Bidirectional Encoder Representations from Transformers (BERT), like GPT-3 and GPT-2, is a large language model designed for natural language processing tasks. The Transformer architecture forms the backbone of BERT.
The architecture comprises multiple layers of self-attention and feed-forward neural networks, which enable BERT to process sequences of data in parallel rather than sequentially.
The BERT-base model contains 110 million parameters, and the larger variant, BERT-large, contains 340 million parameters. Deploying such BERT variants to production can pose significant challenges, including:
By the end of this article, you will have learned:
Here’s an overview of the solution you will build in this article:
Let’s dive right in! 🏊
Text classification involves applying precise labels or categories to textual data. With this, you can train a model to sort text into defined categories. Many industries and large businesses use text classification models for real-world applications like document classification.
One everyday example of text classification silently operating in the background is filtering spam from your email inbox—separating legitimate mail from unsolicited spam. Text classification also plays a crucial role in sentiment analysis. Here, it's instrumental in identifying harmful content, such as hate speech or offensive language, by classifying the sentiment behind posts, a significant step towards creating a safer online environment.
You might wonder, "How does BERT grasp human language and execute text classification?" Essentially, there are two critical phases to harnessing BERT's capabilities:
1. The pre-training phase
2. The fine-tuning phase
During the pre-training phase, the model undergoes training on vast amounts of textual data to grasp linguistic structures. This phase demands significant computational power. For instance, to pre-train BERT, Google used multiple TPUs, specialized processors designed for deep learning models.
The pre-training phase is structured around three pivotal steps:
Next-sentence prediction: This self-supervised task involves presenting BERT with two sentences, A and B, and challenging it to determine whether B logically follows A or is merely a random excerpt from the dataset.
Once the BERT model is pre-trained, it's primed for fine-tuning for any specific NLP task. At this stage, you can introduce a domain-specific dataset in the same language to leverage BERT’s pre-trained weights
The beauty of fine-tuning BERT is that it doesn't necessitate vast datasets, making the process more cost-effective. BERT's complex architecture makes it very good at understanding the subtleties and context of language, but it cannot easily classify text.
You must fine-tune BERT on text classification tasks, adapting it to categorize text effectively. Alternatively, you can use a BERT model that has already been fine-tuned for text classification, available in open-source variants. This should provide immediate, nuanced textual analysis without the need for extensive computational resources.
For an interactive experience, check out the Colab Notebook, which contains all the provided code and is ready to run!
Use the Hugging Face Hub to find an ideal pre-trained BERT model. The Hugging Face Hub hosts the most extensive registry of open-source machine learning models. As of this writing, the Hub boasts a repository of over 6,900 fine-tuned BERT models.
From the hub, you can find fine-tuned BERT models that may meet your needs and even contribute back by pushing your optimized model to the community. We will use BERT for text classification, specifically “spam detection,” to filter out and isolate spam content.
Install the "transformers" library from Hugging Face:
To build the spam classification pipeline, we will use the `pipeline()` class from the transformers library to run inference with the pre-trained BERT model. It acts as a high-level wrapper, requiring you to specify the task and model, along with other parameters, as detailed in its official documentation.
This setup makes it easier to use the BERT model for specific tasks, in this case, text classification. The `pipeline` abstracts the model’s complexity and gives you a simple way to interact with it.
Build the pre-trained BERT model pipeline:
Pass your data through the spam classifier pipeline to classify the text as “SPAM” or “HAM”:
You should see an output similar to the one below distinguishing between both classes with corresponding probability scores.
To get started with Modelbit, create an account. This demo can run on a free account.
Install the Modelbit Python package with pip:
Before deploying the pipeline, embed the classification pipeline within a function, with the function’s argument being the text data you want to classify. Here is how we set up the “classifer_text” function:
Authenticate your notebook kernel and connect it to your workspace by calling the "modelbit.login()" API:
In Modelbit, separate Git branches allow for independent work. They also streamline Git processes like code reviews and merges. To create a new branch, head to Modelbit's UI, click the branch dropdown in the top-right, and select "New Git Branch". Enter the name of your branch in the dialog pop-up.
Now, switch to the new branch you created (replace “your_branch” with the new branch name):
Next, deploy the spam classifier pipeline containing the BERT Model as a REST API endpoint with Modelbit.
Call "mb.deploy()" and pass the deployment function ("classifier_text"). Modelbit detects all notebook dependencies and system packages, pickles the Python functions and the variables integral to "classifier_text".
Under the hood, the API call pushes your source code to Modelbit. Modebit builds a container with the model weights, Python functions, and necessary dependencies to replicate the notebook environment in production:
For reproducibility, the transformers library version used in this demo is "4.34.1"
You should see an output similar to the following:
Next, verify the deployment by clicking the "View in Modelbit" button. It should direct you to the endpoint you deployed within your dashboard on Modelbit. There, select the “classifier_text” deployment to proceed. You might have to wait for a few minutes for the container to build and Modelbit to deploy the endpoint
After your endpoint ships, you should see your new endpoint and instructions to send requests using cURL:
You can invoke the API programmatically using Python with the assistance of the "requests" package, or you can opt for "cURL" for the task. Check the Colab Notebook for how to use both request formats.
Use the “requests” library (change “ENTER_WORKSPACE_NAME” to your deployment workspace name:
If everything works in your code, your output should look like this:
Looking at the “📚Log” panel, we can see that the API processed the request was successfully:
That’s it! You have a real-time endpoint ready to classify messages as either “SPAM” or “HAM.”
As you saw in the intro, deploying BERT-based models can be challenging, but Modelbit, under the hood, ran the following for you:
All you had to do was call “mb.deploy()”, and everything required for production is provisioned and auto-scales to your requirements.
Before integrating your API into a product or client application, familiarize yourself with the security measures necessary to safeguard your API.
When you navigate to your Modelbit dashboard, you'll see a suite of features designed to operate your endpoint in a production environment. This includes functionality for logging prediction data, monitoring endpoint activity, and managing dependencies in your production environment.
Modelbit also offers integration with Arize to monitor data drift, detect issues with your models in Arize, diagnose and fix these issues, and then quickly deploy to production again.
To maximize these resources, don't hesitate to explore your dashboard further and check out more tutorials in the documentation for comprehensive guidance.