Diving into the world of machine learning and AI, whether as an engineer, data scientist, or AI specialist, often means one thing: bringing your ML models to life through deployment to production environments.
Every week, there's a new "Eureka!" moment in AI, signaling a breakthrough. Yet, crafting these state-of-the-art models is only the starting point. The true measure of their worth unfolds when they're seamlessly integrated into production, addressing real-world problems and enhancing business solutions. Picking the right tools for deployment can make all the difference in ensuring the effectiveness of your models.
There are vital questions that arise when selecting your deployment strategy:
Addressing these questions is essential before settling on a deployment strategy. Today, end-to-end ML platforms, like Modelbit and SageMaker, as well as open source solutions, like RayServe, BentoML and Seldon, are available to ease the deployment phase.
While several options for end-to-end ML model deployment platforms have come onto the scene over the last few years, Amazon SageMaker has long stood as the de facto choice for ML teams who want to consolidate on to one platform. Yet, you do not need to dive too far into the various ML communities on Slack or Reddit to learn that there isn’t exactly a universal love for SageMaker, and that an alternative to SageMaker is in demand.
In this in-depth comparison, we will dissect the capabilities, workflows, pricing structures, and real-world use cases of Amazon SageMaker and Modelbit. By the end of this article, you'll have the knowledge needed to make an informed decision when choosing between both tools for your machine learning models.
This comparison will also guide you through the necessary steps to deploy an ML model in both SageMaker, as well as its alternative, Modelbit.
Amazon SageMaker is marketed as a comprehensive machine learning platform offering a model-building, training, and deployment ecosystem. Amazon SageMaker runs on the AWS Cloud. It provides an ecosystem to build, train, and deploy machine learning models for any use case with fully managed infrastructure, tools, and workflows.
Machine learning and AI engineers will, however, face some unique challenges when deploying machine learning models to production environments with Amazon Sagemaker. In this section, you will learn about most ML engineers' limitations when using SageMaker through real-world examples, personal experiences, and user feedback.
Here are four pain points we hear users face when deploying models with SageMaker:
With SageMaker comes the burden of learning how to use different AWS services to deploy your model to production successfully:
Amazon SageMaker also has several AI/ML components running under the hood for end-to-end machine learning workflows. We have heard users make complaints about the many SageMaker services that do not all play well together. You may find it cumbersome to iterate across the entire workflow using different components before they can deploy a model, especially if they only want to deploy one.
The web-serving framework SageMaker provides could be more intuitive to use. We have met several CTOs who said they had to hire front-end engineers to build and wrap a custom UI around SageMaker (as well as Databricks) so that their teams could use it.
As with most services powered by big public cloud providers, vendor lock-in within the AWS ecosystem is a crucial concern for our users regarding SageMaker. SageMaker API only works in that ecosystem and is tightly or moderately interoperable with other AWS services, depending on the service.
A key concern with vendor lock-in is that not all services may be well-developed enough to solve different aspects of your stack. So if you use a component and need the best service for your workflow, you might not be able to leverage external tools without some operational costs.
Large model deployments on SageMaker frequently escalate operational costs because they might require more resource-intensive instance types. While your SageMaker experience is fully managed, it abstracts many operational details. Features such as data processing, batch transform, notebook instances, training, and feature stores, come with their own cost. For end-to-end ML workflows, these costs add up quickly and significantly increase the operational costs of large model deployments on SageMaker.
AWS offers detailed pricing for each feature, but you are responsible for familiarizing yourself with the associated costs and actively monitoring expenses using tools like AWS Cost Explorer.
To avoid such “hidden costs,” carefully consider the resource requirements of your large model before deploying it on SageMaker. Also, consider SageMaker's built-in cost optimization features to help reduce your costs.
SageMaker's complexity often presents a challenge for many teams. This intricacy hinders the ability to prototype swiftly, forcing ML teams to adapt and reshape their workflow around the tool, rather than the tool enhancing their processes.
We have had discussions with users who find it challenging to move medium- to large-scale models through SageMaker components to quickly deploy, update, and ship new features without operational overhead. They cannot prototype with new model types rapidly because they have workflows configured only to support specific models.
In particular, they have had to write custom code and automations in order to make SageMaker work for them, and that code makes assumptions about model types and resource constraints. Those assumptions then get violated when the team wants to deploy new types of models. SageMaker can require such low-level configuration just to get working that the cost of changing its configurations to adapt to new model types becomes prohibitive.
Let’s look at SageMaker alternatives for shipping your models to production in the next section.
Compared to Amazon SageMaker, deployment platforms are available for hosting your machine learning models as endpoints. In this section, you will learn some alternatives to AWS SageMaker Inference.
Here are other options:
Modelbit simplifies deploying and managing machine learning models in production. It emphasizes usability and simplicity—quickly deploy models as REST APIs with an intuitive and user-friendly interface. This ease of use speeds up the deployment process to move ML models to market.
Modelbit also prioritizes monitoring and management, with features for keeping close tabs on the health and performance of the models you deploy. This is critical for maintaining model reliability in production and meeting service level agreements (SLAs).
In terms of pricing, you only pay for what you use. Modelbit has a monthly and an annual pricing model. Modelbit customers can also prepay for compute at a discounted rate.
Ray is an open-source, all-encompassing computing framework for scaling various AI and Python workloads. It offers a seamless platform for extending the capabilities of AI and Python applications, covering a wide range of tasks, including reinforcement learning, deep learning, hyperparameter tuning, and model deployment.
Ray Serve is built on top of Ray. You deploy a machine learning model by defining a deployment decorator ( `@serve.deployment`) on a Python class containing the prediction logic and an application (consisting of one or more deployments that handle inbound traffic). It serves large language models and “traditional” deep learning models.
No direct pricing is associated with using the open-source Ray libraries, but AnyScale recently started providing Ray Serve as a service.
Triton Inference Server is free (but the compute is, of course, not) and open to the community for use and contribution. NVIDIA also provides the option to purchase NVIDIA AI Enterprise, which includes Triton Inference Server, along with a suite of enhanced features and support services to meet the specific needs of businesses looking for comprehensive AI solutions.
In this comparative analysis, we will explore the essential features between Modelbit and Amazon SageMaker to help you decide when choosing the right solution for your model deployment.
These features are the criteria for comparison:
We decided to compare Amazon SageMaker and Modelbit’s deployment capabilities based on these features because we see them repeatedly come up in conversations with users and in broad discussions in communities that widely use both platforms.
Let’s compare! 👀
Phew! Now that you understand how Modelbit and SageMaker’s Inference options stack up, let’s put our concerns and comparisons into practice by comparing the workflows for deploying the same model.
Head over to the fun section 👇.
It’s time to see both SageMaker and Modelbit in action! In this section, you will deploy an XGBoost model for a diabetes binary classification problem on the popular “Diabetes Dataset.” You will build and deploy the same model with Amazon SageMaker and Modelbit to practically compare the workflow for both platforms.
For the purpose of a balanced comparison, we will build and deploy the same model with the same hyperparameters, and data preprocessing code.
Let’s start with Amazon SageMaker.
First step, let’s set up the data and development environment.
Create an S3 bucket to store your data:
Create an Amazon SageMaker notebook instance. To do this, access the AWS Management Console and search for "SageMaker." This action will allow you to create a SageMaker notebook environment for development and model deployment.
After successfully setting up your SageMaker notebook instance, the next step is to ensure that your IAM (Identity and Access Management) role has the necessary permissions to access data in the S3 bucket.
Navigate to the IAM section in the AWS Management Console. Locate and select the IAM role associated with your SageMaker instance. Attach the appropriate S3 permissions to grant your SageMaker notebook the required access to the contents stored in the designated S3 bucket. This access is vital for effectively handling and utilizing the data within your SageMaker environment.
Here, the notebook's filename is "SageMaker-Deployment." Once you create the notebook, import your dataset from S3.
One common approach is to download the file from your S3 storage into your notebook's working directory. Subsequently, you can utilize a library like Pandas to read and manipulate the dataset.
Find the complete code for this section in this Colab notebook.
Create your AWS SageMaker session and initialize the IAM execution role:
Amazon SageMaker provides a default S3 bucket to access using “SageMaker.Session().default_bucket()”. To streamline the process, use the following code block to upload the CSV files you downloaded locally in your Jupyter instance to this bucket.
This step is essential for making the data accessible within the SageMaker environment. With the data successfully uploaded to the default S3 bucket, run the training code in the Colab notebook.
Here’s the code to train your model (a SageMaker Estimator) and fine-tune the parameters of the XGBoost model:
To initialize training, fit the estimator on the training and validation splits:
The training process may take some time to complete, depending on the size of your data. Once it completes, you should see an output similar to the one below.
Once training is complete, deploy the model by calling `.deploy()` on the XGBoost SageMaker estimator you just fitted:
The code deploys your model on a single ml.m4.xlarge instance.
Perfect! You have successfully deployed your AWS SageMaker model as an endpoint. Confirm deployment by heading to the SageMaker console>>Inference>>Endpoints.
After creating the endpoint, you can test them using Amazon Sagemaker Studio, the AWS SDK, or the AWS CLI.
You would have to configure the endpoint to be accessible and test it from your applications.
Test your Sagemaker endpoints using the AWS SDK (Boto3). First, you must authenticate the request using an access key and secret credentials.
After successful authentication, pass a payload to the Sagemaker endpoint.
With more complex applications, you might need to create and manage APIs using Amazon API Gateway, create an execution role for the REST API, a mapping template for response integration, and deploy the API.
Remember to delete your endpoint when you are done with this demo to save costs. Delete the endpoint in your notebook and the configuration files:
Interested in learning how to deploy SageMaker models to Modelbit? Head over to our detailed tutorial: Deploying models built with AWS SageMaker
Modelbit gives you the option to deploy ML models as REST API endpoints directly from your notebooks using Python and Git APIs. In this section, you will deploy your model with a few lines of code from a Colab notebook to highlight the simplicity and quick time-to-market features.
Modelbit offers a free plan—sign up if you haven't already. It provides a fully custom Python environment backed by your git repo.
Install the Modelbit package via `pip` in your Google Colab (or Jupyter) notebook:
Follow the steps in this Colab notebook to load the sample dataset, train, and tune the XGBoost model.
Log into the "modelbit" service and create a development ("dev") or staging ("stage") branch for staging your deployment. Learn how to work with branches in the docs.
If you cannot create a “dev” branch, you can use the default "main" branch for your deployment:
You should see a link to authenticate your kernel to connect to Modelbit. Click on that link to authenticate the notebook kernel.
After successful authentication, you should see an onboarding screen if it’s your first time using Modelbit or your dashboard if you are an existing user.
Now, you are ready to deploy the model! First, create a deployment function. This is necessary because modelbit.deploy() takes a callable deployment function as a parameter.
In this case, define the “diabetes_likelihood_prediction()” function that takes in features that could predict the likelihood of diabetes from a patient’s data, hypothetically, of course.
You are now production-ready! 🚀 Pass the model prediction function "diabetes_likelihood_prediction" and the project dependencies to the "mb.deploy()" API.
Deploy your prediction function:
Calling “mb.deploy(diabetes_likelihood_prediction)” detects all your notebook dependencies, copies the environment configuration, and deploys the model and metadata files to Modelbit.
Modelbit runs a container build and creates a REST endpoint to access your model.
If everything works correctly, you should see the following output:
Test the endpoint by sending a request to the endpoint:
The output of this produces a result. The result displays a value of “1”, which means that there is a possibility that the user has diabetes.
Check the “📚Logs” panel in the Modelbit UI to see real-time logs of every request made to your endpoint.
With two steps, “modelbit.login()” and “modelbit.deploy()”, you have a live production endpoint that:
The best part? You can achieve all of this within your notebook environment without changing your current tech stack!
Need to ship a new model to the endpoint? Simply switch to a new git branch, and all deployments from your notebook will go through that branch. That's it! 😎.
From the analysis and code samples in this article, it's clear that deploying machine learning models with Modelbit is simpler compared to AWS SageMaker—although you can be the judge of that.
Here’s a recap of some notable advantages of using Modelbit:
1. Lightweight and intuitive deployment: The intricacies of SageMaker often pose challenges for its users. Once it's integrated into a workflow, making any modifications can feel like navigating a maze. Modelbit simplifies the deployment process. Instead of getting bogged down with endless configurations, you're just a few clicks away from deployment, thanks to its lightweight design that seamlessly integrates with your existing workflow.
2. Large model support: Don’t think that Modelbit’s lightweight nature limits it from deploying large models. It provides robust support for large models, especially for projects involving resource-intensive models. In fact, the one-click deployment could be advantageous when dealing with large models—it alleviates some of the complexities associated with their deployment. However, it is important to compare this with SageMaker Inference’s options to make an informed decision.
3. Affordable deployment: Users complain about paying a premium for inference with SageMaker—especially for large models. With Modelbit, the pricing structure accommodates various project sizes and budgets. This means more flexibility and cost savings for your production workloads.
4. On-demand Compute: One reason users stick with SageMaker is the availability of many instance types. Modelbit provides CPU and GPU compute resources on demand that autoscale to your training and production workloads. Compute is optimized to support large deployments.
5. Platform agnosticism: Another concern users have with SageMaker services is vendor lock-in within the AWS ecosystem. Modelbit allows you to deploy your models from anywhere your notebooks run, or perform inference anywhere your models live.
Modelbit’s affordability, simplicity, and support for small and large models make it an ideal alternative to Amazon SageMaker for model deployment. Whether you’re part of a small team seeking a cost-effective solution or a large team dealing with resource-intensive models, Modelbit’s products should cater to your production requirements.
Interested in exploring Modelbit further? The Getting Started guide is a good starting point. You can get started for free without the need to set up an entire SageMaker account and enable billing to deploy models.