Introduction
Welcome to our tutorial on deploying a machine learning (ML) model on Amazon Web Services (AWS) Lambda using Docker. In this tutorial, we will walk you through the process of packaging an ML model as a Docker container and deploying it on AWS Lambda, a serverless computing service.
By the end of this tutorial, you will have a working ML model that you can invoke through an API, and you will have gained a deeper understanding of how to deploy ML models on the cloud. Whether you are a machine learning engineer, data scientist, or developer, this tutorial is designed to be accessible to anyone with a basic understanding of ML and Docker. So, let’s get started!
What is Docker?
Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package. By using containers, developers can be sure that their application will run on any other machine, regardless of any customized settings that the machine might have that could differ from the machine used for writing and testing the code. Docker provides a way to package an application and its dependencies in a lightweight, portable container that can be easily moved from one environment to another. This makes it easier to create consistent development, testing, and production environments, and to deploy applications more quickly and reliably. Install the docker from https://docs.docker.com/get-docker/
What is AWS Lambda?
Amazon Web Services (AWS) Lambda is a serverless computing platform that runs code in response to events and automatically manages the underlying compute resources for you. It is a service offered by AWS that allows developers to run their code in the cloud without having to worry about the infrastructure required to run it. AWS Lambda automatically scales your applications in response to incoming request traffic, and you only pay for the computing time that you consume. This makes it an attractive option for building and running microservices, real-time data processing, and event-driven applications.
What is AWS ECR?
Amazon Web Services (AWS) Elastic Container Registry (ECR) is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images. It is a secure and scalable service that enables developers to store and manage Docker images in the AWS cloud and to easily deploy them to Amazon Elastic Container Service (ECS) or other cloud-based container orchestration platforms. ECR is integrated with other AWS services, such as Amazon ECS and Amazon EKS, and provides native support for the Docker command line interface (CLI). This makes it easy to push and pull Docker images from ECR using familiar Docker commands, and to automate the build, test, and deploy processes for containerized applications.
Install AWS-CLI
Install AWS CLI on your system using this. Get the AWS Access Key ID and AWS Secret Access Key by creating the IAM user in your AWS account. After installation runs the below command to configure your AWS CLI and insert the required fields.
aws configure
Deploying lambda function with docker
We are deploying the openAI clip model in this tutorial to vectorize the input text. The lambda function requires the amazon Linux 2 in the docker container so we are using the public.ecr.aws/lambda/python:3.8 with it. Also, as lambda has a read-only filesystem, it won’t allow us to download the models internally, so we need to download and copy them while creating the image.
Get the working code from here and extract it.
Change the working directory where Dockerfile is located and run the below command;
docker build -t lambda_image .
Now we have an image ready that we will deploy on the lambda. To check it locally run the command:
docker run -p 9000:8080 lambda_image
To check it send curl request to it and it should return the vectors for the input text,
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"text": "This is a test for text encoding"}'
Output:
To deploy the image on lambda first we need to push it on ECR so login into the AWS account and create the repository lambda_image in ECR. After creating the repository go to the created repository and you will see the view push command option click on it and you will get the command to push the image into the repository.
Now run the first command to authenticate your docker client using AWS CLI.
We have already created the docker image so skip the second step and run the third command to tag the created image.
Run the last command to push the image in ECR you will see the interface like this after running it:
Once the push is complete you will see the image tagged with ‘:latest’ tag in the repository of ECR.
Copy the URI of the image we’ll need this while creating the lambda function.
Now go to the lambda function and click on create the function option. We are creating a function from the image so select the option of the Container image. Add the name of the function and paste the URI we copied from ECR or you can browse the image too. Select architecture x84_64 and finally click on the create_image option.
It might take some time to build the lambda function so be patient. After successful execution, you will see an interface like below:
The lambda function has a timeout limit of 3 seconds and RAM of 128 MB by default so we need to increase it otherwise it will throw us an error. To do it go to the configuration tab and click on edit.
Now set the timeout to 5-10 minutes (maximum limit is 15 minutes) and RAM to 2-3 GB and click on the save button. It will take some time to update the configuration of the lambda function
After the changes are updated the function is ready to test. To test the lambda function go to the Test tab and add the key value in the event-JSON as “text”: “This is a test for text encoding”. Then click on the test button.
As we are executing the lambda function the first time it might take some time to execute, after the successful execution you will see the vectors for the input text in the execution logs.
Now our lambda function is deployed and working properly. To access it via API we’ll need to create a function URL.
To create the URL for the lambda function go to the Configuration tab and select the Function URL option. Then click on create function URL option.
For now, keep the authentication None and click on Save.
After the process is done you will get the URL for accessing the lambda function via API. Here is the sample python code to access the lambda function using API:
import requests
function_url = ""
url = f"{function_url}?text=this is test text"
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
After the successful execution of the code, you will get the vector for the input text.
So this was the example of how to deploy ml models on AWS lambda using docker, Do let us know if you have any queries.