From the beginning, the Manifold team set the tone for our work together and has consistently exceeded our expectations. This platform has given us the ability to own end-to-end model lifecycle, and empower multiple parts of the organization to consume these technologies in a scalable, reliable, and repeatable manner.
— VP, Data Engineering and Analytics
Our client is a leading entertainment company that works with both consumers and advertisers. Their existing machine learning (ML) models were being run on third-party infrastructure and proprietary software, impeding their internal data science teams' efforts to improve model performance. The infrastructure was also managed by people, rather than being automated—leading to lags in updating models, and introducing human error.
Our client partnered with Manifold so they could own their end-to-end, automated modeling development and deployment infrastructure. Making this change required:
- Migrating existing models to the client's Amazon Web Services (AWS) environment
- Rebuilding the models on a more developer-friendly tech stack
In three months, the Manifold team, consisting of three machine learning engineers and an infrastructure engineer, re-implemented the previous proprietary modeling work to a Python code base that used containerization for local development and deployment. A new container-based modeling deployment pipeline was built in the client's AWS environment, using AWS CodeBuild and Elastic Container Registry (ECR).
The client was then able to easily run predictions from within their own infrastructure, using trained models that outperformed the previous versions. Groups throughout the company are beginning to use the infrastructure—easily onboarding their data science teams and ensuring the company has one set of pipelines to manage for analytical work.
SolutionThe required engineering work had two main components: model development and model deployment.
The Manifold team architected and built a flexible ML pipeline that allowed for rapid iteration cycles of experimentation and analysis. The ML pipeline was completely container-based, including local development environments and remote training environments. Streamlined inference-time images for deployment were created with model artifacts baked directly into images due to their small size. Additionally, the inference-time images were built with a deployment pipeline specification with standard entry-points for running predictions.
With a standardized inference image specification, the Manifold team was able to build a flexible container-based deploy-and-run pipeline that could support future modeling work that followed the same inference image specification:
Figure 1: Inference image specification
The Manifold team built a flexible and robust serverless model deployment pipeline using the following native AWS services:
- Container Registry
- Simple Storage Service (S3)
- CloudWatch Logs
- Elastic File System (EFS)
- Systems Manager Parameter Store
- Aurora for PostgreSQL
Figure 2: Deployment infrastructure diagram
In this pipeline, once a machine learning engineer has finished experimentation and trained a release candidate model, they run a script to build a new production ready inference-time image and remove all development and testing libraries and packages. These inference images are then tagged and pushed to ECR for testing in a staging environment.
With the tagged image in ECR, the engineer updates a staging CodeBuild job to use a new semantic version tag of the release candidate and runs the release candidate with a default set of environment variables. The job’s build specification stores output in CloudWatch and pushes artifacts to S3. Additionally, as part of the job configuration, sensitive build parameters such as database credentials are pulled in at runtime from AWS Systems Manager Parameter Store and loaded as environment variables. Lastly, the build specification mounts an EFS volume via NFS to persist non-S3 build artifacts across job runs that need to be accessed by other systems.
Once the CodeBuild job runs the inference image in a container via the standard interface, the resulting predictions are written to an Aurora analytics database for downstream consumption. One of the downstream consumers is a Lambda job that performs a periodic ETL process to copy certain fields to an application database that is used by an Elastic Beanstalk application for visualizing various predictions.
With a serverless design for the machine learning model deployment pipeline, the client is now able to easily deploy and run new models to development, staging, and production environments without having to worry about scaling and server management.
- Using an internal development and deployment infrastructure provides much more flexibility and scalability
- An automated infrastructure speeds up model updates and reduces human error
- The AWS suite of services, including AWS CodeBuild and ECR, provides tools for building a serverless machine learning model deployment pipeline