This repository provides practice code to build MLOps on AWS. This service specifically focuses on Operation in MLOps. The Python application code is meant to be referenced from the following GitHub: https://github.com/nsakki55/aws-mlops-handson
We guide you through setting up a Python development environment that ensures code quality and maintainability. This environment is carefully configured to enable efficient development practices and facilitate collaboration.
This repository includes the implementation of a training pipeline. This pipeline covers the stages, including data preprocessing, model training, and evaluation. This repository also allows multiple models to be trained in parallel and finally the inference server can be updated. We can also use DynamoDB to version control your models and sanity check before model serving.
This repository provides an implementation of a prediction server that serves predictions based on your trained CTR prediction model. Canary release is also achieved by separating the ALB target group.
We offer a comprehensive monitoring system using Grafana and Prometheus. Software metrics of the inference server, predictions, and data drift in the learning pipeline can also be detected
To showcase industry-standard practices, this repository guide you in deploying the training pipeline, inference server and dashboard on AWS.
AWS Infra Architecture made by this repository.
| Software | Install (Mac) |
|---|---|
| pyenv | brew install pyenv |
| Poetry | curl -sSL https://install.python-poetry.org | python3 - |
| direnv | brew install direnv |
| Terraform | brew install terraform |
| Docker | install via dmg |
| docker-buildx | brew install docker-buildx |
| awscli | curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg" |
Use pyenv to install Python 3.11.7 environment
$ pyenv install 3.11.7
$ pyenv local 3.11.7Use poetry to install library dependencies
$ poetry installUse direnv to configure environment variable
$ cp .env.example .env
$ direnv allow .Set your environment variable setting
AWS_REGION=ap-northeast-1
AWS_ACCOUNT_ID=
AWS_PROFILE=mlops-practice
AWS_ALB_DNS=
USER_NAME=
S3_BUCKET=${USER_NAME}-mlops-practice
TF_VAR_aws_region=${AWS_REGION}
TF_VAR_aws_profile=${AWS_PROFILE}
TF_VAR_aws_account_id=${AWS_ACCOUNT_ID}
TF_VAR_name=${USER_NAME}Use terraform to create aws resources. Apply terraform
$ make init
$ make plan
$ make applyunzip train data
$ unzip data.zipupload train data to S3
$ make upload| Tool | Usage |
|---|---|
| ruff | Format code style, format import statement, and code quality check check |
| mypy | Static type checking |
| pytest | Test Code |
| tox | Virtual env for develop |
Build ML Pipeline
$ make build-mlRun ML Pipeline
$ make run-mlBuild Predict API
$ make build-predictorRun Predict API locally
$ docker compose up --buildShutdown Predict API locally
$ docker compose downRequest Local Predict API
$ make predictRun formatter
$ make formatRun pytest, ruff, mypy
$ toxThere are many more commands, check the Makefile.(fluentd, grafana, prometheus, canary, importer)

