Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1 +1,63 @@
# Example MLflow project
This is a simple example ML project that demonstrates how to use [MLflow](https://mlflow.org/) to track machine learning experiments using the Wine Quality dataset.

## 📁 Project Structure
.
├── conda.yaml # Conda environment dependencies
├── LICENSE.txt # License file
├── MLproject # MLflow project configuration
├── README.md # Project description (this file)
├── train.py # Training script
└── wine-quality.csv # Dataset

## 📦 Requirements

- Python 3.7+
- MLflow
- scikit-learn
- pandas
- numpy

You can install dependencies using:

```bash
pip install -r requirements.txt
Or using the conda.yaml file:

bash
Copy
Edit
conda env create -f conda.yaml
conda activate mlflow-env
🧪 Running the Training Script
To run the training script with MLflow:

bash
Copy
Edit
python train.py 0.5 0.5
Or with MLflow CLI:

bash
Copy
Edit
mlflow run . -P alpha=0.5 -P l1_ratio=0.5
📈 Logged Metrics
This project logs the following metrics:

RMSE (Root Mean Squared Error)

MAE (Mean Absolute Error)

R² (R-squared)

The trained model is also saved and logged with MLflow
📚 Dataset Info
The dataset used is the Wine Quality dataset from UCI Machine Learning Repository.

css
Copy
Edit
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
"Modeling wine preferences by data mining from physicochemical properties".
Decision Support Systems, Elsevier, 47(4):547-553, 2009.
14 changes: 14 additions & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
artifact_uri: file:///C:/Users/hp/mlflow-example/mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/artifacts
end_time: 1753625116637
entry_point_name: ''
experiment_id: '0'
lifecycle_stage: active
run_id: 79442e6bb3ff42c5a7ca58e4858e5f62
run_name: carefree-jay-416
source_name: ''
source_type: 4
source_version: ''
start_time: 1753625111329
status: 3
tags: []
user_id: hp
2 changes: 2 additions & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/metrics/mae
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
1753625111412 0.6278761410160693 0
1753625111412 0.6278761410160693 0
2 changes: 2 additions & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/metrics/r2
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
1753625111400 0.12678721972772689 0
1753625111400 0.12678721972772689 0
2 changes: 2 additions & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/metrics/rmse
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
1753625111380 0.82224284975954 0
1753625111380 0.82224284975954 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
destination_id: m-716f80ebde1c4e5bb22cf127c5367bca
destination_type: MODEL_OUTPUT
source_id: m-716f80ebde1c4e5bb22cf127c5367bca
source_type: RUN_OUTPUT
step: 0
tags: {}
1 change: 1 addition & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/params/alpha
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.5
1 change: 1 addition & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/params/l1_ratio
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
carefree-jay-416
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0651d1c962aa35e4dd02608c51a7b0efc2412407
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
train.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
1 change: 1 addition & 0 deletions mlruns/0/79442e6bb3ff42c5a7ca58e4858e5f62/tags/mlflow.user
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hp
6 changes: 6 additions & 0 deletions mlruns/0/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
artifact_location: file:///C:/Users/hp/mlflow-example/mlruns/0
creation_time: 1753625110390
experiment_id: '0'
last_update_time: 1753625110390
lifecycle_stage: active
name: Default
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
artifact_path: file:///C:/Users/hp/mlflow-example/mlruns/0/models/m-716f80ebde1c4e5bb22cf127c5367bca/artifacts
flavors:
python_function:
env:
conda: conda.yaml
virtualenv: python_env.yaml
loader_module: mlflow.sklearn
model_path: model.pkl
predict_fn: predict
python_version: 3.10.0
sklearn:
code: null
pickled_model: model.pkl
serialization_format: cloudpickle
sklearn_version: 1.7.0
mlflow_version: 3.1.4
model_id: m-716f80ebde1c4e5bb22cf127c5367bca
model_size_bytes: 879
model_uuid: m-716f80ebde1c4e5bb22cf127c5367bca
prompts: null
run_id: 79442e6bb3ff42c5a7ca58e4858e5f62
utc_time_created: '2025-07-27 14:05:11.490711'
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
channels:
- conda-forge
dependencies:
- python=3.10.0
- pip<=21.2.3
- pip:
- mlflow==3.1.4
- cloudpickle==3.1.1
- numpy==2.2.6
- pandas==2.3.0
- psutil==7.0.0
- scikit-learn==1.7.0
- scipy==1.15.3
name: mlflow-env
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
python: 3.10.0
build_dependencies:
- pip==21.2.3
- setuptools==57.4.0
- wheel
dependencies:
- -r requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
mlflow==3.1.4
cloudpickle==3.1.1
numpy==2.2.6
pandas==2.3.0
psutil==7.0.0
scikit-learn==1.7.0
scipy==1.15.3
10 changes: 10 additions & 0 deletions mlruns/0/models/m-716f80ebde1c4e5bb22cf127c5367bca/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
artifact_location: file:///C:/Users/hp/mlflow-example/mlruns/0/models/m-716f80ebde1c4e5bb22cf127c5367bca/artifacts
creation_timestamp: 1753625111456
experiment_id: '0'
last_updated_timestamp: 1753625116628
model_id: m-716f80ebde1c4e5bb22cf127c5367bca
model_type: null
name: model
source_run_id: 79442e6bb3ff42c5a7ca58e4858e5f62
status: 2
status_message: null
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1753625111412 0.6278761410160693 0 79442e6bb3ff42c5a7ca58e4858e5f62
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1753625111400 0.12678721972772689 0 79442e6bb3ff42c5a7ca58e4858e5f62
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1753625111380 0.82224284975954 0 79442e6bb3ff42c5a7ca58e4858e5f62
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0651d1c962aa35e4dd02608c51a7b0efc2412407
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
train.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
LOCAL
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hp
18 changes: 9 additions & 9 deletions train.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import warnings
import sys

#importing the necessary libraries
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
Expand All @@ -15,16 +16,15 @@
import mlflow
import mlflow.sklearn


#define a function to evaluate the metrics
def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
rmse = np.sqrt(mean_squared_error(actual, pred)) #root mean squared error
mae = mean_absolute_error(actual, pred) #mean absolute error
r2 = r2_score(actual, pred) #R-squared score
return rmse, mae, r2



if __name__ == "__main__":
#ignore warnings for clean output
warnings.filterwarnings("ignore")
np.random.seed(40)

Expand All @@ -47,7 +47,7 @@ def eval_metrics(actual, pred):
with mlflow.start_run():
lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
lr.fit(train_x, train_y)

# Predicting the quality of wine using the trained model
predicted_qualities = lr.predict(test_x)

(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
Expand All @@ -56,11 +56,11 @@ def eval_metrics(actual, pred):
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)

# Log parameters and metrics to MLflow
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)

# save the trained model in MLflow
mlflow.sklearn.log_model(lr, "model")