How to Go with the MLFlow: Tracking Tutorial

How to Go with the MLFlow: Tracking Tutorial.

Anyone who has worked on a professional or personal machine learning project will know that keeping track of the performance and evaluations of each of your models can get particularly messy. In my case, I can somewhat fondly recall my bleary-eyed scrolling through a finalized jupyter notebook at the eleventh hour to ensure I created a model from each relevant algorithm and tuned each one’s hyperparameters properly.

MLFlow is an opensource framework released by databricks in 2018, the developers who created the Apache Spark project, to help users keep track of the Machine Learning lifecycle. Inspired in part by platforms created inhouse at tech companies like Google (TFX), Uber (Michelangelo), and Facebook (FBLearner), MLFlow allows tracking, planning, and training to work across libraries and algorithms through an open interface design. MLFlow Tracking allows users to keep track of parameters, metrics, source code, and version as well as any artifacts like data and models. This is not just critical for scaling a model, but also is beneficial in cases where data governance is an issue.

In about twenty lines of code, you can track your modeling process in MLFlow. Here’s what my MLflow tracker client looked like on my first attempt:

For each model for this classification problem, I have details on the parameters and evaluations as well as a handy reference to the algorithm used.

So once you “!pip install mlflow” you can get started with tracking the modeling process in MLFlow with the short tutorial below.


In the below code snippet, I import MLflow and the MLflow client and set up my evaluator objects for later.

import mlflow
from mlflow import spark
from mlflow.tracking import MlflowClient
mlflow.set_experiment(experiment_name = "Experiment-3")
client = MlflowClient()

Bin_evaluator = BinaryClassificationEvaluator(rawPredictionCol='prediction')
MC_evaluator = MulticlassClassificationEvaluator(metricName="accuracy")

LaylaAI in her udemy course provided this handy function to create runs for each model:

experiments = client.list_experiments()experiment_name = "Experiment-3"def create_run(experiment_name):
mlflow.set_experiment(experiment_name = experiment_name)
for x in experiments:
if experiment_name in
experiment_index = experiments.index(x)
run = client.create_run(

return run

This code below is what you would want to have for each model.

# test the functionality here
run = create_run('Experiment-3')
# Add tag to a run
client.set_tag(, "Algorithm", "Gradient Boosted Tree")
client.set_tag(,"Random Seed",908)
client.set_tag(,"Train Perct",0.7)
# Add params and metrics to a run
client.log_param(, "Max Depth", 90)
client.log_param(, "Max Bins", 50)
client.log_metric(, "Accuracy", 0.87)
# Terminate the client

What this would look like with a logistic regression model with cross validation would be below:

run = create_run(experiment_name)
classifier = LogisticRegression()
# Set up your parameter gridparamGrid = (ParamGridBuilder() \
.addGrid(classifier.regParam, [0.1, 0.01])
.addGrid(classifier.maxIter, [10, 15,20])
#Cross Validatorcrossval = CrossValidator(estimator=classifier,
numFolds=2) # 3 + is best practice
# Fit Model: Run cross-validation, and choose the best set of parameters.fitModel = = fitModel.bestModel
print("Intercept: " + str(BestModel.interceptVector))
print("Coefficients: \n" + str(BestModel.coefficientMatrix))

predictions = fitModel.transform(test)
accuracy = (MC_evaluator.evaluate(predictions))*100
########### Track results in MLflow UI ################# Add tag to a run
# Extract the name of the classifier
classifier_name = type(classifier).__name__
client.set_tag(, "Algorithm", classifier_name)
client.set_tag(,"Random Seed",seed)
client.set_tag(,"Train Perct",train_val)
# Extract params of Best Model
paramMap = BestModel.extractParamMap()
# Log parameters to the client
for key, val in paramMap.items():
if 'maxIter' in
client.log_param(, "Max Iter", val)
for key, val in paramMap.items():
if 'regParam' in
client.log_param(, "Reg Param", val)
# Log metrics to the client
client.log_metric(, "Accuracy", accuracy)
# Set a runs status to finished (best practice)

You could, of course, log more parameters depending on your needs.

To display the client, I ran the command “mlflow ui” on my command line within the project folder then access on my browser.

Further Reading

Corey Zumar, a software developer at Databricks, has an on using mlflow on the popular number recognition Mnist dataset, covering more of the cloud functionality of MLFlow for productionizing your models and avoiding the ratnest mapping that can occur when an organization uses a variety of ML Frameworks.

Data Scientist and Writer, passionate about language

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store