How to Go with the MLFlow: Tracking Tutorial

3 min readMar 8, 2021

How to Go with the MLFlow: Tracking Tutorial.

Anyone who has worked on a professional or personal machine learning project will know that keeping track of the performance and evaluations of each of your models can get particularly messy. In my case, I can somewhat fondly recall my bleary-eyed scrolling through a finalized jupyter notebook at the eleventh hour to ensure I created a model from each relevant algorithm and tuned each one’s hyperparameters properly.

MLFlow is an opensource framework released by databricks in 2018, the developers who created the Apache Spark project, to help users keep track of the Machine Learning lifecycle. Inspired in part by platforms created inhouse at tech companies like Google (TFX), Uber (Michelangelo), and Facebook (FBLearner), MLFlow allows tracking, planning, and training to work across libraries and algorithms through an open interface design. MLFlow Tracking allows users to keep track of parameters, metrics, source code, and version as well as any artifacts like data and models. This is not just critical for scaling a model, but also is beneficial in cases where data governance is an issue.

In about twenty lines of code, you can track your modeling process in MLFlow. Here’s what my MLflow tracker client looked like on my first attempt:

For each model for this classification problem, I have details on the parameters and evaluations as well as a handy reference to the algorithm used.

So once you “!pip install mlflow” you can get started with tracking the modeling process in MLFlow with the short tutorial below.

Tutorial

In the below code snippet, I import MLflow and the MLflow client and set up my evaluator objects for later.

import mlflow
from mlflow import spark
from mlflow.tracking import MlflowClientmlflow.set_experiment(experiment_name = "Experiment-3")
client = MlflowClient()
Bin_evaluator = BinaryClassificationEvaluator(rawPredictionCol='prediction') 
MC_evaluator = MulticlassClassificationEvaluator(metricName="accuracy")

LaylaAI in her udemy course PySpark Essentials for Data Scientists provided this handy function to create runs for each model:

experiments = client.list_experiments()experiment_name = "Experiment-3"def create_run(experiment_name):
    mlflow.set_experiment(experiment_name = experiment_name)
    for x in experiments:
        if experiment_name in x.name:
            experiment_index = experiments.index(x)
            run = client.create_run(
experiments[experiment_index].experiment_id)
          
            return run

This code below is what you would want to have for each model.

# test the functionality here
run = create_run('Experiment-3')# Add tag to a run
client.set_tag(run.info.run_id, "Algorithm", "Gradient Boosted Tree")
client.set_tag(run.info.run_id,"Random Seed",908)
client.set_tag(run.info.run_id,"Train Perct",0.7)# Add params and metrics to a run
client.log_param(run.info.run_id, "Max Depth", 90)
client.log_param(run.info.run_id, "Max Bins", 50)
client.log_metric(run.info.run_id, "Accuracy", 0.87)# Terminate the client
client.set_terminated(run.info.run_id)

What this would look like with a logistic regression model with cross validation would be below:

run = create_run(experiment_name)
classifier = LogisticRegression()# Set up your parameter gridparamGrid = (ParamGridBuilder() \
             .addGrid(classifier.regParam, [0.1, 0.01])
             .addGrid(classifier.maxIter, [10, 15,20])
             .build())#Cross Validatorcrossval = CrossValidator(estimator=classifier,
                          estimatorParamMaps=paramGrid,
                          evaluator=MulticlassClassificationEvaluator(),
                          numFolds=2) # 3 + is best practice# Fit Model: Run cross-validation, and choose the best set of parameters.fitModel = crossval.fit(train)BestModel = fitModel.bestModel
print("Intercept: " + str(BestModel.interceptVector))
print("Coefficients: \n" + str(BestModel.coefficientMatrix))
predictions = fitModel.transform(test)accuracy = (MC_evaluator.evaluate(predictions))*100
print(accuracy)########### Track results in MLflow UI ################# Add tag to a run
# Extract the name of the classifier
classifier_name = type(classifier).__name__
client.set_tag(run.info.run_id, "Algorithm", classifier_name) 
client.set_tag(run.info.run_id,"Random Seed",seed)
client.set_tag(run.info.run_id,"Train Perct",train_val)# Extract params of Best Model
paramMap = BestModel.extractParamMap()# Log parameters to the client
for key, val in paramMap.items():
    if 'maxIter' in key.name:
        client.log_param(run.info.run_id, "Max Iter", val)
for key, val in paramMap.items():
    if 'regParam' in key.name:
        client.log_param(run.info.run_id, "Reg Param", val)# Log metrics to the client
client.log_metric(run.info.run_id, "Accuracy", accuracy)# Set a runs status to finished (best practice)
client.set_terminated(run.info.run_id)

You could, of course, log more parameters depending on your needs.

To display the client, I ran the command “mlflow ui” on my command line within the project folder then access http://localhost:5000/ on my browser.

Further Reading

Corey Zumar, a software developer at Databricks, has an awesome demo on using mlflow on the popular number recognition Mnist dataset, covering more of the cloud functionality of MLFlow for productionizing your models and avoiding the ratnest mapping that can occur when an organization uses a variety of ML Frameworks.

How to Go with the MLFlow: Tracking Tutorial

Written by Anton Haugen