MLOps: Bring Your Model to Production

The field of ML is growing up. In the past, questions like „which AI algorithm should I choose“ and „what is the best way to train it“ were the main focus of AI projects. Today, these topics are well explored. Startups and big cloud providers offer services to ease the first steps and usage of AI. In addition, we can use publicly available, pre-trained models and use transfer learning to get a sound predictive performance with less effort than building our own model from scratch.

Now, new questions get into focus: how can we bring our model actually to production in a reliable and reproducible way? How can we integrate data acquisition, training, and monitoring of our model into an automated system?

In software engineering, we have similar questions. To tackle these question, we use methods like continuous integration and deployment to build and distribute our software in a reliable and reproducible way. We use infrastructure as code to provide the underlying infrastructure that is necessary to run our software.

How can we adapt state-of-the-art best practices in the fields of continuous integration and deployment (CI/CD) and infrastructure as code (IaC) to integrate our AI workflows and pipelines to achieve similar degrees of reliability and reproducibility as in software engineering?

Data Sets

In this student research project, the student may choose between two different kinds of data sets and ML problems. Both data sets cover a large period of time.

  • Predicting „helpfulness“ of a new Amazon customer review. For this NLP problem, the publicly available Amazon Customer Reviews Dataset will be used (15 years worth of data).
  • Predicting stock prices. The necessary data is publicly available e.g. through Yahoo! Finance (40 years plus worth of data).

Using one of these datasets, we will implement and simulate the life cycle of a typical machine learning product. E.g. we’ll start and build our model on the first two years of data. After that, we periodically (e.g. every 6 months) retrain our model using „newly acquired“ data from the data set.

We’ll use IaC to build and maintain our infrastructure and CI/CD to build, publish and maintain our model.


  • The main goals are:
    • Build a suitable IaC and CI/CD pipeline to:
      • train/retrain our model to improve it or and/or adopt it to changes in ground truth
      • test our model
      • publish the new model to production
    • Ensure that the predictive performance of the model doesn’t decrease over time

    Furthermore an actual ML-problem is necessary to simulate the CI/CD pipeline. You may choose between the data sets mentioned above to:

    • Initially build a model
    • Retrain the model
    • Evaluate the model


  • Exploration of currently available CI/CD tools for the application in the area of ML (e.g. Airbnb’s Bighead, Facebook’s FBLearner Flow, Google TFX)
  • Data Wrangling of one of the proposed data sets
  • Build and train a ML-model
  • Configure a CI/CD pipeline to:
    • prepare and preprocess new data that was acquired since the last model training
    • retrain the model with new (and old data)
    • test the model e.g. for predictive performance
    • publish the model to production
  • Monitor your model to ensure:
    • availability
    • predictive performance that is stable or improving
    • (optional) security issues
  • (Optional) Consider building the pipeline in a way such that it can deal with new ML technology in the future (new ML algorithms, additional ML hardware, etc.)

Kind of Work
20% Theory, 60% Implementation, 20% Testing/Benchmarking


  • Machine Learning Basics
  • DevOps Basics
  • One of: Python, Scala, R

Time & Effort
Master’s Thesis, 1-2 Students