Supervised Machine Learning for Science

How to stop worrying and love your black box


Christoph Molnar & Timo Freiesleben


Machine learning has revolutionized science, from folding proteins and predicting tornadoes to studying human nature. While science has always had an intimate relationship with prediction, machine learning amplified this focus. But can this hyper-focus on prediction models be justified? Can a machine learning model be part of a scientific model? Or are we on the wrong track?

In this book, we explore and justify supervised machine learning in science. However, a naive application of supervised learning won’t get you far because machine learning in raw form is unsuitable for science. After all, it lacks interpretability, uncertainty quantification, causality, and many more desirable attributes. Yet, we already have all the puzzle pieces needed to improve machine learning, from incorporating domain knowledge and ensuring the representativeness of the training data to creating robust, interpretable, and causal models. The problem is that the solutions are scattered everywhere.

In this book, we bring together the philosophical justification and the solutions that make supervised machine learning a powerful tool for science.

The book consists of two parts:

  • Part 1 discusses the relationship between science and machine learning.
  • Part 2 addresses the shortcomings of supervised machine learning.

Supervised Machine Learning For Science by Christoph Molnar & Timo Freiesleben is licensed under CC BY-NC-SA 4.0