DeepNotebooks: Deep Probabilistic Models Construct Python Notebooks for Reporting Datasets

Abstract

Machine learning is taking an increasingly relevant role in science, business, entertainment, and other fields. However, the most advanced techniques are still in the hands of well-educated and-funded experts only. To help to democratize machine learning, we propose Deep-Notebooks as a novel way to empower a broad spectrum of users, which are not machine learning experts, but might have some basic programming skills and are interested data science. Within the DeepNotebook framework, users feed a cleaned tabular datasets to the system. The system then automatically estimates a deep but tractable probabilistic model and compiles an interactive Python notebook out of it that already contains a preliminary yet comprehensive analysis of the dataset at hand. If the users want to change the parameters of the interactive report or make different queries to the underlying model, they can quickly do that within the DeepNotebook. This flexibility allows the users to interact with the framework in a feedback loop—they can discover patterns and dig deeper into the data using targeted questions, even if they are not experts in machine learning.

Publication
Proceedings of ECMLPKDD Workshop on Automating Data Science
Claas A. Voelcker
PhD Student in Reinforcement Learning and ML