Scott Stevenson

Visibility and monitoring in deployed machine learning systems

Machine learning allows us to build systems of unprecedented capability, enabling everything from self-driving cars to the synthesis of speech indistinguishable from a human voice. This sophistication comes at a cost, however, making it harder to understand and monitor the behaviour of live ML systems.

Continue reading →

Viewing Jupyter notebooks at the command line

The Jupyter notebook is a literate programming environment that has become ubiquitous in machine learning. While the standard tools for interacting with notebooks are web applications, it’s often useful to be able to view notebooks at the command line. This is convenient when logged into a training workstation via SSH, and the process of configuring SSH to forward a port, starting a Jupyter server, and navigating to it in a web browser is a chore to view a notebook for a few seconds.

Continue reading →

Representation learning for audio data

Classical machine learning often cannot be applied to modern, complex datasets–like audio datasets of human speech–without extensive feature engineering. Traditionally, feature engineering requires deep domain knowledge in order to extract the key components of the data.

Continue reading →

Jupyter notebooks and collaboration

Git has seen widespread adoption to become the de facto standard for sharing and collaborating on code, and the same is true of Jupyter notebooks as the environment for doing interactive data exploration and modelling. However, herein lies a problem: Git was designed to version plain text files containing source code, and not for storing structured data such as the JSON source of Jupyter notebooks and binary data such as embedded images.

Continue reading →