Scott Stevenson

Visibility and monitoring in deployed machine learning systems

Modern machine learning has ushered in an era of unparalleled system capabilities, exemplified by self-driving cars and synthetic speech indistinguishable from human. However, these techniques bring with them the challenge of monitoring and understanding the behaviour of live ML systems.

Continue reading →

Viewing Jupyter notebooks at the command line

The Jupyter notebook is a literate programming environment that has become ubiquitous in machine learning. While the standard tools for interacting with notebooks are web applications, it’s often useful to be able to view notebooks at the command line. This is convenient when logged into a training workstation via SSH, and the process of configuring SSH to forward a port, starting a Jupyter server, and navigating to it in a web browser is a chore to view a notebook for a few seconds.

Continue reading →

Representation learning for audio data

The application of classical machine learning methods on complex data formats, such as audio of human speech, typically necessitates extensive feature engineering. This requires significant domain knowledge to extract the key components of the data.

Deep learning can allow models to learn their data representations, obviating the need for feature engineering. However, as the quality of the learned representations strongly influences performance on downstream tasks, how can we ensure that these representations are appropriate?

Continue reading →

Jupyter notebooks and collaboration

The adoption of Git as the primary means of collaborating on code, and Jupyter notebooks as the standard environment for data exploration and interactive modelling, is widespread. However, a problem arises in that Git was designed to version plain text files, such as those containing source code, and not structured data like JSON documents or binary data such as images embedded in Jupyter notebooks.

Continue reading →