HiPlot: High-dimensional interactive plots made easy

1/31/2020

What it is:

HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and patterns in high-dimensional data. It uses parallel plots and other graphical ways to represent information more clearly, and it can be run quickly from a Jupyter notebook with no setup required. HiPlot enables machine learning (ML) researchers to more easily evaluate the influence of their hyperparameters, such as learning rate, regularizations, and architecture. It can also be used by researchers in other fields, so they can observe and analyze correlations in data relevant to their work.

Parallel plots are a convenient way to visualize and filter high-dimensional data. For example, suppose you are running multiple training tasks where each has two scalar parameters, coined dropout and lr, and one optimizer (taking either “SGD” or “Adam” as values), and from this you obtain a loss, which is yet another scalar. Each of the training tasks can be represented as a data point with values (dropout, lr, optimizer, loss). HiPlot will draw one vertical scaled axis each for dropout, lr, optimizer, and loss, and each training/data point is a continuous line that goes through its value on each of the axes. The results are shown here with three different data points, in red, blue, and black.

What it does:

HiPlot is designed to offer several advantages over other visualization tools:

Interactivity. In HiPlot, parallel plots are interactive, which makes it easy to change the visualization for different use cases. For example, you can focus on experiments that take a range or value along one or several axes, set the color scheme according to yet another axis, reorder or remove axes, or extract a particular selection of data.

Something Went Wrong

We're having trouble playing this video.

Learn more

We first chose to display only data points obtained after 20 or more epochs of training. Then, by slicing through the “loss” axis, we observed that larger learning rates led to better performance (perplexity). You can reproduce this example here: https://facebookresearch.github.io/hiplot/_static/demo/ml1.csv.html?hip.color_by=%22valid+ppl%22 .

Simplicity. You can use HiPlot in two equally straightforward ways:

Through an IPython notebook. This will reproduce the first example plot above:

import hiplot as hip
                data = [{'dropout':0.1, 'lr': 0.001, 'loss': 10.0, 'optimizer': 'SGD'},
                {'dropout':0.15, 'lr': 0.01, 'loss': 3.5, 'optimizer': 'Adam'},
                {'dropout':0.3, 'lr': 0.1, 'loss': 4.5, 'optimizer': 'Adam'}]
                hip.Experiment.from_iterable(data).display(force_full_width=True)

Through a server with the “hiplot” command. You can then access it through http://127.0.0.1:5005/ and use it to visualize, manage, and share your experiments. The simple syntax also allows you to see multiple experiments at the same time.

Extendability. By default, HiPlot’s web server can parse CSV or JSON files. You can also provide it with a custom Python parser that will convert your experiments into a HiPlot experiment.To help researchers performing hyperparameter searches, HiPlot is already compatible with the logs of open source Facebook AI libraries, such as wav2letter@anywhere, our inference framework for online speech recognition; Nevergrad, our open source tool for derivative-free optimization; and fairseq, our sequence modeling toolkit.

Visualization for Population-Based Training. Current approaches to hyperparameter tuning include genetic algorithms such as Population-Based Training, where training tasks can be forked several times with different hyperparameters. Such experiments can be challenging to analyze and may contain bugs that are difficult to spot. With HiPlot, such experiments can be visualized, as its XY plot can render edges between related data points.

Why it matters:

ML models are getting ever more complex and often have many hyperparameters. At Facebook AI, we have been using HiPlot to explore and efficiently analyze hyperparameter tuning of deep neural networks with dozens of hyperparameters and more than 100,000 experiments. We hope this tool will enable other scientists and engineers to explore and make the most of their own experimental data, while also paving the way for more dynamic training methods, such as those inspired by genetic algorithms.