Introducing Opacus: A high-speed library for training PyTorch models with differential privacy

August 31, 2020

We are releasing Opacus, a new high-speed library for training PyTorch models with differential privacy (DP) that’s more scalable than existing state-of-the-art methods. Differential privacy is a mathematically rigorous framework for quantifying the anonymization of sensitive data. It’s often used in analytics, with growing interest in the machine learning (ML) community. With the release of Opacus, we hope to provide an easier path for researchers and engineers to adopt differential privacy in ML, as well as to accelerate DP research in the field.

Opacus provides:

Speed: By leveraging Autograd hooks in PyTorch, Opacus can compute batched per-sample gradients, resulting in an order of magnitude speedup compared with existing DP libraries that rely on microbatching.
Safety: Opacus uses a cryptographically safe pseudo-random number generator for its security-critical code. This is processed at high speed on the GPU for an entire batch of parameters.
Flexibility: Thanks to PyTorch, engineers and researchers can quickly prototype their ideas by mixing and matching our code with PyTorch code and pure Python code.
Productivity: Opacus comes with tutorials, helper functions that warn about incompatible layers before your training even starts, and automatic refactoring mechanisms.
Interactivity: Opacus keeps track of how much of your privacy budget (a core mathematical concept in DP) you are spending at any given point in time, enabling early stopping and real-time monitoring.

Opacus defines a lightweight API by introducing the PrivacyEngine abstraction, which takes care of both tracking your privacy budget and working on your model’s gradients. You don’t need to call it directly for it to operate, as it attaches to a standard PyTorch optimizer. It works behind the scenes, making training with Opacus as easy as adding these lines of code at the beginning of your training code:

model = Net()
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
 
privacy_engine = PrivacyEngine(
   model,
   batch_size=32,
   sample_size=len(train_loader.dataset),
   alphas=range(2,32),
   noise_multiplier=1.3,
   max_grad_norm=1.0,
)
privacy_engine.attach(optimizer)
# That's it! Now it's business as usual

After training, the resulting artifact is a standard PyTorch model with no extra steps or hurdles for deploying private models: If you can deploy a model today, you can deploy it after it has been trained with DP without changing a single line of code.

The Opacus library also includes pre-trained and fine-tuned models, tutorials for large-scale models, and the infrastructure designed for experiments in privacy research. It’s open-sourced here.

Achieving high-speed privacy training with Opacus

Our goal with Opacus is to preserve the privacy of each training sample while limiting the impact on the accuracy of the final model. Opacus does this by modifying a standard PyTorch optimizer in order to enforce (and measure) DP during training. More specifically, our approach is centered on differentially private stochastic gradient descent (DP-SGD).

The core idea behind this algorithm is that we can protect the privacy of a training dataset by intervening on the parameter gradients that the model uses to update its weights, rather than the data directly. By adding noise to the gradients in every iteration, we prevent the model from memorizing its training examples while still enabling learning in aggregate. The (unbiased) noise will naturally tend to cancel out over the many batches seen during the course of training.

However, adding noise requires a delicate balance: Too much noise would destroy the signal and too little would not guarantee privacy. To determine the right scale, we look at the norm of the gradients. It’s important to limit how much each sample can contribute to the gradient because outliers have larger gradients than most samples. We need to ensure privacy for those outliers, especially because they are at the greatest risk of being memorized by the model. To do this, we compute the gradient for each individual sample in a minibatch. We clip the gradients individually, accumulating them back into a single gradient tensor and then add noise to the total sum.

This per-sample computation was one of the biggest hurdles in building Opacus. It’s more challenging compared with the typical operation with PyTorch, where Autograd computes the gradient tensor for the entire batch as this is what makes sense for all other ML use cases, and it optimizes performance. To overcome this, we used an efficient technique to obtain all the desired gradient vectors when training a standard neural network. For the model parameters, we return the gradient of the loss for each example in a given batch in isolation, as such:

Something Went Wrong

We're having trouble playing this video.

Learn more

Here’s a diagram of the Opacus workflow in which we compute per-sample gradients.

By tracking some intermediate quantities as we run our layers, we can train with any batch size that fits in memory, making our approach an order of magnitude faster compared with the alternative micro-batch method used in other packages.

The importance of privacy-preserving ML

The security community has encouraged developers of security-critical code to use a small number of carefully vetted and professionally maintained libraries. This “don’t roll your own crypto” principle helps minimize attack surface by allowing application developers to focus on what they know best: building great products. As applications and research of ML continue to accelerate, it’s important for ML researchers to access easy-to-use tools for mathematically rigorous privacy guarantees without slowing down the training process.

We hope that by developing PyTorch tools like Opacus, we’re democratizing access to such privacy-preserving resources. We’re bridging the divide between the security community and general ML engineers with a faster, more flexible platform using PyTorch.

Building community

Over the last few years, there’s been a rapid growth in the privacy-preserving machine learning (PPML) community. We’re excited by the ecosystem that’s already forming around Opacus with leaders in PPML.

One of our key contributors is OpenMined, a community of thousands of developers who are building applications with privacy in mind. The OpenMined community already contributes to CrypTen and leverages many of the PyTorch building blocks to underpin PySyft and PyGrid for differential privacy and federated learning. As part of the collaboration, Opacus will become a dependency for the OpenMined libraries, such as PySyft.

We look forward to continuing our collaboration and growing the community further.

Opacus is part of Facebook AI’s broader efforts to spur progress in developing secure computing techniques for machine learning and responsible AI. Overall, this is an important stepping stone in shifting the field toward building privacy-first systems in the future.

To dive deeper into the concepts of differential privacy, we are starting a series of Medium posts dedicated to differentially-private machine learning. The first piece focuses on the key fundamental concepts. Read the PyTorch Medium blog here.
We also offer comprehensive tutorials and the Opacus open source library here.

Written By

Davide Testuggine

Applied Research Scientist

Ilya Mironov

Applied Research Scientist

Product experiences