Open Source

NLP

PyTorch utilized in aim to accelerate drug discovery

September 30, 2020

At a time when biopharmaceutical scientists must comb through massive amounts of data in the quest to deliver life-changing medicines, PyTorch is helping AstraZeneca work to accelerate its drug discovery research.

AstraZeneca, which is headquartered in Cambridge, UK, has a broad portfolio of prescription medicines, primarily for the treatment of diseases in Oncology; Cardiovascular; Renal & Metabolism; and Respiratory & Immunology.

“The vast amount of data our research scientists have access to is exponentially growing each year, and maintaining a comprehensive knowledge of all this information is increasingly challenging,” wrote Gavin Edwards, a Machine Learning Engineer at AstraZeneca.

Edwards is part of AstraZeneca’s Biological Insights Knowledge Graph (BIKG) team. He explains that knowledge graphs are networks of contextualized scientific data such as genes, proteins, diseases, and compounds—and the relationship between them. As these knowledge graphs grow and become more complex, machine learning gives AstraZeneca’s BIKG team a way to analyze all the data within them and find relevant connections more quickly and efficiently.

“We can use this approach to identify, say, the top 10 drug targets our scientists should pursue for a given disease,” wrote Edwards.

Since a great deal of the data used to form knowledge graphs comes in the form of unstructured text, AstraZeneca uses PyTorch’s natural language processing (NLP) library to define and train models.

“PyTorch is a natural choice for teams that want to be at the forefront of NLP research, as it’s widely used by the academic community and allows us to quickly implement ideas from the latest papers,” said Edwards.

They also use PyTorch in conjunction with Microsoft’s Azure Machine Learning to create machine learning models for recommending drug targets.

Learn more about how AstraZeneca is using PyTorch and Azure in an effort to accelerate drug discovery.