Artificial Intelligence Computing Leadership from NVIDIA

2022-09-24 07:53:21 By : Mr. Jack Dong

As scientists probe for new insights about DNA, proteins and other building blocks of life, the NVIDIA BioNeMo framework — announced today at NVIDIA GTC — will accelerate their research.

NVIDIA BioNeMo is a framework for training and deploying large biomolecular language models at supercomputing scale — helping scientists better understand disease and find therapies for patients. The large language model (LLM) framework will support chemistry, protein, DNA and RNA data formats.

It’s part of the NVIDIA Clara Discovery collection of frameworks, applications and AI models for drug discovery.

Just as AI is learning to understand human languages with LLMs, it’s also learning the languages of biology and chemistry. By making it easier to train massive neural networks on biomolecular data, NVIDIA BioNeMo helps researchers discover new patterns and insights in biological sequences — insights that researchers can connect to biological properties or functions, and even human health conditions.

NVIDIA BioNeMo provides a framework for scientists to train large-scale language models using bigger datasets, resulting in better-performing neural networks. The framework will be available in early access on NVIDIA NGC, a hub for GPU-optimized software.

In addition to the language model framework, NVIDIA BioNeMo has a cloud API service that will support a growing list of pretrained AI models.

Scientists using natural language processing models for biological data today often train relatively small neural networks that require custom preprocessing. By adopting BioNeMo, they can scale up to LLMs with billions of parameters that capture information about molecular structure, protein solubility and more.

BioNeMo is an extension of the NVIDIA NeMo Megatron framework for GPU-accelerated training of large-scale, self-supervised language models. It’s domain specific, designed to support molecular data represented in the SMILES notation for chemical structures, and in FASTA sequence strings for amino acids and nucleic acids.

“The framework allows researchers across the healthcare and life sciences industry to take advantage of their rapidly growing biological and chemical datasets,” said Mohammed AlQuraishi, founding member of the OpenFold Consortium and assistant professor at Columbia University’s Department of Systems Biology. “This makes it easier to discover and design therapeutics that precisely target the molecular signature of a disease.”

For developers looking to quickly get started with LLMs for digital biology and chemistry applications, the NVIDIA BioNeMo LLM service will include four pretrained language models. These are optimized for inference and will be available under early access through a cloud API running on NVIDIA DGX Foundry.

In the future, researchers using the BioNeMo LLM service will be able to customize the LLM models for higher accuracy on their applications in a few hours — with fine-tuning and new techniques such as p-tuning, a training method that requires a dataset with just a few hundred examples instead of millions.

A wave of experts in biotech and pharma are adopting NVIDIA BioNeMo to support drug discovery research.

“The BioNeMo framework is an enabling technology to efficiently leverage the power of LLMs for data-driven protein design within our design-build-test cycle,” said Andrew Ferguson, co-founder and head of computation at Evozyne. “This will have an immediate impact on our design of novel functional proteins, with applications in human health and sustainability.”

“As we see the ever-widening adoption of large language models in the protein space, being able to efficiently train LLMs and quickly modulate model architectures is becoming hugely important,” said Istvan Redl, machine learning lead at Peptone, a biotech startup in the NVIDIA Inception program. “We believe that these two engineering aspects — scalability and rapid experimentation — are exactly what the BioNeMo framework could provide.”

Sign up for early access to the NVIDIA BioNeMo LLM service or BioNeMo framework. For hands on-experience with the MegaMolBART chemistry model in BioNeMo, request a free lab from NVIDIA LaunchPad on training and deploying LLMs.

Discover the latest in AI and healthcare at GTC, running online through Thursday, Sept. 22. Registration is free. 

Watch the GTC keynote address by NVIDIA founder and CEO Jensen Huang below:

Main image by Mahendra awale, licensed under CC BY-SA 3.0 via Wikimedia Commons

An Elevated Experience: XPENG Launches G9 EV, Taking Innovation Even Higher with NVIDIA DRIVE Orin

World-Class: NVIDIA Research Builds AI Model to Populate Virtual Worlds With 3D Objects, Characters

Go Hands On: Logitech G CLOUD Launches With Support for GeForce NOW

Continental and AEye Join NVIDIA DRIVE Sim Sensor Ecosystem, Providing Rich Capabilities for AV Development

Inside AI: NVIDIA DRIVE Ecosystem Creates Pioneering In-Cabin Features With NVIDIA DRIVE IX

NVIDIA websites use cookies to deliver and improve the website experience. See our cookie policy for further details on how we use cookies and how to change your cookie settings.