Machine-learning program imagines a protein's many possible structures

New imaging technology powered by a machine-learning algorithm promises to reveal the idiosyncrasies of ribosome assembly and protein production inside cells. Photo by Dcrjsr/Wikimedia Commons
New imaging technology powered by a machine-learning algorithm promises to reveal the idiosyncrasies of ribosome assembly and protein production inside cells. Photo by Dcrjsr/Wikimedia Commons

Feb. 4 (UPI) -- To study biological molecules like proteins, scientists rely on cryo-electron microscopy. The 3D-imaging technology is ideal for studying proteins that exist in only a single structural arrangement, or conformation.

Unfortunately, many proteins can assume a variety of shapes, complicating bio-molecular surveys.


Thanks to a new machine-learning algorithm, however, scientists can now anticipate and recognize a protein's varied structural iterations.

The new AI-system, described Thursday in the journal Nature Methods, does more than image a diversity of conformations, it can also predict the varied motions of different protein structures.

RELATED Scientists entangle atoms using heat

Instead of mapping different protein structures on individual 3D lattices, the neural network combines the full assemblage of potential structures -- and the idiosyncrasies of their movements -- into a single, working model.

"Our idea was to try to use machine-learning techniques to better capture the underlying structural heterogeneity, and to allow us to inspect the variety of structural states that are present in a sample," corresponding author Joseph Davis, an assistant professor in the biology department at the Massachusetts Institute of Technology, said in a press release.


To showcase the new technology, researchers analyzed the structures formed during ribosome assemblage, the process during which messenger RNA is transcribed and proteins are synthesized.

RELATED Unusual quasiparticles discovered in graphene-based materials

Using cryo-EM, researchers stalled, or blocked, the ribosome assemblage process at various point and captured images of the different protein structures.

At different times during the assembly process, the blockage resulted in a single protein conformation, suggesting some proteins can only be built a certain way.

Other times, the blockage yielded a variety of structures. For some proteins, construction can follow a variety of paths.

RELATED Scientists make breakthrough in study of mitochondria

Because the process of ribosome assemblage is fast, complex and yields so many structures, documenting the full diversity of structural states is impossible using traditional cryo-EM techniques.

"In general, it's an extremely challenging problem to try to figure out how many states you have when you have a mixture of particles," Davis said.

By using machine-learning to analyze a variety of 2D images of the protein production process, researchers were able to model the full scope of structural diversity hidden within the ribosome assembly process.

RELATED Researchers find protein capable of editing other proteins

In doing so, researchers were able to gain new insights into the protein synthesis process.

Previously, scientists estimated ribosomes form large structural elements, like the foundation of a building, upon which a diversity of additions can be layered to form different ribosome "active sites," where RNA is transcribed and proteins are assembled.


But using the new imaging and analysis system, researchers realized that in a small percentage of ribosomes, a structure scientists thought was added at the end is actually formed before the larger foundation.

The discovery suggests assembling molecular structures in the precise order may sometimes require too much energy to be worth it.

"The cells are likely evolved to find a balance between what they can tolerate, which is maybe a small percentage of these types of potentially deleterious structures, and what it would cost to completely remove them from the assembly pathway," Davis said.

Scientists are now putting their machine-learning algorithm to work surveying the infamous COVID-19 spike protein, which latches onto human cell receptors and allows the virus to invade.

Analysis suggests the spike protein's receptor binding domain features three subunits, all of which can exist in one of two positions -- facing either up or down.

Understanding the structural characteristics of potential harmful biomolecules can help researchers identify weaknesses and develop potential therapies.

"As we start to think about how one might develop small molecule compounds to force all of the RBDs into the 'down' state so that they can't interact with human cells, understanding exactly what the 'up' state looks like and how much conformational flexibility there is will be informative for drug design," Davis said.


"We hope our new technique can reveal these sorts of structural details," Davis said.

Latest Headlines


Follow Us