Get all your news in one place.
100’s of premium titles.
One app.
Start reading
The Guardian - UK
The Guardian - UK
Technology
Natalie Grover Science correspondent

AI firm DeepMind puts database of the building blocks of life online

Protein structures representing the data obtained using AlphaFold.
The AlphaFold database will increase our understanding of how proteins function, say scientists. Photograph: Karen Arnott/EMBL-EBI/PA

Last year the artificial intelligence group DeepMind cracked a mystery that has flummoxed scientists for decades: stripping bare the structure of proteins, the building blocks of life. Now, having amassed a database of nearly all human protein structures, the company is making the resource available online free for researchers to use.

The key to understanding our basic biological machinery is its architecture. The chains of amino acids that comprise proteins twist and turn to make the most confounding of 3D shapes. It is this elaborate form that explains protein function; from enzymes that are crucial to metabolism to antibodies that fight infectious attacks.

Despite years of onerous and expensive lab work that began in the 1950s, scientists have only decoded the structure of a fraction of human proteins. DeepMind’s AI program, AlphaFold, has predicted the structure of nearly all 20,000 proteins expressed by humans. In an independent benchmark test that compared predictions to known structures, the system was able to predict the shape of a protein to a good standard 95% of time.

DeepMind, which has partnered with the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI), hopes the database will help researchers to analyse how life works at an atomic scale by unpacking the apparatus that drives some diseases, make strides in the field of personalised medicine, create more nutritious crops and develop “green enzymes” that can break down plastic.

Collaboration in recent months with scientists working on a range of projects – from diseases that disproportionately affect poorer parts of the world to studying antibiotic resistance or the biology of the virus that causes Covid – has already begun.

“The applications are actually limited only by our imagination – but at a more fundamental level, the AlphaFold database will increase our understanding of how proteins function, and their role in the fundamental processes of life,” said Prof Edith Heard, the director-general of the EMBL.

“This understanding means we can be better equipped to unravel the molecular mechanisms of life and accelerate our pursuits to protect and treat human health, as well as the health of our planet, and making this tool open access will accelerate the power of research discovery and innovation for scientists around the world.”

AlphaFold’s ability to predict protein structure with dizzying accuracy was unveiled at the biennial “protein olympics” last year. Participants were given the amino acid sequences for about 100 proteins and challenged to work them out. AlphaFold not only eclipsed the performance of other computer programs but achieved accuracy analogous to laborious lab-based methods.

“I almost fell off my chair in just excitement and amazement that this longstanding problem of how proteins fold had been solved,” said Prof Ewan Birney, the director of the EMBL-EBI, after the results were first presented in November.

“This dataset is rather like the human genome … and it’s this dataset where we start some new bits of science that we weren’t able to do beforehand. I’m very excited to start walking down that road.”

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.