By Marc Zimmer
New London: During her Nobel Prize in Chemistry lecture in 2018, Frances Arnold said: “Today we can, for all practical purposes, read, write and edit any DNA sequence, but we cannot compose it.” That is no longer true.
Since then, science and technology have progressed so much that artificial intelligence has learned to compose DNA and, with genetically modified bacteria, scientists are on the way to designing and manufacturing custom proteins.
The goal is that with AI design talents and gene-editing engineering capabilities, scientists can engineer bacteria to act as mini factories that produce new proteins that can reduce greenhouse gases, digest plastics, or act as species-specific pesticides.
As a chemistry professor and computational chemist who studies molecular science and environmental chemistry, I think advances in AI and gene editing make this a realistic possibility.
Gene sequencing: reading the recipes of life
All living beings contain genetic materials (DNA and RNA) that provide the hereditary information necessary to replicate and produce proteins. Protein makes up 75 percent of human dry weight. They form muscles, enzymes, hormones, blood, hair and cartilage. Understanding proteins means understanding much of biology. The order of nucleotide bases in DNA or RNA in some viruses encodes this information, and genomic sequencing technologies identify the order of these bases.
The Human Genome Project was an international effort that sequenced the entire human genome between 1990 and 2003. Thanks to rapidly improving technologies, it took seven years to sequence the first 1 percent of the genome and another seven years for the 99 percent. remaining. In 2003, scientists had the complete sequence of the 3 billion base pairs of nucleotides that encode between 20,000 and 25,000 genes in the human genome.
However, understanding the functions of most proteins and correcting their dysfunctions remained a challenge.
AI learns proteins
The shape of each protein is fundamental to its function and is determined by the sequence of its amino acids, which in turn is determined by the nucleotide sequence of the gene. Misfolded proteins are the wrong shape and can cause diseases such as neurodegenerative diseases, cystic fibrosis, and type 2 diabetes. Understanding these diseases and developing treatments requires knowledge of protein shapes.
Before 2016, the only way to determine the shape of a protein was through X-ray crystallography, a laboratory technique that uses X-ray diffraction by single crystals to determine the precise three-dimensional arrangement of atoms and molecules in a molecule. . At that time, the structure of about 200,000 proteins had been determined by crystallography, costing billions of dollars.
AlphaFold, a machine learning program, used these crystal structures as a training set to determine the shape of proteins from their nucleotide sequences. And in less than a year, the program calculated the protein structures of the 214 million genes that have been sequenced and published. All protein structures determined by AlphaFold have been published in a freely accessible database.
To effectively address non-infectious diseases and design new drugs, scientists need more detailed knowledge of how proteins, especially enzymes, bind to small molecules. Enzymes are protein catalysts that enable and regulate biochemical reactions.
AlphaFold3, launched on May 8, 2024, can predict the shapes of proteins and the places where small molecules can bind to these proteins. In rational drug design, drugs are designed to bind to proteins involved in a pathway related to the disease being treated. Small molecule drugs bind to the binding site of proteins and modulate their activity, thereby influencing the disease trajectory. By being able to predict protein binding sites, AlphaFold3 will improve researchers’ drug development capabilities.
AI+ CRISPR = compose new proteins
Around 2015, the development of CRISPR technology revolutionized gene editing. CRISPR can be used to find a specific part of a gene, change or delete it, make the cell express more or less of its gene product, or even add a completely foreign gene in its place.
In 2020, Jennifer Doudna and Emmanuelle Charpentier received the Nobel Prize in Chemistry “for the development of a method (CRISPR) for genome editing.” With CRISPR, gene editing, which previously took years, was species-specific, expensive and labor-intensive, can now be done in days and at a fraction of the cost.
AI and genetic engineering are advancing rapidly. What was once complicated and expensive is now routine. Looking ahead, the dream is to have custom proteins designed and produced using a combination of machine learning and CRISPR-modified bacteria. The AI would design the proteins and the CRISPR-altered bacteria would produce the proteins. Enzymes produced this way could potentially inhale carbon dioxide and methane while exhaling organic raw materials, or break down plastics into concrete substitutes.
I think these ambitions are not unrealistic, given that genetically modified organisms already represent 2 percent of the US economy in the agricultural and pharmaceutical sectors.
Two groups have created functional enzymes from scratch that were designed using different artificial intelligence systems. David Baker Institute for protein design at the University of Washington devised a new protein design strategy based on deep learning that they called “family hallucination,” which they used to create a unique light-emitting enzyme. Meanwhile, biotechnology The startup Profluent has used an AI trained from the sum of all CRISPR-Cas knowledge to design new functional genome editors.
If AI can learn to create new CRISPR systems, as well as working bioluminescent enzymes that have never been seen on Earth, there is hope that combining CRISPR with AI could be used to design other new personalized enzymes. Although the CRISPR-AI combination is still in its infancy, once it matures it is likely to be very beneficial and could even help the world address climate change.
It is important to remember, however, that the more powerful a technology is, the greater the risks it poses. Furthermore, humans have not had much success in engineering nature due to the complexity and interconnectedness of natural systems, which often leads to unintended consequences. (The conversation)
HANDS HANDS