Accelerating breakthroughs: How artificial intelligence is transforming synthetic biology
Thanks to synthetic biology, the ways we grow, harvest, and utilize biological resources are changing drastically - and for the better - offering solutions to our most pressing environmental challenges in sectors like food production and agriculture. But how will this change with artificial intelligence placed in the equation?
Machine learning-assisted processes are not new in Synbio. Still, since the release of natural language models, several promising ones have been researched and utilized in synthetic biology - and they’re drastically speeding up, for example, the creation of artificial proteins.
Artificial protein creation with natural language models
With examples like Solar Foods or Impossible Foods, we already have artificially created proteins successfully entering the market. But protein-prediction language models like ProGEN are speeding up the creation of artificial proteins and it is exciting to see how this will be applied to food production.
For example, the ProGEN language model can create artificial proteins with conditional language modeling, using an application of the same technology that most of us know as ChatGPT.
Similarly to language models like ChatGPT being trained with natural language examples and discussions, the ProGen model was trained with existing protein sequences, 280 million of them from more than 19,000 protein families.
What ProGEN does, is that it can predict the probability of the next amino acid based on previous amino acids, and generate full-length protein sequences for any protein family, similar to natural proteins. In essence, ProGEN language models can code in the very language of life: amino acid sequences, to create custom proteins that we can grow in the lab.1
Similarly to ProGEN predicting the next amino acid, another program ESMAraPPI can predict how proteins interact with each other. Remarkably, ESMAraPPI can predict protein interactions, even when it hasn’t seen some of the protein pairs before. When tested in laboratory conditions with a widely used model plant called Arabidopsis, it was able to take new protein sequences as input and predict the probability of proteins interacting with one another.2
Will AI make Synbio funding more accessible?
As many of us know, AI in synthetic biology means biology can be designed and programmed at a tremendous speed like never before.
Artificial intelligence can already independently create the core ingredients of life, such as proteins, revolutionizing how living organisms are created by programming.
There’s a lot of room to further automate existing laboratory processes. A lot of time is spent in the lab performing repetitive, manual tasks, for example, liquid handling and pipetting of samples, to name a few. With AI, previously manual or machine learning-assisted processes are becoming even more automatable, when scientists only need to monitor a task rather than perform it themselves.
We can safely predict that these developments will create more cost-effective, and more environmentally friendly solutions. Working myself daily with investors such as angels and VCs, it’s commonly acknowledged that Synbio, like other deep tech, requires heavy capital, but the research costs will come down as AI-assisted practices become the new default. That makes joining the funding rounds of these ventures more accessible.
As costs needed to fund research and projects drop, these cases are no longer just in the hands of VC funds - modernized and active communities such as DAOs will play a drastic role in taking synbio AI solutions to the market.
Milja Inkeri Mäkelä
ValleyDAO Community Member & Writer in Tech and Finance
Large language models generate functional protein sequences across diverse families, Madani et al. Nature biotechnology (1/2023) https://doi.org/10.1038/s41587-022-01618-2
Pre-trained protein language model sheds new light on the prediction of Arabidopsis protein–protein interactions, Zhou et al. Plant Methods (2023) 19:141 https://doi.org/10.1186/s13007-023-01119-6