Exploring the Fusion of Machine Learning and Synthetic Biology
Written on
This week's newsletter spotlights the convergence of machine learning and synthetic biology. Two recent articles published in Nature Communications (check the press release) by researchers from the Lawrence Berkeley National Laboratory and the Department of Energy’s Joint BioEnergy Institute introduce a machine learning framework designed to aid metabolic engineers in optimizing the production of specific molecules.
While I found the findings quite impressive, I also want to emphasize the importance of achieving a mechanistic understanding in scientific endeavors. Although new machine learning tools can potentially identify an “optimized” metabolic pathway by examining datasets from numerous experiments, they fail to provide metabolic engineers with insights into the why behind the effectiveness of certain proteins or promoters in achieving desired outcomes.
In these studies, the authors utilized a machine learning framework known as the Automated Recommendation Tool (ART), which accelerates the "Learn" phase of the Design-Build-Test cycle. Notably, this tool can function "without requiring a complete mechanistic understanding of the biological system."
One of my least favorite articles is a 2008 op-ed by Chris Anderson in WIRED, where he claimed that the traditional scientific method—hypothesize, model, test—was becoming obsolete due to the influx of massive data. This prediction has not come to fruition, especially considering that this was before machine learning emerged as a major focus in biology, with hopes of solving our experimental challenges.
Thus, while this edition of This Week in Synthetic Biology celebrates the potential of machine learning and synthetic biology, I urge synthetic biologists to remain connected to the essence of being a biologist—engaging with the marvels and complexities of life.
As my undergraduate research mentor used to say, "That’s interesting and all, but what’s the mechanism?"
Machine Learning Enhances Metabolic Engineering: What’s Next, Robot? (Open Access)
The first study from Berkeley's researchers discusses ART, which primarily leverages Python's scikit-learn library. This machine learning framework assists researchers in maximizing target molecule production, reducing cellular toxicity, or adjusting metabolite levels to specific concentrations. It works most of the time. For instance, the researchers applied it to engineer E. coli and S. cerevisiae to produce limonene, synthesize hop flavor metabolites for beer, and generate dodecanol from fatty acids. Although, in the case of dodecanol, the predictions were not particularly useful despite data from 50 engineering cycles. Nevertheless, the tool appears valuable and merits exploration. This study was published in Nature Communications.
Machine Learning Optimizes Tryptophan Production in Cells (Open Access)
In the subsequent paper, also published in Nature Communications, ART was employed to enhance tryptophan production in engineered S. cerevisiae. In this instance, it performed exceptionally well. From 7776 combinatorial options—comprising five distinct genes regulated by six promoters from a pool of thirty—ART successfully pinpointed designs yielding up to 74% higher tryptophan titers than the best designs used for model training. Nonetheless, the researchers needed to amass a substantial amount of data to train their predictive models, collecting over 120,000 time-series data points and generating more than 500 unique yeast strains during the research.
Use Your Pasta Maker to Extract DNA—Seriously! (Open Access)
In just 30 seconds, you can utilize “cellulose-based dipsticks” to extract nucleic acids, according to a new method detailed in Nature Protocols. Each dipstick, which can be produced in large quantities in under 30 minutes, is “dipped” into three buffers: an extract buffer that binds nucleic acids, a wash buffer that removes impurities, and an amplification buffer for eluting the nucleic acids. You’ll also require a pasta maker.
The authors of this study recommend a low-cost, unbranded pasta maker sourced from eBay, which is similar to the Avanti pasta maker (Avanti, cat. no. 26812). However, any brand should suffice—no need to fret about whether your Cucina Pro will work!
Engineered Bacteria Thrive on CO2 and Formic Acid
Engineered E. coli with synthetic pathways for carbon dioxide and formic acid assimilation can survive solely on these substrates. This study, published in Nature Microbiology, was led by Sang Yup Lee’s team at the Korea Advanced Institute of Science and Technology. It builds on a 2019 study where Ron Milo’s team engineered E. coli to derive carbon exclusively from carbon dioxide, using formate as a reducing agent.
A New Viral Gene Drive (Open Access)
Viral gene drives have been introduced. Utilizing human cytomegalovirus (a herpesvirus), researchers Marius Walter and Eric Verdin from the Buck Institute for Research on Aging have developed a gene drive capable of propagating through a viral population. When two viruses infect a host cell—one containing the gene drive and the other not—the Cas9 within the gene drive cleaves the wildtype sequence. The cut sequence then employs the “gene drive sequence as a repair template,” effectively converting the wildtype locus into a new gene drive sequence. This straightforward yet effective approach allows for the propagation of a genetic element within viruses. The study appeared in Nature Communications.
Rapid-Fire Highlights
More research & reviews worth your time:
- A stiff, functional material created entirely from living cells (free of biopolymers or biominerals) was developed by the Joshi lab. My top pick this week. bioRxiv. Link (Open Access)
- DNA nanoswitches programmed to change shape in response to various viral RNAs, including Zika and SARS-CoV-2, can be detected through gel electrophoresis. Science Advances. Link (Open Access)
- A lab-evolved E. coli strain capable of utilizing “acetate as its sole carbon and energy source” was engineered to produce mevalonate and n-butanol. Metabolic Engineering. Link
- An enzyme called ?-xylosidase, which hydrolyzes D-xylose sugars, was engineered to form O-, N-, S-, and Se-glycosides along with sugar esters and phosphoesters. This mutant enzyme is termed thioglycoligase. Nature Communications. Link (Open Access)
- A web tool for designing pegRNAs for your next prime-editing experiments has been created. Nature Biomedical Engineering. Link
- Prime-editing was employed in adult stem cells to correct “disease-causing mutations in…liver organoids from a patient with Wilson disease,” a rare genetic disorder that causes copper accumulation in the liver, brain, and eyes. bioRxiv. Link (Open Access)
- A review discussing how CRISPR–Cas systems can enhance plant yields, bolster disease resistance, and expedite domestication was published. Nature Reviews Molecular Cell Biology. Link
- Another review focusing on CRISPR-Cas applications in cotton plants has been released. Trends in Biotechnology. Link
- Archaea produce unique lipids, such as the curious “C25, C25-archaeal diether-type membrane lipids.” Now, engineered E. coli can produce them as well. Synthetic Biology. Link (Open Access)
- A new Cas9 fusion nuclease, named Cas9-N57, can specifically integrate DNA sequences up to 12 kb long. Nucleic Acids Research. Link (Open Access)
- An adenine base editor capable of converting A•T to G•C in genomic DNA has been engineered for improved “on-target editing efficiency” and reduced off-target effects. Nature Communications. Link (Open Access)
- The Church lab developed a synthetic auxotrophic E. coli strain that remained contained, even after 100 days of continuous growth. bioRxiv. Link (Open Access)
- Interested in the financial viability of DNA storage? A recent review covered this topic. Biotechnology Advances. Link (Open Access)
- Synthetic cells are not as densely packed as real cells, which are filled with proteins, nucleic acids, and other molecules. Researchers have now recreated the “crowded cytoplasm” of cells in protocells, affecting diffusion rates and, consequently, transcription and translation. ACS Synthetic Biology. Link
- Bristol researchers engineered a two-heme binding protein, termed 4D2, elucidated its structure, and further modified it to produce various heme-binding proteins. bioRxiv. Link (Open Access)
- Every single residue in the tip domain of T7 bacteriophage was swapped out to investigate each amino acid's role in bacteriophage:host interactions. Researchers tested a total of 1660 variants. bioRxiv. Link (Open Access)
- The Z-ring (comprised mainly of FtsZ and FtsA) initiates bacterial cell division. A new study has reconstituted FtsA-FtsZ “ring-like structures” entirely through cell-free gene expression within liposome compartments, marking a significant step toward programmable, dividing cells created from the ground up. Communications Biology. Link (Open Access)
- An article in EMBO Reports outlines “a policy framework for transitioning towards a sustainable carbon cycle economy,” with synthetic biology playing a crucial role. EMBO Reports. Link
- A variant of ubiquitin was engineered to conditionally regulate the stability and expression level of its fused proteins. Cell Chemical Biology. Link
#SynBio in the News
(Not much to report this week)
- Alden Wicker, a seasoned fashion journalist, discussed “carbon-neutral, recyclable, biodegradable, and affordable materials” for Neo.Life.
- Matthias Berninger, Bayer's Head of Public Affairs, Science, and Sustainability, was interviewed for a Bioeconomy.xyz article.
- Base editing was utilized to reverse inherited deafness in mice by correcting a mutation in the TMC1 gene. Emily Mullin reported for OneZero.
Thank you for reading This Week in Synthetic Biology, a part of Bioeconomy.XYZ. If you find this newsletter enjoyable, please share it with a friend.
A version of these newsletters is also available on bioeconomy.xyz and my website, nikomccarty.com. Feel free to reach out with tips and feedback via Twitter at @NikoMcCarty. I welcome constructive criticism.