Semantic Design of de novo Genes with Evo
Music:
Enjoy the music based on this article at the end of the episode.
DOI:
10.1038/s41586-025-09749-7
License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/
Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00
Official website https://basebybase.com
On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics.
Episode link: https://basebybase.com/episodes/semantic-design-of-de-novo-genes-with-evo
️ Episode:
207: Semantic Design of de novo Genes with Evo
️ Season:
1
Article title:
Semantic design of functional de novo genes from a genomic language model
Journal:
Nature
QC:
This episode was checked against the original article PDF and publication metadata for the episode release published on 2025-11-24.
QC Scope:
- article metadata and core scientific claims from the narration
- excludes analogies, intro/outro, and music
- transcript coverage: Audited transcript sections covering Evo 1.5 long-context genomic language modeling, in-context design with gene autocomplete, toxin–antitoxin (T2TA/T3TA) design and validation, anti-CRISPR (Acr) design, SynGenome database creation and analyses, and discussed limitations/future directions.
- transcript topics: Evo 1.5 long-context genomic language model; In-context genomic design and autocomplete assessments; Amino acid sequence recovery and sequence diversity; Toxin–antitoxin (T2TA and T3TA) design and validation; Anti-CRISPR (Acr) design and validation; SynGenome database creation and analyses
QC Summary:
- factual score: 10/10
- metadata score: 10/10
- supported core claims: 6
- claims flagged for review: 0
- metadata checks passed: 4
- metadata issues found: 0
Metadata Audited:
- article_doi
- article_title
- article_journal
- license
Factual Items Audited:
- Semantic design uses genomic context to enable function-guided design and generate de novo genes (Evo learns distributional semantics over prokaryotic genes).
- Autocomplete test on conserved genes (e.g., rpoS) with Evo 1.5 achieved ~85% amino acid recovery at 30% input.
- Evo-generated functional toxin–antitoxin (T2TA) systems with low sequence identity to known proteins, including multitoxin neutralization by EvoAT2 and EvoAT4.
- Evo-design of type III toxin–antitoxin (T3TA) systems produced functional RNA antitoxins; EvoAT6 neutralized ToxN.
- Anti-CRISPR (Acr) design yielded functional Acrs with a 17% experimental success rate in SpCas9 assays.
- SynGenome: a database of over 120 billion base pairs of AI-generated genomic sequences; Pfam domain frequencies in SynGenome closely mirror natural genomes (Pearson r ≈ 0.78).
QC result: Pass.
Fler avsnitt av Base by Base
Visa alla avsnitt av Base by BaseBase by Base med Gustavo Barra finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
