Sveriges mest populära poddar
Base by Base

207: Semantic Design of de novo Genes with Evo

15 min24 november 2025

Semantic Design of de novo Genes with Evo

Music:
Enjoy the music based on this article at the end of the episode.

DOI:
10.1038/s41586-025-09749-7

License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/

Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00

Official website https://basebybase.com

On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics.

Episode link: https://basebybase.com/episodes/semantic-design-of-de-novo-genes-with-evo

️ Episode:
207: Semantic Design of de novo Genes with Evo

️ Season:
1

Article title:
Semantic design of functional de novo genes from a genomic language model

Journal:
Nature

QC:
This episode was checked against the original article PDF and publication metadata for the episode release published on 2025-11-24.

QC Scope:
- article metadata and core scientific claims from the narration
- excludes analogies, intro/outro, and music
- transcript coverage: Audited transcript sections covering Evo 1.5 long-context genomic language modeling, in-context design with gene autocomplete, toxin–antitoxin (T2TA/T3TA) design and validation, anti-CRISPR (Acr) design, SynGenome database creation and analyses, and discussed limitations/future directions.
- transcript topics: Evo 1.5 long-context genomic language model; In-context genomic design and autocomplete assessments; Amino acid sequence recovery and sequence diversity; Toxin–antitoxin (T2TA and T3TA) design and validation; Anti-CRISPR (Acr) design and validation; SynGenome database creation and analyses

QC Summary:
- factual score: 10/10
- metadata score: 10/10
- supported core claims: 6
- claims flagged for review: 0
- metadata checks passed: 4
- metadata issues found: 0

Metadata Audited:
- article_doi
- article_title
- article_journal
- license

Factual Items Audited:
- Semantic design uses genomic context to enable function-guided design and generate de novo genes (Evo learns distributional semantics over prokaryotic genes).
- Autocomplete test on conserved genes (e.g., rpoS) with Evo 1.5 achieved ~85% amino acid recovery at 30% input.
- Evo-generated functional toxin–antitoxin (T2TA) systems with low sequence identity to known proteins, including multitoxin neutralization by EvoAT2 and EvoAT4.
- Evo-design of type III toxin–antitoxin (T3TA) systems produced functional RNA antitoxins; EvoAT6 neutralized ToxN.
- Anti-CRISPR (Acr) design yielded functional Acrs with a 17% experimental success rate in SpCas9 assays.
- SynGenome: a database of over 120 billion base pairs of AI-generated genomic sequences; Pfam domain frequencies in SynGenome closely mirror natural genomes (Pearson r ≈ 0.78).

QC result: Pass.

Base by Base med Gustavo Barra finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.