Lassen F et al., The American Journal of Human Genetics - Meta-analysis of up to 948,690 exome- or whole-genome-sequenced individuals across six biobanks used statistical phasing to infer compound-heterozygous genotypes, increasing detectable bi-allelic damaging genotypes by 19% and identifying 58 significant gene-trait associations, 17 of which show stronger recessive effects. Key terms: recessive genetics, compound heterozygous, biobank meta-analysis, loss-of-function, statistical phasing.
Study Highlights:
The study combined data from UKB, All of Us, 100kGP, Genes & Health, BioMe, and BBJ totaling 948,690 samples and phased rare variants to detect compound-heterozygous genotypes. Phasing increased the number of bi-allelic damaging genotypes by 19% and identified 5,563 genes with bi-allelic pLoF genotypes. Gene-based recessive testing across 41 traits found 58 significant associations after meta-analysis and Cauchy combination, with 17 instances showing stronger recessive than additive effects, including HBB with heart failure and LECT2 with height. The federated, cross-biobank approach improved power and highlighted the value of diverse ancestries for discovering recessive effects.
Conclusion:
Federated meta-analysis across multiple biobanks combined with statistical phasing substantially increases discovery of rare recessive gene-trait associations and expands the catalog of human gene knockouts, demonstrating the importance of phasing and diverse cohorts for recessive-effect discovery.
Music:
Enjoy the music based on this article at the end of the episode.
Article title:
Meta-analysis across six global biobanks identifies recessive coding associations with complex traits and diseases
First author:
Lassen F
Journal:
The American Journal of Human Genetics
DOI:
10.1016/j.ajhg.2026.04.005
Reference:
Lassen F.H. et al., 2026. Meta-analysis across six global biobanks identifies recessive coding associations with complex traits and diseases. The American Journal of Human Genetics 113, 1–17. https://doi.org/10.1016/j.ajhg.2026.04.005
License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/
Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00
Official website https://basebybase.com
On PaperCast Base by Base you'll discover the latest in genomics, functional genomics, structural genomics, and proteomics.
Episode link: https://basebybase.com/episodes/recessive-coding-associations-six-biobanks
QC:
This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-05-03.
QC Scope:
- article metadata and core scientific claims from the narration
- excludes analogies, intro/outro, and music
- transcript coverage: Audited transcript sections describing: the role of statistical phasing to identify compound-heterozygous genotypes, the scale across six biobanks (~950k individuals), the rise in bi-allelic genotypes, and key recessive gene–trait associations (HBB, LECT2, ENSG00000267561, PYGM, ODAD1), plus pleiotropy and conditional
- transcript topics: Introduction to human knockouts and biobank-scale data; Compound heterozygosity and the need for phasing; Statistical phasing across six biobanks and study scale; Gene-based recessive associations across 41 traits; Notable associations: HBB with heart failure and lipids; LECT2 with height; ENSG00000267561 with height; BTNL9 association with HDL-C and triglycerides
QC Summary:
- factual score: 10/10
- metadata score: 10/10
- supported core claims: 7
- claims flagged for review: 0
- metadata checks passed: 4
- metadata issues found: 0
Metadata Audited:
- article_doi
- article_title
- article_journal
- license
Factual Items Audited:
- PHASED approach increased bi-allelic damaging genotypes by 19% (compound-heterozygous and homozygous) across biobanks
- Identified 5,563 genes harboring bi-allelic genotypes; 1,767 additional genes beyond previous studies; total 8,925 unique genes when accounting for overlaps
- CH variants increased testable genes by 8.9% to 1,253 genes (from 1,151)
- 58 significant gene–trait associations identified; 17 likely recessive (based on comparing recessive vs additive signals)
- Notable recessive associations include HBB with heart failure and lipid traits; LECT2 with height; ENSG00000267561 with height; PYGM with AST; ODAD1 with COPD
- Ancestry-diversity contribution: 1,371 of the new knockouts found in non-European ancestries, concentrated in SAS cohorts
QC result: Pass.
Fler avsnitt av Base by Base
Visa alla avsnitt av Base by BaseBase by Base med Gustavo Barra finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
