Zhou Y et al., Cell Genomics - This paper introduces scPrediXcan, which combines a deep-learning model (ctPred) built on Enformer-derived features with single-cell RNA-seq to perform cell-type-specific TWAS via a linearized SNP predictor (ℓ-ctPred), improving gene discovery for T2D and SLE. Key terms: cell-type-specific expression, deep learning, TWAS, single-cell RNA-seq, GWAS.
Study Highlights:
The authors developed ctPred, a compact deep-learning model that uses Enformer epigenomic features to predict cell-type-specific pseudobulk expression. They linearized ctPred into ℓ-ctPred (SNP-based elastic net) so predictions can be used with GWAS summary statistics in S-PrediXcan. Applied to type 2 diabetes and systemic lupus erythematosus, scPrediXcan identifies more candidate causal genes, explains more GWAS loci, and provides cell-type-resolved insights compared with canonical TWAS approaches. Models and weights for 46 cell types are released via predictdb.org for community use.
Conclusion:
scPrediXcan leverages sequence-based deep learning and single-cell data to enable cell-type-level TWAS with higher power and broader gene coverage than canonical methods, offering improved mechanistic resolution for complex traits while noting limitations related to model biases and cis-focused inference.
Music:
Enjoy the music based on this article at the end of the episode.
Article title:
scPrediXcan integrates deep learning methods and single-cell data into a cell-type-specific transcriptome-wide association study framework
First author:
Zhou Y
Journal:
Cell Genomics
DOI:
10.1016/j.xgen.2025.100875
Reference:
Zhou Y., Adeluwa T., Zhu L., Salazar-Magaña S., Sumner S., Kim H., Gona S., Nyasimi F., Kulkarni R., Powell J.E., Madduri R., Liu B., Chen M., Im H.K. scPrediXcan integrates deep learning methods and single-cell data into a cell-type-specific transcriptome-wide association study framework. Cell Genomics. 2025;5:100875. doi:10.1016/j.xgen.2025.100875
License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/
Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00
Official website https://basebybase.com
On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics.
Episode link: https://basebybase.com/episodes/scpredixcan-cell-type-twas
QC:
This episode was checked against the original article PDF and publication metadata for the episode release published on 2025-05-22.
QC Scope:
- article metadata and core scientific claims from the narration
- excludes analogies, intro/outro, and music
- transcript coverage: Audited the transcript for substantive scientific content: ctPred architecture, Enformer-derived features, ℓ-ctPred linearization, S-PrediXcan integration, disease results (T2D and SLE) with cell-type specificity, limitations (directionality, distal enhancers, ancestry), and public resources.
- transcript topics: Cell-type-specific TWAS rationale vs bulk tissue TWAS; ctPred architecture and Enformer-derived epigenomic features; ℓ-ctPred linearization and S-PrediXcan integration; Comparison with PEN and scalability considerations; T2D results in islet cell types (gamma, stellate cells) and bulk vs pseudobulk; SLE results in immune cell types (T cells, monocytes; CFB, CXCR5)
QC Summary:
- factual score: 10/10
- metadata score: 10/10
- supported core claims: 8
- claims flagged for review: 0
- metadata checks passed: 4
- metadata issues found: 0
Metadata Audited:
- article_doi
- article_title
- article_journal
- license
Factual Items Audited:
- ctPred uses Enformer-derived epigenomic features and a lightweight four-layer MLP to predict cell-type-specific expression
- ℓ-ctPred is a SNP-based elastic net linearization enabling TWAS with GWAS summary statistics (S-PrediXcan)
- ctPred predictions are cross-validated across cell types with high Pearson correlations (0.753–0.892 reported in the article)
- Enformer input window around TSS is 196,608 bp and yields 5,313 epigenomic features
- T2D: scPrediXcan identified 222 candidate genes across 108 LD blocks vs 12 genes across 11 LD blocks (TWAS-pseudobulk) and 111 genes across 64 LD blocks (TWAS-bulk)
- SLE: scPrediXcan identified 129 genes across 24 LD blocks vs 11 (TWAS-pseudobulk) and 54 (TWAS-bulk) across LD blocks
QC result: Pass.
Fler avsnitt av Base by Base
Visa alla avsnitt av Base by BaseBase by Base med Gustavo Barra finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
