Fast and scalable ensemble learning method for versatile polygenic risk prediction

Tony Chen; Haoyu Zhang; Rahul Mazumder; Xihong Lin

doi:10.1073/pnas.2403210121

What is it about?

Polygenic risk scores (PRSs) are a promising clinical tool to quantify individuals’ genetic predisposition for common traits and diseases. When used along with other clinical risk factors, PRS can be used to identify individuals at high-risk for disease and make more personalized prevention and intervention plans. Most PRSs are often calculated using publicly available summary statistics from genome-wide association studies (GWAS), which leverage large study sample sizes while also preserving individual-level privacy. In this paper, we introduce ALL-Sum, which improves PRS accuracy, computational efficiency, and robustness to a broad spectrum of phenotypes and diverse genetic architectures over many existing methods. We applied our PRS method to a wide range of phenotypes using summary statistics from published GWAS and validated in UK Biobank.

Photo by julien Tromeur on Unsplash

Why is it important?

PRS present great potential for using individuals’ genetic profiles for early disease detection, high-risk subject identification, and disease prevention and intervention. Existing PRS methods face a tradeoff between prediction accuracy and computational efficiency. By utilizing powerful statistical and machine learning algorithms and an ensemble learning strategy, ALL-Sum is able to optimize on both sides of this tradeoff. Through extensive simulation studies and analyses of a broad range of phenotypes using GWAS summary statistics from large consortia and biobanks, ALL-Sum enables more accurate PRS for predicting common diseases and traits, while reducing computational costs, compared with many existing methods. This enhanced PRS method can potentially accelerate integration of genetics into clinical practice and advance precision health.

Perspectives

Many of the most popular PRS approaches come at a high computational cost, which can hinder they utility in the clinic. While there is no single-best approach, we developed a more computationally-efficient method that yields higher prediction accuracy and robustness. We hope this can expedite the process of translating genetics into the clinic and enhance precision medicine.
Tony Chen

This page is a summary of: Fast and scalable ensemble learning method for versatile polygenic risk prediction, Proceedings of the National Academy of Sciences, August 2024, Proceedings of the National Academy of Sciences,
DOI: 10.1073/pnas.2403210121.
You can read the full text:

Read

Contributors

The following have contributed to this page

Tony Chen

Accurate and efficient method for computing polygenic risk scores for common diseases and traits

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Accurate and efficient method for computing polygenic risk scores for common diseases and traits

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management