What is it about?
Congenital heart disease (CHD) is the most common birth defect, affecting about 1 in every 100 babies worldwide. It occurs when the heart does not form properly before birth. Genes are known to play an important role in CHD, but for most children we still don’t know the exact genetic cause. One challenge is that humans have around 20,000 genes, and many of them have not been studied enough for us to understand whether they are important for heart formation. In this study, we used artificial intelligence (machine learning) to help address this problem. We trained a computer program to recognise common features that are typical of genes involved in building the heart. To do this, we used information from mice, because many genes involved in heart development are shared between mice and humans. We started by assembling two groups of mouse genes: • Genes already known to be important for heart development • Genes known not to affect the heart We then analysed many types of information about each gene, for example, how active it is during development and how the protein it produces interacts with other proteins. Using this information, the computer program learned what distinguishes “heart genes” from other genes. It then examined thousands of additional genes and predicted which ones are likely to be important for heart development. The model performed very well. It correctly identified most known heart-related genes in a test dataset and predicted more than 4,400 additional genes that may also play a role in heart development. To check the accuracy of these predictions, we tested some of the top predictions in animal models and confirmed that disrupting these genes can cause heart defects. We also compared our predictions with a list of newly published CHD genes and found a strong overlap. This suggests that our approach can help identify genes that may be important in human patients as well. We also created a public online resource CDGD where researchers and clinicians can explore known and predicted heart genes, view confidence scores, and download data.
Featured Image
Photo by Rohit Dey on Unsplash
Why is it important?
Genome sequencing is now widely used in patients with CHD, but interpreting the results remains a major challenge. Sequencing often reveals rare genetic changes in genes whose roles in heart development are unclear. Distinguishing which of these changes are likely to be disease-causing is one of the central bottlenecks in clinical genetics. Our AI-based approach could help address this challenge by: • Highlighting the genes most likely to be involved in heart development • Supporting interpretation of genetic test results • Helping speed up molecular diagnosis • Providing a framework that could be adapted to other genetic or developmental conditions By improving how genes are prioritised, this work could help more families receive clear genetic answers, reduce uncertainty in diagnosis, and guide future research into better treatments for congenital heart disease.
Perspectives
This study provides new information about which genes are likely to be needed during the process of heart formation. Because errors in this process lead to babies being born with heart defects, it is important that we understand which genes control the correct formation of the heart. We plan to utilise the results of this study in further research to understand which genes work together to ensure the heart, an organ that is critical to survival, forms correctly. Studying these genes may lead to providing genetic diagnoses to patients and their families.
Kathryn Hentges
University of Manchester
Artificial intelligence (AI) is rapidly changing how we approach complex biological questions. In genetics, where vast amounts of data are now generated, traditional experimental methods alone cannot easily keep pace. Computational approaches provide an opportunity to explore these datasets at scale, enabling the identification of patterns and candidate genes that may otherwise remain overlooked. In this work, AI (machine learning) is used to help address an important challenge in congenital heart disease research: identifying genes that may contribute to heart development. By prioritising potential candidate genes from large genomic datasets, this framework offers a way to focus future experimental efforts and may also support the interpretation of genetic testing in clinical settings. Looking ahead, integrating AI with experimental and clinical research will likely play an increasingly important role in understanding developmental biology. Approaches like this may not only advance research in heart development but could also help uncover genetic contributors to a wide range of developmental disorders where the underlying causes are still not fully understood.
Mitra Kabir
University of Manchester
Read the Original
This page is a summary of: A machine learning classifier to identify and prioritise genes associated with murine cardiac development, PLoS Genetics, February 2026, PLOS,
DOI: 10.1371/journal.pgen.1011489.
You can read the full text:
Resources
Contributors
The following have contributed to this page







