Loading...

 

What is it about?

This study introduces a new AI-powered computational method for analyzing Hi-C data, which is used to study the 3D structure of DNA. When analyzing Hi-C data, determining the optimal bin size (resolution) is crucial—if the bin size is too large, important details may be lost, while if it is too small, noise can obscure meaningful patterns. This becomes even more challenging when integrating multiple Hi-C datasets, as a common optimal bin size must be found for all datasets. The researchers developed a novel approach using tensor decomposition-based unsupervised feature extraction (TD-based FE). This AI-driven method can automatically determine the best bin size by detecting phase transition-like phenomena, without requiring any manual parameter tuning.

Featured Image

Why is it important?

・The proposed method was tested on two Hi-C datasets (GSE260760 and GSE255264). ・It successfully identified the optimal bin sizes: 1,000,000 base pairs (bp) for GSE260760 and 150,000 bp for GSE255264. ・Compared to traditional methods, TD-based FE showed a higher correlation with functional genomic sites such as CTCF binding sites and topologically associating domains (TADs). ・This approach outperformed simple averaging techniques commonly used in Hi-C analysis.

Perspectives

Sorry, your browser does not support inline SVG.

We have developed the method called "Tensor decomposition based unsupervised feature extraction" and applied it to wide range of topics in the last decade. This time, we apply it to Hi-C data set and found striking results!

Professor Y-h. Taguchi
Chuo Daigaku

Read the Original

This page is a summary of: Novel AI-powered computational method using tensor decomposition for identification of common optimal bin sizes when integrating multiple Hi-C datasets, Scientific Reports, March 2025, Springer Science + Business Media,
DOI: 10.1038/s41598-025-91355-8.
You can read the full text:

Read

Contributors

The following have contributed to this page