What is it about?

The study integrated comprehensive published genomic datasets to systematically study the evolution of young TSSs in the human genome. We have shown that 1) how repeat sequences, which are abundant in the genome, in the right genomic contexts, are an important source of novel TSSs; 2) new TSSs evolve rapidly in the early stages of formation, driven by mutagenic processes that target repeats; and 3) how the surviving TSSs finally settle into being bone fide TSSs in the genome.

Featured Image

Why is it important?

TSSs are central to initiating gene expression. Previous studies revealed widespread transcription initiation and fast turnover of TSSs in mammalian genomes, but little is known about where new TSSs come from, how they evolve over time, and their functional impact on transcription. Our work fills an important gap in the fields of transcriptional regulation and genome evolution.

Perspectives

This work was the first time that I used only the published omics datasets to answer an interesting question in biology. Given the surging amount of omics data, I think proper reuse and integration of published dataset can provide important new insights and save quite a lot of time and money for generating new data. However, I also found that the access to the datasets in some published studies was a time-consuming task. Good guidelines and practice in disseminating the data is important for the community.

Cai Li
Francis Crick Institute

Read the Original

This page is a summary of: Integrated analysis sheds light on evolutionary trajectories of young transcription start sites in the human genome, Genome Research, April 2018, Cold Spring Harbor Laboratory Press,
DOI: 10.1101/gr.231449.117.
You can read the full text:

Read

Contributors

The following have contributed to this page