What is it about?

Nucleotide correlations in coding sequences result from the functional constraints on physico-chemical properties of proteins. These constraints are imprinted in the coding DNA in the form of a purine bias with a purine preference in first position of codons. The resulting codon pattern is RNY (or Rrr) and has been called "ancestral codon pattern". Here, we describe a method that we called UFM (for Universal Feature Measure) for the CDS/intron classification based on the statistics of purine bias and stop codons.

Featured Image

Why is it important?

The proposed method is species-independent, GC-content independent, does not need prior training nor parameter adjustment and performs well with small DNA fragments >300bp. The results obtained with six model organisms (A. thaliana, D. melanogaster, P. falciparum, O. sativa, C. reinhardtii and Homo sapiens) show that for sequences of size >600 bp the new classifier achieves a sensitivity > 97% and a specificity > 94% in all species.

Perspectives

This review presents results that were improved in papers that came later on.

Nicolas Carels
Oswaldo Cruz Foundation

Read the Original

This page is a summary of: UNIVERSAL FEATURES FOR EXON PREDICTION, March 2011, World Scientific Pub Co Pte Lt,
DOI: 10.1142/9789814343435_0021.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page