What is it about?
The human genome contains many types of genes. One of the most important of these are proteins that go to make up the molecular machinery of the cell. The basic paradigm of molecular biology is that the stretches of DNA that encode a protein gene are first transcribed into messenger RNA and then translated into new proteins by the ribosomes. For any species the number of basic proteins it encodes in its genome is largely fixed and is a defining characteristic of the organism. For example, yeast has about 6,000 protein coding genes. Estimates from the initial human genome were surprisingly low compared to what people had expected to be well over 30,000. It turns out that improvements in looking at this number resulted in it falling over the years and most sources now put the count at around 20,000. There are different teams who collate these protein sets in different ways for different purposes. However, somewhat unexpectedly after over 15 years since the human genome completion, the protein counts from different teams are still not the same (i.e. there is still no exact consensus). This paper includes a detailed comparison of the different numbers from different sources and provides at least some explanations as to why they still do not agree.
Featured Image
Why is it important?
Defining the canonical protein number is important for defining an organism in molecular terms as at least a prelude to functional exploration. Assessing nine different sources in this work produced nine different numbers with a spread of 3000 between the highest and lowest. Considering the massive experimental focus and data generation for the human genome, transcriptome and proteome it seems peculiar for such a lack of consensus to persist for such an important parameter for human biology. In addition defining this number (while we expect some fuzziness for a variety of reasons) is crucial for the biomedical domain. For example it defines the scope of potential of disease associated genetic perturbations relate to protein coding regions as well as potential drug targets for ameliorating diseases.
Perspectives
Read the Original
This page is a summary of: Last rolls of the yoyo: Assessing the human canonical protein count, F1000Research, April 2017, Faculty of 1000, Ltd.,
DOI: 10.12688/f1000research.11119.1.
You can read the full text:
Contributors
The following have contributed to this page