| Literature DB >> 33640492 |
Abstract
We celebrate the 10th anniversary of the launch of the HUPO Human Proteome Project (HPP) and its major milestone of confident detection of at least one protein from each of 90% of the predicted protein-coding genes, based on the output of the entire proteomics community. The Human Genome Project reached a similar decadal milestone 20 years ago. The HPP has engaged proteomics teams around the world, strongly influenced data-sharing, enhanced quality assurance, and issued stringent guidelines for claims of detecting previously "missing proteins." This invited perspective complements papers on "A High-Stringency Blueprint of the Human Proteome" and "The Human Proteome Reaches a Major Milestone" in special issues of Nature Communications and Journal of Proteome Research, respectively, released in conjunction with the October 2020 virtual HUPO Congress and its celebration of the 10th anniversary of the HUPO HPP.Entities:
Keywords: Human Proteome Project; Mass Spectrometry Data Interpretation Guidelines; blueprint; functionally unannotated proteins; missing proteins; neXtProt
Mesh:
Substances:
Year: 2021 PMID: 33640492 PMCID: PMC8058560 DOI: 10.1016/j.mcpro.2021.100062
Source DB: PubMed Journal: Mol Cell Proteomics ISSN: 1535-9476 Impact factor: 5.911
Fig. 1Schema showing the matrix structure of the Human Proteome Project (HPP). There are 25 chromosome-centric HPP teams corresponding to chromosomes 1–22, X, and Y plus mitochondria, with lead country shown. There are 19 Biology and Disease-driven HPP teams, and four Resource Pillars, of mass spectrometry, antibody profiling, knowledge base, and pathology. MP-50 and CP-50 refer to the C-HPP challenges to find 50 missing proteins per chromosome and generate functional annotations for 50 uncharacterized PE1 proteins. See text.
Fig. 2The data flow for the Human Proteome Project, including the connectedness of ProteomeXchange with the major proteomics data set resources PRIDE and PeptideAtlas (founding partners), iProX, jPOST, MassIVE, and Panorama (modified fromwww.proteomeXchange.org).
neXtProt protein existence evidence levels in releases from 2012-02 to 2020-01 showing progress in reducing the PE2,3,4 Missing Proteins, identifying proteins as PE1,a and approaching a complete protein parts list (adapted from Omenn et al (28), JPR, 2020 and informed by Adhikari et al (12))
| Level/date of neXtProt release | 2012–02 | 2013–09 | 2014–10 | 2016–01 | 2017–01 | 2018–01 | 2019–01 | 2020–01 |
|---|---|---|---|---|---|---|---|---|
| PE1: Evidence at protein level | 13,975 | 15,646 | 16,491 | 16,518 | 17,008 | 17,470 | 17,694 | 17,874 |
| Missing Proteins (MP) = PE2 + PE3 + PE4 | 5511 | 3844 | 2948 | 2949 | 2579 | 2186 | 2129 | 1899 |
| PE2: Evidence at transcript level | 5205 | 3570 | 2647 | 2290 | 1939 | 1660 | 1548 | 1596 |
| PE3: Inferred from homology | 218 | 187 | 214 | 565 | 563 | 452 | 510 | 253 |
| PE4: Predicted | 88 | 87 | 87 | 94 | 77 | 74 | 71 | 50 |
PE1 = high-quality evidence for expression of the protein in compliance with HPP Guidelines; PE2 = detection of corresponding transcript without sufficient evidence of protein expression; PE3 = evidence of protein in nonhuman species; PE4 = protein predicted from a gene model, all according to neXtProt.
PE1/PE1+2 + 3 + 4 = 17,874/19,773 = 90.4%.
PE 2 + 3 + 4 = 1899 “missing proteins” as of neXtProt 2020-01 (Jan).