| Literature DB >> 30124169 |
Abstract
Seventeen years after the initial publicationx of the human genome, we still haven't found all of our genes. The answer turns out to be more complex than anyone had imagined when the Human Genome Project began.Entities:
Mesh:
Year: 2018 PMID: 30124169 PMCID: PMC6100717 DOI: 10.1186/s12915-018-0564-x
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Gene annotations in Gencode, Ensembl, RefSeq, and CHESS
| Gencodea | Ensemblb | RefSeqc | CHESSd | |
|---|---|---|---|---|
| Protein-coding genes | 19,901 | 20,376 | 20,345 | 21,306 |
| lncRNA genes | 15,779 | 14,720 | 17,712 | 18,484 |
| Antisense RNA | 5501 | 28 | 2694 | |
| Miscellaneous RNA | 2213 | 2222 | 13,899 | 4347 |
| Pseudogenes | 14,723 | 1740 | 15,952 | |
| Total transcripts | 203,835 | 203,903 | 154,484 | 323,827 |
Note that despite the many differences shown for Gencode and Ensembl, Gencode is created by merging the Havana manual annotation and the Ensembl automated annotation, and the releases coincide (https://www.gencodegenes.org/faq.html)
aGencode statistics for version 28 from www.gencodegenes.org/stats/current.html as of July 12.2018
bEnsemble statistics for version 92.38, which corresponds to Gencode v28, from ensembl.org/Homo_sapiens/Info/Annotation as of July 12, 2018
cRefSeq statistics for release 108 from www.ncbi.nlm.nih.gov/genome/annotation_euk/Homo_sapiens/108/ as of July 12, 2018
dCHESS statistics for version 2.0 from ccb.jhu.edu/chess as of July 12, 2018. CHESS does not currently include pseudogenes