| Literature DB >> 30395293 |
Eric W Sayers1, Richa Agarwala1, Evan E Bolton1, J Rodney Brister1, Kathi Canese1, Karen Clark1, Ryan Connor1, Nicolas Fiorini1, Kathryn Funk1, Timothy Hefferon1, J Bradley Holmes1, Sunghwan Kim1, Avi Kimchi1, Paul A Kitts1, Stacy Lathrop1, Zhiyong Lu1, Thomas L Madden1, Aron Marchler-Bauer1, Lon Phan1, Valerie A Schneider1, Conrad L Schoch1, Kim D Pruitt1, James Ostell1.
Abstract
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 38 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include PubMed Labs and a new sequence database search. Resources that were updated in the past year include PubMed, PMC, Bookshelf, genome data viewer, Assembly, prokaryotic genomes, Genome, BioProject, dbSNP, dbVar, BLAST databases, igBLAST, iCn3D and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.Entities:
Mesh:
Year: 2019 PMID: 30395293 PMCID: PMC6323993 DOI: 10.1093/nar/gky1069
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The Entrez Databases (as of 1 September 2018)
| Database | Records | Description |
|---|---|---|
|
|
| |
| PubMed | 28 809 515 | scientific and medical abstracts/citations |
| PubMed Central | 5 096 212 | full-text journal articles |
| NLM Catalog | 1 586 932 | index of NLM collections |
| Books | 653 701 | books and reports |
| MeSH | 277 030 | ontology used for PubMed indexing |
|
|
| |
| ClinVar | 442 601 | human variations of clinical significance |
| dbGaP | 344 078 | genotype/phenotype interaction studies |
| MedGen | 307 690 | medical genetics literature and links |
| GTR | 55 299 | genetic testing registry |
|
|
| |
| SNP | 672 043 185 | short genetic variations |
| Nucleotide | 265 485 730 | DNA and RNA sequences |
| GSS | 40 713 027 | genome survey sequences |
| Clone | 38 325 184 | genomic and cDNA clones |
| Probe | 32 407 891 | sequence-based probes and primers |
| BioSample | 9 015 281 | descriptions of biological source materials |
| SRA | 6 243 265 | high-throughput DNA and RNA sequence read archive |
| dbVar | 5 227 838 | genome structural variation studies |
| Taxonomy | 1 969 776 | taxonomic classification and nomenclature catalog |
| BioProject | 309 309 | biological projects providing data to NCBI |
| Assembly | 194 537 | genome assembly information |
| Genome | 38 734 | genome sequencing projects by organism |
| BioCollections | 7623 | museum, herbaria, and other biorepository collections |
|
|
| |
| GEO Profiles | 128 414 055 | gene expression and molecular abundance profiles |
| EST | 76 990 816 | expressed sequence tag sequences |
| Gene | 32 928 347 | collected information about gene loci |
| UniGene | 6 473 284 | clusters of expressed transcripts |
| GEO DataSets | 2 756 045 | functional genomics studies |
| PopSet | 307 577 | sequence sets from phylogenetic and population studies |
| HomoloGene | 141 268 | homologous gene sets for selected organisms |
|
|
| |
| Protein | 568 577 026 | protein sequences |
| Identical Protein Groups | 182 401 155 | protein sequences grouped by identity |
| Protein Clusters | 1 137 329 | sequence similarity-based protein clusters |
| Structure | 142 217 | experimentally-determined biomolecular structures |
| Conserved Domains | 56 066 | conserved protein domains |
|
|
| |
| PubChem Substance | 247 411 095 | deposited substance and chemical information |
| PubChem Compound | 96 501 627 | chemical information with structures, information and links |
| PubChem BioAssay | 1 252 901 | bioactivity screening studies |
| BioSystems | 983 968 | molecular pathways with links to genes, proteins and chemicals |
Figure 1.Annual growth rates of the number of records in each Entrez database as of 1 September 2018. Identical Protein Groups is not included since this database was released during the past year. Please see the text for a discussion of a change in scope for dbVar and SNP.