| Literature DB >> 27899561 |
.
Abstract
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. The Entrez system provides search and retrieval operations for most of these data from 37 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include iCn3D, MutaBind, and the Antimicrobial Resistance Gene Reference Database; and resources that were updated in the past year include My Bibliography, SciENcv, the Pathogen Detection Project, Assembly, Genome, the Genome Data Viewer, BLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.Entities:
Mesh:
Year: 2016 PMID: 27899561 PMCID: PMC5210554 DOI: 10.1093/nar/gkw1071
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The Entrez databases (as of 3 September 2016)
| Database | Records | Annual growth | Description |
|---|---|---|---|
| Books | 528 176 | 18.2% | Books and reports |
| PubMed Central | 4 066 155 | 11.9% | Full-text journal articles |
| PubMed | 26 413 966 | 4.7% | Scientific and medical abstracts/citations |
| MeSH | 265 382 | 2.4% | Ontology used for PubMed indexing |
| NLM Catalog | 1 551 801 | 1.4% | Index of NLM collections |
| GTR | 48 612 | 52.0% | Genetic testing registry |
| ClinVar | 159 184 | 27.4% | Human variations of clinical significance |
| PubMed Health | 62 991 | 14.0% | Clinical effectiveness, disease and drug reports |
| dbGaP | 223 662 | 7.6% | Genotype/phenotype interaction studies |
| MedGen | 292 341 | 7.1% | Medical genetics literature and links |
| SRA | 3 092 408 | 82.2% | High-throughput DNA and RNA sequence read archive |
| Assembly | 90 727 | 52.3% | Genome assembly information |
| BioSample | 5 224 211 | 43.2% | Descriptions of biological source materials |
| dbVar | 6 147 903 | 37.2% | Genome structural variation studies |
| BioProject | 193 972 | 27.4% | Biological projects providing data to NCBI |
| Genome | 16 962 | 25.3% | Genome sequencing projects by organism |
| SNP | 819 309 474 | 16.1% | Short genetic variations |
| Taxonomy | 1 617 350 | 13.3% | Taxonomic classification and nomenclature catalog |
| Nucleotide | 210 148 411 | 5.2% | DNA and RNA sequences |
| Clone | 38 083 613 | 2.0% | Genomic and cDNA clones |
| GSS | 39 614 616 | 0.6% | Genome survey sequences |
| Probe | 32 405 018 | 0.1% | Sequence-based probes and primers |
| GEO DataSets | 2 008 226 | 22.1% | Functional genomics studies |
| GEO Profiles | 128 414 055 | 18.1% | Gene expression and molecular abundance profiles |
| Gene | 24 351 351 | 13.8% | Collected information about gene loci |
| PopSet | 257 306 | 11.0% | Sequence sets from phylogenetic and population studies |
| EST | 76 257 001 | 0.3% | Expressed sequence tag sequences |
| UniGene | 6 473 284 | 0.0% | Clusters of expressed transcripts |
| HomoloGene | 141 268 | 0.0% | Homologous gene sets for selected organisms |
| Protein | 307 799 547 | 37.7% | Protein sequences |
| Structure | 121 463 | 9.2% | Experimentally-determined biomolecular structures |
| Conserved Domains | 52 411 | 3.5% | Conserved protein domains |
| Protein Clusters | 820 546 | 0.0% | Sequence similarity-based protein clusters |
| PubChem Compound | 91 679 397 | 50.9% | Chemical information with structures, information and links |
| PubChem Substance | 223 159 019 | 41.8% | Deposited substance and chemical information |
| BioSystems | 879 994 | 9.3% | Molecular pathways with links to genes, proteins and chemicals |
| PubChem BioAssay | 1 218 668 | 5.6% | Bioactivity screening studies |
Figure 1.Graphical Depiction of Selected Entrez Links. Each cell in the matrix is shaded according to the log (base 10) of the number of records in the source database (rows) that have an Entrez link to the destination database (columns). Diagonal cells represent computational links (e.g. pubmed related articles) and off-diagonal cells assert biological relationships (e.g. nuccore to taxonomy). The matrix is not diagonal because an individual record in a source database may have many links to a destination database (e.g. genome to protein).