| Literature DB >> 23193264 |
.
Abstract
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page.Entities:
Mesh:
Year: 2012 PMID: 23193264 PMCID: PMC3531099 DOI: 10.1093/nar/gks1189
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The Entrez Databases (as of September 1, 2012)
| Database | Section within this article | Records | Data source |
|---|---|---|---|
| Site search | Introduction | 10 686 | N |
| Assembly | Recent developments | 9597 | D, C, N |
| PubMed | Literature | 22 076 132 | C |
| PubMed central | Literature | 2 523 284 | D, C |
| NLM catalog | Literature | 1 461 835 | C, N |
| MeSH | Literature | 236 253 | N |
| Books | Literature | 186 112 | C, N |
| Taxonomy | Taxonomy | 932 345 | C, N |
| EST | DNA and RNA | 73 666 909 | D (GenBank) |
| Nucleotide | DNA and RNA | 66 319 706 | D (GenBank), C, N |
| GSS | DNA and RNA | 34 533 114 | D (GenBank) |
| BioSample | DNA and RNA | 970 304 | N |
| SRA | DNA and RNA | 228 739 | D |
| PopSet | DNA and RNA | 159 345 | D (GenBank) |
| Protein | Proteins | 56 394 380 | C, N |
| Protein clusters | Proteins | 794 663 | N |
| GEO profiles | Genes and expression | 63 811 486 | D |
| Probe | Genes and expression | 14 248 527 | D |
| Gene | Genes and expression | 11 290 372 | C, N |
| UniGene | Genes and expression | 5 831 327 | N |
| GEO data sets | Genes and expression | 841 518 | N |
| Biosystems | Genes and expression | 396 029 | C |
| Homologene | Genes and expression | 133 012 | N |
| Clone | Genomes | 29 597 231 | D, N |
| UniSTS | Genomes | 545 353 | D (dbSTS) |
| BioProject | Genomes | 58 227 | D |
| Genome | Genomes | 8276 | C, N |
| Epigenomics | Genomes | 5484 | D |
| SNP | Genetics and medicine | 162 674 947 | D (dbSNP), N |
| dbVar | Genetics and medicine | 2 729 616 | D |
| dbGaP | Genetics and medicine | 143 624 | D |
| Online mendelian inheritance in animals | Genetics and medicine | 2810 | C |
| PubChem substance | Chemicals and bioassays | 100 157 112 | D |
| PubChem compound | Chemicals and bioassays | 35 545 766 | N |
| PubChem bioassay | Chemicals and bioassays | 621 642 | D |
| Structure | Domains and structures | 83 913 | C, N |
| CDD | Domains and structures | 46 389 | C, N |
D, direct submission; C, collaboration/agreement; N, internal NCBI/NLM curation.
Selected NCBI software available for download
| Software | Available binaries | Category within this article |
|---|---|---|
| BLAST (standalone) | Win, Mac, LINUX, Solaris | BLAST sequence analysis |
| BLAST (network client) | Win, Mac, LINUX, Solaris | BLAST sequence analysis |
| BLAST (web server) | Mac, LINUX, Solaris | BLAST sequence analysis |
| CD-Tree | Win, Mac | Domains and structures |
| Cn3D | Win, Mac | Domains and structures |
| PC3D | Win, Mac, LINUX | Chemicals and bioassays |
| gene2xml | Win, Mac, LINUX, Solaris | Genes and expression |
| Genome workbench | Win, Mac, LINUX | Genomes |
| Splign | LINUX, Solaris | Genomes |
| tbl2asn | Win, Mac, LINUX, Solaris | Genomes |
Summary of dbSNP FTP human VCF files
| File name | Update frequency | dbSNP RefSNP count (based on build 137) |
|---|---|---|
| clinvar.vcf.gz | Weekly | 36 K |
| A list of all human variations submitted through clinical channels that contain a mixture of variations asserted to be pathogenic and those known to be non-pathogenic | ||
| 00-All.vcf.gz | Once per dbSNP build | 52 M |
| A comprehensive list of all short human variations based on the most recent dbSNP build | ||
| common_all.vcf.gz | Once per dbSNP build | 28 M |
| A subset of variations from 00-All.vcf.gz that are determined to be ‘common’ based on germline origin and a minor allele frequency of ≥0.01 in at least one major population, with at least two individuals from different families having the minor allele | ||
| common_no_known_medical_impact.vcf.gz | Weekly | 28 M |
| A list of all ‘common’ germline human variations that fall within the scope of VCF processing. To create this list, variation records of probable medical interest from clinvar.vcf.gz are removed from the list of common_all.vcf.gz | ||