| Literature DB >> 21071399 |
Dennis A Benson1, Ilene Karsch-Mizrachi, David J Lipman, James Ostell, Eric W Sayers.
Abstract
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.Entities:
Mesh:
Year: 2010 PMID: 21071399 PMCID: PMC3013681 DOI: 10.1093/nar/gkq1079
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Growth of GenBank Divisions (nucleotide base-pairs)
| Division | Description | Release 173 (8/2009) | Release 179 (8/2010) | Increase (%) |
|---|---|---|---|---|
| TSA | Transcriptome shotgun data | 39 829 979 | 398 676 845 | 900.9 |
| ENV | Environmental samples | 1 091 072 890 | 1 723 286 428 | 57.9 |
| PAT | Patented sequences | 5 592 927 651 | 8 519 294 473 | 52.3 |
| BCT | Bacteria | 4 107 328 206 | 5 333 010 385 | 29.8 |
| VRL | Viruses | 779 481 462 | 970 125 245 | 24.5 |
| PHG | Phages | 36 100 172 | 43 456 808 | 20.4 |
| MAM | Other mammals | 576 977 646 | 679 274 390 | 17.7 |
| INV | Invertebrates | 1 734 996 371 | 2 036 240 836 | 17.4 |
| WGS | WGS data | 148 165 117 763 | 169 253 846 128 | 14.2 |
| GSS | Genome survey sequences | 16 738 219 857 | 18 442 479 673 | 10.2 |
| PLN | Plants | 3 695 552 256 | 4 038 424 961 | 9.3 |
| SYN | Synthetic | 131 361 806 | 142 548 355 | 8.5 |
| VRT | Other vertebrates | 2 366 300 257 | 2 533 789 261 | 7.1 |
| EST | ESTs | 34 522 977 161 | 36 803 930 321 | 6.6 |
| HTC | High-throughput cDNA | 636 472 189 | 659 355 057 | 3.6 |
| PRI | Primates | 5 751 413 009 | 5 943 029 356 | 3.3 |
| ROD | Rodents | 4 206 718 960 | 4 298 354 944 | 2.2 |
| HTG | High-throughput genomic | 23 895 733 886 | 24 276 862 305 | 1.6 |
| UNA | Unannotated | 119 348 | 120 289 | 0.8 |
| STS | Sequence tagged sites | 629 573 650 | 634 263 196 | 0.7 |
| TOTAL | All GenBank sequences | 254 698 274 519 | 286 730 369 256 | 12.6 |
Top organisms in GenBank (Release 179)
| Organism | Non-WGS base pairs |
|---|---|
| 14 792 487 417 | |
| 8 859 010 528 | |
| 6 443 768 086 | |
| 5 361 712 195 | |
| 5 037 629 354 | |
| 4 783 381 701 | |
| 3 137 945 523 | |
| 1 352 920 226 | |
| 1 197 245 122 | |
| 1 187 388 273 | |
| 1 147 132 278 | |
| 1 047 707 620 | |
| 1 001 926 471 | |
| 1 001 073 627 | |
| 943 043 649 | |
| 913 911 649 | |
| 891 463 513 | |
| 886 103 518 | |
| 821 393 285 | |
| 748 350 657 |