| Literature DB >> 15608248 |
Kim D Pruitt1, Tatiana Tatusova, Donna R Maglott.
Abstract
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff.Entities:
Mesh:
Year: 2005 PMID: 15608248 PMCID: PMC539979 DOI: 10.1093/nar/gki025
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Annual growth of the RefSeq collection
| Date | FTP release | Species | Number of records | ||
|---|---|---|---|---|---|
| Genomic | Transcript | Protein | |||
| 6/30/2003 | 1 | 2005 | 64 729 | 211 803 | 785 143 |
| 7/5/2004 | 6 | 2467 | 68 592 | 247 639 | 1 050 975 |
RefSeq information, access and feedback
| Resource | URL |
|---|---|
| RefSeq home page | |
| FTP—RefSeq release | |
| BLAST home page | |
| Entrez home page | |
| RefSeq feedback form | |
| Contact NCBI Help Desk | info@ncbi.nlm.nih.gov |
| Subscribe to RefSeq announce |