| Literature DB >> 19850725 |
Eli Kaminuma1, Jun Mashima, Yuichi Kodama, Takashi Gojobori, Osamu Ogasawara, Kousaku Okubo, Toshihisa Takagi, Yasukazu Nakamura.
Abstract
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has collected and released 1,701,110 entries/1,116,138,614 bases between July 2008 and June 2009. A few highlighted data releases from DDBJ were the complete genome sequence of an endosymbiont within protist cells in the termite gut and Cap Analysis Gene Expression tags for human and mouse deposited from the Functional Annotation of the Mammalian cDNA consortium. In this period, we started a novel user announcement service using Really Simple Syndication (RSS) to deliver a list of data released from DDBJ on a daily basis. Comprehensive visualization of a DDBJ release data was attempted by using a word cloud program. Moreover, a new archive for sequencing data from next-generation sequencers, the 'DDBJ Read Archive' (DRA), was launched. Concurrently, for read data registered in DRA, a semi-automatic annotation tool called the 'DDBJ Read Annotation Pipeline' was released as a preliminary step. The pipeline consists of two parts: basic analysis for reference genome mapping and de novo assembly and high-level analysis of structural and functional annotations. These new services will aid users' research and provide easier access to DDBJ databases.Entities:
Mesh:
Year: 2009 PMID: 19850725 PMCID: PMC2808917 DOI: 10.1093/nar/gkp847
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Datasets released from DDBJ
| Type | Database name | No. of records | Released date |
|---|---|---|---|
| Primary DB | INSD-core (processed by DDBJ) | 17 440 910 entries (1 701 110 entries) | 29 May 2009 |
| WGS | 1 246 513 entries | 10 September 2009 | |
| MGA | 34 740 058 entries | 1 June 2009 | |
| TPA | 593 entries | 10 September 2009 | |
| DTA | 2 submissions | 7 July 2008 | |
| DRA | 12 submissions | 11 September 2009 | |
| Secondary DB | DAD | 14 710 673 entries | 29 May 2009 |
| GTPS | 690 genomes | 25 May 2009 | |
| GIB | 982 genomes | 10 September 2009 |
The number of records represents only published data.
Figure 1.A feed file for RSS 2.0 is published from the DDBJ homepage every day (http://www.ddbj.nig.ac.jp/rss/update_information.xml). Daily released contents of DDBJ databases can be confirmed via RSS reader programs.
Figure 2.Word cloud images created using a DDBJ database release. The upper figure uses feature keys ranking among the top 100 for the total number of nucleotides; similarly, the lower figure uses species names.
Figure 3.DRA sheets: it contains an Excel macro to generate XML-formatted files for submission of metadata to DRA.
Figure 4.Flowchart of DDBJ Read Annotation Pipeline. The files of analytic results for structural and functional annotations are deposited in DDBJ databases, DRA and INSD.