| Literature DB >> 21051338 |
Dongwan Hong1, Sung-Soo Park, Young Seok Ju, Sheehyun Kim, Jong-Yeon Shin, Sujung Kim, Saet-Byeol Yu, Won-Chul Lee, Seungbok Lee, Hansoo Park, Jong-Il Kim, Jeong-Sun Seo.
Abstract
High-throughput genomic technologies have been used to explore personal human genomes for the past few years. Although the integration of technologies is important for high-accuracy detection of personal genomic variations, no databases have been prepared to systematically archive genomes and to facilitate the comparison of personal genomic data sets prepared using a variety of experimental platforms. We describe here the Total Integrated Archive of Short-Read and Array (TIARA; http://tiara.gmi.ac.kr) database, which contains personal genomic information obtained from next generation sequencing (NGS) techniques and ultra-high-resolution comparative genomic hybridization (CGH) arrays. This database improves the accuracy of detecting personal genomic variations, such as SNPs, short indels and structural variants (SVs). At present, 36 individual genomes have been archived and may be displayed in the database. TIARA supports a user-friendly genome browser, which retrieves read-depths (RDs) and log2 ratios from NGS and CGH arrays, respectively. In addition, this database provides information on all genomic variants and the raw data, including short reads and feature-level CGH data, through anonymous file transfer protocol. More personal genomes will be archived as more individuals are analyzed by NGS or CGH array. TIARA provides a new approach to the accurate interpretation of personal genomes for genome research.Entities:
Mesh:
Year: 2010 PMID: 21051338 PMCID: PMC3013693 DOI: 10.1093/nar/gkq1101
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of massively parallel sequencing data in TIARA
| Sample name | Technology | Read length (in bp) | Insert size | Number of reads | Total bases | Sequencing coverage | Aligned coverage | SNPs | Indels | CNV (region) |
|---|---|---|---|---|---|---|---|---|---|---|
| AK1 | Illumina Genome Analyzer | 1 × 36 2 × 36 2 × 88 2 × 106 | 200 | 519 486 218 1 646 543 336 123 322 768 177 416 122 | 18 701 503 848 59 275 560 096 10 852 403 584 18 806 108 932 | 35.9x | 27.8x | 3 453 653 | 170 202 | 1237 (24 193 059) |
| AK2 | AB SOLiD | 2 × 25 2 × 50 | 1500 4700 | 6 371 995 780 3 390 922 334 | 159 299 894 500 169 546 116 700 | 109.6x | 27.5x | 3 586 271 | 213 718 | 607 (9 248 044) |
| AK4 | Illumina Genome Analyzer | 2 × 76 2 × 101 | 500 | 444 312 562 430 032 812 | 33 767 754 712 43 433 314 012 | 25.7x | 23.1x | 3 630 428 | 429 258 | 696 (8 463 889) |
| AK6 | Illumina Genome Analyzer | 2 × 36 2 × 76 2 × 101 | 500 | 55 752 362 540 079 624 301 478 526 | 2 007 085 032 41 046 051 424 30 449 331 126 | 24.5x | 22.3x | 3 558 703 | 413 949 | 706 (11 958 848) |
| NA10851 | Illumina Genome Analyzer | 2 × 36 2 × 76 2 × 101 | 500 | 1 114 121 056 318 924 496 203 842 434 | 40 108 358 016 24 238 261 696 20 588 085 834 | 28.3x | 25.0x | 3 683 016 | 319 266 | 1309 (23 198 937) |
Figure 1.System configuration of TIARA.
Figure 2.User interface of TIARA. (a) The user interface consists of areas (A) control panel, (B) Refseq gene, (C) SNPs, (D) indels, (E) RD display window from high-throughput sequencing, (F) CNV regions and (G) log2 ratio display window for high-resolution CGH array data. (b) Retrieval of genomic data for the TP53 gene using the gene name search function. (c) An example of heterozygous and homozygous SNPs for the same position in several selected individuals. These SNPs are related to colorectal and endometrial cancer. (d) An example of the popup window for the common SNPs displaying.