| Literature DB >> 27622467 |
Jianwei Zhang1,2, Ling-Ling Chen1, Shuai Sun1, Dave Kudrna2, Dario Copetti2,3, Weiming Li1, Ting Mu1, Wen-Biao Jiao1, Feng Xing1, Seunghee Lee2, Jayson Talag2, Jia-Ming Song1, Bogu Du1, Weibo Xie1, Meizhong Luo1, Carlos Ernesto Maldonado2, Jose Luis Goicoechea2, Lizhong Xiong1, Changyin Wu1, Yongzhong Xing1, Dao-Xiu Zhou1, Sibin Yu1, Yu Zhao1, Gongwei Wang1, Yeisoo Yu2, Yijie Luo1, Beatriz Elena Padilla Hurtado2, Ann Danowitz2, Rod A Wing2,3, Qifa Zhang1.
Abstract
Over the past 30 years, we have performed many fundamental studies on two Oryza sativa subsp. indica varieties, Zhenshan 97 (ZS97) and Minghui 63 (MH63). To improve the resolution of many of these investigations, we generated two reference-quality reference genome assemblies using the most advanced sequencing technologies. Using PacBio SMRT technology, we produced over 108 (ZS97) and 174 (MH63) Gb of raw sequence data from 166 (ZS97) and 209 (MH63) pools of BAC clones, and generated ~97 (ZS97) and ~74 (MH63) Gb of paired-end whole-genome shotgun (WGS) sequence data with Illumina sequencing technology. With these data, we successfully assembled two platinum standard reference genomes that have been publicly released. Here we provide the full sets of raw data used to generate these two reference genome assemblies. These data sets can be used to test new programs for better genome assembly and annotation, aid in the discovery of new insights into genome structure, function, and evolution, and help to provide essential support to biological research in general.Entities:
Mesh:
Year: 2016 PMID: 27622467 PMCID: PMC5020871 DOI: 10.1038/sdata.2016.76
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1Base coverage distributions of ZS97 (a) and MH63 (b) BAC sequences.
Summary of Illumina read data for ZS97 and MH63.
|
|
|
|
| |||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| ZS97_short | 338,293,782 | 34,167,671,982 | 299,201,824 | 29,652,703,238 | 88.44 | 86.79 |
| ZS97_5 kb | 436,436,254 | 33,169,155,304 | 345,177,784 | 25,846,914,440 | 79.09 | 77.92 |
| ZS97_10 kb | 396,565,650 | 30,138,989,400 | 327,334,042 | 24,554,200,308 | 82.54 | 81.47 |
| MH63_short | 382,103,532 | 38,592,456,732 | 341,947,062 | 33,482,334,975 | 89.49 | 86.76 |
| MH63_5 kb | 267,288,070 | 20,313,893,320 | 214,879,076 | 16,135,182,511 | 80.39 | 79.43 |
| MH63_10 kb | 198,185,740 | 10,107,472,740 | 173,925,986 | 8,866,402,830 | 87.76 | 87.72 |
ZS97 NGS contig assembly statistics*.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| chr01 | 41,505,150 | 1,800 | 249,739 | 47 | 31,557 | 200 | 1,560,848 |
| chr02 | 34,325,857 | 1,467 | 266,730 | 41 | 37,069 | 159 | 884,689 |
| chr03 | 35,724,790 | 1,154 | 287,031 | 38 | 50,086 | 136 | 1,808,391 |
| chr04 | 29,991,449 | 2,303 | 151,932 | 53 | 6,534 | 451 | 981,333 |
| chr05 | 27,896,270 | 1,635 | 188,526 | 42 | 10,219 | 243 | 973,308 |
| chr06 | 29,165,153 | 1,755 | 183,332 | 47 | 13,724 | 210 | 749,307 |
| chr07 | 27,477,590 | 1,958 | 151,089 | 48 | 8,298 | 277 | 1,065,432 |
| chr08 | 26,789,663 | 1,556 | 173,772 | 48 | 14,332 | 219 | 739,850 |
| chr09 | 21,657,142 | 1,190 | 218,997 | 24 | 15,452 | 147 | 1,087,529 |
| chr10 | 21,689,031 | 1,663 | 129,607 | 45 | 6,829 | 278 | 871,567 |
| chr11 | 27,003,295 | 2,130 | 126,457 | 60 | 5,702 | 366 | 527,948 |
| chr12 | 24,887,049 | 2,048 | 141,176 | 43 | 5,320 | 354 | 628,599 |
| chrUn | 1,921,273 | 1,382 | 1,904 | 239 | 593 | 1,028 | 20,713 |
| All | 350,033,712 | 22,041 | 188,515 | 507 | 11,222 | 2,876 | 1,808,391 |
*The statistics are based sequence lengths that are larger than 500 bp.
†The number of sequences with lengths equal to or larger than N50.
MH63 NGS contig assembly statistics*.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| chr01 | 41,467,158 | 2,360 | 146,312 | 79 | 14,985 | 367 | 659,130 |
| chr02 | 34,737,720 | 1,760 | 162,664 | 68 | 19,298 | 266 | 616,724 |
| chr03 | 35,174,344 | 1,580 | 180,406 | 62 | 27,156 | 238 | 526,233 |
| chr04 | 32,237,781 | 2,758 | 83,872 | 88 | 5,715 | 629 | 652,894 |
| chr05 | 27,841,841 | 1,926 | 113,801 | 68 | 9,194 | 372 | 530,915 |
| chr06 | 28,862,311 | 2,744 | 59,335 | 114 | 5,123 | 737 | 447,621 |
| chr07 | 27,358,412 | 2,464 | 100,207 | 73 | 4,994 | 487 | 530,282 |
| chr08 | 25,940,982 | 1,999 | 93,823 | 73 | 7,300 | 401 | 587,588 |
| chr09 | 21,608,822 | 1,703 | 113,836 | 52 | 6,627 | 312 | 686,925 |
| chr10 | 21,690,036 | 2,087 | 80,081 | 72 | 4,308 | 482 | 376,283 |
| chr11 | 27,926,507 | 2,880 | 79,019 | 95 | 3,592 | 713 | 598,709 |
| chr12 | 25,125,875 | 2,559 | 80,004 | 85 | 3,845 | 585 | 367,314 |
| chrUn | 1,784,506 | 1,316 | 1,723 | 257 | 603 | 990 | 20,724 |
| All | 351,756,295 | 28,136 | 107,523 | 867 | 6,550 | 5,407 | 686,925 |
*The statistics are based sequence lengths that are larger than 500 bp.
†The number of sequences with lengths equal to or larger than N50.
Non-sequence data resources deposited at the iPlant Datastore.
|
|
|
|
|---|---|---|
| Physical maps (tag, bands and fpc files) | physical-maps |
|
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
| MTP clones | mtps |
|
|
| ||
| Sequencing pools | pools |
|
|
| ||
| HGAP jobs | smrt-jobs |
|
*URL: https://de.iplantcollaborative.org/de/?type=data&folder=/iplant/home/shared/agi_data/ZS97MH63.