| Literature DB >> 24466017 |
Yuko Arai-Kichise1, Yuh Shiwa1, Kaworu Ebana2, Mari Shibata-Hatta1, Hirofumi Yoshikawa3, Masahiro Yano4, Kyo Wakasa3.
Abstract
Elucidation of the rice genome is expected to broaden our understanding of genes related to the agronomic characteristics and the genetic relationship among cultivars. In this study, we conducted whole-genome sequencings of 6 cultivars, including 5 temperate japonica cultivars and 1 tropical japonica cultivar (Moroberekan), by using next-generation sequencing (NGS) with Nipponbare genome as a reference. The temperate japonica cultivars contained 2 sake brewing (Yamadanishiki and Gohyakumangoku), 1 landrace (Kameji), and 2 modern cultivars (Koshihikari and Norin 8). Almost >83% of the whole genome sequences of the Nipponbare genome could be covered by sequenced short-reads of each cultivar, including Omachi, which has previously been reported to be a temperate japonica cultivar. Numerous single nucleotide polymorphisms (SNPs), insertions, and deletions were detected among the various cultivars and the Nipponbare genomes. Comparison of SNPs detected in each cultivar suggested that Moroberekan had 5-fold more SNPs than the temperate japonica cultivars. Success of the 2 approaches to improve the efficacy of sequence data by using NGS revealed that sequencing depth was directly related to sequencing coverage of coding DNA sequences: in excess of 30× genome sequencing was required to cover approximately 80% of the genes in the rice genome. Further, the contigs prepared using the assembly of unmapped reads could increase the value of NGS short-reads and, consequently, cover previously unavailable sequences. These approaches facilitated the identification of new genes in coding DNA sequences and the increase of mapping efficiency in different regions. The DNA polymorphism information between the 7 cultivars and Nipponbare are available at NGRC_Rices_Build1.0 (http://www.nodai-genome.org/oryza_sativa_en.html).Entities:
Mesh:
Year: 2014 PMID: 24466017 PMCID: PMC3897683 DOI: 10.1371/journal.pone.0086312
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Coverage and sequencing depth of mapped reads with reference to the Nipponbare chromosomal genome IRGSP1.0.
| Mapped reads | Uniquely mapped reads | ||||
| Number ofnucleotides (bp) | Genome coveragewith sequencingdepth ≥5 (%) | Number ofnucleotides (bp) | Genome coveragewith sequencingdepth ≥5 (%) | Average of sequencingdepth (fold) | |
| Omachi | 19,860,548,296 | 95.9 | 16,630,615,027 | 87.3 | 51 |
| Yamadanishiki | 14,879,630,564 | 96.0 | 12,458,183,627 | 88.8 | 38 |
| Kameji | 15,440,489,841 | 95.6 | 13,024,192,299 | 87.8 | 40 |
| Gohyakumangoku | 20,204,766,808 | 96.2 | 17,088,347,679 | 89.3 | 51 |
| Koshihikari | 16,560,127,574 | 97.0 | 13,914,811,848 | 90.1 | 41 |
| Norin 8 | 22,155,138,982 | 97.0 | 18,740,487,803 | 89.6 | 56 |
| Moroberekan | 11,909,078,634 | 91.5 | 9,924,861,817 | 83.4 | 32 |
Figure 1Relationships between sequencing depth and sequence coverage in the coding sequences.
Omachi reads that were originally sequenced using a sequencing depth of 58× the genome were randomly eliminated to produce adjusted sequencing depths of 50×, 40×, 30×, 20×, and 10× the genome. The x- and y-axes show the sequencing depth and the number of genes covered with short-reads in over 90% of the coding sequences, respectively.
Numbers of genes with defined sequence coverage in each cultivar.
| Sequence coverage (%) | Omachi | Yamadanishiki | Kameji | Gohyakumangoku | Koshihikari | Norin 8 | Moroberekan |
| 100 | 36,629 | 36,687 | 33,229 | 38,940 | 40,296 | 38,589 | 39,193 |
| 90≤ | 4,522 | 3,669 | 5,861 | 2,674 | 2,039 | 3,333 | 2,324 |
| 80≤ | 1,115 | 1,326 | 1,897 | 822 | 515 | 742 | 595 |
| 70≤ | 480 | 743 | 1,054 | 432 | 259 | 366 | 258 |
| 60≤ | 321 | 441 | 599 | 254 | 183 | 216 | 196 |
| 50≤ | 211 | 273 | 446 | 184 | 152 | 173 | 166 |
| 40≤ | 165 | 233 | 283 | 160 | 122 | 136 | 128 |
| 30≤ | 121 | 169 | 185 | 135 | 94 | 108 | 124 |
| 20≤ | 127 | 143 | 165 | 121 | 119 | 111 | 106 |
| 10≤ | 108 | 102 | 126 | 76 | 67 | 90 | 118 |
| 0< | 95 | 107 | 124 | 92 | 80 | 83 | 118 |
| 0 | 620 | 621 | 545 | 624 | 588 | 567 | 1,188 |
Sequence coverage indicating the percentage of coding sequences covered by short-reads.
Densities of SNPs and InDels on individual chromosomes detected between the various cultivars and the Nipponbare genome IRGSP1.0.
| Chr. | Omachi | Yamadanishiki | Kameji | Gohyakumangoku | Koshihikari | Norin 8 | Moroberekan | |
| SNPs | ||||||||
| 1 | 14,683 | 14,460 | 6,217 | 7,059 | 18,138 | 2,780 | 82,017 | |
| 2 | 10,198 | 9,322 | 7,203 | 8,876 | 6,603 | 5,893 | 41,374 | |
| 3 | 6,111 | 6,074 | 5,428 | 4,138 | 4,499 | 2,920 | 50,560 | |
| 4 | 26,042 | 16,178 | 14,865 | 25,723 | 17,242 | 22,211 | 84,996 | |
| 5 | 5,706 | 6,119 | 2,677 | 5,801 | 2,053 | 1,184 | 61,350 | |
| 6 | 15,461 | 15,329 | 3,307 | 13,571 | 2,019 | 6,778 | 94,539 | |
| 7 | 12,938 | 10,644 | 10,752 | 10,431 | 11,829 | 3,136 | 61,282 | |
| 8 | 10,751 | 15,596 | 4,995 | 6,727 | 11,820 | 2,137 | 85,159 | |
| 9 | 2,953 | 1,379 | 1,522 | 17,471 | 547 | 2,700 | 38,306 | |
| 10 | 4,460 | 6,663 | 3,627 | 4,789 | 6,237 | 7,999 | 92,007 | |
| 11 | 17,834 | 17,610 | 12,215 | 27,182 | 21,715 | 17,399 | 61,069 | |
| 12 | 5,662 | 4,283 | 4,159 | 10,319 | 12,753 | 10,154 | 74,789 | |
| Total | 132,799 | 123,657 | 76,967 | 142,087 | 115,455 | 85,291 | 827,448 | |
| Insertions | ||||||||
| 1 | 1,900 | 2,037 | 2,236 | 1,328 | 2,312 | 619 | 8,752 | |
| 2 | 1,469 | 1,423 | 1,950 | 1,344 | 1065 | 952 | 4,797 | |
| 3 | 893 | 937 | 1,041 | 875 | 795 | 522 | 5,492 | |
| 4 | 3,183 | 2,672 | 3,192 | 3,361 | 2,657 | 2,901 | 7,057 | |
| 5 | 921 | 942 | 753 | 968 | 391 | 327 | 5,684 | |
| 6 | 1,526 | 1,618 | 1,038 | 1,640 | 373 | 858 | 8,267 | |
| 7 | 1,472 | 1,345 | 1,979 | 1,247 | 1,518 | 473 | 5,981 | |
| 8 | 1,248 | 1,586 | 1,482 | 938 | 1,088 | 337 | 7,017 | |
| 9 | 403 | 309 | 724 | 1,628 | 180 | 473 | 3,425 | |
| 10 | 645 | 823 | 2,419 | 663 | 848 | 893 | 6,637 | |
| 11 | 1,784 | 1,855 | 2,835 | 2,718 | 2,277 | 1,812 | 5,924 | |
| 12 | 852 | 725 | 1,002 | 1,284 | 1,618 | 1,321 | 6,031 | |
| Total | 16,296 | 16,272 | 20,651 | 17,994 | 15,122 | 11,488 | 75,064 | |
| Deletions | ||||||||
| 1 | 1,956 | 2,045 | 2,333 | 1,440 | 2,476 | 679 | 9,700 | |
| 2 | 1,618 | 1,508 | 2,085 | 1,371 | 1,138 | 978 | 5,454 | |
| 3 | 1,035 | 1,025 | 1,057 | 829 | 884 | 509 | 6,129 | |
| 4 | 4,065 | 3,404 | 4,085 | 4,248 | 3,477 | 3,772 | 8,586 | |
| 5 | 1,060 | 1,083 | 787 | 1,127 | 427 | 411 | 6,457 | |
| 6 | 1,731 | 1,814 | 1,078 | 1,712 | 380 | 967 | 9,479 | |
| 7 | 1,698 | 1,494 | 2,211 | 1,315 | 1,668 | 505 | 6,642 | |
| 8 | 1,448 | 1,741 | 1,597 | 1,007 | 1,126 | 282 | 7,634 | |
| 9 | 479 | 351 | 795 | 1,806 | 151 | 487 | 3,835 | |
| 10 | 773 | 900 | 2,688 | 743 | 883 | 945 | 7,541 | |
| 11 | 2,112 | 2,119 | 3,183 | 3,146 | 2,608 | 2,181 | 6,304 | |
| 12 | 1,095 | 897 | 1,158 | 1,577 | 1,844 | 1,477 | 6,772 | |
| Total | 19,070 | 18,381 | 23,057 | 20,321 | 17,062 | 13,193 | 84,533 |
Annotation of SNPs, insertions, and deletions.
| Omachi | Yamadanishiki | Kameji | Gohyakumangoku | Koshihikari | Norin 8 | Moroberekan | |
| SNP | |||||||
| Intergenic | 107,927 | 100,279 | 62,598 | 119,362 | 95,469 | 69,833 | 700,701 |
| Genic | 24,872 | 23,378 | 14,369 | 22,725 | 19,986 | 15,458 | 126,747 |
| Intron | 12,351 | 11,932 | 7,354 | 11,248 | 9,641 | 7,667 | 68,116 |
| UTRs | 4,665 | 4,255 | 2,387 | 4,367 | 3,737 | 2,929 | 24,131 |
| CDS | 7,856 | 7,191 | 4,628 | 7,110 | 6,608 | 4,862 | 34,500 |
| Synonymous | 3,668 | 3,355 | 2,100 | 3,251 | 3,008 | 2,142 | 15,876 |
| Nonsynonymous | 4,188 | 3,836 | 2,528 | 3,859 | 3,600 | 2,720 | 18,624 |
| Insertion, Deletion | |||||||
| Intergenic | 29,041 | 28,302 | 36,091 | 32,085 | 26,584 | 20,481 | 131,783 |
| Genic | 6,325 | 6,351 | 7,617 | 6,230 | 5,600 | 4,200 | 27,814 |
| Intron | 3,971 | 4,019 | 4,907 | 3,869 | 3,345 | 2,551 | 18,236 |
| UTRs | 1,515 | 1,488 | 1,652 | 1,488 | 1,381 | 958 | 6,616 |
| CDS | 839 | 844 | 1,058 | 873 | 874 | 691 | 2,962 |
SNPs, insertions, and deletions on the IRGSP rice pseudomolecules were classified as genic and intergenic, and locations within gene models were annotated. The number of SNPs, insertions, and deletions in each class is shown.
Annotation of contig sequences assembled from unmapped reads.
| Total | Nipponbare genome | No hit | (%) | ||||||
| Chromosome 1–12 | Unanchored | (%) | |||||||
| Unique | (%) | Multi | (%) | ||||||
| Omachi | 6,903 | 1,834 | 26.6 | 3,250 | 47.1 | 231 | 3.3 | 1,588 | 23.0 |
| Yamadanishiki | 4,020 | 827 | 20.6 | 2,785 | 69.3 | 63 | 1.6 | 345 | 8.6 |
| Kameji | 5,415 | 1,088 | 20.1 | 3,189 | 58.9 | 101 | 1.9 | 1,037 | 19.2 |
| Gohyakumangoku | 4,555 | 821 | 18.0 | 2,949 | 64.7 | 78 | 1.7 | 707 | 15.5 |
| Koshihikari | 3,773 | 809 | 21.4 | 2,482 | 65.8 | 60 | 1.6 | 422 | 11.2 |
| Norin 8 | 2,554 | 434 | 17.0 | 1,802 | 70.6 | 42 | 1.6 | 276 | 10.8 |
| Moroberekan | 23,706 | 6,717 | 28.3 | 14,292 | 60.3 | 358 | 1.5 | 2,339 | 9.9 |
A similarity search of the contig sequences was conducted against the Nipponbare genome chromosomes 1–12 and unanchored sequences. The numbers in the categories of unique, multi-hits, and no hits show the total number of contigs classified.
Figure 2Configuration comparison of unique hit contigs among chromosomes.
The numbers in the bars show the number of classified contigs.
Number of newly mapped genes by contigs in the Nipponbare genome.
| Number of contigs | Number of genes in newly mapped regions | ||
| Total | Newly mapped | ||
| Omachi | 547 | 75 | 28 |
| Yamadanishiki | 245 | 21 | 9 |
| Kameji | 362 | 15 | 4 |
| Gohyakumangoku | 225 | 13 | 7 |
| Koshihikari | 298 | 19 | 5 |
| Norin 8 | 115 | 7 | 2 |
| Moroberekan | 1,926 | 165 | 64 |
The newly mapped contigs contained newly mapped sequences of Nipponbare that had not been covered by any short-reads.
Numbers of contigs showing sequence similarity to Oryza sativa japonica and Oryza sativa indica.
| Hit | No hit | ||
|
|
| ||
| Omachi | 214 | 76 | 1,298 |
| Yamadanishiki | 57 | 11 | 277 |
| Kameji | 98 | 71 | 868 |
| Gohyakumangoku | 53 | 5 | 649 |
| Koshihikari | 75 | 6 | 341 |
| Norin 8 | 22 | 10 | 244 |
| Moroberekan | 249 | 82 | 2,008 |
A similarity search of the contig sequences that were unaligned to regions of Nipponbare sequence was sequentially conducted against Oryza sativa japonica and Oryza sativa indica by using NCBI BLASTn search. The numbers in the categories of hit to japonica or indica, and no hit show the total number of contigs classified.