| Literature DB >> 18304353 |
HyeRan Kim1, Bonnie Hurwitz, Yeisoo Yu, Kristi Collura, Navdeep Gill, Phillip SanMiguel, James C Mullikin, Christopher Maher, William Nelson, Marina Wissotski, Michele Braidotti, David Kudrna, José Luis Goicoechea, Lincoln Stein, Doreen Ware, Scott A Jackson, Carol Soderlund, Rod A Wing.
Abstract
We describe the establishment and analysis of a genus-wide comparative framework composed of 12 bacterial artificial chromosome fingerprint and end-sequenced physical maps representing the 10 genome types of Oryza aligned to the O. sativa ssp. japonica reference genome sequence. Over 932 Mb of end sequence was analyzed for repeats, simple sequence repeats, miRNA and single nucleotide variations, providing the most extensive analysis of Oryza sequence to date.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18304353 PMCID: PMC2374706 DOI: 10.1186/gb-2008-9-2-r45
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Summary of BAC end sequences of 12 Oryza species
| Species | Genome type | Genome size (Mb)* | No. of GenBank submissions | Average length after trim (in GenBank) | Total sequenced length (in GenBank) | Genome coverage | No. of forward reads | No. of reverse reads | No. of clones with paired reads (% of total BES) |
| AA | 448 | 106,124 | 665 bp | ~ 71 Mb | 16% | 53,450 | 52,674 | 51,820 (98%) | |
| AA | 439 | 70,982 | 704 bp | ~ 50 Mb | 11% | 35,473 | 35,509 | 34,747 (98%) | |
| AA | 357 | 66,821 | 590 bp | ~ 39 Mb | 11% | 33,456 | 33,365 | 30,885 (92%) | |
| BB | 425 | 68,384 | 710 bp | ~ 49 Mb | 11% | 35,051 | 33,333 | 32,284 (94%) | |
| CC | 651 | 101,091 | 717 bp | ~ 72 Mb | 11% | 49,972 | 49,052 | 47,197 (93%) | |
| BBCC | 1,124 | 169,460 | 559 bp | ~ 95 Mb | 8% | 85,466 | 83,994 | 82,248 (97%) | |
| CCDD | 1,008 | 128,732 | 586 bp | ~ 75 Mb | 7% | 64,365 | 64,367 | 58,217 (90%) | |
| EE | 965 | 135,769 | 625 bp | ~ 85 Mb | 9% | 67,245 | 68,524 | 64,081 (94%) | |
| FF | 362 | 67,364 | 672 bp | ~ 45 Mb | 13% | 34,009 | 33,355 | 32,258 (96%) | |
| GG | 882 | 138,171 | 674 bp | ~ 93 Mb | 11% | 69,962 | 68,209 | 66,434 (96%) | |
| HHJJ | 1,283 | 204,729 | 632 bp | ~ 129 Mb | 10% | 102,640 | 102,089 | 98,160 (96%) | |
| HHKK | 771† | 195,285 | 661 bp | ~ 129 Mb | 17% | 100,341 | 94,944 | 91,853 (94%) | |
| Total/Average | 1,452,912 | 650 bp | ~ 932 Mb | 11% | 731,430 | 719,415 | 690,184 (95%) | ||
*From [30]. †By K Arumuganathan using flow cytometric methods as previously described [30].
Summary of phase I FPC physical maps of 12 Oryza species
| Species | Average insert size | No. of total attempted | No. of clones FPCed | Genome coverage by all FPCed clones | Success ratio | Total organellar contam + no. inserts containing clones* | No. of FPC clones with paired BES reads | No. of singletons (%) | No. of contigs | Total CB units | Average no. of bands/clone | Deduced size of 1 CB unit† (kb) | Deduced genome size (Mb)‡ (coverage) |
| 161 | 55,296 | 51,056 | 18 | 92% | 4.1% | 48,032 (94%) | 2,356 (4.6%) | 456 | 384,733 | 121.6 | 1.32 | 509 (114%) | |
| 134 | 35,712 | 33,023 | 10 | 92% | 3.9% | 31,202 (94%) | 1,305 (4.0%) | 637 | 406,768 | 106.2 | 1.26 | 513 (117%) | |
| 130 | 36,864 | 33,065 | 12 | 90% | 3.7% | 27,661 (84%) | 2,098 (6.3%) | 905 | 309,740 | 103.9 | 1.25 | 388 (109%) | |
| 142 | 36,864 | 34,224 | 11 | 93% | 1.8% | 30,228 (88%) | 1,482 (4.3%) | 490 | 340,240 | 117.4 | 1.21 | 412 (97%) | |
| 141 | 54,144 | 45,856 | 10 | 85% | 6.7% | 40,549 (88%) | 1,133 (2.5%) | 703 | 462,126 | 115.5 | 1.22 | 564 (87%) | |
| 125 | 92,160 | 86,861 | 10 | 94% | 1.2% | 77,999 (90%) | 9,576 (11.0%) | 3,962 | 1,004,697 | 96.6 | 1.29 | 1,300 (116%) | |
| 133 | 73,728 | 63,860 | 8 | 87% | 0.4% | 50,842 (80%) | 3,111 (4.9%) | 2,492 | 870,411 | 109.3 | 1.22 | 1,059 (105%) | |
| 152 | 73,728 | 67,416 | 11 | 91% | 2.7% | 58,992 (88%) | 3,144 (4.7%) | 1,003 | 691,155 | 125.1 | 1.22 | 840 (87%) | |
| 131 | 36,864 | 33,424 | 12 | 91% | 1.7% | 29,523 (88%) | 2,354 (7.0%) | 422 | 215,208 | 100.9 | 1.30 | 279 (77%) | |
| 134 | 73,728 | 64,836 | 10 | 88% | 3.4% | 58,818 (91%) | 3,032 (4.7%) | 2,358 | 1,078,237 | 120.8 | 1.11 | 1,196 (136%) | |
| 127 | 110,592 | 104,393 | 10 | 94% | 2.1% | 93,192 (89%) | 5,169 (5.0%) | 2,190 | 1,216,461 | 111.2 | 1.14 | 1,389 (108%) | |
| 123 | 100,224 | 92,522 | 15 | 92% | 2.1% | 82,180 (89%) | 1,810 (2.0%) | 1,250 | 611,953 | 100.0 | 1.23 | 753 (98%) | |
| Total/Average | 136 | 779,904 | 710,536 | 11 | 91% | 2.8% | 629,218 (89%) | 5.1% | 110.7 | 1.23 | |||
*From [30]. †Deduced one CB unit size = average inset size/average number of bands. ‡Deduced genome size = †Deduced one CB unit size × Total CB units.
Summary of alignments of 12 OMAP Phase I FPC maps to the O. sativa RefSeq
| Not aligned contig | |||||||||
| Species | No. of contigs aligned | Total CB aligned | Total size aligned* (Mb) | No. of clones in aligned contig | No. of BESs aligned | Average identity of BES alignments | E-value of BES alignments | CB/contig (average) | No. of clones/contig (average) |
| 368 (81%) | 371,561 (97%) | 490 (109%) | 48,424 (99%) | 63,293 (60%) | 97% | e-237 to e-33 | 150 | 3 | |
| 581 (91%) | 399,346 (98%) | 503 (115%) | 31,398 (99%) | 36,725 (52%) | 97% | e-224 to e-57 | 133 | 6 | |
| 860 (95%) | 302,696 (98%) | 378 (106%) | 30,708 (99%) | 33,356 (50%) | 97% | e-213 to e-42 | 157 | 6 | |
| 462 (94%) | 332,224 (98%) | 402 (95%) | 31,674 (97%) | 25,842 (38%) | 93% | e-183 to e-30 | 286 | 38 | |
| 617 (88%) | 448,477 (97%) | 547 (84%) | 44,194 (99%) | 25,699 (25%) | 93% | e-204 to e-16 | 159 | 6 | |
| 3,073 (78%) | 833,710 (83%) | 1,075 (96%) | 66,277 (86%) | 46,227 (27%) | 93% | e-173 to e-07 | 192 | 12 | |
| 1,571 (63%) | 671,625 (77%) | 819 (81%) | 50,281 (83%) | 10,770 (8%) | 93% | e-193 to e-02 | 216 | 11 | |
| 787 (78%) | 630,836 (91%) | 770 (80%) | 60,538 (94%) | 20,429 (15%) | 93% | e-182 to e-03 | 279 | 17 | |
| 351 (83%) | 205,608 (96%) | 267 (74%) | 30,717 (99%) | 25,387 (38%) | 91% | e-175 to e-18 | 135 | 5 | |
| 1,444 (61%) | 814,383 (76%) | 904 (102%) | 50,117 (81%) | 18,785 (14%) | 91% | e-179 to e-05 | 289 | 13 | |
| 1,756 (80%) | 1,123,902 (92%) | 1,281 (100%) | 94,565 (95%) | 37,300 (18%) | 92% | e-170 to e-06 | 213 | 11 | |
| 1,002 (80%) | 579,809 (95%) | 713 (92%) | 89,775 (99%) | 53,920 (28%) | 92% | e-184 to e-07 | 130 | 4 | |
*Deduced from [1 CB unit size from Table 2 × Total CB unit aligned].
Figure 1SyMAP view of unedited physical maps of chromosome 1 from eight diploid Oryza species aligned to the O. sativa ssp. japonica chromosome 1 RefSeq. The numbers in the small rectangles on the left are contig numbers of OMAP phase I physical maps. Beige bars on the right represent the O. sativa RefSeq (IRGSP V.4 assembly) and the red crosses on the beige bars represent the CentO position of O. sativa [18]. Purple lines represent BESs aligned to the O. sativa RefSeq. The order of the species from the left to right is; O. nivara [AA], O. rufipogon [AA], O. glaberrima [AA], O. punctata [BB]; O. officinalis [CC], O. australiensis [EE]; O. brachyantha [FF], O. granulata [GG].
Repeat content analysis of 12 Oryza species using RepeatMasker and RECON
| Repeat content by RECON | |||||
| Species | Repeat content by RepeatMasker | Total | Overlap with RepeatMasker (% of total) | Uniqe (% of total) | Total repeat content |
| 29.4% | 37.2% | 23.8% (64%) | 13.4% (36%) | 42.8% | |
| 37.0% | 47.6% | 29.5% (62%) | 18.1% (38%) | 55.1% | |
| 36.6% | 49.7% | 34.6% (70%) | 15.2% (30%) | 51.7% | |
| 28.5% | 30.9% | 19.8% (64%) | 11.2% (36%) | 39.7% | |
| 40.8% | 41.4% | 30.7% (74%) | 10.7% (26%) | 51.5% | |
| 47.2% | 56.5% | 37.6% (67%) | 18.9% (33%) | 66.1% | |
| 44.0% | 46.3% | 32.2% (70%) | 14.1% (30%) | 58.1% | |
| 43.2% | 47.9% | 33.2% (69%) | 14.7% (31%) | 57.9% | |
| 55.0% | 66.4% | 45.4% (68%) | 21.0% (32%) | 76.0% | |
| 20.9% | 27.5% | 11.6% (42%) | 15.9% (58%) | 36.8% | |
| 30.1% | 30.9% | 20.5% (66%) | 10.4% (34%) | 40.5% | |
| 35.6% | 42.7% | 23.0% (54%) | 19.7% (46%) | 55.2% | |
| 19.7% | 24.5% | 7.7% (32%) | 16.7% (68%) | 36.4% | |
Figure 2Comparison of classified repeat composition in the genomes of 12 Oryza species and the O. sativa RefSeq.
Distribution of OMAP non-redundant SSRs by motif type
| Non-redundant | ||||||||||||||
| Di | Tri | Tetra | Penta | Hexa | Total | |||||||||
| Species | Ploidy | Total no. of SSRs containing BESs (%) | No. | % | No. | % | No. | % | No. | % | No. | % | No. | SSR density (no. of SSRs/Mb) |
| 2 | - | 8,636 | 41.81 | 6,259 | 30.31 | 2,439 | 11.81 | 2,628 | 12.72 | 691 | 3.35 | 20,653 | 55.8 | |
| 2 | - | 8,317 | 42.94 | 6,059 | 31.28 | 2,285 | 11.80 | 2,142 | 11.06 | 567 | 2.93 | 19,370 | 47.2 | |
| 2 | 2,129 (2.01) | 739 | 46.33 | 451 | 28.28 | 189 | 11.85 | 170 | 10.66 | 46 | 2.88 | 1,595 | 22.5 | |
| 2 | 1,599 (2.25) | 696 | 49.40 | 375 | 26.61 | 169 | 11.99 | 137 | 9.72 | 32 | 2.27 | 1,409 | 28.2 | |
| 2 | 1,289 (1.93) | 518 | 45.00 | 319 | 27.72 | 124 | 10.77 | 153 | 13.29 | 37 | 3.21 | 1,151 | 29.5 | |
| 2 | 1,043 (1.53) | 408 | 44.54 | 268 | 29.26 | 105 | 11.46 | 100 | 10.92 | 35 | 3.82 | 916 | 18.7 | |
| 2 | 1,397 (1.35) | 545 | 47.43 | 308 | 26.81 | 111 | 9.66 | 146 | 12.71 | 39 | 3.39 | 1,149 | 16.0 | |
| 2 | 1,306 (0.96) | 597 | 54.17 | 230 | 20.87 | 94 | 8.53 | 128 | 11.62 | 53 | 4.81 | 1,102 | 13.0 | |
| 2 | 1,630 (2.42) | 590 | 48.68 | 287 | 23.68 | 135 | 11.14 | 166 | 13.70 | 34 | 2.81 | 1,212 | 26.9 | |
| 2 | 1,174 (0.85) | 603 | 55.99 | 313 | 29.06 | 64 | 5.94 | 73 | 6.78 | 24 | 2.23 | 1,077 | 11.6 | |
| 4 | 1,939 (1.14) | 921 | 54.27 | 386 | 22.75 | 176 | 10.37 | 168 | 9.90 | 46 | 2.71 | 1,697 | 17.9 | |
| 4 | 1,582 (1.23) | 812 | 52.83 | 363 | 23.62 | 182 | 11.84 | 141 | 9.17 | 39 | 2.54 | 1,537 | 20.5 | |
| 4 | 2,305 (1.13) | 709 | 35.36 | 741 | 36.96 | 238 | 11.87 | 257 | 12.82 | 60 | 2.99 | 2,005 | 15.5 | |
| 4 | 2,671 (1.37) | 1,108 | 52.02 | 500 | 23.47 | 257 | 12.07 | 198 | 9.30 | 67 | 3.15 | 2,130 | 16.5 | |
| OMAP total | 20,064 (1.38) | 8,246 | 48.56 | 4,541 | 26.74 | 1,844 | 10.86 | 1,837 | 10.82 | 512 | 3.02 | 16,980 | 18.2 | |
Figure 3Comparison of the ten most frequent SSR motif compositions in 14 Oryza species (japonica, O. sativa ssp. japonica; indica, O. sativa ssp. indica; OMAP, average of 12 OMAP species).
Distribution of SSR length by the repeat types
| OMAP SSR | |||||||||
| Repeat | No. | Average length | Standard deviation | No. | Average length | Standard deviation | No. | Average length | Standard deviation |
| Di | 8,246 | 44.51 | 45.15 | 8,636 | 43.32 | 24.94 | 8,317 | 40.78 | 22.2 |
| Tri | 4,541 | 28.75 | 13.61 | 6,259 | 29.26 | 17.59 | 6,059 | 28.89 | 15.71 |
| Tetra | 1,844 | 27.69 | 12.69 | 2,439 | 34.25 | 28.81 | 2,285 | 29.9 | 16.51 |
| Penta | 1,837 | 25.11 | 7.73 | 2,628 | 25.24 | 6.39 | 2,142 | 25.52 | 9 |
| Hexa | 512 | 27.46 | 10.5 | 691 | 29.96 | 41.33 | 567 | 28.04 | 20.58 |
| [C/G]A DNR* | 3,151 | 30.29 | 11.08 | 3,617 | 32.31 | 14.76 | 3,722 | 31.55 | 12.45 |
| TA DNR* | 5,086 | 53.32 | 55 | 4,996 | 51.38 | 27.62 | 4,573 | 48.38 | 25.32 |
| G/C rich TNR† | 2,442 | 24.56 | 4.51 | 4,673 | 24.85 | 5.59 | 4,448 | 24.79 | 5.56 |
| A/T rich TNR† | 2,099 | 33.64 | 18.24 | 1,586 | 42.25 | 30.07 | 1,611 | 40.19 | 25.86 |
*Di-nucleotide repeat. †Tri-nucleotide repeat.
miRNAs conservation between O. sativa and four wild rice species detected from their BES datasets
| Target mRNA class | Total no. of loci | |||||
| miR156 | ND | d | 1 | |||
| miR159 | MYB and TCP TFs | e | b | 2 | ||
| miR160 | ND | d | 1 | |||
| miR162 | DICER-LIKE 1 | b | 1 | |||
| miR164 | ND | a | 1 | |||
| miR166 | HD-Zip TFs | f, h, k | h, k, n | h, n | c | 9 |
| miR167 | Auxin response factors TFs | a, d, f | j | d | a | 6 |
| miR168 | ARGONAUTE | b | 1 | |||
| miR169 | CCAAT binding factor and HAP2-like TFs | n, p | c | 3 | ||
| miR171 | SCARECROW-like TFs | c, e, f | c | c, d | a | 7 |
| miR319 | ND | a | 1 | |||
| miR395 | ATP sulphurylases | i, j, k | 3 | |||
| miR396 | GRF TFs, rhodenase-like, and kinesin-like protein B | b | 1 | |||
| miR397 | Laccases and beta-6 tubulin | a | a | 2 | ||
| miR399 | Phosphatase TFs | e | b, e | 3 | ||
| miR418 | ND | x | x | 2 | ||
| miR420 | ND | x | x | x | 3 | |
| miR441 | ND | b | a, b, c | c | 5 | |
| miR442 | ND | x | x | x | 3 | |
| miR446 | ND | x | x | x | 3 | |
| miR531 | ND | x | x | 2 | ||
| miR535 | ND | x | x | x | x | 4 |
| Total no. of loci | 20 | 22 | 15 | 7 | 64 | |
*Named miRNA loci ID [69] are shown by species (for example, a putative miR156d locus ID is noted using a 'd' in the O. rufipogon column). For miRNA without a locus ID (for example, miR408), an 'x' is shown. ND, not determined; TF, transcription factor.