| Literature DB >> 26818077 |
Yong-Bi Fu1, Gregory W Peterson2, Yibo Dong2.
Abstract
Genotyping-by-sequencing (GBS) has emerged as a useful genomic approach for exploring genome-wide genetic variation. However, GBS commonly samples a genome unevenly and can generate a substantial amount of missing data. These technical features would limit the power of various GBS-based genetic and genomic analyses. Here we present software called IgCoverage for in silico evaluation of genomic coverage through GBS with an individual or pair of restriction enzymes on one sequenced genome, and report a new set of 21 restriction enzyme combinations that can be applied to enhance GBS applications. These enzyme combinations were developed through an application of IgCoverage on 22 plant, animal, and fungus species with sequenced genomes, and some of them were empirically evaluated with different runs of Illumina MiSeq sequencing in 12 plant species. The in silico analysis of 22 organisms revealed up to eight times more genome coverage for the new combinations consisted of pairing four- or five-cutter restriction enzymes than the commonly used enzyme combination PstI + MspI. The empirical evaluation of the new enzyme combination (HinfI + HpyCH4IV) in 12 plant species showed 1.7-6 times more genome coverage than PstI + MspI, and 2.3 times more genome coverage in dicots than monocots. Also, the SNP genotyping in 12 Arabidopsis and 12 rice plants revealed that HinfI + HpyCH4IV generated 7 and 1.3 times more SNPs (with 0-16.7% missing observations) than PstI + MspI, respectively. These findings demonstrate that these novel enzyme combinations can be utilized to increase genome sampling and improve SNP genotyping in various GBS applications.Entities:
Keywords: SNP genotyping; genome coverage; genotyping-by-sequencing; in silico analysis; restriction enzyme combination
Mesh:
Substances:
Year: 2016 PMID: 26818077 PMCID: PMC4825655 DOI: 10.1534/g3.115.025775
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Procedures used to explore new REs for GBS application. A flow chart for exploring restriction enzyme (RE) pairs for a genome-by-sequencing (GBS) application to increase genome coverage of a species through in silico analysis and empirical validation. The genome coverage is measured by the proportion of the genome covered by a selected set of DNA fragments digested with a RE or RE pair. IgC and EgC are the genome coverages of a species estimated from in silico analysis and empirical validation, respectively. Two shell scripts (IgCoverage1RE.sh and IgCoverage2RE.sh) are part of software IgCoverage developed for this study.
The in silico genome coverages (IgC; %) of four plant species by DNA fragments of different lengths (100–600 bp) obtained from in silico digestions with 60 individual restriction enzymes
| Soybean | Rice | Maize | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Enzyme | SL | Count | IgC | Count | IgC | Count | IgC | Count | IgC | Mean | SD |
| 4 | 249,612 | 56.7 | 2,266,324 | 60.7 | 1,030,613 | 79.0 | 4,817,820 | 59.6 | 64.0 | 10.1 | |
| 4 | 376,984 | 67.3 | 2,648,781 | 61.9 | 1,150,523 | 58.9 | 6,059,914 | 53.9 | 60.5 | 5.6 | |
| 4 | 249,639 | 56.7 | 2,266,718 | 60.8 | 945,109 | 62.2 | 4,818,313 | 59.6 | 59.8 | 2.3 | |
| 4 | 360,972 | 64.7 | 2,790,706 | 57.9 | 969,836 | 56.2 | 4,546,748 | 54.1 | 58.2 | 4.6 | |
| 4 | 244,853 | 54.9 | 2,194,161 | 58.9 | 914,234 | 60.1 | 4,668,840 | 58.6 | 58.1 | 2.2 | |
| 4 | 349,553 | 65.9 | 2,583,144 | 58.7 | 821,137 | 51.2 | 3,442,333 | 43.4 | 54.8 | 9.7 | |
| 4 | 253,157 | 55.7 | 1,410,267 | 41.0 | 765,561 | 53.7 | 4,481,014 | 56.5 | 51.7 | 7.2 | |
| 4 | 230,319 | 52.6 | 1,256,292 | 38.6 | 680,477 | 49.4 | 3,833,020 | 51.2 | 47.9 | 6.4 | |
| 5 | 247,146 | 55.7 | 1,432,238 | 43.6 | 526,686 | 39.7 | 3,730,642 | 50.4 | 47.4 | 7.1 | |
| 5 | 224,589 | 51.4 | 1,181,936 | 35.8 | 574,679 | 42.6 | 3,713,268 | 48.5 | 44.6 | 6.9 | |
| 5 | 187,752 | 44.6 | 1,304,502 | 39.3 | 491,752 | 37.8 | 3,621,546 | 49.2 | 42.7 | 5.2 | |
| 4 | 143,863 | 36.0 | 1,200,979 | 36.5 | 585,600 | 43.5 | 3,499,202 | 47.8 | 41.0 | 5.7 | |
| 4 | 172,656 | 41.6 | 638,273 | 19.9 | 510,683 | 37.5 | 3,315,820 | 44.3 | 35.8 | 11.0 | |
| 5 | 150,724 | 37.7 | 782,320 | 25.0 | 530,653 | 40.1 | 2,520,324 | 36.1 | 34.7 | 6.7 | |
| 4 | 117,822 | 30.2 | 785,561 | 25.3 | 528,208 | 39.5 | 2,434,929 | 35.5 | 32.6 | 6.2 | |
| 5 | 165,612 | 40.1 | 822,062 | 26.9 | 291,725 | 23.2 | 1,693,916 | 25.2 | 28.8 | 7.6 | |
| 4 | 41,199 | 11.0 | 498,808 | 15.1 | 464,330 | 33.0 | 3,238,709 | 40.6 | 24.9 | 14.2 | |
| 5 | 65,979 | 16.8 | 368,604 | 11.5 | 479,530 | 32.9 | 2,810,234 | 35.0 | 24.1 | 11.6 | |
| 4 | 107,461 | 27.1 | 544,678 | 16.9 | 342,538 | 26.3 | 1,565,649 | 22.7 | 23.3 | 4.6 | |
| 5 | 39,880 | 10.8 | 483,707 | 15.0 | 355,928 | 26.9 | 2,817,258 | 37.3 | 22.5 | 12.0 | |
| 5 | 38,190 | 10.0 | 289,281 | 9.2 | 350,643 | 25.9 | 2,680,165 | 34.3 | 19.9 | 12.3 | |
| 4 | 46,464 | 11.6 | 187,972 | 5.8 | 378,091 | 25.9 | 2,464,695 | 30.8 | 18.5 | 11.8 | |
| 5 | 45,865 | 12.0 | 257,226 | 8.3 | 338,090 | 25.2 | 1,771,780 | 24.6 | 17.5 | 8.6 | |
| 5 | 50,386 | 13.5 | 385,719 | 12.7 | 200,604 | 16.6 | 1,249,590 | 18.8 | 15.4 | 2.8 | |
| 4 | 12,350 | 3.4 | 145,999 | 4.3 | 285,864 | 19.5 | 2,179,751 | 26.9 | 13.5 | 11.6 | |
| 5 | 21,241 | 5.7 | 183,781 | 5.9 | 183,039 | 14.8 | 1,582,998 | 22.1 | 12.1 | 7.9 | |
| 5 | 22,727 | 6.3 | 241,494 | 7.8 | 144,992 | 11.8 | 1,395,284 | 20.0 | 11.5 | 6.2 | |
| 4 | 13,101 | 3.4 | 114,394 | 3.2 | 247,293 | 16.2 | 1,539,695 | 19.4 | 10.6 | 8.5 | |
| 5 | 19,497 | 4.8 | 55,184 | 1.6 | 207,880 | 13.8 | 1,401,069 | 18.2 | 9.6 | 7.7 | |
| 6 | 44,863 | 11.4 | 493,702 | 15.5 | 92,294 | 7.3 | 213,496 | 3.3 | 9.4 | 5.3 | |
| 6 | 41,760 | 11.1 | 189,122 | 6.3 | 105,736 | 8.7 | 621,055 | 9.9 | 9.0 | 2.0 | |
| 5 | 6272 | 1.7 | 55,716 | 1.9 | 125,439 | 9.4 | 973,572 | 13.3 | 6.5 | 5.7 | |
| 6 | 30,444 | 8.0 | 333,631 | 10.6 | 54,716 | 4.3 | 127,778 | 2.0 | 6.2 | 3.8 | |
| 6 | 5528 | 1.4 | 16,387 | 0.5 | 122,510 | 8.9 | 413,772 | 6.2 | 4.3 | 4.0 | |
| 6 | 9536 | 2.6 | 141,415 | 4.6 | 46,432 | 3.9 | 163,998 | 2.4 | 3.4 | 1.0 | |
| 6 | 9536 | 2.6 | 141,415 | 4.6 | 46,432 | 3.9 | 163,998 | 2.4 | 3.4 | 1.0 | |
| 6 | 17,751 | 4.7 | 97,975 | 3.2 | 17,513 | 1.6 | 198,699 | 3.3 | 3.2 | 1.3 | |
| 6 | 8161 | 2.2 | 30,497 | 1.0 | 17,778 | 1.5 | 63,336 | 1.0 | 1.4 | 0.6 | |
| 6 | 5555 | 1.6 | 45,234 | 1.5 | 11,684 | 1.0 | 65,745 | 1.1 | 1.3 | 0.3 | |
| 6 | 166 | 0.0 | 1436 | 0.0 | 48,090 | 3.4 | 93,246 | 1.3 | 1.2 | 1.6 | |
| 6 | 2501 | 0.7 | 8413 | 0.3 | 20,000 | 1.6 | 141,985 | 2.1 | 1.2 | 0.8 | |
| 6 | 4329 | 1.2 | 36,162 | 1.3 | 12,426 | 1.1 | 69,100 | 1.1 | 1.2 | 0.1 | |
| 11 | 206 | 0.1 | 4433 | 0.2 | 27,650 | 2.1 | 152,478 | 2.3 | 1.2 | 1.2 | |
| 6 | 3047 | 0.9 | 13,079 | 0.4 | 16,785 | 1.5 | 85,561 | 1.4 | 1.0 | 0.5 | |
| 6 | 124 | 0.0 | 3898 | 0.1 | 32,765 | 2.5 | 100,460 | 1.4 | 1.0 | 1.2 | |
| 6 | 3637 | 1.0 | 36,561 | 1.3 | 5090 | 0.4 | 61,670 | 1.1 | 1.0 | 0.4 | |
| 6 | 853 | 0.2 | 21,649 | 0.8 | 18,406 | 1.6 | 67,646 | 1.1 | 0.9 | 0.5 | |
| 9 | 2709 | 0.8 | 11,631 | 0.4 | 8836 | 0.8 | 95,178 | 1.6 | 0.9 | 0.5 | |
| 6 | 12 | 0.0 | 1815 | 0.0 | 24,187 | 1.7 | 109,573 | 1.6 | 0.9 | 1.0 | |
| 6 | 4015 | 1.1 | 19,768 | 0.7 | 9299 | 0.8 | 41,023 | 0.7 | 0.8 | 0.2 | |
| 6 | 1611 | 0.5 | 8640 | 0.3 | 8925 | 0.7 | 48,772 | 0.8 | 0.6 | 0.3 | |
| 6 | 182 | 0.0 | 862 | 0.0 | 15,281 | 1.1 | 60,642 | 0.9 | 0.5 | 0.6 | |
| 6 | 1615 | 0.5 | 8977 | 0.3 | 6676 | 0.5 | 43,112 | 0.7 | 0.5 | 0.2 | |
| 6 | 530 | 0.1 | 5811 | 0.2 | 9303 | 0.7 | 53,675 | 0.7 | 0.5 | 0.3 | |
| 6 | 148 | 0.0 | 1524 | 0.1 | 6748 | 0.6 | 56,135 | 0.9 | 0.4 | 0.4 | |
| 8 | 96 | 0.0 | 390 | 0.0 | 14,315 | 1.1 | 23,916 | 0.3 | 0.4 | 0.5 | |
| 6 | 429 | 0.1 | 7655 | 0.2 | 2887 | 0.3 | 28,522 | 0.5 | 0.3 | 0.1 | |
| 8 | 0 | 0.0 | 5 | 0.0 | 992 | 0.1 | 2278 | 0.0 | 0.0 | 0.0 | |
| 8 | 3 | 0.0 | 59 | 0.0 | 749 | 0.1 | 1855 | 0.0 | 0.0 | 0.0 | |
| 8 | 2 | 0.0 | 22 | 0.0 | 88 | 0.0 | 721 | 0.0 | 0.0 | 0.0 | |
SL, site length; Count, the number of DNA fragments of different lengths (100–600 bp).
The empirical genomic coverages (EgC) for three restriction enzyme combinations (PM = PstI + MspI, AB = AvaII + BfaI, and HH = HinfI + HpyCH4IV) and the ratio of the EgC relative to PM (Ratio to PM) in six dicot and six monocot species
| PM | AB | HH | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Plant Species | GS | Contig × Length | EgC (%) | Contig × Length | EgC (%) | Ratio to PM | Contig × Length | EgC (%) | Ratio to PM |
| Dicot | |||||||||
| 156 | 17,352 × 208 | 0.74 | 51,231 × 218 | 2.28 | 3.09 | 111,175 × 196 | 4.45 | 6.03 | |
| 4768 | 27,703 × 175 | 0.10 | 355,620 × 127 | 0.95 | 9.35 | 203,041 × 130 | 0.55 | 5.45 | |
| 685 | 53,747 × 173 | 1.36 | 122,437 × 142 | 2.55 | 1.87 | 202,992 × 147 | 4.37 | 3.21 | |
| 1364 | 22,686 × 194 | 0.32 | 166,661 × 150 | 1.83 | 5.67 | 159,252 × 148 | 1.72 | 5.34 | |
| 1100 | 45,759 × 171 | 0.71 | 222,148 × 165 | 3.33 | 4.67 | 224,215 × 163 | 3.33 | 4.67 | |
| 489 | 39,423 × 191 | 1.54 | 141,589 × 167 | 4.83 | 3.13 | 182,384 × 172 | 6.41 | 4.16 | |
| Mean | 0.80 | 2.63 | 4.63 | 3.47 | 4.81 | ||||
| Monocot | |||||||||
| 4939 | 128,544 × 164 | 0.43 | 294,115 × 126 | 0.75 | 1.76 | 311,323 × 148 | 0.93 | 2.19 | |
| 4450 | 186,910 × 159 | 0.67 | 466,001 × 129 | 1.35 | 2.02 | 365,274 × 151 | 1.24 | 1.85 | |
| 6969 | 264,949 × 147 | 0.56 | 549,794 × 126 | 1.00 | 1.78 | 439,060 × 147 | 0.92 | 1.65 | |
| 2665 | 84,632 × 182 | 0.58 | 295,130 × 136 | 1.51 | 2.62 | 293,728 × 147 | 1.62 | 2.81 | |
| 8240 | 275,761 × 157 | 0.52 | 368,636 × 131 | 0.59 | 1.12 | 416,130 × 145 | 0.73 | 1.39 | |
| 489 | 75,742 × 195 | 3.02 | 164,171 × 190 | 6.36 | 2.11 | 209,403 × 184 | 7.90 | 2.61 | |
| Mean | 0.96 | 1.93 | 1.90 | 2.22 | 2.08 | ||||
| Overall mean | 0.88 | 2.28 | 3.27 | 2.85 | 3.45 | ||||
GS, genome size in Mb obtained from Royal Botanic Gardens, Kew Plant DNA C-values Database; GS is not available for L. grandiflorum, so GS for the related flax species (L. usitatissiumum) was used. Contig × length = number of contigs × average length (bp) per contig.
Statistics of contig and mean contig length per sample obtained for two restriction enzyme combinations (PM = PstI + MspI; HH = HinfI + HpyCH4IV) in combined runs of 12 Arabidopsis and 12 rice samples
| PM | HH | ||||
|---|---|---|---|---|---|
| Sample | Contig Count | Mean Length | Contig Count | Mean Length | Length Ratio |
| Bur-0 | 59,354 | 204 | 120,602 | 208 | 2.1 |
| Col-0 | 79,750 | 199 | 154,377 | 206 | 2.0 |
| Col-1 | 25,865 | 198 | 89,198 | 215 | 3.8 |
| Col-2 | 24,647 | 201 | 84,160 | 217 | 3.7 |
| Col-3 | 25,891 | 200 | 109,742 | 218 | 4.6 |
| Col-4 | 62,244 | 202 | 134,119 | 212 | 2.3 |
| Col-5 | 27,325 | 201 | 99,377 | 223 | 4.0 |
| Col-6 | 35,066 | 200 | 105,837 | 216 | 3.3 |
| Col-7 | 38,725 | 203 | 95,451 | 219 | 2.7 |
| LER | 73,058 | 197 | 131,169 | 213 | 1.9 |
| Tsu-1 | 75,930 | 198 | 123,256 | 210 | 1.7 |
| WS4 | 63,640 | 199 | 113,102 | 214 | 1.9 |
| Mean | 49,291 | 200 | 113,366 | 214 | 2.8 |
| Rice | |||||
| R163 | 66,984 | 220 | 176,323 | 213 | 2.5 |
| R237 | 63,211 | 223 | 206,916 | 214 | 3.1 |
| R242 | 62,994 | 221 | 220,699 | 209 | 3.3 |
| R286 | 64,506 | 221 | 158,090 | 210 | 2.3 |
| R423 | 61,157 | 219 | 181,826 | 212 | 2.9 |
| R614 | 70,942 | 213 | 176,441 | 210 | 2.4 |
| R735 | 56,315 | 209 | 140,580 | 205 | 2.4 |
| R971 | 73,663 | 221 | 199,219 | 211 | 2.6 |
| R1120 | 63,652 | 220 | 146,569 | 211 | 2.2 |
| R1409 | 58,519 | 220 | 170,706 | 209 | 2.8 |
| R1570 | 69,318 | 218 | 166,983 | 211 | 2.3 |
| R1662 | 5690 | 246 | 164,098 | 214 | 25.0 |
| Mean | 64,660 | 219 | 176,759 | 210 | 2.6 |
Length ratio, the ratio of the total base pairs obtained for HH over those for PM.
Rice sample R1662 had a sequencing issue with PM and was excluded from the mean calculations.
The in silico genome coverages (IgC; %) of 22 species with sequenced genomes (eight plant, 13 animal, and one fungus) by DNA fragments of different ends and lengths (100–600 bp) obtained from in silico digestions with the top 21 restriction enzyme pairs and the GBS reference pair PstI + MspI
| Plant | Animal | Fungi | |||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Enzyme Pair | Arabidopsis | Cottonwood | Medicago | Winegrape | Soybean | Rice | Sorghum | Maize | C. elegans | Fruit fly | Honey bee | Stickleback | Pike | Zebra fish | Turkey | Zebra finch | Dog | Housecat | House mouse | Pigmy chimp | Opossum | Baker’s yeast | Mean_all | SD_all | Mean_plants |
| Genome size (Mb) | 120 | 379 | 297 | 426 | 950 | 382 | 659 | 2060 | 83 | 158 | 220 | 401 | 377 | 1340 | 1040 | 1021 | 2328 | 2419 | 2726 | 3152 | 2998 | 12 | |||
| 35.1 | 32.5 | 27.9 | 31.4 | 32.5 | 31.1 | 30.8 | 33.8 | 31.9 | 30.1 | 24.6 | 35.0 | 31.0 | 31.0 | 30.8 | 34.5 | 35.9 | 35.5 | 34.2 | 30.0 | 35.2 | 36.5 | 32.3 | 2.9 | 31.9 | |
| 34.4 | 30.8 | 26.6 | 30.6 | 30.9 | 30.3 | 31.8 | 32.9 | 27.0 | 29.7 | 16.1 | 35.1 | 32.9 | 33.3 | 31.9 | 35.9 | 35.3 | 34.9 | 33.1 | 29.3 | 34.5 | 35.9 | 31.5 | 4.4 | 31.0 | |
| 32.5 | 32.5 | 26.0 | 30.9 | 31.4 | 32.0 | 31.8 | 34.6 | 27.9 | 24.5 | 17.5 | 24.9 | 30.2 | 27.9 | 29.3 | 31.8 | 36.3 | 35.2 | 34.6 | 30.7 | 35.3 | 34.8 | 30.6 | 4.5 | 31.5 | |
| 32.5 | 31.4 | 25.0 | 30.4 | 30.4 | 31.2 | 32.3 | 34.2 | 23.8 | 24.0 | 11.8 | 25.1 | 30.8 | 29.1 | 28.7 | 30.9 | 33.6 | 32.9 | 32.4 | 29.0 | 33.6 | 34.4 | 29.4 | 5.1 | 30.9 | |
| 33.4 | 27.9 | 24.3 | 27.2 | 27.7 | 25.8 | 26.2 | 27.6 | 29.3 | 25.8 | 24.5 | 26.6 | 23.2 | 23.7 | 24.7 | 26.6 | 28.4 | 29.7 | 25.9 | 24.0 | 30.8 | 33.9 | 27.2 | 2.9 | 27.5 | |
| 31.8 | 29.1 | 23.6 | 27.9 | 28.2 | 27.0 | 27.3 | 29.1 | 25.9 | 21.4 | 17.8 | 20.5 | 24.0 | 22.4 | 24.4 | 25.8 | 30.0 | 31.8 | 28.0 | 25.6 | 30.9 | 32.7 | 26.6 | 3.9 | 28.0 | |
| 25.3 | 18.0 | 14.1 | 19.8 | 19.2 | 24.9 | 28.1 | 29.1 | 13.5 | 20.7 | 14.1 | 33.0 | 28.4 | 24.2 | 25.1 | 26.6 | 29.6 | 29.4 | 30.1 | 23.5 | 24.5 | 24.2 | 23.9 | 5.5 | 22.3 | |
| 24.4 | 17.3 | 13.6 | 19.4 | 18.6 | 24.0 | 27.5 | 29.0 | 10.2 | 19.6 | 7.4 | 33.3 | 30.1 | 26.0 | 26.5 | 28.4 | 28.6 | 29.4 | 29.2 | 23.5 | 24.1 | 22.3 | 23.3 | 6.7 | 21.7 | |
| 19.3 | 17.2 | 12.9 | 13.4 | 14.2 | 25.3 | 22.6 | 23.8 | 18.5 | 27.7 | 11.0 | 33.7 | 25.6 | 26.6 | 28.1 | 31.8 | 24.9 | 23.9 | 25.9 | 21.0 | 20.9 | 25.4 | 22.4 | 6.0 | 18.6 | |
| 28.1 | 22.0 | 18.7 | 24.7 | 24.3 | 28.0 | 28.8 | 30.6 | 27.0 | 7.9 | 4.3 | 10.4 | 9.8 | 21.6 | 8.4 | 28.3 | 28.8 | 28.7 | 11.2 | 26.1 | 7.0 | 31.6 | 20.7 | 9.2 | 25.6 | |
| 29.4 | 22.7 | 21.6 | 18.0 | 23.1 | 27.9 | 27.2 | 27.9 | 30.5 | 14.3 | 11.6 | 16.1 | 14.8 | 24.8 | 11.7 | 16.0 | 19.4 | 22.7 | 9.3 | 15.8 | 5.6 | 35.4 | 20.3 | 7.7 | 24.7 | |
| 27.1 | 20.8 | 17.9 | 24.1 | 23.2 | 26.9 | 29.3 | 30.3 | 21.5 | 8.3 | 3.7 | 10.2 | 9.3 | 22.6 | 6.0 | 29.5 | 29.0 | 29.1 | 7.5 | 25.1 | 5.5 | 30.0 | 19.9 | 9.5 | 25.0 | |
| 23.7 | 15.0 | 11.8 | 16.6 | 15.9 | 19.9 | 23.3 | 24.6 | 11.8 | 16.9 | 13.1 | 24.8 | 21.0 | 18.4 | 19.6 | 19.8 | 22.7 | 25.3 | 22.6 | 18.5 | 20.4 | 20.6 | 19.4 | 4.1 | 18.9 | |
| 29.2 | 22.2 | 20.8 | 17.5 | 22.6 | 27.2 | 27.2 | 27.5 | 26.3 | 14.9 | 9.3 | 15.9 | 13.8 | 26.0 | 8.6 | 14.6 | 18.0 | 20.7 | 6.5 | 14.2 | 4.7 | 35.2 | 19.2 | 8.1 | 24.3 | |
| 19.5 | 18.3 | 13.1 | 14.9 | 15.1 | 26.3 | 23.3 | 24.3 | 16.8 | 12.6 | 5.9 | 13.8 | 14.7 | 25.2 | 13.5 | 29.8 | 26.6 | 23.9 | 14.1 | 22.8 | 10.5 | 24.3 | 18.6 | 6.3 | 19.4 | |
| 29.0 | 20.7 | 19.7 | 16.8 | 21.5 | 24.2 | 24.2 | 23.7 | 28.3 | 13.9 | 12.3 | 15.2 | 13.2 | 20.2 | 10.9 | 13.3 | 16.8 | 20.7 | 9.1 | 13.8 | 6.1 | 33.3 | 18.5 | 6.9 | 22.5 | |
| 25.7 | 19.4 | 16.0 | 20.2 | 19.6 | 25.4 | 27.0 | 29.0 | 13.4 | 5.2 | 2.4 | 10.8 | 7.7 | 27.1 | 6.3 | 28.7 | 29.3 | 30.2 | 8.0 | 24.3 | 5.4 | 24.3 | 18.4 | 9.4 | 22.8 | |
| 26.5 | 19.0 | 16.3 | 21.5 | 20.9 | 23.6 | 24.3 | 26.3 | 24.5 | 7.8 | 5.0 | 9.7 | 8.5 | 16.0 | 7.7 | 21.4 | 22.2 | 23.9 | 10.1 | 21.0 | 7.1 | 28.3 | 17.8 | 7.5 | 22.3 | |
| 15.6 | 14.0 | 12.3 | 14.5 | 14.6 | 18.7 | 20.8 | 24.7 | 12.4 | 15.9 | 7.0 | 22.9 | 19.6 | 13.4 | 14.5 | 17.7 | 20.4 | 20.9 | 22.3 | 14.9 | 20.0 | 18.1 | 17.1 | 4.2 | 16.9 | |
| 17.2 | 14.1 | 9.9 | 13.2 | 12.8 | 25.1 | 22.9 | 23.2 | 15.8 | 7.2 | 3.8 | 10.0 | 9.5 | 19.5 | 7.0 | 26.7 | 21.8 | 20.4 | 10.8 | 19.2 | 6.2 | 21.3 | 15.3 | 6.8 | 17.3 | |
| 16.1 | 15.3 | 12.3 | 15.8 | 15.5 | 20.6 | 21.8 | 26.2 | 11.3 | 9.0 | 3.9 | 11.6 | 12.3 | 13.0 | 9.0 | 17.3 | 22.5 | 21.4 | 13.4 | 16.2 | 10.0 | 18.1 | 15.1 | 5.3 | 17.9 | |
| 3.5 | 2.3 | 1.6 | 1.8 | 1.5 | 5.6 | 6.0 | 5.5 | 3.1 | 4.5 | 0.6 | 8.9 | 4.8 | 6.2 | 3.3 | 5.3 | 5.4 | 4.6 | 3.0 | 5.0 | 0.8 | 4.4 | 4.0 | 2.0 | 3.5 | |
IgC values for additional 48 restriction enzyme pairs are shown in Table S3. Three enzyme pairs selected for empirical evaluations in plants are highlighted in red. The number below the organism is its genome size in Mbp (1 × 106 bp). More genome information for these species is shown in Table S2.
The organisms listed are the same as Table S2.
Figure 2Fragment distributions detected in silico on selected chromosomes of Arabidopsis and rice. Distributions of DNA fragments generated by in silico digestions with three restriction enzyme combinations on two chromosomes of Arabidopsis thaliana (A, B) and Oryza sativa (C, D). The number of DNA fragments and the average enzyme-cutting position based on a 100 kb sliding-window of a given chromosome are calculated and shown with a colored line for each enzyme combination. The corresponding horizontal linear line represents the average fragment count for an enzyme combination on the chromosome. More digestions were found for HinfI + HpyCh4IV than the other two enzyme combinations.
Statistics of contigs and single nucleotide polymorphisms obtained for two restriction enzyme combinations (PM = PstI + MspI; HH = HinfI + HpyCH4IV) in combined runs of 12 Arabidopsis and 12 rice samples
| Rice | ||||
|---|---|---|---|---|
| Statistic | PM | HH | PM | HH |
| Contig statistic | ||||
| Total contigs | 10,498 | 42,355 | 28,611 | 36,808 |
| Mean contig length (SD) in bp | 243 (18) | 242 (20) | 239 (18) | 238 (19) |
| Total reads | 6,334,545 | 5,431,330 | 5,514,707 | 3,547,509 |
| Mean reads/contig | 603.4 | 128.2 | 192.7 | 96.4 |
| Mean reads/contig/sample | 50.3 | 10.7 | 16.1 | 8.0 |
| Contigs with SNP0 (%) | 239 (2.3) | 473 (1.1) | 1308 (4.6) | 852 (2.3) |
| Contigs with SNP0 + SNP1 (%) | 368 (3.5) | 1623 (3.8) | 1960 (6.9) | 1900 (5.2) |
| Contigs with SNP0 + SNP1 + SNP2 (%) | 453 (4.3) | 2915 (6.9) | 2319 (8.1) | 2834 (7.7) |
| Contigs with SNPwt (%) | 672 (6.4) | 5405 (12.8) | 3687 (12.9) | 5096 (13.8) |
| SNP statistic | ||||
| Total SNPs | 1343 | 11,489 | 7886 | 11,526 |
| Total reads | 350,269 | 925,468 | 1,412,233 | 1,052,072 |
| Mean reads/SNP | 260.8 | 80.6 | 179.1 | 91.3 |
| Mean reads/SNP/sample | 21.7 | 6.7 | 14.9 | 7.6 |
| SNP0 (%) | 423 (31.5) | 1122 (9.8) | 2325 (29.5) | 1995 (17.3) |
| SNP0 + SNP1 (%) | 688 (51.2) | 3417 (29.7) | 3823 (48.5) | 4216 (36.6) |
| SNP0 + SNP1 + SNP2 (%) | 884 (65.8) | 6168 (53.7) | 4753 (60.3) | 6261 (54.3) |
The percent of the total contigs or total SNPs (single nucleotide polymorphisms) is shown in parenthesis. SNP0, SNPs having no missing observations across the 12 samples; SNP1, SNPs having 8.3% observations missing (or absent in one of the 12 samples); SNP2, SNPs having 16.7% observations missing (or absent in two of the 12 samples); SNPwt, SNPs with or without missing observations across the 12 samples.
Figure 3SNP distributions for two RE pairs in Arabidopsis and rice. Distribution of total single nucleotide polymorphisms detected with respect to the number of samples present, and average number of reads per sample, for two restriction enzyme combinations (PM = PstI + MspI; HH = HinfI + HpyCH4IV) in combined runs of 12 Arabidopsis and 12 rice samples.