| Literature DB >> 26610706 |
Mai Tsuda1,2, Akito Kaga3, Toyoaki Anai4, Takehiko Shimizu5, Takashi Sayama6, Kyoko Takagi7,8, Kayo Machita9, Satoshi Watanabe10,11, Minoru Nishimura12,13, Naohiro Yamada14, Satomi Mori15, Harumi Sasaki16, Hiroyuki Kanamori17, Yuichi Katayose18, Masao Ishimoto19.
Abstract
BACKGROUND: Functions of most genes predicted in the soybean genome have not been clarified. A mutant library with a high mutation density would be helpful for functional studies and for identification of novel alleles useful for breeding. Development of cost-effective and high-throughput protocols using next generation sequencing (NGS) technologies is expected to simplify the retrieval of mutants with mutations in genes of interest.Entities:
Mesh:
Year: 2015 PMID: 26610706 PMCID: PMC4662035 DOI: 10.1186/s12864-015-2079-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Outline of the construction of the EMS-induced mutant library. Seeds were treated with a chemical mutagen (0.35 % EMS). To increase the mutation density, bulk M2 seeds from approximately 2000 M1 plants were treated again with 0.35 % EMS and 8000 M2 seeds used to grow M1’ plants; the generation after the second mutagen treatment was called M1’ to discriminate it from the initial M1 generation. M2' seeds were collected each from 1762 M1' plants. A total of 1762 M2' plants were grown and used to extract DNA. Twenty-six potential out-crossing plants were removed, resulting in a total of 1736 M2’ plants. The resultant library consisted of DNA derived from 1536 M2’ plants; 1437 of which produced seeds. Seeds from each line were stored with their corresponding DNAs
Fig. 2Mutant phenotypes observed in the M2’ plants. Wild-type: (b), (e); Mutant phenotypes: (a) Early maturity, (c) Long internodes, (d) Many root nodules, (f) Albino
Frequency of typical mutant phenotypes detected in the library
| Phenotype description | Number of plants | Frequency (%)** |
|---|---|---|
| Albino (medium - heavy)* | 76 | 4.4 |
| Rugose leaves, dwarf* | 11 | 0.6 |
| Rugose leaves, semi-dwarf* | 20 | 1.2 |
| Rugose leaves, normal | 31 | 1.8 |
| Dwarf | 17 | 1.0 |
| Semi-dwarf | 110 | 6.3 |
| Early flowering | 46 | 2.6 |
| Early maturity | 29 | 1.7 |
| White flowers | 6 | 0.3 |
| Violet flowers | 8 | 0.5 |
| Short internodes | 2 | 0.1 |
| Long internodes | 3 | 0.2 |
| Narrow leaves | 8 | 0.5 |
| Low-density pubescence | 3 | 0.2 |
| Big primary leaves | 6 | 0.3 |
| Early defoliation | 1 | 0.1 |
| Easy lodging | 1 | 0.1 |
| Many leaves | 1 | 0.1 |
| Long peduncles | 1 | 0.1 |
| Many rootlets | 2 | 0.1 |
| Many root nodules | 10 | 0.6 |
| No root nodules | 1 | 0.1 |
| Many pods and early pod maturity | 1 | 0.1 |
| Wide pods and early maturity | 2 | 0.1 |
| Large seeds | 27 | 1.6 |
| Deep yellow seed coat | 4 | 0.2 |
| Light brown seed coat | 2 | 0.1 |
*Most plants produced no or few seeds
**Frequency (%) was calculated from number of mutant for each phenotype divided by 1736 M2’ plants
Fig. 3Variations in protein, oil, and sugar content among seeds harvested from M2’ plants and its progenies. Left histograms and right bar graphs indicate variations in protein, oil, and sugar content of the seeds harvested from M2’ plants and the progenies (M3’ plants), respectively. High content plants (red) and low content plants (blue) in M2’ population were re-evaluated at M3’ plants. In the left histograms, mean and variation for wild-type plants (WT) are indicated by green ellipses and double-headed arrows, respectively
Mutations in 12 lines detected by using whole-genome re-sequencing analysis
| Line name | Depth of coverage | Genome coverage (%) | Number of base changes | Type of base change | Amino acid substitutions | Misssense mutations | Nonsense mutations | Distance between base changes (kb)** | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G > A | C > T | T > A | A > T | T > C | A > G | G > T | C > A | A > C | T > G | C > G | G > C | ||||||||
| EnT-0263 | 16.8 | 94.2 | 9934 | 3776 | 3481 | 635 | 718 | 266 | 300 | 202 | 206 | 135 | 117 | 60 | 38 | 573 | 543 | 30 | 95.7 |
| EnT-0394 | 14.3 | 94.0 | 15579 | 5494 | 6493 | 1023 | 985 | 298 | 266 | 262 | 245 | 170 | 194 | 85 | 64 | 950 | 899 | 51 | 61.0 |
| EnT-0442 | 17.0 | 94.1 | 8970 | 3612 | 3179 | 541 | 541 | 231 | 229 | 146 | 159 | 129 | 107 | 56 | 40 | 564 | 535 | 29 | 106.0 |
| EnT-0541 | 23.1 | 93.9 | 15627 | 5666 | 5638 | 1120 | 966 | 459 | 452 | 338 | 334 | 237 | 220 | 103 | 94 | 823 | 785 | 38 | 60.8 |
| EnT-0790 | 16.9 | 93.9 | 9366 | 3391 | 3366 | 661 | 609 | 262 | 252 | 171 | 212 | 155 | 160 | 65 | 62 | 514 | 485 | 29 | 101.5 |
| EnT-0964 | 19.3 | 94.3 | 21861 | 8099 | 7688 | 1980 | 1972 | 397 | 373 | 374 | 324 | 265 | 263 | 61 | 65 | 1303 | 1246 | 57 | 43.5 |
| EnT-1045 | 21.5 | 93.9 | 9172 | 2428 | 2926 | 1217 | 996 | 349 | 272 | 205 | 235 | 195 | 212 | 84 | 53 | 442 | 426 | 16 | 103.6 |
| EnT-1079 | 18.0 | 94.0 | 14074 | 5376 | 5167 | 1057 | 919 | 292 | 297 | 271 | 247 | 183 | 150 | 69 | 46 | 902 | 865 | 37 | 67.5 |
| EnT-1197 | 15.2 | 94.0 | 13782 | 4516 | 4877 | 1383 | 1234 | 353 | 295 | 269 | 289 | 194 | 227 | 79 | 66 | 796 | 756 | 40 | 69.0 |
| EnT-1610 | 25.0 | 93.8 | 9481 | 2536 | 2874 | 1220 | 1048 | 388 | 338 | 250 | 226 | 220 | 227 | 91 | 63 | 439 | 425 | 14 | 100.3 |
| EnT-1619 | 19.4 | 93.8 | 13274 | 5187 | 4122 | 1077 | 1269 | 273 | 291 | 318 | 250 | 191 | 170 | 66 | 60 | 729 | 686 | 43 | 71.6 |
| EnT-1634 | 20.5 | 93.6 | 12434 | 3306 | 3282 | 1997 | 2138 | 331 | 362 | 228 | 220 | 233 | 224 | 63 | 50 | 684 | 646 | 38 | 76.5 |
| Average | 18.9 | 94.0 | 12796 | 4449 | 4424 | 1159 | 1116 | 325 | 311 | 253 | 246 | 192 | 189 | 74 | 58 | 727 | 691 | 35 | 74.3 |
| Percentage of each type of base change* | 34.8 % | 34.6 % | 9.1 % | 8.7 % | 2.5 % | 2.4 % | 2.0 % | 1.9 % | 1.5 % | 1.5 % | 0.6 % | 0.5 % | |||||||
*The percentage of each type of base change was calculated from a total each type of base change in 12 mutants divided by all base changes
**The distance between base changes was calculated from a total number of base changes per plant and the size of chromosome-scale assembly of the soybean genome (950,068,807 bp)
Fig. 4Distribution of mutations affecting the amino acid sequence on 20 chromosomes of two mutant soybean lines. Blue and yellow lines on the chromosomes of two mutant lines, EnT-1634 (a) and EnT-0964 (b), indicate missense and nonsense mutations, respectively. The black line on the left of each chromosome indicates the pericentromeric region with lower gene densities than surrounding euchromatic region
Fig. 5Mutant discovery by using HRM and indexed amplicon sequencing. DNA extracted from M2’ plants was preserved as the original DNA stock in 96-well plates. The DNA pool in a 384-well plate (four samples per pool) was used for both methods. After a mutation was detected by HRM analysis, base changes in four original DNA samples were confirmed by direct sequencing. If the mutation was found to be silent, HRM analysis and direct sequencing of other regions were performed. In indexed amplicon sequencing, 7 target gene regions (1.3–7.5 kb, 30.3 kb in total) were amplified by long-range PCR. The amplicons of four samples were further pooled. The 96 samples were indexed by using a transposome-based Nextera XT Index kit. Bulk read data for all 96 DNA pools were obtained from Miseq and mapped onto the reference sequences of target genes after classification of the DNA pool by using indices. Base changes at high frequency in many reads were treated as a mutation and were filtered by using a Glyma_189 gene annotation to exclude mutations that did not lead to amino acid substitutions. Based on the information from DNA pool classification with indices, the base change and the plant in which it occurred could be determined by direct sequencing of each of the 16 original M2’ DNA samples. Amplicon sequencing using NGS allows rapid and effective detection of DNA pools containing mutations that cause desirable functional amino acid substitutions
Mutations in Glyma20g25000 detected by high resolution melting analysis
| Line name | Target region | Amplicon size (bp)* | Base change | Mode of mutation | Chromosome | Position (bp) | Amino acid substitution |
|---|---|---|---|---|---|---|---|
| EnT-0541 | Ln ex1 | 332 | G > A | hetero | Gm20 | 34688627 | Met1Ile |
| EnT-0685 | Ln ex1 | 332 | C > T | hetero | Gm20 | 34688652 | Leu10Phe |
| EnT-1168 | Ln ex1 | 332 | G > A | homo** | Gm20 | 34688672 | Syn |
| EnT-0112 | Ln ex1 | 332 | G > A | homo** | Gm20 | 34688682a | Asp20Asn |
| EnT-0044 | Ln ex1 | 332 | G > A | hetero | Gm20 | 34688682a | Asp20Asn |
| EnT-1589 | Ln ex1 | 332 | G > A | hetero | Gm20 | 34688682a | Asp20Asn |
| EnT-1376 | Ln ex1 | 332 | G > A | hetero | Gm20 | 34688686b | Gly21Asp |
| EnT-0621 | Ln ex1 | 332 | G > A | hetero | Gm20 | 34688686b | Gly21Asp |
| EnT-1048 | Ln ex1 | 332 | C > T | homo** | Gm20 | 34688696c | Syn |
| EnT-1306 | Ln ex1 | 332 | C > T | homo** | Gm20 | 34688696c | Syn |
| EnT-0987 | Ln ex1 | 332 | C > T | homo** | Gm20 | 34688713 | Ser30Phe |
| EnT-0160 | Ln ex1 | 332 | C > T | hetero | Gm20 | 34688719 | Ser32Phe |
| EnT-0634 | Ln ex2 | 231 | G > A | homo** | Gm20 | 34689275d | Gly40Ser |
| EnT-0749 | Ln ex2 | 231 | G > A | homo** | Gm20 | 34689275d | Gly40Ser |
| EnT-0439 | Ln ex2 | 231 | C > T | hetero | Gm20 | 34689313 | Syn |
| EnT-1084 | Ln ex2 | 231 | G > A | hetero | Gm20 | 34689360 | Gly68Glu |
| EnT-1281 | Ln ex3 | 114 | C > T | hetero | Gm20 | 34689704 | Ala102Val |
| EnT-1312 | Ln ex3 | 114 | A > T | hetero | Gm20 | 34689710 | His104Leu |
| EnT-0601 | Ln ex4 | 467 | G > A | homo** | Gm20 | 34690049 | Syn |
| EnT-0155 | Ln ex4 | 467 | G > A | hetero | Gm20 | 34690058 | Syn |
| EnT-0687 | Ln ex4 | 467 | C > T | homo** | Gm20 | 34690175e | Syn |
| EnT-0862 | Ln ex4 | 467 | C > T | homo** | Gm20 | 34690175e | Syn |
| EnT-1619 | Ln ex4 | 467 | G > A | hetero | Gm20 | 34690246 | Gly213Asp |
| EnT-1265 | Ln ex4 | 467 | C > T | homo** | Gm20 | 34690247f | Syn |
| EnT-0510 | Ln ex4 | 467 | C > T | hetero*** | Gm20 | 34690247f | Syn |
| EnT-0383 | Ln ex4 | 467 | C > T | hetero | Gm20 | 34690247f | Syn |
*The amplicon size does not include primer sequences
**Base changes in a homozygous state probably occurred in M1 plants
***Base changes in a heterozygous state are probably derived from the same M1 plants as those labeled with **
Syn indicates synonymous site at which a base substitution does not cause an amino acid substitution
Superscripts a to f indicate that the mutation was duplicated in plants labeled with the same letter
Read coverage and mutations in seven genes identified by using indexed amplicon sequencing
| Gene locus* | Amplicon size (bp) | Consensus length (bp) | Total read counts for 96 DNA pools | Read coverage per sample | Base changes | Type of base change | Amino acid substitutions | Misssense mutations | Nonsense mutations | Total number of base changes expected per M2' plant** | Distance between base changes (kb)*** | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Average | Minimum | Maximum | G > A | C > T | T > A | A > T | Others | ||||||||||
| Glyma06g19820 | 7539 | 7519 | 3476540 | 54 | 13 | 137 | 88 | 46 | 30 | 1 | 4 | 7 | 30 | 29 | 1 | 7224 | 132 |
| Glyma05g01770 | 5643 | 5625 | 3844251 | 80 | 5 | 172 | 67 | 26 | 35 | 1 | 2 | 3 | 26 | 25 | 1 | 7348 | 129 |
| Glyma08g46520 | 2531 | 2514 | 1476507 | 69 | 18 | 166 | 77 | 31 | 36 | 2 | 6 | 2 | 44 | 43 | 1 | 18828 | 50 |
| Glyma06g23026 | 1304 | 1294 | 4156094 | 381 | 33 | 1188 | 32 | 17 | 12 | 1 | 0 | 2 | 12 | 10 | 2 | 15188 | 63 |
| Glyma20g22160 | 6390 | 6370 | 6977332 | 130 | 35 | 276 | 132 | 61 | 56 | 1 | 3 | 11 | 72 | 69 | 3 | 12785 | 74 |
| Glyma11g15580 | 4214 | 4190 | 1155194 | 32 | 8 | 124 | 101 | 36 | 35 | 0 | 3 | 27 | 28 | 27 | 1 | 14833 | 64 |
| Glyma20g25000 | 2648 | 2623 | 1044416 | 47 | 10 | 137 | 64 | 26 | 26 | 2 | 9 | 1 | 21 | 21 | 0 | 14958 | 64 |
| Total | 30269 | 30135 | 22130334 | 561 | 233 | 224 | 9 | (av.) 14269 | (av.) 67.0 | ||||||||
| Percentage of each type of base change**** | 43.3 % | 41.0 % | 1.4 % | 4.8 % | 9.4 % | ||||||||||||
*Gene locus names are indicated according to gene models in the Glyma_189 assembly (v1.1)
**The total number of base changes per M2' plant was calculated from a total number of base changes in the library estimated from the amplicon size and the size of chromosome-scale assembly of the soybean genome (950,068,807 bp) and then divided by the total number of plants (1536)
***The distance between base changes was calculated from the size of the amplicon divided by the total number of base changes found in the library
****The percentage of each type of base change was calculated from a total each type of base change in seven genes divided by all base changes
Fig. 6Distribution of read coverage across seven amplicons and observed base changes in the DNA pool. Base change positions called as mutations are shown by red circles. Red arrows indicate confirmed by Sanger sequencing, respectively
Comparison of mutation detection by high resolution melting analysis and indexed amplicon sequencing
| Gene locus | Length compared (bp) | All base changes obtained from two methods | Base changes common to both methods | HRM analysis only | Index amplicon sequencing only | Percentage of mutations detected by HRM analysis (%) | Percentage of mutations detected by index amplicon sequencing (%) |
|---|---|---|---|---|---|---|---|
| Glyma20g25000 | 1144 | 37 | 14 | 4 | 19 | 48.6 | 89.2 |
| Glyma08g46520 | 487 | 22 | 13 | 1 | 8 | 63.6 | 95.5 |
| Glyma06g23026 | 305 | 12 | 7 | 0 | 5 | 58.3 | 100.0 |
| Glyma20g22160 | 1187 | 24 | 12 | 4 | 8 | 66.7 | 83.3 |
| Glyma11g15580 | 380 | 12 | 7 | 2 | 3 | 75.0 | 83.3 |
| Total | 3503 | 107 | 53 | 11 | 43 | 62.5 | 90.3 |