| Literature DB >> 28144247 |
Wei Yan1, Zhufeng Chen2, Jiawei Lu2, Chunjue Xu2, Gang Xie2, Yiqi Li2, Xing Wang Deng3, Hang He4, Xiaoyan Tang5.
Abstract
Next-generation sequencing technologies (NGST) are being used to discover causal mutations in ethyl methanesulfonate (EMS)-mutagenized plant populations. However, the published protocols often deliver too many candidate sites and sometimes fail to find the mutant gene of interest. Accurate identification of the causal mutation from massive background polymorphisms and sequencing deficiencies remains challenging. Here we describe a NGST-based method, named SIMM, that can simultaneously identify the causal mutations in multiple independent mutants. Multiple rice mutants derived from the same parental line were back-crossed, and for each mutant, the derived F2 individuals of the recessive mutant phenotype were pooled and sequenced. The resulting sequences were aligned to the Nipponbare reference genome, and single nucleotide polymorphisms (SNPs) were subsequently compared among the mutants. Allele index (AI) and Euclidean distance (ED) were incorporated into the analysis to reduce noises caused by background polymorphisms and re-sequencing errors. Corrections of sequence bias against GC- and AT-rich sequences in the candidate region were conducted when necessary. Using this method, we successfully identified seven new mutant alleles from Huanghuazhan (HHZ), an elite indica rice cultivar in China. All mutant alleles were validated by phenotype association assay. A pipeline based on Perl scripts for SIMM is publicly available at https://sourceforge.net/projects/simm/.Entities:
Keywords: Euclidean distance; SIMM; SNP index; allele index; next-generation sequencing technology; sequence correction; single nucleotide polymorphism
Year: 2017 PMID: 28144247 PMCID: PMC5239786 DOI: 10.3389/fpls.2016.02055
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Comparison of results from MutMap, NIKS and SIMM analyses on seven published mutants.
| Sample | MutMap1 | NIKS2 | SIMM3 | ED4 | Chr | Locus | Genotype/aa | Candidate gene |
|---|---|---|---|---|---|---|---|---|
| Hit0746-sd | 121 (13) | 16 | 8 (8) | 3 | Chr8 | 27,272,776 | G to A (D to N) | LOC_Os08g43120b,c |
| Hit5243-sm | 229 (21) | 19 | 32 (14) | 15 | Chr8 | 11,552,593 | C to T (R to C) | LOC_Os08g19310a,b,c |
| Hit5500-sd | 131 (9) | 12 | 10 (4) | 8 | Chr9 | 20,062,177 | T to A (L to stop) | LOC_Os09g33980a,b,c |
| Hit1917-pl1 | 101 (24) | 21 | 19 (18) | 7 | Chr10 | 22,484,681 | C to T (L to F) | LOC_Os10g41780a,b,c |
| 22,705,386 | C to T (A to V) | LOC_Os10g42196b,c | ||||||
| Hit1917-sd | 139 (10) | 7 | 12 (12) | 9 | Chr12 | 23,278,301 | G to A (splicing site) | LOC_Os12g37870c |
| 23,285,822 | G to A (S to N) | LOC_Os12g37890b,c | ||||||
| Hit5814-sd | 121 (4) | 10 | 6 (4) | 3 | Chr4 | 23,555,382 | A to T (R to stop) | LOC_Os04g39560a,b,c |
| 24,669,920 | C to T (L to F) | LOC_Os04g41580a,b,c | ||||||
| 25,537,772 | C to T (R to C) | LOC_Os04g43140a,b,c | ||||||
| Hit0813-pl2 | 238 (10) | 20 | 6 (6) | 4 | Chr1 | 42,235,284 | G to A (intron) | LOC_Os01g72800 b,c |
| 42,282,527 | C to T (intergenic) | – | ||||||
| 42,843,286 | C to T (intergenic) | – | ||||||
| 43,121,603 | G to A (intron) | LOC_Os01g74460c |
Summary of SIMM analysis of HHZ mutants.
| Samples | Total reads | Aligned reads | Depth | Coverage (%) | #SNPs1 | #SNPs2 | #SNPs3 | #SNP4 | Candidate gene | Locus /genotype(aa) | SNP index | ED6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| WT HHZ | 121,453,146 | 107,052,411 | 10X | 83.41 | 1,004,213 | / | / | / | / | / | / | / |
| H-224 | 141,304,146 | 110,556,122 | 29X | 88.53 | 1,564,462 | 6,063 (1,546) | 1,109 (41) | 9 | LOC_Os03g43670 ( | 24,417,520 G to A (G to D) | 0.9778 | 6.9913 |
| HT5763 | 127,424,446 | 103,501,527 | 28X | 88.18 | 1,308,965 | 9,230 (1,035) | 6,486 (97) | 4 | LOC_Os04g39470 ( | 23,511,644 G to A (E to K) | 0.9231 | 4.9506 |
| H-190 | 144,014,516 | 111,948,542 | 35X | 87.36 | 1,210,418 | 5,108 (1,238) | 330 (7) | 7 | LOC_Os03g58600 ( | 33,373,677 T to G (Y to D) | 1.0000 | 7.2970 |
| H-174 | 154,449,224 | 118,039,787 | 37X | 87.38 | 1,328455 | 5,505 (1,303) | 573 (16) | 13 | LOC_Os03g58600 ( | 33,372,744 G to A (E to K) | 1.0000 | 7.9995 |
| HM2-S61 | 173,294,984 | 131,445,408 | 41X | 87.57 | 1,385,235 | 5,337 (1,399) | 435 (10) | 4 | LOC_Os07g32480 ( | 19,329,232 G to A (G to S) | 0.9655 | 6.4798 |
| HE-47 | 145,091,780 | 110,757,680 | 29X | 89.16 | 1,609,185 | 5,217 (955) | 1,051 (52) | 8 | LOC_Os03g06410 ( | 3,206,988 T to C (S to F) | 1.0000 | 7.9995 |
| H-212 | 253,627,768 | 114,774,504 | 28X | 90.17 | 1,286,823 | 7,012 (1,230) | 3,109 (233) | 20 | LOC_Os07g04670 ( | 2,068,093 C to T (Q to stop) | 1.0000 | 7.9995 |
Validation of causal mutations identified in HHZ mutants via co-segregation assay.
| Sample | Phenotype | WT/Mutant | χ2 (3:1) | Candidate gene | Primers | WT | Mutant | |
|---|---|---|---|---|---|---|---|---|
| Homo | Hetero | Homo | ||||||
| H-224 | Open hull and brownish palea/lemma | 299:85 | 0.0188 (S) | LOC_Os03g43670 | F: 5′-CAGCTGGTTCTGTGTTCAATTGTGGC-3′ | 27 | 63 | 26 |
| HT5763 | Male sterile | 232:80 | 0.0008 (S) | LOC_Os04g39470 | F: 5′-CCACTCGTTGATGCCTACCTGTTGC-3′ | 39 | 72 | 30 |
| H-190 | Male sterile | 512:264 | 0.1444 (NS) | LOC_Os03g58600 | F: 5′-CATGTAGGTTGTGGCATC-3′ | 42 | 76 | 35 |
| H-174 | Male sterile | 217:67 | 0.0043 (S) | LOC_Os03g58600 | F: 5′-TGTGCTCTTGACATTCCT-3′ | 21 | 58 | 25 |
| HM2-S61 | Male sterile | 251:93 | 0.0083 (S) | LOC_Os07g32480 | F: 5′-GCACCAAGGTGGTAGAAAG-3′ | 24 | 55 | 32 |
| HE-47 | Red lesions on leaves | 97:39 | 0.0257 (S) | LOC_Os03g06410 | F: 5′-TGGAGTTGAATAGAGTGT-3′ | 36 | 90 | 42 |
| H-212 | Long sterile lemma | 132:44 | 0.0000 (S) | LOC_Os07g04670 | F: 5′-TGCTCGCCGGCGGAGCTG-3′ | 30 | 85 | 45 |