| Literature DB >> 29163581 |
Yan Du1, Shanwei Luo1,2, Xin Li1, Jiangyan Yang1, Tao Cui1,2, Wenjian Li1, Lixia Yu1, Hui Feng1,2, Yuze Chen3, Jinhu Mu1,2, Xia Chen1,2, Qingyao Shu4, Tao Guo5, Wenlong Luo5, Libin Zhou1.
Abstract
Heavy-ion beam irradiation is one of the principal methods used to create mutants in plants. Research on mutagenic effects and molecular mechanisms of radiation is an important subject that is multi-disciplinary. Here, we re-sequenced 11 mutagenesis progeny (M3) Arabidopsis thaliana lines derived from carbon-ion beam (CIB) irradiation, and subsequently focused on substitutions and small insertion-deletion (INDELs). We found that CIB induced more substitutions (320) than INDELs (124). Meanwhile, the single base INDELs were more prevalent than those in large size (≥2 bp). In details, the detected substitutions showed an obvious bias of C > T transitions, by activating the formation of covalent linkages between neighboring pyrimidine residues in the DNA sequence. An A and T bias was observed among the single base INDELs, in which most of these were induced by replication slippage at either the homopolymer or polynucleotide repeat regions. The mutation rate of 200-Gy CIB irradiation was estimated as 3.37 × 10-7 per site. Different from previous researches which mainly focused on the phenotype, chromosome aberration, genetic polymorphism, or sequencing analysis of specific genes only, our study revealed genome-wide molecular profile and rate of mutations induced by CIB irradiation. We hope our data could provide valuable clues for explaining the potential mechanism of plant mutation breeding by CIB irradiation.Entities:
Keywords: Arabidopsis thaliana; carbon-ion beam (CIB) irradiation; molecular spectrum; mutation; small INDELs; substitutions; whole genome-wide re-sequencing
Year: 2017 PMID: 29163581 PMCID: PMC5665000 DOI: 10.3389/fpls.2017.01851
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Phenotypes of the nine re-sequenced lines that had stable mutation traits induced by CIB irradiation. (a): wild type (ecotype Col); (b–j): mutagenesis progeny lines in the corresponding order of C7, C197, C352, C357, C600, C828, C941, C116 and C541.
Figure 2Background mutations shared by the 12 re-sequenced Arabidopsis thaliana lines. Shown are the density and variant rate of the single base substitutions and small INDELs that were shared by 12 re-sequenced lines across each chromosome.
Figure 3Verification of background mutations by Sanger sequencing. The REF denotes the Col-0 reference genome. The sequence alignment of different primers (P1–P7) from C116, C172, C197, C352, C357, Lab-WT and reference were shown with different colors according to their sequence homology. Pink highlights the sequence which are different from the reference genome, the dot indicates deletions in the corresponding position.
Figure 4Distributions of the substitutions and small INDELs across chromosomes in the genomes of 11 mutagenesis progeny lines (M3) of Arabidopsis thaliana. Single-base INDELs are indicated by base-designating letters, with a preceding minus sign (deletion) or plus sign (insertion). Multiple-base INDELs are indicated by a minus or plus sign with the number of deleted or inserted bases. Individual colors distinguish the possible mutation effects: missense, frame shift, in-frame deletion, stop gained/lost (red); synonymous (green); UTR (blue); splice site region, intron (orange); intergenic region (gray); up downstream regions (black); non-coding exon (purple).
Figure 5Annotation of mutations induced by CIB irradiation in genomes of 11 mutagenesis progeny lines (M3) of Arabidopsis thaliana. (A) Mutations comprehensively inferred by BWA, SAMtools, VarScan 2, as well as their distributions among functional classes and mutation effects in each CIB-irradiated M3 plant line. (B) Overall distribution of mutations induced by CIB irradiation at the whole genomic level in Arabidopsis thaliana.
Variant rate of the single base substitutions and small INDELs across each chromosome in the 11 re-sequenced lines.
| C7 | 17 | 1789863 | 9 | 2188699 | 10 | 2345983 | 8 | 2323132 | 13 | 2075039 |
| C116 | 7 | 4346810 | 10 | 1969829 | 9 | 2606648 | 10 | 1858506 | 10 | 2697550 |
| C197 | 6 | 5071279 | 9 | 2188699 | 8 | 2932479 | 4 | 4646264 | 8 | 3371938 |
| C352 | 5 | 6085534 | 3 | 6566096 | 3 | 7819943 | 8 | 2323132 | 1 | 26975502 |
| C357 | 14 | 2173405 | 6 | 3283048 | 11 | 2132712 | 3 | 6195019 | 9 | 2997278 |
| C541 | 7 | 4346810 | 5 | 3939658 | 3 | 7819943 | 0 | / | 11 | 2452318 |
| C600 | 10 | 3042767 | 10 | 1969829 | 9 | 2606648 | 12 | 1548755 | 18 | 1498639 |
| C828 | 2 | 15213836 | 5 | 3939658 | 2 | 11729915 | 1 | 18585056 | 3 | 8991834 |
| C941 | 13 | 2340590 | 9 | 2188699 | 8 | 2932479 | 16 | 1161566 | 11 | 2452318 |
| C1001 | 8 | 3803459 | 3 | 6566096 | 2 | 11729915 | 8 | 2323132 | 5 | 5395100 |
| C1322 | 10 | 3042767 | 10 | 1969829 | 9 | 2606648 | 11 | 1689551 | 22 | 1226159 |
Variants indicate the number of mutations in each chromosome.
Rate equals to length of chromosome/variants. Length (bp) of chromosome 1 to 5 is 30427671, 19698289, 23459830, 18585056, and 26975502, respectively.
Numbers of genes predicted to occur function changes in the 11 re-sequenced lines.
| C7 | 2 | 4 | 1 | 0 | 2 | 3 | 3 | 1 | 13 |
| C116 | 0 | 5 | 2 | 1 | 1 | 0 | 0 | 0 | 9 |
| C197 | 0 | 0 | 0 | 0 | 2 | 4 | 0 | 1 | 7 |
| C352 | 0 | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 4 |
| C357 | 1 | 6 | 3 | 0 | 0 | 3 | 1 | 1 | 15 |
| C541 | 0 | 1 | 1 | 0 | 1 | 2 | 1 | 0 | 6 |
| C600 | 1 | 1 | 0 | 0 | 2 | 6 | 5 | 0 | 15 |
| C828 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 3 |
| C941 | 2 | 3 | 1 | 0 | 1 | 4 | 2 | 0 | 13 |
| C1001 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 4 |
| C1322 | 0 | 0 | 0 | 0 | 2 | 6 | 1 | 0 | 9 |
Non-synonymous mutations include those mutations with the effect of missense, stop gained, stop lost, and initiator codon variant.
Disruptive inframe_del indicates that the mutation lead one codon changed, and one or more codons are deleted.
When multiple mutations are located in the same gene in one re-sequenced line, and these mutations are viewed as a single gene-mutated.
Figure 6Single nucleotide mutation rates of CIB irradiation on Arabidopsis thaliana.
Figure 7Similarity between the mutant lines with similar phenotypes at the genomic level. The homozygous mutations in M3 were used to detect whether there were any common mutations at the genomic level between the re-sequenced lines sharing similar phenotypes.
Figure 8Molecular spectrum of CIB irradiation induced substitutions and small INDELs in the genomes of 11 M3 lines. (A) The ratio of transition to transversion (Ti/Tv). (B,C) Size distributions of the small INDELs. (D) Base bias of CIB irradiation-induced single base mutations.
Pyrimidine dinucleotide analyses at C > T substitutions sites in 11 CIB irradiated M3 lines.
| C7 | 5 | 1496355 | C > T | AATT | T |
| C7 | 5 | 1454369 | C > T | CCTT | T |
| C7 | 5 | 3829172 | C > T | TCAA | |
| C7 | 5 | 16650819 | C > T | GACT | T |
| C116 | 2 | 6339827 | C > T | GGCT | T |
| C116 | 3 | 8857172 | C > T | AGTA | |
| C116 | 3 | 13445052 | C > T | CAGT | T |
| C116 | 3 | 21948981 | C > T | TTCT | T |
| C116 | 5 | 8571804 | C > T | CCTT | T |
| C116 | 5 | 14749784 | C > T | ACAA | |
| C197 | 1 | 27664800 | C > T | TTTG | |
| C197 | 2 | 7777170 | C > T | CTCT | T |
| C197 | 2 | 3980109 | C > T | CGGG | |
| C197 | 3 | 5423074 | C > T | TTAA | |
| C197 | 3 | 20612671 | C > T | AATA | |
| C197 | 4 | 4989792 | C > T | ACGC | C |
| C197 | 5 | 12293490 | C > T | CTCT | T |
| C197 | 5 | 5133036 | C > T | ATTG | |
| C352 | 1 | 26491174 | C > T | TTTT | T |
| C352 | 4 | 17139849 | C > T | TCTA | |
| C357 | 1 | 5147165 | C > T | ATAT | T |
| C357 | 1 | 13896252 | C > T | TGTT | T |
| C357 | 2 | 2382455 | C > T | ACAC | C |
| C357 | 2 | 9045207 | C > T | TGCA | |
| C357 | 3 | 959295 | C > T | AGTT | T |
| C357 | 3 | 21754406 | C > T | GAAG | |
| C357 | 4 | 3608407 | C > T | ACTA | |
| C357 | 5 | 11079083 | C > T | TGTT | T |
| C541 | 1 | 27821135 | C > T | TAGG | |
| C541 | 3 | 13945744 | C > T | TGTT | T |
| C541 | 5 | 16565496 | C > T | GTTT | T |
| C600 | 1 | 15506551 | C > T | ACTG | |
| C600 | 2 | 629920 | C > T | CACT | T |
| C600 | 3 | 7132293 | C > T | GAAA | |
| C600 | 4 | 12005191 | C > T | AACA | |
| C600 | 5 | 7896135 | C > T | CTGA | |
| C828 | 5 | 9262872 | C > T | AGCA | |
| C941 | 1 | 1654865 | C > T | AAAT | T |
| C941 | 1 | 16595744 | C > T | AACT | T |
| C941 | 1 | 18884415 | C > T | GAAA | |
| C941 | 2 | 12645936 | C > T | TCTT | T |
| C941 | 4 | 4481201 | C > T | TTAA | |
| C941 | 4 | 6924288 | C > T | AGCA | |
| C941 | 4 | 13051872 | C > T | ATTA | |
| C941 | 4 | 14525988 | C > T | GATT | T |
| C941 | 4 | 17299998 | C > T | TAAA | |
| C941 | 5 | 14679245 | C > T | CCAA | |
| C1001 | 1 | 17132595 | C > T | TTAT | T |
| C1001 | 3 | 20688685 | C > T | AAAA | |
| C1001 | 5 | 10126078 | C > T | ATAC | C |
| C1322 | 1 | 23839536 | C > T | AATG | |
| C1322 | 2 | 8613063 | C > T | CTCT | T |
| C1322 | 4 | 980039 | C > T | GAAT | T |
| C1322 | 5 | 1431634 | C > T | TTTC | C |
| C1322 | 5 | 6887544 | C > T | GGTT | T |
The C base occurring in the substitution sites are in bold and underlined.
Flanking sequences analysis of the small INDELs in six randomly selected re-sequenced lines.
| C7 | 1 | 16803045 | +1 | AAAAATAATT |
| C7 | 1 | 17871566 | +1 | GATGAGTCTC |
| C7 | 2 | 2330971 | +1 | CTAATCTCTG |
| C7 | 3 | 14218519 | +1 | AAAAACCGAT |
| C7 | 3 | 10963641 | +1 | AACATGTGGC |
| C7 | 5 | 15938548 | +1 | TAAAGTTAGA |
| C116 | 4 | 16566783 | +1 | TTTTTTTTTC |
| C352 | 4 | 12840961 | +1 | TTT |
| C357 | 3 | 16396219 | +1 | CAAACTTTCT |
| C7 | 1 | 2500882 | −1 | AAACCTAATA |
| C7 | 1 | 7060076 | −1 | TGCGGCCTTG |
| C7 | 1 | 23165210 | −1 | ATCAATGGCT |
| C7 | 1 | 30364445 | −1 | AATGCAAGAG |
| C7 | 2 | 3736597 | −1 | GTGTATGACC |
| C7 | 2 | 16776720 | −1 | ATCCATAACA |
| C7 | 3 | 12880551 | −1 | AGAATGCTCA |
| C7 | 3 | 15247466 | −1 | AAATCAT |
| C7 | 4 | 13256395 | −1 | TGCTCTTGCC |
| C7 | 5 | 10287067 | −1 | CCAAGATCCG |
| C7 | 5 | 15120725 | −1 | AATAGATTTC |
| C7 | 5 | 15424664 | −1 | TGTTTTTTT |
| C116 | 1 | 6787418 | −1 | CAAGAGCGT |
| C116 | 1 | 899668 | −1 | TGGAGCTGC |
| C116 | 2 | 14954774 | −1 | AAGCGAAAC |
| C116 | 3 | 3201387 | −1 | AGCGGTT |
| C116 | 5 | 14227661 | −1 | TTTTTTTTTC |
| C116 | 5 | 11032451 | −1 | CAATTACA |
| C197 | 2 | 9064276 | −1 | GAGAACCAAT |
| C197 | 2 | 17412340 | −1 | GGAAAGTAAT |
| C197 | 3 | 15116009 | −1 | TAGATAAGTA |
| C197 | 4 | 11372899 | −1 | AAGAAAATTG |
| C352 | 2 | 13175805 | −1 | GTTGCTGCAA |
| C352 | 3 | 20903968 | −1 | GCGCTTGGAA |
| C352 | 3 | 22734373 | −1 | AGTGGTTAAA |
| C352 | 4 | 5256307 | −1 | CTAAATTCAG |
| C357 | 1 | 24487663 | −1 | ATGTTACAGT |
| C357 | 2 | 6789742 | −1 | ACTAAAG |
| C357 | 3 | 8430932 | −1 | ATTCTACTTG |
| C357 | 5 | 3418139 | −1 | CACCTCTAAC |
| C357 | 5 | 7522732 | −1 | TGAGAATCTC |
| C357 | 5 | 14755177 | −1 | AAACAATTTC |
| C541 | 1 | 2611420 | −1 | CCACGT |
| C541 | 3 | 5203766 | −1 | GAAAACTGGT |
| C541 | 5 | 17816447 | −1 | GACTTGAGAG |
| C7 | 3 | 4723240 | −2 | GTAGTCATTT |
| C7 | 4 | 1435510 | −2 | AAAACCCCAA |
| C116 | 3 | 4315154 | −2 | TTTATAACTC |
| C7 | 2 | 3072553 | −4 | AAACGCTCGG |
| C7 | 3 | 5701019 | −5 | TATTAAAAAG |
| C7 | 2 | 8793345 | −5 | AGCCCATGGA |
| C116 | 3 | 16824813 | −5 | GGGTGTG |
| C541 | 1 | 18202184 | −5 | TCCTCTCAGT |
| C7 | 4 | 10587368 | −6 | AGAATTAATC |
| C7 | 1 | 22655118 | −6 | ACTCCGAAGC |
| C116 | 1 | 27885672 | −6 | CATTTTTATA |
| C197 | 1 | 10625807 | −6 | TCGGTTGATG |
| C197 | 4 | 15198376 | −7 | TTTATCTCCA |
| C357 | 3 | 442084 | −7 | CTGAACCTGTG |
| C357 | 5 | 20249679 | −8 | ACTTTAAGTT |
| C197 | 2 | 11198490 | −9 | TTTACTTATG |
| C357 | 1 | 10405785 | −9 | TACAGTACA |
| C116 | 3 | 11700393 | −15 | CTGCCTT |
| C116 | 1 | 29852923 | −18 | GGCCTAGGTA |
Homopolymeric and polynucleotide stretches are underlined; the inserted or deleted base is in bold.