Literature DB >> 23805147

A re-sequencing based assessment of genomic heterogeneity and fast neutron-induced deletions in a common bean cultivar.

Jamie A O'Rourke1, Luis P Iniguez, Bruna Bucciarelli, Jeffrey Roessler, Jeremy Schmutz, Phillip E McClean, Scott A Jackson, Georgina Hernandez, Michelle A Graham, Robert M Stupar, Carroll P Vance.   

Abstract

A small fast neutron (FN) mutant population has been established from Phaseolus vulgaris cv. Red Hawk. We leveraged the available P. vulgaris genome sequence and high throughput next generation DNA sequencing to examine the genomic structure of five P. vulgaris cv. Red Hawk FN mutants with striking visual phenotypes. Analysis of these genomes identified three classes of structural variation (SV); between cultivar variation, natural variation within the FN mutant population, and FN induced mutagenesis. Our analyses focused on the latter two classes. We identified 23 large deletions (>40 bp) common to multiple individuals, illustrating residual heterogeneity and regions of SV within the common bean cv. Red Hawk. An additional 18 large deletions were identified in individual mutant plants. These deletions, ranging in size from 40 bp to 43,000 bp, are potentially the result of FN mutagenesis. Six of the 18 deletions lie near or within gene coding regions, identifying potential candidate genes causing the mutant phenotype.

Entities:  

Keywords:  DNA-Seq; Phaseolus vulgaris; common bean; fast neutron mutation; natural variation; structural variation

Year:  2013        PMID: 23805147      PMCID: PMC3691542          DOI: 10.3389/fpls.2013.00210

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Common bean, Phaseolus vulgaris L., is an important source of proteins and carbohydrates for over three million people worldwide (Broughton et al., 2003). Despite its dietary importance, genetic resources for common bean have lagged behind those of “model legumes” soybean, Medicago truncatula, and Lotus japonicus. However, next generation sequencing (NGS) technologies now make genomic studies applicable to any species of interest. Mutants are important tools in deciphering gene functions. In common bean, individual mutants can be created for a gene of interest through plant transformation (Aragao et al., 1996; Kwapata et al., 2012). Gene expression patterns of various genes in common bean can also be knocked out or down through the use of virus induced gene silencing (Diaz-Camino et al., 2011; Zhang et al., 2013). These methods both require prior knowledge of genes of interest. In contrast, mutant populations can be screened for phenotypes of interest and genes responsible identified through various approaches. The analysis of traits in various species including those related to plant architecture, yield, and stress response genes have been improved by utilizing mutant screens (Papdi et al., 2010; Bolon et al., 2011; Ma et al., 2013). Structural variation (SV), including presence absence variation, inner and intra chromosomal translocations, insertions, and deletions is believed to be an important component of phenotypic diversity in both plants and animals (Lai et al., 2010; Stankiewicz and Lupski, 2010; Cao et al., 2011; Eichten et al., 2011; Wang et al., 2011; McHale et al., 2012). Genomic variation both between and within cultivars has been identified in soybean (Bolon et al., 2011; Haun et al., 2011; McHale et al., 2012), maize (Lai et al., 2010), rice (Huang et al., 2012), and Arabidopsis (Cao et al., 2011; Belfield et al., 2012). Two studies in soybean examining SV between cultivars (McHale et al., 2012) and within the Williams 82 cultivar (Haun et al., 2011) determined the genomic regions most enriched for SV were gene rich regions, particularly regions containing resistance genes. Here, we present a small fast neutron (FN) mutant population for common bean and demonstrate how NGS technologies, such as DNA-seq provide for fast, high quality analysis of genomic variation to identify potential candidate genes for observed phenotypes. Additionally, the DNA-seq data allowed us to examine the natural variation existing within the Red Hawk cultivar. Such natural variation is a rich source of genomic diversity that can be utilized in future cultivar development.

Materials and methods

Development of Phaseolus vulgaris fast neutron population

Ten thousand P. vulgaris cv. Red Hawk seeds (Kelly et al., 1998), an Andean cultivar adapted for growth in the upper Midwest, were sent to the McClellan Nuclear Radiation Center at the University of California-Davis for irradiation. Five thousand seeds were treated with either 16 or 32 Gys of FN radiation. Treated seeds were sent to the Illinois Crop Improvement Association (ICIA) facility in Puerto Rico in November 2009 along with 200 wild type seeds from the same seed lot. Approximately 70% of the 5000 seeds treated with 16 Gys germinated, while none of the seeds treated with 32 Gys germinated. Seedlings were allowed to mature at the ICIA facility, where plants were phenotyped and seeds were collected from all mature plants. Seeds from ~88 plants with striking mutant phenotypes such as developmental delays, plant stature, pod set, and pod size variations, were harvested individually. Remaining mutant plants were bulk harvested. Wild type plants grown at ICIA were also bulk harvested. Seeds from individually collected plants were planted at the University of Minnesota Experiment Station in St. Paul in 2010. Approximately 10,000 seeds from the bulk collection of mutants were also planted. Phenotyping was performed throughout the growth season, complemented by photographs. Selected individuals with visible and/or maturity phenotype variations were harvested. In 2011, seeds from selected 2010 M2 individuals were planted in 10 ft rows (~20 seeds). Phenotypes observed throughout the 2011 growth season were compared to documented phenotypes from previous years to determine if trait expression was consistent. Additionally, segregation among the 20 plants per mutant line was noted. Three to four individuals in each row with visible/stable traits were tagged, photographed and seed was harvested.

DNA-seq analysis of fast neutron mutants

Five FN mutant plants with different, stable, obvious phenotypes (Figure 1) were chosen for paired end sequence analysis. The following FN mutant plants chosen: 1R5C01r5CPVMN11, a plant with decorative chlorotic leaves early in the growing season (Figure 1A), 1R19C15r28CPVMN11, a small plant with lanceolate leaves (Figure 1B); 1R22C04r31CPVMN11, an upright plant with rugose leaves (Figure 1C); 2R29C12r78CPVMN11; which phenotypically resembled the wild type plant but was delayed in maturity (Figure 1D); and 3R5C25r87CPVMN11, which exhibited interveinal chlorosis (Figure 1E). The mutant plants will respectively be referred to as lanceolate, rugose, decorative, maturity, and chlorotic throughout the rest of the manuscript. M3 seeds of the plants chosen for sequencing were collected and planted at the University of Minnesota Experiment Station in St. Paul in 2012 to ensure the phenotype was maintained through the M3 generation. Leaf tissue from a representative wild type plant and from each of the chosen mutant plants at the M2 generation was collected from 2011 field-grown plants early in the morning and immediately placed at −80°C to inhibit DNA degradation. DNA from all six plants (WT and five mutants) was extracted using the phenol:chloroform method as described (Liu et al., 1997). Each DNA sample was visually inspected on a 1% agarose gel, to ensure that the samples were not degraded. DNA concentration and purity was assessed using an Agilent 2100® Bioanalyzer™ (Agilent®, Santa Clara, CA). DNA samples were submitted to the molecular biology core at the Mayo Clinic, Rochester, MN for paired end sequencing on an Illumina HiSeq 2000. To reduce variability, DNA from all samples were multiplexed and run in a single lane. Low quality reads and adaptor sequences were removed, resulting in 31 million paired end reads per sample.
Figure 1

Visual phenotype of five . All plants are from the M2 generation and were grown at the University of Minnesota Experiment Station in St. Paul, MN in 2011. (A) Mutant 1R5C01r5CPVMN11 referred to as decorative due to the chlorotic patterning on leaves early in the growing season. (B) Mutant 1R19C15r28CPVMN11 referred to as lanceolate due to the elongated shape of the leaf. This mutant also appeared shorter than the WT plants in the field. (C) Mutant 1R22C04r31CPVMN11 referred to as rugose due to the crinkled leaf texture. The petioles of this plant also appeared shorter and more upright than the WT. (D) Mutant 2R29C12r78CPVMN11 referred to as the maturity mutant. This plant is phenotypically identical to the WT except for a delay in maturity. (E) Mutant 3R5C25r87CPVMN11 referred to as the chlorotic mutant due to the interveinal chlorosis patterning observed in the leaves. (F) Wild-type Phaseolus vulgaris cv. Red Hawk for comparison.

Visual phenotype of five . All plants are from the M2 generation and were grown at the University of Minnesota Experiment Station in St. Paul, MN in 2011. (A) Mutant 1R5C01r5CPVMN11 referred to as decorative due to the chlorotic patterning on leaves early in the growing season. (B) Mutant 1R19C15r28CPVMN11 referred to as lanceolate due to the elongated shape of the leaf. This mutant also appeared shorter than the WT plants in the field. (C) Mutant 1R22C04r31CPVMN11 referred to as rugose due to the crinkled leaf texture. The petioles of this plant also appeared shorter and more upright than the WT. (D) Mutant 2R29C12r78CPVMN11 referred to as the maturity mutant. This plant is phenotypically identical to the WT except for a delay in maturity. (E) Mutant 3R5C25r87CPVMN11 referred to as the chlorotic mutant due to the interveinal chlorosis patterning observed in the leaves. (F) Wild-type Phaseolus vulgaris cv. Red Hawk for comparison. Paired-end genome sequences were mapped to the P. vulgaris G 19833 genome sequence available at www.phytozome.net using BWA (Li and Durbin, 2009) with default parameters. The resulting mapping files were further sorted, indexed, and translated to binary format (BAM files) using samtools (Li et al., 2009). The sequence alignments were visualized using IGV (Robinson et al., 2011). This approach aligned 70% of all Red Hawk DNA sequences to 88% of each of the 11 P. vulgaris chromosomes with 12X sequence depth. Regions of genomic deletions were identified using custom perl scripts. To confirm these deletions, the program, CREST (Wang et al., 2011) was also used to screen the FN mutant plants. This program identifies genomic deletions, insertions, inversions, and translocations by identifying soft clipped reads and the read coverage at potential breakpoints to calculate if the probability of observing the number of soft clipped reads at a given location is >0.05 based on a binomial distribution. Statistically significant (P < 0.05) SV are retained for further consideration. CREST analysis was performed for each mutant compared to the wild type control. Genomic deletions resulting from differences between cultivars were masked using the –g function. We chose to focus our analysis on genomic deletions as these are called with the greatest confidence and can be confirmed by visual screening of genomic alignments. Deletions >40 bp identified by both perl scripts and CREST analysis were further characterized by genic location: intergenic, promoter, exon, intron, and 3′UTR. Genomic regions of natural variance within the cultivar were identified by identifying deletions common to multiple, but not all, FN mutant plants (Figures 2A, 3). Single nucleotide polymorphisms (SNPs), small (<40 bp) insertions and deletions (INDELs) (either unique to a single plant or common to multiple plants) were identified using the pileup function in SAMtools (Li et al., 2009). Unique or common SNPs and INDELs were identified using the compareBed function of BEDtools (Quinlan and Hall, 2010) and custom perl scripts. SNPs and INDELs with a read depth <6 (half the average genome coverage) were removed from further analysis. Custom perl scripts were used to characterize SNPs and INDELs by genic locations as described above.
Figure 2

Example of sequence alignments illustrating cultivar heterogeneity and fast neutron induced deletions. Paired end sequences from six Red Hawk individuals were aligned to the G 19833 genome sequence available at www.phytozome.net using BWA (Li and Durbin, 2009) with default parameters. Using CREST and custom perl scripts, we identified statistically significant deletions illustrating three classes of SV. SV was visualized using IGV (Robinson et al., 2011). These regions of SV were all identified on chromosome 1, from the two regions highlighted by filled black boxes above the alignments. The double vertical line within the alignments represents chromosomal region between the two genomic regions depicted. For each individual, a histogram plot illustrates the read depth with individual reads plotted below. Black boxes highlight the three classes of SV. (A) Class 2, regions of genomic heterogeneity within the Red Hawk cultivar. These deletions were identified in two or more Red Hawk individuals and represent the residual heterogeneity in the fast neutron mutant population. This deletion spans 3408 bp on chromosome 1 and is only evident in the lanceolate and chlorotic individuals. (B) Class 1, sequences present in the reference genome, but missing in all Red Hawk individuals. These regions illustrate the differences between the common bean cultivars. This particular region spans 5000 bp of the reference genome. (C) Class 3, sequences missing in a single individual but present in all other lines. This class is most likely the result of fast neutron mutagenesis and may be responsible for the mutant phenotype. This particular deletion is approximately 1500 bp long in the chlorotic mutant and is immediately downstream of the predicted gene Phvul.001G128600 (shown in blue below alignments), which encodes a RecA protein.

Figure 3

CREST analyses identifies SV belonging to Class 2 and Class 3. The number of large genomic deletions in each class identified per 500,000 bp on each chromosome of P. vulgaris were counted and plotted as a vertical line. The height of the line indicates the number of SV per 500,000 bp region that were identified. The largest number of SV per chromosomal region is noted to the left of each chromosome. The mutant containing the deletion is represented by color: blue, chlorotic; purple, maturity; green, decorative; red, rugose; orange, lanceolate. Deletions belonging to Class 3 are likely a result of FN mutagenesis and are highlighted by an asterisk (*). Deletions potentially impacting gene expression are highlighted with an E. Note the region of natural variation (SV Class 2) shared by three mutant plants on chromosome 2.

Example of sequence alignments illustrating cultivar heterogeneity and fast neutron induced deletions. Paired end sequences from six Red Hawk individuals were aligned to the G 19833 genome sequence available at www.phytozome.net using BWA (Li and Durbin, 2009) with default parameters. Using CREST and custom perl scripts, we identified statistically significant deletions illustrating three classes of SV. SV was visualized using IGV (Robinson et al., 2011). These regions of SV were all identified on chromosome 1, from the two regions highlighted by filled black boxes above the alignments. The double vertical line within the alignments represents chromosomal region between the two genomic regions depicted. For each individual, a histogram plot illustrates the read depth with individual reads plotted below. Black boxes highlight the three classes of SV. (A) Class 2, regions of genomic heterogeneity within the Red Hawk cultivar. These deletions were identified in two or more Red Hawk individuals and represent the residual heterogeneity in the fast neutron mutant population. This deletion spans 3408 bp on chromosome 1 and is only evident in the lanceolate and chlorotic individuals. (B) Class 1, sequences present in the reference genome, but missing in all Red Hawk individuals. These regions illustrate the differences between the common bean cultivars. This particular region spans 5000 bp of the reference genome. (C) Class 3, sequences missing in a single individual but present in all other lines. This class is most likely the result of fast neutron mutagenesis and may be responsible for the mutant phenotype. This particular deletion is approximately 1500 bp long in the chlorotic mutant and is immediately downstream of the predicted gene Phvul.001G128600 (shown in blue below alignments), which encodes a RecA protein. CREST analyses identifies SV belonging to Class 2 and Class 3. The number of large genomic deletions in each class identified per 500,000 bp on each chromosome of P. vulgaris were counted and plotted as a vertical line. The height of the line indicates the number of SV per 500,000 bp region that were identified. The largest number of SV per chromosomal region is noted to the left of each chromosome. The mutant containing the deletion is represented by color: blue, chlorotic; purple, maturity; green, decorative; red, rugose; orange, lanceolate. Deletions belonging to Class 3 are likely a result of FN mutagenesis and are highlighted by an asterisk (*). Deletions potentially impacting gene expression are highlighted with an E. Note the region of natural variation (SV Class 2) shared by three mutant plants on chromosome 2.

Results

Mutant collection

FN mutant populations have proven to be a valuable asset for genetic studies in a variety of crop species (Starker et al., 2006; Bolon et al., 2011; Xiao et al., 2011). We have established a small FN mutant population for common bean using P. vulgaris cv. Red Hawk. Seeds from 88 plants with stable, visual phenotypes are available for public use (Table 1) by contacting Dr. Carroll Vance at vance004@umn.edu or Jeff Roesler at roess001@umn.edu. Phenotypes observed in the field include varying degrees and types of chlorosis and altered maturity (usually delayed). Additionally, bulk seed from M2 and M3 plants is available for researchers wishing to screen mutants for a particular trait of interest.
Table 1

.

Plant ID2010 Phenotype2011 PhenotypeMatM3 SegGeneration
1R04C18r4CPVMN11Erect growthErect growthNNM3
1R05C01r5CPVMN11Stunted dwarf, slt chloroticStunted dwarf, slt chloroticLNM3
1R06C05r6CPVMN11Stunted, delayedStuntedNNM3
1R06C09r7CPVMN11Large leaves, chloroticWT phenoENM3
1R07C04r8CPVMN11Stunted, delayedStunted, delayedLNM3
1R07C11r9CPVMN11Stunted, delayedStunted, delayedLNM3
1R07C14r11CPVMN11Stunted, delayedDelayed growthLNM3
1R09C01r13CPVMN11Few leaves, floweringWT phenoNNM3
1R10C08r14CPVMN11Delayed growthChloroticNNM3
1R11C15r16CPVMN11Lanceolate leavesLanceolate leavesLNM3
1R13C06r18CPVMN11Few leaves, floweringWT phenoNNM3
1R14C12r20CPVMN11Late maturitySmall plantL1:1M3
1R14C33r21CPVMN11Light green leavesLight green leavesNNM3
1R15C05r22CPVMN11StuntedStuntedL3:1M3
1R15C10r23CPVMN11Few leavesWT pheno, note maturityENM3
1R16C03r24CPVMN10Erect growthWT phenoNNM3
1R16C09r25CPVMN11Few leavesWT phenoNNM3
1R17C13r26CPVMN11Erect, large leavesErect, large leavesNNM3
1R18C18r27CPVMN11Tall, bushy, erectWT phenoNNM3
1R19C15r28CPVMN11Lanceolate leavesLanceolate leaves, long petioles, some chlorosisLNM3
1R20C06r29CPVMN11Few leaves, floweringWT phenoNNM3
1R20C08r30CPVMN11Delayed growthWT pheno, note maturityENM3
1R22C04r31CPVMN11Delayed growthRugose, stuntedLNM3
1R22C15r32CPVMN11Leaf size and shapeLarge leaves, long petiolesNNM3
1R22C27r33CPVMN11Large leavesBushy, lots of leavesNNM3
1R22C37r34CPVMN11Leaf size and shapeWT phenoNNM3
1R23C21r35CPVMN11Small leavesLanceolate leaves, long petioles, some chlorosisNNM3
1R24C13r37CPVMN11Large leavesLarge leavesENM3
1R25C20r39CPVMN11Lacks apical dominance, sprawlingLacks apical dominance, sprawlingN1:5M3
1R26C07r40CPVMN11BushyBushyNNM3
1R26C24r41CPVMN11Lanceolate leavesWT pheno, note maturityENM3
1R26C26r42CPVMN11StuntedStuntedLNM3
1R29C03r46CPVMN11BushyBushyN1:1M3
1R30C02r47CPVMN11ChloroticChloroticLNM3
1R30C06r48CPVMN11Large leavesLarge leavesNNM3
1R31C13r49CPVMN11Few leaves, floweringWT pheno, note maturityENM3
1R31C28r50CPVMN11Few leaves, floweringFew LeavesNNM3
1R32C06r51CPVMN11Late maturityLate maturityLNM3
1R32C11r52CPVMN11Short plantShort plantNNM3
1R32C17r53CPVMN11BushyBushyNNM3
1R33C17r54CPVMN11Large, lanceolate leavesLarge, lanceolate leavesNNM3
1R33C32r55CPVMN11Short, bushy and slightly rugoseShort, bushyNNM3
1R35C27r56CPVMN11Tall and bushyBushyN1:1M3
1R37C06r57CPVMN11Bushy and leaf size variesBushyLNM3
1R37C19r58CPVMN11Many leaves fused to form unifoliateLong petiolesNNM3
1R41C06r60CPVMN11RugoseFewer podsLNM3
1R41C22r61CPVMN11Erect growth, light green chlorotic leavesErect growth, light green chlorotic leavesNNM3
1R42C02r62CPVMN11Small chlorotic plantSmall chlorotic plantN1:2M3
1R43C24r63CPVMN11Wavy curled leavesWT pheno, note maturityLNM3
1R44C20r64CPVMN11Bushy, wavy and curled leavesBushy, wavy and curled leavesENM3
2R14C01r66CPVMN11Stunted, cupped leaves and few flowersStunted, cupped leaves and few flowersLNM3
2R14C13r67CPVMN11Stunted, rugose curled leaves, short petiolesWT phenoNNM3
2R18C06r69CPVMN11Short, stunted, delayedErectL2:1M3
2R24C27r73CPVMN10Short compact plantShort compact plantNNM3
2R25C11r74CPVMN11Slight chloroticErect growthNNM3
2R26C31r75CPVMN11Slight chloroticSlight chloroticNNM3
2R27C18r76CPVMN11Viney, spindly, few leaves, large cupped leavesViney, spindly, few leaves, large cupped leavesLNM3
2R27C20r77CPVMN11Lanceolate leavesBushyNNM3
2R29C12r78CPVMN11Few large leavesWT pheno, note maturityL1:2M3
2R33C02r80CPVMN11Lanceolate curled leaves, very few flowers, late maturityCurled leavesLNM3
2R43C07r83CPVMN11Large leaves, some fused trifoliatesBushyNNM3
2R43C38r84CPVMN11Large dark green rugose leavesWT pheno, note maturityLNM3
3R04C09r85CPVMN11Large leavesLong petioles, a bit bushyNNM3
3R05C13r86CPVMN11Late maturityLate maturityLNM3
3R05C25r87CPVMN11Slight chloroticInterveinal chlorosisLNM3
3R06C13r89CPVMN11Large cupped leavesSlightly curled leavesLNM3
3R07C22r90CPVMN11Tall, erect growthfewer pods, note maturityLNM3
3R11C14r91CPVMN11Large rugose leavesrugose, spindly, very few podsNNM3
3R16C03r92CPVMN11Curled lanceolate leavesCurled lanceolate leavesLNM3
3R17C22r93CPVMN11rugose, short petiolesrugose, short petiolesLNM3
R06C05CPVMN11N/AShort, curled leaves, spotty chlorosisN/AN/AM2
R09C05CPVMN11N/AFew lateral branches, erect growthN/AN/AM2
R10C09CPVMN11N/ASpindly, small leavesN/AN/AM2
R11C21CPVMN11N/ALarge leaves, odd nodes, rugose, many small branchesN/AN/AM2
R13C09CPVMN11N/AShort, rugose curled leavesN/AN/AM2
R15C12CPVMN11N/AShort, pointed leavesN/AN/AM2
R16C12CPVMN11N/AShort, bushy, lacks apical dominance, many small branches, small leavesN/AN/AM2
R19C22CPVMN11N/ASpotty chlorosis like row 5 in M3 lineN/AN/AM2
R21C05CPVMN11N/AShort petioles, large rugose curled leavesN/AN/AM2
R22C06CPVMN11N/ATall, very long petiolesN/AN/AM2
R24C05CPVMN11N/AFew lateral branches and flowersN/AN/AM2
R24C18CPVMN11N/AStunted, small curled leavesN/AN/AM2
R25C15CPVMN11N/AVery stunted mini-plantN/AN/AM2
R28C12CPVMN11N/ASpindly, erect growth, small leavesN/AN/AM2
R30C08CPVMN11N/AStunted, few branches, no flowersN/AN/AM2
R40C08CPVMN11N/AStunted, small pointed leavesN/AN/AM2
R47C24CPVMN11N/AShort, short petioles, small leavesN/AN/AM2
R47C25CPVMN11N/AShort, chlorotic, pointed leavesN/AN/AM2

Mat, Maturity; E, earlier than WT; N, normal; L, later than WT; M3 Seg, if M3 row planting is segregating ratio is noted. Generation, mutant generation seeds available.

. Mat, Maturity; E, earlier than WT; N, normal; L, later than WT; M3 Seg, if M3 row planting is segregating ratio is noted. Generation, mutant generation seeds available.

Use of DNA-seq to identify regions of structural variation

The DNA-seq reads from the five FN mutant lines and one wild-type cv. Red Hawk individual were mapped to the common bean reference genome sequence (accession G 19833). This analysis allowed us to identify three classes of SV (Figure 2): (1) sequences present in the reference genome sequence, but absent from all six of the Red Hawk individuals (Figure 2B); (2) sequences present in both the reference genome sequence and at least one Red Hawk individual, but absent in two or more Red Hawk individuals (Figure 2A); (3) sequences absent from one FN line, but present in all other samples (Figure 2C). The Class 1 group primarily represents intra-specific structural genomic differences between accession G 19833 and cv. Red Hawk. We identified over one-thousand of these regions, in which all six Red Hawk individuals were missing >100 bp that is present in the reference genome. This is an interesting group of SV to catalog and may have important implications for understanding inter-cultivar phenotypic variation. However, these features do not inform our understanding of the mutant phenotypes in the FN lines. Therefore, we chose to focus the data analysis on the Class 2 and Class 3 groups, which exhibited structural polymorphism among the FN individuals. The Class 2 group consists of DNA segments present in some FN individuals, but missing in at least two others (Figure 2A). It is highly unlikely the same genomic regions would be deleted in multiple plants by FN irradiation. Our analysis identified 24 genomic deletions belonging to Class 2, which illustrates the natural variation within the common bean cultivar Red Hawk. Specifically, ten unique deletions (P < 0.05) are shared by the lanceolate and chlorotic mutants on chromosome 1, nine genomic deletions were identified on chromosomes 2 and 11 in three mutant plants (lanceolate, maturity, and chlorotic), four deletions on chromosome 4 are common to the maturity and chlorotic mutants, and one deletion on chromosome 11 is shared by the rugose and maturity mutants (Figure 3, Table 2). Class 2 deletions range in size from 41 base pairs (bp) to 12,111 bp. Of all of these deletions, two are within gene introns and two are within 1000 bp upstream of the start codon or 1000 bp downstream of the stop codon, regions involved in regulating gene expression patterns (Table 2). All Class 2 deletions on chromosome 2 lie within 2.7 million bps of each other. This is a region exhibiting high heterogeneity (Figure 3). Within this region three individual FN lines share eight Class 2 deletions. The region is unaffected in the remaining two FN lines and the wild type plant.
Table 2

Regions of natural structural variation identified within the cultivar, Red Hawk.

ChrDeletion start (bp)Deletion stop (bp)Deletion size (bp)Mutants with deletionGenes potentially affectedDeletion relative to gene
Chr011825316418253570406Lanceolate, Chlorotic
Chr01191197591913087311,114Lanceolate, Chlorotic
Chr01214780792147855376Lanceolate, Chlorotic
Chr012160460321604792189Lanceolate, Chlorotic
Chr01223702552237029641Lanceolate, Chlorotic
Chr0124740621247440293,408Lanceolate, Chlorotic
Chr01257922302579228757Lanceolate, Chlorotic
Chr0125873578258807987,220Lanceolate, Chlorotic
Chr013077993830780112174Lanceolate, ChloroticPhvul.001G111700P
Chr013593730635938111805Lanceolate, Chlorotic
Chr022346235723462602245Lanceolate, Maturity, Chlorotic
Chr022374487223744518646Lanceolate, Maturity, Chlorotic
Chr0224500720245049434,223Lanceolate, Maturity, Chlorotic
Chr022471876524718893128Lanceolate, Maturity, Chlorotic
Chr022472831824728663345Lanceolate, Maturity, Chlorotic
Chr02251304202513051090Lanceolate, Maturity, ChloroticPhvul.002G125100I
Chr0225611051256137982,747Lanceolate, Maturity, Chlorotic
Chr022622926926229410141Lanceolate, Maturity, Chlorotic
Chr04374572037467211,001Matruity, ChloroticPhvul.004G33700I
Chr0441603224160726404Matruity, Chlorotic
Chr044374139438635012,111Matruity, ChloroticPhvul.004G039400P
Phvul.004G039500D
Chr04457149445759134,419Matruity, Chlorotic
Chr11540662154078351,214Lanceolate, Maturity, Chlorotic
Chr11196397451963981267Rugose, Maturity

Genomic deletions identified by the program CREST that belong to SV Class 2. All SV were visually verified using IGV. This class of SV represents regions of heterogeneity within Red Hawk.

P, Deletion is within 1000 bp of start codon; I, Deletion is in an intron; D, Deletion is within 1000 bp of stop codon.

Regions of natural structural variation identified within the cultivar, Red Hawk. Genomic deletions identified by the program CREST that belong to SV Class 2. All SV were visually verified using IGV. This class of SV represents regions of heterogeneity within Red Hawk. P, Deletion is within 1000 bp of start codon; I, Deletion is in an intron; D, Deletion is within 1000 bp of stop codon. The remaining 18 deletions identified by CREST (P < 0.05) belong to Class 3. This class is composed of sequences absent from a single FN line, but present in all other samples (Figure 2C, Table 3). Deletions belonging to Class 3 range in size from 41 to over 43,000 bp and are found on chromosomes 1, 4, 5, 7, 8, and 9. Twelve of the eighteen Class 3 deletions are in intergenic regions of the genome, though as genome annotation improves these regions may contain as of yet unpredicted genes. Six deletions belonging to Class 3 have the potential to alter gene expression patterns. Regions immediately upstream or downstream of gene coding regions are likely involved in regulating gene expression. Three deletions are located within 1000 bp of gene start or stop codons of Phvul.001G128600, Phvul.004G029200, Phvul.004G031900, and Phvul.004G032000 in the chlorotic and maturity mutants. The latter two genes are tightly linked in the genome, so a single deletion may affect the expression of multiple genes. Two Class 3 deletions shorten the introns of Phvul.001G050100 in the lanceolate mutant and Phvul.004G031200 in the maturity mutant. Finally, a 43,034 bp deletion on chromosome 8 in the chlorotic mutant removes the entire sequence for Phvul.008G141500. In summary, we were able to identify statistically significant SV belonging to Class 3 within either the coding region or the potential regulatory region of predicted genes for three of the five mutant plants (lanceolate, maturity, and chlorotic). Genes likely impacted by these deletions represent the most likely candidates responsible for the mutant phenotype of the lanceolate, maturity, and chlorotic mutants. For decorative and rugose, our analysis pipeline failed to identify Class 3 deletions within the regulatory or coding region of genes. These phenotypes may be a result of a heterozygous deletion, a small (<40 bp) INDEL, or a SNP.
Table 3

Putative fast neutron induced genomic deletions.

ChromosomeDeletion start (bp)Deletion stop (bp)Deletion size (bp)Mutant with deletionGenes potentially affectedDeletion relative to gene
Chr01147881614812472,431Lanceolate
Chr0155579005558034134LanceolatePhvul.001G050100I
Chr012281824822818444196Decorative
Chr01363105353631062489Chlorotic
Chr0136442517364403051,518ChloroticPhvul.001G128600D
Chr01365566503655669141Chlorotic
Chr0427820392782294255Maturity
Chr0428978052898208403Maturity
Chr0431156683116070402Maturity
Chr04316739531688781,483MaturityPhvul.004G029200P
Chr04347483934813826,543Maturity
Chr0434897413490354613MaturityPhvul.004G031300I
Chr0435689243569058134MaturityPhvul.004G031900D
Phvul.004G032000D
Chr04367984536843494,504Maturity
Chr05250187052501877166Lanceolate
Chr07635427763544492,762Lanceolate
Chr08245238692456690343,034ChloroticPhvul.008G141500G
Chr094319424347042,762Lanceolate

Class 3 SV identified by the program CREST and visually verified using IGV. This SV class identifies sequences absent from one FN line, but present in all other samples. This class of SV is most likely a result of FN mutagenesis.

P, Deletion is within 1000 bp of start codon; I, Deletion is in an intron; D, Deletion is within 1000 bp of stop codon; G, whole gene deleted.

Putative fast neutron induced genomic deletions. Class 3 SV identified by the program CREST and visually verified using IGV. This SV class identifies sequences absent from one FN line, but present in all other samples. This class of SV is most likely a result of FN mutagenesis. P, Deletion is within 1000 bp of start codon; I, Deletion is in an intron; D, Deletion is within 1000 bp of stop codon; G, whole gene deleted. DNA-seq permits genome analyses on a base pair scale. Using the approach described in the materials and methods we estimated that there are 32,499 SNPs and 20,363 INDELs shared by multiple FN lines, most likely representing the natural variation caused by genetic heterogeneity within the Red Hawk cultivar (Class 2). We also estimated 92,205 SNPS and 20,340 INDELs unique to a single FN line (Class 3). As described earlier, Class 3 SNPs and INDELs are most likely a result of the FN mutagenesis. Both Class 2 and Class 3 SNPs and INDELs were mapped to the available P. vulgaris genome to determine whether the change in genomic architecture corresponded to a genic region (Tables 4, 5). As was observed with the larger deletions, the majority of the INDELs and SNPs mapped to intergenic regions. Confirmation by PCR will be necessary to determine the false discovery rate of the SNP and INDEL identification.
Table 4

Identification and classification of SNPs in the five mutant plants.

Number of mutants with SNPSV classSNPs in intergenic regionsSNPs within gene (promoter, exon, intron)
SNPs unique to 1 plant374,53417,761
SNPs shared by 2 plants214,5152,376
SNPs shared by 3 plants27,1901,073
SNPs shared by 4 plants24,159744
SNPs shared by 5 plants22,033409

SNPs shared by more than one mutant plant (Class 2) represent natural variation existing in the Red Hawk cultivar. SNPs unique to a single mutant (Class 3) are likely caused by FN mutagenesis. SNPs within a gene region are more likely to impact gene expression patterns than those in intergenic regions.

Table 5

Identification and classification of INDELs in the five mutant plants.

Number of mutants with INDELSV classINDELs in intergenic regionsIndels within a gene (promoter, exon, intron)
INDELs unique to 1 plant317,6062,704
INDELs shared by 2 plants28,0421,204
INDELs shared by 3 plants24,771740
INDELs shared by 4 plants22,877445
INDELs shared by 5 plants21,977271

INDELs shared by more than one mutant plant likely represent natural variation existing in the Red Hawk cultivar (Class 2) while INDELs unqiue to a single mutant (Class 3) are likely a result of FN mutagenesis. INDELs within a gene region are more likely to impact gene expression patterns than those in intergenic regions.

Identification and classification of SNPs in the five mutant plants. SNPs shared by more than one mutant plant (Class 2) represent natural variation existing in the Red Hawk cultivar. SNPs unique to a single mutant (Class 3) are likely caused by FN mutagenesis. SNPs within a gene region are more likely to impact gene expression patterns than those in intergenic regions. Identification and classification of INDELs in the five mutant plants. INDELs shared by more than one mutant plant likely represent natural variation existing in the Red Hawk cultivar (Class 2) while INDELs unqiue to a single mutant (Class 3) are likely a result of FN mutagenesis. INDELs within a gene region are more likely to impact gene expression patterns than those in intergenic regions.

Discussion

We demonstrate the feasibility of utilizing high throughput DNA sequencing to analyze FN mutant plants. The use of high throughput sequencing allows the scale of our analysis to be reduced to a base-pair level, providing for the identification and analysis of SNPs, indels, and larger genomic deletions using a single experimental platform. Our analysis identified SV in each Red Hawk individual (wild type and FN mutants) in comparison to the P. vulgaris (accession G 19833) reference genome sequence (available at www.phytozome.net). These SV are inferred to be sequences lost (deleted) from the respective Red Hawk individuals or sequences recently gained by G 19833. (Identifying sequences missing in G 19833 that are present in Red Hawk requires a de-novo assembly of the Red Hawk genome, which is beyond the scope of this analysis.) One may assume that the majority of the SV is either natural or FN-induced deletions in Red Hawk; therefore these events will be referred to as “deletions” throughout the discussion. Our analysis identified three classes of SV (Figure 2). Class 1 represents the putative inter-cultivar SV, large sequence segments that are missing from all sequenced Red Hawk individuals in this study, but are present in the current G 19833 assembly (Figure 2B). Class 2 represents the intra-cultivar SV exhibiting differences among Red Hawk individuals (Figure 2A). Class 3 represents SV specific to a single Red Hawk individual, which were potentially generated by the FN irradiation (Figure 2C). We focused our analysis on the 23 Class 2 and 18 Class 3 SV that were identified in this analysis. However, it is important to recognize that these classifications are tentative, as a deeper sampling of Red Hawk individuals may re-classify some variants. For example, the limited sampling of one wild-type and five mutated Red Hawk individuals suggests that some of our Class 3 SV may be low frequency natural variants that would have been identified in more than one Red Hawk individual (and thereby be a Class 2 SV) if a larger number of genotypes had been sampled. Similarly, some Class 1 SV may be present at low frequency in Red Hawk, suggesting that these would be re-classified as Class 2 SV in a deeper sampling of genotypes. It is probable that increasing the number of sequenced individuals from Red Hawk and/or G 19833 individuals would identify many additional Class 1 and Class 2 SV, and strengthen the confidence of the Class 3 calls. A particularly interesting heterogeneous mixture of deletions was identified in a 2.7 million bp region on chromosome 2 of Red Hawk (Figure 3), and represents an intriguing cluster of Class 2 SV. Sequence analysis of the 159 genes within this region revealed 21 are involved in disease resistance response (one dirigent like protein, six leucine rich repeat proteins, and 14 NB-ARC-LRR domain containing disease resistance proteins). This is consistent with previous studies which identified an over-abundance of disease resistance genes located within regions of natural variation (McHale et al., 2012). Additional regions of Class 2 SV are apparent on chromosomes 1, 4, and 11. Analysis of the DNA-seq data for the five individual FN lines suggests the level of natural variation present within the Red Hawk cultivar is higher than that induced by FN mutagenesis. However, aside from the region on chromosome 2, most of the Class 2 SV is in intergenic regions, not likely impacting gene expression or function. The Class 3 deletions are particularly interesting, as they may be associated with the mutant phenotypes found in the FN irradiated individuals. For two of the FN lines, no unique deletions >40 bp were found within or near gene coding regions (although it is still possible that these lines carry heterozygous deletions within genic regions). For the remaining three FN lines, however, we identified candidate deletions within gene-encoding regions that may be associated with the resulting phenotype. The lanceolate mutant (1R19C15r28CPVMN11) exhibited a shorter than wild type stature with elongated, or lanceolate, shaped leaves (Figure 1B). CREST analysis identified a single Class 3 SV potentially altering the expression of Phvul.001G050100 (Table 3, Figure 3). The deletion is entirely contained within the last intron region of the gene. The gene is a member of the glycosyltransferase family 47 subgroup C. Proteins encoded by members of this family are bound to the golgi and are involved in cell wall biosynthesis (Jensen et al., 2008). Altering the expression of member of this family/subgroup would impact the physical property of the pectin matrix (Jensen et al., 2008). However, there are no reports of altered leaf phenotypes or plant height in Arabidopsis knockout populations. Three regions of SV belonging to Class 3 potentially affecting gene expression in the maturity mutant (2R29C12r78CPVMN11) were identified by CREST analysis. In the field, this plant showed no phenotypic derivation from the wild type, except a delay in maturity (Figure 1D). These three deletions, all on chromosome four, potentially affect the expression pattern of four candidate genes (Figure 3, Table 3). The 1st deletion lies immediately upstream of Phvul.004G029200, which encodes a 60S ribosomal protein. This is an unlikely candidate gene for the phenotype observed. The second deletion is in the intron of Phvul.004G031300. Sequence analysis of this gene failed to identify any functional annotations associated with this gene. The final deletion is immediately downstream of both Phvul.004G032000 and Phvul.004G031900. The Arabidopsis homologs of these genes (At3g01090 and At5g19790) encode a protein kinase and an AP2 transcription factor respectively. AP2 transcription factors are involved in regulating flowering and fruit ripening (Huijser and Schmid, 2011). In Arabidopsis, the over-expression of AP2 results in delayed flowering and maturity (Wollmann et al., 2010). It's possible, the deletion immediately down-stream of Phvul.004G031900 causes an increase of AP2 protein accumulation resulting in delayed flowering and maturity. Finally, the mutant plant 3R05C25r87CPVMN11 exhibited interveinal chlorosis under standard field conditions (Figure 1E). Interveinal chlorosis is often an indicator of nutrient deficiencies. However, DNA-seq analysis of this mutant revealed two SV belonging to Class 3 potentially affecting gene expression patterns (Table 3). The first is a 1518 bp deletion immediately downstream of Phvul.001G128600. The homolog of this gene in Arabidopsis thaliana (At3g10140) encodes a RecA protein, involved DNA repair by binding ssDNA (Miller-Messmer et al., 2012). The second Class 3 SV is a large deletion spaning 43,034 bp on chromosome 8, encompassing the entire sequence for Phvul.008G141500 (Table 3). Sequence analysis of this gene determined it is a member of the SNF2 helicase family. The Arabidopsis homolog of this gene, At2g44980, is the only member of the ALC1 SNF2 subfamily (Knizewski et al., 2008). There is no reported data on the phenotype of an ALC1 knockout in Arabidopsis. However, in mammalian systems, ALC1 is essential for repairing DNA damage (Ryan and Hughes, 2011). Down-regulation of ALC1 protein results in hypersensitivity to damaging agents (Ryan and Hughes, 2011). We hypothesize a similar function is conserved in common bean. Based on the interveinal chlorotic leaf patterning when this gene is completely excised it's possible Phvul.008G141500 is involved in repairing damage to DNA in the leaves, possibly caused by UV radiation. We identified Class 3 SV, likely a result of FN mutagenesis, ranging from 1 bp substitutions to 43,034 bp deletions, though changes <40 bp have not been visually verified. In the mutated plants, the level of FN irradiation used (16 Gys) induced far more small (<50 bp) deletions, including single base pair substitutions and deletions, than large genomic deletions. These results are consistent with FN induced deletions identified in a recent paper in Arabidopsis (Belfield et al., 2012). The lack of large SV regions belonging to Class 3, as seen in the well-characterized soybean population (Bolon et al., 2011), may be due to our analysis pipeline requiring the deletion to be complete (i.e.,: homozygous). It is possible some larger deletions are present in these lines, but are not yet homozygous in the M2 generation and, as such, were not identified by our analysis. In the soybean FN population, 52 and 38% of the mutants with visual phenotypes were identified from seeds treated with 16 and 32 Gys respectively (Bolon et al., 2011). None of the P. vulgaris plants irradiated at 32 Gys germinated in the field, suggesting the common bean genome may be less resilient to interruption than the soybean genome. The soybean genome may accommodate larger genomic deletions by compensating for gene loss through altered expression patterns of duplicated genes.

Summary

A FN population with 88 individuals or bulk seed from M2 and M3 generations is available upon request. We've illustrated the utility of DNA-seq to identify three classes of SV in P. vulgaris individuals. These analyses were greatly facilitated by the availability of the P. vulgaris genome sequence. In the Red Hawk cultivar, natural variation is clustered in regions throughout the genome. These regions of natural variation illustrate the existing genetic potential of common bean germplasm. Our analyses also identified genomic deletions resulting from FN mutagenesis and candidate genes potentially responsible for the altered phenotype in three of the plants selected.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  28 in total

1.  On reconciling the interactions between APETALA2, miR172 and AGAMOUS with the ABC model of flower development.

Authors:  Heike Wollmann; Erica Mica; Marco Todesco; Jeff A Long; Detlef Weigel
Journal:  Development       Date:  2010-09-28       Impact factor: 6.868

2.  Inheritance of foreign genes in transgenic bean (Phaseolus vulgaris L.) co-transformed via particle bombardment.

Authors:  F J Aragão; L M Barros; A C Brasileiro; S G Ribeiro; F D Smith; J C Sanford; J C Faria; E L Rech
Journal:  Theor Appl Genet       Date:  1996-07       Impact factor: 5.699

Review 3.  Structural variation in the human genome and its role in disease.

Authors:  Paweł Stankiewicz; James R Lupski
Journal:  Annu Rev Med       Date:  2010       Impact factor: 13.739

Review 4.  The control of developmental phase transitions in plants.

Authors:  Peter Huijser; Markus Schmid
Journal:  Development       Date:  2011-10       Impact factor: 6.868

5.  Genetic screens to identify plant stress genes.

Authors:  Csaba Papdi; Jeffrey Leung; Mary Prathiba Joseph; Imma Perez Salamó; László Szabados
Journal:  Methods Mol Biol       Date:  2010

Review 6.  Resequencing rice genomes: an emerging new era of rice genomics.

Authors:  Xuehui Huang; Tingting Lu; Bin Han
Journal:  Trends Genet       Date:  2013-01-04       Impact factor: 11.639

7.  Whole-genome sequencing of multiple Arabidopsis thaliana populations.

Authors:  Jun Cao; Korbinian Schneeberger; Stephan Ossowski; Torsten Günther; Sebastian Bender; Joffrey Fitz; Daniel Koenig; Christa Lanz; Oliver Stegle; Christoph Lippert; Xi Wang; Felix Ott; Jonas Müller; Carlos Alonso-Blanco; Karsten Borgwardt; Karl J Schmid; Detlef Weigel
Journal:  Nat Genet       Date:  2011-08-28       Impact factor: 38.330

8.  Structural variants in the soybean genome localize to clusters of biotic stress-response genes.

Authors:  Leah K McHale; William J Haun; Wayne W Xu; Pudota B Bhaskar; Justin E Anderson; David L Hyten; Daniel J Gerhardt; Jeffrey A Jeddeloh; Robert M Stupar
Journal:  Plant Physiol       Date:  2012-06-13       Impact factor: 8.340

9.  Identification of a xylogalacturonan xylosyltransferase involved in pectin biosynthesis in Arabidopsis.

Authors:  Jacob Krüger Jensen; Susanne Oxenbøll Sørensen; Jesper Harholt; Naomi Geshi; Yumiko Sakuragi; Isabel Møller; Joris Zandleven; Adriana J Bernal; Niels Bjerg Jensen; Charlotte Sørensen; Markus Pauly; Gerrit Beldman; William G T Willats; Henrik Vibe Scheller
Journal:  Plant Cell       Date:  2008-05-06       Impact factor: 11.277

10.  Genome-wide analysis of mutations in mutant lineages selected following fast-neutron irradiation mutagenesis of Arabidopsis thaliana.

Authors:  Eric J Belfield; Xiangchao Gan; Aziz Mithani; Carly Brown; Caifu Jiang; Keara Franklin; Elizabeth Alvey; Anjar Wibowo; Marko Jung; Kit Bailey; Sharan Kalwani; Jiannis Ragoussis; Richard Mott; Nicholas P Harberd
Journal:  Genome Res       Date:  2012-04-12       Impact factor: 9.043

View more
  5 in total

Review 1.  Legume genomics: understanding biology through DNA and RNA sequencing.

Authors:  Jamie A O'Rourke; Yung-Tsi Bolon; Bruna Bucciarelli; Carroll P Vance
Journal:  Ann Bot       Date:  2014-04-25       Impact factor: 4.357

2.  Identification of Substitutions and Small Insertion-Deletions Induced by Carbon-Ion Beam Irradiation in Arabidopsis thaliana.

Authors:  Yan Du; Shanwei Luo; Xin Li; Jiangyan Yang; Tao Cui; Wenjian Li; Lixia Yu; Hui Feng; Yuze Chen; Jinhu Mu; Xia Chen; Qingyao Shu; Tao Guo; Wenlong Luo; Libin Zhou
Journal:  Front Plant Sci       Date:  2017-10-27       Impact factor: 5.753

3.  Genome resilience and prevalence of segmental duplications following fast neutron irradiation of soybean.

Authors:  Yung-Tsi Bolon; Adrian O Stec; Jean-Michel Michno; Jeffrey Roessler; Pudota B Bhaskar; Landon Ries; Austin A Dobbels; Benjamin W Campbell; Nathan P Young; Justin E Anderson; David M Grant; James H Orf; Seth L Naeve; Gary J Muehlbauer; Carroll P Vance; Robert M Stupar
Journal:  Genetics       Date:  2014-09-10       Impact factor: 4.562

4.  Genomic Analysis of Storage Protein Deficiency in Genetically Related Lines of Common Bean (Phaseolus vulgaris).

Authors:  Sudhakar Pandurangan; Marwan Diapari; Fuqiang Yin; Seth Munholland; Gregory E Perry; B Patrick Chapman; Shangzhi Huang; Francesca Sparvoli; Roberto Bollini; William L Crosby; Karl P Pauls; Frédéric Marsolais
Journal:  Front Plant Sci       Date:  2016-03-31       Impact factor: 5.753

5.  FNBtools: A Software to Identify Homozygous Lesions in Deletion Mutant Populations.

Authors:  Liang Sun; Yinbing Ge; Andrew Charles Bancroft; Xiaofei Cheng; Jiangqi Wen
Journal:  Front Plant Sci       Date:  2018-07-10       Impact factor: 5.753

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.