Literature DB >> 20624746

Copy number variation shapes genome diversity in Arabidopsis over immediate family generational scales.

Seth DeBolt1.   

Abstract

Arabidopsis thaliana is the model plant and is grown worldwide by plant biologists seeking to dissect the molecular underpinning of plant growth and development. Gene copy number variation (CNV) is a common form of genome natural diversity that is currently poorly studied in plants and may have broad implications for model organism research, evolutionary biology, and crop science. Herein, comparative genomic hybridization (CGH) was used to identify and interrogate regions of gene CNV across the A. thaliana genome. A common temperature condition used for growth of A. thaliana in our laboratory and many around the globe is 22 degrees C. The current study sought to test whether A. thaliana, grown under different temperature (16 and 28 degrees C) and stress regimes (salicylic acid spray) for five generations, selecting for fecundity at each generation, displayed any differences in CNV relative to a plant lineage growing under normal conditions. Three siblings from each alternative temperature or stress lineage were also compared with the reference genome (22 degrees C) by CGH to determine repetitive and nonrepetitive CNVs. Findings document exceptional rates of CNV in the genome of A. thaliana over immediate family generational scales. A propensity for duplication and nonrepetitive CNVs was documented in 28 degrees C CGH, which was correlated with the greatest plant stress and infers a potential CNV-environmental interaction. A broad diversity of gene species were observed within CNVs, but transposable elements and biotic stress response genes were notably overrepresented as a proportion of total genes and genes initiating CNVs. Results support a model whereby segmental CNV and the genes encoded within these regions contribute to adaptive capacity of plants through natural genome variation.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20624746      PMCID: PMC2997553          DOI: 10.1093/gbe/evq033

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Angiosperms rapidly radiated into diverse biomes via both vicariance, which drove adaptation to various ecological niches due to the slow breaking up of land masses and also by biotic and abiotic dispersal mechanisms, which was aided by the advanced seed dispersal mechanisms that were evolving (Lidgard and Crane 1988; Knapp et al. 2005). Consequently, the genome size of angiospermophyta have diversified markedly since their origin, at a rate beyond that of most other taxa (Gaut and Ross-Ibarra 2008). Genome size has been correlated with organism growth and ecology (Gregory 2002), and extremely large genomes may be limited both ecologically and evolutionarily (Lynch and Conery 2003; Morse et al. 2009). Forces of selection based on environmental conditions may be the major components that contribute to genome evolution and plasticity (Huynen and Bork 1998; Barrick et al. 2009), yet the relationship between genome reshuffling and natural selection remains poorly understood. Regions of genome plasticity and rates of genetic segment movement are not well characterized and may have important biological consequences, particularly in annual flowering plants where high natural diversity rates among siblings could provide adaptive advantage. Recently, structural shifts in the entire genome by gene copy number variants (CNVs) have been identified as major genetic variables in the human genome (Sebat et al. 2004) accounting for human disease etiology (McCarroll and Altshuler 2007; Png et al. 2008) and phenotypic variability between individuals (Estivill and Armengol 2007). CNVs are microduplications and deletions, as opposed to whole-genome duplications (Sebat et al. 2004). Plants are the most prolific genome duplicators with examples such as oats (Avena sativa), which has a hexaploid genome structure indicating six polyploid copies of its genome (Linares et al. 1998). Despite the clear flexibility of the plant genome and the sequenced and well-annotated genome of Arabidopsis thaliana, high-resolution study of microduplications and microdeletions by CNV in plants lags behind its mammalian counterpart and may represent a valuable component of the genomic framework (Kliebenstein 2008) that delimits natural phenotypic variation. A study was released during preparation of this manuscript that assayed CNV in maize (Zea mays) (Springer et al. 2009) and showed its potential contribution to the heterosis of this crop during domestication. The rationale for this study was that laboratories all around the world use the model plant, A. thaliana, of the same ecotype (in this case Columbia-0) to dissect the molecular underpinnings of plant growth and development. As an R strategist (MacArthur and Wilson 1967), A. thaliana produce a lot of seed, have a short life cycle, and thus lean to reproduction rather than stability. An overarching goal of the current study sought to test whether A. thaliana, grown under different temperature (16 and 28 °C) regimes and stress regimes (salicylic acid [SA] spray) for five generations and selected for fecundity at each generation, displayed any differences in gene CNV relative to a plant lineage growing under normal (22 °C) conditions. Three siblings from each evolved temperature or stress lineage were compared with the genome of a plant lineage grown under normal growth conditions by comparative genomic hybridization (CGH) to determine repetitive and nonrepetitive CNVs (fig. 1). Ultimately, it was hoped that this experiment would reveal whether CNVs appeared in a lineage-specific manner after force and selection were applied. Force and selection are the overarching rules of evolution and are also common themes of molecular genetics experimental design using A. thaliana.
F

A schematic representation of the experimental design. Force and selection were made at five generational points stemming from a common ancestor. After which, three siblings were selected for interlineage comparison with a reference growth condition by CGH.

A schematic representation of the experimental design. Force and selection were made at five generational points stemming from a common ancestor. After which, three siblings were selected for interlineage comparison with a reference growth condition by CGH.

Materials and Methods

Experimental Design

In this article, five lineages were established from a single parent genotype, and these were grown for five generations under different environmental conditions. The treatments were temperature (16, 22, 28 °C) plus a SA line and a mock treatment line. Selection among these lineages was made between each of the five generations based on fecundity (seed production) whereby the plant with the greatest seed production was chosen for the next generation. Once evolved, three siblings of each lineage were chosen randomly and assayed by whole-genome CGH in an attempt to find regions of repetitive microdeletion and microduplication (fig. 1). The tiling array interval for the CGH chip corresponding to the level of resolution for deletion/duplication was 300 bp.

Growth Conditions and Multigenerational Study Conditions

All A. thaliana lines used in this study were of the Columbia (Col)-O ecotype. For growth of plants in the multigeneration study, plants were stratified for 3 days in 0.15% agar at 4 °C and then seeded directly into soil and grown in continuous light (200 mmol/m2/s) in an Adaptis A1000 environmental growth chamber (Conviron) set to continuous temperature of 16, 22, or 28 °C. Multigeneration studies utilized seed collected from a single plant (the Col-0 seed stocks were obtained from the laboratory of Chris Somerville). For measurements of germination rate and radical elongation extent experiments, seed were germinated in sterile conditions on plates. Plating conditions used surface-sterilized seed that were stratified for 3 days in 0.15% agar at 4 °C prior to plating and were grown in the darkness at the identical temperature regime described above. Plates contained autoclaved 0.5 × Murashige and Skoog (MS) mineral salts (Sigma) and 1% agar. SA treatments were applied with an aqueous mixture of 0.3 mM SA (Sigma Aldrich, catalog number S7401) with 0.025% Silwet-L77 surfactant. Plants were sprayed with a handheld mist sprayer, and approximately 200 ml of SA solution was applied to pots containing 4 individual plants every 14 days. The mock SA treatment contained the Silwet-L77 but no SA and was applied in the same manner as the SA treatment using a different mist sprayer.

Isolation of Plant Genomic DNA from Arabidopsis

Genomic DNA (gDNA) was extracted from leaves of Arabidopsis plants using the DNeasy Plant Kits (QIAGEN). A NANODROP spectrophotometer (Thermo Scientific) was used to check DNA quality. Samples had an A260/A280 ≥1.8 and A260/A230 ≥1.9 for optimal labeling yields. A total of 2 mg of gDNA per sample was used for labeling and hybridization of the CGH array.

Comparative Genome Hybridization and Data Mining

Microarray Design.

CGH measures DNA copy number differences between a test and reference genome. The Arabidopsis whole-genome CGH array was designed and performed as a full service fee-for-service product by Roche Nimblegen on request and is commercially available at Roche Nimblegen (http://www.nimblegen.com/). Initially, probe selection was constrained to have one match within the genome, which resulted in large gaps in coverage. The probe match requirements and selection were relaxed to allow up to five matches and were able to fill in the gaps. The average tiling interval was 300 bases and the median interval was 280 bases (supplementary table S4, Supplementary Material online). The format of CGH array was the 385K design format whereby the single array contains 385,000 probes.

Hybridization and CGH Analysis

DNA extraction and purification were performed according to Nimblegen criteria and samples sent to Roche Nimblegen for CGH sample preparation, labeling, and hybridization. Each gDNA sample comparison used for CGH was labeled using a NimbleGen Dual-Color DNA Labeling Kit. Pairs of samples intended for hybridization to the same array were labeled in parallel using Cy3-Random and Cy5-Random Nonamers from the same kit (Roche NimbleGen). The test samples were labeled with Cy3 (16, SA, and 28 °C treatment) and reference samples with Cy5 (22 °C treatment and Mock SA). Data were normalized by spatial correction of the raw data. Spatial correction reduced artifacts observed in CGH data from and resulted in minimal impact on overall noise and log2-ratio values in regions of CNV. Spatial correction was applied to correct position-dependent nonuniformity of signals across the array. Specifically, locally weighted polynomial regression (LOESS) was used to adjust signal intensities based on X,Y feature position. A qspline fit normalization (Workman et al. 2002) was applied to the data prior to segmentation analysis. By default, normalization is applied and compensated for inherent differences in signal between the two dyes. These statistical processes were performed by Roche Nimblegen as part of the full service product (Roche Nimblegen). Segmental analysis was performed using the SignalMAP software (Roche Nimblegen) to identify regions of CNV that were greater than 1 gene copy. Using the same software, all three biological replicates were overlaid to identify repetitive CNV events. These data were then overlaid with a GIFF annotation file that contains all genes in the Arabidopsis genome fitted to the segmental output in SignalMap (commercially available from Roche Nimblegen). Cross-referencing the annotation file provided the gene identities for each CNV. These were then cross-referenced with The Arabidopsis Information Resource mapping tool (www.arabidopsis.org) to provide putative annotations for each gene.

Results

Genome-Wide Distribution and Size of Repetitive CNVs

All five chromosomes in the A. thaliana genome contained repetitive gene CNV events (table 1). Repetitive gene CNV events were CNVs that were present in at least 2 out of the 3 sibling level CGH assays for each interlineage comparison (figs. 2–4). Chromosomes 1 and 2 were most prone to segmental duplication or deletion, based on the number of genes exposed to CNV (table 1). Size of CNV regions, which by definition are greater than 1 kb in size (The Copy Number Variation Project, Wellcome Trust Sanger Institute), ranged from 2 genes to 128 genes in length (3 to 300 kb). In the 16/22 CGH, there were 14 CNV events that contained a total of 400 gene CNVs that were all microdeletion events. SA spray/Mock CGH assay identified 13 CNV deletion events that contained 402 genes. By contrast, 28/22 CGH assay documented 11 CNV events, which comprised seven microduplications and four microdeletion events and a total of 292 genes were exposed to CNV (table 1). The contribution of CNV to genome variation among CNV events that are repetitive among siblings with a lineage showed that 1.4%, 1.1%, and 1.4% of genes in the genome were exposed to repetitive CNV in the 16/22, 28/22, and SA/Mock assays, respectively (27,549 genes according to EMBL:EBI INTEGR8, A. thaliana genome statistics). Overall, the 16/22 and SA/Mock treatments resulted in a loss of genome size, whereas the 28/22 resulted in a net gain of genome size (supplementary table S1, Supplementary Material online). The largest single CNV event was present in all interlineage comparisons and was a deletion event located on chromosome 2 between 3,244,499 and 3,526,499 bp (supplementary table S1, Supplementary Material online and figs. 2–4).
Table 1

Repetitive CNV That Occurred in At Least 2 Out of 3 Siblings within an Interlineage Comparison (28/22, 16/22, and SA/Mock)

Event NumberChromosomePhysical LocationCNV Type
28/22 CGH
    119019499–9046499Duplication
    2115085499–15151499Deletion
    3115322499–15379499Deletion
    4117248499–17266499Duplication
    521499–67499Duplication
    628971499–9139499Duplication
    7213750499–13843499Duplication
    823241499–3508499Deletion
    941945499–1951499Duplication
    1043193499–3259499Deletion
    1153322499–3455499Duplication
Total genesAll292
Total TEAll32
16/22 CGH
    118767499–8836499Deletion
    2121751499–21850499Deletion
    3127412499–27427499Deletion
    422593499–2674499Deletion
    523121499–3511499Deletion
    627604999–7610999Deletion
    7212457499–12469499Deletion
    8312667499–13147499Deletion
    9316246499–16267499Deletion
    1041699499–1741499Deletion
    1145857499–5920499Deletion
    12413621499–13636499Deletion
    13511509499–11593499Deletion
    14515250499–15262499Deletion
Total genesAll400
Total TEAll159
SA/M CGH
    118767499–8836499Deletion
    2111482499–11524499Deletion
    3121751499–21850499Deletion
    4127412499–27427499Deletion
    522593499–2674499Deletion
    623121499–3514499Deletion
    7212457499–12469499Deletion
    8312667499–13141499Deletion
    9316246499–16267499Deletion
    1041699499–1741499Deletion
    1145857499–5920499Deletion
    12413621499–13636499Deletion
    13511509499–11593499Deletion
Total genesAll402
Total TEAll159

NOTE.—All data represent normalized CNV data, and the initiation and termination site of each CNV segment were documented as a physical location. Each event was defined as CNV event 1–11, 1–14, and 1–13 for 28/22, 16/22, and SA/Mock CGH, respectively. Chromosome number and whether the CNV was a deletion or duplication were documented for each interlineage comparison.

F

CGH analysis of CNV in the Arabidopsis genome in a plant lineage grown at 28 °C compared with a lineage grown at 22 °C for five generations. Both lineages were derived from a common ancestor plant. Right panel is an averaged rainbow view of the CGH analysis with the entire genome broken into color-coded panels. CNV events on each chromosome and each replicate are enlarged, itemized, and presented with functional annotation of the genes occurring in each event. Refer to supplementary table S1 (Supplementary Material online) for the exact annotation and metadata for each gene.

F

CGH analysis of CNV in the Arabidopsis genome in a plant lineage grown at 16 °C compared with a lineage grown at 22 °C for five generations. Both lineages were derived from a common ancestor plant. Right panel is an averaged rainbow view of the CGH analysis. CNV events on each chromosome and each biological replicate are itemized and presented with functional annotation of the genes occurring in each event. Refer to supplementary table S1 (Supplementary Material online) for the exact annotation and metadata for each gene.

F

CGH analysis of CNV in the Arabidopsis genome in a plant lineage grown under conditions whereby plants were exogenously sprayed with SA every 14 days compared with a lineage grown under conditions whereby plants were exogenously sprayed with a mock SA mixture for five generations at 22 °C. Both lineages were derived from a common ancestor plant. Right panel is an averaged rainbow view of the CGH analysis. CNV events on each chromosome and each biological replicate are itemized and presented with functional annotation of the genes occurring in each event. Refer to supplementary table S1 (Supplementary Material online) for the exact annotation and metadata for each gene.

Repetitive CNV That Occurred in At Least 2 Out of 3 Siblings within an Interlineage Comparison (28/22, 16/22, and SA/Mock) NOTE.—All data represent normalized CNV data, and the initiation and termination site of each CNV segment were documented as a physical location. Each event was defined as CNV event 1–11, 1–14, and 1–13 for 28/22, 16/22, and SA/Mock CGH, respectively. Chromosome number and whether the CNV was a deletion or duplication were documented for each interlineage comparison. CGH analysis of CNV in the Arabidopsis genome in a plant lineage grown at 28 °C compared with a lineage grown at 22 °C for five generations. Both lineages were derived from a common ancestor plant. Right panel is an averaged rainbow view of the CGH analysis with the entire genome broken into color-coded panels. CNV events on each chromosome and each replicate are enlarged, itemized, and presented with functional annotation of the genes occurring in each event. Refer to supplementary table S1 (Supplementary Material online) for the exact annotation and metadata for each gene. CGH analysis of CNV in the Arabidopsis genome in a plant lineage grown at 16 °C compared with a lineage grown at 22 °C for five generations. Both lineages were derived from a common ancestor plant. Right panel is an averaged rainbow view of the CGH analysis. CNV events on each chromosome and each biological replicate are itemized and presented with functional annotation of the genes occurring in each event. Refer to supplementary table S1 (Supplementary Material online) for the exact annotation and metadata for each gene. CGH analysis of CNV in the Arabidopsis genome in a plant lineage grown under conditions whereby plants were exogenously sprayed with SA every 14 days compared with a lineage grown under conditions whereby plants were exogenously sprayed with a mock SA mixture for five generations at 22 °C. Both lineages were derived from a common ancestor plant. Right panel is an averaged rainbow view of the CGH analysis. CNV events on each chromosome and each biological replicate are itemized and presented with functional annotation of the genes occurring in each event. Refer to supplementary table S1 (Supplementary Material online) for the exact annotation and metadata for each gene.

Repetitive and Nonrepetitive CNV Events Assayed by Intersibling Comparison

Aimed at pinpointing the consistency and genome-wide distribution of CNV events within a single generation, three siblings from the fifth generation of each lineage were assayed via CGH (fig. 1), and the number of both repetitive and nonrepetitive CNV events counted (table 2). CNV events in siblings from the 16/22 and SA/Mock assays were highly uniform in their distribution among siblings, and no nonrepetitive CNVs were documented. By contrast, the number of CNV events among the three replicates varied substantially in the 28/22 assay (table 2). For instance, chromosome 3 contained no repetitive CNV events but among the siblings there were 6, 5, and 2 nonrepetitive CNV events. Moreover, the number of CNV events on chromosome 5 among siblings varied 2-fold from four in siblings 1 and 2 to eight events in sibling 3. The number of genes exposed to intersibling CNV variation was highest in the 28/22 CGH lineage with 0.38% of genes in the genome exposed to nonrepetitive CNV in an intersibling comparison.
Table 2

CNV Events Per Chromosome Per Sibling Were Analyzed to Establish a Combined Number for Repetitive and Nonrepetitive CNV Events Per Chromosome in a Lineage

CNV Events
Sibling 1Sibling 2Sibling 3
28/22
    Chromosome-1445
    Chromosome-2443
    Chromosome-3652
    Chromosome-4556
    Chromosome-5448
16/22
    Chromosome-1333
    Chromosome-2444
    Chromosome-3222
    Chromosome-4333
    Chromosome-5221
SA/M
    Chromosome-1333
    Chromosome-2444
    Chromosome-3222
    Chromosome-4333
    Chromosome-5211

NOTE.—CNV events that were not common among 2 out 3 biological replicates were considered outlier events within this study.

CNV Events Per Chromosome Per Sibling Were Analyzed to Establish a Combined Number for Repetitive and Nonrepetitive CNV Events Per Chromosome in a Lineage NOTE.—CNV events that were not common among 2 out 3 biological replicates were considered outlier events within this study.

Border Initiation–Termination Sites for Repetitive CNV Events among Siblings

To explore the physical location whereby CNV events began and ended, the CGH probes corresponding to CNVs were mapped against the genome and compared. In the 28/22 CGH assay, 8 out of the 11 CNV events displayed variable initiation or termination sites of the CNV segment among siblings. Alternatively, the 16/22 CGH assay had variable border regions in 3 out of 14 CNV (table 3). SA/Mock also had 3 variable border CNV regions among siblings out of 13 CNV events. Therefore, in addition to displaying greater duplication rate and nonrepetitive CNV events, the 28/22 assay displayed far greater variability of border regions than 16/22 or SA/Mock (table 3). Among duplication events in the 28/22, 5 out of 7 displayed variable initiation–termination sites among siblings, whereas out of 17 different deletion events documented in all lineages, only 5 had variable initiation–termination sites (table 3).
Table 3

Initiation and Termination Sites for Repetitive CNV Events in Each CGH Interlineage Comparison Was Assessed for Each Sibling within a Lineage

Sibling 1Sibling 2Sibling 3
Physical Location of Initiation and Termination of CNV Event
28/22
    Event 18935499–90794998905499–92324999025499–9076499
    Event 215091499–1513949915091499–1515149915085499–15151499
    Event 315325499–1537649915322499–1537949915322499–15379499
    Event 417251499–1726949917248499–1726649917258999–17261999
    Event 51–674991–674991–67499
    Event 63241499–35084993241499–35084993241499–3511499
    Event 78971499–91394998938499–91544998947499–9016499
    Event 813750499–1384349913753499–13939499NA
    Event 91945499–19514991942499–19514991945499–1951499
    Event 103193499–32594993196499–32624993196499–3262499
    Event 113322499–34454993316499–3445999NA
16/22
    Event 18767499–88364998767499–88364998767499–8836499
    Event 221751499–2185049921751499–2185049921751499–21850499
    Event 327412499–2742749927412499–2742749927412499–27427499
    Event 42593499–26744992593499–26744992593499–2674499
    Event 53121499–35114993121499–35114993121499–3235499
    Event 67604999–76109997604999–76109997604999–7610999
    Event 712457499–1246949912457499–1246949912457499–12469499
    Event 812667499–1314149912667499–1314149912667499–13186499
    Event 916246499–1626749916246499–1626749916246499–16267499
    Event 101699499–17414991699499–17414991699499–1741499
    Event 115857499–59204995857499–59204995857499–5920499
    Event 1213621499–1363649913621499–1363649913621499–13636499
    Event 1311509499–1159349911509499–1159349911509499–11626499
    Event 1415250499–1526249915250499–1526249915250499–15262499
SA/M
    Event 18767499–88364998767499–88364998767499–8836499
    Event 211482499–1152449911482499–1152449911482499–11524499
    Event 321751499–2185049921751499–2185049921751499–21850499
    Event 427412499–2742749927412499–2742749927412499–27427499
    Event 52593499–26744992593499–26744992593499–2674499
    Event 63121499–35114993121499–35144993121499–3514499
    Event 712457499–1246949912457499–1246949912457499–12469499
    Event 812667499–1314149912667499–1314149912667499–13141499
    Event 916246499–1626749916246499–1626749916246499–16267499
    Event 101699499–17414991699499–17414991636499–1741499
    Event 115857499–59204995857499–59204995857499–5920499
    Event 1213621499–1363649913621499–1363649913621499–13636499
    Event 1311509499–1159349911509499–1159349911509499–11626499

NOTE.—Repetitive CNV events were defined as occurring in 2 out of 3 biological replicates or siblings among lineage-specific CGH comparisons. The current table documents the physical location where each overlapping CNV event was initiated and terminated on in order to determine conservation among siblings. Refer to table 1 for the corresponding chromosome for each event.

Initiation and Termination Sites for Repetitive CNV Events in Each CGH Interlineage Comparison Was Assessed for Each Sibling within a Lineage NOTE.—Repetitive CNV events were defined as occurring in 2 out of 3 biological replicates or siblings among lineage-specific CGH comparisons. The current table documents the physical location where each overlapping CNV event was initiated and terminated on in order to determine conservation among siblings. Refer to table 1 for the corresponding chromosome for each event. An additional question was whether a nonrandom pattern of initiation gene species existed among CNV events. Five of the initiation genes among the 11 repetitive CNV events in the 28/22 assay were transposable elements (TEs) (supplementary table S1, Supplementary Material online). Others included a secretory membrane protein, other RNA, maturase, nodulin, protein kinase, and Grl5. In the 16/22 assay, initiation genes among the 14 replicated CNV regions included a UDP-3-O-acyl N-acetylglucosamine deacetylase family protein/F-box protein, mRNA splicing, and a protease inhibitor, maturase, five TEs, two unknown proteins, three disease resistance loci including an leucine-rich repeat (LRR) protein kinase, disease resistance protein (nucleotide-binding site [NBS]–LRR), and a resistance locus o (MLO) protein. In the SA/Mock assay, 13 CNVs were apparent and the initiation genes included a UDP-3-O-acyl N-acetylglucosamine deacetylase family protein/F-box protein, maturase, isoamylase, six TEs, three disease resistance loci including an LRR protein kinase, disease resistance protein (NBS-LRR), and a stress responsive suppressor (supplementary table S1, Supplementary Material online). Hence, TEs were the most representative initiating gene in CNV events occurring 42% of the time across all repetitive CNVs (supplementary table S1, Supplementary Material online) but it is important to note that this number was similar to the ratio of TEs among total genes in CNV events.

Variant Gene Identity and Function

16/22 °C

TEs, retro-TEs, and transposases are well documented to represent the largest contributor to genome size and flexibility (Ma et al. 2004). TEs comprised approximately 40% of all genes among repetitive CNV events the 16/22 assay. The most abundant classes were COPIA, GYPSY, CACTA, and non-LTR TEs. Additional TEs identified in the CGH analysis were MUTATOR TE, hAT-like TE, HELICASE TE, Sadhu noncoding TE, and retroelement pol polyproteins. CNVs that were comprised primarily of TEs were located on chromosome 3, 4, and 5 in the 16/22 assay (supplementary table S1, Supplementary Material online). The size of these segments varied, with the largest containing 117 genes and second largest contained 25 genes occurring on chromosome 3 (12,667,227–13,139,009 bp) and chromosome 5 (11,510,033–11,610,717 bp), respectively. In addition to TEs, there were defined transcripts that were highly represented in certain deletion event. Chromosome 1 had three main deletion events, the first was comprised of 36 genes, 6 of which were natural antisense genes and 12 of which were annotated as UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase genes. The second deletion on chromosome 1 contained 25 genes, and there were six adenosine triphosphate-binding protein and four disease resistance (CC-NBS-LRR class) genes. The third deletion event was 6 genes long and contained 3 disease resistance protein (CC-NBS-LRR class) genes, 2 acid phosphatase survival (SurE) genes, and an antisense gene. Hence, it was documented that disease resistance genes were prone to CNV in these experimental conditions (supplementary table S1, Supplementary Material online). Chromosome 2 also had three deletion events, the first of which was comprised primarily of TEs, the second was conserved across all treatments, and the third deletion event contained a string of 3 LRR-receptor like kinases (RLK) (supplementary table S1, Supplementary Material online). Of particular interest was the second and highly conserved event among all experimental lineages. The CNV event was 128 genes in length and began with a maturase and ended with a pre-tRNA gene. Unlike other CNVs documented in this study, it comprised an overrepresentation of pre-tRNAs (16% of genes in CNV segment). In additional to pre-tRNAs, there were 62 genes of unknown function (48% of genes in CNV segment), 16 TEs (12.5% of genes in CNV segment), and several genes annotated as cytochromes (3%), adenosine triphosphatase (3%), or genes associated with primary metabolism (3%) (supplementary table S1, Supplementary Material online). Chromosome 3 had two deletion events, the first of which was a large block of 117 genes that were largely TEs. The second region was 7 genes in length, and 5 were unknown genes, 1 a TE, and 1 a thionin gene. Chromosome 4 contained three small deletion events, the first contained an unknown gene and a Methyl-CpG–binding domain containing gene. The second contained an mRNA splicing, theoredoxin, pre-tRNA, and a DJ-1 gene. The third was comprised solely of TEs. A single CNV occurred on chromosome 5 and was largely composed of TEs. Despite the most abundant genes located within CNV deletion events being TEs, the identification of CNV regions that contained only NBS-LRR genes was unexpected (supplementary table S1, Supplementary Material online).

SA/Mock

Repetitive CNV events in the SA/Mock lineages were identical to those described above for 16/22 with the exception three CNVs. One additional deletion event occurred on chromosome 1 between 11,503,499 and 11,524,499 bp (At1g31993–At1g32045) and comprised stress response suppressor STRS1, three TEs, an F-BOX protein, a MYOSIN heavy chain related protein, two unknown, and two pseudogenes (supplementary table S1, Supplementary Material online). Two deletion events that occurred in the 16/22 lineage where absent in the SA/Mock lineage, and these occurred on chromosome 2 between 7,604,999 and 7,610,999 and chromosome 5 between 15,250,499 and 15,262,499 bp (supplementary table S1, Supplementary Material online).

28/22 °C

Distinct from both 16/22 and SA/Mock assays, repetitive microduplications among siblings were prominent in the 28/22 assays. No CNV events occurred on chromosome 3 and single CNV events occurred on chromosome 4 and 5 that contained a total of eight genes. Moreover, only a single deletion event was common to either of the other CGH experiments. Hence, 97% of genes exposed to CNV in 28/22 occurred on chromosome 1 and 2. On chromosome 1, two deletion (four genes—all TEs) and two duplication (17 genes) CNV segments were present. The first duplication event was initiated by a MUTATOR transposon (At1g25784) followed by a Ulp protease, which was homologous to the SUMO protease. Of the remaining ten genes, five were unknown and a cytochrome B561, extensin, esterase, haloacid dealogenase, and phosphorylase were present (supplementary table S1, Supplementary Material online). The second microduplicated segment contained two TEs, an other RNA gene, DREB ERF/AP2, and genes encoding proteins of unknown function. Chromosome 2 contained 265 of the 290 CNV genes, in total 91% of all genes among four CNV segments. The first event occurred at the initiation of the chromosome and contained several rRNAs, unknown proteins, and a suite of TEs (supplementary table S1, Supplementary Material online). The second event was a deletion that was common to all treatments and contained 128 genes. This event was comprised of a large number of genes annotated as pseudogenes many of which were pseudogenes associated with primary metabolism, pre-tRNAs, and unknown proteins. The third CNV event on chromosome 2 was a large duplication. This CNV comprised only two TEs out of 89 genes. Genes within this intriguing segment included a fructose biphosphate aldolase, mannose-6-phosphate reductase, triosephosphate isomerase, lipoic acid synthase, diacylglycerol kinase, and peptidyl-prolyl cis-trans isomerase (supplementary table S1, Supplementary Material online). Moreover, regulatory genes such as bZIP transcription factors, auxin response genes, and MCM10 (involved in the initiation of DNA replication) were duplicated. Several genes in this segment were annotated as light or temperature response genes, such as the chloroplast-localized THYLAKOID FORMATION 1 gene involved in vesicle-mediated formation of thylakoid membranes. Genes associated with circadium rhythm were also present in this segment such as FIONA1 (a central oscillator-associated component, two copies were duplicated) and XAP5, which is involved in light regulation of the circadian clock and photomorphogenesis. Other genes such as SYT1 affect temperature tolerance. The fourth duplicated CNV segment that occurred on chromosome 2 contained a diverse range of gene annotations. Regulatory genes such as 2 AXR1 genes (auxin homeostasis), two 3'-5' exonuclease-nucleic acid–binding proteins, 2 F-BOX, MEKK, calcium-binding EF hand family protein, 2 R2R3 factor genes, and RCD1 were duplicated. This duplicated region also had cell wall-associated genes, such as four cellulose synthase like genes, a galactosyltransferase family protein, and a hydroxyproline-rich glycoprotein. Chromosome 3 contained no repetitive CNV events among siblings that could be considered lineage stable CNVs. Chromosome 4 contained two small CNV events. The first was a deletion event, which contained a tandem duplicate of a NODULIN-like gene and six genes of unknown function. The second event on chromosome 4 was a duplication comprised of a tandem duplicate GYPSY TE (supplementary table S1, Supplementary Material online). A single repetitive CNV event occurred on chromosome 5 and was initiated by a protein kinase, followed by a gene encoding a pathogenesis-related protein, C2H2 ring zinc finger domain containing gene, C3HC4 ring zinc finger containing gene, gene of unknown function, and an LRR-RLK terminated the CNV event.

Regions of Tandem-Duplicated Genes Were Common in CNV Events

Genes subject to tandem duplication were highly abundant and comprised 52% of all genes in the 28/22 assay (supplementary table S1, Supplementary Material online). Furthermore, the 16/22 and SA/Mock also had 52% tandem duplication rate among genes exposed to CNV. Considering that tandem-duplicated genes are highly abundant in Arabidopsis and represent approximately 15% of the genome (Vision et al. 2000; Zhang et al. 2002; Blanc et al. 2003), it was noteworthy that a substantial overrepresentation of tandem duplicates occurred among CNV segments (supplementary table S1, Supplementary Material online). Patterns of past gene duplication defined as back to back copies of genes that share nucleotide sequence similarity and retail the same functional annotation occurred frequently within regions of CNV (supplementary table S1, Supplementary Material online; regions of similarity are colored). In both SA/Mock and 16/22 CGH comparisons, the first deletion on chromosome 1 was a good example of where 12 out of 36 genes in this CNV were encoded by UDP-3-O-acyl N-acetylglucosamine deacetylase family protein/F-BOX protein. Moreover, a pattern of gene duplication appeared in this CNV and involved a Cyclin-like F-BOX, anthranilate synthase beta subunit and a potential natural antisense gene (supplementary table S1, Supplementary Material online). TEs were frequently associated with tandem duplication within CNV events (supplementary table S1, Supplementary Material online). Furthermore, CNV event seven (table 1) in the 16/22 and SA/Mock assays revealed that three LRR-RLKs were deleted and comprised the entire CNV event. Other examples of entire CNV events comprising duplications were demonstrated by events two and three on chromosome 1 in the 28/22 CGH, in this case both gene duplicates were TEs (CACTA and GYPSY).

Plant Growth among Lineages

Plants grown at 16, 22, and 28 °C displayed different rates of growth measured from seedling to mature phase. The lifecycle of A. thaliana at 22 °C was 6 weeks in constant light conditions, whereas plants grown at 16 °C required between 8 and 9 weeks to complete their lifecycle. Growth form of plants grown at 16 versus 22 °C was not markedly different, apart from a greater degree of anthocyanin accumulation in leaves, siliques, and stems of the 16 °C grown plants (data not presented, previously documented by Leyva et al. 1995). Plants grown at 28 °C displayed low germination rates (supplementary table S2, Supplementary Material online) consistent with previous reports (Kurek et al. 2007). Although initial growth rates were not significantly different to 22 °C as measured for hypocotyl elongation rates (supplementary table S3, Supplementary Material online), the plants became dwarfed and accumulated 40–50% less biomass than plants grown at 22 °C, and seed set was reduced under high temperature treatment (data not presented). These phenotypes were not detailed since the impact of high temperature on photosynthesis, biomass production and seed production have been previously documented (e.g., Feller et al. 1998; Salvucci and Crafts-Brandner 2004; Kurek et al. 2007). Low temperature regimes (16 °C) have also been shown to have no significant impact on germination rates but some effect on flowering time (Nordborg and Bergelson 1999). Plants grown under conditions whereby SA was applied by exogenous misting every 14 days also had an impact on plant growth, which has been previously documented (Nawrath and Métraux 1999) and therefore not represented. Plants were selected for fecundity at each generation, and seed from a single individual propagated in the proceeding generation (fig. 1). Seed from plants evolved at 22, 16, and 28 °C for five generations were grown on sterile, half strength MS agar plates for five days in the darkness to induce etiolation at each of the selection temperatures. The aim of this experiment was to determine whether germination and/or elongation rates were different in plants acclimated to 16 °C when grown at 28 °C relative to plants acclimated at 28 °C. Results showed that greater germination rates occurred in plants acclimated to 28 °C than those acclimated at 22 °C and 16 °C when growth at 28 °C after five generations (supplementary table S3, Supplementary Material online). Plants that were grown at 16 °C or 22 °C did not differ with respect to their germination rate (supplementary table S3, Supplementary Material online), hypocotyl or root elongation capacity.

Discussion

Life scientists, particularly those studying model organisms often assume isogenic properties of their “wild-type” organism. However, these same experimental biologists are often posing constant selection forces on model organisms through simple environmental changes between or within laboratories, mutagenesis, and/or seeking phenotypes of interest. Hence, we as scientists are playing a role in natural selection with unknown consequences because natural variation and subsequent phenotypic selection are the driving force behind evolution (McClintock 1984). Natural variation among a population of individuals is likely to arise from a complex interplay of genetic variability, such as single nucleotide polymorphisms (Borevitz et al. 2007; Clark et al. 2007; Nordborg and Weigel 2008) and gene CNV (Sebat et al. 2004; Conrad et al. 2006; Mileyko et al. 2008; Springer et al. 2009). CNV has already been shown to be a genomic polymorphism of enormous importance to human biology (Sebat et al. 2004) and yet is poorly studied in plants. The overarching goal of this experiment was to take initial steps to determine whether CNV could be detected among individuals in a population of A. thaliana over immediate family generation scale and secondly, if CNV did occur, what were the genomic features of CNV events. It was shown herein that the genome of A. thaliana displayed unexpectedly high rates of physical change by CNV among closely related lineages, affecting not tens but hundreds of genes between individuals separated by only five generations. Because force and selection were applied to individual lineages and multiple siblings within a lineage sustained the same CNV polymorphisms, results presented herein document an unprecedented level of genome divergence by CNV. As mentioned above, findings of this experiment showed that CNV events were consistent among siblings in interlineage CGH assays. Therefore, these assays infer stability of CNVs into the lineage. The instance where this premise did not always hold true was in the case of the highly stressed lineage of plants evolved at 28 °C. Here, CNV instability was apparent, and numerous nonrepetitive CNVs were documented between siblings (approximately 0.38% of all 27,549 genes in the A. thaliana genome). Association between stress and genome flexibility can occur in A. thaliana by increased rates of recombination under stress conditions (Lucht et al. 2002). This theory has been long established as a mechanism by which an organism may improve its evolutionary advantage (McClintock 1984). Because both allelic recombination (Abu Bakar et al. 2009) and nonallelic recombination (Cheeseman et al. 2009) have been proposed to generate CNV, a plausible explanation for the high rates of nonrepetitive CNV between siblings in the 28/22 assays is that an environment–genome variation interaction exists via CNV. A mechanism by which stress is sensed and transferred to stimulate CNV (via recombination?) remains unclear. Documenting the genes that are highly abundant in CNV, particularly in plants, may have important rational basis to explain steady-state changes in the genome. It was found that approximately 52% of CNVs were comprised of tandem-duplicated genes (supplementary table S1, Supplementary Material online) suggesting that these regions of the genome are prone to CNV. Past research has suggested that tandem duplicates can display divergence in expression in A. thaliana (Ganko et al. 2007). Divergence of preserved duplicates has the potential to lead to differential gene expression based on tissue or cellular compartmentalization (Innan and Kondrashov 2010). Variant deletion or duplication of already tandem-duplicated genes, as documented here, may be a means to attempt to specialize or adapt the genome while shielding against deleterious single gene copy polymorphisms. Such a postulate would be consistent with Hanada et al. (2009), who suggest that duplicate genes contribute to genome robustness and protect the plant from severe phenotypes. Another form of common genome diversification is mediated by TEs (Bennetzen 2000). Cataloging the identity of genes that occurred in the repetitive CNV segments among siblings (figs. 2–4; supplementary table S1, Supplementary Material online) showed that TEs were highly abundant components of both duplicated and deleted regions (supplementary table S1, Supplementary Material online) consistent with other studies (Bennetzen 2000; Feschotte et al. 2002). A major discrepancy in this pattern was that while TEs comprised 40% of all genes represented in repetitive CNV events among siblings in the 16/22 lineage CGH, they comprised only 10% of genes in repetitive CNV events among siblings in the 28/22 lineage. These data suggested that variants in plants that were successively being selected with a strong environmental force had reduced frequency of TEs in CNV events. The biological consequence of this disparity was unclear, but it is tempting to speculate that an association between selection and functional gene species would be correlated with adaptation. However, this fails to explain why fewer TEs were associated with stable CNVs in the 28/22 lineage because this would require a mechanism that was capable of preferential CNV selection. An important element of this study was to ask whether nonrandom patterns appeared in the whole-genome distribution of CNV events. Regional CNV clusters at the whole-genome scale were most prevalent on chromosome 2. For instance, 265 out of 290 genes exposed to repetitive CNVs among siblings were located on chromosome 2 from the 28/22 CGH assay. Also, the largest CNV event occurred on chromosome 2 and was a conserved deletion among all treatments of 128 genes (table 1). Patterns of CNV were not predominantly associated with centromeres and could not be physically associated with any region of genome. Without assaying hundreds rather than three interlineage comparisons, which is economically challenging, it is not feasible to draw any conclusions as to the distribution patterns of CNVs at the genome scale in A. thaliana at the current time. Moreover, because selection was based on fecundity and not phenotypic abnormality, it was not an aim of this experiment to delineate between phenotype and CNV. Taking a broad view of the genomic landscape over interlineage selection with a strong forcing function such as temperature (a reality under current climate change scenarios, Tester and Langridge 2010), a systematic view of polymorphisms in interlineage evolution with at least 20+ individuals compared back with a common ancestor genome is needed to acquire a more elaborate picture of genome evolution.

Conclusions

The current study establishes several important paradigms related to CNV in plants for consideration in plant genome evolution and natural diversity: 1) A. thaliana lineages separated by five generations from a common genotype displayed a substantial rate of retained CNVs among siblings (up to 402 genes); 2) CNVs are comprised of a diverse range of gene species but where overrepresented by TEs, biotic stress response genes, and one conserved CNV was heavily comprised of pre-tRNAs (chromosome 2 position 3,121,499–3,511,499 bp); 3) environmental stress was correlated with duplication CNV events and a higher rate of nonrepetitive CNVs among siblings; and 4) experimental biologists using the model plant A. thaliana are dealing with an organism that undergoes unexpectedly high rates of CNV. To broadly conclude, the experiment described herein aimed to take some first steps to understand how CNV contributes to interlineage natural variation in A. thaliana and found that it represented a significant genomic factor underlying natural diversity.

Supplementary Material

Supplementary tables S1–S3 are available at Genome Biology and Evolution online (http://www.oxfordjournals.org/our_journals/gbe/).
  39 in total

1.  The effect of seed and rosette cold treatment on germination and flowering time in some Arabidopsis thaliana (Brassicaceae) ecotypes.

Authors:  M Nordborg; J Bergelson
Journal:  Am J Bot       Date:  1999-04       Impact factor: 3.844

Review 2.  Genome size and developmental complexity.

Authors:  T Ryan Gregory
Journal:  Genetica       Date:  2002-05       Impact factor: 1.082

3.  Divergence in expression between duplicated genes in Arabidopsis.

Authors:  Eric W Ganko; Blake C Meyers; Todd J Vision
Journal:  Mol Biol Evol       Date:  2007-08-01       Impact factor: 16.240

4.  The significance of responses of the genome to challenge.

Authors:  B McClintock
Journal:  Science       Date:  1984-11-16       Impact factor: 47.728

5.  Genome evolution and adaptation in a long-term experiment with Escherichia coli.

Authors:  Jeffrey E Barrick; Dong Su Yu; Sung Ho Yoon; Haeyoung Jeong; Tae Kwang Oh; Dominique Schneider; Richard E Lenski; Jihyun F Kim
Journal:  Nature       Date:  2009-10-18       Impact factor: 49.962

6.  Salicylic acid induction-deficient mutants of Arabidopsis express PR-2 and PR-5 and accumulate high levels of camalexin after pathogen inoculation.

Authors:  C Nawrath; J P Métraux
Journal:  Plant Cell       Date:  1999-08       Impact factor: 11.277

7.  Moderately High Temperatures Inhibit Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase (Rubisco) Activase-Mediated Activation of Rubisco

Authors: 
Journal:  Plant Physiol       Date:  1998-02-01       Impact factor: 8.340

8.  Relationship between the heat tolerance of photosynthesis and the thermal stability of rubisco activase in plants from contrasting thermal environments.

Authors:  Michael E Salvucci; Steven J Crafts-Brandner
Journal:  Plant Physiol       Date:  2004-04       Impact factor: 8.340

9.  Discrimination of the closely related A and D genomes of the hexaploid oat Avena sativa L.

Authors:  C Linares; E Ferrer; A Fominaya
Journal:  Proc Natl Acad Sci U S A       Date:  1998-10-13       Impact factor: 11.205

10.  Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana.

Authors:  Justin O Borevitz; Samuel P Hazen; Todd P Michael; Geoffrey P Morris; Ivan R Baxter; Tina T Hu; Huaming Chen; Jonathan D Werner; Magnus Nordborg; David E Salt; Steve A Kay; Joanne Chory; Detlef Weigel; Jonathan D G Jones; Joseph R Ecker
Journal:  Proc Natl Acad Sci U S A       Date:  2007-07-12       Impact factor: 11.205

View more
  54 in total

1.  Intron-length polymorphism identifies a Y2K4 dehydrin variant linked to superior freezing tolerance in alfalfa.

Authors:  Yves Castonguay; Marie-Pier Dubé; Jean Cloutier; Réal Michaud; Annick Bertrand; Serge Laberge
Journal:  Theor Appl Genet       Date:  2011-11-09       Impact factor: 5.699

Review 2.  Life at the extreme: lessons from the genome.

Authors:  Dong-Ha Oh; Maheshi Dassanayake; Hans J Bohnert; John M Cheeseman
Journal:  Genome Biol       Date:  2012       Impact factor: 13.583

3.  Copy number variation in transcriptionally active regions of sexual and apomictic Boechera demonstrates independently derived apomictic lineages.

Authors:  Olawale M Aliyu; Michael Seifert; José M Corral; Joerg Fuchs; Timothy F Sharbel
Journal:  Plant Cell       Date:  2013-10-29       Impact factor: 11.277

4.  The genome of the extremophile crucifer Thellungiella parvula.

Authors:  Maheshi Dassanayake; Dong-Ha Oh; Jeffrey S Haas; Alvaro Hernandez; Hyewon Hong; Shahjahan Ali; Dae-Jin Yun; Ray A Bressan; Jian-Kang Zhu; Hans J Bohnert; John M Cheeseman
Journal:  Nat Genet       Date:  2011-08-07       Impact factor: 38.330

Review 5.  Halophytism: What Have We Learnt From Arabidopsis thaliana Relative Model Systems?

Authors:  Yana Kazachkova; Gil Eshel; Pramod Pantha; John M Cheeseman; Maheshi Dassanayake; Simon Barak
Journal:  Plant Physiol       Date:  2018-09-20       Impact factor: 8.340

6.  hsp90 and hsp47 appear to play an important role in minnow Puntius sophore for surviving in the hot spring run-off aquatic ecosystem.

Authors:  Arabinda Mahanty; Gopal Krishna Purohit; Ravi Prakash Yadav; Sasmita Mohanty; Bimal Prasanna Mohanty
Journal:  Fish Physiol Biochem       Date:  2016-08-13       Impact factor: 2.794

Review 7.  Copy number variation and disease resistance in plants.

Authors:  Aria Dolatabadian; Dhwani Apurva Patel; David Edwards; Jacqueline Batley
Journal:  Theor Appl Genet       Date:  2017-10-17       Impact factor: 5.699

8.  The composition and origins of genomic variation among individuals of the soybean reference cultivar Williams 82.

Authors:  William J Haun; David L Hyten; Wayne W Xu; Daniel J Gerhardt; Thomas J Albert; Todd Richmond; Jeffrey A Jeddeloh; Gaofeng Jia; Nathan M Springer; Carroll P Vance; Robert M Stupar
Journal:  Plant Physiol       Date:  2010-11-29       Impact factor: 8.340

9.  Genome Reduction Uncovers a Large Dispensable Genome and Adaptive Role for Copy Number Variation in Asexually Propagated Solanum tuberosum.

Authors:  Michael A Hardigan; Emily Crisovan; John P Hamilton; Jeongwoon Kim; Parker Laimbeer; Courtney P Leisner; Norma C Manrique-Carpintero; Linsey Newton; Gina M Pham; Brieanne Vaillancourt; Xueming Yang; Zixian Zeng; David S Douches; Jiming Jiang; Richard E Veilleux; C Robin Buell
Journal:  Plant Cell       Date:  2016-01-15       Impact factor: 11.277

10.  Cbf14 copy number variation in the A, B, and D genomes of diploid and polyploid wheat.

Authors:  Taniya Dhillon; Eric J Stockinger
Journal:  Theor Appl Genet       Date:  2013-08-06       Impact factor: 5.699

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.