| Literature DB >> 28155655 |
Vasantika Suryawanshi1,2, Ina N Talke3, Michael Weber4, Roland Eils5,2,6, Benedikt Brors5, Stephan Clemens4, Ute Krämer7,8.
Abstract
BACKGROUND: Gene copy number divergence between species is a form of genetic polymorphism that contributes significantly to both genome size and phenotypic variation. In plants, copy number expansions of single genes were implicated in cultivar- or species-specific tolerance of high levels of soil boron, aluminium or calamine-type heavy metals, respectively. Arabidopsis halleri is a zinc- and cadmium-hyperaccumulating extremophile species capable of growing on heavy-metal contaminated, toxic soils. In contrast, its non-accumulating sister species A. lyrata and the closely related reference model species A. thaliana exhibit merely basal metal tolerance.Entities:
Keywords: Arabidopsis halleri; Array-CGH; CNV; Cross-species; Metal hyperaccumulation; Resistance genes (R genes); Toll-Interleukin Receptor-Nucleotide Binding Site-Leucine Rich Repeat (TIR-NBS-LRR) protein family
Mesh:
Year: 2016 PMID: 28155655 PMCID: PMC5259951 DOI: 10.1186/s12864-016-3319-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Comparison of the metal hyperaccumulator species Arabidopsis halleri to the closely related non-hyperaccumulator species A. lyrata and A. thaliana. A representative photograph is shown for each species, together with the estimated evolutionary distances separating them, given as the divergence times from a common ancestor [96]. Listed below are key phenotypic and genomic characteristics. Mya, million years ago
Fig. 2Overview of the data analysis workflow. Flowchart summarizing our two-step normalization approach for the processing of cross-species genomic hybridization data, consisting of within-species normalization and global scaling of signals through species-specific signal correction factors, followed by the final prediction of copy number divergent genes. Grey arrows and backgrounds mark the auxiliary steps taken for the determination of species-specific global scaling factors with the aid of reference gene datasets
Fig. 3Frequency distribution of mismatch occurrence between microarray probe sequences and heterologous target gene sequences. Shown is the percentage of A. thaliana probes on the ATH1 array that display no mismatches up to 11 mismatches (observed maximum) when hybridized to non-A. thaliana genomic DNA from either A. halleri (black bars) or A. lyrata (white bars). The expected frequency distribution (binomial) is shown by the grey line, and was calculated based on the average coding sequence identity (94%) within transcribed regions between A. halleri and A. thaliana
Fig. 4Dependence of hybridization signal intensity on number and position of mismatches with respect to the probe sequence on the ATH1 array. a Values are arithmetic means (± SD; n = 8 to 94) of background-corrected raw probe signal intensity ratios for non-A. thaliana gDNA relative to A. thaliana gDNA hybridizations, shown as a function of the total number of mismatches of the heterologous target sequence compared to the corresponding A. thaliana 25-mer probe sequence. b Independence of hybridization signal intensity from the position of a single mismatch with respect to the probe sequence. Values are arithmetic means (± SD; n = 2 to 6) of background-corrected raw probe signal intensity ratios for non-A. thaliana gDNA relative to A. thaliana gDNA hybridizations, shown as a function of mismatch position in the heterologous target sequence compared to the corresponding A. thaliana probe sequence. Black circles represent the representative A. halleri reference dataset; white diamonds represent the representative A. lyrata reference dataset
Fig. 5Distribution of signal intensities before and after normalization and scaling. Boxplot of (a) background-corrected raw hybridization signal intensities, (b) normalized signal intensities after VSN normalization of the replicate arrays of each species, respectively, and (c) signal intensities after the application of species-specific global scaling factors to the normalized data. Boxes show median, and upper and lower quartiles, of L o g 2 probe signal intensities for each gDNA hybridization. Upper and lower horizontal bars mark all values lying within 1.5 times the inter-quartile range. Replicate hybridizations are denoted 1 and 2 and grouped by species
Validation of array-CGH results against highly conserveda genes predicted to be copy number expanded (CNEs) in A. lyrata
| Array-CGH prediction method | Total number of CNEs detected | No. of CNEs detected (117 ; 98)b | Sensitivity (% positives detected out of predicted positives) | Specificity (% negatives detected out of predicted negatives) | Precision (% true positives out of total no. detected)c | ||||
|---|---|---|---|---|---|---|---|---|---|
| E | E-A | E | E-A | E | E-A | E | E-A | ||
| Present manuscript | 217 | 12 | 12 | 10.3 | 12.2 | 99.5 | 99.6 | 5.3 | 5.3 |
| Darby et al. 2011d | 949 | 10 | 8 | 8.5 | 8.2 | 99.5 | 99.6 | 1.3 | 1.1 |
| Machado et al. 2010e | 298 | 7 | 5 | 5.9 | 5.1 | 99.5 | 99.6 | 2.9 | 2.1 |
a A. lyrata genes sharing ≥ 95% sequence identity with their closest A. thaliana homologue [68] are termed highly conserved (compare Additional file 7)
bHeaders of half-columns refer to total number of CNEs predicted by Ensembl Plants (E; Vilella et al. 2009) alone or additionally by A. lyrata genome analysis (E-A; Hu et al. 2011), respectively, as given in parentheses here. Shown are commonalities with these two groups of genes (same column, below) or data referring to these two groups of genes (columns to the right)
cTrue positive is a CNE detected based on array-CGH that was previously predicted to be a CNE by Ensembl Plants [68] alone, or additionally by A. lyrata genome [35] analysis
d[42]
e[43]
Genes identified to be altered in copy number in A. halleri through cross-species hybridization of gDNA onto A. thaliana microarrays
| Affymetrix probeset ID | AGI locus ID | Short gene namea | Gene description |
|
| ||
|---|---|---|---|---|---|---|---|
|
|
|
|
| ||||
| (A) | |||||||
| Copy number expanded in | |||||||
| 256055_at | At1g07030 |
| Mitoferrin-related, mitochondrial solute carrier (MSC) family | 1.23 | 0.01 | -0.05 | 0.43 |
| 262832_s_at | At1g14870; At1g14880 |
| Plant cadmium resistance 1/2 |
| 0.00 |
| 0.00 |
| 247824_at | At5g58460 |
| Member of Na+/H+ antiporter family | 2.50 | 0.00 | 0.24 | 0.63 |
| 261448_atc | At1g21140 | Vacuolar iron transporter 1 (VIT1)-related | 1.27 | 0.00 | 0.76 | 0.00 | |
| 262958_at | At1g54410 |
| Ricinus iron transport protein 2-related |
| 0.08 |
| 0.00 |
| 262936_at | At1g79400 |
| Cation/H+ exchanger 2 | 1.13 | 0.03 | 0.53 | 0.16 |
| 267304_atd | At2g30080 |
| ZRT-, IRT-like protein 6 | 1.12 | 0.08 | 0.46 | 0.55 |
| 263831_at | At2g40300 |
| Ferritin 4 | 1.22 | 0.00 | 0.5 | 0.37 |
| 266718_atc,d | At2g46800 |
| Metal transport/tolerance protein 1 | 2.5 | 0.00 | 0.24 | 0.63 |
| 259008_at | At3g09390 |
| Metallothionein 2a | 1.22 | 0.00 | 0.57 | 0.00 |
| 252697_at | At3g43660 | Vacuolar iron transporter 1 (VIT1)-related | 1.16 | 0.08 | 0.78 | 0.02 | |
| 251735_at | At3g56090 |
| Ferritin 3 | 1.3 | 0.00 | 0.2 | 0.63 |
| 251733_at | At3g56240 |
| Copper chaperone | 1.12 | 0.00 | 0.44 | 0.19 |
| 255552_atd | At4g01850 |
| S-adenosylmethionine synthetase 2 | 1.38 | 0.02 | 0.51 | 0.42 |
| 254604_at | At4g19070 |
| Cadmium-induced protein AS8 | 1.13 | 0.00 | 0.49 | 0.05 |
| 253658_atc,d | At4g30120 |
| Heavy metal ATPase 3 | 2 | 0.00 | 0.63 | 0.01 |
| 252864_at | At4g39740 |
| Homologue of yeast copper chaperone Sco1 | 1.06 | 0.01 | 0.34 | 0.66 |
| 250991_at | At5g02380 |
| Metallothionein 2b | 1.32 | 0.05 | 0.43 | 0.34 |
| 248048_atc | At5g56080 |
| Nicotianamine synthase 2 |
| 0.00 |
| 0.00 |
| Putative copy number expanded in | |||||||
| 249334_at | At5g41000 |
| Yellow stripe like transporter 4 | 0.99 | 0.43 | 0.00 | 0.08 |
| 266336_at | At2g32270 |
| ZRT-, IRT-like$ protein 3 | 0.97 | 0.28 | 0.09 | 0.58 |
| 260283_at | At1g80480 |
| Plastid transcriptionally active 17, putative Zn metallochaperone | 0.96 | 0.46 | 0.00 | 0.05 |
| 260551_at | At2g43510 |
| Trypsin inhibitor 1, defensin-like protein family | 0.95 | 0.24 | 0.00 | 0.74 |
| 250944_atd | At5g03380 |
| Putative metallochaperone-like protein | 0.95 | 0.67 | 0.00 | 0.00 |
| 258745_at | At3g05920 |
| Putative metallochaperone-like protein | 0.93 | 0.45 | 0.00 | 0.06 |
| 253964_at | At4g26480 |
| Protein with NAS domain and KH domain | 0.93 | 0.37 | 0.01 | 0.14 |
| 247331_at | At5g63530 |
| Farnesylated protein 3, metal-binding | 0.92 | 0.45 | 0.00 | 0.09 |
| 266091_at | At2g37920 |
| Copper transporter 4, embryo defective 1513 | 0.92 | 0.68 | 0.01 | 0.01 |
| 259871_atc | At1g76800 | Vacuolar iron transporter 1 (VIT1)-related | 0.91 | 0.18 | 0.02 | 0.85 | |
| 256930_atd | At3g22460 |
| O-acetylserine thiol lyase isoform A2 | 0.90 | 0.42 | 0.06 | 0.14 |
| 253413_atc | At4g33020 |
| ZRT-, IRT-like$ protein 9 | 0.89 | 0.07 | 0.03 | 0.79 |
| 266115_at | At2g02140 |
| Plant defensin 2.6 | 0.86 | 0.46 | 0.00 | 0.01 |
| 258415_at | At3g17390 |
| S-adenosylmethionine synthetase 3 | 0.86 | 0.09 | 0.00 | 0.95 |
| 260489_at | At1g51610 |
| Metal transport protein 7 | 0.86 | 0.75 | 0.00 | 0.00 |
| 260913_atd | At1g02500 |
| S-adenosylmethionine synthetase 3 | 0.85 | 0.44 | 0.00 | 0.03 |
| 262324_at | At1g64170 |
| Cation/H+ exchanger, CPA2 family | 0.84 | 0.38 | 0.07 | 0.36 |
| 257365_x_atc,d | At2g26020 |
| Plant defensin 1.2b | 0.83 | 0.05 | 0.02 | 0.69 |
| 249255_atc | At5g41610 |
| Cation/H+ exchanger, CPA2 family | 0.77 | 0.63 | 0.00 | 0.01 |
| 263838_atc,d | At2g36880 |
| Methionine adenosyltransferase 3 | 0.74 | 0.45 | 0.00 | 0.01 |
| 247128_at | At5g66110 |
| Putative metallochaperone-like protein | 0.73 | 0.35 | 0.00 | 0.02 |
| 264644_at | At1g08960 |
| Ca2+/H+ exchanger 11 | 0.67 | 0.60 | 0.00 | 0.00 |
| 252694_atc | At3g43630 | Vacuolar iron transporter 1 (VIT1)-related | 0.67 | 0.60 | 0.02 | 0.00 | |
| 261135_at | At1g19610 |
| Plant defensin 1.4 | 0.67 | 0.42 | 0.00 | 0.01 |
| 260601_at | At1g55910 |
| ZRT-, IRT-like$ protein 11 | 0.65 | 0.02 | 0.02 | 0.57 |
| 257054_at | At3g15353 |
| Metallothionein 3 | 0.62 | 0.30 | 0.01 | 0.12 |
| 258987_atd | At3g08950 |
| Homologue of yeast copper chaperone Sco1 | 0.61 | 0.23 | 0.00 | 0.25 |
| (B) | |||||||
| Affymetrix probeset ID | AGI locus ID | Short gene name | Gene description |
|
| ||
|
|
|
|
| ||||
| 264097_s_at | At1g16700; At1g79010 | —; | Complex I & 23 kDa subunit; |
| 0.00 |
| 0.00 |
| 261489_at | At1g14450 |
| Complex I & B12 subunit |
| 0.00 |
| 0.00 |
| 254120_at | At4g24570 |
| Dicarboxylate carrier 2 |
| 0.00 |
| 0.00 |
| 245715_s_at | At5g08670; At5g08690 | ATP synthase | 1.78 | 0.00 | 0.64 | 0.01 | |
| 266512_at | At2g47690 | Complex I & 14 kDa subunit; Fe-S subunit 5 |
| 0.00 |
| 0.00 | |
| 258847_at | At3g03100 |
| Complex I & 17.2 kDa subunit |
| 0.06 |
| 0.07 |
| 260767_s_at | At1g49140; At3g18410 |
| Complex I & 12 kDa subunit NDUFS6; PDSW subunit | 1.34 | 0.00 | 0.76 | 0.00 |
| 256679_at | At3g52300 |
| ATP synthase D chain | 1.26 | 0.00 | 0.03 | 0.66 |
| 262397_at | At1g49380 | Cytochrome | 1.21 | 0.01 | 0.48 | 0.21 | |
| 249627_atd | At5g37510 |
| Complex I & subunit of the 400 kDa subcomplex; Embryo defective 1467 | 1.18 | 0.00 | 0.76 | 0.00 |
| 246309_at | At3g51790 |
| Orthologue of | 1.10 | 0.01 | 0.60 | 0.06 |
| 252864_at | At4g39740 |
| Homologue of yeast copper chaperone Sco1 | 1.06 | 0.01 | 0.34 | 0.66 |
| 263375_s_at | At2g20530; At4g28510 |
| Complex I & Prohibitin 6; Prohibitin 1 | 1.02 | 0.00 | 0.36 | 0.04 |
| 256267_at | At3g12260 |
| Complex I & LYR family of Fe/S cluster biogenesis protein | 1.02 | 0.01 | 0.31 | 0.32 |
| (C) | |||||||
| Affymetrix probeset ID | AGI locus ID | Short gene namea | Gene description |
|
| ||
|
|
|
|
| ||||
| 245219_at | At1g58807; At1g59124 | Disease resistance protein (CC-NBS-LRR# class) family |
| 0.00 |
| 0.00 | |
| 248851_s_at | At5g46260; At5g46490 | Disease resistance protein (TIR-NBS-LRR# class) family |
| 0.00 |
| 0.00 | |
| 262374_s_at | At1g72910; At1g72930 | TIR domain-containing protein | –1.20 | 0.00 | –0.70 | 0.00 | |
| 255060_at | At4g09430 | TIR-NBS-LRR# class | –1.20 | 0.00 | –0.84 | 0.00 | |
| 250069_at | At5g17970 | TIR-NBS-LRR# class | –1.20 | 0.00 | –0.70 | 0.00 | |
| 250771_at | At5g05400 | LRR and NB-ARC+ domains-containing disease resistance protein | –1.17 | 0.00 | –0.59 | 0.00 | |
| 262362_at | At1g72840 | TIR-NBS-LRR# class | –1.12 | 0.00 | –0.76 | 0.00 | |
| 245454_at | At4g16920 | TIR-NBS-LRR# class | –1.12 | 0.00 | –0.59 | 0.00 | |
| 257099_s_at | At3g24982; At3g25020 |
| Receptor like protein 40/42 | –1.11 | 0.00 | –0.77 | 0.00 |
| 252648_at | At3g44630 | TIR-NBS-LRR# class | –1.11 | 0.00 | –0.59 | 0.00 | |
| 250039_at | At5g18370 | TIR-NBS-LRR# class | –1.11 | 0.00 | –0.74 | 0.00 | |
| 251438_s_at | At3g59930; At5g33355 | Defensin-like (DEFL) family | –1.10 | 0.00 | –0.54 | 0.00 | |
| 252489_at | At3g46710 | NB-ARC+ domain-containing disease resistance protein | –1.10 | 0.00 | –0.70 | 0.00 | |
| 262363_at | At1g72850 | TIR-NBS# class | –1.09 | 0.00 | –0.67 | 0.00 | |
| 248973_at | At5g45050 |
| Tolerant to tobacco ringspot nepovirus | –1.09 | 0.00 | –0.24 | 0.01 |
| 256431_s_at | At3g11010; At5g27060 |
| Receptor like protein 34/53 | –1.06 | 0.00 | –0.60 | 0.00 |
| 245674_at | At1g56680 | Chitinase family protein | –1.06 | 0.00 | –0.83 | 0.00 | |
| 249320_at | At5g40910 | TIR-NBS-LRR# class | –1.06 | 0.00 | –0.37 | 0.00 | |
| 262364_at | At1g72860 | TIR-NBS-LRR# class | –1.05 | 0.00 | –0.64 | 0.00 | |
| 255059_at | At4g09420 | TIR-NBS-LRR# class | –1.04 | 0.00 | –0.62 | 0.00 | |
| 265723_at | At2g32140 | TIR domain transmembrane protein | –1.03 | 0.00 | –0.74 | 0.00 | |
| 252684_at | At3g44400 | TIR-NBS-LRR# class | –1.03 | 0.00 | –0.66 | 0.00 | |
| 263572_at | At2g17060 | TIR-NBS-LRR# class | –1.02 | 0.00 | 0.00 | 0.33 | |
| 245457_s_at | At4g16940; At4g16960 | TIR-NBS-LRR# class | –1.01 | 0.00 | –0.68 | 0.00 | |
(A) Transition metal homeostasis. (B) Mitochondrial electron transport/ATP synthesis. (C) biotic stress MapMan functional classes
aShort names of genes previously reported to be copy-number expanded in A. halleri based on other experimental approaches are given in bold type (see also Additional file 5)
b L o g 2 fold-change (FC) is underlined for genes that are concordantly copy number expanded/reduced in both A. halleri and A. lyrata
cHigher transcript abundance in roots of A. halleri vs. A. thaliana (Supplemental Table 1 of ref. Talke et al. 2006)
dHigher transcript abundance in shoots of A. halleri vs. A. thaliana (Supplemental Table 2 of Talke et al. 2006)
Synthesis of cytochrome c oxidase;
$Zinc-regulated transporter, iron-regulated transporter-like; *At3g17390 was annotated as SAMS4 in ref. Talke et al. 2006; **At2g36880 was annotated as SAMS3 in Talke et al. 2006; &NADH-ubiquinone oxidoreductase, mitochondrial respiratory chain complex I;#Coiled Coil (CC) or Toll-Interleukin Receptor (TIR)-Nucleotide Binding Site (NBS)-Leucine Rich Repeat (LRR);+nucleotide-binding (NB) motif-containing domain shared by human APAF-1, certain plant R gene products, and Caenorhabditis elegans CED-4 (ARC)
Fig. 6Significantly enriched functional classes among (a) copy number expanded and (b) copy number reduced genes of A. halleri. Bars/brackets represent the proportion/number of genes in each functional class among all copy number expanded or reduced genes of A. halleri (black) or A. lyrata (white), and among all nuclear genes represented on the array (grey) for comparison. Asterisks mark statistically significant enrichment by comparison to the set of nuclear genes represented on the ATH1 microarray (Fischer’s exact test, P ≤ 0.05, Benjamini-Hochberg correction for multiple testing). The analysis was based on MapMan [51] functional categories (see Additional file 8). Shown are only top-level categories of the ontological hierarchy (see text)
Fig. 7Genomic distribution of copy number alterations in A. halleri and A. lyrata, relative to the A. thaliana genome. The five chromosomes of A. thaliana are represented in different colours, with centromeric regions shaded in grey and radial axis labels indicating genomic position in Mbp. Round symbols mark the positions of genes found to be copy number expanded (red) and reduced (blue), respectively, in A. halleri (outer concentric ring) and A. lyrata (inner concentric ring), with radial positions scaled to the L o g 2 signal ratio of the heterologous species vs. A. thaliana. Line symbols mark the positions of 20-kb genomic regions, within which a minimum of 5 kb region is found to be copy number expanded (red) or reduced (blue), respectively. Black line symbols mark positions of at least 3 consecutive concordant red or blue lines, corresponding to 40-kb genomic regions, in which a minimum of 10 kb is found to be copy number expanded or reduced. The centre of the radial plot shows a zoomed 0.1-Mbp region (chromosome 3, 20.8 to 20.9 Mbp), with red circles representing copy number expanded genes in A. halleri (filled circle) and A. lyrata (open circle), labelled by short gene names or AGI codes. Genes (grey) and intergenic regions (pale yellow) are marked; tickmarks are spaced by 1 kb