| Literature DB >> 35154199 |
Timothy Paape1, Benjamin Heiniger2, Miguel Santo Domingo3, Michael R Clear1, M Mercedes Lucas3, José J Pueyo3.
Abstract
Heavy metals are an increasing problem due to contamination from human sources that and can enter the food chain by being taken up by plants. Understanding the genetic basis of accumulation and tolerance in plants is important for reducing the uptake of toxic metals in crops and crop relatives, as well as for removing heavy metals from soils by means of phytoremediation. Following exposure of Medicago truncatula seedlings to cadmium (Cd) and mercury (Hg), we conducted a genome-wide association study using relative root growth (RRG) and leaf accumulation measurements. Cd and Hg accumulation and RRG had heritability ranging 0.44 - 0.72 indicating high genetic diversity for these traits. The Cd and Hg trait associations were broadly distributed throughout the genome, indicated the traits are polygenic and involve several quantitative loci. For all traits, candidate genes included several membrane associated ATP-binding cassette transporters, P-type ATPase transporters, oxidative stress response genes, and stress related UDP-glycosyltransferases. The P-type ATPase transporters and ATP-binding cassette protein-families have roles in vacuole transport of heavy metals, and our findings support their wide use in physiological plant responses to heavy metals and abiotic stresses. We also found associations between Cd RRG with the genes CAX3 and PDR3, two linked adjacent genes, and leaf accumulation of Hg associated with the genes NRAMP6 and CAX9. When plant genotypes with the most extreme phenotypes were compared, we found significant divergence in genomic regions using population genomics methods that contained metal transport and stress response gene ontologies. Several of these genomic regions show high linkage disequilibrium (LD) among candidate genes suggesting they have evolved together. Minor allele frequency (MAF) and effect size of the most significant SNPs was negatively correlated with large effect alleles being most rare. This is consistent with purifying selection against alleles that increase toxicity and abiotic stress. Conversely, the alleles with large affect that had higher frequencies that were associated with the exclusion of Cd and Hg. Overall, macroevolutionary conservation of heavy metal and stress response genes is important for improvement of forage crops by harnessing wild genetic variants in gene banks such as the Medicago HapMap collection.Entities:
Keywords: Medicago truncatula (Medicago); cadmium; genetic architecture; mercury; polygenic; standing variation
Year: 2022 PMID: 35154199 PMCID: PMC8832151 DOI: 10.3389/fpls.2021.806949
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1Correlation matrix of Cd leaf accumulation, Cd relative root growth (RRG), Hg leaf accumulation, and Hg relative root growth (RRG) measured in the M. truncatula HapMap panel. Distributions of the four phenotypes are shown in the diagonals. The upper, off-diagonal panels contain pairwise Pearson’s correlation coefficients (r). Lower, off-diagonals are the plotted trait values of each M. truncatula genotype measured in our experiment. The red line shows the slope of the correlation between any pair of traits. The x- and y-axes are values of the phenotype measurements [in μg–1 dry weight for leaf, and percent for relative root growth (RRG)].
FIGURE 2Genome-wide association analysis of heavy metal accumulation and tolerance traits (Manhattan plots shown for each trait; output from GAPIT). Each point represents a SNP at each position in the genome. The number on the x-axis is the chromosome (each color corresponds to one of the eight Mt4.0 chromosomes). The y-axis is the negative log10 of the p-values. The higher y-axis values indicate smaller p-values. Next to each Manhattan plot are quantile-quantile (Q-Q) plots which show the model fit with population structure covariance included (orange points), and without population structure covariance (green points). The x-axis of the Q-Q plots is the expected distribution, the y-axis is the empirical distribution.
Candidate genes for each trait within 1 kb proximity to top 50 SNPs and a subset of the top 1000 SNPs identified by GWAS.
| Rank | Chr | Gene Name | Position | Location | Gene Description | |
|
| ||||||
| 5 | 8 | Medtr8g064490 | 27050814 | 6.72E-08 | intergenic | 23S rRNA m2A2503 methyltransferase |
| 6 | 8 | Medtr8g469310 | 25178759 | 7.84E-08 | intergenic | peptide/nitrate transporter plant |
| 10 | 7 | Medtr7g009740 | 2221211 | 1.68E-07 | intron | translation factor GUF1-like protein |
| 22 | 3 | Medtr3g036200 | 13449844 | 4.18E-07 | downstream | transmembrane protein |
| 26 | 8 | Medtr8g465310 | 23246354 | 4.94E-07 | synonymous | transmembrane amino acid transporter family protein |
| 29 | 3 | Medtr3g093290 | 42636439 | 5.37E-07 | downstream | MFS transporter |
| 32 | 3 | Medtr3g031940 | 10032134 | 6.01E-07 | synonymous | LRR and NB-ARC domain disease resistance protein |
| 40 | 6 | Medtr6g016640 | 6335718 | 9.40E-07 | intron | proline dehydrogenase |
| 43 | 1 | Medtr1g033940 | 12271020 | 1.07E-06 | intergenic | transmembrane protein |
| 47 | 8 | Medtr8g090295 | 37943857 | 1.48E-06 | intergenic | HCC2, HOMOLOGUE OF COPPER CHAPERONE SCO1 2 |
| 52 | 7 | Medtr7g092070 | 36458974 | 1.58E-06 | downstream | OXS2, OXIDATIVE STRESS 2 |
| 605 | 5 | Medtr5g094830 | 41452129 | 3.34E-05 | synonymous | ABCC3; transmembrane ATPase activity |
|
| ||||||
| 4 | 1 | Medtr1g103200 | 46692340 | 3.17E-09 | downstream | transmembrane protein, putative |
| 5 | 6 | Medtr6g090260 | 34282023 | 4.18E-09 | missense | Fe(II)-dependent oxygenase superfamily protein |
| 7 | 3 | Medtr3g015500 | 4499608 | 9.09E-09 | downstream | NBS-LRR type disease resistance protein |
| 11 | 1 | Medtr1g100787 | 45717235 | 1.75E-08 | intergenic | LRR receptor-like kinase family protein |
| 24 | 7 | Medtr7g033125 | 11718492 | 3.78E-08 | intron | 2-isopropylmalate synthase |
| 25 | 4 | Medtr4g104990 | 43495277 | 4.00E-08 | intergenic | PB1 domain protein |
| 26 | 5 | Medtr5g026510 | 10889054 | 4.47E-08 | downstream | LRR receptor-like kinase |
| 27 | 2 | Medtr2g438720 | 15656689 | 5.61E-08 | missense | ankyrin repeat plant-like protein |
| 28 | 3 | Medtr3g089940 | 40881882 | 5.83E-08 | intron | zinc-binding alcohol dehydrogenase family protein |
| 36 | 3 | Medtr3g088845 | 40664333 | 7.88E-08 | intron | 2-oxoisovalerate dehydrogenase subunit alpha |
| 41 | 3 | Medtr3g088955 | 40721568 | 7.88E-08 | synonymous | flavin containing monooxygenase YUCCA10-like protein |
| 45 | 3 | Medtr3g015500 | 4499656 | 8.43E-08 | downstream | NBS-LRR type disease resistance protein |
| 148 | 8 | Medtr8g015980 | 5289479 | 4.56E-07 | missense | ABCC8; transmembrane ATPase activity |
| 192 | 5 | Medtr5g070320 | 29773843 | 6.99E-07 | intron | PDR3; pleiotropic drug resistance 3 |
| 340 | 5 | Medtr5g070330 | 29783407 | 1.96E-06 | intron | CAX3; cation exchanger 3 |
| 363 | 2 | Medtr2g069090 | 28697944 | 2.08E-06 | downstream | Fe(II)-dependent oxygenase superfamily protein |
| 612 | 3 | Medtr3g090170 | 40983523 | 4.49E-06 | intron | ATVPS45; vacuolar protein sorting 45 |
| 797 | 2 | Medtr2g019020 | 6102461 | 7.32E-06 | synonymous | ABCC2; transmembrane ATPase activity |
| 944 | 2 | Medtr2g095480 | 40788857 | 9.99E-06 | upstream | HIPP3; Heavy metal transport/detoxification superfamily protein |
|
| ||||||
| 1 | 4 | Medtr4g080140 | 31078608 | 3.47E-10 | intergenic | ASP1; aspartate aminotransferase |
| 3 | 6 | Medtr6g017165 | 6799845 | 1.05E-08 | intron | allene oxide cyclase |
| 5 | 7 | Medtr7g080880 | 30810684 | 2.68E-08 | downstream | SPO22/ZIP4-like meiosis protein |
| 19 | 3 | Medtr3g088460 | 40127485 | 1.15E-07 | intron | NRAMP metal ion transporter 6 |
| 25 | 2 | Medtr2g015690 | 4673621 | 2.16E-07 | downstream | phospholipase A1 |
| 27 | 7 | Medtr7g105570 | 42817932 | 2.28E-07 | intergenic | enoyl-(acyl carrier) reductase |
| 33 | 8 | Medtr8g056790 | 18970685 | 2.81E-07 | intergenic | photosystem II CP43 chlorophyll apoprotein |
| 35 | 8 | Medtr8te073680 | 31194688 | 2.96E-07 | downstream | Serine/Threonine-kinase Cx32, related protein |
| 43 | 8 | Medtr8g469600 | 25322097 | 3.80E-07 | downstream | LRR receptor-like kinase family protein |
| 513 | 2 | Medtr2g036380 | 15751342 | 8.27E-06 | intron | HMA2; heavy metal atpase 2 |
| 917 | 5 | Medtr5g033320 | 14374444 | 1.96E-05 | intron | ABCC3; transmembrane ATPase activity |
|
| ||||||
| 1 | 2 | Medtr2g083420 | 35018740 | 8.10E-10 | missense | UDP-glucosyltransferase family protein |
| 4 | 3 | Medtr3g087150 | 39520330 | 1.15E-07 | intergenic | plant/T5J17-70 protein, putative |
| 7 | 8 | Medtr8g465580 | 23392640 | 6.07E-07 | synonymous | S-locus lectin kinase family protein |
| 10 | 8 | Medtr8g091330 | 38065096 | 7.59E-07 | downstream | phosphoglycerate/bisphosphoglycerate mutase family protein |
| 12 | 4 | Medtr4g020850 | 6714547 | 9.31E-07 | missense | disease resistance protein (TIR-NBS-LRR class) |
| 15 | 2 | Medtr2te036520 | 15806363 | 1.04E-06 | downstream | galactose oxidase/kelch repeat protein |
| 18 | 3 | Medtr3g040680 | 14343459 | 1.08E-06 | intergenic | non-specific phospholipase C4 |
| 31 | 3 | Medtr3g064650 | 29135942 | 1.64E-06 | synonymous | transmembrane protein, putative |
| 34 | 7 | Medtr7g094500 | 37617280 | 1.69E-06 | intron | u6 snRNA-associated-like-Sm protein |
| 58 | 4 | Medtr4g116010 | 47960277 | 2.51E-06 | intergenic | Fe(II)-dependent oxygenase superfamily protein |
| 81 | 2 | Medtr2te008720 | 1586136 | 3.51E-06 | downstream | Peroxidase superfamily protein |
| 108 | 5 | Medtr5g073020 | 31072945 | 6.78E-06 | intron | Cytochrome P450, family 71, subfamily B, polypeptide 23 |
| 270 | 3 | Medtr3g014705 | 4234570 | 1.27E-05 | intron | Zinc finger (CCCH-type) family protein / RNA recognition motif |
| 307 | 1 | Medtr1g084780 | 37746862 | 1.47E-05 | downstream | H+ ATPase 2 |
| 351 | 1 | Medtr1g063920 | 28084622 | 1.66E-05 | intron | non-intrinsic ABC protein 12 |
| 357 | 4 | Medtr4g076900 | 29426248 | 1.67E-05 | intron | ABC-2 type transporter family protein |
Columns from left to right are Rank (rank order assigned based on the top 50 and top 1000 SNPs), chromosome (Chr), Gene Name (Mt4.0 gene ID), Position on the chromosome, P-value, Location in proximity of the gene, and Gene Description. Within each trait, SNPs are sorted smallest to largest p-value and rank.
FIGURE 3The number of genes with annotations in categories for ATPase, ion transport, metal ion, and stress related genes within 1 kb of the 1000 most significant SNPs using GAPIT for each of the four traits (A). The number of genes in these four categories in the upper 2% of the highest Fst sliding windows (B), and the number of genes with these four categories in the 2% of sliding windows with highest XP-CLR-values (C). The y-axis represents the number of genes in each category.
FIGURE 4Manhattan plot of Cd relative root growth (RRG) associations on chromosome 2 using GAPIT (A). The highlighted cluster of SNP’s (horizontal black bars) contains multiple ankyrin repeat genes (see Table 2). Each point represents a SNP, the x-axis signifies the position on the chromosome while the y-axis is the –log10 of the p-value. Pairwise linkage disequilibrium (LD) among the 100 most significant SNPs in the peak is shown as a heatmap on the right, the most significant SNPs and Mt4.0 genes in this region shown above the heatmap. The black lines mark the position of each SNP in the peak, dark blue color signifies no LD, green to yellow is high to maximum LD. Weighted Fst statistics from sliding window analysis (100 kb windows) on chromosome 2 with GWAS peak marked in gray (B). Higher Fst indicates genomic differentiation based on the groups of plant genotypes with the highest vs. lowest Cd RRG-values. XP-CLR is a two-group comparison of selective sweeps using plant genotypes with the highest vs. lowest Cd RRG-values (C). The dashed black line represents the mean Fst or CLR-values for the chromosome, the dashed red line represents windows ≥ 5% of the highest Fst or CLR-values for the chromosome.
Genes in chromosome 2 genomic regions associated with Cd relative root growth (Cd RRG) that were identified using GWAS, Fst and XP-CLR.
| Mt4.0 gene ID | BLAST gene function | Mt4.0 location |
|
| ||
| Medtr2g438720 | ankyrin repeat plant-like protein | chr2:15654113..15660513 |
| Medtr2g438740 | ankyrin repeat plant-like protein | chr2:15661862..15664982 |
| Medtr2g438760 | ankyrin repeat plant-like protein | chr2:15678476..15683145 |
| Medtr2g438700 | ankyrin repeat plant-like protein | chr2:15644925..15649215 |
| Medtr2g438560 | hypothetical protein | chr2:15587523..15589330 |
| Medtr2g438580 | ankyrin repeat protein | chr2:15601177..15601431 |
| Medtr2g438670 | hypothetical protein | chr2:15639667..15639882 |
|
| ||
| Medtr2g436680 | ABC transporter protein (ABCC14, AT3G62700.1) | chr2:14273804..14283871 |
| Medtr2g436710 | ABC transporter protein (ABCC14, AT3G62700.1) | chr2:14295748..14305736 |
| Medtr2g436730 | ABC transporter protein (ABCC14, AT3G62700.1) | chr2:14312635..14321998 |
| Medtr2g436830 | heavy metal transport (ATHMP10, AT1G56210.1) | chr2:14380972..14382489 |
| Medtr2g437380 | 2OG-Fe(II) oxygenase family oxidoreductase | chr2:14380972..14382489 |
| Medtr2g035840 | heavy metal P-type ATPase (HMA7, AT5G44790.1) | chr2:15206300..15212503 |
| Medtr2g438160 | potassium transporter 12 (AT1G31120.1) | chr2:15394059..15398083 |
|
| ||
| Medtr2g436680 | ABC transporter protein (ABCC14, AT3G62700.1) | chr2:14273804..14283871 |
| Medtr2g436710 | ABC transporter protein (ABCC14, AT3G62700.1) | chr2:14295748..14305736 |
| Medtr2g436730 | ABC transporter protein (ABCC14, AT3G62700.1) | chr2:14312635..14321998 |
| Medtr2g436830 | heavy metal transport (ATHMP10, AT1G56210.1) | chr2:14380972..14382489 |
| Medtr2g437380 | 2OG-Fe(II) oxygenase family oxidoreductase | chr2:14380972..14382489 |
Columns from left to right are Mt4.0 gene ID, gene function identified by BLAST to A. thaliana, and gene location with start and stop coordinates based on the Mt4.0 genome.
FIGURE 5Manhattan plot of Cd relative root growth (RRG) associations for chromosome 5 using GAPIT (A). The region on chromosome 5 associated with Cd RRG contained MtCAX3 (Medtr5g070330), MtPDR3 (Medtr5g070320), MtCPT7 (Medtr5g070270) and MtDDB2 (Medtr5g070310). The SNPs in genic and intergenic regions are in high LD (pairwise linkage disequilibrium (LD) among the 100 most significant SNPs in the peak). Each point represents a SNP, the x-axis signifies the position on the chromosome while the y-axis is the –log10 of the p-value. The Cd RRG phenotype was log2 transformed to normalize the data prior to running the GWAS. The GWAS region is highlighted in gray the Fst (B) and XP-CLR (C) sliding window (100 kb windows) analyses. The x-axis represents the chromosomal position, the y-axis the Fst or XP-CLR-value of the window at the corresponding position. Higher Fst indicates population differentiation, higher XP-CLR indicates the presence of a selective sweep. The dashed black line represents the mean Fst or CLR-values for the chromosome, the dashed red line represents windows ≥ 5% of the highest Fst or CLR-values for the chromosome.
FIGURE 6Manhattan plot of Hg leaf accumulation for chromosome 8 using GEMMA (A). Each point represents a SNP, the x-axis signifies the position on the chromosome while the y-axis is the –log10 of the p-value. The Cd RRG phenotype was log2 transformed to normalize the data prior to running the GWAS. The GWAS region is highlighted in gray the Fst (B) and XP-CLR (C) sliding window (100 kb windows) analyses. The x-axis represents the chromosomal position, the y-axis the Fst or XP-CLR-value of the window at the corresponding position. Higher Fst indicates population differentiation, higher XP-CLR indicates the presence of a selective sweep. The dashed black line represents the mean Fst or CLR-values for the chromosome, the dashed red line represents windows ≥ 5% of the highest Fst or CLR-values for the chromosome.
FIGURE 7Correlations between minor allele frequency (MAF; x-axis) and SNP effect size (y-axis) in the top 100 (left four panels) and top 1000 (right four panels) most significant SNPs (effect sizes estimated from GAPIT). Pearson’s correlation coefficients shown in upper right corners.
FIGURE 8Comparison of Tajima’s D from the genomic background and regions containing SNPs from the GWAS for the traits Cd leaf accumulation (A), Cd relative root growth (RRG) (B), Hg leaf accumulation (C), and Hg RRG (D). Tajima’s D was estimated using 10 bp regions surrounding focal SNPs.