| Literature DB >> 24048648 |
Qingzhang Du1, Baohua Xu, Wei Pan, Chenrui Gong, Qingshi Wang, Jiaxing Tian, Bailian Li, Deqiang Zhang.
Abstract
Lignocellulosic biomass from trees provides a renewable feedstock for biofuels, lumber, pulp, paper, and other uses. Dissecting the mechanism underlying natural variation of the complex traits controlling growth and lignocellulose biosynthesis in trees can enable marker-assisted breeding to improve wood quality and yield. Here, we combined linkage disequilibrium (LD)-based association analysis with traditional linkage analysis to detect the genetic effect of a Populus tomentosa cellulose synthase gene, PtoCesA4. PtoCesA4 is strongly expressed in developing xylem and leaves. Nucleotide diversity and LD in PtoCesA4, sampled from the P. tomentosa natural distribution, revealed that PtoCesA4 harbors high single nucleotide polymorphism (SNP) diversity (πT = 0.0080 and θw = 0.0098) and low LD (r(2) ≥ 0.1, within 1400 bp), demonstrating that the potential of a candidate-gene-based LD approach in understanding the molecular basis underlying quantitative variation in this species. By combining single SNP, multi-SNP, and haplotype-based associations in an association population of 460 individuals with single SNP linkage analysis in a family-based linkage populations (1200 individuals), we identified three strong associations (false discovery rate Q < 0.05) in both populations. These include two nonsynonymous markers (SNP49 associated with α-cellulose content and SNP59 associated with fiber width) and a noncoding marker (SNP18 associated with α-cellulose content). Variation in RNA transcript abundance among genotypic classes of SNP49 was confirmed in these two populations. Therefore, combining different methods allowed us to examine functional PtoCesA4 allelic variation underlying natural variation in complex quantitative traits related to growth and lignocellulosic biosynthesis.Entities:
Keywords: Populus tomentosa; RNA transcript analysis; linkage analysis; linkage disequilibrium; multi-locus association models; single nucleotide polymorphism
Mesh:
Substances:
Year: 2013 PMID: 24048648 PMCID: PMC3815066 DOI: 10.1534/g3.113.007724
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1PtoCesA4 gene structure and the positions of common SNPs (minor allele frequencies > 0.10). All common SNPs are represented by dark spots; putative transcription factor binding sites around SNPs in the PtoCesA4 promoter were predicted and numbers above the promoter region indicate the positions of putative transcription factor binding sites in base pairs relative to the predicted transcription start site. (A) Zinc-binding domain. (B–I) Two transmembrane helices in the N-terminal region and six in the C-terminal region.
Figure 2Relative transcript levels of PtoCesA4 in Populus tomentosa tissues and organs. The error bars represent ±SD.
Nucleotide polymorphism at the PtoCesA4 locus
| Region | Length (bp) | N of Polymorphic Sites | Frequency (bp−1) | Transitions and Transversions | Nucleotide Diversity | |
|---|---|---|---|---|---|---|
| π | θw | |||||
| Promoter | 1111 | 86 | 13 | 1.676 | 0.0230 | 0.0246 |
| 5′UTR | 297 | 9 | 33 | 3.500 | 0.0053 | 0.0104 |
| Exon1 | 51 | 0 | — | 0 | 0 | 0 |
| Intron1 | 93 | 5 | 19 | 1.500 | 0.0060 | 0.0126 |
| Exon2 | 202 | 6 | 34 | 0.500 | 0.0097 | 0.0105 |
| Intron2 | 82 | 6 | 14 | 1.000 | 0.0083 | 0.0172 |
| Exon3 | 125 | 2 | 63 | 1.000 | 0.0047 | 0.0056 |
| Intron3 | 299 | 10 | 30 | 1.500 | 0.0058 | 0.0080 |
| Exon4 | 67 | 1 | 67 | — | 0.0072 | 0.0035 |
| Intron4 | 96 | 3 | 32 | — | 0.0016 | 0.0074 |
| Exon5 | 151 | 1 | 151 | — | 0.0033 | 0.0016 |
| Intron5 | 79 | 6 | 13 | 5.000 | 0.0160 | 0.0179 |
| Exon6 | 613 | 11 | 56 | 4.500 | 0.0042 | 0.0058 |
| Intron6 | 173 | 9 | 19 | 8.000 | 0.0075 | 0.0164 |
| Exon7 | 138 | 3 | 46 | 2.000 | 0.0041 | 0.0051 |
| Intron7 | 92 | 0 | — | — | 0 | 0 |
| Exon8 | 126 | 4 | 32 | 3.000 | 0.0016 | 0.0075 |
| Intron8 | 88 | 4 | 22 | — | 0.0028 | 0.0134 |
| Exon9 | 213 | 5 | 43 | 1.500 | 0.0024 | 0.0055 |
| Intron9 | 125 | 4 | 31 | 0.333 | 0.0061 | 0.0076 |
| Exon10 | 510 | 11 | 46 | 10.000 | 0.0065 | 0.0065 |
| Intron10 | 294 | 8 | 37 | 1.667 | 0.0084 | 0.0059 |
| Exon11 | 351 | 4 | 88 | 3.000 | 0.0008 | 0.0027 |
| Intron11 | 132 | 5 | 26 | 4.000 | 0.0086 | 0.0089 |
| Exon12 | 582 | 7 | 83 | 6.000 | 0.0031 | 0.0036 |
| 3′UTR | 331 | 8 | 41 | 1.667 | 0.0064 | 0.0057 |
| Total silent | 3856.32 | 210 | 18 | 2.060 | 0.0124 | 0.0152 |
| Synonymous | 714.32 | 47 | 15 | 4.875 | 0.0149 | 0.0194 |
| Nonsynonymous | 2411.68 | 8 | 301 | 3.000 | 0.0009 | 0.0010 |
| Total exon | 3129 | 55 | 57 | 2.500 | 0.0040 | 0.0048 |
| Total intron | 1553 | 60 | 26 | 2.588 | 0.0065 | 0.0105 |
| Total | 6421 | 218 | 29 | 2.169 | 0.0080 | 0.0098 |
Regions containing indels are excluded from the calculation.
Total silent indicates synonymous plus noncoding sites.
Total indicates silent sites plus nonsynonymous sites.
Summary of nucleotide variations for PtoCesA4 in Populus tomentosa natural populations from three climatic regions
| Population | N | S | Sxl | πtot | πsil | πs | πn | Tajima’s D | Fu and Li’s D |
|---|---|---|---|---|---|---|---|---|---|
| Northeastern region | 14 | 166 | 26 | 0.0074 | 0.0114 | 0.0130 | 0.0008 | −0.5013 | −0.4495 |
| Southern region | 13 | 149 | 5 | 0.0085 | 0.0131 | 0.0166 | 0.0010 | 0.4973 | −0.1296 |
| Northwestern region | 13 | 144 | 5 | 0.0082 | 0.0127 | 0.0141 | 0.0010 | 0.5355 | 0.0091 |
| Total | 40 | 218 | — | 0.0080 | 0.0124 | 0.0149 | 0.0009 | −0.6498 | −2.2043 |
N, number of sequences sampled; S, number of segregating sites; Sxl, polymorphic exclusive biallelic mutations in the studied group; πtot, average nucleotide diversity in full gene; πsil, average nucleotide diversity in synonymous and noncoding sites; πs, average nucleotide diversity of synonymous mutation; πn, average nucleotide diversity of nonsynonymous mutation.
Figure 3The decay of short-range linkage disequilibrium within PtoCesA4 for all samples and each climatic region. Pairwise correlations between SNPs are plotted against the physical distance between the SNPs in base pairs. The curves describe the nonlinear regressions of r2 (Er2) onto the physical distance in base pairs.
Summary of significant SNP marker–trait pairs identified in the Populus tomentosa association population using the mixed linear model after a correction for multiple testing
| Trait | Locus | Position | Mutation | Fst | Association Population ( | 2 | 2 | Frequency | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Lignin | |||||||||||||
| SNP44 | Intron1 | [G: T] | 0.057 | 0.0012 | 0.0551 | 2.1 | 3.42 | −0.98 | −0.5725 | 3.8320 | 0.15 (T) | 0.8880 | |
| SNP49 | Exon 3 | [C: A]ns | 0.114 | 0.0025 | 0.0810 | 3.0 | 1.58 | −0.35 | −0.4381 | 1.7746 | 0.14 (A) | −0.387 | |
| α-Cellulose | |||||||||||||
| SNP3 | Promoter | [G: A] | 0.077 | 0.0015 | 0.0629 | 2.2 | 4.79 | 0.84 | 0.3512 | 1.8635 | 0.11 (A) | 2.7436 | |
| SNP18 | Promoter | [A: T] | 0.039 | 0.0011 | 0.0551 | 2.5 | 0.44 | 1.81 | 8.1825 | 0.1720 | 0.45 (T) | −0.1896 | |
| SNP41 | 5′UTR | [C: T] | 0.100 | 3.02E-05 | 0.0035 | 1.6 | 0.31 | −0.60 | −3.8387 | 0.1206 | 0.17 (T) | 0.5131 | |
| SNP49 | Exon 3 | [C: A]ns | 0.114 | 0.0031 | 0.0948 | 5.3 | 2.02 | −0.19 | −0.1881 | 0.7857 | 0.14 (A) | 0.0466 | |
| Holocellulose | |||||||||||||
| SNP45 | Exon 2 | [C: A]s | 0.050 | 0.0002 | 0.0142 | 4.0 | 1.31 | 0.61 | 0.9309 | 0.3429 | 0.16 (A) | 0.8728 | |
| SNP81 | Intron 10 | [T: C] | 0.052 | 0.0002 | 0.0142 | 3.0 | 0.61 | 0.06 | 0.1803 | 0.1597 | 0.46 (C) | −0.0412 | |
| Fiber width | |||||||||||||
| SNP59 | Exon 6 | [A: C]ns | 0.130 | 0.0008 | 0.0440 | 2.6 | 0.71 | −0.12 | −0.3426 | 0.3567 | 0.38 (A) | −0.2106 | |
| Diameter at breast height (D) | |||||||||||||
| SNP48 | Intron 2 | [A: T] | 0.091 | 0.0009 | 0.0454 | 1.9 | 2.12 | −3.05 | −2.8770 | 0.3695 | 0.44 (T) | −0.1966 | |
| SNP75 | Exon 10 | [T: C]s | 0.036 | 3.15E-05 | 0.0035 | 3.2 | 0.83 | 0.13 | 0.3206 | 0.1452 | 0.43 (C) | 0.2125 | |
| SNP81 | Intron 10 | [T: C] | 0.052 | 0.0003 | 0.0195 | 2.0 | 2.05 | 0.14 | 0.1388 | 0.3567 | 0.46 (C) | −0.5694 | |
| Tree height (H) | |||||||||||||
| SNP49 | Exon 3 | [C: A]ns | 0.114 | 0.0012 | 0.0551 | 2.3 | 0.89 | −0.09 | −0.191 | 0.3053 | 0.14 (A) | 0.0198 | |
| Stem volume (V) | |||||||||||||
| SNP75 | Exon 10 | [T: C]s | 0.036 | 3.02E-05 | 0.0035 | 2.6 | 0.06 | 0.01 | 0.1453 | 0.1479 | 0.43 (C) | 0.0164 | |
Fst indicates variation attributable to differentiation among subpopulations. R indicates percentage of the phenotypic variance explained. P-value indicates significance level for association (significance is P ≤ 0.05). Q-value indicates a correction for multiple testing (false discovery rate (Q) ≤ 0.10). ns, nonsynonymous polymorphism; s, synonymous polymorphism.
Calculated as the difference between the phenotypic means observed within each homozygous class (2a = |GBB-Gbb|, where Gij is the trait mean in the ijth genotypic class).
Calculated as the difference between the phenotypic mean observed within the heterozygous class and the average phenotypic mean across both homozygous classes [d = GBb− 0.5(GBB+ Gbb), where Gij is the trait mean in the ijth genotypic class].
sp, SD for the phenotypic trait under consideration.
Allele frequency of either the derived or the minor allele. Single nucleotide polymorphism (SNP) alleles corresponding to the frequency listed are given in parentheses.
The additive effect was calculated as a = pB(GBB) + pb(GBb) − G, where G is the overall trait mean, Gij is the trait mean in the ijth genotypic class, and pi is the frequency of the ith marker allele. These values were always calculated with respect to the minor allele.
Figure 4Genotypic effects of the significant single nucleotide polymorphisms (SNPs) in PtoCesA4 on the same phenotypic trait in association and linkage populations. The marker SNP49 in exon 3 of PtoCesA4, a nonsynonymous mutation, which results in an encoded amino acid change from His to Asn, was significantly associated with α-cellulose content in association and linkage populations. The AA homozygotes were associated with higher α-cellulose values and CC homozygotes were associated with lower α-cellulose values, and mean values in AC heterozygotes were medium in both populations, which are supported by the observation that SNP49 has an additive effect on gene action in cellulose content. The nonsynonymous marker SNP59 in exon 6 of PtoCesA4 significantly associated with fiber width in both populations and shows patterns of gene action consistent with additive effects on fiber width. The A allele at SNP59 causes a Ser-to-Tyr amino acid substitution (d) SNP18 from the promoter of PtoCesA4 and showed significant association with α-cellulose content in both populations. The differences in α-cellulose content among the three genotypes of this marker indicate that patterns of gene action are consistent with overdominance effects. P1 represents the female clone YX01 (Populus alba × Populus glandulosa), P2 represents the male clone LM 50 (Populus tomentosa), and F1 represents the hybrid progeny.
Figure 5Multi-locus single nucleotide polymorphism (SNP) models explain a large percentage of the phenotypic variance for growth and wood properties in the Populus tomentosa association population. The gray line and points denote the numbers of SNPs identified for each trait and the marker effects (R) explained by the list of SNPs identified using the Bayesian mixed linear model in the Bayesian association with missing data (BAMD) program in R (http://cran.r-project.org/package=BAMD).
List of haplotypes with significant associations with wood quality and growth traits in the P. tomentosa association population (n = 460) after a correction for multiple testing
| Trait | Significant Haplotypes | Haplotype Frequency | Single-Marker Associations | |||
|---|---|---|---|---|---|---|
| Lignin | ||||||
| 0.0012 | 0.0487 | 3.4 | SNPs 2-4 | — | ||
| T-G-T | 0.28 | |||||
| G-A-C | 0.18 | |||||
| 0.0052 | 0.0760 | 3.7 | SNPs 44-46 | SNP44 (lignin, | ||
| G-A-T | 0.05 | |||||
| T-A-T | 0.27 | |||||
| α-Cellulose | ||||||
| 0.0019 | 0.0532 | 3.8 | SNPs 1-3 | SNP3 (α-cellulose, | ||
| T-G-A | 0.25 | |||||
| T-T-G | 0.43 | |||||
| 0.0015 | 0.0487 | 2.8 | SNPs 16-18 | SNP18 (α-cellulose, | ||
| T-A-A | 0.08 | |||||
| 0.0063 | 0.0922 | 5.6 | SNPs 48-50 | SNP49 (α-cellulose, | ||
| T-A-C | 0.16 | |||||
| T-C-A | 0.17 | |||||
| Holocellulose | ||||||
| 0.0040 | 0.0713 | 3.0 | SNPs 38-40 | — | ||
| T-G-A | 0.12 | |||||
| 0.0023 | 0.0579 | 5.1 | SNPs 44-46 | SNP45 (holocellulose, | ||
| T-A-T | 0.09 | |||||
| G-C-C | 0.13 | |||||
| G-A-C | 0.20 | |||||
| Fiber length | 0.0051 | 0.0760 | 4.0 | SNPs 89-91 | — | |
| C-A-A | 0.05 | |||||
| G-G-A | 0.29 | |||||
| Fiber width | 0.0022 | 0.0579 | 3.2 | SNPs 57-59 | SNP59 (fiber width, | |
| C-T-A | 0.05 | |||||
| D | ||||||
| 0.0005 | 0.0187 | 2.7 | SNPs 47-49 | SNP48 (D, | ||
| C-A-C | 0.21 | |||||
| T-T-A | 0.17 | |||||
| 0.0035 | 0.0673 | 2.6 | SNPs 81-83 | SNP81 (D, | ||
| C-T-A | 0.30 | |||||
| V | 0.0030 | 0.0611 | 3.9 | SNPs 75-77 | SNP75 (V, | |
| T-T-A | 0.11 | |||||
| C-T-G | 0.08 |
R indicates percentage of the phenotypic variance explained. P-value indicates the significant level for haplotype-based association (the significance is P ≤ 0.05). Q-value indicates a correction for multiple testing (false discovery rate (Q) ≤ 0.10). D, diameter at breast height; V, stem volume.
Significant single-marker associations with the lowest Q value (FDR Q ≤ 0.10) relating to the significant haplotype–trait association; /, no data were identified in this study.
Summary of significant SNP marker–trait pairs identified in PtoCesA4, using a linkage population, after correction for multiple testing errors
| Trait | Locus | Position | Alleles of Parents (Female: Male) | Linkage Population ( | ||
|---|---|---|---|---|---|---|
| α-Cellulose | SNP18 | Promoter | [TT: AT] | 0.0036 | 0.0693 | 2.8 |
| SNP49 | Exon 3 | [AC: AC] | 0.0015 | 0.0490 | 3.6 | |
| SNP75 | Exon 10 | [CT: CT] | 0.0019 | 0.0532 | 1.5 | |
| Holocellulose | SNP88 | Exon 12 | [AG: AG] | 0.0034 | 0.0693 | 1.9 |
| Fiber length | SNP70 | Intron 9 | [AT: AT] | 0.0013 | 0.0490 | 3.0 |
| Fiber width | SNP59 | Exon 6 | [AA: AC] | 0.0044 | 0.0693 | 2.5 |
| Tree height (H) | SNP51 | Intron3 | [AC: AC] | 2.55E-05 | 0.0050 | 3.0 |
R indicates percentage of the phenotypic variance explained. P-value indicates significance level for association (significance is P ≤ 0.05). Q-value indicates a correction for multiple testing (false discovery rate (Q) ≤ 0.10).
Figure 6PtoCesA4 transcript abundance varies among genotypic classes for significant SNP associations. (A) Transcript abundance variation of three genotypic classes for SNP49 in both association and linkage populations. The black and gray lines represent the transcript levels among three genotypic classes in association and linkage populations, respectively. (B) The relative mRNA transcript levels of PtoCesA4 among three genotypic classes for SNP41, a significant noncoding marker in the 5′UTR region of PtoCesA4. The error bars represent ±SD.