| Literature DB >> 30687363 |
Chengguang Dong1, Juan Wang1, Yu Yu1, Longzhen Ju2, Xiaofeng Zhou1, Xiaomei Ma1, Gaofu Mei2, Zegang Han2, Zhanfeng Si3, Baocheng Li1, Hong Chen1, Tianzhen Zhang3.
Abstract
Fiber quality is an important economic index and a major breeding goal in cotton, but direct phenotypic selection is often hindered due to environmental influences and linkage with yield traits. A genome-wide association study (GWAS) is a powerful tool to identify genes associated with phenotypic traits. In this study, we identified fiber quality genes in upland cotton (Gossypium hirsutum L.) using GWAS based on a high-density CottonSNP80K array and multiple environment tests. A total of 30 and 23 significant single nucleotide polymorphisms (SNPs) associated with five fiber quality traits were identified across the 408 cotton accessions in six environments and the best linear unbiased predictions, respectively. Among these SNPs, seven loci were the same, and 128 candidate genes were predicted in a 1-Mb region (±500 kb of the peak SNP). Furthermore, two major genome regions (GR1 and GR2) associated with multiple fiber qualities in multiple environments on chromosomes A07 and A13 were identified, and within them, 22 candidate genes were annotated. Of these, 11 genes were expressed [log2(1 + FPKM)>1] in the fiber development stages (5, 10, 20, and 25 dpa) using RNA-Seq. This study provides fundamental insight relevant to identification of genes associated with fiber quality and will accelerate future efforts toward improving fiber quality of upland cotton.Entities:
Keywords: SNP genotyping array; candidate genes; fiber quality; genome-wide association study; upland cotton
Year: 2019 PMID: 30687363 PMCID: PMC6334163 DOI: 10.3389/fpls.2018.01968
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1Description of SNPs and genetic diversity of 408 cotton accessions. (A) Genetic diversity of 408 cotton genomes. The serial numbers of 26 chromosomes are represented by different colors. From the outer to the inner circle, the curves depict SNP density (the number of SNPs per 100-kb window) and the level of genetic differentiation (the F values per 100-kb window) between G1 and G2, respectively. (B) Proportion of the 48,072 SNPs categorized by the adjacent SNP distances. “d” represents the distance between two adjacent SNPs.
FIGURE 2Analysis of population structure of 408 upland cotton accessions. (A) Mean LnP(D) values plotted as the number of subgroups; (B) ΔK values plotted as the number of subgroups; (C) population structure based on STRUCTURE when K = 2; (D) distribution of accessions in different cotton regions of two subgroups. Red and green represent G1 and G2, respectively; (E) NJ tree based on Nei’s genetic distances; (F) principal component analysis.
FIGURE 3GWAS of the FL (A), FS (B), FM (C), FU (D), and FE (E) in the BLUPs using EMMAX. The horizontal line indicates the threshold (4.68). Seven SNPs (the same loci between 30 significant SNPs and 23 SNPs across the BLUPs) are indicated by the red arrow in the Manhattan plots.
Summary of seven significant peak SNPs associated with fiber quality traits and candidate genes within 500 kb either side of the SNP locus.
| SNPs | Chromosome | Site | Traits | Environment | No. of candidate gene |
|---|---|---|---|---|---|
| TM21110 | A07 | 70483507 | FL | BLUPs, KRL13, KRL14 | 8 |
| FS | BLUPs, KRL13, KRL14 | ||||
| TM21111 | A07 | 70492663 | FL | BLUPs, KRL13, KRL14 | 1 |
| FS | BLUPs, KRL13, KRL14 | ||||
| TM47488 | A13 | 75014691 | FL | BLUPs, SHZ13, SHZ14, KRL13, KRL14 | 43 |
| FS | BLUPs, SHZ13, SHZ14, KRL13, KRL14 | ||||
| FE | BLUPs, SHZ13 | ||||
| TM21292 | A07 | 72067994 | FS | BLUPs, SHZ13, SHZ14, SHZ15, KRL13, KRL14 | 16 |
| TM77174 | D11 | 64945961 | FS | BLUPs, SHZ14, KRL13 | 19 |
| TM81816 | D13 | 52852792 | FS | BLUPs, SHZ14 | 18 |
| TM53502 | D03 | 2481487 | FM | BLUPs, SHZ13 | 23 |
Summary of elite alleles and phenotypic effects.
| Trait | SNP | Chromosome | Genotype | Elite allele | ANOVA | |
|---|---|---|---|---|---|---|
| FL | TM21110 | A07 | A/G | GG | 5.03E-13∗∗∗ | 1.44 |
| TM21111 | A07 | T/C | CC | 2.51E-12∗∗∗ | 1.35 | |
| TM47488 | A13 | T/C | CC | 1.37E-12∗∗∗ | 0.96 | |
| FS | TM21110 | A07 | A/G | GG | 2.93E-15∗∗∗ | 2.65 |
| TM21111 | A07 | T/C | CC | 3.18E-15∗∗∗ | 2.53 | |
| TM21292 | A07 | T/C | CC | 3.07E-13∗∗∗ | 3.01 | |
| TM47488 | A13 | T/C | CC | 9.75E-18∗∗∗ | 1.91 | |
| TM77174 | D11 | A/G | GG | 0.0031∗∗ | 3.73 | |
| TM81816 | D13 | A/G | GG | 1.54E-15∗∗∗ | 0.85 | |
| FM | TM53502 | D03 | A/C | AA | 4.05E-09∗∗∗ | -0.05 |
| FE | TM47488 | A13 | T/C | CC | 2.06E-11∗∗∗ | 0.06 |
FIGURE 4Build-up effect analysis for different numbers of elite alleles. (A) FL, fiber length; (B) FS, fiber strength. The X-axis represents the number of elite alleles carried by the accessions and the Y-axis represents trait mean value. Different lowercase letters above the plots represent Duncan’s multiple comparison at P < 0.05.
FIGURE 5Physical map of SSR markers identified by electronic PCR based on the reference genome sequence (Zhang T. et al., 2015). The SSR markers linked to fiber quality QTLs from previous studies. The unit of physical distance for the chromosomes is Mb; gray columnar graph represents SSR markers or QTLs from previous studies in the region adjacent to the associated SNP loci identified in this study. (A) The first major genome region (GR1) on chromosome A07; (B) The second major genome region (GR2) on chromosome A13.
The 22 candidate genes with fiber quality traits.
| Chromosome | Gene name | Gene_start | Gene_end | Annotation | Expression level |
|---|---|---|---|---|---|
| A7 | 70380260 | 70385380 | Heat shock protein 90-like | >1 | |
| A7 | 70401196 | 70401555 | Unknown | ||
| A7 | 70458716 | 70466147 | Nucleoporin, Nup133/Nup155-like | >1 | |
| A7 | 70665067 | 70669318 | Transducin/WD40 repeat-like superfamily protein | ||
| A7 | 70711596 | 70712168 | Unknown | >1 | |
| A7 | 70778523 | 70779569 | Snf1-related protein kinase regulatory subunit beta-2 | ||
| A7 | 70985378 | 70986640 | FASCICLIN-like arabinogalactan 2 | >1 | |
| A7 | 71156295 | 71158111 | Cytochrome p450 79a2 | ||
| A7 | 71430100 | 71431232 | emp24/gp25L/p24 family/GOLD family protein | ||
| A7 | 71570464 | 71571897 | Receptor like protein 45 | ||
| A7 | 71574475 | 71580347 | Nudix hydrolase homolog 23 | >1 | |
| A7 | 71640549 | 71640998 | RING/U-box superfamily protein | >1 | |
| A7 | 71682050 | 71682979 | Plant protein of unknown function (DUF868) | ||
| A7 | 71683734 | 71690864 | Unknown | ||
| A7 | 71951926 | 71957589 | Subtilisin-like serine endopeptidase family protein | ||
| A7 | 72009668 | 72010690 | Unknown | ||
| A13 | 74974260 | 74978836 | Cyclic nucleotide gated channel 5 | >1 | |
| A13 | 74979629 | 74982299 | Endomembrane protein 70 protein family | >1 | |
| A13 | 74984317 | 74990193 | Cytosol aminopeptidase family protein | >1 | |
| A13 | 75012308 | 75016124 | Fatty acid desaturase 6 | >1 | |
| A13 | 75029006 | 75040535 | Actin binding | ||
| A13 | 75042666 | 75046219 | Reticulon family protein | >1 | |