| Literature DB >> 26154104 |
Kelly Houston1, Rachel A Burton2, Beata Sznajder3, Antoni J Rafalski4, Kanwarpal S Dhugga5, Diane E Mather3, Jillian Taylor2, Brian J Steffenson6, Robbie Waugh7, Geoffrey B Fincher2.
Abstract
Cellulose is a fundamentally important component of cell walls of higher plants. It provides a scaffold that allows the development and growth of the plant to occur in an ordered fashion. Cellulose also provides mechanical strength, which is crucial for both normal development and to enable the plant to withstand both abiotic and biotic stresses. We quantified the cellulose concentration in the culm of 288 two--rowed and 288 six--rowed spring type barley accessions that were part of the USDA funded barley Coordinated Agricultural Project (CAP) program in the USA. When the population structure of these accessions was analysed we identified six distinct populations, four of which we considered to be comprised of a sufficient number of accessions to be suitable for genome-wide association studies (GWAS). These lines had been genotyped with 3072 SNPs so we combined the trait and genetic data to carry out GWAS. The analysis allowed us to identify regions of the genome containing significant associations between molecular markers and cellulose concentration data, including one region cross-validated in multiple populations. To identify candidate genes we assembled the gene content of these regions and used these to query a comprehensive RNA-seq based gene expression atlas. This provided us with gene annotations and associated expression data across multiple tissues, which allowed us to formulate a supported list of candidate genes that regulate cellulose biosynthesis. Several regions identified by our analysis contain genes that are co-expressed with cellulose synthase A (HvCesA) across a range of tissues and developmental stages. These genes are involved in both primary and secondary cell wall development. In addition, genes that have been previously linked with cellulose synthesis by biochemical methods, such as HvCOBRA, a gene of unknown function, were also associated with cellulose levels in the association panel. Our analyses provide new insights into the genes that contribute to cellulose content in cereal culms and to a greater understanding of the interactions between them.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26154104 PMCID: PMC4496100 DOI: 10.1371/journal.pone.0130890
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Cellulose quantities by breeding program and subpopulation in barley culms.
| Abbreviation | Breeding program | sample size | mean cellulose concentration | range of cellulose concentration |
|---|---|---|---|---|
| WA | Washington | 94 | 0.394 | 0.222–0.464 |
| MT | Montana | 96 | 0.433 | 0.340–0.497 |
| N2 | North Dakota | 84 | 0.456 | 0.264–0.648 |
| UT | Utah | 96 | 0.409 | 0.292–0.498 |
| N6 | North Dakota | 94 | 0.361 | 0.035–0.477 |
| MN | Minnesota | 96 | 0.402 | 0.179–0.459 |
| All | 560 | 0.408 | 0.035–0.648 |
Sample size, mean cellulose concentration, and range of cellulose concentration in mg cellulose / mg dry weight.
Populations are based on Barley CAP breeding programs.
Cellulose quantities by subpopulation defined by STRUCTURE in barley culms.
| Subpopulation | sample size | mean cellulose concentration | range of cellulose concentration |
|---|---|---|---|
| 1 | 18 | 0.418 | 0.364–0.468 |
| 2 | 84 | 0.356 | 0.035–0.427 |
| 3 | 87 | 0.401 | 0.264–0.648 |
| 4 | 98 | 0.401 | 0.179–0.459 |
| 5 | 183 | 0.414 | 0.222–0.597 |
| 6 | 58 | 0.405 | 0.292–0.498 |
| All | 528 | 0.399 | 0.035–0.648 |
Sample size, mean cellulose concentration, and range of cellulose concentration in mg cellulose / mg dry weight.
Populations are based on subpopulations lines were assigned to after STRUCTURE analysis.
Regions of the barley genome identified by GWAS as having significant associations with cellulose content.
| Pop 3 | Pop 5 | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (2-row) | (2-row) | 3rd | |||||||||||||||
|
|
| Marker | i-select cM | POPSEQ | IBSC | Zipper MxBk | -log 10 (pvalue) | R2 | -log 10 (pvalue) | R2 |
| Barley Gene id (MLOC) | cM (IBSC) | Morex Contig | internode FPKM | ||
|
|
| 11_10516 | 65.5 | 59.1 | 61 | 60.8 | 0.8 | 0 | 3.6 | * | 0.04 | ? | n/a | n/a | n/a | n/a | |
|
| 11_11037 | 84.7 | n/a | n/a | n/a | 0 | 0.01 | 6.1 | *** | 0.1 |
| MLOC_66574 | 72 | contig_51769 | 611.2 | ||
|
|
| 11_20097 | 100.3 | 80.3 | 80.3 | n/a | 1 | 0.04 | 5.7 | *** | 0.08 |
| MLOC_66740 | 83.5 | contig_52038 | 20.7 | |
|
| AK249826 | 80.4 | n/a | n/a | |||||||||||||
|
| 11_20134 | 106.2 | n/a | 92.4 | n/a | 0.1 | 0 | 3.9 | * | 0.05 |
| AK252806 | 88.2 | contig_230478 | 5.6 | ||
|
| AK373680 | 97.3 | contig_137207 | 0.1 | |||||||||||||
| 11_10094 | 122.4 | 95.5 | n/a | n/a | 0 | 0.01 | 3.8 | * | 0.07 |
| MLOC_1996 | 97.7 | contig_117772 | 20.9 | |||
|
| 11_20487 | 134.6 | n/a | n/a | n/a | 2.3 | 0.03 | 4.1 | * | 0.06 |
| MLOC_67725 | 122.4 | contig_53512 | 3 | ||
|
| MLOC_75585 | 122.4 | contig_68278 | 0.2 | |||||||||||||
|
| MLOC_56792 | 126.1 | contig_41156 | 0 | |||||||||||||
|
| MLOC_55526 | 125.8 | contig_40045 | 482.2 | |||||||||||||
|
| 12_30502 | 0 | 169.4 | n/a | n/a | 0 | 0 | 5.9 | *** | 0.08 |
| MLOC_11345 | 166.67 | contig_1560890 | 71.5 | ||
|
|
| 11_10427 | 34.4 | 38 | n/a | 37.4 | 0.1 | 0 | 4.2 | * | 0.06 |
| AK367031 | 37.4 | contig_39637 | 777.1 | |
|
| 11_10239 | 112.3 | 105.1 | 104.8 | n/a | 0 | 0 | 3.6 | * | 0.05 | ? | n/a | n/a | n/a | n/a | ||
|
|
| 11_10069 | 83.4 | 75.2 | n/a | 74.4 | 4.7 | * | 0.04 | 0 | 0 |
| MLOC_57200 | 72.5 | contig_41513 | 244.1 | |
|
| MLOC_19933 | 73.2 | contig_1587220 | 7.3 |
This table includes just pop 3 and 5 as these were the two subpopulatons in which significant association with culm cellulose content were identified. Significant (−log10 p ≥ 3) marker-trait associations which pass the False Discovery Rate (FDR) adjusted p value threshold of ≤ 0.05 are indicated by *,
≤ 0.01 are indicated by **,
and ≤ 0.001 are indicated by ***
R-squared values for each marker are also included. The marker name and location for the SNP with the highest lod score is provided for each QTL. Germplasm included in the analysis was a subset of 2-rowed and 6- rowed spring accessions from the Barley CAP project. Subpopulations, referred to here as “Pop”, are those determined by STRUCTURE analysis based on degree of shared genetic ancestry. Barley Gene ID (MLOC), Morex Contig, 3rd internode FPKM—fragments per kilobase of exon per million fragments mapped from [39] are provided for each candidate gene identified under each peak where information is available. Four different versions of the barley map were used to provide information for each locus: 9K i-Select [35], Barley Genome Zipper [37], POPSEQ [38] and the IBSC [39]. Question marks indicate where no obvious candidate was identified for particular regions based on the available annotations.
Fig 1Genotypic data and population structure analysis.
(A) Principal coordinates analysis of all lines phenotyped, colour coded by breeding program. (B) STRUCTURE bar plot for K = 6 based on bOPA 1&2 genotyping data for spring barley lines ordered by breeding programs, but colour coded by K value. Please note colours in A. are independent to those in B. Breeding program 1 = Washington (WA), 2 = Montana (MT), 3 = 2- row North Dakota (N2), 4 = Utah (UT), 5 = 6-row North Dakota (N6), and 6 = Minnesota (MN). Colours represent subpopulation defined by shared genetic ancestry. Q value represents proportion of ancestry to a given subpopulation. (C) Output from Structure Harvester showing K as calculated based on ΔK method, in this case K = 6. L(K) represents the likelihood distribution of K, and L”(K) represents the second order rate of change from L(K).
Fig 2Phenotypic data used to carry out a genome wide association study (GWAS).
(A) Mean culm cellulose content for 2-row and 6 row spring barley accessions by breeding program. (B) Mean culm cellulose content for all lines to illustrate the distribution of this trait in the barley CAP programs included in this analysis.
Fig 3Manhattan plots of culm cellulose content (mg cellulose / mg dry weight) genome wide association scans (GWAS) using the Kinship relationship model.
The-log10 (p-values) from a genome-wide scan are plotted against the position on each of the 7 barley chromosomes. Manhattan plots are displayed for those populations which had associations that passed the significance threshold set by FDR p > 0.05 (-log10P >3.0), i.e. populations 3 and 5. Manhattan plots of the null models are provided for comparison. (A) Population 3 Kinship model. (B) Population 5 Kinship model. (C) Population 3 Null model. (D) Population 5 Null model.
Fig 4Co-expression of two groups of genes (HvCesA9, HvCobra1, HvCslF6, and HvChitinase, HvGT1) identified by GWAS as putatively linked to culm cellulose content with HvCesA genes known to be involved in primary and secondary cell wall development.
Transcript abundance across a range of tissues shown in fragment per kilobase of exon per million fragments mapped (FPKM) for group 1, primary cell wall including HvCesA1, HvCesA2, and HvCesA6 (A) for reference, and group 2, secondary cell wall including HvCesA4, HvCesA7 and HvCesA8 for reference (B). Abbreviations for tissues/ developmental stages as follows; EMB = Embryo tissues (germinating), ROO = Root (10cm seedlings), LEA = Shoot (10cm seedlings), INF1 = Inflorescence (0.5cm), INF2 = Inflorescence (1–1.5cm), NOD = Tillers (3rd internode), CAR5 = Grain (5 Days Post Anthesis—DPA), CAR15 = Grain (15 DPA), ETI = Etiolated (10 day seedlings), LEM = Lemma (6 weeks Days After Planting—DAP), LOD = Lodicule (42 DAP), PAL = Palea (42 DAP), EPI = Epidermis (28 DAP), RAC = Rachis (35 DAP), ROO2 = Root (28 DAP seedling), SEN = Senescing leaf (63 DAP).