| Literature DB >> 19017390 |
Arnis Druka1, Ilze Druka, Arthur G Centeno, Hongqiang Li, Zhaohui Sun, William T B Thomas, Nicola Bonar, Brian J Steffenson, Steven E Ullrich, Andris Kleinhofs, Roger P Wise, Timothy J Close, Elena Potokina, Zewei Luo, Carola Wagner, Günther F Schweizer, David F Marshall, Michael J Kearsey, Robert W Williams, Robbie Waugh.
Abstract
BACKGROUND: A typical genetical genomics experiment results in four separate data sets; genotype, gene expression, higher-order phenotypic data and metadata that describe the protocols, processing and the array platform. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. Their predictive power is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci and traits of interest by an entire research community. DESCRIPTION: Using a reference population of 150 recombinant doubled haploid barley lines we generated novel phenotypic, mRNA abundance and SNP-based genotyping data sets, added them to a considerable volume of legacy trait data and entered them into the GeneNetwork http://www.genenetwork.org. GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring them.Entities:
Mesh:
Year: 2008 PMID: 19017390 PMCID: PMC2630324 DOI: 10.1186/1471-2156-9-73
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Condensed list of barley traits that have been measured using the Steptoe × Morex DHL population and are available for analysis through GeneNetwork.
| Germination | Frequency of the germinating grains | 10 | cp |
| Emergence of the second leaf (ESL) | Single-leaf frequency (ESL-f) and length ratio of the second and the first leaf (ESL-r) | 3 | cp |
| Heading date | Time interval to heading or anthesis | 1–16, 17–20 | cp, [ |
| Height | Distance from ground to collar at maturity | 1–16, 17–20 | cp, [ |
| Lodging | Stems < 45 degree angle to ground (1–9) | 1–16, 17–20 | cp, [ |
| Maturity | Maturity of the plot (1–9) | 17–20 | cp |
| Normalized difference vegetation index (NDVI) | GREENSEEKER | 17–20 | cp |
| Head length | Distance from peduncle to the awn tip | 17–20 | cp |
| Post harvest sprouting | Frequency of the germinating grains | 10 | cp |
| Head loss | Frequency of the tillers with no heads | 3 | cp |
| Grain loss | Frequency of the heads with no spikelets. | 3 | cp |
| Thousand grain weight (TGW). | TGW = 1000 × weight/seed number. | 17–20, 25 | cp |
| Grain morphometrics. | MARVIN and ImageJ | 17–20, 25 | cp |
| Endosperm cell wall modification. | Calcuflor staining, ImageJ analysis | 20 | cp |
| Nitrogen content or grain protein | Micromalting | 1–20 | cp, [ |
| Malt extract | Micromalting | 1–22 | cp,[ |
| Fementability | Micromalting | 1–3 | cp |
| Hot water extract | Micromalting | 1–3 | cp |
| Milling energy (J) | COMPARAMILL | 1–3 | cp |
| Predicted spirit yield | Micromalting | 1–3 | cp |
| Grain moisture content | Moisture content of sample % | 1–3 | cp |
| Diastatic power | Micromalting | 7–22 | [ |
| Alpha amylase | Micromalting | 7–22 | [ |
| Leaf rust (Puccinia hordeii) | Relative Latency Period | 24 | [ |
| Net bloch (Pyrenophora teres) | Frequency of the infection types | 6 | [ |
| Scald (Rhyncosporium secalis) | Disease severity | 25 | cp |
| Spot bloch (Cochliobolus sativus) | Frequency of the infection types | 6 | [ |
| Stem rust (Puccinia graminis) | Frequency of the infection types | 6 | [ |
Detailed descriptions of the traits are here:
Environments/years/source:
1 – SCRI, 2002; 2 – SCRI, 2003; 3 – SCRI, 2004; 4 – SCRI, 2005; 5 – SCRI; 2006; 6 – Minnesota; 7 – Crookston, Minnesota; 8 – Ithaca, New York; 9 – Guelph, Ontario; 10 – Pullman, Washington; 11 – Brandon, Manitoba; 12 – Outlook, Saskatchewan; 13 – Goodale, Saskatchewan; 14 – Saskatoon, Saskatchewan; 15 – Tetonia, Idaho; 16 – Bozeman, Montana (irrigated); 17 – Bozeman, Montana (dryland); 18 – Aberdeen, Idaho; 19 -Klamath Falls, Oregon; 20 – Pullman, Washington; 21 – Bozeman, Montana (irrigated); 22 – Bozeman, Montana (dryland); 23 – Japan (Kazuhiro Sato, seeds for morphometrics); 24 – Wageningen; 25 – Giessen. cp – current paper.
Barley expression data sets available for analysis in GeneNetwork.
| The Affymetrix' CEL files that were generated using MAS 5.0 Suite (Affymetrix, Santa Clara, CA) were imported into the GeneSpring GX 7.3 (Agilent Technologies, Palo Alto, CA) and processed using the RMA algorithm. | |
| The MAS 5.0 values were calculated from the DAT files using Affymetrix' MAS 5.0 Suite (Affymetrix, Santa Clara, CA). | |
| The Affymetrix' CEL files were imported into the GeneSpring GX 7.3 (Agilent Technologies, Palo Alto, CA) software and processed using the RMA algorithm. Per-chip and per-gene normalization was done following the standard GeneSpring procedure which includes setting the values below 0.01 to 0.01 and then dividing each measurement by the 50th percentile of all measurements in that sample. Additionally each gene was divided by the median of its measurements in all samples. |
Figure 1A – Generalized schematic representation of the functions and their relationships in GeneNetwork related to three types of data; gene expression, phenotype and genotype. B-E examples of typical graphical outputs generated by the GeneNetwork. B – Profile of a QTL scan using the interval mapping function. The blue line graph – Likelihood Ratio Statistic (LRS) profile, green and red line graphs – allelic effects (in our case green = Morex, red = Steptoe), yellow bars – confidence intervals determined using 1000 bootstrap tests, red and grey horizontal lines – upper and lower significance LRS thresholds determined by 1000 permutation tests; C – Any pairwise correlation can be visualized as a scatter plot allowing the correlation structure to be determined. In this case, mRNA abundance values (reported by the GeneChip probe set Contig8601_s_at) were plotted against grain yield values from one of the trials. 'N of cases' – number of segregating lines. Pearson's and Spearman's correlation coefficients and associated p-values (P) are shown on the top right corner. Linear regression line is shown in green.; D – Selected correlates can also be visualized as a QTL Cluster map, which is a genetically ordered heat-map representation of the QTLs from multiple traits that were calculated using single marker linkage analysis. Significant QTLs are shown in a different colour from loci that have no association, and allelic effects are shown in contrasting colours (red and blue in key). E – Association network of 10 correlated genes. As a 'seed', mRNA abundance of the HLH DNA-binding protein gene (Contig20506_at), was used. Pearson's correlation coefficient threshold in this case was |0.8|. Line colours show correlation strength (more intense – higher correlation) and whether it is positive (orange – red) or negative (green – blue).
Figure 2Prediction of barley gene position based on linkage analysis of mRNA abundance. A – Scattergram of the LRS value distributions of 324 eQTLs with genetic positions of the underlying genes determined using SNP- or RFLP-based linkage mapping. B – Cumulative (%) distribution of the LRS values for cis- (blue line graph) and trans- (red line graph) eQTLs. C – Scatterplots showing the distribution of high (> 30) and low (< 30) LRS class eQTLs across the barley genetic map (x-axes) relative to the position of their putative rice orthologs. Each diagram shows only the comparison to rice chromosome 1 which exhibits considerable conservation of synteny with barley 3H (y-axes). On the x-axis the eQTL positions of barley orthologs of genes on rice chromosome 1 are ordered according to their location on the barley genetic map (tip of barley 1HS to bottom of 7HL), but barley map distances are not taken into account. As expected, barley 3H exhibits strong synteny with rice 1. This is particularly obvious when considering the eQTLs with LRS > 30, suggesting that this class of eQTLs is generally cis-acting. The eQTLs with LRS < 30 show a less obvious (but still apparent) association between rice 1 and barley 3H. In these comparisons all genes reported by 22,840 Barley1 GeneChip probe sets were analysed.
Figure 3Results of principal component analysis (A) and association network (B) show the relationships between the major barley phenotypic traits integrated into the GeneNetwork. The network was built using scores of the first four principal components (c1–c4) calculated by combining data from a single trait measured in different locations and years, or related (component) traits underlying a higher order trait (e.g. malt quality data). Concerning the latter, principal component scores for malting quality traits were calculated from combined alpha amylase, diastatic power, grain protein and malt extract trait values. Principal component node colouring; c1-black background, c2-grey, c3 and c4 – white). Double-lined links – positive correlations; Bold, thick links – negative. For clarity, the network was re-drawn using GeneNetwork's output.