| Literature DB >> 31417977 |
Qian Du1, Malachy Campbell2,3, Huihui Yu1, Kan Liu1, Harkamal Walia2, Qi Zhang4, Chi Zhang1.
Abstract
Rice, an important food resource, is highly sensitive to salt stress, which is directly related to food security. Although many studies have identified physiological mechanisms that confer tolerance to the osmotic effects of salinity, the link between rice genotype and salt tolerance is not very clear yet. Association of gene co-expression network and rice phenotypic data under stress has penitential to identify stress-responsive genes, but there is no standard method to associate stress phenotype with gene co-expression network. A novel method for integration of gene co-expression network and stress phenotype data was developed to conduct a system analysis to link genotype to phenotype. We applied a LASSO-based method to the gene co-expression network of rice with salt stress to discover key genes and their interactions for salt tolerance-related phenotypes. Submodules in gene modules identified from the co-expression network were selected by the LASSO regression, which establishes a linear relationship between gene expression profiles and physiological responses, that is, sodium/potassium condenses under salt stress. Genes in these submodules have functions related to ion transport, osmotic adjustment, and oxidative tolerance. We argued that these genes in submodules are biologically meaningful and useful for studies on rice salt tolerance. This method can be applied to other studies to efficiently and reliably integrate co-expression network and phenotypic data.Entities:
Keywords: LASSO regression; co‐expression network; data integration; gene modules; linkage between genome to phenotype
Year: 2019 PMID: 31417977 PMCID: PMC6689793 DOI: 10.1002/pld3.154
Source DB: PubMed Journal: Plant Direct ISSN: 2475-4455
Figure 1Flowchart of the algorithm to link phenotyping data to submodules in the gene co‐expression network
Figure 2The clustering result of WGCNA to the gene co‐expression network with a heatmap plot. The heatmap shows the topological overlap matrix among all genes in different clusters, and blocks of darker colors along the diagonal are related to genes from the same modules
Figure 3For LASSO training result, the cross‐validation errors were plotted against varying log(λ) values in the search range
Significant submodules after LASSO selection based on their coefficient values
| Module # | 4 | 6 | 7 | 11 | 14 | 15 | 16 |
|---|---|---|---|---|---|---|---|
| 1st PC | −.0325 | −.0111 | |||||
| 2nd PC | .0862 | .0869 | |||||
| 3rd PC | −.0275 | .01257 | .0463 | .0889 |
Figure 4The contribution of genes to PC2 in Module #14 with the background
Overview of all significant modules
| Module #‐PC | No. of genes in modules | No. of genes in submodules | Enriched with genes belongings GO terms | Adj. |
|---|---|---|---|---|
| 4‐3 | 891 | 67 | Transport (20/67) | 1.9 × 10−5 |
| 6‐3 | 313 | 53 | Response to stress (16/53) | 1.58 × 10−5 |
| 7‐3 | 184 | 24 | ||
| 11‐1 | 110 | 110 | Response to stress | 7.86 × 10−9 |
| 14‐3 | 46 | 3 | ||
| 15‐1 | 46 | 18 | Response to abiotic stimulus | .0089 |
| 15‐2 | 46 | 8 | ||
| 16‐2 | 34 | 7 | Cellular homeostasis (3/7) | 4.46 × 10−12 |
Figure 5The distributions of genes with respect to the correlation to each specific PCs.
Figure 6Simulation results of PR AUC comparison between LASSO and correlation method. The x‐axis represents the different multiplying factors. The box plot displays the 25th and 75th percentiles around the median value. Magenta box stands for LASSO method, whereas the cyan box represents the correlation method. The significance was calculated with Wilcoxon signed‐ranks test and p < .05 is labeled as *, p < .01 is labeled as **, and p < .001 is ***
Figure 7Simulation results of PR AUC comparison between LASSO and correlation method. The x‐axis represents different effect size. The box plot displays the 25th and 75th percentiles around the median value. Magenta box stands for LASSO method, whereas the cyan box represents the correlation method. The significance was calculated with Wilcoxon signed‐ranks test and p < .05 is labeled as *, p < .01 is labeled as **, and p < .001 is ***