| Literature DB >> 31766757 |
Zakhar S Mustafin1,2, Vladimir I Zamyatin1,2,3, Dmitrii K Konstantinov1,3, Aleksej V Doroshkov1,3, Sergey A Lashin1,2,3, Dmitry A Afonnikov1,2,3.
Abstract
Plants constantly fight with stressful factors as high or low temperature, drought, soil salinity and flooding. Plants have evolved a set of stress response mechanisms, which involve physiological and biochemical changes that result in adaptive or morphological changes. At a molecular level, stress response in plants is performed by genetic networks, which also undergo changes in the process of evolution. The study of the network structure and evolution may highlight mechanisms of plants adaptation to adverse conditions, as well as their response to stresses and help in discovery and functional characterization of the stress-related genes. We performed an analysis of Arabidopsis thaliana genes associated with several types of abiotic stresses (heat, cold, water-related, light, osmotic, salt, and oxidative) at the network level using a phylostratigraphic approach. Our results show that a substantial fraction of genes associated with various types of abiotic stress is of ancient origin and evolves under strong purifying selection. The interaction networks of genes associated with stress response have a modular structure with a regulatory component being one of the largest for five of seven stress types. We demonstrated a positive relationship between the number of interactions of gene in the stress gene network and its age. Moreover, genes of the same age tend to be connected in stress gene networks. We also demonstrated that old stress-related genes usually participate in the response for various types of stress and are involved in numerous biological processes unrelated to stress. Our results demonstrate that the stress response genes represent the ancient and one of the fundamental molecular systems in plants.Entities:
Keywords: A. thaliana; abiotic stress; divergence; gene family evolution; gene network; multifunctional genes; network structure; phylostratigraphic analysis
Mesh:
Substances:
Year: 2019 PMID: 31766757 PMCID: PMC6947294 DOI: 10.3390/genes10120963
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1The phylostratigraphic map of A. thaliana and phylogeny used in the search for the evolutionary origin of A. thaliana genes, 18 genomic phylostrata that correspond to the phylogenetic internodes. An asterisk (*) indicates three phylostrata that were excluded from the statistical analysis.
Number of GO terms and genes that have identified associations with studied stress types.
| Stress Type | Number of GO Terms | Number of Genes | KEGG Number of Genes |
|---|---|---|---|
| Cold | 4 | 150 | 144 |
| Heat | 14 | 102 | 102 |
| Light | 48 | 155 | 141 |
| Osmotic | 23 | 116 | 114 |
| Oxidative | 28 | 154 | 152 |
| Salt | 17 | 231 | 230 |
| Water | 27 | 215 | 211 |
The number of common genes between pairs in stress-related gene sets. Each cell in the table represent the fraction (and number, in parentheses) of genes from the set of the row common with the set of the column. The last column represents the number of unique genes for the stress in the row.
| Salt | Heat | Light | Water | Cold | Osmotic | Oxidative | Unique Genes | |
|---|---|---|---|---|---|---|---|---|
| Salt | 232 | 13 | 7 (0.03) |
| 18 |
| 18 | 126 |
| −0.06 |
| −0.08 |
| −0.08 | −0.54 | |||
| Heat |
| 102 | 8 |
| 8 | 8 | 6 | 72 |
|
| −0.08 |
| −0.08 | −0.08 | −0.06 | −0.71 | ||
| Light | 7 | 8 | 155 | 12 | 9 | 3 | 6 | 120 |
| −0.05 | −0.05 | −0.08 | −0.06 | −0.02 | −0.04 | −0.77 | ||
| Water |
| 11 | 12 | 216 | 18 |
| 11 | 124 |
|
| −0.05 | −0.06 | −0.09 |
| −0.05 | −0.59 | ||
| Cold |
| 8 | 9 (0.06) |
| 150 |
| 6 | 93 |
|
| −0.05 |
|
| −0.04 | −0.64 | |||
| Osmotic |
| 8 | 3 |
|
| 117 | 12 | 35 |
|
| −0.07 | −0.03 |
|
| −0.1 | −0.3 | ||
| Oxidative |
| 6 | 6 | 11 | 6 | 12 | 154 | 118 |
|
| −0.04 | −0.04 | −0.07 | −0.04 | −0.08 | −0.77 |
* The cells with fraction of genes larger than 0.1 are shown in bold.
Figure 2The distribution of frequencies of A. thaliana protein-coding genes (y-axis) by PAI (X-axis) is shown as grey bars. Solid lines indicate the values of the difference between the frequencies of occurrence of PAI values in stress dataset and all A. thaliana genes (dfPAI). Correspondence of the line color and stress type is shown in the box in the upper right corner.
The comparison of the PAI distribution of genes in the gene networks of A. thaliana stress response with the corresponding distribution of the complete set of A. thaliana genes according to the results of the permutation test. First line: types of stress. Second line: the proportion of random samples for which the average PAIrand value for a set of genes, the same size as the stress network, exceeds the PAIstress value for the corresponding stress network. Third row: fraction of random samples of genes in which the value of the quadratic deviation ChiSqrand distribution of ages the distribution for all genes is higher than in the corresponding gene networks (ChiSqstress). The fifth and subsequent lines: the fraction of random samples of genes in which the difference between the proportions of genes of i-th phylostratum dfPAI among stress genes exceeds the corresponding proportion among random sample formed from the whole gene set. All values in the cells must be multiplied by 10−5. PAI is calculated at the level of similarity of the sequences of ID = 0.5.
| Stress | Cold | Heat | Llight | Osmotic | Oxidative | Salt | Water | All Stress nr |
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
| |
| 00_Cellular organisms |
|
|
|
|
|
|
|
|
| 01_Eukaryota | 11,218 | 35,449 | 34,137 |
| 75,372 |
|
|
|
| 02_Viridiplantae | 15,729 | 11,571 | 6603 | 32,387 | 18,710 | 83,664 |
| 6394 |
| 04_Embryophyta | 47,489 |
| 12,708 | 51,099 | 11,320 | 87,444 | 26,046 | 25,790 |
| 05_Tracheophyta |
| 65,732 | 32,104 | 51,865 | 99,386 | 22,936 | 39,481 | 53,618 |
| 07_Magnoliophyta | 38,170 |
| 26,294 | 23,881 | 49,013 | 45,825 | 82,207 |
|
| 08_eudicotyledons | 94,830 | 62,021 | 78,520 | 8542 | 60,328 | 69,323 | 80,882 |
|
| 10_Pentapetalae | 69,619 | 72,175 | 68,242 | 31,311 | 52,357 | 50,365 | 60,197 |
|
| 11_rosids | 91,840 | 79,304 | 91,447 | 83,906 | 81,256 |
| 72,339 |
|
| 12_malvids | 15,610 | 39,144 | 49,460 | 42,420 | 16,775 | 67,429 | 63,996 | 78,703 |
| 13_Brassicales | 89,388 | 94,844 | 61,746 | 89,433 |
|
|
|
|
| 14_Brassicaceae |
| 89,314 |
|
|
|
|
|
|
| 15_Camelineae | 89,271 | 74,850 |
| 80,248 | 56,525 |
|
|
|
| 16_Arabidopsis | 74,084 | 61,724 | 73,454 | 65,695 | 41,687 | 88,476 | 86,494 |
|
| 17_A. thaliana |
|
|
|
|
|
|
|
|
* The values with p < 0.05 are bold and underlined; the values with p > 0.95 are underlined. ** p < 10−5.
Figure 3The distribution of frequencies of A. thaliana protein-coding genes (y-axis) by the DI value (x-axis) is shown by grey columns. Solid lines show the difference between the frequencies of occurrence of DI values in the stress dataset and all A. thaliana genes. Correspondence between the line color and the stress type is shown in the box in the upper right corner.
Comparison of the divergence index (DI) distribution of genes in the gene networks of A. thaliana stress response with the corresponding distribution of the complete set of A. thaliana genes according to the results of the permutation test. First line: types of stress. Second line: the proportion of random samples for which the average DIrand value for a set of genes, the same size as the stress network, exceeds the DIstress value for the corresponding stress network. Third line: fraction of random samples of genes in which the value of the quadratic deviation ChiSqrand distribution of DI from such distribution for all genes is higher than in the corresponding gene networks (ChiSqstress). The fifth and subsequent lines: fraction of random samples of genes in which the difference between the proportion of genes of i-th phylostratum dfDIi among stress genes exceeds the corresponding proportion among a random sample formed from the whole gene set. All values in the cells must be multiplied by 10−5.
| Stress | Cold | Heat | Light | Osmotic | Oxidative | Salt | Water | All Stresses nr |
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
| 7010 | 43,398 |
|
| 52,740 |
|
|
| |
| [0, 0.1] | 93,375 | 28,064 | 6137 | 53,230 | 38,498 | 10,701 | 38,527 |
|
| (0.1, 0.2] |
|
|
|
|
|
|
|
|
| (0.2, 0.3] | 15,832 | 38,676 | 66,383 |
| 64,490 | 32,404 | 9731 | 18,452 |
| (0.3, 0.4] | 49,562 | 40,493 | 90,569 |
| 69,027 |
|
|
|
| (0.4, 0.5] | 71,783 | 75,719 | 58,004 | 92,193 | 30,113 |
| 94,603 |
|
| (0.5, 0.6] | 70,023 | 73,560 | 68,603 | 92,023 | 86,198 |
|
|
|
| (0.6, 0.7] | 95,009 | 84,956 | 99,071 | 88,993 | 22,528 | 93,999 | 82,333 |
|
| (0.7, 0.8] | 76,327 | 58,418 | 51,318 | 88,964 | 55,802 | 93,241 | 77,350 |
|
| (0.8, 0.9] | 47,813 | 67,878 | 79,538 | 72,465 | 50,554 | 92,166 | 90,674 |
|
| (0.9, 1] | 61,938 | 49,902 | 61,515 | 53,707 | 64,025 | 78,219 | 75,853 |
|
| (1, +∞) | 62,703 | 78,089 | 87,708 | 81,801 | 89,300 |
| 82,213 |
|
* Values with p < 0.05 are bold and underlined; values with p > 0.95 are underlined. ** p < 10−5.
Figure 4Gene network reconstructed for the heat associated gene set using the STRING tool. Node color corresponds to the PAI index of the gene from 0 (dark blue) to 17 (red). Nodes added to the gene set by the STRING procedure of network reconstruction are outlined in red color. The four clusters of genes are shown by rounded rectangles and numbered.
Pearson correlation coefficients r (k, PAI) between the node degree k and its PAI value in gene networks of different stresses. The second column shows the value of the correlation coefficient, the third one shows the significance level of its difference from 0.
| Stress | ||
|---|---|---|
| Сold | 0.004 | 0.974 |
| Heat |
|
|
| Light | −0.125 | 0.248 |
| Osmotic |
|
|
| Oxidative | 0.019 | 0.875 |
| Salt |
|
|
| Water | −0.061 | 0.524 |
* Values for p < 0.05 are shown in bold.
Figure 5The PAI versus k scatterplots for osmotic (A) and oxidative (B) stress gene networks. The X-axis represents the PAI, the Y-axis shows node degree k.
The coefficients of assortativity r for gene ages in stress-related gene networks and estimates of their standard deviation σ(r). Np-number of pairs of interacting genes.
| Stress |
|
|
|
|---|---|---|---|
| Сold | 148 | 0.251 | 0.345 |
| Heat | 196 | 0.126 | 0.294 |
| Light | 178 | 0.026 | 0.348 |
| Osmotic | 148 | 0.192 | 0.327 |
| Oxidative | 111 | 0.567 | 0.344 |
| Salt | 203 | 0.143 | 0.293 |
| Water | 213 | 0.031 | 0.307 |
Figure 6PAI distribution for genes associated with the heat stress and annotated by various GO terms: (A) GO terms for “biological process”; (B) GO terms for “cellular component”; GO terms for “molecular function”. The frequency of PAI occurrence is normalized to 100% for each GO term and shown by circles. The scale of circles shown on the right.