Literature DB >> 26504546

Construction and validation of a gene co-expression network in grapevine (Vitis vinifera. L.).

Ying-Hai Liang1, Bin Cai2, Fei Chen2, Gang Wang2, Min Wang2, Yan Zhong2, Zong-Ming Max Cheng3.   

Abstract

Gene co-expression analysis has been widely used for predicting gene functions because genes within modules of a co-expression network may be involved in similar biological processes and exhibit similar biological functions. To detect gene relationships in the grapevine genome, we constructed a grapevine gene co-expression network (GGCN) by compiling a total of 374 publically available grapevine microarray datasets. The GGCN consisted of 557 modules containing a total of 3834 nodes with 13 479 edges. The functions of the subnetwork modules were inferred by Gene ontology (GO) enrichment analysis. In 127 of the 557 modules containing two or more GO terms, 38 modules exhibited the most significantly enriched GO terms, including 'protein catabolism process', 'photosynthesis', 'cell biosynthesis process', 'biosynthesis of plant cell wall', 'stress response' and other important biological processes. The 'response to heat' GO term was highly represented in module 17, which is composed of many heat shock proteins. To further determine the potential functions of genes in module 17, we performed a Pearson correlation coefficient test, analyzed orthologous relationships with Arabidopsis genes and established gene expression correlations with real-time quantitative reverse transcriptase PCR (qRT-PCR). Our results indicated that many genes in module 17 were upregulated during the heat shock and recovery processes and downregulated in response to low temperature. Furthermore, two putative genes, Vit_07s0185g00040 and Vit_02s0025g04060, were highly expressed in response to heat shock and recovery. This study provides insight into GGCN gene modules and offers important references for gene functions and the discovery of new genes at the module level.

Entities:  

Year:  2014        PMID: 26504546      PMCID: PMC4596334          DOI: 10.1038/hortres.2014.40

Source DB:  PubMed          Journal:  Hortic Res        ISSN: 2052-7276            Impact factor:   6.793


Introduction

The rapid accumulation of genome sequences and high-throughput microarray data provides rich materials for research on gene function and regulation at the system level.[1] However, integrating and exploiting these data sets has been challenging. Biological networks constructed by bioinformatic methods can help ‘put the function in genomics,[2] and allow researchers to understand how biomolecules interact with one another at the system level to perform specific biological functions in living plant cells.[3,4] The molecular interaction network is a type of biological network in which a node represents a gene, gene product or metabolite, and a link or edge refers to an interaction between them.[4] A gene co-expression network, in which nodes and links represent genes and indicate their co-expression relationships, can characterize such topological properties as small-world, hierarchically modular and scale-free.[5] A gene co-expression network can be divided into several substructures, including motifs, modules and pathways. Its substructure exhibits topological properties described by specific terms, such as network density, degree distribution, clustering coefficient and betweenness.[3] Co-expression network analysis is a powerful method to extract functional modules of co-expressed genes, analyze their biological meanings and identify important novel genes. In recent studies, several plant gene co-expression networks have been built and many functional modules have been inferred or identified.[6-13] For instance, Mao and colleagues[7] constructed an Arabidopsis gene-expression network and identified many functional modules associated with photosynthesis, protein biosynthesis, cell cycle, defense response and others, and these modules revealed new insights into gene function organization. The expression of genes related to the same metabolic function may show co-expression patterns.[14] Wang and colleagues employed co-expression network analysis to identify related cell wall genes in Arabidopsis.[11] Gene modules were extracted in response to drought in rice by network-based analysis, and many hub genes clustered in some rice chromosomes have been found to significantly associate with quantitative trait loci (QTLs) for drought tolerance.[12] Microarray datasets and genome sequences provide an excellent opportunity to understand gene relationships and biological functions in the grapevine.[15,16] In this report, we constructed a GGCN by using 374 high quality microarrays (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL1320). Qcut,[17] a graph portioning algorithm, was applied to identify subnetwork modules from the gene co-expression network. The functions represented by the extracted modules were evaluated by GO enrichment analysis.[18] Next, we validated module 17 by examining gene expression by qRT-PCR and inferred that two putative uncharacterized proteins might be potentially related to heat stress.

Materials and methods

Raw expression data

The grapevine microarray data set for the construction of the co-expression network was obtained from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL1320) (platform accession number GPL1320). The platform consists of experimental samples using Affymetrix GeneChip Grapevine Genome Array. A total of 374 CEL files of samples from platform GPL1320 were used to construct the network and involved three treatment types (biotic stress, development, abiotic stress) and 13 series. The grapevine and Arabidopsis genome sequences were downloaded from Phytozome (http://www.phytozome.net).[15]

Annotation of probe sets and homolog search

A total of 16 436 probe sets from the Affymetrix Grapevine GeneChip were mapped to the grapevine gene loci in CRIBI (http://genomes.cribi.unipd.it/) using BlastN. If more than six probes from the set aligned perfectly to a gene, the probe set was assigned to that gene. Arabidopsis protein sequences and gene information were obtained from the Arabidopsis Information Resource release 10 (http://www.arabidopsis.org/). Grapevine protein sequences were used to search complete Arabidopsis protein sequences using BlastP with an e-value cutoff of 1e−4, and the best hits were selected as Arabidopsis orthologs.

Construction of GGCN

The construction of a gene co-expression network involves the measuring gene expression similarity, visualizing gene expression data, and identifying modular structures. To measure the similarity of gene expression, we utilized the Pearson correlation coefficient (PCC) between pairwise genes. The 374 arrays from Gene Expression Omnibus were normalized by the justRMA function in R/BioConductor.[19] Gene co-expression data were calculated in ATTED-II and applied to the PCC calculation (http://atted.jp/help/coex_cal.shtml). To determine the PCC cutoff threshold for network construction, the numbers of probe sets, edges, and network density (ND) were calculated along with the PCC cutoffs. The network density was calculated according to where m was the observed number of edges in the network and n was the number of nodes in the network. Co-expressed genes are selected at a certain PCC cutoff threshold, and a co-expression network was constructed and visualized by Cytoscape software[20] (http://www.cytoscape.org/). The algorithm Qcut, which identifies statistically significant graph partitions in a biological network,[17] was applied to identify sub-network modules from the co-expression network (http://www.mybiosoftware.com/pathway-analysis/12211).

GO enrichment analysis of modules in GGCN

GO annotations of grapevine genes were downloaded from agriGO (http://bioinfo.cau.edu.cn/agriGO/download.php). The GO enrichment was performed within each module using BiNGO 2.4.[18] The statistical significance of GO term enrichment was measured by a hypergeometric test[21] using the genes in a whole co-expression network as the back ground. A Bonferroni correction[22] was used to control the false positive rate in the multiple testing problems, and a GO term in a module was considered significantly enriched in the given module if the family-wise error rate (FWER) corrected p value was less than 0.05.

Validation of expression genes in module 17 by qRT-PCR

Pinot Noir PN40024 (the genotype deriving the reference genome sequence) was subcultured in vitro on 3/4 Murashige and Skoog medium[23] at 22 °C with a 16-h/8-h photoperiod and an illumination intensity of 150 μmol m−2 s−1 for 6 weeks. Young leaves, including second and third expanding leaves, were sampled for gene expression analysis. To analyze the response of module 17 genes to continuous heat shock stress, whole plants were treated at 40 °C for 0.5, 1, 2, 3 or 6 h in the plant growth chamber. Meanwhile, to analyze the heat shock recovery response, a fraction of the plants that were heat-shocked for 1 h was placed under the original temperature (22 °C) for 2 h and 5 h (the third hour or sixth hour from the beginning of heat shock). The plants without heat shock treatment were used as the controls and handled in an identical manner. To analyze their responses to low temperature, a set of plants was placed in a plant growth chamber at 4 °C for 1 h. All the plant samples were then frozen in liquid nitrogen before total RNA extraction and first strand cDNA synthesis by the reported method.[24] We designed 29 pairs of oligonucleotide primers (Supplementary Table 1) in module 17 with Primer 5.0 (http://www.premierbiosoft.com/crm/jsp/com/pbi/crm/clientside/ProductList.jsp) according to the putative cDNA sequences of the grapevine genome. PCR amplification was carried out in a 25 μL reaction solution consisting of 20 ng template cDNA, 2.0 mM MgCl2, 2.5 μL 10× PCR buffer, 200 μM dNTP, 0.2 pM of each primer and 0.25 U Taq DNA polymerase. To validate the specificity of PCR products, the amplicons were cloned into a pMD19-T vector (Takara, Dalian, China), sequenced at Shanghai Invitrogen Biotechnology Co., Ltd (2715 Longwu Road, Shanghai 200231, China) according to the protocol[24] and aligned onto the grapevine reference genome. The qRT-PCR oligonucleotide primers (Table 1) targeting the expressed grapevine genes in module 17 (response to environmental stress) were designed with Beacon Designer 7.0 (http://www.premierbiosoft.com/molecular_beacons/). Because of high homology and some unknown gene information, all primers were blasted against the grapevine reference genome sequences. Each primer differs from non-target genes by at least three nucleotides, and at least one nucleotide at the 3′-end.[25]
Table 1

qRT-PCR primer sequences of genes in module 17

Gene numberGrapevine geneForward primers (5′ to 3′)Reverse primers (5′ to 3′)
1Vit_10s0003g00260TCAACATCAAGTTTCCAACAAGGACAGTCGCACATCATTAGCC
2Vit_07s0185g00040AGGATGCGAGAGGATGAGACACAAGAGAAACACCAGACAAGG
3Vit_13s0019g03160AGTTCCTTCGTCGGTTCAGGCCTTCACCTCAGCCTTC
4Vit_18s0041g01230GTCAACAACCCAAACTATCAAGGGCACCATCATATCATATACACTCC
5Vit_02s0025g04060TTGATAGTATGTCTGAGTTATGGAGCCTTGGGTGTGAAACAAATGG
6Vit_04s0008g01590TTGAGGTGAAGGTTGCTTGAGCATACTGACTTGGGAGACATCG
7Vit_06s0004g04470CATAAGAAGGATATTAGCGGAAGTGTTGTGTAGAAATCAATACCATCGA
9Vit_16s0050g01150GACCTTGTGATGCTCCTATATGATCTTGCTCTCCTCATTGCC
11Vit_01s0010g02290GTATGACCAAGGATGATGTGAAGACTCCATCTTTGACCTCTGC
12Vit_16s0098g01060TGGAGGATGACTTGCTTGTGCTCTACCTTGGTCTTAGGAATGG
13Vit_11s0016g04080GTGAACAAGGCTATCCGGTCTCATCTTCTTCTCCAACCTCG
14Vit_07s0005g01980GGGGTTTGTCACGGTTAGGTATGACTGGAAGTAATTTGCC
15Vit_17s0000g07190TAGATGCGGGAGTGTCAGGCCTCTTCGTCTTCTATTTCTTCG
19Vit_19s0085g01050GAGTTCAAGAGTCAAGACACAGACCTCCAGTTTCACCTCATTC
20Vit_06s0004g06010GCTATTATAGAAGGCGGCATTACGACCCAGGAGTGAGAGACC
22Vit_13s0019g00860AAGGTGGAGATAGAAGATGGAAACTGGAACAACGATGGTGAGAAC
23Vit_08s0007g00130GATTGAGGATGCCATTGAGCTCTTTGCTATGATGGGGTTG
24Vit_16s0022g00510AGATACAGCAGCAGAATTGATTTGTCAGTCCTCTCCTCTTCCTTCAG
26Vit_06s0004g05770GTTCTTACTGTTACTGTTCCTAAGAAGCGCTGATATATGATATGATGGTCTC

There were 41 nodes (probes) in module 17. Among them, 29 probes were matched with grapevine genes annotated by CRIBI Genomics, University of Padua (http://genomes.cribi.unipd.it/). However, the genes numbered 8, 10, 16, 17, 18, 21, 25, 27, 28 and 29 in module 17 did not express in response to heat shock or cold treatment stress and were therefore not cloned (listed in Table 1).

The qRT-PCR reaction was carried out in a 20 μL reaction solution consisting of 10 μL SYBR (Takara), 8.7 μL ddH2O, 1 μL cDNA diluted 10-fold and 0.15 μL of each specific primer. qRT-PCR amplifications were performed with the following procedure: 94 °C for 4 min and 40 cycles of 94 °C for 20 s, 60 °C for 20 s and 72 °C for 43 s. The qRT-PCR data were analyzed as previously described.[25] Each treatment data point represents three biological replicates (individual plants) with three technical replicates each. The actin-101-like gene (VIT_12S0178g00200) was used as an internal reference. The expression ratio was calculated by the formula , as previously described.[16,25]

Goodness of fit test of gene expression in module 17

To test the goodness of fit of all gene expression values between each two time points treated with heat shock and recovery, we employed ‘LOESS’, locally weighted scatterplot smoothing,[26] and ‘Linear’, a unitary linear regression, to add a fit line and calculate R2, the coefficient of determination,[27] with SPSS 19.0 software.[28] Firstly, a matrix scatter was created between the variables ‘gene expression value’ and ‘treatment time point’ following the steps Graphs→Legacy Dialogs→Scatter/Dot→Matrix Scatter. Next, a fit line was added in the matrix scatterplot by ‘LOESS’ with parameters 95% individual confidence intervals, 30% percentage of points to fit and Epanechnikov kernel function. Secondly, ‘Linear’ was performed with 95% individual confidence intervals following the steps Graphs→Legacy Dialogs→Scatter/Dot→Matrix Scatter→Linear. R2 between the dependent and independent variables ‘gene expression value’ and ‘treatment time point’ in the linear regression were obtained for goodness of fit analysis.[27,28]

Results

The raw microarray data could be divided into the following three categories: biotic stress, development, and abiotic stress. The array accession and the experiment conditions are listed in Table 2. After normalization of gene expression values, the PCC was calculated between each pair within the 16,436 genes. An appropriate PCC cutoff value is necessary to construct a co-expression network. Figure 1 reveals a negative correlation between the network density and PCC cutoff values. At approximately 0.78, the network density approached the minimal value and then increased gradually. The PCC cutoff value of 0.78 was then chosen to screen significant co-expression correlation from a large-scale expression data set (Figure 1). At the PCC cutoff value of 0.78, the network contained 3834 nodes (probe sets) with 13 479 edges (Figure 2 and Supplementary Table 2) and a network density of 0.001856078. The GGCN view was created by the Cytoscape software package.[20]
Table 2

Microarray data used to construct the grapevine co-expression network

ConditionSeries IDNumber of gene chipsExperimental conditions
Biotic stressGSE640472Erysiphe necator conidiospores infection
 GSE1185712Downy mildew infection
 GSE1284210Bois noir infection
 GSE3166014Viral diseases in berry
DevelopmentGSE3167427Berry transcriptome during ripening
 GSE3166412Skin transcriptome in the berries
 GSE316628Grape skin transcriptome in the berries
 GSE1140632Berries during ripening initiation
 GSE1750284Photoperiod regulation of bud dormancy
Abiotic stressGSE3167739Salt and water stress
 GSE3167512High temperature
 GSE3159448Short term abiotic stress
 GSE271804Micropropagated plants were transferred to ex vitro conditions
Figure 1

Relationship between network densities and PCC cutoff values.

Figure 2

The co-expression network of grapevine genes. A red dot represents a node, and a blue line connecting two nodes represents an edge.

Modules in GGCN

In the 3834 nodes, a partitioning analysis was performed to detect 557 modules with a Q value of 0.78, demonstrating a strong modular structure. The modular structure, one of the important features of the biological network, indicates the interaction of biomolecules at the system level. However, all modules in the GGCN were completely independent and represented by different sizes (Figure 2 and Supplementary Table 2). For instance, the two largest modules, module 1 and module 2, each contained 312 nodes in their network, but with 1521 and 2284 edges, respectively, and the smallest modules had only two nodes (Supplementary Table 2). BiNGO 2.4,[18] a Cytoscape plugin, was used to perform GO term enrichment analysis of biological processes. A total of 127 modules that contained more than two nodes were analyzed using the 1256 probes with a biological process GO term as the custom reference set. As a result, 15 modules were identified with significantly over-represented GO terms with a FWER-adjusted p<0.01 from the hypergeometric test.[21] Table 3 lists the most significantly enriched functional categories and the GO term number in a module and in the grapevine gene co-expression network. Because the biotic or abiotic stress response and its regulation are important biological processes in plants, we highlight the details of one interesting module here, module 17, which responds to environmental stresses Figure 3 and Table 4.
Table 3

Significantly enriched GO terms in 38 modules

ModuleGO term descriptionGO termp value
1Protein catabolic process13/302.1×10−5
2Ribonucleoprotein complex biogenesis152/2073.0×10−90
3Photosynthesis54/691.0×10−40
4Cellular amine metabolic process18/822.6×10−2
5Response to salicylic acid stimulus5/82.1×10−4
7Carbohydrate metabolic process18/1022.4×10−5
11DNA metabolic process21/405.7×10−19
12ATP synthesis coupled electron transport9/161.5×10−8
15Cellular biosynthetic process34/4084.4×10−7
17Response to heat11/313.5×10−10
20Plant-type cell wall biogenesis6/71.5×10−9
24Response to auxin stimulus3/102.8×10−2
25Phenylpropanoid biosynthetic process9/286.7×10−11
26ATP metabolic process5/141.6×10−5
29Protein folding6/571.0×10−5
30Lipid transport3/142.1×10−2
31Flavonoid biosynthetic process6/86.2×10−11
34Response to wounding3/103.5×10−5
35Carboxylic acid metabolic process6/1413.4×10−4
36Response to biotic stimulus5/376.1×10−6
37Protein ubiquitination2/145.9×10−3
38Acyl-carrier-protein biosynthetic process4/251.1×10−4
42Metal ion transport3/189.9×10−5
48Modification-dependent protein catabolic process4/242.1×10−6
51Nucleic acid metabolic process4/962.5×10−3
57Cell redox homeostasis3/151.3×10−4
75Fatty acid biosynthetic process3/218.9×10−5
79Water homeostasis1/12.1×10−2
83One-carbon metabolic process3/97.9×10−6
87Xylulose metabolic process1/13.6×10−2
96Regulation of cell cycle2/61.6×10−3
101Nucleosome assembly2/254.6×10−2
105D-xylose metabolic process3/39.1×10−8
107Oligosaccharide metabolic process2/293.4×10−2
112Ketone biosynthetic process3/133.1×10−5
115Chitin catabolic process3/95.1×10−6
124Lipid transport3/141.8×10−5
139Response to chlorate3/35.5×10−8

A GO term indicates numerical values of the same GO term in one module and the grapevine gene co-expression network.

Figure 3

The fraction of module 17 enriched with the GO term ‘in response to heat stress’. Red circles represent nodes, the blue lines represent edges, and the numbers in the red circles represent gene chip probes.

Table 4

Gene ontology enrichment analysis in module 17

GO IDp value (FWER corrected)Number of GO terms in module 17 in−1 GGCNDescription
69504.0537×10−1826/183Response to stress
508961.0848×10−1326/267Response to stimulus
94083.5017×10−1011/31Response to heat
92664.5005×10−811/46Response to temperature stimulus
96443.2480×10−76/9Response to high light intensity
96423.4062×10−66/12Response to light intensity
96289.9960×10−612/92Response to abiotic stimulus
425421.7589×10−56/15Response to hydrogen peroxide
100352.7093×10−57/25Response to inorganic substance
3021.2576×10−420/29Response to reactive oxygen species
69793.4874×10−36/34Response to oxidative stress
94166.7133×10−36/38Response to light stimulus
93146.7133×10−36/38Response to radiation
69862.3696×10−22/2Response to unfolded protein
433352.3696×10−22/2Protein unfolding
359662.3696×10−22/2Response to topologically incorrect protein

Module 17, a module in response to environmental stresses

We examined one module, module 17, in detail because we are interested in stress responses, as module 17 was found to be enriched with GO terms relating to environment stresses. Module 17 contained 41 nodes (genes) and 89 edges and was significantly enriched with 16 GO terms (p<2.3696×10–2) (Figure 3 and Table 4). The over-expressed GO terms include ‘response to stimulus’, ‘response to high light intensity’, ‘response to abiotic stimulus’, ‘response to oxidative stress’, ‘response to hydrogen peroxide’ and particularly ‘response to heat’ (GO: 0009408) (p=3.5017×10−10). A total of 19 genes in module 17 encode for heat shock proteins (HSPs), including members of the HSP20, HSP40, HSP70, HSP90 and HSP100 families (Table 5).
Table 5

Homologous genes between 29 grapevine genes in module 17 and those in Arabidopsis thaliana

Gene numberGrapevine geneProbe numberHomologs in Arabidopsis thalianaInformation of gene classification and function
1Vit_10s0003g002601616811_atAT2G20560DNAJ heat shock protein
2Vit_07s0185g000401621759_s_atAT3G07150Unknown protein
3Vit_13s0019g031601616145_a_atAT1G53540HSP17.6C-CI
4Vit_18s0041g012301616369_atAT5G49910Chloroplast HSP70−2; ATP binding
5Vit_02s0025g040601611927_atAT4G11740Unknown protein
6Vit_04s0008g015901611192_atAT5G12020HSP17.6II
7Vit_06s0004g044701621357_s_atAT5G02500HSC70−1; ATP binding
8Vit_04s0008g014901614330_atAT5G12020HSP17.6II
9Vit_16s0050g011501618066_a_atAT5G52640HSP90.1; ATP binding
10Vit_08s0007g007401613948_atAT3G09350Armadillo/beta-catenin repeat family protein
11Vit_01s0010g022901608828_atAT4G27670HSP21
12Vit_16s0098g010601620985_atAT4G27670HSP21
13Vit_11s0016g040801621552_atAT3G24500MBF1C
14Vit_07s0005g019801609808_atAT2G47180GolS1
15Vit_17s0000g071901615503_atAT1G74310HSP101; ATP binding
16Vit_17s0000g000701611931_atAT5G07330Unknown protein
17Vit_13s0047g001101606746_a_atAT4G02450Glycine-rich protein
18Vit_11s0078g002601608348_a_atAT5G35320Unknown protein
19Vit_19s0085g010501616538_atAT1G53540HSP17.6C-CI
20Vit_06s0004g060101615761_atAT1G07350Arginine-rich ribonucleoprotein
21Vit_05s0020g033301621709_atAT2G32120HSP70T−2; ATP binding
22Vit_13s0019g008601622489_atAT5G37670HSP15.7−CI
23Vit_08s0007g001301609949_atAT3G12580HSP70; ATP binding
24Vit_16s0022g005101616889_atAT4G25200Mitochondrion-localized HSP23.6
25Vit_08s0217g000901611195_atAT3G08970Endoplasmic reticulum-localized J protein
26Vit_06s0004g057701621652_atAT1G07400HSP17.8−CI
27Vit_02s0154g004801620348_atAT4G25200Mitochondrion-localized HSP23.6
28Vit_12s0035g019101613858_atAT4G10250HSP22.0
29Vit_18s0089g012701609222_atAT4G10250HSP22.0

Module 17 contains 41 nodes (probes). Among them, 12 probe sets were not matched with grapevine genes annotated by CRIBI Genomics, University of Padua (http://genomes.cribi.unipd.it/) (listed in Supplementary Table 2). These probe sets were 1609554_at, 1615503_at, 1607291_at, 1610779_at, 1613154_at, 1622489_at, 1616706_at, 1611195_at, 1621902_at, 1610122_at, 1616049_at and 1618545_a_at. Therefore, 29 grapevine genes are listed in this table.

Plants respond to various stresses in a similar manner—by producing HSPs that protect cells against many stresses.[29] The accumulation of HSPs plays a key role in acquired heat tolerance during heat stress.[30] MBF1C (Vit_11s0016g04080) is an important transcription factor that responds to stresses,[31] and as a key regulator of heat tolerance in Arabidopsis thaliana, the MBF1C protein accumulates rapidly during heat stress. The inositol galactoside (GolS2) enzyme (Vit_07s0005g01980) is a key synthase that regulates the drought and cold responses.[32] Liu et al.[33] inferred that galactinol synthase may be important for grapevine heat tolerance. The endoplasmic reticulum-localized J protein Vit_08s0217g00090 is an important molecular chaperone of HSP70.[34] In addition, four putative uncharacterized proteins in module 17, Vit_07s0185g00040, Vit_02s0025g04060, Vit_17s0000g00070 and Vit_11s0078g00260, are clearly interrelated to other nodes and edges involved in the stress response, but no information about their domain and homologous alignments is available. Therefore, we considered these four putative genes to have unknown functions in the stress response.

Expression patterns of genes in module 17 at different time points after heat shock and recovery

We tested module 17 in response to heat shock, one environmental stress. When grapevine plants were treated with heat shock at 40 °C for 6 h, 19 of 29 genes in module 17 were upregulated and their expression quantities exhibited variable regulation from low-level to high-level, ranging from 1.86- to 11.63-fold (Figure 4a−4e). However, some gene expression quantities maintained a high level from 0.5 h to 6 h, ranging from 6.85- to 11.63-fold (p<0.01). These included Vit_13s0019g03160, Vit_04s0008g01590, Vit_16s0098g01060, Vit_07s0005g01980 and Vit_19s0085g01050, which encode HSP17.6, HSP17.6, HSP21, galactinol synthase 1 and HSP17.6, respectively, in which galactinol synthase 1 (GolS1) is a heat shock factor target gene responsible for the heat-induced synthesis of the raffinose family of oligosaccharides in Arabidopsis.[35]
Figure 4

Gene expression patterns in module 17 treated with heat shock and recovery at different time points. a–e: heat shock for 0.5, 1, 2, 3 and 6 h, respectively. f–g: heat shock recovery for 2 and 5 h after plants were treated at 40 °C for 1 h, respectively. The value in the Y-axis is −ΔΔCt. The expression ratio of a gene was considered significant if *p<0.05. Expression ratio of genes was significant if **p<0.01. The numbers from 1 to 26 on the X-axis represent the grapevine genes listed under ‘gene number and grapevine gene’ in Table 1.

Moreover, 12 of 19 genes were still upregulated significantly (p<0.01) after 2 h and 5 h of recovery. After 2 h of recovery, 6 of 19 genes were downregulated significantly up to 3.02-fold (p<0.01) (Figure 4f), including Vit_08s0007g00130, Vit_16s0022g00510 and Vit_11s0016g04080. After 5 h of recovery, only two genes among them were downregulated significantly (p<0.01) (Figure 4g), and the other four genes recovered from their downregulated states. However, 3 out of 19 genes, Vit_04s0008g01590, Vit_16s0098g01060 and Vit_19s0085g01050, which expressed highly at 40 °C for 6 h, still maintained high-level expression after 2 h and 5 h of recovery, ranging from 4.49- to 8.49-fold (p<0.01). Therefore, our results indicate that genes in module 17 have different gene functions, and their mechanisms during heat shock and transient states may be complex. The expression of two putative uncharacterized genes, Vit_07s0185g00040 (ranging from 1.12- to 4.72-fold) and Vit_02s0025g04060 (ranging from 0.47- to 5.66-fold), was also detected during heat shock and recovery. Based on the GGCN analysis, no homologous alignment or annotation information is available about their sequences, domains or gene expression in NCBI (http://www.ncbi.nlm.nih.gov/cdd) or in CRIBI Genomics, University of Padua (http://genomes.cribi.unipd.it/). Expression values in response to heat shock and recovery between each two time points were plotted together for the 19 genes in module 17 using the SPSS program[28] and treated with LOESS[26] (Figure 5). The best goodness-of-fit values were those at adjacent time points. Moreover, most R2 between the dependent and independent variables ‘gene expression value’ and ‘treatment time point’ were close to 1.0 at adjacent time points[36] (Table 6), which indicated a strong linear relationship between compared variables. The goodness-of-fit analysis indicated that under the same tempospatial conditions, as a whole network, these genes display a clear co-expression relationship.
Figure 5

The goodness of fit test of 19 gene expression values in module 17 between each two time points treated with heat shock and subsequent recovery. The fit lines were added by using LOESS in the matrix scatterplot. ‘HS’ represents heat shock treatment. ‘HS_R’ represents recovery after heat shock treatment.

Table 6

‘Goodness-of-fit’ test of 19 gene expression values in module 17 between each ‘two time points’ treated with heat shock and recovery

R2HS_0.5 hHS_1 hHS_2 hHS_3 hHS_6 hHS_R_2 hHS_R_5 h
HS_0.5 h 0.9610.8800.8250.8290.6590.591
HS_1 h0.961 0.9440.8820.8490.6790.597
HS_2 h0.8800.944 0.9160.9250.8090.725
HS_3 h0.8250.8820.916 0.9050.7540.727
HS_6 h0.8290.8490.9250.905 0.7990.838
HS_R_2 h0.6590.6790.8090.7540.799 0.835
HS_R_5 h0.5910.5970.7250.7270.8380.835 

R2 represents the coefficient of determination between the dependent and independent variables ‘gene expression value’ and ‘treatment time point’ in the linear regression. ‘HS’ represents heat shock treatment. ‘HS_R’ represents recovery after heat shock treatment.

The PCC of gene expression values were significantly greater than 0.78 (Supplementary Table 3). Similarly, during the different time points of heat shock and the recovery process, most PCC values were also greater than 0.78, which indicate that most genes significantly co-express (Supplementary Table 3). Therefore, gene co-expression ‘in response to heat’ represented by module 17 was validated experimentally by qRT-PCR and by PCC analysis of gene expression given that most genes were upregulated together very significantly (p<0.01), and most PCC values were greater than the PCC cutoff value, 0.78, which was used to screen significant co-expression correlation from a large-scale expression data set. Among the 29 genes in module 17 that corresponded to ‘responses to heat stress’, 10 genes showed no response to heat shock, which could suggest that these genes may co-express in other tempospatial condition heat stress environments or in response to other environment stresses, such as ‘response to high light intensity’, ‘response to oxidative stress’ or ‘response to hydrogen peroxide’, because expression of these genes might be regulated depending on time, space and environmental conditions.[37] This process may include many levels, such as chromatin structure, transcription, transcript stability or localization, and translation. The homologous gene comparison for ‘response to heat’ matched quite well between module 17 grapevine genes and those involved in the heat stress response in A. thaliana (Table 5).

Expression patterns of genes in module 17 after low temperature treatment

In contrast to the upregulation of these genes, most of the 19 genes were down regulated in response to low temperature (4 °C) treatment (Figure 6), ranging from 1.05- to 4.55-fold (Figure 6). To further test the co-expression relationship between these genes, the PCC of 19 gene expression values were calculated. Supplementary Table 4 shows that 45.91% of them were greater than 0.78; thus, the co-expression relationship of these genes was not very obvious if inferring from PCC values, compared with those after heat shock treatment.
Figure 6

Gene expression patterns in module 17 after treatment with low temperature at 4 °C for 1 h. The value on the Y-axis is −ΔΔCt. Expression ratio of genes was considered significant if **p<0.01.

Discussion

Plant growth, development and adaptation to the environment are complex, yet highly coordinated, processes. One way to understand these complex processes is to establish gene co-expression networks from which we can predict putative functions of genes in the network because genes sharing a module in a co-expression network are likely involved in similar biological processes.[3,7] In this study, we constructed a GGCN at the genome-wide level with publically available microarray data using the efficient heuristic algorithm Qcut, which is based on the optimization of a modularity function (Q), and combined spectral graph partitioning and local search to optimize Q.[17] Moreover, nodes were densely linked with each other in a sub-network module, but they were sparse or had no connections between the subnetwork modules. The gene-to-gene PCC derived from gene expression data in Gene Expression Omnibus allowed us to portion these co-expressing genes into network modules in various experimental conditions. The goodness of fit, coefficient of determination and PCC statistical tests of module 17 have confirmed that genes in the same module show co-expression relationships under the same tempo-spatial conditions, which may be associated with the same biological function, one of the important features of a co-expression network.[38,39] The homologous gene comparison of ‘response to heat’ between module 17 in grapevine and A. thaliana also demonstrated that partitioning genes into modules from the co-expression network was reliable. HSPs and chaperones are crucial components of the heat shock regulatory network in plants[40] and take a crucial role in response to multiple environmental insults.[41,42] These HSPs are also involved in response to cold[43] and non-thermal stress treatments, such as salinity,[44] drought,[45,46] high light stress,[47] oxidative stress[48] and heavy metal stress.[49] Therefore, the biological functions represented by module 17, a module that responds to environmental stresses, may be tested in multiple stresses in the future. The reliability and biological correlation of the network were further verified by experimentation. The same set of genes in module 17 of the co-expression network exhibited two co-expression patterns, one upregulation (to heat shock treatment) and one downregulation (to cold treatment). The differential response patterns between heat shock and low temperature experimental treatments suggest that other regulatory factors may be involved, which require additional investigation. These covarying patterns could also suggests the complexity of cellular transcriptional activities.[14] The co-expression network and partitions into different modules may also help to identify new genes that may putatively be involved in certain biological processes.[3] In this research, two putative uncharacterized genes without any gene function information, gene annotation, expression sequence tag(EST), transcriptome data or protein domain prediction were detected in response to heat shock. These genes are worthy of further investigation. Overall, the study provided a new insight into the module properties of grapevine gene functions, which facilitated the module research of gene functions and the discovery of new genes.
  38 in total

Review 1.  Gene networks: how to put the function in genomics.

Authors:  Paul Brazhnik; Alberto de la Fuente; Pedro Mendes
Journal:  Trends Biotechnol       Date:  2002-11       Impact factor: 19.536

2.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

Review 3.  How do plants feel the heat?

Authors:  Ron Mittler; Andrija Finka; Pierre Goloubinoff
Journal:  Trends Biochem Sci       Date:  2012-01-09       Impact factor: 13.807

4.  Comparative transcriptomic profiling of Vitis vinifera under high light using a custom-made array and the Affymetrix GeneChip.

Authors:  Luísa C Carvalho; Belmiro J Vilela; Phil M Mullineaux; Sara Amâncio
Journal:  Mol Plant       Date:  2011-04-15       Impact factor: 13.164

Review 5.  Role of the major heat shock proteins as molecular chaperones.

Authors:  C Georgopoulos; W J Welch
Journal:  Annu Rev Cell Biol       Date:  1993

6.  Galactinol synthase1. A novel heat shock factor target gene responsible for heat-induced synthesis of raffinose family oligosaccharides in Arabidopsis.

Authors:  Tressa Jacob Panikulangara; Gabriele Eggers-Schumacher; Markus Wunderlich; Harald Stransky; Fritz Schöffl
Journal:  Plant Physiol       Date:  2004-10-01       Impact factor: 8.340

7.  Genome-scale identification of cell-wall related genes in Arabidopsis based on co-expression network analysis.

Authors:  Shan Wang; Yanbin Yin; Qin Ma; Xiaojia Tang; Dongyun Hao; Ying Xu
Journal:  BMC Plant Biol       Date:  2012-08-09       Impact factor: 4.215

8.  Expression of pathogenesis related genes in response to salicylic acid, methyl jasmonate and 1-aminocyclopropane-1-carboxylic acid in Malus hupehensis (Pamp.) Rehd.

Authors:  Jiyu Zhang; Xiaoli Du; Qingju Wang; Xiukong Chen; Dong Lv; Kuanyong Xu; Shenchun Qu; Zhen Zhang
Journal:  BMC Res Notes       Date:  2010-07-27

9.  Arabidopsis gene co-expression network and its functional modules.

Authors:  Linyong Mao; John L Van Hemert; Sudhansu Dash; Julie A Dickerson
Journal:  BMC Bioinformatics       Date:  2009-10-21       Impact factor: 3.169

10.  Computational discovery of regulatory elements in a continuous expression space.

Authors:  Mathieu Lajoie; Olivier Gascuel; Vincent Lefort; Laurent Bréhélin
Journal:  Genome Biol       Date:  2012-11-27       Impact factor: 13.583

View more
  14 in total

1.  Identification of putative drought-responsive genes in rice using gene co-expression analysis.

Authors:  Yanmei Lv; Lei Xu; Komivi Dossa; Kun Zhou; Mingdong Zhu; Hongjun Xie; Shanjun Tang; Yaying Yu; Xiayu Guo; Bin Zhou
Journal:  Bioinformation       Date:  2019-07-31

2.  Weighted gene co-expression network analysis unveils gene networks regulating folate biosynthesis in maize endosperm.

Authors:  Lili Song; Diansi Yu; Hongjian Zheng; Guogan Wu; Yu Sun; Peng Li; Jinbin Wang; Cui Wang; Beibei Lv; Xueming Tang
Journal:  3 Biotech       Date:  2021-09-21       Impact factor: 2.893

3.  Analysis of chickpea gene co-expression networks and pathways during heavy metal stress.

Authors:  Birendra Singh Yadav; Swati Singh; Sameer Srivastava; Ashutosh Mani
Journal:  J Biosci       Date:  2019-09       Impact factor: 1.826

4.  Modern Approaches for Transcriptome Analyses in Plants.

Authors:  Diego Mauricio Riaño-Pachón; Hector Fabio Espitia-Navarro; John Jaime Riascos; Gabriel Rodrigues Alves Margarido
Journal:  Adv Exp Med Biol       Date:  2021       Impact factor: 2.622

5.  Analysis of weighted co-regulatory networks in maize provides insights into new genes and regulatory mechanisms related to inositol phosphate metabolism.

Authors:  Shaojun Zhang; Wenzhu Yang; Qianqian Zhao; Xiaojin Zhou; Ling Jiang; Shuai Ma; Xiaoqing Liu; Ye Li; Chunyi Zhang; Yunliu Fan; Rumei Chen
Journal:  BMC Genomics       Date:  2016-02-24       Impact factor: 3.969

6.  ChlamyNET: a Chlamydomonas gene co-expression network reveals global properties of the transcriptome and the early setup of key co-expression patterns in the green lineage.

Authors:  Francisco J Romero-Campero; Ignacio Perez-Hurtado; Eva Lucas-Reina; Jose M Romero; Federico Valverde
Journal:  BMC Genomics       Date:  2016-03-12       Impact factor: 3.969

7.  Differential Network Analysis Reveals Evolutionary Complexity in Secondary Metabolism of Rauvolfia serpentina over Catharanthus roseus.

Authors:  Shivalika Pathania; Ganesh Bagler; Paramvir S Ahuja
Journal:  Front Plant Sci       Date:  2016-08-18       Impact factor: 5.753

8.  Construction of citrus gene coexpression networks from microarray data using random matrix theory.

Authors:  Dongliang Du; Nidhi Rawat; Zhanao Deng; Fred G Gmitter
Journal:  Hortic Res       Date:  2015-06-10       Impact factor: 6.793

9.  Functional characterization of drought-responsive modules and genes in Oryza sativa: a network-based approach.

Authors:  Sanchari Sircar; Nita Parekh
Journal:  Front Genet       Date:  2015-07-30       Impact factor: 4.599

10.  Characterization of CIPK Family in Asian Pear (Pyrus bretschneideri Rehd) and Co-expression Analysis Related to Salt and Osmotic Stress Responses.

Authors:  Jun Tang; Jing Lin; Hui Li; Xiaogang Li; Qingsong Yang; Zong-Ming Cheng; Youhong Chang
Journal:  Front Plant Sci       Date:  2016-09-07       Impact factor: 5.753

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.