Literature DB >> 22197885

Reference genes for the normalization of gene expression in eucalyptus species.

Luisa Abruzzi de Oliveira1, Michèle Claire Breton, Fernanda Macedo Bastolla, Sandro da Silva Camargo, Rogério Margis, Jeverson Frazzon, Giancarlo Pasquali.   

Abstract

Gene expression analysis is increasingly important in biological research, with reverse transcription-quantitative PCR (RT-qPCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Considering the increased sensitivity, reproducibility and large dynamic range of this method, the requirements for proper internal reference gene(s) for relative expression normalization have become much more stringent. Given the increasing interest in the functional genomics of Eucalyptus, we sought to identify and experimentally verify suitable reference genes for the normalization of gene expression associated with the flower, leaf and xylem of six species of the genus. We selected 50 genes that exhibited the least variation in microarrays of E. grandis leaves and xylem, and E. globulus xylem. We further performed the experimental analysis using RT-qPCR for six Eucalyptus species and three different organs/tissues. Employing algorithms geNorm and NormFinder, we assessed the gene expression stability of eight candidate new reference genes. Classic housekeeping genes were also included in the analysis. The stability profiles of candidate genes were in very good agreement. PCR results proved that the expression of novel Eucons04, Eucons08 and Eucons21 genes was the most stable in all Eucalyptus organs/tissues and species studied. We showed that the combination of these genes as references when measuring the expression of a test gene results in more reliable patterns of expression than traditional housekeeping genes. Hence, novel Eucons04, Eucons08 and Eucons21 genes are the best suitable references for the normalization of expression studies in the Eucalyptus genus.

Entities:  

Mesh:

Year:  2011        PMID: 22197885      PMCID: PMC7107212          DOI: 10.1093/pcp/pcr187

Source DB:  PubMed          Journal:  Plant Cell Physiol        ISSN: 0032-0781            Impact factor:   4.927


The Eucalyptus nucleotide sequences reported in this paper have been submitted to GenBank under accession numbers HO763666–HO769458 and HS047685–HS075494 (Dario Grattapaglia et al., Embrapa Recursos Genéticos e Biotecnologia, Final W5 Norte, Cx.P. 02.372, Brasília, DF, 70.770-900, Brazil).

Introduction

The genus Eucalyptus, with >700 species, is one of the main sources of hardwood worldwide and the most widely employed tree in industrial-oriented plantations. Many Eucalyptus species are renowned for their fast growth rate, the straight shape of their trunks, valuable wood properties, wide adaptability to soils and climates, resistance to biotic stresses, and ease of management through coppicing, seed or clonal propagation. Especially in Brazil, Chile, South Africa, Portugal and India, Eucalyptus timber is widely used for cellulose pulp and paper production. The Eucalyptus species and hybrids most employed in commercial plantations and breeding programs in the tropics include E. grandis, E. urophylla, E. globulus and E. saligna (FAO 2001). Despite the high wood productivity of Eucalyptus plantations, reaching 45–60 m3 ha−1 year−1, the increasing demand for cellulose pulp has resulted in wood shortages in recent years (Steane et al. 2002, Foucart et al. 2006, Grattapaglia and Kirst 2008). Hence efforts in many fields of research are being made to improve forest productivity including molecular approaches such as whole-genome sequencing and high-throughput analysis of gene expression. With such objectives in mind, the Eucalyptus Genome Network (EUCAGEN) was created (http://www.ieugc.up.ac.za), representing one example of a valuable database platform for genome research in E. grandis and other species (Rengel et al. 2009). With the recent availability of Eucalyptus genome and transcriptome data, many efforts are and will be made to assess Eucalyptus gene expression with conventional or high-throughput techniques. Independently of the method employed, the use of reference genes as internal controls for gene expression measurements is absolutely essential. Such validated reference genes for Eucalyptus are still scarce. DNA macro- and microarray hybridizations and partial or whole transcriptome sequencing linked to digital transcript counting (RNA-Seq), among other techniques, allow the expression analysis of thousands of genes simultaneously, employing differentially labeled RNA or cDNA populations. These techniques have the advantage of speed, high-throughput and a high degree of potential automation compared with conventional quantification methods such as Northern blot analysis, RNase protection assays, or competitive reverse trnascription–PCR (RT–PCR; Rajeevan et al. 2001, Kim and Kim 2003, Czechowski et al. 2005). Reverse transcription followed by real-time, quantitative PCR (RT–qPCR) is the most sensitive and specific technique commonly used to assess gene expression levels (Aerts et al. 2004). It allows more in-depth studies of smaller sets of genes across many individuals, treatments or cell/tissue types to be performed. RT–qPCR is the technique of choice to validate gene expression results derived from the above-mentioned high-throughput methods (Rajeevan et al. 2001, Kim and Kim 2003, Czechowski et al. 2005). As mentioned previously, only good internal reference genes will allow confident comparison of gene expression results. Internal control genes are used to normalize mRNA fractions and are often referred to as housekeeping genes which should not vary their expression during development, among tissues or cells under investigation, or in response to experimental treatments. The most common housekeeping genes employed in plant gene expression studies are those encoding actin (Bas et al. 2004, Barsalobres-Cavallari et al. 2009), tubulin (Schmidt and Delaney 2010, Yang et al. 2010), glyceraldehyde-3-phosphate dehydrogenase (GAPDH) (Tong et al. 2009, Maroufi et al. 2010), rRNA (Guénin et al. 2009, Schimidt and Delaney 2010, Yang et al. 2010), polyubiquitin (Libault et al. 2008, Barsalobres-Cavallari et al. 2009) and elongation factor 1α (Silveira et al. 2009, Tong et al. 2009). Many studies make use of these housekeeping genes without proper validation of their presumed stability, based on the assumption that they would be constitutively expressed due to their role in basic cellular processes. Considerable amounts of data show that most studied housekeeping genes have expression that can vary considerably depending on the cell type or experimental condition (Thellin et al. 1999, Hruz et al. 2011). With the increased sensitivity, reproducibility and large dynamic range of the RT–qPCR methods, the requirements for proper internal control genes have become increasingly stringent. In recent years, large numbers of reference gene validation attempts have been reported for plants, most of them covering model, crop or ornamental species such as rice (Kim et al. 2003, Jain et al. 2006), Arabidopsis thaliana (Remans et al. 2008, Hong et al. 2010, Dekkers et al. 2012), tobacco (Schmidt and Delaney 2010), sugarcane (Iskandar et al. 2004), potato (Nicot et al. 2005), Brachypodium sp. (Hong et al. 2008), soybean (Jian et al. 2008, Libault et al. 2008, Hu et al. 2009, Kulcheski et al. 2010), tomato (Coker and Davies 2003, Expósito-Rodríguez et al. 2008, Løvdal and Lillo 2009, Dekkers et al. 2012), Brachiaria sp. (Silveira et al. 2009), coffee (Barsalobres-Cavallari et al. 2009), peach (Tong et al. 2009), wheat (Paolacci et al. 2009), chicory (Maroufi et al. 2010), cotton (Artico et al. 2010), cucumber (Wan et al. 2010), Lolium sp. (Martin et al. 2008), Orobanche sp. (González-Verdejo et al. 2008) and Cyclamen sp. (Hoenemann and Hohe 2011). Few studies have focused on woody plants such as poplar (Brunner et al. 2004, Gutierrez et al. 2008), grape (Reid et al. 2006) and longan tree (Lin and Lai 2010). Reference genes for gene expression studies in Eucalyptus have been recently presented. de Almeida et al. (2010), working with E. globulus microccuttings rooted in vitro, have indicated histone H2B and α-tubulin as the most suitable reference genes during in vitro adventitious rooting, in the presence or absence of auxin. Boava et al. (2010), working with clonal seedlings of the hybrid plant (E. grandis×E. urophylla) exposed to biotic (Puccinia psidii) or abiotic (acibenzolar-S-methyl) stresses, concluded that genes encoding the eukaryotic elongation factor 2 (eEF2) and ubiquitin were the most stable, and ideal as internal controls. Both studies tested a small number of genes (11 and 13, respectively) selected according to literature data concerning other plant systems and experimental conditions. Given the increasing interest in the functional genomics of Eucalyptus and the need for validated reference genes for a broader set of species and experimental conditions, we sought to identify the most stably expressed genes in a set of 21,432 genes assayed by microarray developed to compare stem vascular (xylem) and leaf tissues of E. grandis and E. globulus adult trees. Best candidate genes were then validated by RT–qPCR in assays with RNAs from xylem and leaves of six Eucalyptus species and flowers of E. grandis. Seven traditional housekeeping genes most employed in expression studies in plants were also included in our analysis. The Eucalyptus species selected in the present study are among the most planted trees in the tropics and the most employed in breeding programs in Brazil (E. grandis, E. urophylla, E. globulus, E. saligna, E. dunnii and E. pellita). Most importantly, they exhibit highly contrasting phenotypes, especially in growth rate, biotic and abiotic resistance and wood quality (FAO 2001, Coppen 2002), which, in principle, would make the search for general reference genes for the genus difficult. As a result, genes selected as the least variable among all conditions tested have not yet been described in the literature. This set of genes may represent an important molecular tool to analyze accurately the expression of Eucalyptus genes in different tissues/organs and in different species via array hybridization or RT–qPCR.

Results

Selection of Eucalyptus reference genes via microarray analysis

Data from microarray hybridizations conducted within the project ‘Genolyptus: The Brazilian Research Network on the Eucalyptus Genome’ (http://genoma.embrapa.br/genoma/genolyptus) were analyzed in order to select the most stably expressed Eucalyptus genes. The microarray study was conceived with nine 50-mer oligoprobes covering the length of each one of the 21,432 unique sequences derived from the Genolyptus expressed sequence tag (EST) data set (GenBank accession Nos. HO763666–HO769458 and HS047685–HS075494). Nine oligoprobes were also designed for 10 cDNAs encoding known human proteins as negative controls. Oligoprobes were synthesized ‘on-chip’ in duplicate, randomly distributed in two blocks of 10 identical slides. Leaf blades and vascular (xylem) tissue samples were taken from two E. grandis clonal trees, i.e. derived from the same matrix tree and harboring the same genotype. Two additional xylem samples were collected from two other E. grandis clonal trees of a different genotype and from two E. globulus clonal trees. Therefore, 10 Cy3-labeled cDNA samples and 10 identical chips were produced at Roche NimbleGen for the microarray assays, with a total number of 385,956 features per slide [microarray results were submitted to Gene Expression Omnibus (GEO) under accession Nos. GSM786737–GSM786746]. The most stably expressed genes were mined in the microarray data by employing two statistical algorithms named Significance Analysis of Microarrays (SAM) (Tusher et al. 2001) and Standard Deviation Microarray Analysis (SDMA; see Materials and Methods), that allow the representation of results in three-dimensional (3D) graphs. The input data to SAM were gene expression measurements from the set of microarray experiments, as well as the response variable from each experiment. According to Tusher et al. (2001), SAM computes a statistic d(i) for each gene ‘i’, measuring the strength of the relationship between gene expression and the response variable. It uses repeated permutations of the data to determine if the expression of any gene is significantly related to the response. The cut-off for significance is determined by a tuning parameter delta, chosen by the user based on the false-positive rate. One can also choose a fold change parameter, to ensure that called genes change at least a pre-specified amount. In the present study, the value of delta was set to 0.2 so that we could mine the genes whose expression exhibited the lowest variation among the three conditions assessed in the microarrays, i.e. E. grandis leaves and xylem and E. globulus xylem (Fig. 1A). A ranking of 50 genes whose fold change values were approximately equal to 1 were selected as reference candidate genes, since they presented the lowest variation of expression among the leaves and xylem of E. grandis and xylem of E. globulus (Table 1).
Fig. 1

Expression of 21,432 Eucalyptus genes in E. grandis leaves and xylem and in E. globulus xylem evaluated by microarray hybridization analysis. (A) Scatter plot of the observed relative difference d(i) (observed score) vs. the expected relative difference dE(i) (expected score) built with the Significance Analysis of Microarray (SAM) method. The solid black line indicates the line for d(i) = dE(i), where the observed relative difference is identical to the expected relative difference with a delta set to 0.2. Solid and dotted red and green lines represent genes whose observed relative differences were lower or higher than the expected relative differences, i.e. whose expression varied among tissues tested. (B) Three-dimensional graph generated with the Standard Deviation Microarray Analysis (SDMA) method showing genes (open circles) expressed in positions equivalent to their overall average expression among the three conditions analyzed in the microarrays, i.e. leaves (EgrL, z-axis) and xylem (EgrX, x-axis) of E. grandis and xylem of E. globulus (EglX, y-axis). The higher concentration of circles around the main diagonal line proved that most genes exhibited very similar expression values in the analyzed tissues. The most differentially expressed genes appeared proportionally far from the main diagonal line. (C) SDMA 3D graph representing the 50 most invariable Eucalyptus genes according to microarray data. Points representing selected genes tend to form a straight line since their means of expression are similar to the global average, with a standard deviation tending to zero.

Table 1

The 50 most stable Eucalyptus genes selected from microarray data analysis employing the SDMA and the SAM statistical algorithms

Gene nameEUCAGEN scaffoldGene IDBLAST annotatione-ValueSDMA
SAM
SDRankingFold changeScore (d)Ranking
Eucons01 4emb|CAY47298.1|Serine transporter (Pseudomonas fluorescens SBW25)2e-940.00008210.9999956850.0066147348
Eucons02 1,599No hit0.0002921.0000053630.00771650223
Eucons03 7gb|EEY18801.1|DNA damage checkpoint protein rad24 (Verticillium alboatrum VaMs.102)7e-440.00036530.9999938410.00895865441
Eucons04 88gb|EEF43392.1|Cdk8, putative (Ricinus communis)2e-490.00046141.0000067550.00926121221
Eucons05 332No hit0.00050951.0000089730.01025913246
Eucons06 134gb|EEF44719.1|Plastidic ATP/ADP-transporter, putative (R. communis)5e-200.00054961.0000098760.01232245640
Eucons07 515gb|EEF03117.1|ABC transporter family protein (Populus trichocarpa)7e-290.00056171.0000101640.012416578
Eucons08 2gb|EEF33688.1|Transcription elongation factor s-II, putative (R. communis)1e-250.00058281.0000134160.01255606415
Eucons09 6gb|EEF42371.1|Nucleic acid binding protein, putative (R. communis)1e-180.00066591.0000156370.0135532349
Eucons10 369gb|ACG37397.1|Anther-specific proline-rich protein APG (Zea mays)3e-420.000678101.0000189890.0155743812
Eucons11 1No hit0.000694111.00002960.01633079644
Eucons12 2,755No hit0.000694120.9999730050.0164377147
Eucons13 288No hit0.000732130.9999660.01680714817
Eucons14 899gb|AAM52237.1|Senescence/dehydration-associated protein-related (Arabidopsis thaliana)8e-140.000734141.0000358750.01711815336
Eucons15 2emb|CAA65477.1|Lipid transfer protein (Prunus dulcis)7e-350.000785151.0000390150.0180247645
Eucons16 6,792gb|EEF44560.1|F-box and wd40 domain protein, putative (R. communis)5e-190.000791160.9999603010.0181815535
Eucons17 62No hit0.000794171.0000457670.02061782534
Eucons18 6No hit0.000800180.9999608510.0213286324
Eucons19 9No hit0.000821191.0000465870.0214450111
Eucons20 175gb|EEE97842.1|Chromatin remodeling complex subunit (P. trichocarpa)6e-470.000879201.0000543020.02197811214
Eucons21 1,753gb|EEF48129.1|Aspartyl-tRNA synthetase, putative (R. communis)2e-420.000913210.999947590.0225006916
Eucons22 584No hit0.000931221.0000524140.0233104828
Eucons23 180dbj|BAB02414.1|Chloroplast nucleoid DNA binding protein-like (A. thaliana)2e-530.000970231.0000550760.023310483
Eucons24 15gb|EEF45372.1|Conserved hypothetical protein (R. communis)5e-440.000992240.9999487130.02472893210
Eucons25 531No hit0.001028250.9999413480.02487488819
Eucons26 95gb|EEF45384.1|Vacuole membrane protein, putative (R. communis)3e-150.001030261.0000688190.02498329711
Eucons27 743gb|EEF44734.1|Peroxisome biogenesis factor, putative (R. communis)2e-510.001063271.0000595990.02518355716
Eucons28 87No hit0.001085280.9999314810.02525058913
Eucons29 6gb|EER99842.1|Hypothetical protein SORBIDRAFT_02g041780 (Sorghum bicolor)4e-130.001158290.9999146590.02527861612
Eucons30 173gb|EEF45541.1|Sentrin/sumo-specific protease, putative (R. communis)4e-460.001175301.0000625430.02563820920
Eucons31 842emb|CAP42856.1|SsrA-binding protein (Bordetella petrii)8e-430.001175310.9999315920.0265404950
Eucons32 8gb|EDS90429.1|Nitrogen regulation protein NR(II) (Escherichia albertii TW07627)6e-340.001181321.0000799570.02654745918
Eucons33 1,056No hit0.001189331.0000810120.02666952326
Eucons34 1,034gb|EEE90904.1|Predicted protein (P. trichocarpa)5e-180.001256341.0000785580.02684126430
Eucons35 349No hit0.001257351.0000791920.02711474937
Eucons36 52ref|NP_565080.1|Mitochondrial transcription termination factor-related/mTERF-related (A. thaliana)3e-450.001263360.9999118380.02738806838
Eucons37 229No hit0.001273370.9999154380.02786181924
Eucons38 447emb|CAN64407.1|Hypothetical protein (Vitis vinifera)8e-250.001276380.9998910430.02787274145
Eucons39 1,208No hit0.001286390.999907120.02807060325
Eucons40 570No hit0.001293400.9999081550.02819575833
Eucons41 720No hit0.001303411.0000989790.02855023332
Eucons42 4No hit0.001361420.9998932490.02860504729
Eucons43 482gb|EEF46905.1|Serine/threonine-protein kinase PBS1, putative (R. communis)6e-350.001365431.0001057410.02868261227
Eucons44 228gb|EEF48108.1|Pollen specific protein sf21, putative (R. communis)3e-420.001378441.0001175130.03066115943
Eucons45 10,041No hit0.001385450.99990550.03078286142
Eucons46 485gb|ACM45716.1|Class IV chitinase (Pyrus pyrifolia)2e-610.001428461.0001041530.03126411749
Eucons47 175gb|EEE86166.1|F-box family protein (P. trichocarpa)4e-310.001432471.000102790.03151494222
Eucons48 873No hit0.001613481.000124330.03151608439
Eucons49 359gb|EEF03628.1|Predicted protein (P. trichocarpa)9e-250.001848490.9998868580.03162653347
Eucons50 30No hit0.001303501.0000989790.03198751731

Gene names and the identity of E. grandis genomic (EUCAGEN) scaffolds where sequences are located, as well as the EMBL or GenBank accession codes (Gene ID) and putative functional identity of sequences based on BLAST analysis are indicated along with the estimated (e) value. Results of the statistical analysis performed are indicated: standard deviations (SD) for the SDMA method, and fold change and final score (d) of the SAM method. Genes were ranked from highest to lowest stability for both methods. Lines shaded in gray represent genes selected for validation via RT–qPCR analysis.

Expression of 21,432 Eucalyptus genes in E. grandis leaves and xylem and in E. globulus xylem evaluated by microarray hybridization analysis. (A) Scatter plot of the observed relative difference d(i) (observed score) vs. the expected relative difference dE(i) (expected score) built with the Significance Analysis of Microarray (SAM) method. The solid black line indicates the line for d(i) = dE(i), where the observed relative difference is identical to the expected relative difference with a delta set to 0.2. Solid and dotted red and green lines represent genes whose observed relative differences were lower or higher than the expected relative differences, i.e. whose expression varied among tissues tested. (B) Three-dimensional graph generated with the Standard Deviation Microarray Analysis (SDMA) method showing genes (open circles) expressed in positions equivalent to their overall average expression among the three conditions analyzed in the microarrays, i.e. leaves (EgrL, z-axis) and xylem (EgrX, x-axis) of E. grandis and xylem of E. globulus (EglX, y-axis). The higher concentration of circles around the main diagonal line proved that most genes exhibited very similar expression values in the analyzed tissues. The most differentially expressed genes appeared proportionally far from the main diagonal line. (C) SDMA 3D graph representing the 50 most invariable Eucalyptus genes according to microarray data. Points representing selected genes tend to form a straight line since their means of expression are similar to the global average, with a standard deviation tending to zero. The 50 most stable Eucalyptus genes selected from microarray data analysis employing the SDMA and the SAM statistical algorithms Gene names and the identity of E. grandis genomic (EUCAGEN) scaffolds where sequences are located, as well as the EMBL or GenBank accession codes (Gene ID) and putative functional identity of sequences based on BLAST analysis are indicated along with the estimated (e) value. Results of the statistical analysis performed are indicated: standard deviations (SD) for the SDMA method, and fold change and final score (d) of the SAM method. Genes were ranked from highest to lowest stability for both methods. Lines shaded in gray represent genes selected for validation via RT–qPCR analysis. SDMA is a novel and simpler algorithm based on the comparison of the average gene expression in relation to the global average of expressed genes in microarrays and the overall standard deviation, allowing the presentation of results in graphical mode (see Materials and Methods). SDMA allowed us to generate a 3D graph that evidenced genes expressed in a position equivalent to their overall average expression among the three conditions analyzed in the microarrays (Fig. 1B). The average value of gene expression by SDMA should be as similar as possible to the global average of expression, and the overall standard deviation should tend to zero when the scope of the analysis is the selection of genes whose expression is stable. Using the same criteria applied in SAM, we selected 50 genes whose standard deviations were close to zero, indicating the similarity between the values of mean and mean global gene expression (Table 1). An SDMA 3D graphic is presented in Fig. 1C where the mean expression values of the 50 most stable genes selected under the conditions studied are plotted. Note that points representing selected genes tend to form a straight line, indicating that their means of expression when compared with the global average have a standard deviation tending to zero. We were pleased to note that the employment of either SDMA or SAM allowed us to identify the same group of 50 genes as the most stably expressed, confirming the robustness of the analysis performed by both algorithms. Nevertheless, the ranking of the two methods differed, as shown in Table 1. Since none of the sequences selected presented molecular or biochemical identities similar to previously described Eucalyptus genes or proteins, we named them Eucons01 to Eucons50, according to the ranking generated by SDMA, as stated in the first column of Table 1. Selected sequences were annotated using BLASTx (Altschul et al. 1997) against the available non-redundant protein sequences (nr), and their functional categories were determined by the Blast2GO software (Conesa et al. 2005). Although some sequences exhibited expected (e) values too high for a confident annotation, approximately half of them (48%) showed similarity to known proteins. The other half of the sequences corresponded to hypothetical proteins (10%) or returned a ‘no hit’ (42%) result (Table 1). The gene ontology analysis of the 50 selected genes by Blast2GO allowed the functional classification of 35 (70%) of the sequences, as represented in Fig. 2. Most of the sequences were classified in functions related to cellular (12) or metabolic (12) processes, among six other functional categories. The remaining 15 (30%) sequences were classified as ‘no hit’, and were not represented in Fig. 2.
Fig. 2

Functional classification of the 50 most stable Eucalyptus genes according to microarray hybridization data analysis through SAM and SDMA. Gene Ontology hits registered for the 50 most constitutive genes that could be assigned a putative function based on Swiss-Prot query. Only known genes are shown.

Functional classification of the 50 most stable Eucalyptus genes according to microarray hybridization data analysis through SAM and SDMA. Gene Ontology hits registered for the 50 most constitutive genes that could be assigned a putative function based on Swiss-Prot query. Only known genes are shown. In order to validate the results further by RT–qPCR, we selected 10 candidate genes with the least variation in expression and whose annotation matched a known plant protein according to SDMA (Eucons01, 04, 06, 07 and 08) and SAM (Eucons15, 21, 27, 32 and 43). Selected genes are highlighted in gray in Table 1.

Validation of Eucalyptus reference genes by RT–qPCR

In order to check their true expression stability, primers for RT–qPCR validation of the 10 Eucalyptus sequences selected as potential reference genes were designed and are presented in Table 2. In addition to them, we also designed primers for five genes traditionally employed as references, based on their housekeeping function, including a SAND family protein (Remans et al. 2008), GAPDH (Dambrauskas et al. 2003), histone H2B, ribosomal protein L23A (RibL23A) and tubulin (TUA2) (Czechowski et al. 2005), as presented in Table 2. Reference genes previously recommended for the analysis of gene expression during E. globulus rooting in vitro and named Euc10 and Euc12 (de Almeida et al. 2010) were also evaluated, and the primers employed in RT–qPCR are also presented in Table 2.
Table 2

Primer sequences (5′–3′) employed in RT–qPCR analysis of candidate reference genes for Eucalyptus including genes traditionally employed as references in plant gene expression studies

Gene nameGene IDBLAST annotationEUCAGEN scaffoldForward and reverse primers (5′–3′)Amplicon (bp)
Eucons01 emb|CAY47298.1|Serine transporter (P. fluorescens SBW25)4TGGTGCTGACGGTGATGTTCTTCT178
AAGGATTTGGTGATCGCCACCAGT
Eucons04 gb|EEF43392.1|Cdk8, putative (R. communis)88TACAAGCGCTGTTGATATGTGGGC196
TTGCCAATGAGGCGGATTCACAAG
Eucons06 gb|EEF44719.1|Plastidic ATP/ADP-transporter, putative (R. communis)134TCCTCTGTCCACAAATGGGTTCCA141
TCACCAAAGACAGGCTGACCATCA
Eucons07 gb|EEF03117.1|ABC transporter family protein (P. trichocarpa)515AAGCCTCATTGGCTGGCTCACATA153
TCAGCACAAGAGCTCCACCATCAT
Eucons08 gb|EEF33688.1|Transcription elongation factor s-II, putative (R. communis)2TCCAATCCGAGTCGCTGTCATTGT152
TGATGAGCCTCTCTGGTTTGACCT
Eucons15 emb|CAA65477.1|Lipid transfer protein (P. dulcis)2AAGTGAGAGCAAAGATGGAGCGCA154
GACCATATTACACGACGCATCGCA
Eucons21 gb|EEF48129.1|Aspartyl-tRNA synthetase, putative (R. communis)1,753AGAGGTGAAATTCCAGAAGCCCGT155
CTTCCCTTTGGCTTCCGCCAATTA
Eucons27 gb|EEF44734.1|Peroxisome biogenesis factor, putative (R. communis)743CATTTCATGCTGCTGTTGGCCGTT184
AGTCCACCAACATCATCCCATCCA
Eucons32 gb|EDS90429.1|Nitrogen regulation protein NR(II) (E. albertii TW07627)8GACAACGTGCGGTTGATTCGTGAT144
ACGCAGAATGATTTCACCGCCTTC
Eucons43 gb|EEF46905.1|Serine/threonine-protein kinase PBS1, putative (R. communis)482TATTTCTCCTGTTTCGCTCCGGGT166
TACCATCTCTTTGTGCTCTGCGCT
At2g28390AT2G28390.1SAND family protein (A. thaliana)3,502CCATTCAACACTCTCCGACA143
TGTGTGACCCAGCAGAGTAAT
GAPDHAT1G13440.1Glyceraldehyde-3-phosphate dehydrogenase (A. thaliana)4,044TTGGCATTGTTGAGGGTCTA107
AAGCAGCTCTTCCACCTCTC
H2BAT5G02570.1Histone H2B, putative (A. thaliana)90GAGCGTGGAGACGTACAAGA127
GGCGAGTTTCTCGAAGATGT
RibL23AAT2G39460.2Ribosomal protein L23A (A. thaliana)317AAGGACCCTGAAGAAGGACA128
CCTCAATCTTCTTCATCGCA
TUA2AT1G50010.1TUA2; structural constituent of cytoskeleton (A. thaliana)687GCAAGTACATGGCTTGCTGT132
CACACTTGAATCCTGTTGGG
Euc10 AT3G07640.1Unknown protein (A. thaliana)62AGGAGTCCTTCGAGCTTCC110
CAGCACGGACACCTGATAAA
Euc12 AT1G32790.1RNA-binding protein, putative (A. thaliana)107GCGTGGTTCTTGGATCACTA114
TGGTGACAAAGTCAGGTGCT

Gene name abbreviations, GenBank, EMBL or TAIR accession codes and the putative functional identity of genes based on BLAST analysis are indicated. The EUCAGEN scaffolds containing the genome sequences of the referred genes are presented. Based on the Eucalyptus sequences, primers were designed as shown along with amplicon lengths (bp).

Primer sequences (5′–3′) employed in RT–qPCR analysis of candidate reference genes for Eucalyptus including genes traditionally employed as references in plant gene expression studies Gene name abbreviations, GenBank, EMBL or TAIR accession codes and the putative functional identity of genes based on BLAST analysis are indicated. The EUCAGEN scaffolds containing the genome sequences of the referred genes are presented. Based on the Eucalyptus sequences, primers were designed as shown along with amplicon lengths (bp). Total RNA samples were prepared from six Eucalyptus species, distributed as follows: flower, leaf and xylem of E. grandis, leaf and xylem of E. dunnii, E. pellita, E. saligna and E. urophylla, and xylem of E. globulus. RT–qPCR evaluations were conducted with biological duplicates and experimental quadruplicates. Results were analyzed using the software geNorm 3.5 (Vandesompele et al. 2002) and NormFinder (Andersen et al. 2004) in order to generate comparable rankings of genes based on their stability of expression. The Cq data collected for all samples were transformed to relative quantities using the 2−ΔΔCt method developed by Livak and Schmittgen (2001). We did not succeed in obtaining satisfactory single-peak dissociation curves after RT–qPCR with primers designed for Eucons01 and Eucons15 (data not show), and both candidate genes were discarded from the analysis. With geNorm, the average expression stability (M-value) of all genes was first calculated. M-values are defined as the mean variation of a certain gene related to all of the others. The geNorm software recommends an M-value below the threshold of 1.5 in order to identify genes with stable expression, although 0.5 has been used as the threshold limit by many authors (Radonić et al. 2005, Allen et al. 2008, Coll et al. 2010, de Almeida et al. 2010, Taylor et al. 2010). As shown in Fig. 3A, all 15 candidate genes examined showed a very high stability of expression, with thresholds <0.12, independently of the tissues/organs evaluated. According to the geNorm analysis, Eucons04, Eucons08 and Eucons21 were the most stably expressed genes and should be considered as the best reference genes for RT–qPCR normalizations.
Fig. 3

Expression stability of candidate genes evaluated by RT–qPCR in all tissues/organs and species of Eucalyptus analyzed. RT–qPCR results were analyzed according to the algorithms geNorm and NormFinder and represent the general average expression values. The lower the values, the more stable the gene expression. (A) Average M stability values of the expression of candidate genes calculated with the geNorm algorithm. As indicated, genes positioned to the right are the most stable in expression among the variables assayed. (B) Pairwise variation (V) values calculated with the geNorm algorithm in order to estimate the optimal number of reference genes necessary for accurate normalization of the expression of genes of interest. Values lower than 0.15 indicate that no additional genes are required for the normalization of expression in organs/tissues studied. (C) Stability values of gene expression calculated with the NormFinder algorithm.

Expression stability of candidate genes evaluated by RT–qPCR in all tissues/organs and species of Eucalyptus analyzed. RT–qPCR results were analyzed according to the algorithms geNorm and NormFinder and represent the general average expression values. The lower the values, the more stable the gene expression. (A) Average M stability values of the expression of candidate genes calculated with the geNorm algorithm. As indicated, genes positioned to the right are the most stable in expression among the variables assayed. (B) Pairwise variation (V) values calculated with the geNorm algorithm in order to estimate the optimal number of reference genes necessary for accurate normalization of the expression of genes of interest. Values lower than 0.15 indicate that no additional genes are required for the normalization of expression in organs/tissues studied. (C) Stability values of gene expression calculated with the NormFinder algorithm. In order to evaluate the optimal number of reference genes for reliable normalization, geNorm allows calculation of the pairwise variation Vn/Vn + 1 between the sequential ranked normalization factors NFn and NFn + 1 to determine the effect of adding the next reference gene in normalization. The normalization factor is calculated based on the geometric average among the two most stable gene relative quantities and the stepwise inclusion of the other genes in the order of their expression stability. A large pairwise variation implies that the added reference gene has a significant effect on normalization and should be included for calculation of a reliable normalization factor. Considering the cut-off value of 0.15, below which the inclusion of an additional reference gene is not necessary (Vandesompele et al. 2002), the use of the two most stably expressed genes, Eucons08 and Eucons21, was sufficient for accurate normalization (<0.02) in all organs studied (flower, leaves and xylem) from the six Eucalyptus species (Fig. 3A, B). The same applies when analyzing xylem and leaves separately, with Eucons04 and RibL23A genes (leaves) and Eucons06 and Eucons08 genes (xylem; data not show). In addition to geNorm, the expression stability of candidate reference genes assayed by RT–qPCR was also analyzed with the NormFinder software. This program takes into account the intra- and intergroup variations for normalization factor calculation and the results are not affected by occasionally co-regulated genes. The best candidate will be the one with the intergroup variation as close to zero as possible and, at the same time, having the smallest error bar possible. Hence values are inversely correlated to gene expression stability, which avoids artificial selection of co-regulated genes (Andersen et al. 2004). According to the NormFinder analysis of gene expression in leaves, xylem tissues and among species, the stability values of the 15 genes studied were <0.138, with error bars no greater than 0.044 (Fig. 3C; Table 3). When we analyzed the gene expression in all tissues/organs and species, the stability value was in the range between 0.017 and 0.106, proving again that all genes elected are good references for RT–qPCR studies in Eucalyptus. The ranking of the genes and their respective stability values are shown in Table 3. According to the NormFinder analysis and in agreement with the results of geNorm, the three most stable genes were Eucons04, Eucons08 and Eucons21 when considering all tissues/organs and species. When the expression in leaves is considered separately, the stability values were in the range between 0.008 and 0.086, and the three most stable genes in these organs were Eucons04, Eucons08 and Eucons32. In xylem vascular tissues, the stability values were in the range of 0.01–0.138, and the genes Eucons27, Eucons07 and Eucons06 were the most stable (Table 3). The algorithm ranked Eucons04 as the most stably expressed gene in all samples regardless of whether the samples were collected into one main group or divided into two groups. Nonetheless, just one housekeeping gene is determined to all samples using NormFinder when no groups are defined. So, a different group was created to analyze the most stable couple (Table 3). When leaves and xylem were tested as different groups, the stability values were in the range between 0.011 and 0.094. Eucons04 exhibited the lowest stability value. NormFinder identified Eucons04 and Eucons08 as the most appropriate combination of genes, showing a stability value of 0.009.
Table 3

Expression stability values (SV) and standard deviations (SD) of Eucalyptus reference genes calculated by the NormFinder software

1 Group
2 Groups
Leaf
Xylem
All organs
Leaf + xylem
RankingSV ± SDRankingSV ± SDRankingSV ± SDRankingSV
Eucons04 0.008 ± 0.010 Eucons27 0.010 ± 0.010 Eucons04 0.017 ± 0.007 Eucons04 0.011
Eucons08 0.018 ± 0.009 Eucons07 0.017 ± 0.010 Eucons08 0.021 ± 0.007 Eucons08 0.016
Eucons32 0.023 ± 0.010 Eucons06 0.023 ± 0.010 Eucons21 0.022 ± 0.008 Eucons21 0.019
Eucons21 0.027 ± 0.011H2B0.024 ± 0.011RibL23A0.032 ± 0.009RibL23A0.020
RibL23A0.033 ± 0.012 Eucons21 0.025 ± 0.011 Eucons06 0.036 ± 0.009 Eucons06 0.036
GAPDH0.035 ± 0.013RibL23A0.027 ± 0.011H2B0.037 ± 0.010H2B0.037
Eucons06 0.038 ± 0.014 Eucons04 0.028 ± 0.012 Eucons32 0.047 ± 0.011 Eucons27 0.045
H2B0.045 ± 0.016 Eucons08 0.029 ± 0.012GAPDH0.052 ± 0.012 Eucons32 0.046
Eucons07 0.051 ± 0.017 Euc12 0.036 ± 0.013 Eucons27 0.052 ± 0.012GAPDH0.048
At2g283900.054 ± 0.018 Eucons32 0.050 ± 0.017 Eucons07 0.060 ± 0.014 Eucons07 0.058
Eucons27 0.059 ± 0.020GAPDH0.056 ± 0.019At2g283900.073 ± 0.016 Euc10 0.062
TUA20.059 ± 0.020At2g283900.058 ± 0.020 Euc12 0.074 ± 0.016At2g283900.067
Euc12 0.069 ± 0.023TUA20.071 ± 0.023 Eucons43 0.101 ± 0.022 Euc12 0.068
Euc10 0.074 ± 0.024 Eucons43 0.088 ± 0.029TUA20.103 ± 0.022 Eucons43 0.083
Eucons43 0.086 ± 0.028 Euc10 0.138 ± 0.044 Euc10 0.106 ± 0.023TUA20.094
Best combination of two genes Eucons04 Eucons08 0.009

Complementary DNAs from leaf and xylem (E. grandis, E. dunnii, E. pellita, E. saligna and E. urophylla), xylem (E. globulus) and flower (E. grandis) were subjected to RT–qPCR and results were analyzed with the NormFinder software in single groups of leaf or xylem, or all tissues/organs together. Results for the groups of leaf and xylem together are also presented. The best two reference genes for all the analyses performed are indicated at the bottom.

Expression stability values (SV) and standard deviations (SD) of Eucalyptus reference genes calculated by the NormFinder software Complementary DNAs from leaf and xylem (E. grandis, E. dunnii, E. pellita, E. saligna and E. urophylla), xylem (E. globulus) and flower (E. grandis) were subjected to RT–qPCR and results were analyzed with the NormFinder software in single groups of leaf or xylem, or all tissues/organs together. Results for the groups of leaf and xylem together are also presented. The best two reference genes for all the analyses performed are indicated at the bottom.

Validation of the stability of Eucalyptus reference genes via dxr differential gene expression analysis

Terpenoids are all derived from two common precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). In higher plants, IPP and DMAPP are synthesized through two distinct pathways in separate cellular compartments, the cytosolic mevalonate (MVA) pathway and the 2-C-methyl-d-erythritol 4-phosphate (MEP) pathway that takes place in plastids. The MEP pathway, through which diterpenes are synthesized, has two important initial steps: (i) the formation of 1-deoxy-d-xylulose 5-phosphate (DXP) from pyruvate and glyceraldehyde 3-phosphate through the action of the DXP synthase (DXS), followed by the conversion of DXP into MEP by the action of the DXP reductoisomerase (DXR). As DXS and DXR are key enzymes catalyzing the two committed steps for isoprenoid biosynthesis, genes coding for DXS and DXR may play important roles in controlling the plastidic synthesis of isoprenoids and the downstream diterpene products (Carretero-Paulet et al. 2002, Liao et al. 2007, Wu et al. 2009, Yan et al. 2009). It is known that the expression patterns of the DXR enzyme and its encoding gene vary quite consistently according to the plant and organ being assessed. This enzyme and its encoding mRNA show increased expression in inflorescences and leaves of A. thaliana (Carretero-Paulet et al. 2002) and Salvia miltiorrhiza (Yan et al. 2009), but decreased levels were reported in stems and roots. In Rauvolfia veticillata, on the other hand, higher levels of dxr mRNA were reported in fruits and roots, with the lowest levels in flowers (Liao et al. 2007). In order to confirm the constitutive expression of the three best Eucalyptus genes selected as references (Eucons04, Eucons08 and Eucons21), we tested them by normalizing the patterns of dxr gene expression in Eucalyptus and compared the results with those normalized by the traditional reference genes RibL23A and GAPDH. Therefore, the expression of dxr and the reference genes was measured by RT–qPCR in the same set of tissues/organs and Eucalyptus species previously tested. dxr expression values were then normalized against the expression values of two reference genes, as shown in Fig. 4.
Fig. 4

Relative expression of the isoprenoid biosynthetic gene dxr in different tissues/organs of Eucalyptus by RT–qPCR and normalization with different reference gene pairs. Gene pairs employed as references are indicated at the bottom of the graphics. Average values of the relative expression of the reference genes in the different tissues were set to 1 in order to normalize the expression of dxr. (A) Expression patterns of the dxr gene in flowers, leaves and xylem tissues of E. grandis. (B) Expression patterns of the dxr gene in xylem tissues of E. grandis, E. globulus and E. pellita.

Relative expression of the isoprenoid biosynthetic gene dxr in different tissues/organs of Eucalyptus by RT–qPCR and normalization with different reference gene pairs. Gene pairs employed as references are indicated at the bottom of the graphics. Average values of the relative expression of the reference genes in the different tissues were set to 1 in order to normalize the expression of dxr. (A) Expression patterns of the dxr gene in flowers, leaves and xylem tissues of E. grandis. (B) Expression patterns of the dxr gene in xylem tissues of E. grandis, E. globulus and E. pellita. In order to allow comparisons among reference genes, the average value of the pairwise reference gene relative expression in the different organs/tissues tested was set to 1 and taken to normalize the dxr relative expression. As expected, steady-state mRNA levels for the dxr gene were much higher in leaves, followed by flowers, with lower values observed in xylem tissues of E. grandis (Fig. 4A). As shown in Fig. 4, the pairwise combination of Eucons04, Eucons08 or Eucons21 allowed more confident results than the RibL23A/GAPDH pair. The relative expression of the dxr gene was much less variable when normalized with Eucons genes and, most importantly, much more concordant if compared with results normalized by the RibL23A/GAPDH pair. This is more evident in Fig. 4B where no statistical difference was observed in dxr relative expression values among the xylem from E. grandis, E. globulus and E. pellita when RibL23A/GAPDH were used as references, but was quite different when normalized with any two of the Eucons genes. Essentially the same conclusions were assumed by the analysis of dxr relative expression obtained with leaves and xylem tissues from E. dunnii, E. pellita, E. saligna and E. urophylla (results not shown).

Discussion

Real-time qPCR and cDNA microarray measurements are highly reproducible techniques to assess gene expression at the steady-state mRNA level (Yue et al. 2001, Stankovic and Corfas 2003, Stahlberg et al. 2004). However, in comparison with classical RT–PCR, the main advantages of RT–qPCR are its higher sensitivity, specificity of measurements, and broad quantification range of up to seven orders of magnitude (Bustin 2002, Gachon et al. 2004), besides being a great aid to study expression in genes whose transcript levels are known to be very low (Higuchi et al. 1993). The analysis by RT–qPCR has become the most common method for validating whole-genome microarray data or of a smaller set of genes, and molecular diagnostics (Giulietti et al. 2001, Chuaqui et al. 2002). Accurate normalization is an absolute prerequisite for correct measurement of gene expression, and the most commonly used normalization strategy involves standardization to a single constitutively expressed control gene. Therefore, the ideal reference gene should exhibit invariable expression levels among all different cell types, tissues, organs, developmental stages or treatments that are submitted to the test organism (Vandesompele et al. 2002, Andersen et al. 2004). However, it has become clear that no single gene is constitutively expressed in all cell types and under all experimental conditions. It has been shown extensively that the expression of the so-called ‘housekeeping’ genes, although constant under some experimental conditions, can vary quite considerably in other cases, implying that the expression stability of the intended control gene has to be verified before each experiment (Thellin et al. 1999, Volkov et al. 2003, Czechowski et al. 2005, Gutierrez et al. 2008, Guénin et al. 2009, Hruz et al. 2011). Normalization with multiple reference genes is becoming the gold standard for the technique, but reports that identify such genes in plant research are limited, especially for woody species (Rajeevan et al. 2001, Coker and Davies 2003, Brunner et al. 2004, Iskandar et al. 2004, Nicot et al. 2005, Jain et al. 2006, Reid et al. 2006, Expósito-Rodriguez et al. 2008, González-Verdejo et al. 2008, Gutierrez et al. 2008, Hong et al. 2008, Jian et al. 2008, Libault et al. 2008, Martin et al. 2008, Remans et al. 2008, Hu et al. 2009, Løvdal and Lillo 2009, Paolacci et al. 2009, Silveira et al. 2009, Tong et al. 2009, de Almeida et al. 2010, Artico et al. 2010, Boava et al., 2010, Hong et al. 2010, Kulcheski et al. 2010, Lin and Lai 2010, Maroufi et al., 2010, Schmidt and Delaney 2010, Wan et al. 2010, Yang et al. 2010, Hoenemann and Hohe 2011). In the present work, we evaluated the results of microarray data concerning 21,442 Eucalyptus genes and selected the 50 most stably expressed genes in leaves of E. grandis and the xylem of E. grandis and E. globulus. To do so, two statistical algorithms were employed, SAM and SDMA, and the same 50 candidate genes were pointed out as the most invariably expressed genes in microarrays, although the ranking of the genes was different (Table 1). While SAM is a well established and popular program to analyze microarray data, with almost 6,500 citations in PubMed (Tusher et al. 2001), SDMA is here presented as a novel algorithm developed to better represent, in 3D graphs, the results of the most stably expressed genes in microarrays. It is based on the principle that gene expression values with lower standard deviations are supposed to be the most similarly expressed among the samples being tested. By RT–qPCR, the expression stability of eight of the 50 best candidate genes selected by SAM and SDMA was addressed in different organs (leaves and flowers) and vascular tissues (xylem) derived from six species of Eucalyptus. Besides these eight novel genes, seven other genes previously tested as references in Eucalyptus or other plants were also evaluated, including classic housekeeping genes such as those encoding tubulin (TUA2), histone H2B, GAPDH and the ribosomal protein L23A (RibL23A). Genes encoding TUA2, GAPDH, histones, ribosomal proteins and RNAs are the most employed and tested housekeeping genes in plants (Thellin et al. 1999, Rajeevan et al. 2001, Volkov et al. 2003, Iskandar et al. 2004, Czechowski et al. 2005, Barsalobres-Cavallari et al. 2009, de Almeida et al. 2010, Lin and Lai 2010, Maroufi et al. 2010). According to our RT–qPCR results and data analysis by geNorm and NormFinder, all these housekeeping genes showed quite consistent stability in expression in Eucalyptus, especially RibL23A (Fig. 3, Table 3). Nevertheless, the employment of the pair GAPDH/RibL23A to normalize the expression of the isoprenoid biosynthetic gene dxr proved that, at least together, these genes are not suitable as references for Eucalyptus gene expression studies. Indeed, the combination of any pair of the three best reference genes here presented, Eucons04, 08 and 21, to normalize the results of dxr gene expression was intrinsically consistent, leading to a quite different interpretation of the dxr gene expression in xylem tissues of three Eucalyptus species, as shown in Fig. 4B. The stability of the TUA2 gene has often been used to normalize RT–qPCR expression data (Brunner et al. 2004, González-Verdejo et al. 2008, de Almeida et al. 2010). Analysis of the Eucalyptus microarray and RT–qPCR data revealed that, indeed, it has a quite stable expression. Nevertheless, this gene is far from being the best reference for Eucalyptus among those tested (Fig. 3, Table 3). TUA2 has been shown to be a suitable normalization gene during plant development in Orobanche ramosa (González-Verdejo et al. 2008), for comparison of gene expressions among species of Populus (Brunner et al. 2004), and during E. globulus adventitious rooting in vitro (de Almeida et al. 2010), but it was unstable during seedling development in A. thaliana (Volkov et al. 2003, Hong et al. 2010, Zhou et al. 2010), in different tissues or under biotic and abiotic stresses in potato (Nicot et al. 2005) and cucumber (Wan et al. 2010). Similar results were also obtained by Artico et al. (2010) in cotton, Silveira et al. (2009) in Brachyaria brizantha, and Expósito-Rodríguez et al. (2008) and Dekkers et al. (2012) for A. thaliana and tomato seeds. Like TUA2, the GAPDH gene was shown to reach stability values consistent enough to be considered a reference in our studies with Eucalyptus (Fig. 3, Table 3), although much better candidates were pointed out. According to Kim et al. (2003), the relative expression of GAPDH in rice varied up to 2-fold. In Brachypodium distachyon, results of RT–qPCR showed that the GAPDH gene was stably expressed under various abiotic stresses, without considerable variation in response to growth hormones, although it exhibited less stability according to the tissue type being evaluated (Hong et al. 2008). In tomato, GAPDH was poorly ranked as a good reference gene based on the analysis of EST data (Coker and Davies 2003) and RT–qPCR assays during plant development (Expósito-Rodríguez et al. 2008) or under abiotic stress (Løvdal and Lillo 2009). Similar results were obtained with peach, where GAPDH was not among the best reference genes in the experimental groups (Tong et al. 2009). According to Tong et al. (2009), the reasons for the observed discrepancies may be that GAPDH not only acts as a component of the glycolytic pathway but also takes part in other processes. Therefore, the expression profile of GAPDH might fluctuate according to the corresponding experimental conditions. The gene encoding histone H2B also exhibited levels of steady-state mRNA quite constant in the different Eucalyptus organs/tissues evaluated in this study (Fig. 3, Table 3). However, like the other traditional housekeeping genes, its stability was overcome by the novel Eucons genes discussed later. During E. globulus adventitious rooting in vitro, H2B, along with TUA2, was among the most stably expressed genes (de Almeida et al. 2010). The gene encoding histone H3 in chicory was also indicated as a good reference for RT–qPCR assay normalization (Maroufi et al. 2010). Nevertheless, Czechowski et al. (2005), based on the analysis of a large amount of data derived from microarray studies, showed that genes encoding histones were not among the best reference genes for A. thaliana. Similar results were obtained by Lin and Lai (2010) when studying synchronized longan tree embryogenic cultures at different developmental stages and temperatures. Genes encoding ribosomal proteins and rRNAs are often viewed as a homogeneous collection of housekeeping genes and were employed as references in many works (Thellin et al. 1999, Volkov et al. 2003, Iskandar et al. 2004, Barsalobres-Cavallari et al. 2009, de Almeida et al. 2010). Nevertheless, members of this gene family were shown to have extraribosomal functions with strong variations in the pattern of their expression (Wool 1996, McIntosh et al. 2005). For instance, these genes were shown to be specifically induced or repressed in particular tissues during different stages such as tuber (Taylor et al. 1992) and root (Williams and Sussex 1995) development; or in response to stresses such as genotoxicity (Revenkova et al. 1999) and cold (Saez-Vasquez et al. 2000, Kim et al. 2004). Volkov et al. (2003) specifically evaluated the tissue-specific changes in the RibL23A mRNA levels in different organs of A. thaliana. Compared with leaves, the level of RibL23A mRNA was increased in flowers and reduced in stems and siliques. These observations are in accordance with the idea that ribosomal protein genes in plants are transcriptionally up-regulated in actively growing tissues and down-regulated in metabolically inactive tissues (Marty et al. 1993, Moran 2000). Interestingly, among the traditional housekeeping genes tested in the present work, RibL23A was the most stable, only outperformed by the Eucons genes discussed later. One of the Eucons genes tested in the present work is orthologous to At2g28390.1, originally identified as one of the best reference genes for A. thaliana gene expression analysis by Czechowski et al. (2005), both by microarray and by RT–qPCR analysis. The orthologous Eucalyptus sequence was tested by de Almeida et al. (2010) during in vitro adventitious rooting, proving it to be one of the best reference genes for RT–qPCR under the conditions assayed. The At2g28390.1 sequence putatively encodes a SAND family protein member, a membrane protein related to vesicle traffic (Cottage et al. 2004, Czechowski et al. 2005). Considering Eucalyptus leaves, xylem and flowers tested in the present work, the At2g28390.1 gene was among the least stable genes (Fig. 3, Table 3). Similar results were obtained for Euc10 and EuC12 genes. Both sequences were previously identified as strong reference gene candidates for E. grandis vs. E. globulus xylem and leaf gene expression studies (unpublished results). de Almeida et al. (2010) proved that Euc12 is indeed a good reference gene for RT–qPCR studies during E. globulus in vitro rooting. In the present work, both genes exhibited acceptable stability values (Fig. 3, Table 3), but were outperformed by the Eucons genes. Ecualyptus grandis sequences for Euc10 and Euc12 were derived from At3g07640.1 (encoding an unknown protein) and At1g32790.1 (encoding a putative RNA-binding protein) from A. thaliana, also pointed out by Czechowski et al. (2005) as the best reference genes based on both microarrays and RT–qPCR. The analysis of the Eucalyptus microarray data allowed us to identify the 50 most stable genes in the xylem of E. grandis and E. globulus and leaves of E. grandis (Table 1). We named these potential reference genes Eucons after ‘Eucalyptus constitutives’. RT–qPCR analysis of eight of the selected genes proved that these genes were indeed very reliable references for the normalization of gene expression in different Eucalyptus organs and tissues, especially those named Eucons04, 08 and 21. Analysis of the function of the putative encoded proteins revealed that these genes may also belong to the so-called housekeeping class of genes. Eucons04 putatively encodes a protein highly similar to cyclin-dependent protein kinases (CDKs) such as R. communis CDK8 and CDKs from A. thaliana (Menges et al. 2005). These types of proteins are able to phosphorylate protein target amino acids in different metabolic pathways and, most notably, in cell cycle control (Umeda et al. 2005). Eucons08 is similar to R. communis and A. thaliana genes possibly encoding the transcription elongation factor SII (TFIIS). SII is considered one of the numerous elongation factors that enable RNA polymerase II to transcribe faster and/or more efficiently. It engages transcribing RNA polymerase II and assists it in bypassing blocks to elongation by stimulating a cryptic, nascent RNA cleavage activity intrinsic to RNA polymerase (Wind and Reines 2000). Eucons21 encodes a protein with significant sequence similarity to a putative R. communis aspartyl-tRNA synthetase. Aminoacyl-tRNA synthetases catalyze the addition of amino acids to their cognate tRNAs. In the case of aspartyl-tRNA synthetase, the amino acid bound to tRNAs is aspartate. In plants, all aminoacyl-tRNA synthetases are nuclear encoded and are post-translationally targeted to the compartments where protein synthesis takes place, i.e. the cytoplasm, mitochondria or plastids (Duchêne et al. 2005). According to the analysis of the RT–qPCR data performed with the software NormFinder, Eucons04 and Eucons08 are the best reference genes pairwise when assessing test gene expression exclusively in leaves, or in leaves along with xylem tissues. If only xylem tissues are analyzed, Eucons07 and Eucons27 would be the best references (Table 3). Eucons07 encodes a protein similar to a member of the ABC transporter family from Arabidopsis lyrata while Eucons27 putatively encodes a factor related to peroxisome biogenesis. Interestingly, an ABC transporter ATPase-encoding gene was indicated as one of the best reference genes for RT–qPCR analysis of embryogenic cell cultures of Cyclamen persicum (Hoenemann and Hohe 2011). Kamada et al. (2003), analyzing expression profiles of genes encoding peroxisomal proteins in A. thaliana, showed that these genes are expressed in all plant organs, suggesting that they play a role in metabolic pathways of unidentified plant peroxisomes and may have a constitutive expression in plants. It is important to mention that, when considering all organs/tissues of all Eucalyptus species evaluated by RT–qPCR, the stability values of Eucons04, 08 and 21 are not statistically different from those observed for H2B, RibL23A and Eucons06 according to the NormFinder analysis, as can be observed in Fig. 3C and Table 3. Nevertheless, the algorithm geNorm also indicated Eucons04, 08 and 21 as the best reference genes for the group of variables evaluated. Although outperformed by Eucons04 and 08 as reference genes, the remaining Eucons06, 32 and 43 genes also presented consistently constant stability values in our analysis. Eucons06 putatively encodes a plastidic ATP/ADP-transporter, while Eucons32 and 43 encode a nitrogen regulatory protein and a serine/threonine-protein kinase, respectively. To our knowledge, none of these sequences was previously indicated as a potential reference to normalize studies of gene expression by RT–qPCR. Based on the microarray expression analysis of >21,000 Eucalyptus genes, we identified the 50 most stably expressed genes in leaves (E. grandis) and xylem tissues (E. grandis and E. globulus). We proved by RT–qPCR that eight representatives of these reference genes are indeed very stable in different organs/tissues and species of Eucalyptus, outperforming traditional housekeeping genes. Considering that two statistical programs allowed us to reach similar interpretations of the microarray results, and that potential discrepancies should be expected, the good agreement of our results with the independent approaches strongly suggested that Eucons04, Eucons08 and Eucons21 should be regarded as tne most suitable reference genes for normalization of gene expression studies in Eucalyptus species. Although the selected reference genes were tested only in six species of a genus with >700 species, these six species represent some of the most widely planted trees in the tropics (FAO 2001) and exhibit quite a large variation in growth rate, stress resistance and wood quality (Coppen 2002). To our knowledge, the present work represents the widest in-depth study developed to validate optimal reference genes for the evaluation of transcript levels in different eucalypt organs and species. In summary, these findings provide useful tools for the normalization of RT–qPCR experiments and will enable more accurate and reliable gene expression studies related to functional genomics in Eucalyptus.

Materials and Methods

Plant material

For microarray studies, xylem tissues were collected from 4-year-old, field-grown E. grandis and E. globulus trees located at Hortoflorestal Barba Negra (Aracruz Celulose S.A., today's Fibria) in Barra do Ribeiro, RS, Brazil. Xylem was collected by scraping the exposed vascular tissue after the removal of the 0.5–1 cm thick stem bark. Two lines of genetically unrelated matrixes were chosen and each line was represented by two clones (biological duplicates), therefore totalling eight xylem samples. From both clones of one of the E. grandis lines, mature leaves were also collected. To minimize the proportion of primary xylem mainly located in the main veins of leaves, only leaf blades without the central vein were used for this study. Tissue samples were immediately frozen in liquid nitrogen and stored at −80°C. For RT–qPCR studies, the same E. grandis and E. globulus trees were sampled, along with xylem and leaves of field-grown E. dunnii, E. pellita, E. saligna and E. urophylla. Eucalyptus grandis flowers were also collected under the same conditions. Harvested organs/tissues were immediately frozen in liquid nitrogen and stored at −80°C until further analysis.

RNA extraction

Total RNA was extracted using the PureLink Plant RNA Purification (Invitrogen) reagent according to the manufacturer's instructions for small-scale RNA isolation. About 20 μg of total RNA was sent to NimbleGen Systems Inc. (Reykjavik, Iceland) for cDNA synthesis and microarray hybridizations.

Oligonucleotide microarray hybridization

Microarray experiments were carried out by Roche NimbleGen. In total, 21,432 unigenes were selected from the Genolyptus EST data set to make up a basic chip. Ten cDNAs encoding known human proteins were also included in chips as negative controls. Nine oligonucleotides, 50 bp long, distributed throughout each sequence and with close melting temperatures were designed and synthesized for each sequence consensus or singleton. Probes were randomly distributed on two blocks of each chip in duplicated form, adding up to 385,856 features per chip. Therefore, each chip was composed of two blocks (technical replicates) containing the same collection of randomly distributed probes, and 18 hybridization values were collected for each gene from every chip. A total of 10 identical chips were produced. Two chips were hybridized with cDNA samples from E. grandis mature leaves, and eight chips were destined to xylem cDNA hybridizations. After submission of total RNA samples to NimbleGen, prepared as described above, cDNAs were labeled with Cy3, and hybridizations, washing, scanning, data collection and initial data normalization were performed according to NimbleGen's standard protocols.

Data processing and statistical analysis

Microarray expression data were normalized into log2 intensity values. Afterwards, we carried out three distinct analyses. In the first one, we compared hybridizations from E. grandis leaf and xylem. In the second one, we compared E. grandis xylem and E. globulus xylem. In both previous analyses, the aim was to find the most similarly and the most differentially expressed genes. In the third analysis, we looked for the most similarly expressed genes in hybridizations from the three organs/tissues. In each analysis, data were mean-centered as follows. A reference set was generated by averaging the expression of each gene over all hybridizations. Each piece of hybridization data was subtracted from the reference data set, generating new mean-centered data. In the next step, the ‘relative difference’ in gene expression was computed. The relative difference score was used to identify the most similarly and the most differentially expressed genes. We have performed the two class unpaired SAM (Tusher et al. 2001) when comparing two tissues, and a multiclass SAM when comparing the three organs/tissues. In order to perform the experiments, we have used SAM Version 3.09 and R 2.9.2 tools and the SDMA V1.0 tool as described next.

Standard deviation microarray analysis (SDMA)

In this paper we propose a new approach, called SDMA, for finding the most similarly expressed genes in microarray studies. SDMA is the acronym for Standard Deviation Microarray Analysis. The formal statement of SDMA can be defined as follows: let G = {g1, g2, g3,…, gm} be a set of genes. Let H = {h1, h2, h3,…, ho} be a set of hybridizations, where o ≥ 2. Let M = {T1, T2, T3,…, Tn} be a set of tissues, where n ≥ 2 and each element T contains a set of hybridizations such that T ⊂ H, and Tx ⋂ Ty = Ø for any x ≠ y. Let Ehp = {Ehp_g1, Ehp_g2, Ehp_g3,…, Ehp_gm} be a set of expressions levels of m genes in hp hybridization, where p ≤ o.Let Avg(Tpgq) be the average of expression levels of gene q over all hybridizations from tissue p. Let Sd(gq) be the standard deviation of gene q considering Avg(T1gq), Avg(T2gq), Avg(T3gq),… and Avg(Tq). Sdg can assume any value from 0 until ∞. Sdgq is equal to zero when Avg(T1gq) = Avg(T2gq) = Avg(T3gq) = … = Avg(T), i.e. the gene q has exactly the same expression level in all tissues. The value of Sdgq increases proportionally to growth of difference among Avg(T1gq), Avg(T2gq), Avg(T3gq),… and Avg(Tq). So, SDMA can rank the n most similarly expressed, i.e. those n genes for which Sdg is closer to zero. When viewing SDMA graphs, a main diagonal line is supposed to exist since it contains every possible data point where Avg(T1g) = Avg(T2g) = Avg(T3g) = … = Avg(T). Although it is rare to find a gene obeying this restriction when comparing similar tissues, a concentration of data points around the main diagonal line is supposed to exist. Otherwise, when comparing very dissimilar tissues, data points are supposed to be dispersed in space. Regarding the Eucalyptus microarray analysis, a set of 21,442 genes was considered, i.e. G = {g1, g2, g3,…, g21,442}. There were three tissues evaluated, i.e. M = {T1, T2, T3}, where T1 represents E. grandis leaf, T2 represents E. grandis xylem and T3 represents E. globulus xylem. There was a set of 20 hybridizations, i.e. H = {h1, h2, h3,…, h20}, where T1 = {h1, h2, h3, h4}, T2 = {h5, h6,…, h12} and T3 = {h13, h14,…, h20}. We also considered SDg as the standard deviation of expression levels of a gene g in T1, T2 and T3. Moreover, microarray expression data were scaled into log10 intensity values. It resulted in values for expression levels from 4.66 to 5.20. The SDMA approach ranked the genes according to their similarities in expression levels in the three distinct tissues. So, genes with minor standard deviation are supposed to be the most similarly expressed gennes. Otherwise, genes with higher standard deviation are supposed to be the most differentially expressed ones.

RT–qPCR

Primer pairs for RT–qPCR were designed using the program PrimerQuest (http://www.idtdna.com/Scitools/Applications/Primerquest) and are listed in Table 2. The relative transcript abundance was detected by SYBR Green, and PCRs were carried out in a total volume of 20 μl using a thermocycler 7500 Real Time PCR System (Applied Biosystems). Reaction conditions included one initial cycle of denaturation at 95°C for 5 min followed by 40 cycles of 95°C for 15 s (denaturation), 60°C for 10 s (annealing) and 72°C for 15 s (elongation). PCRs were followed by a melting curve program (60–95°C with a heating rate of 0.1°C s−1 and a continuous fluorescence measurement). A negative control was run without a cDNA template in all assays to assess the overall amplification specificity.

Nucleotide sequence annotation

The Gene Ontology Functional Annotation Tool Blast2GO (Conesa et al. 2005) was used to assign GO identities and enzyme commission numbers. This tool also enabled statistical analysis related to over-representation of functional categories based on a Fisher exact statistic methodology.

Funding

This work was supported by Financiadora de Estudos e Projetos (FINEP, Brazilian Ministry of Science and Technology-MCT) [grant No. 2101063500]; Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq, MCT) [grant Nos. 50.6348/04-0, 578632/08-0, 311361/09-9]; Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Brazilian Ministry of Education).
  79 in total

1.  Validation of array-based gene expression profiles by real-time (kinetic) RT-PCR.

Authors:  M S Rajeevan; S D Vernon; N Taysavang; E R Unger
Journal:  J Mol Diagn       Date:  2001-02       Impact factor: 5.568

2.  Accumulation and nuclear targeting of BnC24, a Brassica napus ribosomal protein corresponding to a mRNA accumulating in response to cold treatment.

Authors: 
Journal:  Plant Sci       Date:  2000-07-14       Impact factor: 4.729

3.  Characterization of the structure and expression of a highly conserved ribosomal protein gene, L9, from pea.

Authors:  D L Moran
Journal:  Gene       Date:  2000-07-25       Impact factor: 3.688

4.  Validation of housekeeping genes as internal control for studying gene expression in rice by quantitative real-time PCR.

Authors:  Mukesh Jain; Aashima Nijhawan; Akhilesh K Tyagi; Jitendra P Khurana
Journal:  Biochem Biophys Res Commun       Date:  2006-05-03       Impact factor: 3.575

5.  Towards a systematic validation of references in real-time rt-PCR.

Authors:  Laurent Gutierrez; Mélanie Mauriat; Jérôme Pelloux; Catherine Bellini; Olivier Van Wuytswinkel
Journal:  Plant Cell       Date:  2008-07-29       Impact factor: 11.277

6.  Identification of reference genes for RT-qPCR expression analysis in Arabidopsis and tomato seeds.

Authors:  Bas J W Dekkers; Leo Willems; George W Bassel; R P Marieke van Bolderen-Veldkamp; Wilco Ligterink; Henk W M Hilhorst; Leónie Bentsink
Journal:  Plant Cell Physiol       Date:  2011-08-18       Impact factor: 4.927

7.  A new 1-deoxy-D-xylulose 5-phosphate reductoisomerase gene encoding the committed-step enzyme in the MEP pathway from Rauvolfia verticillata.

Authors:  Zhihua Liao; Rong Chen; Min Chen; Chunxian Yang; Qiang Wang; Yifu Gong
Journal:  Z Naturforsch C J Biosci       Date:  2007 Mar-Apr

8.  RefGenes: identification of reliable and condition specific reference genes for RT-qPCR data normalization.

Authors:  Tomas Hruz; Markus Wyss; Mylene Docquier; Michael W Pfaffl; Sabine Masanetz; Lorenzo Borghi; Phebe Verbrugghe; Luba Kalaydjieva; Stefan Bleuler; Oliver Laule; Patrick Descombes; Wilhelm Gruissem; Philip Zimmermann
Journal:  BMC Genomics       Date:  2011-03-21       Impact factor: 3.969

9.  Selection of reference genes for quantitative real-time PCR expression studies in the apomictic and sexual grass Brachiaria brizantha.

Authors:  Erica Duarte Silveira; Márcio Alves-Ferreira; Larissa Arrais Guimarães; Felipe Rodrigues da Silva; Vera Tavares de Campos Carneiro
Journal:  BMC Plant Biol       Date:  2009-07-02       Impact factor: 4.215

10.  Selection of reliable reference genes for gene expression studies in peach using real-time PCR.

Authors:  Zhaoguo Tong; Zhihong Gao; Fei Wang; Jun Zhou; Zhen Zhang
Journal:  BMC Mol Biol       Date:  2009-07-20       Impact factor: 2.946

View more
  22 in total

1.  Identification and validation of reference genes for expression studies in human keratinocyte cell lines treated with and without interferon-γ - a method for qRT-PCR reference gene determination.

Authors:  Angelika B Riemer; Derin B Keskin; Ellis L Reinherz
Journal:  Exp Dermatol       Date:  2012-08       Impact factor: 3.960

2.  Evaluation of stability and validation of reference genes for RT-qPCR expression studies in rice plants under water deficit.

Authors:  Priscila Ariane Auler; Letícia Carvalho Benitez; Marcelo Nogueira do Amaral; Isabel Lopes Vighi; Gabriela Dos Santos Rodrigues; Luciano Carlos da Maia; Eugenia Jacira Bolacel Braga
Journal:  J Appl Genet       Date:  2016-11-23       Impact factor: 3.240

3.  Exogenous GA3 application altered morphology, anatomic and transcriptional regulatory networks of hormones in Eucalyptus grandis.

Authors:  Qian-Yu Liu; Guang-Sheng Guo; Zhen-Fei Qiu; Xiao-Dan Li; Bing-Shan Zeng; Chun-Jie Fan
Journal:  Protoplasma       Date:  2018-02-08       Impact factor: 3.356

4.  Characterization of Brassinazole resistant (BZR) gene family and stress induced expression in Eucalyptus grandis.

Authors:  Chunjie Fan; Guangsheng Guo; Huifang Yan; Zhenfei Qiu; Qianyu Liu; Bingshan Zeng
Journal:  Physiol Mol Biol Plants       Date:  2018-06-18

5.  Screening of stable internal reference gene of Quinoa under hormone treatment and abiotic stress.

Authors:  Xiaolin Zhu; Baoqiang Wang; Xian Wang; Xiaohong Wei
Journal:  Physiol Mol Biol Plants       Date:  2021-11-16

6.  The Eucalyptus Tonoplast Intrinsic Protein (TIP) Gene Subfamily: Genomic Organization, Structural Features, and Expression Profiles.

Authors:  Marcela I Rodrigues; Agnes A S Takeda; Juliana P Bravo; Ivan G Maia
Journal:  Front Plant Sci       Date:  2016-11-30       Impact factor: 5.753

7.  Reference gene selection for real-time quantitative PCR normalization in Hemarthria compressa and Hemarthria altissima leaf tissue.

Authors:  Yao Lin; Ailing Zhang; Shengting Yang; Linkai Huang
Journal:  Mol Biol Rep       Date:  2019-06-21       Impact factor: 2.316

8.  Validation of reference genes from Eucalyptus spp. under different stress conditions.

Authors:  Jullyana Cristina Magalhães Silva Moura; Pedro Araújo; Michael dos Santos Brito; Uiara Romero Souza; Juliana de Oliveira Fernandes Viana; Paulo Mazzafera
Journal:  BMC Res Notes       Date:  2012-11-14

9.  Temporal regulation of terpene synthase gene expression in Eucalyptus globulus leaves upon ozone and wounding stresses: relationships with stomatal ozone uptake and emission responses.

Authors:  Arooran Kanagendran; Leila Pazouki; Rudolf Bichele; Carsten Külheim; Ülo Niinemets
Journal:  Environ Exp Bot       Date:  2018-08-14       Impact factor: 5.545

10.  Xylem transcription profiles indicate potential metabolic responses for economically relevant characteristics of Eucalyptus species.

Authors:  Marcela Mendes Salazar; Leandro Costa Nascimento; Eduardo Leal Oliveira Camargo; Danieli Cristina Gonçalves; Jorge Lepikson Neto; Wesley Leoricy Marques; Paulo José Pereira Lima Teixeira; Piotr Mieczkowski; Jorge Maurício Costa Mondego; Marcelo Falsarella Carazzolle; Ana Carolina Deckmann; Gonçalo Amarante Guimarães Pereira
Journal:  BMC Genomics       Date:  2013-03-22       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.