Literature DB >> 16397297

A new method to remove hybridization bias for interspecies comparison of global gene expression profiles uncovers an association between mRNA sequence divergence and differential gene expression in Xenopus.

Maureen A Sartor1, Aaron M Zorn, Jennifer A Schwanekamp, Danielle Halbleib, Saikumar Karyala, Michael L Howell, Gary E Dean, Mario Medvedovic, Craig R Tomlinson.   

Abstract

The recent sequencing of a large number of Xenopus tropicalis expressed sequences has allowed development of a high-throughput approach to study Xenopus global RNA gene expression. We examined the global gene expression similarities and differences between the historically significant Xenopus laevis model system and the increasingly used X.tropicalis model system and assessed whether an X.tropicalis microarray platform can be used for X.laevis. These closely related species were also used to investigate a more general question: is there an association between mRNA sequence divergence and differences in gene expression levels? We carried out a comprehensive comparison of global gene expression profiles using microarrays of different tissues and developmental stages of X.laevis and X.tropicalis. We (i) show that the X.tropicalis probes provide an efficacious microarray platform for X.laevis, (ii) describe methods to compare interspecies mRNA profiles that correct differences in hybridization efficiency and (iii) show independently of hybridization bias that as mRNA sequence divergence increases between X.laevis and X.tropicalis differences in mRNA expression levels also increase.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16397297      PMCID: PMC1325202          DOI: 10.1093/nar/gkj413

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Xenopus has played a prominent role in many seminal discoveries in biology (1–3). The eggs and embryos of Xenopus in comparison with most vertebrates are larger, more plentiful, simpler to obtain and easier to manipulate. These virtues led researchers of the last century to use Xenopus laevis as a model system of choice to investigate countless questions in developmental and cellular biology. However, X.laevis has some shortcomings that the closely related species Xenopus tropicalis does not. A major advantage of X.tropicalis is that it has nearly one-half the genome content of X.laevis because the X.tropicalis genome is diploid while the X.laevis genome is allotetraploid, which for the most part precludes genetic and gene expression knockdown manipulations in X.laevis that can be readily carried out in X.tropicalis. Furthermore, X.tropicalis develops in one-third the time and requires one-fifth the housing space, yet is genetically and embryologically very similar to X.laevis. To take full advantage of the Xenopus model system for studies in basic and medical science, genomic information regarding the relative RNA expression levels of X.laevis and X.tropicalis must be made available. With the advent of the X.tropicalis system for genetic and genomic studies (4,5), it would be highly valuable to the research community to ascertain the similarities and differences that exist between the historically significant X.laevis system and the increasingly used and genetically amenable X.tropicalis system. Thus, a primary intent of the work described here was to provide a comprehensive comparison of global gene expression profiles of multiple tissues and developmental stages of the two Xenopus species using microarrays and to make the data accessible to the Xenopus research community. Microarray studies with Xenopus were first carried out using a cDNA-based platform to examine the temporal regulation of global gene expression during development and neural induction (6,7). More recently, cDNA microarray platforms were used to analyse the global effect on gene expression by VegT (8) and the global gene expression profiles of different X.laevis tissues and developmental stages (9). In published work closely relevant to the work described here, mRNA expression levels of 96 genes between X.laevis and X.tropicalis were compared using a long oligonucleotide (70mer) platform based on X.tropicalis gene sequence (10). The authors found that the X.tropicalis-based microarray worked well using mRNA from X.laevis and that the two species produced similar gene expression profiles. Here, we provide a greatly expanded comparative analysis of X.laevis and X.tropicalis mRNA expression levels and cross-species hybridization using a 70mer oligonucleotide library based on the expressed gene sequences of X.tropicalis (11), in which nearly 11 000 annotated genes are examined. We investigated four questions. What is the range of hybridization efficiencies between X.laevis and X.tropicalis? Second, is there a correlation between mRNA sequence divergence and mRNA expression levels independent of hybridization efficiency? Third, how do X.laevis and X.tropicalis compare in their global gene expression profiles for selected tissues and developmental stages, i.e. which genes are expressed similarly and differently between the corresponding tissues and stages? Last, do the two Xenopus species express similar genes during development, i.e. does temporal gene expression during development correspond similarly for the two species? An investigation of the above questions allowed us to evaluate the overall efficacy of using X.laevis mRNA on the X.tropicalis microarray platform.

MATERIALS AND METHODS

Xenopus cultures and RNA isolation

X.laevis and X.tropicalis embryos were generated by in vitro fertilization as previously described (12), and the embryos were staged (13). Three separate matings were performed. Each biological replicate was from a separate mating, and 50 sibling embryos from the same mating were pooled to generate the RNA. Embryos from the different matings were always kept separate and not pooled. Total RNA was extracted from pooled embryos or 200 mg of ovary or liver tissue using Trizol Reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's protocol. The total RNA was further purified by phenolchloroform extraction and ethanol precipitation.

Microarray hybridization

Total RNA from the tissues and developmental stages was amplified two rounds using the Amino Allyl MessageAmp II aRNA Amplification Kit (Ambion, Austin, TX; catalog no. 1753) according to the accompanying protocol. The amplified RNA (aRNA) samples were concentrated to 2 µg/µl by vacuum drying. The aRNA was labeled with reactive monofunctional cyanine-3 (Cy3) and cyanine-5 (Cy5) dyes (Amersham, Piscataway, NJ; catalog no. RPN5661) by an indirect amino allyl labeling method as described in Guo et al. (14) with the following exception. The labeling reaction was initiated by adding 5 µl (10 µg of aRNA) to 5 µl of coupling buffer, which in turn was used to suspend the Cy3 and Cy5 dyes. The X.tropicalis 70mer oligonucleotide library (Operon Biotechnologies, Inc., Huntsville, AL), representing 10 898 mRNA transcripts was suspended in 3× SSC at 30 µM and printed at 22°C and 65% relative humidity on aminosilane-coated slides (Cel Associates, Inc., Pearland, TX; VSA-25C) using a high-speed robotic OmniGrid machine (GeneMachines, San Carlos, CA) with Stealth SMP3 pins (Telechem, Sunnyvale, CA) (14,15). A pre-hybridization step was carried out by placing slides in slide rack and immersing them in a staining dish containing 5× SSC, 0.1% SDS and 1% BSA, stirring at 48°C for 1 h. Following pre-hybridization, the slides were dipped ∼10 times in two dishes containing deionized water at room temperature and excess water was gently shaken off. The slides were dipped 10 times in isopropanol at room temperature and spun dried. The vacuum dried Cy3 and Cy5 labeled aRNAs were suspended in 9 µl water, and the mixture heated at 95°C for 3 min and centrifuged at 10 000× g for 1 min. A 2× hybridization buffer was prepared containing 50% formamide, 10× SSC and 0.2% SDS and preheated to 48°C. To the denatured Cy3/Cy5 target mixture, 21 µl of the 2× hybridization buffer preheated to 48°C was added. To block non-specific hybridization, 8 µl of calf thymus DNA (1 µg/µl) (Sigma, St Louis, MO; catalog no. D8661), 2 µl poly(A)-DNA (10 µg/µl) (Sigma, catalog no. P9403), 2 µl yeast tRNA (4 µg/µl) (Sigma, catalog no. R8759) were added to a total volume of 42 µl. The hybridization mixture was applied to the pre-hybridized microarray slide and covered with a 22 × 60 coverslip (Fisher, Pittsburgh, PA; catalog no. 12-545-J). The slide was placed in a CMT Hybridization Chamber (Corning, Acton, MA; catalog no. 2551) and 12 µl water was added to the small reservoirs at each end of the chamber. The sealed hybridization chambers were placed in a water bath at 48°C for 66 h (16). After hybridization, the slides were placed in a slide rack, set in a staining dish containing 1× SSC with 0.1% SDS preheated to 48°C, the coverslips were removed, and the slides were washed for 15 min at 48°C with agitation. The slides were washed further with agitation for 5 min at 48°C three times in 0.1× SSC and 0.1% SDS. The slides were transferred to a staining jar containing 0.1× SSC and washed twice at room temperature for 5 min with agitation. The slides were spun dried immediately after washing, and imaging and data analyses were carried out as described (16).

Data normalization and analysis

The data representing raw spot intensities generated by GenePix® Pro version 5.0 was analysed to identify differentially expressed genes. Data normalization was performed in three steps for each microarray separately (16). Channel specific local background intensities were subtracted from the median intensity of each channel (Cy3 and Cy5). Second, background adjusted intensities were log-transformed and the differences (R) and averages (A) of log-transformed values were calculated as R = log2(X1) − log2(X2) and A = [log2(X1) + log2(X2)]/2, where X1 and X2 denote the Cy5 and Cy3 intensities after subtracting local backgrounds, respectively. Third, data centering was performed by fitting the array-specific local regression model of R as a function of A. The difference between the observed log-ratio and the corresponding fitted value represented the normalized log-transformed gene expression ratio. Normalized log-intensities for the two channels were then calculated by adding half of the normalized ratio to A for the Cy5 channel and subtracting half of the normalized ratio from A for the Cy3 channel. A statistical analysis was performed for each gene and for each Xenopus species separately by fitting the following mixed effects linear model (9). Y = μ + A + S + C + μ, where Y corresponds to the normalized log-intensity on the i-th array (i = 1, …, 15), with the j-th tissue/sample type (j = 1,…, 6) and labeled with the k-th dye (k = 1 for Cy5 and 2 for Cy3). μ is the overall mean log-intensity, A is the effect of the i-th array, S is the effect of the j-th tissue/sample type and C is the effect of the k-th dye. Assumptions about model parameters were the same as described by Wolfinger et al. (17), with array effects assumed to be random and treatment and dye effects assumed to be fixed. Additionally, a similar statistical analysis was performed for interspecies comparisons using the data from both Xenopus species. Statistical significance of differential expression among RNA samples, after adjusting for array and dye effects, was assessed by calculating P-values and estimates of fold-change were calculated. Multiple hypotheses testing adjustment was performed for the full analysis by calculating false discovery rates (FDR) (18,19) and Bonferroni adjusted P-values. Data normalization and statistical analyses were performed using SAS statistical software package (SAS Institute Inc., Cary, NC).

Cluster analysis

Clustering was performed using Bayesian infinite mixture (BIM) model based clustering (20) using normalized expression values for each comparison. BIM model based clustering allowed for the fitting of the statistical mixture model without knowing the number of clusters in the data (21). The statistical model was fitted using the Gibbs sampler, and hierarchical clustering was produced by treating pairwise posterior probabilities as the similarity measure and applying the traditional complete-linkage principle. The clustering results were displayed using the TreeView program () (22).

Sequence Analysis

Measures of sequence similarity were obtained from Basic Local Alignment Search Tool (BLAST) searches of each of the 70mer oligonucleotides against an X.laevis EST database performed by Operon Biotechnologies, Inc. Measures of whole gene sequence similarity were obtained by similarly blasting the X.tropicalis sequences from which the 70mer oligonucleotides were derived against all X.laevis complete coding sequences available from the NCBI database. In both cases, the value used as the measure of sequence similarity was the highest ‘bit’ score among the significant matches for each BLAST result. We chose to use bits rather than E-scores because we were searching against only one database. Bits is a commonly used BLAST outcome score and is defined as follows: Score (bits) = [λ * Score (raw) − ln K] / ln 2, where λ and K are Karlin-Altschul parameters (23).

Real-time quantitative polymerase chain reaction

Quantitative polymerase chain reaction (QPCR) analysis using SYBR green and designed primers was carried out following a described protocol (14,15) to confirm the microarray results. Ribosomal protein 60S L4 (RPL4) was selected for use as the reference RNA because there was little difference in RPL4 mRNA levels among the three developmental stages and two tissues of X.tropicalis to X.laevis. The forward and reverse primer sequences for RPL4 were 5′CCAGAATCCTGAAAAGCCAGGAG3′ and 5′TCTCAAGCTGCTGCAGGATAGC3′, respectively (product size 167 bp). The average cycle threshold (CT) value for the reference RNA was used to normalize the tested gene hypoxia inducible factor 1α (HIF1α). The forward and reverse primer sequences for HIF1α were 5′GTAGTTCAAGGCTTTGATGC3′ and 5′GCATGAAATCAAATACCAAGC3′, respectively (product size ∼150 bp). The primer sequences for RPL4 and HIF1α were 100% complementary to the corresponding gene regions in both Xenopus species. All the PCR products produced single bands of the predicted sizes. For the negative control, no template was added to the reference RNA primers from which no PCR products were detected after 40 cycles. Approximately 20–30 µg of aRNA used in the microarray analysis (see earlier description) was used as template for cDNA synthesis using random primers. The QPCR was performed two times using 125 ng (25 ng/µl) and 60 ng (20 ng/µl) of starting cDNA template. The average CT values for the PCR amplifications and the reference RNAs were determined by carrying out a QPCR measurements from three biological replicates for each gene in each experimental condition (two tissues and three developmental stages). The results from the two QPCR runs were combined for statistical analysis. Differential gene expression ratios were calculated based on CT values of the reference RNAs using the following calculations. ANOVA was used to calculate the for each tissue/developmental stage and P-values. The values were then transformed back to the original scale, where normalized RNA expression levels were 2Δ.

RESULTS

We used a high-throughput microarray approach to investigate genomics of the Xenopus model system. The microarray experiments were carried out using a printed library of 70mer oligonucleotides representing 10 898 mRNA transcripts. The sequences were derived from the X.tropicalis expressed sequence tag (EST) project carried out at the Wellcome Trust/Cancer Research Gurdon Institute in Cambridge, UK. The entire experimental design for the microarray experiments carried out for the studies discussed in this paper is shown in Figure 1.
Figure 1

Experimental design for the microarray studies. mRNA expression levels from three corresponding biological replicates of X.laevis (Xl) and X.tropicalis (Xt) tissues (ovary and liver) and developmental stages (egg; stage 10, St. 10 and stage 40, St. 40) were compared to each other and with a reference RNA (Ref.). The reference RNA was composed of equal amounts of total RNA of the above tissues and developmental stages for a given Xenopus species. Each double pointed arrow represents three microarray slides, one slide per biological replicate, in which one of the three slides was ‘dye flipped’.

Hybridization efficiencies of X.laevis transcript sequences to X.tropicalis probes

Before additional questions could be investigated, we first needed to determine whether X.laevis transcript sequences could efficiently hybridize to X.tropicalis probes. The studies were carried out to ascertain a quantitative measure of how well X.laevis transcripts bound to the X.tropicalis microarray platform and to determine transcript-specific hybridization efficiency correction factors for direct interspecies comparisons. Hybridization efficiencies were determined by directly comparing the transcript levels of the corresponding tissues and developmental stages for the two Xenopus species, and the experimental design used for this portion of the study is shown in Figure 2A. Hybridization conditions for the microarray studies were relatively stringent (48°C, 25% formamide) to minimize as much as possible any experimental variability owing to non-specific binding.
Figure 2

A measurement of microarray hybridization efficiencies for X.laevis versus X.tropicalis. (A) The experimental design to determine the hybridization efficiencies of X.laevis transcripts from ovary, liver, egg; stage 10 (St. 10) and stage 40 (St. 40) to the X.tropicalis DNA microarray probes. Corresponding RNA samples from three biological replicates of the tissues and developmental stages for a given Xenopus species were directly compared to each other by microarray analysis. (B) A plot of log average spot intensities from X.laevis (Y-axis) versus X.tropicalis–X.laevis probe sequence similarity (X-axis). (C) A plot of the log-ratio hybridization bias (Y-axis) versus X.tropicalis–X.laevis probe sequence similarity (X-axis). The Y-axis represents the expected offset from actual relative mRNA expression levels (X.tropicalis/X.laevis).

The calculation of hybridization efficiencies was carried out using two methods. The first method is the method of choice for single channel arrays, and the other method is the choice for dual channel arrays. The methods included a local regression of X.laevis log spot intensity (single channel arrays, Figure 2B) and a local regression of the log of the ratios of the expression levels for each gene (dual channel arrays, Figure 2C) (plotted on the Y-axis) versus values that are a measure of sequence similarity in the corresponding 70mer probe (X-axis). In other words, the X-axis represents the degree to which the 70mer oligonucleotide probe contained significant sequence matches to the corresponding mRNA sequence (see Materials and Methods for a more detailed description). By assuming that for any sequence similarity level, the mRNA levels for an approximately equal number of genes increased or decreased, we could define the predicted values form local regression as the correction factors to use for hybridization efficiency. Local regression was chosen for two reasons: it is readily available in microarray analysis software because it is commonly used for the normalization of microarray data, and it avoids defining a specific parametric function for the dependence of hybridization efficiencies on sequence similarity. For the 9346 probes tested, i.e. those probes that produced signal over background levels, the Spearman rank correlation coefficient for spot intensity versus sequence similarity was R = 0.4091 (P-value < 0.0001). These results indicated that overall there was a reasonable increase in spot intensity as sequence similarity increased (Figure 2B). A primary intent of our studies was to determine whether X.laevis gene expression profiles can be determined on an X.tropicalis microarray platform. Thus, we asked whether hybridization efficiency seemed to be a factor in differential gene expression measurements (Figure 2C). A measurement of the log ratios versus sequence similarity showed that for the ∼4000 probes tested from the direct comparison of the different tissues and developmental stages, as sequence similarity decreased, differential gene expression increased between the two species (Figure 2C). The negative correlation was greatest for transcripts from stage 40 (R = −0.5328) and least from liver (R = −0.3479) with P-values at <0.0001 for all the RNA samples. The predicted values obtained from the local regression analysis were used as the correction factors for hybridization analysis. The magnitude of correction needed showed the likely need of using a correction factor in microarray interspecies comparisons, and although our correction factors are experiment-specific and therefore not listed here, this method for determining correction factors can be used for any interspecies microarray comparisons. In conclusion, using methods that directly compared X.laevis and X.tropicalis transcript sequences with the X.tropicalis platform, we showed that (i) the greater the divergence of RNA transcripts, the greater the difference in spot intensities and differential mRNA levels and (ii) correction factors were calculated for the X.laevis sequences.

Differential mRNA expression levels increase as gene sequence divergence increases

A reasonable argument to explain the results in the previous section is that the observed association between sequence divergence versus spot intensities and differential gene expression is due to the inability of divergent X.laevis sequences to efficiently hybridize to the X.tropicalis probes on the microarray. That is, of course spot intensities will decrease and differences in mRNA levels will increase if one of the sets of transcripts has greater sequence divergence to the arrayed probes. Therefore, we developed a new method to pursue the intriguing question of whether there is an association between RNA transcript divergence and differential gene expression in a way that was free of bias owing to differences in interspecies hybridization efficiencies. We hypothesized that on the whole, as mRNA sequence divergence increased, differences in gene expression between X.laevis and X.tropicalis would also increase. The hypothesis was based on the premise that the more similar an mRNA is expressed in time and space, the more probable the gene sequence and (therefore the encoded protein) will be similar. To test our hypothesis, we designed an experimental approach that was independent of differences in hybridization efficiencies (Figure 3A), in which mRNA levels from liver, ovary, egg, stage 10 and stage 40 were compared with a corresponding X.laevis or X.tropicalis reference RNA. Each reference RNA was composed of equal masses of total RNA from the two tissues and three developmental stages from the respective Xenopus species. The separate comparisons with a commonly composed reference RNA allowed the determination of relative gene expression changes for each Xenopus species without the interceding bias owing to differences in hybridization efficiencies of the two Xenopus targets to X.tropicalis probes. The gene expression ratio values between the reference RNA and each stage and tissue for corresponding genes were next compared with each other to derive a normalized differential expression value for each gene and can be summarized by the formula: log [(Xt / Xt Ref) / (Xl / Xl Ref)], where Xl is X.laevis mRNA, Xt is X.tropicalis mRNA and Ref is the reference RNA. Of the 10 898 RNA transcripts screened on the microarray slide, 1681 were identified as significantly different from the corresponding reference RNA for at least one tissue or developmental stage, using the relatively strict criteria of ≥2-fold difference in mRNA levels and FDR (18) of ≤0.05.
Figure 3

Differential mRNA expression levels for X.laevis (Xl) and X.tropicalis (Xt) are correlated with mRNA transcript sequence divergence. (A) The experimental design to determine whether X.laevis and X.tropicalis differential gene expression for ovary, liver, egg, stage 10 (St. 10) and stage 40 (St. 40) is associated with sequence divergence. mRNA from three biological replicates of the tissues and developmental stages for a given Xenopus species were compared with a corresponding reference RNA (Ref.), in turn, the corresponding ratios from each Xenopus species were compared to each other. (B) Predicted relative gene expression levels of X.laevis to X.tropicalis determined from local regression analysis (Y-axis) versus probe sequence similarity (X-axis). Only the 1681 genes described in the text that were significantly changed relative to the corresponding reference RNA were plotted. (C) A plot of the squared correlation coefficient values (Y-axis) versus probe sequence similarity (X-axis). The Y-axis represents the square of the correlation coefficients for the 1681 genes.

Three approaches were used to examine mRNA expression levels versus sequence divergence (Figure 3B and C, Table 1). The first approach was an investigation into the relationship between the difference in mRNA expression levels for each tissue and stage normalized to their respective reference RNAs (Y-axis) versus sequence similarity (X-axis) (Figure 3B). Of the five tissues and developmental stages tested, all but liver (P-value 0.203) showed a significant negative correlation (the remaining: P-value < 0.0001) between mRNA expression levels normalized to reference and sequence similarity (Figure 3B). The range of the correlation values were R = −0.0785 for egg to R = −0.1260 for ovary. These results showed that as sequence divergence increased so did the divergence in relative mRNA expression levels.
Table 1

Differential mRNA expression levels between X.laevis and X.tropicalis increase as mRNA sequence divergence increases

Gene GroupaMedians of absolute normalized log differences b,c
EggdStage 10Stage 40LiverOvarydBit score range (median)
Group 1 (Higher similarity scores)0.460.510.610.870.32735–4230 (1201)
Group 2 (Lower similarity scores)0.670.620.641.080.4734–722 (348)

aThere were 77 genes in each group.

bOverall P-value = 0.007 from Wilcoxon non-parametric test.

cAbsolute normalized log difference calculation: |log [(Xt / Xt Ref) / (Xl / Xl Ref)]|.

dSignificant difference: P-value <0.05.

The second approach (Figure 3C) was an examination of the relationship between the square of the correlation coefficients (data from Figure 4B) for all the genes that were significantly different from the reference RNA (Y-axis) versus the sequence similarity measure (X-axis). For each gene, the square of the correlation coefficient is a measure of mRNA expression similarity of each gene across tissues in X.laevis versus X.tropicalis. The spearman rank correlation coefficient was R = 0.1234 (P-value < 0.0001) showing that as sequence similarity decreased, the correlation between X.laevis and X.tropicalis tissues and developmental stages also decreased. Again, these results are consistent with the contention that the more similar a gene is between X.laevis and X.tropicalis the more likely the genes are expressed similarly.
Figure 4

The gene expression profiles for X.laevis (Xl) and X.tropicalis (Xt) are similar. (A) The experimental design to determine how X.laevis and X.tropicalis compare in their global gene expression profiles for selected tissues (ovary and liver) and developmental stages (egg, stage 10 and stage 40). mRNA levels from three biological replicates from the two tissues and three developmental stages were compared to a reference RNA (Ref.) for a given Xenopus species. The estimates of log [(Xt/Xt Ref)/(Xl/Xl Ref)] for each gene were compared between X.laevis and X.tropicalis to determine correlation coefficients. (B) Histogram representing the correlation coefficients between X.laevis and X.tropicalis using 1681 transcript levels that changed significantly among the different tissues and stages.

The above two approaches provided global quantitative measures in the relationship between sequence divergence and gene expression. To totally rule out hybridization efficiency as a confounding factor, the third method we used was an examination of the sequence similarities of specific RNA transcripts from homologous genes of X.laevis and X.tropicalis (Table 1) and was performed to eliminate the possibility that the correlation seen above was due to variability in physical binding properties. In this case, RNA transcript sequence divergence was determined for those genes in which the 70mer probe sequence on the microarray matched at least 69 of 70 nt within the corresponding X.laevis mRNA sequence. This approach allowed a direct measure of sequence divergence independent of hybridization efficiencies because both the X.laevis and X.tropicalis sequences shared nearly identical regions for hybridization to the microarray probes. A total of 154 genes were found that met this condition as well as being present in all tissues and in developmental stages. The sequence similarities of the 154 genes were then compared with the corresponding differential gene expression levels (Table 1). The results in Table 1 confirmed the results from the global approaches (Figure 3B and C). The 154 genes were evenly divided into two groups in which Group 1 contained those genes that were more similar in sequence outside the complementary 70mer region and Group 2 contained those genes that were less similar outside the 70mer region. The two gene groups were then compared for the five tissues and developmental stages, in which the normalized log difference of the differential mRNA levels were calculated by the formula log [(Xt/Xt Ref)/(Xl/Xl Ref)] described earlier. Differential mRNA expression levels between the two gene groups were significantly different when the data for all the stages and tissues were combined (P-value < 0.007). Additionally, each developmental stage and tissue showed a trend in which differential gene expression widened as sequence divergence increased, and for egg and ovary, the difference was significant (P-value < 0.05). We note that our measure of sequence similarity is dependent on the database used for BLAST, and as the relevant sequence databases are updated, the analysis will be more precise. Nevertheless, we were able to detect significance even in the midst of the noise created by incomplete sequence data. In conclusion, as shown by different approaches, as sequence divergence in mRNA transcripts increased between X.laevis and X.tropicalis, differential gene expression also increased.

The global gene expression profiles for X.laevis and X.tropicalis are generally similar

The third question we asked was in regard to how X.laevis and X.tropicalis compare in their global gene expression profiles for selected tissues and developmental stages. For this part of the study, we used those data derived from the experimental design shown in Figure 4A, in which mRNA levels from liver, ovary, egg, stage 10 and stage 40 were compared with the corresponding X.laevis or X.tropicalis reference RNA described earlier. Again, separate comparisons with a commonly composed reference RNA allowed the determination of relative gene expression changes for each Xenopus species independent of differences in hybridization efficiencies. As described earlier, 1681 genes were identified as significantly different in mRNA expression levels from the corresponding reference RNA for at least one tissue or developmental stage. Use of the significantly changed genes allowed us to compare ratio-based correlation coefficients for like genes between X.laevis and X.tropicalis among tissues (Figure 4B). Of the 10 898 total transcripts, 2510 transcripts (23%) were present in all tissues and met the signal to noise criterion for analysis of X.laevis versus the corresponding reference RNA, and 8317 transcripts (76%) were similarly analysed for X.tropicalis versus its reference RNA. Of the 1681 significantly changed genes ∼24% had a correlation of 0.90 or greater, and 68% of the genes for the two Xenopus species had correlation values of 0.5 or greater. Thus, the global gene expression levels for X.laevis and X.tropicalis are generally similar for the examined tissues and developmental stages. The EASE program (24), was used to functionally group the most positively (≥0.90) and negatively (≤−0.35) correlated genes between X.laevis and X.tropicalis based on the Gene Ontology (GO, ) database, in which gene products are described in the context of biological processes, cellular components and molecular functions in a species-independent manner (24,25) (Table 2). Using Fisher's Z-score in EASE, the two gene lists were used to test for GO categories significantly enriched with either the most highly, or negatively, correlated genes. The results revealed that the most highly correlated expressed genes were involved primarily in protein synthesis. The lowly correlated genes did not yield any significant results. The 20 most positively and negatively correlated genes are shown in Table 3. The merging of all 1681 significantly changed genes into the GO program (Table 4) showed that the categories whose genes correlated most highly between X.laevis and X.tropicalis, on average were related to development, growth, reproduction, cell death, biosynthesis and response to abiotic/external stimulus.
Table 2

The biological process, molecular function, or cellular component involving those genes with the most highly correlated (≥0.90) gene expression levels between X.laevis and X.tropicalis

GO Gene CategoryaList hitsList totalP-valueBonferroni P-valueBenjamini FDRb
Structural molecule activity522110.0000.0000.000
Cytosolic ribosome292080.0000.0000.000
Protein biosynthesis462110.0000.1350.012
Nucleic acid binding882110.0000.1420.012
Ribonucleoprotein complex462080.0000.1490.012
Protein metabolism852110.0000.2630.020
Cytosol362080.0010.7250.051
Binding1412110.0010.7720.051
Biosynthesis532110.0091.0000.446

aGene Ontology (GO) program ().

bFalse Discovery Rate (18,19).

Table 3

The 20 genes of the highest (upper tier) and lowest (lower tier) correlation values

NCBI clone IDGene product descriptionPearson correlation
NP_000970.1Ribosomal protein L18; 60S ribosomal protein L18 [Homo sapiens]0.997
NP_009089.2NRAS-related gene; upstream of NRAS [H.sapiens]0.997
NP_003398.1Zinc finger protein 360.997
NP_115684.1Hypothetical protein MGC4189 [H.sapiens]0.995
NP_497827.1Cytosolic juvenile hormone binding protein subunit like (32.1 kD) (3F243)0.995
NP_001001.2Ribosomal protein S6; 40S ribosomal protein S6; phosphoprotein NP330.995
NP_004692.1Cyclin B2 [H.sapiens]0.995
AAH61315.1Unknown (protein for MGC:75795) [Silurana tropicalis]0.994
BAC98186.1mKIAA1504 protein [Mus musculus]0.994
NP_114172.1Cyclin B1; G2/mitotic-specific cyclin B1 [H.sapiens]0.994
P30985Transcription factor 12 (Class A helix–loop–helix transcription factor GE1)0.993
NP_001924.2Dihydrolipoamide S-succinyltransferase0.992
NP_057735.2DAPPER1 [H.sapiens]0.992
NP_057018.1Nucleolar protein NOP5/NOP58 [H.sapiens]0.992
NP_014936.1Homology to rat S10; Rps10ap [Saccharomyces cerevisiae]0.992
P98199Potential phospholipid-transporting ATPase (ATPase class I type 8B member 2)0.992
NP_000995.1Ribosomal protein P2; 60S acidic ribosomal protein P20.991
NP_001025.1Ribonucleotide reductase M2 polypeptide [H.sapiens]0.991
NP_002257.1Karyopherin alpha 2; RAG cohort 1; importin alpha 1 [H.sapiens]0.991
XP_232671.2Similar to Probable chromodomain-helicase-DNA-binding protein KIAA14160.991

NP_060695.1Homolog of Caenorhabditis elegans smu-1; ortholog of rat brain-enriched WD-repeat protein−0.774
AAA70336.1LATS−0.784
AAH60352.1Unknown (protein for MGC:68448) [Xenopus laevis]−0.787
NP_733366.1CG2139-PB [Drosophila melanogaster]−0.790
NP_077287.1Hypothetical protein ET [H.sapiens]−0.796
NP_150597.1Mitochondrial ribosomal protein S36 [H.sapiens]−0.802
NP_006295.1Deleted in split-hand/split-foot 1 region [H.sapiens]−0.829
NP_057124.2CGI-100 protein [H.sapiens]−0.831
BAC04638.1Unnamed protein product [H.sapiens]−0.837
NP_005889.2Membrane component, chromosome 11, surface marker 1 [H.sapiens]−0.841
XP_358556.1Hypothetical protein XP_358556 [M.musculus]−0.859
NP_001780.1Cell division cycle 25A; Cdc25A; protein-tyrosine-phosphatase [H.sapiens]−0.865
NP_060590.1Armadillo repeat-containing protein [H.sapiens]−0.869
NP_733778.1Muscle-specific beta 1 integrin binding protein [H.sapiens]−0.890
NP_056299.1GCIP-interacting protein p29 [H.sapiens]−0.896
NP_780399.1RIKEN cDNA 2900010J23 [M.musculus]−0.905
XP_215053.2Similar to D7Wsu128e protein [Rattus norvegicus]−0.908
NP_068681.1Quaking protein [M.musculus]−0.912
NP_001800.1Centromere protein A; centromere protein A (17kD) [H.sapiens]−0.935
Table 4

The physiological processes involving the genes (1681 total) that were significantly differentially expressed between X.laevis and X.tropicalis

LevelaGO termaGO descriptionaCorrelation
GenesMeanbMediancSD
1GO:0007275Development1160.610.800.45
1GO:0050789Regulation of biological process2030.600.740.41
1GO:0009987Cellular process7970.580.740.43
1GO:0007582Physiological process7930.570.730.43

2GO:0040007Growth180.700.850.32
2GO:0050793Regulation of development130.690.840.29
2GO:0000003Reproduction170.540.830.57
2GO:0016265Death360.670.820.35
2GO:0009605Response to external stimulus510.650.810.40
2GO:0042592Homeostasis110.590.800.45
2GO:0050794Regulation of cellular process1740.610.770.41
2GO:0009653Morphogenesis710.590.750.45
2GO:0050791Regulation of physiological process1930.600.750.42
2GO:0030154Cell differentiation220.640.740.42
2GO:0008152Metabolism6220.580.740.42
2GO:0007154Cell communication1110.570.730.41
2GO:0006950Response to stress590.590.700.38
2GO:0050790Regulation of enzyme activity130.530.700.53
2GO:0006928Cell motility200.540.670.41
2GO:0009719Response to endogenous stimulus150.650.600.25
3GO:0016049Cell growth120.720.890.34
3GO:0009628Response to abiotic stimulus170.630.850.49
3GO:0040008Regulation of growth110.680.840.31
3GO:0009581Detection of external stimulus160.570.830.53
3GO:0008219Cell death360.670.820.35
3GO:0009058Biosynthesis1720.630.810.42
3GO:0019953Sexual reproduction160.520.810.58
3GO:0019538Protein metabolism2980.620.800.42
3GO:0016043Cell organization and biogenesis780.600.790.44
3GO:0009607Response to biotic stimulus450.640.750.34
3GO:0007267Cell–cell signaling110.550.740.46
3GO:0008283Cell proliferation470.560.740.46
3GO:0006793Phosphorus metabolism640.620.730.37
3GO:0006118Electron transport460.550.730.38
3GO:0007155Cell adhesion210.650.720.29
3GO:0009887Organogenesis530.600.720.41
3GO:0019222Regulation of metabolism1230.570.700.42
3GO:0006139Nucleobase, nucleoside, nucleotide and nucle2380.550.690.44
3GO:0007165Signal transduction890.560.690.41
3GO:0009308Amine metabolism300.520.690.42
3GO:0006519Amino acid and derivative metabolism260.510.680.44
3GO:0005975Carbohydrate metabolism390.570.670.40
3GO:0040011Locomotion200.540.670.41
3GO:0006810Transport1620.530.660.45
3GO:0006119Oxidative phosphorylation230.630.660.27
3GO:0009056Catabolism740.560.650.42
3GO:0009611Response to wounding170.560.630.41
3GO:0006091Generation of precursor metabolites and energy830.530.610.39
3GO:0006066Alcohol metabolism260.560.610.36
3GO:0006082Organic acid metabolism370.460.600.44
3GO:0006974Response to DNA damage stimulus140.630.600.24

aGene Ontology (GO) program ().

bOverall mean = 0.56.

cOverall median = 0.73.

QPCR assays were carried out to confirm the microarray results. The HIF1α gene was selected for the QPCR assays based on the criteria that it (i) was expressed at relatively high intensities; (ii) was showed relatively large fold changes for at least some of the different tissues and developmental stages and (iii) was a gene that encoded a product of known function. The fold-change increase in HIF1α RNA expression levels in X.tropicalis relative to X.laevis from the microarray studies were: egg, 5.2; stage 10, 9.4; stage 40, 8.5; liver, 27.3 and ovary, 6.3 (P-values ≤ 0.001). Using RPL4 as a reference, the fold-change increase in HIF1α RNA expression levels in X.tropicalis relative to X.laevis from the QPCR results were: egg, 7.4; stage 10, 1.2; stage 40, 3.3; liver, 2.8 and ovary, 1.5 (egg, stage 40, and liver P-value ≤ 0.001; ovary P-value ≤ 0.05). Thus, because the QPCR results for the five tissues/developmental stages all followed the same trend as the microarray results and at statistically significant levels (except for stage 10), the QPCR results validated the microarray results.

The global gene expression profiles for X.laevis and X.tropicalis follow parallel temporal developmental programs

The last question posed was how the two Xenopus species compare in their gene expression profiles during development. That is, do the temporal gene expression profiles correspond similarly to the observed embryological stages for the two species? Identifying the genes that behave similarly and differently at the different developmental stages may lead to the identification of determinants. To carry out this portion of the study, mRNA levels from egg, stage 10 and stage 40 from a given Xenopus species was compared with the same corresponding reference RNAs described earlier (Figure 5A). In addition, for each Xenopus species, mRNA from egg was compared with stage 10 mRNA and with stage 40 mRNA relative to the reference RNA. By comparing gene expression levels in relation with the corresponding reference RNA, the experimental design again allowed for the detection of differentially expressed genes during the course of development without the skewing effects from differences in hybridization efficiencies for the two Xenopus species on the X.tropicalis-based microarray platform.
Figure 5

The global gene expression profiles for X.laevis (Xl) and X.tropicalis (Xt) follow parallel temporally-regulated developmental programs. (A) The part of the experimental design used to determine how X.laevis and X.tropicalis compare in their global gene expression profiles for selected developmental stages (egg; stage 10, St. 10 and stage 40, St. 40). mRNA levels from three biological replicates from the three developmental stages for a given Xenopus species were compared with a reference RNA (Ref.) and mRNA from egg was compared with stage 10 and to stage 40 via the reference RNA. (B) Hierarchical tree of genes and heat map of the developmental stages, in which corresponding stage 10 and stage 40 mRNA levels were compared to egg mRNA levels for each Xenopus species. The top 200 ranked genes in each comparison that were at least 50% changed were included. The heat map columns left to right are: X laevis stage 10 versus X.laevis egg, X.tropicalis stage 10 versus X.tropicalis egg, X.laevis stage 40 versus X.laevis egg, and X.tropicalis stage 40 versus X.tropicalis egg. The brackets to the right of the heat map numbered 1–3, designate groups of genes that are contrary to the overall clustering trend and are described in the text.

The heat map in Figure 5B shows a gene cluster diagram for stage 10 and stage 40 mRNA levels versus corresponding X.laevis and X.tropicalis egg mRNA levels and includes the top ranked most significantly changed 200 genes in each comparison that changed at least 50%. There were overlapping genes that were top ranked in more than one comparison, thus, a total of 547 different genes were included. The heat map shows that for the vast majority of genes, a given gene that either increased (red), decreased (green) or did not change (black) relative to the corresponding egg mRNA, also increased, decreased or did not change accordingly in the other Xenopus species. Thus, the heat map is generally a block of genes with decreased mRNA levels (green upper half) over a block of genes with increased mRNA expression levels (lower red half) and indicates that the gene expression profiles during development for X.laevis and X.tropicalis are very similar. Nonetheless, there were several small clusters of genes that were contrary to the general trend and are designated to the right of the heat map by the numbers 1–3 (Figure 5B). For example, the cluster of genes designated number one shows a group of genes that decreased at stages 10 and 40 in X.tropicalis, whereas, in X.laevis, the same genes remained relatively unchanged. However, in each of the groups 1–3, no functional relationship among the genes could be discerned and suggests that there are no major differences in any specific developmental program. In conclusion, for the great majority of the genes, mRNA levels for the same genes rose and declined similarly during development in both Xenopus species. Additional evidence that the developmental programs for X.laevis and X.tropicalis are similarly regulated is shown in Table 5. Table 5 lists the correlation coefficients between the two Xenopus species for the 116 genes identified as significantly changed among tissues or developmental stages, i.e. those genes involved in development according to the GO database (first row in Table 4, GO term GO:0007275). Of the 116 genes listed, 88 (76%) had a correlation coefficient of 0.5 or greater (above the black dividing line) suggesting that many of the key genes that direct or participate in development are regulated similarly in the two Xenopus species.
Table 5

Correlation coefficients of genes known to be involved in Xenopus development

Gene IDDescription of development genesPearson correlation
7812NRAS-related gene; upstream of NRAS1.00
51602Nucleolar protein NOP5/NOP580.99
4678Nuclear autoantigenic sperm protein isoform 10.99
21974Topoisomerase (DNA) II beta0.98
1786DNA (cytosine-5-)-methyltransferase 1; DNA methyltransferase 1; DNAmethyltransferase0.98
1466Cysteine and glycine-rich protein 2; SmLIM; LIM domain only 5, smooth muscle0.98
175621EMB-5, abnormal EMBryogenesis EMB-5 (175.8 kD) (emb-5)0.98
4624Myosin heavy chain 6; myosin heavy chain, cardiac muscle alpha isoform0.97
851389Required for Start B in mitosis and for meiosis I spindle pole body separation;Cdc36p0.97
1277Alpha 1 type I collagen preproprotein0.97
1459Casein kinase 2, alpha prime polypeptide0.97
7273Titin isoform N2-B; connectin; CMH9, included; cardiomyopathy, dilated 1G0.97
13822Unnamed protein product0.96
30096Zic10.95
5351Lysyl hydroxylase precursor; lysine hydroxylase0.95
3149High-mobility group box 3; high-mobility group (nonhistone chromosomal) protein40.95
172244Cytoplasmic Polyadenylation Element-Binding protein (cpb-3)0.95
655Bone morphogenetic protein 7 precursor; osteogenic protein 10.95
3399Inhibitor of DNA binding 30.94
326340Zygote arrest 10.94
70Actin, alpha, cardiac muscle precursor0.94
3622Inhibitor of growth 1-like0.94
7020Transcription factor AP-2 alpha0.94
855568Membrane-bound casein kinase I homolog; Yck2p0.93
54766B-cell translocation gene 4; putative transcriptional regulator0.93
1301Alpha 1 type XI collagen isoform B preproprotein; collagen XI, alpha-1polypeptide0.93
8651Suppressor of cytokine signaling 1; STAT induced SH3 protein 10.93
4116Mago-nashi homolog0.92
23411Sirtuin 1; sir2-like 1; sirtuin type 10.92
51654CDK5 regulatory subunit associated protein 1 isoform a0.92
7784Zona pellucida glycoprotein 3 preproprotein0.91
6520Solute carrier family 3 (activators of dibasic and neutral amino acidtransport), member 20.91
86BAF53a; hArpN beta; actin-related protein; BAF complex 53 kDa subunit;BRG1-associated factor0.91
57829Zona pellucida glycoprotein 4 preproprotein; zona pellucida B protein0.90
3475Interferon-related developmental regulator 10.90
854549Homolog of chicken calponin, thus the name S.cerevisiae CalPonin; Scp1p0.89
70Actin, alpha, cardiac muscle precursor0.89
5757Prothymosin, alpha (gene sequence 28)0.89
8857Fc fragment of IgG binding protein; IgG Fc binding protein0.89
12505CD44 antigen precursor (Phagocytic glycoprotein I) (PGP-1) (HUTCH-I)0.88
323630Similar to dishevelled 2, dsh homolog0.87
64603T-box transcription factor eomesodermin0.87
6678Secreted protein, acidic, cysteine-rich (osteonectin)0.87
4729NADH dehydrogenase (ubiquinone) flavoprotein 224 kDa0.87
984Cell division cycle 2-like 1 (PITSLRE proteins); Cell division cycle 2-like 10.86
3491Cysteine-rich, angiogenic inducer, 610.85
2266Fibrinogen, gamma chain isoform gamma-A precursor0.85
20687trans-acting transcription factor 30.85
23481Pescadillo homolog 1, containing BRCT domain0.85
174044SMAll body size SMA-6, Serine-threonine kinase, transforming growth factor betatype I receptor0.84
10361Nucleoplasmin 20.83
180357ForKHead transcription factor family member, defective PHArynx development0.82
6159Ribosomal protein L29; 60S ribosomal protein L29; heparin/heparansulfate-interacting protein0.82
176688Serine/arginine rich splicing factor SF2, substrate of the SR protein kinaseSPK-1 (28.7 kDa)0.82
1278Alpha 2 type I collagen; Collagen I, alpha-2 polypeptide; Collagen of skin,tendon and bone, alpha-2 chain0.81
5292Pim-1 oncogene; Oncogene PIM10.80
54993Zinc finger protein 290.80
51399Synbindin; TRS23 homolog; hematopoietic stem/progenitor cell protein 1720.80
70Actin, alpha, cardiac muscle precursor0.79
6658SRY (sex determining region Y)-box 3; transcription factor SOX-30.78
3398Inhibitor of DNA binding 2; inhibitor of differentiation 2; DNA-binding proteininhibitor ID20.77
54514DEAD (Asp-Glu-Ala-Asp) box polypeptide 4; VASA protein0.76
26578Osteoclast stimulating factor 10.75
10643IGF-II mRNA-binding protein 3; KH domain containing protein overexpressed incancer0.75
5743Prostaglandin-endoperoxide synthase 2 precursor; prostaglandin G/H synthase andcyclooxygenase0.75
6227Ribosomal protein S21; 40S ribosomal protein S210.74
8943Adaptor-related protein complex 3, delta 1 subunit; adaptin, delta0.73
5515Protein phosphatase 2 (formerly 2A), catalytic subunit, alpha isoform0.72
6223Ribosomal protein S19; 40S ribosomal protein S190.72
5052Peroxiredoxin 1; natural killer-enhancing factor A; proliferation-associatedgene A0.71
2010Emerin0.71
2147Coagulation factor II precursor; prothrombin0.70
1209Cleft lip and palate associated transmembrane protein 10.68
1281Alpha 1 type III collagen; Collagen III, alpha-1 polypeptide; collagen, fetal0.67
27289GTP-binding protein RHO60.67
80781Alpha 1 type XVIII collagen isoform 2 precursor; endostatin0.65
652Bone morphogenetic protein 4 preproprotein; bone morphogenetic protein 2B0.63
856856Suppressor of Choline SynthesisLikely to be involved in regulating INO1expression0.62
859Caveolin 3; M-caveolin; caveolin-30.61
851676Brain Modulosignalin Homolog; Bmh2p [S.cerevisiae]0.60
176702Human Mortality factor-Related Gene related (38.3 kDa) (mrg-1)0.59
35070Cadherin-N CG7100-PH0.58
43510Kayak CG15509-PB0.54
2195FAT gene product0.53
6665SRY (sex determining region Y)-box 15; SRY (sex determining region Y)-box 200.51
1994ELAV-like 1; embryonic lethal, abnormal vision, Drosophila, homolog-like 1; Huantigen R0.50
5274Serine (or cysteine) proteinase inhibitor, clade I (neuroserpin), member 1;protease inhibitor 12 (neuroserpin)0.50
4733Developmentally regulated GTP binding protein 110.50

6997Teratocarcinoma-derived growth factor 10.45
11146FKBP-associated protein isoform FAP68; FK506-binding protein-associated protein;glomulin0.44
1634Decorin isoform b precursor; dermatan sulphate proteoglycans II0.41
694B-cell translocation protein 10.30
10856RuvB-like 2; erythrocyte cytosolic protein, 51-KD; TBP-interacting protein,48-KD; Reptin520.24
3852Keratin 5; Keratin-5; 58 kda cytokeratin; keratin, type II cytoskeletal 5;cytokeratin 50.23
7125Troponin C2, fast0.19
173233UNCoordinated locomotion UNC-59, septin (52.9 kDa) (unc-59)0.12
7092Tolloid-like 10.09
2879Glutathione peroxidase 4; phospholipid hydroperoxidase; sperm nucleusglutathione peroxidase0.03
1655DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 50.00
224Aldehyde dehydrogenase 3A2; aldehyde dehydrogenase 10; fatty aldehydedehydrogenase0.00
928CD9 antigen; motility related protein; leukocyte antigen MIC3−0.01
43383Fork head CG10002-PA [Drosophila melanogaster]−0.01
8861LIM domain binding 1; carboxy terminal LIM domain protein 2; LIM domain-bindingfactor-1−0.03
41062Polychaetoid CG31349-PA−0.05
5931Retinoblastoma binding protein 7−0.07
4637Smooth muscle and non-muscle myosin alkali light chain isoform 1−0.08
7171Tropomyosin 4−0.10
7168Tropomyosin 1 (alpha)−0.30
8324Frizzled 7; frizzled (Drosophila) homolog 7−0.51
4869Nucleophosmin (nucleolar phosphoprotein B23, numatrin); Nucleophosmin 1−0.51
4738Neural precursor cell expressed, developmentally down-regulated 8−0.52
179788Cadherin protein like−0.53
51588Protein inhibitor of activated STAT protein PIASy−0.64
7979Deleted in split-hand/split-foot 1 region−0.83
19317Quaking protein−0.91

DISCUSSION

Sequence divergence and gene expression

We introduced four questions to investigate. The first question was whether the association between gene sequence divergence and hybridization efficiency can be effectively measured and used to generate a correction factor when mRNA levels are directly compared (Figure 2), and the second question was whether there is an association between sequence divergence and differences in gene expression levels for X.laevis and X.tropicalis when hybridization bias is removed (Figure 3). The prediction for the first question is obvious, i.e. that there would be a direct relationship between sequence similarity and hybridization efficiency, and our intent was to quantify sequence divergence and hybridization efficiency between the two Xenopus species. By plotting spot intensity (Figure 2B) and the relative mRNA expression levels for each gene (Figure 2C) as measures of hybridization efficiency versus sequence matches in the 70mer probe, we showed, not surprisingly, that overall, as sequence similarity for the genes from the two Xenopus species increased, so did hybridization efficiency. Moreover, we obtained a hybridization efficiency measure for each Xenopus gene based on its sequence similarity measure. Such information may be useful as a correction factor when directly comparing X.laevis and X.tropicalis mRNA levels or the mRNA levels of any species. For the second question, our hypothesis was that as sequence divergence increased so would the difference in gene expression levels (Figure 3, Table 1). The hypothesis was based on the premise that as sequence divergence increased for a given transcript between two species, the greater the likelihood that the transcript encoded a different protein and that the encoded protein carried out a different function, and therefore, the greater the likelihood the transcript would be expressed at a different time and/or level for the encoded protein to carry out the different function. That is, across tissues and developmental stages, the more similar orthologous genes are expressed in time and space, then the more likely the gene sequence and therefore the encoded protein sequence will be similar. To test that hypothesis, we designed a method to compare interspecies transcriptomes free of hybridization bias using a same species reference RNA. It should be noted that the method in theory can be used to compare any two transcriptomes free of binding bias on any given array of probes, although to increase sensitivity and specificity, the sequence identity of the arrayed probes should be as close as possible to the target sequences. Indeed, we observed that for the transcripts on the whole, as mRNA sequence divergence increased, so did mRNA expression levels, which raises a cause and effect question. In the case where sequence divergence in the mRNA sequence could be a cause for differential gene expression, a gene with a non-lethal, non-synonymous mutation would encode a different protein that would either be better or worse at carrying out its function than would the formerly encoded protein. Because no mutation would have yet occurred in the regulatory regions for the gene in question, the transcript for the new protein would probably be expressed at the same time and level of the previous transcript. A mutation in the regulatory region of the gene would be required to alter transcriptional timing or transcript levels such that the new protein could function better, serving as a positive selective mechanism for the gene. It could also be the case that a change in gene expression could be the selective force for mRNA sequence divergence. A mutation that first occurred in the regulatory region that caused a change in transcript levels and/or timing could be a means for selecting those mutations in the corresponding coding region that encoded a better functioning protein under the new expression conditions. For example, a mutation in the promoter that decreased the transcription rate of the gene may positively select for a mutation in the coding region that increased the activity of the translated protein. When more sequence data for both Xenopus species become available, we would predict that the genes in Xenopus with the greatest sequence divergence in the coding region would as a means for compensation, also have the greatest sequence divergence in the regulatory regions. Furthermore, it will prove interesting to tease out whether certain groups of genes diverged more rapidly than others and whether these genes play roles in developmental programs. Our new method for interspecies transcriptome comparisons may also prove useful in the testing of the neutral theory of evolution (26,27) to determine whether the mRNA sequence divergence between species is primarily stochastic and neutral or the result of natural selection. Our results would support the latter because the mRNA sequence divergence we observed did have an effect on mRNA expression levels, i.e. if the transcript sequence divergence we observed was neutral, little or no change in gene expression would be predicted. An apparent complication that would not be faced in most interspecies comparisons is that X.laevis is an allotetraploid (28), in which ∼50% of the genes in X.laevis are duplicated (29), which arose from the fusion of two nuclei from different species (30). These facts raise the question of (i) which orthologous transcript sequence (which may vary considerably for a given duplicated gene) may have hybridized to a given X.tropicalis probe and (ii) the larger question of whether a given X.laevis transcript sequence may have hybridized specifically to the corresponding orthologous probe. In regard to which X.laevis orthologous transcript may have hybridized to a given X.tropicalis probe, it is impossible with the current microarray probes to determine to what degree dissimilar RNA sequences of a duplicated X.laevis gene pair would have hybridized. The binding of two different labeled targets to a probe would have provided a sum amount of fluorescence signal. In our analysis when applicable (Table 1), we used only the more similar sequence of an X.laevis gene pair because it could not be ascertained how much signal the less similar sequence provided. Thus, for Table 1 the degree of sequence divergence may be underestimated. In regard to whether a given X.laevis sequence hybridized specifically to the corresponding orthologous probe, because hybridization efficiency is a function of sequence similarity, owing to the stringent hybridization conditions employed for these studies and the divergence of the X.laevis sequences, it is probable that only the orthologous transcripts would have hybridized to the corresponding probe. Furthermore, non-specific hybridization would be expected to be less in closely related interspecies comparisons than with an intraspecies comparison, i.e. if somewhat divergent sequences are expected to hybridize less well to the orthologous probes then they would also be expected to bind less well to the non-orthologous probes. These observations may account for our results described earlier in which only 23% of the genes met the signal to noise criterion for analysis of X.laevis versus its reference RNA while 76% of the X.tropicalis genes were analysed versus its reference RNA. Other studies have addressed the problem of sequence mismatches for multi-species comparisons on microarrays by using multi-probe arrays representing the different target species (31), by disregarding all data except those representing identical target/probe sequences (27), or by incorporating a non-specific, general normalization procedure (32). Interspecies hybridization approaches in which mRNA levels were directly compared include canine sequences on human probes (33), bovine sequences on human probes (34), porcine sequences on human probes (35) and non-human primate sequences on human probes (36). In each study, orthologous genes were identified with similar and dissimilar gene expression levels. Our results showed that the orthologous genes between X.laevis and X.tropicalis produced generally similar expression profiles. However, to our knowledge, no other published work has attempted to quantify interspecies comparisons in which the bias owing to differences in hybridization efficiencies was removed and in which differential gene expression levels and transcript sequence divergence were correlated.

Interspecies comparisons

Third, we asked how differential gene expression patterns compared between corresponding tissues and developmental stages of X.laevis and X.tropicalis. The expressed RNA from the tissues and stages were each compared with a same-species reference RNA, a method that allowed the comparison of interspecies expression patterns free of bias owing to sequence differences (Figure 4A). The homologous Xenopus genes that were significantly differentially expressed relative to their reference RNA in at least one of the tested tissues and stages (1681 genes of ±2-fold or greater, P-value ≤ 0.001, FDR ≤ 0.05) were then compared across species. We assumed that the differentially expressed genes were representative of all genes present in all tissues and there was no evidence to suggest otherwise. We suggest that this approach will be useful in comparing gene expression levels of other related species on a given microarray platform. We found that among the differentially expressed genes, gene expression levels were quite similar between the two species, and our conclusion was that the genes between the two species were generally expressed similarly (Figure 4B). However, that conclusion was based on our findings that 23% of the 10 898 total transcripts present in the examined tissues and developmental stages were successfully analysed for X.laevis, yet, 76% of the transcripts were similarly analysed for X.tropicalis. These results suggest that (i) the overall RNA transcript sequence divergence between the Xenopus species may be such that only one-third of the X.laevis transcript sequences relative to X.tropicalis can bind sufficiently to the X.tropicalis platform and/or (ii) the very unlikely prospect that nearly two-thirds of the transcripts expressed in the examined tissues and developmental stages in X.tropicalis are not expressed in X.laevis. Therefore, our conclusion that the two Xenopus species generally have similar global gene expression patterns is somewhat guarded because it is based on approximately one-quarter of the total available Xenopus transcripts.

Global gene expression during X.laevis and X.tropicalis development

The fourth question dealt with how well temporal gene expression changes across the observed embryological stages correlated for the two Xenopus species. We found that gene expression profiles between two given developmental stages were generally similar for each Xenopus species (Figure 5). The results for this part of the study were also free of any bias due to sequence differences in that changes in gene expression over time (egg to stage 10 to stage 40) were determined relative to the like species reference RNA. In this case, all the homologous Xenopus genes that were differentially expressed in any of the three developmental stages relative to the reference RNA were compared with each other (±2-fold or greater, P-value ≤ 0.001, FDR ≤ 0.05). These results led to a total of 547 genes included in the clustering analysis. The data presented in Figure 5B show that X.laevis and X.tropicalis express the majority of genes at the same developmental time. Although it was beyond the scope of this paper to speculate what implications the differences in temporal expression may mean in the two developmental programs, it will undoubtedly bear out that the genes that are not expressed similarly will prove more interesting.
  31 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Microarray-based analysis of early development in Xenopus laevis.

Authors:  C R Altmann; E Bell; A Sczyrba; J Pun; S Bekiranov; T Gaasterland; A H Brivanlou
Journal:  Dev Biol       Date:  2001-08-01       Impact factor: 3.582

3.  Identifying differentially expressed genes using false discovery rate controlling procedures.

Authors:  Anat Reiner; Daniel Yekutieli; Yoav Benjamini
Journal:  Bioinformatics       Date:  2003-02-12       Impact factor: 6.937

4.  Perspective on the Xenopus field.

Authors:  J B Gurdon
Journal:  Dev Dyn       Date:  2002-12       Impact factor: 3.780

Review 5.  Splitting pairs: the diverging fates of duplicated genes.

Authors:  Victoria E Prince; F Bryan Pickett
Journal:  Nat Rev Genet       Date:  2002-11       Impact factor: 53.242

6.  Bayesian infinite mixture model based clustering of gene expression profiles.

Authors:  Mario Medvedovic; Siva Sivaganesan
Journal:  Bioinformatics       Date:  2002-09       Impact factor: 6.937

7.  Gene profiling during neural induction in Xenopus laevis: regulation of BMP signaling by post-transcriptional mechanisms and TAB3, a novel TAK1-binding protein.

Authors:  Ignacio Muñoz-Sanjuán; Esther Bell; Curtis R Altmann; Alin Vonica; Ali H Brivanlou
Journal:  Development       Date:  2002-12       Impact factor: 6.868

Review 8.  Xenopus, the next generation: X. tropicalis genetics and genomics.

Authors:  Nicolas Hirsch; Lyle B Zimmerman; Robert M Grainger
Journal:  Dev Dyn       Date:  2002-12       Impact factor: 3.780

9.  Techniques and probes for the study of Xenopus tropicalis development.

Authors:  Mustafa K Khokha; Christina Chung; Erika L Bustamante; Lisa W K Gaw; Kristin A Trott; Joanna Yeh; Nancy Lim; Jennifer C Y Lin; Nicola Taverner; Enrique Amaya; Nancy Papalopulu; James C Smith; Aaron M Zorn; Richard M Harland; Timothy C Grammer
Journal:  Dev Dyn       Date:  2002-12       Impact factor: 3.780

10.  Sex-dependent gene expression and evolution of the Drosophila transcriptome.

Authors:  José M Ranz; Cristian I Castillo-Davis; Colin D Meiklejohn; Daniel L Hartl
Journal:  Science       Date:  2003-06-13       Impact factor: 47.728

View more
  24 in total

Review 1.  Toward an unbiased evolutionary platform for unraveling Xenopus developmental gene networks.

Authors:  Ronny Beer; Florian Wagner; Vladislav Grishkevich; Leonid Peshkin; Itai Yanai
Journal:  Genesis       Date:  2012-01-05       Impact factor: 2.487

Review 2.  Cross species analysis of microarray expression data.

Authors:  Yong Lu; Peter Huggins; Ziv Bar-Joseph
Journal:  Bioinformatics       Date:  2009-04-08       Impact factor: 6.937

3.  Segmental duplications contribute to gene expression differences between humans and chimpanzees.

Authors:  Ran Blekhman; Alicia Oshlack; Yoav Gilad
Journal:  Genetics       Date:  2009-03-30       Impact factor: 4.562

4.  Evolution of sex-dependent gene expression in three recently diverged species of Drosophila.

Authors:  Zi-Feng Jiang; Carlos A Machado
Journal:  Genetics       Date:  2009-08-31       Impact factor: 4.562

5.  Comparative network analysis reveals that tissue specificity and gene function are important factors influencing the mode of expression evolution in Arabidopsis and rice.

Authors:  Sara Movahedi; Yves Van de Peer; Klaas Vandepoele
Journal:  Plant Physiol       Date:  2011-05-13       Impact factor: 8.340

6.  Anxa4 Genes are Expressed in Distinct Organ Systems in Xenopus laevis and tropicalis But are Functionally Conserved.

Authors:  Karine L Massé; Robert J Collins; Surinder Bhamra; Rachel A Seville; Elizabeth A Jones
Journal:  Organogenesis       Date:  2007-10       Impact factor: 2.500

7.  Genomic profile of matrix and vasculature remodeling in TGF-alpha induced pulmonary fibrosis.

Authors:  William D Hardie; Thomas R Korfhagen; Maureen A Sartor; Adrienne Prestridge; Mario Medvedovic; Timothy D Le Cras; Machiko Ikegami; Scott C Wesselkamper; Cynthia Davidson; Maggie Dietsch; William Nichols; Jeffrey A Whitsett; George D Leikauf
Journal:  Am J Respir Cell Mol Biol       Date:  2007-05-11       Impact factor: 6.914

8.  TBP paralogs accommodate metazoan- and vertebrate-specific developmental gene regulation.

Authors:  Ulrike G Jacobi; Robert C Akkers; Elisabeth S Pierson; Daniel L Weeks; John M Dagle; Gert Jan C Veenstra
Journal:  EMBO J       Date:  2007-08-16       Impact factor: 11.598

Review 9.  Promoter architecture and the evolvability of gene expression.

Authors:  Itay Tirosh; Naama Barkai; Kevin J Verstrepen
Journal:  J Biol       Date:  2009-12-14

10.  Comparative gene expression between two yeast species.

Authors:  Yuanfang Guan; Maitreya J Dunham; Olga G Troyanskaya; Amy A Caudy
Journal:  BMC Genomics       Date:  2013-01-16       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.