Wye-Lup Kong1,2, Ryuji J Machida3. 1. Biodiversity Program, Taiwan International Graduate Program, Academia Sinica and National Taiwan Normal University, Taipei, Taiwan. 2. Department of Life Science, National Taiwan Normal University, Taipei, Taiwan. 3. Biodiversity Research Center, Academia Sinica, Taipei, Taiwan.
Abstract
Growth rate estimation is important to understand the flow of energy and nutrient elements in an ecosystem, but it has remained challenging, especially on microscopic organisms. In this study, we propose four growth rate indices that use mRNA abundance ratios between nuclear and mitochondrial genes: (1) total nuclear and mitochondrial mRNA ratio (Nuc:Mito-TmRNA); (2) nuclear and mitochondrial ribosomal protein mRNA ratio (Nuc:Mito-RPmRNA); (3) gene ontology (GO) terms and total mitochondrial mRNA ratios; and (4) nuclear and mitochondrial specific gene mRNA ratio. We examine these proposed ratios using RNA-Seq datasets of Daphnia magna, and Saccharomyces cerevisiae retrieved from the NCBI Short Read Archive. The results showed that both Nuc:Mito-TmRNA and Nuc:Mito-RPmRNA ratio indices showed significant correlations with the growth rate for both species. A large number of GO terms mRNA ratios showed significant correlations with the growth rate of S. cerevisiae. Lastly, we identified mRNA ratios of several specific nuclear and mitochondrial gene pairs that showed significant correlations. We foresee future implications for the proposed mRNA ratios used in metatranscriptome analyses to estimate the growth rate of communities and species.
Growth rate estimation is important to understand the flow of energy and nutrient elements in an ecosystem, but it has remained challenging, especially on microscopic organisms. In this study, we propose four growth rate indices that use mRNA abundance ratios between nuclear and mitochondrial genes: (1) total nuclear and mitochondrial mRNA ratio (Nuc:Mito-TmRNA); (2) nuclear and mitochondrial ribosomal protein mRNA ratio (Nuc:Mito-RPmRNA); (3) gene ontology (GO) terms and total mitochondrial mRNA ratios; and (4) nuclear and mitochondrial specific gene mRNA ratio. We examine these proposed ratios using RNA-Seq datasets of Daphnia magna, and Saccharomyces cerevisiae retrieved from the NCBI Short Read Archive. The results showed that both Nuc:Mito-TmRNA and Nuc:Mito-RPmRNA ratio indices showed significant correlations with the growth rate for both species. A large number of GO terms mRNA ratios showed significant correlations with the growth rate of S. cerevisiae. Lastly, we identified mRNA ratios of several specific nuclear and mitochondrial gene pairs that showed significant correlations. We foresee future implications for the proposed mRNA ratios used in metatranscriptome analyses to estimate the growth rate of communities and species.
Organisms' growth rates are important traits that can be influenced by environmental conditions of their habitat, such as temperature and availability of nutrients (Logan et al., 1976; Urabe et al., 1997; McCarthy et al., 1998). By estimating the growth rate of the organisms in an ecosystem, we can determine the efficiency of energy and nutrient element flow to higher trophic levels in a food web system (Cloern et al., 1995; Granados et al., 2017). Growth rate varies between species and populations, and changes according to the spatial and temporal variation in environmental conditions. Hence, an effective growth rate estimation method is essential to understand ecosystems.Growth rate estimation remains challenging, especially for microscopic organisms, because individual tracing and direct measurement methods are not feasible. Hence, approaches such as biochemical indices are favoured in growth rate estimation. Over the decades, many nucleic acid‐based indices have been introduced to estimate the growth rate of an animal population, many of which use RNA content as the proxy for growth rate (Yebra et al., 2017). There are three main reasons the nucleic acid indices have gained popularity in growth rate estimation. First, the approach does not require the incubation of animals, where the samples can be collected from the field and the nucleic acid quantified to estimate the growth rate. Second, the method is applicable across different taxonomic groups due to the conservative feature that mRNA is being used as the blueprint of protein production (Crick, 1970). Third, molecular techniques are easy to access because of the availability of commercial kits. These kits can reduce the time required for training and improve the quality of results.Initially, RNA concentration was considered as the proxy for animal growth rate that includes some molluscs and arthropods species (Sutcliffe, 1965). After two decades, Ota and Landry (1984) followed up with animals' conservative feature that uses the RNA:DNA ratio indices with DNA concentration as the denominator. The RNA:DNA ratio gained popularity and was widely incorporated in many studies across different taxonomic groups due to its simple and rapid procedure (Wagner et al., 1998). However, Ota and Landry (1984) also pointed out the limitation of the ratio used in field‐collected samples. One of the reasons might be the size of the genome, which can vary drastically among eukaryote species (Gregory et al., 2007). Another novel nucleic acid index, the nuclear to mitochondrial ribosomal ratio (Nuc:Mito ribosomal ratio), uses quantities of cytosolic and mitochondrial ribosomal RNA (rRNA) as numerator and denominator, respectively, for the ratio (Kong et al., 2019). The significant correlation between the ribosomal ratio and growth rate is due to the allocation of assimilated nutrient elements to the mitochondrial and nuclear ribosome, responsible for energy production and growth‐related genes, respectively.Over the past decades, high‐throughput transcriptomics technologies have gradually been incorporated into ecological studies (Alvarez et al., 2015; Brauer et al., 2017; Caron et al., 2017). Through transcriptomics analyses, researchers can capture the gene expression responses towards environmental changes and compare different locations with various conditions. Furthermore, the genetic responses of multiple individuals can be studied simultaneously with metatranscriptomics analyses. Metatranscriptomics is still in its infancy, and the application is limited to diversity estimation and functional activities assessment (Mukherjee & Reddy, 2020; Vorobev et al., 2020). However, we believe that with the correct approaches and careful calibration, more organismal information can be extracted from metatranscriptomics data, such as growth rate.In this study, we investigated the potential of four mRNA ratio indices: (1) total nuclear and mitochondrial mRNA ratio; (2) nuclear and mitochondrial ribosomal protein mRNA ratio; (3) gene ontology (GO) terms and total mitochondrial mRNA ratio and (4) nuclear and mitochondrial specific gene mRNA ratio. Transcriptome data sets of Daphnia magna and Saccharomyces cerevisiae retrieved from the NCBI (National Centre for Biotechnology Information); sequence read archive (SRA) were used for the study. The results demonstrated that the proposed mRNA ratios are useful growth rate indices which might be used for individuals and communities collected in the field.
MATERIALS AND METHODS
This study included two different transcriptome datasets to evaluate the proposed nuclear and mitochondrial mRNA ratios. (1) The first data set is the Daphnia magna transcriptome, cultured at different temperatures and food concentrations. (2) The second data set is the Saccharomyces cerevisiae transcriptome, cultured with different phosphorus‐level treatments, which was retrieved from the NCBI; SRA (BioProject accession number: PRJNA395936, Gurvich et al., 2017).
Daphnia magna acquisition and preparation
Daphnia magna samples were obtained from the Freshwater Bioresources Center, National Chiayi University, Chiayi, Taiwan. The samples used in this study were cultured in laboratory incubators under 16:8 h light and dark cycle. A single individual D. magna was kept individually in a 50 ml tube filled with dechlorinated and filtered tap water. Samples were fed with laboratory‐grown green alga Chlorella vulgaris as a food source and fed daily by replacing the culturing water with the green alga. Different levels of temperature (high: 25, medium: 20 and low: 15°C) and food concentration (high: 5 × 105, medium: 5 × 104 and low: 5 × 103 cells/ml) were used to manipulate the growth of D. magna. Each treatment group consisted of five replicates, which resulted in a total of 45 individuals. The body length was recorded twice, newly hatched (start of the incubation) and immediately before the development of the embryo (end of the incubation: ovary becomes darker and swollen). The duration of culture is between 4–6 days due to different treatments affect D. magna development time. Dry weights of the individuals were estimated based on a length–weight regression formula (Kawabata & Urabe, 1998; Weight = e3.05 + 2.16ln[Length]). The somatic growth rate of D. magna was calculated based on the formula provided by Lampert and Trubetskova (1996):
where W
t is the weight (μg) of the sample collected for RNA extraction, W
0 is the sample's weight on the first day of culture and t is the time culture duration (day).
Ribonucleic acid extraction
Total RNA was extracted from each individual without pooling using the RNeasy Plus Micro kit (Product ID: 74034; Qiagen) and followed the standard protocol provided by the manufacturer.
High‐throughput transcriptome
Transcriptome library preparation and sequencing of Daphnia magna
mRNA
Twenty ng of the total RNA were used for each library preparation. The mRNA was isolated from the total RNA using the NEBNext Poly(A) mRNA magnetic isolation module (Product ID: E7490; New England Biolabs), following the standard protocol provided by the manufacturer. Transcriptome library preparation procedures were performed following the standard protocol of NEBNext Ultra II RNA Library Prep Kit for Illumina (Product ID: E7770; New England Biolabs) with some modifications. The enrichment step for the library preparation was increased from 20 to 25 cycles to ensure an adequate amount of cDNA for the sequencing. The Illumina MiSeq (Illumina) with 150 cycles single read was used for the sequencing. The NGS Core Laboratory in the Biodiversity Research Center, Academia Sinica, Taiwan conducted all sequencing procedures.
Sequence data analysis
Standard low‐quality read filtering and quality checks were done using the program cutadapt (Martin, 2011) and fastqc (Andrews, 2010), respectively, with a minimum sequence length of 50 bp and the partial sum quality threshold of 28. Filtered high‐quality sequences were mapped to the reference genome using the program Tophat (Trapnell et al., 2009). Reference genome (dmagna‐version 2.4‐20,100,422‐assembly.fna) and general feature format (GFF) file (dmagset7finloc9c.puban.gff) for D. magna were retrieved from wFleaBase (wfleabase.org; Colbourne et al., 2005). A maximum of five base pair mismatches was allowed during the mapping process. HTseq (htseq‐count; Anders et al., 2015) was used to acquire sequence read counts for both nuclear and mitochondrial protein‐coding genes. During the read counting process, the GFF reference file (dmagset7finloc9c.puban.gff) was referred to select reads that mapped to reference sequence of “mRNA” feature type. This ensured that only sequences mapped to protein‐coding genes were included. Ratio indices, correlation coefficient (function: cor.test) and adjusted coefficient of determination (function: lm) were calculated using the R platform (R Core Team, 2020). All the commands used in the process are provided in the supporting information (Appendix S1).
Ratio indices calculations
There are four mRNA ratio indices proposed in this study: (1) Total nuclear and mitochondrial mRNA ratio (Nuc:Mito‐TmRNA). (2) Nuclear and mitochondrial ribosomal protein mRNA ratio (Nuc:Mito‐RPmRNA). (3) GO terms and the total mitochondrial mRNA ratio, and (4) Nuclear and mitochondrial specific gene mRNA ratio.The first ratio, Nuc:Mito‐TmRNA, uses the abundances of mRNA reads mapped to the reference of the nuclear genome as the numerator and the reads mapped to the mitochondrial genome as the denominator. The second ratio, Nuc:Mito‐RPmRNA, uses the number of mRNA reads mapped to nuclear and mitochondrial ribosomal protein‐coding genes. The third ratio is the GO term and mitochondrial mRNA ratio, which uses the reads numbers mapped on the GO terms of both lowest and highest hierarchical terms and mitochondrial mRNAs. GO information from the UniProt database (uniprot.org, 2021) for the D. magna dataset was utilized for the annotation. Manual annotations were performed by pairing the protein name of genes in our dataset with the protein list in databases. Genes were grouped based on the annotated GO terms and each GO read abundance was calculated. In the ratio, total mitochondrial mRNA read abundance was used as the denominator. The last mRNA ratio uses read abundances of a single specific nuclear and a mitochondrial gene read abundance. Every nuclear and mitochondrial gene combination available in the transcriptome dataset was calculated in this study. All the mRNA ratios were temperature corrected using the van't Hoff–Arrhenius relation because most of the biological processes rates are influenced exponentially by temperature (Brown et al., 2004),
where E is the activation energy (0.63 eV), k is Boltzmann's constant (8.62 × 10−5 eV·K−1) and T is the temperature in kelvins. The value of activation energy was derived from the temperature‐correction of metabolic rate (Brown et al., 2004). All the calculations and model descriptions were performed using R (R Core Team, 2020). The R scripts for the calculation procedure are uploaded into the Zenado Repository (https://doi.org/10.5281/zenodo.5506660).
Real‐time quantitative PCR
To validate the nuclear and mitochondrial specific gene mRNA ratio, we performed the absolute quantification approach of real‐time qPCR on the top 10 nuclear and mitochondrial gene pairs that have the highest correlation coefficients with growth rate. A new batch of D. magna was cultured following the same protocol and treatments as mentioned before. The total RNA extraction protocol was also as before. The reason for using a new sample set is due to the insufficient RNA left after the RNA‐Seq library preparations.
RNA extraction and reverse transcription for real‐time quantitative PCR
RNA extraction was performed using PureLink RNA Minikit (Product ID: 12183020; Invitrogen). Reverse transcription was performed using Maxima reverse transcriptase enzyme (Product ID: EP0741; Thermo Fisher Scientific) together with random hexamer primers (Product ID: N8080127; Invitrogen).
Primer design and standard curves preparation
Two types of primers were designed for the real‐time qPCR experiment. The first primers, known as the precursor primers, amplified the longer sequence product of the target genes (ca. 1,000 bp). The purpose of these amplified products is to construct standard curves that can be used in absolute qPCR. The second set of primers is designed based on the precursor primers' product sequences. These primer sets are used in the real‐time qPCR. All the primers used in this study were designed using the program Primer3 (Untergasser et al., 2012) and listed in the Table S1.The standard curves for real‐time qPCR were constructed by amplifying cDNA of spare D. magna samples using precursor primers set. The copy number for the amplified cDNA samples were determined using the formula (Whelan et al., 2003):
where Avogadro's number is 6.022 × 1023 (Johnston, 1939), while the units of cDNA amount in nanogram (ng), and length in base pair (bp). The cDNA samples were diluted to obtain the desired copy numbers for the standard curves.
Real‐time quantitative PCR
Absolute real‐time qPCR was conducted using SYBR‐green PowerUp master mix (Product ID: A25776; Applied Biosystems) following the standard protocol. QuantStudio 3 Real‐Time PCR system (Applied Biosystems) was used in the PCR. All cDNA template concentrations were standardized to 1 ng/μl, and primers' concentrations were diluted to 3 mM. Each qPCR plate consisted of the test cDNA samples, a 5‐point standard curve (1 × 109 to 1 × 105 copy numbers), and a no‐template control for each primer set. Every sample was loaded into three wells as triplicates during the qPCR process. The standard curve of each qPCR plate was used as a quality filter with the criteria of primer efficiency ranging from 110% to 90% and R
2 value higher than 0.98.The absolute quantities of mRNA copy numbers for each gene were calculated using the qPCR standard curve application provided by the QuantStudio system (Thermo Fisher Scientific).
Saccharomyces cerevisiae transcriptome data acquisition and analysis
To test the potential utility of the indices in other species, a total of 32 S. cerevisiae S288C strain transcriptome data sets were retrieved from the NCBI (ncbi.nlm.nih.gov/sra; BioProject accession: PRJNA395936 and the same analyses as for D. magna were performed. Growth rate of S. cerevisiae was measured based on the change in its optical density (OD600) after 30 min of incubation, then normalized by its own cell density to control for density‐dependent effects (Gurvich et al., 2017). Accession number list of transcriptome data downloaded is available in the Table S2. The downstream analysis of this transcriptome dataset followed the same pipeline as the D. magna analysis. The reference genome, gff format file (reference genome version: S288C_reference_genome_R64‐2‐1_20150113), and the GO information terms of S. cerevisiae were retrieved from the Saccharomyces Genome Database (yeastgenome.org; Cherry et al., 2012).
Statistical analysis and data visualization
All the statistical analyses were performed using R (R Core Team, 2020). Pearson's correlation tests were conducted to see the relationship between various mRNA ratios and growth rate using the cor.test function of the R stats package. The analysis of variance (ANOVA) was performed to determine the effect of treatments on somatic growth rate and mRNA ratios of D. magna using the aov function of R stats package. The statistical models were constructed using lm function of R stats package. Figures were plotted using the R package ggplot2 (Wickham, 2011) and the R script for the plots is provided in Zenodo Repository (https://doi.org/10.5281/zenodo.5547206).
RESULTS
Somatic growth rate of Daphnia magna
Both temperature and food concentration treatments had significant effects on the somatic growth rate of D. magna (Temperature: F
2,36 = 88.71, p < .01**; Food concentration: F
2,36 = 154.41, p < .01**; Interaction: F
4,36 = 4.29, p < .01**; Figure 1a).
FIGURE 1
(a) Boxplots show the somatic growth rate (μg/day) of Daphnia magna under different temperature and food concentration treatments. (b) The boxplot shows the total nuclear and mitochondrial mRNA ratios under different treatments. (c)The boxplot shows the nuclear and mitochondrial ribosomal protein mRNA ratios under different treatments. Both ratios were temperature corrected by the exponential (e) of negative activation energy (−E) divided by Boltzmann's constant (k) and temperature in kelvins (T)
(a) Boxplots show the somatic growth rate (μg/day) of Daphnia magna under different temperature and food concentration treatments. (b) The boxplot shows the total nuclear and mitochondrial mRNA ratios under different treatments. (c)The boxplot shows the nuclear and mitochondrial ribosomal protein mRNA ratios under different treatments. Both ratios were temperature corrected by the exponential (e) of negative activation energy (−E) divided by Boltzmann's constant (k) and temperature in kelvins (T)
Daphnia magna transcriptome data output
A total of 45 transcriptome libraries were constructed. Each library yielded about 1,000,000 single‐end mRNA reads (Table S3). Two of the libraries with small reads output, and a high level of nontarget species contamination were removed from the further analysis (a sample treated with high temperature and low food, and another sample treated with medium temperature and medium food concentration). On average, 87% of mRNA reads were assigned to protein‐coding gene regions. The mapping percentages for the libraries were 84%–93% and 2%–16% for the nuclear and the mitochondrial genomes, respectively (Table S3).
Relationship between the growth rate and the Daphnia transcriptome estimated ratios
Total nuclear and mitochondrial mRNA ratio
The boxplot of the temperature‐corrected total nuclear and mitochondrial mRNA ratio (Nuc:Mito‐TmRNA) across different temperature and food concentration treatments (Figure 1b) showed an inconsistent pattern when compared to the boxplot of somatic growth rate (Figure 1a), especially in 25°C temperature treatment. The analysis of variance (ANOVA) showed that temperature had a significant effect on the Nuc:Mito‐TmRNA (F
2,34 = 3.25, p < .01**), but food concentration and the interaction effect between these two treatments do not show any significant effect on the mRNA ratio (food concentration: F
2,34 = 2.29, p = .12; Interaction effect: F
4,34 = 0.77, p = .55; Figure 1b). Using Pearson's correlation test, we found a significant correlation between the somatic growth rate of the samples and temperature‐corrected Nuc:Mito‐TmRNA (r
41 = 0.42, p < .01; Figure 2a).
FIGURE 2
Scatterplots show the relationship between the somatic growth rate (μg/day) of Daphnia magna against (a) the total nuclear and mitochondrial mRNA ratio and (b) the nuclear and mitochondrial ribosomal protein mRNA ratio. The black lines are regression estimated with Pearson's correlation
Scatterplots show the relationship between the somatic growth rate (μg/day) of Daphnia magna against (a) the total nuclear and mitochondrial mRNA ratio and (b) the nuclear and mitochondrial ribosomal protein mRNA ratio. The black lines are regression estimated with Pearson's correlation
Nuclear and mitochondrial ribosomal protein mRNA ratio
Both temperature and food concentration treatments had significant effects on the mRNA ratio (Nuc:Mito‐RPmRNA), (Temperature: F
2 = 37.09, p < .01**; Food concentration: F
2,34 = 6.38, p < .01**; Interaction: F
4,34 = 7.00, p < .01**; Figure 1c). There is a significant correlation between the temperature‐corrected Nuc:Mito‐RPmRNA and somatic growth rate (r
41 = 0.70, p < .01**; Figure 2b).
Gene ontology terms and the total mitochondrial mRNA ratio
A total of 721 GO child terms and 26 highest GO parent terms were used for the ratio estimations for the D. magna RNA‐Seq dataset. However, the correlation coefficient between the ratios and somatic growth rate were relatively low (<0.49) (Table S4).
Nuclear and mitochondrial specific gene mRNA ratio
The mRNA read abundances of a total of 213,240 gene pairs from 26,655 nuclear and eight mitochondrial genes, of which at least one read mapped to the gene reference sequence, were used to estimate the ratio. Raw mapped read abundances were used in the mRNA ratio calculation with no reference sequence length correction. Out of all the nuclear and mitochondrial gene pair combinations, the ratios of 31,171 gene pair combinations had a significant correlation with the growth rate. Overall, the gene pairs with the cytochrome c oxidase subunit I (COX1) have a better correlation with growth rate than other mitochondrial genes (Figure S1). Therefore, the rest of the calculation will be performed using the mitochondrial COX1 as the denominator.Using the mitochondrial COX1 gene read abundance as the denominator, we extracted the 10 nuclear genes with high correlation coefficient for further examination (Table 1, Figure S2a–j).
TABLE 1
The 10 nuclear and mitochondrial gene pairs that show significant correlation with somatic growth rate in the Daphnia magna transcriptome data set. The information about the genes can be retrieved from wfleabase.org (“wfleabase.org/lucegene/search”, find in “daphmag7genexml” library) using the accession number
Nuclear and mitochondrial gene pair
Nuclear genes' accession number
r
p‐value
Aldose reductase and COX1 (AR:COX1)
Dapma7bEVm010418t1
0.719
<.01
Glutamate‐cysteine ligase regulatory subunit and COX1 (GCLM:COX1)
Dapma7bEVm006091t1
0.775
<.01
Ethylmalonic encephalopathy protein 1 and COX1 (ETHE1:COX1)
Dapma7bEVm002027t1
0.684
<.01
Receptor expression‐enhancing protein 5 and COX1 (REEP5:COX1)
Dapma7bEVm006154t1
0.642
<.01
Peptide transporter family 1 and COX1 (PTF1:COX1)
Dapma7bEVm015155t1
0.660
<.01
Saccharopine dehydrogenase and COX1 (SDH:COX1)
Dapma7bEVm004750t1
0.694
<.01
Carbonyl reductase 1 and COX1 (CBR1:COX1)
Dapma7bEVm006355t1
0.589
<.01
Eukaryotic translation initiation factor 4E transporter and COX1 (EIF4ET:COX1)
Dapma7bEVm010030t1
0.633
<.01
Peritrophic matrix protein and COX1 (PMP:COX1)
Dapma7bEVm001252t1
0.598
<.01
α, α‐trehalose‐phosphate synthase A and COX1 (OtsA:COX1)
Dapma7bEVm009716t1
0.701
<.01
The 10 nuclear and mitochondrial gene pairs that show significant correlation with somatic growth rate in the Daphnia magna transcriptome data set. The information about the genes can be retrieved from wfleabase.org (“wfleabase.org/lucegene/search”, find in “daphmag7genexml” library) using the accession number
Statistical models of growth rate prediction using the Daphnia magna specific gene mRNA ratios
Based on the results, we selected three specific gene mRNA ratios as potential growth rate indicators: AR:COX1‐mRNA (aldose reductase), OtsA:COX1‐mRNA (α, α‐trehalose‐phosphate synthase A), and ETHE1:COX1‐mRNA (ethylmalonic encephalopathy protein 1) ratios (Figure 3). We investigate these three mRNA ratios by constructing statistical models that can be used to estimate growth rate. Initially, all the relations between different types of mRNA ratio and somatic growth rate were tested with a simple linear model. However, residual plots of the mRNA ratio against growth rate showed high residuals with low mRNA ratios. This shifted our analytical strategy to use the log‐linear model, which has a better fit at the lower end of the scale (Figure S3). Using the RNA‐Seq result of the D. magna sample, we described log‐linear models for the five mRNA ratios proposed in this study to predict growth rate (Table 2).
FIGURE 3
The three gene‐specific mRNA ratios, which showed high correlation with growth rate in Daphnia magna: AR:COX1 (top), ETHE1:COX1 (middle) and OtsA:COX1 (bottom). The boxplots show the mRNA ratios under different temperature and food concentration treatments. The scatterplot show the relationship between the mRNA ratios and somatic growth rate. The blue lines are regression estimated with Pearson's correlation
TABLE 2
Three models (log‐linear) of Daphnia magna somatic growth rate (SGR) with the mRNA ratios. Adjusted coefficient of determination (adj. R
2) and standard error of estimate (SE) are listed in this table
Statistical model
adj. R2
SE
SGR = 2.58 + 0.07·ln(OtsA:COX1‐mRNA·e–E/kT)
0.593
0.078
SGR = 3.32 + 0.09·ln(AR:COX1‐mRNA·e–E/kT)
0.659
0.071
SGR = 3.54 + 0.11·ln(ETHE1:COX1‐mRNA·e–E/kT)
0.668
0.070
The three gene‐specific mRNA ratios, which showed high correlation with growth rate in Daphnia magna: AR:COX1 (top), ETHE1:COX1 (middle) and OtsA:COX1 (bottom). The boxplots show the mRNA ratios under different temperature and food concentration treatments. The scatterplot show the relationship between the mRNA ratios and somatic growth rate. The blue lines are regression estimated with Pearson's correlationThree models (log‐linear) of Daphnia magna somatic growth rate (SGR) with the mRNA ratios. Adjusted coefficient of determination (adj. R
2) and standard error of estimate (SE) are listed in this table
Validation of nuclear and mitochondrial specific gene mRNA ratios using real‐time qPCR
The specific gene mRNA ratios selected from the transcriptome data were validated using absolute real‐time qPCR. On top of the selected three genes from the statistical models (AR, OtsA and ETHE1), seven more genes with a high correlation coefficient between mRNA ratio and somatic growth rate: glutamate‐cysteine ligase regulatory subunit (GCLM), receptor expression‐enhancing protein 5 (REEP5), peptide transporter family 1 (PTF1), saccharopine dehydrogenase (SDH), carbonyl reductase (CBR1), peritrophic matrix protein (PMP), and eukaryotic translation initial factor 4E transporter (EIF4ET), were also included in the qPCR validation. We designed multiple primers for the 10 nuclear genes. However, only six genes: AR, ETHE1, REEP5, PTF1, CBR1 and OtsA, were successfully amplified due to the strict requirements for the real‐time qPCR experiment (Taylor et al., 2019). The average cDNA copy numbers for each gene that was detected and quantified using the real‐time qPCR are 1,428,089, 17, 2, 6, 79, 13 and 6494 copies for COX1, REEP5, PTF1, CBR1, OtsA, AR and ETHE1, respectively. The ratios with OtsA, AR and ETHE1 showed significant correlation with somatic growth rate (OtsA: n = 27, r = 0.854, p < .01**, Figure 4d; AR: n = 27, r = 0.702, p < .01**, Figure 4e; ETHE1: n = 19, r = 0.761, p < .01**, Figure 4f).
FIGURE 4
Scatterplots of relationship between somatic growth rate (μg/day) against nuclear and mitochondrial specific gene mRNA ratio of Daphnia magna ((a) REEP5, (b) PTF1, (c) CBR1, (d) OtsA, (e) AR and (f) ETHE1). The ratios were estimated using real‐time qPCR. The black lines are estimated with Pearson's correlation
Scatterplots of relationship between somatic growth rate (μg/day) against nuclear and mitochondrial specific gene mRNA ratio of Daphnia magna ((a) REEP5, (b) PTF1, (c) CBR1, (d) OtsA, (e) AR and (f) ETHE1). The ratios were estimated using real‐time qPCR. The black lines are estimated with Pearson's correlation
Messenger RNA ratios using the Saccharomyces cerevisiae transcriptome data set
Pearson's correlation test showed that both Nuc:Mito‐TmRNA and Nuc:Mito‐RPmRNA ratios correlated significantly with growth rate for S. cerevisiae (Nuc:Mito‐TmRNA: r
26 = 0.785, p < .01**, Figure 5a; Nuc:Mito‐RPmRNA: r
26 = 0.823, p < .01**, Figure 5b). The analysis of variance test showed that phosphorus levels have a significant effect on both Nuc:Mito‐TmRNA (F
3,24 = 5.83, p < .01**) and Nuc:Mito‐RPmRNA (F
3,24 = 4.42, p = .01*). The Tukey ad hoc test showed that the mRNA ratios from the no‐phosphorus treatment group have significant differences from other groups. However, the correlation is only significant in the low‐phosphorus‐level groups, namely, S. cerevisiae that are treated with 0 mM and 0.06 mM phosphorus (0 mM: r
5 = 0.947, p < .01**; 0.06 mM: r
5 = 0.846, p = .02*) for Nuc:Mito‐TmRNA. For Nuc:Mito‐RPmRNA, only the 0.06 mM phosphorus‐level treatment group had a significant correlation between the mRNA ratio and growth rate (r
5 = 0.916, p < .01**).
FIGURE 5
The scatterplots show the relationship between the ratios and the growth rate of Saccharomyces cerevisiae. The scatterplots show the relationships of the (a) total and (b) ribosomal protein mRNA ratios. The black lines are regression estimated with Pearson's correlation
The scatterplots show the relationship between the ratios and the growth rate of Saccharomyces cerevisiae. The scatterplots show the relationships of the (a) total and (b) ribosomal protein mRNA ratios. The black lines are regression estimated with Pearson's correlationA large number of GO term ratios (each GO term read number and the total mitochondrial mRNA) had significant correlations with the growth rate. The GO term with the highest correlation coefficient is the cytoplasmic translation (GO: 0002181, r
26 = 0.887, p < .01**). Other GO terms, including organelle assembly (GO: 0070925, r
26 = 0.884, p < .01**), ribosomal small subunit biogenesis (GO: 0042274, r
26 = 0.880, p < .01**), ribosome assembly (GO: 0042255, r
26 = 0.874, p < .01**) and ribosomal large subunit biogenesis (GO: 0042273), also correlated with growth rate.Among the 10 nuclear candidate genes suggested from D. magna data set, five nuclear homologous genes were also found in S. cerevisiae. Out of the five, one gene—saccharopine dehydrogenase (YNR050C: r
26 = 0.45, p = .02)—showed significant correlation with their growth rate. The remaining four genes did not show significant correlations (aldose reductase: YHR104W: r
26 = −0.10, p = .62; peptide transporter family: YPR194C: r
26 = −0.29, p = .14; carbonyl reductase: YMR226C: r
26 = 0.33, p = .08; α, α‐trehalose‐phosphate synthase A: YBR126C: r
26 = −0.07, p = .72; Table 3).
TABLE 3
Comparison between the mRNA ratios of Daphnia magna and Saccharomyces cerevisiae RNA‐Seq data sets. The correlation coefficients (r) and p‐value (p) were obtained using Pearson's correlation test
mRNA ratio
Daphnia magna
Saccharomyces cerevisiae
Nuc:Mito‐TmRNA
r41 = 0.42, p < .01
r26 = 0.79, p < .01
Nuc:Mito‐RPmRNA
r41 = 0.70, p < .01
r26 = 0.82, p < .01
GO:Mito‐mRNA
r41 = 0.49, p < .01
Gama‐tubulin complex localization (GO: 0033566)
r26 = 0.89, p < .01
Cytoplasmic translation (GO: 0002181)
AR:COX1‐mRNA
r41 = 0.74, p < .01
r26 = −0.10, p = .62
OtsA:COX1‐mRNA
r41 = 0.71, p < .01
NA
ETHE1:COX1‐mRNA
r41 = 0.73, p < .01
NA
Comparison between the mRNA ratios of Daphnia magna and Saccharomyces cerevisiae RNA‐Seq data sets. The correlation coefficients (r) and p‐value (p) were obtained using Pearson's correlation testr
41 = 0.49, p < .01Gama‐tubulin complex localization (GO: 0033566)r
26 = 0.89, p < .01Cytoplasmic translation (GO: 0002181)
DISCUSSION
According to endosymbiotic theory, the origin of mitochondria was a free‐roaming nucleus free cell that was engulfed by another cell and gradually evolved into the current state of mitochondria in eukaryotic cells (Margulis & Bermudes, 1985). During the evolutionary process, fragments of mitochondrial genomes migrated into the cell nucleus and integrated into the nuclear genome (Lopez et al., 1994). The migration resulted in the compacting of mitochondrial genome size, and the remaining genes are mainly involved in energy production for cellular processes (Adams & Palmer, 2003). Thus, the rest of the biological processes such as cell growth and division rely on the genes expressed in the nuclear genome. All of the mRNA‐based growth indices proposed in this study use these function differences as the ratios.
Nuc:Mito‐TmRNA and Nuc:Mito‐RPmRNA as the growth rate indices
Both Nuc:Mito‐TmRNA and Nuc:Mito‐RPmRNA showed significant correlations with the growth rate of D. magna and S. cerevisiae. We also found that temperature and food concentration treatments significantly affect Nuc:Mito‐RPmRNA ratios, whereas the Nuc:Mito‐TmRNA ratio is only significantly affected by temperature in D. magna. This might be one of the reasons for the weaker correlation of Nuc:Mito‐TmRNA with somatic growth rate compared with Nuc:Mito‐RPmRNA. The rationales of those two indices are very similar: separating cell growth and division (measured by nuclear‐encoded gene expression, which consists of a portion of gene involved in that cellular biological process) and energy production (measured by the mitochondrial gene expression) components from the total mRNA abundance and using them as the growth indices. The difference between the two ratios is that the Nuc:Mito‐TmRNA includes all characterized protein‐coding gene of nuclear and mitochondrial genomes, whereas the Nuc:Mito‐RPmRNA only considers the projected abundance of ribosome molecules, which approximated by the quantity of ribosomal protein mRNA in this case. The synthesis of ribosomes requires a large amount of environmentally limited nutrients such as nitrogen and phosphorus (Warner, 1999). Hence, the change of Nuc:Mito‐RPmRNA is expected to be slow. In contrast, less energy and nutrients are required for the dynamics of the Nuc:Mito‐TmRNA. Because of these differences, we expect that Nuc:Mito‐TmRNA has higher sensitivity than Nuc:Mito‐RPmRNA.Overall, these two mRNA ratios showed significant correlations with the growth rate, although the correlation coefficients of the relationships was not particularly high (Figure 2). The lack of a stronger correlation could be explained by the fact that protein‐coding genes that are not involved in cell growth and divisions, such as genes that maintain cellular homeostasis are included in the mRNA ratio calculations, highlighting an area for future development.
GO terms and total mitochondrial mRNA ratio
There are many GO terms in the biological process domain that are directly or indirectly involved in growth. For example, the genes that are annotated to cytoplasmic translation (GO: 0002181) are responsible for the protein translation process in the cytoplasm. In the S. cerevisiae data set, the mRNA ratio using this GO term showed the strongest correlation with growth rate. The uniqueness of the GO terms and mitochondrial mRNA ratio is that they can minimize the gene expression noise caused by other genes that do not contribute to the organism's growth, such as secondary metabolites. However, we did not observe similar patterns in the D. magna transcriptome dataset. The results of our D. magna data set showed that only 11.4% (3052 out of 26,655 genes) of the protein‐coding genes were annotated to the GO terms, whereas the estimate for S. cerevisiae was 89.2% (5882 out of 6572 genes). This result indicates that the majority of the genes of D. magna (88.6%) do not have a known function or require thorough functional annotation to assign proper GO terms. This suggests that a thorough transcriptome functional annotation is required for this mRNA ratio index, which indicates that the GO terms and total mitochondrial mRNA ratio might be less effective for species that lack information on the annotated genes' function.
Specific nuclear gene and COX1 mRNA ratio
The correlation coefficients estimated from the mitochondrial COX1 gene were the best among the mitochondrial genes (Figure S1). This result demonstrated that the mitochondrial COX1 was the best gene to be used as the denominator.First, we estimated candidate nuclear genes from the transcriptome data set. Next, we validated the combination with real‐time qPCR. Our results demonstrated that three pairs of nuclear and mitochondrial gene mRNA ratios, namely, OtsA:COX1‐mRNA, AR:COX1‐mRNA and ETHE1:COX1‐mRNA, are good candidates for estimating growth rate due to consistent results in both transcriptome and real‐time qPCR. The statistical models of these three mRNAs also showed relatively high predictability of somatic growth rate. In the present study, we focused only on the three genes. However, it is expected that more genes can be found with more extensive data sets.For real‐time qPCR, a region of different gene sequences were amplified using different primers. In contrast, for RNA‐Seq, the whole sequence of the target gene, without using gene specific primers, was determined. This potentially resulted in the difference in mRNA ratio despite the sample experiencing the same treatments and growth rate. Hence, the mRNA ratios generated by the two approaches are not directly comparable.
Advantages and disadvantages of mRNA ratios as growth rate indices
This is the first study that proposed growth rate indices involving protein‐coding gene mRNA read abundance from transcriptome high‐throughput sequencing data. The four types of proposed nuclear and mitochondrial mRNA ratios each come with their advantages and disadvantages (Table 4). First, Nuc:Mito‐TmRNA can be calculated by aligning transcriptome mRNA reads to a mitochondrial reference, whereas the rest of the reads will be the nuclear mRNA reads. Hence, this method can be utilized by non‐model species due to the availability of mitochondrial reference databases (Machida et al., 2017; Leray et al., 2018; Sato et al., 2018). However, this approach includes all protein‐coding genes, some of which might not contribute to growth. Besides, this approach requires high‐throughput transcriptome sequencing technology.
TABLE 4
List of advantages and disadvantages of each proposed mRNA ratio index
Advantages
Disadvantages
Total nuclear and mitochondrial mRNA ratio (Nuc:Mito‐TmRNA)
Requires only the mitochondrial genome databases to perform the calculation
Potentially applicable across different taxonomic groups
Lacks specificity—all protein‐coding genes are included in the calculation
Requires high‐throughput sequencing technology
Nuclear and mitochondrial ribosomal protein mRNA ratio (Nuc:Mito‐RPmRNA)
Potentially applicable across different taxonomic groups
Requires both cytosolic and mitochondrial ribosomal protein sequence databases
Requires high‐throughput sequencing technology
GO term and total mitochondrial mRNA ratio
Specific on the function of the genes included—potentially less noise
Requires GO annotation.
Requires high‐throughput sequencing technology
Nuclear and mitochondrial specific gene mRNA ratio
Can be conducted through the qPCR platform
Homologue could not be found in some species for certain genes
List of advantages and disadvantages of each proposed mRNA ratio indexRequires only the mitochondrial genome databases to perform the calculationPotentially applicable across different taxonomic groupsLacks specificity—all protein‐coding genes are included in the calculationRequires high‐throughput sequencing technologyPotentially applicable across different taxonomic groupsRequires both cytosolic and mitochondrial ribosomal protein sequence databasesRequires high‐throughput sequencing technologySpecific on the function of the genes included—potentially less noiseRequires GO annotation.Requires high‐throughput sequencing technologyCan be conducted through the qPCR platformHomologue could not be found in some species for certain genesFor the nuclear and mitochondrial ribosomal protein mRNA ratio (Nuc:Mito‐RPmRNA), the main advantage is the high potential of this mRNA ratio in predicting the somatic growth rate due to the high correlation coefficient between these two variables. A reference database of ribosomal protein is also available (Nakao et al., 2004), but the species coverage is not as widely available as the mitochondrial reference genome database. Similar to Nuc:Mito‐TmRNA, the Nuc:Mito‐RPmRNA approach also requires a high‐throughput sequencing platform.Compared with Nuc:Mito‐TmRNA and Nuc:Mito‐RPmRNA, the GO terms and total mitochondrial mRNA ratio provide higher specificity, which minimizes potential noise generated by protein‐coding genes, which is not related to growth. However, the method requires a well‐established GO terms reference. Currently, such information is mainly available only for the major model organisms. A high‐throughput sequencing platform is also needed for this approach.Lastly, the nuclear and mitochondrial specific gene mRNA ratio has a unique advantage because this approach can be performed using the real‐time qPCR platform. The method also has very high sensitivity. However, not all species have the homolog genes of the nuclear gene proposed in this study, such as the homologue of ETHE1, which is yet to be discovered in S. cerevisiae.Currently, D. magna and S. cerevisiae are the only species that we have tested for the utility of the proposed mRNA ratios to estimate growth rate. Validation of the ratios with different taxonomic groups, life histories and traits will be an important next step to achieve the universality of the indices across disparate eukaryotic organisms. Cautions should also be exercised when the ratios are extrapolated to nonmetazoan eukaryote species. Enrichment of mRNA using polyadenylation (Oligo dT purification) is the standard strategy in transcriptome library preparations. However, the polyadenylation of nonmetazoan eukaryote mitochondrial mRNA is known to promote degradation (Chang & Tong, 2012). Therefore, it is likely that the data generated from the Oligo dT‐enriched libraries will not provide good mitochondrial mRNA abundance estimates. Instead, other mRNA enrichment, such as ribosomal RNA depletion, is recommended.
Future perspective of the proposed mRNA indices
Prospect of the proposed mRNA indices with metatranscriptomics
Metatranscriptomics analysis has gained popularity in the past few years. Aside from the microbial community, eukaryotes such as zooplanktons have also pursued the same approach recently (Damon et al., 2012; Lenz et al., 2021; Lopez et al., 2021; Machida et al., 2021). All of the ratios proposed in this study can be estimated from metatranscriptome data sets with the availability of reference databases. For example, once the sequence databases of the genes required for specific genes mRNA ratio, such as ETHE1 and COX1 are available, the sequence reads of these target genes from metatranscriptome dataset can be extracted by matching with the sequence databases. This will allow us to estimate diversity indices, species composition, and growth indices from a single metatranscriptome sequence dataset. However, reference sequences of some of the genes proposed in this study are less available. Therefore, deposition of those functional gene sequences will be required before fully utilizing some of the methods.
Future studies to address unanswered questions
This study showed the potential of several mRNA ratios that might be good growth rate indices using D. magna and S. cerevisiae. However, the universality of the ratios as broader eukaryote growth rate indices was not well addressed in this study. For example, it is still unclear if the ratio is effective in organisms of varying sizes, different life history stages, complexities, and taxonomically distant organisms. Furthermore, the test of nonoptimal environmental conditions, which will influence the expression of the nongrowth‐related genes, is another layer of requirement to derive universal indices. Validating the relationship between mRNA ratios and growth rate in broader ecological, taxonomic and ontogenetic contexts will therefore be the prime focus for future studies.
AUTHOR CONTRIBUTIONS
The experiment was conceived by R.J.M. and W.‐L.K.; Animal culturing and molecular experiments were performed by W.‐L.K. Statistical analyses were performed by W.‐L.K. with guidance from R.J.M.; W.‐L.K. wrote most of the manuscript with help from R.J.M. R.J.M. edited and approved the final version of the manuscript.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
OPEN RESEARCH BADGES
This article has earned an Open Data badge for making publicly available the digitally‐shareable data necessary to reproduce the reported results. The data is available at https://doi.org/10.5281/zenodo.5588252; https://doi.org/10.5281/zenodo.5506660; https://doi.org/10.5281/zenodo.5547206.Figure S1–S3Click here for additional data file.
Authors: J Michael Cherry; Eurie L Hong; Craig Amundsen; Rama Balakrishnan; Gail Binkley; Esther T Chan; Karen R Christie; Maria C Costanzo; Selina S Dwight; Stacia R Engel; Dianna G Fisk; Jodi E Hirschman; Benjamin C Hitz; Kalpana Karra; Cynthia J Krieger; Stuart R Miyasato; Rob S Nash; Julie Park; Marek S Skrzypek; Matt Simison; Shuai Weng; Edith D Wong Journal: Nucleic Acids Res Date: 2011-11-21 Impact factor: 16.971