Literature DB >> 31798006

Species-specific transcriptional profiles of the gut and gut microbiome of Ceratitis quilicii and Ceratitis rosa sensu stricto.

Fathiya M Khamis1, Paul O Mireji2,3, Fidelis L O Ombura4, Anna R Malacrida5, Erick O Awuoche2,6, Martin Rono3, Samira A Mohamed4, Chrysantus M Tanga4, Sunday Ekesi4.   

Abstract

The fruit fly species, Ceratitis rosa sensu stricto and Ceratitis quilicii, are sibling species restricted to the lowland and highland regions, respectively. Until recently, these sibling species were considered as allopatric populations of C. rosa with distinct bionomics. We used deep Next Generation Sequencing (NGS) technology on intact guts of individuals from the two sibling species to compare their transcriptional profiles and simultaneously understand gut microbiome and host molecular processes and identify distinguishing genetic differences between the two species. Since the genomes of both species had not been published previously, the transcriptomes were assembled de novo into transcripts. Microbe-specific transcript orthologs were separated from the assembly by filtering searches of the transcripts against microbe databases using OrthoMCL. We then used differential expression analysis of host-specific transcripts (i.e. those remaining after the microbe-specific transcripts had been removed) and microbe-specific transcripts from the two-sibling species to identify defining species-specific transcripts that were present in only one fruit fly species or the other, but not in both. In C. quilicii females, bacterial transcripts of Pectobacterium spp., Enterobacterium buttiauxella, Enterobacter cloacae and Klebsiella variicola were upregulated compared to the C. rosa s.s. females. Comparison of expression levels of the host transcripts revealed a heavier investment by C. quilicii (compared with C. rosa s.s.) in: immunity; energy production; cell proliferation; insecticide resistance; reproduction and proliferation; and redox reactions that are usually associated with responses to stress and degradation of fruit metabolites.

Entities:  

Mesh:

Year:  2019        PMID: 31798006      PMCID: PMC6892911          DOI: 10.1038/s41598-019-54989-z

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

The Natal fruit fly, Ceratitis rosa Karsch, is a polyphagous Afro-tropical pest species with a host range of over 100 wild and cultivated plants[1]. In Africa, C. rosa is found in southern and eastern Africa[2,3], including the Indian Ocean islands of Mauritius and La Réunion, where it is an invasive and devastating quarantine pest[4-6]. Ceratitis rosa occurs in R1 and R2 morphotypes[7] which have recently been described as the sibling species C. rosa sensu stricto and Ceratitis quilicii[8]. In Kenya, C. rosa is confined to coastal areas[2,3] while C. quilicii is confined to central highland regions (1,533–1,771 m above sea level)[9]. The spatial geographic segregation of these sibling species appears to be defined by differential thermal developmental requirements of the two species[6,10,11]. Physiologically, the midgut of organisms, including C. rosa sensu lato represents the physiological contact between the organism and its environment; it is possible that microbial (gut microbiome) and host transcript expression profiles of the midgut would be informed by the environments in which these flies exist. Despite our limited knowledge of the gut microbiome of these insects[12], available information suggests a wide range of insect-microbe interactions (from symbionts to facultative bacteria) in nature[13]. While symbionts and their hosts are interdependent, vertically or environmentally-acquired facultative bacteria[14] are not essential for host survival, but can influence host fitness and ecological adaptation to the environment[15]. Amongst fruit flies, the gut microbiome of Ceratitis capitata is largely dominated by free-living Pectobacterium, Enterobacter and Klebsiella species from the Enterobacteriaceae family, that are stable throughout the life cycle of the fly and between geographical regions[16-20]. Unlike vertically-transmitted endosymbionts, these extracellular bacteria are transmitted horizontally[21], and have only been shown recently to have clear impact on the germ line and reproduction via modulation of oogenesis, maternal-to-zygotic-transition in offspring and phenotypic variation in mutants mediated by fly molecular factors[22]. Many chronic infectious agents have subtle deleterious effects on hosts[23]. Also, there is potential for host transcriptional profiles to be influenced by the gut microbiome as well as other environmental factors, which can thus define species-specific adaptation to the local environment. The recent delineation of the C. rosa morphotypes into distinct sibling species necessitates further characterisation of their genotypes to identify inherent molecular differences in transcripts of both the gut microbiome and the host that will allow the two morphologically identical species to be distinguished from each other. In this study, we conducted deep Next Generation Sequencing (NGS) of intact guts from the sibling species, C. rosa s.s and C. quilicii. We isolated host and microbial transcripts from the transcriptome and identified microbe-specific and host-specific transcripts that were differentially expressed in the two species.

Results

De novo assembly and mapping statistics of the transcriptome from C. rosa s.s and C. quilicii transcripts

High quality (83–103 million) reads were obtained with at least 93.46 and 41.49% Q30 and GC contents from C. rosa s.s. and C. quilicii libraries (male and female combined), respectively (Fig. 1). We successfully assembled the reads, de novo, into 36170 transcripts with an overall N100, median and average length of 738, 336 and 552 nucleotides, respectively. Our TransDecoder ORF analysis[24] identified 15463 transcripts as the best candidates (with eclipsed ORFs removed) (Fig. 2). Most of the ORFs were internal Coding Domain Sequences (CDS) without isoforms. Complete CDs constituted about 18% of the transcripts (Fig. 2). Approximately 41.0–71.3% of the reads were successfully mapped to our transcripts and at least 43.6% of our read mappings were unique to specific transcripts (Fig. 3).
Figure 1

Quality matrices of the de novo assembly of RNA-seq reads using the short reads Trinity assembly program[24,39]. GC = Guanine-Cytosine content, AT = Adenine-Thymine content, Q20 = PHRED quality score threshold of 20, Q30 = PHRED quality score threshold of 30.

Figure 2

Nature of open reading frames (ORFs) isolated from the assembly using the TransDecoder program[24].

Figure 3

Mapping statistics of the RNA-seq reads to transcripts.

Quality matrices of the de novo assembly of RNA-seq reads using the short reads Trinity assembly program[24,39]. GC = Guanine-Cytosine content, AT = Adenine-Thymine content, Q20 = PHRED quality score threshold of 20, Q30 = PHRED quality score threshold of 30. Nature of open reading frames (ORFs) isolated from the assembly using the TransDecoder program[24]. Mapping statistics of the RNA-seq reads to transcripts.

OrthoMCL separation of host and microbiome transcripts

When results from the OrthoMCL analysis[25] of the de novo assembled gut transcripts were mapped against the orthoMCL database[26] var 5 it was revealed that most of our gut transcripts had fruit fly orthologs with fewer orthologs from bacteria; most orthologs were complete CDS (Fig. 4). On further analysis, 72.3% of the host transcripts were homologous to C. capitata fruit flies (Table 1) while the bacterial transcripts were homologous to 13 species of bacteria (Table 2). RNA-Seq differential analysis amongst the bacterial transcripts from C. rosa s.s and C. quilicii and from male vs. female flies revealed significant upregulation of transcripts associated with eight species of bacteria in female C. quilicii compared with corresponding C. rosa s.s females (Fig. 5). Induction of bacteria-associated transcripts was similar amongst the remaining comparison categories (P > 0.05).
Figure 4

Proportional representation of taxa that had at least 98% sequence identity and matched specific C. rosa transcripts in the orthoMCL database[26]; and proportional assembly of their de novo ORFs using the short reads Trinity assembly program[24,39].

Table 1

Eukaryotic taxa identified in the orthoMCL database[26], that had orthologs in the C. rosa s.s. and C. quilicii transcripts, and their corresponding homologs in the nr NCBI database.

OrthoMCL Orthologs*Ceratitis rosa s.s. and Ceratitis quilicii transcriptsNCBI BLAST homologs**
Best BLAST HitsBest BLAST Hits (nr database)
TaxonE- valueNLength (aa)SpeciesE- value
Aedes aegypti5E-921161Bactrocera dorsalis8E-112
6E-581103Bactrocera oleae4E-52
1E-105 -1E-712129–183Drosophila melanogaster2E-87 - 5E-130
1E-1471248Homalodisca liturata1E-170
Anopheles gambiae8E-561100Cuerna arida2E-56
1E-641112Monomorium pharaonis3E-77
Apis0E + 001377Ceratitis capitata0
5E-852148Drosophila melanogaster2E-104
6E-571101Operophtera brumata4E-62
4E-821150Trichinella pseudospiralis1E-101
Culex pipiens0E + 003100–493Ceratitis capitata0 - 8E - 60
2E-851141Drosophila affinis5E-98
0E + 001304Drosophila melanogaster0
Drosophila melanogaster1E-1031181Drosophila pseudoobscura3E-128
7E-741128Anopheles gambiae4E-87
2E-661124Drosophila biarmipes5E-59
3E-581109Drosophila elegans6E-69
1E-641117Drosophila erecta7E-71
3E-981155Drosophila ficusphila5E-117
6E-881148Drosophila miranda4E-102
01329Drosophila montana0
3E-591108Drosophila navojoa2E-65
2E-721129Drosophila suzukii5E-88
6E-831144Drosophila virilis1E-97
1E-180 - 2E-582102–305Drosophila busckii0 - 2E-69
1E-86 - 7E-712120–156Drosophila persimilis8E-105 - 2E-80
1E-111-2E-992173–191Drosophila willistoni1E-115- 3E-138
5E-82 - 6E-683106–139Drosophila simulans8E-94 - 2E-79
1E-55 - 1E-643102–113Musca domestica4E-75 - 2E-67
0 - 6E-825116–606Bactrocera oleae0 - 3E-101
0 - 1E-1506101–479Bactrocera dorsalis0 - 1E-168
0 - 1E-597105–586Bactrocera cucurbitae0 - 5E-113
0 - 3E-937161–381Rhagoletis zephyria0 - 3E-121
0 - 1E-1127184Rhagoletis zephyria0 - 1E-124
0 - 1E-5210105–409Drosophila melanogaster0 - 1E-55
0 - 7E-5450100–1046Ceratitis capitata0 - 5E-60
Ixodes scapularis8E-751137Felis catus1E-91
4E-741141Nothobranchius furzeri7E-91
Nematostella vectensis5E-761137Culex quinquefasciatus4E-92
Pediculus humanus3E-611111Ceratitis capitata3E-75
Tetrahymena thermophila3E-561104Acipenser persicus4E-67

* = Orthologs of C. rosa s.s. and C. quilicii transcripts with at least 98% identity with matches in the orthoMCL database.

** = Species in the nr NCBI database with genes orthologous to the C. rosa s.s. and C. quilicii orthologs in the orthoMCL database, and with at least 98% sequence identity and coverage (without gaps) in the NCBI database.

Table 2

Bacterial taxa identified in the orthoMCL database[26], with orthologs in the C. rosa s.s. and C. quilicii transcripts, and their corresponding homologs in the NCBI nr database.

OrthoMCL Orthologs*Ceratitis rosa transcriptsNCBI BLAST homologs**
Best BLAST HitsBest BLAST Hits (nr database)
TaxonE- valueIDLength (aa)SpeciesE- value
Escherichia coli7E-67C.r_14811125Buttiauxella3E-82
1E-90C.r_8435161Pectobacterium5E-97
1E-105C.r_8944187Enterobacteriaceae9E-132
Salmonella enterica9E-58C.r_14824106Salmonella enterica2E-68
Salmonella enterica5E-73C.r_16126131Enterobacter cloacae complex3E-89
1E-56C.r_16175143Klebsiella variicola9E-70
1E-79C.r_16852141Raoultella planticola9E-95
1E-110C.r_17300186Salmonella enterica2E-131
1E-107C.r_4220190Escherichia coli2E-132
7E-76C.r_4222136Enterobacter cloacae6E-90
2E-66C.r_6036124Enterobacterales4E-81
2E-60C.r_6303112Escherichia coli3E-73
Shigella flexneri7E-66C.r_6038124Enterobacterales3E-81
1E-117C.r_7072210Proteobacteria8E-146
4E-71C.r_7640126Enterobacter cloacae3E-85
Yersinia enterocolitica1E-167C.r_8942290Enterobacter hormaechei0

* = Orthologs of C. rosa s.s. and C. quilicii transcripts with at least 98% identity with matches in the orthoMCL database.

** = Species in the nr NCBI database with genes orthologous to the C. rosa s.s. and C. quilicii orthologs in the orthoMCL database, and with at least 98% sequence identity and coverage (without gaps) in the NCBI database.

Figure 5

Volcano plot of RNA-Seq of bacterial transcripts that were differentially expressed in female C. rosa s.s and C. quilicii and used as a proxy measurement of relative abundance of the respective taxa.

Proportional representation of taxa that had at least 98% sequence identity and matched specific C. rosa transcripts in the orthoMCL database[26]; and proportional assembly of their de novo ORFs using the short reads Trinity assembly program[24,39]. Eukaryotic taxa identified in the orthoMCL database[26], that had orthologs in the C. rosa s.s. and C. quilicii transcripts, and their corresponding homologs in the nr NCBI database. * = Orthologs of C. rosa s.s. and C. quilicii transcripts with at least 98% identity with matches in the orthoMCL database. ** = Species in the nr NCBI database with genes orthologous to the C. rosa s.s. and C. quilicii orthologs in the orthoMCL database, and with at least 98% sequence identity and coverage (without gaps) in the NCBI database. Bacterial taxa identified in the orthoMCL database[26], with orthologs in the C. rosa s.s. and C. quilicii transcripts, and their corresponding homologs in the NCBI nr database. * = Orthologs of C. rosa s.s. and C. quilicii transcripts with at least 98% identity with matches in the orthoMCL database. ** = Species in the nr NCBI database with genes orthologous to the C. rosa s.s. and C. quilicii orthologs in the orthoMCL database, and with at least 98% sequence identity and coverage (without gaps) in the NCBI database. Volcano plot of RNA-Seq of bacterial transcripts that were differentially expressed in female C. rosa s.s and C. quilicii and used as a proxy measurement of relative abundance of the respective taxa.

Differential expression of host gut transcripts between male and female C. rosa s.s. and C. quilicii libraries

RNA-Seq differential analysis of host gut-specific transcripts revealed High proportion of significantly induced host transcripts in C. quilicii compared with C. rosa s.s for both genders (Fig. 6). In male C. quilicii (compared with male C. rosa s.s.), GSEA GO term enrichment analysis[27-29] revealed significant induction of putative pathways associated with: innate immune responses against microbes; apoptosis-related cellular stress responses linked to endoplasmic reticulum (ER); negative regulation of immunity; catalytic transfer of hexose sugar across membranes; and intra–organismal carbohydrate catabolism (Table 3). Further to this, STRING database enriched pathway analysis[30] revealed induction of putative pathways associated with: phagocytosis; ER protein processing; glycolysis; folate biosynthesis (crucial for DNA replication and cell division); plant terpenoids; and limonene and pinene degradation (Table 4). Volcano plot analysis of individual gene sequences associated with these observations identified further transcripts that were upregulated in male C. quilicii compared to male C. rosa s.s (Fig. 6) and that the majority of these were primarily associated with: energy production (mitochondrial Acyl-CoA synthetase, glycogen phosphorylase, L-lactate dehydrogenase, glucosidase, NADH dehydrogenase and nuclear valosin); immunity (CD109, protein TsetseEP, proclotting enzyme antigen, tectonin-2, toll, streptogramin A acetyltransferase); cell growth and proliferations (carboxypeptidase D, signal transducer and transcription activator, transposon, selenide, replication polyprotein, BAG domain-containing protein samui, myc, replicase polyprotein, and protein P1); synthesis and transport of protein and other macromolecules (solute carrier organic anion transporter, RNA1, polyprotein, 50 S ribosomal protein, protein translocase, odorant-binding protein, alkaline phosphatase, B-cell receptor-associated protein, nuclear pore complex protein, inorganic phosphate cotransporter, trehalose transporter and protein transport protein); and resistance to insecticides (cytochrome P450). Only transcripts of pathways associated with urine nucleotide biosynthesis (major energy carriers) were more highly expressed in male C. rosa s.s. compared with male C. quilicii (Tables 3 and 4). Volcano plot analysis also revealed that most of the transcripts that were more highly expressed in male C. rosa s.s were of viral (capsid protein alpha, RNA-directed RNA polymerase, threonine-tRNA ligase and microtubule-associated protein) or bacterial (cytidylate kinase) origin. The rest of the transcripts induced in male C. rosa s.s were largely associated with energy metabolism (NADPH, carbamoyl-phosphate synthase, protein melted, fructose-1,6-bisphosphatase).
Figure 6

Volcano plot showing host transcripts that were differentially expressed in the gut tissues of C. rosa s.s and C. quilicii. The transcripts expression level data were supported by at least 100 reads in each category (i.e. in C. rosa s.s or C. quilicii) or five CPM (Counts Per Million) by edge R analysis[44,45]. The red dots indicate points-of-interest, i.e. the best BLAST hits[53] on the manually annotated and reviewed Swiss-Prot database[54], that displayed both large magnitude fold-changes (x axis) and high statistical significance (−log10 of p value, y axis). Points with a fold-change less than 2 (log2 = 1) and/ or a False Detection Rate (FDR) corrected p value of less than 0.05 are shown in black, and indicate transcripts that did not change significantly in expression between the two species. Panel A, B, C = the transcripts that had the most, medium and least difference in their expression between the two species.

Table 3

Canonical gene set enrichment analysis (GSEA) of transcripts that were significantly differentially expressed transcripts in the gut tissues of either male or female C. rosa s.s and C. quilicii. Enrichment profiles established using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46]. Non-redundant enriched Gene Ontology (GO) categories.

SpeciesCategoryPathway IDDescription of Pathway#Ref#ObservedExpectedRatioP-ValueAdjusted P-Value
Ceratitis quilicii MaleBiological processGO:0019730Antimicrobial humoral response10650.647.810.00044.93E-02
GO:0030968Endoplasmic reticulum unfolded protein response820.0541.410.0014.93E-02
GO:0015149Hexose transmembrane transporter activity1820.1118.10.00531.44E-01
GO:0050777Negative regulation of immune response1120.0730.110.00194.93E-02
GO:0044724Single-organism carbohydrate catabolic process3930.2412.740.00164.93E-02
Molecular functionGO:0042803Protein homodimerization activity7230.446.790.00971.44E-01
Cellular componentGO:0030662Coated vesicle membrane2320.1513.670.00926.58E-02
GO:0009897External side of plasma membrane820.0539.30.00112.69E-02
GO:0005811Lipid particle24971.584.420.00092.69E-02
GO:0045298Tubulin complex1020.0631.440.00172.69E-02
GO:0030018Z disc2120.1314.970.00776.58E-02
Ceratitis rosa MaleBiological processGO:0006164Purine nucleotide biosynthetic process7220.1117.480.00562.37E-01
Molecular functionGO:0000166Nucleotide binding122531.911.570.29735.79E-01
GO:0045735Nutrient reservoir activity520.01256.292.27E-055.00E-04
GO:0016491Oxidoreductase activity63020.982.030.25775.79E-01
GO:0022891Substrate-specific transmembrane transporter activity64921.011.970.2695.79E-01
GO:0008270Zinc ion binding87521.371.460.40235.79E-01
Cellular componentGO:0005616Larval serum protein complex530.01437.291.92E-083.84E-07
GO:0005811Lipid particle24930.348.780.00411.64E-02
GO:0005886Plasma membrane60320.832.420.19834.67E-01
Ceratitis quilicii FemaleBiological processGO:0045454Cell redox homeostasis5020.238.780.02164.67E-01
GO:0008340Determination of adult lifespan12830.585.150.02034.67E-01
GO:0008593Regulation of Notch signalling pathway6220.287.080.03225.51E-01
GO:0001666Response to hypoxia4520.29.760.01774.49E-01
GO:0007296Vitellogenesis820.0454.890.00061.12E-01
Molecular functionGO:0015036Disulfide oxidoreductase activity3020.1314.90.00797.24E-02
GO:0009055Electron carrier activity14530.654.620.02671.47E-01
GO:0051287NAD binding3920.1711.460.0131.02E-01
GO:0050661NADP binding1220.0537.250.00137.15E-02
GO:0016651Oxidoreductase activity, acting on NADH or NADPH5120.238.770.02171.33E-01
GO:0005198Structural molecule activity48772.183.210.00547.24E-02
Cellular componentGO:0005576Extracellular region81473.861.810.08474.49E-01
GO:0015934Large ribosomal subunit10220.484.140.08394.49E-01
GO:0005811Lipid particle24971.185.930.00015.30E-03
GO:0005875Microtubule associated complex36251.722.910.02694.49E-01
GO:0005700Polytene chromosome12130.575.230.01934.49E-01
Ceratitis rosa FemaleBiological processGO:030154Cell differentiation179920.762.620.16643.78E-01
Molecular functionGO:0045735Nutrient reservoir activity520961.11.30E-062.60E-06
Cellular componentGO:0005616Larval serum protein complex530962.041.16E-091.62E-08
GO:0005811Lipid particle24930.1619.320.00032.10E-03
Table 4

Enriched KEGG pathways of transcripts that were significantly differentially expressed transcripts in the gut tissues of either male or female C. rosa s.s. and C. quilicii. Enrichment profiles established using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46].

SpeciesPathway Name#Ref#ObservedExpectedRatioP-ValueAdjusted P-Value
Ceratitis quilicii MaleMetabolic pathways892124.32.790.0010.0076
Phagosome6420.316.490.03810.0478
Propanoate metabolism2220.1118.870.0050.009
Protein processing in endoplasmic reticulum11940.576.980.00260.0078
Ribosome biogenesis in eukaryotes7820.385.320.05430.0543
Terpenoid backbone biosynthesis1320.0631.930.00170.0076
Folate biosynthesis2120.119.770.00450.009
Glycolysis/Gluconeogenesis4920.248.470.02320.0348
Limonene and pinene degradation6820.336.10.04250.0478
Ceratitis rosa MaleMetabolic pathways89231.152.60.10460.1046
Ceratitis quilicii FemaleLimonene and pinene degradation6820.238.70.02210.0418
Metabolic pathways89263.011.990.07830.0783
Pyrimidine metabolism7720.267.690.02790.0418
Ceratitis rosa Female-

#Ref: the number of reference genes in the category; # Observed: the number of genes in the gene set and in the category; Expected: the expected number in the category; Ratio: ratio of enrichment; P-Value: p value from hypergeometric multiple Test Adjustment test; Adjusted P-value: p value adjusted by the multiple test adjustment.

Volcano plot showing host transcripts that were differentially expressed in the gut tissues of C. rosa s.s and C. quilicii. The transcripts expression level data were supported by at least 100 reads in each category (i.e. in C. rosa s.s or C. quilicii) or five CPM (Counts Per Million) by edge R analysis[44,45]. The red dots indicate points-of-interest, i.e. the best BLAST hits[53] on the manually annotated and reviewed Swiss-Prot database[54], that displayed both large magnitude fold-changes (x axis) and high statistical significance (−log10 of p value, y axis). Points with a fold-change less than 2 (log2 = 1) and/ or a False Detection Rate (FDR) corrected p value of less than 0.05 are shown in black, and indicate transcripts that did not change significantly in expression between the two species. Panel A, B, C = the transcripts that had the most, medium and least difference in their expression between the two species. Canonical gene set enrichment analysis (GSEA) of transcripts that were significantly differentially expressed transcripts in the gut tissues of either male or female C. rosa s.s and C. quilicii. Enrichment profiles established using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46]. Non-redundant enriched Gene Ontology (GO) categories. Enriched KEGG pathways of transcripts that were significantly differentially expressed transcripts in the gut tissues of either male or female C. rosa s.s. and C. quilicii. Enrichment profiles established using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46]. #Ref: the number of reference genes in the category; # Observed: the number of genes in the gene set and in the category; Expected: the expected number in the category; Ratio: ratio of enrichment; P-Value: p value from hypergeometric multiple Test Adjustment test; Adjusted P-value: p value adjusted by the multiple test adjustment. When female C. rosa s.s and C. quilicii were compared, the pathways induced in C. quilicii females compared with C. rosa s.s females were associated with: yolk formation; maintenance of the cellular redox environment; control of viability and duration of the adult phase of the life cycle; and regulation of notch signaling pathways that affect cell differentiation and responses to insufficient oxygen (hypoxia) (Fig. 6). In addition to the upregulation of the general metabolic pathways, and limonene and pinene degradation pathways that were observed in male C. quilicii, pyrimidine metabolism pathways were also upregulated in female C. quilicii (Table 4). Analysis of the volcano plot of individual gene expression profiles (Fig. 6) of female C. quilicii (compared with female C. rosa s.s.) revealed upregulation of pathways associated with: some common energy production regulators (glucosidase and NADH dehydrogenase); immunity (CD109, protein TsetseEP, proclotting enzyme antigen and streptogramin A acetyltransferase); cell growth and proliferations (replication polyprotein, myc, replicase polyprotein, and protein P1); synthesis and transport of protein and other macromolecules (solute carrier organic anion transporter, protein translocase and inorganic phosphate cotransporter and trehalose transporter); and resistance to insecticide (cytochrome P450). Transcripts that were more highly expressed in female C. rosa s.s (compared with female C. quilicii) included pathways associated with: energy metabolism (glyceraldehyde-3-phosphate and glucosylceramidase dehydrogenase); immunity (peptidoglycan-recognition protein); transcription (eukaryotic translation initiation factor, RNA polymerase and coatomer); proteolysis (trypsin theta, proteolysis); and egg formation/reproduction (chorion protein, outer membrane protein A, vitellogenin and vigilin). Only transcripts for pathways associated with urine nucleotides biosynthesis (major energy carriers) were more highly expressed in male C. rosa s.s compared with male C. quilicii (Table 4). Volcano plot analysis revealed that most of the transcripts that were more highly expressed in male C. rosa s.s. were of viral (capsid protein alpha, RNA-directed RNA polymerase, threonine-tRNA ligase and microtubule-associated protein) or bacterial (cytidylate kinase) origin. The rest were associated with energy metabolism (NADPH, carbamoyl-phosphate synthase, protein melted, fructose-1,6-bisphosphatase). Up-regulation of bacteria/virus-specific (cytidylate kinase and RNA-directed RNA polymerase) transcripts was also common in female C. rosa s.s. The comparison made in this study of expression profiles of 11 randomly selected DE genes from male and female datasets, revealed a Pearson correlation coefficient of R = 0.8146 and R = 0.9338 (Text S1) for the genes evaluated, which is indicative of a valid transcriptome.

Discussion

The goals of this study were (1) to establish the identity of naturally-occurring gut microbes (gut microbiome) associated with the sibling species, C. rosa s.s. and C. quilicii and (2) to use host transcriptional profiles to identify host species-specific transcripts that define the identity of C. rosa s.s. and C. quilicii sibling species. The C. rosa s.s. (ex Kibarani, Msambweni district) and C. quilicii (ex Kithoka, Imenti North district) colonies used in this study were reared on carrot-based artificial diets with no antibiotics. Oviposition was done on apple mango domes and the insects reared in rooms with similar conditions of their places of origin (28 ± 1 °C, 50 ± 8% RH for C. rosa s.s. and 23 ± 1 °C, 65 ± 5% RH for C. quilicii). This was essential so as to minimize the effects of environmental differences on the colonies. Ideally, sample collections were to be of similar generation age, but we realized that the two colonies had different reproductive rates. The fecundity rate of C. quilicii was way higher than that of C. rosa s.s. hence C. quilicii colony was at the 59th generation while the C. rosa s.s. colony was at the 36th generation. Using OrthoMCL analyses we identified, transcripts of 13 distinct extracellular bacterial species, some of which (Pectobacterium, Enterobacter and Klebsiella species from the family Enterobacteriaceae) have previously been identified as stable free-living bacteria dominating the gut microbiome of C. capitata across its geographical distribution[16-20]. Others like Enterobacterium buttauxella are known to proliferate in a stable way throughout the life cycle of C. capitata[31]. Surprisingly, we also detected phytopathogenic Pectobacterium spp. in C. quilicii, which may have colonized and ultimately evolved alternative associations, perhaps involving insect mediation of plant pathogen dissemination[32]. Another plausible explanation would be it could have originated from the carrot-based diet since the diet was made devoid of antibiotics. Kahala et al.[55] demonstrated that carrots infected with this bacterium appeared mainly in shop samples during autumn, winter, and spring, causing spoilage gradually. Others, such as Enterobacter cloacae and Klebsiella variicola have also been reported colonizing the gut of C. capitata and Zeugodacus cucurbitae in nature[33,34]. The reason for the presence of some bacterial species in C. quilicii, and not in C. rosa s.s, and the reverse scenario for the presence of virus transcripts in C. rosa s.s. and not C. quillicii, is not obvious but may reflect the prevalence of these bacteria/virus species in the original environments of the two species, respectively. It may also relate to the original host plant from which the flies were collected (although they had both been in laboratory culture for > 30 generations), the relative robustness of their immune responses to particular pathogens, or endemic/mutually beneficial host-parasite stability. These hypotheses are possible because they are based on documented evidence for the influence of environmental and host factors on extracellular gut bacteria and a broad range of somatic host functions in D. melanogaster[35,36]. Furthermore, common chronic, noninvasive associations with particular bacterial lineages are often beneficial, or even required, for the development and reproduction of hosts[37]. Beneficial effects include developmental interactions that prime the immune system and improve tolerance to thermal stress; these benefits might account for the differences we have observed in C. rosa s.s and C. quilicii which are geographically and thermally divergent. Our transcriptomic analyses of transcripts revealed a greater investment in physiological processes associated with immunity, energy production, cell proliferation, insecticide resistance and degradation of the fruit metabolites in male C. quilicii compared with male C. rosa s.s. Given that our microbial analysis (above) revealed the presence of various bacteria within the guts of C. quilicii, it is possible that the immune responses observed in C. quilicii might be in response to these bacteria which could, in turn, reflect the relative abundance of these bacteria in the environmental biota, including within host plants. Higher demand for energy and proliferation in C. quilicii than in C. rosa s.s., suggests a higher intrinsic rate of growth and thus fitness. The presence of insecticide resistance genes shows that C. quilicii also has enhanced resistance to insecticides (and probably other xenobiotics) and/or it is under constant selection pressure for insecticide resistance. In contrast, C. rosa s.s harbours viral transcripts that may again reflect the abundance of these viruses in their environmental biota and/or a weaker immunity against viruses. In addition to the differences observed in male flies, comparisons of female C. quilicii and C. rosa s.s also revealed upregulation of pathways and genes essentially associated with reproduction and proliferation, but also with redox which is usually associated with responses to stress. Additionally, the presence of viral transcripts in female C. rosa s.s similar to those in male C. rosa s.s suggests either a potential environmental source for the viruses or their vertical transmission. Despite the discernible differences drawn from this study on the two sibling species in gene expression levels and microbiota of the mid-gut tissues and contents, the study is based on only laboratory colonies. Consequently, colony reared insects can only provide limited, but acceptable insight on phenotype of the flies in their natural habitats that should later be validated downstream. Therefore, the differentially expressed genes can be used as diagnostic markers for the two sibling species. Further studies with wild populations should be undertaken.

Conclusion

In conclusion, this study identified potential natural gut microbes and host transcriptional profiles that could be used to define and differentiate between the sibling species, C. rosa s.s. and C. quilicii. We identified stable, free-living bacterial species (and others) in C. quilicii with potential for stable proliferation throughout the life cycle of the fruit fly. In C. quilicii we also identified greater investment in several physiological processes that distinguished this species from C. rosa s.s. which, in contrast, harboured several viruses that were not present in C. quilicii. Whether the extracellular gut microbiome we identified in C. quilicii are responsible for a potential reduction in reproductive fitness remains to be determined. Recent findings suggest that horizontally transmitted extracellular gut Acetobacter species can influence the D. melanogaster germ line by enhancing oogenesis linked to aldehyde dehydrogenase in the host[22].

Material and Methods

Test insects

The C. rosa s.s. colony used in this study was established from individuals collected from guava fruits from Kibarani, Msambweni district (S 04°19'62.8”; E 039°32'41.1”; 34 m a. s. l), a coastal region of Kenya. The flies were reared in Plexiglass cages (65 cm × 35 cm × 65 cm) and the colony maintained at 28 ± 1 °C, 50 ± 8% RH and photoperiod of L12: D12. The C. quilicii colony was established from individuals collected from mango fruits from Kithoka, a highland region in Imenti North district (N 00°05'58.9”; E 037°40'39.5”; 1,425 m a. s. l) of the central region in Kenya. The C. quilicii was reared in Plexiglass cages and the colony maintained at at 23 ± 1 °C, 65 ± 5% RH and photoperiod of L12: D12. Both species were reared in a sterile environment and on carrot-based artificial diets[38] with no antibiotic, in the icipe animal rearing and quarantine unit, Nairobi, Kenya. At the time of the bioassay, the C. quilicii colony was at the 59th generation and the C. rosa s.s. colony was at the 36th generation. Noteworthy, the establishment of the two colonies was initiated concurrently, however, the C. rosa s. s. colony took way much longer to be established at the icipe laboratories because the species is adapted to lowland temperate conditions. Because of this, the reproduction rates of C. quilicii was way higher and faster facilitating more generations as compared to its counterpart C. rosa s. s. which was slow.

Extraction of C. rosa s.s. and C. quilicii RNA

Experimental insects were collected two days post emergence and maintained exclusively on water. The entire gut and its contents were dissected from individual male and female C. rosa s.s and C. quilicii in 1X phosphate buffered saline (PBS) (pH 8.0) under a Leica (EZ4HD) stereoscope (n = 100 for each sex of each species); the guts for each sex/ species combination were then pooled and stored in liquid nitrogen until required for downstream analyses. Total RNA was extracted using the Isolate II RNA Mini Kit (Bioline, London, UK), following the manufacturer’s instructions, and the resulting RNA immediately stored at −80 °C. RNA yield was determined using a Nanodrop 2000/2000c Spectrophotometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA), and the size and preliminary integrity of the RNA determined by gel electrophoresis on 1.2% non-denaturing agarose followed by ethidium bromide staining. Visualization and documentation of the gel image was done using a KETA GL imaging system (Wealtec Corp, Meadowvale Way Sparks, Nevada, USA). The RNA (50 uL) was subsequently lyophilized and preserved in RNAstable tubes (Biomatrica, Inc, USA) as per the manufacturer’s instructions until the RNA was required for NGS.

RNA sequencing of the C. rosa s.s. and C. quilicii transcriptomes

The integrity of RNA from each sample was assessed individually using a Bioanalyzer (Agilent Technologies, USA), and cDNA was generated using an Illumina TruSeq RNA Sample Preparation Kit (Illumina, Hayward, CA, USA). This technique is based on rRNA depletion and not polyA + selection which means that both prokaryotic and eukaryotic transcripts are retained in the libraries. Forward-Reverse (FR) strand-specific libraries were prepared from each sample and were sequenced by Macrogen Inc. (Seoul, South Korea) using an Illumina HiSeq. 2000 (Illumina, Hayward, CA, USA) platform with two 101 nt paired-end reads. Low quality reads, reads with less than 101 base pairs, and adapter sequences were removed using Illumina build software (Illumina, Hayward, CA, USA) as part of the sequence clean-up. This generated final fastq formatted raw data for each library. Overall, eight fastq files were generated, a pair each for each sibling species and sex (i.e. (C. rosa s.s and C. quilicii, male or female) × 2).

Identification and validation of transcripts that were differentially expressed (DE) in C. rosa s.s. and C. quilicii

The quality of the RNA sequence reads in each file was assessed using FastQC software (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and the high quality data used to clean reads using fastx quality trimmer (http://hannonlab.cshl.edu/fastx_toolkit/) software. Since both the C. rosa s.s. and C. quilicii reference genomes were not available (unpublished), we could not employ established genome-guided and gene-set based strategies to evaluate the expression profiles of the transcripts. We thus employed a de novo transcript assembly-based strategy to generate the transcripts from the transcriptomes. We then separated them into microbial and host transcripts and then identified which transcripts (microbial or host) were differentially expressed in the different libraries. Finally, we established which metabolic pathways were predominant and functionally associated with the differential transcript expression profiles using geneset enrichment analysis. Briefly, all the transcriptomes were combined, assembled de novo into transcripts and the quality of the assembled transcripts assessed using the short reads Trinity assembly program[24,39]. We isolated the longest transcripts (most representative of the respective genes) with open reading frames that could be translated into peptides that were at least 100 amino acids long using the TransDecoder program[24]. To separate the microbial transcripts from the host transcripts amongst these long transcript sets, we mapped the transcripts using the OrthoMCL[25] ortholog against the orthologs (OrthoMCL) database var 5[26]. All sources of known transcripts from available microbe databases were identified using OrthoMCL. This separated the transcripts into those that represented microbial species and those that were host transcripts based on differences in their codon usage bias. The database contained 810, 686 ortholog groups clustered from 1, 398, 546 proteins obtained from 150 genomes. The genomes consisted of four Amoebozoan, six Firmicutes, nine Euglenozoan, 11 Viridiplantae, 15 Alveolates, 16 Archaea, 19 Proteobacteria, 24 fungi and 29 Metazoan taxonomic groups, in addition to six and 11 other eukaryote and bacteria taxonomic groups respectively[27]. Furthermore, the identification of orthologous groups in prokaryotic genomes has permitted cross-referencing of genes from multiple species, facilitating genome annotation, protein family classification and studies on bacterial evolution[40-43]. The transcripts were then queried against the BLAST non-redundant database which served to; (i) confirm the microbe ID’s and (ii) identified putative ID’s of the novel transcripts that could not be identified in the microbial database. To identify transcriptional expression profiles for both the microbial transcripts and the host transcripts that were differentially expressed between the libraries (sibling species), the cleaned reads from individual transcriptomes in the microbial or host transcript sets were individually mapped using CLC genomic workbench version 9.5 (CLC Bio, Aarhus, Denmark). The resultant mapping read counts for each library on each transcript set were used to determine differential expression of the respective microbial or host (male or female) transcripts using edgeR analysis[44,45] software. To minimize detection of false positives for differential expression, a conservative regime was adopted against both libraries. Within this regime, we considered transcripts to be differentially expressed (DE) between treatments if: 1) they had at least a two-fold difference in expression level; 2) the false detection rate (FDR) was corrected p < 0.05; 3) they were supported by at least 100 read mappings in either categories (i.e. in C. rosa s.s, C. quilicii or microbial transcripts); and 4) they had five CPM (Counts Per Million). Fold changes as a ratio of CPM values between microbial or host transcripts from the C. rosa s.s and C. quilicii libraries were defined. Moreover, the metabolic pathways and networks that underwent differential functional enrichment (expression) between the two hosts (C. rosa s.s vs. C. quilicii) or between genders were identified using secondary canonical gene set enrichment analysis (GSEA) approaches within the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46].

Validation of transcriptome expression profiles using qRT-PCR

Since our data were derived from a single transcriptome for each treatment (i.e. species and gender), we employed a quantitative reverse transcription Polymerase Chain Reaction (qRT-PCR) technique as an independent tool to determine whether our RNA-Seq analysis results could potentially be independently replicated; we compared fold changes in randomly selected DE host transcripts that were achieved using the two approaches. This strategy has been employed successfully for validation of RNA-seq expression levels from single transcriptomes in other studies[47-50]. We did the validation using eight independent biological replicates (n = 8 for each of the different species and gender combinations) obtained from dissected flies generated under similar experimental conditions to those described for the transcriptome samples. Briefly, we prepared cDNAs (from 1 µg Total RNA) from the midgut of each replicate using High Capacity cDNA reverse transcription kit (Applied Biosystems, Carlsbad, CA) according to the manufacturer’s protocol. Quantitative RT-PCR was done on each replicate and specifically on 11 randomly selected DE genes using gene-specific primers (as detailed in Supplementary Information S1). The reaction mix consisted of 3 µg cDNA template that was amplified from each biological replicate; there were three independent technical replicates each with 7.5 µL of Fast SYBR® Green master mix (Applied Biosystems, Carlsbad, CA) and 0.5 picomoles of each of the specific primers for the various genes of interest. Reactions were made in the Strategene MX3005P real time qPCR machine (Agilent Technologies, California, USA). All qRT-PCR results were normalized to two genes that were expressed by Bestkeeper[51] in the two sibling species, which were also quantified from each biological replicate. REST software[52] was used for pairwise gene expression analysis, including an internal multiple-tests correction. Fold changes in transcript expression levels were established by comparing levels of expression of the transcripts in the midguts of C. rosa s.s with those in the midguts of C. quilicii. We conducted Pearson correlation analysis of the fold changes that were obtained from qRT-PCR with those that had been obtained previously from the RNA-seq data to determine the validity of our transcriptomes. Separate validation of the microbial transcriptomes was not performed since they were inherently intertwined with the host transcriptomes and generally had lower expression levels which would be technically difficult to detect using qRT-PCR.

Ethics approval

All insect rearing, handling and experiments were performed using standard operating procedures at the ICIPE Animal Rearing and Quarantine Unit as approved by the National Commission of Science, Technology and Innovations, Kenya. Supplementary Info Table S1 Table S2
  43 in total

1.  Relative expression software tool (REST) for group-wise comparison and statistical analysis of relative expression results in real-time PCR.

Authors:  Michael W Pfaffl; Graham W Horgan; Leo Dempfle
Journal:  Nucleic Acids Res       Date:  2002-05-01       Impact factor: 16.971

2.  Heritable endosymbionts of Drosophila.

Authors:  Mariana Mateos; Sergio J Castrezana; Becky J Nankivell; Anne M Estes; Therese A Markow; Nancy A Moran
Journal:  Genetics       Date:  2006-06-18       Impact factor: 4.562

3.  Molecular characterization of spoilage bacteria as a means to observe the microbiological quality of carrot.

Authors:  Minna Kahala; Lucia Blasco; Vesa Joutsjoki
Journal:  J Food Prot       Date:  2012-03       Impact factor: 2.077

4.  Enterobacteria-mediated nitrogen fixation in natural populations of the fruit fly Ceratitis capitata.

Authors:  A Behar; B Yuval; E Jurkevitch
Journal:  Mol Ecol       Date:  2005-08       Impact factor: 6.185

5.  Bringing back the fruit into fruit fly-bacteria interactions.

Authors:  A Behar; E Jurkevitch; B Yuval
Journal:  Mol Ecol       Date:  2008-03       Impact factor: 6.185

6.  Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research.

Authors:  Ana Conesa; Stefan Götz; Juan Miguel García-Gómez; Javier Terol; Manuel Talón; Montserrat Robles
Journal:  Bioinformatics       Date:  2005-08-04       Impact factor: 6.937

7.  The homeodomain protein ladybird late regulates synthesis of milk proteins during pregnancy in the tsetse fly (Glossina morsitans).

Authors:  Geoffrey M Attardo; Joshua B Benoit; Veronika Michalkova; Kevin R Patrick; Tyler B Krause; Serap Aksoy
Journal:  PLoS Negl Trop Dis       Date:  2014-04-24

8.  A novel highly divergent protein family identified from a viviparous insect by RNA-seq analysis: a potential target for tsetse fly-specific abortifacients.

Authors:  Joshua B Benoit; Geoffrey M Attardo; Veronika Michalkova; Tyler B Krause; Jana Bohova; Qirui Zhang; Aaron A Baumann; Paul O Mireji; Peter Takáč; David L Denlinger; Jose M Ribeiro; Serap Aksoy
Journal:  PLoS Genet       Date:  2014-04-24       Impact factor: 5.917

9.  Are we underestimating the diversity and incidence of insect bacterial symbionts? A case study in ladybird beetles.

Authors:  Lucy A Weinert; Matthew C Tinsley; Matilda Temperley; Francis M Jiggins
Journal:  Biol Lett       Date:  2007-12-22       Impact factor: 3.703

10.  Blast2GO: A comprehensive suite for functional analysis in plant genomics.

Authors:  Ana Conesa; Stefan Götz
Journal:  Int J Plant Genomics       Date:  2008
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.