| Literature DB >> 31798006 |
Fathiya M Khamis1, Paul O Mireji2,3, Fidelis L O Ombura4, Anna R Malacrida5, Erick O Awuoche2,6, Martin Rono3, Samira A Mohamed4, Chrysantus M Tanga4, Sunday Ekesi4.
Abstract
The fruit fly species, Ceratitis rosa sensu stricto and Ceratitis quilicii, are sibling species restricted to the lowland and highland regions, respectively. Until recently, these sibling species were considered as allopatric populations of C. rosa with distinct bionomics. We used deep Next Generation Sequencing (NGS) technology on intact guts of individuals from the two sibling species to compare their transcriptional profiles and simultaneously understand gut microbiome and host molecular processes and identify distinguishing genetic differences between the two species. Since the genomes of both species had not been published previously, the transcriptomes were assembled de novo into transcripts. Microbe-specific transcript orthologs were separated from the assembly by filtering searches of the transcripts against microbe databases using OrthoMCL. We then used differential expression analysis of host-specific transcripts (i.e. those remaining after the microbe-specific transcripts had been removed) and microbe-specific transcripts from the two-sibling species to identify defining species-specific transcripts that were present in only one fruit fly species or the other, but not in both. In C. quilicii females, bacterial transcripts of Pectobacterium spp., Enterobacterium buttiauxella, Enterobacter cloacae and Klebsiella variicola were upregulated compared to the C. rosa s.s. females. Comparison of expression levels of the host transcripts revealed a heavier investment by C. quilicii (compared with C. rosa s.s.) in: immunity; energy production; cell proliferation; insecticide resistance; reproduction and proliferation; and redox reactions that are usually associated with responses to stress and degradation of fruit metabolites.Entities:
Mesh:
Year: 2019 PMID: 31798006 PMCID: PMC6892911 DOI: 10.1038/s41598-019-54989-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Quality matrices of the de novo assembly of RNA-seq reads using the short reads Trinity assembly program[24,39]. GC = Guanine-Cytosine content, AT = Adenine-Thymine content, Q20 = PHRED quality score threshold of 20, Q30 = PHRED quality score threshold of 30.
Figure 2Nature of open reading frames (ORFs) isolated from the assembly using the TransDecoder program[24].
Figure 3Mapping statistics of the RNA-seq reads to transcripts.
Figure 4Proportional representation of taxa that had at least 98% sequence identity and matched specific C. rosa transcripts in the orthoMCL database[26]; and proportional assembly of their de novo ORFs using the short reads Trinity assembly program[24,39].
Eukaryotic taxa identified in the orthoMCL database[26], that had orthologs in the C. rosa s.s. and C. quilicii transcripts, and their corresponding homologs in the nr NCBI database.
| OrthoMCL Orthologs* | NCBI BLAST homologs** | ||||
|---|---|---|---|---|---|
| Best BLAST Hits | Best BLAST Hits (nr database) | ||||
| 5E-92 | 1 | 161 | 8E-112 | ||
| 6E-58 | 1 | 103 | 4E-52 | ||
| 1E-105 -1E-71 | 2 | 129–183 | 2E-87 - 5E-130 | ||
| 1E-147 | 1 | 248 | 1E-170 | ||
| 8E-56 | 1 | 100 | 2E-56 | ||
| 1E-64 | 1 | 112 | 3E-77 | ||
| 0E + 00 | 1 | 377 | 0 | ||
| 5E-85 | 2 | 148 | 2E-104 | ||
| 6E-57 | 1 | 101 | 4E-62 | ||
| 4E-82 | 1 | 150 | 1E-101 | ||
| 0E + 00 | 3 | 100–493 | 0 - 8E - 60 | ||
| 2E-85 | 1 | 141 | 5E-98 | ||
| 0E + 00 | 1 | 304 | 0 | ||
| 1E-103 | 1 | 181 | 3E-128 | ||
| 7E-74 | 1 | 128 | 4E-87 | ||
| 2E-66 | 1 | 124 | 5E-59 | ||
| 3E-58 | 1 | 109 | 6E-69 | ||
| 1E-64 | 1 | 117 | 7E-71 | ||
| 3E-98 | 1 | 155 | 5E-117 | ||
| 6E-88 | 1 | 148 | 4E-102 | ||
| 0 | 1 | 329 | 0 | ||
| 3E-59 | 1 | 108 | 2E-65 | ||
| 2E-72 | 1 | 129 | 5E-88 | ||
| 6E-83 | 1 | 144 | 1E-97 | ||
| 1E-180 - 2E-58 | 2 | 102–305 | 0 - 2E-69 | ||
| 1E-86 - 7E-71 | 2 | 120–156 | 8E-105 - 2E-80 | ||
| 1E-111-2E-99 | 2 | 173–191 | 1E-115- 3E-138 | ||
| 5E-82 - 6E-68 | 3 | 106–139 | 8E-94 - 2E-79 | ||
| 1E-55 - 1E-64 | 3 | 102–113 | 4E-75 - 2E-67 | ||
| 0 - 6E-82 | 5 | 116–606 | 0 - 3E-101 | ||
| 0 - 1E-150 | 6 | 101–479 | 0 - 1E-168 | ||
| 0 - 1E-59 | 7 | 105–586 | 0 - 5E-113 | ||
| 0 - 3E-93 | 7 | 161–381 | 0 - 3E-121 | ||
| 0 - 1E-112 | 7 | 184 | 0 - 1E-124 | ||
| 0 - 1E-52 | 10 | 105–409 | 0 - 1E-55 | ||
| 0 - 7E-54 | 50 | 100–1046 | 0 - 5E-60 | ||
| 8E-75 | 1 | 137 | 1E-91 | ||
| 4E-74 | 1 | 141 | 7E-91 | ||
| 5E-76 | 1 | 137 | 4E-92 | ||
| 3E-61 | 1 | 111 | 3E-75 | ||
| 3E-56 | 1 | 104 | 4E-67 | ||
* = Orthologs of C. rosa s.s. and C. quilicii transcripts with at least 98% identity with matches in the orthoMCL database.
** = Species in the nr NCBI database with genes orthologous to the C. rosa s.s. and C. quilicii orthologs in the orthoMCL database, and with at least 98% sequence identity and coverage (without gaps) in the NCBI database.
Bacterial taxa identified in the orthoMCL database[26], with orthologs in the C. rosa s.s. and C. quilicii transcripts, and their corresponding homologs in the NCBI nr database.
| OrthoMCL Orthologs* | NCBI BLAST homologs** | ||||
|---|---|---|---|---|---|
| Best BLAST Hits | Best BLAST Hits (nr database) | ||||
| Taxon | E- value | ID | Length (aa) | Species | E- value |
| 7E-67 | C.r_14811 | 125 | 3E-82 | ||
| 1E-90 | C.r_8435 | 161 | 5E-97 | ||
| 1E-105 | C.r_8944 | 187 | 9E-132 | ||
| 9E-58 | C.r_14824 | 106 | 2E-68 | ||
| 5E-73 | C.r_16126 | 131 | 3E-89 | ||
| 1E-56 | C.r_16175 | 143 | 9E-70 | ||
| 1E-79 | C.r_16852 | 141 | 9E-95 | ||
| 1E-110 | C.r_17300 | 186 | 2E-131 | ||
| 1E-107 | C.r_4220 | 190 | 2E-132 | ||
| 7E-76 | C.r_4222 | 136 | 6E-90 | ||
| 2E-66 | C.r_6036 | 124 | 4E-81 | ||
| 2E-60 | C.r_6303 | 112 | 3E-73 | ||
| 7E-66 | C.r_6038 | 124 | 3E-81 | ||
| 1E-117 | C.r_7072 | 210 | 8E-146 | ||
| 4E-71 | C.r_7640 | 126 | 3E-85 | ||
| 1E-167 | C.r_8942 | 290 | 0 | ||
* = Orthologs of C. rosa s.s. and C. quilicii transcripts with at least 98% identity with matches in the orthoMCL database.
** = Species in the nr NCBI database with genes orthologous to the C. rosa s.s. and C. quilicii orthologs in the orthoMCL database, and with at least 98% sequence identity and coverage (without gaps) in the NCBI database.
Figure 5Volcano plot of RNA-Seq of bacterial transcripts that were differentially expressed in female C. rosa s.s and C. quilicii and used as a proxy measurement of relative abundance of the respective taxa.
Figure 6Volcano plot showing host transcripts that were differentially expressed in the gut tissues of C. rosa s.s and C. quilicii. The transcripts expression level data were supported by at least 100 reads in each category (i.e. in C. rosa s.s or C. quilicii) or five CPM (Counts Per Million) by edge R analysis[44,45]. The red dots indicate points-of-interest, i.e. the best BLAST hits[53] on the manually annotated and reviewed Swiss-Prot database[54], that displayed both large magnitude fold-changes (x axis) and high statistical significance (−log10 of p value, y axis). Points with a fold-change less than 2 (log2 = 1) and/ or a False Detection Rate (FDR) corrected p value of less than 0.05 are shown in black, and indicate transcripts that did not change significantly in expression between the two species. Panel A, B, C = the transcripts that had the most, medium and least difference in their expression between the two species.
Canonical gene set enrichment analysis (GSEA) of transcripts that were significantly differentially expressed transcripts in the gut tissues of either male or female C. rosa s.s and C. quilicii. Enrichment profiles established using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46]. Non-redundant enriched Gene Ontology (GO) categories.
| Species | Category | Pathway ID | Description of Pathway | #Ref | #Observed | Expected | Ratio | P-Value | Adjusted P-Value |
|---|---|---|---|---|---|---|---|---|---|
| Biological process | GO:0019730 | Antimicrobial humoral response | 106 | 5 | 0.64 | 7.81 | 0.0004 | 4.93E-02 | |
| GO:0030968 | Endoplasmic reticulum unfolded protein response | 8 | 2 | 0.05 | 41.41 | 0.001 | 4.93E-02 | ||
| GO:0015149 | Hexose transmembrane transporter activity | 18 | 2 | 0.11 | 18.1 | 0.0053 | 1.44E-01 | ||
| GO:0050777 | Negative regulation of immune response | 11 | 2 | 0.07 | 30.11 | 0.0019 | 4.93E-02 | ||
| GO:0044724 | Single-organism carbohydrate catabolic process | 39 | 3 | 0.24 | 12.74 | 0.0016 | 4.93E-02 | ||
| Molecular function | GO:0042803 | Protein homodimerization activity | 72 | 3 | 0.44 | 6.79 | 0.0097 | 1.44E-01 | |
| Cellular component | GO:0030662 | Coated vesicle membrane | 23 | 2 | 0.15 | 13.67 | 0.0092 | 6.58E-02 | |
| GO:0009897 | External side of plasma membrane | 8 | 2 | 0.05 | 39.3 | 0.0011 | 2.69E-02 | ||
| GO:0005811 | Lipid particle | 249 | 7 | 1.58 | 4.42 | 0.0009 | 2.69E-02 | ||
| GO:0045298 | Tubulin complex | 10 | 2 | 0.06 | 31.44 | 0.0017 | 2.69E-02 | ||
| GO:0030018 | Z disc | 21 | 2 | 0.13 | 14.97 | 0.0077 | 6.58E-02 | ||
| Biological process | GO:0006164 | Purine nucleotide biosynthetic process | 72 | 2 | 0.11 | 17.48 | 0.0056 | 2.37E-01 | |
| Molecular function | GO:0000166 | Nucleotide binding | 1225 | 3 | 1.91 | 1.57 | 0.2973 | 5.79E-01 | |
| GO:0045735 | Nutrient reservoir activity | 5 | 2 | 0.01 | 256.29 | 2.27E-05 | 5.00E-04 | ||
| GO:0016491 | Oxidoreductase activity | 630 | 2 | 0.98 | 2.03 | 0.2577 | 5.79E-01 | ||
| GO:0022891 | Substrate-specific transmembrane transporter activity | 649 | 2 | 1.01 | 1.97 | 0.269 | 5.79E-01 | ||
| GO:0008270 | Zinc ion binding | 875 | 2 | 1.37 | 1.46 | 0.4023 | 5.79E-01 | ||
| Cellular component | GO:0005616 | Larval serum protein complex | 5 | 3 | 0.01 | 437.29 | 1.92E-08 | 3.84E-07 | |
| GO:0005811 | Lipid particle | 249 | 3 | 0.34 | 8.78 | 0.0041 | 1.64E-02 | ||
| GO:0005886 | Plasma membrane | 603 | 2 | 0.83 | 2.42 | 0.1983 | 4.67E-01 | ||
| Biological process | GO:0045454 | Cell redox homeostasis | 50 | 2 | 0.23 | 8.78 | 0.0216 | 4.67E-01 | |
| GO:0008340 | Determination of adult lifespan | 128 | 3 | 0.58 | 5.15 | 0.0203 | 4.67E-01 | ||
| GO:0008593 | Regulation of Notch signalling pathway | 62 | 2 | 0.28 | 7.08 | 0.0322 | 5.51E-01 | ||
| GO:0001666 | Response to hypoxia | 45 | 2 | 0.2 | 9.76 | 0.0177 | 4.49E-01 | ||
| GO:0007296 | Vitellogenesis | 8 | 2 | 0.04 | 54.89 | 0.0006 | 1.12E-01 | ||
| Molecular function | GO:0015036 | Disulfide oxidoreductase activity | 30 | 2 | 0.13 | 14.9 | 0.0079 | 7.24E-02 | |
| GO:0009055 | Electron carrier activity | 145 | 3 | 0.65 | 4.62 | 0.0267 | 1.47E-01 | ||
| GO:0051287 | NAD binding | 39 | 2 | 0.17 | 11.46 | 0.013 | 1.02E-01 | ||
| GO:0050661 | NADP binding | 12 | 2 | 0.05 | 37.25 | 0.0013 | 7.15E-02 | ||
| GO:0016651 | Oxidoreductase activity, acting on NADH or NADPH | 51 | 2 | 0.23 | 8.77 | 0.0217 | 1.33E-01 | ||
| GO:0005198 | Structural molecule activity | 487 | 7 | 2.18 | 3.21 | 0.0054 | 7.24E-02 | ||
| Cellular component | GO:0005576 | Extracellular region | 814 | 7 | 3.86 | 1.81 | 0.0847 | 4.49E-01 | |
| GO:0015934 | Large ribosomal subunit | 102 | 2 | 0.48 | 4.14 | 0.0839 | 4.49E-01 | ||
| GO:0005811 | Lipid particle | 249 | 7 | 1.18 | 5.93 | 0.0001 | 5.30E-03 | ||
| GO:0005875 | Microtubule associated complex | 362 | 5 | 1.72 | 2.91 | 0.0269 | 4.49E-01 | ||
| GO:0005700 | Polytene chromosome | 121 | 3 | 0.57 | 5.23 | 0.0193 | 4.49E-01 | ||
| Biological process | GO:030154 | Cell differentiation | 1799 | 2 | 0.76 | 2.62 | 0.1664 | 3.78E-01 | |
| Molecular function | GO:0045735 | Nutrient reservoir activity | 5 | 2 | 0 | 961.1 | 1.30E-06 | 2.60E-06 | |
| Cellular component | GO:0005616 | Larval serum protein complex | 5 | 3 | 0 | 962.04 | 1.16E-09 | 1.62E-08 | |
| GO:0005811 | Lipid particle | 249 | 3 | 0.16 | 19.32 | 0.0003 | 2.10E-03 |
Enriched KEGG pathways of transcripts that were significantly differentially expressed transcripts in the gut tissues of either male or female C. rosa s.s. and C. quilicii. Enrichment profiles established using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)[46].
| Species | Pathway Name | #Ref | #Observed | Expected | Ratio | P-Value | Adjusted P-Value |
|---|---|---|---|---|---|---|---|
| Metabolic pathways | 892 | 12 | 4.3 | 2.79 | 0.001 | 0.0076 | |
| Phagosome | 64 | 2 | 0.31 | 6.49 | 0.0381 | 0.0478 | |
| Propanoate metabolism | 22 | 2 | 0.11 | 18.87 | 0.005 | 0.009 | |
| Protein processing in endoplasmic reticulum | 119 | 4 | 0.57 | 6.98 | 0.0026 | 0.0078 | |
| Ribosome biogenesis in eukaryotes | 78 | 2 | 0.38 | 5.32 | 0.0543 | 0.0543 | |
| Terpenoid backbone biosynthesis | 13 | 2 | 0.06 | 31.93 | 0.0017 | 0.0076 | |
| Folate biosynthesis | 21 | 2 | 0.1 | 19.77 | 0.0045 | 0.009 | |
| Glycolysis/Gluconeogenesis | 49 | 2 | 0.24 | 8.47 | 0.0232 | 0.0348 | |
| Limonene and pinene degradation | 68 | 2 | 0.33 | 6.1 | 0.0425 | 0.0478 | |
| Metabolic pathways | 892 | 3 | 1.15 | 2.6 | 0.1046 | 0.1046 | |
| Limonene and pinene degradation | 68 | 2 | 0.23 | 8.7 | 0.0221 | 0.0418 | |
| Metabolic pathways | 892 | 6 | 3.01 | 1.99 | 0.0783 | 0.0783 | |
| Pyrimidine metabolism | 77 | 2 | 0.26 | 7.69 | 0.0279 | 0.0418 | |
| - | — | — | — | — | — | — |
#Ref: the number of reference genes in the category; # Observed: the number of genes in the gene set and in the category; Expected: the expected number in the category; Ratio: ratio of enrichment; P-Value: p value from hypergeometric multiple Test Adjustment test; Adjusted P-value: p value adjusted by the multiple test adjustment.