| Literature DB >> 33108393 |
Julia Canitz1, Frank Kirschbaum1,2, Ralph Tiedemann1.
Abstract
African weakly electric fish of the mormyrid genus Campylomormyrus generate pulse-type electric organ discharges (EODs) for orientation and communication. Their pulse durations are species-specific and elongated EODs are a derived trait. So far, differential gene expression among tissue-specific transcriptomes across species with different pulses and point mutations in single ion channel genes indicate a relation of pulse duration and electrocyte geometry/excitability. However, a comprehensive assessment of expressed Single Nucleotide Polymorphisms (SNPs) throughout the entire transcriptome of African weakly electric fish, with the potential to identify further genes influencing EOD duration, is still lacking. This is of particular value, as discharge duration is likely based on multiple cellular mechanisms and various genes. Here we provide the first transcriptome-wide SNP analysis of African weakly electric fish species (genus Campylomormyrus) differing by EOD duration to identify candidate genes and cellular mechanisms potentially involved in the determination of an elongated discharge of C. tshokwe. Non-synonymous substitutions specific to C. tshokwe were found in 27 candidate genes with inferred positive selection among Campylomormyrus species. These candidate genes had mainly functions linked to transcriptional regulation, cell proliferation and cell differentiation. Further, by comparing gene annotations between C. compressirostris (ancestral short EOD) and C. tshokwe (derived elongated EOD), we identified 27 GO terms and 2 KEGG pathway categories for which C. tshokwe significantly more frequently exhibited a species-specific expressed substitution than C. compressirostris. The results indicate that transcriptional regulation as well cell proliferation and differentiation take part in the determination of elongated pulse durations in C. tshokwe. Those cellular processes are pivotal for tissue morphogenesis and might determine the shape of electric organs supporting the observed correlation between electrocyte geometry/tissue structure and discharge duration. The inferred expressed SNPs and their functional implications are a valuable resource for future investigations on EOD durations.Entities:
Mesh:
Year: 2020 PMID: 33108393 PMCID: PMC7591079 DOI: 10.1371/journal.pone.0240812
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Appearance and EOD waveforms of four mormyrid species used for this study.
The morphological shape of each target species is shown, while body sizes may vary. Species-specific EOD wave forms are represented in relation to a 1ms time scale.
Assembly statistics of the four transcriptomes.
| Number of processed reads | 66986804 | 117994270 | 41018678 | 99097520 |
| Number of contigs | 160665 | 218372 | 141384 | 176155 |
| N50 | 1393 | 1873 | 1881 | 1643 |
| Number of transcripts with LORF | 50241 | 65330 | 48929 | 52226 |
| BUSCO completeness (Actinopterygii core gene set) | 62.5% | 76.9% | 64.4% | 68.0% |
List of 27 candidate genes potentially related to the EOD elongation of C. tshokwe.
| Gene | Gene name | SNP position | Amino acid change | Paml/codeml ω value | General function |
|---|---|---|---|---|---|
| TSR2, Ribosome Maturation Factor | A 432 G | K 138 E | 1.058 | • transcriptional regulation | |
| • apoptosis | |||||
| Zinc Finger Protein 32 | T 110 C | V 37 A | 3.352 | • transcriptional regulation | |
| T 452 C | V 151 A | ||||
| Mitogen-Activated Protein Kinase Binding Protein 1-like | G 122 A | S 40 N | 1.689 | • immune system response | |
| • regulatory function | |||||
| Tumor Necrosis Factor (TNF) Receptor superfamily member 5 | T 330 G | D 110 E | 1.295 | • immune system response | |
| SprT-like N-Terminal Domain (Spartan) | T 919 C | S 307 P | 1.215 | • DNA damage response | |
| Tripartite Motif Containing 56 | G 271 A | V 91 I | 1.874 | • immune system response | |
| G 413 A | G 138 E | ||||
| NFU1 iron-sulfur cluster scaffold homolog | A 454 G | I 152 V | 2.144 | • iron-sulfur cluster biogenesis | |
| WDYHV Motif Containing 1 | G 305 C | R 102 P | 1.613 | • cellular protein modification process | |
| Corepressor Interacting With RBPJ, 1 | T 1000 C | F 334 L | 2.714 | • transcriptional regulation | |
| A 1238 G | D 413 G | • signal transduction | |||
| DnaJ Heat Shock Protein Family (Hsp40) Member A1 | T 838 A | S 280 T | 4.105 | • protein folding | |
| T 1034 C | V345 A | • regulation of androgene receptor activity | |||
| Cystatin-like | G 92 A | G 31 E | 1.769 | • regulatory function | |
| EWS RNA-binding protein 1 | A 598 G | T 200 A | 2.042 | • neuron development | |
| • transcriptional regulation | |||||
| Annexin A3b | A 544 G | N 182 D/E | 1.092 | • cell morphogenesis | |
| • membrane permeability | |||||
| REV3 Like, DNA Directed Polymerase Zeta Catalytic Subunit | A 400 G | T 134 A | 2.601 | • DNA repair | |
| • cell proliferation | |||||
| Adaptor Related Protein Complex 4 Subunit Beta 1 | G 404 A | G 135 D | 1.242 | • localization | |
| C 806 T | A 269 V | ||||
| Vascular Cell Adhesion Molecule 1 | G 77 A | A 26 N | 1.090 | • cell-cell recognition | |
| Zinc finger, NFX1-type containing 1 | T 173 C | V 58 A | 1.776 | • DNA-binding | |
| • transcription factor activity | |||||
| Hemoglobin, beta adult 1 | T 57 G | F 19 L | 2.190 | • oxygen transport | |
| SLP Adaptor and CSK Interacting Membrane Protein | T 256 C | S 87 P | 1.716 | • immune synapse formation | |
| • signal transduction | |||||
| Zinc Finger Protein 678 | T 491 C | V 164 T/A | 1.120 | • transcriptional regulation | |
| Immunoglobulin superfamily DCC subclass member 4 | A 428 G | N 143 S | 1.553 | • binding | |
| ATP Synthase Membrane Subunit F | A 275 G | D 92 G/S | 1.294 | • ATP production | |
| Repulsive Guidance Molecule BMP co-receptor b (3'UTR) | G 8 C | W 3 S | 1.453 | • development of nervous system | |
| Chromosome unknown C2orf42 homolog | A 52 G | K 18 E | 1.522 | • integral component of membrane | |
| T 97 C | S 33 P | ||||
| uncharacterized LOC111853234 transcript variant X2 | G 209 A | G 70 E | 1.595 | - | |
| uncharacterized protein LOC109871595 | A 268 G | T 90 A | 1.248 | - | |
| - | unknown gene | T 65 C | V 22 A | 1.801 | - |
| C 150 G | D 50 E | ||||
| T 166 C | W 56 R |
* Amino acid substitutions predicted to impair/alter protein function.
** SNP position refers to the SCO alignments.
Fig 2Proportional ratio of candidate SCO-sequences between C. tshokwe and C. compressirostris for any annotated GO-term.
The figure illustrates the proportional ratio of candidate SCO-sequences between the two Campylomormyrus species (y-axis) relative to the total candidate SCO-sequence number for any annotated GO term (x-axis). The 95% confidence interval for an equal ratio (50:50) is depicted as the gray shaded area, rendering dots (i.e., GO terms) outside of the area significant (red circles).
Comparison of proportional assignment of candidate SCO data among C. tshokwe and C. compressirostris for KEGG level A categories.
| KEGG level A category | Fisher Exact Test | Chi2 Test |
|---|---|---|
| (p-value) | (p-value) | |
| Metabolism | 0.691 | 0.090 |
| Genetic Information Processing | 0.877 | 0.009 |
| Environmental Information Processing | 0.034 | 0.578 |
| Cellular Processes | 0.031 | < 0.001 |
| Organismal Systems | < 0.001 | 0.007 |
| Human Diseases | 0.691 | 0.008 |
1 Species-wise comparison of the total number of candidate SCO-sequences assigned to the respective level A category.
2 Species-wise comparison of candidate SCO-sequence distributions in a KEGG level A category according to the KEGG level B assignment.
* significant (p < 0.05)
** highly significant (p < 0.01).
Fig 3Proportional ratio of candidate SCO-sequences among C. tshokwe and C. compressirostris with annotated KEGG pathways.
The figure represents the proportional ratio of candidate SCO-sequence counts among both species in a found KEGG level B category (y-axis), relative on the total number of candidate SCO-sequences in the respective category (x-axis). The 95% confidence interval for 50:50 ratio is depicted as gray shaded area, rendering dots (i.e., KEGG level B categories) outside of the area significant.
Fig 4KEGG level A category assignments among the candidate SCO sequences, compared to the entire transcriptome.
The bar chart shows the sequence percentage (y-axis) with an annotated KEGG Level A category (x-axis) in the SCO data set and entire transcriptomes of C. tshokwe (red) and C. compressirostris (blue). The error bars indicate the 95% confidence interval, taking the total absolute number of candidate SCO-sequences into account (confidence limits of proportions). Asterisks (*) depict significance at p<0.05.
Fig 5Overview of the workflow of the applied data-analytical approaches.
Shown are the major bioinformatical steps to create an input data set (A), steps for potential candidate gene identification (B), and the computational steps to create the candidate SCO data sets as well as their three annotation comparisons (C).