| Literature DB >> 31554175 |
Bijendra Khadka1, Tonuka Chatterjee2, Bhagwati P Gupta3, Radhey S Gupta4.
Abstract
The phylum Nematoda encompasses numerous free-living as well as parasitic members, including the widely used animal model Caenorhabditis elegans, with significant impact on human health, agriculture, and environment. In view of the importance of nematodes, it is of much interest to identify novel molecular characteristics that are distinctive features of this phylum, or specific taxonomic groups/clades within it, thereby providing innovative means for diagnostics as well as genetic and biochemical studies. Using genome sequences for 52 available nematodes, a robust phylogenetic tree was constructed based on concatenated sequences of 17 conserved proteins. The branching of species in this tree provides important insights into the evolutionary relationships among the studied nematode species. In parallel, detailed comparative analyses on protein sequences from nematodes (Caenorhabditis) species reported here have identified 52 novel molecular signatures (or synapomorphies) consisting of conserved signature indels (CSIs) in different proteins, which are uniquely shared by the homologs from either all genome-sequenced Caenorhabditis species or a number of higher taxonomic clades of nematodes encompassing this genus. Of these molecular signatures, 39 CSIs in proteins involved in diverse functions are uniquely present in all Caenorhabditis species providing reliable means for distinguishing this group of nematodes in molecular terms. The remainder of the CSIs are specific for a number of higher clades of nematodes and offer important insights into the evolutionary relationships among these species. The structural locations of some of the nematodes-specific CSIs were also mapped in the structural models of the corresponding proteins. All of the studied CSIs are localized within the surface-exposed loops of the proteins suggesting that they may potentially be involved in mediating novel protein-protein or protein-ligand interactions, which are specific for these groups of nematodes. The identified CSIs, due to their exclusivity for the indicated groups, provide reliable means for the identification of species within these nematodes groups in molecular terms. Further, due to the predicted roles of these CSIs in cellular functions, they provide important tools for genetic and biochemical studies in Caenorhabditis and other nematodes.Entities:
Keywords: Caenorhabditis elegans; Chromadorea; conserved signature indels; evolutionary relationships among nematodes; genome sequences; molecular markers (synapomorphies); phylogenetic trees; structural analysis of Caenorhabditis/nematodes-specific indels
Mesh:
Substances:
Year: 2019 PMID: 31554175 PMCID: PMC6826867 DOI: 10.3390/genes10100739
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Maximum-likelihood tree for 52 genome-sequenced nematode species. The tree was constructed based on the concatenated alignment of 17 orthologous proteins present in a single copy in these genomes as described in the Methods. Bootstrap scores for each node are indicated at the branch points. The bar indicates 0.2 changes per position. The major nematode groups at different phylogenetic levels are labeled. The tree was rooted using the outgroup species shown.
Figure 2Partial sequence alignments of the proteins (A) Rab44 and (B) poly ADP-ribose glycohydrolase showing two CSIs (boxed) that are specific for the genus Caenorhabditis. Dashes (-) in these as well as all other alignments denote identity with the amino acid shown in the top sequence. Sequence information for only limited numbers of species is presented in this figure. More detailed alignments for these CSIs are shown in Figure S2. Sequence information for 37 additional CSIs, which are also specific for the genus Caenorhabditis is provided in Figures S3–S39 and a summary of these CSIs is provided in Table 1.
Characteristics of the CSIs specific for the Genus Caenorhabditis.
| Protein Name | Accession No. | Figure No. | Indel Size | Indel Position | |
|---|---|---|---|---|---|
| Rab44 | 4R79.2 | AFP33163 | 1 aa ins | 233–263 | |
| Poly ADP-ribose Glycohydrolase | parg-1 | NP_001255324 | 5 aa ins | 411–454 | |
| Poly (ADP-ribose) polymerase 2 | parp-2 | NP_001022057 |
| 2 aa del | 389–420 |
| DnaJ-domain containing chaperone protein | dnj-16 | OZF80352 |
| 1 aa del | 186–207 |
| Cyclin-dependent kinase 12 | cdk-12 | NP_001254914 |
| 1 aa del | 456–487 |
| CRAL-TRIO domain-containing Sec14 protein | T23G5.2 | NP_001040875 |
| 2 aa ins | 448–487 |
| Mammalian ZAK kinase homolog | zak-1 | NP_001254942 |
| 1 aa ins | 80–109 |
| Probable 3',5'-cyclic phosphodiesterase | pde-2 | NP_001022706 |
| 2 aa ins | 448–495 |
| Nuclear Hormone Receptor | nhr-68 | NP_001256335 |
| 1 aa del | 1–35 |
| SMA2- like | sma-1 | NP_001256383 |
| 2 aa ins | 1353–1393 |
| Glutathione transferase omega-1 * | C02D5.4 | NP_001254962 |
| 1 aa ins | 65–103 |
| Probable 26S proteasome regulatory subunit | rpn-6.2 | NP_001254973 |
| 1 aa ins | 46–90 |
| Serine/ Threonine protein phosphatase 2A Regulatory Subunit | pptr-2 | NP_001256283 |
| 1 aa ins | 92–130 |
| Failed axon connections-like protein * | F53G12.9 | NP_001293265 |
| 1 aa ins | 176–211 |
| NADH dehydrogenase [ubiquinone] 1 alpha subcomplex assembly factor 2 | Y116A8C.30 | XP_002632399 |
| 13 aa ins | 62–97 |
| Disorganized muscle protein 1 | Cbn-dim-1 | EGT45899 |
| 1 aa del | 135–170 |
| ETS (E26 transformation-specific) class transcription factor | ets-9 | NP_001024482 |
| 1 aa ins | 54–78 |
| Glycine-rich domain-containing protein | F32B5.7 | EGT38541 |
| 1 aa ins | 430–466 |
| Heat shock protein 70 | F11F1.1 | NP_001255199 |
| 2 aa del | 364–399 |
| Heat shock protein 70 | F11F1.1 | NP_001255199 |
| 1 aa del | 437–481 |
| Abnormal cell migration protein 13 | mig-13 | NP_001024661 |
| 1 aa del | 123–151 |
| Regulatory-associated protein of mTOR-like protein | daf-15 | XP_003089575 |
| 1 aa ins | 143–175 |
| Abnormal cell migration protein 13 | mig-13 | NP_001024661 |
| 3 aa del | 141–170 |
| Abnormal cell migration protein 13 | mig-13 | NP_001024660 |
| 1 aa del | 220–251 |
| Plexin | plx-1 | NP_500018 |
| 1 aa ins | 1460–1497 |
| Piwi-like protein * | ergo-1 | NP_503362 |
| 1 aa ins | 1020–1070 |
| Stomatin * | sto-1 | NP_001123124 |
| 1 aa del | 70–99 |
| Ral guanine nucleotide dissociation stimulator | rgl-1 | NP_001123140 |
| 1 aa del | 257–290 |
| Transglutaminase/ protease homolog | ltd-1 | NP_001309573 |
| 1 aa del | 261–290 |
| Vacuolar protein sorting-associated protein 41 homolog | vps-41 | NP_001033544 |
| 1 aa ins | 209–242 |
| Serine/arginine-rich splicing factor | rsp-1 | NP_001317731 |
| 1 aa del | 13–36 |
| Serine/ Threonine-protein phosphatase PP1 | Cni-W03D8.2 | PIC40784 |
| 1 aa ins | 159–191 |
| NEPrilysin metallopeptidase * | nep-20 | NP_001317749 |
| 1 aa del | 761–804 |
| DNA PRImase homolog | pri-2 | NP_001251923 |
| 1 aa ins | 224–262 |
| Probable maleylacetoacetate isomerase | Y105E8A.21 | NP_001252372 |
| 3 aa del | 56–91 |
| Glutathione S-transferase * | C25H3.7 | NP_001254102 |
| 1 aa ins | 39–61 |
| CTD nuclear envelope phosphatase 1 homolog | cnep-1 | NP_001254124 |
| 1 aa ins | 32–52 |
| Kelch-domain protein | F53E4.1 | NP_506895 |
| 6 aa ins | 206–248 |
| Intermediate filament protein * | ifc-2 | NP_741705 |
| 2 aa del | 946-983 |
* Two isoforms of this protein are present in Caenorhabditis species.
Figure 3Partial sequence alignment of a conserved region from a protein annotated as abnormal cell migration protein 13 (MIG-13) containing a 2 aa insertion (boxed) which is specific for the family Rhabditoidea. This insertion is not present in the homologous proteins from other nematodes as well as other eukaryotic species. Sequence information for three additional CSIs, which are also specific for the family Rhabditoidea is provided in Figures S41–S43 and a summary of these CSIs is provided in Table 2. Other details are the same as in the legend to Figure 2.
Characteristics of the CSIs specific for the nematode suborder Rhabditoidea and class Chromadorea.
| Protein Name | Accession No. | Figure (Fig. Sup) No. | Indel Size | Indel Position | Specificity | |
|---|---|---|---|---|---|---|
| Cleavage Factor Im homolog | cfim-2 | NP_001255355 |
| 2 aa ins | 87–130 |
|
| Methyl-CpG-binding protein | mbd-2 | NP_001021012 |
| 2 aa ins | 158–200 | |
| Abnormal cell migration protein 13 | mig-13 | NP_001024660 |
| 2 aa ins | 71–105 | |
| PAX3- and PAX7 binding protein 1 | F43G9.12 | NP_001250840 |
| 1 aa del | 126–164 | |
| tRNA (guanine-N(1)-)-methyltransferase | F46F11.10 | NP_491647 |
| 4 aa ins | 632–669 |
|
| Palmitoyltransferase a | spe-10 | KHJ83757 |
| 1 aa del | 234–270 | |
| Palmitoyltransferase | spe-10 | KHJ83757 |
| 2 aa del | 255–282 | |
| Battenin | cln-3.3 | EGT30700 |
| 3 aa ins | 162–194 | |
| ETS (E26 transformation-specific) class transcription factor | ets-5 | KJH47557 |
| 1 aa ins | 122–155 | |
| Heterogeneous nuclear ribonucleoprotein A1 * | H28G03.1 | KJH46562 |
| 1 aa ins | 93–122 | |
| Heterogeneous nuclear ribonucleoprotein A1 * | H28G03.1 | XP_013302959 |
| 5 aa del | 139–171 | |
| Regulator of G-protein signaling 7 a | Cbn-rgs-7 | EGT30339 |
| 1 aa ins | 221–252 | |
| Na(+)/H(+) Exchange Regulatory Factor * | nrfl-1 | NP_001294068 |
| 1 aa ins | 210–245 | Nematoda |
* Two isoforms of this protein are present in Rhabitida species. a These CSIs are not found in Strongyloides ratti, which branches deeply in comparison to the other Chromadorea species.
Figure 4Excerpts from the sequence alignment of a conserved region of the protein tRNA (guanine-N(1)-)-methyltransferase protein containing a 4 aa CSI (boxed) which is specifically found in the homologs from the class Chromadorea. Sequence information for seven additional CSIs, which are also specific for the class Chromadorea is provided in Figures S44–S50 and a summary of these CSIs is provided in Table 2.
Figure 5Partial sequence alignment from a conserved region of a Na(+)/H(+) exchange regulatory factor protein (NRFL-1) harboring a 1 aa insertion (boxed) which is specific for the phylum Nematoda. Most nematodes species contain two homologs of this protein and this CSI is specifically present in one of these two homologs. More detailed information regarding the species distribution of this CSI is provided in Figure S51.
Figure 6Homology models of the C. elegans proteins (A) Rab-44, (B) poly ADP-ribose glycohydrolase and (C) tRNA (guanine-N(1)-)-methyltransferase showing the locations of the CSIs in the structures of these proteins. The CSIs are shown in red color in these figures. As seen from the presented structural overlap, the CSIs in all three studied proteins are localized within the surface-exposed loops of these proteins. More details regarding modeling of these structures are provided in the Methods section.
Figure 7A conceptual diagram summarizing the species specificities of different nematodes-specific CSIs identified in this work and the evolutionary relationships inferred from them and the constructed phylogenetic tree. The numbers of CSIs that are specific for different clades or species-groupings are noted on the respective nodes.