Literature DB >> 25730766

The genome and transcriptome of the zoonotic hookworm Ancylostoma ceylanicum identify infection-specific gene families.

Erich M Schwarz1, Yan Hu2, Igor Antoshechkin3, Melanie M Miller4, Paul W Sternberg5, Raffi V Aroian2.   

Abstract

Hookworms infect over 400 million people, stunting and impoverishing them. Sequencing hookworm genomes and finding which genes they express during infection should help in devising new drugs or vaccines against hookworms. Unlike other hookworms, Ancylostoma ceylanicum infects both humans and other mammals, providing a laboratory model for hookworm disease. We determined an A. ceylanicum genome sequence of 313 Mb, with transcriptomic data throughout infection showing expression of 30,738 genes. Approximately 900 genes were upregulated during early infection in vivo, including ASPRs, a cryptic subfamily of activation-associated secreted proteins (ASPs). Genes downregulated during early infection included ion channels and G protein-coupled receptors; this downregulation was observed in both parasitic and free-living nematodes. Later, at the onset of heavy blood feeding, C-lectin genes were upregulated along with genes for secreted clade V proteins (SCVPs), encoding a previously undescribed protein family. These findings provide new drug and vaccine targets and should help elucidate hookworm pathogenesis.

Entities:  

Mesh:

Year:  2015        PMID: 25730766      PMCID: PMC4617383          DOI: 10.1038/ng.3237

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


The two hookworm species causing the most infections are Necator americanus and Ancylostoma duodenale, which are generally restricted to human hosts[1,9]. Hookworms are free living during part of their life cycle, with eggs hatching in soil and larvae feeding on bacteria through the first and second larval stages. At the infectious third-stage larval phase (L3i), hookworms cease feeding and wait until they encounter a human host. They generally enter their host by burrowing into skin, although Ancylostoma can alternatively enter by being swallowed. Hookworms then pass through the bloodstream, lungs and digestive tract to the small intestine, where they affix themselves, mature to adulthood, mate and lay eggs that are excreted by the host[1]. The ability to culture A. ceylanicum in golden hamster allows it to be used as a model system for the human-specific hookworms N. americanus and A. duodenale, upon which new drug and vaccine candidates can be tested (Fig. 1)[6,10,11]. Human-specific hookworms belong to a class of parasitic nematodes, strongylids, that are more closely related to the free-living Caenorhabditis elegans than is the free-living Pristionchus pacificus (Fig. 2)[12-15]. Treatments effective against A. ceylanicum might thus also prove useful against other strongylids, such as Haemonchus contortus, that infect farm animals and depress agricultural productivity[16]. Characterizing the genome and transcriptome of A. ceylanicum is a key step toward such comparative analysis.
Figure 1

Life cycle of A. ceylanicum. A. ceylanicum hatch in feces and grow as free-living first- to third-stage (L1–L3) larvae. Before exiting the third larval stage, they mature into infectious third-stage (L3i) larvae, arresting further development until they are inside a host. In 24 h after gavage into golden hamsters, A. ceylanicum are still in the stomach but have exited the L3i stage (24.PI). A standard model for parasite infection is to incubate L3i larvae for 24 h in hookworm culture medium (24.HCM), which evokes changes in larval shape and behavior thought to mimic those of 24.PI larvae in vivo. By 5 d after infection, larvae have migrated further to the intestine, affixed themselves there and grown into early fourth-stage (L4) larvae (5.D; female shown) with visible sexual differentiation. By 12 d (12.D; female shown), they start heavy blood feeding and become young adults with mature males and a few gravid females, with little or no egg laying. By 17 d (17.D; male shown), they are fully mature adults. They begin laying many eggs that are deposited outside the host during defecation, renewing the life cycle. From 19 d onward (19.D; male shown), they remain fertile adults for weeks in hamsters. Scale bars: 100 μm (L3i through 5.D), 500 μm (12.D) and 1 mm (17.D and 19.D).

Figure 2

Evolutionary relatedness of A. ceylanicum to other nematodes. The phylogeny is derived from van Megen et al.[13] and Kiontke et al.[14]. N. americanus and H. contortus are strongylid parasites[15] and the closest relatives of A. ceylanicum. C. elegans, C. briggsae and P. pacificus are free-living, non-parasitic nematodes. Nematodes from distinct groups (clades)[12] within the phylum are color-coded: black, A. ceylanicum and close relatives, clade V; green, plant parasites, clade IV; pink, ascarid and filarial animal parasites, clade III; orange, Trichinella, an animal parasite from clade I. To the right are the numbers of strictly orthologous genes for A. ceylanicum or C. elegans and other species. Self-comparisons (bold) list all strictly defined orthologs within a genome. A. ceylanicum and C. elegans have similar orthology to diverse nematode species.

We assembled an initial A. ceylanicum genome sequence of 313 Mb and a scaffold N50 of 668 kb, estimated to cover ~95% of the genome, with Illumina sequencing and RNA scaffolding[17,18] (Supplementary Tables 1–3). The genome size was comparable to those of Ancylostoma caninum (347 Mb)[19] and H. contortus (320–370 Mb)[20,21] but larger than those of N. americanus, C. elegans and P. pacificus (100– 244 Mb)[22-24]. We found that 40.5% of the genomic DNA was repetitive, twice as much as in N. americanus, C. elegans or P. pacificus (17–24%). We predicted 26,966 protein-coding genes[25] with products of ≥100 residues (Supplementary Table 4). We also predicted 10,050 genes with products of 30–99 residues, to uncover smaller proteins that might aid in parasitism[26]. With RNA sequencing (RNA-seq), we detected expression of 23,855 (88.5%) and 6,883 (68.5%) of these genes, respectively (Fig. 3).
Figure 3

RNA expression levels for 30,738 A. ceylanicum genes. Gene activity during infection is shown in log2-transformed transcripts per million (TPM), with k partitioning of the genes into 20 groups. Genes in yellow and blue are up- and downregulated, respectively; TPM values are shown ranging from ≤2−3 to ≥23. Developmental stages are as in Figure 1. Changes in gene expression after 24 h of growth in HCM (24.HCM) are relatively minor, as opposed to the far-reaching changes in gene expression seen after 24 h of infection in vivo (24.PI).

The genomes of plant-parasitic, necromenic and animal-parasitic nematodes have all acquired bacterial genes through horizontal gene transfer (HGT)[27,28]. We detected one instance of bacterial HGT in A. ceylanicum: Acey_s0012.g1873, a homolog of the N-acetylmuramoyl-L-alanine amidase amiD, which encodes a protein that may help bacteria recycle their murein[29]. Acey_s0012.g1873 was strongly expressed in L3i and then downregulated in all later stages of infection. It has nine predicted introns, presumably acquired after HGT; it has only one homolog in the entire nematode phylum (NECAME_15163 from N. americanus) but many bacterial homologs (Supplementary Fig. 1 and Supplementary Table 5). The sap-feeding insects Acyrthosiphon pisum and Planococcus citri also have amiD genes, acquired by HGT, that may promote bacterial lysis[30,31]. To find genes acting at specific points of infection, we carried out RNA-seq on specimens collected at developmental stages spanning the onset and establishment of infection by A. ceylanicum in golden hamster (Figs. 1 and 3, and Supplementary Table 6), beginning at L3i and followed by 24 h either of incubation in hookworm culture medium (24.HCM), a standard model for early hookworm infection[32], or infection in the hamster stomach (24.PI). We found 942 genes to be significantly upregulated from L3i after 24 h of infection in vivo (Supplementary Table 7). In contrast, we observed only 240 genes significantly upregulated from L3i after 24 h of incubation in HCM, of which 141 were also upregulated with in vivo infection. This lower number matches previous observations[32] and shows that infection in vivo has stronger effects on gene activity than its in vitro model. We linked known or probable gene functions to steps of infection by assigning gene ontology (GO) terms to A. ceylanicum genes[33] and computing which GO terms were over-represented among genes upregulated or downregulated in developmental transitions (Supplementary Tables 8 and 9)[34]. We also analyzed homologous gene families for disproportionate upregulation or downregulation; in particular, gene families identified by orthology of A. ceylanicum with N. americanus or other nematodes might encode previously undescribed components of infection (Supplementary Table 10). Proteases, protease inhibitors, nucleases and protein synthesis were upregulated during early infection (L3i to 24.PI; Supplementary Tables 9a and 11a); proteases and protease inhibitors were also upregulated after L3i in N. americanus[24], as were proteases in H. contortus[21]. Secreted proteases could allow hookworms to digest host proteins in blood and intestinal mucosa[6,11,35-37]. Secreted proteases might also digest and inactivate proteins of the host’s immune system[37,38]. Conversely, secreted protease inhibitors could also suppress host immunity[39-41]. G protein–coupled receptors (GPCRs), receptor-gated ion channels and neurotransmission-related functions in general were downregulated during early infection (L3i to 24.PI), along with transcription factors (Supplementary Tables 9b and 11b). We observed the same pattern among genes downregulated in the transition from L3 to fourth-stage (L4) larvae both in H. contortus[21] and C. elegans[42] (Supplementary Table 8). This finding is consistent with down-regulation after L3 of sensory perception and transcription genes in both C. elegans[43] and N. americanus[24] and of ion channel genes in A. caninum and Brugia malayi[32,44]. Such downregulation might thus be conserved in both parasitic and free-living nematodes. Among gene families upregulated during early infection, we found some already known from other parasitic nematodes, such as ASPs (Supplementary Table 12a)[21,24,45,46]. ASP genes encode a diverse set of secreted cysteine-rich proteins, whose functions probably include blocking immune responses and blood clotting[8]. However, we also found a family of 92 genes collectively upregulated during early infection in vivo (24.PI; q value = 0.003) that had no obvious similarity to known gene families (Supplementary Tables 4 and 12a). By contrast, upregulation of these genes after 24 h of simulated infection in vitro was insignificant (24.HCM; q value = 0.93). These homologs were distantly related to ASPs, so we termed them ASP-related genes (ASPRs; Fig. 4 and Supplementary Fig. 2). We found other ASPRs in some strongylids (for example, N. americanus; Supplementary Tables 13 and 14) but not all (for example, H. contortus). Most ASPR proteins were predicted to be secreted (Supplementary Table 4), and one ASPR in Heligmosomoides bakeri is secreted by parasitic adults[46]. Thus, like ASPs, ASPRs might comprise an important element of hookworm infection in vivo.
Figure 4

Domain-based phylogeny of ASP and ASPR genes from A. ceylanicum and N. americanus and ASPR genes from other nematodes. The tree shows a maximum-likelihood phylogeny of protein domains rather than full-length proteins at the tips (as ASP genes sometimes encode two or more tandem ASP domains). All ASP domains and most ASPR domains are from A. ceylanicum or N. americanus. Almost all domains from ASPRs fall within a single branch, labeled in blue. ASPR genes are labeled blue (A. ceylanicum), green (N. americanus), purple (Oesophagostomum dentatum) or magenta (Heligmosomoides bakeri). ASP domains from orthologs of known ASP genes are labeled in gold, with their branches. N- and C-terminal domains from two-domain proteins are noted as “N” or “C.” Domains from other, less familiar ASP genes are labeled in gray. Confidence values are given as decimal fractions (supplementary Fig. 2). Identities of the corresponding genes and domains are given in supplementary tables 4, 13 and 14.

A. ceylanicum had 432 ASP genes, noticeably more than the related parasites N. americanus (128 genes) and H. contortus (161 genes) and remarkably more than the non-parasitic C. elegans and P. pacificus (35 and 33 genes, respectively). A. ceylanicum and N. americanus also had 92 and 25 ASPR genes, respectively, which were missing entirely from the other species. One explanation for this diversity is the ‘gray pawn’ hypothesis: members of a large gene family might have little individual effect on phenotypic fitness yet be collectively needed for robust fitness under variable conditions[47]. For parasites, a relevant variable condition might be diverse host immune systems, which might favor continually diversifying sequences and expression profiles of ASPs and ASPRs. For development from 24 h to 5 d after infection (24.PI to 5.D), genes encoding structural components of cuticle and genes whose products bind cytoskeletal proteins such as actin were prominently upregulated (Supplementary Table 11e). This period in the life cycle corresponds with the start of parasitic feeding, molting into L4 larvae and overt sexual differentiation (Fig. 1)[6,10]. We also observed a new protein family upregulated at this stage, with homologs in the strongylids A. ceylanicum, N. americanus, H. contortus and Angiostrongylus cantonensis (Supplementary Fig. 3 and Supplementary Tables 4, 12b and 15); the corresponding genes in A. cantonensis are expressed in L4 larvae infecting brain tissue[48]. We thus named this family strong-ylid L4 proteins (SL4Ps). In A. ceylanicum, 24 SL4P genes encoded proteins of ~200 residues, of which 21 were predicted to be non-classically secreted[49] without a leader sequence (Supplementary Table 16); notably, parasitic nematodes often use non-classical rather than classical secretion to export proteins into their hosts[50]. From 5 to 12 d after infection (5.D to 12.D), genes encoding protein tyrosine phosphatases, serine/threonine kinases and C-lectins were prominently upregulated (Supplementary Tables 11g and 12c). This period in the life cycle corresponds with maturation from late-L4 larvae to young adults with incipient fertility and the onset of heavy blood feeding, which exposes A. ceylanicum to the host’s immune system (Fig. 1)[10,11]. Among 22 C-lectin genes upregulated by 12 d, we detected 6 whose products had greater apparent similarity to mammalian than to nematode lectins (Supplementary Tables 4 and 17). Two of these genes encoded structural mimics of mammalian mannose receptor[51], with five tandem C-lectin domains that had arisen through intragenic duplication (Supplementary Fig. 4a). The other four C-lectin genes resembled mammalian asialoglycoprotein receptors and neurocans[51] but arose phylogenetically from nematode lectins (Supplementary Fig. 4b,c). Lectin genes with similarities to mammalian rather than nematode lectins have also been observed in the parasitic nematodes Ascaris suum and Toxocara canis and might help suppress host immune responses[52,53]. We also observed a previously undescribed gene family upregulated at 12 d after infection, with members not only in strongylid parasites (A. ceylanicum, N. americanus, H. contortus and Heterorhabditis bacteriophora) but also in related non-parasitic clade V species (C. elegans, Caenorhabditis briggsae and P. pacificus; Fig. 5, Supplementary Fig. 5 and Supplementary Tables 12c and 18). We thus named this family secreted clade V proteins (SCVPs). In A. ceylanicum, 53 SCVP genes encoded ~150-residue proteins, of which 48 were predicted to be classically secreted (Supplementary Table 4). Whereas N. americanus and H. contortus had 11 to 101 SCVP genes, other nematodes had only 1 to 6, suggesting an expansion of SCVP genes in mammalian-parasitic nematodes analogous to those observed for ASP and ASPR genes.
Figure 5

Phylogeny of SCVPs from A. ceylanicum and other nematodes. A maximum-likelihood phylogeny of SCVPs (supplementary Fig. 5 and supplementary tables 4 and 18) is shown. Species are indicated by color: the hookworms A. ceylanicum and N. americanus are shown in green and olive green, respectively; H. contortus is shown in orange; the free-living Caenorhabditis nematodes (C. elegans and C. briggsae) and P. pacificus are shown in blue and light blue; and H. bacteriophora, an insect parasite, is shown in purple. Confidence values are given as decimal fractions (supplementary Fig. 5b). The SCVP phylogeny falls into five branches: two large, independent gene expansions in hookworms (green); two more branches in H. contortus (orange); and one small branch for non-parasitic nematodes (blue). Like ASPs, SCVPs appear to have existed as a small gene family in free-living nematodes but then to have expanded greatly in both hookworms and other mammalian parasites.

A key motivation for parasite genomics is to identify targets for drugs or vaccines. Because drug development often fails[54], it is essential to identify as many targets as possible. Four drug targets (adenylosuccinate lyase, carnitine O-palmitoyltransferase, dTDP-4-dehydrorhamnose 3,5-epimerase and trehalose-6-phosphatase) have recently been identified in H. contortus and N. americanus[20,24,55-59]. All four are encoded by genes with A. ceylanicum orthologs (Supplementary Table 4). To identify additional drug targets across the genome, we searched for genes that were conserved by diverse parasites but absent from mammals, might be essential for survival in the host (determined on the basis of C. elegans loss-of-function phenotypes), had homologs with known three-dimensional protein structures and had at least one homolog bound by a known small molecule (Supplementary Fig. 6). This screen yielded 72 genes in A. ceylanicum, one of which (Acey_s0015.g2804) encoded trehalose-6-phosphatase (Table 1 and Supplementary Tables 4, 19 and 20).
Table 1

Summary of predicted drug targets in A. ceylanicum

Protein classA. ceylanicum genesKey C. elegans genesDrug data
4-coumarate:coenzyme A ligase, class I10acs-10NA
Ammonium/urea transporter5amt-2NA
Cofactor-independent phosphoglycerate mutase1ipgm-1Limited druggability
Fumarate reductase1F48E8.3NA
Glutamate-gated chloride channel10avr-14, avr-15, glc-2avr-14 observed
Glutamate synthase1W07E11.1NA
Glutamine-fructose 6-phosphate aminotransferase3gfat-1, gfat-2NA
Isocitrate lyase/malate synthase2icl-1NA
KH-domain RNA binding5asd-2, gld-1, K07H8.9NA
Malate/l-lactate dehydrogenase, YlbC type4F36A2.3NA
NADH:flavin oxidoreductase, Oye2/3 type14F17A9.4NA
Nematode prostaglandin F synthase3C35D10.6NA
O-acetylserine sulfhydrylase2cysl-1NA
Secreted lipase6lips-8, lips-9NA
Trehalose-6-phosphate synthase5gob-1, tps-1, tps-2gob-1 predicted

Predicted drug targets, encoded by 72 genes in A. ceylanicum (supplementary table 4), are listed by their protein class. For each class, the number of A. ceylanicum genes encoding it is listed, along with C. elegans homologs that have mutant or RNA interference (RNAi) phenotypes and data indicating whether the drug target is likely to work. avr-14 was recently shown to be a drug target of nitazoxanide[67]; ipgm-1, previously detected as a promising target, was found to encode a poorly druggable protein[68]; and gob-1 encodes trehalose-6-phosphatase, a predicted drug target in H. contortus[20]. “NA” indicates protein classes for which we are not aware of pertinent data. References for all drug target classes (and their drug data, if any) are given in supplementary table 19.

Vaccine targets should be both immunologically accessible and crucial for survival. Proteases meet these requirements, as they are expressed in the intestine (and thus exposed to the host’s immune system) and because, without them, hookworms cannot digest host proteins such as hemoglobin[36]. We thus selected genes encoding proteases that were permanently upregulated by 5 d after infection and that lacked mammalian orthologs but had H. contortus homologs that are also upregulated during infection[21]. This screen yielded 12 cathepsin B–like protease genes, with 4 orthologs in H. contortus; by 19 d after infection, 5 of these 12 genes generated 1% of all transcripts (Supplementary Table 4). Because protease inhibitors were also upregulated during early infection, we searched for ones meeting our criteria; this screen yielded a previously undescribed protease inhibitor predicted to be a 79-residue secreted protein with consistently strong expression (~0.1% of all adult transcripts) and one H. contortus homolog upregulated during infection. The sequencing of A. ceylanicum adds to a growing number of genomes for parasitic nematodes that, collectively, infect over 1 billion humans[60]. Practically, these genomes will be crucial for inventing new drugs and vaccines against nematodes that rapidly evolve drug resistance[61] and that have been parasitizing vertebrates since the Cretaceous[62]. Understanding immunosuppression by parasitic nematodes might also help alleviate autoimmune disorders, which may be partly due to improved hygiene ridding humans of chronic worm infections[63]. Intellectually, understanding these genomes may illuminate remarkable evolutionary changes. Parasitism allows adult nematodes to grow larger and live longer than their free-living relatives (N. americanus adults are ~1 cm long and live for 3–10 years, whereas C. elegans adults are ~1 mm long and live for 3 weeks), but the genomic changes underlying these adaptations are essentially unknown[1,64-66]. The genome and transcriptome of A. ceylanicum should provide lasting benefits for biology and medicine. URLs. FigTree, http://tree.bio.ed.ac.uk/software/figtree/; Gene Ontology term tables, http://archive.geneontology.org/full/; modENCODE, http://www.modencode.org/; NCoils, http://www.russell.embl-heidelberg.de/coils/coils.tar.gz; protocols by S. Kumar for running Blast2GO, InterProScan and MAKER2, https://github.com/sujaikumar/assemblage/blob/master/README-annotation.md; RepBase, http://www.girinst.org/server/RepBase/protected/RepBase19.02.fasta.tar.gz.

ONLINE METHODS

General summary

Culture and infection of A. ceylanicum in golden hamster (Mesocricetus auratus) were carried out as described[69]. All housing and care of laboratory animals used in this study conformed with the US National Institutes of Health Guide for the Care and Use of Laboratory Animals in Research (see 18-F22) and all requirements and all regulations issued by the US Department of Agriculture (USDA), including regulations implementing the Animal Welfare Act (Public Law 89-544, US Statutes at Large) as amended (see 18-F23). Stages of A. ceylanicum selected for developmental RNA-seq are shown in Figure 1 and listed in Supplementary Table 6; they are based on previously described stages of growth in golden hamster[10]. Genomic sequencing and RNA-seq were carried out largely as described[70]. The numbers of A. ceylanicum and hamsters used for A. ceylanicum RNA-seq are listed in Supplementary Table 21. The A. ceylanicum genomic sequence was assembled from paired Illumina 100-nt reads (550 nt and 6 kb apart) with Velvet (1.2.05)[18], gaps were closed after assembly with BGI GapCloser 1.12 (release_2011)[71] and the sequence was reduced in possible heterozygosity[72] with HaploMerger (20111230)[73]. Genomic RNA scaffolding was performed by filtering RNA-seq reads with khmer[74] and then scaffolding with ERANGE (3.2)[17]. RNA-seq reads were assembled into cDNA with Oases (0.2.07)[75]. Assembled cDNAs (Supplementary Table 2) were used both to assess genome completeness and to aid in the prediction of protein-coding genes. The true genomic size of A. ceylanicum was estimated by counting 31-mer frequencies with SOAPdenovo (V1.05)[71], by CEGMA (v2.4.010312) (Supplementary Table 3)[76] and by mapping cDNAs to genomic DNA with BLAT (v. 34)[77]. Repetitive DNA elements in the final genome assembly were identified with RepeatScout (1.0.5)[78]. We predicted protein-coding genes for our final genomic assembly with AUGUSTUS (2.6.1)[25], after generating species-specific parameters with one round of MAKER2 (2.26-beta)[79] (see URLs for the protocol by S. Kumar for running MAKER2) and using hints from cDNA that had been mapped to the genome assembly with BLAT. For predicted A. ceylanicum proteins, we annotated signal and transmembrane sequences with Phobius[80], low-complexity regions with SEG[81], coiled-coil domains with NCoils[82], Pfam 26.0 domains (from both Pfam-A and Pfam-B)[83] with HMMER 3.0/hmmsearch[84], InterPro domains with InterProScan (4.8)[85] and GO terms with Blast2GO 2.5 (build 23092011)[33] (see URLs for protocols for running Blast2GO and InterProScan). We also assigned GO terms to C. elegans and H. contortus genes with Blast2GO so that comparisons of GO terms between different nematode species would be based on equivalent GO term assignments. We performed InterProScan and Blast2GO for both A. ceylanicum and C. elegans. We computed orthologies with OrthoMCL (1.3)[86]. Strict orthologies between genes from two or more species were defined as those orthology groups that contained only one predicted gene for each of the species. Annotations for protein-coding genes are listed in Supplementary Table 4. For RNA-seq analysis of C. elegans, we used published developmental data from the modENCODE consortium (Supplementary Table 22)[42]. For H. contortus, we used published developmental RNA-seq data[21]. We mapped RNA-seq reads to genes with Bowtie 2 (ref. 87) and quantified gene expression with RSEM (1.2.0)[88]. For individual genes, we computed the significance for changes in gene activity between stages or biological conditions (Supplementary Tables 7 and 23) with NOISeq-sim (2.13)[89]. Because we had only one biological replicate per condition, we sampled five random subsets of RNA-seq data per condition to estimate the significance of changes in gene activity. For A. ceylanicum, C. elegans and H. contortus, we used FUNC 0.4.5 with Wilcoxon rank-sum statistics[34] to compute which GO terms were significantly associated with genes up- or downregulated between developmental stages or environmental conditions (for example, changes of drug treatment). For A. ceylanicum, we also used rank-sum statistics to compute such associations for protein families. For phylogenetic analyses, sequences homologous to a protein or single domain were extracted with psi-BLAST[90] or HMMER/jackhmmer. Protein sequences were aligned with MUSCLE (3.8.31)[91] or MAFFT (v7.158b)[92]; alignments were edited with Trimal (v1.4.rev15)[93] and visualized with JalView (2.8)[94]. Protein maximum-likelihood phylogenies and their branch confidence levels were computed with FastTree (2.1.7)[95] and visualized with FigTree 1.4 (see URLs). Some details of these methods are provided below; considerably more extensive details are provided in the Supplementary Note.

Assessing the completeness of genomic DNA

We estimated the assembly’s completeness as 98% by computing the frequencies of 31-mers[71], as 91–99% by searching for conserved eukaryotic genes (Supplementary Table 3)[76] and as 93% by mapping cDNA (assembled independently from RNA-seq reads) to genomic DNA: these calculations supported a consensus value of 95%. The average number of orthologs observed for full-length core eukaryotic genes[76] was 1.13, which matched averages of 1.11–1.15 in C. elegans, C. briggsae and Caenorhabditis tropicalis (all of which are hermaphrodites and thus are completely homozygous), suggesting that the assembly was largely free of unresolved heterozygosity. We searched the genome for tRNA genes with tRNAscan-SE-1.3.1 (ref. 96); this analysis detected a full complement of 426 tRNAs decoding all 20 standard amino acids and one selenocysteine tRNA (Supplementary Table 24).

Examining repetitive elements for possible horizontal gene transfer

In A. caninum, the repetitive element bandit resembles the HSMAR1 mariner-like transposon of humans and has been postulated to arise from a mammalian host by HGT[97]. To determine whether a bandit homolog also existed in A. ceylanicum, we searched our library of A. ceylanicum repetitive elements with the DNA sequence for bandit via BLASTN (2.2.26+)[90] (arguments: “-task blastn -evalue 1e-03”). This analysis yielded two hits, with E values of 0.0 and 7 × 10−170 (Supplementary Table 25). Phylogenetic analysis (Supplementary Fig. 7) and domain analysis with HMMER/hmmsearch indicated that the higher-scoring hit represented an A. ceylanicum homolog of bandit, whereas the lower-scoring hit represented a partial homolog of bandit that did not encode a transposase domain (Transposase_1/PF01359.13 in Pfam). To examine whether more evidence for lateral acquisition of repetitive elements existed in human hookworms, we used the DFAM database[98] to identify repetitive DNA elements in A. ceylanicum and N. americanus with similarity to human repetitive elements. This analysis identified two classes of elements with mammalian similarities, L3/Plat_L3-like retrotransposons and HSMAR1/2-like mariner elements (Supplementary Table 25). To determine whether these similarities were adventitious or real, we computed maximum-likelihood phylogenies for reverse-transcriptase domains (for L3/Plat_L3-like elements) and transposase domains (for HSMAR1/2-like elements). These phylogenies included all of the L3/Plat_L3-like and HSMAR1/2-like repetitive elements that we could detect in A. ceylanicum and N. americanus, in a diverse set of other published genome sequences from nematodes, vertebrates, arthropods, lophotrochozoans and deuterostomes (Supplementary Table 26) and in a curated collection of eukaryotic elements from RepBase[99] (see URLs for source). We extracted well-aligned, full-length protein domains from repetitive elements by requiring that they match the Pfam domains Transposase_1/PF01359.13 (for HSMAR1/2-like elements) or RVT_1 (reverse transcriptase)/PF00078.22 (for L3/Plat_L3-like elements) and also by excluding the shortest 10% of domain matches. These criteria led us to select 988 Plat_L3/L3-like RVT_1/PF00078.22 peptides (Supplementary Table 27a) and 168 HSMAR1/2-like Transposase_ 1/PF01359.13 peptides (Supplementary Table 27b), which we subjected to multiple-sequence alignment and phylogenetic analysis.

Analyzing protein-coding genes

For motif searches or OrthoMCL analyses of protein sequences, we used nematode and mammalian proteomes from genomic sequences and partial nematode proteomes from translated ESTs. All proteomes and their sources are listed in Supplementary Table 28. We classified A. ceylanicum, H. contortus and C. elegans genes both by known protein motifs (through HMMER 3.0/Pfam-A 26 and InterProScan 4.8)[83-85] and evolutionary relationships to genes in different species (through OrthoMCL 1.3)[86]. Pfam-A domains were detected at a threshold of E ≤ 1 × 10−5; InterProScan and OrthoMCL were run with default parameters. We used Pfam-A and InterPro motifs, in turn, to assign GO terms to each gene with Blast2GO 2.5 (build 23092011)[33]. We performed InterProScan and Blast2GO according to available protocols (see URLs); for Blast2GO, we used both InterProScan predictions and BLASTP results against an animal-specific subset of the NCBI nr database (NCBI-nr)[100]. We computed orthology groups for our A. ceylanicum genes with OrthoMCL (1.3)[86], for numbers of species ranging from 4 to 14 (Supplementary Tables 4 and 28). Strict orthologies between genes of two or more species were defined as those orthology groups that contained only one predicted gene for each of those species (Fig. 2). Strict orthologies allowed us to compare transcriptional profiles between A. ceylanicum and C. elegans and to thereby identify a set of 406 A. ceylanicum genes that were strongly expressed under all conditions for which we had RNA-seq data from either A. ceylanicum or C. elegans.

Searching for horizontal gene transfer of protein-coding genes

To find possible cases of HGT of protein-coding genes from non-nematodes to A. ceylanicum, we used both orthologies (strict and non-strict) and Pfam-A domains (computed for all proteomes as with A. ceylanicum). Orthologies were considered to represent possible instances of HGT if they included A. ceylanicum, Homo sapiens and Mus musculus but did not include C. elegans, C. briggsae, P. pacificus, Bursaphelenchus xylophilus or Meloidogyne hapla. Sets of genes encoding a shared Pfam-A domain were likewise considered to contain possible instances of HGT if the domains were present in A. ceylanicum and mammals (at E ≤ 1 × 10−6) but absent in C. elegans, C. briggsae, P. pacificus, B. xylophilus and M. hapla (at E ≤ 1 × 10−5). Out of 33,243 orthology groups and 3,545 Pfam-A domains, we found 52 and 15 (respectively) that were instances of possible HGT. Each possible instance of HGT in A. ceylanicum was individually checked by BLASTP searches of NCBI-nr. In most cases, BLASTP showed similarities to C. elegans and other nematodes, which marked the putative HGTs as false positives. However, we also identified (through Pfam-A domains) one A. ceylanicum gene with strong similarity to bacterial amiD, Acey_s0012.g1873. To search for other such homologs, we reran our motif searches without the requirement for mammalian hits, but, on further testing with BLASTP against NCBI-nr, no other bacterial sequences were found.

Phylogenetic analysis of lectin homologs from metazoa

In addition to the amiD homolog Acey_s0012.g1873, we also observed eight A. ceylanicum genes that were more similar to vertebrate lectins than to nematode ones (Supplementary Table 17): these fell into three classes, showing similarity to mannose receptor (MRC), asialoglycoprotein receptor (ASGR) or neurocan (NCAN). We phylogenetically compared their domains to nematode, arthropod, deuterostome and lophotrochozoan proteomes, along with a small number of added individual nematode lectins that had been characterized because of their previously reported similarities to mammalian proteins (species listed in Supplementary Table 29; sources of proteome sequences listed in Supplementary Table 28). To avoid misalignments and spurious similarities between multidomain proteins, we analyzed individual C-lectin domains rather than full-length lectin proteins; to identify coherent sets of homologs, we searched the custom proteome database with single-domain query sequences via psi-BLAST (2.2.26+)[90,101], run for either three or four rounds at an inclusion threshold of E ≤ 1 × 10−20. The query sequences used, with the corresponding numbers of psi-BLAST rounds, are listed in Supplementary Table 30. The resulting single-domain matches were realigned with MUSCLE (3.8.31) and phylogenetically analyzed as above. For each lectin class, the sequences in each resulting phylogeny are listed in Supplementary Table 31.

Phylogenetic analysis of amiD homologs from metazoa and bacteria

We first characterized non-bacterial and bacterial homologs of Acey_ s0012.g1873 with BLASTP of NCBI-nr. This analysis yielded matches to sequences from the hookworms A. ceylanicum (our own data, deposited into GenBank) and N. americanus; it also gave nine matches to non-bacterial sequences from arthropods and basal animals (Supplementary Table 32). To more rigorously determine the phylogenetic origin of the amiD genes in the hookworms A. ceylanicum and N. americanus, we generated a phylogeny for the entire Amidase_2 superfamily (N-acetylmuramoyl-L-alanine amidase; PFAM 27.0 motif PF01510.20), of which bacterial amiD genes represent one of four major subdivisions[102]. We searched all of the proteomes listed in Supplementary Table 29, along with all of the individual metazoan amiD homologs listed in Supplementary Table 32 and more proteomes from arthropods, two different metagenomes from human stool and cow rumen and the entire 9 July 2014 release of UniProt[103]. Species and data sources for additional proteomes are listed in Supplementary Table 33; source files are listed in Supplementary Table 28. We extracted subsequences matching the Amidase_2/PF01510.20 domain, realigned them with MAFFT v7.158b and phylogenetically analyzed them as above.
  99 in total

Review 1.  Molecular mechanisms of hookworm disease: stealth, virulence, and vaccines.

Authors:  Mark S Pearson; Leon Tribolet; Cinzia Cantacessi; Maria Victoria Periago; Maria Adela Valero; Maria Adela Valerio; Amar R Jariwala; Peter Hotez; David Diemert; Alex Loukas; Jeffrey Bethony
Journal:  J Allergy Clin Immunol       Date:  2012-07       Impact factor: 10.793

2.  A novel C-type lectin identified by EST analysis in tissue migratory larvae of Ascaris suum.

Authors:  Ayako Yoshida; Eiji Nagayasu; Yoichiro Horii; Haruhiko Maruyama
Journal:  Parasitol Res       Date:  2011-10-18       Impact factor: 2.289

Review 3.  Genome evolution in filamentous plant pathogens: why bigger can be better.

Authors:  Sylvain Raffaele; Sophien Kamoun
Journal:  Nat Rev Microbiol       Date:  2012-05-08       Impact factor: 60.633

4.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

Review 5.  Helminth infections and host immune regulation.

Authors:  Henry J McSorley; Rick M Maizels
Journal:  Clin Microbiol Rev       Date:  2012-10       Impact factor: 26.132

6.  Mechanistic and single-dose in vivo therapeutic studies of Cry5B anthelmintic action against hookworms.

Authors:  Yan Hu; Bin Zhan; Brian Keegan; Ying Y Yiu; Melanie M Miller; Kathryn Jones; Raffi V Aroian
Journal:  PLoS Negl Trop Dis       Date:  2012-11-08

7.  HaploMerger: reconstructing allelic relationships for polymorphic diploid genome assemblies.

Authors:  Shengfeng Huang; Zelin Chen; Guangrui Huang; Ting Yu; Ping Yang; Jie Li; Yonggui Fu; Shaochun Yuan; Shangwu Chen; Anlong Xu
Journal:  Genome Res       Date:  2012-05-03       Impact factor: 9.043

8.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.

Authors:  Marcel H Schulz; Daniel R Zerbino; Martin Vingron; Ewan Birney
Journal:  Bioinformatics       Date:  2012-02-24       Impact factor: 6.937

9.  Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010.

Authors:  Theo Vos; Abraham D Flaxman; Mohsen Naghavi; Rafael Lozano; Catherine Michaud; Majid Ezzati; Kenji Shibuya; Joshua A Salomon; Safa Abdalla; Victor Aboyans; Jerry Abraham; Ilana Ackerman; Rakesh Aggarwal; Stephanie Y Ahn; Mohammed K Ali; Miriam Alvarado; H Ross Anderson; Laurie M Anderson; Kathryn G Andrews; Charles Atkinson; Larry M Baddour; Adil N Bahalim; Suzanne Barker-Collo; Lope H Barrero; David H Bartels; Maria-Gloria Basáñez; Amanda Baxter; Michelle L Bell; Emelia J Benjamin; Derrick Bennett; Eduardo Bernabé; Kavi Bhalla; Bishal Bhandari; Boris Bikbov; Aref Bin Abdulhak; Gretchen Birbeck; James A Black; Hannah Blencowe; Jed D Blore; Fiona Blyth; Ian Bolliger; Audrey Bonaventure; Soufiane Boufous; Rupert Bourne; Michel Boussinesq; Tasanee Braithwaite; Carol Brayne; Lisa Bridgett; Simon Brooker; Peter Brooks; Traolach S Brugha; Claire Bryan-Hancock; Chiara Bucello; Rachelle Buchbinder; Geoffrey Buckle; Christine M Budke; Michael Burch; Peter Burney; Roy Burstein; Bianca Calabria; Benjamin Campbell; Charles E Canter; Hélène Carabin; Jonathan Carapetis; Loreto Carmona; Claudia Cella; Fiona Charlson; Honglei Chen; Andrew Tai-Ann Cheng; David Chou; Sumeet S Chugh; Luc E Coffeng; Steven D Colan; Samantha Colquhoun; K Ellicott Colson; John Condon; Myles D Connor; Leslie T Cooper; Matthew Corriere; Monica Cortinovis; Karen Courville de Vaccaro; William Couser; Benjamin C Cowie; Michael H Criqui; Marita Cross; Kaustubh C Dabhadkar; Manu Dahiya; Nabila Dahodwala; James Damsere-Derry; Goodarz Danaei; Adrian Davis; Diego De Leo; Louisa Degenhardt; Robert Dellavalle; Allyne Delossantos; Julie Denenberg; Sarah Derrett; Don C Des Jarlais; Samath D Dharmaratne; Mukesh Dherani; Cesar Diaz-Torne; Helen Dolk; E Ray Dorsey; Tim Driscoll; Herbert Duber; Beth Ebel; Karen Edmond; Alexis Elbaz; Suad Eltahir Ali; Holly Erskine; Patricia J Erwin; Patricia Espindola; Stalin E Ewoigbokhan; Farshad Farzadfar; Valery Feigin; David T Felson; Alize Ferrari; Cleusa P Ferri; Eric M Fèvre; Mariel M Finucane; Seth Flaxman; Louise Flood; Kyle Foreman; Mohammad H Forouzanfar; Francis Gerry R Fowkes; Richard Franklin; Marlene Fransen; Michael K Freeman; Belinda J Gabbe; Sherine E Gabriel; Emmanuela Gakidou; Hammad A Ganatra; Bianca Garcia; Flavio Gaspari; Richard F Gillum; Gerhard Gmel; Richard Gosselin; Rebecca Grainger; Justina Groeger; Francis Guillemin; David Gunnell; Ramyani Gupta; Juanita Haagsma; Holly Hagan; Yara A Halasa; Wayne Hall; Diana Haring; Josep Maria Haro; James E Harrison; Rasmus Havmoeller; Roderick J Hay; Hideki Higashi; Catherine Hill; Bruno Hoen; Howard Hoffman; Peter J Hotez; Damian Hoy; John J Huang; Sydney E Ibeanusi; Kathryn H Jacobsen; Spencer L James; Deborah Jarvis; Rashmi Jasrasaria; Sudha Jayaraman; Nicole Johns; Jost B Jonas; Ganesan Karthikeyan; Nicholas Kassebaum; Norito Kawakami; Andre Keren; Jon-Paul Khoo; Charles H King; Lisa Marie Knowlton; Olive Kobusingye; Adofo Koranteng; Rita Krishnamurthi; Ratilal Lalloo; Laura L Laslett; Tim Lathlean; Janet L Leasher; Yong Yi Lee; James Leigh; Stephen S Lim; Elizabeth Limb; John Kent Lin; Michael Lipnick; Steven E Lipshultz; Wei Liu; Maria Loane; Summer Lockett Ohno; Ronan Lyons; Jixiang Ma; Jacqueline Mabweijano; Michael F MacIntyre; Reza Malekzadeh; Leslie Mallinger; Sivabalan Manivannan; Wagner Marcenes; Lyn March; David J Margolis; Guy B Marks; Robin Marks; Akira Matsumori; Richard Matzopoulos; Bongani M Mayosi; John H McAnulty; Mary M McDermott; Neil McGill; John McGrath; Maria Elena Medina-Mora; Michele Meltzer; George A Mensah; Tony R Merriman; Ana-Claire Meyer; Valeria Miglioli; Matthew Miller; Ted R Miller; Philip B Mitchell; Ana Olga Mocumbi; Terrie E Moffitt; Ali A Mokdad; Lorenzo Monasta; Marcella Montico; Maziar Moradi-Lakeh; Andrew Moran; Lidia Morawska; Rintaro Mori; Michele E Murdoch; Michael K Mwaniki; Kovin Naidoo; M Nathan Nair; Luigi Naldi; K M Venkat Narayan; Paul K Nelson; Robert G Nelson; Michael C Nevitt; Charles R Newton; Sandra Nolte; Paul Norman; Rosana Norman; Martin O'Donnell; Simon O'Hanlon; Casey Olives; Saad B Omer; Katrina Ortblad; Richard Osborne; Doruk Ozgediz; Andrew Page; Bishnu Pahari; Jeyaraj Durai Pandian; Andrea Panozo Rivero; Scott B Patten; Neil Pearce; Rogelio Perez Padilla; Fernando Perez-Ruiz; Norberto Perico; Konrad Pesudovs; David Phillips; Michael R Phillips; Kelsey Pierce; Sébastien Pion; Guilherme V Polanczyk; Suzanne Polinder; C Arden Pope; Svetlana Popova; Esteban Porrini; Farshad Pourmalek; Martin Prince; Rachel L Pullan; Kapa D Ramaiah; Dharani Ranganathan; Homie Razavi; Mathilda Regan; Jürgen T Rehm; David B Rein; Guiseppe Remuzzi; Kathryn Richardson; Frederick P Rivara; Thomas Roberts; Carolyn Robinson; Felipe Rodriguez De Leòn; Luca Ronfani; Robin Room; Lisa C Rosenfeld; Lesley Rushton; Ralph L Sacco; Sukanta Saha; Uchechukwu Sampson; Lidia Sanchez-Riera; Ella Sanman; David C Schwebel; James Graham Scott; Maria Segui-Gomez; Saeid Shahraz; Donald S Shepard; Hwashin Shin; Rupak Shivakoti; David Singh; Gitanjali M Singh; Jasvinder A Singh; Jessica Singleton; David A Sleet; Karen Sliwa; Emma Smith; Jennifer L Smith; Nicolas J C Stapelberg; Andrew Steer; Timothy Steiner; Wilma A Stolk; Lars Jacob Stovner; Christopher Sudfeld; Sana Syed; Giorgio Tamburlini; Mohammad Tavakkoli; Hugh R Taylor; Jennifer A Taylor; William J Taylor; Bernadette Thomas; W Murray Thomson; George D Thurston; Imad M Tleyjeh; Marcello Tonelli; Jeffrey A Towbin; Thomas Truelsen; Miltiadis K Tsilimbaris; Clotilde Ubeda; Eduardo A Undurraga; Marieke J van der Werf; Jim van Os; Monica S Vavilala; N Venketasubramanian; Mengru Wang; Wenzhi Wang; Kerrianne Watt; David J Weatherall; Martin A Weinstock; Robert Weintraub; Marc G Weisskopf; Myrna M Weissman; Richard A White; Harvey Whiteford; Steven T Wiersma; James D Wilkinson; Hywel C Williams; Sean R M Williams; Emma Witt; Frederick Wolfe; Anthony D Woolf; Sarah Wulf; Pon-Hsiu Yeh; Anita K M Zaidi; Zhi-Jie Zheng; David Zonies; Alan D Lopez; Christopher J L Murray; Mohammad A AlMazroa; Ziad A Memish
Journal:  Lancet       Date:  2012-12-15       Impact factor: 79.321

Review 10.  Lateral gene transfers have polished animal genomes: lessons from nematodes.

Authors:  Etienne G J Danchin; Marie-Noëlle Rosso
Journal:  Front Cell Infect Microbiol       Date:  2012-03-06       Impact factor: 5.293

View more
  40 in total

1.  Transcriptomic analysis of hookworm Ancylostoma ceylanicum life cycle stages reveals changes in G-protein coupled receptor diversity associated with the onset of parasitism.

Authors:  James P Bernot; Gabriella Rudy; Patti T Erickson; Ramesh Ratnappan; Meseret Haile; Bruce A Rosa; Makedonka Mitreva; Damien M O'Halloran; John M Hawdon
Journal:  Int J Parasitol       Date:  2020-06-25       Impact factor: 3.981

2.  FMRFamide-like peptides expand the behavioral repertoire of a densely connected nervous system.

Authors:  James Siho Lee; Pei-Yin Shih; Oren N Schaedel; Porfirio Quintero-Cadena; Alicia K Rogers; Paul W Sternberg
Journal:  Proc Natl Acad Sci U S A       Date:  2017-11-22       Impact factor: 11.205

Review 3.  Perusal of parasitic nematode 'omics in the post-genomic era.

Authors:  Jonathan D Stoltzfus; Adeiye A Pilgrim; De'Broski R Herbert
Journal:  Mol Biochem Parasitol       Date:  2016-11-22       Impact factor: 1.759

Review 4.  The genomic basis of nematode parasitism.

Authors:  Mark Viney
Journal:  Brief Funct Genomics       Date:  2018-01-01       Impact factor: 4.241

5.  A bioactive phlebovirus-like envelope protein in a hookworm endogenous virus.

Authors:  Monique Merchant; Carlos P Mata; Yangci Liu; Haoming Zhai; Anna V Protasio; Yorgo Modis
Journal:  Sci Adv       Date:  2022-05-11       Impact factor: 14.957

6.  Host- and Helminth-Derived Endocannabinoids That Have Effects on Host Immunity Are Generated during Infection.

Authors:  Hashini M Batugedara; Donovan Argueta; Jessica C Jang; Nicholas V DiPatrizio; Meera G Nair; Dihong Lu; Marissa Macchietto; Jaspreet Kaur; Shaokui Ge; Adler R Dillman
Journal:  Infect Immun       Date:  2018-10-25       Impact factor: 3.441

7.  Zoonotic Ancylostomiasis: An Update of a Continually Neglected Zoonosis.

Authors:  Katharina Stracke; Aaron R Jex; Rebecca J Traub
Journal:  Am J Trop Med Hyg       Date:  2020-04-23       Impact factor: 2.345

8.  The Haemonchus contortus kinome--a resource for fundamental molecular investigations and drug discovery.

Authors:  Andreas J Stroehlein; Neil D Young; Pasi K Korhonen; Abdul Jabbar; Andreas Hofmann; Paul W Sternberg; Robin B Gasser
Journal:  Parasit Vectors       Date:  2015-12-08       Impact factor: 3.876

9.  CAP protein superfamily members in Toxocara canis.

Authors:  Andreas J Stroehlein; Neil D Young; Ross S Hall; Pasi K Korhonen; Andreas Hofmann; Paul W Sternberg; Abdul Jabbar; Robin B Gasser
Journal:  Parasit Vectors       Date:  2016-06-24       Impact factor: 3.876

10.  The genomic basis of parasitism in the Strongyloides clade of nematodes.

Authors:  Vicky L Hunt; Isheng J Tsai; Avril Coghlan; Adam J Reid; Nancy Holroyd; Bernardo J Foth; Alan Tracey; James A Cotton; Eleanor J Stanley; Helen Beasley; Hayley M Bennett; Karen Brooks; Bhavana Harsha; Rei Kajitani; Arpita Kulkarni; Dorothee Harbecke; Eiji Nagayasu; Sarah Nichol; Yoshitoshi Ogura; Michael A Quail; Nadine Randle; Dong Xia; Norbert W Brattig; Hanns Soblik; Diogo M Ribeiro; Alejandro Sanchez-Flores; Tetsuya Hayashi; Takehiko Itoh; Dee R Denver; Warwick Grant; Jonathan D Stoltzfus; James B Lok; Haruhiko Murayama; Jonathan Wastling; Adrian Streit; Taisei Kikuchi; Mark Viney; Matthew Berriman
Journal:  Nat Genet       Date:  2016-02-01       Impact factor: 38.330

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.