Literature DB >> 25505504

Evolutionary genomics and population structure of Entamoeba histolytica.

Koushik Das1, Sandipan Ganguly1.   

Abstract

Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.

Entities:  

Keywords:  Disease outcome; Genetic polymorphism; Genetic recombination; Genotyping; Short tandem repeat loci; Single nucleotide polymorphism

Year:  2014        PMID: 25505504      PMCID: PMC4262060          DOI: 10.1016/j.csbj.2014.10.001

Source DB:  PubMed          Journal:  Comput Struct Biotechnol J        ISSN: 2001-0370            Impact factor:   7.271


Introduction

Amoebiasis 4caused by the gastrointestinal parasite Entamoeba histolytica is one of the major parasitic diseases after malaria and is responsible for approximately 100,000 human deaths per annum [1]. The parasite has an interchangeable two stage life cycle consisting of an infective cyst form and a motile pathogenic trophozoite form. Infection is endemic in many developing countries where poor sanitation and malnutrition are common. Infection can also be restricted to a certain population in some developed countries (among male homosexual population in Japan) [2,3]. The global prevalence of infection (estimated in 1986) suggested that 10% of the world population was infected by this parasite [4]. E. histolytica infection develops variable disease outcomes. 90% of infected individuals remain asymptomatic, while only 10% develops symptoms of invasive amoebiasis [5,6]. However, the global prevalence was estimated prior to the differentiation of E. histolytica from its non-pathogenic sibling Entamoeba dispar in 1993 [7]. Regardless of this epidemiological modification, invasive amoebiasis is still relatively a rare outcome of E. histolytica infection. Specific determinants for the diverse outcomes of this infection still remain obscure. However, host genetics and parasite genotype could be two possible factors [8,9]. Exploring the hidden genetic trait of parasite, directly linked to its virulence or associated with disease outcome, motivates a substantial area of Entamoeba research. Intra and inter-specific genomic comparisons have been conducted to identify the parasites' genetic factor linked to its virulence or associated with differential disease causing abilities [10-13]. These studies also provide some interesting and valuable information concerning the evolution and population structure of this parasite. Recent information concerning evolutionary genomics of E. histolytica and their association with parasite phenotype and its virulence have been discussed. How parasite population structure is revealed by genetic diversity has also been discussed. Questions related to their evolution and population structure have also been emphasized in this review.

Whole-genome sequences of Entamoeba species

Several species of Entamoeba infects a wide range of hosts [14]. The simplest morphological characteristic like the number of nuclei per cyst has been exploited to distinguish between species [15]. However, morphological variations do not always reflect species-level differences and significant genetic diversity exists among morphologically indistinguishable organisms [15]. Some species like the oral parasite Entamoeba gingivalis do not produce cysts [14]. Phylogenetic relationships among SSU rRNA gene sequences of Entamoeba species suggested that E. dispar, Entamoeba nuttalli and Entamoeba moshkovskii are closely related to E. histolytica, while Entamoeba invadens and Entamoeba coli are distantly related [15]. E. dispar, morphologically identical with E. histolytica is usually considered as an avirulent commensal of human gut [14]. However, a recent study suggested that a certain strain of E. dispar (ICB-ADO), isolated from a Brazilian patient can cause amoebic liver abscess (ALA) in hamsters [16]. E. moshkovskii is microscopically indistinguishable from E. histolytica and E. dispar in its cyst and trophozoite form. It was initially thought to be a free living protozoan species [17] but a recent study suggested that E. moshkovskii infects humans and causes diarrhea and colitis in infants [17]. E. dispar infection is, in general much more common than E. histolytica worldwide [18]. Since, worldwide prevalence of E. histolytica infection [4] was estimated prior to the genetic discrimination of E. histolytica from E. dispar, the prevalence value can be completely erroneous and E. dispar could be a potential contributor to the prevalence figures in endemic areas [19]. E. moshkovskii can be found more frequently in regions where amoebiasis shows high prevalence [19,20]. Entamoeba bangladeshi, recently discovered from Bangladesh was clearly grouped with the clade of Entamoeba infecting humans, including E. histolytica [21]. E. invadens is a reptilian parasite and is an important model for encystation process. E. invadens can be induced to encyst in axenic laboratory culture, while encystation has not yet been achieved in axenically grown E. histolytica trophozoites [14]. The genome sequence of E. histolytica strain HM1:IMSS was published and analyzed in 2005 [22-24]. The genome assembly contains 20, 800, 560 bp of DNA in 1496 scaffolds. The genome has a high AT content (approximately 75%). Approximately half of the assembled sequence is predicted to be coding, with 8333 annotated genes [14]. Genome assembly of E. dispar strain SAW760 is of a similar size to that of E. histolytica strain HM1:IMSS. It consists of 22,955,291 bp of DNA in 3312 scaffolds. AT content is also quite similar to that of E. histolytica strain HM1:IMSS (approximately 76.5%). 50% of the assembled sequence is predicted to be coding, with 8749 annotated genes [14]. Genome assembly of E. invadens strain IP1 appears to be larger than that of E. histolytica strain HM1:IMSS and E. dispar strain SAW760. It contains 40,888,805 bp of DNA in 1149 scaffolds. AT content is comparatively less (approximately 70%). Approximately 38% of the assembled sequence is predicted to be coding, with 11,549 annotated genes. As per AmoebaDB database version 4.1 [25, www.amoebadb.org], genome assembly of E. moshkovskii strain Laredo consists of 25, 250,000 bp of DNA in 1147 scaffolds. AT content is approximately 64%. A total of 12,518 annotated genes are present. According to AmoebaDB database version 4.1 [25, www.amoebadb.org], genome assembly of E. nuttalli strain P19 consists of 14, 399,953 bp of DNA in 5233 scaffolds. AT content is approximately 75%. A total of 6187 annotated genes are present.

Structure and organization of genome

Structure of E. histolytica genome has been extensively reviewed by Clark et al. [24]. Many interesting evolutionary features of E. histolytica genome have been highlighted. E. histolytica have gained a significant number of metabolic genes (at least 68) through horizontal gene transfer from bacteria [14,22,24]. Orthologues of these genes found in both E. histolytica and its evolutionary distant species E. invadens [15] indicate that gene transfer is ancient [14]. The haploid genome of E. histolytica strain, HK9 is 3 × 107 bp in size, based on renaturation kinetics experiments [26]. Hybridization of gene marker to pulse field gels identified 14 linkage groups with 1–4 chromosomes per linkage group per nucleus [27]. Tetra-nucleated E. histolytica cyst must contain at least one to two genome copies (1n–2n) in each of the nuclei [28]. However, karyotype analysis of E. histolytica trophozoite revealed the presence of at least 4 functional copies of many structural genes and therefore probably a ploidy that is a multiple of four [28]. Ploidy can vary even within a cell lineage under different growth conditions [28]. However, this phenomenon was only studied in-vitro and whether this occurs in nature is not known. The rRNA gene occurs in circular DNA molecules that exist in multiple copies per nucleus [29]. These circular structures could be important for determining parasite phenotypes. The rDNA episome varies in size from 15 kb to 25 kb depending on E. histolytica strains. The rDNA episome in E. histolytica virulent strain HM1:IMSS has two rDNA units per circle, while E. histolytica avirulent strain Rahman has only a single rDNA unit in its episome [30]. Moreover, Jasson et al. reported that structural genes for hemolysins were present within the ribosomal RNA repeat on extra-chromosomal DNA element of E. histolytica [31]. Initial characterization of E. histolytica genome revealed some unusual features of its organization. E. histolytica genome is highly repetitive (about 40% of the sequences are assigned to repetitive elements). Among them, tRNA genes are exceptionally abundant; with an estimated 4500 copies (about 10 times of human genome) were present. Moreover, most of these tRNA genes are clustered and organized into 25 distinct arrays. The tRNA arrays are composed of tandemly repeated units encoding between 1 and 5 tRNA acceptor types [32]. The intergenic regions of these tRNA genes comprises of short tandemly repeated sequences (STRs) which resembles the micro/mini satellites of eukaryotic genomes. The only difference is that unlike randomly dispersed micro/mini satellites, STRs form a part of a larger unit which is itself tandemly arrayed [32]. tRNA genes are thought to be “hotspots” for recombination and mutation due to their unique structural organizations [32]. The arrangement of tRNA gene showed inter-specific variation. E. histolytica has 2 versions of tRNA array containing AsnGTT and LysCTT genes [i.e. (N-K1) and (N-K2)], while E. dispar genome contains only 1 type of [N-K] array. E. moshkovskii array units are significantly smaller than their homolog in E. histolytica and E. dispar and their intergenic regions do not contain any STRs [32]. STR regions between these tRNA array units showed high degree of intra-specific variation in their repeat number, type and arrangement patterns [13]. These particular features make them very useful as population genetic markers for quantification of evolutionary divergence of this fascinating parasite. The only proposed function of this tRNA array unit is nuclear matrix binding [33]. Moreover, circumstantial evidence also suggests that they may be located either at subtelomeric or at chromosomal ends and could be functional replacements of traditional telomere repeats [32].

Genomic rearrangements and transposable elements

Unlike Plasmodium which has a stable genomic organization even among distantly related species, Entamoeba exhibit high degree of genomic plasticity and instability [14]. Genome rearrangement associated with tissue invasion and organ tropism has been reported as one possible explanation for the different tRNA STR genotypes identified in liver abscess and stool derived parasites from the same infected person [34]. Transposons and repetitive DNA molecule, which are present abundantly in Entamoeba genome, may be responsible for genome reorganization [14]. Transposable elements are organized in clusters, frequently found at syntenic break points providing insights into their contribution to chromosome instability and therefore, to genomic variation and speciation in these parasites [35]. Investigation of repetitive elements within genome from three Entamoeba species identified hundred copies of LINE (long interspersed elements) and SINE (short interspersed elements) elements and a large proportion of Entamoeba specific repeats (ERE1 and ERE2). ERE1 is spread across the three genomes and is associated with different repeats in a species-specific manner [35]. ERE2 sequence was present exclusively in E. histolytica [14]. LINEs and SINEs are class I transposons, propagated by reverse transcription [36]. EhLINEs (LINEs of E. histolytica) each has a single open reading frame with a putative nucleic acid binding motif (CCHC) and restriction enzyme-like endonuclease domain located downstream of the reverse transcriptase (RT) domain. Phylogenetic analysis of the RT domain placed the EhLINEs in the R4 clade of non-LTR elements, a mixed clade of non-LTR elements that includes members from nematodes, insects, and vertebrates [36]. EhLINEs share a common 3′ end sequence with EhLSINEs (SINEs of E. histolytica) which indicates that they are involved in the retro transposition of EhLSINEs. EhLSINEs also have a conserved 5′ end, involved in regulation of their transcription [36]. A genome-wide comparison based on location of LINEs and SINEs elements in E. histolytica and E. dispar genome suggested that SINE expansion has taken place after divergence of two species. However, the basic retrotransposition machinery is conserved in these two species [37]. Since, LINE and SINE can profoundly influence the expression of neighboring genes, their genomic location can affect the phenotypic consequences of parasites [37]. Moreover, a recent study by Yadav et al. [38] suggested that E. histolytica can form recombinant SINEs at high frequency during induced retro transposition in-vivo. DNA transposons (class II transposons) are rarely present in E. histolytica and E. dispar, but are much more prevalent in E. invadens and E. moshkovskii [14]. Representatives of three DNA transposase superfamilies (hobo/Activator/Tam3, Mutator, and piggyBac) were identified in Entamoeba in addition to a variety of members of a fourth superfamily (Tc1/mariner), previously reported only from ciliates and Trichomonas vaginalis among protozoans [39]. Genomic rearrangement might be responsible for variation in number of transposable elements in different lineages [14].

Large gene families and their diversities

The genome of E. histolytica contains a number of large multi-gene families [14]. One such gene family encodes a group of AIG1 like proteins [23]. AIG1 protein family comprises of 29 members distributed in 3 clusters [23]. 18 of them are present near transposons, but whether their duplication and subsequent growth are encouraged by the proximity of transposons is required to be explored [23]. AIG1 proteins are associated with resistance to bacteria [40]. Another gene family encodes a group of leucine-rich-repeats (LRRs) containing proteins, homologous to bacterial fibronectin (BspA of Bacteroides forsythus) [41,23]. Lorenzi et al. identified 114 genes encoding for BspA-like proteins in the genome of E. histolytica strain HM1:IMSS. 41 of them are associated with transposable elements [23]. Proteins of the family contain conserved N-terminal domain. However, no classic membrane-targeting signal is present in the proteins [23]. Hence, it is tempting to speculate that conserved N-terminal domain of proteins might function as either an export signal or serve as a membrane-anchor domain or that export involves a non-classical transport mechanism, independent of the ER–Golgi pathway, similar to those that have been detected in yeast and mammalian cells [42]. At least one member of this family is expressed at the external surface of parasite [41]. Genome survey of E. invadens identified multiple copies of these leucine-rich-repeats (LRRs) containing genes and differential gene expression within gene families has also been reported [43]. However, it is quite unknown whether gene expression has been controlled in such a way that a single gene family is expressed at any one time, as observed in other parasites like Trypanosoma and Plasmodium [14]. Entamoeba also encodes a large number of Rab GTPase (like another protozoan parasite T. vaginalis), involved in vesicular trafficking in the cell [44,45]. A total of 102 Rab GTPase distributed in 16 subfamilies have been annotated in genome of E. histolytica [44,45]. Majority of them showed moderate similarity to Rab from other organisms, while only 22 amoebic Rab proteins including EhRab1, EhRab2, EhRab5, EhRab7, EhRab8, EhRab11, and EhRab21 showed significant similarity to Rab from other organisms [44]. E. invadens has over 100 Rab genes similar to E. histolytica [45]. A comparison of Rab GTPase from E. histolytica and E. invadens revealed that most Rab subfamilies are conserved among these two Entamoeba species [45]. This indicates that Rab GTPase-controlled vesicular trafficking machinery is well conserved among them and expansion of the gene family largely occurred before the divergence of these two species [45]. Rab GTPases have been involved in the regulation of cysteine protease secretion and transport [46,47]. E. histolytica differentially expressed their RabB protein (EhRabB) during phagocytosis of target cells, suggesting the potential role of EhRabB protein in phagocytosis process [48]. EhRabB protein has been mutated experimentally at 118 amino acid position and thus the resulted protein (RabBN118I) was unable to bind guanine nucleotide and became constitutively inactive [49]. Over-expression of such mutated RabB protein within E. histolytica trophozoites resulted in a significant reduction of parasite phagocytosis, cytopathic activity and ability to produce liver abscess in hamster [49]. Hence, Rab-regulated vesicular trafficking is important for parasite biology and pathogenesis. Gene families encoding heavy (hgl) and light (lgl) chain subunits of virulence determinant Gal/GalNAc lectin present in multiple Entamoeba species, but genes for intermediate chain subunit (igl) are only detected in E. histolytica and E. dispar [24]. Bioinformatics comparison among members of this gene family from E. histolytica and E. dispar identified the evidence of gene conversion within the lineages, which may play an important role in molecular evolution of these parasites [50]. Cysteine protease-5, the key virulence factor of E. histolytica is present as a pseudogene in E. dispar [14]. Over-expression of specific cysteine protease genes (ehcp-b8, ehcp-b9 and ehcp-c13) within parasite cells also confers pathogenicity to non-pathogenic E. histolytica clone A1 [51]. Southern blot analysis indicates that the ariel surface proteins of E. histolytica are either not present or highly divergent in E. dispar [14].

Genetic diversity and population structure

Since E. histolytica genome does not appear to contain any microsatellite like elements, measurement of genetic diversity and estimation of population structures greatly rely on other genetic markers like Serine Rich E. histolytica protein (SREHP) gene and chitinase [14]. SREHP is an immune dominant surface antigen, involved in phagocytosis of apoptotic host cells to prevent inflammatory responses by host [52] whereas chitinase is only expressed during encystations of amoeba [53]. Both genes contain tandem repeats which showed high degree of inter-isolate diversity based on their repeat types and arrangement patterns [2,3,54]. However, SREHP gene showed comparatively high degree of polymorphism than chitinase [3]. Since SREHP is highly immunogenic, such high genetic diversity within SREHP gene may suggest that it has a biological role like immune evasion [55]. However, PCR amplification of SREHP gene often produces multiple and mixed PCR bands from a single strain due to allelic variation [18]. Direct sequencing of such mixed PCR products (without cloning of PCR product into a vector prior to sequencing) gives rise to a chromatogram showing multiple peaks at a single nucleotide position. Multiple variations of a single sequence can be obtained from the analysis of such a sequence and this can be misinterpreted as genetic diversity. tRNA linked STR loci of E. histolytica has proved to be a useful population genetic marker and has been used to identify the parasite genotypes associated with different disease outcomes [56]. Studies of genetic diversity based on 6 tRNA linked STR loci (i.e. D-A, S, N-K2, R-R, A-L and S-Q) have identified few parasite genotypes associated with disease outcomes [8,13,57,12,58]. For example- 5RR of R-R locus was associated with asymptomatic outcome, while 10RR was associated with symptomatic outcome [59]. J1DA and VEN2DA of D-A locus were associated with asymptomatic and symptomatic outcomes respectively [60]. Even though tRNA linked STR loci showed few associations with disease outcomes, they are actually surrogate marker and their variations are not at all directly linked to parasite virulence [59]. Moreover, these loci are frequently mutated to form new genotypes and hence any significant association of parasite genotype with disease outcome would be lost over time [18]. However, patterns of polymorphism within these repetitive DNA sometimes reflect the population structure of parasite [14]. For example, in Japan, diversity among parasite population infecting homosexual men was high, while diversity was much more limited among parasite infecting residents of institution [2]. Similarly, low diversity among parasite population infecting residents of institution was seen in the Philippines, where clear population structure was observed within and between locations [54]. In South Africa, genotypes clustered within households but showed extensive diversity among different households [61]. Recently, Zermeno et al. have proposed the worldwide genealogy and population structure of E. histolytica based on two tRNA linked STR loci (i.e. D-A and N-K2) [60]. Majority of these genotypes were found to be exclusive for a particular country. Only few were shared by isolates from different countries. For example- 18NK, 17NK, 10NK, and 11NK of N-K2 locus and 5DA and 6DA of D-A locus were the only genotypes distributed in many regions. Among them, 18NK and 6DA, corresponding to the genotype of E. histolytica strain HM1:IMSS were the most abundant and widely distributed in many countries like Mexico, Bangladesh, Japan, China and the USA. However, genealogies based on these two individual loci (i.e. D-A and N-K2) suggested that there were no parasite lineages related with a particular geographic region. Moreover, concatenated analysis of two tRNA linked STR loci (i.e. D-A and N-K2) revealed the possibility of genetic recombination among the population studied [60]. Genetic organization of E. histolytica population from stool and liver abscess samples of same patients were also studied [34]. The study revealed that E. histolytica population from stool and liver abscess samples were genetically distinct [34]. However, few opposite but interesting scenarios have also been reported. E. histolytica population isolated from amoebic liver abscess (ALA) patients was genetically identical with those isolated from asymptomatic patients [57]. This finding was further supported by recent STR loci based genotyping study of E. histolytica from India. E. histolytica isolates remaining asymptomatic are genetically closer to those causing liver abscess rather than the diarrheal isolates (Fig. 1) [12]. Repetitive DNA markers appear to be stable enough to link closely related parasites recently transmitted among members of a household, an institution or recent sexual partners [14]. However, extensive population diversity in limited geographic regions and frequent occurrences of novel genotypes limit the efficiency of repetitive loci to probe large scale, long term population structure of E. histolytica [14]. SNP (single nucleotide polymorphism) markers may be preferable in these situations.
Fig. 1

E. histolytica isolates remaining asymptomatic are genetically closer to those causing liver abscess: depicted by (A) phylogenetic tree, (B) graphical representation. Phylogeny was based on tRNA linked N-K2 (STR) locus. The sequences of all (a total of 22) representative STR patterns from N-K2 locus, obtained from the genetic analysis of 51 study isolates were aligned using ClustalW multiple alignment program of MEGA Version 4 software. Phylogenetic tree was constructed from the alignment through “Generalized Time Reversal (GTR) + gamma” substitution model of SeaView Graphical Interface Version 4 software using a maximum likelihood matrix algorithm. One distinct “D” group, one distinct “LA” group and one mixed “AS + LA” group can be assigned. ‘D’ group contains STR patterns found exclusively in diarrheal outcome. ‘LA’ group contains STR patterns found only in liver abscess outcome. ‘AS + LA’ group contains STR patterns exclusive for asymptomatic (AS) and liver abscess (LA) outcome.

SNPs within non-repetitive loci arising under neutral, positive and negative pressure are genetically stable and inherited by their descendents [11]. SNP analysis could be a successful strategy to identify the potential virulence marker of parasite linked to infection outcome [11,62]. Comparison between genome sequences of various E. histolytica strains deposited onto AmoebaDB database version 4.1 [25, www.amoebadb.org] have identified a total of 2613 genes, which contain intra-species SNPs within them. Most of the proteins encoded by these genes are hypothetical in nature, while the functions of some genes are known. Few of such genes with known and hypothetical functions are listed in Table 1. A large number of SNPs have been identified in serine threonine isoleucine rich protein (EHI_073630), gene for Gal/Gal NAc lectin lgl2 (EHI_065330), heat shock protein70 (EHI_159140), tyrosine kinase (EHI_124500), gene for AIG1 family protein (EHI_144270), gene for Rab family GTPase (EHI_059670), etc. Gal/Gal NAc lectin is a surface antigen of Entamoeba and involved in parasite adhesion with intestinal epithelium [50]. AIG1 proteins are associated with resistance to bacteria [40]. Rab GTPases are involved in vesicular trafficking machinery of parasite [44,45]. However, further investigation is required to determine the precise function of hypothetical proteins listed in Table 1. Homologs for some of these genes are also found in AmoebaDB database. Few genes of E. histolytica and their homologs are listed in Table 2. High degree of inter-species genetic variability is also observed among genes of E. histolytica and their homologs. A total of 326 SNPs have been identified within E. dispar hsp70 gene (EDI_012650) in comparison to that of E. histolytica (EHI_159140). Similarly, 520 inter-species SNPs have been detected in the homologous gene of lgl2 (EDI_244250) present in E. dispar SAW760 strain. Homologous gene for AIG1 family protein (EDI_001050) contains 144 inter-species SNPs. A total of 103 SNPs were identified in the homologous gene for actinin like protein (EDI_207850) present in E. dispar SAW760 strain. Homologous gene for elongation factor alpha 1 (EDI_134610) also contains a total of 90 SNPs. A total of 254 SNPs have also been identified in the homologous gene for inositol polyphosphate-5-phosphatase (EDI_159070). Another important virulence factor of E. histolytica is lysine and glutamic acid rich protein (KERP1). KERP1 is a surface-associated protein of E. histolytica and has been shown to be involved in the parasite adherence to human enterocytes. It is also an important virulence factor in liver abscess pathogenesis [63,64]. kerp1 gene has been found in both E. histolytica (AmoebaDB i.d. EHI_098210) and E. nuttalli (AmoebaDB i.d. ENU1_189420) but not in E. dispar [64,65]. Analysis of AmoebaDB database version 4.1 [25, www.amoebadb.org] revealed that inter-species genetic variability within kerp1 gene was present among E. histolytica and E. nuttalli. A total of 10 SNPs were identified within E. nuttalli kerp1 sequence (ENU1_189420) in comparison to that of E. histolytica (EHI_098210). However, no intra-species genetic variability has been observed within the gene (EHI_098210). Genome sequencing of E. histolytica clinical isolates has also identified SNPs within cyclicin-2 gene, significantly associated with asymptomatic and liver abscess outcomes [62]. This indicates that cyclicin-2 could be an important virulence determinant of E. histolytica. Studies of comparative genomic hybridization of E. histolytica and E. dispar strains suggested relatively low genomic diversity among E. histolytica [10]. A recent study by Weedall et al. has also identified a low level of single nucleotide diversity within E. histolytica populations [66]. Sequence analysis of defined regions also suggests similar observations [11,67]. Such low level of genetic diversity suggests a relatively recent common ancestor for E. histolytica [14]. However, this observation was quite incongruous with a recent report by Gilchirst et al. [62]. Multilocus sequence typing of E. histolytica clinical isolates identified extensive population diversity, suggesting that the genotypes of individual parasites do not contain consistent phylogenetic signals. They have blamed genetic recombination events for such a result, since it can break down the linkage between target loci and assist to form loci with different genealogies [62]. Hence, an important question regarding the population structure of Entamoeba is whether the parasite populations are predominantly clonal or sexual.
Table 1

Genes of E. histolytica, contain intra-species single nucleotide polymorphisms (SNPs).

AmoebaDB IDProtein product for this geneTotal SNPsNon-synonymous SNPsSynonymous SNPsNon-sense SNPsNon-coding SNPsNon-synonymous SNP/synonymous SNP ratioSNPs per kb (CDS)
EHI_073630Serine threonine isoleucine rich protein, putative704624001.924.6
EHI_065330Gal/Gal NAc lectin lgl2271314000.938.14
EHI_159140Heat shock protein70, putative141220066.94
EHI_006980Gal/Gal NAc lectin lgl11394002.253.89
EHI_124500Tyrosine kinase, putative1394002.251.68
EHI_164190DNA polymerase, putative12660013.13
EHI_144270AIG1 family protein1165001.212.75
EHI_164440Actinin like protein, putative9090005.74
EHI_135220Phospholipid transporting p-type ATPase, putative936000.53.05
EHI_023050Protein kinase domain containing protein9630022.55
EHI_035690Galactose inhibitable lectin 35 kDa subunit precursor8440018.57
EHI_011210Elongation factor alpha 18080005.99
EHI_139430Leucine rich repeat protein BspA family835000.63.96
EHI_023430Glycosyl hydrolase family 31 protein8440013.06
EHI_042370Galactose specific adhesin 170 kDa subunit, putative8620032.05
EHI_013980Phosphatidyl linositol 3 kinase, putative835000.61.86
EHI_119600Ubiquitin carboxyl terminal hydrolase domain containing protein7070001.8
EHI_059670Rab family GTPase624000.52.66
EHI_160860Inositol polyphosphate 5 phosphatase, putative6510052.06
EHI_012270Gal/Gal NAc lectin heavy subunit6420021.55
EHI_045170U5 SnRNP specific 200 kDa protein, putative6420021.11
EHI_164520Iron sulfur flavoprotein pseudogene54100411.36
EHI_001160Plasma membrane calcium transporting ATPase, putative514000.254.42
EHI_000430Rap/Ran GTPase activating protein, putative514000.252.92
EHI_188590Long chain fatty acid CoA ligase, putative514000.252.57
EHI_190880Thioredoxin domain containing protein 2, putative5410042.42
EHI_061870Hypothetical protein22139001.443.4
EHI_033550Hypothetical protein17116001.836.93
EHI_023320Hypothetical protein17125002.45.51
EHI_072500Hypothetical protein15312000.258.93
EHI_013060Hypothetical protein151050023.11
EHI_077290Hypothetical protein1486001.335.03
EHI_018390Hypothetical protein1385001.64.73
EHI_121060Hypothetical protein121020055.75
EHI_059870Hypothetical protein12840024.54
EHI_172000Hypothetical protein1239000.333.57
EHI_050660Hypothetical protein1257000.712.28
EHI_174540Hypothetical protein11710372.65
EHI_196760Hypothetical protein10910098.71
EHI_174560Hypothetical protein10820047.38
EHI_111770Hypothetical protein10910095.45
EHI_006990Hypothetical protein9630027.27
EHI_025310Hypothetical protein945000.84.34
EHI_077750Hypothetical protein972003.53.64
EHI_103400Hypothetical protein9030603.62
EHI_114110Hypothetical protein954001.252.3
EHI_016900Hypothetical protein853001.676.79
EHI_004180Hypothetical protein826000.331.52
EHI_119790Hypothetical protein734000.7521.28
EHI_145460Hypothetical protein716000.1713.75
EHI_106320Hypothetical protein743001.3311.69
EHI_107040Hypothetical protein7060107.99
EHI_144390Hypothetical protein7610067.39
EHI_017780Hypothetical protein725000.46.56
Table 2

Genes of E. histolytica and their homologs present in AmoebaDB database.

E. histolytica HM1:IMSS
E. dispar SAW760
E. invadens IP1
E. moshkovskii Laredo
AmoebaDB IDProtein product for this geneAmoebaDB IDProtein product for this geneAmoebaDB IDProtein product for this geneAmoebaDB IDProtein product for this gene
EHI_073630Serine threonine isoleucine rich protein, putativeEDI_083900Hypothetical proteinEIN_092260Hypothetical proteinEMO_033950Serine threonine isoleucine rich protein, putative
EHI_065330Gal/Gal NAc lectin lgl2EDI_244250Furin repeat containing protein, putativeEIN_065850Furin repeat containing protein, putativeEMO_010790Gal/Gal Nac lectin lgl2
EHI_159140Heat shock protein 70, putativeEDI_012650Heat shock protein 70 kDa, putativeaaEMO_060560Heat shock protein 70, putative
EHI_006980Gal/Gal Nac lectin lgl1EDI_244250Furin repeat containing protein, putativeEIN_065850Furin repeat containing protein, putativeEMO_010790Gal/Gal Nac lectin subunit lgl2
EHI_124500Tyrosine kinase, putativeEDI_004150Serine/threonine protein kinase HT1, putativeEIN_000210Protein serine/threonine kinase, putativeEMO_009220Tyrosine kinase, putative
EHI_164190DNA polymerase, putativeEDI_056410Hypothetical protein, conservedEIN_032840Hypothetical protein, conservedEMO_057600DNA polymerase, putative
EHI_144270AIG1 family proteinEDI_001050Hypothetical protein, conservedaaaa
EHI_164440Actinin like protein, putativeEDI_207850Grainin, putativeEIN_037840Grainin, putativeEMO_010570Actinin like protein, putative
EHI_135220Phospholipid transporting p-type ATPase, putativeEDI_018000Phospholipid transporting ATPase, putativeEIN_038730Phospholipid transporting ATPase, putativeEMO_035200Phospholipid transporting p-type ATPase, putative
EHI_023050Protein kinase domain containing proteinEDI_012370Serine–threonine protein kinase, putativeEIN_016310Serine–threonine protein kinase, putativeEMO_012200Protein kinase domain containing protein
EHI_035690Galactose inhibitable lectin 35 kDa subunit precursorEDI_023210Galactose-inhibitable lectin 35 kDa subunit precursor, putativeaaEMO_050130Galactose-inhibitable lectin 35 kDa subunit precursor
EHI_011210Elongation factor alpha 1EDI_134610Elongation factor 1-alphaEIN_146970Elongation factor 1-alpha, putativeEMO_123750Elongation factor 1-alpha 1
EHI_139430Leucine rich repeat protein BspA familyEDI_284090Hypothetical protein, conservedEIN_054420Hypothetical protein, conservedEMO_007680Leucine rich repeat protein BspA family
EHI_023430Glycosyl hydrolase family 31 proteinEDI_137800Neutral alpha-glucosidase AB precursor, putativeEIN_108320Neutral alpha-glucosidase AB precursor, putativeEMO_112400Glycosyl hydrolase, family 31 protein
EHI_042370Galactose specific adhesin 170 kDa subunit, putativeEDI_213670170 kDa surface lectin precursor, putativeEIN_068210170 kDa surface lectin precursor, putativeEMO_066770Gal/GalNAc lectin heavy subunit
EHI_013980Phosphatidyl linositol 3 kinase, putativeEDI_147070Phosphatidylinositol 3-kinase catalytic subunit gamma, putativeEIN_020710Phosphatidylinositol 3-kinase catalytic subunit gamma, putativeEMO_071620Phosphatidylinositol 3-kinase, putative
EHI_119600Ubiquitin carboxyl terminal hydrolase domain containing proteinEDI_023410Ubiquitin specific protease, putativeEIN_200010Hypothetical proteinEMO_025900Ubiquitin carboxyl-terminal hydrolase domain containing protein
EHI_059670Rab family GTPaseEDI_156940Trichohyalin, putativeEIN_157460Trichohyalin, putativeEMO_059660Rab family GTPase
EHI_160860Inositol polyphosphate 5 phosphatase, putativeEDI_159070Type II inositol-1,4,5-trisphosphate 5-phosphatase precursor, putativeEIN_020640Type II inositol-1,4,5-trisphosphate 5-phosphatase precursor, putativeEMO_012640Inositol polyphosphate-5-phosphatase, putative
EHI_012270Gal/Gal Nac lectin heavy subunitEDI_213670170 kDa surface lectin precursor, putativeEIN_068210170 kDa surface lectin precursor, putativeEMO_066770Gal/GalNAc lectin heavy subunit
EHI_045170U5 SnRNP specific 200 kDa protein, putativeEDI_076220U5 small nuclear ribonucleoprotein 200 kDa helicase, putativeEIN_093940U5 small nuclear ribonucleoprotein 200 kDa helicase, putativeEMO_014940U5 snRNP-specific 200 kDa protein, putative
EHI_164520Iron sulfur flavoprotein pseudogeneEDI_064980Hypothetical protein, conservedEIN_091700Hypothetical protein, conservedEMO_098730Iron–sulfur flavoprotein, putative
EHI_001160Plasma membrane calcium transporting ATPase, putativeEDI_013570Plasma membrane calcium-transporting ATPase, putativeEIN_222480Plasma membrane calcium-transporting ATPase, putativeEMO_006020Plasma membrane calcium-transporting ATPase, putative
EHI_000430Rap/Ran GTPase activating protein, putativeEDI_026850Rap GTPase-activating protein, putativeEIN_033200Rap GTPase-activating protein, putativeEMO_022230Rap/Ran GTPase-activating protein, putative
EHI_188590Long chain fatty acid CoA ligase, putativeEDI_093250Long-chain-fatty-acid—CoA ligase, putativeEIN_016090Long-chain-fatty-acid—CoA ligase, putativeEMO_002990Long-chain-fatty-acid—CoA ligase, putative
EHI_190880Thioredoxin domain containing protein 2, putativeEDI_197960Hypothetical protein, conservedEIN_163620Hypothetical proteinEMO_099010Thioredoxin domain-containing protein 2, putative

Homolog of the corresponding gene is not found in the particular Entamoeba species [as per AmoebaDB database (www.AmoebaDB.org)].

Sexual reproduction can help parasite to improve the fitness of their progeny [68]. Parasitic protists are continuously exposed to exogenous environmental factors and host immune pressure, which can alter the chemical structure and stability of their genome [68]. Parasites should repair structural alteration in their genome, since it can lead to mutations, deletions, insertions, translocation and loss of essential genetic information [68]. Parasites remove their DNA damage by recombinational DNA repair mechanism and this allows greater survival of offspring with undamaged DNA [68]. It is also an important mechanism to generate genetic diversity used by parasites to evade host immune response [68]. This particular feature of parasite is quite important, since sexual reproduction can exchange genes, responsible for drug resistance and parasite virulence. This could generate selectively advantageous genotypes that can spread very rapidly through host population [14]. Sexual reproduction can also help in the removal of deleterious genes. Current deleterious mutations brought together by sexual reproduction create unfit individuals that are eliminated from the population [68]. The genome of E. histolytica contains meiotic genes like SPO11, DMC1, and MND1 and many homologous recombination (HR) specific genes like MLH1, MSH2, RAD21 and RAD51 [22,69,68]. Moreover, ploidy changes and unscheduled gene amplification, which indicate the possibility of recombination have also been reported in Entamoeba [68]. E. histolytica contain a large number of retrotransposons in its genome, which also indicates their ability to reproduce by sexual means [68]. Organisms which reproduce solely by asexual means would eventually lose these retrotransposons from their genome [68]. However, Singh et al. recently provide the first direct demonstration of HR in Entamoeba using a construct with inverted repeats, which upon recombination results in sequence inversion. Increased rate of genetic recombination has been reported in Entamoeba under stress conditions and during encystation process [68]. Stage inter-conversion between cyst and trophozoite is crucial for disease transmission and pathogenesis in E. histolytica [68]. In addition to this, few indirect evidences of genetic recombination have also been identified in Entamoeba through population genetic studies. Complete genome sequencing of 10 axenic E. histolytica cell lines has identified pattern of polymorphism, indicates that recombination has occurred in the history of the population studied [66]. Concatenated genealogy based on repetitive loci (i.e. D-A and N-K2) also revealed the possibility of genetic recombination among E. histolytica population [60]. Bioinformatics comparison of Gal/GalNAc lectin among E. histolytica and its non-pathogenic sibling E. dispar also identified the evidence of gene conversion within the lineages [50]. Transposable elements constitute a significant portion of E. histolytica genome and they can affect the expression of adjacent genes [37]. Phenotypic characteristic of this parasite is also influenced by their genomic location [37]. Variability in genomic distribution of SINE1 and SINE2 among E. histolytica clinical isolates has been recently studied by Kumari et al. [70]. Several loci with extensive polymorphism of SINE occupancy among E. histolytica strains have been identified [70].

Conclusion

Queries related to evolution and population structure of E. histolytica still remains to be investigated. One of the concerning issue is whether E. histolytica population is sexual or clonal. Circumstantial evidence suggested that Entamoeba might engage in genetic recombination at some stage in their life-cycle. However, further detailed investigations with Entamoeba and other early branching protists are required to understand the origin of their sexual reproduction and to determine the variety of mechanisms by which these organisms exchange their DNA. Another major question that arises is whether E. histolytica population from ALA patients is genetically closer to that of asymptomatic individuals. If they are close (few studies suggested this), then individuals with persistent asymptomatic E. histolytica infection may be under high risk of developing ALA in the future. Prompt preventive measures should be undertaken for such individuals. Advanced whole genome sequencing of E. histolytica clinical isolates can be helpful to address this question.
  70 in total

1.  DNA vaccination with the serine rich Entamoeba histolytica protein (SREHP) prevents amebic liver abscess in rodent models of disease.

Authors:  T Zhang; S L Stanley
Journal:  Vaccine       Date:  1999-12-10       Impact factor: 3.641

2.  A retromerlike complex is a novel Rab7 effector that is involved in the transport of the virulence factor cysteine protease in the enteric protozoan parasite Entamoeba histolytica.

Authors:  Kumiko Nakada-Tsukui; Yumiko Saito-Nakano; Vahab Ali; Tomoyoshi Nozaki
Journal:  Mol Biol Cell       Date:  2005-08-24       Impact factor: 4.138

3.  Coding and noncoding genomic regions of Entamoeba histolytica have significantly different rates of sequence polymorphisms: implications for epidemiological studies.

Authors:  Dhruva Bhattacharya; Rashidul Haque; Upinder Singh
Journal:  J Clin Microbiol       Date:  2005-09       Impact factor: 5.948

4.  Unique organisation of tRNA genes in Entamoeba histolytica.

Authors:  C Graham Clark; Ibne Karim M Ali; Mehreen Zaki; Brendan J Loftus; Neil Hall
Journal:  Mol Biochem Parasitol       Date:  2005-10-28       Impact factor: 1.759

5.  The small GTPase EhRabB of Entamoeba histolytica is differentially expressed during phagocytosis.

Authors:  Mario Hernandes-Alejandro; Mercedes Calixto-Gálvez; Israel López-Reyes; Andrés Salas-Casas; Javier Cázares-Ápatiga; Esther Orozco; Mario A Rodríguez
Journal:  Parasitol Res       Date:  2013-02-12       Impact factor: 2.289

6.  The genome of the protist parasite Entamoeba histolytica.

Authors:  Brendan Loftus; Iain Anderson; Rob Davies; U Cecilia M Alsmark; John Samuelson; Paolo Amedeo; Paola Roncaglia; Matt Berriman; Robert P Hirt; Barbara J Mann; Tomo Nozaki; Bernard Suh; Mihai Pop; Michael Duchene; John Ackers; Egbert Tannich; Matthias Leippe; Margit Hofer; Iris Bruchhaus; Ute Willhoeft; Alok Bhattacharya; Tracey Chillingworth; Carol Churcher; Zahra Hance; Barbara Harris; David Harris; Kay Jagels; Sharon Moule; Karen Mungall; Doug Ormond; Rob Squares; Sally Whitehead; Michael A Quail; Ester Rabbinowitsch; Halina Norbertczak; Claire Price; Zheng Wang; Nancy Guillén; Carol Gilchrist; Suzanne E Stroup; Sudha Bhattacharya; Anuradha Lohia; Peter G Foster; Thomas Sicheritz-Ponten; Christian Weber; Upinder Singh; Chandrama Mukherjee; Najib M El-Sayed; William A Petri; C Graham Clark; T Martin Embley; Bart Barrell; Claire M Fraser; Neil Hall
Journal:  Nature       Date:  2005-02-24       Impact factor: 49.962

Review 7.  Amoebiasis.

Authors:  Samuel L Stanley
Journal:  Lancet       Date:  2003-03-22       Impact factor: 79.321

8.  Influence of human leukocyte antigen class II alleles on susceptibility to Entamoeba histolytica infection in Bangladeshi children.

Authors:  Priya Duggal; Rashidul Haque; Shantanu Roy; Dinesh Mondal; R Bradley Sack; Barry M Farr; Terri H Beaty; William A Petri
Journal:  J Infect Dis       Date:  2004-01-20       Impact factor: 5.226

9.  Genotyping of Entamoeba species in South Africa: diversity, stability, and transmission patterns within families.

Authors:  Mehreen Zaki; Selvan G Reddy; Terry F H G Jackson; Jonathan I Ravdin; C Graham Clark
Journal:  J Infect Dis       Date:  2003-06-04       Impact factor: 5.226

10.  Patterns of evolution in the unique tRNA gene arrays of the genus Entamoeba.

Authors:  Blessing Tawari; Ibne Karim M Ali; Claire Scott; Michael A Quail; Matthew Berriman; Neil Hall; C Graham Clark
Journal:  Mol Biol Evol       Date:  2007-11-01       Impact factor: 16.240

View more
  2 in total

1.  The E. histolytica Genome Structure and Virulence.

Authors:  Carol A Gilchrist
Journal:  Curr Trop Med Rep       Date:  2016-10-03

2.  Environmental adaptation of Acanthamoeba castellanii and Entamoeba histolytica at genome level as seen by comparative genomic analysis.

Authors:  Victoria Shabardina; Tabea Kischka; Hanna Kmita; Yutaka Suzuki; Wojciech Makałowski
Journal:  Int J Biol Sci       Date:  2018-02-12       Impact factor: 6.580

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.