| Literature DB >> 25505504 |
Koushik Das1, Sandipan Ganguly1.
Abstract
Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.Entities:
Keywords: Disease outcome; Genetic polymorphism; Genetic recombination; Genotyping; Short tandem repeat loci; Single nucleotide polymorphism
Year: 2014 PMID: 25505504 PMCID: PMC4262060 DOI: 10.1016/j.csbj.2014.10.001
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1E. histolytica isolates remaining asymptomatic are genetically closer to those causing liver abscess: depicted by (A) phylogenetic tree, (B) graphical representation. Phylogeny was based on tRNA linked N-K2 (STR) locus. The sequences of all (a total of 22) representative STR patterns from N-K2 locus, obtained from the genetic analysis of 51 study isolates were aligned using ClustalW multiple alignment program of MEGA Version 4 software. Phylogenetic tree was constructed from the alignment through “Generalized Time Reversal (GTR) + gamma” substitution model of SeaView Graphical Interface Version 4 software using a maximum likelihood matrix algorithm. One distinct “D” group, one distinct “LA” group and one mixed “AS + LA” group can be assigned. ‘D’ group contains STR patterns found exclusively in diarrheal outcome. ‘LA’ group contains STR patterns found only in liver abscess outcome. ‘AS + LA’ group contains STR patterns exclusive for asymptomatic (AS) and liver abscess (LA) outcome.
Genes of E. histolytica, contain intra-species single nucleotide polymorphisms (SNPs).
| AmoebaDB ID | Protein product for this gene | Total SNPs | Non-synonymous SNPs | Synonymous SNPs | Non-sense SNPs | Non-coding SNPs | Non-synonymous SNP/synonymous SNP ratio | SNPs per kb (CDS) |
|---|---|---|---|---|---|---|---|---|
| EHI_073630 | Serine threonine isoleucine rich protein, putative | 70 | 46 | 24 | 0 | 0 | 1.92 | 4.6 |
| EHI_065330 | Gal/Gal NAc lectin lgl2 | 27 | 13 | 14 | 0 | 0 | 0.93 | 8.14 |
| EHI_159140 | Heat shock protein70, putative | 14 | 12 | 2 | 0 | 0 | 6 | 6.94 |
| EHI_006980 | Gal/Gal NAc lectin lgl1 | 13 | 9 | 4 | 0 | 0 | 2.25 | 3.89 |
| EHI_124500 | Tyrosine kinase, putative | 13 | 9 | 4 | 0 | 0 | 2.25 | 1.68 |
| EHI_164190 | DNA polymerase, putative | 12 | 6 | 6 | 0 | 0 | 1 | 3.13 |
| EHI_144270 | AIG1 family protein | 11 | 6 | 5 | 0 | 0 | 1.2 | 12.75 |
| EHI_164440 | Actinin like protein, putative | 9 | 0 | 9 | 0 | 0 | 0 | 5.74 |
| EHI_135220 | Phospholipid transporting p-type ATPase, putative | 9 | 3 | 6 | 0 | 0 | 0.5 | 3.05 |
| EHI_023050 | Protein kinase domain containing protein | 9 | 6 | 3 | 0 | 0 | 2 | 2.55 |
| EHI_035690 | Galactose inhibitable lectin 35 kDa subunit precursor | 8 | 4 | 4 | 0 | 0 | 1 | 8.57 |
| EHI_011210 | Elongation factor alpha 1 | 8 | 0 | 8 | 0 | 0 | 0 | 5.99 |
| EHI_139430 | Leucine rich repeat protein BspA family | 8 | 3 | 5 | 0 | 0 | 0.6 | 3.96 |
| EHI_023430 | Glycosyl hydrolase family 31 protein | 8 | 4 | 4 | 0 | 0 | 1 | 3.06 |
| EHI_042370 | Galactose specific adhesin 170 kDa subunit, putative | 8 | 6 | 2 | 0 | 0 | 3 | 2.05 |
| EHI_013980 | Phosphatidyl linositol 3 kinase, putative | 8 | 3 | 5 | 0 | 0 | 0.6 | 1.86 |
| EHI_119600 | Ubiquitin carboxyl terminal hydrolase domain containing protein | 7 | 0 | 7 | 0 | 0 | 0 | 1.8 |
| EHI_059670 | Rab family GTPase | 6 | 2 | 4 | 0 | 0 | 0.5 | 2.66 |
| EHI_160860 | Inositol polyphosphate 5 phosphatase, putative | 6 | 5 | 1 | 0 | 0 | 5 | 2.06 |
| EHI_012270 | Gal/Gal NAc lectin heavy subunit | 6 | 4 | 2 | 0 | 0 | 2 | 1.55 |
| EHI_045170 | U5 SnRNP specific 200 kDa protein, putative | 6 | 4 | 2 | 0 | 0 | 2 | 1.11 |
| EHI_164520 | Iron sulfur flavoprotein pseudogene | 5 | 4 | 1 | 0 | 0 | 4 | 11.36 |
| EHI_001160 | Plasma membrane calcium transporting ATPase, putative | 5 | 1 | 4 | 0 | 0 | 0.25 | 4.42 |
| EHI_000430 | Rap/Ran GTPase activating protein, putative | 5 | 1 | 4 | 0 | 0 | 0.25 | 2.92 |
| EHI_188590 | Long chain fatty acid CoA ligase, putative | 5 | 1 | 4 | 0 | 0 | 0.25 | 2.57 |
| EHI_190880 | Thioredoxin domain containing protein 2, putative | 5 | 4 | 1 | 0 | 0 | 4 | 2.42 |
| EHI_061870 | Hypothetical protein | 22 | 13 | 9 | 0 | 0 | 1.44 | 3.4 |
| EHI_033550 | Hypothetical protein | 17 | 11 | 6 | 0 | 0 | 1.83 | 6.93 |
| EHI_023320 | Hypothetical protein | 17 | 12 | 5 | 0 | 0 | 2.4 | 5.51 |
| EHI_072500 | Hypothetical protein | 15 | 3 | 12 | 0 | 0 | 0.25 | 8.93 |
| EHI_013060 | Hypothetical protein | 15 | 10 | 5 | 0 | 0 | 2 | 3.11 |
| EHI_077290 | Hypothetical protein | 14 | 8 | 6 | 0 | 0 | 1.33 | 5.03 |
| EHI_018390 | Hypothetical protein | 13 | 8 | 5 | 0 | 0 | 1.6 | 4.73 |
| EHI_121060 | Hypothetical protein | 12 | 10 | 2 | 0 | 0 | 5 | 5.75 |
| EHI_059870 | Hypothetical protein | 12 | 8 | 4 | 0 | 0 | 2 | 4.54 |
| EHI_172000 | Hypothetical protein | 12 | 3 | 9 | 0 | 0 | 0.33 | 3.57 |
| EHI_050660 | Hypothetical protein | 12 | 5 | 7 | 0 | 0 | 0.71 | 2.28 |
| EHI_174540 | Hypothetical protein | 11 | 7 | 1 | 0 | 3 | 7 | 2.65 |
| EHI_196760 | Hypothetical protein | 10 | 9 | 1 | 0 | 0 | 9 | 8.71 |
| EHI_174560 | Hypothetical protein | 10 | 8 | 2 | 0 | 0 | 4 | 7.38 |
| EHI_111770 | Hypothetical protein | 10 | 9 | 1 | 0 | 0 | 9 | 5.45 |
| EHI_006990 | Hypothetical protein | 9 | 6 | 3 | 0 | 0 | 2 | 7.27 |
| EHI_025310 | Hypothetical protein | 9 | 4 | 5 | 0 | 0 | 0.8 | 4.34 |
| EHI_077750 | Hypothetical protein | 9 | 7 | 2 | 0 | 0 | 3.5 | 3.64 |
| EHI_103400 | Hypothetical protein | 9 | 0 | 3 | 0 | 6 | 0 | 3.62 |
| EHI_114110 | Hypothetical protein | 9 | 5 | 4 | 0 | 0 | 1.25 | 2.3 |
| EHI_016900 | Hypothetical protein | 8 | 5 | 3 | 0 | 0 | 1.67 | 6.79 |
| EHI_004180 | Hypothetical protein | 8 | 2 | 6 | 0 | 0 | 0.33 | 1.52 |
| EHI_119790 | Hypothetical protein | 7 | 3 | 4 | 0 | 0 | 0.75 | 21.28 |
| EHI_145460 | Hypothetical protein | 7 | 1 | 6 | 0 | 0 | 0.17 | 13.75 |
| EHI_106320 | Hypothetical protein | 7 | 4 | 3 | 0 | 0 | 1.33 | 11.69 |
| EHI_107040 | Hypothetical protein | 7 | 0 | 6 | 0 | 1 | 0 | 7.99 |
| EHI_144390 | Hypothetical protein | 7 | 6 | 1 | 0 | 0 | 6 | 7.39 |
| EHI_017780 | Hypothetical protein | 7 | 2 | 5 | 0 | 0 | 0.4 | 6.56 |
Genes of E. histolytica and their homologs present in AmoebaDB database.
| AmoebaDB ID | Protein product for this gene | AmoebaDB ID | Protein product for this gene | AmoebaDB ID | Protein product for this gene | AmoebaDB ID | Protein product for this gene |
|---|---|---|---|---|---|---|---|
| EHI_073630 | Serine threonine isoleucine rich protein, putative | EDI_083900 | Hypothetical protein | EIN_092260 | Hypothetical protein | EMO_033950 | Serine threonine isoleucine rich protein, putative |
| EHI_065330 | Gal/Gal NAc lectin lgl2 | EDI_244250 | Furin repeat containing protein, putative | EIN_065850 | Furin repeat containing protein, putative | EMO_010790 | Gal/Gal Nac lectin lgl2 |
| EHI_159140 | Heat shock protein 70, putative | EDI_012650 | Heat shock protein 70 kDa, putative | – | – | EMO_060560 | Heat shock protein 70, putative |
| EHI_006980 | Gal/Gal Nac lectin lgl1 | EDI_244250 | Furin repeat containing protein, putative | EIN_065850 | Furin repeat containing protein, putative | EMO_010790 | Gal/Gal Nac lectin subunit lgl2 |
| EHI_124500 | Tyrosine kinase, putative | EDI_004150 | Serine/threonine protein kinase HT1, putative | EIN_000210 | Protein serine/threonine kinase, putative | EMO_009220 | Tyrosine kinase, putative |
| EHI_164190 | DNA polymerase, putative | EDI_056410 | Hypothetical protein, conserved | EIN_032840 | Hypothetical protein, conserved | EMO_057600 | DNA polymerase, putative |
| EHI_144270 | AIG1 family protein | EDI_001050 | Hypothetical protein, conserved | – | – | – | – |
| EHI_164440 | Actinin like protein, putative | EDI_207850 | Grainin, putative | EIN_037840 | Grainin, putative | EMO_010570 | Actinin like protein, putative |
| EHI_135220 | Phospholipid transporting p-type ATPase, putative | EDI_018000 | Phospholipid transporting ATPase, putative | EIN_038730 | Phospholipid transporting ATPase, putative | EMO_035200 | Phospholipid transporting p-type ATPase, putative |
| EHI_023050 | Protein kinase domain containing protein | EDI_012370 | Serine–threonine protein kinase, putative | EIN_016310 | Serine–threonine protein kinase, putative | EMO_012200 | Protein kinase domain containing protein |
| EHI_035690 | Galactose inhibitable lectin 35 kDa subunit precursor | EDI_023210 | Galactose-inhibitable lectin 35 kDa subunit precursor, putative | – | – | EMO_050130 | Galactose-inhibitable lectin 35 kDa subunit precursor |
| EHI_011210 | Elongation factor alpha 1 | EDI_134610 | Elongation factor 1-alpha | EIN_146970 | Elongation factor 1-alpha, putative | EMO_123750 | Elongation factor 1-alpha 1 |
| EHI_139430 | Leucine rich repeat protein BspA family | EDI_284090 | Hypothetical protein, conserved | EIN_054420 | Hypothetical protein, conserved | EMO_007680 | Leucine rich repeat protein BspA family |
| EHI_023430 | Glycosyl hydrolase family 31 protein | EDI_137800 | Neutral alpha-glucosidase AB precursor, putative | EIN_108320 | Neutral alpha-glucosidase AB precursor, putative | EMO_112400 | Glycosyl hydrolase, family 31 protein |
| EHI_042370 | Galactose specific adhesin 170 kDa subunit, putative | EDI_213670 | 170 kDa surface lectin precursor, putative | EIN_068210 | 170 kDa surface lectin precursor, putative | EMO_066770 | Gal/GalNAc lectin heavy subunit |
| EHI_013980 | Phosphatidyl linositol 3 kinase, putative | EDI_147070 | Phosphatidylinositol 3-kinase catalytic subunit gamma, putative | EIN_020710 | Phosphatidylinositol 3-kinase catalytic subunit gamma, putative | EMO_071620 | Phosphatidylinositol 3-kinase, putative |
| EHI_119600 | Ubiquitin carboxyl terminal hydrolase domain containing protein | EDI_023410 | Ubiquitin specific protease, putative | EIN_200010 | Hypothetical protein | EMO_025900 | Ubiquitin carboxyl-terminal hydrolase domain containing protein |
| EHI_059670 | Rab family GTPase | EDI_156940 | Trichohyalin, putative | EIN_157460 | Trichohyalin, putative | EMO_059660 | Rab family GTPase |
| EHI_160860 | Inositol polyphosphate 5 phosphatase, putative | EDI_159070 | Type II inositol-1,4,5-trisphosphate 5-phosphatase precursor, putative | EIN_020640 | Type II inositol-1,4,5-trisphosphate 5-phosphatase precursor, putative | EMO_012640 | Inositol polyphosphate-5-phosphatase, putative |
| EHI_012270 | Gal/Gal Nac lectin heavy subunit | EDI_213670 | 170 kDa surface lectin precursor, putative | EIN_068210 | 170 kDa surface lectin precursor, putative | EMO_066770 | Gal/GalNAc lectin heavy subunit |
| EHI_045170 | U5 SnRNP specific 200 kDa protein, putative | EDI_076220 | U5 small nuclear ribonucleoprotein 200 kDa helicase, putative | EIN_093940 | U5 small nuclear ribonucleoprotein 200 kDa helicase, putative | EMO_014940 | U5 snRNP-specific 200 kDa protein, putative |
| EHI_164520 | Iron sulfur flavoprotein pseudogene | EDI_064980 | Hypothetical protein, conserved | EIN_091700 | Hypothetical protein, conserved | EMO_098730 | Iron–sulfur flavoprotein, putative |
| EHI_001160 | Plasma membrane calcium transporting ATPase, putative | EDI_013570 | Plasma membrane calcium-transporting ATPase, putative | EIN_222480 | Plasma membrane calcium-transporting ATPase, putative | EMO_006020 | Plasma membrane calcium-transporting ATPase, putative |
| EHI_000430 | Rap/Ran GTPase activating protein, putative | EDI_026850 | Rap GTPase-activating protein, putative | EIN_033200 | Rap GTPase-activating protein, putative | EMO_022230 | Rap/Ran GTPase-activating protein, putative |
| EHI_188590 | Long chain fatty acid CoA ligase, putative | EDI_093250 | Long-chain-fatty-acid—CoA ligase, putative | EIN_016090 | Long-chain-fatty-acid—CoA ligase, putative | EMO_002990 | Long-chain-fatty-acid—CoA ligase, putative |
| EHI_190880 | Thioredoxin domain containing protein 2, putative | EDI_197960 | Hypothetical protein, conserved | EIN_163620 | Hypothetical protein | EMO_099010 | Thioredoxin domain-containing protein 2, putative |
Homolog of the corresponding gene is not found in the particular Entamoeba species [as per AmoebaDB database (www.AmoebaDB.org)].