| Literature DB >> 22839995 |
Carol A Gilchrist1, Ibne Karim M Ali, Mamun Kabir, Faisal Alam, Sana Scherbakova, Eric Ferlanti, Gareth D Weedall, Neil Hall, Rashidul Haque, William A Petri, Elisabet Caler.
Abstract
BACKGROUND: The outcome of an Entamoeba histolytica infection is variable and can result in either asymptomatic carriage, immediate or latent disease (diarrhea/dysentery/amebic liver abscess). An E. histolytica multilocus genotyping system based on tRNA gene-linked arrays has shown that genetic differences exist among parasites isolated from patients with different symptoms however, the tRNA gene-linked arrays cannot be located in the current assembly of the E. histolytica Reference genome (strain HM-1:IMSS) and are highly variable.Entities:
Mesh:
Year: 2012 PMID: 22839995 PMCID: PMC3438053 DOI: 10.1186/1471-2180-12-151
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Genomes sequenced by the Genomic Sequencing Center for Infectious Diseases (GSCID) and the Institute of Integrative Biology,Genome sequencing projects
| GSCID | ||||
| MS96-3382 | 885314 | R. Haque, unpublished data ICDDR,B | ||
| DS4-868 | 885310 | Ali | ||
| KU 27 | 885311 | Escueta-de Cadiz | ||
| KU 50 | 885313 | Escueta-de Cadiz | ||
| KU 48 | 885312 | Escueta-de Cadiz | ||
| University of Liverpool | ||||
| HK-9 | | Ungar | ||
| PVBM08B | | University of Liverpool genome resequencing project [ | ||
| PVBM08F | | University of Liverpool genome resequencing project [ | ||
| 2592100 | | R. Haque, unpublished data ICDDR,B | ||
| Rahman | | Diamond, and Clark. 1993 [ | ||
| MS84-1373 | | R. Haque, unpublished data ICDDR,B [ | ||
| MS27-5030 | R. Haque, unpublished data ICDDR,B [ | |||
Verification, by Sanger sequencing, of 12 polymorphic loci identified by Next Generation Sequencing (NGS) of genomes
| XM_644365 | EHI_103540 | 63883C | C | C | C | C | C/A | C/A |
| XM_645788 | EHI_069570 | 120673G | G | G | A | A | A | A |
| XM_647032 | EHI_134740 | 54882G | G | G | G | G | A | A |
| XM_651435 | EHI_041950 | 9878A | A | A | A | A | C | C |
| XM_647310 | EHI_065250 | 10296C 10297T | CT | CT | TC | TC | TC | TC |
| XM_647310 | EHI_046600 | 6048A | A | A | C | C | C | C |
| XM_647170 | EHI_166490 | 28371G | G | G/A | G | G | G/A | G/A |
| XM_652055 | EHI_049680 | 91356A | A | A | A | A | C | C |
| XM_648588 | EHI_188130 | 32841C | C | C | T | T | T | T |
| XM_001914355 | EHI_083760 | 807T | T-x-G | T-x-G | T-x-G | T-x-G | T-x-A | T-x-A |
| 784G | ||||||||
| XM_647392 | EHI_126120 | 105607A | A | A | A | A | G | G |
| XM_001913688 | EHI_168860 | 11109G | G | G | A | A | A | A |
Verification of SNPs identified during Next Generation Sequencing of E. histolytica genomes.
Figure 1Amplicon sequencing efficiency for individual samples.A) Number of reads obtained from the Illumina libraries prepared from different sample source x-axis libraries prepared from different sample source; y-axis number of reads (log2 scale) B) Average coverage of the reads when mapped to the concatenated amplicon reference; x-axis libraries prepared from different sample source y-axis average coverage of mapped reads (log2 scale) Line indicates median number of reads.
Figure 2Similarity of diversity in Bangladeshi and whole genome sequenced strains. Shown on the y axis (H) is the calculated heterozygosity and represents sum of the squared allele frequencies was subtracted from 1 on the x axis the loci containing the SNPs genotyped by MSLT(■ value in Bangladesh samples genotyped during this study, (□ value in the sequenced genomes described in Table 1).
Figure 3Lack of consistent patterns of descent among SNP markers from Bangladeshi isolates suggests they segregate independently. Consensus phylogeny inferred from 100 bootstrap replicates of polymorphic SNP markers, constructed using the MEGA 5 program and the Maximum Likelihood method based on the Tamura-Nei model and using the sequences shown in Additional file 1: Table 8 [42]. Branches produced in fewer than 50% of the bootstrap phylogenies were collapsed. Sequences from stool have the suffix s; culture c; monthly survey stools begin with MS or CMS, diarrheal DS or CDS, amebic liver abscess samples RUF.
Figure 4Amebic culture effect on the EHI_065250 genotype. Distribution of the EHI_065250 SNP at the 10296 location in field isolates or cultured strains established from asymptomatic disease (p = 0.0166). The distribution of the individual SNPs, which were either Reference (Ref), Non-Reference (Non-Ref) or heterologous was shown on the x-axis. The number of samples of with this genotype isolated from patients with asymptomatic disease was shown on the y-axis.
Association of SNPs with disease phenotype
| | | | | | ||
|---|---|---|---|---|---|---|
| XM_647889.1& | EHI_080100 | Pro361Leu | 2725C/T | 1 | 0.002** | 0.032** |
| XM_647310.1& | EHI_065250 | Ser399Asp | 10296A/G | 3 | 0.05** | 0.3 |
| 10297G/A | 4 | | | |||
| XM_644633.2 | EHI_200030 | Leu60Ile | 16181C/A | 8 | 0.08 | 0.31 |
| XM_646031.2 | EHI_120270 | Pro21Ser | 7994C/T | 9 | 0.10 | 0.31 |
| XM_647889.1 | EHI_008810 | Leu326Ile | 73463C/A | 10 | 0.24 | 0.44 |
| XM_643253.1 | EHI_040810 | Ala197Glu | 1216C/A | 11 | 0.31 | 0.46 |
| XM_645270.1 | EHI_105150 | Ile282Met | 27395T/G | 12 | 0.42 | 0.56 |
| XM_001913781.1 | EHI_138990 | Val1288Leu | 30231G/T | 13 | 0.52 | 0.64 |
| XM_651449.1 | EHI_042210 | Pro58Leu | 39051C/T | 14 | 0.92 | 1.00 |
| XM_648423.2& | EHI_016380 | Tyr702His | 17795T/C | 15 | 0.97 | 1.00 |
#Only loci with diversity H value over 0.25 shown.
** <0.05.
&Representative SNP chosen in linked SNP data sets.
Figure 5SNPs 1&2 in the EHI_080100 locus segregate with disease. Distribution of the SNP1 which was either Reference (□,MS)(Ref), Non-Reference (■ ALA);(Non-Ref) was shown on the x-axis. The number of samples of with this genotype isolated from patients with either amebic liver abscesses diarrhea/(D/D) asymptomatic disease COL was shown on the y-axis. Fisher’s pairwise comparison between asymptomatic and diarrhea/dysentery p = 0.0182 (*); between amebic liver abscess and diarrhea/dysentery samples p = 0.0003; q = 0.0144 (**); Chi-squared contingency analysis of all phenotypes p = 0.002; q = 0.032 (**).
Figure 6The locationof the SNPs1&2 in EHI_080100 and EHI_065250 genes. Mapping of the informative SNPs within the coding sequences. A) EHI_065250 and B) EHI_080100 genes. Nucleotide position of the amplicon 5’ and 3’ bases are shown and approximate location of the 5’ (green) and 3’ (red) and the positions and number of the targeted SNPs indicated by vertical lines. The bases involved are bracketed in the nucleotide sequence at this region (shown above). The amino acid sequence with changed residues in red is also shown.
Locations of informative SNPs
| EHI_080100 | DS571720 | 5179 | 2725-2730 |
| EHI_065250 | DS571302 | 38246 | 10296-10318 |
Genomic Location of the SNPS in the EHI_080100 and EHI_065250 genes.