Literature DB >> 32019797

Genome-Wide Association Study of Cryptosporidiosis in Infants Implicates PRKCA.

Genevieve L Wojcik1,2, Poonum Korpe2, Chelsea Marie3, Alexander J Mentzer4,5, Tommy Carstensen6, Josyf Mychaleckyj7, Beth D Kirkpatrick8, Stephen S Rich7, Patrick Concannon9, A S G Faruque10, Rashidul Haque10, William A Petri3, Priya Duggal11.   

Abstract

Diarrhea is a major cause of both morbidity and mortality worldwide, especially among young children. Cryptosporidiosis is a leading cause of diarrhea in children, particularly in South Asia and sub-Saharan Africa, where it is responsible for over 200,000 deaths per year. Beyond the initial clinical presentation of diarrhea, it is associated with long-term sequelae such as malnutrition and neurocognitive developmental deficits. Risk factors include poverty and overcrowding, and yet not all children with these risk factors and exposure are infected, nor do all infected children develop symptomatic disease. One potential risk factor to explain these differences is their human genome. To identify genetic variants associated with symptomatic cryptosporidiosis, we conducted a genome-wide association study (GWAS) examining 6.5 million single nucleotide polymorphisms (SNPs) in 873 children from three independent cohorts in Dhaka, Bangladesh, namely, the Dhaka Birth Cohort (DBC), the Performance of Rotavirus and Oral Polio Vaccines in Developing Countries (PROVIDE) study, and the Cryptosporidiosis Birth Cohort (CBC). Associations were estimated separately for each cohort under an additive model, adjusting for length-for-age Z-score at 12 months of age, the first two principal components to account for population substructure, and genotyping batch. The strongest meta-analytic association was with rs58296998 (P = 3.73 × 10-8), an intronic SNP and expression quantitative trait locus (eQTL) of protein kinase C alpha (PRKCA). Each additional risk allele conferred 2.4 times the odds of Cryptosporidium-associated diarrhea in the first year of life. This genetic association suggests a role for protein kinase C alpha in pediatric cryptosporidiosis and warrants further investigation.IMPORTANCE Globally, diarrhea remains one of the major causes of pediatric morbidity and mortality. The initial symptoms of diarrhea can often lead to long-term consequences for the health of young children, such as malnutrition and neurocognitive developmental deficits. Despite many children having similar exposures to infectious causes of diarrhea, not all develop symptomatic disease, indicating a possible role for human genetic variation. Here, we conducted a genetic study of susceptibility to symptomatic disease associated with Cryptosporidium infection (a leading cause of diarrhea) in three independent cohorts of infants from Dhaka, Bangladesh. We identified a genetic variant within protein kinase C alpha (PRKCA) associated with higher risk of cryptosporidiosis in the first year of life. These results indicate a role for human genetics in susceptibility to cryptosporidiosis and warrant further research to elucidate the mechanism.
Copyright © 2020 Wojcik et al.

Entities:  

Keywords:  Cryptosporidiumzzm321990; genetics; genome analysis

Mesh:

Substances:

Year:  2020        PMID: 32019797      PMCID: PMC7002356          DOI: 10.1128/mBio.03343-19

Source DB:  PubMed          Journal:  mBio            Impact factor:   7.867


INTRODUCTION

Cryptosporidiosis is a leading cause of diarrhea and is estimated to be responsible for greater than 200,000 deaths in young children in South Asia and sub-Saharan Africa each year (1). Beyond the immediate infection, cryptosporidiosis is also associated with long-term sequelae, including malnutrition and neurocognitive developmental deficits (2–5). The majority of human infections are caused by the Cryptosporidium hominis, C. meleagridis, and C. parvum species (4, 6, 7). As cryptosporidiosis is transmitted fecal-orally, contact with any reservoir with possible fecal contamination could serve as a point of transmission. In the developed world, cryptosporidia represent an important cause of diarrhea in individuals living with HIV and are the most common pathogens causing waterborne outbreaks (7). In regions of endemicity, cryptosporidiosis mostly impacts young children, and risk factors for infection include poverty and overcrowding (4, 8–10). Livestock serve as an environmental reservoir for C. parvum, and transmission after contact with infected animals or with drinking water contaminated by human or animal waste has been reported previously (11). In regions where Cryptosporidium infection is endemic, there is heterogeneity in clinical courses and outcomes. In an eight-site multicenter international study of enteric infection and malnutrition (MAL-ED), the rate of Cryptosporidium infection, age of onset, number of repeat infections, and clinical manifestation differed significantly by site (9). In a recent study in Dhaka, Bangladesh, we found that two-thirds of children living in an urban slum were infected with Cryptosporidium by 2 years of age and that one-fourth had had more than one episode of cryptosporidiosis. Fully three-fourths of the infections were subclinical, but, regardless of the symptoms, children with cryptosporidiosis were more likely to become malnourished by 2 years of age (4). Potential explanations for the Cryptosporidium infection heterogeneity include differences in the pathogenicity of various Cryptosporidium species or genotypes (12) and in host genetic susceptibility. Candidate gene studies identified an increased risk of Cryptosporidium infection associated with specific alleles in HLA class I and II genes and with single nucleotide polymorphisms (SNPs) in the mannose binding lectin (MBL) gene (13–15). Bangladeshi preschool children with multiple Cryptosporidium infections (≥2 infections) were more likely to carry the -221 MBL2 promoter variant (rs7906206; odds ratio [OR] = 4.02, P = 0.025) and to have the YO/XA haplotype (OR = 4.91), as well as to be deficient in their MBL serum levels (OR = 10.45) (14). Since the findings with respect to the MBL and HLA alleles explained Cryptosporidium susceptibility only partially, we conducted a genome-wide association study (GWAS) of cryptosporidiosis occurring in the first year of life using three existing birth cohorts of children in Dhaka, Bangladesh: the Performance of Rotavirus and Oral Polio Vaccines in Developing Countries (PROVIDE) study, the Dhaka Birth Cohort (DBC), and the Cryptosporidiosis Birth Cohort (CBC). (This article was submitted to an online preprint archive [16].)

RESULTS

Across these three cohorts, there were a total of 183 children with at least one symptomatic (diarrheal) sample that tested positive for Cryptosporidium within the first year of life (“cases”) (Table 1). A total of 873 children did not test positive for Cryptosporidium in either symptomatic (diarrheal) or surveillance samples within the first year of life (“controls”). There were no significant differences in length-for-age Z-score (LAZ) at birth (LAZbirth), the number of days exclusively breastfed, or sex between cases and controls (P > 0.05). To control for a possible role of malnutrition affecting susceptibility to infection, we compared the LAZ at 12 months of age (LAZ12) between cases and controls. We observed increased levels of stunting in cases (lower LAZ12) versus controls within PROVIDE (P = 0.007) and CBC (P = 0.02), while no differences were observed in stunting between cases and controls in DBC (P = 0.97). Additionally, there was no statistically significant evidence of heterogeneity in LAZ12, number of days exclusively breastfed, or sex between the three studies (heterogeneity P [Phet], >0.05).
TABLE 1

Demographics of study populations

ParameterValue for:
Dhaka Birth Cohort (DBC)
PROVIDE
Cryptosporidiosis Birth Cohort (CBC)
Mean for controls (n = 267)Mean for cases (n = 46)PMean for controls (n = 354)Mean for cases (n = 60)PMean for controls (n = 252)Mean for cases (n = 77)PPhet
LAZ at 12 mos−1.75−1.740.97−1.40−1.797.28 × 10−3−1.34−1.630.020.12
Exclusive breast feeding (no. of days)130.2114.60.16127.2112.10.06110.9103.70.420.74
Sex (% female subjects)46.334.80.1545.946.70.9152.857.70.450.28
Demographics of study populations

GWAS of cryptosporidiosis within the first year of life.

We tested the association between 6.5 million SNPs across the human genome and symptomatic Cryptosporidium infection in the first year of life. Effects were estimated separately for the three birth cohorts and subsequently combined using a fixed-effects meta-analysis, filtered for heterogeneity (Phet), minor allele frequency (MAF) (>5%), and imputation quality (INFO; score, >0.6) (Fig. 1; see also Fig. S1 in the supplemental material). A total of 6 SNPs in an intron of PRKCA (protein kinase c, alpha) were significantly associated with Cryptosporidium infection (P < 5 × 10−8) (Fig. 2A). For the SNP most highly associated with Cryptosporidium infection (rs58296998), each copy of the risk allele (T) conferred 2.4 times the odds of cryptosporidiosis within the first year of life (P = 3.73 × 10−8). The effect size and risk allele were consistent across all three studies (Phet value of 0.11) (Fig. 2B). After conditioning performed on the basis of rs58296998 (by including this SNP in the logistic regression model as a covariate), the evidence for association with the remaining SNPs in the region was no longer statistically significant, suggesting that the observed association in PRKCA is explained by a single SNP (rs58296998) or by one highly correlated with this SNP (Fig. S2A). Among the 26 children homozygous for the risk allele (TT) at rs58296998, 46% developed symptomatic cryptosporidiosis during the first year of life. This proportion decreased to 24% for children heterozygous (CT) for this risk allele (n = 272), compared to 13% of children homozygous (CC) for the risk allele (n = 745).
FIG 1

Manhattan plot for cryptosporidiosis within the first year of life. Each dot indicates the association of a single SNP with cryptosporidiosis in the first year of life. SNPs are sorted by chromosome (each color) and position along the x axis. The y axis is the -log10 P value for the SNP association in the meta-analysis of study-specific logistic regressions adjusting for length-for-age Z-score at 12 months, the first two study-specific principal components, and the genotyping batch for the Dhaka Birth Cohort (DBC). Genome-wide significance (5 × 10−8) is denoted by the dashed line. This plot is limited to associations with a P value below 0.01.

FIG 2

Association between variants in PRKCA and cryptosporidiosis. (A) Regional association on chromosome 17 between variants in PRKCA and cryptosporidiosis. Fill denotes linkage disequilibrium (r2) between the top SNP (rs58296998) and surrounding SNPs. cM/Mb, centimorgan/megabase. (B) Forest plot of odds ratios and 95% confidence intervals for top signal rs58296998 by individual cohort and meta-analysis. Crypto Birth Cohort, Cryptosporidiosis Birth Cohort. (C) Survival analysis of first episode of cryptosporidium-associated diarrhea among all participants by rs58296998 genotype within the first year of life.

Manhattan plot for cryptosporidiosis within the first year of life. Each dot indicates the association of a single SNP with cryptosporidiosis in the first year of life. SNPs are sorted by chromosome (each color) and position along the x axis. The y axis is the -log10 P value for the SNP association in the meta-analysis of study-specific logistic regressions adjusting for length-for-age Z-score at 12 months, the first two study-specific principal components, and the genotyping batch for the Dhaka Birth Cohort (DBC). Genome-wide significance (5 × 10−8) is denoted by the dashed line. This plot is limited to associations with a P value below 0.01. Association between variants in PRKCA and cryptosporidiosis. (A) Regional association on chromosome 17 between variants in PRKCA and cryptosporidiosis. Fill denotes linkage disequilibrium (r2) between the top SNP (rs58296998) and surrounding SNPs. cM/Mb, centimorgan/megabase. (B) Forest plot of odds ratios and 95% confidence intervals for top signal rs58296998 by individual cohort and meta-analysis. Crypto Birth Cohort, Cryptosporidiosis Birth Cohort. (C) Survival analysis of first episode of cryptosporidium-associated diarrhea among all participants by rs58296998 genotype within the first year of life. Quality control workflow for all three cohorts. Download FIG S1, PDF file, 0.1 MB. Characteristics of PRKCA region and top SNP. (A) Regional association in PRKCA region after conditioning with top signal rs58296998, showing significantly diminishment between recombination peaks. (B) Survival analysis of the first episode of cryptosporidiosis associated with the PRKCA rs58296998 genotype within the first year of life among cases. Adjusting for the study, we saw no additive relationship between an additive model of the risk allele (T) with genotypes having no, one, or two copies of the T allele and earlier infection (P = 0.095). (C) Relationship between genotype for PRKCA SNP rs58296998 and severity of diarrhea as determined by Ruuska score within PROVIDE. Under an additive model, we saw a statistically significant relationship between PRKCA genotypes and diarrhea severity (P = 0.028). Download FIG S2, PDF file, 2.8 MB. The rs58296998 T allele frequencies (15.0% to 16.7%) for all three cohorts in this region are consistent with the Bangladeshi reference population (1000 Genomes phase 3) frequency of 18% and the overall South Asian frequency of 15% (17). Globally, the highest frequencies of rs58296998 T allele are found in East Asian populations, with the highest T allele frequency of 34% of the Chinese Dai in Xishuangbanna, China. The rs58296998 T allele is at lower frequencies within Africa, at 9% within the Luhya in Kenya, and is even less frequent in West Africa (3.5% to 5.5%) (Fig. 3).
FIG 3

Allele frequencies for allele T at top signal rs58296998 as determined by analysis of 1000 Genomes phase 3 data, as well as by analysis of case/control status in the three cohorts combined. Each pie chart on the map shows the frequency of the T allele with the black wedge. The remainder of each pie chart is colored in accordance with that T allele frequency. The inset provides the T allele frequency for children without any symptomatic cryptosporidiosis in the first year of life (controls; MAF = 13.6%) and for those with at least one diarrheal episode (cases; MAF = 25.0%).

Allele frequencies for allele T at top signal rs58296998 as determined by analysis of 1000 Genomes phase 3 data, as well as by analysis of case/control status in the three cohorts combined. Each pie chart on the map shows the frequency of the T allele with the black wedge. The remainder of each pie chart is colored in accordance with that T allele frequency. The inset provides the T allele frequency for children without any symptomatic cryptosporidiosis in the first year of life (controls; MAF = 13.6%) and for those with at least one diarrheal episode (cases; MAF = 25.0%). Cases had their first diarrheal episode positive for Cryptosporidia at a mean of 242 days of age. We confirmed the GWAS results with respect to the dosage of rs58296998 risk alleles significantly associated with time to first diarrheal sample positive for Cryptosporidia among cases versus right-censored controls (up to the child’s first birthday) (P = 6.37 × 10−8). All children homozygous for the risk allele (TT) had their first episode in the first year of life (Fig. 2C). Among cases, however, there was no statistically significant association between rs58296998 genotype and time to infection (P = 0.095) (Fig. S2B). In PROVIDE, the rs58296998 genotype was associated with severity of diarrhea as determined by the Ruuska score (P = 0.028) (Fig. S2C). Suggestive SNP associations with Cryptosporidium (P < 10−6) were also identified on chromosome 11 (chr11) and chr16. The strongest association on chromosome 11 (rs4758351) was found within an intergenic region of a cluster of olfactory receptor genes. Each copy of the rs4758351 A allele (MAF of 14%) conferred 2.39 times the odds of Cryptosporidium within the first year of life (P = 3.78 × 10−7) (Fig. S3A). Multiple SNPs in this region of chr11 (position 6015194 to position 6024551) had similar magnitudes and strengths of association with Cryptosporidium (OR, 2.13 to 2.39). The strongest association on chromosome 16 was with the rs9937140 SNP, located upstream of apolipoprotein O pseudogene 5 (APOOP5). Each copy of the rs9937140 G allele (MAF, 23%) conferred 1.99 times the odds of cryptosporidiosis (P = 7.75 × 10−7) (Fig. S3B). LocusZoom plots of suggestive signals for GWAS. (A) LocusZoom plot of suggestive signal on chromosome 11 (rs4758351). Each dot represents a single SNP association from the meta-analysis of the three study-specific logistic regressions adjusting for HAZ (11), the first two study-specific principal components, and the batch for the DBC. The x axis represents the physical position along chromosome 11 with the gene locations indicated below. The y axis represents the log P value from the single SNP association. The fill represents the level of linkage disequilibrium (r2) between the top signal (rs4758351) and the surrounding SNPs. (B) LocusZoom of suggestive signal on chromosome 16 (rs9937140). Each dot represents a single SNP association from the meta-analysis of the three study-specific logistic regressions adjusting for HAZ (11), the first two study-specific principal components, and the batch for DBC. The x axis represents the physical position along chromosome 16 with the gene locations below. The y axis represents the log P value from the single SNP association. The fill represents the level of linkage disequilibrium (r2) between the top signal (rs9937140) and surrounding SNPs. Download FIG S3, PDF file, 0.1 MB.

Expression and PrediXcan.

We used a publicly available resource, the Genotype-Tissue Expression (GTEx) Project, to estimate the influence of human genetic variation on human gene expression in multiple tissues (18, 19). The associated rs58296998 SNP, located in the PRKCA gene, is also associated with PRKCA expression. This expression quantitative trait locus (eQTL), or a genetic variant previously shown to influence the expression of a gene, showed decreasing expression of PRKCA with each T allele in the esophageal muscularis (P = 3.12 × 10−5), the sigmoid colon (P = 4.61 × 10−4), and the esophageal mucosa (P = 7.50 × 10−4) (19). These expression data, coupled with the GWAS result, suggested that decreased expression of PRKCA is correlated with increased risk of symptomatic Cryptosporidium infection within the first year of life.

Additional genome-wide expression and gene set analyses.

In the absence of direct gene expression measurement, we relied on previously estimated tissue-specific associations between genome-wide SNPs and gene expression, which quantify the genetic component of gene expression. We estimated predicted patterns of genome-wide differential gene expression between cases and controls by weighting the summary statistics from our GWAS of cryptosporidiosis in the first year of life by the use of tissue-specific PredictDB weights. These SNP-level estimates were then combined for each gene to infer association between imputed gene expression and cryptosporidiosis (20, 21). No association of predicted gene expression with cryptosporidiosis reached statistical significance. A total of 13 genes showed a nominally significant (P < 0.001) association in more than one tissue-specific model (see Table S1 in the supplemental material; see also Fig. S4). Variants in the gene OTUD3 (OTU deubiquitinase 3) (chr1; position 20208356 to position 20239438) were associated with cryptosporidiosis in 18 different tissue-specific models (P < 0.001). In all tissue-specific models, individuals with predicted increased expression of OTUD3 had an increased risk of cryptosporidiosis within the first year of life (OR, 1.68 to 6.63; P = 8.46 × 10−5 to 8.97 × 10−4) (Fig. 4).
FIG 4

OTUD3 region showing association with cryptosporidiosis in the first year of life. (A) Association of SNPs on chromosome 1 region, colored by linkage disequilibrium (r2) with index SNP (black diamond). (B) Association of case status with imputed gene expression in all tissues with P value of <0.001 and predicted expression performance of r2 = >0.1.

OTUD3 region showing association with cryptosporidiosis in the first year of life. (A) Association of SNPs on chromosome 1 region, colored by linkage disequilibrium (r2) with index SNP (black diamond). (B) Association of case status with imputed gene expression in all tissues with P value of <0.001 and predicted expression performance of r2 = >0.1. Shared associations for predicted gene expression, filtered for gene-tissue pairs with P values of <0.001. Download FIG S4, PDF file, 0.2 MB. Results for metaXcan analysis evaluating association of predicted gene expression with cryptosporidiosis in the first year of life. Download Table S1, XLSX file, 0.03 MB. We also performed gene set enrichment analysis using MSigDB hallmark gene sets (n = 50), KEGG (n = 186) and BioCarta (n = 217) by combining gene-level summary statistics to examine aggregate signals within biological pathways. No pathways reached statistical significance after adjusting for multiple comparisons; however, data from several gene sets were suggestive (Table S2). The two top-ranked gene sets are among the hedgehog signaling pathways, namely, the hallmark hedgehog signaling pathway (empirical P value [Pemp] = 5.04 × 104) (Bayes factor [BF] = 515.65) and KEGG hedgehog signaling pathway (Pemp = 1.47 × 10−3) (BF = 235.59). Data from gene-set analysis determined on the basis of metaXcan results for association of predicted gene expression with cryptosporidiosis in the first year of life. Download Table S2, XLSX file, 0.03 MB.

DISCUSSION

Here, we present the results of the first genome-wide association study of symptomatic Cryptosporidium infection. Specifically, we tested the role of host genetics in susceptibility to Cryptosporidium infection associated with diarrhea within the first year of life. A region on chromosome 17 was identified, with each additional T allele of rs58296998, an intronic SNP in PRKCA, conferring 2.4 times the odds of cryptosporidiosis within the first year of life. Additionally, this SNP was previously identified as an eQTL of PRKCA, with decreased expression of PRKCA associated with the T allele. This suggests that this SNP may influence Cryptosporidium infection through decreased expression of PRKCA. The protein kinase C alpha gene (PRKCA) is an isotype of the protein kinase C (PKC) family, whose members are serine and threonine specific and are known to be involved in diverse cellular signaling pathways. Specifically, PKCs have numerous roles in the development and function of the gastrointestinal tract (22) and in the immune response (23). This relationship was confirmed with knockout experiments, where PKCα was shown to be a positive regulator of Th17 cell effector functions. PKCα-deficient [Prkca(−/−)] cells failed to produce the appropriate levels of interleukin-17A (IL-17A) in vitro (23). An analysis of Cryptosporidium parvum-infected mice demonstrated the importance of the Th17 response to infection, showing increased levels of IL-17 mRNA and Th17 cell-related cytokines in gut tissue after infection (24). Additionally, both pharmacological inhibition and genetic PKCα inhibition have been shown to prevent NHE3 internalization, Na+ malabsorption, and tumor necrosis factor (TNF)-mediated diarrhea, despite continued barrier dysfunction (25), supporting the idea of a role for PRKCA in symptomatic cryptosporidiosis. This link between PRKCA and Th17 may be critical to gut infections and, specifically, to infection of Cryptosporidium in the developing infant gut. We identified a SNP that was associated with decreased expression of PRKCA and thus was less able to mediate the IL-17 immune response during Cryptosporidium infection. PRKCA has also been shown to be associated with numerous other infections, including infections by Staphylococcus aureus (26); with progression of sepsis (27) and toxoplasmosis (28); with Burkholderia cenocepacia infections in cystic fibrosis patients (29); and with hepatitis E virus replication (30). As an obligate intracellular parasite, Cryptosporidium relies on host cells to complete its life cycle in the human host; thus, it is also plausible that PRKCA directly mediates susceptibility via impacts on parasite invasion. Sporozoites invade brush border intestinal epithelial cells by inducing volume increases (31) and cytoskeletal remodeling at the site of host cell attachment (32), leading to engulfment via host membrane protrusions. Studies have shown that inhibition of host factors, including actin remodeling proteins and PKC enzymes, is sufficient to inhibit sporozoite invasion in vitro (32). Interestingly, PKCα has been shown to play an important role in Escherichia coli pathogenesis (33). Like Cryptosporidium, E. coli induces host actin condensation at the site of host cell invasion, and immunocytochemical studies indicate that activated PKCα colocalized with actin condensation at the bacterial entry site (34). While our top SNP within PRKCA has previously been shown to influence the expression of PRKCA in GTEx, our imputed gene expression analysis using PrediXcan did not reveal a significant difference in predicted levels of PRKCA expression between cases and controls. This was likely due to the difference between a single SNP being examined in GTEx and the combined effects of multiple eQTLs estimated from a European descent reference population in PrediXcan. A major limitation of predicted gene expression analyses is the lack of population specificity for non-European groups (35). The PrediXcan models were derived from individuals of European descent, as were the covariance structures used to infer correlations between eQTLs. We saw a direct relationship between population differences in allele frequencies for the weighted SNPs and impaired performance. Specifically, we observed the lowest predictive performance in tissues for which the informative SNPs had large differences in allele frequencies between European and South Asian populations in the 1000 Genomes Project phase 3 data (17) (see Fig. S5 in the supplemental material). These included two tissues, namely, esophageal mucosa and the colon sigmoid tissue, in which rs58296998 was identified as an eQTL for PRKCA. These trends highlight the importance of reference populations representative of global populations to ensure that tools are useful in non-European populations, such as ours. We also identified an association of increased expression of OTUD3 with increased odds of cryptosporidiosis within the first year of life. This gene is associated with ulcerative colitis (36–42) and inflammatory bowel disease (43, 44). This finding is consistent with the hypothesis of a pathway shared between enteric infection and autoimmune intestinal disease, as indicated in a previous genetic analysis of Entamoeba histolytica infection in the same study population (45). Gene expression prediction characteristics of PRKCA. (A) Correlation per tissue of differences in allele frequencies between European and South Asian populations with prediXcan weights for PRKCA. We saw that there is a statistically significant correlation (P < 0.05) in the tissues of interest: colon sigmoid and esophagus mucosa. (B) Correlation per tissue between weights and frequency differences versus the predictive performance in our participants. The tissues of interest (colon sigmoid, esophagus mucosa) show high correlation and low predictive performance r2. Fill indicates the log P value for correlation. (C) Difference per SNP between European and South Asian allele frequencies (EUR-SAS) versus the prediXcan weight. Fill indicates the South Asian allele frequencies. We note that many of the highest-weighted alleles are of low frequency or absent in South Asia. Download FIG S5, PDF file, 0.1 MB. Collapsing the predicted patterns of differentially expressed genes into gene sets, we found enrichment in the hedgehog signaling pathway. A previous study examined the gene expression profiles of long noncoding RNA (lncRNA) and mRNA in HCT-8 cells infected with C. parvum subtype IId (46). Of note, PRKCA was the most significantly differentially expressed gene in infected HCT8 cells 24 h postinfection (2.24-fold decreased expression in infected cells; P = 3.82 × 10−5). Pathway analysis of the differentially expressed mRNAs found that genes in the hedgehog signaling pathway were significantly enriched during Cryptosporidium infection. This finding, in combination with our identification of hedgehog signaling in imputed gene expression profiles, is suggestive of a potential link between decreased PRKCA expression and hedgehog signaling; however, further research to confirm these findings and to elucidate the role of PRKCA genetic variation in gene expression and hedgehog pathway perturbation is needed. A potential limitation of our study was that, due to the use of sensitive molecular diagnostics, multiple enteropathogens were frequently detected in each diarrheal sample. However, we did not detect the same genetic signatures as that seen in our previous study of Entamoeba histolytica in this same study population for Cryptosporidium (45). Further, coinfection with multiple pathogens would dilute the statistical signal for any one pathogen, and yet we found a statistically significant result for Cryptosporidium. Therefore, we are confident that our results are specific to cryptosporidiosis, despite cooccurrence with other enteric pathogens. Through a GWAS meta-analysis of three separate birth cohorts, we identified a region in PRKCA on chromosome 17 as being associated with increased risk of symptomatic cryptosporidiosis in the first year of life among Bangladeshi infants. This gene has previously been implicated in other infectious outcomes, indicating pleiotropy with the immune system’s reaction to numerous pathogens. Publicly available data support a link between our top SNP and expression of PRKCA, suggesting a mechanism operating via Th17 inflammatory control. Clinical trials are currently proposed for PKC isotypes, including PKC-alpha, for treatment of autoimmune disease (47). These treatments may also be important for cryptosporidiosis, which lacks treatment for young children, due to an underlying shared pathway identified in this study. Identifying host genetic variations associated with cryptosporidiosis, such as those in PRKCA, can help us identify viable drug targets to improve treatment and prevention of this major cause of morbidity and mortality. Further research is needed to elucidate the mechanism underlying this relationship and to better understand the complex interplay of genetic susceptibility and environmental influences in the development of intestinal disease.

MATERIALS AND METHODS

Study protocol.

The study protocol was approved by the Research and Ethical Review Committee of the International Center for Diarrheal Disease Research, Bangladesh, and by the Institutional Review Board of the University of Virginia and the Institutional Review Board of the Johns Hopkins Bloomberg School of Public Health. The parents or guardians of all individuals provided informed consent.

Dhaka Birth Cohort study design.

Designed to study the influence of malnutrition in child development, the Dhaka Birth Cohort (DBC) is a subset of a larger birth cohort recruited from the urban slum in the Mirpur Thana in Dhaka, Bangladesh. Children were enrolled within the first week after birth and followed up biweekly with household visits by trained field research assistants (FRAs) for the first year of life. Anthropometric measurements were collected at the time of enrollment and every 3 months thereafter. Length-for-age adjusted Z-scores (LAZ) were calculated by comparing the lengths and weights of study subjects with those of the World Health Organization (WHO) reference population, adjusting for age and sex, using WHO Anthro software, version 3.0.1. Field research assistants (FRAs) collected diarrheal stool samples from the home or study field clinic every time that the mother of the child reported diarrhea. To maintain a cold chain, the samples were transported to the Centre for Diarrheal Disease Research, Bangladesh (ICDDR,B) parasitology laboratory. The presence of Cryptosporidium was determined using enzyme-linked immunosorbent assay (ELISA). More details can be found in previously published reports by Steiner et al. (4) and Korpe et al. (9). We used a nested case-control design, where children with at least one diarrheal sample positive for Cryptosporidium within the first year were defined as “cases.” Children with diarrheal samples that were not positive for Cryptosporidium were defined as “controls.”

PROVIDE study design.

The “Performance of Rotavirus and Oral Polio Vaccines in Developing Countries” (PROVIDE) Study consists of a randomized controlled clinical trial and birth cohort from the same urban slum in the Mirpur Thana in Dhaka, Bangladesh, as the DBC and Cryprosporidiosis Birth Cohort (CBC) (see below). PROVIDE was specifically designed to assess the influence of various factors on oral vaccine efficacy among children in areas with high poverty, urban overcrowding, and poor sanitation. The 2-by-2 factorial design looked specifically at the efficacy of the 2-dose Rotarix oral rotavirus vaccine and oral polio vaccine (OPV) with an inactivated polio vaccine (IPV) boost over the first 2 years of life. All participants were from the Mirpur area of Dhaka, Bangladesh, with pregnant mothers recruited from the community by female Bangladeshi FRAs. Each participant had 15 scheduled follow-up clinic visits, as well as biweekly diarrhea surveillance through home visits by FRAs. The presence of Cryptosporidium in diarrheal samples was determined by ELISA. Consistently with the DBC phenotype definition, cases had at least one diarrheal sample positive for Cryptosporidium within the first year of life. Controls had at least one diarrheal sample available for testing, but none were positive for Cryptosporidium. Severity of diarrhea was determined with the Ruuska score, which assesses severity as a function of diarrhea length, clinical symptoms, and other clinical features (48).

Cryptosporidiosis Birth Cohort study design.

The Cryptosporidiosis Birth Cohort (“Cryptosporidiosis and Enteropathogens in Bangladesh”; ClinicalTrials.gov registration no. NCT02764918) is a prospective longitudinal birth cohort study in two sites in Bangladesh. The first site is in an urban, economically depressed neighborhood of Mirpur, and the second is in Mirzapur, a rural subdistrict 60 km northwest of Dhaka. The two birth cohorts were established in parallel, with the objective of understanding the incidence of cryptosporidiosis, the acquired immune response, and host genetic susceptibility to cryptosporidiosis in Bangladeshi children. Pregnant women were recruited and screened, and infants were enrolled at birth. Participants were followed twice-weekly with in-home visits to monitor for child morbidity and diarrhea for 24 months. Infant length and weight were measured every 3 months, and weight-for-age and length-for-age adjusted Z-scores were determined using World Health Organization Anthro software (version 3.2.2). Stool samples were collected during diarrheal illness and once per month for surveillance. Stool was tested for Cryptosporidium by quantitative PCR (qPCR) assay modified from a method reported previously by Liu et al. (49). A cycle threshold value of 40 was used. The pan-Cryptosporidium primers and probes target the 18S gene in multiple species known to infect humans (4).

Genotype data.

DNA for all three cohorts was extracted from blood samples collected in the first few months of follow-up. The Dhaka Birth Cohort (DBC) and PROVIDE Study data were generated and cleaned as described previously (45). A summary of quality control (QC) procedures is provided in Fig. S1 in the supplemental material. Briefly, a total of 396 children in the DBC were genotyped on three different Illumina arrays. Imputation to 1000Genomes phase 3 data was performed for all individuals. After postimputation QC, which included additional filtering for relatedness and for poorly imputed variants, a total of 396 individuals and 10.2 million SNPs were included in the DBC data freeze. For PROVIDE, a total of 541 individuals were genotyped on a Multi-Ethnic Genotyping Array (MEGA) (Illumina). After standard quality control measures (including the use of minor allele frequency values of >0.5% and missingness values of <5%) were applied and first-degree-related individuals removed, a total of 499 individuals remained. After imputation to 1000Genomes and subsequent postimputation QC, a total of 499 individuals and 10.8 million genetic variants remained. For CBC, a total of 630 individuals were genotyped on a Multi-Ethnic Global Array (MEGA) (Illumina). One individual was removed for first-degree relatedness (PI_HAT > 0.2), 31 individuals were removed as PCA outliers, and 3 individuals were removed for heterozygosity. No individuals or SNPs were removed for missingness (>5%). Additional SNP-level filters included the use of minor allele frequency (MAF) values of <0.5% (M = 751,869) and Hardy-Weinberg equilibrium P values of <10−5 (M = 85). After all QC steps, CryptoCohort genotype data included 594 individuals and 826,228 SNPs. Phasing in of SHAPEIT2 (50) was followed by imputation to 1000 Genomes phase 3 data (1000Genomes) (17) performed with IMPUTE2 (51, 52). All three studies were separately imputed to 1000Genomes.

Cross-study genetic data harmonization.

After imputation, all three data sets (DBC, PROVIDE, and CBC) were double-checked for relatedness (both within each study and between studies) to ensure independence. One individual from each pair of related individuals was dropped in a manner consistent with the first or second degree of relatedness (PI_HAT > 0.2). Individual outliers for heterozygosity (F = >5 standard deviations from the mean) were also excluded from further analysis. A total of 85 individuals were dropped from DBC, 9 from PROVIDE, and 34 from CBC. Only the top principal component from the combined data set was found to be significantly associated with outcome (Fig. S6). Quality control metrics for combined GWAS. (A) Distribution of three studies for principal components 1 to 5, colored by study. Crypto (red), Cryptosporidiosis Birth Cohort (CBC); DBC (green), Dhaka Birth Cohort; provide (blue), PROVIDE Study. (B) Distribution of three studies for principal components 1 to 5, colored by case status. Cases are shown in blue and controls in red. Only the first principal component was significantly associated with case status. (C) Histogram of heterozygosity distribution by cohort. Here, we show the distribution of heterozygosity by cohort. crypto, Cryptosporidiosis Birth Cohort; dbc, Dhaka Birth Cohort; provide, PROVIDE Study. The data on the x axis represent F, or the coefficient of heterozygosity. Download FIG S6, PDF file, 0.2 MB.

Statistical analysis.

All three studies (DBC, PROVIDE, and CBC) were analyzed separately using logistic regression with an additive model accounting for imputed genotype weights in SNPTEST (51, 53, 54). All three analyses were adjusted for length-for-age Z-score (LAZ) at 1 year of age, for sex, and for the first two principal components. The Dhaka Birth Cohort was additionally conditioned on the genotyping array to account for batch effects. We combined the three analyses in a fixed-effects meta-analysis within META. Results were filtered for Phet values of >0.05, minor allele frequency (MAF) of >5%, and INFO score of >0.6 in all three studies, resulting in 6,504,706 SNPs. The conditional analyses were run separately by cohort for the PRKCA region, with each analysis being conditioned on rs58296998 in addition to the original covariates with SNPTEST. Results were again filtered for heterogeneity or Phet values of >0.05, MAF of >5%, and INFO score of >0.6 in all three studies.

Allele frequencies.

The allele frequencies were derived from the 1000 Genomes Project phase 3 data, v5a (17). Individuals were stratified by their denoted population with first degree related individuals removed.

GTEx and eQTL overlap GWAS results.

Expression quantitative trait loci (eQTLs) were identified through the use of the GTEx Portal (https://www.gtexportal.org/home/) on 6 August 2018 (19). The top SNP was identified as an eQTL for PRKCA with P values of <0.001 for multiple tissues. PrediXcan measured gene expression in 48 tissues and subsequently mapped genetic variation across the human genome to tissue-specific gene expression levels. Therefore, eQTLs are identified in a tissue-specific manner and annotated as such on the GTEx Portal.

MetaXcan imputation and association analysis.

To impute gene expression and association with outcome from our GWAS summary statistics, we applied MetaXcan (S-PrediXcan and packaged best practices) (21). Weights were previously derived with GTEx v7 data in a population of subjects of European descent, with accompanying European-descent linkage disequilibrium metrics for the SNP covariance matrices (PredictDB Data Respository; http://predictdb.org/). MetaXcan was used instead of the original PrediXcan to ensure consistency in models with our GWAS. All 48 tissues were run separately for the meta-analysis results previously described. Following imputation and estimation of gene expression with outcome, we calculated weights for each gene-tissue pair as the ratio between the number of SNPs used in the model and the total number that were prespecified in the model multiplied by predicted expression performance. To determine associations across many tissues, a P value threshold of 0.001 was utilized. A strict Bonferroni correction performed for the 242,686 comparisons resulted in a P value threshold of 0.05/242,686 = 2.06 × 10−7, according to which no comparison yielded a statistically significant result. The relationships of allele frequencies in European and South Asian populations with PrediXcan weights were examined to assess prediction capacity (Fig. S5 and S7).

Gene set enrichment analysis.

Gene set enrichment analysis was conducted on the described previously imputed gene expression data summary statistics from MetaXcan. For each gene, we selected the tissue corresponding to the smallest P value. Using the program GIGSEA (Genotype Imputed Gene Set Enrichment Analysis [55]), we tested for associations of 453 curated gene sets defined by MSigDB hallmark gene sets (56), as well as KEGG (Kyoto Encyclopedia of Genes and Genomes; https://www.kegg.jp) and BioCarta (57) gene sets (58). To account for redundancy with overlapping gene sets, we utilized the weighted multiple linear regression model, using the matrix operation to increase speed, with a total of 1,000 permutations. A false-discovery rate of 0.05 was calculated on the ranked results.

Data availability.

Data are publicly available from the NIH, via dbGAP, phs001478.v1.p1 (Exploration of the Biologic Basis for Underperformance of Oral Polio and Rotavirus Vaccines in Bangladesh), or by request from us. All analysis programs used are detailed above, but the actual code in R for each analysis is also available by request from us. Gene expression prediction characteristics of OTUD3. Data represent correlation per tissue of differences in allele frequencies between European and South Asian populations with prediXcan weights for OTUD3 and correlation per tissue between weights and frequency differences versus the predictive performance in our participants. The tissues of interest (colon sigmoid, esophagus mucosa) show high correlation and low predictive performance (r2). Fill indicates the log P value for correlation. Data represent difference per SNP between European and South Asian allele frequencies (EUR-SAS) versus the prediXcan weight. Fill indicates the South Asian allele frequencies. We note that many of the highest-weighted alleles are of low frequency or absent in South Asia. Download FIG S7, PDF file, 0.1 MB.
  55 in total

Review 1.  Genotype imputation for genome-wide association studies.

Authors:  Jonathan Marchini; Bryan Howie
Journal:  Nat Rev Genet       Date:  2010-07       Impact factor: 53.242

2.  Biological or pharmacological activation of protein kinase C alpha constrains hepatitis E virus replication.

Authors:  Wenshi Wang; Yijin Wang; Yannick Debing; Xinying Zhou; Yuebang Yin; Lei Xu; Elena Herrera Carrillo; Johannes H Brandsma; Raymond A Poot; Ben Berkhout; Johan Neyts; Maikel P Peppelenbosch; Qiuwei Pan
Journal:  Antiviral Res       Date:  2017-01-08       Impact factor: 5.970

3.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

4.  Genome-wide association identifies multiple ulcerative colitis susceptibility loci.

Authors:  Dermot P B McGovern; Agnès Gardet; Leif Törkvist; Philippe Goyette; Jonah Essers; Kent D Taylor; Benjamin M Neale; Rick T H Ong; Caroline Lagacé; Chun Li; Todd Green; Christine R Stevens; Claudine Beauchamp; Phillip R Fleshner; Marie Carlson; Mauro D'Amato; Jonas Halfvarson; Martin L Hibberd; Mikael Lördal; Leonid Padyukov; Angelo Andriulli; Elisabetta Colombo; Anna Latiano; Orazio Palmieri; Edmond-Jean Bernard; Colette Deslandres; Daan W Hommes; Dirk J de Jong; Pieter C Stokkers; Rinse K Weersma; Yashoda Sharma; Mark S Silverberg; Judy H Cho; Jing Wu; Kathryn Roeder; Steven R Brant; L Phillip Schumm; Richard H Duerr; Marla C Dubinsky; Nicole L Glazer; Talin Haritunians; Andy Ippoliti; Gil Y Melmed; David S Siscovick; Eric A Vasiliauskas; Stephan R Targan; Vito Annese; Cisca Wijmenga; Sven Pettersson; Jerome I Rotter; Ramnik J Xavier; Mark J Daly; John D Rioux; Mark Seielstad
Journal:  Nat Genet       Date:  2010-03-14       Impact factor: 38.330

5.  Localized glucose and water influx facilitates Cryptosporidium parvum cellular invasion by means of modulation of host-cell membrane protrusion.

Authors:  Xian-Ming Chen; Steven P O'Hara; Bing Q Huang; Patrick L Splinter; Jeremy B Nelson; Nicholas F LaRusso
Journal:  Proc Natl Acad Sci U S A       Date:  2005-04-25       Impact factor: 11.205

6.  Attribution of malnutrition to cause-specific diarrheal illness: evidence from a prospective study of preschool children in Mirpur, Dhaka, Bangladesh.

Authors:  Dinesh Mondal; Rashidul Haque; R Bradley Sack; Beth D Kirkpatrick; William A Petri
Journal:  Am J Trop Med Hyg       Date:  2009-05       Impact factor: 2.345

7.  Deficient serum mannose-binding lectin levels and MBL2 polymorphisms increase the risk of single and recurrent Cryptosporidium infections in young children.

Authors:  Marya Carmolli; Priya Duggal; Rashidul Haque; Janet Lindow; Dinesh Mondal; William A Petri; Phoenix Mourningstar; Catherine J Larsson; Meera Sreenivasan; Salwa Khan; B D Kirkpatrick
Journal:  J Infect Dis       Date:  2009-11-15       Impact factor: 5.226

8.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47.

Authors:  Carl A Anderson; Gabrielle Boucher; Charlie W Lees; Andre Franke; Mauro D'Amato; Kent D Taylor; James C Lee; Philippe Goyette; Marcin Imielinski; Anna Latiano; Caroline Lagacé; Regan Scott; Leila Amininejad; Suzannah Bumpstead; Leonard Baidoo; Robert N Baldassano; Murray Barclay; Theodore M Bayless; Stephan Brand; Carsten Büning; Jean-Frédéric Colombel; Lee A Denson; Martine De Vos; Marla Dubinsky; Cathryn Edwards; David Ellinghaus; Rudolf S N Fehrmann; James A B Floyd; Timothy Florin; Denis Franchimont; Lude Franke; Michel Georges; Jürgen Glas; Nicole L Glazer; Stephen L Guthery; Talin Haritunians; Nicholas K Hayward; Jean-Pierre Hugot; Gilles Jobin; Debby Laukens; Ian Lawrance; Marc Lémann; Arie Levine; Cecile Libioulle; Edouard Louis; Dermot P McGovern; Monica Milla; Grant W Montgomery; Katherine I Morley; Craig Mowat; Aylwin Ng; William Newman; Roel A Ophoff; Laura Papi; Orazio Palmieri; Laurent Peyrin-Biroulet; Julián Panés; Anne Phillips; Natalie J Prescott; Deborah D Proctor; Rebecca Roberts; Richard Russell; Paul Rutgeerts; Jeremy Sanderson; Miquel Sans; Philip Schumm; Frank Seibold; Yashoda Sharma; Lisa A Simms; Mark Seielstad; A Hillary Steinhart; Stephan R Targan; Leonard H van den Berg; Morten Vatn; Hein Verspaget; Thomas Walters; Cisca Wijmenga; David C Wilson; Harm-Jan Westra; Ramnik J Xavier; Zhen Z Zhao; Cyriel Y Ponsioen; Vibeke Andersen; Leif Torkvist; Maria Gazouli; Nicholas P Anagnou; Tom H Karlsen; Limas Kupcinskas; Jurgita Sventoraityte; John C Mansfield; Subra Kugathasan; Mark S Silverberg; Jonas Halfvarson; Jerome I Rotter; Christopher G Mathew; Anne M Griffiths; Richard Gearry; Tariq Ahmad; Steven R Brant; Mathias Chamaillard; Jack Satsangi; Judy H Cho; Stefan Schreiber; Mark J Daly; Jeffrey C Barrett; Miles Parkes; Vito Annese; Hakon Hakonarson; Graham Radford-Smith; Richard H Duerr; Séverine Vermeire; Rinse K Weersma; John D Rioux
Journal:  Nat Genet       Date:  2011-02-06       Impact factor: 38.330

9.  Natural History of Cryptosporidiosis in a Birth Cohort in Southern India.

Authors:  Deepthi Kattula; Nithya Jeyavelu; Ashok D Prabhakaran; Prasanna S Premkumar; Vasanthakumar Velusamy; Srinivasan Venugopal; Jayanthi C Geetha; Robin P Lazarus; Princey Das; Karthick Nithyanandhan; Chandrabose Gunasekaran; Jayaprakash Muliyil; Rajiv Sarkar; Christine Wanke; Sitara Swarna Rao Ajjampur; Sudhir Babji; Elena N Naumova; Honorine D Ward; Gagandeep Kang
Journal:  Clin Infect Dis       Date:  2016-11-09       Impact factor: 9.079

10.  Species of Cryptosporidia Causing Subclinical Infection Associated With Growth Faltering in Rural and Urban Bangladesh: A Birth Cohort Study.

Authors:  Kevin L Steiner; Shahnawaz Ahmed; Carol A Gilchrist; Cecelia Burkey; Heather Cook; Jennie Z Ma; Poonum S Korpe; Emtiaz Ahmed; Masud Alam; Mamun Kabir; Fahmida Tofail; Tahmeed Ahmed; Rashidul Haque; William A Petri; Abu S G Faruque
Journal:  Clin Infect Dis       Date:  2018-10-15       Impact factor: 9.079

View more
  7 in total

Review 1.  Forward Genetics in Apicomplexa Biology: The Host Side of the Story.

Authors:  Juan C Sánchez-Arcila; Kirk D C Jensen
Journal:  Front Cell Infect Microbiol       Date:  2022-05-12       Impact factor: 6.073

Review 2.  Molecular Epidemiology of Human Cryptosporidiosis in Low- and Middle-Income Countries.

Authors:  Xin Yang; Yaqiong Guo; Lihua Xiao; Yaoyu Feng
Journal:  Clin Microbiol Rev       Date:  2021-02-24       Impact factor: 26.132

3.  Genome-Wide Association Study of Campylobacter-Positive Diarrhea Identifies Genes Involved in Toxin Processing and Inflammatory Response.

Authors:  Rebecca M Munday; Rashidul Haque; Ning-Jiun Jan; Genevieve L Wojcik; Chelsea Marie; Dylan Duchen; Alexander J Mentzer; Uma Nayak; Poonum Korpe; Beth D Kirkpatrick; William A Petri; Priya Duggal
Journal:  mBio       Date:  2022-04-14       Impact factor: 7.786

4.  Genome-Wide Association Study of Cryptosporidiosis in Infants Implicates PRKCA.

Authors:  Genevieve L Wojcik; Poonum Korpe; Chelsea Marie; Alexander J Mentzer; Tommy Carstensen; Josyf Mychaleckyj; Beth D Kirkpatrick; Stephen S Rich; Patrick Concannon; A S G Faruque; Rashidul Haque; William A Petri; Priya Duggal
Journal:  mBio       Date:  2020-02-04       Impact factor: 7.867

5.  Protein Kinase C-α Is a Gatekeeper of Cryptosporidium Sporozoite Adherence and Invasion.

Authors:  Sayo McCowin; William A Petri; Chelsea Marie
Journal:  Infect Immun       Date:  2022-01-18       Impact factor: 3.441

Review 6.  Cryptosporidium Infections in Africa-How Important Is Zoonotic Transmission? A Review of the Evidence.

Authors:  Lucy J Robertson; Øystein Haarklau Johansen; Tsegabirhan Kifleyohannes; Akinwale Michael Efunshile; Getachew Terefe
Journal:  Front Vet Sci       Date:  2020-10-08

7.  Megasphaera in the Stool Microbiota Is Negatively Associated With Diarrheal Cryptosporidiosis.

Authors:  Maureen A Carey; Gregory L Medlock; Masud Alam; Mamun Kabir; Md Jashim Uddin; Uma Nayak; Jason Papin; A S G Faruque; Rashidul Haque; William A Petri; Carol A Gilchrist
Journal:  Clin Infect Dis       Date:  2021-09-15       Impact factor: 20.999

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.