| Literature DB >> 25884002 |
Carrie B Moore1, Anurag Verma2, Sarah Pendergrass2, Shefali S Verma2, Daniel H Johnson3, Eric S Daar4, Roy M Gulick5, Richard Haubrich6, Gregory K Robbins7, Marylyn D Ritchie2, David W Haas3.
Abstract
Background. Phenome-Wide Association Studies (PheWAS) identify genetic associations across multiple phenotypes. Clinical trials offer opportunities for PheWAS to identify pharmacogenomic associations. We describe the first PheWAS to use genome-wide genotypic data and to utilize human immunodeficiency virus (HIV) clinical trials data. As proof-of-concept, we focused on baseline laboratory phenotypes from antiretroviral therapy-naive individuals. Methods. Data from 4 AIDS Clinical Trials Group (ACTG) studies were split into 2 datasets: Dataset I (1181 individuals from protocol A5202) and Dataset II (1366 from protocols A5095, ACTG 384, and A5142). Final analyses involved 2547 individuals and 5 954 294 imputed polymorphisms. We calculated comprehensive associations between these polymorphisms and 27 baseline laboratory phenotypes. Results. A total of 10 584 (0.17%) polymorphisms had associations with P < .01 in both datasets and with the same direction of association. Twenty polymorphisms replicated associations with identical or related phenotypes reported in the Catalog of Published Genome-Wide Association Studies, including several not previously reported in HIV-positive cohorts. We also identified several possibly novel associations. Conclusions. These analyses define PheWAS properties and principles with baseline laboratory data from HIV clinical trials. This approach may be useful for evaluating on-treatment HIV clinical trials data for associations with various clinical phenotypes.Entities:
Keywords: HIV-1; PheWAS; antiretroviral therapy; clinical trials; pharmacogenomics
Year: 2015 PMID: 25884002 PMCID: PMC4396430 DOI: 10.1093/ofid/ofu113
Source DB: PubMed Journal: Open Forum Infect Dis ISSN: 2328-8957 Impact factor: 3.835
Information Regarding ACTG Protocols
| Genotyping Phase | PheWAS Dataset | Study | Number Subjects | Self-Reported Race/Ethnicity | % Provided DNA | References |
|---|---|---|---|---|---|---|
| I | I | A5095 | 1147 | 40% White | 88 | [ |
| 37% Black | ||||||
| 21% Hispanic | ||||||
| II | I | ACTG 384 | 898 | 46% White | 63a | [ |
| 35% Black | ||||||
| 17% Hispanic | ||||||
| II | I | A5142 | 757 | 36% White | 87 | [ |
| 42% Black | ||||||
| 19% Hispanic | ||||||
| III | II | A5202 | 1864 | 47% White | 87 | [ |
| 26% Black | ||||||
| 25% Hispanic |
Abbreviations: ACTG, AIDS Clinical Trials Group; PheWAS, Phenome-Wide Association Studies.
a Genetic consenting and DNA from protocol ACTG 384 is lower than other protocols because the ACTG's genetic consent protocol A5128 became available in 2002 because follow-up of ACTG 384 participants was ending.
Figure 1.Study flowchart for genotypic and phenotypic data and analyses. The graphic illustrates steps used for quality control for genotypic and phenotype data, imputation of genotypic data, criteria for passing filtering threshold for associations across 2 datasets, and software tools used for result interpretation. Abbreviations: GWAS, Genome-Wide Association Studies; MAF, minor allele frequency; PheWAS, Phenome-Wide Association Studies; SNP, single-nucleotide polymorphisms.
Data for 27 Pretreatment Laboratory Phenotypesa and Summary Statistics
| Phenotypes | Sample Size | Median | Min | Max | Transformation |
|---|---|---|---|---|---|
| Absolute basophil countb | 2739 | – | – | – | Binary |
| Absolute eosinophil count | 2809 | 2.06 | 0 | 3.51 | Natural log |
| Absolute lymphocyte count | 2847 | 3.13 | 0 | 4.83 | Natural log |
| Absolute monocyte count | 2823 | 2.6 | 0 | 4.36 | Natural log |
| Absolute neutrophil count | 2957 | 3.32 | 2.54 | 4.03 | Natural log |
| ALT | 2960 | 1.51 | 0.3 | 2.29 | Natural log |
| alkaline Phosphatase | 2966 | 1.89 | 0.78 | 2.72 | Natural log |
| AST | 2964 | 1.48 | 1.04 | 2.27 | Natural log |
| Blood urea nitrogen | 2954 | 1.11 | 0.2 | 2.17 | Natural log |
| Carbon dioxide/bicarbonate | 2664 | 26 | 12 | 35 | |
| CD4 T-cell countb | 3286 | 224 | 0 | 1336 | Square root |
| CD8 T-cell count | 3286 | 2.88 | 1.54 | 3.76 | Natural log |
| Chloride | 2773 | 103 | 89 | 116 | |
| Creatinine | 2986 | 0.9 | 0 | 2.5 | |
| Glucose (fasting) | 1761 | 1.93 | 1.53 | 2.64 | Natural log |
| Triglycerides (fasting) | 2023 | 2.06 | 1.11 | 3.45 | Natural log |
| Glucose (nonfasting) | 1175 | 1.93 | 1.48 | 2.6 | Natural log |
| Hematocrit | 3010 | 40 | 18 | 57.5 | |
| Hemoglobin | 3026 | 13.9 | 6 | 20 | |
| HDL-C | 2456 | 1.56 | 0.7 | 2.17 | Natural log |
| LDL-C | 2235 | 94 | 12 | 262 | |
| Platelet count | 3000 | 202 | 36 | 648 | |
| Potassium | 2773 | 4.1 | 2.2 | 5.7 | |
| HIV-1 RNA | 3269 | 4.61 | 0.95 | 7.27 | Natural log |
| Sodium | 2776 | 139 | 127 | 151 | |
| Total bilirubin | 2925 | 0.5 | 0.1 | 2.3 | |
| Total cholesterol | 2852 | 158 | 6 | 350 |
Abbreviations: ALT, alanine amino transferase; AST, aspartate amino transferase; HDL-C, high-density lipoprotein cholesterol; HIV, human immunodeficiency virus; LDL-C, low-density lipoprotein cholesterol; Max, maximum; Min, minimum.
a Original units (before transformations) for each phenotype were as follows: cells × 103/µL for absolute basophil count, absolute eosinophil count, absolute lymphocyte count, absolute monocyte count, absolute neutrophil count, and platelet count; U/L for ALT, alkaline phosphatase, and AST; mg/dL for blood urea nitrogen, creatinine, glucose (fasting), glucose (nonfasting), triglycerides (fasting), HDL-C, LDL-C, total bilirubin, and total cholesterol; mmol/L for carbon dioxide/bicarbonate, chloride, potassium, sodium; cells/µL for CD4 T-cell count and CD8 T-cell count; % for hematocrit; g/dL for hematocrit; copies/mL for HIV-1 RNA.
b Absolute basophil count measurements were used for regressions both natural log transformed as well as dichotomized into a binary variable, and CD4 T-cell counts were only used as a covariate and not as a dependent variable.
Genotyping Summary by Dataset and Phase
| Dataset | Phase | Study | N | Number of SNPs | Assay |
|---|---|---|---|---|---|
| Dataset 1 | Phase I | A5095 | 798 | 631 476 | Illumina 650Y |
| Phase II | A384, A5142 | 898 | 1 199 187 | Illumina 1M Duo | |
| Dataset 2 | Phase III | A5202 | 1221 | 1 199 187 | Illumina 1M Duo |
Abbreviation: SNP, single-nucleotide polymorphism.
Figure 2.A Manhattan plot representing phenotype-single-nucleotide polymorphism (SNP) pairs that meet the P-value threshold. Each marker represents a phenotype-SNP pair with P < .01 in both datasets, with the same direction of association. Red markers represent Dataset I, and blue markers represent Dataset II. The peak on chromosome 2 is for total bilirubin with rs887829 in the UGT1A locus (Dataset I, β = 0.149 and P value = 4.04 × 10−35; Dataset II, β = 0.115 and P value = 7.05 × 10−31).
Figure 3.Results for single-nucleotide polymorphisms (SNPs) also in the Genome-Wide Association Studies (GWAS) Catalog and regardless of catalog phenotype. The track on the left indicates the chromosomal location of each SNP, the next track indicates the SNP, the associated phenotype in our study, and (in parenthesis) the GWAS Catalog phenotype. The next track indicates whether our association was as follows: an “exact” match with the GWAS Catalog phenotype; “related” with similarity to the GWAS Catalog phenotype; or “novel” with no apparent similarity to the GWAS Catalog phenotype. All P values less than 1 × 10−10 are represented by a larger triangle. Triangles point to the right if beta is positive and to the left if beta is negative. Abbreviations: HDL-C, high-density lipoprotein cholesterol; HIV, human immunodeficiency virus; LDL-C, low-density lipoprotein cholesterol.
Figure 4.The graphic illustrates study associations replicating previously reported genotype-phenotype associations. The left-most track indicates chromosome and coordinate position. The single-nucleotide polymorphism (SNP) for each association is listed, with the associated clinical laboratory measurement. Each phenotype where we replicated a previous reported result is listed in the next track, with boxes to the right indicating the phenotype for the previously reported SNP-phenotype association: green, total bilirubin levels; brown, high-density lipoprotein cholesterol (HDL-C); blue, absolute neutrophil count; black, total cholesterol levels. Dark green triangles and dark blue triangles represent P values from Dataset I and Dataset II, respectively. Right-pointing triangles indicate positive direction of association, and left-pointing triangles indicate negative direction of association. Abbreviations: HDL, low-density lipoprotein; HIV, human immunodeficiency virus.