| Literature DB >> 36085083 |
Ruth Johnson1,2, Yi Ding3,4, Vidhya Venkateswaran3,5, Arjun Bhattacharya3,6, Kristin Boulier4,7, Alec Chiu4, Sergey Knyazev3,6, Tommer Schwarz3,4, Malika Freund8,9, Lingyu Zhan10, Kathryn S Burch3,4, Christa Caggiano3,11, Brian Hill12, Nadav Rakocz12, Brunilda Balliu13, Christopher T Denny14,15,16, Jae Hoon Sul17, Noah Zaitlen11,13, Valerie A Arboleda3,8,13, Eran Halperin12,13,18, Sriram Sankararaman12,8,13, Manish J Butte19, Clara Lajonchere11,20, Daniel H Geschwind8,11,20, Bogdan Pasaniuc21,22,23,24,25.
Abstract
BACKGROUND: Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736).Entities:
Keywords: Biobank; Electronic health records; Genetic ancestry; Genome-wide association studies; Phenome-wide association studies
Mesh:
Year: 2022 PMID: 36085083 PMCID: PMC9461263 DOI: 10.1186/s13073-022-01106-x
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 15.266
Fig. 1Self-identified race/ethnicity (SIRE) and genetically inferred ancestry (GIA) are not analogous. We show a Sankey diagram visualizing the sample size breakdown of individuals in each genetically inferred ancestry group and SIRE groups for all individuals in ATLAS (N = 36,736)
Fig. 2Global PCA reflects self-identified race/ethnicity and language of ATLAS participants. A Genetic PCs 1 and 2 of individuals in ATLAS (N=36,736) shaded by continental GIA as inferred from 1000 Genomes. B, C The first two genetic PCs of the ATLAS participants shaded by SIRE and preferred language, respectively. To improve visualization in C, only languages with >10 responses were assigned a color
Fig. 3PCA of individuals with inferred East Asian American, European American, and Hispanic Latino American genetic ancestry in ATLAS captures fine-scale subcontinental ancestry groupings. PCA was performed separately within each continental GIA in ATLAS with the corresponding subcontinental ancestry samples from 1000 Genomes: A East Asian American, B European American, C Hispanic Latino American. Cluster annotation labels were determined using a combination of samples from 1000 Genomes and self-identified race, ethnicity, and language information from the EHR
Fig. 4IBD sharing between ATLAS participants. InfoMap community membership is indicated by color for all communities with >100 individuals (20 communities total) and individuals with a degree >30. Community membership indicates elevated shared IBD within that community. Community identity is labeled adjacent to the network plot in the corresponding color
Fig. 5Disease associations vary across continental genetically inferred ancestry groups in ATLAS. We show the odds ratio computed from associating each phenotype with individuals’ genetically inferred ancestry in ATLAS (N=36,736) under a logistic regression model. Error bars represent 95% confidence intervals
Fig. 6Global ancestry correlates with disease prevalence in admixed individuals. Individuals by SIRE who have had a diagnosis of A chronic nonalcoholic liver disease, B uterine leiomyoma, or C liver/intrahepatic bile duct cancer are binned by their proportions of either European, African, Native American, or East Asian ancestry estimated using ADMIXTURE. Within each bin, we plot the prevalence of the diagnoses and provide standard errors (+/− 1.96 SE) of the computed frequencies
Fig. 7Recapitulating known associations for chronic nonalcoholic liver disease in ancestry-specific and multi-ancestry meta-analyses in ATLAS. GWAS Manhattan plots for chronic nonalcoholic liver disease in the A European American, B Hispanic Latino American, C African American, D East Asian American GIA groups in ATLAS, and E the meta-analysis performed across all 4 GIA groups. The red dashed line denotes genome-wide significance (p-value < 5×10-8). We recapitulate a known association at the 22q13.31 locus
Fig. 8Identifying correlated phenotypes at rs2294915 in both the Hispanic Latino American and European American GIA groups in ATLAS. We show a PheWAS plot at rs2294915 for the Hispanic Latino American (top) and European American (bottom) GIA groups. The red dashed line denotes p-value=4.09×10−5, the significance threshold after adjusting for the number of tested phenotypes. The red dotted line denotes the significance threshold after correcting for both genome-wide significance and the number of tested phenotypes (p-value=4.09×10−11)