Amy R Bentley1, Yun J Sung2, Michael R Brown3, Thomas W Winkler4, Aldi T Kraja5, Ioanna Ntalla6, Kenneth Rice7, Patricia B Munroe6,8, Alanna C Morrison3, Dabeeru C Rao2, Charles N Rotimi9, L Adrienne Cupples10,11, Karen Schwander2, Daniel I Chasman12,13, Elise Lim14, Xuan Deng14, Xiuqing Guo15, Jingmin Liu16, Yingchang Lu17, Ching-Yu Cheng18,19,20, Xueling Sim21, Dina Vojinovic22, Jennifer E Huffman23, Solomon K Musani24, Changwei Li25, Mary F Feitosa5, Melissa A Richard26, Raymond Noordam27, Jenna Baker28, Guanjie Chen28, Hugues Aschard29,30, Traci M Bartz31,32, Jingzhong Ding33, Rajkumar Dorajoo34, Alisa K Manning35,36, Tuomo Rankinen37, Albert V Smith38,39, Salman M Tajuddin40, Wei Zhao41, Mariaelisa Graff42, Maris Alver43, Mathilde Boissel44, Jin Fang Chai21, Xu Chen45, Jasmin Divers46, Evangelos Evangelou47,48, Chuan Gao49, Anuj Goel50,51, Yanick Hagemeijer52, Sarah E Harris53,54, Fernando P Hartwig55,56, Meian He57, Andrea R V R Horimoto58, Fang-Chi Hsu46, Yi-Jen Hung59,60, Anne U Jackson61, Anuradhani Kasturiratne62, Pirjo Komulainen63, Brigitte Kühnel64,65, Karin Leander66, Keng-Hung Lin67, Jian'an Luan68, Leo-Pekka Lyytikäinen69,70, Nana Matoba71, Ilja M Nolte72, Maik Pietzner73,74, Bram Prins75, Muhammad Riaz76,77, Antonietta Robino78, M Abdullah Said52, Nicole Schupf79, Robert A Scott68, Tamar Sofer36,80, Alena Stancáková81, Fumihiko Takeuchi82, Bamidele O Tayo83, Peter J van der Most72, Tibor V Varga84, Tzung-Dau Wang85,86,87, Yajuan Wang88, Erin B Ware89, Wanqing Wen90, Yong-Bing Xiang91, Lisa R Yanek92, Weihua Zhang93,94, Jing Hua Zhao68, Adebowale Adeyemo28, Saima Afaq93, Najaf Amin22, Marzyeh Amini72, Dan E Arking95, Zorayr Arzumanyan15, Tin Aung18,20,96, Christie Ballantyne97,98, R Graham Barr99, Lawrence F Bielak41, Eric Boerwinkle3,100, Erwin P Bottinger17, Ulrich Broeckel101, Morris Brown6,8, Brian E Cade80, Archie Campbell102, Mickaël Canouil44, Sabanayagam Charumathi18,19, Yii-Der Ida Chen15, Kaare Christensen103, Maria Pina Concas78, John M Connell104, Lisa de Las Fuentes2,105, H Janaka de Silva106, Paul S de Vries3, Ayo Doumatey28, Qing Duan107, Charles B Eaton108, Ruben N Eppinga52, Jessica D Faul89, James S Floyd32,109, Nita G Forouhi68, Terrence Forrester110, Yechiel Friedlander111, Ilaria Gandin112, He Gao47, Mohsen Ghanbari22,113, Sina A Gharib114, Bruna Gigante66, Franco Giulianini12, Hans J Grabe115, C Charles Gu2, Tamara B Harris116, Sami Heikkinen81,117, Chew-Kiat Heng118,119, Makoto Hirata120, James E Hixson3, M Arfan Ikram22,121,122, Yucheng Jia15, Roby Joehanes123,124, Craig Johnson125, Jost Bruno Jonas126,127, Anne E Justice42, Tomohiro Katsuya128,129, Chiea Chuen Khor34, Tuomas O Kilpeläinen130,131, Woon-Puay Koh21,132, Ivana Kolcic133, Charles Kooperberg134, Jose E Krieger58, Stephen B Kritchevsky135, Michiaki Kubo136, Johanna Kuusisto81, Timo A Lakka63,117,137, Carl D Langefeld46, Claudia Langenberg68, Lenore J Launer116, Benjamin Lehne138, Cora E Lewis139, Yize Li2, Jingjing Liang88, Shiow Lin5, Ching-Ti Liu14, Jianjun Liu34,140, Kiang Liu141, Marie Loh93,142,143,144, Kurt K Lohman46, Tin Louie7, Anna Luzzi15, Reedik Mägi43, Anubha Mahajan51, Ani W Manichaikul145, Colin A McKenzie146, Thomas Meitinger147,148,149, Andres Metspalu43, Yuri Milaneschi150, Lili Milani43, Karen L Mohlke107, Yukihide Momozawa151, Andrew P Morris51,152, Alison D Murray153, Mike A Nalls154,155, Matthias Nauck73,74, Christopher P Nelson76,77, Kari E North42, Jeffrey R O'Connell156,157, Nicholette D Palmer158, George J Papanicolau159, Nancy L Pedersen45, Annette Peters65,160, Patricia A Peyser41, Ozren Polasek133,161,162, Neil Poulter163, Olli T Raitakari164,165, Alex P Reiner134, Frida Renström84,166, Treva K Rice2, Stephen S Rich145, Jennifer G Robinson167, Lynda M Rose12, Frits R Rosendaal168, Igor Rudan169, Carsten O Schmidt170, Pamela J Schreiner171, William R Scott138,172, Peter Sever172, Yuan Shi18, Stephen Sidney173, Mario Sims24, Jennifer A Smith41,89, Harold Snieder72, John M Starr53,174, Konstantin Strauch175,176, Heather M Stringham61, Nicholas Y Q Tan18, Hua Tang177, Kent D Taylor15, Yik Ying Teo21,34,178,179,180, Yih Chung Tham18, Henning Tiemeier22,181, Stephen T Turner182, André G Uitterlinden22,183, Diana van Heemst27, Melanie Waldenberger64,65, Heming Wang36,80, Lan Wang14, Lihua Wang5, Wen Bin Wei184, Christine A Williams5, Gregory Wilson185, Mary K Wojczynski5, Jie Yao15, Kristin Young42, Caizheng Yu57, Jian-Min Yuan186,187, Jie Zhou28, Alan B Zonderman188, Diane M Becker92, Michael Boehnke61, Donald W Bowden158, John C Chambers47,93,94,144,189, Richard S Cooper83, Ulf de Faire66, Ian J Deary53,190, Paul Elliott93, Tõnu Esko43,191, Martin Farrall50,51, Paul W Franks84,192,193,194, Barry I Freedman195, Philippe Froguel44,196, Paolo Gasparini78,112, Christian Gieger65,197, Bernardo L Horta55, Jyh-Ming Jimmy Juang86,87, Yoichiro Kamatani71, Candace M Kammerer198, Norihiro Kato82, Jaspal S Kooner93,94,172,189, Markku Laakso81, Cathy C Laurie7, I-Te Lee199,200,201, Terho Lehtimäki69,70, Patrik K E Magnusson45, Albertine J Oldehinkel202, Brenda W J H Penninx150, Alexandre C Pereira58, Rainer Rauramaa63, Susan Redline80, Nilesh J Samani76,77, James Scott172, Xiao-Ou Shu90, Pim van der Harst52,203, Lynne E Wagenknecht204, Jun-Sing Wang199,201, Ya Xing Wang127, Nicholas J Wareham68, Hugh Watkins50,51, David R Weir89, Ananda R Wickremasinghe62, Tangchun Wu57, Eleftheria Zeggini75,205, Wei Zheng90, Claude Bouchard37, Michele K Evans40, Vilmundur Gudnason38,39, Sharon L R Kardia41, Yongmei Liu206, Bruce M Psaty32,109,207,208, Paul M Ridker12,13, Rob M van Dam21,140, Dennis O Mook-Kanamori168,209, Myriam Fornage3,26, Michael A Province5, Tanika N Kelly210, Ervin R Fox211, Caroline Hayward23, Cornelia M van Duijn22,212, E Shyong Tai21,132,140, Tien Yin Wong18,20,96, Ruth J F Loos17,213, Nora Franceschini42, Jerome I Rotter15, Xiaofeng Zhu88, Laura J Bierut214, W James Gauderman215. 1. Center for Research on Genomics and Global Health, National Human Genome Research Institute, US National Institutes of Health, Bethesda, MD, USA. amy.bentley@nih.gov. 2. Division of Biostatistics, Washington University School of Medicine, St. Louis, MO, USA. 3. Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA. 4. Department of Genetic Epidemiology, University of Regensburg, Regensburg, Germany. 5. Division of Statistical Genomics, Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA. 6. Clinical Pharmacology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK. 7. Department of Biostatistics, University of Washington, Seattle, WA, USA. 8. NIHR Barts Cardiovascular Biomedical Research Centre, Queen Mary University of London, London, UK. 9. Center for Research on Genomics and Global Health, National Human Genome Research Institute, US National Institutes of Health, Bethesda, MD, USA. rotimic@mail.nih.gov. 10. Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA. adrienne@bu.edu. 11. Framingham Heart Study, National Heart, Lung, and Blood Institute, US National Institutes of Health, Bethesda, MD, USA. adrienne@bu.edu. 12. Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA. 13. Harvard Medical School, Boston, MA, USA. 14. Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA. 15. Institute for Translational Genomics and Population Sciences, Department of Pediatrics, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA, USA. 16. Women's Health Initiative Clinical Coordinating Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA. 17. Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA. 18. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore. 19. Centre for Quantitative Medicine, Academic Medicine Research Institute, Ophthalmology and Visual Sciences Academic Clinical Program (Eye ACP), Duke-NUS Medical School, Singapore, Singapore. 20. Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. 21. Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore, Singapore. 22. Department of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands. 23. Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK. 24. Jackson Heart Study, Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA. 25. Epidemiology and Biostatistics, University of Georgia at Athens College of Public Health, Athens, GA, USA. 26. Brown Foundation Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX, USA. 27. Internal Medicine, Gerontology and Geriatrics, Leiden University Medical Center, Leiden, the Netherlands. 28. Center for Research on Genomics and Global Health, National Human Genome Research Institute, US National Institutes of Health, Bethesda, MD, USA. 29. Centre de Bioinformatique, Biostatistique, et Biologie Intégrative (C3BI), Institut Pasteur, Paris, France. 30. Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA. 31. Cardiovascular Health Research Unit, Department of Biostatistics, University of Washington, Seattle, WA, USA. 32. Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA. 33. Center on Diabetes, Obesity, and Metabolism, Gerontology and Geriatric Medicine, Wake Forest University Health Sciences, Winston-Salem, NC, USA. 34. Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore. 35. Clinical and Translational Epidemiology Unit, Massachusetts General Hospital, Boston, MA, USA. 36. Department of Medicine, Harvard Medical School, Boston, MA, USA. 37. Human Genomics Laboratory, Pennington Biomedical Research Center, Baton Rouge, LA, USA. 38. Icelandic Heart Association, Kopavogur, Iceland. 39. Faculty of Medicine, University of Iceland, Reykjavik, Iceland. 40. Health Disparities Research Section, Laboratory of Epidemiology and Population Sciences, National Institute on Aging, US National Institutes of Health, Baltimore, MD, USA. 41. Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA. 42. Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. 43. Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia. 44. CNRS UMR 8199, European Genomic Institute for Diabetes (EGID), Institut Pasteur de Lille, University of Lille, Lille, France. 45. Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden. 46. Department of Biostatistics and Data Science, Division of Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, NC, USA. 47. Department of Epidemiology and Biostatistics, School of Public Health, Imperial College, London, UK. 48. Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece. 49. Molecular Genetics and Genomics Program, Wake Forest School of Medicine, Winston-Salem, NC, USA. 50. Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK. 51. Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK. 52. University of Groningen, University Medical Center Groningen, Department of Cardiology, Groningen, the Netherlands. 53. Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, UK. 54. Medical Genetics Section, Centre for Genomic and Experimental Medicine, University of Edinburgh, Edinburgh, UK. 55. Postgraduate Programme in Epidemiology, Federal University of Pelotas, Pelotas, Brazil. 56. Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK. 57. Department of Occupational and Environmental Health and State Key Laboratory of Environmental Health for Incubating, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China. 58. Laboratory of Genetics and Molecular Cardiology, Heart Institute (InCor), University of São Paulo Medical School, São Paulo, Brazil. 59. Endocrinology and Metabolism, Tri-Service General Hospital, Taipei, Taiwan. 60. School of Medicine, National Defense Medical Center, Taipei, Taiwan. 61. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA. 62. Department of Public Health, Faculty of Medicine, University of Kelaniya, Ragama, Sri Lanka. 63. Foundation for Research in Health, Exercise, and Nutrition, Kuopio Research Institute of Exercise Medicine, Kuopio, Finland. 64. Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany. 65. Institute of Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany. 66. Unit of Cardiovascular Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden. 67. Ophthalmology, Taichung Veterans General Hospital, Taichung, Taiwan. 68. Medical Research Council Epidemiology Unit, University of Cambridge, Cambridge, UK. 69. Department of Clinical Chemistry, Fimlab Laboratories, Tampere, Finland. 70. Department of Clinical Chemistry, Finnish Cardiovascular Research Center-Tampere, Faculty of Medicine and Technology, Tampere University, Tampere, Finland. 71. Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan. 72. University of Groningen, University Medical Center Groningen, Department of Epidemiology, Groningen, the Netherlands. 73. DZHK (German Centre for Cardiovascular Health), Partner Site Greifswald, Greifswald, Germany. 74. Institute of Clinical Chemistry and Laboratory Medicine, University Medicine Greifswald, Greifswald, Germany. 75. Human Genetics, Wellcome Trust Sanger Institute, Hinxton, UK. 76. Department of Cardiovascular Sciences, University of Leicester, Leicester, UK. 77. NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK. 78. Institute for Maternal and Child Health, IRCCS 'Burlo Garofolo', Trieste, Italy. 79. Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Medical Center, New York, NY, USA. 80. Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA. 81. Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland. 82. Department of Gene Diagnostics and Therapeutics, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan. 83. Department of Public Health Sciences, Loyola University Chicago, Maywood, IL, USA. 84. Department of Clinical Sciences, Genetic and Molecular Epidemiology Unit, Lund University Diabetes Centre, Skåne University Hospital, Malmö, Sweden. 85. Cardiovascular Center, National Taiwan University Hospital, Taipei, Taiwan. 86. National Taiwan University College of Medicine, Taipei, Taiwan. 87. Division of Cardiology, Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan. 88. Department of Population Quantitative and Health Sciences, Case Western Reserve University, Cleveland, OH, USA. 89. Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA. 90. Division of Epidemiology, Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA. 91. SKLORG and Department of Epidemiology, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China. 92. Division of General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA. 93. MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College, London, UK. 94. Department of Cardiology, Ealing Hospital, Middlesex, UK. 95. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA. 96. Ophthalmology and Visual Sciences Academic Clinical Program (Eye ACP), Duke-NUS Medical School, Singapore, Singapore. 97. Section of Cardiovascular Research, Baylor College of Medicine, Houston, TX, USA. 98. Houston Methodist Debakey Heart and Vascular Center, Houston, TX, USA. 99. Departments of Medicine and Epidemiology, Columbia University Medical Center, New York, NY, USA. 100. Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA. 101. Section of Genomic Pediatrics, Departments of Pediatrics, Medicine, and Physiology, Medical College of Wisconsin, Milwaukee, WI, USA. 102. Centre for Genomic and Experimental Medicine, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK. 103. Danish Aging Research Center, Institute of Public Health, University of Southern Denmark, Odense, Denmark. 104. Ninewells Hospital and Medical School, University of Dundee, Dundee, UK. 105. Cardiovascular Division, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA. 106. Department of Medicine, Faculty of Medicine, University of Kelaniya, Ragama, Sri Lanka. 107. Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. 108. Department of Family Medicine and Epidemiology, Alpert Medical School of Brown University, Providence, RI, USA. 109. Department of Epidemiology, University of Washington, Seattle, WA, USA. 110. UWI Solutions for Developing Countries, University of the West Indies, Kingston, Jamaica. 111. Braun School of Public Health, Hebrew University-Hadassah Medical Center, Jerusalem, Israel. 112. Department of Medical Sciences, University of Trieste, Trieste, Italy. 113. Department of Genetics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. 114. Computational Medicine Core, Center for Lung Biology, UW Medicine Sleep Center, Department of Medicine, University of Washington, Seattle, WA, USA. 115. Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Greifswald, Germany. 116. Laboratory of Epidemiology and Population Sciences, National Institute on Aging, US National Institutes of Health, Bethesda, MD, USA. 117. Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland. 118. Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. 119. Khoo Teck Puat-National University Children's Medical Institute, National University Health System, Singapore, Singapore. 120. Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, University of Tokyo, Minato-ku, Japan. 121. Department of Radiology and Nuclear Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands. 122. Department of Neurology, Erasmus University Medical Center, Rotterdam, the Netherlands. 123. Hebrew SeniorLife, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA. 124. Framingham Heart Study, National Heart, Lung, and Blood Institute, US National Institutes of Health, Bethesda, MD, USA. 125. Collaborative Health Studies Coordinating Center, University of Washington, Seattle, WA, USA. 126. Department of Ophthalmology, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany. 127. Beijing Institute of Ophthalmology, Beijing Ophthalmology and Visual Science Key Laboratory, Beijing Tongren Eye Center, Capital Medical University, Beijing, China. 128. Department of Clinical Gene Therapy, Osaka University Graduate School of Medicine, Suita, Japan. 129. Department of Geriatric and General Medicine, Osaka University Graduate School of Medicine, Suita, Japan. 130. Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark. 131. Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA. 132. Duke-NUS Medical School, Singapore, Singapore. 133. Department of Public Health, Department of Medicine, University of Split, Split, Croatia. 134. Fred Hutchinson Cancer Research Center, University of Washington School of Public Health, Seattle, WA, USA. 135. Sticht Center for Healthy Aging and Alzheimer's Prevention, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA. 136. RIKEN Center for Integrative Medical Sciences, Yokohama, Japan. 137. Department of Clinical Physiology and Nuclear Medicine, Kuopio University Hospital, Kuopio, Finland. 138. Institute of Clinical Sciences, Department of Molecular Sciences, Imperial College, London, UK. 139. Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA. 140. Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. 141. Epidemiology, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA. 142. Translational Laboratory in Genetic Medicine, Agency for Science, Technology, and Research, Singapore, Singapore. 143. Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. 144. Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore. 145. Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA. 146. Tropical Metabolism Research Unit, Caribbean Institute for Health Research, University of the West Indies, Mona, Jamaica. 147. Institute of Human Genetics, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany. 148. Institute of Human Genetics, Technische Universität München, Munich, Germany. 149. Technische Universität München, Munich, Germany. 150. Department of Psychiatry, Amsterdam Neuroscience and Amsterdam Public Health Research Institute, Amsterdam University Medical Center, Vrije Universiteit, Amsterdam, the Netherlands. 151. Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan. 152. Department of Biostatistics, University of Liverpool, Liverpool, UK. 153. Institute of Medical Sciences, Aberdeen Biomedical Imaging Centre, University of Aberdeen, Aberdeen, UK. 154. Data Tecnica International, Glen Echo, MD, USA. 155. Laboratory of Neurogenetics, National Institute on Aging, Bethesda, MD, USA. 156. Division of Endocrinology, Diabetes, and Nutrition, University of Maryland School of Medicine, Baltimore, MD, USA. 157. Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA. 158. Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA. 159. Epidemiology Branch, Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, US National Institutes of Health, Bethesda, MD, USA. 160. Epidemiology, Faculty of Medicine, Institute for Medical Information Processing, Biometry, and Epidemiology, Ludwig Maximilian University, Munich, Germany. 161. Psychiatric Hospital 'Sveti Ivan', Zagreb, Croatia. 162. Gen-info, Ltd, Zagreb, Croatia. 163. School of Public Health, Imperial College, London, UK. 164. Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland. 165. Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku, Finland. 166. Department of Biobank Research, Umeå University, Umeå, Sweden. 167. Department of Epidemiology and Medicine, University of Iowa, Iowa City, IA, USA. 168. Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands. 169. Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK. 170. Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany. 171. Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA. 172. National Heart and Lung Institute, Imperial College, London, UK. 173. Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA. 174. Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK. 175. Genetic Epidemiology, Faculty of Medicine, Institute for Medical Information Processing, Biometry, and Epidemiology, Ludwig Maximilian University, Munich, Germany. 176. Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany. 177. Department of Genetics, Stanford University, Stanford, CA, USA. 178. Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore. 179. Life Sciences Institute, National University of Singapore, Singapore, Singapore. 180. NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore, Singapore. 181. Department of Social and Behavioral Sciences, Harvard T. H. Chan School of Public Health, Boston, MA, USA. 182. Division of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, USA. 183. Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands. 184. Beijing Tongren Eye Center, Beijing Tongren Hospital, Capital Medical University, Beijing, China. 185. Jackson Heart Study, School of Public Health, Jackson State University, Jackson, MS, USA. 186. Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA. 187. Division of Cancer Control and Population Sciences, UPMC Hillman Cancer, University of Pittsburgh, Pittsburgh, PA, USA. 188. Behavioral Epidemiology Section, Laboratory of Epidemiology and Population Sciences, National Institute on Aging, US National Institutes of Health, Baltimore, MD, USA. 189. Imperial College Healthcare NHS Trust, London, UK. 190. Psychology, University of Edinburgh, Edinburgh, UK. 191. Broad Institute of MIT and Harvard, Boston, MA, USA. 192. Department of Nutrition, Harvard T. H. Chan School of Public Health, Harvard University, Boston, MA, USA. 193. Department of Public Health and Clinical Medicine, Umeå University, Umeå, Sweden. 194. OCDEM, Radcliffe Department of Medicine, University of Oxford, Oxford, UK. 195. Nephrology, Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA. 196. Department of Genomics of Common Disease, Imperial College, London, UK. 197. German Center for Diabetes Research (DZD), Neuherberg, Germany. 198. Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA. 199. Endocrinology and Metabolism, Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan. 200. School of Medicine, Chung Shan Medical University, Taichung, Taiwan. 201. School of Medicine, National Yang-Ming University, Taipei, Taiwan. 202. University of Groningen, University Medical Center Groningen, Department of Psychiatry, Groningen, the Netherlands. 203. University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, the Netherlands. 204. Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, NC, USA. 205. Institute of Translational Genomics, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany. 206. Public Health Sciences, Epidemiology and Prevention, Wake Forest University Health Sciences, Winston-Salem, NC, USA. 207. Department of Health Services, University of Washington, Seattle, WA, USA. 208. Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA. 209. Public Health and Primary Care, Leiden University Medical Center, Leiden, the Netherlands. 210. Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA. 211. Cardiology, Medicine, University of Mississippi Medical Center, Jackson, MS, USA. 212. Nuffield Department of Population Health, University of Oxford, Oxford, UK. 213. Mindich Child Health Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA. 214. Psychiatry, Washington University School of Medicine, St. Louis, MO, USA. 215. Biostatistics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA.
Abstract
The concentrations of high- and low-density-lipoprotein cholesterol and triglycerides are influenced by smoking, but it is unknown whether genetic associations with lipids may be modified by smoking. We conducted a multi-ancestry genome-wide gene-smoking interaction study in 133,805 individuals with follow-up in an additional 253,467 individuals. Combined meta-analyses identified 13 new loci associated with lipids, some of which were detected only because association differed by smoking status. Additionally, we demonstrate the importance of including diverse populations, particularly in studies of interactions with lifestyle factors, where genomic and lifestyle differences by ancestry may contribute to novel findings.
The concentrations of high- and low-density-lipoprotein cholesterol and triglycerides are influenced by smoking, but it is unknown whether genetic associations with lipids may be modified by smoking. We conducted a multi-ancestry genome-wide gene-smoking interaction study in 133,805 individuals with follow-up in an additional 253,467 individuals. Combined meta-analyses identified 13 new loci associated with lipids, some of which were detected only because association differed by smoking status. Additionally, we demonstrate the importance of including diverse populations, particularly in studies of interactions with lifestyle factors, where genomic and lifestyle differences by ancestry may contribute to novel findings.
Serum lipids, such as triglycerides and high- and low-density lipoprotein cholesterol (HDL and LDL), are influenced by both genetic and lifestyle factors. Over 250 lipid loci have been identified,[1-6] yet, it is unclear to what extent lifestyle factors modify the effects of these variants, or those yet to be identified. Smoking is associated with an unfavorable lipid profile,[7,8] warranting its investigation as a lifestyle factor that potentially modifies genetic associations with lipids. Identifying interactions using traditional 1 degree of freedom (1df) tests of SNP x smoking terms may have low power, except in very large sample sizes. To enhance power, a 2 degree of freedom (2df) test that jointly evaluates the interaction and main effects was developed.[9]The Gene-Lifestyle Interactions Working Group, under the aegis of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium[10], was formed to conduct analyses of lifestyle interactions in the genetic basis of cardiovascular traits. As both genetic and lifestyle factors differ across populations with different ancestry backgrounds, and to address the underrepresentation of non-European populations in genomic research, great effort went into creating a large, multi-ancestry resource for these investigations.[11] Here, we report a genome-wide interaction study that uses both the 1df test of interaction and the 2df joint test of main and interaction effects to test the hypothesis that genetic associations of serum lipids differ by smoking status.
Results
Novel Loci
We conducted genome-wide interaction meta-analyses for current and ever-smoking status in up to 133,805 individuals of European (EUR), African (AFR), Asian (ASN) and Hispanic (HISP) ancestries (Supplementary Tables 1–3), with follow-up of 17,921 variants with p ≤ 10−6 (not pruned for linkage disequilibrium [LD]) in an additional 253,467 individuals of EUR, AFR, ASN, HISP, and Brazilian (BR) ancestries (Supplementary Tables 4–6), as described in Figure 1. Of these, 16,389 variants (487 loci, defined by +/− 1 MB) passed filters and were included in stage 2 analyses. Ninety percent of variants (14,733) and 22% of loci (109) replicated in stage 2 (variants: p <0.05/16,389, loci: p <0.05/487). We conducted meta-analyses of stage 1 and 2 results (Manhattan Plots Supplementary Figure 1; QQ Plots, Supplementary Figure 2) and identified 13 novel loci with p < 5 × 10−8 that were at least 1 MB away from previously reported lipid loci (Table 1; results by stage: Supplementary Table 7; forest plots: Supplementary Figures 3 and 4; regional association plots: Supplementary Figure 5). These loci had low false discovery rate (FDR) q-values (all q < 3 × 10−4; Supplementary Table 8). We report novel loci with p < 5 × 10−8 as well as those passing a more stringent threshold (p < 6.25 × 10−9), adjusting for 2 smoking exposures, 2 interaction tests, and ancestry-specific and trans-ancestry tests. The patterns observed in these results are described below and illustrated using output from stage 1 meta-analyses, where results from a main effect model (in all and stratified by smoking exposure) and a smoking-adjusted main effect model were also available (Figure 1; Supplementary Table 9).
Figure 1.
Study Overview:
Summary of data included in this study. 116,389 variants passed filtering criteria and were included in stage 2 analyses. 2Trans-ancestry (TRANS) stage 1 and 2 combined meta-analyses were meta-analyses of stage 1 TRANS and stage 2 TRANS meta-analyses, and not meta-analyses of ancestry-specific stage 1 and stage 2 combined meta-analyses.
Table 1:
Statistically Significant (p < 5×10−8) Results in Stage 1 and 2 Meta-Analysis
Index Variant(Nearest Gene)[1]
Bld 37Chr:Position
1000Genomes Freq[2]AFR/AMR/ASN/EUR
TestedAllele:Freq
Ancestry
Trait/Exposure[3]
Stage 1 + 2
Stage 1
n
Effect
SE
Int.Effect
SE
1df Int.P-value[4]
2df JointP-value[4]
n
Adj. MainEffectP-value[5]
Loci with Evidence for Interaction
rs12740061(LOC105378783)
1:69407810
0.01/0.17/0.02/0.22
T: 0.05
AFR
HDL/CS
16,606
0.02
0.0082
−0.11
0.019
7.40E-09
2.4E-08
15,499
0.98
rs77810251(PTPRZ1)
7:121504149
0.02/0.22/0.34/0.11
A: 0.04
AFR
HDL/ES
24,253
0.052
0.0083
−0.06
0.012
9.50E-07
1.2E-9*
23,146
1.60E-04
rs73453125(CNTNAP2)
7:146084573
0.09/0.02/0/0
A: 0.07
TRANS, AFR
LDL/CS
40,566
1.9
0.69
−8.3
1.4
1.70E-07
2.0E-08
24,668
0.76
rs56167574(PRKAG2)
7:151245975
0.13/0.01/0/0
A: 0.12
AFR
LDL/ES
25,778
1.9
0.8
−6.1
1.1
1.50E-08
8.4E-08
23,353
0.08
rs79950627(MIR4686)
11:2233790
0.06/0.01/0/0
A: 0.05
TRANS, AFR
LDL/CS
38,272
−0.1
0.79
−8.4
1.6
1.40E-06
7.2E-09
23,348
0.25
rs60029395(ZNF729)
19:22446748
0.15/0.01/0.03/0
A: 0.13
AFR
TRIG/CS
19,048
0.041
0.0092
−0.097
0.018
3.30E-08
8.2E-08
15,747
0.17
rs7364132(DGCR8)
22:20096172
0.19/0.02/0/0
A: 0.16
AFR, TRANS
TRIG/ES
23,935
0.012
0.0091
−0.066
0.013
8.80E-07
2.5E-08
21,834
0.0055
Probable Main Effect Loci (No Evidence of Interaction)
rs12144063(EYA3)
1:28406047
0.35/0.28/0.53/0.30
T: 0.37
TRANS
HDL/CS, ES
375,418
−0.004
0.00069
−0.00033
0.0016
0.75
1.3E-10*
131,057
4.70E-07
rs10937241(ETV5)
3:185822774
0.30/0.31/0.58/0.19
A: 0.17
EA, TRANS
HDL/CS, ES
230,919
−0.008
0.0012
0.0021
0.0026
0.65
4.2E-12*
90,266
4.50E-07
rs34311866(TMEM175)
4:951947
0.01/0.07/0.12/0.20
C: 0.17
TRANS, EA
HDL, TRIG/CS
351,489
−0.006
0.00097
0.0014
0.0022
0.61
1.6E-9*
115,640
2.10E-06
rs73729083(CREB3L2)
7:137559799
0.11/0.04/0.02/0
C: 0.05
TRANS, AFR
LDL/ES, CS
84,091
−3.7
0.66
−0.37
0.95
0.53
1.3E-14*
35,909
2.00E-10
rs10101067(EYA1)
8:72407374
0.04/0.07/0.13/0.06
C: 0.08
TRANS
TRIG/CS
317,809
0.014
0.0025
−0.0092
0.0053
0.069
4.1E-08
102,263
2.10E-06
rs4758675(B3GNT4)
12:122691738
0.02/0/0/0
C: 0.02
AFR
TRIG/CS
12,982
−0.13
0.025
−0.029
0.057
0.85
1.3E-08
11,875
3.60E-08
Abbreviations: African ancestry (AFR), Current Smoking (CS), European ancestry (EUR), Ever-Smoking (ES), Trans-ancestry (TRANS), Triglycerides (TRIG).
Listed variants represent the lead associations within 1 MB region for the 2 and 1 degree of freedom tests of the variant × smoking interaction after excluding variants within 1 MB of known lipids loci. If variant is in/within 2 KB of a gene, that gene name is listed;
Frequency of the tested allele in 1000 Genomes data by ancestry: Asian (ASN), Americas (AMR), African (AFR), and European (EUR)
If the region was associated with the trait in more than one meta-analysis, the most statistically significant result is listed first and described in table;
P-values in this column come from a smoking-adjusted main effect model (available in Stage 1 cohorts only, see Figure 1);
Findings with an asterisk are statistically significant using a stricter p-value threshold, after Bonferroni correction for 2 smoking traits, 2 interaction tests, and ethnic and trans-ethnic testing (p < 5 × 10−8/8=6.25 × 10−9).
Notably, many novel loci were statistically significant only in AFR meta-analyses. For 7 of the 13 novel loci, the minor allele frequencies (MAF) of the index variants were highest in AFR, and inter-ancestry differences in MAF and/or LD may explain the failure to detect similar associations in other ancestries. However, some AFR-only associations were unlikely to be due to diminished power in non-AFR meta-analyses. For instance, the effect of rs12740061 (NC_000001.10:g.69407810C>T; LOC105378783) on HDL was significantly modified by current smoking status among AFR (p1df = 7.4 × 10−9; Figure 2, Table 1), such that the genetic effect was stronger among current smokers than non-smokers (Supplementary Table 9). In contrast, there was virtually no evidence for association in any other ancestry, despite higher MAF (Figure 2). The potential influence of under-adjustment for principal components (PCs) on these results was evaluated by excluding the 6 studies adjusting for only 1 PC (the average number of PCs among AFR studies was 4.2); effect estimates were similar and p-values were increased or similar, consistent with a ~20% reduction in sample size (Supplementary Table 10).
Figure 2.
Interaction of rs12740061 (LOC105378783) and Current Smoking (1df). A forest plot showing the betas (95% confidence intervals) and p values (1df) for the rs12740061 × Current Smoking interaction term in linear regression models of HDL adjusted for age, sex, study-specific covariates (if applicable), smoking status, and principal components. Results for each AFR study are shown, as well as the ancestry-specific combined stage 1 and 2 meta-analyses.
We observed interactions where notable associations were only found among current or ever-smokers, with effect sizes close to zero among non- or never-smokers, including a statistically significant association for the 2df joint test of main and interaction effects for rs7364132 (NC_000022.10:g.20096172G>A; DGCR8) × ever-smoking on triglycerides (p2df = 2.5 × 10−8; Table 1). Main effect models stratified by smoking status showed a strong genetic association with triglycerides among ever-smokers (difference in mean ln triglycerides per A allele β = −0.05, p = 7.9 × 10−8), with a negligible association among never-smokers (β = 0.01, p = 0.19; Figure 3a). This association was not significant in a non-stratified main effect model (Table 1; Supplementary Table 9), and was only detectable when modeling permitted a different association across smoking strata. Similar results were observed for rs79950627 (NC_000011.9:g.2233790G>A; MIR4686) × current smoking on LDL (Figure 3b), and rs56167574 (NC_000007.13:g.151245975G>A; PRKAG2) × ever-smoking on LDL (Figure 3c, Supplementary Table 9).
Figure 3.
Associations Observed Primarily Among One Smoking Stratum. For selected variants for which an association was primarily observed only in one smoking stratum, a comparison of the p values for stage 1 linear association models, including a main effect model adjusted for age, sex, principal components, and study-specific covariates (as appropriate) in all individuals and stratified by smoking exposure; a model additionally adjusted for smoking exposure; and a model that also includes a smoking exposure × SNP interaction term, from which a 1df test of interaction and a 2df joint test of main effect and interaction were calculated. a.) rs7364132 (DGCR8) × ever-smoking on triglycerides (n = 21,834 [11,113 never smokers; 10,725 ever-smokers]), b.) rs79950627 (MIR4686) × current smoking on LDL (n = 23,348 [18,384 non-smokers; 4,973 current smokers]), c.) rs56167574 (PRKAG2) × ever smoking on LDL (n = 23,353 [11,700 never smokers; 11,649 ever-smokers]), and d.) rs77810251 (PTPRZ1) × ever smoking on HDL (n = 23,146 [11,560 never smokers; 11,592 ever-smokers]).
We also observed interactions where the association was in opposite directions in the exposed vs. unexposed stratum, with a larger, more statistically significant association among smokers. For instance, current smoking modified the association between rs73453125 (NC_000007.13:g.146084573G>A; CNTNAP2) and LDL (Table 1). In stratified main effect models, the A allele was associated with lower LDL among current smokers (β = −8.1 mg/dL, p = 2.2 × 10−7), but higher LDL among non-smokers (β = 2.18 mg/dL, p = 0.01; Figure 4a, Supplementary Table 9). In a non-stratified smoking-adjusted main effects model, no association between rs73453125 and LDL was detected (β = 0.3 mg/dL, p = 0.98). Similar results were observed for rs12740061 (LOC105378783) (Supplementary Table 9).
Figure 4.
Forest Plots of Selected Associations. (a.) Plot showing the association between rs73453125 and LDL among AFR in stage 1 (where a series of models were available). Variant betas (95% confidence intervals) and p values are drawn from main effect linear regression models of Non-Smokers, Smokers, all individuals, and all individuals with adjustment for smoking status. (b.) Plot showing the association between rs10101067 (EYA1) and triglycerides in ancestry-specific and combined analysis from stages 1 and 2. Variant main and interaction betas (95% confidence intervals) are drawn from linear regression models that include a current smoking × SNP term and p values are for the 2df joint test of main effect and interaction.
Although many interactions manifested as associations significant only, or more strongly, in smokers, for rs10937241 (NC_000003.11:g.185822774A>G; ETV5), rs34311866 (NC_000004.11:g.951947T>C; TMEM175), rs10101067 (NC_000008.10:g.72407374G>C; EYA1), and rs77810251 (NC_000007.13:g.121504149G>A; PTPRZ1), the associations observed among non- or never-smokers were more statistically significant. Notably, in stratified main effect models, rs77810251 was associated with increased HDL among never-smokers (β = 0.05 lnHDL, p = 6.3 × 10−11) with no significant association among ever-smokers (β = −0.005 lnHDL, p = 0.56; Figure 3d; Supplementary Table 9). In a smoking-adjusted main effect model of never- and ever-smokers together, the association was markedly reduced (β = 0.02 lnHDL, p = 1.6 × 10−4).The 2df joint test simultaneously evaluates main and smoking interaction effects; some of our results appear to capture a main effect of the variant. For instance, the 2df test for rs12144063 (EYA3) detected an association (p = 1.3 × 10−10), while the 1df test of interaction does not (p = 0.75). The minor alleles for this and three other variants (rs10937241 [ETV5], rs34311866 [TMEM175], and rs10101067 [EYA1]) were common across populations, and their effects were small in magnitude and yet reached genome-wide statistical significance (rs10101067 [EYA1]; Figure 4b), consistent with expectations for novel main effect loci in well-studied populations. There are two findings, however, for which the relatively large sample size in the AFR meta-analyses appeared to facilitate detection. The MAF for rs73729083 (NC_000007.13:g.137559799T>C; CREB3L2) was much greater among AFR than in HISP and ASN (not present among EUR), and the variant effect estimates were large and consistent across ancestries, while the interaction effect estimates were inconsistent, with wide confidence intervals (Supplementary Figure 3f). The minor allele for rs4758675 (NC_000012.11:g.122691738C>A; B3GNT4) was only present in AFR (Supplementary Figure 3k), but variant effect estimates were consistent across AFR studies, with interaction effect estimates approaching the null (Supplementary Figure 4e). In total, 6 of the 13 novel loci that we identified appear to be driven by main effects of the variant while the remainder show some evidence of interaction.There were 16 additional novel loci identified in stage 1 meta-analyses (p1df or p2df < 5 × 10−8) for which the variants were unavailable for analysis in stage 2 cohorts. These loci were identified only in AFR meta-analyses (many were AFR-specific variants; Table 2). Due to the relatively small number and size of available AFR cohorts in stage 2 (total n = 7,217; n < 2,000 per cohort), these relatively low frequency variants did not pass filters for minor allele count within exposure groups. Nevertheless, these associations had low FDR q-values (all q < 2.4 × 10−4) in stage 1, and some appear worthy of further investigation. One particularly interesting candidate is rs17150980 (NC_000007.13:g.78173734T>C; MAGI2) × ever-smoking on triglycerides (p2df = 1.4 × 10−9), for which consistent effects for both the variant and the interaction were observed across AFR studies, but not in other ancestries (Supplementary Figure 6).
Table 2:
Statistically Significant (p < 5×10−8) Results in Stage 1 Meta-Analysis Unavailable in Stage 2[1]
All loci have some evidence for interaction (p<0.05 for 1df test of interaction); thus, results not categorized into “Loci with Evidence for Interaction” or “Probable Main Effects (without evidence for interaction)”;
Listed variants represent the lead associations within 1 MB region for the 2 and 1 degree of freedom tests of the variant × smoking interaction after excluding variants within 1 MB of known lipids loci. If variant is in/within 2 KB of a gene, that gene name is listed;
Frequency of the tested allele in 1000 Genomes data by ancestry: Asian (ASN), Americas (AMR), African (AFR), and European (EUR);
P-values in this column come from a smoking-adjusted main effect model (available in Stage 1 cohorts only, see Figure 1).
Findings with an asterisk indicate statistical significance using a stricter p-value threshold, after Bonferroni correction for 2 smoking traits, 2 interaction tests, and ethnic and trans-ethnic testing (5 × 10−8/8 = 6.25 × 10−9).
As we ran analyses for both current and ever-smoking status, we evaluated novel associations across smoking exposures to further characterize those loci (Supplementary Table 11). For the 6 probable main effect loci (EYA3, ETV5, TMEM175, CREB3L2, EYA1, B3GNT4), an association of similar statistical significance was observed across smoking status definitions for the 2df joint test, with similar lack of effect for the 1df test of the interaction, consistent with the interpretation that smoking status was unimportant, with the main effect driving the association. For the locus in which a stronger association was observed among non-smokers (PTPRZ1), the 1df interaction p value was dramatically reduced (from 9.5 × 10−7 for ever-smoking to 0.011 for current smoking), consistent with any smoke exposure altering the association between this variant and HDL, and including former smokers with the never smokers (as in the current smoking analysis) diluting the observed association among never smokers. For the reported interactions with current smoking, all the effect estimates were greatly reduced in the ever-smoking analysis, suggesting that active smoking is the relevant exposure. For the reported interactions with ever-smoking, markedly reduced statistical significance was observed in the current smoking analysis, likely reflecting a drop in power from excluding former smokers from the exposed group.We conducted a secondary analysis of smoking dose in two of our AFR cohorts with measured cigarettes per day for four interaction loci (see methods for selection criteria): rs12740061 (LOC105378783), rs73453125 (CNTNAP2), rs79950627 (MIR4686), and rs7364132 (DGCR8). For each of these variants, a stronger association was observed with increasing smoking dose (Supplementary Table 12), and the interaction was statistically significant for all variants but rs7364132, which was just over our threshold for statistical significance (p = 0.0035 vs. p < 0.0021).Conditional analysis showed no evidence that the novel associations were driven by variants at known lipids loci (Supplementary Table 13). Imputation quality for novel variants was high (minimum 0.75), with sample-size weighted average imputation quality of 0.90 and minor allele frequencies that match publicly-available datasets (Supplementary Table 14).
Interactions at Known Loci
We examined interactions at known lipid loci. Since results for the 2df test at known lipid loci are expected to predominantly reflect previously identified main effects, we exclusively evaluated the 1df test of interaction. No interactions within known loci were statistically significant (p1df < 0.05/269 known loci in our data). To evaluate whether the proportion of known variants with p1df < 0.05 was higher than would be expected by chance (5%), we conducted binomial tests for each trait-exposure combination (p-values Bonferroni-corrected for multiple tests). There was significant enrichment of known variants with 1df interaction p < 0.05: HDL-current smoking p = 9.6 × 10−12, HDL-ever smoking p = 5.9 × 10−7, LDL-current smoking p = 8.4 × 10−15, LDL-ever smoking p = 3.1 × 10−5, triglycerides-current smoking p = 4.0 × 10−3, triglycerides-ever smoking p = 3.1 × 10−4. We conducted power calculations under different interaction scenarios to determine the conditions under which an interaction analysis and a main effect analysis would both be sufficiently powered to detect the same locus (i.e. when an interaction could be detected in a locus previously identified in a main effect analysis; Supplementary Table 15). At current trans-ancestry meta-analyses sample sizes and assuming a large effect size, there was limited power to detect either a main effect or an interaction when an association was larger or only present among smokers (main effect <1%; interaction 77%), or when associations differed in magnitude but not direction (main effect >99%; interaction <1%); thus, making it unlikely to detect an interaction at a known locus. We were well-powered for both interaction and main effect analyses to detect smoking interactions for which smoking eliminates or drastically reduces a large association among non- or never-smokers. We identified one such interaction in our data, for PTPRZ1 in AFR only, which may not have been previously identified in a main effect analysis because of limited power of AFR main effect analyses thus far.
Proportion Variance Explained by Identified Loci
Ten studies from four ancestries were used to calculate the proportion of the variance in lipid traits explained by the genome-wide statistically significant novel loci: 13 loci from stage 1 and 2 combined meta-analyses (Table 1), and 16 loci from stage 1 that were not available in stage 2 analyses (Table 2). Two different methods were used (Online Methods), and the range of findings across these methods are presented (Supplementary Table 16). In AFR, novel variants and their interactions explained 1.0–2.7% of HDL, 0.7–2.6% of LDL, and 1.3–3.2% of triglycerides. The proportion explained was smaller among EUR (0.06–0.14% of HDL, 0.01–0.07% of LDL, and 0.10–0.19% of triglycerides), ASN (0.27–0.86% of HDL, 0.09–0.82% of LDL, and 0.8–1.5% of triglycerides), and HISP (0.2–0.4% of HDL, 0.2–0.5% of LDL, and 0.2–0.4% of triglycerides). These results should be considered in the context of the inter-ancestry MAF differences: the proportion of novel variants that could be evaluated varied by ancestry, with 94–97% among the AFR cohorts, but only 32–39% among the EUR and ASN cohorts, and 55% in the HISP cohort. In contrast, each of the cohorts investigated had similar proportions of the requested known variants (83–96%).
Reproducing Known Lipids Associations
We evaluated the degree to which our data reproduce previously reported lipid loci. Given that approximately 81% of cohorts in stage 1 were included both in this and in previous efforts, this analysis is not a formal replication. For comparability with traditional GWAS, we evaluated results from stage 1 main effect models. Of the 356 previously reported associations for 279 variants (compiled from[1-6,12]), there were 236 associations for 189 variants that were confirmed in our data (consistent direction and p < 0.05/356), for a 66.3% concordance rate (Supplementary Table 17).
Bioinformatics
To characterize the potential impact of our novel associations for chronic disease risk and to investigate biological mechanisms, we conducted a series of follow-up analyses and annotations. We performed extensive bioinformatics annotation on variants within the 29 novel loci (Tables 1 and 2). These loci included 78 associated variants that were in or near 33 unique genes (Supplementary Table 18). We conducted look-up of these variants in previously conducted GWAS for related traits (Supplementary Tables 19–24), the Genotype-Tissue Expression (GTEx v7.0) portal and Regulome DB (Supplementary Table 25), HaploReg v4.1 (Supplementary Table 26), and an analysis of cis- and trans- expression quantitative trait loci (eQTL) in whole blood from Framingham Heart Study participants (Supplementary Table 27). Additionally, for each trait we performed DEPICT gene prioritization (Supplementary Tables 28–30), gene set enrichment (Supplementary Tables 31–33), and tissue or cell type enrichment analyses[13] (Supplementary Tables 34–37), using both novel and known loci. Notable findings from these follow-up analyses are summarized below by locus.Consistent with our observations of an association of the C allele for rs10101067 (EYA1) with higher triglycerides, this allele was associated with increased risk of coronary artery disease (β = 0.036, p= 0.03; Supplementary Table 19), ischemic stroke (β = 0.11, p= 0.04; Supplementary Table 20), and higher waist to hip ratio adjusted for BMI (β = 0.029 units, p= 6.5 × 10−4, with similar results observed for waist circumference adjusted for BMI; Supplementary Table 21).We found an association of the T allele of rs12144063 (NC_000001.10:g.28406047G>T; EYA3) with lower HDL. This allele was associated with increased risk of all stroke types (β = 0.05, p = 0.04), as well as stroke subtypes (Supplementary Table 20). rs7529792 (NC_000001.10:g.28306250C>T), a variant in LD with rs12144063 (r2 = 0.97) regulates gene expression of EYA3 and has a high Regulome DB score (1b; Supplementary Table 25). Haploreg also shows regulatory features for rs12144063, including being in a promoter location expressed in liver and brain, in enhancer histone marks, and at DNAse marks for EYA3 (Supplementary Table 26). DEPICT predicted a role for these variants in regulating EYA3 and XKR8 (Supplementary Table 28), which encodes a phospholipid scramblase important in apoptotic signaling[14].We report an interaction between smoking and rs77810251 (PTPRZ1) with the minor allele associated with higher HDL only among never-smokers. While this variant was not available in look-up data for GIANT, a variant in this locus with a similar association, rs740965 (NC_000007.13:g.121513561T>G), was associated with lower BMI among EUR (β = −0.01 kg/m2, p= 0.01, similar results for trans-ancestry analysis). This variant was also associated with lower waist circumference adjusted for BMI among EUR women (β = −0.016, p = 0.04; Supplementary Table 21). PTPRZ1 was shown to be downregulated in cells treated with an acute dose of nicotine[15], which supports our observation of a lack of an association of PTPRZ1 variants among ever-smokers.We report a main effect of rs34311866 on HDL and triglycerides. rs34311866 is a missense variant in TMEM175, which has been associated with Parkinson’s disease[16] and type 2 diabetes[17]. This variant contributes to the regulation of DGKQ (p = 5.3 × 10−21) and is an eQTL of DGKQ in adipose, artery, lung, nerve and thyroid tissue (Supplementary Table 25). The expression of DGKQ is more strongly regulated by another significantly associated variant in this locus, rs4690220 (NC_000004.11:g.980464A>G), which is located upstream of IDUA and in an intron of SLC26A1. This variant had a high score in the RegulomeDB (1f), supporting a potential functional effect (Supplementary Table 25). Importantly, DGKQ has been implicated in studies of cholesterol metabolism[18], bile acid signaling, glucose homoeostasis in hepatocytes[19], primary biliary cirrhosis[20], and Parkinson’s disease[21-24]. DGKQ interacts with the key lipid enzymes LPL, LIPG, and PNPLA3 (Supplementary Figure 7). These results suggest that the observed association with HDL and triglycerides could act on cholesterol metabolism through regulation of DGKQ. Also, rs34311866 is a trans-eQTL for GNPDA1 (Supplementary Table 27); expression of this gene has been associated with a set of traits, including hyperlipidemia[25].In our data, there was a significant rs12740061 (LOC105378783) × smoking interaction, such that the minor allele was associated with decreased HDL only among current smokers. This variant is a trans-eQTL for TAS1R1 (Supplementary Table 27). Variants in this gene have been found to influence taste receptors, notably affecting cigarette smoking habits[26].
Discussion
In this study, we evaluated gene-smoking interactions in large, multi-ancestry, meta-analyses of serum lipids, using varying associations among smoking subgroups to improve the ability to detect novel lipid loci. We report 13 novel loci for serum lipids from stage 1 and 2 meta-analyses. Sixteen additional statistically significant novel loci were found in stage 1 but were unavailable in stage 2. All 29 novel associations had a low q-value (p < 3 × 10−4). Using both the 1df test of interaction and the 2df joint test of main and interaction effects in this study allowed us to improve our inferences based on the results: the 2df test bolstered the power to detect interactions, while the 1df test could discriminate between associations that predominantly reflected main effects vs. interactions.Our results provide support for future efforts to evaluate lifestyle interactions with complex traits. We identified loci for which an association with serum lipids was only observed in one smoking stratum. In main effect models at these loci, the signal from one subgroup was not detected when all individuals were evaluated together (regardless of adjusting for smoking). These loci could only be observed by an analysis that was either smoking-stratified or contained an interaction term, highlighting the importance of considering potential effect modification in association studies. Additionally, through use of the joint 2df test, we identified six loci that appear to show novel main effects. Consistent with this characterization, five of these loci were within 500 KB of variants identified in recent large-scale association studies using main effect models: ETV[27-29], TMEM175[28], EYA1[28], EYA3[28], and B3GNT4[28].With 23,753 AFR individuals in the Stage 1 analyses and 30,970 AFR individuals overall, this work represents one of the largest studies of serum lipids in AFR. It is therefore unsurprising that two of our novel lipid loci (CREB3L2 and B3GNT4) appear to be driven primarily by genetic main effects. Importantly, these associations could not have been detected in EUR, as the tested allele for both rs4758675 (B3GNT4) and rs73729083 (CREB3L2) are absent in EUR.In addition to these probable main effect loci, the prominence of novel loci that were statistically significant only in AFR meta-analyses deserves further discussion. Some findings could not be effectively evaluated in other ancestry groups because of inter-ancestry MAF differences: the minor alleles for half of the variants were much more frequent in AFR. More puzzling, however, is the discovery of loci with evidence of strong interactions in AFR but not in meta-analyses in other ancestries, despite comparable or higher allele frequencies, such as were observed with rs12740061 (LOC105378783; Figure 2) or rs17150980 (MAGI2; Supplementary Figure 6). This phenomenon suggests inter-ancestry differences in either genomic or environmental context. There are variants in LD (r2 > 0.2) among AFR for rs12740061 (LOC105378783) and rs17150980 (MAGI2) that are not in LD with these variants in other ancestries[30], but these variants were directly tested in our study with no evidence of an association in non-AFR analyses. Thus, it is unlikely that inter-ancestry LD differences explain these results, although unmeasured causal variants are a possibility. Inter-ancestry differences in smoking are also a potential explanation. In addition to known differences in smoking patterns[31], there are pronounced ancestry differences in preferred cigarette type, with over 85% of AFR smokers using menthol cigarettes compared to 29% of EUR smokers (in the US)[32]. Menthol cigarettes are thought to facilitate greater absorption of harmful chemicals because of deeper inhalation[31,33] through desensitization of nicotinic acetylcholine receptors that cause nicotine-induced irritation[34]. Evidence for an excess risk of cardiovascular disease associated with mentholated cigarettes, however, is equivocal[35-39]. Ancestry differences in smoking-related metabolites and carcinogens have been reported[40-43], and differential metabolism of key compounds may underlie observed differences by ancestry. Some behaviors/conditions that co-occur with smoking may also differ by ancestry, and this additional factor may modify the observed genetic associations with serum lipids.The biological mechanisms through which smoking influences the observed genetic associations will require further investigation, as the myriad components of cigarette smoke and their downstream consequences (including oxidative stress and inflammation) affect pathways throughout the body[44]. However, there is evidence for differential expression of PTPRZ1[15], LPL[15] and LDLR[45] in cells exposed to an acute dose of nicotine. Also, concentrations of CETP[46], ApoB[47], and LPL[48] are associated with smoking status.The sample size attained for diverse ancestries is a key strength of our study, particularly among AFR. As a result, we were able to identify loci that had not been previously detected in meta-analyses of ancestries that are better represented in genomic research. Additionally, our use of nested models in our stage 1 analyses allowed us to more fully characterize loci. Despite these strengths, however, a smaller number of AFR studies were available for stage 2, resulting in an inability to follow up on some of our stage 1 low frequency findings.In conclusion, this large, multi-ancestry genome-wide study of gene-smoking interactions on serum lipids identified 13 novel loci based on combined analysis of stages 1 and 2, and an additional 16 novel loci based on stage 1 that were unavailable in stage 2. Some loci were detected only in analyses stratified by smoking status or with a smoking interaction term, thus motivating further study of gene × environment interactions with other lifestyle factors to identify new loci for lipids and other complex traits. We demonstrate the importance of including diverse populations, reaching a sufficient sample size in these analyses for discovery of novel main effect lipid loci for AFR. Careful consideration of ancestry may be of particular importance for gene × environment interactions, as ancestry may be a proxy for both genomic and environmental context.
Details regarding motivation and methodology of this and other projects of the CHARGE Gene-Lifestyle Interactions Working Group are available in our recently published methods paper[11], and detailed information on study design can be found in the Life Sciences Reporting Summary.
Participants
Analyses included men and women between 18 and 80 years of age of European (EUR), African (AFR), Asian (ASN), Hispanic (HISP), and (in stage 2 only) Brazilian (BR) ancestry. Participating studies are described in Supplementary Materials, with further details of sample sizes, trait distribution, and data preparation available in Supplementary Tables 1–6. Considerable effort was expended to engage as many studies of diverse ancestry as possible. This work was approved by the Washington University in St. Louis Institutional Review Board and complies with all relevant ethical regulations. Each study obtained informed consent from participants and received approval from the appropriate institutional review boards.
Phenotypes
Analyses evaluated the concentrations of high-density lipoprotein cholesterol (HDL), low-density lipoprotein cholesterol (LDL), and triglycerides. LDL could be either directly assayed or derived using the Friedewald equation (if triglycerides ≤ 400 mg/dL and individuals were fasting for at least 8 hours). Lipid-lowering drug use was defined as any use of a statin drug or any unspecified lipid-lowering drug after 1994 (when statin use became common). If LDL was directly assayed, adjustment for lipid-lowering drug was performed by dividing the LDL value by 0.7. If LDL was derived using the Friedewald equation, total cholesterol was first adjusted for lipid-lowering drug use (total cholesterol/0.8) before calculation of LDL by the Friedewald equation. No adjustments were made for any other lipid medication, nor were adjustments made to HDL or triglycerides for medication use. If samples were from individuals who were non-fasting (fasting ≤ 8 hours), then neither triglycerides nor calculated LDL were used. Both HDL and triglycerides were natural log-transformed, while LDL remained untransformed. In the event that multiple measurements of lipids were available (i.e. in a longitudinal study), analysts selected the visit for which data were available for the largest number of participants, and the measurement from that visit was included in analyses.
Environmental Exposure Status
Smoking variables evaluated were current smoking status (yes/no) and ever smoking status (yes/no). Current smokers were included in the exposed group for both of these variables, and never smokers were included in the unexposed group for both of these variables. Former smokers were included in the unexposed group for the current smoking variable and the exposed group for the ever-smoking variable. Smoking variables were coded as 0/1 for unexposed/exposed groups.
Genotype Data
Genotyping was performed by each participating study using genotyping arrays from either Illumina (San Diego, CA, USA) or Affymetrix (Santa Clara, CA, USA). Each study conducted imputation using various software. The cosmopolitan reference panel from the 1000 Genomes Project Phase I Integrated Release Version 3 Haplotypes (2010–11 data freeze, 2012–03-14 haplotypes) was specified for imputation and used by most studies, with some using the HapMap Phase II reference panel instead. Only variants on the autosome and with MAF of at least 0.01 were considered. Specific details of each participating study’s genotyping platform and imputation software are described (Supplementary Tables 3 and 6). Genotype was coded as the dosage of the imputed genetic variant, coded additively (0,1,2).
Stage 1 Analysis
Stage 1 genome-wide interaction analyses included 29 cohorts contributing data from 51 study/ancestry groups and up to 133,805 individuals of EUR, AFR, ASN, and HISP ancestry (Supplementary Tables 1–3). All cohorts ran three models in all individuals: a main effect model, a model adjusted for smoking, and an interaction model that included a multiplicative interaction term between the variant and smoking status (Figure 1). Additionally, the main effect model was run stratified by smoking exposure. All models were run for 3 lipids traits (HDL, LDL, and triglycerides) and 2 smoking exposures (current smoking and ever smoking). Thus, each study/ancestry group completed 30 GWAS (using 5 models × 3 traits × 2 exposures).All models were adjusted for age, sex, and field center (as appropriate). Principal components derived using genotyped SNPs were included based on the study analyst’s discretion. All AFR cohorts were requested to include at least the first principal component, and 71% of AFR cohorts used multiple PCs (with 25% using 10 PCs). The average number of PCs used was 4.2. Additional cohort-specific covariates could be included if necessary to control for other potential confounding factors. Studies including participants from multiple ancestry groups conducted and reported analyses separately by ancestry. Participating studies provided the estimated genetic main effect and robust estimates of standard error for all requested models. In addition, for the models with an interaction term, studies also reported the interaction effects and robust estimates of their standard errors, and a robust estimate of the corresponding covariance matrix between the main and interaction effects. To obtain robust estimates of covariance matrices and robust standard errors, studies with only unrelated participants used R packages; either sandwich or ProbABEL. If the study included related individuals, either generalized estimating equations (R package geepack) or linear mixed models (GenABEL, MMAP, or R) were used. Sample code provided to studies to generate these data has been previously published (see Supplementary Materials
[11]).Extensive quality control (QC) was performed using EasyQC[49] on study-level (examining the results of each study individually), and then on ancestry-level (examining all studies within each ancestry group together). Study-level QC consisted of exclusion of all variants with MAF < 0.01, extensive harmonization of alleles, and comparison of allele frequencies with ancestry-appropriate 1000 Genomes reference data. Ancestry-level QC included the compilation of summary statistics on all effect estimates, standard errors and p-values across studies to identify potential outliers, and production of SE-N and QQ plots to identify analytical problems (such as improper trait transformations)[50]. Variants were excluded from ancestry-specific meta-analyses for an imputation score < 0.5; the same threshold was implemented regardless of imputation software, as imputation quality measures are shown to be similar across software[51]. Additionally, variants were excluded if the minimum of the minor allele count in the exposed or unexposed groups × imputation score was less than 20. To be included in meta-analyses, each variant had to be available from at least 3 studies or 5,000 individuals contributing data.Meta-analyses were conducted for all models using the inverse variance-weighted fixed effects method as implemented in METAL. We evaluated both a 1 degree of freedom test of interaction effect (1df) and a 2 degree of freedom joint test of main and interaction effects (2df), following previously published methods[9]. A 1df Wald test was used to evaluate the 1df interaction, as well as the main effect and the smoking-adjusted main effect in models without an interaction term. A 2df Wald test was used to jointly test the effects of both the variant and the variant x smoking interaction[52]. Meta-analyses were conducted within each ancestry separately, and then trans-ancestry meta-analyses were conducted on all ancestry-specific meta-analyses. Genomic control correction was applied before all meta-analyses.Variants that were associated in any analysis at p ≤ 10−6 were carried forward for analysis in Stage 2. A total of 17,921 variants from 519 loci (defined by physical distance +/− 1 MB) were selected for Stage 2 analyses.
Stage 2 Analysis
Variants selected for Stage 2 were evaluated in 50 cohorts, with data from 75 separate ancestry/study groups totaling up to 253,467 individuals (Supplementary Tables 4–6). In addition to the 4 ancestry groups listed above, stage 2 analyses also included studies of Brazilian (BR) individuals. BR were considered only in the trans-ancestry meta-analyses, since there were no stage 1 BR results for meta-analysis. In stage 2, variants were evaluated only in a model with the interaction term (Figure 1).Study- and ancestry-level QC was carried out as in stage 1. In contrast to stage 1, no additional filters were included for the number of studies or individuals contributing data to stage 2 meta-analyses, as these filters were implemented to reduce the probability of false positives, and were less relevant in stage 2. Stage 2 variants were evaluated in all ancestry groups and for all traits, no matter what specific meta-analysis met the p-value threshold in the stage 1 analysis. Genomic control was not applied to stage 2 meta-analyses, given the expectation of association. To ensure quality of analyses, all quality control and meta-analyses of replication data were completed independently by analysts at two different institutions (ARB and JLB [NIH], EL, XD, and CTL [Boston University]), with differences resolved through consultation.
Meta-Analyses of Stages 1 and 2
Given the increased power of combined meta-analysis of stage 1 and 2 results compared with a discovery and replication strategy[53], combined stage 1 and 2 meta-analyses were carried out for all the selected variants . We report variants significant at 5 × 10−8 as well as those significant at Bonferroni correction for 2 smoking traits, 2 interaction tests, and ancestry-specific and trans-ancestry testing, with p-value of 6.25 × 10−9 (5 × 10−8/8). Loci that are significant at the stricter p-value are identified in main tables. Loci were defined based on physical distance (+/− 1 MB) and are described by the index variant (i.e. the most statistically significant variant within each locus). Novelty was determined by physical distance (+/− 1 MB) from known lipids loci compiled from large meta-analyses[1-5,12]. False Discovery Rate q values were determined using EasyStrata to implement the Benjamini-Hochberg method of calculation. Results were visualized using R 3.1.0, including the package ‘forestplot’ (Supplementary Figures 3 and 4), and LocusZoom v1.4 (Supplementary Figure 5) for regional association plots.
Smoking Dose Analysis
To further characterize these associations, we evaluated an interaction between smoking dose and a few of the observed novel loci. While smoking dose data was not available for many of the included studies, we conducted secondary analysis on smoking dose interaction in a subset of loci in our two largest AFR studies: WHI-SHARE and ARIC. We identified 4 loci from our main results (LOC105378783, CNTNAP2, MIR4686, DGCR8) for follow-up based on the following criteria: an interaction locus (as opposed to a probable main effect), stronger association observed among smokers compared to non-/never-smokers, the presence of contributing cohort(s) with smoking dose variables available and with p < 0.05 for reported result (to ensure sufficient power for analysis). We investigated these 4 loci using 3 methods of characterizing cigarettes per day: a quantitative variable, a categorical variable based on meaningful dose levels (less than a half a pack, between a half a pack and a pack, and more than a pack per day), and binary variable defined by the median of cigarettes per day in that cohort. Dose variables were defined separately by smoking status, such that cigarettes per day for former smokers were set to 0 for variables defined for current smokers, while the cigarettes per day for both current and former smokers were quantified when defined for ever smokers. Statistical significance was set at p < 0.0021, Bonferroni correction for investigation of 4 loci, 3 smoking dose variables, and 2 smoking status exposures.
Conditional Analyses
To assess independence of novel loci from established lipids loci, we conducted conditional analyses using GCTA. GCTA’s conditional and joint analysis option (COJO) calculates approximate conditional and joint association analyses based on summary statistics from a GWAS meta-analysis and individual genotype data from an ancestry-appropriate reference sample (for LD estimation). For novel loci from predominantly AFR meta-analyses, the LD reference set included unrelated AFR from HUFS, CFS, JHS, ARIC, and MESA (total N = 8,425). For novel loci from predominantly EUR meta-analyses, the LD reference set included unrelated EUR from ARIC (total N = 9,770). Excluding HUFS, these data were accessed through dbGaP (ARIC phs000280.v2.p1, phs000090.v2.p1; CFS phs000284.v1.p1; JHS phs000286.v4.p1, phs000499.v2.p1; and MESA phs000209.v13.p1, phs000420.v6.p3) and imputed to 1000 Genomes phase 1 v. 3 using the Michigan Imputation Server[54] For loci with a p < 5 × 10−8 for the 1df test of interaction, results from stage 1 and 2 meta-analyses were adjusted for all known lipids loci. A method for running conditional analyses for 2df tests has not been implemented within GCTA, therefore we evaluated loci with a p < 5 × 10−8 for the 2df joint test of main and interaction effects by conditioning stage 1 stratified analyses on known lipids loci (stratified analyses were not conducted in stage 2 studies). The conditioned 2df joint test of main and interaction effects was then calculated using EasyStrata[50] on the conditioned stratified results.
Power Calculations for Detecting Interactions at Known Lipids Loci
To better contextualize our lack of detection of an interaction at a known locus, we conducted power calculations under a variety of scenarios. We explored the power to detect both an interaction and a main effect, making assumptions based on our data, as the sample sizes achieved in this project are comparable to the largest main effect GWAS for lipids[1,5]. Using previously developed analytical power formulas[55], we evaluated three interaction scenarios: a pure interaction effect (no effect in non-smokers and a positive effect in current smokers), a quantitative interaction (effects in the same direction across strata, but of different magnitude), and a qualitative interaction (effects in opposite directions and of different magnitude). We assumed stage 1 + 2 sample sizes and 19% prevalence of smoking (as in our data). For the purposes of illustration, we assumed relatively large effects which explain 0.06% of the variance in the lipid trait; the median variance explained from known lipid loci, as estimated from a previous publication (their Supplemental Table 1)[2], is 0.04%.
Proportion of Variance Explained
To evaluate the proportion of the variance explained by our novel associations, we conducted additional analyses of our variants of interest in cohorts of diverse ancestries (Supplementary Table 16). In each of 10 studies from 4 ancestries (EUR, AFR, ASN, and HISP), we ran a series of nested regression models to determine the relative contribution of each set of additional variables. The first model included only standard covariates (age, sex, center, principal components, etc.). The second model additionally included smoking status (both current and ever smoking). The third added known variants[1-5,12]. The fourth model added all novel variants, and the last model also included interaction terms for novel variants. For the purposes of this analysis, novel variants included the lead variant for each genome-wide significant locus in the meta-analyses of stages 1 and 2 (Table 1) and that were significant but only available in stage 1 meta-analyses (Table 2). By subtracting the r2values from each of these nested regression models, the proportion of variance explained by the additional set of variables was determined. We conducted these analyses using two approaches. In Approach 1, all variants with MAF ≥ 0.01 and imputation quality ≥ 0.3 were included in regression models. While the imputation quality threshold used for the main analyses (≥ 0.5) was higher in order to reduce the risk of spurious associations, we selected a lower threshold for this secondary analysis to maximize the number of variants of interest included. In Approach 2, to avoid possible overfitting, stepwise regression was used for variant selection, such that only variants that were associated (p < 0.05) were retained in the model. All variants were considered in models for each trait and ancestry, regardless of the trait or ancestry in which the association was identified.
To evaluate the degree to which our data confirmed previous associations, we evaluated statistically significant associations reported from recent large meta-analyses[1-5,12]. In the event of overlap between reports, the most statistically significant variant-trait association was considered, for a total of 346 unique associations for 269 variants. Output from our main effect models (stage 1) was extracted for all ancestries for each previously reported variant-trait combination. Reproducibility was determined by p < 0.05 in any ancestry and a consistent direction of effect (Supplementary Table 17).
Functional Inference
To evaluate the degree to which our novel variants might influence other cardiometabolic traits, we extracted our novel variants (Tables 1 and 2) from previous studies. Supplementary Tables 19–24 present the association of these variants with coronary artery disease and myocardial infarction, using data from the CARDIoGRAM consortium[56]; neurological traits, using data from the Neurology Working Group of the CHARGE Consortium; anthropometry, using data from the GIANT consortium.[57-59] adoptive smoking interaction, using data from the GIANT consortium [60]; diabetes and related traits, using data from MAGIC[61], AAGILE[62], and DIAGRAM[63, 64]; and kidney outcomes, using data from the COGENT-Kidney consortium[65].To conduct functional annotation of our novel variants (Supplementary Tables 18, 25–27), we used NCBI Entrez gene (see URLs) for gene information, dbSNP to translate positions to human genome build 38, HaploReg (v4.1) and RegulomeDB for gene expression and regulation data from ENCODE and RoadMap projects, and GTEx v7.0 for additional gene expression information. We also investigated our novel variants in cis- and trans-eQTL data based on analysis of the whole blood of Framingham Heart Study participants[66].
Pathway and Gene Set Enrichment Analyses
We conducted DEPICT analyses[13] based on genome-wide significant (p< 5 × 10−8) variants separately for the three traits HDL, LDL and triglycerides (Supplementary Tables 28–37). To obtain input for the prioritization and enrichment analyses, DEPICT first created a list of non-overlapping loci by applying a combined distance and LD based threshold (500 KB flanking regions and LD r² > 0.1) between the associated variants and the 1000 Genomes reference data. DEPICT then obtained lists of overlapping genes by applying an LD based threshold (r2 > 0.5) between the non-overlapping variants and known functional coding or cis-acting regulatory variants for the respective genes. Finally, the major histocompatibility complex region on chromosome 6 (base position 25,000,000 – 35,000,000) was removed from further analyses. DEPICT prioritized genes at associated regions by comparing functional similarity of genes across associated loci using a gene score that was adjusted for several confounders, such as gene length. Utilizing lead variants from 500 pre-compiled null GWAS the scoring step was repeated 50 times to obtain an experiment-wide FDR for the gene prioritization. Second, DEPICT conducted gene-set enrichment analyses based on a total of 14,461 pre-compiled reconstituted gene sets. The reconstituted gene sets involve 737 Reactome database pathways, 2,473 phenotypic gene sets (derived from the Mouse Genetics Initiative)[67], 184 Kyoto Encyclopedia of Genes and Genomes (KEGG) database pathways, 5,083 Gene Ontology database terms, and 5,984 protein molecular pathways (derived from protein-protein interactions[68]). Third, DEPICT conducted tissue and cell type enrichment analyses based on expression data in any of the 209 MeSH annotations for 37,427 microarrays of the Affymetrix U133 Plus 2.0 Array platform. In addition, we used STRING database for identifying protein x protein interactions.
Data Availability
All summary results will be made available in dbGaP (phs000930.v7.p1).
Authors: Ju Wang; Wenyan Cui; Jinxue Wei; Dongxiao Sun; Ramana Gutala; Jun Gu; Ming D Li Journal: Front Psychiatry Date: 2011-03-08 Impact factor: 4.157
Authors: Yun Ju Sung; Lisa de Las Fuentes; Thomas W Winkler; Daniel I Chasman; Amy R Bentley; Aldi T Kraja; Ioanna Ntalla; Helen R Warren; Xiuqing Guo; Karen Schwander; Alisa K Manning; Michael R Brown; Hugues Aschard; Mary F Feitosa; Nora Franceschini; Yingchang Lu; Ching-Yu Cheng; Xueling Sim; Dina Vojinovic; Jonathan Marten; Solomon K Musani; Tuomas O Kilpeläinen; Melissa A Richard; Stella Aslibekyan; Traci M Bartz; Rajkumar Dorajoo; Changwei Li; Yongmei Liu; Tuomo Rankinen; Albert Vernon Smith; Salman M Tajuddin; Bamidele O Tayo; Wei Zhao; Yanhua Zhou; Nana Matoba; Tamar Sofer; Maris Alver; Marzyeh Amini; Mathilde Boissel; Jin Fang Chai; Xu Chen; Jasmin Divers; Ilaria Gandin; Chuan Gao; Franco Giulianini; Anuj Goel; Sarah E Harris; Fernando P Hartwig; Meian He; Andrea R V R Horimoto; Fang-Chi Hsu; Anne U Jackson; Candace M Kammerer; Anuradhani Kasturiratne; Pirjo Komulainen; Brigitte Kühnel; Karin Leander; Wen-Jane Lee; Keng-Hung Lin; Jian'an Luan; Leo-Pekka Lyytikäinen; Colin A McKenzie; Christopher P Nelson; Raymond Noordam; Robert A Scott; Wayne H H Sheu; Alena Stančáková; Fumihiko Takeuchi; Peter J van der Most; Tibor V Varga; Robert J Waken; Heming Wang; Yajuan Wang; Erin B Ware; Stefan Weiss; Wanqing Wen; Lisa R Yanek; Weihua Zhang; Jing Hua Zhao; Saima Afaq; Tamuno Alfred; Najaf Amin; Dan E Arking; Tin Aung; R Graham Barr; Lawrence F Bielak; Eric Boerwinkle; Erwin P Bottinger; Peter S Braund; Jennifer A Brody; Ulrich Broeckel; Brian Cade; Archie Campbell; Mickaël Canouil; Aravinda Chakravarti; Massimiliano Cocca; Francis S Collins; John M Connell; Renée de Mutsert; H Janaka de Silva; Marcus Dörr; Qing Duan; Charles B Eaton; Georg Ehret; Evangelos Evangelou; Jessica D Faul; Nita G Forouhi; Oscar H Franco; Yechiel Friedlander; He Gao; Bruna Gigante; C Charles Gu; Preeti Gupta; Saskia P Hagenaars; Tamara B Harris; Jiang He; Sami Heikkinen; Chew-Kiat Heng; Albert Hofman; Barbara V Howard; Steven C Hunt; Marguerite R Irvin; Yucheng Jia; Tomohiro Katsuya; Joel Kaufman; Nicola D Kerrison; Chiea Chuen Khor; Woon-Puay Koh; Heikki A Koistinen; Charles B Kooperberg; Jose E Krieger; Michiaki Kubo; Zoltan Kutalik; Johanna Kuusisto; Timo A Lakka; Carl D Langefeld; Claudia Langenberg; Lenore J Launer; Joseph H Lee; Benjamin Lehne; Daniel Levy; Cora E Lewis; Yize Li; Sing Hui Lim; Ching-Ti Liu; Jianjun Liu; Jingmin Liu; Yeheng Liu; Marie Loh; Kurt K Lohman; Tin Louie; Reedik Mägi; Koichi Matsuda; Thomas Meitinger; Andres Metspalu; Lili Milani; Yukihide Momozawa; Thomas H Mosley; Mike A Nalls; Ubaydah Nasri; Jeff R O'Connell; Adesola Ogunniyi; Walter R Palmas; Nicholette D Palmer; James S Pankow; Nancy L Pedersen; Annette Peters; Patricia A Peyser; Ozren Polasek; David Porteous; Olli T Raitakari; Frida Renström; Treva K Rice; Paul M Ridker; Antonietta Robino; Jennifer G Robinson; Lynda M Rose; Igor Rudan; Charumathi Sabanayagam; Babatunde L Salako; Kevin Sandow; Carsten O Schmidt; Pamela J Schreiner; William R Scott; Peter Sever; Mario Sims; Colleen M Sitlani; Blair H Smith; Jennifer A Smith; Harold Snieder; John M Starr; Konstantin Strauch; Hua Tang; Kent D Taylor; Yik Ying Teo; Yih Chung Tham; André G Uitterlinden; Melanie Waldenberger; Lihua Wang; Ya Xing Wang; Wen Bin Wei; Gregory Wilson; Mary K Wojczynski; Yong-Bing Xiang; Jie Yao; Jian-Min Yuan; Alan B Zonderman; Diane M Becker; Michael Boehnke; Donald W Bowden; John C Chambers; Yii-Der Ida Chen; David R Weir; Ulf de Faire; Ian J Deary; Tõnu Esko; Martin Farrall; Terrence Forrester; Barry I Freedman; Philippe Froguel; Paolo Gasparini; Christian Gieger; Bernardo Lessa Horta; Yi-Jen Hung; Jost Bruno Jonas; Norihiro Kato; Jaspal S Kooner; Markku Laakso; Terho Lehtimäki; Kae-Woei Liang; Patrik K E Magnusson; Albertine J Oldehinkel; Alexandre C Pereira; Thomas Perls; Rainer Rauramaa; Susan Redline; Rainer Rettig; Nilesh J Samani; James Scott; Xiao-Ou Shu; Pim van der Harst; Lynne E Wagenknecht; Nicholas J Wareham; Hugh Watkins; Ananda R Wickremasinghe; Tangchun Wu; Yoichiro Kamatani; Cathy C Laurie; Claude Bouchard; Richard S Cooper; Michele K Evans; Vilmundur Gudnason; James Hixson; Sharon L R Kardia; Stephen B Kritchevsky; Bruce M Psaty; Rob M van Dam; Donna K Arnett; Dennis O Mook-Kanamori; Myriam Fornage; Ervin R Fox; Caroline Hayward; Cornelia M van Duijn; E Shyong Tai; Tien Yin Wong; Ruth J F Loos; Alex P Reiner; Charles N Rotimi; Laura J Bierut; Xiaofeng Zhu; L Adrienne Cupples; Michael A Province; Jerome I Rotter; Paul W Franks; Kenneth Rice; Paul Elliott; Mark J Caulfield; W James Gauderman; Patricia B Munroe; Dabeeru C Rao; Alanna C Morrison Journal: Hum Mol Genet Date: 2019-08-01 Impact factor: 6.150
Authors: Oyomoare L Osazuwa-Peters; R J Waken; Karen L Schwander; Yun Ju Sung; Paul S de Vries; Sarah M Hartz; Daniel I Chasman; Alanna C Morrison; Laura J Bierut; Chengjie Xiong; Lisa de Las Fuentes; D C Rao Journal: Genet Epidemiol Date: 2020-03-29 Impact factor: 2.135
Authors: Jia Y Wan; Christina Cataby; Andrew Liem; Emily Jeffrey; Trina M Norden-Krichmar; Deborah Goodman; Stephanie A Santorico; Karen L Edwards Journal: Hear Res Date: 2019-12-24 Impact factor: 3.208
Authors: Woori Kim; Dmitry Prokopenko; Phuwanat Sakornsakolpat; Brian D Hobbs; Sharon M Lutz; John E Hokanson; Louise V Wain; Carl A Melbourne; Nick Shrine; Martin D Tobin; Edwin K Silverman; Michael H Cho; Terri H Beaty Journal: Am J Epidemiol Date: 2021-05-04 Impact factor: 4.897