Literature DB >> 31911677

Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes.

Hugues Aschard^1,2,3, Jonathan Beesley⁴, Laura Fachal⁵, Daniel R Barnes⁶, Jamie Allen⁶, Siddhartha Kar⁵, Karen A Pooley⁶, Joe Dennis⁶, Kyriaki Michailidou^6,7, Constance Turman³, Penny Soucy⁸, Audrey Lemaçon⁸, Michael Lush⁶, Jonathan P Tyrer⁵, Maya Ghoussaini⁵, Mahdi Moradi Marjaneh^4,9, Xia Jiang², Simona Agata¹⁰, Kristiina Aittomäki¹¹, M Rosario Alonso¹², Irene L Andrulis^13,14, Hoda Anton-Culver¹⁵, Natalia N Antonenkova¹⁶, Adalgeir Arason^17,18, Volker Arndt¹⁹, Kristan J Aronson²⁰, Banu K Arun²¹, Bernd Auber²², Paul L Auer^23,24, Jacopo Azzollini²⁵, Judith Balmaña^26,27, Rosa B Barkardottir^17,18, Daniel Barrowdale⁶, Alicia Beeghly-Fadiel²⁸, Javier Benitez^29,30, Marina Bermisheva³¹, Katarzyna Białkowska³², Amie M Blanco³³, Carl Blomqvist^34,35, William Blot^28,36, Natalia V Bogdanova^16,37,38, Stig E Bojesen^39,40,41, Manjeet K Bolla⁶, Bernardo Bonanni⁴², Ake Borg⁴³, Kristin Bosse⁴⁴, Hiltrud Brauch^45,46,47, Hermann Brenner^19,47,48, Ignacio Briceno^49,50, Ian W Brock⁵¹, Angela Brooks-Wilson^52,53, Thomas Brüning⁵⁴, Barbara Burwinkel^55,56, Saundra S Buys⁵⁷, Qiuyin Cai²⁸, Trinidad Caldés⁵⁸, Maria A Caligo⁵⁹, Nicola J Camp⁶⁰, Ian Campbell^61,62, Federico Canzian⁶³, Jason S Carroll⁶⁴, Brian D Carter⁶⁵, Jose E Castelao⁶⁶, Jocelyne Chiquette⁶⁷, Hans Christiansen³⁷, Wendy K Chung⁶⁸, Kathleen B M Claes⁶⁹, Christine L Clarke⁷⁰, J Margriet Collée⁷¹, Sten Cornelissen⁷², Fergus J Couch⁷³, Angela Cox⁵¹, Simon S Cross⁷⁴, Cezary Cybulski³², Kamila Czene⁷⁵, Mary B Daly⁷⁶, Miguel de la Hoya⁵⁸, Peter Devilee^77,78, Orland Diez^79,80, Yuan Chun Ding⁸¹, Gillian S Dite⁸², Susan M Domchek⁸³, Thilo Dörk³⁸, Isabel Dos-Santos-Silva⁸⁴, Arnaud Droit^8,85, Stéphane Dubois⁸, Martine Dumont⁸, Mercedes Duran⁸⁶, Lorraine Durcan^87,88, Miriam Dwek⁸⁹, Diana M Eccles⁹⁰, Christoph Engel⁹¹, Mikael Eriksson⁷⁵, D Gareth Evans^92,93, Peter A Fasching^94,95, Olivia Fletcher⁹⁶, Giuseppe Floris⁹⁷, Henrik Flyger⁹⁸, Lenka Foretova⁹⁹, William D Foulkes¹⁰⁰, Eitan Friedman^101,102, Lin Fritschi¹⁰³, Debra Frost⁶, Marike Gabrielson⁷⁵, Manuela Gago-Dominguez^104,105, Gaetana Gambino⁵⁹, Patricia A Ganz¹⁰⁶, Susan M Gapstur⁶⁵, Judy Garber¹⁰⁷, José A García-Sáenz¹⁰⁸, Mia M Gaudet⁶⁵, Vassilios Georgoulias¹⁰⁹, Graham G Giles^82,110,111, Gord Glendon¹³, Andrew K Godwin¹¹², Mark S Goldberg^113,114, David E Goldgar¹¹⁵, Anna González-Neira³⁰, Maria Grazia Tibiletti¹¹⁶, Mark H Greene¹¹⁷, Mervi Grip¹¹⁸, Jacek Gronwald³², Anne Grundy¹¹⁹, Pascal Guénel¹²⁰, Eric Hahnen^121,122, Christopher A Haiman¹²³, Niclas Håkansson¹²⁴, Per Hall^75,125, Ute Hamann¹²⁶, Patricia A Harrington⁵, Jaana M Hartikainen^127,128,129, Mikael Hartman^130,131, Wei He⁷⁵, Catherine S Healey⁵, Bernadette A M Heemskerk-Gerritsen¹³², Jane Heyworth¹³³, Peter Hillemanns³⁸, Frans B L Hogervorst¹³⁴, Antoinette Hollestelle¹³², Maartje J Hooning¹³², John L Hopper⁸², Anthony Howell¹³⁵, Guanmengqian Huang¹²⁶, Peter J Hulick^136,137, Evgeny N Imyanitov¹³⁸, Claudine Isaacs¹³⁹, Motoki Iwasaki¹⁴⁰, Agnes Jager¹³², Milena Jakimovska¹⁴¹, Anna Jakubowska^32,142, Paul A James^62,143, Ramunas Janavicius^144,145, Rachel C Jankowitz¹⁴⁶, Esther M John¹⁴⁷, Nichola Johnson⁹⁶, Michael E Jones¹⁴⁸, Arja Jukkola-Vuorinen¹⁴⁹, Audrey Jung¹⁵⁰, Rudolf Kaaks¹⁵⁰, Daehee Kang^151,152,153, Pooja Middha Kapoor^150,154, Beth Y Karlan^155,156, Renske Keeman⁷², Michael J Kerin¹⁵⁷, Elza Khusnutdinova^31,158, Johanna I Kiiski¹⁵⁹, Judy Kirk¹⁶⁰, Cari M Kitahara¹⁶¹, Yon-Dschun Ko¹⁶², Irene Konstantopoulou¹⁶³, Veli-Matti Kosma^127,128,129, Stella Koutros¹⁶⁴, Katerina Kubelka-Sabit¹⁶⁵, Ava Kwong^166,167,168, Kyriacos Kyriacou⁷, Yael Laitman¹⁰¹, Diether Lambrechts^169,170, Eunjung Lee¹²³, Goska Leslie⁶, Jenny Lester^155,156, Fabienne Lesueur^171,172,173, Annika Lindblom^174,175, Wing-Yee Lo⁴⁵, Jirong Long²⁸, Artitaya Lophatananon¹⁷⁶, Jennifer T Loud¹¹⁷, Jan Lubiński³², Robert J MacInnis^82,110, Tom Maishman^87,88, Enes Makalic⁸², Arto Mannermaa^127,128,129, Mehdi Manoochehri¹²⁶, Siranoush Manoukian²⁵, Sara Margolin^125,177, Maria Elena Martinez^105,178, Keitaro Matsuo^179,180, Tabea Maurer¹⁸¹, Dimitrios Mavroudis¹⁰⁹, Rebecca Mayes⁵, Lesley McGuffog⁶, Catriona McLean¹⁸², Noura Mebirouk^171,172,183, Alfons Meindl¹⁸⁴, Austin Miller¹⁸⁵, Nicola Miller¹⁵⁷, Marco Montagna¹⁰, Fernando Moreno¹⁰⁸, Kenneth Muir¹⁷⁶, Anna Marie Mulligan^186,187, Victor M Muñoz-Garzon¹⁸⁸, Taru A Muranen¹⁵⁹, Steven A Narod¹⁸⁹, Rami Nassir¹⁹⁰, Katherine L Nathanson⁸³, Susan L Neuhausen⁸¹, Heli Nevanlinna¹⁵⁹, Patrick Neven⁹⁷, Finn C Nielsen¹⁹¹, Liene Nikitina-Zake¹⁹², Aaron Norman¹⁹³, Kenneth Offit^194,195, Edith Olah¹⁹⁶, Olufunmilayo I Olopade¹⁹⁷, Håkan Olsson¹⁹⁸, Nick Orr¹⁹⁹, Ana Osorio^29,30, V Shane Pankratz²⁰⁰, Janos Papp¹⁹⁶, Sue K Park^151,152,153, Tjoung-Won Park-Simon³⁸, Michael T Parsons⁴, James Paul²⁰¹, Inge Sokilde Pedersen^202,203,204, Bernard Peissel²⁵, Beth Peshkin¹³⁹, Paolo Peterlongo²⁰⁵, Julian Peto⁸⁴, Dijana Plaseska-Karanfilska¹⁴¹, Karolina Prajzendanc³², Ross Prentice²³, Nadege Presneau⁸⁹, Darya Prokofyeva¹⁵⁸, Miquel Angel Pujana²⁰⁶, Katri Pylkäs^207,208, Paolo Radice²⁰⁹, Susan J Ramus^210,211, Johanna Rantala²¹², Rohini Rau-Murthy¹⁹⁵, Gad Rennert²¹³, Harvey A Risch²¹⁴, Mark Robson¹⁹⁵, Atocha Romero²¹⁵, Maria Rossing¹⁹¹, Emmanouil Saloustros²¹⁶, Estela Sánchez-Herrero²¹⁵, Dale P Sandler²¹⁷, Marta Santamariña^29,218,219, Christobel Saunders²²⁰, Elinor J Sawyer²²¹, Maren T Scheuner³³, Daniel F Schmidt^82,222, Rita K Schmutzler^121,122, Andreas Schneeweiss^56,223, Minouk J Schoemaker¹⁴⁸, Ben Schöttker^19,224, Peter Schürmann³⁸, Christopher Scott¹⁹³, Rodney J Scott^225,226,227, Leigha Senter²²⁸, Caroline M Seynaeve¹³², Mitul Shah⁵, Priyanka Sharma²²⁹, Chen-Yang Shen^230,231, Xiao-Ou Shu²⁸, Christian F Singer²³², Thomas P Slavin²³³, Snezhana Smichkoska²³⁴, Melissa C Southey^111,235, John J Spinelli^236,237, Amanda B Spurdle⁴, Jennifer Stone^82,238, Dominique Stoppa-Lyonnet^183,239,240, Christian Sutter²⁴¹, Anthony J Swerdlow^148,242, Rulla M Tamimi^2,3,243, Yen Yen Tan²⁴⁴, William J Tapper⁹⁰, Jack A Taylor^217,245, Manuel R Teixeira^246,247, Maria Tengström^127,248,249, Soo Hwang Teo^250,251, Mary Beth Terry²⁵², Alex Teulé²⁵³, Mads Thomassen²⁵⁴, Darcy L Thull²⁵⁵, Marc Tischkowitz^100,256, Amanda E Toland²⁵⁷, Rob A E M Tollenaar²⁵⁸, Ian Tomlinson^259,260, Diana Torres^49,126, Gabriela Torres-Mejía²⁶¹, Melissa A Troester²⁶², Thérèse Truong¹²⁰, Nadine Tung²⁶³, Maria Tzardi²⁶⁴, Hans-Ulrich Ulmer²⁶⁵, Celine M Vachon²⁶⁶, Christi J van Asperen²⁶⁷, Lizet E van der Kolk¹³⁴, Elizabeth J van Rensburg²⁶⁸, Ana Vega²⁶⁹, Alessandra Viel²⁷⁰, Joseph Vijai^194,195, Maartje J Vogel¹³⁴, Qin Wang⁶, Barbara Wappenschmidt^121,122, Clarice R Weinberg²⁷¹, Jeffrey N Weitzel²³³, Camilla Wendt¹⁷⁷, Hans Wildiers⁹⁷, Robert Winqvist^207,208, Alicja Wolk^124,272, Anna H Wu¹²³, Drakoulis Yannoukakos¹⁶³, Yan Zhang^19,47, Wei Zheng²⁸, David Hunter²⁷³, Paul D P Pharoah^5,6, Jenny Chang-Claude^150,181, Montserrat García-Closas^164,274, Marjanka K Schmidt^72,275, Roger L Milne^82,110,111, Vessela N Kristensen^{276,277,278,279}, Juliet D French⁴, Stacey L Edwards⁴, Antonis C Antoniou⁶, Georgia Chenevix-Trench⁴, Jacques Simard⁸, Douglas F Easton^5,6, Peter Kraft^280,281, Alison M Dunning²⁸².

Abstract

Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes.

Entities: Chemical

Mesh：

Substances：
Biomarkers, Tumor

Year: 2020 PMID： 31911677 PMCID： PMC6974400 DOI： 10.1038/s41588-019-0537-1

Source DB: PubMed Journal: Nat Genet ISSN： 1061-4036 Impact factor: 38.330

Introduction

Genome-wide association studies (GWAS) have identified genetic variants associated with breast cancer risk in more than 150 genomic regions [1,2]. However, the variants and genes driving these associations are mostly unknown, with fewer than 20 regions studied in detail [3-20]. Here, we aimed to fine-map all known breast cancer susceptibility regions using dense genotype data on > 217K subjects participating in the Breast Cancer Association Consortium (BCAC) and the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). All samples were genotyped using the OncoArray™ [1,2,21] or the iCOGS chip [22,23]. Stepwise multinomial logistic regression was used to identify independent association signals in each region and define credible causal variants (CCVs) within each signal. We found genomic features significantly overlapping the CCVs. We then used a Bayesian approach, integrating genomic features and genetic associations, to refine the set of likely causal variants and calculate their posterior probabilities. Finally, we integrated genetic and in silico epigenetic, expression and chromatin conformation data to infer the likely target genes of each signal.

Results

Most breast cancer genomic regions contain multiple independent risk-associated signals

We included 109,900 breast cancer cases and 88,937 controls, all of European ancestry, from 75 studies in the BCAC. Genotypes (directly observed or imputed) were available for 639,118 single nucleotide polymorphisms (SNPs), deletion/insertions, and copy number variants (CNVs) with minor allele frequency (MAF) ≥ 0.1% within 152, previously defined, risk-associated regions (Supplementary Table 1; Figure 1). Multivariate logistic regression confirmed associations for 150/152 regions at a p-value < 10-4 significance threshold (Supplementary Table 2A). To determine the number of independent risk signals within each region we applied stepwise multinomial logistic regression, deriving the association of each variant, conditional on the more significant ones, in order of statistical significance. Finally, we defined CCVs in each signal as variants with conditional p-values within two orders of magnitude of the index variant [24]. We classified the evidence for each independent signal, and its CCVs, as either strong (conditional p-values <10-6) or moderate (10-6 < conditional p-values <10-4).

Figure 1

Flowchart summarizing the study design.

Logistic regression summary statistics were used to select the final set of variants to run stepwise multinomial regression. These results were meta-analysed with CIMBA to provide the final set of strong independent signals and their CCVs. Through a case-only analysis we identified significant differences in effect sizes between ER-positive and ER-negative breast cancer and used this to classify the phenotype for each independent signal. With these strong CCVs, we ran the bio-features enrichment analysis, which identified the features to be included in the PAINTOR models, together with the OncoArray logistic regression summary statistics, and the OncoArray LD. Both multinomial regression CCVs and PAINTOR high Posterior Probability variants were analyzed with INQUISIT to determine high confidence target genes. Finally, we used the set of high confidence target genes to identify enriched pathways.

a conditional on the index variants from BCAC strong signals.

From the 150 genomic regions we identified 352 independent risk signals containing 13,367 CCVs, 7,394 of these were within the 196 strong-evidence signals across 129 regions (Figures 2A-B). The number of signals per region ranged from 1 to 11, with 79 (53%) containing multiple signals. We noted a wide range of CCVs per signal, but in 42 signals there was only a single CCV: for these signals, the simplest hypothesis is that the CCV is causal (Figures 2C-D, Table 1). Furthermore, within signals with few CCVs (<10), the mean scaled CADD score was higher than in signals with more CCVs (13.1 Vs 6.7 for CCVs in exons; Pttest = 2.7x10-4) suggesting that these are more likely to be functional.

Figure 2

Determining independent risk signals and credible candidate variants (CCVs).

(a) Number of independent signals per region identified through multinomial stepwise logistic regression. (b) Signal classification according to their confidence into strong and moderate confidence signals. (c) Number of CCVs per signal at strong confidence signals identified through multinomial stepwise logistic regression. (d) Number of CCVs per signal at moderate confidence signals identified through multinomial stepwise regression. (e) Subtype classification of strong signals into ER-positive, ER-negative and signals equally associated with both phenotypes (ER-neutral) from BCAC analysis. (f) Subtype classification from the meta-analysis of BCAC and CIMBA. Between brackets, number of CCVs from the meta-analysis of BCAC and CIMBA. (g) Number of variants at different posterior probability thresholds. 15 variants reach a PP ≥ 80% by at least one of the three models (ER-all, ER-positive, ER-negative).

Table 1

Signals with single CCVs and variants with PP > 80%

								ER-negative		ER-positive
Fine-mapping region[a]	Variant [b]	Ref/Alt [c]	EAF[d]	PP[e]	Model[f]	Signal[g]	N CCV[h]	OR[i]	(95%CI)	OR[i]	(95%CI)	P-value[i]	FP[j]	Predicted target gene(s)[k]	Confidence[l]
chr1:120723447-121780613	rs11249433	A/G	0.42	0.57	ERALL	Signal 1	1	1.02	(0.99-1.04)	1.13	(1.11-1.15)	8.11x10^-60	na	na
chr1:200937832-201937832	rs35383942	C/T	0.06	0.96	ERALL	Signal 1	2	1.10	(1.05-1.16)	1.09	(1.06-1.13)	1.14x10^-7	D	TNNI1	Level 1
chr2:201681247-202681247	rs3769821	C/T	0.66	0.40	ERALL	Signal 1	1	0.94	(0.92-0.97)	0.95	(0.93-0.96)	1.46x10^-12	D	ALS2CR12	Level 1
chr2:217405832-218796508	rs4442975 [n]	G/T	0.48	0.84	ERALL	Signal 1	1	0.94	(0.92-0.97)	0.86	(0.85-0.87)	2.50x10^-90	D	IGFBP5[m]	Level 2
chr4:105569013-106856761	esv3601665	-/Alu	0.07	0.95	ERPOS			1.01	(0.95-1.08)	1.10	(1.06-1.14)	3.27x10^-6	D	ARHGEF38, AC004066.3	Level 1
chr5:779790-1797488	rs10069690	C/T	0.27	0.58	ERNEG	Signal 1	1	1.18	(1.15-1.21)	1.03	(1.01-1.05)	1.20x10^-34	D	SLC6A18, TERT[m]	Level 2
chr5:44013304-45206498	rs10941679	A/G	0.26	0.00	ERPOS	Signal 1	1	1.04	(1.02-1.07)	1.17	(1.15-1.19)	1.50x10^-77	D	MRPS30	Level 2
chr5:44013304-45206498	rs5867671	A/-	0.77	0.01	ERPOS	Signal 2	1	0.91	(0.89-0.94)	0.99	(0.97-1.01)	2.25x10^-9	na	na
chr5:44013304-45206498	rs190443933	T/C	0.01	0.00	ERALL	Signal 4	1	1.30	(1.14-1.48)	1.26	(1.16-1.37)	2.32x10^-8	na	na
chr5:55531884-56587883	rs984113	G/C	0.61	0.81	ERPOS	Signal 2	1	0.96	(0.93-0.98)	0.96	(0.94-0.97)	3.51x10^-8	D	MAP3K1[m]	Level 2
chr5:55531884-56587883	rs889310	C/T	0.56	0.84	ERPOS	(Signal 6)	15	1.03	(1.00-1.05)	1.05	(1.03-1.06)	1.75x10^-7	D	MAP3K1[m]	Level 1
chr6:15899557-16899557	rs3819405	C/T	0.32	0.96	ERALL	Signal 1	1	0.97	(0.95-1.00)	0.95	(0.94-0.97)	1.14x10^-7	D	ATXN1, RP1-151F17.1, RP1-151F17.2	Level 2
chr6:151418856-152937016	rs12173562	C/T	0.08	0.10	ERNEG	Signal 1	1	1.30	(1.25-1.36)	1.14	(1.11-1.18)	3.98x10^-40	D	ESR1[m]	Level 1
	rs34133739	-/C	0.53	0.25	ERALL	Signal 2	1	1.11	(1.09-1.14)	1.05	(1.04-1.07)	2.36x10^-22	D	ESR1[m]	Level 1
	rs851984	G/A	0.40	0.73	ERALL	Signal 3	1	1.07	(1.04-1.09)	1.05	(1.04-1.07)	3.69x10^-13	D	ESR1[m]	Level 1
chr7:130167121-131167121	rs68056147	G/A	0.30	0.84	ERALL			1.04	(1.01-1.07)	1.05	(1.03-1.06)	3.07x10^-7	D	MKLN1	Level 2
chr8:127424659-130041931	rs35961416	-/A	0.41	0.68	ERALL	Signal 3	1	0.97	(0.94-0.99)	0.95	(0.93-0.96)	9.97x10^-11	D	MYC[m]	Level 1
chr9:21247803-22624477	rs539723051	AAAA/-	0.33	0.43	ERALL	Signal 1	1	1.08	(1.05-1.11)	1.06	(1.04-1.08)	1.81x10^-15	na	na
chr9:109803808-111395353	rs10816625	A/G	0.07	0.95	ERPOS	Signal 3	1	1.06	(1.01-1.11)	1.13	(1.10-1.16)	3.62x10^-15	D	KLF4[m]	Level 2
chr9:109803808-111395353	rs13294895	C/T	0.18	0.93	ERPOS	Signal 4	1	1.01	(0.98-1.05)	1.09	(1.07-1.11)	4.00x10^-17	D	KLF4[m]	Level 1
chr9:109803808-111395353	rs60037937	AA/-	0.22	0.68	ERPOS	Signal 2	1	1.02	(0.99-1.06)	1.11	(1.09-1.13)	3.17x10^-26	D	KLF4[m], RAD23B	Level 2
chr10:63758684-65063702	rs10995201	A/G	0.15	0.31	ERALL	Signal 1	1	0.91	(0.88-0.94)	0.87	(0.85-0.89)	1.40x10^-37	na	na
chr10:122593901-123849324	rs35054928	C/-	0.56	0.60	ERALL	Signal 1	1	0.96	(0.94-0.98)	0.74	(0.73-0.76)	6.55x10^-342	D	FGFR2[m]	Level 1
	rs45631563 [n]	A/T	0.04	0.93	ERPOS	Signal 3	1	0.97	(0.92-1.03)	0.76	(0.73-0.79)	4.84x10^-44	C	FGFR2[m]	Level 2
	rs7899765	T/C	0.06	0.02	ERALL	Signal 5	1	1.01	(0.97-1.06)	0.87	(0.84-0.90)	2.21x10^-18	D	FGFR2[m]	Level 1
chr11:68831418-69879161	rs78540526	C/T	0.09	0.91	ERPOS	Signal 1	1	1.01	(0.97-1.06)	1.40	(1.36-1.44)	2.77x10^-145	D	CCND1[m], MYEOV	Level 1
chr12:27639846-29034415	rs7297051	C/T	0.23	0.23	ERALL	Signal 1	1	0.87	(0.85-0.90)	0.89	(0.88-0.91)	3.12x10^-43	D	CCDC91[m], PTHLH[m], RP11-967K21.1	Level 2
chr12:115336522-116336522	rs35422	G/A	0.57	0.58	ERPOS	Signal 2	1	0.98	(0.96-1.01)	1.05	(1.03-1.07)	4.85x10^-10	D	TBX3	Level 1
chr14:91341069-92368623	rs7153397	C/T	0.70	0.81	ERPOS	Signal 1	3	1.01	(0.99-1.04)	1.06	(1.04-1.08)	3.25x10^-11	D,C	CCDC88C, CTD-2547L24.4, C14orf159, GPR68, RPS6KA5, RP11-73M18.7, RP11-895M11.3	Level 2
chr16:52038825-53038825	rs4784227	C/T	0.27	0.95	ERPOS	Signal 1	1	1.15	(1.12-1.18)	1.26	(1.24-1.28)	4.63x10^-160	D	TOX3[m]	Level 1
chr18:23832476-25075396	rs180952292	T/C	0.01	0.01	ERNEG	Signal 4	1	1.24	(1.12-1.37)	0.98	(0.92-1.05)	2.07x10^-5	na	na
chr18:41899590-42899590	rs9952980	T/C	0.34	0.95	ERALL	Signal 2	3	0.97	(0.94-0.99)	0.95	(0.93-0.96)	7.43x10^-12	D	SLC14A2	Level 2
chr20:5448227-6448227	rs16991615	G/A	0.07	0.97	ERALL	Signal 1	1	1.09	(1.04-1.15)	1.07	(1.04-1.11)	7.89x10^-7	D, C	GPCPD1, MCM8	Level 2
chr22:45783297-46783297	rs184070480	C/T	0.01	0.00	ERALL	Signal 2	1	1.40	(1.20-1.64)	1.01	(0.91-1.12)	5.02x10^-5	D	ATXN10, WNT7B	Level 2

GRCh37/hg19, bp

Current reference ID

Reference (Ref) versus Alternative (Alt) Allele

Effect allele (Alt allele) frequency in OncoArray

PP: Posterior probability. Largest posterior probability in all evaluated models

Model where the variant reaches the largest posterior probability

Signal where the variant is included. Between brackets moderate confidence signals.

Number of CCVs in the signal

Multinomial logistic regression summary statistics, X2 single variant analysis p-value, estimated using 67,136 ER-positive and 17,506 ER-negative cases, together with 88,937 controls.

D: Distal regulation, P: proximal regulation, C: coding; na: prediction non available

Predicted target genes with the largest confidence level for each variant. Between brackets, largest confidence level. na: prediction non available

INQUISIT level of confidence

Target genes with functional follow up

Two variants reach PP> 0.8 in both the ERall and ERpos models; rs4442975: ERpos PP = 0.83, ERall PP = 0.84; rs45631563: ERpos PP = 0.93, ERall PP = 0.92

The majority of breast tumors express the estrogen receptor (ER-positive), but ~20% do not (ER-negative); these two tumor types have distinct biological and clinical characteristics [25]. Using a case-only analysis for the 196 strong-evidence signals, we found 66 signals (34%; containing 1,238 CCVs) where the lead variant conferred a greater relative-risk of developing ER-positive tumors (false discovery rate, FDR 5%), and 29 (15%; 646 CCVs) where the lead variant conferred a greater risk of ER-negative cancer tumors (FDR 5%) (Supplementary Table 2B, Figure 2E). The remaining 101 signals (51%, 5,510 CCVs) showed no difference by ER status (referred to as ER-neutral). Patients with BRCA1 mutations are more likely to develop ER-negative tumors [26]. Hence, to increase our power to identify ER-negative signals, we performed a fixed-effects meta-analysis, combining association results from BRCA1 mutation carriers in CIMBA with the BCAC ER-negative association results. This meta-analysis identified ten additional signals, seven ER-negative and three ER-neutral, making 206 strong-evidence signals (17% ER-negative) containing 7,652 CCVs in total (Figure 2F). More than one quarter of the CCVs (2,277) were accounted for by one signal, resulting from strong linkage disequilibrium with a copy number variant. The remaining analyses focused on the other 205 strong signals across 128 regions (Supplementary Table 2C). The proportion of the familial relative risk of breast cancer (FRR) explained by all 206 strong signals was 20.6%, compared with 17.6% when only the lead SNP for each region was considered. The proportion of the FRR explained increased by a further 3% (to 23.6%) when all 352 signals were considered (Supplementary Table 2D).

CCVs are over-represented in active gene-regulatory regions and transcription factor binding sites

We constructed a database of mapped genomic-features in seven primary cells derived from normal breast and 19 breast cell lines using publicly available data, resulting in 811 annotation tracks in total. These ranged from general features, such as whether a variant was in an exon or in open chromatin, to more specific features, such a cell-specific TF binding or histone mark (determined through ChIP-Seq experiments) in breast-derived cells or cell lines. Using logistic regression, we examined the overlap of these genomic-features with the positions of 5,117 CCVs in the 195 strong-evidence BCAC signals versus the positions of 622,903 variants excluded as credible candidates in the same regions (Supplementary Figure 1A, Supplementary Table 3). We found significant enrichment of CCVs (FDR 5%) in the following genomic-features: Open chromatin (determined by DNase-seq and FAIRE-seq) in ER-positive breast cancer cell-lines and normal breast (Figure 3A). Conversely, we found depletion of CCVs within heterochromatin (determined by the H3K9me3 mark in normal breast, and by chromatin-state in ER-positive cells [27]).

Figure 3

Overlap of CCVs with gene regulatory regions gene bodies and transcription factor binding sites.

(a) Breast cancer CCVs overlap with chromatin states and broad breast cells epigenetic marks. (b) Breast cancer CCVs overlap with breast cells epigenetic marks. (c) Autoimmune CCVs overlap with breast cells epigenetic marks. (d) Breast cancer CCVs overlap with autoimmune-related epigenetic marks. (e) Autoimmune CCVs overlap with autoimmune-related epigenetic marks. (f) Significant ER-positive CCVs overlap with transcription factors binding sites. TFBSs found significant for ER-positive CCVs are highlighted in red (x axis labels). (g) Significant ER-negative CCVs overlap with transcription factors binding sites. (h) Significant ER-neutral CCVs overlap with transcription factors binding sites. Strong column: analysis with all CCVs at strong signals. ER-positive, ER-negative, ER-neutral: analysis of CCVs at strong signals stratified by phenotype. Logistic regression robust variance estimation for clustered observations, Wald test Χ2 p-values estimated using 67,136 ER-positive and 17,506 ER-negative cases, together with 88,937 controls.

Non-significant p-values are noted as dark grey. Significance defined as FDR 5%, which corresponds to the following P-value thresholds: Strong signals P-value = 1.66x10-2, ER-positive P-value = 2.42x10-2; ER-negative P-value 3.02x10-3; ER-neutral P-value = 1.76x10-3.

Actively transcribed genes in normal breast and ER-positive cell lines (defined by H3K36me3 or H3K79me2 histone marks, Figure 3A). Enrichment was larger for ER-neutral CCVs than for those affecting either ER-positive or ER-negative tumors. Gene regulatory regions. CCVs overlapped distal gene regulatory elements in ER-positive breast cancer cells lines (defined by H3K4me1 or H3K27ac marks, Figure 3B). This was confirmed using the ENCODE definition of active enhancers in MCF-7 cells (enhancer-like regions defined by combining DNase and H3K27ac marks), as well as the definition of [28] and [27] (Supplementary Table 3). Under these more stringent definitions, enrichment among ER-positive CCVs was significantly larger than ER-negative or ER-neutral CCVs. Data from [27], showed that 73% of active enhancer regions overlapped by ER-positive CCVs in ER-positive cells (MCF-7), are inactive in the normal HMEC breast cell line; thus, these enhancers appear to be MCF-7-specific. We also detected significant enrichment of CCVs in active promoters in ER-positive cells (defined by H3K4me3 marks in T-47D), although the evidence for this effect was weaker than for distal regulatory elements (defined by H3K27ac marks in MCF-7, Figure 3B). Only ER-positive CCVs were significantly enriched in T-47D active promoters. Conversely, CCVs were depleted among repressed gene-regulatory elements (defined by H3K27me3 marks) in normal breast (Figure 3B). As a control, we performed similar analyses with autoimmune disease CCVs [29] (Methods) and relevant B and T cells (Figures 3B-E). The strongest evidence of enrichment of breast cancer CCVs was found at regulatory regions active in ER-positive cells (Figure 3B), whereas enrichment of autoimmune CCVs was in regulatory regions active in B and T cells (Figure 3E). We also compared the enrichment of our CCVs in enhancer-like and promoter-like regions (defined by ENCODE; Supplementary Figure 1B). The strongest evidence of enrichment of ER-positive CCVs in enhancer-like regions was found in MCF-7 cells, the only ER-positive cell line in ENCODE (Supplementary Figure 1B). These results highlight both the tissue- and disease-specificity of these histone marked gene regulatory regions. We observed significant enrichment of CCVs in the binding sites for 40 transcription factor binding sites(TFBS) determined by ChIP-Seq (Figures 3F-H). The majority of the experiments were performed in ER-positive cell lines (90 TFBSs, 20 with data in ER-negative cell lines, 76 in ER-positive cell lines, and 16 in normal breast). These TFBSs overlap each other and histone marks of active regulatory regions (Supplementary Figure 2). Enrichment in five TFBSs (ESR1, FOXA1, GATA3, TCF7L2, E2F1) has been previously reported [2,30]. All 40 TFBSs were significantly enriched in ER-positive CCVs (Figure 3F), seven were also enriched in ER-negative CCVs and nine in ER-neutral CCVs (Figures 3G-H). ESR1, FOXA1, GATA3 and EP300 TFBSs were enriched in all CCV ER-subtypes. However, the enrichment for ESR1, FOXA1 or GATA3 was stronger for ER-positive CCVs than for ER-negative or ER-neutral.

CCVs significantly overlap consensus transcription factor binding motifs

We investigated whether CCVs were also enriched within consensus transcription factor binding motifs by conducting a motif-search within active regulatory regions (ER-positive CCVs at H3K4me1 marks in MCF-7). We identified 30 motifs, from eight transcription factor families, with enrichment in ER-positive CCVs (FDR 10%, Supplementary Table 4A) and a further five motifs depleted among ER-positive CCVs. To assess whether the motifs appeared more frequently than by chance at active regulatory regions overlapped by our ER-positive CCVs, we compared motif-presence in a set of randomized control sequences (Methods). Thirteen of 30 motifs were more frequent at active regulatory regions with ER-positive CCV enrichment; these included seven homeodomain motifs and two fork head factors (Supplementary Table 4B). When we looked at the change in predicted binding affinity, 57 ER-positive signals (86%) included at least one CCV predicted to modify the binding affinity of the enriched TFBSs (≥2-fold, Supplementary Table 4C). Forty-eight ER-positive signals (73%) had at least one CCV predicted to modify the binding affinity >10-fold. This analysis validates previous reports of breast cancer causal variants that alter DNA binding affinity for FOXA1 [3,30]

Bayesian fine -mapping incorporating functional annotations and linkage disequilibrium

As an alternative statistical approach for inferring likely causal variants, we applied PAINTOR [31] to the same 128 regions (Figure 1). In brief, PAINTOR integrates genetic association results, linkage disequilibrium (LD) structure, and enriched genomic features in an empirical Bayes framework and derives the posterior probability of each variant being causal, conditional on available data. To eliminate artifacts due to differences in genotyping and imputation across platforms, we restricted PAINTOR analyses to cases and controls typed using the OncoArray (61% of the total). We identified seven variants with high posterior probability (HPP ≥ 80%) of being causal for overall breast cancer and ten for the ER-positive subtype (Table 1); two of these had HPP > 80% for both ER-positive and overall breast cancer. These 15 HPP variants (HPPVs; ≥ 80%) were distributed across 13 regions. We also identified an additional 35 variants in 25 regions with HPP (≥ 50% and < 80%) for ER-positive, ER-negative, or overall breast cancer (Figure 2G). Consistent with the CCV analysis, we found evidence that most regions contained multiple HPPVs; the sum of posterior probabilities across all variants in a region (an estimate of the number of distinct causal variants in the region) was > 2.0 for 84/86 regions analyzed for overall breast cancer, with a maximum of 16.1 and a mean of 6.4. For ER-positive cancer, 46/47 regions had total posterior probability > 2.0 (maximum 18.3, mean 6.5) and for ER-negative, 17/23 regions had total posterior probability > 2.0 (maximum 9.1, mean 3.2). Although for many regions we were not able to identify HPP variants, we were able to reduce the proportion of variants needed to account for 80% of the total posterior probability in a region to under 5% for 65 regions for overall, 43 for ER-positive, and 18 for ER-negative breast cancer (Supplementary Figure 3A-C). PAINTOR analyses were also able to reduce the set of likely causal variants in many cases. After summing the posterior probabilities for CCVs in each of the overall breast cancer signals, 39/100 strong-evidence signals had a total posterior probability > 1.0. The number of CCVs in these signals ranged from 1 to 375 (median 24), but the number of variants needed to capture 95% of the total PP in each signal ranged from 1 to 115 (median 12), representing an average reduction of 43% in the number of variants needed to capture the signal. PAINTOR and CCV analyses were generally consistent, yet complementary. Only 3.3% of variants outside of the set of strong-signal CCVs for overall breast cancer had posterior probability > 1%, and only 48 (0.013%) of these had posterior probability > 30% (Supplementary Figure 3D). At ER-positive and ER-negative signals respectively, 3.1% and 1.6% of the non-CCVs at strong signals had posterior probability > 1%, and 40 (0.019%) and 3 (0.003%) of these had posterior probability > 30% (Figures S3E-F). For the non-CCVs at strong-evidence signals with posterior probability > 30%, the relatively high posterior probability may be driven by the addition of functional annotation. Indeed, the incorporation of functional annotations more than doubled the posterior probability for 64/88 variants when compared to a PAINTOR model with no functional annotations.

CCVs co-localize with variants controlling local gene expression

We used four breast-specific expression quantitative trait loci (eQTL) data sets to identify a credible set of variants associated with differences in gene expression (eVariants): tumor tissue from the Nurses’ Health Study (NHS) [32] and The Cancer Genome Atlas (TCGA) [33], and normal breast tissue from the NHS and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) [34]. We then examined the overlap of eVariants (for each gene eVariants were defined as those variants that had a p-value within two orders of magnitude of the variant most significantly associated with that gene’s expression) with CCVs (Methods). There was significant overlap of CCVs with eVariants from both the NHS normal and breast cancer tissue studies (normal breast OR = 2.70, p-value = 1.7×10-5; tumor tissue OR = 2.34, p-value = 2.6×10-4; Supplementary Table 3). ER-neutral CCVs overlapped with eVariants in normal tissue more frequently than did ER-positive and ER-negative CCVs (ORER-neutral = 3.51, p-value = 1.3×10-5). Cancer risk CCVs overlapped credible eVariants in 128/205 (62%) signals in at least one of the datasets (Supplementary Table 5A-B). Sixteen additional variants with PP ≥ 30%, not included among the CCVs, also overlapped with a credible eVariant (Supplementary Table 5A-B).

Transcription factors and known somatic breast cancer drivers are overrepresented among prioritized target genes

We assumed that causal variants function by affecting the behavior of a local target gene. However, it is challenging to define target genes or to determine how they may be affected by the causal variant. Few potentially causal variants directly affect protein coding: we observed 67/5,375 CCVs, and 19/137 HPPVs (≥ 30%) in protein-coding regions. Of these, 33 (0.61%) were predicted to create a missense change, one a frameshift, and another a stop-gain, while 30 were synonymous (0.59%, Supplementary Table 5C). Four hundred and ninety-nine CCVs at 94 signals, and four additional HPPV (≥ 30%), are predicted to create new splice sites or activate cryptic splice sites in 126 genes (Supplementary Table 5D). These results are consistent with previous observations that majority of common susceptibility variants are regulatory. We applied an updated version of our pipeline INQUISIT - integrated expression quantitative trait and prediction of GWAS targets) [2] to prioritize potential target genes from 5,375 CCVs in strong signals and all 138 HPPVs (≥ 30%; Supplementary Table 2C). The pipeline predicted 1,204 target genes from 124/128 genomic regions examined. As a validation we examined the overlap between INQUISIT predictions and 278 established breast cancer driver genes [35-39]. Cancer driver genes were over-represented among high confidence (Level 1) targets; a 5-fold increase over expected from CCVs and 15-fold from HPPVs; p-value= 1×10-6; Supplementary Figure 4A). Notably, thirteen cancer driver genes (ATAD2, CASP8, CCND1, CHEK2, ESR1, FGFR2, GATA3, MAP3K1, MYC, SETBP1, TBX3, XBP1 and ZFP36L1) were predicted from the HPPVs derived from PAINTOR. Cancer driver gene status was consequently included as an additional weighting factor in the INQUISIT pipeline. TF genes [40] were also enriched amongst high-confidence targets predicted from both CCVs (2-fold, p-value = 4.6×10-4) and HPPVs (2.5-fold, p-value = 1.8×10-2, Supplementary Figure 4A). In total INQUISIT identified 191 target genes supported by strong evidence (Supplementary Table 6). Significantly more genes were targeted by multiple independent signals (N = 165) than expected by chance (p-value = 4.3×10-8, Supplementary Figure 4B, Figure 4). Six high-confidence predictions came only from HPPVs, although three of these (IGFBP5, POMGNT1 and WDYHV1) had been predicted at lower confidence from CCVs. Target genes included 20 that were prioritized via potential coding/splicing changes (Supplementary Table 7), ten via promoter variants (Supplementary Table 8), and 180 via distal regulatory variants (Supplementary Table 9). We illustrate genes prioritized via multiple lines of evidence in Figure 4A.

Figure 4

Predicted target genes are enriched in known breast cancer driver genes and transcription factors.

79 target genes that fulfil at least one of the following criteria: are targeted by more than one independent signal, are known driver genes, transcription factor genes, or their binding sites (ChIP-Seq BS) or consensus motif (TF Motif) are significantly overlapped by CCVs. *Genes with published functional follow up.

Three examples of INQUISIT using genomic features to identify predict target genes. Based on capture Hi-C and ChIA-PET chromatin interaction data, NRIP1 is a predicted target of intergenic CCVs and HPPVs at chr21q21 (Supplementary Figure 5A). Multiple target genes were predicted at chr22q12, including the driver genes CHEK2 and XBP1 (Supplementary Figure 5B). A third example at chr12q24.31 is a more complicated scenario with two Level 1 targets: RPLP0 [41] and a modulator of mammary progenitor cell expansion, MSI1 [42] (Supplementary Figure 5C).

Target gene pathways include DNA integrity-checkpoint, apoptosis, developmental processes and the immune system

We performed pathway analysis to identify common processes using INQUSIT high confidence target protein-coding genes (Figure 5A) and identified 488 Gene Ontology terms and 307 pathways at an FDR of 5% (Supplementary Table 10). These were grouped into 98 themes by common ancestor Gene Ontology terms, pathways, or transcription factor classes (Figure 5B). We found that 23% (14/60) of the ER-positive target genes were classified within developmental process pathways (including mammary development), 18% in immune system and a further 17% in nuclear receptors pathways. Of genes targeted by ER-neutral signals, 21% (18/87) were classified in developmental process pathways, 19% in in immune system pathways, and a further 18% in apoptotic process. The top themes of genes targeted by ER-negative signals were DNA integrity checkpoint and immune system, each containing 19% (7/37) genes, and apoptotic processes (16%).

Figure 5

Predicted target genes by phenotype and significantly enriched pathways.

(a) Venn diagram showing the associated phenotype (ER-positive, ER-negative, ER-neutral) for the Level 1 target genes, predicted by the CCVs and HPPVs. * ER-positive or ER-negative target genes also targeted by ER-neutral signals. (b) Heatmap showing clustering of pathway themes over-represented by INQUISIT Level 1 target genes. Color represents the relative number of genes per phenotype within enriched pathways, grouped by common themes. ER-positive, ER-negative, ER-neutral, and all phenotypes together (strong).

Novel pathways revealed by this study include TNF-related apoptosis-inducing ligand (TRAIL) signaling, the AP-2 transcription factors pathway, and regulation of IκB kinase/NF-κB signaling. Of note, the latter of these is specifically overrepresented among ER-negative target genes. We also found significant overrepresentation of additional carcinogenesis-linked pathways including cAMP, NOTCH, PI3K, RAS, WNT/Beta-catenin, and of receptor tyrosine kinases signaling, including FGFR, EGFR, or TGFBR [43-47]. Finally, our target genes are also significantly overrepresented in DNA damage checkpoint, DNA repair pathways, as well as programmed cell death pathways, such as apoptotic process, regulated necrosis, and death receptor signaling-related pathways.

Discussion

We have performed multiple, complementary analyses on 150 breast cancer associated regions, originally found by GWAS, and identified 362 independent risk signals, 205 of these with high confidence (p-value < 10-6). The inclusion of these new variants increases the explained proportion of familial risk by 6% when compared to that explained by the lead signals alone. We observed most regions contain multiple independent signals, the greatest number (nine) in the region surrounding ESR1 and its co-regulated genes, and on 2q35, where IGFBP5 appears to be a key target. We have used two complementary approaches to identify likely causal variants within each region: a Bayesian approach, PAINTOR, which integrated genetic associations, LD and informative genomic features, providing complementary evidence supporting most associations found by the more traditional, multinomial regression approach, and also identified additional variants. Specifically, the Bayesian method highlighted 15 variants that are highly likely to be causal (HPP ≥ 80%). From these approaches we have identified a single variant, likely to be causal, at each of 34 signals (Table 1). Of these, only rs16991615 (MCM8 NP_115874.3:p.E341K) and rs7153397 (CCDC88C NM_001080414.2:c.5058+1342G>A, a cryptic splice-donor site) were predicted to affect protein-coding sequences. However, in other signals we also identified four coding changes previously recognized as deleterious, including the stop-gain rs11571833 (BRCA2 NP_000050.2:p.K3326*, Meeks et al., 2016)[48] and two CHEK2 coding variants; the frameshift rs555607708 [49,50], and a missense variant, rs17879961 [51,52]. In addition, a splicing variant, rs10069690, in TERT results in the truncated protein INS1b [19], decreased telomerase activity, telomere shortening, and increased DNA damage response [53] Having identified potential causal variants within each signal, we aimed to uncover their functions at the DNA level and as well as trying to predict their target gene(s). Looking across all 150 regions, a notable feature is that many likely causal variants implicated in ER-positive cancer risk, lie in gene-regulatory regions marked as open and active in ER-positive breast cells, but not in other cell types. Moreover, a significant proportion of potential causal variants overlap the binding sites for transcription factor proteins (n=40 from ChIP-Seq) and co-regulators (n=64 with addition of computationally derived motifs). Furthermore, nine proteins also appear in the list of high-confidence target genes, hence the following genes and their products have been implicated by two different approaches: CREBBP, EP300, ESR1, FOXI1, GATA3, MEF2B, MYC, NRIP1 and TCF7L2. Most proteins encoded by these genes already have established roles in estrogen signaling. CREBBP, EP300, ESR1, GATA3, and MYC are also known cancer driver genes that are frequently somatically mutated in breast tumors. In contrast to ER-positive signals, we identified fewer genomic features enriched in ER-negative signals. This may reflect the common molecular mechanisms underlying their development, but the power of this study was limited, despite including as many patients with ER-negative tumors as possible, from the BCAC and CIMBA consortia. Less than 20% of genomic signals confer a greater risk of ER-negative cancer and there is little publicly available ChIP-Seq data on ER-negative breast cancer cell lines. The heterogeneity of ER-negative tumors may also have limited our power. Nevertheless, we have identified 35 target genes for ER-negative likely causal variants. Some of these already had functional evidence supporting their role: including CASP8 [54] and MDM4 [55]. Most targets, however, currently have no reported function in ER-negative breast cancer development. Finally, we examined the gene-ontology pathways in which target genes most often lie. Of note, 14% (25/180) of all high-confidence target genes and 19% of ER-negative target predictions are in immune system pathways. Among the significantly enriched pathways were T cell activation, interleukin signaling, Toll-like receptor cascades, and I-κB kinase/NF-κB signaling, as well as processes leading to activation and perpetuation of the innate immune system. The link between immunity, inflammation and tumorigenesis has been extensively studied [56], although not primarily in the context of susceptibility. Five ER-negative high confidence target genes (ALK, CASP8, CFLAR, ESR1, TNFSF10) lie in the I-κB kinase/NF-κB signaling pathway. Interestingly, ER-negative cells have high levels of NF-kB activity when compared to ER-positive [57]. A recent expression–methylation analysis on breast cancer tumor tissue also identified clusters of genes correlated with DNA methylation levels, one enriched in ER signaling genes, and a second in immune pathway genes [58]. These analyses provide strong evidence for more than 200 independent breast cancer risk signals, identify the plausible cancer variants and define likely target genes for the majority of these. However, notwithstanding the enrichment of certain pathways and transcription factors, the biological basis underlying most of these signals remains poorly understood. Our analyses provide a rational basis for such future studies into the biology underlying breast cancer susceptibility.

Methods

Study samples

Epidemiological data for European women were obtained from 75 breast cancer case-control studies participating in the Breast Cancer Association Consortium (BCAC) (cases: 40,285 iCOGS, 69,615 OncoArray; cases with ER status available: 29,561 iCOGS, 55,081 OncoArray); controls: 38,058 iCOGS, 50,879 OncoArray). Details of the participating studies, genotyping calling and quality control are given in [2,22,23], respectively. Epidemiological data for BRCA1 mutation carriers were obtained from 60 studies providing data to the Consortium of Investigators of Modifiers of BRCA1 and BRCA2 (CIMBA) (affected 1,591 iCOGS, 7,772 OncoArray; unaffected 1,665 iCOGS, 7,780 OncoArray). This dataset has been described in detail previously [1,59,60]. All studies provided samples of European ancestry. Any non-European samples were excluded from analyses.

Variant selection and genotyping

Similar approaches were used to select variants for inclusion on the iCOGS and OncoArray, which are described in detail elsewhere [2,21]. Both arrays including a dense coverage of variants across known susceptibility regions (at the time of their design), with sparser coverage of the rest of the genome. Twenty-one known susceptibility regions were selected for dense genotyping using iCOGS and 73 regions using the Oncoarray: the regions were 1Mb intervals centred on the published lead GWAS hit (combined into larger intervals where these overlapped). For iCOGS: all known variants from the March 2010 release of the 1000 Genomes Project with MAF > 0.02 in Europeans were identified, and all those correlated with the published GWAS variants at r2 > 0.1 together with a set of variants designed to tag all remaining variants at r2 > 0.9 were selected to be included in the array. (http://ccge.medschl.cam.ac.uk/files/2014/03/iCOGS_detailed_lists_ALL1.pdf). For Oncoarray, all designable variants correlated with the known hits at r2 > 0.6, plus all variants from lists of potentially functional variants on RegulomeDB, and a set of variants designed to tag all remaining variants at r2 > 0.9 were selected. In total, across the 152 regions considered here, 26,978 iCOGS and 58,339 OncoArray genotyped variants passed QC criteria. We imputed genotypes for all remaining variants using IMPUTE2 [61] and the October 2014 release of the 1000 Genomes Project as a reference. Imputation was conducted independently in the iCOGS and OncoArray subsets. To improve accuracy at low frequency variants, we used the standard IMPUTE2 MCMC algorithm for follow-up imputation, which includes no pre-phasing of the genotypes and increasing both the buffer regions and the number of haplotypes to use as templates (more detailed description of the parameters used can be found in [21]). We thus genotyped or successfully imputed 639,118 variants (all with imputation info score ≥ 0.3 and minor allele frequency (MAF) ≥ 0.001 in both iCOGS and OncoArray datasets). Imputation summaries, and coverage for each of the analyzed regions stratified by allele frequency can be found in Supplementary Table 1B.

BCAC Statistical analyses

Per-allele odds ratios (OR) and standard errors (SE) were estimated for each variant using logistic regression. We ran this analysis separately for iCOGS and OncoArray, and for overall, ER-positive and ER-negative breast cancer. The association between each variant and breast cancer risk was adjusted by study (iCOGS) or country (OncoArray), and eight (iCOGS) or ten (OncoArray) ancestry-informative principal components. The statistical significance for each variant was derived using a Wald test.

Defining appropriate significance thresholds for association signals

To establish an appropriate significance threshold for independent signals, all variants evaluated in the meta-analysis were included in logistic forward selection regression analyses for overall breast cancer risk in iCOGS, run independently for each region. We evaluated five p-value thresholds for inclusion: < 1×10−4, < 1×10-5, < 1×10-6, < 1×10-7, and < 1×10-8. The most parsimonious iCOGS models were tested in OncoArray, and the false discovery rate (FDR) at 1% level for each threshold estimated using the Benjamini-Hochberg procedure. At a 1% FDR threshold: 72% of associations, significant at p<10-4, were replicated on iCOGS and 94% of associations, significant at p<10-6, were replicated on OncoArray. Based on these results, two categories were defined: strong-evidence signals (conditional p-values <10-6 in the final model), and moderate-evidence signals (conditional p-values <10-4 and ≥10-6 in the final model)

Identification of independent signals

To identify independent signals, we ran multinomial stepwise regression analyses, separately in iCOGS and OncoArray, for all variants displaying evidence of association (Nvariants = 202,749). We selected two sets of well imputed variants (imputation info score ≥ 0.3 in both iCOGS and OncoArray): (a) common and low frequency variants (MAF ≥ 0.01) with logistic regression p-value inclusion threshold ≤0.05 in either the iCOGS or OncoArray datasets for at least one of the three phenotypes: overall, ER-positive and ER-negative breast cancer; and (b) rarer variants (MAF ≥ 0.001 and < 0.01), with logistic regression inclusion p-value ≤ 0.0001. The same parameters used for adjustment in logistic regression were used in the multinomial regression analysis (R function multinom). The multinomial regression estimates were combined using a fixed-effects meta-analysis weighted by the inverse variance. Variants with the lowest conditional p-value from the meta-analysis of both European cohorts at each step were included into the multinomial regression model. However, if the new variant to be included in the model caused collinearity problems due to high correlation with an already selected variant, or showed high heterogeneity (p-value < 10-4) between iCOGS and OncoArray after being conditioned by the variant(s) in the model; we dropped the new variant and repeated this process. At 105 of 152 evaluated regions the main signal demonstrated genome-wide significance, while 44 were marginally significant (9.89×10-5 ≥ p-value > 5×10-8). For two regions there were no variants significant at p<10-4 (chr14:104712261-105712261; rs10623258 multinomial regression p-value = 2.32×10-4; chr19:10923703-11923703, rs322144, multinomial regression p-value = 3.90×10-3). Four main differences in the datasets used here and in the previous paper may account for this: (i) our previous paper [2] included data from 11 additional GWAS (14,910 cases and 17,588 controls) that have not been included in the present analysis in order to minimize differences in array coverage, and because ER-status data were substantially incomplete and individual level data were not available for all GWAS; (ii) the present analysis was based on estimating separate risks for ER-positive and ER-negative disease, whereas in our previous paper the outcome was overall breast cancer risk. ER status was available for only 73% of the iCOGS and 79% of the OncoArray breast cancer cases (iii) for the set of samples genotyped with both arrays, [2] used the iCOGS genotypes, while this study includes OncoArray genotypes to maximize the number of samples genotyped with a larger coverage; and (iv) the imputation procedure was modified (in particular using one-step imputation without pre-phasing) to improve the imputation accuracy of less frequent variants. We used a forward stepwise approach to define the number of independent signals within each associated genomic region. We first we identified the index variant of the main signal in the region, and then ran multinomial logistic regression for all other variants, adjusted by the index variant, to identify additional variants that remained independently significant within the model. We repeated this process, adjusting for identified index variants, until no more additional variants could be added. In this way we found from 1-11 independent signals within the 150 regions that containing a genome-wide significant main signal.

Selection of a set of credible causal variants (CCVs)

For each independently associated signal, we first defined credible candidate variants (CCVs), likely to drive its association, as those variants with p-values within two orders of magnitude of the most significant variant for that signal, after adjusting for the index variant of other signals within that region (as identified in the forward stepwise regression above, Supplementary Figure 6A)[24]. For each region, we then attempted to obtain the best fitting model by successively fitting models in which the index variant for each signal was replaced by other CCVs for that signal, adjusting for the index variants for the other signals (Supplementary Figure 6B). Where a model with a higher chi-square was obtained, the index variant was replaced by the CCV in the best model (Supplementary Figure 6C-D). This process was repeated until the model (i.e. the set of index variants) did not change further (Supplementary Figure 6G). This procedure was performed first for the set of strong signals (i.e. considering models including only the strong signals). Once a final model had been obtained for the strong signals, the index variants for the strong signals were considered fixed and the process was repeated for all signals, the index variants for the weak signals (but not the strong signals) to vary. Using this procedure we could define the best model for 140/150 regions, but for ten regions this approach did not converge (chr4:175328036-176346426, chr5:55531884-56587883, chr6:151418856-152937016, chr8:75730301-76917937, chr10:80341148-81387721, chr10:122593901-123849324, chr12:115336522-116336522, chr14:36632769-37635752, chr16:3606788-4606788, chr22:38068833-39859355). For these 10 regions, we defined the best model, from among all possible combinations of credible variants, as that with the largest chi-square value. Finally, redefined the set of CCVs for each signal using the conditional p-values, after adjusting for the revised set of index variants. Again, for the strong signals we conditioned on the index variants for the other strong signals, while for the weak signals we conditioned on the index variants for all other signals.

Case-only analysis

Differences in the effect size between ER-positive and ER-negative disease for each index independent variant were assessed using a case-only analysis. We performed logistic regression with ER status as the dependent variable, and the lead variant at each strong signal in the fine mapping region as the independent variables. We use FDR (5%) to adjust for multiple testing.

OncoArray-only stepwise analysis

To evaluate whether the lower coverage in iCOGS could affect the identification of independent signals, we ran stepwise multinomial regression using only the OncoArray dataset. We identified 249 independent signals. Ninety-two signals, in 67 fine mapping regions, achieved a genome-wide significance level (conditional p-value < 5×10-8). Two hundred and five of these signals were also identified in the meta-analysis with iCOGS. Nine independent variants across ten regions were not evaluated in the combined analysis due to their low imputation info score in iCOGS. Out of these nine signals, two signals would be classified as main primary signals, rs114709821 at region chr1:145144984-146144984 (OncoArray imputation info score = 0.72), and rs540848673 at region chr1:149406413-150420734 (OncoArray imputation info score = 0.33). Given the low number of additional signals identified in the OncoArray dataset alone, all analyses were based on the combined iCOGS/OncoArray dataset.

CIMBA statistical analysis

CIMBA provided data from 60 retrospective cohort studies consisting of 9,445 unaffected and 9,363 affected female BRCA1 mutation carriers of European ancestry. Unconditional (i.e. single variant) analyses were performed using a score test based on the retrospective likelihood of observing the genotype conditional on the disease phenotype [62,63]. Conditional analyses, where more than one variant is analyzed simultaneously, cannot be performed in this score test framework. Therefore, conditional analyses were performed by Cox regression, allowing for adjustment of the conditionally independent variants identified by the BCAC/DRIVE analyses. All models were stratified by country and birth cohort, and adjusted for relatedness (unconditional models used kinship adjusted standard errors based on the estimated kinship matrix; conditional models used cluster robust standard errors based on phenotypic family data). Data from the iCOGS array and the OncoArray were analyzed separately and combined to give an overall BRCA1 association by fixed-effects meta-analysis. Variants were excluded from further analyses if they exhibited evidence of heterogeneity (Heterogeneity p-value < 1×10-4) between iCOGS and OncoArray, had MAF < 0.005, were poorly imputed (imputation info score < 0.3) or were imputed to iCOGS only (i.e. must have been imputed to OncoArray or iCOGS and OncoArray).

Meta-analysis of ER-negative cases in BCAC with BRCA1 mutation carriers from CIMBA

BRCA1 mutation carrier association results were combined with the BCAC multinomial regression ER-negative association results in a fixed-effects meta-analysis. Variants considered for analysis must have passed all prior QC steps and have had MAF≥0.005. All meta-analyses were performed using the METAL software [64].Instances where spurious associations might occur were investigated by assessing the LD between a possible spurious association and the conditionally independent variants. High LD between a variant and a conditionally independent variant within its region causes model instability through collinearity and the convergence of the model likelihood maximization may not reliable. Where the association appeared to be driven by collinearity, the signals were excluded.

Heritability Estimation

To estimate the frailty-scale heritability due to all fine-mapping signals, we used the formula: here where p is a vector of allele frequencies, γ are the estimated per-allele odds ratios and τ the corresponding standard errors, and R is the correlation matrix of genotype frequencies. To adjust for the overestimation resulting from only including signals passing a given significance threshold, we adapted the approach of [65], based on maximizing the likelihood conditional on the test statistic passing the relevant threshold. Since our analyses were based on estimating ER-negative and ER-positive odds ratios simultaneously, the method needed to be adapted to maximise a conditional bivariate normal likelihood. Following [65] we then estimated mean square error estimates based on a weighted mean of the maximum likelihood estimates and the naïve estimates, which they show to be close to be unbiased in the 1df case. The estimated effect sizes for overall breast cancer were computed as a weighted mean of the ER-negative and ER-positive estimates, based on the proportions of each subtype in the whole study (weights 0.21 and 0.79). The results were then expressed in terms of the proportion of the familial breast cancer risk (FRR) to first degree relatives of affected women, using the formula h2 / (2logλ) where the FRR λ was assumed to be 2 [2].

eQTL analysis

Total RNA was extracted from normal breast tissue in formalin-fixed paraffin embedded breast cancer tissue blocks from 264 Nurses’ Health Study (NHS) participants [32]. Transcript expression levels were measured using the Glue Grant Human Transcriptome Array version 3.0 at the Molecular Biology Core Facilities, Dana-Farber Cancer Institute. Gene expression was normalized and summarized into Log2 values using RMA (Affymetrix Power Tools v1.18.012); quality control was performed using GlueQC and arrayQualityMetrics v3.24.014. Genome-wide data on variants were generated using the Illumina HumanHap 550 BeadChip as part of the Cancer Genetic Markers of Susceptibility initiative [66]. Imputation to the 1000KGP Phase 3 v5 ALL reference panel was performed using MACH to pre-phase measured genotypes and minimac to impute. Expression analyses were performed using data from The Cancer Genome Atlas (TCGA) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) projects [34,38]. The TCGA eQTL analysis was based on 458 breast tumors that had matched gene expression, copy number and methylation profiles together with the corresponding germline genotypes available. All 458 individuals were of European ancestry as ascertained using the genotype data and the Local Ancestry in admixed Populations (LAMP) software package (LAMP estimate cut-off >95% European)[67]. Germline genotypes were imputed into the 1000 Genomes Project reference panel (October 2014 release) using IMPUTE version 2 [68,69]. Gene expression had been measured on the Illumina HiSeq 2000 RNA-Seq platform (gene-level RSEM normalized counts [70]), copy-number estimates were derived from the Affymetrix SNP 6.0 (somatic copy-number alteration minus germline copy-number variation called using the GISTIC2 algorithm [71]), and methylation beta values measured on the Illumina Infinium HumanMethylation450. Expression QTL analysis focused on all variants within each of the 152 genomic intervals that had been subjected to fine-mapping for their association with breast cancer susceptibility. Each of these variants was evaluated for its association with the expression of every gene within 2 Mb that had been profiled for each of the three data types. The effects of tumor copy number and methylation on gene expression were first regressed out using a method described previously [72]. eQTL analysis was performed by linear regression, with residual gene expression as outcome, germline SNP genotype dosage as the covariate of interest and ESR1 expression and age as additional covariates, using the R package Matrix eQTL [73]. The METABRIC eQTL analysis was based on 138 normal breast tissue samples resected from breast cancer patients of European ancestry. Germline genotyping for the METABRIC study was also done on the Affymetrix SNP 6.0 array, and gene expression in the METABRIC study was measured using the Illumina HT12 microarray platform (probe-level estimates). No adjustment was implemented for somatic copy number and methylation status since we were evaluating eQTLs in normal breast tissue. All other steps were identical to the TCGA eQTL analysis described above.

Genomic feature enrichment

We explored the overlap of CCVs and excluded variants with 90 transcription factors, 10 histone marks, and DNase hypersensitivity sites in in 15 breast cell lines, and eight normal human breast tissues. We analysed data from the Encyclopedia of DNA Elements (ENCODE) Project [74,75], Roadmap Epigenomics Projects [76], the International Human Epigenome Consortium [77, 27], Pellacani et al. [78], The Cancer Genome Atlas (TCGA) [33], the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) [34], ReMap database (We included 241 TF annotations from ReMap (of 2825 total) which showed at least 2% overlap for any of the phenotype SNP sets) [79], and other data obtained through the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO). Promoters were defined following the procedure defined in [78], that is +/- 2Kb from a gene transcription start site, using an updated version of the RefSeq genes (refGene, version updated 2017-04-11)[80]. Transcribed regions were defined using the same version of refSeq genes. lncRNA annotation was obtained from Gencode (v19)[81] To include eQTL results in the enrichment analysis we (i) identified all the genes for which summary statistics were available; (ii) defined the most significant eQTL variant for each gene (index eQTL variant, p-value threshold ≤ 5×10-4); (iii) classified variants with p-values within two orders of magnitude of the index eVariant as the credible set of eQTL variants; ie. the best candidates to drive expression of the gene. Variants within at least one eQTL credible set were defined as eVariants. We evaluated the overlap between eQTL credible sets and CCVs (risk variants credible set). We evaluated the enrichment of CCVs for genomic feature using logistic regression, with CCV (vs non-CCV variants) being the outcome. To adjust for the correlation among variants in the same fine mapping region, we used robust variance estimation for clustered observations (R function multiwaycov). The associated variants at FDR 5% were included into a stepwise forward logistic regression procedure to select the most parsimonious model. A likelihood ratio test was used to compare multinomial logistic regression models with and without equality effect constraints to evaluate whether there was heterogeneity among the effect sizes for ER-positive, ER-negative or signals equally associated with both phenotypes (ER-neutral). To validate the disease specificity of the regulatory regions identified through this analysis we follow the same approach for the autoimmune related CCVs from [29] (N = 4,192). Variants excluded as candidate causal variants, and within 500 kb upstream and downstream of the index variant for each signal were classified as excluded variants (N = 1,686,484). We then tested the enrichment for both the breast cancer and autoimmune CCVs with breast and T and B cell enhancers. We also evaluated the overlap of our CCVs with ENCODE enhancer-like and promoter-like regions for 111 tissues, primary cells, immortalized cell line, and in vitro differentiated cells. Of these, 73 had available data for both enhancer- and promoter-like regions.

Transcription binding site motif analysis

We conducted a search to find motif occurrences for the transcription factors significantly enriched in the genomic featured. For this we used two publicly available databases, Factorbook [82] and JASPAR 2016 [83]. For the search using Factorbook we included the motifs for the transcription factors discovered in the cell lines where a significant enrichment was found in our genomic features analysis. We also searched for all the available motifs for Homo sapiens at the JASPAR database (JASPAR CORE 2016, TFBSTools [84]) Using as reference the USCS sequence (BSgenome.Hsapiens.USCS.hg19) we created fasta sequences with the reference and alternative alleles for all the variants included in our analysis plus 20 bp flanking each variant. We used FIMO (version 4.11.2, Grant et al., 2011)[85] to scan all the fasta sequences searching for the JASPAR and Factorbook motifs to identify any overlap of any of the alleles for each of the variants (setting the p-value threshold to 10-3). We subsequently determined whether our CCVs were more frequency overlapping a particular TF binding motif when compared with the excluded variants. We ran these analyses for all the strong signals, but also strong signals stratified by ER status. Also, we subset this analysis to the variants located at regulatory regions in an ER-positive cell line (MCF-7 marked by H3K4me1, ENCODE id: ENCFF674BKS) and evaluated whether the ER-positive CCVs overlap any of the motifs more frequently that the excluded variants. We also evaluated the change in total binding affinity caused by the ER-positive CCCR alternative allele for all but one (2:217955891:T::0) of the ER-positive CCVs (MatrixRider [86]). Subsequently, we evaluated whether the MCF-7 regions demarked by H3K4me1 (ENCODE id: ENCFF674BKS), and overlapped by ER-positive CCVs, were enriched in known TFBS motifs. We first subset the ENCODE bed file ENCFF674BKS to identify MCF-7 H3K4me1 peaks overlapped by the ER-positive CCVs (N = 107), as well as peaks only overlapped by excluded variants (N = 11,099), using BEDTools [87]. We created fasta format sequences using genomic coordinate data from the intersected bed files. In order to create a control sequence set, we used the script included with the MEME Suite (fasta-shuffle-letters) to created 10 shuffled copies of each sequence overlapped by ER-positive CCVs (N = 1,070). We then used AME [88] to interrogate whether the 107 MCF-7 H3K4me1 genomic regions overlapped by ER-positive CCVs were enriched in know TFBS consensus motifs when compared to the shuffled control sequences, or to the MCF-7 H3K4me1 genomic regions overlapped only by excluded variants. We used the command line version of AME (version 4.12.0) selecting as scoring method the total number of positions in the sequence whose motif score p-value is less than 10-3, and using a one-tailed Fisher’s Exact test as the association test.

PAINTOR analysis

To further refine the set of CCVs, we performed empirical Bayes fine-mapping using PAINTOR to integrate marginal genetic association summary statistics, linkage disequilibrium patterns, and biological features [31,89]. PAINTOR derives jointly the posterior probability for causality of all variants along the respective contribution of genomic features, in order to maximize the log Likelihood of the data across all regions. PAINTOR does not assume a fixed number of causal variants in each region, although it implicitly penalizes non-parsimonious causal models. We applied PAINTOR separately to association results for overall breast cancer (in 85 regions determined to have at least one ER-neutral association or ER-positive and ER-negative association), ER-positive breast cancer (in 48 regions determined to have at least one ER-positive-specific association), and ER-negative breast cancer (in 22 regions determined to have at least one ER-negative-specific association). To avoid artifacts due to mis-matches between the LD in study samples and the LD matrix supplied to PAINTOR, we used association logistic regression summary statistics from OncoArray data only and estimated the LD structure in the OncoArray sample. For each endpoint we fit four models with increasing numbers of genomic features selected from the stepwise enrichment analyses described above: Model 0 (with no genomic features—assumes each variant is equally likely to be causal a priori), Model 1 (with those genomic features selected with stopping rule p<0.001); Model 2 (with those genomic features selected with stopping rule p<0.01); and Model 3 (with those genomic features selected with stopping rule p<0.05). We used the Bayesian Information Criterion (BIC) to choose the best-fitting model for each outcome. As PAINTOR estimates the marginal log likelihood of the observed Z scores using Gibbs sampling, we used a shrunk mean BIC across multiple Gibbs chains to account for the stochasticity in the log-likelihood estimates. We ran PAINTOR four times to generate four independent Gibbs chains and estimated the BIC difference between model i and model j as This assumes a N(0,100) prior on the difference, or roughly a 16% chance that model i would be decisively better than model j (i.e. |BIC-BIC|>10). We then proceeded to choose the best-fitting model in a stepwise fashion: starting with a model with no annotations, we selected a model with more annotations in favor a model with fewer if the larger model was a considerably better fit—i.e. Model 1 was the best fit according to this process for overall and ER-positive breast cancer; Model 0 was the best fit for ER-negative breast cancer. Differences between the PAINTOR and CCV outputs may be due to several factors. By considering functional enrichment and joint LD among all SNPs, PAINTOR may refine the set of likely causal variants; rather than imposing a hard threshold, PAINTOR allows for a gradient of evidence supporting causality; and the two sets of calculations are based on different summary statistics, CCV analyses used both iCOGS and OncoArray genotypes, while PAINTOR used only OncoArray data (Figure 1, Methods).

Variant annotation

Variants genome coordinates were converted to assembly GRCh38 with liftOver and uploaded to Variant Effect Predictor [90] to determine their effect on genes, transcripts, and protein sequence. The commercial software Alamut® Batch v1.6 batch was also used to annotate coding and splicing variants. PolyPhen-2 [91], SIFT [92], MAPP [93] were used to predict the consequence of missense coding variants. MaxEntScan [94], Splice-Site Finder, and Human Splicing Finder [95] were used to predict splicing effects.

INQUISIT analysis

Logic underlying INQUISIT predictions

Briefly, genes were considered to potential targets of candidate causal variants through effects on: (1) distal gene regulation, (2) proximal regulation, or (3) a gene's coding sequence. We intersected CCV positions with multiple sources of genomic information including chromatin interactions from capture Hi-C experiments performed in a panel of six breast cell lines [96], chromatin interaction analysis by paired-end tag sequencing (ChIA-PET; [97]) and genome-wide chromosome conformation capture from HMECs (Hi-C, (Rao et al., 2014)). We used computational enhancer–promoter correlations (PreSTIGE [98], IM-PET (He et al., 2014), FANTOM5 [99] and super-enhancers [28]), results for breast tissue-specific expression variants (eVariants) from multiple independent studies (TCGA, METABRIC, NHS, Methods), allele-specific imbalance in gene expression [100], transcription factor and histone modification chromatin immunoprecipitation followed by sequencing (ChIP-Seq) from the ENCODE and Roadmap Epigenomics Projects together with the genomic features found to be significantly enriched as described above, gene expression RNA-seq from several breast cancer lines and normal samples and topologically associated domain (TAD) boundaries from T47D cells (ENCODE, [101], Methods and Key Resources Table). To assess the impact of intragenic variants, we evaluated their potential to alter splicing using Alamut® Batch to identify new and cryptic donors and acceptors, and several tools to predict effects of coding sequence changes (see Variant Annotation section). Variants potentially affecting post-translational modifications were downloaded from the "A Website Exhibits SNP On Modification Event" database (http://www.awesome-hust.com/) [102]. The output from each tool was converted to a binary measure to indicate deleterious or tolerated predictions.

Scoring hierarchy

Each target gene prediction category (distal, promoter or coding) was scored according to different criteria. Genes predicted to be distally-regulated targets of CCVs were awarded points based on physical links (eg CHi-C), computational prediction methods, allele-specific expression, or eVariant associations. All CCV and HPPVs were considered as potentially involved in distal regulation. Intersection of a putative distal enhancer with genomic features found to be significantly enriched (see ‘Genomic features enrichment’ for details) were further upweighted. Multiple independent interactions were awarded an additional point. CCVs and HPPVs in gene proximal regulatory regions were intersected with histone ChIP-Seq peaks characteristic of promoters and assigned to the overlapping transcription start sites (defined as -1.0 kb - +0.1 kb). Further points were awarded to such genes if there was evidence for eVariant association or allele-specific expression, while a lack of expression resulted in down-weighting as potential targets. Potential coding changes including missense, nonsense and predicted splicing alterations resulted in addition of one point to the encoded gene for each type of change, while lack of expression reduced the score. We added an additional point for predicted target genes that were also breast cancer drivers. For each category, scores ranged from 0-7 (distal); 0-3 (promoter) or 0-2 (coding). We converted these scores into 'confidence levels': Level 1 (highest confidence) when distal score > 4, promoter score >= 3 or coding score > 1; Level 2 when distal score <= 4 and >=1, promoter score = 1 or = 2, coding score = 1; and Level 3 when distal score < 1 and > 0, promoter score < 1 and > 0, and coding < 1 and > 0. For genes with multiple scores (for example, predicted as targets from multiple independent risk signals or predicted to be impacted in several categories), we recorded the highest score. Driver and transcription factor gene enrichment analysis was carried out using INQUISIT scores prior to adding a point for driver gene status. Modifications to the pipeline since original publication [2] include: TAD boundary definitions from ENCODE T47D Hi-C analysis. Previously, we used regions from Rao, Cell 2013; eQTL: Addition of NHS normal and tumor samples allele-specific imbalance using TCGA and GTEx RNA-seq data [100] Capture Hi-C data from six breast cell lines [103] Additional biofeatures derived from global enrichment in this study Variants affecting sites of post-translational modification [102]

Multi-signal targets

To test if more genes were targeted by multiple signals than expected by chance, we modelled the number of signals per gene by negative binomial regression (R function glm.nb, package MASS) and Poisson regression (R function glm, package stats) with ChIA-PET interactions as a covariate and adjusted by fine mapping region. Likelihood ratio tests were used to compare goodness of fit. Rootograms were created using the R function rootogram (package vcd).

Pathway analysis

The pathway gene set database, dated 1 September 2018 was used [104] (http://download.baderlab.org/EM_Genesets/current_release/Human/symbol/). This database contains pathways from Reactome [105], NCI Pathway Interaction Database [106], GO (Gene Ontology) [107], HumanCyc [108], MSigdb [109], NetPath [110], and Panther [111]. All duplicated pathways, defined in two or more databases, were included. To provide more biologically meaningful results, only pathways that contained ≤ 200 genes were used. We interrogated the pathway annotation sets with the list of high-confidence (Level 1) INQUISIT gene list. The significance of over-representation of the INQUISIT genes within each pathway was assessed with a hypergeometric test using the R function phyper as follows: where x is the number of Level 1 genes that overlap with any of the genes in the pathway, n is the number of genes in the pathway, m is the number of Level1 genes that overlap with any of the genes in the pathway data set (mstrong GO = 145, mER-positive GO = 50, mER-negative GO = 27, mER-neutral GO = 73; mstrong Pathways = 121, mER-positive Pathways = 38, mER-negative Pathways = 21, mER-neutral Pathways = 68), and N is the number of genes in the pathway data set (NGenes GO = 14,252, NGenes Pathways = 10,915). We only included pathways that overlapped with at least two Level 1 genes. We used the Benjamini-Hochberg false discovery rate (FDR) [112] at 5% level.

101 in total

1. An intergenic risk locus containing an enhancer deletion in 2q35 modulates breast cancer risk by deregulating IGFBP5 expression.

Authors: Asaf Wyszynski; Chi-Chen Hong; Kristin Lam; Kyriaki Michailidou; Christian Lytle; Song Yao; Yali Zhang; Manjeet K Bolla; Qin Wang; Joe Dennis; John L Hopper; Melissa C Southey; Marjanka K Schmidt; Annegien Broeks; Kenneth Muir; Artitaya Lophatananon; Peter A Fasching; Matthias W Beckmann; Julian Peto; Isabel Dos-Santos-Silva; Elinor J Sawyer; Ian Tomlinson; Barbara Burwinkel; Frederik Marme; Pascal Guénel; Thérèse Truong; Stig E Bojesen; Børge G Nordestgaard; Anna González-Neira; Javier Benitez; Susan L Neuhausen; Hermann Brenner; Aida Karina Dieffenbach; Alfons Meindl; Rita K Schmutzler; Hiltrud Brauch; Heli Nevanlinna; Sofia Khan; Keitaro Matsuo; Hidemi Ito; Thilo Dörk; Natalia V Bogdanova; Annika Lindblom; Sara Margolin; Arto Mannermaa; Veli-Matti Kosma; Anna H Wu; David Van Den Berg; Diether Lambrechts; Hans Wildiers; Jenny Chang-Claude; Anja Rudolph; Paolo Radice; Paolo Peterlongo; Fergus J Couch; Janet E Olson; Graham G Giles; Roger L Milne; Christopher A Haiman; Brian E Henderson; Martine Dumont; Soo Hwang Teo; Tien Y Wong; Vessela Kristensen; Wei Zheng; Jirong Long; Robert Winqvist; Katri Pylkäs; Irene L Andrulis; Julia A Knight; Peter Devilee; Caroline Seynaeve; Montserrat García-Closas; Jonine Figueroa; Daniel Klevebring; Kamila Czene; Maartje J Hooning; Ans M W van den Ouweland; Hatef Darabi; Xiao-Ou Shu; Yu-Tang Gao; Angela Cox; William Blot; Lisa B Signorello; Mitul Shah; Daehee Kang; Ji-Yeob Choi; Mikael Hartman; Hui Miao; Ute Hamann; Anna Jakubowska; Jan Lubinski; Suleeporn Sangrajrang; James McKay; Amanda E Toland; Drakoulis Yannoukakos; Chen-Yang Shen; Pei-Ei Wu; Anthony Swerdlow; Nick Orr; Jacques Simard; Paul D P Pharoah; Alison M Dunning; Georgia Chenevix-Trench; Per Hall; Elisa Bandera; Chris Amos; Christine Ambrosone; Douglas F Easton; Michael D Cole
Journal: Hum Mol Genet Date: 2016-07-11 Impact factor: 6.150

2. Breast cancer risk variants at 6q25 display different phenotype associations and regulate ESR1, RMND1 and CCDC170.

Authors: Alison M Dunning; Kyriaki Michailidou; Karoline B Kuchenbaecker; Deborah Thompson; Juliet D French; Jonathan Beesley; Catherine S Healey; Siddhartha Kar; Karen A Pooley; Elena Lopez-Knowles; Ed Dicks; Daniel Barrowdale; Nicholas A Sinnott-Armstrong; Richard C Sallari; Kristine M Hillman; Susanne Kaufmann; Haran Sivakumaran; Mahdi Moradi Marjaneh; Jason S Lee; Margaret Hills; Monika Jarosz; Suzie Drury; Sander Canisius; Manjeet K Bolla; Joe Dennis; Qin Wang; John L Hopper; Melissa C Southey; Annegien Broeks; Marjanka K Schmidt; Artitaya Lophatananon; Kenneth Muir; Matthias W Beckmann; Peter A Fasching; Isabel Dos-Santos-Silva; Julian Peto; Elinor J Sawyer; Ian Tomlinson; Barbara Burwinkel; Frederik Marme; Pascal Guénel; Thérèse Truong; Stig E Bojesen; Henrik Flyger; Anna González-Neira; Jose I A Perez; Hoda Anton-Culver; Lee Eunjung; Volker Arndt; Hermann Brenner; Alfons Meindl; Rita K Schmutzler; Hiltrud Brauch; Ute Hamann; Kristiina Aittomäki; Carl Blomqvist; Hidemi Ito; Keitaro Matsuo; Natasha Bogdanova; Thilo Dörk; Annika Lindblom; Sara Margolin; Veli-Matti Kosma; Arto Mannermaa; Chiu-Chen Tseng; Anna H Wu; Diether Lambrechts; Hans Wildiers; Jenny Chang-Claude; Anja Rudolph; Paolo Peterlongo; Paolo Radice; Janet E Olson; Graham G Giles; Roger L Milne; Christopher A Haiman; Brian E Henderson; Mark S Goldberg; Soo H Teo; Cheng Har Yip; Silje Nord; Anne-Lise Borresen-Dale; Vessela Kristensen; Jirong Long; Wei Zheng; Katri Pylkäs; Robert Winqvist; Irene L Andrulis; Julia A Knight; Peter Devilee; Caroline Seynaeve; Jonine Figueroa; Mark E Sherman; Kamila Czene; Hatef Darabi; Antoinette Hollestelle; Ans M W van den Ouweland; Keith Humphreys; Yu-Tang Gao; Xiao-Ou Shu; Angela Cox; Simon S Cross; William Blot; Qiuyin Cai; Maya Ghoussaini; Barbara J Perkins; Mitul Shah; Ji-Yeob Choi; Daehee Kang; Soo Chin Lee; Mikael Hartman; Maria Kabisch; Diana Torres; Anna Jakubowska; Jan Lubinski; Paul Brennan; Suleeporn Sangrajrang; Christine B Ambrosone; Amanda E Toland; Chen-Yang Shen; Pei-Ei Wu; Nick Orr; Anthony Swerdlow; Lesley McGuffog; Sue Healey; Andrew Lee; Miroslav Kapuscinski; Esther M John; Mary Beth Terry; Mary B Daly; David E Goldgar; Saundra S Buys; Ramunas Janavicius; Laima Tihomirova; Nadine Tung; Cecilia M Dorfling; Elizabeth J van Rensburg; Susan L Neuhausen; Bent Ejlertsen; Thomas V O Hansen; Ana Osorio; Javier Benitez; Rachel Rando; Jeffrey N Weitzel; Bernardo Bonanni; Bernard Peissel; Siranoush Manoukian; Laura Papi; Laura Ottini; Irene Konstantopoulou; Paraskevi Apostolou; Judy Garber; Muhammad Usman Rashid; Debra Frost; Louise Izatt; Steve Ellis; Andrew K Godwin; Norbert Arnold; Dieter Niederacher; Kerstin Rhiem; Nadja Bogdanova-Markov; Charlotte Sagne; Dominique Stoppa-Lyonnet; Francesca Damiola; Olga M Sinilnikova; Sylvie Mazoyer; Claudine Isaacs; Kathleen B M Claes; Kim De Leeneer; Miguel de la Hoya; Trinidad Caldes; Heli Nevanlinna; Sofia Khan; Arjen R Mensenkamp; Maartje J Hooning; Matti A Rookus; Ava Kwong; Edith Olah; Orland Diez; Joan Brunet; Miquel Angel Pujana; Jacek Gronwald; Tomasz Huzarski; Rosa B Barkardottir; Rachel Laframboise; Penny Soucy; Marco Montagna; Simona Agata; Manuel R Teixeira; Sue Kyung Park; Noralane Lindor; Fergus J Couch; Marc Tischkowitz; Lenka Foretova; Joseph Vijai; Kenneth Offit; Christian F Singer; Christine Rappaport; Catherine M Phelan; Mark H Greene; Phuong L Mai; Gad Rennert; Evgeny N Imyanitov; Peter J Hulick; Kelly-Anne Phillips; Marion Piedmonte; Anna Marie Mulligan; Gord Glendon; Anders Bojesen; Mads Thomassen; Maria A Caligo; Sook-Yee Yoon; Eitan Friedman; Yael Laitman; Ake Borg; Anna von Wachenfeldt; Hans Ehrencrona; Johanna Rantala; Olufunmilayo I Olopade; Patricia A Ganz; Robert L Nussbaum; Simon A Gayther; Katherine L Nathanson; Susan M Domchek; Banu K Arun; Gillian Mitchell; Beth Y Karlan; Jenny Lester; Gertraud Maskarinec; Christy Woolcott; Christopher Scott; Jennifer Stone; Carmel Apicella; Rulla Tamimi; Robert Luben; Kay-Tee Khaw; Åslaug Helland; Vilde Haakensen; Mitch Dowsett; Paul D P Pharoah; Jacques Simard; Per Hall; Montserrat García-Closas; Celine Vachon; Georgia Chenevix-Trench; Antonis C Antoniou; Douglas F Easton; Stacey L Edwards
Journal: Nat Genet Date: 2016-02-29 Impact factor: 38.330

3. Fine-scale mapping of the 5q11.2 breast cancer locus reveals at least three independent risk variants regulating MAP3K1.

Authors: Dylan M Glubb; Mel J Maranian; Kyriaki Michailidou; Karen A Pooley; Kerstin B Meyer; Siddhartha Kar; Saskia Carlebur; Martin O'Reilly; Joshua A Betts; Kristine M Hillman; Susanne Kaufmann; Jonathan Beesley; Sander Canisius; John L Hopper; Melissa C Southey; Helen Tsimiklis; Carmel Apicella; Marjanka K Schmidt; Annegien Broeks; Frans B Hogervorst; C Ellen van der Schoot; Kenneth Muir; Artitaya Lophatananon; Sarah Stewart-Brown; Pornthep Siriwanarangsan; Peter A Fasching; Matthias Ruebner; Arif B Ekici; Matthias W Beckmann; Julian Peto; Isabel dos-Santos-Silva; Olivia Fletcher; Nichola Johnson; Paul D P Pharoah; Manjeet K Bolla; Qin Wang; Joe Dennis; Elinor J Sawyer; Ian Tomlinson; Michael J Kerin; Nicola Miller; Barbara Burwinkel; Frederik Marme; Rongxi Yang; Harald Surowy; Pascal Guénel; Thérèse Truong; Florence Menegaux; Marie Sanchez; Stig E Bojesen; Børge G Nordestgaard; Sune F Nielsen; Henrik Flyger; Anna González-Neira; Javier Benitez; M Pilar Zamora; Jose Ignacio Arias Perez; Hoda Anton-Culver; Susan L Neuhausen; Hermann Brenner; Aida Karina Dieffenbach; Volker Arndt; Christa Stegmaier; Alfons Meindl; Rita K Schmutzler; Hiltrud Brauch; Yon-Dschun Ko; Thomas Brüning; Heli Nevanlinna; Taru A Muranen; Kristiina Aittomäki; Carl Blomqvist; Keitaro Matsuo; Hidemi Ito; Hiroji Iwata; Hideo Tanaka; Thilo Dörk; Natalia V Bogdanova; Sonja Helbig; Annika Lindblom; Sara Margolin; Arto Mannermaa; Vesa Kataja; Veli-Matti Kosma; Jaana M Hartikainen; Anna H Wu; Chiu-chen Tseng; David Van Den Berg; Daniel O Stram; Diether Lambrechts; Hui Zhao; Caroline Weltens; Erik van Limbergen; Jenny Chang-Claude; Dieter Flesch-Janys; Anja Rudolph; Petra Seibold; Paolo Radice; Paolo Peterlongo; Monica Barile; Fabio Capra; Fergus J Couch; Janet E Olson; Emily Hallberg; Celine Vachon; Graham G Giles; Roger L Milne; Catriona McLean; Christopher A Haiman; Brian E Henderson; Fredrick Schumacher; Loic Le Marchand; Jacques Simard; Mark S Goldberg; France Labrèche; Martine Dumont; Soo Hwang Teo; Cheng Har Yip; Mee-Hoong See; Belinda Cornes; Ching-Yu Cheng; M Kamran Ikram; Vessela Kristensen; Wei Zheng; Sandra L Halverson; Martha Shrubsole; Jirong Long; Robert Winqvist; Katri Pylkäs; Arja Jukkola-Vuorinen; Saila Kauppila; Irene L Andrulis; Julia A Knight; Gord Glendon; Sandrine Tchatchou; Peter Devilee; Robert A E M Tollenaar; Caroline Seynaeve; Christi J Van Asperen; Montserrat García-Closas; Jonine Figueroa; Stephen J Chanock; Jolanta Lissowska; Kamila Czene; Daniel Klevebring; Hatef Darabi; Mikael Eriksson; Maartje J Hooning; Antoinette Hollestelle; John W M Martens; J Margriet Collée; Per Hall; Jingmei Li; Keith Humphreys; Xiao-Ou Shu; Wei Lu; Yu-Tang Gao; Hui Cai; Angela Cox; Simon S Cross; Malcolm W R Reed; William Blot; Lisa B Signorello; Qiuyin Cai; Mitul Shah; Maya Ghoussaini; Daehee Kang; Ji-Yeob Choi; Sue K Park; Dong-Young Noh; Mikael Hartman; Hui Miao; Wei Yen Lim; Anthony Tang; Ute Hamann; Diana Torres; Anna Jakubowska; Jan Lubinski; Katarzyna Jaworska; Katarzyna Durda; Suleeporn Sangrajrang; Valerie Gaborieau; Paul Brennan; James McKay; Curtis Olswold; Susan Slager; Amanda E Toland; Drakoulis Yannoukakos; Chen-Yang Shen; Pei-Ei Wu; Jyh-Cherng Yu; Ming-Feng Hou; Anthony Swerdlow; Alan Ashworth; Nick Orr; Michael Jones; Guillermo Pita; M Rosario Alonso; Nuria Álvarez; Daniel Herrero; Daniel C Tessier; Daniel Vincent; Francois Bacot; Craig Luccarini; Caroline Baynes; Shahana Ahmed; Catherine S Healey; Melissa A Brown; Bruce A J Ponder; Georgia Chenevix-Trench; Deborah J Thompson; Stacey L Edwards; Douglas F Easton; Alison M Dunning; Juliet D French
Journal: Am J Hum Genet Date: 2014-12-18 Impact factor: 11.025

4. Polymorphisms in a Putative Enhancer at the 10q21.2 Breast Cancer Risk Locus Regulate NRBF2 Expression.

Authors: Hatef Darabi; Karen McCue; Jonathan Beesley; Kyriaki Michailidou; Silje Nord; Siddhartha Kar; Keith Humphreys; Deborah Thompson; Maya Ghoussaini; Manjeet K Bolla; Joe Dennis; Qin Wang; Sander Canisius; Christopher G Scott; Carmel Apicella; John L Hopper; Melissa C Southey; Jennifer Stone; Annegien Broeks; Marjanka K Schmidt; Rodney J Scott; Artitaya Lophatananon; Kenneth Muir; Matthias W Beckmann; Arif B Ekici; Peter A Fasching; Katharina Heusinger; Isabel Dos-Santos-Silva; Julian Peto; Ian Tomlinson; Elinor J Sawyer; Barbara Burwinkel; Frederik Marme; Pascal Guénel; Thérèse Truong; Stig E Bojesen; Henrik Flyger; Javier Benitez; Anna González-Neira; Hoda Anton-Culver; Susan L Neuhausen; Volker Arndt; Hermann Brenner; Christoph Engel; Alfons Meindl; Rita K Schmutzler; Norbert Arnold; Hiltrud Brauch; Ute Hamann; Jenny Chang-Claude; Sofia Khan; Heli Nevanlinna; Hidemi Ito; Keitaro Matsuo; Natalia V Bogdanova; Thilo Dörk; Annika Lindblom; Sara Margolin; Veli-Matti Kosma; Arto Mannermaa; Chiu-Chen Tseng; Anna H Wu; Giuseppe Floris; Diether Lambrechts; Anja Rudolph; Paolo Peterlongo; Paolo Radice; Fergus J Couch; Celine Vachon; Graham G Giles; Catriona McLean; Roger L Milne; Pierre-Antoine Dugué; Christopher A Haiman; Gertraud Maskarinec; Christy Woolcott; Brian E Henderson; Mark S Goldberg; Jacques Simard; Soo H Teo; Shivaani Mariapun; Åslaug Helland; Vilde Haakensen; Wei Zheng; Alicia Beeghly-Fadiel; Rulla Tamimi; Arja Jukkola-Vuorinen; Robert Winqvist; Irene L Andrulis; Julia A Knight; Peter Devilee; Robert A E M Tollenaar; Jonine Figueroa; Montserrat García-Closas; Kamila Czene; Maartje J Hooning; Madeleine Tilanus-Linthorst; Jingmei Li; Yu-Tang Gao; Xiao-Ou Shu; Angela Cox; Simon S Cross; Robert Luben; Kay-Tee Khaw; Ji-Yeob Choi; Daehee Kang; Mikael Hartman; Wei Yen Lim; Maria Kabisch; Diana Torres; Anna Jakubowska; Jan Lubinski; James McKay; Suleeporn Sangrajrang; Amanda E Toland; Drakoulis Yannoukakos; Chen-Yang Shen; Jyh-Cherng Yu; Argyrios Ziogas; Minouk J Schoemaker; Anthony Swerdlow; Anne-Lise Borresen-Dale; Vessela Kristensen; Juliet D French; Stacey L Edwards; Alison M Dunning; Douglas F Easton; Per Hall; Georgia Chenevix-Trench
Journal: Am J Hum Genet Date: 2015-06-11 Impact factor: 11.025

5. Fine-scale mapping of 8q24 locus identifies multiple independent risk variants for breast cancer.

Authors: Jiajun Shi; Yanfeng Zhang; Wei Zheng; Kyriaki Michailidou; Maya Ghoussaini; Manjeet K Bolla; Qin Wang; Joe Dennis; Michael Lush; Roger L Milne; Xiao-Ou Shu; Jonathan Beesley; Siddhartha Kar; Irene L Andrulis; Hoda Anton-Culver; Volker Arndt; Matthias W Beckmann; Zhiguo Zhao; Xingyi Guo; Javier Benitez; Alicia Beeghly-Fadiel; William Blot; Natalia V Bogdanova; Stig E Bojesen; Hiltrud Brauch; Hermann Brenner; Louise Brinton; Annegien Broeks; Thomas Brüning; Barbara Burwinkel; Hui Cai; Sander Canisius; Jenny Chang-Claude; Ji-Yeob Choi; Fergus J Couch; Angela Cox; Simon S Cross; Kamila Czene; Hatef Darabi; Peter Devilee; Arnaud Droit; Thilo Dork; Peter A Fasching; Olivia Fletcher; Henrik Flyger; Florentia Fostira; Valerie Gaborieau; Montserrat García-Closas; Graham G Giles; Pascal Guenel; Christopher A Haiman; Ute Hamann; Mikael Hartman; Hui Miao; Antoinette Hollestelle; John L Hopper; Chia-Ni Hsiung; Hidemi Ito; Anna Jakubowska; Nichola Johnson; Diana Torres; Maria Kabisch; Daehee Kang; Sofia Khan; Julia A Knight; Veli-Matti Kosma; Diether Lambrechts; Jingmei Li; Annika Lindblom; Artitaya Lophatananon; Jan Lubinski; Arto Mannermaa; Siranoush Manoukian; Loic Le Marchand; Sara Margolin; Frederik Marme; Keitaro Matsuo; Catriona McLean; Alfons Meindl; Kenneth Muir; Susan L Neuhausen; Heli Nevanlinna; Silje Nord; Anne-Lise Børresen-Dale; Janet E Olson; Nick Orr; Ans M W van den Ouweland; Paolo Peterlongo; Thomas Choudary Putti; Anja Rudolph; Suleeporn Sangrajrang; Elinor J Sawyer; Marjanka K Schmidt; Rita K Schmutzler; Chen-Yang Shen; Ming-Feng Hou; Matha J Shrubsole; Melissa C Southey; Anthony Swerdlow; Soo Hwang Teo; Bernard Thienpont; Amanda E Toland; Robert A E M Tollenaar; Ian Tomlinson; Therese Truong; Chiu-Chen Tseng; Wanqing Wen; Robert Winqvist; Anna H Wu; Cheng Har Yip; Pilar M Zamora; Ying Zheng; Giuseppe Floris; Ching-Yu Cheng; Maartje J Hooning; John W M Martens; Caroline Seynaeve; Vessela N Kristensen; Per Hall; Paul D P Pharoah; Jacques Simard; Georgia Chenevix-Trench; Alison M Dunning; Antonis C Antoniou; Douglas F Easton; Qiuyin Cai; Jirong Long
Journal: Int J Cancer Date: 2016-06-17 Impact factor: 7.396

6. Fine-mapping identifies two additional breast cancer susceptibility loci at 9q31.2.

Authors: Nick Orr; Frank Dudbridge; Nicola Dryden; Sarah Maguire; Daniela Novo; Eleni Perrakis; Nichola Johnson; Maya Ghoussaini; John L Hopper; Melissa C Southey; Carmel Apicella; Jennifer Stone; Marjanka K Schmidt; Annegien Broeks; Laura J Van't Veer; Frans B Hogervorst; Peter A Fasching; Lothar Haeberle; Arif B Ekici; Matthias W Beckmann; Lorna Gibson; Zoe Aitken; Helen Warren; Elinor Sawyer; Ian Tomlinson; Michael J Kerin; Nicola Miller; Barbara Burwinkel; Frederik Marme; Andreas Schneeweiss; Chistof Sohn; Pascal Guénel; Thérèse Truong; Emilie Cordina-Duverger; Marie Sanchez; Stig E Bojesen; Børge G Nordestgaard; Sune F Nielsen; Henrik Flyger; Javier Benitez; Maria Pilar Zamora; Jose Ignacio Arias Perez; Primitiva Menéndez; Hoda Anton-Culver; Susan L Neuhausen; Hermann Brenner; Aida Karina Dieffenbach; Volker Arndt; Christa Stegmaier; Ute Hamann; Hiltrud Brauch; Christina Justenhoven; Thomas Brüning; Yon-Dschun Ko; Heli Nevanlinna; Kristiina Aittomäki; Carl Blomqvist; Sofia Khan; Natalia Bogdanova; Thilo Dörk; Annika Lindblom; Sara Margolin; Arto Mannermaa; Vesa Kataja; Veli-Matti Kosma; Jaana M Hartikainen; Georgia Chenevix-Trench; Jonathan Beesley; Diether Lambrechts; Matthieu Moisse; Guiseppe Floris; Benoit Beuselinck; Jenny Chang-Claude; Anja Rudolph; Petra Seibold; Dieter Flesch-Janys; Paolo Radice; Paolo Peterlongo; Bernard Peissel; Valeria Pensotti; Fergus J Couch; Janet E Olson; Seth Slettedahl; Celine Vachon; Graham G Giles; Roger L Milne; Catriona McLean; Christopher A Haiman; Brian E Henderson; Fredrick Schumacher; Loic Le Marchand; Jacques Simard; Mark S Goldberg; France Labrèche; Martine Dumont; Vessela Kristensen; Grethe Grenaker Alnæs; Silje Nord; Anne-Lise Borresen-Dale; Wei Zheng; Sandra Deming-Halverson; Martha Shrubsole; Jirong Long; Robert Winqvist; Katri Pylkäs; Arja Jukkola-Vuorinen; Mervi Grip; Irene L Andrulis; Julia A Knight; Gord Glendon; Sandrine Tchatchou; Peter Devilee; Robertus A E M Tollenaar; Caroline M Seynaeve; Christi J Van Asperen; Montserrat Garcia-Closas; Jonine Figueroa; Stephen J Chanock; Jolanta Lissowska; Kamila Czene; Hatef Darabi; Mikael Eriksson; Daniel Klevebring; Maartje J Hooning; Antoinette Hollestelle; Carolien H M van Deurzen; Mieke Kriege; Per Hall; Jingmei Li; Jianjun Liu; Keith Humphreys; Angela Cox; Simon S Cross; Malcolm W R Reed; Paul D P Pharoah; Alison M Dunning; Mitul Shah; Barbara J Perkins; Anna Jakubowska; Jan Lubinski; Katarzyna Jaworska-Bieniek; Katarzyna Durda; Alan Ashworth; Anthony Swerdlow; Michael Jones; Minouk J Schoemaker; Alfons Meindl; Rita K Schmutzler; Curtis Olswold; Susan Slager; Amanda E Toland; Drakoulis Yannoukakos; Kenneth Muir; Artitaya Lophatananon; Sarah Stewart-Brown; Pornthep Siriwanarangsan; Keitaro Matsuo; Hidema Ito; Hiroji Iwata; Junko Ishiguro; Anna H Wu; Chiu-Chen Tseng; David Van Den Berg; Daniel O Stram; Soo Hwang Teo; Cheng Har Yip; Peter Kang; Mohammad Kamran Ikram; Xiao-Ou Shu; Wei Lu; Yu-Tang Gao; Hui Cai; Daehee Kang; Ji-Yeob Choi; Sue K Park; Dong-Young Noh; Mikael Hartman; Hui Miao; Wei Yen Lim; Soo Chin Lee; Suleeporn Sangrajrang; Valerie Gaborieau; Paul Brennan; James Mckay; Pei-Ei Wu; Ming-Feng Hou; Jyh-Cherng Yu; Chen-Yang Shen; William Blot; Qiuyin Cai; Lisa B Signorello; Craig Luccarini; Caroline Bayes; Shahana Ahmed; Mel Maranian; Catherine S Healey; Anna González-Neira; Guillermo Pita; M Rosario Alonso; Nuria Álvarez; Daniel Herrero; Daniel C Tessier; Daniel Vincent; Francois Bacot; David J Hunter; Sara Lindstrom; Joe Dennis; Kyriaki Michailidou; Manjeet K Bolla; Douglas F Easton; Isabel dos Santos Silva; Olivia Fletcher; Julian Peto
Journal: Hum Mol Genet Date: 2015-02-04 Impact factor: 6.150

7. Fine scale mapping of the 17q22 breast cancer locus using dense SNPs, genotyped within the Collaborative Oncological Gene-Environment Study (COGs).

Authors: Hatef Darabi; Jonathan Beesley; Arnaud Droit; Siddhartha Kar; Silje Nord; Mahdi Moradi Marjaneh; Penny Soucy; Kyriaki Michailidou; Maya Ghoussaini; Hanna Fues Wahl; Manjeet K Bolla; Qin Wang; Joe Dennis; M Rosario Alonso; Irene L Andrulis; Hoda Anton-Culver; Volker Arndt; Matthias W Beckmann; Javier Benitez; Natalia V Bogdanova; Stig E Bojesen; Hiltrud Brauch; Hermann Brenner; Annegien Broeks; Thomas Brüning; Barbara Burwinkel; Jenny Chang-Claude; Ji-Yeob Choi; Don M Conroy; Fergus J Couch; Angela Cox; Simon S Cross; Kamila Czene; Peter Devilee; Thilo Dörk; Douglas F Easton; Peter A Fasching; Jonine Figueroa; Olivia Fletcher; Henrik Flyger; Eva Galle; Montserrat García-Closas; Graham G Giles; Mark S Goldberg; Anna González-Neira; Pascal Guénel; Christopher A Haiman; Emily Hallberg; Ute Hamann; Mikael Hartman; Antoinette Hollestelle; John L Hopper; Hidemi Ito; Anna Jakubowska; Nichola Johnson; Daehee Kang; Sofia Khan; Veli-Matti Kosma; Mieke Kriege; Vessela Kristensen; Diether Lambrechts; Loic Le Marchand; Soo Chin Lee; Annika Lindblom; Artitaya Lophatananon; Jan Lubinski; Arto Mannermaa; Siranoush Manoukian; Sara Margolin; Keitaro Matsuo; Rebecca Mayes; James McKay; Alfons Meindl; Roger L Milne; Kenneth Muir; Susan L Neuhausen; Heli Nevanlinna; Curtis Olswold; Nick Orr; Paolo Peterlongo; Guillermo Pita; Katri Pylkäs; Anja Rudolph; Suleeporn Sangrajrang; Elinor J Sawyer; Marjanka K Schmidt; Rita K Schmutzler; Caroline Seynaeve; Mitul Shah; Chen-Yang Shen; Xiao-Ou Shu; Melissa C Southey; Daniel O Stram; Harald Surowy; Anthony Swerdlow; Soo H Teo; Daniel C Tessier; Ian Tomlinson; Diana Torres; Thérèse Truong; Celine M Vachon; Daniel Vincent; Robert Winqvist; Anna H Wu; Pei-Ei Wu; Cheng Har Yip; Wei Zheng; Paul D P Pharoah; Per Hall; Stacey L Edwards; Jacques Simard; Juliet D French; Georgia Chenevix-Trench; Alison M Dunning
Journal: Sci Rep Date: 2016-09-07 Impact factor: 4.379

8. Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer.

Authors: Roger L Milne; Karoline B Kuchenbaecker; Kyriaki Michailidou; Jonathan Beesley; Siddhartha Kar; Sara Lindström; Shirley Hui; Audrey Lemaçon; Penny Soucy; Joe Dennis; Xia Jiang; Asha Rostamianfar; Hilary Finucane; Manjeet K Bolla; Lesley McGuffog; Qin Wang; Cora M Aalfs; Marcia Adams; Julian Adlard; Simona Agata; Shahana Ahmed; Habibul Ahsan; Kristiina Aittomäki; Fares Al-Ejeh; Jamie Allen; Christine B Ambrosone; Christopher I Amos; Irene L Andrulis; Hoda Anton-Culver; Natalia N Antonenkova; Volker Arndt; Norbert Arnold; Kristan J Aronson; Bernd Auber; Paul L Auer; Margreet G E M Ausems; Jacopo Azzollini; François Bacot; Judith Balmaña; Monica Barile; Laure Barjhoux; Rosa B Barkardottir; Myrto Barrdahl; Daniel Barnes; Daniel Barrowdale; Caroline Baynes; Matthias W Beckmann; Javier Benitez; Marina Bermisheva; Leslie Bernstein; Yves-Jean Bignon; Kathleen R Blazer; Marinus J Blok; Carl Blomqvist; William Blot; Kristie Bobolis; Bram Boeckx; Natalia V Bogdanova; Anders Bojesen; Stig E Bojesen; Bernardo Bonanni; Anne-Lise Børresen-Dale; Aniko Bozsik; Angela R Bradbury; Judith S Brand; Hiltrud Brauch; Hermann Brenner; Brigitte Bressac-de Paillerets; Carole Brewer; Louise Brinton; Per Broberg; Angela Brooks-Wilson; Joan Brunet; Thomas Brüning; Barbara Burwinkel; Saundra S Buys; Jinyoung Byun; Qiuyin Cai; Trinidad Caldés; Maria A Caligo; Ian Campbell; Federico Canzian; Olivier Caron; Angel Carracedo; Brian D Carter; J Esteban Castelao; Laurent Castera; Virginie Caux-Moncoutier; Salina B Chan; Jenny Chang-Claude; Stephen J Chanock; Xiaoqing Chen; Ting-Yuan David Cheng; Jocelyne Chiquette; Hans Christiansen; Kathleen B M Claes; Christine L Clarke; Thomas Conner; Don M Conroy; Jackie Cook; Emilie Cordina-Duverger; Sten Cornelissen; Isabelle Coupier; Angela Cox; David G Cox; Simon S Cross; Katarina Cuk; Julie M Cunningham; Kamila Czene; Mary B Daly; Francesca Damiola; Hatef Darabi; Rosemarie Davidson; Kim De Leeneer; Peter Devilee; Ed Dicks; Orland Diez; Yuan Chun Ding; Nina Ditsch; Kimberly F Doheny; Susan M Domchek; Cecilia M Dorfling; Thilo Dörk; Isabel Dos-Santos-Silva; Stéphane Dubois; Pierre-Antoine Dugué; Martine Dumont; Alison M Dunning; Lorraine Durcan; Miriam Dwek; Bernd Dworniczak; Diana Eccles; Ros Eeles; Hans Ehrencrona; Ursula Eilber; Bent Ejlertsen; Arif B Ekici; A Heather Eliassen; Christoph Engel; Mikael Eriksson; Laura Fachal; Laurence Faivre; Peter A Fasching; Ulrike Faust; Jonine Figueroa; Dieter Flesch-Janys; Olivia Fletcher; Henrik Flyger; William D Foulkes; Eitan Friedman; Lin Fritschi; Debra Frost; Marike Gabrielson; Pragna Gaddam; Marilie D Gammon; Patricia A Ganz; Susan M Gapstur; Judy Garber; Vanesa Garcia-Barberan; José A García-Sáenz; Mia M Gaudet; Marion Gauthier-Villars; Andrea Gehrig; Vassilios Georgoulias; Anne-Marie Gerdes; Graham G Giles; Gord Glendon; Andrew K Godwin; Mark S Goldberg; David E Goldgar; Anna González-Neira; Paul Goodfellow; Mark H Greene; Grethe I Grenaker Alnæs; Mervi Grip; Jacek Gronwald; Anne Grundy; Daphne Gschwantler-Kaulich; Pascal Guénel; Qi Guo; Lothar Haeberle; Eric Hahnen; Christopher A Haiman; Niclas Håkansson; Emily Hallberg; Ute Hamann; Nathalie Hamel; Susan Hankinson; Thomas V O Hansen; Patricia Harrington; Steven N Hart; Jaana M Hartikainen; Catherine S Healey; Alexander Hein; Sonja Helbig; Alex Henderson; Jane Heyworth; Belynda Hicks; Peter Hillemanns; Shirley Hodgson; Frans B Hogervorst; Antoinette Hollestelle; Maartje J Hooning; Bob Hoover; John L Hopper; Chunling Hu; Guanmengqian Huang; Peter J Hulick; Keith Humphreys; David J Hunter; Evgeny N Imyanitov; Claudine Isaacs; Motoki Iwasaki; Louise Izatt; Anna Jakubowska; Paul James; Ramunas Janavicius; Wolfgang Janni; Uffe Birk Jensen; Esther M John; Nichola Johnson; Kristine Jones; Michael Jones; Arja Jukkola-Vuorinen; Rudolf Kaaks; Maria Kabisch; Katarzyna Kaczmarek; Daehee Kang; Karin Kast; Renske Keeman; Michael J Kerin; Carolien M Kets; Machteld Keupers; Sofia Khan; Elza Khusnutdinova; Johanna I Kiiski; Sung-Won Kim; Julia A Knight; Irene Konstantopoulou; Veli-Matti Kosma; Vessela N Kristensen; Torben A Kruse; Ava Kwong; Anne-Vibeke Lænkholm; Yael Laitman; Fiona Lalloo; Diether Lambrechts; Keren Landsman; Christine Lasset; Conxi Lazaro; Loic Le Marchand; Julie Lecarpentier; Andrew Lee; Eunjung Lee; Jong Won Lee; Min Hyuk Lee; Flavio Lejbkowicz; Fabienne Lesueur; Jingmei Li; Jenna Lilyquist; Anne Lincoln; Annika Lindblom; Jolanta Lissowska; Wing-Yee Lo; Sibylle Loibl; Jirong Long; Jennifer T Loud; Jan Lubinski; Craig Luccarini; Michael Lush; Robert J MacInnis; Tom Maishman; Enes Makalic; Ivana Maleva Kostovska; Kathleen E Malone; Siranoush Manoukian; JoAnn E Manson; Sara Margolin; John W M Martens; Maria Elena Martinez; Keitaro Matsuo; Dimitrios Mavroudis; Sylvie Mazoyer; Catriona McLean; Hanne Meijers-Heijboer; Primitiva Menéndez; Jeffery Meyer; Hui Miao; Austin Miller; Nicola Miller; Gillian Mitchell; Marco Montagna; Kenneth Muir; Anna Marie Mulligan; Claire Mulot; Sue Nadesan; Katherine L Nathanson; Susan L Neuhausen; Heli Nevanlinna; Ines Nevelsteen; Dieter Niederacher; Sune F Nielsen; Børge G Nordestgaard; Aaron Norman; Robert L Nussbaum; Edith Olah; Olufunmilayo I Olopade; Janet E Olson; Curtis Olswold; Kai-Ren Ong; Jan C Oosterwijk; Nick Orr; Ana Osorio; V Shane Pankratz; Laura Papi; Tjoung-Won Park-Simon; Ylva Paulsson-Karlsson; Rachel Lloyd; Inge Søkilde Pedersen; Bernard Peissel; Ana Peixoto; Jose I A Perez; Paolo Peterlongo; Julian Peto; Georg Pfeiler; Catherine M Phelan; Mila Pinchev; Dijana Plaseska-Karanfilska; Bruce Poppe; Mary E Porteous; Ross Prentice; Nadege Presneau; Darya Prokofieva; Elizabeth Pugh; Miquel Angel Pujana; Katri Pylkäs; Brigitte Rack; Paolo Radice; Nazneen Rahman; Johanna Rantala; Christine Rappaport-Fuerhauser; Gad Rennert; Hedy S Rennert; Valerie Rhenius; Kerstin Rhiem; Andrea Richardson; Gustavo C Rodriguez; Atocha Romero; Jane Romm; Matti A Rookus; Anja Rudolph; Thomas Ruediger; Emmanouil Saloustros; Joyce Sanders; Dale P Sandler; Suleeporn Sangrajrang; Elinor J Sawyer; Daniel F Schmidt; Minouk J Schoemaker; Fredrick Schumacher; Peter Schürmann; Lukas Schwentner; Christopher Scott; Rodney J Scott; Sheila Seal; Leigha Senter; Caroline Seynaeve; Mitul Shah; Priyanka Sharma; Chen-Yang Shen; Xin Sheng; Hermela Shimelis; Martha J Shrubsole; Xiao-Ou Shu; Lucy E Side; Christian F Singer; Christof Sohn; Melissa C Southey; John J Spinelli; Amanda B Spurdle; Christa Stegmaier; Dominique Stoppa-Lyonnet; Grzegorz Sukiennicki; Harald Surowy; Christian Sutter; Anthony Swerdlow; Csilla I Szabo; Rulla M Tamimi; Yen Y Tan; Jack A Taylor; Maria-Isabel Tejada; Maria Tengström; Soo H Teo; Mary B Terry; Daniel C Tessier; Alex Teulé; Kathrin Thöne; Darcy L Thull; Maria Grazia Tibiletti; Laima Tihomirova; Marc Tischkowitz; Amanda E Toland; Rob A E M Tollenaar; Ian Tomlinson; Ling Tong; Diana Torres; Martine Tranchant; Thérèse Truong; Kathy Tucker; Nadine Tung; Jonathan Tyrer; Hans-Ulrich Ulmer; Celine Vachon; Christi J van Asperen; David Van Den Berg; Ans M W van den Ouweland; Elizabeth J van Rensburg; Liliana Varesco; Raymonda Varon-Mateeva; Ana Vega; Alessandra Viel; Joseph Vijai; Daniel Vincent; Jason Vollenweider; Lisa Walker; Zhaoming Wang; Shan Wang-Gohrke; Barbara Wappenschmidt; Clarice R Weinberg; Jeffrey N Weitzel; Camilla Wendt; Jelle Wesseling; Alice S Whittemore; Juul T Wijnen; Walter Willett; Robert Winqvist; Alicja Wolk; Anna H Wu; Lucy Xia; Xiaohong R Yang; Drakoulis Yannoukakos; Daniela Zaffaroni; Wei Zheng; Bin Zhu; Argyrios Ziogas; Elad Ziv; Kristin K Zorn; Manuela Gago-Dominguez; Arto Mannermaa; Håkan Olsson; Manuel R Teixeira; Jennifer Stone; Kenneth Offit; Laura Ottini; Sue K Park; Mads Thomassen; Per Hall; Alfons Meindl; Rita K Schmutzler; Arnaud Droit; Gary D Bader; Paul D P Pharoah; Fergus J Couch; Douglas F Easton; Peter Kraft; Georgia Chenevix-Trench; Montserrat García-Closas; Marjanka K Schmidt; Antonis C Antoniou; Jacques Simard
Journal: Nat Genet Date: 2017-10-23 Impact factor: 38.330

9. Association analysis identifies 65 new breast cancer risk loci.

Authors: Kyriaki Michailidou; Sara Lindström; Joe Dennis; Jonathan Beesley; Shirley Hui; Siddhartha Kar; Audrey Lemaçon; Penny Soucy; Dylan Glubb; Asha Rostamianfar; Manjeet K Bolla; Qin Wang; Jonathan Tyrer; Ed Dicks; Andrew Lee; Zhaoming Wang; Jamie Allen; Renske Keeman; Ursula Eilber; Juliet D French; Xiao Qing Chen; Laura Fachal; Karen McCue; Amy E McCart Reed; Maya Ghoussaini; Jason S Carroll; Xia Jiang; Hilary Finucane; Marcia Adams; Muriel A Adank; Habibul Ahsan; Kristiina Aittomäki; Hoda Anton-Culver; Natalia N Antonenkova; Volker Arndt; Kristan J Aronson; Banu Arun; Paul L Auer; François Bacot; Myrto Barrdahl; Caroline Baynes; Matthias W Beckmann; Sabine Behrens; Javier Benitez; Marina Bermisheva; Leslie Bernstein; Carl Blomqvist; Natalia V Bogdanova; Stig E Bojesen; Bernardo Bonanni; Anne-Lise Børresen-Dale; Judith S Brand; Hiltrud Brauch; Paul Brennan; Hermann Brenner; Louise Brinton; Per Broberg; Ian W Brock; Annegien Broeks; Angela Brooks-Wilson; Sara Y Brucker; Thomas Brüning; Barbara Burwinkel; Katja Butterbach; Qiuyin Cai; Hui Cai; Trinidad Caldés; Federico Canzian; Angel Carracedo; Brian D Carter; Jose E Castelao; Tsun L Chan; Ting-Yuan David Cheng; Kee Seng Chia; Ji-Yeob Choi; Hans Christiansen; Christine L Clarke; Margriet Collée; Don M Conroy; Emilie Cordina-Duverger; Sten Cornelissen; David G Cox; Angela Cox; Simon S Cross; Julie M Cunningham; Kamila Czene; Mary B Daly; Peter Devilee; Kimberly F Doheny; Thilo Dörk; Isabel Dos-Santos-Silva; Martine Dumont; Lorraine Durcan; Miriam Dwek; Diana M Eccles; Arif B Ekici; A Heather Eliassen; Carolina Ellberg; Mingajeva Elvira; Christoph Engel; Mikael Eriksson; Peter A Fasching; Jonine Figueroa; Dieter Flesch-Janys; Olivia Fletcher; Henrik Flyger; Lin Fritschi; Valerie Gaborieau; Marike Gabrielson; Manuela Gago-Dominguez; Yu-Tang Gao; Susan M Gapstur; José A García-Sáenz; Mia M Gaudet; Vassilios Georgoulias; Graham G Giles; Gord Glendon; Mark S Goldberg; David E Goldgar; Anna González-Neira; Grethe I Grenaker Alnæs; Mervi Grip; Jacek Gronwald; Anne Grundy; Pascal Guénel; Lothar Haeberle; Eric Hahnen; Christopher A Haiman; Niclas Håkansson; Ute Hamann; Nathalie Hamel; Susan Hankinson; Patricia Harrington; Steven N Hart; Jaana M Hartikainen; Mikael Hartman; Alexander Hein; Jane Heyworth; Belynda Hicks; Peter Hillemanns; Dona N Ho; Antoinette Hollestelle; Maartje J Hooning; Robert N Hoover; John L Hopper; Ming-Feng Hou; Chia-Ni Hsiung; Guanmengqian Huang; Keith Humphreys; Junko Ishiguro; Hidemi Ito; Motoki Iwasaki; Hiroji Iwata; Anna Jakubowska; Wolfgang Janni; Esther M John; Nichola Johnson; Kristine Jones; Michael Jones; Arja Jukkola-Vuorinen; Rudolf Kaaks; Maria Kabisch; Katarzyna Kaczmarek; Daehee Kang; Yoshio Kasuga; Michael J Kerin; Sofia Khan; Elza Khusnutdinova; Johanna I Kiiski; Sung-Won Kim; Julia A Knight; Veli-Matti Kosma; Vessela N Kristensen; Ute Krüger; Ava Kwong; Diether Lambrechts; Loic Le Marchand; Eunjung Lee; Min Hyuk Lee; Jong Won Lee; Chuen Neng Lee; Flavio Lejbkowicz; Jingmei Li; Jenna Lilyquist; Annika Lindblom; Jolanta Lissowska; Wing-Yee Lo; Sibylle Loibl; Jirong Long; Artitaya Lophatananon; Jan Lubinski; Craig Luccarini; Michael P Lux; Edmond S K Ma; Robert J MacInnis; Tom Maishman; Enes Makalic; Kathleen E Malone; Ivana Maleva Kostovska; Arto Mannermaa; Siranoush Manoukian; JoAnn E Manson; Sara Margolin; Shivaani Mariapun; Maria Elena Martinez; Keitaro Matsuo; Dimitrios Mavroudis; James McKay; Catriona McLean; Hanne Meijers-Heijboer; Alfons Meindl; Primitiva Menéndez; Usha Menon; Jeffery Meyer; Hui Miao; Nicola Miller; Nur Aishah Mohd Taib; Kenneth Muir; Anna Marie Mulligan; Claire Mulot; Susan L Neuhausen; Heli Nevanlinna; Patrick Neven; Sune F Nielsen; Dong-Young Noh; Børge G Nordestgaard; Aaron Norman; Olufunmilayo I Olopade; Janet E Olson; Håkan Olsson; Curtis Olswold; Nick Orr; V Shane Pankratz; Sue K Park; Tjoung-Won Park-Simon; Rachel Lloyd; Jose I A Perez; Paolo Peterlongo; Julian Peto; Kelly-Anne Phillips; Mila Pinchev; Dijana Plaseska-Karanfilska; Ross Prentice; Nadege Presneau; Darya Prokofyeva; Elizabeth Pugh; Katri Pylkäs; Brigitte Rack; Paolo Radice; Nazneen Rahman; Gadi Rennert; Hedy S Rennert; Valerie Rhenius; Atocha Romero; Jane Romm; Kathryn J Ruddy; Thomas Rüdiger; Anja Rudolph; Matthias Ruebner; Emiel J T Rutgers; Emmanouil Saloustros; Dale P Sandler; Suleeporn Sangrajrang; Elinor J Sawyer; Daniel F Schmidt; Rita K Schmutzler; Andreas Schneeweiss; Minouk J Schoemaker; Fredrick Schumacher; Peter Schürmann; Rodney J Scott; Christopher Scott; Sheila Seal; Caroline Seynaeve; Mitul Shah; Priyanka Sharma; Chen-Yang Shen; Grace Sheng; Mark E Sherman; Martha J Shrubsole; Xiao-Ou Shu; Ann Smeets; Christof Sohn; Melissa C Southey; John J Spinelli; Christa Stegmaier; Sarah Stewart-Brown; Jennifer Stone; Daniel O Stram; Harald Surowy; Anthony Swerdlow; Rulla Tamimi; Jack A Taylor; Maria Tengström; Soo H Teo; Mary Beth Terry; Daniel C Tessier; Somchai Thanasitthichai; Kathrin Thöne; Rob A E M Tollenaar; Ian Tomlinson; Ling Tong; Diana Torres; Thérèse Truong; Chiu-Chen Tseng; Shoichiro Tsugane; Hans-Ulrich Ulmer; Giske Ursin; Michael Untch; Celine Vachon; Christi J van Asperen; David Van Den Berg; Ans M W van den Ouweland; Lizet van der Kolk; Rob B van der Luijt; Daniel Vincent; Jason Vollenweider; Quinten Waisfisz; Shan Wang-Gohrke; Clarice R Weinberg; Camilla Wendt; Alice S Whittemore; Hans Wildiers; Walter Willett; Robert Winqvist; Alicja Wolk; Anna H Wu; Lucy Xia; Taiki Yamaji; Xiaohong R Yang; Cheng Har Yip; Keun-Young Yoo; Jyh-Cherng Yu; Wei Zheng; Ying Zheng; Bin Zhu; Argyrios Ziogas; Elad Ziv; Sunil R Lakhani; Antonis C Antoniou; Arnaud Droit; Irene L Andrulis; Christopher I Amos; Fergus J Couch; Paul D P Pharoah; Jenny Chang-Claude; Per Hall; David J Hunter; Roger L Milne; Montserrat García-Closas; Marjanka K Schmidt; Stephen J Chanock; Alison M Dunning; Stacey L Edwards; Gary D Bader; Georgia Chenevix-Trench; Jacques Simard; Peter Kraft; Douglas F Easton
Journal: Nature Date: 2017-10-23 Impact factor: 49.962

10. Evidence that breast cancer risk at the 2q35 locus is mediated through IGFBP5 regulation.

Authors: Maya Ghoussaini; Stacey L Edwards; Kyriaki Michailidou; Silje Nord; Richard Cowper-Sal Lari; Kinjal Desai; Siddhartha Kar; Kristine M Hillman; Susanne Kaufmann; Dylan M Glubb; Jonathan Beesley; Joe Dennis; Manjeet K Bolla; Qin Wang; Ed Dicks; Qi Guo; Marjanka K Schmidt; Mitul Shah; Robert Luben; Judith Brown; Kamila Czene; Hatef Darabi; Mikael Eriksson; Daniel Klevebring; Stig E Bojesen; Børge G Nordestgaard; Sune F Nielsen; Henrik Flyger; Diether Lambrechts; Bernard Thienpont; Patrick Neven; Hans Wildiers; Annegien Broeks; Laura J Van't Veer; Emiel J Th Rutgers; Fergus J Couch; Janet E Olson; Emily Hallberg; Celine Vachon; Jenny Chang-Claude; Anja Rudolph; Petra Seibold; Dieter Flesch-Janys; Julian Peto; Isabel Dos-Santos-Silva; Lorna Gibson; Heli Nevanlinna; Taru A Muranen; Kristiina Aittomäki; Carl Blomqvist; Per Hall; Jingmei Li; Jianjun Liu; Keith Humphreys; Daehee Kang; Ji-Yeob Choi; Sue K Park; Dong-Young Noh; Keitaro Matsuo; Hidemi Ito; Hiroji Iwata; Yasushi Yatabe; Pascal Guénel; Thérèse Truong; Florence Menegaux; Marie Sanchez; Barbara Burwinkel; Frederik Marme; Andreas Schneeweiss; Christof Sohn; Anna H Wu; Chiu-Chen Tseng; David Van Den Berg; Daniel O Stram; Javier Benitez; M Pilar Zamora; Jose Ignacio Arias Perez; Primitiva Menéndez; Xiao-Ou Shu; Wei Lu; Yu-Tang Gao; Qiuyin Cai; Angela Cox; Simon S Cross; Malcolm W R Reed; Irene L Andrulis; Julia A Knight; Gord Glendon; Sandrine Tchatchou; Elinor J Sawyer; Ian Tomlinson; Michael J Kerin; Nicola Miller; Christopher A Haiman; Brian E Henderson; Fredrick Schumacher; Loic Le Marchand; Annika Lindblom; Sara Margolin; Soo Hwang Teo; Cheng Har Yip; Daphne S C Lee; Tien Y Wong; Maartje J Hooning; John W M Martens; J Margriet Collée; Carolien H M van Deurzen; John L Hopper; Melissa C Southey; Helen Tsimiklis; Miroslav K Kapuscinski; Chen-Yang Shen; Pei-Ei Wu; Jyh-Cherng Yu; Shou-Tung Chen; Grethe Grenaker Alnæs; Anne-Lise Borresen-Dale; Graham G Giles; Roger L Milne; Catriona McLean; Kenneth Muir; Artitaya Lophatananon; Sarah Stewart-Brown; Pornthep Siriwanarangsan; Mikael Hartman; Hui Miao; Shaik Ahmad Bin Syed Buhari; Yik Ying Teo; Peter A Fasching; Lothar Haeberle; Arif B Ekici; Matthias W Beckmann; Hermann Brenner; Aida Karina Dieffenbach; Volker Arndt; Christa Stegmaier; Anthony Swerdlow; Alan Ashworth; Nick Orr; Minouk J Schoemaker; Montserrat García-Closas; Jonine Figueroa; Stephen J Chanock; Jolanta Lissowska; Jacques Simard; Mark S Goldberg; France Labrèche; Martine Dumont; Robert Winqvist; Katri Pylkäs; Arja Jukkola-Vuorinen; Hiltrud Brauch; Thomas Brüning; Yon-Dschun Koto; Paolo Radice; Paolo Peterlongo; Bernardo Bonanni; Sara Volorio; Thilo Dörk; Natalia V Bogdanova; Sonja Helbig; Arto Mannermaa; Vesa Kataja; Veli-Matti Kosma; Jaana M Hartikainen; Peter Devilee; Robert A E M Tollenaar; Caroline Seynaeve; Christi J Van Asperen; Anna Jakubowska; Jan Lubinski; Katarzyna Jaworska-Bieniek; Katarzyna Durda; Susan Slager; Amanda E Toland; Christine B Ambrosone; Drakoulis Yannoukakos; Suleeporn Sangrajrang; Valerie Gaborieau; Paul Brennan; James McKay; Ute Hamann; Diana Torres; Wei Zheng; Jirong Long; Hoda Anton-Culver; Susan L Neuhausen; Craig Luccarini; Caroline Baynes; Shahana Ahmed; Mel Maranian; Catherine S Healey; Anna González-Neira; Guillermo Pita; M Rosario Alonso; Nuria Alvarez; Daniel Herrero; Daniel C Tessier; Daniel Vincent; Francois Bacot; Ines de Santiago; Jason Carroll; Carlos Caldas; Melissa A Brown; Mathieu Lupien; Vessela N Kristensen; Paul D P Pharoah; Georgia Chenevix-Trench; Juliet D French; Douglas F Easton; Alison M Dunning
Journal: Nat Commun Date: 2014-09-23 Impact factor: 14.919

36 in total

1. Two distinct mechanisms underlie estrogen-receptor-negative breast cancer susceptibility at the 2p23.2 locus.

Authors: Gustavo Mendoza-Fandiño; Paulo Cilas M Lyra; Thales C Nepomuceno; Carly M Harro; Nicholas T Woods; Xueli Li; Leticia B Rangel; Marcelo A Carvalho; Fergus J Couch; Alvaro N A Monteiro
Journal: Eur J Hum Genet Date: 2021-11-22 Impact factor: 4.246

2. eQTL Colocalization Analyses Identify NTN4 as a Candidate Breast Cancer Risk Gene.

Authors: Jonathan Beesley; Haran Sivakumaran; Mahdi Moradi Marjaneh; Wei Shi; Kristine M Hillman; Susanne Kaufmann; Nehal Hussein; Siddhartha Kar; Luize G Lima; Sunyoung Ham; Andreas Möller; Georgia Chenevix-Trench; Stacey L Edwards; Juliet D French
Journal: Am J Hum Genet Date: 2020-08-31 Impact factor: 11.025

3. Common variants in signaling transcription-factor-binding sites drive phenotypic variability in red blood cell traits.

Authors: Avik Choudhuri; Eirini Trompouki; Brian J Abraham; Leandro M Colli; Kian Hong Kock; William Mallard; Min-Lee Yang; Divya S Vinjamur; Alireza Ghamari; Audrey Sporrij; Karen Hoi; Barbara Hummel; Sonja Boatman; Victoria Chan; Sierra Tseng; Satish K Nandakumar; Song Yang; Asher Lichtig; Michael Superdock; Seraj N Grimes; Teresa V Bowman; Yi Zhou; Shinichiro Takahashi; Roby Joehanes; Alan B Cantor; Daniel E Bauer; Santhi K Ganesh; John Rinn; Paul S Albert; Martha L Bulyk; Stephen J Chanock; Richard A Young; Leonard I Zon
Journal: Nat Genet Date: 2020-11-23 Impact factor: 38.330

4. Potential functional variants of KIAA genes are associated with breast cancer risk in a case control study.

Authors: Jing Zhou; Congcong Chen; Sijun Liu; Wen Zhou; Jiangbo Du; Yue Jiang; Juncheng Dai; Guangfu Jin; Hongxia Ma; Zhibin Hu; Jiaping Chen; Hongbing Shen
Journal: Ann Transl Med Date: 2021-04

5. A Role for TGFβ Signaling in Preclinical Osteolytic Estrogen Receptor-Positive Breast Cancer Bone Metastases Progression.

Authors: Julia N Cheng; Jennifer B Frye; Susan A Whitman; Andrew G Kunihiro; Ritu Pandey; Janet L Funk
Journal: Int J Mol Sci Date: 2021-04-24 Impact factor: 5.923

6. Rare germline copy number variants (CNVs) and breast cancer risk.

Authors: Joe Dennis; Jonathan P Tyrer; Logan C Walker; Kyriaki Michailidou; Leila Dorling; Manjeet K Bolla; Qin Wang; Thomas U Ahearn; Irene L Andrulis; Hoda Anton-Culver; Natalia N Antonenkova; Volker Arndt; Kristan J Aronson; Laura E Beane Freeman; Matthias W Beckmann; Sabine Behrens; Javier Benitez; Marina Bermisheva; Natalia V Bogdanova; Stig E Bojesen; Hermann Brenner; Jose E Castelao; Jenny Chang-Claude; Georgia Chenevix-Trench; Christine L Clarke; J Margriet Collée; Fergus J Couch; Angela Cox; Simon S Cross; Kamila Czene; Peter Devilee; Thilo Dörk; Laure Dossus; A Heather Eliassen; Mikael Eriksson; D Gareth Evans; Peter A Fasching; Jonine Figueroa; Olivia Fletcher; Henrik Flyger; Lin Fritschi; Marike Gabrielson; Manuela Gago-Dominguez; Montserrat García-Closas; Graham G Giles; Anna González-Neira; Pascal Guénel; Eric Hahnen; Christopher A Haiman; Per Hall; Antoinette Hollestelle; Reiner Hoppe; John L Hopper; Anthony Howell; Agnes Jager; Anna Jakubowska; Esther M John; Nichola Johnson; Michael E Jones; Audrey Jung; Rudolf Kaaks; Renske Keeman; Elza Khusnutdinova; Cari M Kitahara; Yon-Dschun Ko; Veli-Matti Kosma; Stella Koutros; Peter Kraft; Vessela N Kristensen; Katerina Kubelka-Sabit; Allison W Kurian; James V Lacey; Diether Lambrechts; Nicole L Larson; Martha Linet; Alicja Ogrodniczak; Arto Mannermaa; Siranoush Manoukian; Sara Margolin; Dimitrios Mavroudis; Roger L Milne; Taru A Muranen; Rachel A Murphy; Heli Nevanlinna; Janet E Olson; Håkan Olsson; Tjoung-Won Park-Simon; Charles M Perou; Paolo Peterlongo; Dijana Plaseska-Karanfilska; Katri Pylkäs; Gad Rennert; Emmanouil Saloustros; Dale P Sandler; Elinor J Sawyer; Marjanka K Schmidt; Rita K Schmutzler; Rana Shibli; Ann Smeets; Penny Soucy; Melissa C Southey; Anthony J Swerdlow; Rulla M Tamimi; Jack A Taylor; Lauren R Teras; Mary Beth Terry; Ian Tomlinson; Melissa A Troester; Thérèse Truong; Celine M Vachon; Camilla Wendt; Robert Winqvist; Alicja Wolk; Xiaohong R Yang; Wei Zheng; Argyrios Ziogas; Jacques Simard; Alison M Dunning; Paul D P Pharoah; Douglas F Easton
Journal: Commun Biol Date: 2022-01-18

Review 7. Fine-mapping genetic associations.

Authors: Anna Hutchinson; Jennifer Asimit; Chris Wallace
Journal: Hum Mol Genet Date: 2020-09-30 Impact factor: 6.150

8. Molecular Spectra and Frequency Patterns of Somatic Mutations in Arab Women with Breast Cancer.

Authors: Humaid O Al-Shamsi; Ibrahim Abu-Gheida; Ahmed S Abdulsamad; Aydah AlAwadhi; Sadir Alrawi; Khaled M Musallam; Banu Arun; Nuhad K Ibrahim
Journal: Oncologist Date: 2021-08-14

Review 9. Quality and Quantity: How to Organize a Countrywide Genetic Counseling and Testing.

Authors: Rita Katharina Schmutzler
Journal: Breast Care (Basel) Date: 2021-05-07 Impact factor: 2.860

10. A search for modifying genetic factors in CHEK2:c.1100delC breast cancer patients.

Authors: Camilla Wendt; Taru A Muranen; Lotta Mielikäinen; Jessada Thutkawkorapin; Carl Blomqvist; Xiang Jiao; Hans Ehrencrona; Emma Tham; Brita Arver; Beatrice Melin; Ekaterina Kuchinskaya; Marie Stenmark Askmalm; Ylva Paulsson-Karlsson; Zakaria Einbeigi; Anna von Wachenfeldt Väppling; Eija Kalso; Tiina Tasmuth; Anne Kallioniemi; Kristiina Aittomäki; Heli Nevanlinna; Åke Borg; Annika Lindblom
Journal: Sci Rep Date: 2021-07-20 Impact factor: 4.379