Literature DB >> 29879995

Family specific genetic predisposition to breast cancer: results from Tunisian whole exome sequenced breast cancer cases.

Yosr Hamdi1, Maroua Boujemaa2, Mariem Ben Rekaya2, Cherif Ben Hamda3,4, Najah Mighri2, Houda El Benna5, Nesrine Mejri5, Soumaya Labidi5, Nouha Daoud5, Chokri Naouali2, Olfa Messaoud2, Mariem Chargui2, Kais Ghedira3, Mohamed Samir Boubaker2, Ridha Mrad6, Hamouda Boussen5, Sonia Abdelhak2.   

Abstract

BACKGROUND: A family history of breast cancer has long been thought to indicate the presence of inherited genetic events that predispose to this disease. In North Africa, many specific epidemio-genetic characteristics have been observed in breast cancer families when compared to Western populations. Despite these specificities, the majority of breast cancer genetics studies performed in North Africa remain restricted to the investigation of the BRCA1 and BRCA2 genes. Thus, comprehensive data at a whole exome or whole genome level from local patients are lacking.
METHODS: A whole exome sequencing (WES) of seven breast cancer Tunisian families have been performed using a family-based approach. We focused our analysis on BC-TN-F001 family that included two affected members that have been sequenced using WES. Relevant variants identified in BC-TN-F001 have been confirmed using Sanger sequencing. Then, we conducted an integrative analysis by combining our results with those from other WES studies in order to figure out the genetic transmission model of the newly identified genes. Biological network construction and protein-protein interactions analyses have been performed to decipher the molecular mechanisms likely accounting for the role of these genes in breast cancer risk.
RESULTS: Sequencing, filtering strategies, and validation analysis have been achieved. For BC-TN-F001, no deleterious mutations have been identified on known breast cancer genes. However, 373 heterozygous, exonic and rare variants have been identified on other candidate genes. After applying several filters, 12 relevant high-risk variants have been selected. Our results showed that these variants seem to be inherited in a family specific model. This hypothesis has been confirmed following a thorough analysis of the reported WES studies. Enriched biological process and protein-protein interaction networks resulted in the identification of four novel breast cancer candidate genes namely MMS19, DNAH3, POLK and KATB6.
CONCLUSIONS: In this first WES application on Tunisian breast cancer patients, we highlighted the impact of next generation sequencing technologies in the identification of novel breast cancer candidate genes which may bring new insights into the biological mechanisms of breast carcinogenesis. Our findings showed that the breast cancer predisposition in non-BRCA families may be ethnic and/or family specific.

Entities:  

Keywords:  Breast cancer; Exome sequencing; Family specific predisposition; Non BRCA Tunisian families

Mesh:

Year:  2018        PMID: 29879995      PMCID: PMC5992876          DOI: 10.1186/s12967-018-1504-9

Source DB:  PubMed          Journal:  J Transl Med        ISSN: 1479-5876            Impact factor:   5.531


Background

A range of genetic and non-genetic risk factors contribute to the development of breast cancer [1]. So far, several genetic variants of high, moderate and low penetrance have been identified as impacting on breast cancer risk using familial linkage, DNA resequencing and genome wide association analysis, respectively [2]. The identification of additional breast cancer associated genes is crucial to explain the missing breast cancer heritability. Recent studies showed that breast cancer susceptibility may be explained by a polygenic risk model of inheritance in which a large number of common SNPs contribute multiplicatively towards risk [3]. With the introduction of next generation sequencing (NGS) technologies [4, 5] many studies suggested that a large rate of the remaining breast cancer heritability can be attributed to new rare risk alleles that segregate in an autosomal-dominant pattern of inheritance. To date, two different whole exome sequencing study designs are used: case/control association studies and the family-based approach. The case/control design is considered as the major promising tool to detect significant associations between genetic variations and breast cancer disease [6]. However, due to the extreme rarity of certain variants, this approach requires large-size cohorts to confirm the association between these variants and breast cancer risk. The second WES design is the family-based approach [7] where breast cancer family members are exome-sequenced and the shared variants between affected individuals presumably include the familial breast cancer risk allele. Thus, focusing on the family segregation of relevant variants is expected to better detect novel susceptibility variants than the screening of pooled unrelated cases and controls. Several WES studies have been performed on hereditary breast cancer [7, 8]. Almost, 108 breast cancer families have been whole exome sequenced using the family-based approach and reported many relevant variants present in related affected individuals and absent in unaffected ones. So far, five new genes have been identified by WES as associated with breast cancer risk, among them four genes identified using the family-based approach, namely: XRCC2 [9], MAPKAP1 [10], FANCM [11] and RINT1 [12] while only one gene, REQCL, was identified using the case/control approach [13]. Mutations on known breast cancer susceptibility genes were reported in only four families [10-14]. In Tunisia, breast cancer is the most common and the most deadly form of cancer among females [15]. Several epidemiological, genetic and clinical breast cancer characteristics have been observed to be unique to Tunisian and North African population. Indeed, breast cancer shows a lower incidence rate but a younger age of disease onset, when compared to Western populations, with a relative high frequency of the aggressive breast cancer forms such as inflammatory and triple negative breast cancers [16]. Thus, a genetic predisposition specific to this ethnic group is plausible, [8, 17, 18]. Moreover, it is possible that breast cancer risk variants are so rare that they are “family specific” meaning that a genetic predisposition can be detected within a disease-prone family, but not necessarily shared with other genetically unrelated families with the same disease [19-21]. So far, genetic studies performed on Tunisian breast cancer patients mostly focused on the BRCA genes using the traditional Sanger technique. Therefore, the use of next generation sequencing technologies in the genetic investigation of these under-exploited populations may help identifying novel breast cancer risk allele and explain the remaining unresolved breast cancer genetic heritability. In the present study, we performed a whole exome sequencing of seven BRCAx breast cancer Tunisian families with strong family history in order to identify genetic variations that may be associated with breast cancer risk. Using the family-based approach, we focused our analysis on a non BRCA family by sequencing two out of three affected sisters. After comparing our results to those identified in previous WES studies and by performing biological network analysis, we identified a set of novel breast cancer candidate genes that seems to be inherited in a family specific manner.

Methods

Patients

Seven Tunisian breast cancer families were selected for WES based on the following criteria: (1) Presence of at least three related first or second-degree breast cancer cases; (2) Breast cancer in young patients aged less than 35 years, (3) Presence of at least two cases of breast or ovarian cancer, regardless of age, and at least one case of pancreatic cancer or prostate cancer in a related first or second degree patient. Blood samples have been collected from the affected family members and have been sampled in the Medical oncology department, Abderrahman Mami Hospital, Ariana, Tunisia. Written informed consents were obtained from all participants. Ethical approval according to the Declaration of Helsinki Principles was obtained from the biomedical ethics committee of Institut Pasteur de Tunis (2017/16/E/Hôpital a-m/V1). Two out of three affected sisters from BC-TN-F001 have been whole exome sequenced. The proband was diagnosed with a primary breast cancer at age 43 and contralateral invasive ductal breast carcinoma at age 48. The second family member involved in this study was diagnosed with an invasive breast cancer at age 56. Phenotypic characteristics of the affected family members are described in Table 1.
Table 1

Epidemiological and clinical data of affected family members

Family MemberDiagnosis ageHistological subtypeSBR gradeTumor size (mm)Hormone receptors statusHER2 statusDisease evolutionMedical history
BC-TN-F001-143Invasive ductal carcinomaII22ER +/PR+NDCBC within 5 years, grade III triple negative carcinoma3 miscarriages
BC-TN-F001-256Invasive ductal carcinomaNDNDER +/PR+NDIn remissionNo medical history
BC-TN-F001-347Bifocal invasive ductal carcinomaI7ER +/PR+HER2−In remissionPrimary infertility (IVF)

CBC contralateral breast cancer; ER estrogen receptor; PR progesterone receptor; ND not determined; IVF in vitro fertilization

Epidemiological and clinical data of affected family members CBC contralateral breast cancer; ER estrogen receptor; PR progesterone receptor; ND not determined; IVF in vitro fertilization

Whole exome sequencing and data analysis

For each participant, total genomic DNA was isolated from peripheral blood using the salting out method or the DNeasy blood Kit from Qiagen according to the manufacturer’s instructions. DNA purity and concentration were measured using a NanoDrop™ spectrophotometer. Samples were prepared according to Agilent’s SureSelect Protocol version 1.2 and enrichment was carried out according to Agilent SureSelect protocols. Enriched samples were sequenced on the Illumina HiSeq 2000 platform using TruSeq v3 chemistry with paired-end (2 × 100pb). Exome DNA sequences were mapped to their location in the build of the human genome (hg19/b37) using the Burrows–Wheeler Aligner (BWA) package. The subsequent SAM files were converted to BAM files using Samtools. Duplicate reads were removed using Picard. GATK was then used to recalibrate the base quality scores as well as for SNP and short INDEL calling. Annotation and prioritization of potential disease-causing variants were performed using VarAFT (Variant Annotation and Filtering Tool) (http://varaft.eu). To annotate variants, VarAFT uses ANNOVAR, a command line tool. INDELs and SNPs annotated were filtered according to several criteria: (1) considering breast cancer as autosomal dominant disease and removing variants that were found in a homozygous state, (2) variants identified as intronic, intergenic, and none coding or synonymous were discarded, (3) assuming that causal variants are rare, we removed all variants with an allele frequency > 1% either in Exac [22], 1000 genomes [23] or ESP6500 (http://evs.gs.washington.edu/EVS/), (4) benign or tolerated variants, according to different in silico prediction tools were also removed. Finally, significant candidate variants were obtained after filtering against their phenotypic relevance.

Sanger sequencing

The Sanger sequencing technique was first used to test the BRCA status of affected family members, then to validate the identified variants resulting from the whole exome sequencing. PCR reactions were performed on genomic DNA (gDNA), following standard protocols, followed by Sanger sequencing using an automated sequencer (ABI 3500; Applied Biosystems, Foster City, CA) using a cycle sequencing reaction kit (Big Dye Terminator kit, Applied Biosystems). Data were analyzed using BioEdit Sequence Alignment Editor Version 7.2.5.

In silico prediction tools

We selected four in silico prediction tools to assess the functional effects of the candidate variants: Sorting Intolerant From Tolerant (SIFT) (http://sift.jcvi.org/) to examine the degree of conservation for amino acid residues across species and to find changes in protein structure and function; PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/) and Mutation Taster (http://www.mutationtaster.org/) to assess the impact of mutations on protein function and to look at effects on splicing or mRNA expression and Align GVGD (http://agvgd.iarc.fr) that classifies missense variants in a query sequence into seven grades, from the most deleterious C65 to the least deleterious C0, with the intermediate grades C15, C25, C35, C45 and C55 [24]. The program is based on Grantham calculation, a combination of Grantham Variation (GV) which measures the amount of observed biochemical evolutionary variation at a specific position of the alignment, and Grantham Deviation (GD) which measures the biochemical difference between the missense residue and the range of variation observed at this position in the alignment.

Functional annotation and biological network construction

To discern the implication of the candidate breast cancer genes, several bioinformatics tools have been used to explore their biological pathways and the possible protein–protein interactions. We first performed a functional analysis using the EnrichR platform [25], a bioinformatics web-based tool that includes more than 60 gene-set libraries, such as Gene ontology [26], KEGG, Wikipathways, as well as Jensen-diseases. The selection criteria for significantly enriched pathways and ontology term were a p value less than 0.05 (Additional file 1: Table S1). For a better visualization and interpretation of the biological processes associated with selected breast cancer candidate genes and their upstream regulator, we used ClueGO [27], a user friendly Cytoscape plug-into analyze interrelations of terms and functional groups in biological networks [28]. In brief, we used enrichment (right-sided) hyper-geometric distribution tests, with a p value significance level ≤ 0.05, followed by the Bonferroni adjustment for the terms and the groups with Kappa-statistics score threshold set to 0.5, and leading term groups were selected based on the highest significance. Protein–protein interaction network including physical and functional association across our set of genes was sorted out using string db 10.0 [29] with confidence score 0.4.

Results

Eight affected individuals from seven BRCAx Tunisian families at high risk of breast cancer were analyzed using whole exome sequencing. Results including number of reads, sample coverage and sequencing depth of the whole exome sequenced patients have been summarized in Additional file 1: Table S2. We focused our current analysis on the first BRCA negative family; BC-TN-F001 (Fig. 1). Two out of three affected family members have been selected for whole exome sequencing.
Fig. 1

The familial pedigree of the breast cancer whole exome sequenced family

The familial pedigree of the breast cancer whole exome sequenced family

Analysis of variants located on the known breast cancer susceptibility genes

Before applying the filter, steps described in the methods section, we first investigated the following 29 genes known to be associated with hereditary breast and ovarian cancer: ATM, BARD1, BRCA1, BRCA2, BLM, BRIP1, CDH1, CHEK2, FAM175A, FANCC, FANCM, MAPKAP1, MLH1, MRE11A, MSH2, NBN, NF1, PALB2, PMS2, PTEN, RAD50, RAD51B, RAD51C, RAD51D, RECQL, RINT1, STK11, TP53 and XRCC2 (Table 2). 59 shared heterozygous variants have been identified on these genes of which, 51 (86.4%) common non-coding variants, five exonic variants and 3 splicing SNPs. The exonic variations include a BRCA2 rare variant (rs4987047, MAF = 0.0089), three common exonic polymorphisms on BARD1 (rs2070094, rs2229571 and rs1048108), and one variant on MAPKAP1 (rs1201689). None of the heterozygous variants that have been found on BRCA1, BLM, FAM175A, FANCM, PTEN, RAD50, RINT1, STK11, TP53 and XRCC2 were shared between the two sequenced family members.
Table 2

Variants on hereditary breast and ovarian cancer genes shared by the two sequenced family members

GenesPositionVariant IDSequence variationFrequency (1000 genomes)LocalizationClinVar
ATM 108137775rs642496c.2467−123T > A0.681909Intronic
108225661rs664143c.640+30986T>C0.628195Intronic
10815,707rs3218681c.3403−15_3403−14insA0.542133IntronicBenign
BARD1 215632255rs2070094c.1462G>A0.366214ExonicLikely benign
215645464rs2229571c.1077G>C0.459265ExonicLikely benign
215674224rs1048108c.70C>T0.33127ExonicLikely benign
215632155rs5031009c.1568+51A>G0.366214Intronic
215632126rs398048293c.1511+78_1511+79delAA0.366214Intronic
215634055rs6704780c.1315−19G>A0.365216IntronicBenign
21532192rs5031011c.1568+14C>T0.352236IntronicLikely benign
215595645rs16852600c.1904−413G>A0.275359Intronic
BLM No detected variants
BRCA1 No detected variants
BRCA2 32953529rs4987047c.8830A>T0.00898562ExonicBenign
BRIP1 No detected variants
CDH1 68857277rs201760019c.1754−25C>A0.000599042Intronic
68857544rs34939176c.1981+17_1981+18insA0.0459265IntronicBenign
68868148rs140240766c.*746C>A0.000599042UTR3Likely benign
CHEK2 29137944rs2236142c.−194C>G0.560304Upstream
FAM175A No detected variants
FANCC 9,873957rs4647534c.1155−38T>C0.541334IntronicBenign
97873435rs2404457c.1329+310C>T0.411142UTR3
97888730rs4647512c.896+81G>A0.0313498Intronic
FANCM No detected variants
MAPKAP1 128321827rs146481224c.848+85T>A0.0163738Intronic
42103822rs1197672c.328−333C>T0.239816Intronic
42105918rs1201689c.937C>G0.305112Exonic
42111933rs890497c.2499+85G>A0.0884585Intronic
MLH1 37070437rs41562513c.1558+14G>A0.0501198IntronicBenign
MRE11A 94179125rs1014666c.1784−69A>G0.517173Intronic
94212048rs535801c.403−6G>A0.313099SplicingBenign
94197568rs640627c.1099−163G>A0.314896Intronic
94225807rs496797c.20+141G>A0.552915Splicing
94225920rs497763c.20+28G>A0.457268IntronicBenign
94212154rs680695c.403−112T>C0.313099Intronic
MSH2 47656801rs2347794c.1077−80G>A0.59365IntronicBenign
47630550rs2303426c.211+9C>A0.628395IntronicBenign
47693959rs3732183c.1661+12G>A0.483427IntronicBenign
47693706rs3732182c.1511−91G>T0.483027IntronicBenign
47739551rs2303424c.2744A>G0.527955Intergenic
NBN 90983317rs104895036c.456+84G>C0.00139776Intronic
NF1 29685905rs34513299c.8051−82A>G0.00199681Intronic
PALB2 23640467rs249954c.2586+58C>T0.35004IntronicBenign
23652525rs8053188c.−339C>T0.0662939UTR5Benign
PMS2 6037058rs549498051c.706−5delT0.453075SplicingBenign
PTEN No detected variants
RAD50 131927748rs10520116c.1793+22T>C0.0129792Intronic
131944964rs2066742c.2923−11_2923−10insT0.0734824IntronicLikely benign
131928652rs2706366c.1793+926A>G0.123003Intronic
131892979rs4526098c.−38A>G0.92492UTR5Benign
RAD51B 68290372rs17783124c.84+28T>G0.250399Intronic
68290464rs28623567c.84+120G>A0.2498Intronic
68937054rs142879847c.1036+2087A>G0.00798722Intronic
68758575rs10129646c.757−26T>C0.138379Intronic
68301767rs34564590c.199−29_199−28insA0.319489Intronic
68290426rs28604984c.84+82T>C0.2498Intronic
68934860rs34436700c.958−29A>G0.00778754IntronicLikely benign
69117512rs8023214c.1037−32142T>C0.528554Intergenic
69117387rs8021657c.1037−32267A>G0.527556Intergenic
RAD51C 56798207rs28363318c.904+34T>C0.205272Intronic
56769979rs12946397c.−681G>A0.158347UTR5Likely benign
RAD51D No detected variants
RECQL 21629993rs397718052c.868−68_868−67insG0.488818Intronic
21628320rs10841831c.1216+82G>A0.486821Intronic
21628791rs3752648c.950−33A>G0.48742Intronic
21628336rs10841832c.1216+66C>T0.486821Intronic
RINT1 No detected variants
STK11 No detected variants
TP53 No detected variants
XRCC2 No detected variants
Variants on hereditary breast and ovarian cancer genes shared by the two sequenced family members Based on breast cancer information core (BIC) and ClinVar databases, none of the 59 variants identified on these classical breast and ovarian cancer genes was classified as pathogenic. Thus, we suggested that breast cancer genetic predisposition in this family might be due to new variants on novel breast cancer candidate genes.

Identification of novel candidate variants

A total of 32,212 heterozygous variants shared by both cases have been identified (Fig. 2). Among them, 4593 heterozygous, exonic, splicing and non-synonymous SNPs were called. Variants with MAF > 1% have been excluded. Therefore, 373 rare variations have been selected for further investigations including 39 variations that have not been previously reported. In fact, as the Tunisian population is not represented in public databases, reported variants have not been excluded.
Fig. 2

Number of variants filtered using several criteria determining high risk alleles

Number of variants filtered using several criteria determining high risk alleles In order to select the most relevant SNPs, SIFT (score < 0.05), PolyPhen (score > 0.909), Mutation Taster (disease-causing prediction) and Align GVGD (score > C55) have been used as in silico prediction tools to assess the functional effect of the 373 variants. A list of 12 high risk variants have been selected based on interesting in silico predictions (Table 3) of which seven nonsynonymous variants on HSD3B1, PBK, ITIH2, MMS19, PPL, DNAH3 and RASSF2, 1 splicing variation on CFTR, 2 stop-gain variants on CALCOCO2 and LRRC29, 1 frameshift deletion on PABPC3 and 1 frameshift insertion on ZNF677. None of these variants have been listed in the ClinVar database, except CFTR-rs1057516216 variant that seems to be “likely pathogenic”.
Table 3

Damaging variations identified in the affected individuals and selected using different functional prediction tools

Chromosome-PositionaLocusGeneReference sequenceVariant typeCoding changeProtein variationVariant IdFrequencyPrediction of variant effectConservation score PhastConsbClinVar
dbSNPExACSIFTPolyphen2Mutation tasterAlign-GVGD
Chr1: 1200566301p12 HSD3B1 NM_000862Nonsynonymousc.484G > Tp.A162Srs997216232N/ADamagingProbably DamagingDisease causingC650.995N/A
Chr7:1172327137q31 CFTR NM_000492Splicingc.2490 + 2T > Crs1057516216N/ADisease causing0.998Likely Pathogenic
Chr8: 276685338p21 PBK NM_018492Nonsynonymousc.714G > Cp.W238Crs7744988348.265e−06DamagingProbably DamagingDisease causingC651N/A
Chr10:775102810p14 ITIH2 NM_002216Nonsynonymousc.236C > Ap.S79Yrs7491496209.884e−05DamagingProbably DamagingDisease causingC651N/A
Chr10:9923811710q24 MMS19 NM_001289403Nonsynonymousc.292C > Tp.R98Wrs290012800.0015DamagingProbably DamagingDisease causingC651N/A
Chr13:2567131113q12 PABPC3 NM_030979Frameshift deletionc.975_979delp V325 fsrs3711307688.237e−06Disease causing1N/A
Chr16:493453216p13 PPL NM_002705Nonsynonymousc.4124T > Gp.I1375SN/AN/ADamagingProbably DamagingDisease causingC651N/A
Chr16:2101174416p12 DNAH3 NM_017539Nonsynonymousc.6223C > Tp.P2075SN/AN/ADamagingProbably DamagingDisease causingC651N/A
Chr16:6724186716q22 LRRC29 NM_001004055Stopgainc.412C > Tp.R138Xrs7767217998.582e−06Disease causing0.259N/A
Chr17:4694029217q21 CALCOCO2 NM_005831Stopgainc.1266T > Ap.C422XN/AN/ADisease causing0.999N/A
Chr19: 5374040619q13 ZNF677 NM_182609Frameshift insertionc.1573dupAp.T525 fsrs5667140890.0038Disease causingN/A
Chr20: 476690220p13 RASSF2 NM_170774Nonsynonymousc.886C > Tp.R296 Wrs7564861848.238e−06DamagingProbably DamagingDisease causingC650.998N/A

aGRCh37/hg19; b PhastCons values vary between 0 and 1 and reflect the probability that each nucleotide belongs to a conserved element, based on the multiple alignment of genome sequences of 46 different species (the closer the value is to 1, the more probable the nucleotide is conserved)

Damaging variations identified in the affected individuals and selected using different functional prediction tools aGRCh37/hg19; b PhastCons values vary between 0 and 1 and reflect the probability that each nucleotide belongs to a conserved element, based on the multiple alignment of genome sequences of 46 different species (the closer the value is to 1, the more probable the nucleotide is conserved)

The family specific hypothesis

We first filtered this list of candidate genes and variants against the additional six BRCAx exome sequenced breast cancer families (BC-TN-F002_BC-TN-F007). All identified variants have been only found in BC-TN-F001, expect the PABPC3 variant that was found in other Tunisian BRCAx families. Then, we compared the list of variants identified in this family to results from other WES studies on BRCAx families. Again, variants identified in this study were only found in BC-TN-F001, suggesting a family specific predisposition to breast cancer. This family specific hypothesis has been suggested to explain the breast cancer predisposition in 4 other WES studies [8, 19–21]. We therefore performed a literature curation based on the results of the 4 family specific WES studies and the current one in order to explore this family specific predisposition to breast cancer. Additional file 1: Table S3 summarizes the list of 54 genes identified through these studies as new potential breast cancer candidate genes inherited in a family specific model. We observed that each exome sequenced family showed a specific genetic pattern with a different set of candidate genes. Only KAT6B has been reported in two different families from two separate studies [19, 20]. In a recent WES study performed on five BRCAx Egyptian families [8], four genes namely LOC100129697, NPIPB1, NBPF10 and PABPC3 have been identified in more than one family. PABPC3 is also found to be shared between three Egyptian families and the four Tunisian families sequenced in this current study.

Gene set enrichment analysis

As most of the breast cancer candidate genes identified through family specific predisposition studies lack functional evidence of their involvement in breast carcinogenesis, we pooled the 54 candidate genes identified in separate WES studies (Additional file 1: Table S3) and we performed functional annotation analysis to explore if there is any biological interaction between these genes which may strengthen their association with breast cancer (Additional file1: Table S1; Additional file 2: Figure S1). Moreover, a comprehensive gene set enrichment combined with a protein–protein interaction analysis was performed using both of EnrichR and Stringdb webtools. Results showed that MMS19 and POLK genes are involved in the DNA repair pathway (Fig. 3). The remaining genes are a part of several pathways involved in cancer etiology such as: Negative regulation of stress activated MAPK cascade (PBK and PINK1), intracellular signal transduction and regulation of autophagosome assembly (LRRK2 and PINK1) and RNA degradation (PABC3 and DDX6). NOTCH2 and ZNF677 are highly predicted to be co-expressed with PBK and LRRK2 (Fig. 3).
Fig. 3

Protein-Protein interactions of novel breast cancer candidate genes identified in four WES breast cancer studies. Genes are clustered in four pathways related to cancer etiology. The lines represent the levels of evidence as indicated in the color legend

Protein-Protein interactions of novel breast cancer candidate genes identified in four WES breast cancer studies. Genes are clustered in four pathways related to cancer etiology. The lines represent the levels of evidence as indicated in the color legend Finally, we performed a disease genes association analysis using Jensen disease database (PMID: 25484339) by clustering the candidates genes into subgroups involved in a same disease. We, therefore, examined the overlap between these sub-clusters and different cancers namely, breast, ovarian, liver and endometrial cancers (Fig. 4). The results obtained show five top significant genes involved in breast cancers that are DNHA3, KATB6, PDE4DIP, MXRA5 and NBPF10. Of note, NBPF10 is also linked to endometrial cancer and DNHA3 is the only candidate that is involved in all these cancers.
Fig. 4

Venn diagram representing the involvement of the identified breast cancer candidate genes in several cancers

Venn diagram representing the involvement of the identified breast cancer candidate genes in several cancers

Discussion

The majority of BRCAx patients with familial breast cancer lack evidence for their genetic predisposition. Multiple models have been proposed to explain the missing heritability. First, recessive and polygenic models of transmission have been proposed to resolve a part of breast cancer remaining heritability [30]. Another class of genetic variations that contributes to familial breast cancer risk includes large deletions and copy number variation [31]. Interactions between genetic variants and environmental risk factors remain an interesting model to explain breast cancer predisposition in multiple families. However, this model is largely unexplored because most of association studies that could address this model are underpowered [32]. Finally, NGS application using family-based approach represents an appropriate modality to identify additional genes with autosomal dominant mechanism of inheritance and thus explains an additional part of the breast cancer familial component [7]. In the present study, two affected sisters from a non BRCA Tunisian breast cancer family have been explored using whole exome sequencing. We excluded unaffected members in our sequenced individuals since they could be non-penetrant carriers. Thousands of heterozygous variants shared between the two sequenced family members have been identified. However, no deleterious variants have been found within known breast cancer genes. BRCA2-rs4987047 is the only rare exonic variant identified on the known breast cancer susceptibility genes. Despite its potential functional effect [33], the ClinVar predictions classify this variant as benign. Of note, among 108 exome sequenced families previously reported in 10 breast cancer WES studies, mutations on known breast cancer genes have been reported in only four families because BRCA tests are usually performed before using the whole exome sequencing approach [10-14]. Moreover, the high rate of consanguinity in the Tunisian population, may decrease the prevalence of breast cancer by decreasing the frequency of high penetrant mutations [34]. However, several common variants located on known breast cancer susceptibility genes have been identified in BC-TN-F001 (Table 2). Some of these variants have been previously reported as associated with different cancers as low penetrant polymorphisms. Indeed, two common exonic variants identified on BARD1 gene (rs2229571 and rs1048108) have been identified as low penetrant breast cancer loci in the Chinese population [35]. Moreover, PALB2-rs249954 has been reported to be associated with breast cancer risk [36], CHEK2-rs2236142 is likely associated with a decreased risk of esophageal cancer and lymph node metastasis in a Chinese population [37], RAD51C-rs12946397 is known to be associated with the risk of head and neck cancer [38] and ATM-rs664143 has been reported to be associated with lung cancer [39]. Given the fact that multiple family members are affected by other cancers such as lung carcinoma and small bowel lymphoma (Fig. 1), the involvement of these variants in this family predisposition to cancer is possible. Therefore, we cannot discard the polygenic model of breast cancer predisposition in this Tunisian breast cancer family. Despite the fact that these variants have been reported as common low penetrant variants in Caucasians, we cannot estimate their penetrance in the Tunisian population. Indeed, because of different genetic architectures and differences in allele frequencies between populations, variant penetrance may differ from one population to another and a low penetrant variant in one population may be of high penetrance in another population. Further association studies in large Tunisian cohorts are needed to assess the penetrance of these variants in the Tunisian population. After investigating known breast cancer genes, we explored other genes not yet reported as associated with the breast disease. Twelve high risk variants, predicted as deleterious by four different in silico prediction tools and showing a phenotypic relevance have been selected on the following genes: HSD3B1, CFTR, PBK, ITIH2, MMS19, PABPC3, PPL, DNAH3, LRRC29, CALCOCO2, ZNF677 and RASSF2. None of the variants identified within these genes have been listed in the ClinVar database, except for the CFTR-rs1057516216 variant that seems to be “likely pathogenic”. CFTR (Cystic Fibrosis Transmembrane Conductance Regulator) is a gene that encodes a member of the ATP-binding cassette (ABC) transporter superfamily [40]. Mutations in this gene cause cystic fibrosis, the most common lethal genetic disorder in populations of Northern European descent [41]. However, CFTR is potentially recurrently mutated by chance because of its large size and its involvement in breast carcinogenesis is controversial, thus, it cannot be considered as a potential breast cancer candidate gene. Indeed, it has been proposed that a CFTR mutation may protect against breast cancer [42], however, in another study that correlated the expression level of CFTR and breast cancer histological grading, it was shown that high serum levels of CFTR were associated with a high grade and poorly differentiated tumors [43]. When comparing the identified set of genes with other genes reported in other breast cancer WES studies, we showed that each exome sequenced family has a specific genetic pattern with a different set of candidate genes. Except PABPC3, genes identified in this Tunisian breast cancer family have not been reported in other breast cancer exome sequenced families, suggesting a family specific genetic predisposition to the disease. PABPC3 was shared between four Tunisian families and three Egyptian whole exome sequenced families. Moreover, LOC100129697, NPIPB1, NBPF10 have been found in three whole exome sequenced Egyptian families [8]. These genes shared between families from a particular ethnic group (Tunisians and Egyptians) suggest that in populations with high consanguinity and endogamy rates, the ethnic specific breast cancer predisposition model is also plausible. PABPC3 acts in a cytoplasmic regulatory processes of mRNA metabolism [44]. The involvement of PABPC3 in the RNA degradation pathway has been confirmed by the analysis of the biological process and protein–protein networks that we performed in this study (Additional file 2: Figure S1, Fig. 3). We also showed that the remaining genes are also linked to interesting new pathways such as: negative regulation of stress activated MAPK cascade and intracellular signal transduction and regulation of autophagosome assembly. Only two genes (MMS19 and POLK) are involved in DNA repair pathway, considered as the traditional pathway in which breast cancer genes are involved [45]. MMS19 acts as an adapter between early-acting cytosolic iron-sulfur assembly components and a subset of cellular target iron-sulfur proteins such as ERCC2/XPD, FANCJ and RTEL1, thereby playing a key role in nucleotide excision repair (NER) and RNA polymerase II (POL II) transcription [46]. Of note, the human MMS19 also interacts with estrogen receptors in a ligand-independent manner [47]. POLK is a member of Y family DNA polymerases, and functions by repairing the replication fork passing through DNA lesions [48]. Recently, POLK have been reported as a new ovarian cancer susceptibility gene [49]. Additional functional annotation analysis using the Jensen disease library, showed that the top significant genes involved in breast cancer are KATB6, PDE4DIP, MXRA5, DNHA3 and NBPF10. KAT6B—a histone acetyl transferase involved in DNA replication, gene expression and regulation, and epigenetic modification of chromosomal structure [50] has been reported as associated with breast cancer in two separate WES studies [19, 20]. Consistently with our results, it has been reported that DNHA3 is involved in different cancers including breast cancer [51-53]. DNHA3 (Dynein Axonemal Heavy Chain 3) gene belongs to the dynein family, whose members encode large proteins that are constituents of the microtubule-associated motor protein complex [54]. Among its related pathways we denotes the respiratory electron transport, ATP synthesis by chemiosmosis coupling, and heat production by uncoupling proteins. However, little evidence exist on the roles of PDE4DIP, MXRA5, and NBPF10 in breast carcinogenesis. In summary, these WES studies results and the functional annotation performed in the present study, altogether showed that MMS19, DNHA3, POLK and KATB6 are interesting breast cancer candidate genes. Variants located on these genes seem to be inherited in a family specific model. PABPC3 seems to be another interesting breast cancer candidate gene that may be associated with breast cancer in an ethnic specific manner as it has been reported in another North African population [8]. Although NGS represents an unprecedented approach to decipher the genetic predisposition to different hereditary diseases, it comes with numerous challenges. Indeed, the different lists of genes that resulted from different breast cancer WES studies may be explained in part by the different pipelines and bioinformatics tools used to analyze these data. In addition, NGS data users apply different filters to help prioritize variants such as the in silico prediction tools that may mis-classify some variants and thus causes erroneous inclusion or exclusion of some variations. Therefore, in order to assess how much the family specific hypothesis is plausible, we suggest to pool raw data from all breast cancer whole exome sequenced families and re-analyze the resulting data using a common and consensual strategy. Efforts made by the COMPLEXO group in identifying the missing breast cancer heritability via Next generation collaborations represent an excellent initiative to overcome these NGS data analysis challenges [55].

Conclusions

In the present study we reported a list of new breast cancer candidate genes that seem to be inherited in a family specific and ethnic specific models. Further WES studies on BRCAx Tunisian families and further in vitro or in vivo functional assays are needed to understand their effects and to confirm their association with breast cancer risk. For a better interpretation of NGS data, the scientific community should first overcome NGS data analysis challenges in order to generate more meaningful NGS data and more clinically actionable variants. Additional file 1: Table S1. Gene set enrichment analysis. Table S2. Summary of SNPs and Indels identified in the 7 BRCAx sequenced Tunisian breast cancer families. Table S3. Putative predisposition family-specific genes in several WES studies using the family-based approach. Additional file 2: Figure S1. Biological networks and Enriched gene ontology pathways identified by the functional annotation analysis. Enrichment network of the shared candidate disease genes and their upstream regulator based on biological processes using ClueGO Cytoscape plugin. Hyper-geometric (right-handed) enrichment distribution tests, with a p-value significance level of ≤ 0.05, followed by the Bonferroni adjustment for the terms and leading term groups were selected based on the highest significance. The node size and deeper color indicates greater significance of the enrichment.
  55 in total

1.  Germline RECQL mutations are associated with breast cancer susceptibility.

Authors:  Cezary Cybulski; Jian Carrot-Zhang; Wojciech Kluźniak; Barbara Rivera; Aniruddh Kashyap; Dominika Wokołorczyk; Sylvie Giroux; Javad Nadaf; Nancy Hamel; Shiyu Zhang; Tomasz Huzarski; Jacek Gronwald; Tomasz Byrski; Marek Szwiec; Anna Jakubowska; Helena Rudnicka; Marcin Lener; Bartłomiej Masojć; Patrica N Tonin; Francois Rousseau; Bohdan Górski; Tadeusz Dębniak; Jacek Majewski; Jan Lubiński; William D Foulkes; Steven A Narod; Mohammad R Akbari
Journal:  Nat Genet       Date:  2015-04-27       Impact factor: 38.330

2.  Can unknown predisposition in familial breast cancer be family-specific?

Authors:  Henry Lynch; Hongxiu Wen; Yeong C Kim; Carrie Snyder; Yulia Kinarsky; Pei Xian Chen; Fengxia Xiao; David Goldgar; Kenneth H Cowan; San Ming Wang
Journal:  Breast J       Date:  2013-06-26       Impact factor: 2.431

3.  Cystic fibrosis transmembrane conductance regulator gene mutation and lung cancer risk.

Authors:  Yafei Li; Zhifu Sun; Yanhong Wu; Dusica Babovic-Vuksanovic; Yan Li; Julie M Cunningham; Vernon S Pankratz; Ping Yang
Journal:  Lung Cancer       Date:  2010-02-08       Impact factor: 5.705

Review 4.  Identification of novel hereditary cancer genes by whole exome sequencing.

Authors:  Anna P Sokolenko; Evgeny N Suspitsin; Ekatherina Sh Kuligina; Ilya V Bizin; Dmitrij Frishman; Evgeny N Imyanitov
Journal:  Cancer Lett       Date:  2015-09-30       Impact factor: 8.679

5.  Rare mutations in XRCC2 increase the risk of breast cancer.

Authors:  D J Park; F Lesueur; T Nguyen-Dumont; M Pertesi; F Odefrey; F Hammet; S L Neuhausen; E M John; I L Andrulis; M B Terry; M Daly; S Buys; F Le Calvez-Kelm; A Lonie; B J Pope; H Tsimiklis; C Voegele; F M Hilbers; N Hoogerbrugge; A Barroso; A Osorio; G G Giles; P Devilee; J Benitez; J L Hopper; S V Tavtigian; D E Goldgar; M C Southey
Journal:  Am J Hum Genet       Date:  2012-03-29       Impact factor: 11.025

6.  The poly(A)-binding protein genes, EPAB, PABPC1, and PABPC3 are differentially expressed in infertile men with non-obstructive azoospermia.

Authors:  Saffet Ozturk; Berna Sozen; Fatma Uysal; Ibrahim C Bassorgun; Mustafa F Usta; Gokhan Akkoyunlu; Necdet Demir
Journal:  J Assist Reprod Genet       Date:  2016-02-03       Impact factor: 3.412

7.  COMPLEXO: identifying the missing heritability of breast cancer via next generation collaboration.

Authors:  Melissa C Southey; Daniel J Park; Tu Nguyen-Dumont; Ian Campbell; Ella Thompson; Alison H Trainer; Georgia Chenevix-Trench; Jacques Simard; Martine Dumont; Penny Soucy; Mads Thomassen; Lars Jønson; Inge S Pedersen; Thomas Vo Hansen; Heli Nevanlinna; Sofia Khan; Olga Sinilnikova; Sylvie Mazoyer; Fabienne Lesueur; Francesca Damiola; Rita Schmutzler; Alfons Meindl; Eric Hahnen; Michael R Dufault; Tl Chris Chan; Ava Kwong; Rosa Barkardóttir; Paolo Radice; Paolo Peterlongo; Peter Devilee; Florentine Hilbers; Javier Benitez; Anders Kvist; Therese Törngren; Douglas Easton; David Hunter; Sara Lindstrom; Peter Kraft; Wei Zheng; Yu-Tang Gao; Jirong Long; Susan Ramus; Bing-Jian Feng; Jeffrey N Weitzel; Katherine Nathanson; Kenneth Offit; Vijai Joseph; Mark Robson; Kasmintan Schrader; San Wang; Yeong C Kim; Henry Lynch; Carrie Snyder; Sean Tavtigian; Susan Neuhausen; Fergus J Couch; David E Goldgar
Journal:  Breast Cancer Res       Date:  2013-06-21       Impact factor: 6.466

Review 8.  Common breast cancer risk variants in the post-COGS era: a comprehensive review.

Authors:  Kara N Maxwell; Katherine L Nathanson
Journal:  Breast Cancer Res       Date:  2013-12-20       Impact factor: 6.466

9.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

10.  Unique Features of Germline Variation in Five Egyptian Familial Breast Cancer Families Revealed by Exome Sequencing.

Authors:  Yeong C Kim; Amr S Soliman; Jian Cui; Mohamed Ramadan; Ahmed Hablas; Mohamed Abouelhoda; Nehal Hussien; Ola Ahmed; Abdel-Rahman Nabawy Zekri; Ibrahim A Seifeldin; San Ming Wang
Journal:  PLoS One       Date:  2017-01-11       Impact factor: 3.240

View more
  18 in total

1.  The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms.

Authors:  Kimberly Walker; Divya Kalra; Rebecca Lowdon; Guangyi Chen; David Molik; Daniela C Soto; Fawaz Dabbaghie; Ahmad Al Khleifat; Medhat Mahmoud; Luis F Paulin; Muhammad Sohail Raza; Susanne P Pfeifer; Daniel Paiva Agustinho; Elbay Aliyev; Pavel Avdeyev; Enrico R Barrozo; Sairam Behera; Kimberley Billingsley; Li Chuin Chong; Deepak Choubey; Wouter De Coster; Yilei Fu; Alejandro R Gener; Timothy Hefferon; David Morgan Henke; Wolfram Höps; Anastasia Illarionova; Michael D Jochum; Maria Jose; Rupesh K Kesharwani; Sree Rohit Raj Kolora; Jędrzej Kubica; Priya Lakra; Damaris Lattimer; Chia-Sin Liew; Bai-Wei Lo; Chunhsuan Lo; Anneri Lötter; Sina Majidian; Suresh Kumar Mendem; Rajarshi Mondal; Hiroko Ohmiya; Nasrin Parvin; Carolina Peralta; Chi-Lam Poon; Ramanandan Prabhakaran; Marie Saitou; Aditi Sammi; Philippe Sanio; Nicolae Sapoval; Najeeb Syed; Todd Treangen; Gaojianyong Wang; Tiancheng Xu; Jianzhi Yang; Shangzhe Zhang; Weiyu Zhou; Fritz J Sedlazeck; Ben Busby
Journal:  F1000Res       Date:  2022-05-16

Review 2.  A Systematic Literature Review of Whole Exome and Genome Sequencing Population Studies of Genetic Susceptibility to Cancer.

Authors:  Alisa M Goldstein; Elizabeth M Gillanders; Melissa Rotunno; Rolando Barajas; Mindy Clyne; Elise Hoover; Naoko I Simonds; Tram Kim Lam; Leah E Mechanic
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2020-05-28       Impact factor: 4.254

Review 3.  The Effects of Genetic and Epigenetic Alterations of BARD1 on the Development of Non-Breast and Non-Gynecological Cancers.

Authors:  Andrea K Watters; Emily S Seltzer; Danny MacKenzie; Melody Young; Jonathan Muratori; Rama Hussein; Andrej M Sodoma; Julie To; Manrose Singh; Dong Zhang
Journal:  Genes (Basel)       Date:  2020-07-21       Impact factor: 4.096

Review 4.  Applications of Next Generation Sequencing to the Analysis of Familial Breast/Ovarian Cancer.

Authors:  Veronica Zelli; Chiara Compagnoni; Katia Cannita; Roberta Capelli; Carlo Capalbo; Mauro Di Vito Nolfi; Edoardo Alesse; Francesca Zazzeroni; Alessandra Tessitore
Journal:  High Throughput       Date:  2020-01-10

5.  Germline copy number variations in BRCA1/2 negative families: Role in the molecular etiology of hereditary breast cancer in Tunisia.

Authors:  Maroua Boujemaa; Yosr Hamdi; Nesrine Mejri; Lilia Romdhane; Kais Ghedira; Hanen Bouaziz; Houda El Benna; Soumaya Labidi; Hamza Dallali; Olfa Jaidane; Sonia Ben Nasr; Abderrazek Haddaoui; Khaled Rahal; Sonia Abdelhak; Hamouda Boussen; Mohamed Samir Boubaker
Journal:  PLoS One       Date:  2021-01-27       Impact factor: 3.240

6.  Genetic predisposition and prediction protocol for epithelial neoplasms in disease-free individuals: A systematic review.

Authors:  J Gowthami; N Gururaj; V Mahalakshmi; R Sathya; T R Sabarinath; Daffney Mano Doss
Journal:  J Oral Maxillofac Pathol       Date:  2020-09-09

Review 7.  A Review of Cancer Genetics and Genomics Studies in Africa.

Authors:  Solomon O Rotimi; Oluwakemi A Rotimi; Bodour Salhia
Journal:  Front Oncol       Date:  2021-02-15       Impact factor: 5.738

8.  Comprehensive Cohort Analysis of Mutational Spectrum in Early Onset Breast Cancer Patients.

Authors:  Mohit K Midha; Yu-Feng Huang; Hsiao-Hsiang Yang; Tan-Chi Fan; Nai-Chuan Chang; Tzu-Han Chen; Yu-Tai Wang; Wen-Hung Kuo; King-Jen Chang; Chen-Yang Shen; Alice L Yu; Kuo-Ping Chiu; Chien-Jen Chen
Journal:  Cancers (Basel)       Date:  2020-07-28       Impact factor: 6.639

9.  BRCA mutation screening and patterns among high-risk Lebanese subjects.

Authors:  Chantal Farra; Christelle Dagher; Rebecca Badra; Miza Salim Hammoud; Raafat Alameddine; Johnny Awwad; Muhieddine Seoud; Jaber Abbas; Fouad Boulos; Nagi El Saghir; Deborah Mukherji
Journal:  Hered Cancer Clin Pract       Date:  2019-01-18       Impact factor: 2.857

10.  Expanding cancer predisposition genes with ultra-rare cancer-exclusive human variations.

Authors:  Roni Rasnic; Nathan Linial; Michal Linial
Journal:  Sci Rep       Date:  2020-08-10       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.