| Literature DB >> 34755241 |
Jan Benada1, Jayashree Vijay Thatte1, Muthiah Bose1,2, Savvas Kinalis2, Bent Ejlertsen3, Finn Cilius Nielsen2, Claus Storgaard Sørensen1, Maria Rossing4,5.
Abstract
PURPOSE: Decades of research have identified multiple genetic variants associated with breast cancer etiology. However, there is no database that archives breast cancer genes and variants responsible for predisposition. We set out to build a dynamic repository of curated breast cancer genes.Entities:
Keywords: Breast cancer; Common-polygenic variants; DNA repair pathways; Database; Genetic predisposition; Rare-monogenic variants
Mesh:
Year: 2021 PMID: 34755241 PMCID: PMC8763822 DOI: 10.1007/s10549-021-06441-y
Source DB: PubMed Journal: Breast Cancer Res Treat ISSN: 0167-6806 Impact factor: 4.872
Fig. 1Flow chart outlining multiple steps involved in the database design such as literature search, data extraction, data annotation, and data harmonization
Fig. 2a Pie chart outlining the distribution of 652 breast cancer-associated loci across the genome. b Pie chart outlining the distribution of variants that either predisposes to breast cancer (Disease; OR > 1) or confers protection against breast cancer (Protective; OR < 1)
Fig. 3Chromosomal ideogram illustrating the distribution of 652 breast cancer-associated loci across the chromosomes. Chromosomal ideogram was constructed using PhenoGram software tool [15] with each dot representing one gene or variant
Fig. 4a Scatter plot illustrating the number of breast cancer-associated loci relative to its length for every chromosome. The chromosomal length for each chromosome was retrieved from Ensembl under Chromosome Statistics. b Scatter plot illustrating the number of breast cancer-associated loci relative to the total number of genes present in each chromosome. The total number of genes for each chromosome was calculated using Ensembl (Chromosome Statistics) by adding the number of coding genes, non-coding genes, and pseudogenes. The thick continuous line depicts the trendline for the number of breast cancer-associated loci present in each chromosome compared to its length (a) or the total number of genes present in that chromosome (b). a and b The thin dotted line is an imaginary trendline to illustrate a perfect positive correlation
Fig. 5Flow chart outlining the different criteria used to annotate and collate the rare-monogenic variant containing breast cancer genes. Out of the 459 breast cancer genes, our manual curation effort has identified 39 genes to contain disease-causing monogenic variants
Fig. 6Protein network analysis performed in the 459 breast cancer genes revealed a major cluster enriched among the DNA repair pathways. Rare-monogenic variant containing breast cancer genes (red dots) was mainly present within this cluster. The protein–protein interaction network was constructed using STRING database [16] and graphically adjusted in Cytoscape [17]