| Literature DB >> 22456607 |
Arkarachai Fungtammasan1, Erin Walsh, Francesca Chiaromonte, Kristin A Eckert, Kateryna D Makova.
Abstract
Chromosomal common fragile sites (CFSs) are unstable genomic regions that break under replication stress and are involved in structural variation. They frequently are sites of chromosomal rearrangements in cancer and of viral integration. However, CFSs are undercharacterized at the molecular level and thus difficult to predict computationally. Newly available genome-wide profiling studies provide us with an unprecedented opportunity to associate CFSs with features of their local genomic contexts. Here, we contrasted the genomic landscape of cytogenetically defined aphidicolin-induced CFSs (aCFSs) to that of nonfragile sites, using multiple logistic regression. We also analyzed aCFS breakage frequencies as a function of their genomic landscape, using standard multiple regression. We show that local genomic features are effective predictors both of regions harboring aCFSs (explaining ∼77% of the deviance in logistic regression models) and of aCFS breakage frequencies (explaining ∼45% of the variance in standard regression models). In our optimal models (having highest explanatory power), aCFSs are predominantly located in G-negative chromosomal bands and away from centromeres, are enriched in Alu repeats, and have high DNA flexibility. In alternative models, CpG island density, transcription start site density, H3K4me1 coverage, and mononucleotide microsatellite coverage are significant predictors. Also, aCFSs have high fragility when colocated with evolutionarily conserved chromosomal breakpoints. Our models are predictive of the fragility of aCFSs mapped at a higher resolution. Importantly, the genomic features we identified here as significant predictors of fragility allow us to draw valuable inferences on the molecular mechanisms underlying aCFSs.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22456607 PMCID: PMC3371707 DOI: 10.1101/gr.134395.111
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Locations of 76 APH-induced common fragile sites (aCFSs) and 131 nonfragile control regions (NFRs) used in this study. (Red) aCFSs (pink is used to differentiate among three fragile sites on chromosome 1). (Blue) NFRs. Gray regions were excluded from the analysis because they are either rare fragile sites or background breakage regions. (Black regions) Centromeres.
Figure 2.Statistical workflow. (A) Potential predictor selection (prescreening). (B) Regression analysis (box “Regression Analysis” in A).
Figure 3.Hierarchical clustering of predictors using their Spearman correlation coefficients computed across all 73 aCFSs + 124 NFRs. (Y-axis) 1−|correlation coefficient|. The lower predictors merge in the dendrogram, the higher their correlation. Predictors in black boxes were selected as potential predictors to run our regression analysis (which includes further predictor selection steps).
The 19 genomic features (after prescreening) used as potential predictors in regression analyses
Optimal multiple logistic regression model contrasting autosomal aCFSs with NFRs
Alternative multiple logistic regression models contrasting autosomal aCFSs with NFRs
Optimal multiple standard regression model for breakage frequency of autosomal aCFSs