| Literature DB >> 29121169 |
Lucas Lochovsky1,2, Jing Zhang1,2, Mark Gerstein1,2,3.
Abstract
Summary: Identifying genomic regions with higher than expected mutation count is useful for cancer driver detection. Previous parametric approaches require numerous cell-type-matched covariates for accurate background mutation rate (BMR) estimation, which is not practical for many situations. Non-parametric, permutation-based approaches avoid this issue but usually suffer from considerable compute-time cost. Hence, we introduce Mutations Overburdening Annotations Tool (MOAT), a non-parametric scheme that makes no assumptions about mutation process except requiring that the BMR changes smoothly with genomic features. MOAT randomly permutes single-nucleotide variants, or target regions, on a relatively large scale to provide robust burden analysis. Furthermore, we show how we can do permutations in an efficient manner using graphics processing unit acceleration, speeding up the calculation by a factor of ∼250. Availability and implementation: MOAT is available at moat.gersteinlab.org. Contact: mark@gersteinlab.org. Supplementary information: Supplementary data are available at Bioinformatics online.Entities:
Mesh:
Year: 2018 PMID: 29121169 PMCID: PMC5860157 DOI: 10.1093/bioinformatics/btx700
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1(a) MOAT-a shuffles each annotation to a new location within the local genome context bounded by user-defined parameters d_min and d_max, producing n permutations. (b) In MOAT-v, the whole genome is divided into bins of user-defined width W, within which variants are moved to new coordinates, thereby preserving the local mutation context. As with MOAT-a, MOAT-v produces n permutations. (c) MOAT-s bins the entire genome, whereupon it calculates the covariate values for each bin. The program then clusters bins with similar covariate values, represented here as bins with the same color (we refer to these clusters as equivalence classes). The input variants that fall within each cluster are then permuted to new locations chosen from the bins within the same cluster, honoring trinucleotide context preservation if requested