Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Empirical null estimation using zero-inflated discrete mixture distributions and its application to protein domain data.

Literature DB >> 28940296

Empirical null estimation using zero-inflated discrete mixture distributions and its application to protein domain data.

Iris Ivy M Gauran^1,2, Junyong Park¹, Johan Lim³, DoHwan Park¹, John Zylstra¹, Thomas Peterson⁴, Maricel Kann⁴, John L Spouge⁵.

Abstract

In recent mutation studies, analyses based on protein domain positions are gaining popularity over gene-centric approaches since the latter have limitations in considering the functional context that the position of the mutation provides. This presents a large-scale simultaneous inference problem, with hundreds of hypothesis tests to consider at the same time. This article aims to select significant mutation counts while controlling a given level of Type I error via False Discovery Rate (FDR) procedures. One main assumption is that the mutation counts follow a zero-inflated model in order to account for the true zeros in the count model and the excess zeros. The class of models considered is the Zero-inflated Generalized Poisson (ZIGP) distribution. Furthermore, we assumed that there exists a cut-off value such that smaller counts than this value are generated from the null distribution. We present several data-dependent methods to determine the cut-off value. We also consider a two-stage procedure based on screening process so that the number of mutations exceeding a certain value should be considered as significant mutations. Simulated and protein domain data sets are used to illustrate this procedure in estimation of the empirical null using a mixture of discrete distributions. Overall, while maintaining control of the FDR, the proposed two-stage testing procedure has superior empirical power.

Entities: Chemical Disease Gene Species

Keywords: Local false discovery rate; Protein domain; Zero-in ated generalized poisson

Mesh：

Year: 2017 PMID： 28940296 PMCID： PMC5862774 DOI： 10.1111/biom.12779

Source DB: PubMed Journal: Biometrics ISSN： 0006-341X Impact factor: 2.571

13 in total

1. Objective method for estimating asymptotic parameters, with an application to sequence alignment.

Authors: Sergey Sheetlin; Yonil Park; John L Spouge
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2011-09-13

2. DMDM: domain mapping of disease mutations.

Authors: Thomas A Peterson; Asa Adadey; Ivette Santana-Cruz; Yanan Sun; Andrew Winder; Maricel G Kann
Journal: Bioinformatics Date: 2010-08-04 Impact factor: 6.937

Review 3. Signal transduction in cancer.

Authors: Richard Sever; Joan S Brugge
Journal: Cold Spring Harb Perspect Med Date: 2015-04-01 Impact factor: 6.915

4. Generalized Poisson distribution: the property of mixture of Poisson and comparison with negative binomial distribution.

Authors: Harry Joe; Rong Zhu
Journal: Biom J Date: 2005-04 Impact factor: 2.207

Empirical null estimation using zero-inflated discrete mixture distributions and its application to protein domain data.

1. Objective method for estimating asymptotic parameters, with an application to sequence alignment.

2. DMDM: domain mapping of disease mutations.

Review 3. Signal transduction in cancer.

4. Generalized Poisson distribution: the property of mixture of Poisson and comparison with negative binomial distribution.

5. Fitting mixture models to grouped and truncated data via the EM algorithm.

Review 6. The role of oncogenic kinases in human cancer (Review).

7. Overexpression of NOTCH-regulated ankyrin repeat protein is associated with breast cancer cell proliferation.

8. Activation of an olfactory receptor inhibits proliferation of prostate cancer cells.

Review 9. Cadherins and cancer: how does cadherin dysfunction promote tumor progression?

10. A protein domain-centric approach for the comparative analysis of human and yeast phenotypically relevant mutations.

1. Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples.