Literature DB >> 30258934

InfiniumPurify: An R package for estimating and accounting for tumor purity in cancer methylation research.

Yufang Qin1,2, Hao Feng3, Ming Chen1,2, Hao Wu3, Xiaoqi Zheng4.   

Abstract

The proposition of cancer cells in a tumor sample, named as tumor purity, is an intrinsic factor of tumor samples and has potentially great influence in variety of analyses including differential methylation, subclonal deconvolution and subtype clustering. InfiniumPurify is an integrated R package for estimating and accounting for tumor purity based on DNA methylation Infinium 450 k array data. InfiniumPurify has three main functions getPurity, InfiniumDMC and InfiniumClust, which could infer tumor purity, differential methylation analysis and tumor sample cluster accounting for estimated or user-provided tumor purities, respectively. The InfiniumPurify package provides a comprehensive analysis of tumor purity in cancer methylation research.

Entities:  

Keywords:  Cancer subtype classification; DNA methylation; Differential methylation analysis; InfiniumPurify; Tumor purity

Year:  2018        PMID: 30258934      PMCID: PMC6147081          DOI: 10.1016/j.gendis.2018.02.003

Source DB:  PubMed          Journal:  Genes Dis        ISSN: 2352-3042


Availability

The R package InfiniumPurify is available from http://cran.r-project.org/web/packages.

Introduction

Tumor purity, defined as the percentage of cancer cells in a solid tumor sample, is an important characteristic that cannot be ignored in cancer genomics or epigenomics data analysis.1, 2, 3, 4 Due to the normal cell contamination in tumor tissue, high-throughput data obtained from tumor samples are mixed signals of cancer and normal cells. Thus the purity effect must be accounted for in various data analyses such as sample clustering/classification and differential expression/methylation.5, 6 Till now, a few methods and software tools are available for tumor purity estimation, mainly based on gene expression or copy number variation data. A comprehensive review is provided by. Here we present the InfiniumPurify, a comprehensive R package to evaluate and account for tumor purity in a series of cancer methylation researches based on Infinium 450 k array data. It includes the following functions: getPurity, which estimates tumor purities from beta value matrices of tumor and normal samples; InfiniumDMC, which performs differential methylation analysis accounting for tumor purities estimated from getPurify; InfiniumPurify, which infers purified tumor methylomes from tumor, normal samples and purities; InfiniumClust, which classified tumor samples into different methylation subtypes corrected by tumor purities.

Methods

InfiniumPurify takes beta value matrix of tumor and normal samples as input, which could be obtained from ChAMP, DMRcate, minfi or some related R packages. Note that if starting with raw CEL data of Infinium 450 k array, a normalization step is essential for data preparation. To be specific, two types of probes (type-I and type-II) are used in Infinium 450 k chip and they may have different beta distributions. Moreover, tumor samples exhibit a global different pattern with normal samples, i.e., hyper methylation in promoter regions and global hypo-methylation in the whole genome. So we prefer functional normalization in data preparation.

getPurity: estimate tumor purity from DNA methylation Infinium 450 k array data

The function getPurity is used to estimate tumor purities of tumor samples. It takes methylation beta value matrix of tumor (and optionally normal) samples and tumor type as inputs, and outputs a vector of tumor purities for all tumor samples. If normal data are available and numbers of tumor and normal samples are both sufficient large (≥20), the function first identifies a number of informative differentially methylated CpG sites (iDMCs) by comparing the methylation differences between tumor and normal samples and variation in tumor samples. Then methylation levels of the selected iDMCs are used to estimate tumor purity for each tumor sample by density evaluation of Gaussian kernel. When normal sample is unavailable or tumor/normal samples are too few to get reliable iDMCs, getPurity will load pre-selected iDMCs identified from public TCGA data to infer tumor purities. In such case, the tumor type needs to be specified by the user. As an application, we calculated tumor purities for all tumor samples with methylation 450 k array data in TCGA, which are available from https://doi.org/10.5281/zenodo.253193. Comparison with purity estimated from other tools shows good correlation.13, 14

InfiniumDMC: differential methylation analysis accounting for tumor purity

Tumor purity could serious bias or weaken differential methylation analysis if not correctly accounting for. There are a few discussions on differential expression analysis with the consideration of tumor purity, and most of them simply add tumor purity as a covariate in regression model. However, as is showed in our work through rigorous data modeling, the tumor purity has multiplicative effect on differential methylation (as well as different expression), instead of additive. InfiniumDMC takes beta value matrix of tumor (and optionally normal)and purities for all tumor samples as inputs. Note that the purities can be the results from getPurity or other tools. The DM calling is performed under the following two scenarios. With normal sample size more than 20, InfiniumDMC tests the significance of differential methylation comparing tumor and normal data based on a generalized least square procedure. Otherwise when normal samples are too few or unavailable, InfiniumDMC will use data from tumor samples alone and test the association between tumor beta values and tumor purities. The latter control-free DM calling method provides an alternative way to DM analysis when normal controls are not available or of low quality.

InfiniumPurify: deconvolute pure tumor methylomes

InfiniumPurify is to deconvolute pure tumor cellmethylomes from tumor samples, normal samples and tumor purity through a linear regression model. Intuitively, a CpG site is likely to be differentially methylated if it is highly correlated to tumor purities. In Figure 1, we show a CpG site with no significant methylation difference in tumor and normal samples by minfi. But its high correlation between tumor methylation and purity indicate that tumor methylations are seriously affected by tumor purity. After we corrected the purity effect by InfiniumPurify, its difference between purified tumor and normal methylomesis very significant.
Figure 1

An example showing DMCs that are only detected by InfiniumPurify. Left panel shows their methylation level distributions in tumor and normal samples. Middle panel shows correlation between purities and methylation levels. Right panel shows methylation levels of normal and tumor samples after correcting for tumor purities.

An example showing DMCs that are only detected by InfiniumPurify. Left panel shows their methylation level distributions in tumor and normal samples. Middle panel shows correlation between purities and methylation levels. Right panel shows methylation levels of normal and tumor samples after correcting for tumor purities.

InfiniumClust: cluster tumor sample accounting for tumor purity

DNA methylation plays an important role in tumorigenesis, thus clustering of tumor samples into different epigenetic subtypes is helpful in identifying diagnostic biomarker and therapeutic target in clinical practice. InfiniumClust is the first attempt to attribute tumor samples into subtypes after correcting tumor purity effect. It assumes pure normal methylome and tumor methylomes of different subtypes follow normal distribution after arcsine transformation. The clustering membership of a tumor sample is denoted as a latent variable that is optimized by Expectation-Maximization (EM) algorithm from the tumor-normal mixture model. InfiniumClust takes beta value matrix of and purities for a number of tumor samples and reports the probabilities of cluster membership. Given a user-specified number K of clusters, the function returns a list consisting of likelihood and membership matrix, where row corresponds to tumor samples and column corresponds to K clusters.

Conclusion

The R package InfiniumPurify contains a series of functions for DNA methylation analysis in cancer research accounting for tumor purity.

Conflict of interest

None declared.
  15 in total

1.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.

Authors:  Martin J Aryee; Andrew E Jaffe; Hector Corrada-Bravo; Christine Ladd-Acosta; Andrew P Feinberg; Kasper D Hansen; Rafael A Irizarry
Journal:  Bioinformatics       Date:  2014-01-28       Impact factor: 6.937

2.  Predicting tumor purity from methylation microarray data.

Authors:  Naiqian Zhang; Hua-Jun Wu; Weiwei Zhang; Jun Wang; Hao Wu; Xiaoqi Zheng
Journal:  Bioinformatics       Date:  2015-06-25       Impact factor: 6.937

Review 3.  Tumor purity and differential methylation in cancer epigenomics.

Authors:  Fayou Wang; Naiqian Zhang; Jun Wang; Hao Wu; Xiaoqi Zheng
Journal:  Brief Funct Genomics       Date:  2016-05-19       Impact factor: 4.241

4.  Accounting for tumor purity improves cancer subtype classification from DNA methylation data.

Authors:  Weiwei Zhang; Hao Feng; Hao Wu; Xiaoqi Zheng
Journal:  Bioinformatics       Date:  2017-09-01       Impact factor: 6.937

5.  De novo identification of differentially methylated regions in the human genome.

Authors:  Timothy J Peters; Michael J Buckley; Aaron L Statham; Ruth Pidsley; Katherine Samaras; Reginald V Lord; Susan J Clark; Peter L Molloy
Journal:  Epigenetics Chromatin       Date:  2015-01-27       Impact factor: 4.954

6.  Inferring tumour purity and stromal and immune cell admixture from expression data.

Authors:  Kosuke Yoshihara; Maria Shahmoradgoli; Emmanuel Martínez; Rahulsimham Vegesna; Hoon Kim; Wandaliz Torres-Garcia; Victor Treviño; Hui Shen; Peter W Laird; Douglas A Levine; Scott L Carter; Gad Getz; Katherine Stemke-Hale; Gordon B Mills; Roel G W Verhaak
Journal:  Nat Commun       Date:  2013       Impact factor: 14.919

Review 7.  A comprehensive overview of Infinium HumanMethylation450 data processing.

Authors:  Sarah Dedeurwaerder; Matthieu Defrance; Martin Bizet; Emilie Calonne; Gianluca Bontempi; François Fuks
Journal:  Brief Bioinform       Date:  2013-08-29       Impact factor: 11.622

8.  Functional normalization of 450k methylation array data improves replication in large cancer studies.

Authors:  Jean-Philippe Fortin; Aurélie Labbe; Mathieu Lemire; Brent W Zanke; Thomas J Hudson; Elana J Fertig; Celia Mt Greenwood; Kasper D Hansen
Journal:  Genome Biol       Date:  2014-12-03       Impact factor: 13.583

9.  Systematic pan-cancer analysis of tumour purity.

Authors:  Dvir Aran; Marina Sirota; Atul J Butte
Journal:  Nat Commun       Date:  2015-12-04       Impact factor: 14.919

10.  Accounting for cellular heterogeneity is critical in epigenome-wide association studies.

Authors:  Andrew E Jaffe; Rafael A Irizarry
Journal:  Genome Biol       Date:  2014-02-04       Impact factor: 13.583

View more
  19 in total

1.  BCurve: Bayesian Curve Credible Bands Approach for the Detection of Differentially Methylated Regions.

Authors:  Chenggong Han; Jincheol Park; Shili Lin
Journal:  Methods Mol Biol       Date:  2022

2.  DNA methylation alterations across time and space in paediatric brain tumours.

Authors:  Anna Wenger; Sandra Ferreyra Vega; Elizabeth Schepke; Maja Löfgren; Thomas Olsson Bontell; Magnus Tisell; Daniel Nilsson; Teresia Kling; Helena Carén
Journal:  Acta Neuropathol Commun       Date:  2022-07-16       Impact factor: 7.578

3.  Dissecting heterogeneity in malignant pleural mesothelioma through histo-molecular gradients for clinical applications.

Authors:  Yuna Blum; Clément Meiller; Lisa Quetel; Nabila Elarouci; Mira Ayadi; Danisa Tashtanbaeva; Lucile Armenoult; François Montagne; Robin Tranchant; Annie Renier; Leanne de Koning; Marie-Christine Copin; Paul Hofman; Véronique Hofman; Henri Porte; Françoise Le Pimpec-Barthes; Jessica Zucman-Rossi; Marie-Claude Jaurand; Aurélien de Reyniès; Didier Jean
Journal:  Nat Commun       Date:  2019-03-22       Impact factor: 14.919

4.  Absence of an embryonic stem cell DNA methylation signature in human cancer.

Authors:  Ze Zhang; John K Wiencke; Devin C Koestler; Lucas A Salas; Brock C Christensen; Karl T Kelsey
Journal:  BMC Cancer       Date:  2019-07-19       Impact factor: 4.430

5.  RF_Purify: a novel tool for comprehensive analysis of tumor-purity in methylation array data based on random forest regression.

Authors:  Pascal David Johann; Natalie Jäger; Stefan M Pfister; Martin Sill
Journal:  BMC Bioinformatics       Date:  2019-08-16       Impact factor: 3.169

6.  Histoepigenetic analysis of the mesothelin network within pancreatic ductal adenocarcinoma cells reveals regulation of retinoic acid receptor gamma and AKT by mesothelin.

Authors:  Eugene Lurie; Dongliang Liu; Emily L LaPlante; Lillian R Thistlethwaite; Qizhi Yao; Aleksandar Milosavljevic
Journal:  Oncogenesis       Date:  2020-07-02       Impact factor: 7.485

7.  Association of variably methylated tumour DNA regions with overall survival for invasive lobular breast cancer.

Authors:  Medha Suman; Pierre-Antoine Dugué; Ee Ming Wong; JiHoon Eric Joo; John L Hopper; Tu Nguyen-Dumont; Graham G Giles; Roger L Milne; Catriona McLean; Melissa C Southey
Journal:  Clin Epigenetics       Date:  2021-01-18       Impact factor: 6.551

8.  Expression of cyclin-dependent kinases and their clinical significance with immune infiltrates could predict prognosis in colorectal cancer.

Authors:  Adewale Oluwaseun Fadaka; Nicole Remaliah Samantha Sibuyi; Olalekan Olanrewaju Bakare; Ashwil Klein; Abram Madimabe Madiehe; Mervin Meyer
Journal:  Biotechnol Rep (Amst)       Date:  2021-02-23

9.  Intratumor DNA methylation heterogeneity in glioblastoma: implications for DNA methylation-based classification.

Authors:  Anna Wenger; Sandra Ferreyra Vega; Teresia Kling; Thomas Olsson Bontell; Asgeir Store Jakola; Helena Carén
Journal:  Neuro Oncol       Date:  2019-05-06       Impact factor: 12.300

10.  PEIS: a novel approach of tumor purity estimation by identifying information sites through integrating signal based on DNA methylation data.

Authors:  Shudong Wang; Lihua Wang; Yuanyuan Zhang; Shanchen Pang; Xinzeng Wang
Journal:  BMC Bioinformatics       Date:  2019-12-30       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.