Literature DB >> 23468659

Confounding by repetitive elements and CpG islands does not explain the association between hypomethylation and genomic instability.

R Alan Harris1, Chad Shaw, Jian Li, Sau Wai Cheung, Cristian Coarfa, Mira Jeong, Margaret A Goodell, Lisa D White, Ankita Patel, Sung-Hae Kang, A Craig Chinault, Tomasz Gambin, Anna Gambin, James R Lupski, Aleksandar Milosavljevic.   

Abstract

Entities:  

Mesh:

Year:  2013        PMID: 23468659      PMCID: PMC3585018          DOI: 10.1371/journal.pgen.1003333

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


× No keyword cloud information.
In our recent article [1], we reported an association between hypomethylation and genomic instability. A comment by Watson et al. [2] re-analyzes the data and claims that our findings may represent an artifact. We extend the methodological framework for analyzing copy number variants (CNVs) in the context of potential confounding factors to address the issues raised in the comment and to further research in this growing area of genomic science. Watson et al. argue that the association we reported between hypomethylation of genomic DNA—determined from sperm methylomes [3]—and the density of CNVs can be explained by a combined confounding effect of known correlates of CNVs, namely repetitive elements and CpG islands. To support their argument, the authors eliminate many genomic regions providing a variety of justifications for why these regions create “spurious association”. Once the regions have been removed from the genome, Watson et al. claim the association between hypomethylation and genomic instability disappears. We would first like to point out that the goal of our study was not to ignore potential relevance of these other factors—such as repetitive elements and CpG islands—but instead to broaden the scope of inquiry to examine possible additional explanatory power of hypomethylation. In deference to Watson et al., we initiated a re-analysis that systematically examined the purported confounding factors. Rather than pursuing the data exclusion approach used by Watson et al., we applied the standard multiple regression approach. The regression methods control for confounding without discarding data or otherwise biasing the inquiry to particular genomic sub-regions. Specifically, we first asked if the variables brought up by Watson et al. (LINE, SINE, LTR, Satellite, and CpG island content) individually or in combination explain the association between hypomethylation and CNV counts within 100-Kbp windows tiling the genome. In addition, using the Akaike information criterion (AIC) we measured the explanatory power of each of the six variables beyond the explanatory power of the other five. We first applied the Negative Binomial regression model [4], because it is commonly used for overdispersed variables and is a well-accepted and robust method for count data such as CNV counts. We applied the same method to all five sample sets brought up by Watson et al.—HapMap270 [5], HapMap450 [6], WTCCC [7], Schizophrenia Cases, and Schizophrenia Controls [8]. As documented in Tables S1 and S2, in all five sample sets hypomethylation remained highly significantly associated with CNV density after correction for all of the “confounders” individually and in combination. As illustrated in Figure 1, as measured by AIC, methylation was more predictive of CNV counts per 100-Kbp window by an order of magnitude than any other factor.
Figure 1

Predictive power of methylation and other genomic factors for CNV counts.

Predictive power of methylation, CpG island content, and repetitive element content (LINE, SINE, LTR, and Satellites) was measured using Akaike information criterion (AIC). For all five datasets, negative binomial regression was performed using all six factors and all six combinations of five factors (one factor being removed at a time). The y-axis represents the predictive power of a factor, as measured by the improvement of the AIC score based on all six factors relative to the AIC score without the factor. Note that this method measures predictive power of a factor after correction for any potential confounding due to other factors. (The detailed calculations and input data are in Supporting Information.)

Predictive power of methylation and other genomic factors for CNV counts.

Predictive power of methylation, CpG island content, and repetitive element content (LINE, SINE, LTR, and Satellites) was measured using Akaike information criterion (AIC). For all five datasets, negative binomial regression was performed using all six factors and all six combinations of five factors (one factor being removed at a time). The y-axis represents the predictive power of a factor, as measured by the improvement of the AIC score based on all six factors relative to the AIC score without the factor. Note that this method measures predictive power of a factor after correction for any potential confounding due to other factors. (The detailed calculations and input data are in Supporting Information.) To examine robustness of our analyses with respect to modeling assumptions, we repeated the same analysis by applying the more widely used Poisson and linear regression models. All three models gave consistent results for all confounders in all five sample sets. (Table S2 describes regression models, output, and the input data extracted from our original paper sufficient to run a statistical program such as R to obtain the output.) We also employed zero-inflated negative binomial and Poisson regression models, and found completely concordant results. We therefore conclude that the assertions regarding “confounding” are not consistent with the data available. This is likely because Watson et al. resorted to an idiosyncratic selective data elimination procedure rather than pursuing a more standard statistical approach. Second, Watson et al. bring up mappability of reads as a confounding factor while failing to mention that the original paper [1] considered and—using bisulfite sequencing data from embryonic stem cell H1 as a control—ruled out ascertainment biases due to read mappability: “We next examined the difference in methylation levels between sperm and H1. As illustrated in Figure S16, the difference shows even stronger association with structural mutability than the absolute methylation levels in sperm. This result rules out possible ascertainment biases due to low mappability of sequencing reads in potentially unstable and repetitive hypomethylated regions. It also suggests that structural mutability is associated with germline-specific hypomethylation.” Also quoting from our article: “We found significant negative correlation between the methylation scores in sperm and the heterozygosity rates (CNVs from 400 MGL samples: r≈−0.15, p≈10−9; CNVs from 270 HapMap samples: r≈−0.20, p≈10−10). In contrast, no significant correlation between the H1 methylation scores and the CNV heterozygosity rates was detected” [1]. Third, Watson et al. argue that the higher mode with zero scores for the Methylation Index (MI = 0) is likely an artifact due to small SNP and CpG counts. In this context it is surprising that Watson et al. fail to mention that our article considered, examined, and ruled out this possibility: “One could expect that if the windows with MI = 0 were due to low probing density, the windows within the higher mode would have fewer SNPs or CpGs. However, we examined potential biases in MI estimation due to variations in the number of SNPs, CpGs, read coverage (Figure S6CD), or sampling events (Figure S7BD) and found no significant difference between the two modes, ruling out the possibility that the two modes may be explained by variation in mappability or shallow sampling. In addition, a simulation experiment showed that the statistical variance of methylation estimates due to CpG sampling of windows with MI = 0 was a relatively small fraction of biological variance in methylation observed between the two sperm methylomes (Figure S8). We therefore hypothesize that the higher mode may either indicate hypomethylation specific to the female germline, given that male and female germline methylation patterns are highly dimorphic [47], or may be due to other germline hypomethylation detected by MI that is absent from sperm.” Fourth, the “confounders” brought up by Watson et al. do not influence genomic instability independently of the methylation state and therefore do not meet the common definition of confounding [9]. In the specific case of CpG islands, the striking pattern where hypomethylated CpG islands are enriched in unstable regions (Figure 2A and 2B in Watson et al. [2]) would in fact be expected to occur if hypomethylation were mechanistically linked to genomic instability. Watson et al. ignore this possibility without sound justification while claiming that this pattern somehow provides evidence against any connection of hypomethylation and genomic instability. Fifth, contrary to what Watson et al. claim, our article does not state that hypomethylation plays a causative role in genomic instability. Specifically, in the discussion section of our article we state three possible mechanistic explanations for the observed association: DNA break–inducing germline-specific demethylation during embryogenesis; mutagenic effects of germline-specific gene expression in hypomethylated loci; and mutagenic effects of transcription factor binding to hypomethylated loci. In summary, we thank Watson et al. for their efforts and further examination of our reported observations. Nevertheless, we find that the arguments put forward in the comment do not diminish the strength of our reported findings. Specifically, our analyses of the confounding factors suggested by Watson et al. do not diminish the contention that genomic correlates may provide only a partial explanation for the hotspots of genomic instability. Thus, broadening inquiry to also include the epigenome may be warranted. CNV counts and values for methylation and other genomic factors for 100-Kbp windows tiling the hg18 assembly of the human genome. (ZIP) Click here for additional data file. Regression analysis of the predictive power of methylation and other genomic factors for CNV counts. (XLS) Click here for additional data file.
  7 in total

1.  Integrated detection and population-genetic analysis of SNPs and copy number variation.

Authors:  Steven A McCarroll; Finny G Kuruvilla; Joshua M Korn; Simon Cawley; James Nemesh; Alec Wysoker; Michael H Shapero; Paul I W de Bakker; Julian B Maller; Andrew Kirby; Amanda L Elliott; Melissa Parkin; Earl Hubbell; Teresa Webster; Rui Mei; James Veitch; Patrick J Collins; Robert Handsaker; Steve Lincoln; Marcia Nizzari; John Blume; Keith W Jones; Rich Rava; Mark J Daly; Stacey B Gabriel; David Altshuler
Journal:  Nat Genet       Date:  2008-09-07       Impact factor: 38.330

2.  Sperm methylation profiles reveal features of epigenetic inheritance and evolution in primates.

Authors:  Antoine Molaro; Emily Hodges; Fang Fang; Qiang Song; W Richard McCombie; Gregory J Hannon; Andrew D Smith
Journal:  Cell       Date:  2011-09-16       Impact factor: 41.582

3.  Rare chromosomal deletions and duplications increase risk of schizophrenia.

Authors: 
Journal:  Nature       Date:  2008-07-30       Impact factor: 49.962

4.  Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls.

Authors:  Nick Craddock; Matthew E Hurles; Niall Cardin; Richard D Pearson; Vincent Plagnol; Samuel Robson; Damjan Vukcevic; Chris Barnes; Donald F Conrad; Eleni Giannoulatou; Chris Holmes; Jonathan L Marchini; Kathy Stirrups; Martin D Tobin; Louise V Wain; Chris Yau; Jan Aerts; Tariq Ahmad; T Daniel Andrews; Hazel Arbury; Anthony Attwood; Adam Auton; Stephen G Ball; Anthony J Balmforth; Jeffrey C Barrett; Inês Barroso; Anne Barton; Amanda J Bennett; Sanjeev Bhaskar; Katarzyna Blaszczyk; John Bowes; Oliver J Brand; Peter S Braund; Francesca Bredin; Gerome Breen; Morris J Brown; Ian N Bruce; Jaswinder Bull; Oliver S Burren; John Burton; Jake Byrnes; Sian Caesar; Chris M Clee; Alison J Coffey; John M C Connell; Jason D Cooper; Anna F Dominiczak; Kate Downes; Hazel E Drummond; Darshna Dudakia; Andrew Dunham; Bernadette Ebbs; Diana Eccles; Sarah Edkins; Cathryn Edwards; Anna Elliot; Paul Emery; David M Evans; Gareth Evans; Steve Eyre; Anne Farmer; I Nicol Ferrier; Lars Feuk; Tomas Fitzgerald; Edward Flynn; Alistair Forbes; Liz Forty; Jayne A Franklyn; Rachel M Freathy; Polly Gibbs; Paul Gilbert; Omer Gokumen; Katherine Gordon-Smith; Emma Gray; Elaine Green; Chris J Groves; Detelina Grozeva; Rhian Gwilliam; Anita Hall; Naomi Hammond; Matt Hardy; Pile Harrison; Neelam Hassanali; Husam Hebaishi; Sarah Hines; Anne Hinks; Graham A Hitman; Lynne Hocking; Eleanor Howard; Philip Howard; Joanna M M Howson; Debbie Hughes; Sarah Hunt; John D Isaacs; Mahim Jain; Derek P Jewell; Toby Johnson; Jennifer D Jolley; Ian R Jones; Lisa A Jones; George Kirov; Cordelia F Langford; Hana Lango-Allen; G Mark Lathrop; James Lee; Kate L Lee; Charlie Lees; Kevin Lewis; Cecilia M Lindgren; Meeta Maisuria-Armer; Julian Maller; John Mansfield; Paul Martin; Dunecan C O Massey; Wendy L McArdle; Peter McGuffin; Kirsten E McLay; Alex Mentzer; Michael L Mimmack; Ann E Morgan; Andrew P Morris; Craig Mowat; Simon Myers; William Newman; Elaine R Nimmo; Michael C O'Donovan; Abiodun Onipinla; Ifejinelo Onyiah; Nigel R Ovington; Michael J Owen; Kimmo Palin; Kirstie Parnell; David Pernet; John R B Perry; Anne Phillips; Dalila Pinto; Natalie J Prescott; Inga Prokopenko; Michael A Quail; Suzanne Rafelt; Nigel W Rayner; Richard Redon; David M Reid; Susan M Ring; Neil Robertson; Ellie Russell; David St Clair; Jennifer G Sambrook; Jeremy D Sanderson; Helen Schuilenburg; Carol E Scott; Richard Scott; Sheila Seal; Sue Shaw-Hawkins; Beverley M Shields; Matthew J Simmonds; Debbie J Smyth; Elilan Somaskantharajah; Katarina Spanova; Sophia Steer; Jonathan Stephens; Helen E Stevens; Millicent A Stone; Zhan Su; Deborah P M Symmons; John R Thompson; Wendy Thomson; Mary E Travers; Clare Turnbull; Armand Valsesia; Mark Walker; Neil M Walker; Chris Wallace; Margaret Warren-Perry; Nicholas A Watkins; John Webster; Michael N Weedon; Anthony G Wilson; Matthew Woodburn; B Paul Wordsworth; Allan H Young; Eleftheria Zeggini; Nigel P Carter; Timothy M Frayling; Charles Lee; Gil McVean; Patricia B Munroe; Aarno Palotie; Stephen J Sawcer; Stephen W Scherer; David P Strachan; Chris Tyler-Smith; Matthew A Brown; Paul R Burton; Mark J Caulfield; Alastair Compston; Martin Farrall; Stephen C L Gough; Alistair S Hall; Andrew T Hattersley; Adrian V S Hill; Christopher G Mathew; Marcus Pembrey; Jack Satsangi; Michael R Stratton; Jane Worthington; Panos Deloukas; Audrey Duncanson; Dominic P Kwiatkowski; Mark I McCarthy; Willem Ouwehand; Miles Parkes; Nazneen Rahman; John A Todd; Nilesh J Samani; Peter Donnelly
Journal:  Nature       Date:  2010-04-01       Impact factor: 49.962

5.  Origins and functional impact of copy number variation in the human genome.

Authors:  Donald F Conrad; Dalila Pinto; Richard Redon; Lars Feuk; Omer Gokcumen; Yujun Zhang; Jan Aerts; T Daniel Andrews; Chris Barnes; Peter Campbell; Tomas Fitzgerald; Min Hu; Chun Hwa Ihm; Kati Kristiansson; Daniel G Macarthur; Jeffrey R Macdonald; Ifejinelo Onyiah; Andy Wing Chun Pang; Sam Robson; Kathy Stirrups; Armand Valsesia; Klaudia Walter; John Wei; Chris Tyler-Smith; Nigel P Carter; Charles Lee; Stephen W Scherer; Matthew E Hurles
Journal:  Nature       Date:  2009-10-07       Impact factor: 49.962

6.  Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

Authors:  Jian Li; R Alan Harris; Sau Wai Cheung; Cristian Coarfa; Mira Jeong; Margaret A Goodell; Lisa D White; Ankita Patel; Sung-Hae Kang; Chad Shaw; A Craig Chinault; Tomasz Gambin; Anna Gambin; James R Lupski; Aleksandar Milosavljevic
Journal:  PLoS Genet       Date:  2012-05-17       Impact factor: 5.917

7.  Comment on "genomic hypomethylation in the human germline associates with selective structural mutability in the human genome".

Authors:  Corey T Watson; Paras Garg; Andrew J Sharp
Journal:  PLoS Genet       Date:  2013-02-28       Impact factor: 5.917

  7 in total
  1 in total

1.  CNVs are associated with genomic architecture in a songbird.

Authors:  Vinicius H da Silva; Veronika N Laine; Mirte Bosse; Kees van Oers; Bert Dibbits; Marcel E Visser; Richard P M A Crooijmans; Martien A M Groenen
Journal:  BMC Genomics       Date:  2018-03-13       Impact factor: 3.969

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.