Literature DB >> 28695155

In Silico Restriction Enzyme Digests to Minimize Mapping Bias in Genomic Sequencing.

Jason Roszik1,2, György Fenyőfalvi3, László Halász3,4, Zsolt Karányi3, Lóránt Székvölgyi3,4.   

Abstract

Entities:  

Keywords:  DRIP; Hi-C; bias; genome fragmentation; restriction enzyme digestion

Year:  2017        PMID: 28695155      PMCID: PMC5485759          DOI: 10.1016/j.omtm.2017.06.003

Source DB:  PubMed          Journal:  Mol Ther Methods Clin Dev        ISSN: 2329-0501            Impact factor:   6.698


× No keyword cloud information.

Main Text

A commonly used genome fragmentation method in next generation sequencing, restriction endonuclease (RE) digestion, may severely compromise genomic mapping resolution and prevent the functional annotation of certain chromosomal regions unless REs are applied in correct combinations to sample all genomic regions with an equal probability. Genome fragmentation by REs is routinely used in multiple genomic mapping technologies, including RNA-DNA hybrid (R-loop) immunoprecipitation sequencing (DRIP-seq),1, 2 chromosome conformation capture (4C/5C, Hi-C), reduced-representation bisulfite sequencing (RRBS), and restriction site associated marker (RAD) genotyping. The performance of these approaches depends on (1) the length distribution of the restriction fragments (determining the spatial resolution of the assay) and (2) the randomness of RE digestion (ensuring that all genomic regions are sampled with an equal probability). Therefore, selecting the proper combination of REs for genome fragmentation is of crucial importance to obtain representative next-generation sequencing (NGS) libraries and to assign clear biological functions to the mapped regions. Using the DRIP-seq technique, we have recently shown that this technology contains inherent biases related to RE digestion that might prevent functional annotation of a significant fraction of R-loops. R-loops, nucleic acid structures that are composed of an RNA-DNA hybrid and a single-stranded DNA, are involved in multiple cellular processes and may also mediate genomic instability in a pathological context. The DRIP method uses an anti-RNA-DNA hybrid antibody to capture RNA-DNA hybrids associated with RE-fragmented DNA or chromatin, followed by fragment mapping to the genome. The main reason for the over-representation of lengthy DRIP fragments may be that the distribution of restriction enzyme cutting sites is not random in the human genome. This bias is especially enhanced over the first exons. The over-representation of first exons in RE-fragmented samples may also be an issue in other species and sequencing methods. For instance, the mouse genome also contains long intronic sequences that may cause similar biases. Similar to the DRIP method, suboptimal RE fragment size distribution and first exon bias might affect the outcome and interpretation of other frequently used genomic technologies (e.g., all C-based methods [4C, 5C, and Hi-C]), potentially introducing false-positive spatial contacts that fall proximal to open reading frames (ORFs), especially to first exons. Finally, the estimation of the evolutionary conservation of R-loop binding sites between species that reflect the sequence homology/divergence of exonic DRIP fragments, but precisely located R-loop binding sites, is potentially also problematic. Superimposed on the RE bias, multiple other genome characteristics can affect the efficacy of RE digestion. DNA methylation is present in higher organisms, and the majority of REs do not cut at methylated cytosines. Furthermore, most REs do not cut DNA-RNA hybrids that are prevalent over the chromosomes (constituting 5%–8% of the eukaryote genome). Restriction enzyme accessibility is also limited by the chromatin (nucleosome) structure that inherently prefers the cutting of linker DNA sequences. The randomness and uniformity of restriction fragment length distributions can be tested for any combination of REs using in silico restriction endonuclease digests (Figure 1), and RE cocktails with theoretically justified cutting parameters can be selected for use in experiments. We recommend using the DECIPHER R package, which is available in Bioconductor. To predict the expected DNA fragments, the “digestDNA” function can be used to perform in silico restriction digestion of given DNA sequences. Issues related to CpG methylation can be experimentally addressed by methylation-insensitive REs that cleave methylated DNA. RNase H1 digestion of nucleic acid preps can also be applied to remove RNA from DNA-RNA hybrids. Furthermore, short treatment of live cells with chromatin decompaction agents (e.g., HDAC inhibitors) may provide increased RE accessibility in experiments involving in situ RE fragmentation (e.g., Hi-C). Collectively, the above recommendations can help identify RE cocktails and experimental conditions that result in proper DNA fragment size distributions and optimal resolution in genomic sequencing technologies.
Figure 1

Genome Fragmentation by In Silico Restriction Enzyme Digestion in Species That Were Analyzed by DRIP-seq or HI-C

The absolute number of restriction fragments is shown in terms of the average fragment lengths (mean + interquartile range [IQR]) obtained by the indicated restriction enzymes applied alone or in combination. RE cocktail denotes the HindIII, EcoRI, BsrGI, XbaI, and SspI enzymes.

Genome Fragmentation by In Silico Restriction Enzyme Digestion in Species That Were Analyzed by DRIP-seq or HI-C The absolute number of restriction fragments is shown in terms of the average fragment lengths (mean + interquartile range [IQR]) obtained by the indicated restriction enzymes applied alone or in combination. RE cocktail denotes the HindIII, EcoRI, BsrGI, XbaI, and SspI enzymes.
  8 in total

1.  Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers.

Authors:  Michael R Miller; Joseph P Dunham; Angel Amores; William A Cresko; Eric A Johnson
Journal:  Genome Res       Date:  2006-12-22       Impact factor: 9.043

2.  R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters.

Authors:  Paul A Ginno; Paul L Lott; Holly C Christensen; Ian Korf; Frédéric Chédin
Journal:  Mol Cell       Date:  2012-03-01       Impact factor: 17.970

3.  Comprehensive mapping of long-range interactions reveals folding principles of the human genome.

Authors:  Erez Lieberman-Aiden; Nynke L van Berkum; Louise Williams; Maxim Imakaev; Tobias Ragoczy; Agnes Telling; Ido Amit; Bryan R Lajoie; Peter J Sabo; Michael O Dorschner; Richard Sandstrom; Bradley Bernstein; M A Bender; Mark Groudine; Andreas Gnirke; John Stamatoyannopoulos; Leonid A Mirny; Eric S Lander; Job Dekker
Journal:  Science       Date:  2009-10-09       Impact factor: 47.728

4.  Prevalent, Dynamic, and Conserved R-Loop Structures Associate with Specific Epigenomic Signatures in Mammals.

Authors:  Lionel A Sanz; Stella R Hartono; Yoong Wearn Lim; Sandra Steyaert; Aparna Rajpurkar; Paul A Ginno; Xiaoqin Xu; Frédéric Chédin
Journal:  Mol Cell       Date:  2016-06-30       Impact factor: 17.970

5.  Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis.

Authors:  Alexander Meissner; Andreas Gnirke; George W Bell; Bernard Ramsahoye; Eric S Lander; Rudolf Jaenisch
Journal:  Nucleic Acids Res       Date:  2005-10-13       Impact factor: 16.971

6.  RNA-DNA hybrid (R-loop) immunoprecipitation mapping: an analytical workflow to evaluate inherent biases.

Authors:  László Halász; Zsolt Karányi; Beáta Boros-Oláh; Tímea Kuik-Rózsa; Éva Sipos; Éva Nagy; Ágnes Mosolygó-L; Anett Mázló; Éva Rajnavölgyi; Gábor Halmos; Lóránt Székvölgyi
Journal:  Genome Res       Date:  2017-03-24       Impact factor: 9.043

7.  A combinatorial approach to the restriction of a mouse genome.

Authors:  Leonid V Bystrykh
Journal:  BMC Res Notes       Date:  2013-07-22

8.  GC skew is a conserved property of unmethylated CpG island promoters across vertebrates.

Authors:  Stella R Hartono; Ian F Korf; Frédéric Chédin
Journal:  Nucleic Acids Res       Date:  2015-08-07       Impact factor: 16.971

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.