P H Reyes-Herrera1, C A Speck-Hernandez2, C A Sierra2, S Herrera3. 1. Colombian Corporation for Agricultural Research (CORPOICA), 250047 Bogotá, Colombia. 2. Universidad Antonio Nariño, 110311 Bogotá, Colombia. 3. Woods Hole Oceanographic Institution, 02543 Massachusetts, USA and Massachusetts Institute of Technology, 02139 Massachusetts, USA.
Abstract
MOTIVATION: PAR-CLIP, a CLIP-seq protocol, derives a transcriptome wide set of binding sites for RNA-binding proteins. Even though the protocol uses stringent washing to remove experimental noise, some of it remains. A recent study measured three sets of non-specific RNA backgrounds which are present in several PAR-CLIP datasets. However, a tool to identify the presence of common background in PAR-CLIP datasets is not yet available. RESULTS: We used the measured sets of non-specific RNA backgrounds to build a common background set. Each element from the common background set has a score that reflects its presence in several PAR-CLIP datasets. We present a tool that uses this score to identify the amount of common backgrounds present in a PAR-CLIP dataset, and we provide the user the option to use or remove it. We used the proposed strategy in 30 PAR-CLIP datasets from nine proteins. It is possible to identify the presence of common backgrounds in a dataset and identify differences in datasets for the same protein. This method is the first step in the process of completely removing such backgrounds. AVAILABILITY: The tool was implemented in python. The common background set and the supplementary data are available at https://github.com/phrh/BackCLIP. CONTACT: phreyes@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: PAR-CLIP, a CLIP-seq protocol, derives a transcriptome wide set of binding sites for RNA-binding proteins. Even though the protocol uses stringent washing to remove experimental noise, some of it remains. A recent study measured three sets of non-specific RNA backgrounds which are present in several PAR-CLIP datasets. However, a tool to identify the presence of common background in PAR-CLIP datasets is not yet available. RESULTS: We used the measured sets of non-specific RNA backgrounds to build a common background set. Each element from the common background set has a score that reflects its presence in several PAR-CLIP datasets. We present a tool that uses this score to identify the amount of common backgrounds present in a PAR-CLIP dataset, and we provide the user the option to use or remove it. We used the proposed strategy in 30 PAR-CLIP datasets from nine proteins. It is possible to identify the presence of common backgrounds in a dataset and identify differences in datasets for the same protein. This method is the first step in the process of completely removing such backgrounds. AVAILABILITY: The tool was implemented in python. The common background set and the supplementary data are available at https://github.com/phrh/BackCLIP. CONTACT: phreyes@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Philip J Uren; Emad Bahrami-Samani; Patricia Rosa de Araujo; Christine Vogel; Mei Qiao; Suzanne C Burns; Andrew D Smith; Luiz O F Penalva Journal: RNA Biol Date: 2016-01-13 Impact factor: 4.652
Authors: Eric L Van Nostrand; Gabriel A Pratt; Alexander A Shishkin; Chelsea Gelboin-Burkhart; Mark Y Fang; Balaji Sundararaman; Steven M Blue; Thai B Nguyen; Christine Surka; Keri Elkins; Rebecca Stanton; Frank Rigo; Mitchell Guttman; Gene W Yeo Journal: Nat Methods Date: 2016-03-28 Impact factor: 28.547