| Literature DB >> 29168754 |
Yuri Kravatsky1, Vladimir Chechetkin2, Daria Fedoseeva3, Maria Gorbacheva4, Galina Kravatskaya5, Olga Kretova6, Nickolai Tchurikov7.
Abstract
The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.Entities:
Keywords: bioinformatic pipeline; data processing; deep-sequencing; drug targets; mutations; viruses
Mesh:
Substances:
Year: 2017 PMID: 29168754 PMCID: PMC5744132 DOI: 10.3390/v9120357
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Figure 1The positions of RNA interference (RNAi) targets on the genome of human immunodeficiency virus 1 (HIV-1) and their conservation within two independent cohorts of patients from Russia. (a) Schematic presentation of RNAi targets within the HIV-1 physical map; (b) The profiles of total mutation frequencies over the RNAi targets. The broken horizontal lines correspond to the thresholds of reliable mutation detection as determined by Equation (3).
Figure 2Scheme of the bioinformatics pipeline for deep-sequence monitoring of viral drug targets. The resulting output options are marked in red. The particular scripts correspond to the blocks of programs outlined in blue. The typical information related to the filtering and times for read processing is presented in Table S2. Links to all software used in the pipeline are given in Section 2.2.
Figure 3The z-criterion profiles (Equation (4)) characterizing the differences between mutation frequencies in the corresponding target sites for two independent cohorts of patients from Russia. For presentation purposes, the maximum absolute values of z were taken to be |z| = 5.5 (Pr = 3.9 × 10−8). The horizontal broken lines (z = ±1.96) correspond to the thresholds of statistical significance (Pr = 0.05).
Figure 4The conservation of RNAi targets (Equation (6)) for two independent cohorts of patients from Russia. The conservation of the targets is defined by Equation (6). The difference in the conservation of about 1% should be considered as statistically significant according to the Gaussian z-criterion.