| Literature DB >> 27899620 |
Hyerin Kim1, NaNa Kang2, KyuHyeon An1, Doyun Kim2, JaeHyung Koo3, Min-Soo Kim4.
Abstract
Many infectious diseases are caused by viral infections, and in particular by RNA viruses such as MERS, Ebola and Zika. To understand viral disease, detection and identification of these viruses are essential. Although PCR is widely used for rapid virus identification due to its low cost and high sensitivity and specificity, very few online database resources have compiled PCR primers for RNA viruses. To effectively detect viruses, the MRPrimerV database (http://MRPrimerV.com) contains 152 380 247 PCR primer pairs for detection of 1818 viruses, covering 7144 coding sequences (CDSs), representing 100% of the RNA viruses in the most up-to-date NCBI RefSeq database. Due to rigorous similarity testing against all human and viral sequences, every primer in MRPrimerV is highly target-specific. Because MRPrimerV ranks CDSs by the penalty scores of their best primer, users need only use the first primer pair for a single-phase PCR or the first two primer pairs for two-phase PCR. Moreover, MRPrimerV provides the list of genome neighbors that can be detected using each primer pair, covering 22 192 variants of 532 RefSeq RNA viruses. We believe that the public availability of MRPrimerV will facilitate viral metagenomics studies aimed at evaluating the variability of viruses, as well as other scientific tasks.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27899620 PMCID: PMC5210568 DOI: 10.1093/nar/gkw1095
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Schematic overview of MRPrimerV. (1) Checking the filtering constraints and performing large-scale rigorous similarity testing based on the MRPrimer technology on a supercomputer for the entire human sequence and RNA virus sequences. (2) Aligning the primer pairs with the genome neighbors to check whether each primer pair can amplify each genome neighbor.(3) Converting the results of processing along with information from the GenBank annotation database into the MRPrimerV database of 152,380,247 high-quality primer pairs. (4) Relaxing filtering constraints to cover 100% of 1,818 RNA viruses. (5) Updating the database with experimental validation results for some viruses.
List of filtering constraints used in MRPrimerV
| Value range | |||
|---|---|---|---|
| Parameter | Default | Relaxed | |
| Single primer | primer length | 19–23 bp | 19–23 bp |
| melting temperature (TM)* | 58–62°C | 57–62°C | |
| GC content | 40–60% | 35–65% | |
| self-complementarity | <5-mer | <9-mer | |
| 3′ self-complementarity | <4-mer | <4-mer | |
| contiguous residue | <6-mer | <6-mer | |
| 3′ end stability (ΔG) | ≥-9 | ≥-9 | |
| hairpin | <4-mer | <9-mer | |
| Primer pair | length difference | ≤3-mer | ≤5-mer |
| TM difference | ≤5°C | ≤5°C | |
| product size | 100–500 bp | 70–500 bp | |
| pair-complementarity | <5-mer | <9-mer | |
| 3′ pair-complementarity | <4-mer | <4-mer | |
*The nearest-neighbor thermodynamic model (14) was used to calculate the melting temperature.
Statistics of RNA viruses covered by primer pairs in MRPrimerV
| Filtering constraints | Default | Relaxed | ||
|---|---|---|---|---|
| Number | Ratio | Number | Ratio | |
| Total number of non-segmented genomes or segments | 2972 | 100% | 2972 | 100% |
| Number of non-segmented genomes or segments covered by CDS-specific primers | 2944 | 99.1% | 2960 | 99.6% |
| Number of non-segmented genomes or segments covered by both CDS- and virus-specific primers | 2955 | 99.4% | 2963 | 99.7% |
| Total number of RNA viruses | 1818 | 100% | 1818 | 100% |
| Number of viruses covered by MRPrimerV | 1817 | 99.9% | 1818 | 100% |
Figure 2.Distribution of the number of RefSeq sequences for each range of numbers of genome neighbors. The majority of RefSeq sequences (379/532, 71.24%) have fewer than ten genome neighbors, and only 0.03% of RefSeq have more than 1000 genome neighbors.
Statistics of genome neighbors covered by the top-50 primers in MRPrimerV
| All 532 RefSeq sequences | HIV-1 | Rotavirus C segment 5 | |
|---|---|---|---|
| Total number of genome neighbors | 44 653 | 762 | 109 |
| Number of genome neighbors covered by top-50 primers of MRPrimerV | 22 192 | 760 | 79 |
| Ratio (%) | 49.69 | 99.73 | 72.47 |
Figure 3.Output interface of MRPrimerV. Respiratory syncytial virus (RSV) has 10 coding sequences (CDSs); the output page first shows the top primer pair (penalty score: 5.027) for gene symbol G, and, next, the top primer pair (penalty score: 5.483) for gene symbol NS1. The bottom of the figure shows mumps virus with the list of 78 genome neighbors for gene symbol L. The head of the table provides information including a description of the virus, GenBank accession number (with a link to detailed gene information from GenBank), and organism information. The body of the table provides information about coding genes (gene symbol and gene ID) and primer information. If users click the ‘Validation results’ button, experimental validation data are shown, including specimen information, agarose gel data, qPCR amplification and melting curves, and sequencing data of the qPCR amplicon.