| Literature DB >> 18270574 |
Linda Zheng1, Paul J Wayper, Adrian J Gibbs, Mathieu Fourment, Brendan C Rodoni, Mark J Gibbs.
Abstract
UNLABELLED: Unknown and foreign viruses can be detected using degenerate primers targeted at conserved sites in the known viral gene sequences. Conserved sites are found by comparing sequences and so the usefulness of a set of primers depends crucially on how well the known sequences represent the target group including unknown sequences. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2008 PMID: 18270574 PMCID: PMC2217591 DOI: 10.1371/journal.pone.0001586
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Rank and N scores of 17 conserved sites in potyvirus genomes along with their locations and sequence and the method used to identify each one.
| rank | location | Gene/ORF | nucleotide sequence of sites (5′—3′) | amino acid sequence | identified by | N scores |
| 1 | 7587 | NIb | TGYGTNGAYGAYYTYAAYAA | CVD | Eye/E1/R1/C2/V3 | 0.55 |
| 2 | 9237 | CP | GARRAYACDGARMGNCAYRC | EN/D | R4/C5/V1(9240) | 0.60 |
| 3 | 4545 | CI | RAYATHRTNGARAAYGGNGT |
| Nicholas | 0.65 |
| 4 | 4539 | CI | GCNMSNRAYATHRTNGARAAYGG | A | Eye | 0.78 |
| 5 | 9162 | CP | CARATRAARRCNKCNSV | QMKAAA | Langeveld/Pappu | 0.82 |
| 6 | 8278 | NIb | SNATNDTNGADKCNTGGGG |
| Eye | 0.84 |
| 7 | 4458 | CI | AARRTNGAYGGNMGNWCNAT | KV/IDGRT/SM | R5 | 0.85 |
| 7 | 7899 | NIb | GKNAAYAAYWSBGGNCARCC |
| Gibbs | 0.85 |
| 8 | 9099 | CP | WWHGSNTTYGAYTTHTWHVR |
| E3/C3 | 0.90 |
| 9 | 7911 | NIb | GGNCARCCNTCNACNGTNGTNG | G | Eye | 0.95 |
| 10 | 7545 | NIb | TTYACNGCNGCNCCNNTNGRNAC | FTAAP | Eye | 1.00 |
| 10 | 7722 | NIb | GAYGGNWSNMRVYTYGAYWS | DG | E4 | 1.00 |
| 11 | 3888 | CI | GTNGGNWSNGGNAARTSNWC |
| Eye | 1.05 |
| 12 | 8853 | CP | WKGRTNHGGKNHHTNGDNAR |
| C1/E2/V4/Langeveld/Pappu | 1.10 |
| 13 | 8901 | CP | TGGNHNWTSRTRVAHRRNVR | W | C4/V5 | 1.20 |
| 13 | 9042 | CP | NNNTAYATDSCNVGNTRYGS |
| V2 | 1.20 |
| 14 | 4396 | CP | CNRGYYRYRRHGANRTNGA |
| Eye | 1.21 |
Sites with equal N scores (see Materials and Methods) were ranked as equal (e.g. 7th for sites at position 4458 and 7899)
Nucleotide positions refer to the 5′ residue of the site in the sequence of the Tobacco etch virus genome [Accession code M11458].
CI (cylindrical inclusion), a multifunctional protein that acts as an RNA helicase and is involved in genome replication and cell-to-cell movement; NIb (large nuclear inclusion protein), RdRp (RNA-dependent RNA polymerase); CP (coat protein), RNA encapsidation.
Only the most abundant variants, those present in 8 or more potyvirus genomes are shown. Italicised amino acids have other variants present at the same site in fewer than 8 potyvirus genomes.
Regions were identified by using conservation measures implemented in the program NCSF are coded C1 to C5, E1 to E5, R1 to R5, or V1 to V5 where the codes represent the ranking according to the dominant base count(C), Shannon entropy score (E), redundancy score (R) or sub-sequence variants count (V).
R4 and C5 both start from 9237 whereas V1 starts from 9240
Nicholas and Laliberte (1991) identified part of this motif and designed primers that targeted NIIENG. This is part of a conserved region 69 nucleotides long (with one nucleotide gap) that encodes: KKH/FKG/VNNSGQPSTVVDNTLMVV/II. Langeveld et al. (1991) identified part of the region and designed primers that targeted TVVDNTLMV. Gibbs and Mackenzie (1997) designed primers that targeted G/VNNSGQ in this region.
One sequence (Accession code AJ310102) was removed from analysis of this site as there was 1 codon (3 nucleotides) insertion at position 9165.
Langeveld et al. (1991) identified part of this motif and designed primers that targeted AHFQMKTA. Pappu et al. (1993) modified Langeveld's primer and designed primer that targeted QMKAAA.
Langeveld et al. (1991) identified this motif and designed primers to target MVCIENG. Pappu et al. (1993) designed primers to target WCIEN.
Figure 1The annual release and the accumulated total of all potyvirus sequences in GenBank from 1985 to 2005.
Figure 2The annual release and the accumulated total of all full length potyvirus sequences in GenBank from 1985 to 2005.
Figure 3Average nucleotide variant score (N score) of the top 3 and 6 other conserved sites compared with the minimum, maximum and average N scores of all 17 conserved sites in the potyvirus genomes.
Average nucleotide variant counts (N scores) of 17 conserved sites compared to the mean of 20 random sites in all potyvirus genomes from 1985 to 2005.
| conserved sites | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | Yr 96 | 97 | 98 | 99 | 00 | 01 | 02 | 03 | 04 | 05 |
|
| 0.00 | 0.00 | 0.00 | 0.20 | 0.20 | 0.20 | 0.20 | 0.30 | 0.40 | 0.40 | 0.40 | 0.40 | 0.45 | 0.45 | 0.45 | 0.45 | 0.50 | 0.50 | 0.50 | 0.50 | 0.55 |
|
| 0.00 | 0.00 | 0.00 | 0.25 | 0.35 | 0.35 | 0.40 | 0.50 | 0.50 | 0.50 | 0.55 | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 | 0.60 |
|
| 0.00 | 0.00 | 0.00 | 0.10 | 0.25 | 0.25 | 0.30 | 0.45 | 0.45 | 0.50 | 0.50 | 0.50 | 0.50 | 0.50 | 0.50 | 0.50 | 0.55 | 0.55 | 0.65 | 0.65 | 0.65 |
|
| 0.00 | 0.00 | 0.00 | 0.09 | 0.26 | 0.26 | 0.30 | 0.43 | 0.43 | 0.43 | 0.57 | 0.57 | 0.57 | 0.57 | 0.61 | 0.65 | 0.70 | 0.70 | 0.78 | 0.78 | 0.78 |
|
| 0.00 | 0.00 | 0.00 | 0.18 | 0.29 | 0.29 | 0.29 | 0.35 | 0.35 | 0.47 | 0.53 | 0.53 | 0.53 | 0.53 | 0.53 | 0.65 | 0.71 | 0.71 | 0.76 | 0.76 | 0.82 |
|
| 0.00 | 0.00 | 0.00 | 0.32 | 0.42 | 0.42 | 0.47 | 0.53 | 0.68 | 0.74 | 0.74 | 0.74 | 0.74 | 0.74 | 0.74 | 0.74 | 0.79 | 0.84 | 0.84 | 0.84 | 0.84 |
|
| 0.00 | 0.00 | 0.00 | 0.10 | 0.35 | 0.35 | 0.45 | 0.60 | 0.75 | 0.75 | 0.80 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 |
|
| 0.00 | 0.00 | 0.00 | 0.20 | 0.30 | 0.30 | 0.35 | 0.45 | 0.45 | 0.45 | 0.50 | 0.55 | 0.55 | 0.55 | 0.70 | 0.70 | 0.70 | 0.75 | 0.75 | 0.75 | 0.85 |
|
| 0.00 | 0.00 | 0.00 | 0.15 | 0.30 | 0.30 | 0.35 | 0.45 | 0.45 | 0.45 | 0.60 | 0.75 | 0.85 | 0.85 | 0.90 | 0.90 | 0.90 | 0.90 | 0.90 | 0.90 | 0.90 |
|
| 0.00 | 0.00 | 0.00 | 0.18 | 0.36 | 0.36 | 0.41 | 0.64 | 0.68 | 0.68 | 0.86 | 0.86 | 0.86 | 0.86 | 0.86 | 0.86 | 0.86 | 0.86 | 0.86 | 0.86 | 0.95 |
|
| 0.00 | 0.00 | 0.00 | 0.09 | 0.30 | 0.30 | 0.35 | 0.48 | 0.70 | 0.74 | 0.83 | 0.87 | 0.87 | 0.91 | 0.91 | 0.96 | 0.96 | 0.96 | 1.00 | 1.00 | 1.00 |
|
| 0.00 | 0.00 | 0.00 | 0.25 | 0.30 | 0.30 | 0.40 | 0.45 | 0.50 | 0.50 | 0.60 | 0.65 | 0.70 | 0.70 | 0.80 | 0.85 | 0.85 | 0.85 | 0.85 | 0.90 | 1.00 |
|
| 0.00 | 0.00 | 0.00 | 0.15 | 0.40 | 0.40 | 0.50 | 0.65 | 0.70 | 0.70 | 0.75 | 0.90 | 0.90 | 0.90 | 0.90 | 0.95 | 0.95 | 1.00 | 1.00 | 1.00 | 1.05 |
|
| 0.00 | 0.00 | 0.00 | 0.10 | 0.25 | 0.25 | 0.25 | 0.45 | 0.45 | 0.50 | 0.55 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.85 | 0.90 | 1.00 | 1.00 | 1.10 |
|
| 0.00 | 0.00 | 0.00 | 0.10 | 0.20 | 0.20 | 0.40 | 0.50 | 0.65 | 0.65 | 0.80 | 0.80 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 1.15 | 1.20 | 1.20 | 1.20 |
|
| 0.00 | 0.00 | 0.00 | 0.20 | 0.30 | 0.30 | 0.55 | 0.70 | 0.70 | 0.80 | 0.80 | 0.90 | 0.95 | 1.05 | 1.05 | 1.05 | 1.05 | 1.15 | 1.15 | 1.15 | 1.20 |
|
| 0.00 | 0.00 | 0.00 | 0.11 | 0.26 | 0.26 | 0.32 | 0.47 | 0.47 | 0.53 | 0.58 | 0.63 | 0.63 | 0.68 | 0.74 | 0.74 | 0.79 | 0.84 | 0.95 | 1.11 | 1.21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 4Comparison of the maximum and mean N scores for all conserved sites in the genome, representative and the species datasets.
Correlations between the N score time series from the genome dataset and the N score time series from other datasets for each site.
| correlation coefficients between datasets | ||||
| conserved sites | representative | PPV | TuMV | PVY |
|
| 0.98 | 0.90 | 0.89 | 0.88 |
|
| 1.00 | N/A | 0.65 | 0.58 |
|
| 1.00 | 0.80 | 0.86 | 0.92 |
|
| 0.98 | 0.92 | 0.87 | 0.62 |
|
| 1.00 | 0.86 | 0.80 | 0.80 |
|
| 1.00 | 0.92 | 0.67 | 0.83 |
|
| 1.00 | 0.80 | 0.58 | 0.73 |
|
| 1.00 | 0.76 | 0.33 | 0.65 |
|
| 1.00 | 0.89 | 0.61 | 0.50 |
|
| 0.99 | 0.93 | 0.71 | 0.93 |
|
| 1.00 | 0.75 | 0.78 | 0.52 |
|
| 0.99 | 0.76 | 0.71 | 0.92 |
|
| 0.98 | 0.42 | 0.72 | 0.84 |
|
| 0.98 | 0.89 | 0.89 | 0.92 |
|
| 0.97 | 0.76 | 0.89 | 0.84 |
|
| 1.00 | 0.84 | 0.35 | 0.80 |
|
| 0.99 | 0.83 | 0.72 | 0.91 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
N/A–no variants occurred at this site in the PPV genomes