| Literature DB >> 16643672 |
Cheng-Chung Chou1, Te-Tsui Lee, Chun-Houh Chen, Hsiang-Yun Hsiao, Yi-Ling Lin, Mei-Shang Ho, Pan-Chyr Yang, Konan Peck.
Abstract
BACKGROUND: Most virus detection methods are geared towards the detection of specific single viruses or just a few known targets, and lack the capability to uncover the novel viruses that cause emerging viral infections. To address this issue, we developed a computational method that identifies the conserved viral sequences at the genus level for all viral genomes available in GenBank, and established a virus probe library. The virus probes are used not only to identify known viruses but also for discerning the genera of emerging or uncharacterized ones.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16643672 PMCID: PMC1523220 DOI: 10.1186/1471-2105-7-232
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Algorithm for designing conserved probes. Schematic illustration of the conserved sequence design method. (A) The similarity sequence segments (I1to I5) of a given virus vare aligned with the sequence segments of the other five viruses within the same viral genus by the BLASTN program to find the regions with high similarity hits. (B) Graph of the numbers of similarity hits, S, obtained by the procedure illustrated in panel A. The largest number of hits as illustrated is max(S), which equals 4. The two regions with the highest similarity hits are marked by vertical dotted lines. (C) A case study of locating the conserved sequence segments in coronavirus genus. The genus contains the following 11 fully sequenced viral genomes: NC_002645, human coronavirus 229E; NC_001451, avian infectious bronchitis virus; NC_003436, porcine epidemic diarrhea virus; AF201929, murine hepatitis virus strain 2; AF208067, murine hepatitis virus strain ML-10; AF208066, murine hepatitis virus strain Penn 97-1; NC_001846, murine hepatitis virus strain A59; NC_003045, bovine coronavirus; AF220295, bovine coronavirus strain Quebec; NC_002306, transmissible gastroenteritis virus; and SARS-CoV, SARS coronavirus.
Figure 2Table 1 - Example searching the minimum number of probes that generate similarity hits with all the virus members within a viral genus.
Figure 3. All of the fully sequenced viral genomes (except that of SARS-CoV) in the coronavirus genus were used to determine the conserved sequences. The design method was validated by testing if the computed conserved sequences could successfully detect the novel SARS-CoV. (A) The redundant conserved segments as computed by steps 1 to 4 of the probe design algorithm were reduced to five nonredundant sequence groups by the BLASTN sequence alignment in step 5 of the probe design algorithm. The longest sequence (underlined sequence) in each group was used to represent the group. (B) BLASTN results for the comparison of the group 1 conserved sequence with the SARS-CoV genome. The BLASTN alignment outputs are shown in the figure. The sequence alignment with the SARS-CoV genome shows that the group 1 conserved sequence is located between nucleotides 15732 and 15829 of the SARS-CoV genome. (C) Larger-scale cross-validation of 14 viral genera consisting of 333 nonredundant sets of viral genomes.
Figure 4Viral genus classification and specific virus identification. Hybridization and identification results for seven test viral samples. The probes eligible for identity determination were prescreened using the hybridization intensity criteria described in the Methods section. (A) Viral genus determination for the seven viral samples, all of which belong to three viral genera. The genus signal strength is defined in the Methods section. (B) Determination of the identities of specific viruses. The numbers on the Y-axis are the ID numbers of the virus targets as listed in Additional File 2. Note that there are 46 species-specific probes for 26 viruses in the flavivirus and coronavirus genera, and that there are no enterovirus-specific probes.