Literature DB >> 23750091

Design of a set of probes with high potential for influenza virus epidemiological surveillance.

Luis R Carreño-Durán1, V Larios-Serrato, Hueman Jaimes-Díaz, Hilda Pérez-Cervantes, Héctor Zepeda-López, Carlos Javier Sánchez-Vallejo, Gabriela Edith Olguín-Ruiz, Rogelio Maldonado-Rodríguez, Alfonso Méndez-Tenorio.   

Abstract

An Influenza Probe Set (IPS) consisting in 1,249 9-mer probes for genomic fingerprinting of closely and distantly related Influenza Virus strains was designed and tested in silico. The IPS was derived from alignments of Influenza genomes. The RNA segments of 5,133 influenza strains having diverse degree of relatedness were concatenated and aligned. After alignment, 9-mer sites having high Shannon entropy were searched. Additional criteria such as: G+C content between 35 to 65%, absence of dimer or trimer consecutive repeats, a minimum of 2 differences between 9mers and selecting only sequences with Tm values between 34.5 and 36.5oC were applied for selecting probes with high sequential entropy. Virtual Hybridization was used to predict Genomic Fingerprints to assess the capability of the IPS to discriminate between influenza and related strains. Distance scores between pairs of Influenza Genomic Fingerprints were calculated, and used for estimating Taxonomic Trees. Visual examination of both Genomic Fingerprints and Taxonomic Trees suggest that the IPS is able to discriminate between distant and closely related Influenza strains. It is proposed that the IPS can be used to investigate, by virtual or experimental hybridization, any new, and potentially virulent, strain.

Entities:  

Keywords:  IPS; Influenza virus; Microarray; Shannon Entropy; Virtual Hybridization; fingerprinting

Year:  2013        PMID: 23750091      PMCID: PMC3670124          DOI: 10.6026/97320630009414

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Influenza viruses are part of Orthomixoviridae Family and possess segmented genomes consisting of seven or eight separate RNA molecules, each coding for one or more viral proteins. The viruses can exchange segments, leading to diversity of reassortant strains. Together with accumulation of point mutations, segment reassortment is the basis for evolution and maintenance of diversity for these viruses. It provides them with the ability to rapidly adapt to the pressure of the host immune system and leads to the continuous emergence of new virus variants that cause seasonal and pandemic outbreaks of influenza. Because of this ability, segmented viruses can exist in numerous genotypes and serotypes, presenting a challenge to the creation of protective vaccines and detection methods [1, 2]. Because of these reasons, the early detection and diagnostic confirmation of influenza virus infections is fundamental for an appropriate control of the disease. Several molecular biology techniques, most of them based on PCR amplification, have contributed to the diagnostic of the different types and subtypes of influenza virus. However, PCR techniques are frequently unable to detect new potentially virulent strains. Other techniques such as sequencing are able to perform a precise identification of such strains but still are not so widely available for routinary diagnostic [3, 4]. The creation of a microarray is complicated when genomic structures are similar. Probe selection is further complicated when the number of known sequences is very large. When this happens the probe selection strategy becomes critical [5]. There are several methods [6-12] for the selection of specific probes for influenza virus detection. Direct search for probes based on traditional computational methods is labor-intensive and often requires plenty of time. The Shannon entropy (H), is a bioinformatics technique that has been used to sort the influenza virus, to analyze the evolution of influenza [13], to facilitate the development of an anti-influenza vaccine [14], and to create a profile of these areas of high variation, observing characteristic patterns for each subtype [15]. In the present approach we designed and tested in silico, an Influenza Probe Set (IPS) which consists in 1,249 probes with a length of 9mer, extracted from sequence alignment zones with maximum entropy within the full viral genome of over 5,000 viruses reported, considering almost all viral subtypes of Influenza A. Using Virtual Hybridization (VH) technology, in silico Genomic Fingerprints were generated, which in turn were compared to estimate a phylogeny based on the fingerprint pairwise distances. Other studies have employed the use of the VH technology to create genomic fingerprints for in silico classifying of microorganisms as Human Papillomaviruses [16] and bacterial genomes [17].

Methodology

Shannon entropy is a measure of the lack of predictability of an element [19], such as a given base, in a particular position of alignment. Highly variable columns in an alignment will yield maximum values of entropy.

Search Probe:

This program developed in Java, calculates the Shannon entropy of aligned sequences. It finds the points having maximum entropy, then, selects 9-mer sequences (the size can be modified by the user), using the point of maximum entropy as the 9-mer center. The equation used by SearchProbe to calculate the Shannon entropy is: H (n) =-Σ f (i, n) ln f (i, n); Where H (n) is the entropy at position n, i represents a residue (in this case there are only four possible options A, C, G and U), f (i, n) is the frequency of residue i in the n position. The information content in position n, is defined then as a decrease in uncertainty or entropy in that position. In our particular case, SearchProbe seeks regions with maximum entropy values [18]. CalcProbes. This Perl script refines the search of probes using the 9-mer sequences provided by SearchProbe. These sequences are subject to the next restrictions: i) Select only sequences having between 35-65 %G + C (4 or 5), ii) Eliminate 9-mers having tandem repeats of 2 or 3 nucleotides, iii) Select sequences having a minimum of 2 differences between them and iv) Chose 9-mer sequences having 34 to 36°C Tm values. Tm values were calculated with the thermodynamic Nearest- Neighbors (NN) model using SantaLucia parameters [19]. The final 1,249 9-mer probe set selected by this procedure is the IPS (Influenza Probe Set).

Virtual Hybridization (VH):

Virtual Hybridization is a computer program able to predict perfect and mismatched target/probe hybridizations under a selected Tm cutoff value. The stability of target/probe duplexes is calculated with the NN model. This program was used to determine all the hybridizations occurring between each Influenza virus genome, or control strain, and the IPS. The group of hybridization signals produced by each viral genome corresponds to its particular fingerprint [20].

Genomic Fingerprinting Analysis with UFA software:

Universal Fingerprinting Analysis (UFA) software transforms genomic fingerprints produced by Virtual Hybridization under any chosen stringent condition, into images. It also allows visual comparison of any selected pairs of fingerprints, producing spots with specific colors for both distinctive as well as for shared hybridization signals. Besides, this tool is able to calculate pairwise distances between pairs of genomic fingerprints. From a table of such distances Taxonomic trees were built using the Neighbor -Joining method with the program MEGA 5 [21].

Distinction of Influenza strains with the IPS:

Two types of analysis were performed: I) A Taxonomic tree, based on distances between IPS-Genomic Virtual Hybridization fingerprints, comparing several types of Influenza and other viruses, was made. II) Overlapped images from selected pairs of genomic fingerprints for strains having: low, medium, or high degree of relatedness, were made. Influenza A /mallard duck/New York/170/1982(H1N2) and Influenza A/Mexico/InDRE4487/200 were used as references.

Results & Discussion

In the first step an average of 550,500 non-unique sequence probes were selected from the alignment. Furthermore probe sequences were clustered in order to remove the repeated ones and to select only those with entropy higher than a convenient threshold (ProbeSearch). Calcprobes is responsible for applying the design parameters explained in the methodology. After the above-mentioned, we performed a third selection, by removing sequences containing probes with the lowest entropy values and taking probes with a Tm range of 34.5 to 36.5°C and free energy values between -9.00 and -13.5Kcal/mol.

Virtual Hybridization:

A database of tested target viral genomes used for the in silico experiments was created. The VH programs conducts a rigorous and reliable analysis to find and track all the sites in each viral genome where the probe sequences can hybridize taking into account the degree of complementarity between the probe and the recognized site in the target (allowing at least a mismatch difference) and the thermodynamic stability between them. The generated information constitutes an in silico genomic fingerprint listing details of the specific sites in each target DNA where hybridization occurred, the number and sequence of the probe that hybridized as well as the free energy value of the hybridizations and it also provides the sequence of the target site recognized by each probe. A free energy cutoff value of -9 kcal/mol for 9mer probes was used.

Evaluation of the genomic fingerprint:

The analysis of the performance of our set of probes to distinguish between several viral sequences the in silico evaluation was divided into several steps: A) Viral Family Test: Two kinds of trees were calculated: One derived from the alignment of the concatenated fragments calculated with Clustal X 2.0 and the other resulting from the comparison of genomic fingerprints obtained with the IPS and VH. Both trees are derived from 12 different viral genome families including Paramyxoviridae, Coronaviridae, Picornaviridae, Adenoviridae, which cause respiratory symptoms similar to influenza and several family Orthomixoviridae viruses like influenza B, influenza C, influenza A (H1N1, H1N2, H3N2) Thogotovirus and Isavirus are shown in (Figure 1).
Figure 1

Taxonomic trees of 12 viral families including Paramixoviridae, Orthomixoviridae, Coronaviridae, Picornaviridae, Adenoviridae, Influenza A (H1N1, H1N2, H3N2), B and C, and two other Orthomixovirus, Thogotovirus and Isavirus is given (in red). (A) Fingerprinting Tree, (B) Alignment Tree. It is shown that all the Influenza A virus subtypes were clustered into a single group.

The comparison of the two trees shows a close correspondence. The tree from genomic fingerprints groups the influenza A (H1N1, H1N2 and H3N2) on a branch, influenza B, C and Isavirus in another branch and other viral families in other clusters. It is noteworthy that Human Respiratory Syncytial virus (which is a Paramyxovirus) was grouped with the Rhinovirus Human Rhinovirus B, together with Thogotovirus in the tree based on genomic fingerprints. Thogotovirus has been classified as belonging to the Orthomixovirus family, although other studies make comparisons of Orthomyxoviridae PB1 proteins, showing low percentages of amino acid identity when compared with influenza viruses and Isavirus [22, 23]. Likewise, influenza virus types A, B and C yielded characteristic patterns for each virus, so IPS probes allowed creating distinctive fingerprints for each one and create one fingerprint characteristic for each virus (Figure 2).
Figure 2

A) Genomic fingerprints of different influenza viruses and other viral families. Using as reference organism the virus Influenza A A /mallard duck/New York/170/1982(H1N2) (in red) and the Infectious salmon anemia virus(Isavirus), Thogotovirus, Human respiratory syncytial virus(Paramixoviridae), Human rhinovirus B (Picornaviridae), SARS coronavirus (Coronaviridae), Human adenovirus D (Adenoviridae) in green to compare the fingerprints generated , Genomic fingerprints of different viral types of influenza virus. Using as reference organism the virus Influenza A A /mallard duck/New York/170/1982(H1N2) (in red) and Influenza B B/Mexico/84/2000 and Influenza C C/Ann Arbor/1/50 (in green) to compare fingerprints B) Genomic fingerprints of different viral types of influenza virus. Using as reference organism the virus A/New York/18/2006/H1N1 (in red) and A/ Swine/Wisconsin/1915/1988/H1N1 (in green) to compare fingerprints; C) Genomic fingerprints of different viral types of influenza virus. Using as reference organism the virus A/Mexico/InDRE4487/2009(H1N1 (in red) and A/California/04/2009 H1N1 (in green) to compare fingerprints.

B) Hybridization of viral genomes of the same subtype was carried out on two subtypes of Influenza A H1N1, that infect different hosts (human and swine), to check if the IPS is able to generate distinctive genomic fingerprints. The virus A/New York/18/2006/H1N1 causes seasonal influenza whereas the virus A/ Swine/Wisconsin/1915/1988/H1N1 infects swine. It is highlighted that the IPS probes generated specific genomic fingerprints for each one. This result is very relevant showing that IPS is capable of an appropriated identification when there are outbreaks of this disease in humans by strains from animals such as pigs (Figure 2). C) Comparison of Genomic Fingerprints of two genomes with very high similarity. Overlapping Genomic Fingerprints of Influenza A H1N1 viruses A/Méxicoindre4487/2009 and A/California/04/2009 from the 2009 pandemics are shown in Figure 2. It is clear that both viruses are very similar with only minor mutations, as expected for viruses from the same outbreak. However IPS genomic fingerprints are able to show seven differences between then, with five specific probes for A/Méxicoindre4487/2009 H1N1 virus and two for the A/California/04/2009. This is very important for molecular studies of influenza because IPS is highly sensitive as to spread viruses even those very closer; this will help in the management of influenza epidemiology, and not depend on a previous sequencing.

Conclusions

Following the established parameters, the set of 1249 highly specific probes (IPS) allowed us to correct typing and subtyping of influenza viruses, including human and animal strains, as well as very similar strains. The IPS design based on the construction of probes from regions of the viral genome with maximum entropy allows a highly sensitive discrimination. Through an in silico hybridization, the performance of the IPS microarray was simulated, allows us to know the possible behavior of the probes, and predicting genomic fingerprints of these viruses. Prediction is based in experimentally supported thermodynamic models, which suggest that the IPS microarray would be a valuable Influenza diagnosis tool.
  20 in total

1.  In silico evaluation of a novel DNA chip based fingerprinting technology for viral identification.

Authors:  Alfonso Méndez-Tenorio; Perla Flores-Cortés; Armando Guerra-Trejo; Hueman Jaimes-Díaz; Emma Reyes-Rosales; Arcadio Maldonado-Rodríguez; Mercedes Espinosa-Lara; Rogelio Maldonado-Rodríguez; Loren Beattie Kenneth
Journal:  Rev Latinoam Microbiol       Date:  2006 Apr-Jun

2.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

3.  Bioinformatic analysis of the genome of infectious salmon anemia virus associated with outbreaks with high mortality in Chile.

Authors:  L Cottet; M Cortez-San Martin; M Tello; E Olivares; A Rivas-Aravena; E Vallejos; A M Sandino; E Spencer
Journal:  J Virol       Date:  2010-09-01       Impact factor: 5.103

4.  Molecular detection and identification of influenza viruses by oligonucleotide microarray hybridization.

Authors:  Srikumar Sengupta; Kenji Onodera; Alexander Lai; Ulrich Melcher
Journal:  J Clin Microbiol       Date:  2003-10       Impact factor: 5.948

5.  Simultaneous detection and high-throughput identification of a panel of RNA viruses causing respiratory tract infections.

Authors:  Haijing Li; Melinda A McCormac; R Wray Estes; Susan E Sefers; Ryan K Dare; James D Chappell; Dean D Erdman; Peter F Wright; Yi-Wei Tang
Journal:  J Clin Microbiol       Date:  2007-05-16       Impact factor: 5.948

6.  Universal oligonucleotide microarray for sub-typing of Influenza A virus.

Authors:  Vladimir A Ryabinin; Elena V Kostina; Galiya A Maksakova; Alexander A Neverov; Konstantin M Chumakov; Alexander N Sinyakov
Journal:  PLoS One       Date:  2011-04-29       Impact factor: 3.240

7.  Universal fingerprinting chip server.

Authors:  Janet Casique-Almazán; Violeta Larios-Serrato; Gabriela Edith Olguín-Ruíz; Carlos Javier Sánchez-Vallejo; Rogelio Maldonado-Rodríguez; Alfonso Méndez-Tenorio
Journal:  Bioinformation       Date:  2012-06-28

Review 8.  The role of genomics in tracking the evolution of influenza A virus.

Authors:  Alice Carolyn McHardy; Ben Adams
Journal:  PLoS Pathog       Date:  2009-10-26       Impact factor: 6.823

9.  Evolutionarily conserved protein sequences of influenza a viruses, avian and human, as vaccine targets.

Authors:  A T Heiny; Olivo Miotto; Kellathur N Srinivasan; Asif M Khan; G L Zhang; Vladimir Brusic; Tin Wee Tan; J Thomas August
Journal:  PLoS One       Date:  2007-11-21       Impact factor: 3.240

10.  Oligonucleotide microchip for subtyping of influenza A virus.

Authors:  Eugeny E Fesenko; Dmitry E Kireyev; Dmitry A Gryadunov; Vladimir M Mikhailovich; Tatyana V Grebennikova; Dmitry K L'vov; Alexander S Zasedatelev
Journal:  Influenza Other Respir Viruses       Date:  2007-05       Impact factor: 4.380

View more
  1 in total

1.  In Silico Genomic Fingerprints of the Bacillus anthracis Group Obtained by Virtual Hybridization.

Authors:  Hueman Jaimes-Díaz; Violeta Larios-Serrato; Teresa Lloret-Sánchez; Gabriela Olguín-Ruiz; Carlos Sánchez-Vallejo; Luis Carreño-Durán; Rogelio Maldonado-Rodríguez; Alfonso Méndez-Tenorio
Journal:  Microarrays (Basel)       Date:  2015-02-17
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.