BACKGROUND: Although the majority of bacteria are innocuous or even beneficial for their host, others are highly infectious pathogens that can cause widespread and deadly diseases. When investigating the relationships between bacteria and other living organisms, it is therefore essential to be able to separate pathogenic organisms from non-pathogenic ones. Using traditional experimental methods for this purpose can be very costly and time-consuming, and also uncertain since animal models are not always good predictors for pathogenicity in humans. Bioinformatics-based methods are therefore strongly needed to mine the fast growing number of genome sequences and assess in a rapid and reliable way the pathogenicity of novel bacteria. METHODOLOGY/PRINCIPAL FINDINGS: We describe a new in silico method for the prediction of bacterial pathogenicity, based on the identification in microbial genomes of features that appear to correlate with virulence. The method does not rely on identifying genes known to be involved in pathogenicity (for instance virulence factors), but rather it inherently builds families of proteins that, irrespective of their function, are consistently present in only one of the two kinds of organisms, pathogens or non-pathogens. Whether a new bacterium carries proteins contained in these families determines its prediction as pathogenic or non-pathogenic. The application of the method on a set of known genomes correctly classified the virulence potential of 86% of the organisms tested. An additional validation on an independent test-set assigned correctly 22 out of 24 bacteria. CONCLUSIONS: The proposed approach was demonstrated to go beyond the species bias imposed by evolutionary relatedness, and performs better than predictors based solely on taxonomy or sequence similarity. A set of protein families that differentiate pathogenic and non-pathogenic strains were identified, including families of yet uncharacterized proteins that are suggested to be involved in bacterial pathogenicity.
BACKGROUND: Although the majority of bacteria are innocuous or even beneficial for their host, others are highly infectious pathogens that can cause widespread and deadly diseases. When investigating the relationships between bacteria and other living organisms, it is therefore essential to be able to separate pathogenic organisms from non-pathogenic ones. Using traditional experimental methods for this purpose can be very costly and time-consuming, and also uncertain since animal models are not always good predictors for pathogenicity in humans. Bioinformatics-based methods are therefore strongly needed to mine the fast growing number of genome sequences and assess in a rapid and reliable way the pathogenicity of novel bacteria. METHODOLOGY/PRINCIPAL FINDINGS: We describe a new in silico method for the prediction of bacterial pathogenicity, based on the identification in microbial genomes of features that appear to correlate with virulence. The method does not rely on identifying genes known to be involved in pathogenicity (for instance virulence factors), but rather it inherently builds families of proteins that, irrespective of their function, are consistently present in only one of the two kinds of organisms, pathogens or non-pathogens. Whether a new bacterium carries proteins contained in these families determines its prediction as pathogenic or non-pathogenic. The application of the method on a set of known genomes correctly classified the virulence potential of 86% of the organisms tested. An additional validation on an independent test-set assigned correctly 22 out of 24 bacteria. CONCLUSIONS: The proposed approach was demonstrated to go beyond the species bias imposed by evolutionary relatedness, and performs better than predictors based solely on taxonomy or sequence similarity. A set of protein families that differentiate pathogenic and non-pathogenic strains were identified, including families of yet uncharacterized proteins that are suggested to be involved in bacterial pathogenicity.
Authors: N T Perna; G Plunkett; V Burland; B Mau; J D Glasner; D J Rose; G F Mayhew; P S Evans; J Gregor; H A Kirkpatrick; G Pósfai; J Hackett; S Klink; A Boutin; Y Shao; L Miller; E J Grotbeck; N W Davis; A Lim; E T Dimalanta; K D Potamousis; J Apodaca; T S Anantharaman; J Lin; G Yen; D C Schwartz; R A Welch; F R Blattner Journal: Nature Date: 2001-01-25 Impact factor: 49.962
Authors: Hadi Abd; Thorsten Johansson; Igor Golovliov; Gunnar Sandström; Mats Forsman Journal: Appl Environ Microbiol Date: 2003-01 Impact factor: 4.792
Authors: Jens Friis-Nielsen; Kristín Rós Kjartansdóttir; Sarah Mollerup; Maria Asplund; Tobias Mourier; Randi Holm Jensen; Thomas Arn Hansen; Alba Rey-Iglesia; Stine Raith Richter; Ida Broman Nielsen; David E Alquezar-Planas; Pernille V S Olsen; Lasse Vinner; Helena Fridholm; Lars Peter Nielsen; Eske Willerslev; Thomas Sicheritz-Pontén; Ole Lund; Anders Johannes Hansen; Jose M G Izarzugaza; Søren Brunak Journal: Viruses Date: 2016-02-19 Impact factor: 5.048
Authors: Olga Zaborina; John C Alverdy; Sanjiv K Hyoju; Alexander Zaborin; Robert Keskey; Anukriti Sharma; Wyatt Arnold; Fons van den Berg; Sangman M Kim; Neil Gottel; Cindy Bethel; Angella Charnot-Katsikas; Peng Jianxin; Carleen Adriaansens; Emily Papazian; Jack A Gilbert Journal: mBio Date: 2019-07-30 Impact factor: 7.867