Legana C H W Fingerhut1,2,3, David J Miller1,2,3, Jan M Strugnell4, Norelle L Daly1,5, Ira R Cooke1,2. 1. Department of Molecular and Cell Biology, Centre for Tropical Bioinformatics and Molecular Biology, Townsville, Qld, 4811, Australia. 2. Department of Molecular and Cell Biology, Townsville, Qld, 4811, Australia. 3. ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Qld, 4811, Australia. 4. Centre for Sustainable Tropical Fisheries and Aquaculture, College of Science and Engineering, James Cook University, Townsville, Qld 4811, Australia. 5. Centre for Molecular Therapeutics, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, QLD 4870, Australia.
Abstract
SUMMARY: Antimicrobial peptides (AMPs) are the key components of the innate immune system that protect against pathogens, regulate the microbiome and are promising targets for pharmaceutical research. Computational tools based on machine learning have the potential to aid discovery of genes encoding novel AMPs but existing approaches are not designed for genome-wide scans. To facilitate such genome-wide discovery of AMPs we developed a fast and accurate AMP classification framework, ampir. ampir is designed for high throughput, integrates well with existing bioinformatics pipelines, and has much higher classification accuracy than existing methods when applied to whole genome data. AVAILABILITY AND IMPLEMENTATION: ampir is implemented primarily in R with core feature calculation methods written in C++. Release versions are available via CRAN and work on all major operating systems. The development version is maintained at https://github.com/legana/ampir. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
SUMMARY: Antimicrobial peptides (AMPs) are the key components of the innate immune system that protect against pathogens, regulate the microbiome and are promising targets for pharmaceutical research. Computational tools based on machine learning have the potential to aid discovery of genes encoding novel AMPs but existing approaches are not designed for genome-wide scans. To facilitate such genome-wide discovery of AMPs we developed a fast and accurate AMP classification framework, ampir. ampir is designed for high throughput, integrates well with existing bioinformatics pipelines, and has much higher classification accuracy than existing methods when applied to whole genome data. AVAILABILITY AND IMPLEMENTATION: ampir is implemented primarily in R with core feature calculation methods written in C++. Release versions are available via CRAN and work on all major operating systems. The development version is maintained at https://github.com/legana/ampir. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: James Rooney; Timothy L Williams; Holly M Northcote; Fiona E Karet Frankl; Daniel R G Price; Alasdair J Nisbet; Russell M Morphew; Cinzia Cantacessi Journal: Parasit Vectors Date: 2022-10-02 Impact factor: 4.047