| Literature DB >> 32210056 |
Anton E Shikov1,2, Yury V Malovichko1,2, Rostislav K Skitchenko3, Anton A Nizhnikov1,2, Kirill S Antonets1,2.
Abstract
Bacillus thuringiensis (Bt) is a natural pathogen of insects and some other groups of invertebrates that produces three-domain Cry (3d-Cry) toxins, which are highly host-specific pesticidal proteins. These proteins represent the most commonly used bioinsecticides in the world and are used for commercial purposes on the market of insecticides, being convergent with the paradigm of sustainable growth and ecological development. Emerging resistance to known toxins in pests stresses the need to expand the list of known toxins to broaden the horizons of insecticidal approaches. For this purpose, we have elaborated a fast and user-friendly tool called CryProcessor, which allows productive and precise mining of 3d-Cry toxins. The only existing tool for mining Cry toxins, called a BtToxin_scanner, has significant limitations such as limited query size, lack of accuracy and an outdated database. In order to find a proper solution to these problems, we have developed a robust pipeline, capable of precise 3d-Cry toxin mining. The unique feature of the pipeline is the ability to search for Cry toxins sequences directly on assembly graphs, providing an opportunity to analyze raw sequencing data and overcoming the problem of fragmented assemblies. Moreover, CryProcessor is able to predict precisely the domain layout in arbitrary sequences, allowing the retrieval of sequences of definite domains beyond the bounds of a limited number of toxins presented in CryGetter. Our algorithm has shown efficiency in all its work modes and outperformed its analogues on large amounts of data. Here, we describe its main features and provide information on its benchmarking against existing analogues. CryProcessor is a novel, fast, convenient, open source (https://github.com/lab7arriam/cry_processor), platform-independent, and precise instrument with a console version and elaborated web interface (https://lab7.arriam.ru/tools/cry_processor). Its major merits could make it possible to carry out massive screening for novel 3d-Cry toxins and obtain sequences of specific domains for further comprehensive in silico experiments in constructing artificial toxins.Entities:
Keywords: Bacillus thuringiensis; Bt; Cry toxins; CryProcessor; biopesticide; insecticide; pathogen
Mesh:
Substances:
Year: 2020 PMID: 32210056 PMCID: PMC7150774 DOI: 10.3390/toxins12030204
Source DB: PubMed Journal: Toxins (Basel) ISSN: 2072-6651 Impact factor: 4.546
Benchmarking of CryProcessor and BtToxin_scanner on the protein sequences of 511 Bt assemblies in FASTA format.
| Tool | System Time | User Time | Real Time | Number of Toxins Found |
|---|---|---|---|---|
| CryProcessor | 67.20 s | 42611.29 s | 2902.15 s | 602 (3d-Cry) |
| CryProcessor | 39.23 s | 22186.64 s | 1379.77 s | 590 (3d-Cry) |
| BtToxin_scanner | 241.79 s | 27007.7 s | 15822.07 s | 419 (3d-Cry) |
The test dataset comprised all the Bt entries from the NCBI Assembly (511 genomes, 2810060 FASTA protein entries). * Only 128 of these proteins were marked as new after performing CryProcessor on the BtToxin_scanner output.
Benchmarking of CryProcessor and BtToxin_scanner on the FASTA input.
| Tool | System Time | User Time | Real Time | Number of Toxins Found |
|---|---|---|---|---|
| CryProcessor | 1636.47 s | 31369.38 s | 7723.37 s | 5 (3d-Cry)* |
| BtToxin_scanner | 26301.08 s | 23038.92 s | 4561.89 s | no toxins found |
For the test, a SRX2330733 submission from NCBI Sequence Read Archive (SRA) was used (BioProject: PRJNA352636; shotgun sequencing data for Bt strain HD-133; 3.2 Gb). * Detected 3d-Cry toxins: Cry9Ea1, Cry1Ia14, Cry1Ca5, Cry1Da1, Cry2Ab16.
Negative control of CryProcessor and BtToxin_scanner.
| Tool | System Time | User Time | Real Time | Number of Toxins Found |
|---|---|---|---|---|
| CryProcessor | 1490.93 s | 6232.63 s | 4491.51 s | no toxins found |
| BtToxin_scanner | 22333.75 s | 18020.30 s | 2075.42 s | no toxins found |
Figure 1The detailed pipeline implemented in CryProcessor. All available modes of work are shown. Ovals denote data blocks, parallelograms denote processing steps, and rhombus indicates the decision-making step.