| Literature DB >> 29258574 |
Ségolène Caboche1,2, Gaël Even3,4, Alexandre Loywick3,4, Christophe Audebert3,4, David Hot5,4.
Abstract
The increase in available sequence data has advanced the field of microbiology; however, making sense of these data without bioinformatics skills is still problematic. We describe MICRA, an automatic pipeline, available as a web interface, for microbial identification and characterization through reads analysis. MICRA uses iterative mapping against reference genomes to identify genes and variations. Additional modules allow prediction of antibiotic susceptibility and resistance and comparing the results of several samples. MICRA is fast, producing few false-positive annotations and variant calls compared to current methods, making it a tool of great interest for fully exploiting sequencing data.Entities:
Keywords: Bioinformatics pipeline; Comparative genomics; High-throughput sequencing; Microbial genome characterization
Mesh:
Year: 2017 PMID: 29258574 PMCID: PMC5738152 DOI: 10.1186/s13059-017-1367-z
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1The MICRA pipeline. Ovals represent the input files. Black boxes represent the four main parts of the MICRA pipeline and blue boxes show the constitutive modules. Dashed lines are used for optional steps. CDS coding sequence
Comparison of results obtained with MICRA, MIRA and IonGAP
| MICRA with DH10B | MICRA without DH10B | MIRA | IonGAP | |
|---|---|---|---|---|
| Time | 9 minutes | 11 minutes | 3 h | >3 h |
| Number of contigs > 500 bp | 1 | 41 | 267 | 197 |
| N50 | 4686138 | 2836109 | 88618 | 121061 |
| Genome fraction (%) | 100 | 97.195 | 96.314 | 96.752 |
| Number of Ns | 2 | 91 | 89 | 4 |
| Number of mismatches | 1 | 73 | 147 | 116 |
| Number of short indels | 1 | 17 | 63 | 64 |
| Number of long indels | 0 | 5 | 5 | 0 |
| Number of genes | 4127 + 0 part | 4019 + 7 part | 3919 + 106 part | 3927 + 95 part |
| Number of misassemblies | 0 | 19 | 5 | 9 |
Fig. 2CGH array results comparing the P134S strain gene content to BPZE1 gene content. a Obtained log ratios from the CGH array experiment are represented as a MA plot. Genes within the clusters predicted by MICRA to be absent in P134S are indicated in red. Genes known to be absent in BPZE1 are indicated in green. Dashed red line represent log ratio = −1. b QQ plot of t-statistic for P134S strain gene content compared to BPZE1 gene content as determined by CGH array. Quantiles of t-statistics distribution for all ORFs were plotted against a normal standard distribution. Genes highlighted in panel a are marked with red and green stars and appear not normally distributed, indicating a highly divergent log ratio
Fig. 3Examples of MICRA results highlighting the biological features. a Extracted lines of the CSV annotation file from mapping. The yellow highlighted lines show the shiga toxin 2 genes and several tellurite resistance genes characteristic of the strain. b The comparative genome picture produced with CGView against the E. coli 55989 strain. The additional ROD elements correspond to the deleted regions identified in Rhode et al. [32] c Extracted results of MICRA de novo contig annotation showing components of the microcin gene cluster, the tellurite resistance gene cluster, and the mercury resistance gene cluster. d Venn diagram showing the comparison of CDSs obtained with MICRA between the strains EL2009-2050, EL2009-2071, and TY2482
Precision, recall, and F-measure values for five sequence annotation approaches
| Number of CDSs | True positives | False negatives | False positives | Precision | Recall | F-measure | |
|---|---|---|---|---|---|---|---|
| MICRA | 4913 | 4658 | 695 | 255 | 0.95 | 0.87 | 0.91 |
| BG7 | 6190 | 4846 | 507 | 1344 | 0.78 | 0.91 | 0.84 |
| RAST | 7958 | 4778 | 575 | 3180 | 0.6 | 0.89 | 0.72 |
| PROKKA | 7482 | 4840 | 513 | 2642 | 0.65 | 0.9 | 0.75 |
| IonGAP | 6514 | 4411 | 942 | 2103 | 0.68 | 0.82 | 0.74 |
Comparison between experimentally measured and computationally predicted antibiotics susceptibility profiles
| Antibiotic | Class | Exp. | MICRA | Kuznetsov et al. | ResFinder |
|---|---|---|---|---|---|
| Ampicillin | Beta-lactam, penicillins | R | R* | ND | R** |
| Amoxicillin | Beta-lactam, penicillins | R | R* | ND | R** |
| Piperacillin | Beta-lactam, penicillins | R | R* |
| R** |
| Cefuroxim | Beta-lactam, cephalosporins | R | R* | ND | R** |
| Cefoxitin | Beta-lactam, cephalosporins | R | R | R | R** |
| Cefotaxim | Beta-lactam, cephalosporins | R | R* | ND | R** |
| Ceftazidim | Beta-lactam, cephalosporins | R | R | R | R** |
| Cefpodoxime | Beta-lactam, cephalosporins | R | R | R | R** |
| Imipenem | Beta-lactam, carbapenems | S | S | S |
|
| Meropenem | Beta-lactam, carbapenems | S | S | S |
|
| Amikacin | Aminoglycoside | S | S | S | ND |
| Gentamicin | Aminoglycoside | S | S | S | ND |
| Kanamycin | Aminoglycoside | S | S | S | ND |
| Tobramycin | Aminoglycoside | S | S | S | ND |
| Streptomycin | Aminoglycoside | R | R | R | R |
| Ciprofloxacin | Fluoroquinolone | S | S | ND | S |
| Norfloxacin | Fluoroquinolone | S |
| ND | S |
| Tetracyclin | Polyketide | R | R | R | R |
| Nitrofurantoin | Furans | S | S | S | ND |
| Trimethoprim/sulfamethoxazole | Aminopyrimidine | R | R | R | R |
| Chloramphenicol | Phenicol | S |
| ND | S |
| Fosfomycin | Phosphonic acids | S | S |
| S |
Note that ResFinder mainly makes predictions for antibiotic classes rather than individual drugs. R resistant, S sensitive, R* not sensitive considered as a particular resistance, R** prediction for beta-lactam class, not individual drugs, ND not determined. Incorrect predictions appear in bold text