| Literature DB >> 31287781 |
Yuki Matsumoto1, Takeshi Kinjo2, Daisuke Motooka1,3, Daijiro Nabeya2, Nicolas Jung1, Kohei Uechi2,4, Toshihiro Horii1, Tetsuya Iida1, Jiro Fujita2, Shota Nakamura1,3,5.
Abstract
The prevalence of nontuberculous mycobacteria (NTM) pulmonary diseases has been increasing worldwide. NTM consist of approximately 200 species and distinguishing between them at the subspecies level is critical to treatment. In this study, we sequenced 63 NTM genomes, 27 of which were newly determined, by hybrid assembly using sequencers from Illumina and Oxford Nanopore Technologies (ONT). This analysis expanded the available genomic data to 175 NTM species and redefined their subgenus classification. We also developed a novel multi-locus sequence typing (MLST) database based on 184 genes from 7547 assemblies and an identification software, mlstverse, which can also be used for detecting other bacteria given a suitable MLST database. This method showed the highest sensitivity and specificity amongst conventional methods and demonstrated the capacity for rapid detection of NTM, 10 min of sequencing of the ONT MinION being sufficient. Application of this methodology could improve disease epidemiology and increase the cure rates of NTM diseases.Entities:
Keywords: Nontuberculous mycobacteria; comparative genomics; multi-locus sequence typing; next-generation sequencing; pulmonary diseases
Mesh:
Year: 2019 PMID: 31287781 PMCID: PMC6691804 DOI: 10.1080/22221751.2019.1637702
Source DB: PubMed Journal: Emerg Microbes Infect ISSN: 2222-1751 Impact factor: 7.163
Figure 1.Taxonomic analysis using all A) Assembly quality. The bar plot indicates the N50 length in each species. The grey bar was available with the NCBI assemblies. The x-axis is sorted in ascending order with y-axis value. The blue dots show assemblies obtained in this work. The black and red dashed lines show the median N50s calculated using assemblies available from NCBI and our assemblies (also see Table S1). Growth type was defined according to the time required to grow bacterial colonies [i.e. rapid (3–7 days) and slow (>7 days)]. B) Core and pan genome analysis. Gene clusters were discriminated based on differences in the percent identity and length. The numbers of core (bottom) and pan (top) genes are shown, based on the number of NTM species. The error bars were calculated from 1000 replicates of randomly selected species. C) Comparative analysis. A phylogenetic tree was constructed using the 80% core genome (also see Figure S1). Filled colours correspond to the subgenus, growth type, pathogenicity, and assembly availability.
Figure 2.Pipeline workflow. Mlstverse uses an fasta or fastq file as input and outputs an identification result as a table, using a given database.
Comparison of NTM-identification methods. Analysis with the WGS, MS, 16S rRNA sequence analysis, and MLST-based metaMLST, pubmlst, and mlstverse methods is shown (also see Supplementary Tables 2 and 3).
| WGS | 16S | VITEK MS | metaMLST (Zolfo et al. 2017) | pubmlst (Jolley & Maiden 2010) | mlstverse (this study) | sample# |
|---|---|---|---|---|---|---|
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | ||||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
| – | – | |||||
Figure 3.Distribution of mapped reads in 29 clinical samples. The coverage of each gene in each sample is shown as a heat map for 53 ribosomal proteins genes and 131 species-specific genes. The coverage distribution is also shown on the right, with a binomial distribution fitting.
Figure 4.Sensitivity and specificity. A) Species-level evaluation of sequence homologies of 16S rRNA genes obtained from SILVA (left) and profile similarity in the mlstverse database (right). Species were ordered based on hierarchical-clustering results with complete-linkage. The profile shown on the right was sorted in the same order. B) Subspecies-level evaluation of M. abscessus profile similarities in the mlstverse database. The colour bar under the dendrogram indicates subspecies information stored in assembly metadata corresponding to subspecies that are not yet classified (grey), subsp. abscessus (red), subsp. bolletii (blue), or subsp. massiliense (yellow) (also see Figure S3).
Figure 5.Rapid detection of A and B) MLST scores vs. coverage with Illumina short reads (A) and ONT long reads (B). Views based on a wide range of coverage (top of A and B) or focused on low coverage range (bottom of A and B) are shown. C) Amount of output data vs. time using an ONT MinION sequencer. Data obtained during the first 48 h (2,880 min; top) or initial 30 min (bottom) are shown.