| Literature DB >> 31534995 |
Jaeyres Jani1, Zainal Arifin Mustapha2, Norfazirah Binti Jamal3, Cheronie Shely Stanis3, Chin Kai Ling4, Richard Avoi5, Naing Oo Tha5, Valentine Gantul6, Daisuke Mori3, Kamruddin Ahmed1,3.
Abstract
A Mycobacterium tuberculosis strain SBH162 was isolated from a 49-year-old male with pulmonary tuberculosis. GeneXpert MDR/RIF identified the strain as rifampicin-resistant M. tuberculosis. The whole genome sequencing was performed using Illumina HiSeq 4000 system to further investigate and verify the mutation sites of the strain through genetic analyses namely variant calling using bioinformatics tools. The de novo assembly of genome generated 100 contigs with N50 of 156,381bp. The whole genome size was 4,343,911 bp with G + C content of 65.58% and consisted of 4,306 predicted genes. The mutation site, S450L, for rifampicin resistance was detected in the rpoB gene. Based on the phylogenetic analysis using the Maximum Likelihood method, the strain was identified as belonging to the Europe America Africa lineage (Lineage 4). The genome dataset has been deposited at DDBJ/ENA/GenBank under the accession number SMOE00000000.Entities:
Keywords: M. tuberculosis; Malaysia; Next generation sequencing; Rifampicin resistant; Sabah; Whole genome sequencing
Year: 2019 PMID: 31534995 PMCID: PMC6743026 DOI: 10.1016/j.dib.2019.104445
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Comparative phylogenetic analysis of strain SBH162. This strain belongs to Lineage 4 and is clustered with other strains from the LAM family. The Malaysian strains are also in Lineage 4 and belong to T2 family while other Malaysian strains belong to Lineages 1 and 2. The phylogenetic tree was constructed using SNPs data extracted from the genome sequence. The phylogenetic tree was inferred using the Maximum Likelihood method and General Time Reversible model. The tree is rooted with M. bovis SP38 as outgroup.
Specifications Table
| Subject area | Environmental Science |
| Specific subject area | Immunology and Microbiology |
| Type of data | Whole genome sequence with gene annotation and comparative genomic of |
| Data acquisition | |
| Data format | Raw and analyzed data of whole genome sequences |
| Experimental factors | Isolated and cultured in 7H9 middlebrook medium, and incubated at BACTEC MGIT 320, Extraction of genomic DNA from a pure culture, library preparation for sequencing, Illumina sequencing, |
| Experimental features | DNA extraction was performed using Masterpure Complete DNA and RNA purification kit; library was prepared using NEBNext® Ultra™ DNA Library Prep Kit for Illumina®; sequencing was performed using Illumina Hiseq 4000 system. The genome was assembled using SPAdes, variant calling by GATK tools, annotated with NCBI Prokaryotic Genome Annotation Pipeline and comparative genomic through kSNP3. |
| Data source location | Kota Kinabalu, Sabah, Malaysia |
| Data accessibility | Data is publicly available at NCBI Genbank from the following links: |
The data will shed light on the molecular biology of a The data will give insight into drug resistance in The data will help to understand the relation between |
| Sequencing depth | 332X |
| Total length of sequences (bp) | 4,343,911 |
| Total number of contigs | 100 |
| N50 (bp) | 156,381 |
| GC (%) | 65.58 |
| CDSs | 4,306 |
| tRNAs | 45 |
| 5s,16s,23s rRNA | 1, 1, 1 |
Supplementary data 1, 2 and 3.