| Literature DB >> 26473025 |
Seung Jung Han1,2, Taeksun Song2, Yong-Joon Cho3, Jong-Seok Kim1,2, Soo Young Choi1, Hye-Eun Bang3, Jongsik Chun3, Gill-Han Bai4, Sang-Nae Cho1,2, Sung Jae Shin1,2.
Abstract
Mycobacterium tuberculosis K, a member of the Beijing family, was first identified in 1999 as the most prevalent genotype in South Korea among clinical isolates of M. tuberculosis from high school outbreaks. M. tuberculosis K is an aerobic, non-motile, Gram-positive, and non-spore-forming rod-shaped bacillus. A transmission electron microscopy analysis displayed an abundance of lipid bodies in the cytosol. The genome of the M. tuberculosis K strain was sequenced using two independent sequencing methods (Sanger and Illumina). Here, we present the genomic features of the 4,385,518-bp-long complete genome sequence of M. tuberculosis K (one chromosome, no plasmid, and 65.59 % G + C content) and its annotation, which consists of 4194 genes (3447 genes with predicted functions), 48 RNA genes (3 rRNA and 45 tRNA) and 261 genes with peptide signals.Entities:
Keywords: Korean Beijing strain; M. tuberculosis K complete genome; Mycobacterium tuberculosis; Outbreak; TB Beijing family; TB clinical strain
Year: 2015 PMID: 26473025 PMCID: PMC4606834 DOI: 10.1186/s40793-015-0071-4
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Classification and general features of M. tuberculosis K according to the MIGS recommendation [16]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Strain K (CP007803.1) | |||
| Gram stain | Weakly positive | TAS [ | |
| Cell shape | Irregular rods | TAS [ | |
| Motility | Non motile | TAS [ | |
| Sporulation | Nonsporulating | NAS | |
| Temperature range | Mesophile | TAS [ | |
| Optimum temperature | 37 °C | TAS [ | |
| pH range; Optimum | 5.5–8; 7 | IDA | |
| Carbon source | Asparagine, Oleic acid, Potato starch | TAS [ | |
| MIGS-6 | Habitat | Human-associated: Human lung | TAS [ |
| MIGS-6.3 | Salinity | Normal | TAS [ |
| MIGS-22 | Oxygen | Aerobic | TAS [ |
| MIGS-15 | Biotic relationship | Free-living | NAS |
| MIGS-14 | Pathogenicity | Hypervirulent | TAS [ |
| Biosafety level | 3 | NAS | |
| Isolation | Sputum of TB patient | TAS [ | |
| MIGS-4 | Geographic location | High schools in Kyunggi Province, Republic of Korea. | TAS [ |
| MIGS-5 | Sample collection time | 1999 | TAS [ |
| MIGS-4.1 | Latitude Longitude | 37.274377 | NAS |
| MIGS-4.2 | Longitude | 127.009442 | NAS |
| MIGS-4.4 | Altitude | Not reported | NAS |
Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [36]
Fig. 1Phylogenetic tree showing the relationships of M. tuberculosis K with other Mycobacterium species based on aligned sequences of the rpoB gene. 711 bp internal region was used for phylogenetic analysis. All sites were informative and there were no gap-containing sites. Phylogenetic tree was built using the Maximum-Likelihood method based on Tamura-Nei model by MEGA. Bootstrap analysis [37] was performed with 500 replicates to assess the support of the clusters. Bootstrap values over 50 are shown at each node. The bar represents 0.02 substitutions per site
Fig. 2Image of Mycobacterium tuberculosis K using the appearance of colony morphology on 7H10-OADC solid medium
Fig. 3Transmission electron microscopy of Mycobacterium tuberculosis K
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS-31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | Three genomic libraries: two Sanger libraries; 2 kb shotgun library, fosmid library, respectively and one Illumina library |
| MIGS-29 | Sequencing platforms | Sanger, Illumina MiSeq 250 bp paired-end |
| MIGS-31.2 | Fold coverage | 8.3x (Sanger), 551.66x (Illumina) |
| MIGS-30 | Assemblers | Phred/Phrap/Consed, CLC genomics workbench v6.5, CodonCode Aligner v3.7 |
| MIGS-32 | Gene calling method | Glimmer v 3.02 |
| Locus Tag | MTBK | |
| Genbank ID | CP007803.1 | |
| Genbank Date of Release | June 05, 2014 | |
| GOLD ID | Gp0032286 | |
| BIOPROJECT | PRJNA178919 | |
| MIGS-13 | Source Material Identifier | The Korean Institution of Tuberculosis |
| Project relevance | Human-associated pathogen |
Fig. 4Graphical circular map of M. tuberculosis K strain genome. From the outside to the center: RNA features (ribosomal RNAs are colored as blue, and transfer RNAs are colored as red), genes on the forward strand and the reverse strand (colored according to the COG categories). The inner two circles show the GC ratio and GC skew. The GC ratio and GC skew are shown in orange and red indicates positive, and in blue and green indicates negative, respectively
Nucleotide content and gene count levels of the genome
| Attribute | Value | % of total |
|---|---|---|
| Genome size (bp) | 4,385,518 | 100.00 |
| DNA coding (bp) | 3,954,282 | 90.17 |
| DNA G + C (bp) | 2,876,511 | 65.59 |
| DNA scaffolds | 1 | 100.00 |
| Total genes | 4,194 | 100.00 |
| Protein coding genes | 4,146 | 98.86 |
| Pseudo genes | 2 | 0.05 |
| RNA genes | 48 | 1.14 |
| Genes in internal clusters | NA | NA |
| Genes with function prediction | 2,885 | 68.79 |
| Genes assigned to COGs | 2,892 | 69.74 |
| Genes with Pfam domains | 3,347 | 79.80 |
| Genes with signal peptides | 233 | 5.56 |
| Genes coding transmembrane helices | 810 | 19.31 |
| CRISPR repeats | 4 | 0.10 |
The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome
Also includes 1 pseudogene
Summary of genome: one chromosome and two plasmids
| Label | Size (Mb) | Topology | INSDC identifier | RefSeq ID (Optional) |
|---|---|---|---|---|
| Chromosome | 4,385,518 | Circular | GenBank | CP007803.1 |
Number of genes associated with the 25 general COG functional categories
| Code | Value | % age | COG category |
|---|---|---|---|
| J | 193 | 4.66 | Translation, ribosomal structure and biogenesis |
| A | 10 | 0.24 | RNA processing and modification |
| K | 195 | 4.70 | Transcription |
| L | 106 | 2.56 | Replication, recombination and repair |
| B | – | – | Chromatin structure and dynamics |
| D | 37 | 0.89 | Cell cycle control, cell division, chromosome partitioning |
| Y | – | – | Nuclear structure |
| V | 84 | 2.03 | Defense mechanisms |
| T | 112 | 2.70 | Signal transduction mechanisms |
| M | 157 | 3.79 | Cell wall/membrane/envelope biogenesis |
| N | 8 | 0.19 | Cell motility |
| Z | – | – | Cytoskeleton |
| W | 1 | 0.02 | Extracellular structures |
| U | 23 | 0.55 | Intracellular trafficking, secretion, and vesicular transport |
| O | 117 | 2.82 | Posttranslational modification, protein turnover, chaperones |
| C | 195 | 4.70 | Energy production and conversion |
| G | 139 | 3.35 | Carbohydrate transport and metabolism |
| E | 186 | 4.49 | Amino acid transport and metabolism |
| F | 83 | 2.00 | Nucleotide transport and metabolism |
| H | 225 | 5.43 | Coenzyme transport and metabolism |
| I | 275 | 6.63 | Lipid transport and metabolism |
| P | 127 | 3.06 | Inorganic ion transport and metabolism |
| Q | 107 | 2.58 | Secondary metabolite biosynthesis, transport and catabolism |
| R | 246 | 5.93 | General function prediction only |
| S | 266 | 6.42 | Function unknown |
| – | 1254 | 30.26 | Not in COGS |
The total is based on the total number of protein coding genes in the annotated genome