| Literature DB >> 26215170 |
Tanmoy Roychowdhury1, Saurav Mandal1, Alok Bhattacharya2.
Abstract
Insertion sequence (IS) 6110 is found at multiple sites in the Mycobacterium tuberculosis genome and displays a high degree of polymorphism with respect to copy number and insertion sites. Therefore, IS6110 is considered to be a useful molecular marker for diagnosis and strain typing of M. tuberculosis. Generally IS6110 elements are identified using experimental methods, useful for analysis of a limited number of isolates. Since short read genome sequences generated using next-generation sequencing (NGS) platforms are available for a large number of isolates, a computational pipeline for identification of IS6110 elements from these datasets was developed. This study shows results from analysis of NGS data of 1377 M. tuberculosis isolates. These isolates represent all seven major global lineages of M. tuberculosis. Lineage specific copy number patterns and preferential insertion regions were observed. Intra-lineage differences were further analyzed for identifying spoligotype specific variations. Copy number distribution and preferential locations of IS6110 in different lineages imply independent evolution of IS6110, governed mainly through ancestral insertion, fitness (gene truncation, promoter activity) and recombinational loss of some copies. A phylogenetic tree based on IS6110 insertion data of different isolates was constructed in order to understand genome level variations of different markers across different lineages.Entities:
Mesh:
Year: 2015 PMID: 26215170 PMCID: PMC4517164 DOI: 10.1038/srep12567
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
M. tuberculosis isolates used for this analysis.
| Indo-Oceanic (L1) | 116 |
| East Asian (L2) | 375 |
| Indian-East African (L3) | 182 |
| Euro-American (L4) | 666 |
| West African I (L5) | 18 |
| West African II (L6) | 16 |
| Ethiopian (L7) | 4 |
Figure 1Lineage based IS6110 copy number distribution in six major global lineages of M. tuberculosis.
Each lineage is represented by a different color as mentioned in the box.
Lineage specific preferential insertion regions of IS6110.
| 1500–1600 | 97.84 | 0.75 | Intergenic (Rv0001-Rv0002) |
| 1262900–1263000 | 84.68 | 0.46 | Rv1135c (PPE16) |
| 1543900–1544000 | 95.93 | 0.46 | Rv1371 (hypothetical) |
| 1998600–1998700 | 94.25 | 4.60 | Intergenic(Rv1765-Rv1765A) |
| 2263600–2263700 | 94.01 | 0.46 | Rv2016 (hypothetical) |
| 2634000–2634100 | 86.12 | 7.04 | Rv2352c (PPE38) |
| 3378500–3378600 | 91.62 | 0.28 | Intergenic (Rv3018-Rv3019) |
| 3549100–3549200 | 81.10 | 3.00 | Intergenic (Rv3179-Rv3180) |
| 3797800–3797900 | 93.30 | 0.84 | Rv3383c (idsB) |
| 3844600–3844700 | 88.51 | 0.46 | Rv3427c (transposase) |
| 475200–475300 | 98.48 | 0.46 | Rv0395 (hypothetical) |
| 850000–850100 | 92.92 | 0.77 | Rv0755c (PPE12) |
| 1694600–1694700 | 98.98 | 0 | Rv1504c (hypothetical) |
| 2166300–2166400 | 84.34 | 0.07 | Rv1917c (PPE34) |
| 3048400–3048500 | 92.92 | 0.15 | Rv2735c (hypothetical) |
| 4320100–4320200 | 98.48 | 0.07 | Intergenic (Rv3845-Rv3846) |
| 1075900–1076000 | 92.30 | 4.73 | Rv0963c (hypothetical) |
| 1907000–1907100 | 92.30 | 0.48 | Rv1682 (coiled-coil str. Pr.) |
| 2555700–2555800 | 96.15 | 0.82 | Rv2282c (LysR family transcriptional regulator) |
| 4185700–4185800 | 96.15 | 0 | Rv3734c (hypothetical) |
| 1300100–1300200 | 95.65 | 0 | Rv1169c (PE11) |
| 3709600–3709700 | 95.65 | 0.54 | Rv3323c (moaX) |
Spoligotype specific preferential insertion regions of IS6110 in L1 isolates.
| 888800–888900 | 100 | 2.46 | Intergenic (Rv0794c-Rv0795) |
| 932100–932200 | 88.88 | 0 | Intergenic (Rv0835-Rv0836c) |
| 1721300–1721400 | 100 | 0 | Rv1526c (glycosyltransferase) |
| 1998600–1998700 | 88.88 | 0 | Intergenic (Rv1765c-Rv1765A) |
| 2038800–2038900 | 88.88 | 0 | Intergenic (Rv1798-Rv1799) |
| 888700–888800 | 97.87 | 3.92 | Intergenic (Rv0794c-Rv0795) |
| 1880600–1880700 | 97.87 | 3.92 | Rv1661 (pks7) |
| 1987600–1987700 | 97.87 | 3.92 | Rv1755c (plcD) |
| 2559500–2559600 | 97.87 | 3.92 | Rv2286c (hypothetical) |
| 2627900–2628000 | 97.87 | 3.92 | Rv2349c (plcC) |
| 3030100–3030200 | 97.87 | 3.92 | Rv2717c (hypothetical) |
| 3096500–3096600 | 97.87 | 3.92 | Rv2787 (hypothetical) |
| 3491500–3491600 | 97.87 | 3.92 | Rv3125c (PPE49) |
Spoligotype specific preferential insertion regions of IS6110 in L4 isolates.
| 932200–932300 | 86.89 | 3.25 | Intergenic (Rv0835-Rv0836c) |
| 1481500–1481600 | 76.21 | 0.25 | Rv1319c (adenylate cyclase) |
| 3480300–3480400 | 88.34 | 3.5 | Rv3113 (phosphatase) |
| 80400–80500 | 100 | 0.16 | Intergenic (Rv0071-Rv0072) |
| 888900–889000 | 100 | 3.37 | Intergenic (Rv0794c-Rv0795) |
| 1889000–1889100 | 100 | 0.16 | Rv1664 (pks9) |
| 1987400–1987500 | 100 | 3.37 | Rv1755c (plcD) |
| 2166500–2166600 | 100 | 1.01 | Rv1917c (PPE34) |
| 483200–483300 | 100 | 16.07 | Rv0402c (mmpL1); Rv0403c (mmpS1) |
Probable IS6110 mediated large sequence polymorphisms in different isolates.
| 888755–889021 (D) | 889021–890375 | L4(15) |
| 888762–889021 (D) | L4(19) | |
| 888785–889021 (D) | L1(65), L2(278), L3(4), L4(9) | |
| 1543306–1543969 (D) | 1541952–1543306 | L2(11) |
| 1989057–1989078 (D) | 1987703–1989057 | L3(126) |
| 1989057–1989080 (D) | L4(46) | |
| 1996100–1998623 (I) | 1996100–1997455 | L2(352), L3(1), L4(2) |
| 1996100–1998750 (I) | L3(23) | |
| 1996100–1998792 (I) | L2(14) | |
| 1996100–1998810 (I) | L4(146) | |
| 1996100–1998838 (I) | L4(44) | |
| 1997455–1998658 (D) | L4(12) | |
| 2365413–2367206 (I) | 2365414–2366768 | L2(14), L4(2) |
| 2366766–2367209 (D) | L2(12) | |
| 2628292–2635577 (D) | 2635577–2636931 | L3(36) |
| 2634026–2635577 (D) | L2(22) | |
| 2635041–2635577 (D) | L2(14) | |
| 2635577–2636956 (I) | L3(44) | |
| 2636931–2636955 (D) | L2(12), L3(43) | |
| 2636931–2639361 (D) | L4(14) | |
| 3121878–3122030 (D) | 3120523–3121897 | L1(41) |
| 3121878–3121986 (D) | L3(48) | |
| 3549196–3551230 (D) | 3551230–3552584 | L2(18) |
Figure 2Global phylogeny of 1377 M. tuberculosis isolates based on IS6110 insertion sites.
Different colors represent isolates from different lineages. L1: green; L2: cyan; L3: blue; L4: red; L5: purple; L6: violet; L7: orange.