Literature DB >> 30036392

Whole genome SNP analysis suggests unique virulence factor differences of the Beijing and Manila families of Mycobacterium tuberculosis found in Hawaii.

Kent Koster1, Angela Largen2, Jeffrey T Foster3, Kevin P Drees4, Lishi Qian1, Edward P Desmond5, Xuehua Wan6, Shaobin Hou6, James T Douglas1.   

Abstract

While tuberculosis (TB) remains a global disease, the WHO estimates that 62% of the incident TB cases in 2016 occurred in the WHO South-East Asia and Western Pacific regions. TB in the Pacific is composed predominantly of two genetic families of Mycobacterium tuberculosis (Mtb): Beijing and Manila. The Manila family is historically under-studied relative to the families that comprise the majority of TB in Europe and North America (e.g. lineage 4), and it remains unclear why this lineage has persisted in Filipino populations despite the predominance of more globally successful Mtb lineages in most of the world. The Beijing family is of particular interest as it is increasingly associated with drug resistance throughout the world. Both of these lineages are important to the State of Hawaii, where they comprise over two-thirds of TB cases. Here, we performed whole genome sequencing on 82 Beijing family, Manila family, and outgroup clinical Mtb isolates from Hawaii to identify lineage-specific SNPs (SNPs found in all isolates from their respective families, and exclusively in those families) in established virulence factor genes. Six non-silent lineage-specific virulence factor SNPs were found in the Beijing family, including mutations in alternative sigma factor sigG and polyketide synthases pks5 and pks7. The Manila family displayed more than eleven non-silent lineage-specific and characteristic virulence factor mutations, including in genes coding for MCE-family protein Mce1B, two mutations in fatty-acid-AMP ligase FadD26, and virulence-regulating transcriptional regulator VirS. This study further identified an ancient clade that shared some virulence factor mutations with the Manila family, and investigated the relationship of those and other "Manila-like" spoligotypes to the Manila family with this SNP dataset. This work identified a set of virulence genes that are worth pursuing to determine potential differences in transmission or virulence displayed by these two Mtb families.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30036392      PMCID: PMC6056056          DOI: 10.1371/journal.pone.0201146

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Despite sustained control efforts, tuberculosis (TB) remains the ninth leading cause of death worldwide [1]. The World Health Organization (WHO) estimated that 62% of the incident TB cases in 2016 occurred in the WHO South-East Asia and Western Pacific Regions [1]. Within those regions, TB in both China and the Philippines in characterized by distinct families of TB–the Beijing family in China and the Manila family in the Philippines [2, 3]. These families fall into global genetic lineages 1 (the Manila family) and 2 (the Beijing family), and are distinct from the lineage 4 clades that are dominant in Europe and North America [4, 5]. The WHO’s Global Tuberculosis Report 2017 indicates these regions and Mtb families as areas of concern [1]. The Philippines was only surpassed in number of incident cases by India, Indonesia, and China. As noted in the report, the Fourth National Survey of the Prevalence of TB Disease in the Philippines was conducted from March to December of 2016, and the survey revealed that tuberculosis was not well controlled in the country [1]. Comparing the prevalence of smear-positive TB among participants aged 15 years or more to the case notification rate for smear-positive TB cases in 2016 (142 per 100,000 population) in the same age group gave a prevalence-to-notification ratio of 3.1. From that, the group’s adjusted prevalence of culture positive TB became 463 per 100,000 population (95% CI: 333−592) in 2007 and 512 per 100,000 population (95% CI: 420−603) from that study in 2016. The probability that prevalence did not decline over the period 2007–2016 was estimated at 75%. Following the survey, the estimate of TB incidence in the Philippines was revised to 554 cases per 100,000 (95% CI: 311–866), or 573,000 cases. That was well above the incidence previously estimated by the WHO (322 cases per 100,000). In contrast to the steady or increasing incidence of TB in the Philippines, TB incidence in China is thought to be declining, with the WHO’s estimate being 895,000 cases (64 per 100,000) in 2016. However, as also noted in the WHO’s report, 4.1% of new TB cases and 19% of re-treated TB cases globally are multidrug-resistant (MDR) TB cases. China is joined by India and the Russian Federation to account for 47% of the world’s total incident MDR TB cases. China saw an estimated 73,000 incident MDR TB cases in 2016, compared to 30,000 for the Philippines. Considerably more concerning is China’s estimated percentage of new TB cases that are MDR in 2016, at 7.1%,compared to 2.6% for the Philippines, though the Philippines’ total MDR rate is expected to continue to increase [6]. Approximately 70% of TB cases in China are of the Beijing family, which also displayed the highest MDR rate of all lineages in the country [7]. Beyond China, it has further been proposed that the population structure of Mtb in areas of high drug resistance is shifting towards Beijing family strains [8]. While the Beijing family has been studied with increasing intensity, especially in the context of drug resistance, considerably less work has focused on the Manila family. Both families are of importance to TB control in the United States, where immigration from the WHO South-East Asia and Western Pacific Regions continually presents TB controllers with new TB cases. The State of Hawaii, centrally located in the Pacific and experiencing a high level of immigration from throughout both of those WHO Regions, accordingly has a composition of TB lineages that resembles the United States Affiliated Pacific Islands more than the continental United States [9]. As expected, immigration from Asia imports primarily Beijing family strains to Hawaii, while immigration from the Philippines brings primarily Manila family strains [2, 3]. Those two families combined are responsible for over two-thirds of the TB cases in Hawaii, a ratio that has remained constant since the introduction of the use of spoligotyping to type Hawaii’s Mtb in 1997 [10, 11]. Furthermore, high levels of immigration from throughout the Pacific into Hawaii result in the state perennially displaying either the highest or second-highest rates of TB in the United States, with Hawaii’s TB rate remaining steady even as TB rates throughout the US continue to decrease [12]. Thus, the need to further understand the characteristics of the Beijing and Manila families of Mtb is of considerable importance not only to countries throughout the Pacific, but also to the United States. Investigating unique differences in virulence factors that are characteristic of each of these families may help explain those families’ success in their originating environments, and may aid in developing improved control strategies if differences affecting latency, reactivation, or host preference are identified. The Manila family, for example, continues to predominate in Filipino hosts, where other, more globally successful lineages have been unable to displace the Manila family and rise to high incidence in the Philippines. Considerably less research has been performed on the Manila family than on the Beijing family or on lineage 4, thus which attributes may contribute to the Manila family’s persistence remains an open question. This study seeks to utilize next-generation Illumina whole-genome sequencing on a diverse set of Beijing and Manila family-focused clinical Mtb isolates from Hawaii in order to identify lineage-specific virulence factor differences of these two families. Here, we examine the Mtb genomes to find non-silent (i.e. non-synonymous), single-nucleotide polymorphisms (SNPs) specific to each family in virulence genes as well as other genes of interest. These SNPs are alleles that are unique to either the Beijing or Manila family isolates among the 82 that we studied. In order for a SNP to be described here as a Beijing family-specific SNP, that SNP must have been found in all Beijing family isolates from this study, but not have been found in any of this study’s non-Beijing family isolates. Similarly, SNPs described as Manila family-specific must have been found in all Manila family isolates from this study, but not have been found in any of this study’s non-Manila family isolates. Additionally, Hawaii and the Pacific region frequently see Mtb isolates with spoligotyping fingerprints that resemble the Manila family, but are less commonly encountered than isolates possessing spoligotyping patterns that have been established as “Manila family.” This study will utilize phylogenetics from full-genome sequences to better describe the relationship of such patterns and isolates to the Manila family.

Materials and methods

Selection of Mtb isolates for study

A comprehensive database of all 1,076 M. tuberculosis isolates processed for genetic fingerprinting for the State of Hawaii from 2002 through 2016 (and including a partial set from 2017) was obtained from the State of Hawaii Department of Health Tuberculosis Control Program for this study. Spoligotype octal codes are presented as recorded by the Centers for Disease Control and Prevention (CDC) contracted reference laboratory that performed the typing. Lineage or family names were assigned to spoligotype octal codes using the SpolDB4 database, with “EAI2_MANILLA” entries being considered “Manila family.” [13]. Eighty-two Mtb isolates from this set were selected for use in this study (Table 1). Additionally, a Mycobacterium bovis BCG P3 isolate was used as an outgroup for phylogenetic analysis, as M. bovis has been shown to have diverged from the M. tuberculosis ancestor prior to the differentiation of the TB lineages examined in this study [5].
Table 1

Full list of sequenced Mycobacterium tuberculosis isolates.

#AccessionNumberSpoligotypeNameSpoligotypeOctal CodeMIRU-VNTR Loci 1–12MIRU-VNTR Loci 13–24Comments
709L1688Beijing000000000003771223315175533444444423348Beijing Diversity Selection
1009L2178Beijing000000000003771222325173533445644423328Beijing Clustered
1610L8643Beijing000000000003771222325173533445644423326Beijing Clustered
1810L9505Beijing000000000003771223325173432345544423539Beijing Diversity Selection
2211L1883Beijing000000000003771222325173533445644423326Beijing Clustered
2811L0426Beijing000000000003771222325173533445644423328Beijing Clustered
2911L0427Beijing000000000003771222325173533445644423328Beijing Clustered
3512L6290Beijing000000000003771222325173533445644423326Beijing Clustered
3812L7610Beijing000000000003771223325173533545534223328Beijing Diversity Selection
3912L7621Beijing000000000003771223325173521343544423339Beijing Diversity Selection
4012L7991Beijing000000000003771222325173533445644423328Beijing Clustered
5215RF4481Beijing000000000003771223325163533448544423328Beijing Clustered
5615RF6753Beijing000000000003771222325173533445644423328Beijing Clustered
5816RF0102Beijing000000000003771222325173533445644423328Beijing Clustered
6316RF2356Beijing000000000003771222325173533445644423328Beijing Clustered
6416RF4738Beijing000000000003771223325163533448544423328Beijing Clustered
7508L7125Beijing000000000003771222325173533-Beijing Clustered
7910L8839Beijing000000000003771222325173533445644423326Beijing Clustered
8010L9812Beijing000000000003771222325173533445644423326Beijing Clustered
8111L3224Beijing000000000003771222325173533445644423326Beijing Clustered
8212L6284Beijing000000000003771222325173533445644423326Beijing Clustered
8513L2956Beijing000000000003771222325173533445644423328Beijing Clustered
8616RF7084Beijing000000000003771222325173533445644423328Beijing Clustered
7408L0161Beijing000000000003751222325173533-Uncommon Beijing
7709L2141Beijing000000000003751222325173533445644423328Uncommon Beijing
205L3477Manila67777747741377125432622343214a843263217Manila Clustered
305L3487Manila67777747741377125432622343214a943263217Manila Clustered
408L7789Manila67777747741377125432622343214a923273217Manila Diversity Selection
609L1025Manila677777477413771265326223432148443263216Manila Diversity Selection
909L1898Manila67777747741377125432622343214a843263217Manila Clustered
1109L2431Manila67777747741375125431622343214a743263217Manila Diversity Selection
1410L0115Manila677777477413771254326223432149843263217Manila Diversity Selection
1510L5331Manila677777477413671254326223432147743263217Manila Diversity Selection
1710L9503Manila677777477413771244326223432149443263217Manila Diversity Selection
2011L1563Manila677777477413771254326222432245843263217Manila Diversity Selection
2311L3658Manila677777477413771254326223332148943273217Manila Diversity Selection
2411L4599Manila67777747741377125432622343214a843243215Manila Diversity Selection
3412L6289Manila677777477413771234326223432176843263217Manila Diversity Selection
4112L9260Manila67777747741377125432622343213A843263217Manila Diversity Selection
4313L1313Manila67777747741377124432622344214A943263215Manila Diversity Selection
4513L2435Manila67777747741377125432622343214A943263217Manila Clustered
4613L2950Manila67777747741377125432622343214A843263217Manila Clustered
4814RF3136Manila67777747741377125432622343214a943263217Manila Clustered
4914RF3267Manila677777477000771254326223432149743263217Manila Clustered
5014RF5521Manila67777747741377125432622343214a843253217Manila Clustered
5115RF0460Manila677777477413771255326223432147843263217Manila Clustered
5315RF5637Manila67777747741377125432622343214a923263217Manila Clustered
5916RF0108Manila677777477413771255326223432147843263217Manila Clustered
6016RF1671Manila67777747741377125432622343214a943263217Manila Clustered
6116RF1679Manila67777747741377125432622343214a943253217Manila Clustered
6216RF1689Manila67777747741370025432622343214a943263217Manila Clustered
6516RF5423Manila677777477413771254326223422247843263217Manila Clustered
6909L0735Manila677777477413771--Singleton, Deletions Study
7009L1111Manila677777477413771--Singleton, Deletions Study
8996128Manila677777477413701--Uncommon Manila
9020034Manila677777477413751--Uncommon Manila
509L0897EAI5777777776413771364225223533245273243416For Manila-like Comparison
3311L5411EAI5777777776413771275225223532245273243415For Manila-like Comparison
8413L2584no match*67777447741377124432622343214A843263217Manila-like Clustered
2111L1565no match**600777477413771254326223432146743263217Manila-like Cluster 1
3011L2683no match**67777740200377125132622343214b843233217Manila-like Cluster 2
3712L7424no match**67777740200377125132622343214B843233217Manila-like Cluster 2
4413L1569no match**600777477413771254326223432146743263217Manila-like Cluster 1
809L1897no match777777607360371123326143326342224123235No Match Diversity Selection
4213L0560no match7037177400037712254251b3533524244223345No Match Diversity Selection
7104L0177no match737777377413771264225223532-No Match Diversity Selection
7206L6898no match737777377413771264225223532-No Match Diversity Selection
7306L7696no match737777377413771264225223532-No Match Diversity Selection
1309L4214H3777777777720771225325153323232634423337Lineage 4 Diversity Selection
2710L8137H3777777770020771222225153322143334213328Lineage 4 Diversity Selection
7609L2140H3777777770020771122325143323233334213326Lineage 4 Diversity Selection
7810L6616H3777777770020771122325143323233334213326Lineage 4 Diversity Selection
104L0176LAM9777777607760771123326153326-Lineage 4 Diversity Selection
1911L0415T1777777777760771242325142222241544223124Lineage 4 Diversity Selection
3111L2831T17777777777607712233151433–3343544223326Lineage 4 Diversity Selection
3612L7116T1777777777760771233326153311232334222523Lineage 4 Diversity Selection
1209L3451T3377737777760771223326153323242334223425Lineage 4 Diversity Selection
4713RF4795U777777760000000223325143322242324223422U Diversity Selection
5415RF5640U777777760000000223325143322242324223422U Diversity Selection
5515RF6749U777777760000000223325143322242334223322U Diversity Selection
5715RF6755U777777760000000223325143322242324223422U Diversity Selection
8313L2363U777777760000000223325143322242324223422U Diversity Selection
66BCG P3BOVIS1_BCG676773777777600--M. bovis outgroup
-ReferenceH37Rv7777774777607712x3226133321242534233525Reference for alignment

Complete list of Mycobacterium tuberculosis isolates selected for sequencing for this study. Isolates are identified by one or two digit University of Hawaii DNA extraction numbers and Centers for Disease Control and Prevention (CDC) genome accession numbers. Spoligotypes are named as presented in SpolDB4.

*No match in SpolDB4, but strongly suspected to be Manila family based on similarity.

**Manila-like.

Complete list of Mycobacterium tuberculosis isolates selected for sequencing for this study. Isolates are identified by one or two digit University of Hawaii DNA extraction numbers and Centers for Disease Control and Prevention (CDC) genome accession numbers. Spoligotypes are named as presented in SpolDB4. *No match in SpolDB4, but strongly suspected to be Manila family based on similarity. **Manila-like.

Selection of Mtb isolates for identification of lineage-specific virulence factor differences

From the database, 25 Beijing family (spoligotype 000000000003771) isolates covering a chronological range from 2005 to 2016 were selected for whole genome sequencing. Of those, 19 isolates were selected from putative transmission clusters, four were selected to increase our represented Beijing family diversity (based on their MIRU-VNTR [Mycobacterial Interspersed Repetitive Unit-Variable Number Tandem Repeat] fingerprints), and two were selected due to possessing an uncommon Beijing family spoligotype (000000000003751). Thirty-one Manila family (representative spoligotyping pattern: 677777477413771) isolates were selected, of which 15 were selected from putative transmission clusters and 12 were selected based on uniqueness of MIRU-VNTR fingerprints to maximize this study’s Manila family diversity. To further increase the diversity of Manila family isolates examined in this study, two additional Manila family isolates with uncommon Manila family spoligotyping patterns (isolates 96128 and 20034) and two Manila family isolates with unique large-sequence-polymorphism (LSP) deletion patterns from a prior study (09L0735 and 09L1111) were selected from our laboratory’s Mycobacterial DNA Bank at the University of Hawaii, where their extracted DNA was archived [14]. Fourteen outgroup isolates from non-Beijing, non-Manila lineages from Hawaii during the 2002–2016 time period were selected for comparison, primarily on the basis of uniqueness of spoligotyping and MIRU-VNTR patterns. These isolates included four with the H3 spoligotype, one LAM9, three T1, one T3, and five from a suspected outbreak cluster with an undefined spoligotyping pattern (777777760000000). An additional three-isolate suspected transmission cluster with no SpolDB4 spoligotype match (737777377413771) was also selected.

Selection of Mtb isolates for comparison of “Manila-like” strains

Two putative two-isolate transmission clusters whose spoligotypes had no match in SpolDB4, but whose spoligotyping patterns otherwise exhibited substantial similarity to the Manila family, were selected for whole-genome comparison to the Manila family (Manila-like Cluster 1: isolates 21 and 44; Manila-like Cluster 2: isolates 30 and 37). Two EAI5 isolates were also selected (isolates 5 and 33) for sequencing due to the EAI5 clade’s similarity to the Manila family. An additional clustered isolate with no spoligotype match in SpolDB4 (isolate 84) that we considered to be Manila family based on the similarity of its spoligotype and MIRU-VNTR fingerprints was also included. Finally, two isolates with no spoligotype match in SpolDB4 and no near-matches with MIRU-VNTR patterns from other isolates (isolates 8 and 42) were selected to represent lineage-agnostic outgroups.

Whole genome sequencing

Recall of state of Hawaii Mtb isolates

All required isolates were requested from the laboratories where they had been previously sent by the State of Hawaii for contracted fingerprinting, and where they were archived (California Department of Public Health State Laboratory and Michigan Department of Community Health). Spoligotyping and 24-loci MIRU-VNTR fingerprints are displayed as provided by those laboratories. The Michigan Department of Community Health returned extracted DNA from their isolates, while isolates from the California Department of Public Health State Laboratory were returned as “double-killed” sample preps using a treatment of immersion in 70% ethanol and then heating at 80°C for one hour.

DNA extraction and whole genome sequencing

DNA extraction was performed as previously described by the National Institute of Public Health and Environmental Protection (RIVM), Bilthoven, The Netherlands (Isolation of Genomic DNA from Mycobacteria Protocol), or according to the source state laboratory’s internal protocol. DNA was quantified with the Qubit 2.0 dsDNA Broad Range Assay. Genomic libraries were prepared using the Illumina Nextera XT DNA Library Kit using manual normalization and sequenced on the Illumina MiSeq Platform with v3 Chemistry for 300 bp paired-end reads.

Data analysis

SNP matrices were produced using a modification of the NASP pipeline, with Bowtie2 used for alignment, GATK used for SNP-calling, and SNPs being filtered for a minimum of ten-fold read coverage and 90% read consensus [15, 16, 17]. Repetitive regions were removed by the NASP pipeline. A M. tuberculosis type strain H37Rv genome (GenBank NC_000962.3) was used as the scaffold for read alignment for identification of virulence factor SNPs, and our de novo sequenced representative Manila family genome “96121” (GenBank CP009427) was used as the scaffold for read alignment for construction of phylogenies and comparison of “Manila-like” strains. The number and percentage of mapped reads was determined using SAMtools flagstat. Percent coverage of the 4,411,532 base pair H37Rv reference genome NC_000962.3 was determined using SAMtools mpileup.

Identification of lineage-specific virulence factor differences

Output matrices from the NASP pipeline were imported into a Microsoft Access database and SQL queries were utilized to identify SNPs relative to H37Rv that were shared by all members of either the Beijing or Manila families that we sequenced. SQL queries were further used to search all non-Beijing isolates for those SNPs in order to exclude them as non-specific to the Beijing family if found. For the Manila family, SNPs that were shared by all Manila family isolates were searched for in all non-Manila family isolates–however, their presence in Manila-like isolates or isolates with no spoligotype match in SpolDB4 did not automatically exclude them, as the relationships of those isolates to the Manila family was examined separately and individually. SNPs in genes described in Forrellad et al.’s 2013 review of Mtb virulence were designated as “virulence factor SNPs” [18]. The strength of the association between lineage-specific and lineage-characteristic SNPs and their respective groups versus outgroups was calculated using the Fisher Exact method using R Statistics version 3.4.1 [19].

Exploring the relationship of Manila-like isolates to the Manila family

Spoligotypes of isolates (presented in this work as their CDC octal codes) were compared in order to identify the number of gained or lost spacers relative to the representative Manila family spoligotype (677777477413771). All isolates examined were individually aligned against Manila family do novo sequenced genome 96121 using the NASP pipeline to determine the numbers of SNPs separating those isolates from 96121. Putative Manila family-specific non-silent virulence factor SNPs were examined for all Manila-like, EAI5, and no-match isolates, with Mtb type strain H37Rv and the Beijing family serving as known outgroups. A phylogenetic tree was constructed using MEGA7 utilizing maximum parsimony, obtaining the most parsimonious tree using the Subtree-Pruning-Regrafting algorithm with search level 2 where the initial trees were obtained by the random addition of sequences with ten replicates, and branch lengths were calculated using the average pathway method [20]. The maximum parsimony phylogeny was supported by 1,000 bootstrap repetitions, and rooted with Mycobacterium bovis BCG P3. To further support that phylogeny, a model averaged phylogeny was generated using jModelTest 2.1.10 with likelihood scores computed using a maximum likelihood optimized base tree with NNI base tree search and 11 substitution schemes [21]. Akaike Information Criterion (AIC), Akaike Information Criterion corrected for small sample sizes (AICc), and Bayesian Information Criterion (BIC) (all with model averaging and 100% confidence intervals) were evaluated to find the best-fit. The model-averaged phylogeny was created using AICc as the criterion for tree weights and with majority rule consensus and a 100% confidence interval. Isolates were designated as “ancestral” or “modern” on the basis of the presence or absence of the TbD1 deletion region [22]. The presence of the TbD1 deletion was identified in the sequenced isolates by a gap in the bases aligned to de novo sequenced genome 96121 (GenBank CP009427) in the range of nucleotide positions 1,759,200 to 1,761,320.

Results

Of the 82 sequenced and analyzed Mtb genomes, 71 provided 90% or greater coverage of the H37Rv reference genome at 10X depth (Table 2). Eight more showed intermediate (40–90%) coverage, and three showed low coverage. Remarkably, despite showing less than 10% coverage of the reference genome, Manila family Isolate 53/15RF5637 still demonstrated 200-300X coverage depth at all of the Manila specific and characteristic SNP loci. Two isolates from a putative outbreak cluster with an undefined “U” spoligotype displayed extremely low coverage. However, the three other identically fingerprinting isolates from the same putative cluster displayed >95% coverage, ensuring that the putative cluster was still able to provide data for this study.
Table 2

Whole genome sequencing and SNP data.

#Accession NumberSpoligo-typeTotal SNPsUniqueSNPsBases with 10X CoveragePercentCoverageReads Passed QCReads Mapped% Reads Mapped
709L1688Beijing1214150436674999.0%8000000786423298.3%
1009L2178Beijing12595435669998.8%2157898212291398.4%
1610L8643Beijing12630437704299.2%8000000761662395.2%
1810L9505Beijing1199106436970999.1%1607256154847396.3%
2211L1883Beijing12630429766297.4%91205288682197.2%
2811L0426Beijing12540428700397.2%91296689781998.3%
2911L0427Beijing12540437029999.1%3370522326847997.0%
3512L6290Beijing12630434216898.4%1049434101543396.8%
3812L7610Beijing123692433815298.3%1302504127402597.8%
3912L7621Beijing1219126432937398.1%100176896794596.6%
4012L7991Beijing12365434889298.6%1587788155931398.2%
5215RF4481Beijing12240429550397.4%101925898763096.9%
5615RF6753Beijing12398426955996.8%86808083890096.6%
5816RF0102Beijing12301429831797.4%1066562103329596.9%
6316RF2356Beijing12370421680095.6%1316206127614797.0%
6416RF4738Beijing12240431160097.7%1271118121111395.3%
7508L7125Beijing12310408309492.6%1234036118214895.8%
7910L8839Beijing12420428432697.1%1887444181110596.0%
8010L9812Beijing12420434191798.4%2928112280698695.9%
8111L3224Beijing12420403878391.6%1208578115942695.9%
8212L6284Beijing12640429870997.4%2068176198876196.2%
8513L2956Beijing12532433900498.4%3256308315813097.0%
8616RF7084Beijing12601434078198.4%2406912231284296.1%
7408L0161Beijing12561430826997.7%1918020185814096.9%
7709L2141Beijing12550432534598.0%3012910290373996.4%
205L3477Manila189842438206399.3%1391250136136197.9%
305L3487Manila188663439974199.7%8000000740137492.5%
408L7789Manila187361438148199.3%1868494182671997.8%
609L1025Manila1898105439224899.6%1694468165663197.8%
909L1898Manila188528437534299.2%1906012185421997.3%
1109L2431Manila191495438392599.4%2261522221648698.0%
1410L0115Manila187752437464299.2%2365498232580198.3%
1510L5331Manila189624436802999.0%2555006250589198.1%
1710L9503Manila188695439622199.7%8000000757687194.7%
2011L1563Manila189741435589498.7%1132140104723592.5%
2311L3658Manila174072438674499.4%3066566300661698.0%
2411L4599Manila172358437354499.1%2375178232329697.8%
3412L6289Manila17240309648270.2%50626649703298.2%
4112L9260Manila172148435262298.7%2009858197733898.4%
4313L1313Manila1770135436641599.0%2166864212640598.1%
4513L2435Manila173069434644598.5%1431348140643698.3%
4613L2950Manila173864195203444.2%39594834805687.9%
4814RF3136Manila173549406496892.1%58769256357495.9%
4914RF3267Manila175374421090295.5%71903668296195.0%
5014RF5521Manila175570238127054.0%89716627909631.1%
5115RF0460Manila11040425522496.5%74793472027796.3%
5315RF5637Manila1125374013409.1%14242041401539.8%
5916RF0108Manila11040427409396.9%87653283812695.6%
6016RF1671Manila112132387998788.0%55184653161196.3%
6116RF1679Manila113143417582994.7%68613265848396.0%
6216RF1689Manila113364427563896.9%89947686581096.3%
6516RF5423Manila113288204803046.4%85988831107336.2%
6909L0735Manila111030435600098.7%1655260160876697.2%
7009L1111Manila111439275757262.5%32361231078796.0%
8996128Manila179052437075299.1%4774208463240797.0%
9020034Manila179058436082898.9%2170986208584796.1%
509L0897EAI51880190434206598.4%1479306142818396.5%
3311L5411EAI518658306857969.6%50729449777498.1%
8413L2584no match*1803112433562698.3%2260038217805496.4%
2111L1565no match**18150436121798.9%1318272127992897.1%
3011L2683no match**17920436595899.0%6147928599029197.4%
3712L7424no match**17975435219798.7%2665532235239088.3%
4413L1569no match**18183438927799.5%8000000658322282.3%
809L1897no match73849437237499.1%2008844197775798.5%
4213L0560no match1242574439615799.7%6411876588800791.8%
7104L0177no match17840227310151.5%55031252588195.6%
7206L6898no match17850428504397.1%1880038179928595.7%
7306L7696no match17850422939895.9%1585002151807795.8%
1309L4214H3820348436142398.9%2175416214128398.4%
2710L8137H3757139409769092.9%61579059898297.3%
7609L2140H37670428672197.2%1786962172562396.6%
7810L6616H37702433621398.3%2394498232166197.0%
104L0176LAM980876437667999.2%2094096206234298.5%
1911L0415T1751325435830698.8%1089628106364097.6%
3111L2831T1725310438541599.4%8000000719768390.0%
3612L7116T1325123439376899.6%1499892118638079.1%
1209L3451T3507191438270399.3%2509032246918798.4%
4713RF4795U620429477797.4%89167286674297.2%
5415RF5640U86558516513.3%112849614688713.0%
5515RF6749U570422455295.8%79696875638694.9%
5715RF6755U181101921612.1%904078557346.2%
8313L2363U602435881598.8%2559816247327596.6%

Isolates are identified by one or two digit University of Hawaii DNA extraction numbers and Centers for Disease Control and Prevention (CDC) genome accession numbers. Spoligotypes are named as presented in SpolDB4. Total SNPs are those identified versus the Mtb type strain genome H37Rv GenBank NC_000962.3 (length: 4,411,532 base pairs). Unique SNPs are alleles not found at that locus in any other isolate in this study.

*No match in SpolDB4, but strongly suspected to be Manila family based on similarity.

**Manila-like.

Isolates are identified by one or two digit University of Hawaii DNA extraction numbers and Centers for Disease Control and Prevention (CDC) genome accession numbers. Spoligotypes are named as presented in SpolDB4. Total SNPs are those identified versus the Mtb type strain genome H37Rv GenBank NC_000962.3 (length: 4,411,532 base pairs). Unique SNPs are alleles not found at that locus in any other isolate in this study. *No match in SpolDB4, but strongly suspected to be Manila family based on similarity. **Manila-like. The numbers of SNPs displayed by each isolate are generally determined by the isolates’ families, with ancestral Manila family isolates, as expected, displaying the most SNPs against the modern H37Rv genome. The number of unique SNPs for each isolate was mostly consistent with that isolate’s clustering. Isolates from direct transmission clusters displayed zero to six unique SNPs, while isolates that shared a MIRU-VNTR fingerprint, but were not actually transmission linked (common for Beijing and Manila family isolates) displayed dozens or more.

Identification of lineage-specific virulence factor differences

Multiple Beijing family specific virulence factor SNPs were identified in this study. In addition to identifying virulence factor differences specific to the Manila family, we also identified “lineage characteristic” virulence factor differences that were also found in a relatively uncommon non-Manila clade, but which may regardless be important for charactering the Manila family’s virulence.

Beijing family-specific virulence factor SNPs

Six SNPs in five previously described virulence factor genes were identified as lineage-specific non-silent virulence factor SNPs for the Beijing family (Table 3). Four additional Beijing-specific mutations in four genes possibly contributing to virulence or drug resistance, or otherwise of interest, were also identified.
Table 3

Beijing family-specific virulence factor SNP mutations, non-silent.

Published Virulence Factor Mutations
Nucleotide LocusReferenceVariantRv #Nucleotide #StrandOld CodonOld AANew CodonNew AAGene Titlep-value
213147CARv0182c994RGACDTACFalternative RNA polymerase sigma factor SigG<0.005
213281CTRv0182c860RGGCGGACDalternative RNA polymerase sigma factor SigG<0.005
1722228ACRv1527c6182RCTGLCGGRpolyketide synthase Pks5<0.005
1877744ACRv16612441FGAAEGCAApolyketide synthase Pks7<0.005
2673818CGRv2383c2020RGTGVCTGLphenyloxazoline synthase MbtB<0.005
3979990CGRv3540c670RGTGVCTGLlipid-transfer protein or keto acyl-CoA thiolase Ltp2<0.005
Possible Virulence Factor Mutations and Genes of Interest
Nucleotide LocusReferenceVariantRv #Nucleotide #StrandOld CodonOld AANew CodonNew AAGene Titlep-value
1361285CTRv1217c517RGCTAAGTSantibiotic transporter permease<0.005
1643864TCRv1458c397RACCTGCCAantibiotic transporter ATP-binding protein<0.005
1955941GCRv1730c1305RGACDGAGEpenicillin-binding protein<0.005
3127931TARv2820342RAAAKAATNCRISPR type III-a/mtube-associated ramp protein csm4<0.005

List of non-silent SNPS in possible virulence factor genes or genes of interest that were exclusively found in all Beijing family isolates. None of these alleles were found in any non-Beijing-family isolates in this study. Rv numbers and gene titles are those provided by TubercuList [23]. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. P-values were calculated using the Fisher Exact method. F: forward strand. R: reverse strand. AA: amino acid.

List of non-silent SNPS in possible virulence factor genes or genes of interest that were exclusively found in all Beijing family isolates. None of these alleles were found in any non-Beijing-family isolates in this study. Rv numbers and gene titles are those provided by TubercuList [23]. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. P-values were calculated using the Fisher Exact method. F: forward strand. R: reverse strand. AA: amino acid.

Manila family-specific virulence factor SNPs

Seven SNPs in six previously described virulence factor genes were identified as non-silent lineage-specific virulence factor SNPs for the Manila family (Table 4). Two of those mutations were located several bases upstream of the start codons, possibly in the promotor region or ribosome binding site. Four additional virulence factor mutations found in all Manila family isolates, but also in one epidemiological cluster with no SpoldDB4 spoligotype match and in one phylogenetically distinct EAI5 isolate, are also listed in Table 4.
Table 4

Family specific and characteristic virulence factor SNP mutations of the Manila family of Mtb.Family specific virulence factor mutations of the Manila family, non-silent.

Nucleotide LocusReferenceVariantRv #Nucleotide #StrandOld CodonOld AANew CodonNew AAGene Titlep-value# Non-Manila
200379CTRv0170485FGCCAGTCVMCE-family protein Mce1B<0.005-
236512CTRv0198c -5-5-----5 bases upstream Rv0198c zinc metalloprotease Zmp1, possibly in the ribosome binding site<0.005-
235681GARv0191c827RGGCAGTGVzinc metalloprotease Zmp1<0.005-
495198AGRv0410c2117RTTCFTCCSserine/threonine-protein kinase PknG<0.005-
1038500*TGRv0931c1415RCAGQCCGPtransmembrane serine/threonine-protein kinase D PknD<0.005-
1875295*CTPks7–9-9-----9 bases upstream of Rv1661 polyketide synthase Pks7, possibly in the promotor<0.005-
3244126GARv2930430FGTTVATTIfatty-acid-AMP ligase FadD26<0.005-
Family characteristic virulence factor mutations of the Manila family, non-silent
Nucleotide LocusReferenceVariantRv #Nucleotide #StrandOld CodonOld AANew CodonNew AAGene Titlep-value# Non-Manila
2745739GARv2445c -15-15-----15 bases upstream of Rv2445c nucleoside diphosphate kinase NdkA, possibly in the promotor<0.0052 (4)
3244414AGRv2930718FATGMGTGVfatty-acid-AMP ligase FadD26<0.0052 (4)
3447480ACRv3082c947RCTCLCGCRvirulence-regulating transcriptional regulator VirS<0.0052 (4)
4290827CGRv3823c703RGGGGCGGRmembrane transporter MmpL8<0.0052 (4)

List of family-specific and family-characteristic non-silent SNPs in virulence factor genes. All family-specific SNPs were exclusively found in all Manila family isolates. None of these Manila-specific SNPs were found in any non-Manila-family isolates in this study. Family-characteristic SNPs are non-silent SNPs in virulence factor genes that were found in all Manila family isolates, as well as a limited set of similar isolates in this study.

# Non-Manila is the number of clusters that are neither Manila family nor Manila-like that possess this allele (number of contained isolates is in parenthesis). Rv numbers and gene titles are those provided by TubercuList [23]. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. P-values were calculated using the Fisher Exact method. F: forward strand. R: reverse strand. AA: amino acid. *One two-isolate Manila-like cluster did not display either of these two SNPs.

List of family-specific and family-characteristic non-silent SNPs in virulence factor genes. All family-specific SNPs were exclusively found in all Manila family isolates. None of these Manila-specific SNPs were found in any non-Manila-family isolates in this study. Family-characteristic SNPs are non-silent SNPs in virulence factor genes that were found in all Manila family isolates, as well as a limited set of similar isolates in this study. # Non-Manila is the number of clusters that are neither Manila family nor Manila-like that possess this allele (number of contained isolates is in parenthesis). Rv numbers and gene titles are those provided by TubercuList [23]. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. P-values were calculated using the Fisher Exact method. F: forward strand. R: reverse strand. AA: amino acid. *One two-isolate Manila-like cluster did not display either of these two SNPs.

Manila family-specific possible virulence factor or gene-of-interest SNPs

Five Manila-specific SNPs in genes possibly contributing to virulence or drug resistance, or otherwise of interest, were identified (Table 5). Six additional such mutations found in all Manila family isolates, but also in one epidemiological cluster with no SpoldDB4 spoligotype match and in one phylogenetically distinct EAI5 isolate, are also listed in Table 5.
Table 5

Manila family specific and characteristic SNP mutations in possible virulence factor genes and genes of interest.

Family specific possible virulence factor mutations and genes of interest of the Manila family, non-silent
Nucleotide LocusReferenceVariantRv #Nucleotide #StrandOld CodonOld AANew CodonNew AAGene Titlep-value# Non-Manila
27199GARv0022c244RCAGQTAGSTOPtranscriptional regulator WhiB-like WhiB5<0.005-
2239160GCRv1996157FGGGGCGGRuniversal stress protein<0.005-
2654696GARv2374c398RTCGSTTGLheat shock protein transcriptional repressor HrcA<0.005-
3129675CTRv2823c2099RCGCRCACHCRISPR-associated protein cas10/csm1, subtype III-a/mtube<0.005-
27473GTWhiB5–31-31-----31 bp upstream of transcriptional regulator WhiB-like WhiB5, possibly in the promotor<0.005-
Family characteristic possible virulence factor mutations and genes of interest related to the Manila family, non-silent
Nucleotide LocusReferenceVariantRv #Nucleotide #StrandOld CodonOld AANew CodonNew AAGene Titlep-value# Non-Manila
412280TGRv03421443FCATHCAGQisoniazid inducible protein IniA<0.0052 (4)
1097023GARv0981202FGGCGAGCSmycobacterial persistence regulator MRPA<0.0052 (4)
2239055CTRv199652FCCCPTCCSuniversal stress protein<0.0052 (4)
2574598CTRv2303c422RAGCSAACNantibiotic-resistance protein<0.0052 (4)
2726051GARv2427ac37-----pseudogene OxyR protein<0.0052 (4)
27469AGWhiB5–27-27-----27 bp upstream of transcriptional regulator WhiB-like WhiB5, possibly in the promotor<0.0052 (4)

List of family-specific and family-characteristic non-silent SNPs in possible virulence factor genes or genes of interest. All family-specific SNPs were exclusively found in all Manila family isolates. None of these Manila-specific SNPs were found in any non-Manila-family isolates in this study. Family-characteristic SNPs are non-silent SNPs in virulence factor genes that were found in all Manila family isolates, as well as a limited set of similar isolates in this study.

# Non-Manila is the number of clusters that are neither Manila family nor Manila-like that possess this allele (number of contained isolates is in parenthesis). Rv numbers and gene titles are those listed in TubercuList [23]. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. P-values were calculated using the Fisher Exact method. F: forward strand. R: reverse strand. AA: amino acid

List of family-specific and family-characteristic non-silent SNPs in possible virulence factor genes or genes of interest. All family-specific SNPs were exclusively found in all Manila family isolates. None of these Manila-specific SNPs were found in any non-Manila-family isolates in this study. Family-characteristic SNPs are non-silent SNPs in virulence factor genes that were found in all Manila family isolates, as well as a limited set of similar isolates in this study. # Non-Manila is the number of clusters that are neither Manila family nor Manila-like that possess this allele (number of contained isolates is in parenthesis). Rv numbers and gene titles are those listed in TubercuList [23]. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. P-values were calculated using the Fisher Exact method. F: forward strand. R: reverse strand. AA: amino acid

Relationship of Manila-like isolates to the Manila family

Phylogenetic analysis based on all genomic SNPs among selected isolates was conducted to further investigate the relationships of “Manila-like” and other strains to the Manila family (Fig 1). This phylogeny showed that the Manila family isolates that were selected due to their uncommon (but still Manila family) spoligotyping patterns (isolates 89 and 90) grouped within Manila family isolates that displayed the common Manila pattern. Likewise, the isolates selected based on previous deletion analysis to maximize the Manila family diversity in this study (isolates 69 and 70) grouped with other Manila family isolates as expected.
Fig 1

Maximum parsimony phylogeny of Manila-like isolates.

Evolutionary history inferred by MEGA7 using the maximum parsimony method and rooted by Mycobacterium bovis BCG P3. Branch support was determined by 1000 bootstrap repetitions (bootstrap values shown above each node). A model averaged phylogeny further supported this result (data not shown).

Maximum parsimony phylogeny of Manila-like isolates.

Evolutionary history inferred by MEGA7 using the maximum parsimony method and rooted by Mycobacterium bovis BCG P3. Branch support was determined by 1000 bootstrap repetitions (bootstrap values shown above each node). A model averaged phylogeny further supported this result (data not shown). Other isolates displayed diverse relationships with the Manila family. One cluster of “Manila-like” isolates whose spoligotypes had no match in SpolDB4 (isolates 21 and 44) were shown to group inside the Manila family branch. However, one no-match Manila-like cluster (isolates 30 and 37), which was also observed to exhibit pigmentation when grown on Löwenstein–Jensen medium and differed from the Manila family at only two virulence factor SNP loci, occupies a branch of the phylogeny immediately divergent to the Manila family. Interestingly, one EAI5 isolate (5) and a three-isolate no SpolDB4 match epidemiological cluster (71–73) occupy a distinct branch between lineage 1 (Manila family and other EAI lineages) and the modern lineages that include lineages 2 (including the Beijing family) and 4 (including the LAM lineages, H lineages, and others). This was also an expected result, as those isolates matched the Manila-characteristic virulence factor SNP alleles and deviated from them in roughly equal number (Table 6). Other isolates with no SpolDB4 match (isolates 8 and 42, selected for diversity due to appearing to belong to no obvious lineage) grouped with lineage 4 isolates.
Table 6

Virulence factor SNPs specific to Manila and Manila-like isolates.

Spoligotype NameManila ReferenceUncommon Manila SpoligotypeEAI5Uncommon Manila SpoligotypeManila-like*Manila-like**Manila-like** No-SpolDB4 MatchEAI5No SpolDB4 MatchBeijing (Outgroup)No SpolDB4 MatchMtb Type Strain
Isolate #9612189339021843771581042H37Rv
Spacers Lost-31162721826133
Spacer Gained--3----336015
SNPs vs. 96121-2392392782812994378721020192019641970NA
27,469GGGGGGGGGAAAA
412,280GGGGGGGGGTTTT
1,097,023AAAAAAAAAGGGG
2,239,055TTTTTTTTTCCCC
2,574,598TTTTTTTTTCCCC
2,726,051AAAAAAAAAGGGG
2,745,739AAAAAAAAAGGGG
3,244,414GGGGGGGGGAAAA
3,447,480CCCCCCCCCAAAA
4,290,827GGGGGGGGGCCCC
27,199AAAAAAAGGGGGG
27,473TTTTTTTGGGGGG
200,379TTTTTTTCCCCCC
235,681AAAAAAAGGGGGG
236,512TTTTTTTCCCCCC
495,198GGGGGGGAAAAAA
2,239,160CCCCCCCGGGGGG
2,654,696AAAAAAAGGGGGG
2,673,701TTTTTTTCCCCCC
3,129,675TTTTTTTCCCCCC
3,244,126AAAAAAAGGGGGG
1,038,500GGGGGGTTTTTTT
1,875,295TTTTTTCCCCCCC

Comparison of SNP alleles between Manila family, Manila-like, unknown, and outgroup isolates. Shaded alleles match those displayed by Mtb type strain H37Rv, while non-shaded alleles match representative Manila family isolate 96121. Loci displayed in the lower two groups are found in Tables 4 and 5 as family-specific SNPs. Loci in the upper group are found in Tables 4 and 5 as family-characteristic SNPs; although not exclusive to the Manila family, these SNPs shared only with isolates from outside of existing major spoligotyping lineages may be important for defining virulence of the Manila family. All Manila family diversity selection isolates present the same alleles, as do two Manila-like clusters and one EAI5 isolate. One Manila-like cluster differed from the Manila family at two of these selected loci. One other EAI5 isolate and one cluster with no SpolDB4 match diverged from the Manila family at thirteen selected loci. These differences suggest possible differences in virulence for this small clade.

Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3.

Spacers gained or lost are relative to the Manila family reference spoligotype 677777477413771.

* No match in SpolDB4, but Manila-like; representative of a two-isolate epidemiological cluster with both isolates sharing the same alleles at these loci.

**No match in SpolDB4, but almost certainly Manila family; representative of a two-isolate epidemiological cluster with both isolates sharing the same alleles at these loci.

† Representative of a two-isolate epidemiological cluster with both isolates sharing the same alleles at these loci.

‡ Representative of a three-isolate epidemiological cluster with all isolates sharing the same alleles at these loci.

Comparison of SNP alleles between Manila family, Manila-like, unknown, and outgroup isolates. Shaded alleles match those displayed by Mtb type strain H37Rv, while non-shaded alleles match representative Manila family isolate 96121. Loci displayed in the lower two groups are found in Tables 4 and 5 as family-specific SNPs. Loci in the upper group are found in Tables 4 and 5 as family-characteristic SNPs; although not exclusive to the Manila family, these SNPs shared only with isolates from outside of existing major spoligotyping lineages may be important for defining virulence of the Manila family. All Manila family diversity selection isolates present the same alleles, as do two Manila-like clusters and one EAI5 isolate. One Manila-like cluster differed from the Manila family at two of these selected loci. One other EAI5 isolate and one cluster with no SpolDB4 match diverged from the Manila family at thirteen selected loci. These differences suggest possible differences in virulence for this small clade. Nucleotide loci are genomic nucleotide positions in Mtb H37Rv genome NC_000962.3. Spacers gained or lost are relative to the Manila family reference spoligotype 677777477413771. * No match in SpolDB4, but Manila-like; representative of a two-isolate epidemiological cluster with both isolates sharing the same alleles at these loci. **No match in SpolDB4, but almost certainly Manila family; representative of a two-isolate epidemiological cluster with both isolates sharing the same alleles at these loci. † Representative of a two-isolate epidemiological cluster with both isolates sharing the same alleles at these loci. ‡ Representative of a three-isolate epidemiological cluster with all isolates sharing the same alleles at these loci. Table 6 further illustrates the distribution of loci that were designated as either Manila-specific (found only in Manila family isolates) or Manila-characteristic (found in all Manila family isolates as well as certain related groups). Notably, two SNPs (1,038,500 and 1,875,295) that were designated as Manila-specific were not shared by one two-isolate Manila-like cluster (isolates 30 and 37), though the cluster matched the Manila family at the other virulence loci. Additionally, since octal coding might not intuitively convey how many spacers are gained or lost with a change from one digit to another, Table 7 presents the numbers of spacers gained or lost by each isolate relative to Manila family reference genome 96121. This comparison of SNPs versus 96121 and spacers gained or lost clearly illustrates that the number of spacers lost by an isolate relative to 96121 fails to predict an isolate or lineage’s closeness to the Manila family.
Table 7

Clustering and spoligotype details of selected Manila and Manila-like isolates.

96121 vs isolate #SNPs vs. 96121Isolate ClusterSpoligotyping Pattern(96121: 677777477413771)Spacers LostSpacersGained
70193Manila Diversity Selection (Deletions)677777477413771--
69210Manila Diversity Selection (Deletions)677777477413771--
89239Manila Diversity Selection6777774774137013-
33239EAI577777777641377113
2278Manila Cluster A677777477413771--
90278Manila Diversity Selection6777774774137511-
21281Manila-like Cluster 16007774774137716-
44293Manila-like Cluster16007774774137716-
84299Manila-like Cluster B6777744774137712-
37437Manila-like Cluster 2 (Pigmented)6777774020037717-
30438Manila-like Cluster 2 (Pigmented)6777774020037717-
7187202–06 No SpolDB4 Match73777737741377123
7287302–06 No SpolDB4 Match73777737741377123
7387302–06 No SpolDB4 Match73777737741377123
51020EAI577777777641377113
81920Diversity Selection77777760736037186
801925Beijing Family (Outgroup)000000000003771260
101964Beijing Family (Outgroup)000000000003771260
421970Diversity Selection703717740003771131

Details of the clusters represented by isolates in Fig 1. Spoligotyping octal elements that differ from the standard Manila family spoligotyping pattern 677777477413771 are marked in bold and underlined, with the numbers of CRISPR spacers gained or lost (versus the standard Manila family pattern) also presented. Isolates are ordered by the total number of genomic SNPs against de novo sequenced representative Manila family isolate 96121. Note that the number of spacers lost relative to the Manila family reference fails to indicate the closeness of the isolate’s relationship to 96121, as determined by the number of SNPs between them.

Details of the clusters represented by isolates in Fig 1. Spoligotyping octal elements that differ from the standard Manila family spoligotyping pattern 677777477413771 are marked in bold and underlined, with the numbers of CRISPR spacers gained or lost (versus the standard Manila family pattern) also presented. Isolates are ordered by the total number of genomic SNPs against de novo sequenced representative Manila family isolate 96121. Note that the number of spacers lost relative to the Manila family reference fails to indicate the closeness of the isolate’s relationship to 96121, as determined by the number of SNPs between them.

Discussion

Lineage-specific SNPs in virulence factor genes were identified for both the Beijing and Manila families. These were defined as non-silent mutations that were shared by all isolates in that family, but could not be found outside of that family in this study.

Lineage-specific Beijing family virulence factor SNPs

This study identified several virulence factor mutations specific to the Beijing family. Two mutations were found in alternative RNA polymerase sigma factor gene sigG, which has been shown to be upregulated during macrophage infection as well as by DNA damage, despite not regulating DNA repair genes. Accordingly, its deletion has been shown to impair survival in a macrophage infection model [24, 25]. Mutations were also found in polyketide synthase genes pks5 and pks7. The contribution of these genes to Mtb virulence has been investigated by multiple studies. For example, disruption of Pks5 showed no difference in cell envelope lipid composition, but resulted in severe growth defects in a mouse infection model. In contrast, disruption of Pks7 produced a strain that was deficient in the production of phthiocerol dimycocerosates (PDIMs), which has been known to attenuate growth during in vivo infection, and in this case attenuated growth in mice infected by respiratory inoculation [18, 26, 27]. In addition, W-Beijing family strains have been shown to produce a polyketide synthase-derived phenolic glycolipid and demonstrated hypervirulence in a murine model, associated with a reduction in production of pro-inflammatory cytokines TNF-α, IL-6, and IL-12 [28]. One mutation was found in phenyloxazoline synthase MbtB, which is involved in iron acquisition, an important virulence factor for growth inside a macrophage, where iron levels can be 1/1000th those outside the leukocyte [18]. The final Beijing family virulence factor found to display a lineage-specific mutation was lipid-transfer protein or keto acyl-CoA thiolase Ltp2, which may be involved in cholesterol transport [18]. The Beijing family’s lineage-specific mutations in other genes of interest included one in antibiotic transporter permease Rv1217c, which has been shown to actively transport the ionophore antibiotic tetronasin across the cell membrane, and one in antibiotic transporter ATP-binding protein Rv1458c, which is similar to other described antibiotic transporters [29]. A mutation was also found in a penicillin-binding protein (Rv1730c). Penicillin is widely regarded as ineffective against Mtb, but the rise of extensively drug-resistant TB has renewed interest in the use of beta-lactam drugs against it [30]. Additionally, studies combining beta lactam drugs with beta lactamase inhibitors have shown promise [31]. Finally, CRISPR type III-a/mtube-associated ramp protein Csm4 showed one mutation. Although the exact role of Csm4 is still being deduced, CRISPR-related changes in the Beijing family are of interest due to its lack of many of the CRISPR spacers from the set of 43 utilized by spoligotyping that are present in many other lineages. The Beijing family has been shown to be split into ancient (atypical) and modern (typical) sublineages on the basis of the presence or absence of one to two IS6110 insertion sequences in the NTF region, with the modern sublineage being shown to display increased virulence relative to the ancestral sublineage [32]. Depending on their sublineage, Beijing family isolates display mutations in two DNA repair genes (mutT4 and mutT2) as well as ogg, which are thought to result in an increased mutation rate, allowing for quicker adaptation to new host pressures or environments [32, 33]. The genetic difference between ancient and modern Beijing strains has been further explored, and while mutations have been identified, their phenotypic results still require investigation [34]. Our present study, however, did not seek to include isolates of the less clinically important ancient (atypical) clade. A previous study identified a set of 81 SNPs that characterized all modern Beijing family strains [35]. That study identified virulence factor SNPs in the mce and vapBC gene families, but our study indicated that those SNPs are not specific to the Beijing family (data not shown). Another study that sought to characterize the modern/typical Beijing lineage identified 51 SNPs that defined the lineage and were not found in a search of 29 non-Beijing strains [36]. Notably, their set of SNPs found regulatory network mutations to be over-represented. As our study only searched for mutations in established virulence factor genes, mutations in regulatory genes that might be responsible for major differences in virulence would not have been captured by this study unless those regulatory genes had already been characterized as virulence factors.

Lineage-specific Manila family virulence factor SNPs

The Manila family exhibited a similar number of lineage-specific virulence factor mutations as the Beijing family. However, we also identified a group of isolates (EAI5 isolate 5 and a cluster with no SpolDB4 match composed of isolates 71–73) that showed virulence factor SNPs matching the Manila family at some loci, but matching an outgroup at others. Those SNPs that were found only in the Manila family and in this small, ancient clade were presented as Manila family characteristic SNPs rather than Manila family specific SNPs. While not 100% specific to the family, these mutations may still be important for understanding the virulence or transmission characteristics of the Manila family. Among the notable virulence factor genes hosting the Manila family’s lineage specific SNPs, one mutation was found in mce1B. One study showed that when the mce1 operon was knocked out, the resulting infection was unable to enter a state of persistent infection in mouse lungs, instead exhibiting hypervirulence and killing the mice more rapidly than wild-type Mtb. From ex vivo infection of murine macrophages, this was proposed to have resulted from the mutant failing to stimulate a Th-1 mediated immune response that would have induced protective granuloma formation [37]. One Manila family-specific mutation was found five bases upstream of the zinc metalloprotease zmp1 start codon, which might place it within the ribosome binding site (RBS). Mycobacterial ribosome binding sites are typically G and A-rich regions of six to eight base pairs in length, located four to seven base pairs upstream of the start codon [38]. Zmp1 is further of interest as it contains a second Manila family specific mutation, and zmp1’s deletion has been shown to result in hypervirulence in a C57BL/6 mouse model [39]. One mutation was found in the gene encoding PknG, which is secreted into the macrophage cytosol upon mycobacterial infection, where it is required for inhibition of phago-lysosomal fusion and mediates persistence under stressful conditions [40, 41]. Another mutation was found in pknD, which is required for invasion of brain endothelia, but not lung epithelia or macrophages [42]. While the Beijing family displayed a family-specific mutation in pks7, the Manila family displayed a family-specific mutation nine bases upstream of the gene, possibly affecting the promotor or ribosome binding site. Finally, the Manila family exhibits a mutation in fadD26, whose disruption results in impaired synthesis of phthiocerol dimycocerosates and attenuation in a mouse model [43]. Interestingly, one potential Mtb vaccine that has now completed a phase I clinical trial, MTBVAC, was produced from a double deletion mutant that removed phoP and fadD26 [44]. Manila family characteristic mutations were found in several additional genes. Nucleoside diphosphate kinase NdkA inactivates GTPase Rac1 in macrophages (which reduces production of reactive oxygen species), and knock down of ndkA resulted in increased susceptibility to intracellular killing and reduced persistence in the lungs of infected mice [45]. Another mutation was also observed for fadD26 (in addition to the family-specific mutation), suggesting that FadD26 may be important in directing the virulence of the Manila family. VirS is a transcriptional regulator controlling the mymA operon, which is induced by infection in macrophages, and may be involved in remodeling Mtb’s cellular envelope under acidic, intracellular conditions [46]. Finally, a mutation was found in mmpL8, disruption of which resulted in attenuation presenting as significantly increased time to death in a mouse aerosol infection model, corresponding to less efficient suppression of cytokines directing a Th1-type immune response [47]. SNPs in genes of interest were likewise split between lineage-specific and lineage-characteristic for the Manila family. The most notable lineage-specific gene was transcriptional regulator whiB5, which influences the expression of 58 genes, and which our study revealed is truncated due to a premature stop codon in the Manila family [48]. Both chronic and progressive infections have been found to be attenuated in a mouse model when whiB5 was deleted. The mutant was also shown to be unable to resume growth after reactivation from chronic infection. However, as previously noted, human infections with Manila family strains do not appear to display any deficiency in reactivation [49]. Furthermore, two mutations were observed upstream of whiB5’s start codon (-27 and -31). These mutations may be located in the -35 promoter sequence, but that is difficult to determine as mycobacteria have little homology in their -35 promoter sequences [38]. Although studies have shown that the -35 region is not essential for promoter function in mycobacteria, manipulation of the region can vary the wild-type promotor activity from as little as 6% to as much as 200% of its original activity [38]. Regardless, it is interesting that while the whiB5–31 mutation was observed only in Manila family isolates, a whiB5–27 mutation was observed in the Manila family, both EAI5 clades, and the ancient three-isolate epidemiological cluster with no SpolDB4 match. Another gene of interest is the oxyR pseudogene. oxyR nucleotide 37 T has been used as a marker for the Manila family and EAI5 lineage [50]. However, this study revealed that oxyR 37 T is not fully exclusive to the Manila family, as the newly identified ancient clade with an EIA5 isolate and a no SpolDB4 match cluster also exhibited the mutation. Finally, a Manila family characteristic mutation was observed in mycobacterial persistence regulator MRPA, which of Mtb’s ~214 transcriptional regulators, was one of seven shown to be upregulated during nonreplicating persistent stage stage 2 (NRP2) [51]. MRPA had been previously shown to be required for entrance into and maintenance of persistent infection [52]. Full genome sequencing provides insight here into the connection of isolates with putative “Manila-like” spoligotypes to the Manila family. A phylogenetic tree determined from those full genome SNP sets has further established their relationships (Fig 1). In general, the number of spoligotyping/CRISPR spacers lost by an isolate relative to the Manila family did not serve as a predictor of the number of total SNP differences between that isolate and reference Manila family isolate 96121 (Table 7). Multiple examples demonstrating this are presented below. Manila family isolate 89, with its three missing spacers relative to the Manila family’s prototypical pattern, displayed the same number of SNPs from 96121 as EAI5 isolate 33 (which lost one spacer and gained three). Furthermore, both of those isolates had fewer SNPs versus 96121 than isolate 2 (chosen to represent a large Manila family epidemiological cluster). The epidemiological cluster represented by isolates 21 and 41 was missing six spacers, but did not display many more SNPs versus 96121. However, other than for EAI5, gain of spacers relative to the Manila family appeared to be a better indication of distance from the Manila family than loss of spacers. EAI5 isolate 5 and a three-isolate no SpolDB4 match epidemiological cluster (isolates 71–73) occupy a branch together, with both gaining the same three spacers, but the one spacer lost by isolate 5 and the two spacers lost by the 71–73 cluster do not overlap. Other isolates whose spoligotypes had no match in SpolDB4 differed from Manila by the largest numbers of SNPs, similar to the Beijing family outgroup isolates. Of those no-match isolates, isolate 8 was selected due to having a large number of spacers gained versus 96121, and it clustered closest to LAM9 isolate 1, while isolate 42, selected to represent a large number of spacers lost, clustered closest to the Beijing family isolates (despite the apparent difference in spoligotypes). The numbers of SNPs between individual isolates and our Manila family reference genome further clarify the relationships of isolates displaying various spoligotyping patterns to the Manila family. Isolates 69 and 70, selected from a prior study (which used deletion-based analysis) in order to maximize our present study’s diversity within the Manila family, expectedly clustered within the other Manila family isolates, and displayed the fewest SNPs versus 96121 (210 and 193 SNPs, respectively). Manila family isolate 89, selected due to losing three of the Manila family’s characteristic spoligotyping spacers, likewise clustered with the rest of the Manila family, while Manila family isolate 90 (which lost only a single spacer) was more distinct, and clustered closer to isolate 84 (which lost two spacers and had no SpolDB4 match). Another un-matched, but seemingly Manila-like spoligotype displayed by epidemiologically clustered isolates 21 and 44 (missing seven spacers) clustered within the Manila family. An additional un-matched, but seemingly Manila-like, epidemiological cluster of isolates 30 and 37 (missing seven spacers) clustered furthest from the Manila family, yet still on a distinct branch from the Manila family’s known outgroups. Despite this, it was the only “Manila-like” group not to match the Manila family in all of the virulence factor mutations that characterized the family in this study, differing in a SNP at transmembrane serine/threonine-protein kinase D pknD and a SNP 9 bases upstream of Rv1661 polyketide synthase pks7, possibly in the promotor. Furthermore, isolates from this cluster were also observed to exhibit pigmentation when grown on Löwenstein–Jensen medium. The EAI5 spoligotype revealed interesting results. We have previously identified in other EAI5 isolates that this spoligotype covers both Principle Genetic Groups 1 and 2 [53]. Thus, this EAI5 spoligotype spans evolutionarily distinct clades, possibly as the result of convergent evolution of CRISPR regions due to phage pressure. Both EAI5 isolates in this study were found to be ancient, despite both phylogenetics and virulence SNPs showing them to be quite distinct. While EAI5 isolate 33 clustered with the Manila family, EAI5 isolate 5 clustered with a three-isolate epidemiological cluster with no SpolDB4 match (isolates 71–73). This sub-clade of ancient isolates was roughly split between matching the Manila family and matching outgroup H37Rv at the Manila family’s lineage specific and characteristic SNP loci (10 to 13). Some notable virulence factor differences for those isolates include losing the Manila family’s truncated transcriptional regulator WhiB5 and one of its two upstream mutations, SNPs in MCE-family protein Mce1B, zinc metalloprotease Zmp1 and its upstream mutation, and one of the two fatty-acid-AMP ligase FadD26 mutations, among others (Table 6). These differences suggest that this ancient sub-clade displays virulence characteristics different from the Manila family.

Conclusions

This study identified distinct sets of non-silent virulence factor SNPs that were specific to the Beijing and Manila families of Mtb. Further study of these mutations could lead to greater understanding of differences in virulence in these two families. Future work investigating the effects of these SNPs in animal or macrophage infection models, which is beyond the scope of this present study, will be required to fully determine their phenotypes. However, future studies should not be limited to investigating only single SNPs. Mtb lacks singular genomic features that account for host preference, and epistatic interactions and coordination of transcriptional regulation must be considered together to account for Mtb’s transmission and virulence [5]. Thus, investigating only single SNPs from a specific family alone (in the absence of the family’s other mutations) may be insufficient to produce a phenotype with altered virulence. Additionally, this study utilized only a small (82-isolate) set of Hawaii-focused Mtb isolates. Future work will further establish the lineage-specificity of these mutations by significantly expanding the number of genomes investigated. Although repositories such as the National Center for Biotechnology Information (NCBI) GenBank already hold a large number of Mtb genomes, the lack of paired spoligotyping data attached to those genomes limits their use in this study. A tool for in silico spoligotyping of Mtb genomes, SpolPred, enables identification of spoligotypes from FASTQ sequencing read files, allowing data from repositories such as the NCBI Sequence Read Archive (SRA) to be investigated [54]. However, while Beijing family sequencing data is abundant, Manila family sequencing data currently remains uncommon.
  47 in total

1.  Mutations in the regulatory network underlie the recent clonal expansion of a dominant subclone of the Mycobacterium tuberculosis Beijing genotype.

Authors:  Anita C Schürch; Kristin Kremer; Robin M Warren; Nguyen V Hung; Yanlin Zhao; Kanglin Wan; Martin J Boeree; Roland J Siezen; Noel H Smith; Dick van Soolingen
Journal:  Infect Genet Evol       Date:  2011-01-28       Impact factor: 3.342

2.  Mycobacterium tuberculosis Rv0198c, a putative matrix metalloprotease is involved in pathogenicity.

Authors:  D G Niranjala Muttucumaru; Debbie A Smith; Elizabeth J McMinn; Valerie Reese; Rhea N Coler; Tanya Parish
Journal:  Tuberculosis (Edinb)       Date:  2011-01-08       Impact factor: 3.131

Review 3.  The complex architecture of mycobacterial promoters.

Authors:  Mae Newton-Foot; Nicolaas C Gey van Pittius
Journal:  Tuberculosis (Edinb)       Date:  2012-09-25       Impact factor: 3.131

4.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

5.  Rapid and spontaneous loss of phthiocerol dimycocerosate (PDIM) from Mycobacterium tuberculosis grown in vitro: implications for virulence studies.

Authors:  Pilar Domenech; Michael B Reed
Journal:  Microbiology (Reading)       Date:  2009-08-06       Impact factor: 2.777

6.  Hypervirulent mutant of Mycobacterium tuberculosis resulting from disruption of the mce1 operon.

Authors:  Nobuyuki Shimono; Lisa Morici; Nicola Casali; Sally Cantrell; Ben Sidders; Sabine Ehrt; Lee W Riley
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-08       Impact factor: 11.205

7.  Virulence attenuation of two Mas-like polyketide synthase mutants of Mycobacterium tuberculosis.

Authors:  Cécile Rousseau; Tatiana D Sirakova; Vinod S Dubey; Yann Bordat; Pappachan E Kolattukudy; Brigitte Gicquel; Mary Jackson
Journal:  Microbiology       Date:  2003-07       Impact factor: 2.777

8.  The role of MmpL8 in sulfatide biogenesis and virulence of Mycobacterium tuberculosis.

Authors:  Pilar Domenech; Michael B Reed; Cynthia S Dowd; Claudia Manca; Gilla Kaplan; Clifton E Barry
Journal:  J Biol Chem       Date:  2004-03-04       Impact factor: 5.157

9.  Predominance of a single genotype of Mycobacterium tuberculosis in countries of east Asia.

Authors:  D van Soolingen; L Qian; P E de Haas; J T Douglas; H Traore; F Portaels; H Z Qing; D Enkhsaikan; P Nymadawa; J D van Embden
Journal:  J Clin Microbiol       Date:  1995-12       Impact factor: 5.948

10.  Mycobacterium tuberculosis nucleoside diphosphate kinase inactivates small GTPases leading to evasion of innate immunity.

Authors:  Jim Sun; Vijender Singh; Alice Lau; Richard W Stokes; Andrés Obregón-Henao; Ian M Orme; Dennis Wong; Yossef Av-Gay; Zakaria Hmama
Journal:  PLoS Pathog       Date:  2013-07-18       Impact factor: 6.823

View more
  6 in total

1.  Development of small-molecule inhibitors of fatty acyl-AMP and fatty acyl-CoA ligases in Mycobacterium tuberculosis.

Authors:  Marzena Baran; Kimberly D Grimes; Paul A Sibbald; Peng Fu; Helena I M Boshoff; Daniel J Wilson; Courtney C Aldrich
Journal:  Eur J Med Chem       Date:  2020-06-13       Impact factor: 6.514

2.  On the Identification of Clinically Relevant Bacterial Amino Acid Changes at the Whole Genome Level Using Auto-PSS-Genome.

Authors:  Hugo López-Fernández; Cristina P Vieira; Pedro Ferreira; Paula Gouveia; Florentino Fdez-Riverola; Miguel Reboiro-Jato; Jorge Vieira
Journal:  Interdiscip Sci       Date:  2021-05-19       Impact factor: 2.233

3.  Virulence of Mycobacterium tuberculosis Clinical Isolates Is Associated With Sputum Pre-treatment Bacterial Load, Lineage, Survival in Macrophages, and Cytokine Response.

Authors:  Trinh T B Tram; Hoang N Nhung; Srinivasan Vijay; Hoang T Hai; Do D A Thu; Vu T N Ha; Tran D Dinh; Philip M Ashton; Nguyen T Hanh; Nguyen H Phu; Guy E Thwaites; Nguyen T T Thuong
Journal:  Front Cell Infect Microbiol       Date:  2018-11-27       Impact factor: 5.293

4.  Genomic sequencing is required for identification of tuberculosis transmission in Hawaii.

Authors:  Kent J Koster; Angela Largen; Jeffrey T Foster; Kevin P Drees; Lishi Qian; Ed Desmond; Xuehua Wan; Shaobin Hou; James T Douglas
Journal:  BMC Infect Dis       Date:  2018-12-03       Impact factor: 3.090

5.  Reporting practices for genomic epidemiology of tuberculosis: a systematic review of the literature using STROME-ID guidelines as a benchmark.

Authors:  Brianna Cheng; Marcel A Behr; Benjamin P Howden; Theodore Cohen; Robyn S Lee
Journal:  Lancet Microbe       Date:  2021-03-02

6.  Connection between two historical tuberculosis outbreak sites in Japan, Honshu, by a new ancestral Mycobacterium tuberculosis L2 sublineage.

Authors:  Christophe Guyeux; Gaetan Senelle; Guislaine Refrégier; Florence Bretelle-Establet; Emmanuelle Cambau; Christophe Sola
Journal:  Epidemiol Infect       Date:  2022-01-19       Impact factor: 2.451

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.