Literature DB >> 27581378

Comparison of the complete genome sequences of four γ-hexachlorocyclohexane-degrading bacterial strains: insights into the evolution of bacteria able to degrade a recalcitrant man-made pesticide.

Michiro Tabata¹, Satoshi Ohhata¹, Yuki Nikawadori¹, Kouhei Kishida¹, Takuya Sato¹, Toru Kawasumi¹, Hiromi Kato¹, Yoshiyuki Ohtsubo¹, Masataka Tsuda¹, Yuji Nagata².

Abstract

γ-Hexachlorocyclohexane (γ-HCH) is a recalcitrant man-made chlorinated pesticide. Here, the complete genome sequences of four γ-HCH-degrading sphingomonad strains, which are most unlikely to have been derived from one ancestral γ-HCH degrader, were compared. Together with several experimental data, we showed that (i) all the four strains carry almost identical linA to linE genes for the conversion of γ-HCH to maleylacetate (designated "specific" lin genes), (ii) considerably different genes are used for the metabolism of maleylacetate in one of the four strains, and (iii) the linKLMN genes for the putative ABC transporter necessary for γ-HCH utilization exhibit structural divergence, which reflects the phylogenetic relationship of their hosts. Replicon organization and location of the lin genes in the four genomes are significantly different with one another, and that most of the specific lin genes are located on multiple sphingomonad-unique plasmids. Copies of IS6100, the most abundant insertion sequence in the four strains, are often located in close proximity to the specific lin genes. Analysis of the footprints of target duplication upon IS6100 transposition and the experimental detection of IS6100 transposition strongly suggested that the IS6100 transposition has caused dynamic genome rearrangements and the diversification of lin-flanking regions in the four strains.

Entities: Chemical Disease Gene Mutation Species

Keywords: evolution; genome; mobile genetic elements; sphingomonads; xenobiotics

Mesh：

Substances：

Year: 2016 PMID： 27581378 PMCID： PMC5144681 DOI： 10.1093/dnares/dsw041

Source DB: PubMed Journal: DNA Res ISSN： 1340-2838 Impact factor: 4.458

1. Introduction

In the early 20th century, rapid development in the chemical industry led to the production and wide use of numerous anthropogenic chemicals. Because they are often recalcitrant in the environment and toxic to humans and ecosystems, they have caused serious environmental problems. Many bacterial strains capable of degrading man-made xenobiotic compounds have been isolated and characterized. Such strains are thought to have evolved to degrade xenobiotics within relatively short periods. However, with the exception of a few speculative examples, the evolutionary processes of these bacterial strains remain largely unknown., γ-Hexachlorocyclohexane (γ-HCH; also known as γ-BHC or lindane) is a completely man-made chlorinated pesticide that has caused serious environmental problems due to its toxicity and long persistence in upland soils.,, Only 60 years after the first release of γ-HCH into the environment, a number of bacterial strains that aerobically degrade γ-HCH have been isolated from geographically distant locations around the world. An archetypal γ-HCH-degrading strain, Sphingobium japonicum UT26, was isolated from an upland experimental field to which γ-HCH had been applied once a year for 12 years, and its aerobic γ-HCH degradation pathway (Fig. 1) and genome organization have been intensively studied. Recently, many draft genome sequences of other HCH (including not only γ-HCH but also other HCH isomers) degraders and their related but non-HCH-degrading strains were determined, and their comparative analyses have been published., These studies provided us some important primary information on the evolution of HCH-degraders with the involvement of plasmids and insertion sequences (ISs). However, the more detailed information (e.g. which and how plasmids/ISs are involved in the evolution of HCH-degraders) remains unclear.

Figure 1

Degradation pathway of γ-HCH in UT26. Compounds: 1, γ-HCH; 2, γ-pentachlorocyclohexene; 3, 1,3,4,6-tetrachloro-1,4-cyclohexadiene; 4, 1,2,4-trichlorobenzene; 5, 2,4,5-trichloro-2,5-cyclohexadiene-1-ol; 6, 2,5-dichlorophenol; 7, 2,5-dichloro-2,5-cyclohexadiene-1,4-diol; 8, 2,5-dichlorohydroquinone; 9, chlorohydroquinone; 10, acylchloride; 11, hydroquinone; 12, γ-hydroxymuconic semialdehyde; 13, maleylacetate; 14, β-ketoadipate; 15, 3-oxoadipyl-CoA; 16, succinyl-CoA; 17, acetyl-CoA. TCA, citrate/tricarboxylic acid cycle. Note that the compounds 4 and 6 are dead-end products. Spontaneous and non-enzymatic reaction is indicated by grey arrows. The reaction marked with ‘?’ has not been identified. Compounds in parentheses have not been directly detected. In this study, to gain further insight into the functional evolution of bacterial genomes, the complete genome sequences of three other γ-HCH-degraders, Sphingomonas sp. MM-1,, Sphingobium sp. MI1205,, and Sphingobium sp. TKS, were determined. On the basis of our comparison of the complete genome sequences of these three strains and UT26, in association with several supporting experimental data, the evolution of γ-HCH-degrading bacterial strains is discussed.

2. Materials and methods

2.1. Bacterial strains, plasmids, and culture conditions

Strains and plasmids used in this study are listed in Supplementary Table S1. All sphingomonad (the collective name of Sphingomonas, Sphingobium, Novosphingobium, Sphingopyxis, and their related genera) strains were cultured at 30°C in 1/3LB medium or 1/10W minimal medium. If needed, antibiotics were added at the following concentrations: tetracycline (Tc) 20 μg/ml, nalidixic acid (Nal) 100 μg/ml, and kanamycin (Km) 50 μg/ml for UT26; Tc 5 μg/ml, Nal 100 μg/ml, gentamycin (Gm) 2 μg/ml, and Km 50 μg/ml for MM-1; Tc 2 μg/ml, Nal 100 μg/ml, and Km 50 μg/ml for TKS; and Tc 2 μg/ml, Nal 100 μg/ml, and Km 50 μg/ml for MI1205. Escherichia coli DH5α for genetic manipulation was grown at 37°C in LB. If needed, antibiotics were added at the following concentrations: Tc 20 μg/ml, Gm 20 μg/ml, and Km 50 μg/ml. The solid media were prepared by the addition of 1.5% agar.

2.2. Construction of plasmids and strains

The 3.6-kb region containing the MM-1 linKbLbMbNb genes was amplified by PCR using the primer set of MM_linKLMN_F and MM_linKLMN_R (Supplementary Table S2), and the amplified fragment was cloned into a broad-host-range vector, pKS13P, under the linA constitutive promoter to generate pKSR1020. The whole region of the linFb gene of strain TKS was amplified by PCR using the primer set of TKS_MAR_F_Hind and TKS_MAR_R_Bam (Supplementary Table S2), digested with BamHI and HindIII, and cloned into the corresponding sites of pBBR1-MCS2 to generate pBLFb. To disrupt the genomic linFb gene in TKS, an internal part of linFb was amplified by PCR using the primer set of TKS_MAR_single_F_Eco and TKS_MAR_single_R_Hind (Supplementary Table S2), digested with EcoRI and HindIII, and cloned into the corresponding sites of pEX18Gm. The resultant plasmid pEDLFb was introduced into TKS by electroporation to select the Gm-resistant transformants. Subsequent PCR analysis of one such transformant (TKSdLFb) confirmed that its genomic linFb gene was disrupted by reciprocal homologous recombination through the single crossover-mediated integration of pEDLFb (Supplementary Fig. S1).

2.3. DNA manipulations and Sanger sequencing

Established methods were employed for the preparation of plasmids and genomic DNAs, their digestion with restriction endonucleases, ligation, and agarose gel electrophoresis, and the transformation of E. coli cells., Electroporation of sphingomonad strains was performed as described previously. PCR for cloning was performed with KOD-Plus DNA polymerase (TOYOBO, Osaka, Japan). The primers used are listed in Supplementary Table S2. The Sanger sequencing was performed using an ABI PRISM 3130xl sequencer and ABI Prism Big Dye Terminator v3.1 Kit (Applied Biosystems).

2.4. Genome sequencing and annotation analyses

Fragment reads of genomic DNA of MM-1, MI1205, and TKS were obtained by the Roche 454 and Illumina HiSeq 2000 sequencing systems, and were assembled using the Newbler programme (Roche). The GenoFinisher and AceFileViewer programmes were used to finish the sequencing completion. More detailed information for these procedures have been published elsewhere.,, Sequencing gap regions were amplified by PCR using KOD FX (TOYOBO) or Ex Taq (TaKaRa) by using the total DNA of the respective strains as templates, and the resultant DNA fragments were sequenced using primers for PCR amplification. Pulsed-field-gel electrophoresis was also performed as described previously to support the whole genome sequencing data of MI1205 and TKS (data not shown). The annotation data of the complete genome sequences were obtained by PGAAP (Prokaryotic Genome Annotation Pipeline, http://www.ncbi.nlm.nih.gov/genome/annotation_prok/; 22 August 2016, date last accessed), and was curated with the dedicated software bundled in the GenomeMatcher programme (http://www.ige.tohoku.ac.jp/joho/gmProject/gmhome.html; 22 August 2016, date last accessed) as well as by consulting the MiGAP auto-annotation system (Microbial Genome Annotation Pipeline, http://www.migap.org/; 22 August 2016, date last accessed).

2.5. Computational analyses of sequence data

The nucleotide and protein sequences were analysed using the Genetyx programme version 13–18 (Genetyx Corp., Tokyo, Japan). Homology searches were performed using the BLAST programmes available at the National Center for Biotechnology Information website (http://www.ncbi.nlm.nih.gov/BLAST/; 22 August 2016, date last accessed) with the default parameters. Venn diagram was depicted using the result obtained by BLASTClust analysis (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/; 22 August 2016, date last accessed) for all predicted open reading frames (ORFs) of the four γ-HCH degraders with a parameter set ‘-p T -L .6 -b T -S 60 -a 24’. Comparative analyses of DNA sequences were performed by GenomeMatcher using BLASTN with a parameter set ‘-F F -W 21 -e 0.01’. Conserved motifs and repeat sequences were searched by GenomeMatcher. All neighbour-joining phylogenetic trees shown in this study were constructed using MAFFT programme (http://mafft.cbrc.jp/alignment/software/; 22 August 2016, date last accessed) and visualized by NJplot software (http://pbil.univ-lyon1.fr/software/njplot.html; 22 August 2016, date last accessed). Transposable elements were identified by analyses of regions containing putative transposase genes, i.e. mutual BLASTN analysis with a parameter set ‘-F F -W 21 -e 0.01’and searches for inverted and direct repeats using Dot Match mode of GenomeMatcher. For detection of candidate ORFs relevant to metabolisms of aromatic compounds, all ORFs obtained from each genome were BLASTP-searched with parameters ‘-e 1e-5 -b 5 -F F’ and threshold identity ≥50%, query and reference sequence coverage ≥50% against an in-house database for the enzymes for degradation of aromatic compounds.

2.6. Entrapment of transposable elements

pGEN500, an entrapment plasmid vector of transposable elements, was introduced into UT26, MM-1, TKS, and MI1205 by electroporation or mating using E. coli S17-1, and the cells carrying pGEN500 were selected on the 1/3LB agar plate containing Tc. The pGEN500-containing strains were thereafter plated on 1/3LB agar containing Tc and 10% sucrose (w/v) at 30°C. The colonies formed on the plates were analysed for their resident pGEN500 derivatives. Such derivatives with enlarged sizes were postulated to be formed by the insertion of endogenous transposable elements in the sacB gene on pGEN500. The insertion event in each derivative was investigated by PCR using the multiple pairs of primers listed in Supplementary Table S2, and the insert was sequenced by the Sanger method.

3. Results and discussion

3.1. Complete genome sequences of three γ-HCH-degrading sphingomonad strains

Most of the aerobic γ-HCH-degrading bacterial strains that have been critically analysed at the genetic level are sphingomonads that belong to Alphaproteobacteria. We have previously published an article describing our detailed analysis of the UT26 genome. In this study, the complete genome sequences of three other γ-HCH-degrading strains, Sphingomonas sp. MM-1,, Sphingobium sp. MI1205,, and Sphingobium sp. TKS, were determined. The basic genome organizations of the four γ-HCH-degrading strains isolated from different geographical areas are summarized in Table 1. Comparative analysis of the 16S rRNA genes indicated that these four γ-HCH-degrading strains are phylogenetically diverse among related sphingomonad strains (Fig. 2). All predicted ORFs of these four strains (19,312) were clustered into 10,325 ORF clusters. Figure 3 summarizes the result as a Venn diagram that shows the numbers of shared and unique ORF clusters among the four strains; these four strains each have 1,190–2,346 unique ORF clusters, but share only 1,288 ones. This observation supports the phylogenetic divergence of the four γ-HCH degraders. These results also strongly suggest that the four degraders independently acquired γ-HCH-degradation ability, and thus it is unlikely that the four strains have been derived from one ancestral γ-HCH-degrader.

Table 1.

Genome organization of four γ-HCH-degrading sphingomonad strains

Strain name	Isolated site	Source type	Replicon	Length (bp)	Number of			Acc no.	Replicon type^a	lin genes^b	Reference for strain
Strain name	Isolated site	Source type	Replicon	Length (bp)	rrn operon	ORF	IS6100		Replicon type^a	lin genes^b	Reference for strain
S. japonicum UT26	Tokyo, Japan	Soil artificially polluted	Chromosome 1	3,514,822	1	3,529	5	AP010803	Chr	linA, linB, linC, linKLMN	18
		with γ−HCH	Chromosome 2	681,892	2	589	2	AP010804	UT26_Chr 2	linF, linGHIJ, linEb
			pCHQ1	190,974	0	224	4	AP010805	pCHQ1	linRED
			pUT1	31,776	0	44	2	AP010806	pUT1	—
			pUT2	5,398	0	8	0	AP010807	pUT2	—

			total	4,424,862	3	4,394	13

Sphingomonas sp. MM-1	Lucknow, India	Soil polluted	Chromosome	4,054,833	2	3,801	0	CP004036	Chr	linKbLbMbNb	22
		with HCH isomers	pISP0	275,840	0	251	1	CP004037	pCHQ1	linF, linGHIJ
			pISP1^c	172,140	0	174	7	CP004038	UT26_Chr 2/pISP4	linA, linC, linF'
			pISP2	53,841	0	52	2	CP004039	pUT1	—
			pISP3	43,776	0	44	1	CP004040	pISP3	linRED
			pISP4	33,183	0	39	4	CP004041	pISP4	linB, linC, linF'

			Total	4,633,613	2	4,361	15

Sphingobium sp. TKS	Kyushu, Japan	Sediment polluted	Chromosome 1^c	4,249,857	1	4,172	7	CP005083	Chr/UT26_Chr 2	linB, linC, linF', linFb, linKLMN	This study
		with HCH isomers	Chromosome 2	989,120	2	843	0	CP005084	UT26_Chr 2	linGHIJ_homologue
			pTK1^c	520,614	0	470	6	CP005085	UT26_Chr 2/pCHQ1	—
			pTK2	195,308	0	182	1	CP005086	pTK2	—
			pTK3^c	87,635	0	92	8	CP005087	pISP4/pTK3_type 1/pTK3_type 2	linB, linC, linF'
			pTK4^c	75,938	0	86	6	CP005088	pUT1/pISP4	linA, linC
			pTK5	53,908	0	76	0	CP005089	pLB1	—
			pTK6	34,300	0	35	1	CP005090	pISP3	linRED
			pTK7	9,585	0	12	0	CP005091	pTK7	—
			pTK8	7,223	0	11	0	CP005092	pTK8	—
			pTK9	5,391	0	8	0	CP005093	pUT2	—

			total	6,228,879	3	5,987	29

Sphingobium sp. MI1205	Miyagi, Japan	Soil polluted	Chromosome 1	3,351,250	1	3,285	0	CP005188	Chr	linKLMN	24
		with HCH isomers	Chromosome 2	567,154	1	516	0	CP005189	UT26_Chr 2	—
			pMI1	292,135	0	299	7	CP005190	UT26_Chr 2	linB, linC, linRED, linEb, linF, linF'', linGHIJ
			pMI2^c	287,488	0	323	7	CP005191	pUT1/UT26_Chr 2	linA, linRD
			pMI3^c	88,374	0	102	7	CP005192	pLB1/pISP4	linB, linC, linF''
			pMI4^c	32,974	0	45	3	CP005193	pISP3/pISP4	linRED

			total	4,619,375	2	4,570	24

aSee Table 3

blinF′ and linF″ are probably pseudogenes (see Supplementary Fig. S2).

cReplicons having more than one rep genes.

Figure 2

Phylogenetic tree of 16S rRNA genes of sphingomonad strains. Neighbor-joining phylogenetic tree of the conserved sites in 16S rRNA genes of 13 sphingomonad strains, S. japonicum UT26S (UT26_1, SJA_C1-r0010; UT26_2, SJA_C2-r0010; UT26_3, SJA_C2-r0040), Sphingobium indicum B90A (B90A, NR_042943), Sphingobium francense Sp+ (Sp+, NR_042944), Sphingobium sp. TKS (TKS_1, Chr1_62351_63846; TKS_2, Chr2_117006_118503; and TKS_3, Chr2_376042_377539_c), Sphingobium chlorophenolicum L-1 (L-1_1, Sphch_R0043; L-2_2, Sphch_R0058; L-1_3, Sphch_R0067), Sphingomonas sp. SKA58 (SKA58_1, SKA58_r00366; SKA58_2, SKA58_r18278), Sphingobium sp. MI1205 (MI1205_1, Chr1_64638_66133; MI1205_2, Chr2_561355_562850_c), Sphingobiums sp. SYK-6 (SYK6_1, SLG_r0030; SYK6_2, SLG_r0060), Sphingomonas wittichii RW1 (RW1_1, Swit_R0031; RW1_2, Swit_R0040), Sphingomonas sp. MM-1 (MM-1_1, Chr_1791835_1793331_c; MM-1_2, Chr_2084177_2085673_c), Sphingopyxis alaskensis RB2256 (RB2256, Sala_R0048), Novosphingobium sp. PP1Y (PPY_1, PP1Y_AR03; PPY_2, PP1Y_AR23; PPY_3, PP1Y_AR65), and N. aromaticivorans DSM 12444 (DSM_1, Saro_R0065; DSM_2, Saro_R0059; DSM_3, Saro_R0053) was constructed. 16S rRNA gene (rrsE: gene ID 7437018) of E. coli str. K-12 substr. W3110 (E. coli) was used as an out-of-group sequence. Bootstrap values calculated from 1,000 resampling using neighbour-joining are shown at the respective nodes. Length of lines reflects relative evolutionary distances among the sequences. Sphingomonas sp. SKA58 should be Sphingobium sp. SKA58 on the basis of comprehensive 16S rRNA gene analysis. However, we used ‘Sphingomonas’ for the strain according to the database in order to avoid confusion. γ-HCH degraders are bolded.

Figure 3

Venn diagram showing the number of shared and unique ORF clusters of four γ-HCH-degrading sphingomonad strains. Total ORFs (19,312) of the four strains were clustered into 10,325 ORF clusters by BLASTClust analysis. Numbers of total ORF clusters of these four strains are shown in parentheses under the strain names.

Table 3.

Classification of sphingomonad plasmids and plasmid-type replicons

Replicon^a	Size (bp)	Host	Host feature	RepA protein			Type	Gene cluster for conjugation	Genes for degradation	Accession number	Reference
Replicon^a	Size (bp)	Host	Host feature	Location	A.A.	Identity (%) to representative^a
UT26_Chr2	681,892	S. japonicum UT26	γ-HCH degradation	1_1203	400	—	repABC		linF, linGHIJ	AP010804	18
L-1_Chr2	1,368,670	Sphingobium chlorophenolicum L-1	PCP degradation	321340_322506	388	89 (359/400)	repABC		pcpBDR, pcpEMAC	CP002799	11
TKS_Chr2	989,120	Sphingobium sp. TKS	γ-HCH degradation	1_1206	401	89 (360/401)	repABC		linGHIJ_homologue	CP005084	26
MI_Chr2	567,154	Sphingobium sp. MI1205	γ-HCH degradation	1_1209	402	83 (338/405)	repABC			CP005189	25
pNL2	487,268	Novosphingobium aromaticivorans DSM 12444	aromatic compounds degradation	209439_210644_c	401	70 (284/401)	repABC			CP000677	unpublished
Lpl	192,103	Novosphingobium sp. PP1Y	aromatic compounds degradation	88922_90199	425	67 (288/425)	repABC			FR856860	80
pISP1^b	172,140	Sphingomonas sp. MM-1	γ-HCH degradation	114750_115880_c	376	55 (201/363)	repABC	F	linA, linC, linF'	CP004038	23
TKS_Chr1^b	4,249,857	Sphingobium sp. TKS	γ-HCH degradation	252492_253622_c	376	55 (200/363)^d	repABC	F	linB, linC, linF', linFb, linKLMN	CP005083	26
pMI2^b	287,488	Sphingobium sp. MI1205	γ-HCH degradation	183355_184485	376	55 (200/363)^d	repABC	F	linA, linRD	CP005191	25
pMI1	292,135	Sphingobium sp. MI1205	γ-HCH degradation	1_1224	407	42 (174/409)	repABC	F	linB, linC, linRED, linEb, linF, linF'', linGHIJ	CP005190	25
pTK1^b	520,614	Sphingobium sp. TKS	γ-HCH degradation	312340_313539	399	42 (171/401)	repABC	Ti, F		CP005085	26

pCHQ1	190,974	S. japonicum UT26	γ-HCH degradation	1_1164	387	—	repABC	Ti, F	linRED	AP010805	18
pTK1^b	520,614	Sphingobium sp. TKS	γ-HCH degradation	1_1164	387	identical	repABC	Ti, F		CP005085	26
pSLGP	148,801	Sphingobium sp. SYK-6	lignin degradation	1_1164	387	identical	repABC	Ti		AP012223	81
pSPHCH01	123,733	Sphingobium chlorophenolicum L-1	PCP degradation	47083_48171_c	362^c	identical	repABC	Ti		CP002800	11
pISP0	275,840	Sphingomonas sp. MM-1	γ-HCH degradation	4081_5244	387	98 (381/387)	repABC	Ti	linF, linGHIJ	CP004037	23

pUT1	31,776	S. japonicum UT26	γ-HCH degradation	1_1104	367	—	iteron			AP010806	18
pMI2^b	287,488	Sphingobium sp. MI1205	γ-HCH degradation	1_1104	367	identical	iteron	F	linA, linRD	CP005191	25
pISP2	53,841	Sphingomonas sp. MM-1	γ-HCH degradation	1_1104	367	identical	iteron			CP004039	23
pTK4^b	75,938	Sphingobium sp. TKS	γ-HCH degradation	1_1104	367	identical	iteron		linA, linC	CP005088	26

pISP3	43,776	Sphingomonas sp. MM-1	γ-HCH degradation	37217_38329	370	—	iteron		linRED	CP004040	23
pMI4^b	32,974	Sphingobium sp. MI1205	γ-HCH degradation	1_1113	370	identical	iteron		linRED	CP005193	25
pTK6	34,300	Sphingobium sp. TKS	γ-HCH degradation	1_1113	370	identical	iteron		linRED	CP005090	26

pTK3_1^b	87,635	Sphingobium sp. TKS	γ-HCH degradation	1_960	319	—	iteron		linB, linC, linF'	CP005087	26

pTK3_2^b	87,635	Sphingobium sp. TKS	γ-HCH degradation	52874_53821_c	315	—	iteron		linB, linC, linF'	CP005087	26

pISP4	33,183	Sphingomonas sp. MM-1	γ-HCH degradation	101_1012	303	—	iteron		linB, linC, linF'	CP004041	23
pISP1^b	172,140	Sphingomonas sp. MM-1	γ-HCH degradation	158477_159388_c	303	identical	iteron	F	linA, linC, linF'	CP004038	23
pTK3^b	87,635	Sphingobium sp. TKS	γ-HCH degradation	19263_20174_c	303	identical	iteron		linB, linC, linF'	CP005087	26
pTK4^b	75,938	Sphingobium sp. TKS	γ-HCH degradation	56463_57374	303	identical	iteron		linA, linC	CP005088	26
pMI3^b	88,374	Sphingobium sp. MI1205	γ-HCH degradation	30095_31006_c	303	identical	iteron	Ti	linB, linC, linF''	CP005192	25
pMI4^b	32,974	Sphingobium sp. MI1205	γ-HCH degradation	18420_19331	303	identical	iteron		linRED	CP005193	25

pTK2	195,308	Sphingobium sp. TKS	γ-HCH degradation	1_1122	373	—	iteron	Ti		CP005086	26

pTK7	9,585	Sphingobium sp. TKS	γ-HCH degradation	1_1107	368	—	iteron			CP005091	26

pTK8	7,223	Sphingobium sp. TKS	γ-HCH degradation	1_1032	343	—	iteron			CP005092	26

pNL1	184,462	Novosphingobium aromaticivorans DSM 12444	aromatic compounds degradation	86362_87666	434	—	iteron			CP000676	unpublished
pCAR3	254,797	Novosphigobium sp. KA1	carbazole degradation	200594_201898_c	434	91 (398/433)	iteron			AB270530	82
Mpl	1,161,602	Novosphingobium sp. PP1Y	aromatic compounds degradation	513635_514939_c	434	85 (372/434)	iteron			FR856861	80
pSWIT02	222,757	Sphingomonas wittichii RW1	dioxin degradation	63589_64893	434	82 (356/429)	iteron			CP000701	83

pLB1	65,998	unidentifed soil bacterium (S. japonicum UT26)	γ-HCH degradation	1_783	260	—		Ti	linB	AB244976	74
pMI3^b	88,374	Sphingobium sp. MI1205	γ-HCH degradation	1_783	260	identical		Ti	linB, linC, linF'	CP005192	25
pLA2	62,341	Novosphingobium pentaromativorans US6-1	benzo(a)pyrene degradation	40543_41325	260	98 (256/260)		Ti		AGFM01000123	84
pTK5	53,908	Sphingobium sp. TKS	γ-HCH degradation	1_783	260	97 (253/260)		Ti		CP005089	26

pUT2	5,398	S. japonicum UT26	γ-HCH degradation	1_654	217	—	iteron			AP010807	18
pTK9	5,391	Sphingobium sp. TKS	γ-HCH degradation	1_654	217	identical	iteron			CP005093	26

aThe representative replicons of each type ones are indicated in bold in the first column.

bReplicons having more than one rep genes.

cStart codon is differently annotated for the same DNA region.

dIdentical with each other.

blinF′ and linF″ are probably pseudogenes (see Supplementary Fig. S2). cReplicons having more than one rep genes. In addition to the genes for γ-HCH degradation (see below), putative genes for the degradation of various aromatic compounds, toluene/phenol, chlorophenol, anthranilate, and homogentisate, reside in the UT26 genome. These genes constitute four clusters for the degradation of the respective compounds, and each cluster contains all the genes necessary for the conversion of each compound to the metabolites in the central metabolic pathway, strongly suggesting that UT26 is able to utilize these compounds. Similarly, several putative genes for the degradation of aromatic compounds were found in the MM-1, MI1205, and TKS genomes (Supplementary Table S3). The potential of the four γ-HCH degraders for the degradation of aromatic compounds was estimated more comprehensively by BLASTP search of all their ORFs against our previously constructed in-house database which consists of enzymes for the degradation of aromatic compounds, and the result was summarized in Supplementary Table S4. The numbers of ORFs potentially involved in the degradation of aromatic compounds in the four strains (62, 46, 27, and 25 for TKS, UT26, MI1205, and MM-1, respectively) are much smaller than those in versatile recalcitrant pollutant degraders, Cupriavidus necator JMP134,, and Burkholderia xenovorans LB400, (149 and 135 for JMP134 and LB400, respectively). Especially, those in UT26, MI1205, and MM-1 are even smaller than those in typical metabolically versatile soil bacterial strains Burkholderia multivorans ATCC 17616 and Pseudomonas putida KT2440 (73 and 62 for KT2440 and ATCC 17616, respectively). These results indicate that our sphingomonad strains are ‘specialists’ for γ-HCH degradation, but not ‘generalists’ for the degradation of many recalcitrant compounds.

3.2. The lin genes for γ-HCH utilization

UT26 converts γ-HCH to β-ketoadipate via reactions catalysed by dehydrochlorinase (LinA), haloalkane dehalogenase (LinB), dehydrogenase (LinC), reductive dechlorinase (LinD), ring-cleavage dioxygenase (LinE), and maleylacetate reductase (MAR) (LinF); β-ketoadipate is thereafter converted to succinyl-CoA and acetyl-CoA by succinyl-CoA:3-oxoadipate CoA transferase (LinGH) and β-ketoadipyl CoA thiolase (LinJ), respectively (Fig. 1)., In addition to genes for these catabolic enzymes and their regulatory genes (linR for linDE and linI for linGHJ),, the linKLMN genes encoding a putative ABC-transporter system are necessary for the γ-HCH utilization in UT26. The linA, linB, linC, and linF genes, and the linRED, linGHIJ, and linKLMN clusters are dispersed on the UT26 genome. Since the β-ketoadipate pathway is often used by environmental bacterial strains, the lin genes for the conversion of γ-HCH to β-ketoadipate (linA to linF) are peculiar to the γ-HCH-degrading pathway. In particular, the linA gene is unique because it does not show significant similarity to any sequences in the databases except for the almost identical (>90% identity) linA genes from other bacterial strains and metagenomes., The MM-1, MI1205, and TKS genomes carry linA, linB, and linC genes and a linRED cluster that are almost identical (>98% identity at the DNA level) to those of UT26 (Table 2). The former two strains additionally carry a linF gene and linGHIJ cluster that are almost identical (>98% identity at the DNA level) to those in UT26, strongly suggesting that γ-HCH is degraded in these two strains by the same pathway as in UT26 (Fig. 1). The last strain lacks the linFUT26 and linGHIJUT26 cluster for maleylacetate metabolism (Fig. 1). Although two copies of the truncated version of linF (named linF′) (Supplementary Fig. S2) are present in the TKS genome, linF' was assumed not to encode functional MAR, since linF' misses more than one half of the intact linF gene (Supplementary Fig. S2). TKS instead carries another putative gene, designated linFb, on Chr1. Although LinFb showed only 49% identity to LinFUT26 (Table 2), it showed much higher similarity with other known MAR proteins, such as TfdF from Bordetella petrii, Achromobacter denifrificans, Burkholderia sp. M701 (Acc no. YP_008864525), and Comamonas testosteroni (83,78, 78, and 71%, respectively). Interestingly, Sphingobium sp. HDIPO4, a recently isolated HCH degrader, has the identical linFb gene, although its start codon is annotated at a position different from that in TKS. To clarify the LinFb function, its gene in TKS was disrupted (Supplementary Fig. S1). The resultant strain did not grow on a minimal agar plate supplemented with γ-HCH as a sole source of carbon and energy, and this growth defect was reversed by the supply of the intact linFb gene (Supplementary Fig. S3). In addition, the γ-HCH utilization defect of UT1023d, a linF mutant of UT26, was reversed by the supply of the linFb gene (data not shown). These results clearly demonstrated that the linFb encodes a MAR that is functional for the γ-HCH utilization. Although the linGHIJUT26 cluster was not found in the TKS genome, many homologues of linG, linH, and linJ were found (Supplementary Table S5). Among these homologues, one set of linGH homologues is located just downstream of linFb on Chr1TKS (Supplementary Fig. S4A) and only one set of linGHIJ homologues exists as a cluster on Chr2TKS (Supplementary Fig. S4B). Some of these homologues may be functional for the β-ketoadipate metabolism, and this experimental confirmation is necessary. These results strongly suggested that γ-HCH is also degraded in TKS by the same pathway as in UT26 (Fig. 1).

Table 2.

lin genes of four γ−HCH-degrading sphingomonad strains

Gene^b	Function	UT26		MM-1				TKS				MI1205
		A.A. residues	Location^a	A.A. residues	Location^a	Identity (%) to that of UT26		A.A. residues	Location^a	Identity (%) to that of UT26		A.A. residues	Location^a	Identity (%) to that of UT26
		A.A. residues	Location^a	A.A. residues	Location^a	A.A.	Nucleotide	A.A. residues	Location^a	A.A.	Nucleotide	A.A. residues	Location^a	A.A.	Nucleotide
linA	Dehydrochlorinase	156	Chr1_1860686-1861156_c	156	pISP1_13547_14017	98 (153/156)	98 (466/471)	156	pTK4_18583_19053	identical		156	pMI2_260065_260535_c	identical

linB	Halidohydrolase	296	Chr1_1966541-1967431	296	pISP4_18957_19847_c	98 (293/296)	99 (883/891)	296	Chr1_230419_231309	97 (290/296)	99 (883/891)	296	pMI1_179020_179910_c	98 (292/296)	99 (886/891)
								296	pTK3_34952_35842	97 (290/296)	99 (883/891)	296	pMI3_39099_39989	97 (289/296)	99 (883/891)
												296	pMI3_41282_42172	97 (289/296)	99 (883/891)

linC	Dehydrogenase	250	Chr1_566609-567361_c	250	pISP1_147367_148119	99 (249/250)	99 (752/753)	250	Chr1_469607_470359_c	99 (249/250)	99 (752/753)	250	pMI1_169764_170516_c	99 (249/250)	99 (752/753)
				250	pISP4_11370_12122	identical		250	pTK3_74827_75579	99 (249/250)	99 (752/753)	250	pMI3_15466_16218	99 (249/250)	99 (752/753)
								250	pTK4_21888_22640	99 (249/250)	99 (752/753)

linD	Reductive dechlorinase	346	pCHQ1_110947-111987_c	346	pISP3_15908_16948	identical		346	pTK6_21982_23022	identical		346	pMI1_148490_149530_c	identical
												346	pMI2_214796_215836	identical
												346	pMI4_28637_29677	identical

linE	Ring-cleavage dioxygenase	321	pCHQ1_114235-115200_c	321	pISP3_12695_13660	identical		321	pTK6_18769_19734	identical		321	pMI1_151777_152742_c	identical
												321	pMI4_25425_26390	identical	99 (965/966)
linEb^c		320	Chr2_564928_565890									320	pMI1_141726_142688_c	99 (319/320)	99 (961/963)

linR	LysR-family transcriptional regulator	303	pCHQ1_115332-116243	303	pISP3_11652_12563_c	99 (302/303)	99 (911/912)	303	pTK6_17726_18637_c	identical		303	pMI1_152874_153785	identical
												303	pMI2_210541_211452_c	identical
												303	pMI4_24382_25293_c	identical

linF	Maleylactate reductase	352	Chr2_562332-563390_c	352	pISP0_110260_111318_c	identical	98 (1046/1059)					352	pMI1_144226_145284	identical	99 (1049/1059)
linF'				180	pISP1_150100_150642	98 (176/178)	99 (526/531)	180	Chr1_466905_467447_c	98 (176/178)	99 (526/531)
				180	pISP4_8847_9389_c	98 (176/178)	99 (526/531)	180	pTK3_77560_78102	98 (176/178)	99 (526/531)
linF''												127	pMI1_167400_167783_c	100 (122/122)	98 (366/371)
												127	pMI3_18199_18582	100 (122/122)	98 (366/371)
linFb								357	Chr1_438688_439761_c	49 (174/350)

linG	Acyl-CoA transferase, alpha subunit	215	Chr2_603108-603755	215	pISP0_152221_152868	identical	99 (647/648)					239	pMI1_102652_103371_c	100 (215/215)	99 (647/648)

linH	Acyl-CoA transferase, beta subunit	212	Chr2_603755-604393	212	pISP0_152868_153506	identical	99 (637/639)					212	pMI1_102014_102652_c	identical	99 (637/639)

linI	IclR-family transcriptional regulator	267	Chr2_602168-602971_c	267	pISP0_151281_152084_c	99 (265/267)	99 (798/804)					265	pMI1_103442_104239	99 (263/265)	99 (798/804)

linJ	Thiolase	403	Chr2_600921-602132_c	403	pISP0_150034_151245_c	identical	99 (1202/1212)					401	pMI1_104281_105486	99 (400/401)	99 (1202/1212)

linK	Putative ABC transporter system, inner membrane protein	376	Chr1_19347-20477	366	Chr_2358426_2359526	65 (239/364)	83 (526/627)	316	Chr1_40913_41863_c	95 (301/316)	88 (1004/11131)	369	Chr1_40248_41357_c	86 (320/369)	87 (739/840)
linKb	Putative ABC transporter system, inner membrane protein	376	Chr1_19347-20477	366	Chr_2358426_2359526	65 (239/364)	83 (526/627)	316	Chr1_40913_41863_c	95 (301/316)	88 (1004/11131)	369	Chr1_40248_41357_c	86 (320/369)

linL	Putative ABC transporter system, ATPase	282	Chr1_20477-21325					282	Chr1_40065_40913_c	95 (268/282)	91 (721/786)	290	Chr1_39376_40248_c	90 (252/280)	85 (627/734)
linLb	Putative ABC transporter system, ATPase			268	Chr_2359526_2360332	75 (194/258)	83 (141/168)

linM	Putative ABC transporter system, periplasmic protein	320	Chr1_21329-22291					320	Chr1_39099_40061_c	98 (314/320)	93 (899/963)	320	Chr1_38410_39372_c	93 (297/319)	86 (827/958)
linMb	Putative ABC transporter system, periplasmic protein			317	Chr_2360339_2361292	63 (201/319)

linN	Putative ABC transporter system, lipoprotein	202	Chr1_22299-22907					204	Chr1_38477_39091_c	93 (190/204)	88 (538/611)	203	Chr1_37784_38395_c	81 (168/205)	82 (384/464)
linNb	Putative ABC transporter system, lipoprotein			193	Chr_2361298_2361879	42 (80/190)

ac, encoding on complementary strand.

blinF′ and linF″" are probably pseudogenes.

clinEb is homologue of PcpA (96% identity) and partially involved in γ-HCH degradation in UT26.

lin genes of four γ−HCH-degrading sphingomonad strains ac, encoding on complementary strand. blinF′ and linF″" are probably pseudogenes. clinEb is homologue of PcpA (96% identity) and partially involved in γ-HCH degradation in UT26. The linKLMNUT26 homologues have been found in various bacterial strains, and were also found in the MM-1, MI1205, and TKS genomes (Table 2). However, their similarities to linKLMNUT26 are lower (<93% identity at DNA level; Table 2) than in the case of the other lin genes mentioned earlier, and this divergence roughly reflects the phylogenetic relationship of their hosts (Fig. 2 and Supplementary Fig. S5), suggesting that the linKLMN system is one of the inherent functions necessary for γ-HCH utilization in sphingomonads. In particular, the linKLMN homologues of MM-1, which is phylogenetically the most distant strain from UT26 (Fig. 2), show a relatively low similarity with linKLMNUT26 (42–75% identities at the amino acid level; Table 2 and Supplementary Fig. S5), and they were designated linKbLbMbNb. To confirm their function for γ-HCH utilization, we attempted to disrupt the linKbLbMbNb gene cluster in MM-1. However, such an expected disruptant could not be constructed because unknown DNA rearrangements often occurred in MM-1. As an alternative confirmation, a plasmid containing the linKbLbMbNbMM-1 gene cluster was introduced into RE1, a linKLMNUT26 disruptant of UT26. The growth of RE1 in 1/3LB medium was inhibited by the addition of γ-HCH (Supplementary Fig. S6A), and this inhibition was suppressed by the supply of the linKbLbMbNb cluster (Supplementary Fig. S6B) as well as the linKLMNUT26 one (Supplementary Fig. S6C), strongly suggesting that the both clusters function in the same way for the γ-HCH utilization. These results supported our hypothesis that the linKLMN system is one of the inherent functions necessary for γ-HCH utilization in sphingomonads. However, the possibility cannot be excluded that other functional homologue(s) of linKLMN system exist in MM-1. All of the four strains carry almost identical linA to linE genes (designated ‘specific’ lin genes) (Table 2), suggesting they acquired such genes by lateral gene transfer. However, the specific lin genes are dispersed on multiple replicons in the four strains (Table 2). In UT26, linA to linC are located on Chr1, and only the linRED cluster is located on a plasmid. On the other hands, all the specific lin genes are dispersed on multiple plasmids with various combinations in other three strains, although additional copies of linB and linC are also located on Chr1 in TKS (Table 1). Furthermore, replicon types of such plasmids carrying the specific lin genes are various (Table 3). These observations indicate that these four strains did not simply acquire all the specific lin genes at once as a cluster. This contrasts with other aromatic compound-degrading strains, which can acquire a whole set of responsible genes by the conjugative transfer of plasmids and/or integrative and conjugative elements. Classification of sphingomonad plasmids and plasmid-type replicons aThe representative replicons of each type ones are indicated in bold in the first column. bReplicons having more than one rep genes. cStart codon is differently annotated for the same DNA region. dIdentical with each other. Although, in the present study, we only described the overall genetic repertoire of the lin genes for γ-HCH utilization, the composition of genes for LinA and LinB variants and their copy numbers and expression levels are important for the degradation performance of host strains toward HCH isomers, since the LinA and LinB variants show different levels of enzymatic activity toward different HCH isomers and their metabolites.,, However, in order to properly discuss this point from a genomic viewpoint, additional fundamental biochemical and experimental data will be needed.

3.3. Replication/partition-encoding regions of plasmids and plasmid-type replicons in sphingomonad strains

The putative replication origins (oriCs) of the main chromosomes (Chr1s) of TKS, MI1205, and MM-1 were, as in the case with Chr1UT26, found to be of alphaproteobacterial-chromosome type;, these oriCs were located upstream of the uroprophyrinogen decarboxylase gene (hemE) with multiple DnaA boxes [TT(A/T)TNCACA] (Supplementary Fig. S7). On the other hand, as in the case of Chr2UT26, both Chr2TKS and Chr2MI1205 have the plasmid-type replication and active partition systems. These three plasmid-type chromosomes and the 21 plasmids in our four strains and the plasmids from other sphingomonads were, on the basis of the similarities of their RepA (DNA replication initiator) proteins, classified into 13 types (Table 3). Although the importance of plasmids in sphingomonads has been recognized, no detailed analysis of their fundamental machineries was reported. In addition, RepA proteins of plasmids in sphingomonads show a very low level of similarity to those of well-studied plasmids (e.g. IncP-1, F, IincP-7, and IncP-9 plasmids), and thus we compared only plasmids in sphingomonads in this study. Since the RepA proteins of the 13 types are very divergent, they were further categorized into three major groups, in each of which the RepA proteins exhibit 22–60% identity: (i) the Chr2UT26- and pCHQ1-types (Fig. 4A), (ii) the pUT1-, pISP3-, pTK3_1-, and pTK3_2-types (Fig. 4B), and (iii) the pISP4-, pTK2-, pTK7-, and pTK8-types (Fig. 4C). Based on our BLASTP analysis, the RepA proteins of pNL1, pLB1, and pUT2 did not show similarity to those of any of the other types of plasmids listed in Table 3, although the RepA of pUT2 was similar to those of the IncP-9 family of plasmids. Figure 4D schematically shows the organizations of the repA-flanking regions in the 12 representative plasmids (note that pTK3 has three types of repA genes), and many of these regions also carry the putative replication origin (oriV) sequences as well as the putative genes for the active partition systems, each with the putative parS (cis-acting centromeric) sequences, the parB gene encoding the parS-binding protein, and the parA gene encoding the NTPase that is capable of binding the parS-ParB complex. The Chr2UT26- and pCHQ1-type plasmids belong to the repABC-type plasmids, and putative palindromic parS sequences were found (Fig. 4D and Supplementary Table S6). The Chr2UT26-type plasmids have the parA-parB-repA cluster, and the order of these three genes is conserved in other typical repABC-type plasmids, although the RepA proteins from the Chr2UT26-type plasmids are divergent in their sizes and similarities (Table 3 and Fig. 4A). In contrast, the pCHQ1-type plasmids have a repA-parA-parB cluster (Fig. 4D) with nearly identical RepA proteins (Table 3). Other types of plasmids except the pLB1-type were categorized as iteron-type plasmids (Table 3),, and direct repeats (iteron), DnaA box, and parS sequences were found in their repA-flanking regions (Fig. 4D and Supplementary Table S6). Each of the pISP1, Chr1TKS, pTK1, pTK3, pTK4, pMI2, pMI3, and pMI4 replicons appears to carry at least two repA genes, which are of different types (Tables 1 and 3). This observation suggested the frequent occurrence of fusions of ancestral plasmids (see below). Similar mosaic replicons carrying more than one repA gene have been reported in various bacterial strains. Interestingly, six pISP4-type plasmids carry identical repA and parA genes, and five of them also have other types of repA genes (Table 3), suggesting a prevalent fusion event of replicons in the pISP4-type plasmids (see below). It is noteworthy that all six pISP4-type plasmids contain the lin genes (Table 3), indicating that this type of plasmid plays an important role in dissemination of the lin genes.

Figure 4

Phylogenetic trees of putative RepA proteins of Chr2UT26- and pCHQ1- (A), pUT1- (B), and pISP4- (C) types plasmids and organizations of repA-flanking regions of 12 representative sphingomonad plasmids (D). Classification of 13 types of plasmids and information on RepA protein sequences are summarized in Table 3. Neighbour-joining phylogenetic trees of the conserved sites, 260 aa (A), 262 aa (B), and 257 aa (C), respectively, were constructed. Bootstrap values calculated from 1,000 resampling using neighbour-joining are shown at the respective nodes. Length of lines reflects relative evolutionary distances among the sequences. RepA proteins of the representatives of the plasmid types (Table 3) are bolded. In panel D, the repA-flanking regions of plasmids whose putative RepA proteins show significant similarity are boxed. Pentagons indicate size and direction of ORF. Putative ORFs involved in replication and partition are filled with dark and light gray, respectively. Putative parS (palindromic TTN4CG N4AA) and DnaA box [TT(A or T)TNCACA] sequences are shown in red bars and red diamonds, respectively. Inverted repeats and repeat sequences are shown in blue and green bars, respectively. See Supplementary Table S6 for their sequences.

3.4. Highly conserved regions of replicons in sphingomonad strains

Although multiple members in each of the pCHQ1-, pUT1-, pLB1-, pISP3-, pUT2-, and pISP4-type plasmids have almost identical repA-containing regions, the sizes and gene contents of the plasmid members in each type are diverse (Table 3). Therefore, we compared the overall structures in the six types of plasmids (Fig. 5). The 6.4-kb repA-parA-parB-containing region and the 10-kb repA-parA-containing region are conserved in all members of the pCHQ1- and pUT1-type plasmids, respectively (Fig. 5AB). The 25.5-kb region containing the pLB1-type repA and parB genes is conserved in pMI3 and pTK5 (Fig. 5C). Most of this region is also conserved in pLA2, although the region is divided into two parts and the parB gene is lacking (Fig. 5C). pMI3 is a fusion plasmid of the pLB1- and pISP4-type plasmids because the 9.1-kb repA-parA-containing region commonly conserved in the pISP4-type plasmids (Fig. 5F) is, together with the linC- and linF″-containing region, inserted into the continuous region on pLB1 (Fig. 5C). A part of the 9.1-kb conserved region is also present in pTK4 (Fig. 5B). The 4.5-kb region containing the pISP3-type repA and parA genes is conserved in pTK6 and pMI4 (Fig. 5D). Two 1,645-bp plasmids, pUT2 and pTK9, differ by only 9 bp (Fig. 5E). As mentioned above, all the pISP4-type plasmids except for archetypal pISP4 have, in addition to the common repA gene, other distinct repA genes (Table 3), and probably are fusion plasmids. IS6100 or the Tn3-type transposon is located at the junctions of highly conserved regions of pISP4-type plasmids (Fig. 5F), suggesting that the pISP4-type plasmids are easily fused with other replicons via the transposition of IS6100 and/or the Tn3-type transposon. The repA, parA, and parB genes of the Chr2UT26-type replication machineries were also located on Chr1TKS, pISP1, pMI2, Chr2TKS, pTK1, Chr2MI1205, and pMI1 (Table 3), and the former three replicons carry an almost identical 20.3-kb region that covers the Chr2UT26-type repA, parA, and parB genes (Supplementary Fig. S8). Our findings in this section clearly indicate that the replicons having highly conserved replication/partition genes are distributed among sphingomonad strains with frequent recombination events including replicon fusion.

Figure 5

Structures of plasmids which have the highly conserved regions in the four γ-HCH-degrading sphingomonad strains. The highly conserved regions in pCHQ1- (A), pUT1- (B), pLB1- (C), pISP3- (D), pUT2- (E), and pISP4- (F) types plasmids are schematically shown. See Table 3 for the classification of plasmids. ORFs shown by pentagons are coloured as follows: red, rep and par genes; green, transposase gene of IS6100; cyan, putative genes for conjugal transfer; yellow, lin genes; and gray with gradient, transposition-related genes in other putative transposons. Almost identical regions are shown by the same background colours.

3.5. Genes for conjugal transfer of plasmids in HCH-degrading sphingomonads

The genes for conjugal transfer consist of those encoding proteins involved in mating pair formation (Mpf) and DNA transfer and replication (Dtr). The mpf genes encode proteins that assemble in a large macromolecular structure called the Type IV secretion system (T4SS), whereas the dtr genes encode proteins that bind to the DNA at the origin of transfer region, oriT, forming a structure called a relaxosome. This modular gene organization is shared by most conjugative systems, showing a high degree of gene synteny conservation. Among the sphingomonad plasmids listed in Table 3, conjugal transferability of pCHQ1 and pLB1 has been experimentally confirmed,, and these two plasmids have putative gene clusters for conjugal transfer similar to the vir gene cluster of Agrobacterium tumefaciens Ti plasmid, consisting of genes for Mpf (VirB1 to VirB11) and Dtr (relaxase VirD2 and coupling protein VirD4)( Supplementary Fig. S9A). , Putative vir gene clusters were also found on pISP0, pTK1, pTK2, pTK5, and pMI3 (Supplementary Fig. S9A), indicating the potential self-transferability of these plasmids. However, the level of similarity of each component to the counterpart of Ti plasmid is relatively low, and only the phylogenetic relationship of the putative cytoplasmic ATPase component (VirB4), which is the essential and most conserved component of T4SS, encoded by these clusters is shown (Supplementary Fig. S9B). pCHQ1 has another putative gene cluster for conjugal transfer similar to the tra gene cluster of F plasmid, and gene clusters similar to the tra gene cluster were also found on Chr1TKS, pMI1, pMI2, pTK1, and pISP1 (Supplementary Fig. S9C). As in the case with the gene clusters homologous to the vir gene cluster, the level of similarity of each component to its counterpart in F plasmid is relatively low, and only phylogenetic relationship of cytoplasmic ATPase component (TraC: VirB4 homologue in function) encoded by these clusters is shown (Supplementary Fig. S9D). However, traI and traD, which encode the relaxase and coupling protein, respectively, were not found on pCHQ1, and traD and traG, which encode the coupling protein and inner membrane platform component, respectively, are missing on pISP1 (Supplementary Fig. S9C). Further experimental confirmation is necessary to demonstrate the self-transferability of these plasmids having putative gene clusters for conjugal transfer.

3.6. Transposable elements in four γ-HCH degraders

Many putative transposable elements including IS elements and Tn3-type transposons were found in the genomes of the four γ-HCH degraders (Supplementary Table S7). Although most of the IS elements are present as a single-copy form in the four strains (Supplementary Table S7), IS6100 is, as in the case of UT26, the most abundant element in the MM-1, TKS, and MI1205 genomes (15, 29, and 24 copies, respectively) (Table 1 and Supplementary Table S7). This suggests that IS6100 can transpose and increase its copy number in these γ-HCH degraders. To investigate the transposition activity of IS6100 and other transposable elements, the IS entrapment methodology using pGEN50042 was applied for the four γ-HCH degraders. We conducted several independent analyses for each strain, and detected the successful transposition of IS6100, ISsp1, ISSj02, ISSj12, and Tn6134 in UT26, IS6100, ISSj02, ISTks12, Tn6268, and Tn6269 in TKS, and IS6100, ISsp1, ISMi02, ISMi08, Tn6137, and Tn6274 in MI1205 (Supplementary Table S7). On the other hand, this IS-entrapment system did not work well in MM-1 because of the high-frequency generation of the spontaneous sucrose-resistant mutants without the insertion of transposable elements into the sacB gene on pGEN500 (data not shown).

3.7. Inference of the past genome rearrangements via IS6100

IS6100 is often located in close proximity to the lin genes in the HCH-degrading strains and the metagenomic sequences from HCH-contaminated sites., IS6100 with a size of 880 bp is a ‘replicative’ IS element (Supplementary Fig. S10), and its transposition without apparent preference of target specificity causes the duplication of IS6100 with an 8-bp duplication of the target sequence. Therefore, the IS6100 transposition can generate three types of DNA rearrangements (Supplementary Fig. S11): intra-molecular transposition with a deletion/resolution (intra-replicon 1) or inversion (intra-replicon 2) event, and inter-molecular transposition with a fusion (inter-replicon) event. Our comparison of the regions just upstream and downstream of the 13 copies of IS6100 present on Chr1, Chr2, pCHQ1, and pUT1 of UT26 revealed five pairs of 8-bp sequences (Supplementary Table S8). On the basis of the IS6100 transposition mechanism (Supplementary Fig. S11), the most plausible past events caused by transposition of IS6100 can be inferred (Fig. 6A); it is indicated that not only simple transposition with inversion but also transposition accompanied with the fusion and resolution of replicons must have occurred. In a similar manner, seven, one, and four pairs of 8-bp sequences were found just upstream or downstream of IS6100 in TKS, MM-1, and MI1205, respectively (Supplementary Table S8), and the plausible past events mediated by transposition of IS6100 in these strains are depicted in Figure 6B–D.

Figure 6

Inference of the past genome rearrangements via IS6100 in UT26 (A), TKS (B), MM-1 (C), and MI1205 (D). Blue pentagons, triangles with alphabet, and red pentagons indicate IS6100, 8-bp target sites, and lin genes, respectively. Triangles with the same alphabet mean identical sequence and direction (see Supplementary Table S8 and Supplementary Fig. S11 for detail: note that sequences shown in Supplementary Table S8 are cyan strands of 8-bp targets in Supplementary Fig. S11). Blue pentagons marked with internal white circle and triangle indicate the IS6100 element which transposed and mediated homologous recombination, respectively. IS6100 is a ‘replicative’ IS element, and it increases its copy number with the transposition (Supplementary Fig. S11). Only replicons carrying IS6100 are illustrated, and relative positions and directions of IS6100 and lin genes in each replicon are schematically shown. The IS6100 elements involved in the proposed past genome rearrangements are shown in larger size. Numbers in current forms of the four strains indicate locations of IS6100s and lin genes in each replicon. The specific lin-flanking regions in the four strains were compared (Fig. 7). Not only the lin genes themselves (Table 2) but also their flanking regions are highly conserved (Fig. 7). Interestingly, such conserved regions are located very close to IS6100 and the distances between the IS6100 copies and the lin genes are varied (Fig. 7). This means that IS6100 is likely to play a crucial ‘editing’ role in the ‘trimming’ of ‘unnecessary regions’ for HCH utilization and the ‘gathering’ of the specific lin genes. At least, it is the most plausible that the transposition of IS6100 led to the diversification of the distribution and organization of the lin genes in the genomes. The distance between IS6100 and linA is the longest in UT26 (Fig. 7A), and the linB gene in UT26 has no IS6100 element in its flanking regions (Fig. 7B). Moreover, IS6100 is located at only one side of linC (Fig. 7C) and the linRED cluster (Fig. 7D) in UT26. These results suggested that UT26 is the closest to the prototype of the γ-HCH degrader, at least among the four strains examined in this study.

Figure 7

Comparison of regions containing the specific lin genes in the four γ-HCH-degrading sphingomonad strains. The regions containing linA (A), linB (B), linC (C), and linRED cluster (D) were compared. The regions homologous to each other were coloured in the gradient depending on the level of similarity as shown in explanatory note. The lin genes, transposase gene of IS6100, and other ORFs were shown by pentagons in red, blue, and orange, respectively. The pseudo ‘linE’ gene exists in pIM2 of MI1205 at the region corresponding to the linE gene in other plasmids.

3.8. Conclusions and perspectives

Our comparison of the complete genome sequences of four γ-HCH-degrading sphingomonad strains and gathering of experimental data in this study demonstrate or strongly suggest the following points: (i) the gene repertoires and genomic organizations of the four γ-HCH-degrading strains, which are phylogenetically dispersed among related sphingomonad strains (Fig. 2), are relatively different from one another (Table 1 and Fig. 3); (ii) all four strains carry almost identical linA to linE genes for the conversion of γ-HCH to maleylacetate (Fig. 1 and Table 2); (iii) considerably different genes are used for the metabolism of maleylacetate in TKS (Fig. 1, Table 2, and Supplementary Table S5); (iv) the linKLMN genes for the putative ABC transporter necessary for γ-HCH utilization are structurally divergent, and such divergence reflects the phylogenetic relationship of their hosts (Fig. 2 and Supplementary Fig. S5 and Table 2); (v) most of the linA to linJ genes for the catabolic enzymes are located on several replicons whose replication/partition systems are highly conserved among sphingomonad plasmids (Tables 1 and 3); and (vi) the transposition of IS6100 has caused dynamic genome rearrangements including the fusion and resolution of replicons and the diversification of lin-flanking regions in the four strains (Figs. 6 and 7). Based on our results in this study, we propose that these γ-HCH-degraders were formed independently in different geographic regions through the recruitment of specific lin genes and genes for the metabolism of maleylacetate into ancestral strains that had the core functions (including the linKLMN-encoded one) of sphingomonads. Multiple plasmids whose replication/partition machineries are highly conserved in sphingomonads might have played important roles in the recruitment of the specific lin genes by their horizontal transfer. In addition, IS6100 likely plays a crucial ‘editing’ role in the distribution and organization of the lin genes in genomes. In the future, our hypothesis may be confirmed in experiments using the four HCH degraders and their related but non-HCH-degrading and/or IS6100-free sphingomonad strains.

4. Data availability

The sequences with the annotation of replicons in MM-1, MI1205, and TKS have been deposited in DDBJ/EMBL/GenBank databases under the accession numbers shown in Table 1. Nucleotide sequences of the Tn3-type transposons, Tn6268 to Tn6278, were deposited in the DDBJ/EMBL/GenBank databases under accession numbers LC102249 to LC102259, respectively.

Conflict of interest

None declared.

Accession number

CP004036 to CP004041, CP005083 to CP005093, CP005188 to CP005193, and LC102249 to LC102259.

Supplementary data

Supplementary data are available at www.dnaresearch.oxfordjournals.org.

Funding

This work was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI grant numbers (22380047 and 25292043).

79 in total

Review 1. Control of plasmid DNA replication by iterons: no longer paradoxical.

Authors: D K Chattoraj
Journal: Mol Microbiol Date: 2000-08 Impact factor: 3.501

2. High-temperature-induced transposition of insertion elements in burkholderia multivorans ATCC 17616.

Authors: Yoshiyuki Ohtsubo; Hiroyuki Genka; Harunobu Komatsu; Yuji Nagata; Masataka Tsuda
Journal: Appl Environ Microbiol Date: 2005-04 Impact factor: 4.792

Review 3. The ABCs of plasmid replication and segregation.

Authors: Uelinton M Pinto; Katherine M Pappas; Stephen C Winans
Journal: Nat Rev Microbiol Date: 2012-11 Impact factor: 60.633

Review 4. Towards an integrated model of bacterial conjugation.

Authors: Elena Cabezón; Jorge Ripoll-Rozada; Alejandro Peña; Fernando de la Cruz; Ignacio Arechaga
Journal: FEMS Microbiol Rev Date: 2014-12-04 Impact factor: 16.408

5. Biodegradation of hexachlorocyclohexane (HCH) by microorganisms.

Authors: Theresa M Phillips; Alan G Seech; Hung Lee; Jack T Trevors
Journal: Biodegradation Date: 2005-08 Impact factor: 3.909

6. Complete genome sequence and comparative analysis of the metabolically versatile Pseudomonas putida KT2440.

Authors: K E Nelson; C Weinel; I T Paulsen; R J Dodson; H Hilbert; V A P Martins dos Santos; D E Fouts; S R Gill; M Pop; M Holmes; L Brinkac; M Beanan; R T DeBoy; S Daugherty; J Kolonay; R Madupu; W Nelson; O White; J Peterson; H Khouri; I Hance; P Chris Lee; E Holtzapple; D Scanlan; K Tran; A Moazzez; T Utterback; M Rizzo; K Lee; D Kosack; D Moestl; H Wedler; J Lauber; D Stjepandic; J Hoheisel; M Straetz; S Heim; C Kiewitz; J A Eisen; K N Timmis; A Düsterhöft; B Tümmler; C M Fraser
Journal: Environ Microbiol Date: 2002-12 Impact factor: 5.491

Review 7. Evolution of atrazine-degrading capabilities in the environment.

Authors: Nikolina Udiković-Kolić; Colin Scott; Fabrice Martin-Laurent
Journal: Appl Microbiol Biotechnol Date: 2012-10-18 Impact factor: 4.813

Review 8. Aerobic degradation of lindane (gamma-hexachlorocyclohexane) in bacteria and its biochemical and molecular basis.

Authors: Yuji Nagata; Ryo Endo; Michihiro Ito; Yoshiyuki Ohtsubo; Masataka Tsuda
Journal: Appl Microbiol Biotechnol Date: 2007-07-19 Impact factor: 4.813

9. A broad-host-range Flp-FRT recombination system for site-specific excision of chromosomally-located DNA sequences: application for isolation of unmarked Pseudomonas aeruginosa mutants.

Authors: T T Hoang; R R Karkhoff-Schweizer; A J Kutchma; H P Schweizer
Journal: Gene Date: 1998-05-28 Impact factor: 3.688

10. Draft Genome Sequence of Sphingobium sp. Strain HDIPO4, an Avid Degrader of Hexachlorocyclohexane.

Authors: Udita Mukherjee; Roshan Kumar; Nitish Kumar Mahato; J P Khurana; Rup Lal
Journal: Genome Announc Date: 2013-09-19

9 in total

Review 1. Whole genome strategies and bioremediation insight into dehalogenase-producing bacteria.

Authors: Habeebat Adekilekun Oyewusi; Roswanira Abdul Wahab; Fahrul Huyop
Journal: Mol Biol Rep Date: 2021-03-01 Impact factor: 2.316

Review 2. How enzyme promiscuity and horizontal gene transfer contribute to metabolic innovation.

Authors: Margaret E Glasner; Dat P Truong; Benjamin C Morse
Journal: FEBS J Date: 2020-01-10 Impact factor: 5.542

3. Insights into the Function and Horizontal Transfer of Isoproturon Degradation Genes (pdmAB) in a Biobed System.

Authors: Veronika Storck; Sara Gallego; Sotirios Vasileiadis; Sabir Hussain; Jérémie Béguet; Nadine Rouard; Céline Baguelin; Chiara Perruchon; Marion Devers-Lamrani; Dimitrios G Karpouzas; Fabrice Martin-Laurent
Journal: Appl Environ Microbiol Date: 2020-07-02 Impact factor: 4.792

4. Revealing large metagenomic regions through long DNA fragment hybridization capture.

Authors: Cyrielle Gasc; Pierre Peyret
Journal: Microbiome Date: 2017-03-14 Impact factor: 14.650

5. Comparative Genomics Suggests Mechanisms of Genetic Adaptation toward the Catabolism of the Phenylurea Herbicide Linuron in Variovorax.

Authors: Başak Öztürk; Johannes Werner; Jan P Meier-Kolthoff; Boyke Bunk; Cathrin Spröer; Dirk Springael
Journal: Genome Biol Evol Date: 2020-06-01 Impact factor: 3.416

6. Genome-Wide Analysis Reveals Genetic Potential for Aromatic Compounds Biodegradation of Sphingopyxis.

Authors: Fei Yang; Hai Feng; Isaac Yaw Massey; Feiyu Huang; Jian Guo; Xian Zhang
Journal: Biomed Res Int Date: 2020-05-27 Impact factor: 3.411

7. Adaptive Evolution of Sphingobium hydrophobicum C1^T in Electronic Waste Contaminated River Sediment.

Authors: Da Song; Xingjuan Chen; Meiying Xu; Rong Hai; Aifen Zhou; Renmao Tian; Joy D Van Nostrand; Megan L Kempher; Jun Guo; Guoping Sun; Jizhong Zhou
Journal: Front Microbiol Date: 2019-10-02 Impact factor: 5.640

8. Evolution of Sphingomonad Gene Clusters Related to Pesticide Catabolism Revealed by Genome Sequence and Mobilomics of Sphingobium herbicidovorans MH.

Authors: Tue Kjærgaard Nielsen; Morten Rasmussen; Sandrine Demanèche; Sébastien Cecillon; Timothy M Vogel; Lars Hestbjerg Hansen
Journal: Genome Biol Evol Date: 2017-09-01 Impact factor: 3.416

9. Genomic Analysis of γ-Hexachlorocyclohexane-Degrading Sphingopyxis lindanitolerans WS5A3p Strain in the Context of the Pangenome of Sphingopyxis.

Authors: Michal A Kaminski; Adam Sobczak; Andrzej Dziembowski; Leszek Lipinski
Journal: Genes (Basel) Date: 2019-09-06 Impact factor: 4.096

9 in total