| Literature DB >> 24782836 |
Matthew S Fullmer1, Shannon M Soucy1, Kristen S Swithers2, Andrea M Makkay1, Ryan Wheeler1, Antonio Ventosa3, J Peter Gogarten1, R Thane Papke1.
Abstract
The Halobacteria are known to engage in frequent gene transfer and homologous recombination. For stably diverged lineages to persist some checks on the rate of between lineage recombination must exist. We surveyed a group of isolates from the Aran-Bidgol endorheic lake in Iran and sequenced a selection of them. Multilocus Sequence Analysis (MLSA) and Average Nucleotide Identity (ANI) revealed multiple clusters (phylogroups) of organisms present in the lake. Patterns of intein and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) presence/absence and their sequence similarity, GC usage along with the ANI and the identities of the genes used in the MLSA revealed that two of these clusters share an exchange bias toward others in their phylogroup while showing reduced rates of exchange with other organisms in the environment. However, a third cluster, composed in part of named species from other areas of central Asia, displayed many indications of variability in exchange partners, from within the lake as well as outside the lake. We conclude that barriers to gene exchange exist between the two purely Aran-Bidgol phylogroups, and that the third cluster with members from other regions is not a single population and likely reflects an amalgamation of several populations.Entities:
Keywords: Average Nucleotide Identity (ANI); CRISPR; Halobacteria; Multilocus Sequence Analysis (MLSA); intein
Year: 2014 PMID: 24782836 PMCID: PMC3990103 DOI: 10.3389/fmicb.2014.00140
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Degenerate primers used to PCR amplify and sequence the genes for MLSA.
| atpB | tgt aaa acg acg gcc agt aac ggt gag scv ats aac cc | cag gaa aca gct atg act tca ggt cvg trt aca tgt a |
| ef-2 | tgt aaa acg acg gcc agt atc cgc gct bta yaa stg g | cag gaa aca gct atg act ggt cga tgg wyt cga ahg g |
| glnA | tgt aaa acg acg gcc agt cag gta cgg gtt aca sga cgg | cag gaa aca gct atg acc ctc gcs ccg aar gac ctc gc |
| ppsA | tgt aaa acg acg gcc agt ccg cgg tar ccv agc atc gg | cag gaa aca gct atg aca tcg tca ccg acg arg gyg g |
| rpoB | tgt aaa acg acg gcc agt tcg aag agc cgg acg aca tgg | cag gaa aca gct atg acc ggt cag cac ctg bac cgg ncc |
PCR conditions for each locus.
| Water (μl) | 11.6 | 8.2 | 11.8 | 7.9 | 11.9 |
| 5× phire reaction buffer (μl) | 4.0 | 4.0 | 4.0 | 4.0 | 4.0 |
| DMSO (μl) | 0.6 | 0 | 0.4 | 0.6 | 0.6 |
| Acetamide (25%, μl) | 0 | 4.0 | 0 | 4.0 | 0 |
| dNTP mix (10 mM, μl) | 0.4 | 0.4 | 0.4 | 0.4 | 0.4 |
| Forward primer (10 mM, μl) | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| Reverse primer (10 mM, μl) | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| Phire II DNA polymerase (μl) | 0.4 | 0.4 | 0.4 | 0.4 | 0.4 |
| Template DNA (20 ng/μl, μl) | 1.0 | 1.0 | 1.0 | 0.7 | 0.7 |
| Annealing temperature (°C) | 60.0 | 61.0 | 69.6 | 66.0 | 63.7 |
Assembly statistics for the genomes sequenced in this study.
| N75 (kb) | 18.9 | 2.3 | 23.2 | 24.7 | 1.1 | 1.3 | 30.0 | 25.1 | 25.4 | 42.7 | 25.3 | 27.2 | 41.1 | 23.8 | 32.1 | 23.2 | 21.4 | 8.4 |
| N50 (kb) | 54.9 | 4.4 | 56.3 | 42.9 | 1.9 | 2.3 | 43.8 | 51.6 | 51.6 | 80.3 | 42.7 | 68.1 | 74.9 | 51.2 | 64.4 | 43.4 | 39.6 | 32.1 |
| N25 (kb) | 97.3 | 7.8 | 99.8 | 73.4 | 3.5 | 4.0 | 77.5 | 95.4 | 95.7 | 131.8 | 90.3 | 118.4 | 118.9 | 91.9 | 83.0 | 68.2 | 76.0 | 67.9 |
| Minimum (kb) | 0.5 | 0.4 | 0.5 | 0.5 | 0.4 | 0.4 | 0.5 | 0.5 | 0.5 | 0.6 | 0.5 | 0.5 | 0.3 | 0.5 | 0.5 | 0.5 | 0.5 | 0.4 |
| Maximum (kb) | 180.2 | 40.5 | 183.6 | 123.4 | 26.7 | 25.0 | 203.3 | 169.6 | 268.1 | 412.4 | 174.7 | 230.0 | 246.3 | 145.6 | 122.0 | 190.3 | 145.8 | 153.4 |
| Average (kb) | 16.6 | 2.9 | 22.5 | 23.1 | 1.5 | 1.8 | 24.7 | 22.6 | 23.3 | 44.3 | 20.6 | 25.7 | 40.3 | 21.0 | 27.9 | 19.6 | 17.5 | 4.4 |
| Contig count | 233 | 1165 | 159 | 145 | 2764 | 1278 | 159 | 166 | 156 | 74 | 176 | 138 | 83 | 160 | 137 | 189 | 213 | 1090 |
| Length (Mb) | 3.87 | 3.33 | 3.58 | 3.35 | 4.21 | 2.26 | 3.93 | 3.75 | 3.63 | 3.28 | 3.63 | 3.55 | 3.35 | 3.36 | 3.82 | 3.70 | 3.73 | 4.79 |
| Base composition (GC%) | 66.0 | 65.8 | 65.8 | 67.6 | 65.5 | 66.3 | 67.0 | 67.6 | 67.5 | 67.6 | 66.6 | 67.1 | 67.8 | 67.7 | 67.6 | 67.6 | 66.2 | 66.0 |
| Number of coding sequences | 3908 | 3379 | 3529 | 3323 | 4147 | 2187 | 3977 | 3672 | 3544 | 3245 | 3600 | 3617 | 3400 | 3382 | 3718 | 3612 | 3724 | 4615 |
| Number of RNAs | 57 | 37 | 49 | 54 | 51 | 31 | 50 | 49 | 48 | 47 | 65 | 48 | 49 | 47 | 51 | 48 | 56 | 69 |
List of genomes used in this study.
| PRJNA72475 | NCBI | Alicante, Spain | Solar saltern | Complete | |
| PRJNA57719 | NCBI | Dead Sea, Israel | Saline lake/sea | Complete | |
| PRJNA167315 | NCBI | Alicante, Spain | Solar saltern | Complete | |
| PRJNA46845 | NCBI | Dead Sea, Israel | Saline lake/sea | Complete | |
| PRJNA199598 | NCBI | Yunnan, China | Solar saltern | Draft | |
| PRJNA188616 | NCBI | Xin-Jiang, China | Saline lake | Draft | |
| PRJNA188617 | NCBI | Xin-Jiang, China | Saline lake | Draft | |
| PRJNA188618 | NCBI | California, United States | Solar saltern | Draft | |
| PRJNA188619 | NCBI | Geelong, Australia | Solar saltern | Draft | |
| PRJNA188621 | NCBI | Turkmenistan | Saline soils | Draft | |
| PRJNA188620 | NCBI | Turkmenistan | Saline soils | Draft | |
| PRJNA188622 | NCBI | California, United States | Solar saltern | Draft | |
| PRJNA188615 | NCBI | Inner Mongolia, China | Saline lake | Draft | |
| PRJNA58807 | NCBI | Deep Lake, Antarctica | Saline lake | Complete | |
| PRJNA188614 | NCBI | Xin-Jiang, China | Saline lake | Draft | |
| PRJNA188613 | NCBI | Fujian, China | Solar saltern | Draft | |
| PRJNA188612 | NCBI | California, United States | Solar saltern | Draft | |
| PRJNA188611 | NCBI | Atacama, Chile | Solar saltern | Draft | |
| PRJNA188610 | NCBI | Turkmenistan | Saline soils | Draft | |
| Hrr. Cb34 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. C49 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Ea1 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Eb13 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Ib24 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Ea8 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Hd13 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. C3 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. E8 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. E3 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. LG1 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Fb21 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Ga2p | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. G37 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. LD3 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Ec15 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
| Hrr. Ga36 | PRJNA232799 (in submission) | This study | Aran-Bidgol, Iran | Saline lake | Draft |
Figure 1Maximum-likelihood gene trees made from the DNA sequences of . Support values on branches are bootstrap replicates. Bootstrap values below 70 are not displayed.
Figure 2Maximum-likelihood tree made from the concatenated DNA sequences of five housekeeping genes (. Support values on branches are bootstrap replicates. Bootstraps values below 70 are not displayed.
Pairwise distances of the concatenated alignment of the five MLSA genes.
Figure 3Average Nucleotide Identity (ANI) and tetramer frequency correlation analysis. Color coding reflects three described ANI cutoffs for species delineation. Red squares represent ANI values of 96% or greater, Orange 95% or greater, and yellow represents 94% or greater. The vertical stripes indicate tetramer regression coefficients lower than 0.9900.
Figure 4GC usage of all annotated ORFs within and between phylogroups.
Figure 5Assessment of the presence of inteins and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs). For inteins, purple boxes indicates the presence of an intein allele, white indicates its absence and black indicates an undetermined result. For CRISPRs a (+) indicates the presence of one or more CRISPRs and a (–) indicates the absence of CRISPRs.
Figure 6Bayesian tree made from presence-absence of intein alleles and protein sequences of present alleles. Support values on branches are posterior probabilities. Posteriors below 0.8 are not displayed.