| Literature DB >> 28082953 |
Carolina González1, Marcelo Lazcano1, Jorge Valdés2, David S Holmes1.
Abstract
Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e-5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD).Entities:
Keywords: Acidithiobacillus; Orphan (ORFan) genes; Thermithiobacillus; acid resistance; biomining bioleaching and acid mine drainage (AMD); extreme acidophile; horizontal gene transfer (HGT); metagenome and metatranscriptome
Year: 2016 PMID: 28082953 PMCID: PMC5186765 DOI: 10.3389/fmicb.2016.02035
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Genomes used in this study.
| 2.98 | 3147 | 58.8 | Valdés et al., | ||
| 2.88 | 2826 | 58.9 | Lucas et al., 2008, Unpublished | ||
| 3.2 | 3093 | 56.6 | Liljeqvist et al., | ||
| 3.42 | 3854 | 56.4 | Talla et al., | ||
| 3.82 | 3826 | 53.1 | Yin et al., | ||
| 3.01 | 3041 | 53.1 | Valdés et al., | ||
| 3.93 | 4191 | 52.8 | Travisany et al., | ||
| 2.77 | 2681 (0.21) | 61.4 | Valdes et al., | ||
| 2.93 | 2881 (0.31) | 61.3 | You et al., | ||
| 2.96 | 2750 | 66.8 | Kelly and Wood, | ||
| 2.97 | 2822 | 56.8 | Zhou, 2016, Unpublished | ||
| 4.23 | 5600 | 57.6 | Chen et al., | ||
| 3.11 | 2949 | 58.6 | Yan et al., | ||
| 3.11 | 2939 | 58.6 | Schopf, 2016, Unpublished | ||
| 2.87 | 2646 | 61.4 | Mi et al., 2006, Unpublished | ||
| 3.85 | 3768 | 53.1 | Zhang et al., |
denotes type strain;
denotes plasmid information.
Denotes JGI accession number.
Predicted properties of the proteins of families I–V.
| Family I | AFE_0294 | 8.06 | 250 | 5 | – | IM | – | |
| Lferr_0470 | 8.06 | 251 | 5 | – | IM | – | ||
| Acife_2737 | 9.47 | 259 | 5 | – | IM | – | ||
| CDQ10770.1 | 9.26 | 259 | 5 | – | IM | – | ||
| AFOH01000117 | 8.21 | 261 | 5 | – | IM | – | ||
| AZMO01000067 | 8.06 | 263 | 5 | – | IM | – | ||
| JMEB01000250 | 8.21 | 261 | 5 | – | IM | – | ||
| Atc_0578 | 9.25 | 257 | 5 | – | IM | – | ||
| Acaty_c0588 | 8.85 | 249 | 5 | – | IM | – | ||
| Family II | AFE_2894 | 9.52 | 103 | 1 | – | IM/P/C | – | |
| Lferr_2514 | 9.52 | 103 | 1 | – | IM | – | ||
| Acife_0262 | 10.26 | 103 | 1 | – | IM/P/C | – | ||
| CDQ10832.1 | 9.98 | 103 | 1 | – | IM/P/C | – | ||
| AFOH01000056 | 10.94 | 103 | 1 | – | IM/P/C | – | ||
| AZMO01000007 | 10.63 | 103 | 1 | – | IM/P/C | – | ||
| JMEB01000152 | 10.90 | 103 | 1 | – | IM/P/C | – | ||
| Atc_0665 | 10.37 | 103 | 1 | – | IM/P/C | – | ||
| Acaty_c0696 | 9.97 | 91 | 1 | – | IM/P/C | – | ||
| Family III | AFE_2918 | 6.82 | 128 | 1 | Yes | P | Yes | |
| Lferr_2533 | 6.82 | 128 | 1 | Yes | P | Yes | ||
| Acife_0237 | 8.79 | 128 | 1 | Yes | P/C | Yes | ||
| CDQ10857.1 | 7.88 | 128 | 1 | Yes | P | Yes | ||
| AFOH01000056 | 8.76 | 128 | 1 | Yes | P | Yes | ||
| AZMO01000007 | 8.07 | 128 | 1 | Yes | P | Yes | ||
| JMEB01000332 | 8.76 | 128 | 1 | Yes | P | Yes | ||
| Atc_2682 | 8.58 | 129 | 1 | Yes | P/C | Yes | ||
| Acaty_c2529 | 8.59 | 129 | 1 | Yes | P/C | Yes | ||
| Family IV | AFE_3261 | 6.33 | 172 | – | Yes | P/IM | Yes | |
| Lferr_2861 | 6.48 | 172 | – | Yes | P | Yes | ||
| Acife_0197 | 8.80 | 170 | – | Yes | P/E | Yes | ||
| CDQ11656.1 | 8.80 | 170 | – | Yes | P/E | Yes | ||
| AFOH01000137 | 6.33 | 172 | – | Yes | P | Yes | ||
| AZMO01000008 | 8.21 | 171 | – | Yes | P | Yes | ||
| JMEB01000258 | 8.22 | 171 | – | Yes | P | Yes | ||
| Atc_0064 | 8.80 | 170 | – | Yes | P/IM | Yes | ||
| Acaty_c0059 | 8.80 | 170 | – | Yes | P | Yes | ||
| Family V | AFE_2816 | 9.30 | 146 | 1 | – | P/IM | – | |
| Lferr_2439 | 9.31 | 146 | 1 | – | P/IM | – | ||
| Acife_0333 | 9.75 | 145 | 1 | – | P | – | ||
| CDQ09308.1 | 9.70 | 145 | 1 | – | P | – | ||
| AFOH01000029 | 9.52 | 86 | 1 | – | C/P | – | ||
| AZMO01000004 | 9.56 | 119 | 1 | – | P | – | ||
| JMEB01000081 | 9.40 | 119 | 1 | Yes | P | – | ||
| Atc_0233 | 9.21 | 128 | 1 | – | P | – | ||
| Acaty_c0260 | 9.21 | 128 | 1 | – | P | – |
IM, inner membrane; C, cytoplasm; P, periplasm.
Figure 1Work Flow. (A) Phylogenetic tree of the class Acidithiobacillia (within the dotted line) showing the clustering of the acidophilic Acidithiobacillus genus (Acidithiobacilli) subtended by the neutrophilic Thermithiobacillus tepidarius. The tree is based on genome-scale maximum-likelihood analysis of 98 universal protein families (housekeeping) conserved in Zeta-, Gamma-, Betaproteobacteria, and Acidithiobacillia class according to references Williams and Kelly (2013) and Hudson et al. (2014). (B) Pipeline for the identification and recovery of five protein families (termed I-V) unique to the genus Acidithiobacillus.
Gene expression evidence.
| Family I | AFE_0294 | ND | ND | Family I | AFE sp. Yes | |
| Lferr_0470 | ND | ND | ||||
| Acife_2737 | Yes | ND | ||||
| CDQ10770.1 | ND | ND | AFV sp. Yes | |||
| AFOH01000117 | ND | ND | ||||
| AZMO01000067 | ND | ND | ||||
| JMEB01000250 | ND | ND | ATHIO sp. Yes | |||
| Atc_0578 | ND | ND | ||||
| Acaty_c0588 | Yes | Up at pH 1 | ||||
| Family II | AFE_2894 | ND | ND | Family II | AFE sp. Yes | |
| Lferr_2514 | ND | ND | ||||
| Acife_0262 | Yes | ND | ||||
| CDQ10832.1 | ND | ND | AFV sp. Yes | |||
| AFOH01000056 | ND | ND | ||||
| AZMO01000007 | ND | ND | ||||
| JMEB01000152 | ND | ND | ATHIO sp. Yes | |||
| Atc_0665 | ND | ND | ||||
| Acaty_c0696 | Yes | No change | ||||
| Family III | AFE_2918 | Yes | ND | Family III | AFE sp. Yes | |
| Lferr_2533 | ND | ND | ||||
| Acife_0237 | Yes | ND | ||||
| CDQ10857.1 | ND | ND | AFV sp. Yes | |||
| AFOH01000056 | ND | ND | ||||
| AZMO01000007 | ND | ND | ||||
| JMEB01000332 | ND | ND | ATHIO sp. Yes | |||
| Atc_2682 | ND | ND | ||||
| Acaty_c2529 | Yes | Up at pH 1 | ||||
| Family IV | AFE_3261 | ND | ND | Family IV | AFE sp. Yes | |
| Lferr_2861 | ND | ND | ||||
| Acife_0197 | Yes | ND | ||||
| CDQ11656.1 | ND | ND | AFV sp. Yes | |||
| AFOH01000137 | ND | ND | ||||
| AZMO01000008 | ND | ND | ||||
| JMEB01000258 | ND | ND | ATHIO sp. Yes | |||
| Atc_0064 | ND | ND | ||||
| Acaty_c0059 | Yes | Up at pH 1 | ||||
| Family V | AFE_2816 | ND | ND | Family V | AFE sp. Yes | |
| Lferr_2439 | ND | ND | ||||
| Acife_0333 | Yes | ND | ||||
| CDQ09308.1 | ND | ND | AFV sp. Yes | |||
| AFOH01000029 | ND | ND | ||||
| AZMO01000004 | ND | ND | ||||
| JMEB01000081 | ND | ND | ATHIO sp. Yes | |||
| Atc_0233 | ND | ND | ||||
| Acaty_c0260 | Yes | Up at pH 4 |
Expression of members of the five orphan families in different environmental conditions. Locus tags for the five families are provided.
Gene expression for families I–V was extracted from Christel et al. (.
Information regarding protein abundance levels when A. caldus was subjected to growth at pH 1, 2, or 4 was taken from Mangold et al. (.
RNA transcript expression as determined by examination of published metatranscriptomics data (Chen et al., .
AFE, Acidithiobacillus ferrooxidans; AFV, Acidithiobacillus ferrivorans; ATHIO, Acidithiobacillus thiooxidans; ND, Not detected.
Figure 2Example of functional prediction based on multiple bioinformatics and genome-based evidence for members of family II. (A) Bioinformatics analysis of members of family II based on secondary structure prediction, hydrophobicity profiles and transmembrane segments prediction, multiple alignments and conservation profiles for the generation of consensus and profile sequences and their comparison to specific substrate binding protein profiles found in public databases. (B) Genomic context analysis of members of family II including functional annotations of the closest neighborhood genes for functional association. gstA, Glutathione S-transferase; ntrC, Nitrogen assimilation regulatory protein; ispB, Octaprenyl-diphosphate synthase; rfaL, O-antigen ligase; ftsI, Cell division protein; Hyp family II, Hypothetical protein; abcA, ABC transporter A family; app, Amino acid permease; Hyp (1–8), Hypothetical protein (1–8). Table 2 provides a complete overview of the predicted properties from amino acid sequences for member of the five families.
Figure 3Schematic summary of functional associations found in families I–V. (A) Multiple alignments, conservation profiles and consensus sequences. (B) Transmembrane topology predictions. (C) Predicted protein localization and deduced general functions. (D) Expression data TD: RNA transcript detected; EP: protein expression profile.
Figure 4Location of the genes encoding families I–V (red arrows) in the genomes of (A) A. ferrooxidans ATCC 23270, (B) A. ferrivorans SS3, (C) A. caldus ATCC 51756, and (D) A. caldus SM-1. The outer two circles show the genes on both strands of DNA of the chromosome. The inner blue circle indicates the G+C content. The green two-headed arrow indicates the predicted origin of replication of the chromosome. The red arrows indicate the position of the families I–V genes.
Figure 5Heat map showing the percent nucleotide similarity (from 100% to <70%, see color key) between families I–V genes, concatenated for each . ferro: A. ferrooxidans; ferri: A. ferrivorans; thio: A. thiooxidans and cald: A. caldus.
Figure 6Heat map illustrating the percent nucleotide similarity (from 100% to <50%, see color key) between families I–V genes and the best BLAST hit of four newly identified .
Figure 716S rRNA gene tree of selected . The tree was constructed using bayesian inference with MrBayes (Huelsenbeck and Ronquist, 2001). The posterior probability node support is given for all nodes.
Detection of .
| Kristineberg Mine | P | Malå, Sweden | 2.5–2.7 | NCBI nr | AOMQ00000000 | AFV, AFE, ATHIO, ACAL (Liljeqvist et al., | AFV, AFE, ATHIO, ACAL |
| Kristineberg Mine | B | Malå, Sweden | 2.5–2.7 | NCBI nr | AOMP00000000 | AFV, AFE, ATHIO, ACAL (Liljeqvist et al., | AFV, AFE, ATHIO, ACAL |
| Pink biofilm Richmond Mine | AMD | California, USA | 0.83 | NCBI nr | AADL00000000 | None (Tyson et al., | Not detected |
| Carnoulès Mine (bin 5) | AMD | Gard, France | 3.5–3.8 | NCBI nr | PRJNA62261 | AFE (Bertin et al., | AFE, ATHIO, ACAL |
| Snottites in Frasassi Cave | AMD | Ancona, Italy | 0–1 | NCBI nr | SRP006444 | ATHIO, AT (Jones et al., | ATHIO |
| Acquasanta Terme AS5 | SB | Grotta Nuova di Rio Garrafo, Italy | 0–1.5 | IMG/M | 3300000825 | ATHIO (Jones et al., | ATHIO |
| Black Soud Mine | AMD | Minnesota, USA | 6.7 | NCBI nr | ABLV00000000 | None (Edwards et al., | Not detected |
| Black smokers (Tui Malila) | HVP | Lau Basin, Pacific Ocean | 3.8–5.7 | IMG/M | 3300001676 | None (Sheik et al., | Not detected |
| Hydrothermal vent (Guaymas Basin) | HVP | Guaymas Basin, Pacific Ocean | 6.5–8 | IMG/M | 3300003086 | None (Li et al., | Not detected |
| Marine Microbial communities (Loihi) | HVP | Loihi Seamount, Hawaii | 8 | IMG/M | 3300000327 | None (Singer et al., | Not detected |
| Deep Oceanic Microbial Communities (Juan de Fuca) | HVP | Juan de Fuca, Pacific Ocean | 4.2 | IMG/M | 3300002481 | None (Jungbluth et al., | Not detected |
| Marine Microbial communities (Lost City) | HVP | Lost City, Atlantic Ocean | 9–11 | IMG/M | 3300003136 | None (Anantharaman et al., | Not detected |
| Dabaoshan Mine | AMD | Guangdong, China | 1.9–2.3 | MG-RAST | 4481316.3 | AFE, AFV (Chen et al., | AFE, AFV, ATHIO |
| Yunfu Mine | AMD | Guangdong, China | 2.5 | MG-RAST | 4481318.3 | AFE, AFV (Chen et al., | AFE, AFV, ATHIO |
AMD, Acid Mine Drainage; ACAL, A. caldus; AFV, A. ferrivorans; AFE, A. ferrooxidans; ATHIO, A. thiooxidans; AT, Acidithiobacillus genus; P, Planktonic; B, Biofilm; SB, Subaerial biofilm; HVP, Hydrothermal vent plume; NCBI nr, National Center for Biotechnology Information, non-redundant database; IMG/M, Integrated Microbial Genomes/ Metagenomes; MG-RAST, Metagenomes- Rapid Annotation using Subsystem Technology.
Figure 8Heat map indicating the percent nucleotide identity (top number in respective cells) and sequence coverage (lower number in respective cells) between families I–V and environmental metagenomes and metatranscriptomes as assayed by BLASTX. The figure also shows (leftmost column, ♢ = concatenated probe) the presence or absence of the Acidithiobacillus genus in the metagenomes and metatranscriptomes determined by BLASTX, using as a probe the concatenated sequences of all five families of all Acidithiobacilli used in the study (5 families × 9 Acidithiobacilli species = 45 concatenated sequences), where positive matching is indicated with a “yes.” The letters A to C refer to specific cases described in the text. The * refers to sequences that are truncated in the respective metagenome/transcriptome databases.