Literature DB >> 25452698

Evaluative profiling of arsenic sensing and regulatory systems in the human microbiome project genomes.

Raphael D Isokpehi1, Udensi K Udensi2, Shaneka S Simmons3, Antoinesha L Hollman4, Antia E Cain5, Samson A Olofinsae6, Oluwabukola A Hassan6, Zainab A Kashim6, Ojochenemi A Enejoh6, Deborah E Fasesan6, Oyekanmi Nashiru6.   

Abstract

The influence of environmental chemicals including arsenic, a type 1 carcinogen, on the composition and function of the human-associated microbiota is of significance in human health and disease. We have developed a suite of bioinformatics and visual analytics methods to evaluate the availability (presence or absence) and abundance of functional annotations in a microbial genome for seven Pfam protein families: As(III)-responsive transcriptional repressor (ArsR), anion-transporting ATPase (ArsA), arsenical pump membrane protein (ArsB), arsenate reductase (ArsC), arsenical resistance operon transacting repressor (ArsD), water/glycerol transport protein (aquaporins), and universal stress protein (USP). These genes encode function for sensing and/or regulating arsenic content in the bacterial cell. The evaluative profiling strategy was applied to 3,274 genomes from which 62 genomes from 18 genera were identified to contain genes for the seven protein families. Our list included 12 genomes in the Human Microbiome Project (HMP) from the following genera: Citrobacter, Escherichia, Lactobacillus, Providencia, Rhodococcus, and Staphylococcus. Gene neighborhood analysis of the arsenic resistance operon in the genome of Bacteroides thetaiotaomicron VPI-5482, a human gut symbiont, revealed the adjacent arrangement of genes for arsenite binding/transfer (ArsD) and cytochrome c biosynthesis (DsbD_2). Visual analytics facilitated evaluation of protein annotations in 367 genomes in the phylum Bacteroidetes identified multiple genomes in which genes for ArsD and DsbD_2 were adjacently arranged. Cytochrome c, produced by a posttranslational process, consists of heme-containing proteins important for cellular energy production and signaling. Further research is desired to elucidate arsenic resistance and arsenic-mediated cellular energy production in the Bacteroidetes.

Entities:  

Keywords:  Bacteroides; Bacteroidetes; Human Microbiome Project; arsenate; arsenic; arsenite; bioinformatics; genomes; gut microbiota; heavy metal transport; human symbiont; mercuric transport; secondary data analysis; visual analytics

Year:  2014        PMID: 25452698      PMCID: PMC4230230          DOI: 10.4137/MBI.S18076

Source DB:  PubMed          Journal:  Microbiol Insights        ISSN: 1178-6361


Introduction

High-throughput technologies for assaying biological macromolecules and metabolites are providing wealth of data on the structure, function, and condition-induced changes within host-associated microbial communities.1 The influence of environmental chemicals including arsenic, a type 1 carcinogen, on the composition and function of the human-associated microbiota is of significance in human health and disease.2 The data from the Human Microbiome Project (HMP) include genome sequences and functional annotations for over 1000 microbial isolates obtained from diverse body sites of healthy adults.3–5 There is an urgent need for data analytics (modeling and simulation, statistical analysis, and visual analytics) of the wealth of data on the human microbiome for new types of treatment as well as mechanisms of chronic diseases.6–8 The results from data analytics of microbiome data hold promise to advance knowledge of how the human microbiota at body sites respond to ubiquitous environmental chemicals such as arsenic. The overall theme of our research is to identify and evaluate microbial gene clusters that are equipped for stress response.9 In this article, we report the integration of data on the availability and abundance of genes for arsenic stress response in microbial genomes. We reason that the integration of availability (presence or absence) and abundance of genes for functions can be informative on the microbe’s potential to perform the functions. In the case of influences of arsenic on the human microbiome, knowledge on the availability and abundance of arsenic stress response genes will guide further research on the pre-systemic metabolism of arsenic by the microbiota at the body site. Arsenic is a naturally occurring environmental chemical, and drinking water and dietary intake are two main routes through which human beings are exposed to it.10,11 Pentavalent (arsenate) and trivalent (arsenite) inorganic arsenic species perturbs the normal cell function.12 The ubiquitous natural occurrence of arsenic means that cells from all domains of life must develop molecular and phenotypic mechanisms to respond to arsenic-induced stress.13–15 Ingested arsenic is a cause of cancers of the skin, lungs, bladder, and kidneys.16 Gut microbial metabolism of arsenate produces the more absorbable and toxic arsenite. The genome sequences of single isolates and microbial communities encode mechanisms by which gut microbiota transforms ingested arsenic to more toxic trivalent methylated and thiolated arsenicals prior to their metabolism in human cells.17,18 Therefore, to make progress on elucidating pre-systemic metabolism of arsenic, it is necessary to identify microbes of the human microbiota with genes for sensing and regulating arsenic. Exposure of the human microbiota to arsenic presents an unfavorable environment to microbial cells. In microbial genomes, several genes function in the sensing and regulation of inorganic arsenic. We are interested in the genes encoding arsenic resistance operon, the aquaporins, and the universal stress proteins (USPs). These genes encode function for sensing and/or regulating (resistance) arsenic content in the bacterial cell.19 The best-characterized arsenic genes include As(III)-responsive transcriptional repressor (arsR), anion-transporting ATPase (arsA), arsenical pump membrane protein (arsB), arsenate reductase and related proteins, glutaredoxin family (arsC arsenical resistance operon trans-acting repressor (arsD), arylsulfatase family, member H (arsH), putative membrane permease (ArsP), and As(III)-S-adenosylmethionine methyltransferase (arsM).20 The genes for conversion of arsenate to arsenite and arsenite extrusion from the cell are typically organized as operons, such as arsRBC, arsRABC, and arsRDABC, but the genes can also exist alone.20 Proteins for water and/or glycerol transport across cellular membranes termed aquaporins can also function in arsenic transport.21 The USP family is a protein family known to enable bacteria, archaea, fungi, viridiplantae, and certain metazoans that respond to stresses.22–24 The USP family includes proteins that contain 140–160 amino acid (aa) USP domain [PF00582 (or Pfam00582) in the Pfam database].22–24 The domain architecture of USPs can be (i) one USP domain, (ii) two USP domains in tandem, or (iii) one or two USP domains together with other functional domains including transporters, kinases, permeases, transferases, and bacterial sensor.24,25 In Exiguobacterium sp. PS, a Gram-positive bacteria that lacks arsenic reductase activity, a USP was induced by arsenate stress.26 The uspA of Escherichia coli has been evaluated as a sensor to detect chemical toxicants.27 The availability of diverse data from HMP allows for secondary data analytics including constructing profiles of functional annotations for genes involved in arsenic sensing and regulation in the HMP genomes collection. Therefore, we report the development of a genome profiling scheme based on the availability of functional annotations for seven Pfam protein families, including known arsenic resistance operon proteins, aquaporins, and USPs. A list of 62 genomes from 18 genera was identified including 12 genomes in the HMP genomes collection. Several noteworthy findings could be a basis for further investigations. For example, in multiple Bacteroides genomes, a gene for arsenic binding and transfer (arsD) is adjacent to a gene for cytochrome c biogenesis protein. Cytochrome c, produced by a posttranslational process, consists of heme-containing proteins important for cellular energy production and signaling.28,29 Previous reported research on Bacteroides and arsenic appears to be limited to phenotypic characterization of susceptibilities to arsenate in which 25% of strains in the Bacteroides fragilis group, which included Bacteroides thetaiotaomicron, were resistant to 0.01 M arsenate.30 Further research is desired to elucidate arsenic resistance and arsenic-mediated cellular energy production in the Bacteroidetes.

Methods

Construction of functional annotation profiles for arsenic stress response of microbial genomes

We assembled a list of protein family functions (Pfam) that are known to participate in the metabolism of arsenic and in stress response in bacteria and archaea. The Pfam identifiers, names, and common abbreviation of the proteins are Pfam02374 [anion-transporting ATPase (ArsA)]; Pfam02040 [arsenical pump membrane protein (ArsB)]; Pfam03960 [arsenate reductase and related proteins, glutaredoxin family (ArsC)]; Pfam06953 [arsenical resistance operon trans-acting repressor (ArsD)]; Pfam01022 [As(III)-responsive transcriptional repressor (ArsR)]; Pfam00230 [major intrinsic protein family (MIP/AQP)]; and Pfam00582 [universal stress protein domain (Usp)]. Genomes with a profile of interest were further grouped into relevance annotation (eg, agriculture, biotechnology, human pathogen, and medical) provided by the Integrated Microbial Genomes (IMG) system. When relevance is not annotated, we tagged the genome as “Not_Reported.” A binary matrix that encodes the presence (1) or absence (0) of a relevance annotation for selected genomes was constructed. The binary matrix was visualized with matrix2png.31

Integration of availability and abundance of genes for arsenic metabolism and USPs

Genes annotated for presence of the above seven protein families in microbial genomes were initially retrieved from the IMG system (http://img.jgi.doe.gov/) in December 2011 (IMG version 3.5).32 The datasets were downloaded as Excel spreadsheets and integrated in a visual analytics software (Tableau Desktop Professional, Seattle, WA). The dataset includes several fields including the genome name (Genome) and the gene object identifier (Gene Object ID). The visual analytics tool displayed the availability (presence or absence) and abundance (number of genes annotated with the Pfam function) of each microbial genome. A list of reference genomes sequenced by the HMP was obtained from the HMP catalog (http://www.hmpdacc.org/catalog/). Because B. thetaiotaomicron is a dominant symbiont of the gut of humans and other mammals,33 we decided to determine the abundance of genes for the arsenic resistance operon (arsRABCD) in the Bacteroides genomes in the dataset. A visualization was also generated to provide an overview of the abundance of the arsenic resistance genes in Bacteroides genomes.

Functional associations of ArsD encoded transcription units in B. thetaiotaomicron

The gene content of the transcription unit with arsenic-associated genes was determined for B. thetaiotaomicron using the BioCyc database collection of pathway/genome databases (PGDBs).34 In BioCyc, a transcription unit is defined as a set of one or more genes that are transcribed to produce a single messenger RNA. Our particular interest is in multigene transcription units. Based on the observed functions in the transcription units, the presence of two Pfam functions in a chromosomal cassette was determined with the Cassette search tool of the IMG system.32 In the IMG system, a chromosomal cassette is defined as a stretch of protein coding genes with intergenic distance lesser than or equal to 300 base pairs. Known and predicted functional associations of proteins encoded in a transcription unit were retrieved from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database.35 With this approach, we expect that the STRING database will provide a score for gene neighborhood evidence for genes in a transcription unit. Other types of evidence that are used in the STRING database to calculate a combined score are gene fusion, co-occurrence, co-expression, experiment, databases, text mining, and homology.35

Results

Arsenic stress response profiles for genomes

A seven-digit binary code was assigned to 3,274 genomes (119 archaea, 3033 bacteria, and 122 eukaryota) obtained from the IMG system. The order of the seven Pfam families in the binary code is ArsA, ArsB, ArsC, ArsD, ArsR, Aqp, and Usp (Table 1). As functional annotations of genes could change, we selected only genomes that have the complete profile (111111111). A total of 62 microbial genomes from 18 genera had a binary code in which all the seven Pfam families were present. We further grouped the genomes according to their relevance (eg, agriculture, biotechnology, human pathogen, and medical) to help direct further research. A subset of 57 genomes with the complete profile was mapped to 22 assignments of relevance. Additionally, five genomes did not have an assignment of relevance and their assignment was designated “Not_Reported”. A visualization of the matrix of 23-digit binary signatures was constructed for 62 genomes (Fig. 1).
Table 1

Position of Pfam protein family annotation in genome binary profile.

Pfam IDPfam NAME AND ABBREVIATIONPOSITION IN BINARY DIGIT
Pfam02374Anion-transporting ATPase [ArsA]1
Pfam02040Arsenical pump membrane protein [ArsB]2
Pfam03960Arsenate reductase and related proteins, glutaredoxin family [ArsC]3
Pfam06953Arsenical resistance operon trans-acting repressor, [ArsD]4
Pfam01022As(III)-responsive transcriptional repressor [ArsR]5
Pfam00230Major intrinsic protein family [MIP/AQP]6
Pfam00582Universal stress protein domain [Usp]7
Figure 1

Visualization of binary-encoded matrix for relevance of genomes with genes for arsenic operon, aquaporin, and universal stress protein. Data were obtained from the Integrated Microbial Genomes. Black square, relevance annotated for genome; white square, relevance not annotated for genome. When relevance data were not available, we entered a “Not_Reported” for data processing.

The 12 HMP reference genomes with the complete profile were grouped in the body sites: gastrointestinal tract, skin, and skin wound. The genomes with the complete binary profile from the human gastrointestinal tract were Citrobacter sp. 30_2, E. coli MS 115-1, E. coli MS 198-1, Lactobacillus fermentum ATCC 14931, and Providencia stuartii ATCC 25827. Additionally, the selected genomes based on the complete seven-digit profile and isolated from the skin were five strains of Staphylococcus epidermidis, Staphylococcus hominis SK119, and Rhodococcus erythropolis SK121. The locus tags for genes encoding the protein families are presented in Table 2. The locus tags are presented in the sequence of the arsRDABC operon (Table 2). The arrangement of the genes with reference to their function is presented in Table 3. In Citrobacter sp. 30_2, two arsenic resistance gene clusters were identified with the same gene order of arsRDABC. Overall, a pattern in Table 3 is that arsA and arsD when present are adjacent for all the genomes from the four Gram-negative bacteria. However, in L. fermentum ATCC 14931, the gene cluster order was arsRABDC.
Table 2

Locus tags for genes encoding selected arsenic-associated protein families in five Human Microbiome Project reference genomes.

Pfam FAMILYCITROBACTER SP. 30_2ESCHERICHIA COLI MS 115-1ESCHERICHIA COLI MS 198-1LACTOBACILLUS FERMENTUM ATCC 14931PROVIDENCIA STUARTII ATCC 25827
arsR (Pfam01022)CSAG_00049HMPREF9540_00434HMPREF9552_00168HMPREF0511_0214PstuA_020100015920
CSAG_00058HMPREF9540_00675HMPREF9552_02803HMPREF0511_1131PstuA_020100016320
CSAG_00761HMPREF9540_01104HMPREF9552_02903HMPREF0511_1475PstuA_020100017025
CSAG_02502HMPREF9540_04804HMPREF9552_02908
CSAG_04185HMPREF9540_04813
CSAG_04189
CSAG_04238
CSAG_04297
arsD (Pfam06953)CSAG_00050HMPREF9540_04807HMPREF9552_02907HMPREF0511_1134PstuA_020100016315
CSAG_00055HMPREF9540_04812
CSAG_04239
arsA (Pfam02374)CSAG_00051HMPREF9540_04808HMPREF9552_02906HMPREF0511_1132PstuA_020100007500
CSAG_00054HMPREF9540_04811PstuA_020100016310
CSAG_04240
CSAG_04243
arsB (Pfam02040)CSAG_00052HMPREF9540_00433HMPREF9552_02905HMPREF0511_1133PstuA_020100016305
CSAG_04241HMPREF9540_04810
arsC (Pfam03960)CSAG_00053HMPREF9540_00432HMPREF9552_01835HMPREF0511_0280PstuA_020100013100
CSAG_02267HMPREF9540_04809HMPREF9552_01863HMPREF0511_0923PstuA_020100013225
CSAG_02283HMPREF9540_05016HMPREF9552_02904HMPREF0511_0962PstuA_020100016300
CSAG_04242HMPREF9540_05044
Aqp (Pfam00230)CSAG_01847HMPREF9540_02867HMPREF9552_00087HMPREF0511_1378PstuA_020100002768
CSAG_01948HMPREF9540_03890HMPREF9552_03971
CSAG_04569HMPREF9540_04727
Usp (Pfam00582)CSAG_00404HMPREF9540_00348HMPREF9552_01620HMPREF0511_0613PstuA_020100008480
CSAG_00475HMPREF9540_00443HMPREF9552_02920HMPREF0511_1339PstuA_020100010015
CSAG_01459HMPREF9540_00492HMPREF9552_03271HMPREF0511_1387PstuA_020100010755
CSAG_01471HMPREF9540_01169HMPREF9552_03967HMPREF0511_1569PstuA_020100010765
CSAG_01741HMPREF9540_02863HMPREF9552_04168HMPREF0511_1702PstuA_020100011480
CSAG_03714HMPREF9540_04247HMPREF9552_05110PstuA_020100015380
CSAG_03977HMPREF9540_04414HMPREF9552_05202PstuA_020100019714
CSAG_04126
CSAG_00328
Table 3

Comparison of gene function order in arsenic resistance operon in selected Human Microbial Project reference genomes.

GENOMEGENE CLUSTER IDENTIFIER*GENE ORDER
12345
Citrobacter sp. 30_2CSAG_00049-CSAG_00053arsRarsDarsAarsBarsC
CSAG_00054-CSAG_00055arsAarsD
CSAG_04238-CSAG_04242arsRarsDarsAarsBarsC
CSAG_04243-CSAG_04243arsA
Escherichia coli MS 115-1HMPREF9540_00432-HMPREF9540_00434arsCarsBarsR
HMPREF9540_04807-HMPREF9540_04810arsDarsAarsCarsB
HMPREF9540_04811-HMPREF9540_04813arsAarsDarsR
Escherichia coli MS 198-1HMPREF9552_02903-HMPREF9552_02907arsRarsCarsBarsAarsD
HMPREF9552_02908-HMPREF9552_02908arsR
Lactobacillus fermentum ATCC 14931HMPREF0511_1131-HMPREF0511_1134arsRarsAarsBarsD
Providencia stuartii ATCC 25827PstuA_020100016300-PstuA_020100016320arsCarsBarsAarsDarsR

Note:

Start and end genes are used to identify gene clusters.

Integration of availability and abundance of genes for arsenic sensing and regulation

The integration of data fields from several data sources was accomplished through visual analytics tasks. Figure 2 is a visualization that integrates the data on binary code; body site; genome; and the availability of Pfam annotations and abundance (number of genes) for 12 reference genomes sequenced by the HMP. Several patterns can be identified from Figure 2. As gastrointestinal pre-systemic metabolism is an essential step of arsenic metabolism in humans, we further investigated the abundance of arsenic resistance genes in 43 Bacteroides genomes (Fig. 3). There were several noteworthy findings from the visualization including (i) multiple copies of arsA, arsD, and arsR were observed in the genomes of Bacteroides intestinalis 341, DSM 17393, and B. thetaiotaomicron VPI-5482; (ii) all the Bacteroides genomes did not include the annotation for arsB (Pfam2040, arsB); and (iii) in 21 Bacteroides genomes, only one arsC gene per genome was annotated. Clearly, the B. intestinalis and B. thetaiotaomicron strains have multiple arsA, arsD, and arsR, which is indicative of the presence of at least two arsenic resistance operons. The genomic context of the genes in the arsenic operons of B. thetaiotaomicron is presented in Figure 4. Further analysis of the Pfam domain composition of the three genes annotated with the Pfam for the arsA gene revealed that two genes (BT_0116 and BT_0802) had only the Pfam02374 annotation, whereas BT_3895 had a protein domain annotation of Pfam02374 (arsA) and Pfam10609 (ParA/MinD ATPase like).
Figure 2

Genomes in the Human Microbiome Project (HMP) genomes collection with genes for arsenic operon, aquaporin, and universal stress protein.

Notes: Binary code is based on the presence of seven protein families: Pfam02374 [anion-transporting ATPase (ArsA)]; Pfam02040 [arsenical pump membrane protein (ArsB)]; Pfam03960 [arsenate reductase and related proteins, glutaredoxin family (ArsC)]; Pfam06953 [arsenical resistance operon trans-acting repressor (ArsD)]; Pfam01022 [As(III)-responsive transcriptional repressor (ArsR)]; Pfam00230 [major intrinsic protein family (MIP/AQP)]; and Pfam00582 [universal stress protein domain (Usp)].

Figure 3

Abundance of genes for arsenic-associated genes in genomes of Bacteroides species.

Notes: The horizontal axis has the count for arsenic-associated genes in the genome. The scale for each Pfam family varies according to the maximum abundance observed in the genomes evaluated. The genes and their Pfam encoding are arsA, Pfam02374 (anion-transporting ATPase); arsB, Pfam02040 (arsenical pump membrane protein); arsC, Pfam03960 (arsenate reductase and related proteins, glutaredoxin family); arsD, Pfam06953 (arsenical resistance operon trans-acting repressor); and arsR, Pfam01022 (As(III)-responsive transcriptional repressor).

Figure 4

Transcription units and functional associations of arsenic resistance operon in Bacteroides thetaiotaomicron VPI-5482. The web pages for the transcription units are http://biocyc.org/BTHE226186/NEW-IMAGE?type=OPERON&object=TUJXV-83 and http://biocyc.org/BTHE226186/NEW-IMAGE?type=OPERON&object=TUJXV-442.

Functional associations of ArsD-encoded transcription units in B. thetaiotaomicron

Figure 4 provides an overview of the two transcription units, the predicted function of proteins, and the predicted protein–protein networks involving arsenic-associated genes for B. thetaiotaomicron VPI-5482. The two arsD genes (BT_0117 and BT_0801) in B. thetaiotaomicron VPI-5482 are located on two transcription units, which are, respectively, labeled as TUJXV-83 and TUJXV-442 in BioCyc, respectively. These transcription units have nine and six genes, respectively (Fig. 4A and B). As shown in Figure 4C, both transcription units have the genes for homologs of arsA (BT_0116; BT_0802), acr3 [homologous to arsB] (BT_0114; BT_0803), and arsD (BT_0117; BT_0801). TUJXV-83 is unique for a permease (BT_0113), mercuric transport protein (BT_0114), and arsenic reductase (BT_0115). Three additional proteins encoded in both transcription units are cytochrome c biogenesis protein (BT_0118; BT_0800), protein with thioredoxin-like fold (BT_0119; BT_0799), and redox-active disulfide protein (BT_0120; BT_0798). In both the transcription units, the arsD gene was adjacent to the gene for a cytochrome c biogenesis protein DsbD_2 (Pfam13386) and an anion-transporting ATPase arsA (Pfam06953). Using the IMG system, a search of chromosomal cassettes with Pfam06953 and Pfam13386 in 2,841 finished bacterial genomes identified cassettes in the following nine genomes: B. thetaiotaomicron VPI-5482; Bacteroides vulgatus ATCC 8482; Bacteroides xylanisolvens XB1A; Porphyromonas asaccharolytica VPI 4198, DSM 20707; Prevotella melaninogenica ATCC 25845; Shewanella putrefaciens 200; S. putrefaciens CN-32; Shewanella sp. ANA-3; and Shewanella sp. W3-18-1. It was only in the Bacteroidetes genomes (Bacteroides, Poryphyromonas, and Prevotella) that genes for ArsD and DsbD_2 were adjacent. The gene for ArsD proteins (BT_0117 and BT_0801) of B. thetaiotaomicron VPI-5482 was selected as input proteins for generation of protein–protein interaction network. As expected, the generated networks (Fig. 4D) include the genes in the transcription units that had the neighborhood evidence (green line). The interaction between BT_0117 (arsD) and BT_0116 (arsA) has multiple types of evidence and is expected. A predicted interaction of BT_0120 (redox-active disulfide protein) had co-occurrence and fusion evidence types with BT_0112 (a permease) and BT_0110 (hypothetical protein), respectively. There was experimental evidence for the interaction between homologs of BT_0802 (arsA) and ileS (BT_0806; isoleucyl-tRNA synthetase).

Discussion

We have provided an integrated view of relevance of 62 genomes from 18 genera with genes for arsenic operon, aquaporin, and USPs (Fig. 2). A set of 12 bacteria genomes in the HMP collection5 was identified to have genes encoding seven protein families defined in this research as relevant to arsenic sensing and regulation (Table 1). The following bacteria in the HMP reference genomes are associated with the gastrointestinal tract: Citrobacter sp. 30_2, E. coli MS 115-1, E. coli MS 198-1, L. fermentum ATCC 14931, and P. stuartii ATCC 25827. Citrobacter sp. 30_2 is a Gram-negative isolate from an intestinal biopsy specimen of patient with Crohn’s disease.36 The Citrobacter genus are rod shaped, motile, non-spore forming members of the family Enterobacteriaceae that use citrate as their sole source of carbon.37,38 In terms of arsenic metabolism, Citrobacter sp. NC-1 isolated from soil contaminated with arsenic at levels as high as 5,000 mg. As kg−1 was able to reduce 20 mM arsenate within 24 h.39 A comparison of Citrobacter sp. 30_2 and Citrobacter UC1CIT strains from a premature infant revealed the presence of an arsenic operon unique to strain 30_2.36 The two E. coli strains identified are part of the project on human gut microbiota and Crohn’s disease (http://genome.wustl.edu/projects/). An indication that the genomic composition of E. coli MS 115-1 is equipped for environmental fitness is the presence of the clpK gene for thermal resistance in Klebsiella pnuemoniae.36 E. coli MS 115-1 is one of the two E. coli strains with the clpK gene. In Table 3, we observed that arsA and arsD were adjacent for all the four Gram-negative bacteria. However, in L. fermentum ATCC 14931, the arsB and arsD were adjacent. A systematic evaluation of more than 19,000 bacterial genomes could provide additional examples of this gene adjacency. Functional analysis with molecular techniques could also elucidate impact of the adjacency on arsenic extrusion. We have included the USPs as markers for arsenic exposure because of (i) prior research that support induction of USPs by arsenic and (ii) proximity of arsenic-associated genes and genes for USPs. Arsenite early exposure (15 minutes) induced the transcription of two usp genes in Herminiimonas arsenicoxydans, a bacterium isolated from arsenic-contaminated sludge.40 A usp gene and an arsenic resistance operon are located on an antibiotic-resistant island in the genome of Acinetobacter baumannii, an opportunistic pathogen that causes nosocomial infections.41 We have observed that the genome of Bacillus cereus Q1 contains a usp gene that is adjacent and in the same transcription direction with a gene with predicted function for HTH ArsR-type DNA-binding domain (Inter-Pro Database Identifier: IPR001845).9 Further research is needed to better define the relationship between expression of USP genes and the level of arsenic exposure. Additionally, investigations are desired in the context of arsenic sensing to compare the speed of expression of USP genes and arsenic resistance operon (ars) genes. In the Gram-negative colon inhabiting B. thetaiotaomicron VPI-5482, two transcription units include arsenic resistance genes (ars) (Fig. 4). Investigations to confirm these genes would help define mechanisms for arsenic sensing and regulation by B. thetaiotaomicron VPI-5482, which is able to acquire and utilize indigestible dietary polysaccharides.42 Multiple copies of ars genes in B. thetaiotaomicron VPI-5482 are consistent with expansion of paralogous genes and the species environmental sensing abilities needed to adapt to changing ecosystems.33 Only genomes categorized under the phylum Bacteroidetes contain genes encoding for the cytochrome c biogenesis protein adjacent to the arsD gene protein is adjacent to the arsD gene only genomes of the Bacteroidetes phylum (Fig. 3). Cytochrome c, produced by a posttranslational process, consists of heme-containing proteins important for cellular energy production and signaling.28,29 The arsD controls the maximal expression of the arsenic-resistant operon (arsRDABC).43 The ArsD metallochaperone protein delivers arsenite to ArsA efflux pump.44,45 The significance of the adjacency of cytochrome c biogenesis protein and the metallochaperone needs further investigation. In Shewanella putrefaciens strain CN-2, a subunit of c-type cytochrome (CymA) that is present in anaerobic conditions functions in conjunction with a known respiratory arsenate reductase.46 Through additional functional annotation data curation, we noted the presence of a gene for mercuric transport protein (Locus Tag: BT_0113; UniProt Accession: Q8ABJ7) in one of the BioCyc transcription units (TUJXV-83) of the B. thetaiotaomicron VPI-5482 (Fig. 4). Predictions available at OrthoDB indicate that the gene encodes a mercuric transport protein (http://cegg.unige.ch/orthodb/results?searchtext=Q8ABJ7).47 Genes BT_0112 and BT_0114 encode transport functions. The proteins encoded by the genes BT_0112, BT_0113, and BT_0114 could be investigated for mechanisms of heavy metal transport in B. thetaiotaomicron. The focus of the evaluative profiling scheme was limited to seven Pfam annotations. Thus, certain functions that are arsenic associated would not be evaluated. For example, in the Bacteroides genomes, annotation for ArsB (Pfam02040; arsB) was not observed (Fig. 3). The annotation available in the genomes was the ACR3 form. Furthermore, in L. fermentum ATCC 14931, arsC (HMPREF0511_1135) in the arsenic resistance operon (HMPREF0511_1131 to HMPREF0511_1135) was not annotated with the Pfam family Pfam0396, but the annotation observed was Pfam01451. Our evaluative scheme assessed the arsenic reductase genes annotated with Pfam03960. Further development of the evaluative profiling for arsenic sensing and regulation would be more comprehensive using the arsenic-related gene families: cytoplasmic AsV reduction (ars), periplasmic AsV reduction (arr), arsenite oxidation (aio), and arsenite methylation (arsM).48 Finally, the evaluative profiles will account for instances where multiple Pfam families map to a gene as with arsC and arsB.

Conclusion

In conclusion, we have developed a suite of bioinformatics and visual analytics methods to evaluate the availability (presence or absence) and abundance of functional annotations in a microbial genome for seven Pfam protein families: As(III)-responsive transcriptional repressor (ArsR); anion-transporting ATPase (ArsA); arsenical pump membrane protein (ArsB); arsenate reductase (ArsC); arsenical resistance operon trans-acting repressor (ArsD); water/glycerol transport protein (aquaporins); and USP. We identified 62 genomes from 18 genera that have genes for all the seven protein families. Our list included 12 genomes in the HMP reference genomes from the following genera Citrobacter, Escherichia, Lactobacillus, Providencia, Rhodococcus, and Staphylococcus. The use of visual analytics methods makes it possible to include additional arsenic-associated protein families in the profiling scheme. Finally, investigations are desired on the arsenic sensing and regulatory systems in members of the Bacteroidetes phylum.
  45 in total

Review 1.  From structure to function: the ecology of host-associated microbial communities.

Authors:  Courtney J Robinson; Brendan J M Bohannan; Vincent B Young
Journal:  Microbiol Mol Biol Rev       Date:  2010-09       Impact factor: 11.056

2.  The cymA gene, encoding a tetraheme c-type cytochrome, is required for arsenate respiration in Shewanella species.

Authors:  Julie N Murphy; Chad W Saltikov
Journal:  J Bacteriol       Date:  2007-01-05       Impact factor: 3.490

3.  Metalloregulatory properties of the ArsD repressor.

Authors:  Y Chen; B P Rosen
Journal:  J Biol Chem       Date:  1997-05-30       Impact factor: 5.157

4.  Strain-resolved community genomic analysis of gut microbial colonization in a premature infant.

Authors:  Michael J Morowitz; Vincent J Denef; Elizabeth K Costello; Brian C Thomas; Valeriy Poroyko; David A Relman; Jillian F Banfield
Journal:  Proc Natl Acad Sci U S A       Date:  2010-12-29       Impact factor: 11.205

5.  Ability of commercial identification systems to identify newly recognized species of Citrobacter.

Authors:  C M O'Hara; S B Roman; J M Miller
Journal:  J Clin Microbiol       Date:  1995-01       Impact factor: 5.948

6.  Responses to toxicants of an Escherichia coli strain carrying a uspA'::lux genetic fusion and an E. coli strain carrying a grpE'::lux fusion are similar.

Authors:  T K Van Dyk; D R Smulski; T R Reed; S Belkin; A C Vollmer; R A LaRossa
Journal:  Appl Environ Microbiol       Date:  1995-11       Impact factor: 4.792

7.  Susceptibility of Bacteroides spp. to heavy metals.

Authors:  T V Riley; B J Mee
Journal:  Antimicrob Agents Chemother       Date:  1982-11       Impact factor: 5.191

8.  Identification of drought-responsive universal stress proteins in viridiplantae.

Authors:  Raphael D Isokpehi; Shaneka S Simmons; Hari H P Cohly; Stephen I N Ekunwe; Gregorio B Begonia; Wellington K Ayensu
Journal:  Bioinform Biol Insights       Date:  2011-02-07

9.  Inferences on the biochemical and environmental regulation of universal stress proteins from Schistosomiasis parasites.

Authors:  Andreas N Mbah; Ousman Mahmud; Omotayo R Awofolu; Raphael D Isokpehi
Journal:  Adv Appl Bioinform Chem       Date:  2013-05-10

10.  Gut microbiome phenotypes driven by host genetics affect arsenic metabolism.

Authors:  Kun Lu; Ridwan Mahbub; Peter Hans Cable; Hongyu Ru; Nicola M A Parry; Wanda M Bodnar; John S Wishnok; Miroslav Styblo; James A Swenberg; James G Fox; Steven R Tannenbaum
Journal:  Chem Res Toxicol       Date:  2014-02-03       Impact factor: 3.739

View more
  9 in total

Review 1.  Individual susceptibility to arsenic-induced diseases: the role of host genetics, nutritional status, and the gut microbiome.

Authors:  Liang Chi; Bei Gao; Pengcheng Tu; Chih-Wei Liu; Jingchuan Xue; Yunjia Lai; Hongyu Ru; Kun Lu
Journal:  Mamm Genome       Date:  2018-02-10       Impact factor: 2.957

2.  Arsenic Accumulation of Realgar Altered by Disruption of Gut Microbiota in Mice.

Authors:  Wenfeng Xu; Shanshan Zhang; Wenqing Jiang; Shuo Xu; Pengfei Jin
Journal:  Evid Based Complement Alternat Med       Date:  2020-08-18       Impact factor: 2.629

3.  Gut microbiome disruption altered the biotransformation and liver toxicity of arsenic in mice.

Authors:  Liang Chi; Jingchuan Xue; Pengcheng Tu; Yunjia Lai; Hongyu Ru; Kun Lu
Journal:  Arch Toxicol       Date:  2018-10-24       Impact factor: 5.153

Review 4.  The gut microbiome and arsenic-induced disease-iAs metabolism in mice.

Authors:  Yifei Yang; Liang Chi; Yunjia Lai; Yun-Chung Hsiao; Hongyu Ru; Kun Lu
Journal:  Curr Environ Health Rep       Date:  2021-04-14

5.  The microbiome in urogenital schistosomiasis and induced bladder pathologies.

Authors:  Adewale S Adebayo; Mangesh Vasant Suryavanshi; Shrikant Bhute; Atinuke M Agunloye; Raphael D Isokpehi; Chiaka I Anumudu; Yogesh S Shouche
Journal:  PLoS Negl Trop Dis       Date:  2017-08-09

6.  Genomic Evidence for Bacterial Determinants Influencing Obesity Development.

Authors:  Raphael D Isokpehi; Shaneka S Simmons; Matilda O Johnson; Marinelle Payton
Journal:  Int J Environ Res Public Health       Date:  2017-03-26       Impact factor: 3.390

7.  The gut microbiome is required for full protection against acute arsenic toxicity in mouse models.

Authors:  Michael Coryell; Mark McAlpine; Nicholas V Pinkham; Timothy R McDermott; Seth T Walk
Journal:  Nat Commun       Date:  2018-12-21       Impact factor: 14.919

8.  Strain-level profiling of viable microbial community by selective single-cell genome sequencing.

Authors:  Masahito Hosokawa; Taruho Endoh; Kazuma Kamata; Koji Arikawa; Yohei Nishikawa; Masato Kogawa; Tatsuya Saeki; Takuya Yoda; Haruko Takeyama
Journal:  Sci Rep       Date:  2022-03-15       Impact factor: 4.996

9.  H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

Authors:  Nicola J Mulder; Ezekiel Adebiyi; Raouf Alami; Alia Benkahla; James Brandful; Seydou Doumbia; Dean Everett; Faisal M Fadlelmola; Fatima Gaboun; Simani Gaseitsiwe; Hassan Ghazal; Scott Hazelhurst; Winston Hide; Azeddine Ibrahimi; Yasmina Jaufeerally Fakim; C Victor Jongeneel; Fourie Joubert; Samar Kassim; Jonathan Kayondo; Judit Kumuthini; Sylvester Lyantagaye; Julie Makani; Ahmed Mansour Alzohairy; Daniel Masiga; Ahmed Moussa; Oyekanmi Nash; Odile Ouwe Missi Oukem-Boyer; Ellis Owusu-Dabo; Sumir Panji; Hugh Patterton; Fouzia Radouani; Khalid Sadki; Fouad Seghrouchni; Özlem Tastan Bishop; Nicki Tiffin; Nzovu Ulenga
Journal:  Genome Res       Date:  2015-12-01       Impact factor: 9.438

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.