Christian Woehle1, Alexandra-Sophie Roy2, Nicolaas Glock3, Tanita Wein4, Julia Weissenbach4, Philip Rosenstiel5, Claas Hiebenthal3, Jan Michels6, Joachim Schönfeld3, Tal Dagan4. 1. Institute of Microbiology, Kiel University, Am Botanischen Garten 11, Kiel 24118, Germany. Electronic address: cwoehle@ifam.uni-kiel.de. 2. Institute of Microbiology, Kiel University, Am Botanischen Garten 11, Kiel 24118, Germany. Electronic address: sroy@ifam.uni-kiel.de. 3. GEOMAR Helmholtz Centre for Ocean Research Kiel, Wischhofstrasse, Kiel 24148, Germany. 4. Institute of Microbiology, Kiel University, Am Botanischen Garten 11, Kiel 24118, Germany. 5. Institute of Clinical Molecular Biology, Kiel University, Am Botanischen Garten 11, Kiel 24118, Germany. 6. Institute of Zoology, Kiel University, Am Botanischen Garten 1-9, Kiel 24118, Germany.
Abstract
Benthic foraminifera are unicellular eukaryotes inhabiting sediments of aquatic environments. Several species were shown to store and use nitrate for complete denitrification, a unique energy metabolism among eukaryotes. The population of benthic foraminifera reaches high densities in oxygen-depleted marine habitats, where they play a key role in the marine nitrogen cycle. However, the mechanisms of denitrification in foraminifera are still unknown, and the possibility of a contribution of associated bacteria is debated. Here, we present evidence for a novel eukaryotic denitrification pathway that is encoded in foraminiferal genomes. Large-scale genome and transcriptomes analyses reveal the presence of a denitrification pathway in foraminifera species of the genus Globobulimina. This includes the enzymes nitrite reductase (NirK) and nitric oxide reductase (Nor) as well as a wide range of nitrate transporters (Nrt). A phylogenetic reconstruction of the enzymes' evolutionary history uncovers evidence for an ancient acquisition of the foraminiferal denitrification pathway from prokaryotes. We propose a model for denitrification in foraminifera, where a common electron transport chain is used for anaerobic and aerobic respiration. The evolution of hybrid respiration in foraminifera likely contributed to their ecological success, which is well documented in palaeontological records since the Cambrian period.
Benthic foraminifera are unicellular eukaryotes inhabiting sediments of aquatic environments. Several species were shown to store and use nitrate for complete denitrification, a unique energy metabolism among eukaryotes. The population of benthic foraminifera reaches high densities in oxygen-depleted marine habitats, where they play a key role in the marine nitrogen cycle. However, the mechanisms of denitrification in foraminifera are still unknown, and the possibility of a contribution of associated bacteria is debated. Here, we present evidence for a novel eukaryotic denitrification pathway that is encoded in foraminiferal genomes. Large-scale genome and transcriptomes analyses reveal the presence of a denitrification pathway in foraminifera species of the genus Globobulimina. This includes the enzymes nitrite reductase (NirK) and nitric oxide reductase (Nor) as well as a wide range of nitrate transporters (Nrt). A phylogenetic reconstruction of the enzymes' evolutionary history uncovers evidence for an ancient acquisition of the foraminiferal denitrification pathway from prokaryotes. We propose a model for denitrification in foraminifera, where a common electron transport chain is used for anaerobic and aerobic respiration. The evolution of hybrid respiration in foraminifera likely contributed to their ecological success, which is well documented in palaeontological records since the Cambrian period.
Production of biologically inaccessible dinitrogen (N2) is
attributed to anaerobic oxidation of ammonium (NH4+) and to
the anaerobic respiration of nitrate (NO3-) to N2,
named denitrification[1]. These processes in
the oceans are considered of major importance in the global nitrogen cycle[2]. Indeed, oxygen-depleted environments that
are densely populated by denitrifying organisms constitute major sinks for
bioavailable nitrogen-species [1]. While the
production of N2 by denitrification is widespread among prokaryotes[3, 4], in
eukaryotes it has been reported only in foraminifera[5].Foraminifera are known to colonise a wide range of marine habitats where
abiotic factors, especially fluctuation or depletion of oxygen availability, are key
to species diversity and success[6]. Several
benthic species of the order Rotaliida show denitrification activity[5, 7].
Recent studies predicted that denitrifying foraminifera contribute up to 100 % of
total benthic denitrification in the Peruvian oxygen minimum zone[8], where foraminifera reach abundances of
>500 individuals per cm2 [8,
9]. Furthermore, foraminifera were shown
to have a NO3- storage that is suggested to accumulate in
intracellular vacuoles[10] and sustains
denitrification activity for months[11].Several foraminifera species (e.g., Buliminella tenuata, and
Virgulinella fragilis) were shown to harbour intracellular
bacteria[12-15], and it has been suggested that these bacteria perform the
denitrification reported in those species[14,
16]. In contrast, the denitrifying
foraminifera Globobulimina turgida[17] (previously determined as G. pseudospinescens[18]) harbours intracellular bacteria in a low
abundance, such that the denitrification rates measured for this species cannot be
accounted for a substantial bacterial contribution[5]. A eukaryotic denitrification pathway has previously been described
in fungi[19, 20]. However, the fungal denitrification is incomplete, where the end
product is nitrous oxide (N2O) rather than N2. So far, only
prokaryotes are known to encode the genetic repertoire to perform complete
denitrification.The commonly known denitrification pathway includes four enzymatic steps that
catalyse the reactions NO3- →
NO2- → NO → N2O →
N2. Dissimilatory reduction of NO3- to
NO2- is facilitated by periplasmic or membrane-bound
nitrate reductase (NapA or NarG, respectively). The second denitrification step is
catalysed by cd1-containing (NirS) or copper-containing
(NirK) nitrite reductase. While NirS is exclusively found in prokaryotes, NirK
homologs are found in a few protists and fungi[21]. Further reduction of NO to the greenhouse gas N2O is
catalysed in prokaryotes by nitric oxide reductase (Nor), while an alternative
enzyme (P450Nor) is documented in fungi. The last step of denitrification in
prokaryotes is catalysed by nitrous oxide reductase (NosZ). Therefore, all
denitrifying organisms share a similar gene repertoire of the denitrification
pathway. Nonetheless, the denitrification gene set in foraminifera remains unknown.
Here we investigate the genetic repertoire of the denitrification pathway in
foraminifera. We analysed the whole genomes and transcriptomes of two
Globobulimina species from the Gullmar Fjord (Sweden). We find
eukaryotic genes encoding for the denitrification enzymes in foraminifera and
investigate their evolutionary origin.
Results
Homologs of denitrification enzymes in Globobulimina
spp
Foraminifera were obtained from sediment sampled in the Gullmar Fjord,
Sweden. A total of 3,360 viable individuals of Globobulimina
turgida and Globobulimina auriculata[17] were manually picked based on their
morphological characteristics (Figure 1).
Denitrification rates for both species were calculated from N2O
production measurements following protocols previously established for
foraminifera[5, 7, 22]. The
denitrification rate measured for G. turgida ranges between 28
and 1712 pmol individual-1 day-1 (Table S1); and are within
the range of previously measured rates for this species [5, 11]. The rate
calculated for G. auriculata ranges between 29 and 124 pmol
individual-1 day-1.
Figure 1
Globobulimina turgida and G.
auriculata.
See also . A) Scanning electron micrograph of a
G. turgida test. This species is characterised by spines at
the proloculus (arrow) and a serrated toothplate margin. B)
Scanning electron micrograph of a G. auriculata test
demonstrating the species’ lack of spines (arrow). C, D)
Stereo micrographs showing numerous living G. turgida
(C) and G. auriculata
(D) individuals highlighting the differences in cytoplasm colour
and test transparency. E) A species phylogenetic tree based on 18S
rRNA gene sequences. GloG15 and GloT14/GloT15 prefixes correspond to genomics
and transcriptomics data from our study, respectively (in red and green/blue).
18S rRNA gene sequences from specific individuals are shown in brown.
Identifiers of sequences derived from assemblies of the current study include
the contig identifier, sequence position and strand orientation. The tree was
rooted with 18S rRNA gene sequences from other foraminifera species as outgroup.
Bootstrap support values (> 50) are given at the branches, and the scale
bar refers to substitutions per site. The tree supports the taxonomic
classification of the species in our study and demonstrates a clear divergence
of the two species into distinct clades.
The genome and the eukaryotic transcriptome were sequenced from batches
of pooled individuals of both species. Genomic sequences were sorted into
genomic bins based on coverage and tetra-nucleotide frequencies. Six bins
containing substantial genomic information of Globobulimina, as
validated by the transcriptome information, were classified as
Globobulimina draft genome. The draft genome includes
48,370 contigs covering a total length of ~70 megabases. The number of
duplicates per gene shows that the Globobulimina draft genome
is enriched for a single species hence the species heterogeneity is low (see
STAR★METHODS). Additional 26
bins were classified as bacterial taxa, and their coverage was sufficient to
assemble draft genomes, while the remaining contigs (~2,260,000 contigs)
were classified as unassigned bins. The presence of genes involved in the
denitrification pathway in Globobulimina was tested by sequence
similarity search using prokaryotic and eukaryotic query sequences. This
revealed Globobulimina homologs of nitrate/nitrite transporters
(Nrt) and two enzymes in the denitrification pathway: copper-containing nitrite
reductase (NirK) and nitric oxide reductase (Nor).
Evidence for functional conservation of Globobulimina spp.
Nrt, NirK & Nor protein sequences
The utilization of NO3- in denitrification
suggests that foraminifera harbour mechanisms to transport
NO3- or NO2- The search for Nrt
homologs in the Globobulimina draft genome revealed multiple
protein coding sequences related to the NarK protein superfamily (Data S1A). A phylogenetic
analysis of Nrt protein sequences and identified homologs show two distinct
Globobulimina clades that likely correspond to two diverged
protein families (Clades I and II; Figure
2). Within each clade we observe a multitude of diversified homologs
(Data S2A). Clade I
clusters with bacterial encoded Nrt, which would suggest an origin of the
Globobulimina Nrt that is independent from other
eukaryotes. To validate the bacterial neighbourhood we tested an alternative
topology where we constrained a monophyletic clade including the eukaryotic
homologs and Globobulimina clade I. This revealed that the
likelihood of eukaryotic monophyly is not significantly different from the
bacterial neighbourhood position (p-value 0.352, using
AU-test). The second Globobulimina clade (clade II) has a
highly supported eukaryotic nearest neighbour (Micromonas
commoda); from this we conclude that the Nrt is of eukaryotic
ancestry. This and a massive diversification of homologs in clade I and II
indicate that NO3-/NO2- transport is
an important property of Globobulimina.
Figure 2
Phylogeny of the Globobulimina nitrate/nitrite transporter
(Nrt).
See also . Colour coding indicates taxonomic classification where
Globobulimina clades are highlighted in purple. The tree
was rooted with homologs of the DHA14 and ACS protein family as outgroup. The
number of sequences included in collapsed branches is shown next to the
corresponding triangles (the number of sequences derived from the current study
is indicated in brackets). Additional sequences are from the genome (GloG15) and
transcriptome assembly (GloT15). Bootstrap support values using 100 replicates
(> 50) are denoted at the branches, and the scale bar refers to
substitutions per site (see complete phylogeny in Data S2A).
Our analysis identified homologs of NirK in
Globobulimina, which contain cupredoxin domains and
conserved copper binding sites that are typical for that enzyme, supporting
their functional annotation as a nitrite reductase (Figure 3A; Data S1B). Notably, the Globobulimina homologs
contain introns that are flanked by the canonical eukaryotic splice sites[23](Figure
3B; Data
S1B). A phylogenetic reconstruction of NirK homologs reveals two highly
supported sister clades (Figure 3C)
including homologs having two (clade I) or three (clade II) introns (Figure 3B). Exon sequences of the
NirK-encoding gene from both clades were further validated experimentally (Figure S1A-B). The
phylogenetic tree further shows that the Globobulimina NirK
homologs are neighbours to a prokaryotic clade (Figure 3C). To further validate the prokaryotic neighbourhood of the
Globobulimina NirK, we tested the likelihood of alternative
tree topologies. A eukaryotic neighbourhood of the
Globobulimina clade could be rejected, albeit at a marginal
confidence level (p-value 0.043, using AU-test). This shows
that the Globobulimina clade is deeply branching in the tree,
rather than grouping with a specific NirK sub-clade. Consequently, we conclude
that the NirK origin in Globobulimina is ancient. Indeed,
multiple topologies of deeper branching positions could not be statistically
rejected (Figure 3C). Further search for
the alternative nitrite reductase NirS reveals the presence of homologs, which
are found in genomes of associated bacteria in our dataset but not in the
Globobulimina transcriptome or draft genome (Data S2C).
Figure 3
Sequence and phylogeny of the Globobulimina nitrite
reductase (NirK).
See also . A) Sequence
conservation of copper ligand binding sites in Globobulimina
NirK. Copper ligand binding sites are highlighted in green. Type 1 or 2 copper
binding sites are indicated. Column numbers refer to the corresponding sequence
positions in the Geobacillus thermodenitrificans sequence.
‘Seq id%’ refers to the local alignment sequence identity compared
to Geobacillus thermodenitrificans (100 %). For details and the
complete alignment see Data
S1B. B) Gene structure in Globobulimina
nirK clades based on genomic contigs. Dashed lines indicate regions
that are supported by transcriptome assembly but were not covered by genomic
contigs. Although the automatic annotation of both genomic representatives of
Globobulimina NirK was predicted to encode complete
N-termini, homology to predicted transcriptome contigs indicate an incomplete
gene start. C) Collapsed phylogenetic tree of
Globobulimina NirK. Colour coding indicates taxonomic
classification where two Globobulimina clades (clade I and II)
are highlighted and form a monophyletic clade. The number of sequences
represented by collapsed branches is shown next to the corresponding triangles.
(The number of sequences derived from the current study is indicated in
brackets.) Alternative tested topologies of the Globobulimina
clades are indicated by circled numbers (① Grouping of the
Globobulimina clade with other eukaryotes in contrast to
all remaining sequences in the tree. ② The Globobulimina
clade groups independently from NirK class II clade. ③ Grouping of the
Globobulimina clade with a eukaryote-containing subclade of
NirK class II). For more details and the complete phylogeny see Data S2B. The tree was
rooted with the MAD method. Bootstrap support values using 100 replicates
(>50) are denoted at the branches. The scale bar refers to substitutions
per site.
A search for Nor homologs revealed that the
Globobulimina genome encodes the prokaryotic Nor whose
functional annotation is supported by the presence of a conserved cytochrome
oxidase I domain and a conserved catalytic site (Data S1C). One gene
encoding a Nor homolog is located adjacent to a eukaryote-specific gene, and the
Nor-encoding gene sequence could be validated experimentally (Figure S1C). These
findings provide further support for the conclusion that Nor is indeed encoded
in the Globobulimina genome. A phylogenetic reconstruction
shows that the Globobulimina Nor homologs are monophyletic and
nested within the bacterial NorZ clade (Figure
4A). Notably, the Globobulimina clade is neighbour
to a clade containing two ‘Candidatus Methylomirabilis
oxyfera’ Nor homologs. Denitrification in ‘Ca. M.
oxyfera’ was reported to lack the intermediary
N2O where N2 and O2 are released[24]. Consequently, it has been proposed
that the ‘Ca. M. oxyfera’ Nor
homolog is, in fact, a nitric oxide dismutase (Nod) that catalyses the enzymatic
reaction 2NO → N2 + O2[25]. Amino acid residues of the
Globobulimina Nor enzyme catalytic site are similar to Nod
where two of the eight sites are identical (Figure
4B). However, since we measured N2O production in both
foraminifera species in our study we can conclude that the
Globobulimina Nor function is not Nod-like.
Figure 4
Phylogeny and conserved amino acid residues of the
Globobulimina nitric oxide reductase (Nor).
See also . A) Colour-coding
indicates for taxonomic classification. Two Globobulimina
clades (clade I and II) are monophyletic and highlighted in purple colour. The
number of sequences represented by collapsed branches is shown next to the
corresponding triangles (The number of sequences derived from the current study
is indicated in brackets). The tree was rooted with two cbb3 oxidases as
outgroup. Nod candidates as in Ettwig et al. 2012[24, 25] are
highlighted by black circles. Uncollapsed sequences represent protein
predictions from the genome (GloG15) and transcriptome assembly (GloT15).
Bootstrap support values using 100 replicates are denoted at the branches
(>50). The scale bar refers to substitutions per site. For more details
and the complete phylogeny see Data S2D. B) Sequence conservation of catalytic sites.
Amino acid residues of the catalytic site are highlighted in green and red. Two
different sequences found in ‘Candidatus
Methylomirabilis oxyfera’ are indicated by ‘(I)’ and
‘(II)’. ‘Seq id%’ refers to the local alignment
sequence identity compared to Geobacillus thermodenitrificans
(100 %), and column numbers refer to the corresponding sequence positions in
G. thermodenitrificans. Details and the complete alignment
are shown in Data
S1C.
Missing enzymatic reactions and bacterial contribution
Our search for known denitrification genes did not detect
Globobulimina homologs to enzymes catalysing the first and
last steps of denitrification. Homologs of the dissimilatory nitrate reductases
(NapA and NarG) as well as NosZ were found in the associated bacteria draft
genomes (Figure S2; Data
S2E-G). However, the possibility of bacterial contribution to the
Globobulimina denitrification has been previously discarded
due to their low abundance [5].
Considering the presence of nitrate storage in Globobulimina
[5] it is likely that nitrate
reduction occurs within the cell. Therefore, if Globobulimina
encodes a dissimilatory nitrate reductase, it has to be a yet uncharacterised
gene. Notably, we found homologs of the eukaryotic assimilatory nitrate
reductase (Nr) in the Globobulimina draft genome. This enzyme
has been shown to catalyse denitrification in the fungus Cylindrocarpon
tonkinense under specific conditions[26]. The phylogenetic reconstruction and assessment of
functional domains reveal that the Globobulimina Nr homologs
are more closely related to sulfite oxidases than to Nr; therefore, they are
better described as sulfite oxidases (Data S2H; Figure S3). Previous studies of sulfite oxidases
in animals showed that these enzymes can acquire nitrate reducing activity
following a replacement of only two amino acid residues (e.g., in human and
chicken[27]). Thus, it is possible
that the Globobulimina Nr homolog has an assimilatory nitrate
reductase function.To further validate the presence of denitrification enzymes in
foraminifera, we searched the public databases for homologs to the
Globobulimina enzymes. Foraminifera sequencing data is
scarce in public databases yet we could find several representatives (Table S2). Only one
highly fragmented draft genome of the distant species Reticulomyxa
filosa [28] was found in
databases, and no genomes or transcriptomes of species reported to denitrify
have been published. Homologs to NirK were found in the rotaliids
Brizalina sp. and Rosalina sp., while Nor
homologs were found in Brizalina only. Homologs of Nrt were
found in the intertidal rotaliids Elphidium margaritaceum and
Ammonia spp., the miliolid foraminifera
Sorites sp. as well as Rosalina sp.,
Nonionellina sp. and Bulimina marginata.
Most of those homologous sequences are short gene fragments; nonetheless, their
phylogenetic reconstruction shows a robust clustering with
Globobulimina (Data S2I-K).
Discussion
Our results demonstrate that the main denitrification enzymes are encoded in
the genome of Globobulimina, revealing a so far undescribed
eukaryotic denitrification pathway. The pathway origin is independent of known
eukaryotic enzymes and has a prokaryotic ancestry. Most of the eukaryotic genes of
prokaryotic ancestry were shown to have been acquired by endosymbiotic gene transfer
(EGT) from the mitochondrion ancestor at the origins of eukaryotes[29, 30].
This process is still ongoing[31], and the
mechanisms involved are beginning to be unravelled[32]. Eukaryotic gene acquisition from prokaryotic donors by lateral gene
transfer (LGT, as opposed to EGT) is frequently reported; however, these are often
hampered by signatures of bacterial contamination, and, therefore, their
interpretation is often complicated[33, 34]. We note that in our study we controlled
for possible prokaryotic contamination by analysing the genome and transcriptome of
Globobulimina in parallel. Furthermore, gene origin in our
pipeline was determined by genomic binning approach, genomic context and
phylogenetics.The phylogeny of Nrt supports an ancient origin of the nitrate metabolism in
foraminifera. The foraminiferal NirK phylogeny and the absence of eukaryotic
homologs to the Nor found in foraminifera further indicate that the origin of the
foraminiferal denitrification pathway is independent from the known fungal pathway.
The fungal NirK gene origin is likely an EGT[21], and indeed, it is clustering with other eukaryotic taxa in our
phylogenetic analysis as expected for genes of endosymbiotic origin (Figure 3C; Data S2B). The
Globobulimina NirK and Nor are deep branching in all
phylogenies, and the enzymes’ monophyly in foraminifera is supported by the
presence of homologs in other rotaliids (Data S2I-K). These indicate an ancient origin of
denitrification enzymes in foraminifera by LGT from a prokaryotic donor.Our findings demonstrate that denitrification is performed by foraminifera
rather than associated bacteria. Based on our results, we propose a denitrification
model for foraminifera that draws upon known denitrification properties of fungi
(Figure 5). Nitrate transport is a
necessary step preceding denitrification that can be facilitated by the Nrt. The Nrt
function must not be limited to transport into the cell, it can also be integrated
into NO3- storage vacuoles and NO3-
transport into the mitochondria. In certain foraminifera species (family
Bolivinidae), mitochondria were observed to cluster near the tests’ pores in
oxygen-depleted environments[35, 36]. Consequently, denitrification can be
performed inside the mitochondrion, as previously reported for fungi[37] and suggested for the foraminifera
Bolivina spissa[38]. We
hypothesise that enzymatic components of foraminifera localise similarly to their
fungal and bacterial homologs. Therefore, NirK can be localised in the mitochondrial
intermembrane space and Nor in the mitochondrial inner membrane[37, 39].
Our study is lacking evidence for foraminiferal genes performing the first and last
denitrification reactions. The Globobulimina sulfite oxidases
homologous to Nr (Figure
S3) could, hypothetically, catalyse the nitrate reduction step. However, it
is also possible that so far uncharacterised or unrecognised proteins are catalysing
the missing denitrification reactions. Considering a tight association of
denitrifying enzymes with the respiratory chain, the putative dissimilatory nitrate
reductase (dNr) and nitrous oxide reductase (Nos) enzymes are likely localised
inside the mitochondrion.
Figure 5
A model for denitrification in foraminifera.
Related to . A schematic representation of a foraminiferal cell
containing cytoplasm, a vacuole and a mitochondrion is shown. Arrows indicate
the suggested flow of nitrogen compounds in foraminiferal denitrification from
NO3- to N2. A dashed arrow indicates for
the diffusion of gaseous N2 to the outside of the cell. Hypothesised
enzymatic reactions for the first and the last step of denitrification are
highlighted by dark grey colour and surrounded by a dashed line. Putative
bacterial contribution is shown as well. Abbreviations: MT: mitochondrion; CP:
cytoplasm; Nrt: nitrate/nitrite transporter; dNr: dissimilatory nitrate
reductase; Nr: eukaryotic nitrate reductase; NirK: copper-containing nitrite
reductase; Nor: nitric oxide reductase; Nos: nitrous oxide reductase.
The proposed foraminiferal model suggests sharing of a common electron
transport chain between aerobic respiration and denitrification, permitting the use
of both electron acceptors in parallel without the need of assembling new protein
complexes as reported in the fungus Fusarium oxysporum[37]. We speculate that this ability lends
foraminifera a substantial ecological advantage when exposed to hypoxia or in
response to fluctuating oxygen levels, and explains their success in populating a
wide range of marine habitats.
Star★Methods
Contact for Reagent and Resource Sharing
Further information and requests for resources and reagents should be
directed to and will be fulfilled by the Lead Contact, Christian Woehle
(cwoehle@ifam.uni-kiel.de).
Experimental Model and Subject Details
Sites description and sampling
Living foraminifera (Globobulimina turgida and
Globobulimina auriculata) were sampled during two
consecutive expeditions (2014 & 2015) to the Gullmar Fjord, Sweden.
Sediment samples were obtained from the Alsbäck Deep
(58°19.38’N, 11°32.74’E) at a depth of ca. 117 m
using a 4-tube sediment interface corer (Mini Muc K/MT410). The top 3 cm of
the sediments were sampled and wet sieved directly upon return to the
station. Foraminifera were individually picked from the 125 to 2000
µm size fraction and cleaned in sterile artificial seawater (ASW)
with 30 psu and a nitrate concentration of 20 µmol l-1 (see below,
modified from Kester et al.[40]).
Batches of foraminifera were frozen in liquid nitrogen (ambient condition)
or incubated in culturing vessels filled with sterile ASW (see next
paragraph) and flash frozen after ca. 45 h (Figure S4). The
culturing vessels were incorporated to one of two culturing systems
assembled to reproduce different Alsbäck Deep environmental
conditions. One system was dedicated to natural oxygen concentration of the
Alsbäck Deep (≈125 μmol/l) during summer whereas the
second system was completely drawn down of oxygen (<10
μmol/l). Each system was filled with sterile ASW and sparged to
oxygen concentration of interests with N2, CO2 and
O2 pre-mixed gas (AGA Gas A; gas entries indicated by red
arrows) before and during the length of the experiments. Frozen samples were
stored at - 80 °C.The sterile ASW used for rinsing and incubations was based on the
natural salinity and nitrate concentration based on CTD and nutrients
measurements. The water was obtained by dissolving the following salts in
3/5 of the end volume. Here we present weights and molecular weight (MW) for
a 1 liter end volume ASW. A total of 23.38 g of NaCL (MW = 58.44 g/mol),
4.93 g MGSO4·7H2O (MW = 246.48 g/mol), 1.11 g
CaCl2·2H2O (MW = 147.02 g/mol), 0.2 g KBr
(MW = 119.01 g/mol), 0.75 g KCL (MW = 74.56 g/mol), 4.06 g
MgCL2·6H20 (MW = 203.3 g/mol), and 1 ml
H3BO3 (of 61.83 g/mol) were added to
double-distilled-water. After the addition of the previous salts 0.023 ml of
NaNO3 was added, the pH adjusted to 8 and the media
autoclaved. It was important to dissolve each salt individually before the
addition of a new one to decrease chance of salt precipitation. The
nutrients necessary to produce the adequate ASW were autoclave separately in
10 ml glass vials. In our case 1 ml of
NaH2PO4·H2O, 1 ml
Na2SiO3·9H2O, and 1 ml of trace
metal mix (see below) containing 100 μl selenium
(H2SeO3, MW= 128,97) were added to the medium
after sterilisation. The carbonate system was also added after sterilisation
of the medium by sterile filtering (0.22 um) 0.17 g of NaHCO3
dissolved in 30 ml double-distilled-water and 500 μl of the vitamins
directly to the medium.The trace metal mix used for the ASW consists of 11,65 μM
FeCl3·6H2O (MW = 270.3 g/mol), 11.71
μM of Na2EDTA·2H2O (MW=372.24 g/mol),
0.039 μM of CuSO4·5H2O (MW = 249.7
g/mol), 0.026 μM of Na2MoO4·2H2O (MW =
241.9 g/mol), 0.077 μM of ZnSO4·7H2O (MW
= 287.54 g/mol), 0.042 μM of CoCl2·6H2O
(MW = 237.93 g/mol), and 0.91 μM of
MnCl2·4H2O (MW = 197.9 g/mol). Vitamins
were made from 0.002 μM of Biotin (MW = 244.32 g/mol), 0.0004
μM of B12 (MW=1355.38 g/mol) and 0.30 μM of Thiamine-HCL (MW =
337.28 g/mol). The vitamins mix needed to be sterile filtered and kept in
the dark.
Method Details
Nucleic acid isolation and sequencing
Foraminifera lysis and disruption (including lysozyme buffer) were done
via immersion in liquid nitrogen followed by pestle-crashing. Genomic DNA and
total RNA were simultaneously isolated using the AllPrep DNA/RNA Micro Kit
(Qiagen) or innuPREP DNA/RNA Mini Kit (AnalytikJena). Transcriptome libraries of
two consecutives years (2 independent biological replicates) were synthesised
from cDNA and enrichment for eukaryotic transcripts. Libraries were prepared
with TotalScript™ RNA-Seq kit (Epicentre®) with Oligo (dT) primers
or NEBNext® Ultra™ RNA Library Prep Kit for Illumina® (NEB)
with mRNA isolation performed with poly-A mRNA beads. Transcriptomics paired end
libraries (2x 100 bp and 2x 125 bp) were sequenced on an Illumina HiSeq 2000
platform. Genomic libraries were prepared with the NEBNext® Ultra™
II DNA Library Prep Kit for Illumina® and were directly used for whole
genome shotgun sequencing on an Illumina HiSeq 2000 platform (2x 100 bp). All
samples were quantified and qualified using a Qubit® fluorometer
(Invitrogen by Life Technologies™) and a Bioanalyzer & TapeStation
(Agilent technology). Sequence data were deposited in NCBI (BioSamples
SAMN07821823, SAMN07821824).
Validation of gene sequences
The gene sequence of the genes encoding NirK and Nor was confirmed by
amplifying exon fragments using RT-PCR with cDNA synthesised with Oligo (dT)
primers (NEB) as template. The reactions were performed with iTaq Universal
SYBR® on the Bio-Rad CFX connect Real-Time System (Bio-Rad Laboratories).
The reaction was composed of 10 μl iTaq Universal SYBR® master mix
(Bio-Rad), 0.5 μl of 10 μM forward and reverse primers each (Table S3), 8 μl
PCR-H2O and 1 μl of poly-A cDNA template. RT-PCR cycling
conditions were 3 min at 95 °C (once), 39 cycles of 0:10 min at 95
°C, 0:30 min at 60 °C, followed by a melt curve from 58 °C
to 98 °C (increment 0.2 °C 0:50). Duplicates of one biological
sample per gene and no template control (NTC) were run for each primer pair.
Taxonomy
Six foraminifera-specific hypervariability 18S rRNA gene loci were
amplified as described by Pawlowski and Lecroq 2010[41] and used to separate closely related species. The
foraminiferal 18S rRNA gene sequence was amplified from 25–44 individuals
using Phusion polymerase (NEB) and the primers 14F1 and B (Table S3). This yielded a
fragment of about 1200 bp for the Globobulimina species. The
PCR products were evaluated by electrophoresis, and positive samples were
purified (GeneJet Gel extraction and DNA Cleanup Micro Kit; Thermo Scientific)
for Sanger sequencing. Duplicates are shown in the 18S phylogenetic tree (Figure 1E). Additional 18S rRNA gene homologs
were detected in NCBI public databases and the genomes and transcriptomes
presented in the current study by BLAST[42]. All 18S rRNA gene sequences were aligned using MAFFT[43] 7.123b (‘linsi’ option),
and a phylogeny was reconstructed using PhyML[44] ver. 20131022 (‘-b 100 -m HKY85’; Figure 1E).
Foraminiferal denitrification rate
The foraminiferal denitrification rates were calculated from linear
steady state gradients of N2O in microtubes with 3–5
individuals (at least quadruplicates measurements per species). Nitrous oxide
reductase was inhibited with acetylene, thereby altering the complete
denitrification by making N2O the final product[45]. The steady state diffusion fluxes in the tubes
corresponding to the respiration rates were calculated by Fick's first
law of diffusion: Eq1: J = -D.dC/dx where J = flux; dC/dx = the measured
concentration gradient; D = free diffusion coefficient of N2O. All
measurements were performed in a cooling room at a constant temperature of 9
°C. However, since the temperature could not be continuously recorded
inside the microchamber medium, sporadic temperature measurements were taken
within a water bath next to the incubation chamber (average temperature 9.2
± 0.1 °C; N = 19). The production of N2O was measured
with a N2O microsensor[46]
(Table S1),
following methods previously established for foraminifera[5, 7, 11, 22]. N2O production microprofiles of ambient foraminifera
were performed with freshly picked foraminifera directly after wet sieving and
rinsing twice with nitrate free ASW (red sea salt) prior to the transfer to a
microchamber.
Microscopic visualizations of tests and living individuals of
Globobulimina spp
Specimens of the two Globobulimina species were removed
from sediment samples, washed with clean seawater, dehydrated in a graded
ethanol series (70 %, 80 %, 90 %, 96 % and two times 100 %; 15 min each),
air-dried for 12 h in a desiccator and mounted on aluminium stubs (PLANO GmbH)
using conductive and adhesive carbon pads (PLANO GmbH). Subsequently, the
preparations were sputter-coated with a 10-nm-thick gold-palladium (80/20) layer
using a high vacuum sputter coater Leica EM SCD500 (Leica Microsystems GmbH) and
visualized with a Hitachi S-4800 field emission scanning electron microscope
(Hitachi High-Technologies Corporation) at an acceleration voltage of 3 kV and
an emission current of 10 mA applying a combination of the upper detector and
the lower detector.
Processing of sequencing data
Sequencing resulted in total in 1.3 billion paired-end reads. This
includes transcriptome datasets of two consecutive years (SRA accessions:
SRR6202056, SRR6202059–SRR6202078) as well as genome data of
Globobulimina (SRA accessions:
SRR6202052–SRR6202055,
SRR6202057–SRR6202058). Reads were quality-checked by FastQC
ver. 0.11.5 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc; Aug
2016). Filtering and trimming of the reads was performed using Trimmomatic[47] ver. 0.36 (Parameters:
ILLUMINACLIP:primers.fa:2:30:10 LEADING:5 TRAILING:5 SLIDINGWINDOW:4:5
MINLEN:21; the file ‘primers.fa’ contains adaptor and contaminant
sequences provided by Trimmomatic and FastQC). Processed paired-end reads of the
transcriptome samples were assembled into contigs using SPAdes[48] ver. 3.9.1 (‘--rna’
option), which yielded 161,222 (termed ‘GloT14’, NCBI accession:
GGCE00000000) and 906,588 (termed ‘GloT15’, NCBI accession:
GGCD00000000) transcripts. The contigs shorter than 200 nucleotides and
contamination identified by NCBI Transcriptome Shotgun Assembly (TSA) submission
pipeline were excluded. Protein sequences were translated from transcripts as
the longest open reading frame (ORF) using TransDecoder[49] ver. 3.0.1 (‘-m 30’ option). Transcript
abundances of individual transcriptome datasets refer to Transcripts Per Million
(TPM) determined by the Trinity pipeline[49] 2.4.0 (Trinity script
‘align_and_estimate_abundance.pl’) via RSEM[50] ver. 1.2.30 and Bowtie[51] 2.1.0 using paired-end reads. Paired-end reads of
Globobulimina spp. genome datasets were assembled using
IDBA-UD[52] ver. 1.1.1 (parameters:
--pre_correction --mink 20 --maxk 120). Resulting contig sequences (termed
‘GloG15’) were classified into 241 genomic bins by MaxBin[53] ver. 2.2 (parameter: -min_contig_length
500) providing reads of individual sequencing samples separately. We assessed
the quality of prokaryotic genomic bins, their completeness, coverage and first
prediction of protein sequences on binned contigs with checkM[54] ver. 1.0.5. Thresholds of completeness
≥ 80 % and contamination ≤ 15 % were applied to classify bacterial
bins as draft genomes (26 in total; Figure S2). The protein sequences obtained by checkM were
combined with those predictions for unclassified contigs using
MetaProdigal[55] ver. 2.6.2, and
additional protein sequences were predicted based on genome mapping of
transcriptome paired-end reads using HISAT2[56] ver. 2.0.5 (‘--non-deterministic’ option) and
BRAKER[57] ver. 1.9. Similar
sequences were clustered with CD-HIT[58]
ver. 4.6 (‘-c 0.98’ option) in order to reduce redundancy of the
complete protein catalogue. The final protein names consist of the contig IDs
followed by the sequence positions covered by the CDS and an indicator for the
forwards (+) or reverse (-) strand. In case of multiple exons, individual
regions are separated by comma. For a high resolution taxonomic classification
of prokaryotic genomic bins, DIAMOND[59]
tool 0.7.11.60 (‘-k 10’ option) similarity searches of proteins
predicted via checkM were applied. The first best hit (by score) per protein
sequence to the NCBI non-redundant database (NR; January 2017;
e-value ≤ 1e-10) was obtained, and the corresponding
NCBI taxonomy assignment was further used. Identical protein taxonomy
assignments per genomic bin were counted and sorted accordingly. Beginning with
the most abundant taxonomy, the lowest taxonomic rank was searched that was
supported by > 50 % of bin protein hits and accepted as taxonomic
classification of the corresponding draft genome. Taxonomic assignments
containing ‘environmental samples’ and the rank ‘Cellular
organisms’ were not considered. Contigs that were not classified as
bacterial or Globobulimina draft genome represent the
unclassified metagenome associated with Globobulimina (NCBI
accession: PJEL00000000). Six genomic bins (No. 179, 190–194) were
classified as the Globobulimina draft genome (NCBI accession:
PIVH00000000). All of them were annotated as eukaryotic, based on the previously
described approach. Additionally, each of these bins covers more than 5 % of
transcriptome read pairs mapped to binned contigs, and their coverage over
different samples is highly correlated (Pairwise Pearson correlations ≥
0.99, FDR adjusted p-values < 10-7). Overall, 93.3 % of
transcriptome read pairs mapped to the Globobulimina draft
genome. The corresponding assembly consists of 48,370 contigs with a GC-content
of 34.25 % and a N50 of 1,797 nucleotides. The assessment of genome completeness
by Benchmarking Universal Single-Copy Orthologs[60] (BUSCO v3; lineage ‘eukaryota’) method was done
using all 132,080 protein predictions for Globobulimina genome
contigs and recovered 63.7 % of BUSCOs completely, which shows that more than
half of the genome is covered. We analysed pooled samples of two species of the
genus Globobulimina (G. turgida &
G. auriculata) and therefore expected our sequencing
results to contain sequences derived from both genomes. As BUSCOs represent
universal single-copy genes, the presence of multiple BUSCOs of the same type
indicates multiple organisms in the analysis (i.e., species heterogeneity within
the dataset). In the described BUSCO analysis, 44.1 % of recovered BUSCOs were
inferred as multi-copy orthologs. However, this still includes duplicated
predictions for the same gene locus that resulted by differing predictions based
on prodigal or BRAKER. Focussing only on the 27,238 protein sequences predicted
via BRAKER reduced the species heterogeneity such that only 7.12 % of complete
BUSCOs were found multiple times. From this, we conclude that the
Globobulimina draft genome is enriched for a single
species, either G. turgida or G. auriculata.
The BUSCO analysis of the transcriptome assemblies revealed a higher level of
heterogeneity where 37.5 % (GloT14) or 86 % (GloT15) of the complete orthologs
represented multiple copies.
Gene identification and phylogenies
To identify homologs of enzymes in the denitrification pathway, we
collected query protein sequences of known enzymes from Swiss-Prot[61] and the literature (Table S4). The search for
denitrification enzymes homologs in our (and external) datasets was performed
with BLASTP[42] applying an
e-value ≤ 1e-5 threshold. Protein sequences of hits
with query coverage ≥ 40 % and sequence identity ≥ 20 % were
extracted to obtain a first set of homologs. With this preliminary set, we
reiterated the search in NR and RefSeq 79 database[62] using GHOSTZ[63]
1.0.0 applying a threshold of e-value ≤ 1e-5. Hit
sequences were obtained and clustered with CD-HIT 4.6 (option: ‘-c
0.98’) to reduce sequence redundancy. Protein sequences were aligned with
MAFFT (‘linsi’ option), and phylogenetic trees were reconstructed
with FastTree[64] 2.1.7. Putative
paralogs were excluded as following: i) the trees were rooted using outgroups
(See Table S4), if
available, and/or the Minimal Ancestor Deviation (MAD) method[65]; ii) the resulting root separated the
tree into two main clades. Only one of these clades, containing orthologs, was
retained, which was defined by sequence annotation and sequence representation
of proteins from literature (Table S4). Further sequences that exhibit long branches or increase
redundancy of the phylogenetic information were as well excluded. For a complete
list of homologous sequences that were excluded from the final phylogenies see
Table S5. The
refined sequence set was used for the reconstruction of phylogenetic trees using
PhyML (‘-m LG -b 100’ options). The resulting trees were rooted
using outgroups or MAD. Clades of protein sequences encoded by the
Globobulimina spp. were defined by representation in the
Globobulimina genome data and transcriptome assemblies.
Statistical support of alternative topologies was estimated using CONSEL[66] ver. 1.20. Homologs for comparison of
functional domains and conserved amino acid residues of copper binding and
catalytic sites were chosen based on available literature[25, 67, 68] (Table S4). For each
enzyme, sequences were aligned using MAFFT (‘linsi’ option). The
protein domains (shown in Data
S1 & Figure S3) were determined using the CD-Search webserver
at NCBI[69] (Aug–Nov
2017), and regions spanned by hits to specific protein domains were extracted.
Pairwise local sequence identities were obtained using ‘water’
tool in the EMBOSS[70] package.In the search for homologs of Globobulimina NirK, Nor
and Nrt in further foraminifera species and related taxa,
Globobulimina protein sequences were used as queries for
BLAST (with a threshold of e-value ≤ 1e-5). Datasets in
use include NCBI genome data (Reticulomyxa filosa,
GCA_000512085.1; Astrammina rara, GCA_000211355.2), NCBI
transcriptomic reads (Nonionellina sp., SRR2003403;
Bulimina marginata, SRR2003397; Brizalina
sp., SRR2003388; Ammonia sp., SRR2003283) and MMETSP[71] transcriptomes
(Rosalina sp., MMETSP0190; Sorites sp.,
MMETSP0191; Ammonia sp., MMETSP1384; Elphidium
margaritaceum, MMETSP1385). Nrt homologs with high sequence
identity (≥ 50 %) were identified in protein sequences of MMETSP datasets
and in sequencing reads of NCBI transcriptomes. Protein sequences of hits were
used without modification, and the obtained sequencing reads were assembled into
contigs using CAP3[72]. Final Nrt protein
sequences were determined by predicting longest ORF using TransDecoder on
contigs and remaining unassembled reads. Hits for Globobulimina
NirK and Nor with high sequence identity (≥ 50 %) in the same dataset
were only observed for Brizalina sp. sequencing reads. Further
24 reads of Rosalina were covered by hits to NirK using the
same cut-off. Read sequences of individual hits to NirK and Nor were obtained
and individually assembled using CAP3. Rosalina sp. associated
reads were short, and the parameters recommended for assembly of short sequences
(‘-i 30 -j 31 -o 18 -s 300’ options) were applied. Resulting
contigs were translated into the reading frame that exhibited most sequence
similarity to corresponding proteins. The sequence similarity of
Brizalina sp. sequences to NirK was limited to a short
region on one read (accession: SRR2003388.68811). The homologous region was
extracted and translated. If represented, stop codons were replaced by
‘X’s. For protein sequences of gene fragments obtained see Table S2. Predicted
protein sequences that did not show sequence similarity to corresponding query
protein sequences were discarded. Protein sequences obtained in this way were
added to the alignments of the denitrification proteins described before (MAFFT;
‘--addfragments’ option) for subsequent phylogenetic
reconstruction via PhyML (Data
S2I).
Quantification and Statistical Analysis
The standard deviation (σsd), standard of the mean
(σSEM) and the number of individuals (N) for the
denitrification measurements are presented in Table S1. An Approximate
Unbiased (AU) test was used to compare optimal phylogenetic trees to alternative
topologies without application of further testing procedure. Alternative topologies
were inferred by constraining individual clades for tree reconstruction and their
tree likelihoods were compared to the unconstrained phylogeny using AU-test via
CONSEL (See Figure 2, eukaryotic monophyly of
clade I: p-value 0.352; See Figure
3, Grouping of the Globobulimina clade with other
eukaryotes: p-value 0.043, The Globobulimina clade
groups independently from NirK class II clade: p-value 0.373,
Grouping of the Globobulimina clade with a eukaryote-containing
subclade of NirK class II: p-value 0.581). An α of 0.05 was
applied to determine a significant difference of the alternative topology in
contrast to the optimal tree.
Data and Software Availability
Data availability
Sequencing reads are available from the single read archive (SRA)
accessions SRR6202052 to SRR6202078. Transcriptome assemblies were deposited at
the transcriptome sequencing archive (TSA) accessions GGCE00000000 and
GGCD00000000. The genome sequencing assembly is available at NCBI with the
accessions PIVH00000000 to PIWH00000000 representing draft genomes and
PJEL00000000 the unassigned contigs. Individually amplified 18S rRNA gene
sequences of G. turgida and G. auriculata were
submitted to GenBank (accessions: MG800664 to MG800667). All other information
on accessing data analysed in this study is included in the manuscript or in the
supplemental information. All additional datasets generated during the current
study are available from the corresponding authors upon request.
Authors: Gernot Glöckner; Norbert Hülsmann; Michael Schleicher; Angelika A Noegel; Ludwig Eichinger; Christoph Gallinger; Jan Pawlowski; Roberto Sierra; Ursula Euteneuer; Loic Pillet; Ahmed Moustafa; Matthias Platzer; Marco Groth; Karol Szafranski; Manfred Schliwa Journal: Curr Biol Date: 2013-12-12 Impact factor: 10.834
Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971
Authors: Katharina F Ettwig; Daan R Speth; Joachim Reimann; Ming L Wu; Mike S M Jetten; Jan T Keltjens Journal: Front Microbiol Date: 2012-08-07 Impact factor: 5.640
Authors: Patrick J Keeling; Fabien Burki; Heather M Wilcox; Bassem Allam; Eric E Allen; Linda A Amaral-Zettler; E Virginia Armbrust; John M Archibald; Arvind K Bharti; Callum J Bell; Bank Beszteri; Kay D Bidle; Connor T Cameron; Lisa Campbell; David A Caron; Rose Ann Cattolico; Jackie L Collier; Kathryn Coyne; Simon K Davy; Phillipe Deschamps; Sonya T Dyhrman; Bente Edvardsen; Ruth D Gates; Christopher J Gobler; Spencer J Greenwood; Stephanie M Guida; Jennifer L Jacobi; Kjetill S Jakobsen; Erick R James; Bethany Jenkins; Uwe John; Matthew D Johnson; Andrew R Juhl; Anja Kamp; Laura A Katz; Ronald Kiene; Alexander Kudryavtsev; Brian S Leander; Senjie Lin; Connie Lovejoy; Denis Lynn; Adrian Marchetti; George McManus; Aurora M Nedelcu; Susanne Menden-Deuer; Cristina Miceli; Thomas Mock; Marina Montresor; Mary Ann Moran; Shauna Murray; Govind Nadathur; Satoshi Nagai; Peter B Ngam; Brian Palenik; Jan Pawlowski; Giulio Petroni; Gwenael Piganeau; Matthew C Posewitz; Karin Rengefors; Giovanna Romano; Mary E Rumpho; Tatiana Rynearson; Kelly B Schilling; Declan C Schroeder; Alastair G B Simpson; Claudio H Slamovits; David R Smith; G Jason Smith; Sarah R Smith; Heidi M Sosik; Peter Stief; Edward Theriot; Scott N Twary; Pooja E Umale; Daniel Vaulot; Boris Wawrik; Glen L Wheeler; William H Wilson; Yan Xu; Adriana Zingone; Alexandra Z Worden Journal: PLoS Biol Date: 2014-06-24 Impact factor: 8.029
Authors: Nicolaas Glock; Alexandra-Sophie Roy; Dennis Romero; Tanita Wein; Julia Weissenbach; Niels Peter Revsbech; Signe Høgslund; David Clemens; Stefan Sommer; Tal Dagan Journal: Proc Natl Acad Sci U S A Date: 2019-02-06 Impact factor: 11.205
Authors: Christian Woehle; Alexandra-Sophie Roy; Nicolaas Glock; Jan Michels; Tanita Wein; Julia Weissenbach; Dennis Romero; Claas Hiebenthal; Stanislav N Gorb; Joachim Schönfeld; Tal Dagan Journal: Proc Natl Acad Sci U S A Date: 2022-06-15 Impact factor: 12.779
Authors: William D Orsi; Raphaël Morard; Aurele Vuillemin; Michael Eitel; Gert Wörheide; Jana Milucka; Michal Kucera Journal: ISME J Date: 2020-07-08 Impact factor: 10.302
Authors: Clare Bird; Charlotte LeKieffre; Thierry Jauffrais; Anders Meibom; Emmanuelle Geslin; Helena L Filipsson; Olivier Maire; Ann D Russell; Jennifer S Fehrenbacher Journal: Front Microbiol Date: 2020-12-03 Impact factor: 5.640