Literature DB >> 24348009

A comprehensive phylogenetic analysis of deadenylases.

Athanasia Pavlopoulou¹, Dimitrios Vlachakis¹, Nikolaos A A Balatsos², Sophia Kossida¹.

Abstract

Deadenylases catalyze the shortening of the poly(A) tail at the messenger ribonucleic acid (mRNA) 3'-end in eukaryotes. Therefore, these enzymes influence mRNA decay, and constitute a major emerging group of promising anti-cancer pharmacological targets. Herein, we conducted full phylogenetic analyses of the deadenylase homologs in all available genomes in an effort to investigate evolutionary relationships between the deadenylase families and to identify invariant residues, which probably play key roles in the function of deadenylation across species. Our study includes both major Asp-Glu-Asp-Asp (DEDD) and exonuclease-endonuclease-phospatase (EEP) deadenylase superfamilies. The phylogenetic analysis has provided us with important information regarding conserved and invariant deadenylase amino acids across species. Knowledge of the phylogenetic properties and evolution of the domain of deadenylases provides the foundation for the targeted drug design in the pharmaceutical industry and modern exonuclease anti-cancer scientific research.

Entities: Chemical Disease Gene Species

Keywords: deadenylases; evolution; molecular modelling; phylogenesis

Year: 2013 PMID： 24348009 PMCID： PMC3859875 DOI： 10.4137/EBO.S12746

Source DB: PubMed Journal: Evol Bioinform Online ISSN： 1176-9343 Impact factor: 1.625

Introduction

Shortening of the polyadenylated (poly(A)) tail at the mRNA 3′-end, referred to as deadenylation, is a key step in mRNA decay in eukaryotes.1,2 This process is catalyzed by the deadenylase enzymes. Poly(A) tails are the preferred substrates of deadenylases, although in some instances they are capable of degrading non-adenosine ribopolymers in vitro with reduced efficiency. 3–6 According to Goldstrohm and Wickens,7 the known deadenylases are classified into 2 superfamilies, DEDD and exonuclease-endonuclease-phospatase (EEP), which are defined by conserved exonuclease sequence motifs required for catalysis. Members of the EEP superfamily of deadenylases use a conserved glutamic acid (E) and a histidine (H) for catalysis.7,8 This superfamily includes the families carbon catabolite repressor 4 (CCR4), Nocturinin, ANGEL and 2′ phosphodiesterase (2′PDE).7 The DEDD superfamily of deadenylases is named after the invariant catalytic acidic residues aspartic acid (D) and glutamic acid (E), which are distributed in 3 exonuclease motifs.7,9 The DEDD superfamiliy includes the families POP2, Poly(A)-specific ribonuclease (PARN), CAF1Z and PAB-dependent poly(A)-specific ribonuclease subunit 2 (PAN2).7,9 In the present study, we focus on the molecular evolution of these families, thus providing insights into the amino acid conservation patterns that may be subsequently used for further studying deadenylases as a promising and emerging anti-cancer pharmacological target.

Methods

Identification of deadenylase homologues

To identify homologous deadenylase protein sequences, the accession numbers of the characterized deadenylases reported in literature7 were used to retrieve their corresponding amino acid sequences from publically-available databases UniProtKB10 and GenBank.11 These sequences were subsequently used as probes to search the sequence databases by applying reciprocal BLASTp and tBLASTn.12 This process was reiterated until no new putative deadenylase homologues could be found.

Motifs construction

Representative DEDD and EEP peptide sequences were aligned and edited with Utopia suite’s CINEMA alignment editor.13 Sequence motifs were collected from the alignments, manually edited for insertions or gaps, and submitted to WebLogo314 to generate consensus sequences.

Phylogenetic analyses

The deadenylase sequences under study were searched against InterPro15 in order to identify the boundaries of the core nuclease domain. In order to optimize the alignment and avoid unreasonable gap penalties, the amino acid sequences that correspond to the nuclease domain were collected from the entire deadenylase peptide sequences and aligned using CLUSTALW.16 The resulting multiple sequence alignment was first trimmed for gaps using Gblocks17,18 and manually edited. The trimmed alignment was then used to reconstruct phylogenetic trees by employing 2 different methods. The first one is the maximum-likelihood method implemented in PhyML,19 where an initial distance-based tree (BIONJ) is optimized using a hillclimbing algorithm. In our study, the nearest-neighbor-interchange (NNI) heuristic was used with 4 substitution-rate categories; the proportion of the invariable sites and the gamma shape parameter were estimated from the data. The number of amino acid substitutions per site was estimated with the LG20 model. The second one is the Neighbor-net method21 implemented in SplitsTree4,22 a distance-based method which detects conflicting phylogenetic signals, presented in the form of reticulations; the Uncorrected P substitution model was used. Bootstrap analyses (1000 pseudo-replicates) were conducted in order to assess the robustness of the reconstructed trees. The inferred phylogenetic trees were visualized with the program Dendroscope.23

Evolutionary rate shift analysis

A maximum-likelihood method24 was employed for the identification of evolutionary rate differences at specific protein sites in DEDD families. Towards this end, a set of 19 protein sequences from the four DEDD families was analyzed in order to identify amino acid positions with significant 4 rate differences among the DEDD families as described in Knudsen and Miyamoto (2001).24 The alignment was based on the core nuclease domain, and it was carried out using CLUSTALW.16

Results/Discussion

Phylogenetic analyses of deadenylases

In the present study, we performed comprehensive and updated phylogenetic analyses of the deadenylase homologs in all available genomes (Figs. 1, 2, S1 and S2, Table S1). Collectively, 114 DEDD and 97 EEP homologous protein sequences were identified in the genomes of 38 and 37 species, respectively, which represent major eukaryotic taxonomic divisions (according to the NCBI taxonomy database; Table S1).25

Figure 1

Phylogenetic tree of DEDD deadenylases. Bootstrap values (>50%) are shown at the nodes. The length of the tree branches reflects evolutionary distance. The scale bar at the upper left represents the length of amino acid substitutions per position. To minimize confusion, we used the protein names as described in Goldstrohm and Wickens;7 the UniProt 5-letter codes were used for the species names. The proteins derived from metazoa are shown in red, from viridiplantae in green, from fungi in orange and from protozoa in yellow.

Figure 2

Phylogenetic tree of EEP deadenylases. Bootstrap values above 50% are shown at the nodes. The length of the tree branches depicts evolutionary distance. The scale bar at the upper left represents the length of amino acid substitutions per site. To minimize confusion, we used the protein names as described in Goldstrohm and Wickens;7 the UniProt 5-letter codes were used for the species names. The proteins derived from metazoa are shown in red, from viridiplantae in green, from fungi in orange and from protozoa in yellow.

In order to better resolve the evolutionary relationships between the deadenylase families, we applied 2 different methods for phylogenetic tree reconstruction. The phylogenetic trees reconstructed with both methods are congruent, since the overall topology is similar, and all main branches are supported by high bootstrap values (Figs. 1, 2, S1 and S2). 8 coherent, well-supported monophyletic branches that correspond to the 4 families of the DEDD superfamily (Figs. 1 and S1), and the 4 families that comprise the EEP superfamily (Figs. 2 and S2) are distinguished. Based on our analysis, putative members of the families POP2, PARN, PAN2 and CCR4 were identified in the major eukaryotic taxonomic divisions, ranging from metazoa to protozoa. POP2 appears to be the largest family in size with a wide distribution among taxa (Figs. 1 and S1). Based on the phylogenetic analyses (Figs. 1, 2, S1 and S2), the deadenylase families POP2, PARN and CCR4 appear to have undergone gene duplications in metazoa giving rise to the, metazoan-specific, subfamilies CNOT8, PARNL and CNOT6 L, respectively. In POP2 and CCR4 families, gene duplications have rather occurred after the emergence of teleosts (bony fishes) (Figs. 1, S1, 2 and S2), since teleost (DANRE) homologs were detected in the corresponding subfamilies CNOT8 and CNOT6L. In PARN, a duplication event has presumably followed the radiation of arthropods (Figs. 1 and S1), as arthropod (SOLIN and TRICA) homologs were identified in the PARNL subfamily. However, neither frog (XENTR) nor fish (DANRE) PARNL homologs were identified; we suggest that frog and fish PARNL genes might have existed that probably got deleted during the evolutionary course. Of importance, PARN homologs were not detected in the fungus Saccharomyces cerevisiae (YEAST) and the arthropod Drosophila melanogaster (DROME), whereas putative PARN homologs were detected in other fungi (BATDJ and SCHPO) and arthropods (DAPPU, ANOGA and SOLIN) (Figs. 1 and 1S). This leads to the suggestion that alternative metabolic pathways might exist in yeast and Drosophila that might compensate for PARN’s function. Furthermore, a series of species-specific gene duplications in the green plant Arabidopsis thaliana (ARATH) yielded 12 POP2 and 6 CCR4 paralogs (Figs. 1, 2, S1 and S2). However, the deadenylase families CCR4-associated factor 1Z (CAF1Z), ANGEL, Nocturnin and 2′PDE are restricted to certain eukaryotic taxa (Figs. 1, S1, 2 and S2). CAF1Z is restricted to metazoa and protozoa (Figs. 1 and S1). Moreover, a putative CAF1Z homolog was detected in the chytrid fungus Batrachochytrium dendrobatidis (BATDJ), which infects frogs causing chytridiomycosis26,27 (Figs. 1 and S1), leading to the suggestion that this fungal parasite has presumably acquired CAF1Z from its amphibian host by horizontal gene transfer. Furthermore, members of the Nocturnin and 2′PDE families were detected in the main eukaryotic taxa, except fungi (Figs. 2 and S2). We suggest that either alternative metabolic pathways may exist in fungi for poly(A) degradation, or differences in the lifestyle or physiology of fungi led to the loss of the Nocturnin and 2′PDE families. The ANGEL family is restricted to opisthokonts (metazoa and fungi; Figs. 2 and S2). Based on the reconstructed phylogeny, the ANGEL1/ANGEL2 duplication should have occurred after the vertebrate-invertebrate separation (Figs. 2 and S2). This is supported by the observation that a single ANGEL1/2-like homologue was detected in the invertebrate chordate Branchiostoma floridae (BRAFL Angel), which lies in the vertebrate-invertebrate evolutionary boundary, that appears to be basal to the ANGEL1 and ANGEL2 clades in the trees reconstructed with both methods (Figs. 2 and S2). In the ANGEL family, the fungi Ngl homologues form a separate, highly supported clade. 3 S. cerevisiae (Ngl1, Ngl2 and Ngl3) and one S. pombe homologue (Ngl1) were identified. Based on the phylogenetic tree (Figs. 2 and S2), we suggest that tandem duplication events, apparently after the S. cerevisiae and S. pombe divergence, may have copied Ngl2 and Ngl3. Based on the rate shift analysis, a total of 153 sites, distributed across the core domain, were detected with significant evolutionary rate differences. Among them, 29 (19%) sites were detected with significant rate shifts between the PAN2 family and the other DEDD families (Fig. 3, both blue and red highlighting). This is in agreement with the phylogenetic analyses results where PAN2 appears to be more distantly related to the other DEDD families. Moreover, 29 (19%) conserved sites were detected in all DEDD families, exhibiting slower evolutionary rates compared to the average of all proteins under investigation (Fig. 3, blue highlighting). This leads to the suggestion that must have been evolutionary pressure to these sites to evolve slowly because they have a critical role in the function or structure of DEDD enzymes; as expected, the 4 catalytic residues that define the DEDD superfamily are also included in this category (Fig. 3). Also, 95 sites, a significantly high percentage (62%), were detected with faster evolutionary rates compared to the average of all DEDD proteins (Fig. 3, red highlighting).

Figure 3

Results of the rate shift analysis for the 19 DEDD proteins. Sites with blue and red highlight correspond to those with slower and faster evolutionary rate, respectively. Sites with entirely blue or red highlight represent amino acid sites with the same evolutionary rate in all families, but with significantly slower or faster rates compared to the average of all sites, respectively.

Furthermore, sequence logo analyses were generated in order to determine the consensus sequence of each of the conserved motifs that were deduced from he alignment of representative deadenylase sequences from both superfamilies (Fig. 4A and B). In this way, a set of structurally-conserved residues were identified on both the DEDD and EEP deadenylases. More specifically, 3 major motifs were identified in DEDD deadenylases and 7 prime motifs in EEP deadenylases (Fig. 4A and B).

Figure 4

Sequence logos of the motifs identified in deadenylase protein sequences. (A) DEDD, numbered according to the human PARN nuclease domain (PDB code 2A1R) and (B) EEP, numbered according to the human CNOT6 nuclease domain. The height of each letter is relative to the frequency of the corresponding residue at that position, and the letters are ordered such as the most frequent is on the top. The invariant catalytic residues that define each superfamily are indicated with dots.

Importantly, apart from the known catalytic residues (Fig. 4A and B), several other residues were found to be evolutionary conserved across species in all various deadenylases. Therefore, these amino acids may serve important functional roles in the action of the deadenylase mechanism. They could also represent potential drug targets. Phylogenetic tree of DEDD. Support values above 50% are indicated at the nodes within the major clades. The scale bar at the upper left denotes the length of amino acid sustitutions per site. Phylogenetic tree of EEP. Support values (>50%) are indicated at the nodes within the major clades. The scale bar at the upper left indicates the length of amino acid sustitutions per position. Phylogenetic distribution of the deadenylases analyzed in the present study.

27 in total

1. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

Authors: J Castresana
Journal: Mol Biol Evol Date: 2000-04 Impact factor: 16.240

2. Functionally unrelated signalling proteins contain a fold similar to Mg2+-dependent endonucleases.

Authors: M Dlakić
Journal: Trends Biochem Sci Date: 2000-06 Impact factor: 13.807

3. A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins.

Authors: B Knudsen; M M Miyamoto
Journal: Proc Natl Acad Sci U S A Date: 2001-12-04 Impact factor: 11.205

Review 4. Mechanisms of deadenylation-dependent decay.

Authors: Chyi-Ying A Chen; Ann-Bin Shyu
Journal: Wiley Interdiscip Rev RNA Date: 2010-09-15 Impact factor: 9.957

5. Multiple sequence alignment using ClustalW and ClustalX.

Authors: Julie D Thompson; Toby J Gibson; Des G Higgins
Journal: Curr Protoc Bioinformatics Date: 2002-08

6. Chytridiomycosis causes amphibian mortality associated with population declines in the rain forests of Australia and Central America.

Authors: L Berger; R Speare; P Daszak; D E Green; A A Cunningham; C L Goggin; R Slocombe; M A Ragan; A D Hyatt; K R McDonald; H B Hines; K R Lips; G Marantelli; H Parkes
Journal: Proc Natl Acad Sci U S A Date: 1998-07-21 Impact factor: 11.205

7. The transcription factor associated Ccr4 and Caf1 proteins are components of the major cytoplasmic mRNA deadenylase in Saccharomyces cerevisiae.

Authors: M Tucker; M A Valencia-Sanchez; R R Staples; J Chen; C L Denis; R Parker
Journal: Cell Date: 2001-02-09 Impact factor: 41.582

8. Conservation of the deadenylase activity of proteins of the Caf1 family in human.

Authors: Claire Bianchin; Fabienne Mauxion; Stéphanie Sentis; Bertrand Séraphin; Laura Corbo
Journal: RNA Date: 2005-04 Impact factor: 4.942

9. UniProt Knowledgebase: a hub of integrated protein data.

Authors: Michele Magrane
Journal: Database (Oxford) Date: 2011-03-29 Impact factor: 3.451

10. Dendroscope: An interactive viewer for large phylogenetic trees.

Authors: Daniel H Huson; Daniel C Richter; Christian Rausch; Tobias Dezulian; Markus Franz; Regula Rupp
Journal: BMC Bioinformatics Date: 2007-11-22 Impact factor: 3.169

6 in total

Review 1. Proteins involved in the degradation of cytoplasmic mRNA in the major eukaryotic model systems.

Authors: Aleksandra Siwaszek; Marta Ukleja; Andrzej Dziembowski
Journal: RNA Biol Date: 2014 Impact factor: 4.652

2. A unique system for regulating mitochondrial mRNA poly(A) status and stability in plants.

Authors: Takashi Hirayama
Journal: Plant Signal Behav Date: 2014

3. Template-Independent Poly(A)-Tail Decay and RNASEL as Potential Cellular Biomarkers for Prostate Cancer Development.

Authors: Gordana Kocić; Jovan Hadzi-Djokić; Andrej Veljković; Stefanos Roumeliotis; Ljubinka Janković-Veličković; Andrija Šmelcerović
Journal: Cancers (Basel) Date: 2022-04-29 Impact factor: 6.575

4. Temperature-dependent fasciation mutants provide a link between mitochondrial RNA processing and lateral root morphogenesis.

Authors: Kurataka Otsuka; Akihito Mamiya; Mineko Konishi; Mamoru Nozaki; Atsuko Kinoshita; Hiroaki Tamaki; Masaki Arita; Masato Saito; Kayoko Yamamoto; Takushi Hachiya; Ko Noguchi; Takashi Ueda; Yusuke Yagi; Takehito Kobayashi; Takahiro Nakamura; Yasushi Sato; Takashi Hirayama; Munetaka Sugiyama
Journal: Elife Date: 2021-01-14 Impact factor: 8.140

5. Mammalian PNLDC1 is a novel poly(A) specific exonuclease with discrete expression during early development.

Authors: Dimitrios Anastasakis; Ilias Skeparnias; Athanasios-Nasir Shaukat; Katerina Grafanaki; Alexandra Kanellou; Stavros Taraviras; Dionysios J Papachristou; Athanasios Papakyriakou; Constantinos Stathopoulos
Journal: Nucleic Acids Res Date: 2016-08-11 Impact factor: 16.971

6. TOE1 acts as a 3' exonuclease for telomerase RNA and regulates telomere maintenance.

Authors: Tingting Deng; Yan Huang; Kai Weng; Song Lin; Yujing Li; Guang Shi; Yali Chen; Junjiu Huang; Dan Liu; Wenbin Ma; Zhou Songyang
Journal: Nucleic Acids Res Date: 2019-01-10 Impact factor: 16.971

6 in total