Literature DB >> 20198186

Sequence and structural analysis of 4SNc-Tudor domain protein from Takifugu Rubripes.

Jianzhou Zheng¹, Jian Lu, Haijun Liu, Jun Li, Keping Chen.

Abstract

The fugu SN4TDR protein belongs to an evolutionarily conserved family, consisting of four repeat staphylococcal nuclease-like domains (SN1-SN4) at the N-terminus followed by Tudor and SN-like domains (TSN). Sequence analysis showed that the C-terminal TSN domain is composed of a complete SN-like domain interdigitated with a Tudor domain. In despite of low level of sequence identities, five SN-like domains have a few conserved amino acids that may play essential roles in the function of the protein. Computer modeling and secondary structural prediction of the SN-like domains revealed the presence of similar structural features of beta1-beta2-beta3-alpha1-beta4-beta5-alpha2-alpha3, which provides a structural basis for oligonucleotides binding. The loop region L(3alpha) for binding sites between beta3 and alpha1 of SN-like domains are different from human p100, implying the divergence in the structures of binding sites. These results indicate that fugu SN4TDR may bind methylated ligands and/or oligonucleotides through its distant domains.

Entities: Chemical Disease Gene Species

Keywords: SN-like domain; SN4TDR; Takifugu rubripes; Tudor domain

Year: 2009 PMID： 20198186 PMCID： PMC2828898 DOI： 10.6026/97320630004127

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

Background

4SNc-Tudor domain proteins (SN4TDR) have been identified as highly conserved proteins among species, but not in bacteria [1, 2]. There are usually four repeat domains with homology to staphylococcal aureus nuclease (SN) at the N-terminus followed by Tudor domain and SN-like domain at the C-terminus, both of which are defined as TSN domain. The SN4TDR in eukaryotes is a key regulator of gene expression and plays an important role in both transcription and pre-mRNA splicing. The two distinct domains of SN4TDR SN-like domain and Tudor domain act as interaction mediators for nucleic acids and proteins, respectively [3]. The protein was first identified as a coactivator of the Epstein-Barr virus nuclear antigen 2 (EBNA-2) [4] and later found to be able to interact with other transcription factors such as c-Myb [5], Stat5 [6] and Stat6 [7]. Notably, SN4TDR leads to cell proliferation via activating some transcription factors and is linked to autosomaldominant polycystic kidney disease (ADPKD). The SN4TDR was also found to be one of the members of the RNA-induced gene silencing (RISC) that could bind double-stranded RNA and ultimately degrade them [8], especially for hyper-edited doublestranded RNA containing multiple I·U and U·I pairs [3]. Recently, the SN4TDR was shown to be able to promote both the in-vitro spliceosome complex formation and the first step of pre-mRNA splicing by interacting with U5 snRNP (small nuclear ribonucleoproteins) [9]. The SN4TDR proteins from eukaryotic organisms were classified into 5 categories based on 5th SN-like domain [10]. In contrast, the overall length and sequences of the proteins from fish were highly conserved, suggesting an essential role in this species. Furthermore, the conservation was also observed in SN4TDR proteins between human and fish. Although the human SN4TDR has been studied intensively, the structure and function of fugu (Takifugu rubripes) SN4TDR remains unknown. To investigate the function of fugu SN4TDR, here we performed a systemic bioinformatics analysis with a novel HCA (hydrophobic-cluster analysis) method. The analysis reveals that the protein has such a modular architecture including four repeat SN-like domains at the N-terminus, followed by Tudor domain and complete SN-like domain. In addition, some conserved amino acids were detected within fugu SN-like domains, suggesting their important roles in the function of fugu SN4TDR.

Methodology

Searching database for right sequences

The sequence data of fugu SN4TDR (BAD32626), human SN4TDR (NP_055205), staphylococcal aureus nuclease (2SNS_A) and SMN (the survival of motor neurons) (1MHN_A) for sequence analysis was obtained directly from the database at National Centre for Biotechnology Information (NCBI).

HCA plot

Due to the low level of sequence identities, HCA program at the Mobyle server (http://mobyle.rpbs.univ-paris-diderot.fr/cgibin/portal.py?form=HCA) based on the methods described by Callebaut et al [11] was used to detect similar plots between fugu SN4TDR, SNase and SMN, which could determine the presence of similar three-dimensional folds. The five SN-like domains of fugu SN4TDR detected by HCA analysis were used for multiple sequence alignment by the CLUSTAL W program (http://www.ebi.ac.uk/clustalw/) based on the methods described by Thompson et al [12]. They were subsequently used to generate input files to BOXSHADE for direct-viewing multiple alignment output.

Modeling of fugu SN-like domains

The five SN-like domains from fugu SN4TDR were modeled by an automatic web server in Geno 3D home page (http://geno3dpbil.ibcp.fr/) according to the methods described by Combet et al [13]. The crystal structure of SNase (PDB 1SNc) was used as a template for modeling four N-terminal SN-like domains (SN1- SN4), and the crystal structure of p100 co-activator Tudor domain (PDB 2HQE) was used for modeling C-terminal SN-like domain (SN5). PredictProtein program at the ExPASy Molecular Biology Server (http://cubic.bioc.columbia.edu/) was performed according to the methods described by Rost et al [14].

Results

The detection of five SN-like domains and Tudor domain based on HCA plot

By searching the conserved domain against SMART database, we found that fugu SN4TDR sequence contains two distant domains, four repeat SN-like domains (SN1 to SN4) and TSN domain. The later is comprised of the fifth SN domain (SN5) and Tudor domain. The comparison between five SN-like domains shows that they have low sequence identities in the amino acid level. In order to understand structure of fugu SN4TDR, the four SN-like domains and TSN domain were analyzed by hydrophobic cluster analysis (HCA), which aligns protein sequences relying on a twodimensional (2D) representation of the sequences rather than the sequence similarities. The HCA plot from Mobyle was employed for the comparison of the fugu SN4TDR with staphylococcal aureus nuclease and SMN (the survival of motor neurons) protein sequences for detecting similar motifs. The results indicated that the fugu SN4TDR has a modular architecture. At the N-terminus, four complete SN-like domains are located in residues 22-164, 198-326, 346-495 and 531-659, and each SN-like domain includes four similar repetitive hydrophobic domains. At the C-terminus, the SN-like domain are located in residues 684-704 and residues 795-896, both of which interdigitate with the Tudor domain (residues 705-794) (Figure 1).

Figure 1

HCA plots of SN-like domains, SNase and SMN showing their structural homology. In the HCA plots, the residues are shown on a duplicated α-helical net and the clusters of hydrophobic residues are automatically drawn. One-letter corresponds to one amino acid, except for proline, glycine, serine and threonine which are represented by the symbols ★,♦ ,⊡ ,□ respectively. The catalytic sites (D21, F34, D40, E43, K84, Y85 and R87) of SNase described in Christopher [17] and its secondary structure predicted by PredictProtein (β1-β2-β3-α1-β4-β5-α2-α3) are shown in HCA plot of SNase. SNase and SN-like domains are divided into subdomain A and subdomain B. Hydrophobic clusters with similar characteristics between SNase and SN-like domains are designated as motifs C1 to C8, while similar clusters between SMN and Tudor domain are denoted as motifs C9 to C10. Vertical lines indicate the putative correspondences between sequences; clusters with identical characteristics are darkly shaded. SNase = staphylococcal nuclease; SMN = the survival of motor neurons.

Similar to SNase, all five SN-like domains of fugu SN4TDR protein contain a similar hydrophobic cluster that includes eight similar hydrophilic motifs (designated as motif C1 to C8, Figure 1). Relative to other motifs, C1, C3, and C6 are well-conserved in all SN-like domains (Figure 1). The motifs C1 and C2 in SN-like domains are linked by a loop (L12), which contributes a conserved glycine necessary for nuclease binding. An exception is observed in SN3 domain, where glycine is replaced by an alanine. For loop L3α (linking β3 and α-helix), the sequence and length are different between five modeled SN-like domains and SNase. The HCA analysis showed that Tudor domain of TSN domain (residues 705-794) (Figure 1) is similar to that of SMN (MHN), which contains a typical β-barrel domain with four β-sheets. Furthermore, the hydrophobic cluster of Tudor domain (designated as motif C9 and C10) is also similar to that of SMN Tudor domain (Figure 1). In motifs C9 and C10, the residues Phe741, Try747, Try764, and Try767 correspond to the residues Trp102, Try109, Try127 and Try130 in SMN Tudor domain. The four residues of SMN Tudor domain have been proved to form an aromatic cage that is associated with the protein-protein interactions by enclosing a dimethylated arginine ligand to the cage [15]. Previous studies described a similar mechanism for recognition and binding of methylated amino acids residues in the Tudor domains of JMJDA [16] and 53BP1 [8].

Secondary structure and modeling of fugu SN-like domains

The secondary structure elements of all SN-like domains were predicted by PredictProtein. The result indicates that five SN-like domains closely resemble the overall structure of staphylococcal nuclease, which mainly includes a five-stranded β-barrel capped by an α-helix between β3 and β4, followed by two α-helices (α2 and α3, termed as subdomain B). It is of note that two amino acids (Asp21 and Asp40) in SNase required for catalysis are missing in the structure of five SN-like domains. In despite of the similarity of secondary structure elements, the surface residues of SN-like domains differ from those of SNase. Model analysis of five Nterminal SN-like domains indicates that SN1-SN4 domains have positively charged surfaces that provide a solvent accessible surface, whereas SN5 has negatively charged surface unable to interact with nucleic acid (Figure 3).

Figure 3

Structure of fugu SN-like domains modeled using Geno 3D. (A) Cartoon illustration of five SN-like domains structures. Red, α- helix; yellow, β-sheet; (B) Surface electrostatic potential calculated by pdbviewer show positively charged surfaces as blue and negatively charged surface as red. The classification of residues is as follows: positively charged, Arg and Lys; negatively charged, Asp and Glu.

Discussion

4SNc-Tudor domain protein (SN4TDR) was first identified as a transcriptional coactivator of the Epstein-Barr virus nuclear antigen 2, coactivating gene expression by interacting with the EBNA-2 acidic domain [4]. Previous studies indicated that SN4TDR was conserved in model structures among a wide variety of organisms. It contains four repeat SN-like domains at the N-terminus; followed by a Tudor domain and a variety of SN-like domain (TSN). The high degree of structural conservation in many eukaryotic organisms may imply that SN4TDR plays an indispensable role in eukaryotic cells. Different from the model structures, the amino acid sequence of SN4TDR is less conserved (below 30%). Therefore, HCA analysis was used to detect the presence of the conserved structure within N-terminal SN-like domains and TSN domain. For the diversity of the fifth SN-like domain (SN5), SN4TDR is classified into 5 categories based on homology and the status of SN5 domain [10]. In addition, the fugu SN5 domain contains a fulllength secondary structure formed by residues 687-704 (β1 and β2) and residues 795-896 (β3, α1, β4, β5, α2 and α3). Multiple sequence alignment showed there is a low identity between five SN-like domains and SNase. However, when compared with diverse eukaryotes, the domains of fugu SN4TDR are conserved, suggesting that they are functionary importance [17]. In addition, the secondary structural analysis by PredictProtein (Figure 2) indicated that SNlike domains contain the typical OB-folds, implying a function of binding oligonucleotides or oligosaccharides [18]. It is notable that subdomain B (α2 and α3) has more similarities than subdomain A (β1, β2, β3, α1, β4 and β5) (Figure 2). The conserved residues in subdomain B are considered to be necessary for its stability [17], and the subdomain A appears to contribute to oligonucleotides binding due to the presence of typical OB-fold [18]. In subdomain A, the L3α loop regions of fugu SN4TDR between β3 and α1 are rich in Arg and Lys, implying a potential to bind phosphates. Interestingly, the overall structures of SN4TDR between fugu and human are conserved in sequence and length, except for L3α where residues are different, suggesting different substrates for binding sites (Figure 4). Furthermore, the charged properties of L3α of SNlike domains except for SN5 domain have no significant difference between fugu and human.

Figure 2

Multiple sequence alignment of SNase and five SN-like domains based on HCA analysis. The protein sequence alignment is performed by CLUSTAL W program using default parameters, and then handled using BOXSHADE. The secondary structure of SNase protein sequences predicted by PredictProtein are denoted as β1, β2, β3, α1, β4, β5, α2 and α3. The corresponding secondary structures of SN-like domains are displayed as underline at the bottom of the sequences. The region β3 between β2 and α1 in SN5 domain is predicted by modeling SN5 and denoted as dashed line. The region β3 of SN5 domain is analyzed according to crystal structure of human SN4TDR protein.

Figure 4

Comparison of fugu and human SN4TDR. The protein sequence comparison is performed by CLUSTAL W program using default parameters, and then handled using BOXSHADE. The SN-like domains are denoted using dashed line. The secondary structures of five SN-like domains are displayed as upper line.

Tudor domains have been found in many eukaryotic organisms and are involved in protein-protein interactions. Recent study on mutagenesis of p100 TSN domain revealed that methylated ligands are trapped inside a cage which is composed of at least three aromatic amino acids residues [19]. The Tudor domain of the survival of motor neurons (SMN) was demonstrated to bind to dimethylated arginines of arginine-glycine (RG) rich sequences at the C-terminus of Sm proteins. Here, we indicated that four amino acids (Phe741, Try747, Try764, and Try767) in Tudor domain are invariant and likely to form conserved aromatic cage to bind methylated ligands.The model analysis of five SN-like domains shows that there are highly conserved secondary structures and differences between N-terminal SN1-SN4 and C-terminal SN5. The similar secondary structures and positively charged surfaces between fugu and human SN-like domains suggest that fugu SN1- SN4 may bind DNA and/or double-stranded RNA in the same way as human SN1-SN4 [20]. In contrast, the SN5 domain is unable to bind phosphoric acids due to the presence of negatively charged surface. In addition, it is demonstrated that TSN domain of p100 could interact with snRNP complexes and promote pre-mRNA splicing [19]. Thereby charged surface of SN5 domain may compromise methylated ligands binding of Tudor domain.

20 in total

1. Geno3D: automatic comparative molecular modelling of protein.

Authors: Christophe Combet; Martin Jambon; Gilbert Deléage; Christophe Geourjon
Journal: Bioinformatics Date: 2002-01 Impact factor: 6.937

2. The multifunctional human p100 protein 'hooks' methylated ligands.

Authors: Neil Shaw; Min Zhao; Chongyun Cheng; Hao Xu; Juha Saarikettu; Yang Li; Yurong Da; Zhi Yao; Olli Silvennoinen; Jie Yang; Zhi-Jie Liu; Bi-Cheng Wang; Zihe Rao
Journal: Nat Struct Mol Biol Date: 2007-07-15 Impact factor: 15.369

Review 3. Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives.

Authors: I Callebaut; G Labesse; P Durand; A Poupon; L Canard; J Chomilier; B Henrissat; J P Mornon
Journal: Cell Mol Life Sci Date: 1997-08 Impact factor: 9.261

4. P100, a transcriptional coactivator, is a human homologue of staphylococcal nuclease.

Authors: C P Ponting
Journal: Protein Sci Date: 1997-02 Impact factor: 6.725

5. The Epstein-Barr virus nuclear protein 2 acidic domain forms a complex with a novel cellular coactivator that can interact with TFIIE.

Authors: X Tong; R Drapkin; R Yalamanchili; G Mosialos; E Kieff
Journal: Mol Cell Biol Date: 1995-09 Impact factor: 4.272

6. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

Authors: J D Thompson; D G Higgins; T J Gibson
Journal: Nucleic Acids Res Date: 1994-11-11 Impact factor: 16.971

7. Tudor nuclease genes and programmed DNA rearrangements in Tetrahymena thermophila.

Authors: Rachel A Howard-Till; Meng-Chao Yao
Journal: Eukaryot Cell Date: 2007-08-22

8. OB(oligonucleotide/oligosaccharide binding)-fold: common structural and functional solution for non-homologous sequences.

Authors: A G Murzin
Journal: EMBO J Date: 1993-03 Impact factor: 11.598

9. Transcriptional co-activator protein p100 interacts with snRNP proteins and facilitates the assembly of the spliceosome.

Authors: Jie Yang; Tuuli Välineva; Jingxin Hong; Tianxu Bu; Zhi Yao; Ole N Jensen; Mikko J Frilander; Olli Silvennoinen
Journal: Nucleic Acids Res Date: 2007-06-18 Impact factor: 16.971

10. Structural and functional insights into human Tudor-SN, a key component linking RNA interference and editing.

Authors: Chia-Lung Li; Wei-Zen Yang; Yi-Ping Chen; Hanna S Yuan
Journal: Nucleic Acids Res Date: 2008-05-03 Impact factor: 16.971

1 in total

1. Detection of growth-related QTL in turbot (Scophthalmus maximus).

Authors: Enrique Sánchez-Molano; Alex Cerna; Miguel A Toro; Carmen Bouza; Miguel Hermida; Belén G Pardo; Santiago Cabaleiro; Jesús Fernández; Paulino Martínez
Journal: BMC Genomics Date: 2011-09-29 Impact factor: 3.969

1 in total