| Literature DB >> 28575517 |
Dorota Matelska1, Kamil Steczkiewicz1, Krzysztof Ginalski1.
Abstract
PIN-like domains constitute a widespread superfamily of nucleases, diverse in terms of the reaction mechanism, substrate specificity, biological function and taxonomic distribution. Proteins with PIN-like domains are involved in central cellular processes, such as DNA replication and repair, mRNA degradation, transcription regulation and ncRNA maturation. In this work, we identify and classify the most complete set of PIN-like domains to provide the first comprehensive analysis of sequence-structure-function relationships within the whole PIN domain-like superfamily. Transitive sequence searches using highly sensitive methods for remote homology detection led to the identification of several new families, including representatives of Pfam (DUF1308, DUF4935) and CDD (COG2454), and 23 other families not classified in the public domain databases. Further sequence clustering revealed relationships between individual sequence clusters and showed heterogeneity within some families, suggesting a possible functional divergence. With five structural groups, 70 defined clusters, over 100,000 proteins, and broad biological functions, the PIN domain-like superfamily constitutes one of the largest and most diverse nuclease superfamilies. Detailed analyses of sequences and structures, domain architectures, and genomic contexts allowed us to predict biological function of several new families, including new toxin-antitoxin components, proteins involved in tRNA/rRNA maturation and transcription/translation regulation.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28575517 PMCID: PMC5499597 DOI: 10.1093/nar/gkx494
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Crystal structures of the major variants of the PIN domain-like fold (top panels) with zoom-in views of their actives sites (bottom panels). (A) Structure-specific human FEN-1 nuclease (PDB ID: 3q8k). A ‘hydrophobic wedge’ between S1 and S2 is shown in dark violet, a ‘helical arch’ between S2 and H3 is shown in dark green, and a C-terminal helix-2-turn-helix motif (H2TH) in orange. (B) A canonical PIN domain of VapC15 from Mycobacterium tuberculosis (PDB ID: 4chg). Helix 2΄ (H2΄) specific for VapC-like domains is shown in dark green. (C) VPA0982 from Vibrio parahaemolyticus (PDB ID: 2qip). (D) PRORP1 from Arabidopsis thaliana (PDB ID: 4g24). Short helix 2΄ (H2΄) is shown in green. β-strands forming the core β-sheet are labeled S1–S5 and shown in yellow, active site residues are shown as red sticks, and metal ions as pink spheres.
Figure 2.Sequence similarity network of the PIN-like domains. Nodes correspond to the PIN-like domain sequences representing 40% sequence-identity clusters, whereas edges correspond to BLAST E-values < 10−5. Sequences are colored according to the defined clusters; newly identified PIN-like domains are marked in red. The weighted graph, with weights transformed by –log(E-value), was visualized in Cytoscape 3.2 with 10,000 iterations of the Prefuse Force Directed Layout (45).
PIN-like domain clusters defined in this study
| Cluster name | Structural group | Pfam and COG/KOG domains | Description | Number of active-site residues | PDB IDs | Number of sequences | Assignment to PIN domain-like superfamily | Phyletic distribution |
|---|---|---|---|---|---|---|---|---|
| 5_3_exonuc_N | FEN | PF02739 (5_3_exonuc_N), COG0258, KOG2519 | N-terminal domain of type A DNA polymerases, with 5΄–3΄ exonuclease and structure-specific endonuclease activities. It functions in DNA replication and repair, cleaves flap structures, including Okazaki fragments. | 4–9 (+) | 3zd8, 2ihn, 1bgx, 1xo1, 1taq, 1ut5, 1exn, 3h8s, 1tfr | 18405 / 238 | ( | Bacteria [17107], Eukaryota [380], dsDNA viruses [243], Archaea [2] |
| Pox_G5 | FEN | PF04599 (Pox_G5) | FEN1-like nucleases conserved in poxviruses, involved in DNA replication and double-strand break repair by homologous recombination ( | 6 (+) | - | 98 / 7 | ( | Viruses: Poxviridae [98] |
| XPG_I | FEN | PF00867 (XPG_I), PF00752 (XPG_N), COG0258, KOG2519, KOG2520, KOG2518, COG5366 | Internal domain, which together with the N-terminal domain (XPG_N, PF00752) forms the catalytic domain of the FEN-like structure that contains the active site. In eukaryotes, Rad2/XPG proteins are responsible for a key step of the nucleotide excision DNA repair (NER) pathway, cleaving DNA duplex-containing bubble or loop structures during DNA replication, repair and recombination ( | 7 (+) | 1b43, 4wa8, 1ul1, 4q0r, 3ory, 3q8m, 2izo, 1mc8, 3q8k, 3qe9, 1a76, 3qea, 1rxv, 4q0w, 5cnq | 6119 / 524 | Pfam | Eukaryota [5510], Archaea [363], dsDNA viruses [169], Bacteria [2], unclassified viruses [1] |
| XPG_I_2 | FEN | PF12813 (XPG_I_2), KOG2518 | Function unknown. Eukaryotic family of asteroid homologs, in Drosophila possibly functioning in EGFR signaling ( | 6–7 (+) | - | 1056 / 123 | Pfam | Eukaryota [1056] |
| XRN_N | FEN | PF03159 (XRN_N), COG5049, KOG2044, KOG2045 | In eukaryotes, major 5΄–3΄ exoribonucleases involved in mRNA decay ( | 7 (+) | 2y35, 3pie, 3fqd | 2886 / 167 | Pfam | Eukaryota [2835], dsDNA viruses [32], Bacteria [6] |
| COG2405 | VapC | PF11848 (DUF3368), COG2405 | Potential toxins from toxin-antitoxin systems, including new three-component systems. | 2–5 (+/-, e.g. PDB ID: 2mdt) | 2mdt | 1675 / 235 | Pfam | Bacteria [1248], Archaea [357] |
| DUF1308 | VapC | PF07000 (DUF1308), KOG4529 | Function unknown. Present in some mimiviruses, eukaryota and some cyanobacteria, so probably of chloroplast-origin in eukaryota. The Pfam definition comprises two domains: N-terminal, distantly related to the PD-(D/E)XK nucleases, and C-terminal, PIN-like domain, which probably lacks acidic residues from the active site. | 2–4 (-/+) | - | 762 / 78 | This work | Eukaryota [744], Bacteria [10], dsDNA viruses [7] |
| DUF4411.1 | VapC | PF14367 (DUF4411) | Potential toxins from three-component toxin-antitoxin systems, together with HTH DNA-binding domains and DUF955 proteases. | 4–5 (+) | - | 643 / 91 | Pfam | Bacteria [616], Archaea [13] |
| DUF4411.2 | VapC | PF14367 (DUF4411) | Potential toxins from toxin-antitoxin systems. | 4–6 (+) | - | 14 / 6 | Pfam | Bacteria: Bacillales [14] |
| DUF4935 | VapC | PF16289 (DUF4935) | Function unknown. The Pfam definition comprises two regions: the N-terminal one is a PIN-like domain, whereas the α/β C-terminal region does not show homology to any protein of known structure. | 4–5 (+) | - | 320 / 84 | This work | Bacteria [310], Archaea [2] |
| Fcf1 | VapC | PF04900 (Fcf1), COG1412, KOG3164, KOG3165 | Maturation of rRNA. In human and yeasts, Utp24 is an essential endoribonuclease processing the 18S rRNA precursor at site A1 and A2 ( | 2–4 (+, e.g. Utp24 / -, e.g. Utp23) | 4mj7 | 1868 / 90 | Pfam | Eukaryota [1859], Bacteria [4], Archaea [2] |
| PIN.COG1487 | VapC | PF01850 (PIN), COG1487 | VapC-like toxins from toxin-antitoxin systems. | 2–4 (+/-) | 3zvk, 4chg, 4xgq, 2bsq, 3tnd, 3dbo, 2h1c, 1v96, 3h87, 2h1o, 1y82 | 10532 / 660 | Pfam | Bacteria [9852], Archaea [413], Eukaryota [3] |
| PIN.COG1848 | VapC | PF01850 (PIN), COG1848 | Toxins from toxin-antitoxin systems. | 3/4 (+/-) | - | 938 / 81 | Pfam | Bacteria [902] |
| PIN.COG2402 | VapC | PF01850 (PIN), COG2402 | Toxins from toxin-antitoxin systems. | 3–4 (+/-) | 1w8i | 1620 / 169 | Pfam | Bacteria [1266], Archaea [278], Eukaryota [11] |
| PIN.COG3742.COG1848.COG4374 | VapC | PF01850 (PIN), COG3742, COG4374, COG1848 | Toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 3265 / 469 | Pfam | Bacteria [2421], Archaea [739], Eukaryota [3] |
| PIN.COG3744 | VapC | PF01850 (PIN), COG3744 | Toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 2506 / 155 | Pfam | Bacteria [2429], Archaea [18], Eukaryota [2] |
| PIN.COG4113 | VapC | PF01850 (PIN), COG4113 | Toxins from toxin-antitoxin systems. | 4 (+) | 2fe1, 1v8o, 1v8p | 1611 / 243 | Pfam | Bacteria [1248], Archaea [301], Eukaryota [1] |
| PIN.COG4956 | VapC | PF01850 (PIN), COG4956 | Function unknown. In addition to the PIN domain, the proteins possess TRAM, a putative RNA-binding domain ( | 3–4 (+/-) | 3ix7 | 2002 / 20 | Pfam | Bacteria [1946], Archaea [6], Eukaryota [2] |
| PIN.COG5573 | VapC | PF01850 (PIN), COG5573 | Toxins from toxin-antitoxin systems. | 4 (+) | - | 695 / 82 | Pfam | Bacteria [674], Archaea [6], Eukaryota [1] |
| PIN.COG5611 | VapC | PF01850 (PIN), COG5611 | Toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 1139 / 140 | Pfam | Bacteria [1066], Archaea [36] |
| PIN.1 | VapC | PF01850 (PIN) | Toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 1163 / 155 | Pfam | Bacteria [1085], Archaea [39] |
| PIN.2 | VapC | PF01850 (PIN) | Toxins from toxin-antitoxin systems. | 4–6 (+) | - | 671 / 19 | Pfam | Bacteria: Actinobacteria [663], Proteobacteria [1] |
| PIN.3 | VapC | PF01850 (PIN) | Toxins from toxin-antitoxin systems. | 2–4 (+/-) | - | 324 / 32 | Pfam | Bacteria [303], Archaea [2] |
| PIN.4 | VapC | PF01850 (PIN) | Function unknown. Transcriptionally coupled to DUF4325-encoding genes. | 5–6 (+) | - | 119 / 45> | Pfam | Bacteria [108], Archaea [3] |
| PIN.5 | VapC | PF01850 (PIN) | Function unknown. | 4 (+) | - | 104 / 28 | Pfam | Bacteria [104] |
| PIN.6 | VapC | PF01850 (PIN) | Function unknown. | 3–5 (+/-) | - | 64 / 12 | Pfam | Bacteria [58] |
| PIN.7 | VapC | PF01850 (PIN) | Function unknown. | 4 (+) | - | 26 / 5 | Pfam | Bacteria: Cyanobacteria [22], Proteobacteria [2], Bacteroidetes/Chlorobi group [1] |
| PIN_2 | VapC | PF10130 (PIN_2), COG5378 | Toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 222 / 51 | Pfam | Bacteria [144], Archaea [71] |
| PIN_3.COG1569 | VapC | PF13470 (PIN_3), COG1569 | Typically, toxins from toxin-antitoxin systems. Some representatives are encoded in operons comprising ATP-grasp ligase, ATPase and HNH nuclease, which were proposed to constitute a novel conflict system, where RNA ligase would neutralize toxic behavior of the nucleases ( | 4–6 (+) | - | 3548 / 331 | Pfam | Bacteria [3384], Archaea [52], Eukaryota [5] |
| PIN_3.1 | VapC | PF13470 (PIN_3) | Toxins from toxin-antitoxin systems. | 4–7 (+) | - | 963 / 161 | Pfam | Bacteria [906], Archaea [6] |
| PIN_4.COG1875 | VapC | PF13638 (PIN_4), COG1875 | In bacteria, toxins from toxin-antitoxin systems. In eukaryota, SMG5 and SMG6 are components of nonsense-mediated mRNA decay (NMD) machinery ( | 3–4 (+/-, e.g. SMG5, PDB ID: 2hwy) | 2hwx, 2hwy, 2hww, 2dok | 6518 / 423 | Pfam | Bacteria [4328], Eukaryota [2041], dsDNA viruses [22] |
| PIN_4.1 | VapC | PF13638 (PIN_4) | Function unknown. | 4 (+) | - | 25 / 5 | Pfam | Eukaryota [21], Bacteria [4] |
| PIN_4.2 | VapC | PF13638 (PIN_4) | Function unknown. | 3–4 (+/-) | - | 27 / 10 | Pfam | Bacteria [23], Eukaryota [4] |
| PIN_5 | VapC | PF08745 (PIN_5), COG1458 | Function unknown. The gene co-occurrence patterns suggest that it may interact with RNA ligase from TIGR01209 family, tRNA methyltransferase and tRNA-synthetase. | 4 (+) | - | 197 / 9 | Pfam | Archaea [128], Bacteria [65] |
| PIN_6 | VapC | PF17146 (PIN_6), COG1439 | In eukaryotes, Nob1 proteins are endoribonucleases involved in 18S rRNA maturation ( | 3–4 (+/-) | 2lcq | 1316 / 141 | ( | Eukaryota [989], Archaea [280], Bacteria [2] |
| Rrp44 | VapC | KOG2102 | RNA degradation within exosome. Rrp44 (DIS3) acts as an Mn-dependent endoribonuclease from the exosome core ( | 3–5 (+/-) | 2wp8, 4ifd, 4pmw, 5c0w | 167 / 60 | ( | Eukaryota [167] |
| PIN_8 | VapC | - | Function unknown. | 4–5 (+) | - | 458 / 139 | This work | Bacteria [441], Archaea [5], dsDNA viruses [1] |
| PIN_9 | VapC | COG1412 | Function unknown. Archaea-specific Fcf1-like domains, not matching the Fcf1 Pfam model. | 4–5 (+) | 1o4w | 353 / 56 | ( | Archaea [311], Bacteria [1] |
| PIN_12 | VapC | - | Function unknown. Related to DUF4935. | 4–5 (+) | - | 240 / 79 | This work | Bacteria [240] |
| PIN_13 | VapC | - | Potential toxins from toxin-antitoxin systems. | 1–3 (-) | - | 218 / 13 | This work | Bacteria: Actinomycetales [218] |
| PIN_14 | VapC | - | Potential toxins from three-component toxin-antitoxin systems, together with HTH DNA-binding domains and DUF955 proteases. | 3–4 (+/-) | - | 213 / 41 | This work | Bacteria [204], Archaea [3] |
| PIN_15 | VapC | - | Mainly potential toxins from toxin-antitoxin systems. The domains are fused with GCN5-related acetyltransferases or potential RNA-binding ( | 3–5 (+/-) | - | 182 / 22 | This work | Bacteria [161], Archaea [12], Eukaryota [1], dsDNA viruses [1] |
| PIN_17 | VapC | - | Potential toxins from toxin-antitoxin systems. The PIN-like domain is fused with an acetyltransferase domain, and encoded upstream to HTH-ASCH fusion protein genes (COG4933). | 3–5 (+/-) | - | 117 / 35 | This work | Bacteria [117] |
| PIN_18 | VapC | - | Function unknown. | 4–6 (+) | - | 97 / 1 | This work | Archaea: Euryarchaeota [93], unclassified Archaea [1] |
| PIN_19 | VapC | - | Function unknown. | 3–4 (+/-) | - | 54 / 22 | This work | Bacteria [51], Archaea [1] |
| PIN_20 | VapC | - | Potential toxins from three-component toxin-antitoxin systems, together with HTH DNA-binding domains and DUF955 proteases. | 4–6 (+) | - | 50 / 11 | This work | Bacteria: Actinomycetales [50] |
| PIN_21 | VapC | - | Function unknown. | 4 (+) | - | 42 / 8 | This work | Archaea [39], Bacteria [2] |
| PIN_22 | VapC | - | Function unknown. | 4–5 (+) | - | 29 / 4 | This work | Bacteria: Clostridium [29] |
| PIN_23 | VapC | - | Function unknown. | 3–4 (+/-) | - | 25 / 7 | This work | Bacteria [20], Archaea [4] |
| PIN_24 | VapC | - | Function unknown. | 3–4 (-/+) | - | 23 / 3 | This work | Bacteria: Cyanobacteria [23] |
| PIN_25 | VapC | - | Potential toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 20 / 17 | This work | Bacteria [17] |
| PIN_26 | VapC | - | Potential toxins from toxin-antitoxin systems. | 4–5 (+) | - | 20 / 6 | This work | Bacteria: Firmicutes [20] |
| PIN_27 | VapC | - | Function unknown. | 4 (+) | - | 8 / 5 | This work | Archaea: Euryarchaeota [7], unclassified Archaea [1] |
| PIN_28 | VapC | - | Function unknown. | 4 (+) | - | 8 / 2 | This work | Archaea: Sulfolobaceae [8] |
| COG2454 | NYN | COG2454 | Function unknown. Fused N-terminally with alpha-helical DUF434. In some archaea, located within ribosomal or tRNA operons. | 4–5 (+) | - | 291 / 30 | This work | Bacteria [200], Archaea [86] |
| DUF188 | NYN | PF02639 (DUF188), COG1671 | Function unknown. | 5 (+) | - | 4252 / 42 | Pfam | Bacteria [4125], Eukaryota [3] |
| NYN.COG1432 | NYN | PF01936 (NYN), COG1432 | Function unknown. The cluster comprises LabA-like proteins, which in | 1–5 (+/-) | 2qip | 8754 / 460 | Pfam | Bacteria [7883], Archaea [360], Eukaryota [260], dsDNA viruses [1] |
| NYN.1 | NYN | PF01936 (NYN) | Function unknown. | 3–6 (+/-) | - | 1813 / 277 | Pfam | Eukaryota: Viridiplantae [1151], Opisthokonta [638], other [17] |
| NYN.2 | NYN | PF01936 (NYN) | Function unknown. Majority encoded downstream to the genes encoding putative tRNA methyltransferases TrmB. | 2–6 (+/-) | - | 682 / 19 | Pfam | Bacteria [509], Eukaryota [167] |
| NYN.3 | NYN | PF01936 (NYN) | Function unknown. | 2–4 (+/-) | - | 225 / 35 | Pfam | Eukaryota [214], Bacteria [10] |
| NYN_YacP | NYN | PF05991 (NYN_YacP), COG3688 | Function unknown. | 4–7 (+) | - | 2959 / 120 | Pfam | Bacteria [2588], Eukaryota [168] |
| PIN_7 | NYN | - | Function unknown. | 3–4 (+/-) | - | 657 / 54 | This work | Bacteria [603], Eukaryota [1] |
| PIN_11 | NYN | - | Function unknown. C-terminal domain of bilaterial ZNF451 proteins, comprising 887–1002 region in isoform 1 of human ZNF451 (Uniprot ID: Q9Y4E5–1). In higher eukaryotes, fused with zinc-finger motifs. | 3–4 (+/-) | - | 284 / 3 | This work | Eukaryota: Eumetazoa [282] |
| PRORP | PRORP | PF16953 (PRORP) | Processing of pre-tRNA at the 5΄-end in mitochondria and chloroplasts ( | 4–5 (+) | 4g23, 4g24, 4xgl, 5diz | 830 / 79 | ( | Eukaryota [820], dsDNA viruses [1] |
| RNase_Zc3h12a | PRORP | PF11977 (RNase_Zc3h12a), KOG3777 | Two evolutionary separated groups. In higher eukaryotes, MCPIP1 (Zc3h12a) is involved in regulation of mRNA decay ( | 0–5 (+/-) | 3v32, 3v33 | 2122 / 130 | Pfam | Eukaryota [1925], Bacteria [154], Archaea [23] |
| RNase_Zc3h12a_2 | PRORP | PF14626 (RNase_Zc3h12a_2) | Function unknown. | 2–5 (+/-) | - | 26 / 8 | Pfam | Eukaryota: Chromadorea [26] |
| COG4634 | Mut7-C | COG4634 | In bacteria, potential toxins from toxin-antitoxin systems ( | 3–5 (+/-) | - | 1732 / 204 | ( | Bacteria [1536], Archaea [117], Eukaryota [4], dsDNA viruses [1] |
| Mut7-C | Mut7-C | PF01927 (Mut7-C), COG1656 | Function unknown. PIN domain-like fold with an inserted zinc ribbon at the C terminus. In eukaryotes, the Mut7-C domain is fused N-terminally to the 3΄–5΄ exonuclease RNase D family domain, whereas in archaea, it is a standalone module and in bacteria, it is fused with a ubiquitin member of potential RNA-binding function ( | 2–4 (+/-) | - | 1869 / 125 | ( | Bacteria [1055], Eukaryota [556], Archaea [210] |
| PIN_10 | Mut7-C | - | Potential toxins from toxin-antitoxin systems, related to COG4634. Recently, a crystal structure of its DUF433-containing antitoxin VapB45 (Rv2018) from | 3–4 (+/-) | - | 297 / 43 | This work | Bacteria [297] |
| PIN_16 | Mut7-C | - | Potential toxins from toxin-antitoxin systems. | 3–4 (+/-) | - | 150 / 26 | This work | Bacteria [150] |
Clusters are named according to the corresponding Pfam families, COG/KOG groups, or after the best-characterized representative. Name with a dot denotes a cluster within a Pfam family (‘Pfam’.‘subfamily’). Matches to Pfam and CDD were computed with HMMER (41) and RPS-BLAST (35), respectively, at the E-value cutoff of 10−5. The number of active site residues was predicted based on the conservation of acidic residues at the positions corresponding to known active sites. ‘+’/‘-’ in parentheses denotes the presence of predicted active/inactive nucleases. ‘PDB IDs’ refer to PDB IDs of solved structures within PDB90 (proteins with known structure clustered at 90% sequence identity). ‘Number of sequences’ is based on the NCBI NR database (40). Numbers following slash refer to the number of representatives at 40% identity based on clustering of the corresponding sequence sets with CD-HIT (33). In ‘Assignment to PIN domain-like superfamily’, ‘Pfam’ refers to the clan CL0280 (PIN) in the Pfam database (31). Taxonomic lineages of organisms were assigned according to the NCBI Taxonomy database (40). Numbers in square brackets in ‘Phyletic distribution’ refer to numbers of sequences from the NCBI NR database.
Figure 3.Multiple sequence alignment of the conserved core elements of the PIN domain-like superfamily. The sequence blocks (VapC-like, FEN-like, NYN-like, PRORP and Mut7-C) correspond to the defined structural groups. Each defined cluster is represented by one or more sequences, labeled with NCBI accession numbers or, for proteins of known structure, PDB codes. The numbers of excluded residues are specified in parentheses. Residue conservation is denoted with the following scheme: uncharged, highlighted in yellow; polar, highlighted in gray; known or potential active site residues, highlighted in black. Secondary structure elements (E, β-strand; H, α-helix) are shown above the corresponding alignment blocks. Abbreviated species are defined in Supplementary Dataset S2.
Figure 4.Gene co-occurrence of the PIN-domain toxin and major antitoxin families. For a given PIN-like domain family, percentages correspond to the number of prokaryotic genes located on the same strand and in close proximity (separated by less than 100 nt) to the genes that encode an antitoxin of a given family (AbrB, HTH, RHH, UPF0175 or YefM), in reference to the total number of prokaryotic genes belonging to the family. The calculations are based on the KEGG GENOME database (March 2016) (57). Shown are the PIN-like families that include at least 10% prokaryotic genes encoded in potential toxin-antitoxin operons.
Functions and catalytic strategies of selected PIN-like domains
| Enzyme | PIN-like cluster | Endonucleolytic activity | 5΄–3΄ exonucleolytic activity | Biological function | Active site | Metal ions in tertiary structures |
|---|---|---|---|---|---|---|
|
| 5_3_exonuc_N | DNA, RNA and RNA-DNA (preferentially cleaves on the junction between a 5΄ single-strand and duplex, i.e., 5΄ flap) ( | DNA (single-stranded ( | DNA replication: removal of the RNA primers from lagging strand fragments. DNA repair: mediation of the nick translation. | Two metal binding sites, A and B (7 x Asp/Glu: D13, D63, E113, D115, D116, D138, D140) | One Zn2+ bound in the crystal structure of |
|
| 5_3_exonuc_N | DNA (5΄ flap and pseudo flap-like structures) ( | — ( | Unknown biological function. As mutations of this gene are synthetically lethal with those in polymerase I, the protein has been implied in Okazaki fragment maturation ( | Metal binding site A (5 x Asp/Glu: D9, D50, E102, D104, D127) | Two Mg2+ 2.5 Å apart bound in the crystal structure of the complex with DNA (PDB ID: 3zd8). One K+ bound at an interface between the H3TH domain and DNA. Ca2+ has inhibitory effect ( |
| Bacteriophage T4 RNase H (rnh) | 5_3_exonuc_N | DNA (junction between a single-strand and duplex) ( | DNA (dsDNA), RNA-DNA ( | DNA replication: removal of the RNA primers from lagging strand fragments ( | Sites A and B (7 x Asp/Glu: D19, D71, E130, D132, D155, D157, D200) | Two Mg2+ 7 Å apart bound in the crystal structure (PDB ID: 1tfr). No metal ions in the complex with fork DNA (PDB ID: 2ihn). |
| Human exonuclease 1 (EXO1) | XPG_I | DNA (5΄ flap and pseudo flap-like structure-specific), RNA-DNA (RNA primer removal from Okazaki fragments) ( | DNA (dsDNA, low activity on ssDNA) ( | DNA replication: removal of the RNA primers from lagging strand fragments ( | Sites A and B (7 x Asp/Glu: D30, D78, E150, D152, D171, D173, D225) | Two Mn2+ 4.1 Å apart bound in the crystal structure of the complex with DNA (PDB ID: 3qeb). |
| Human flap endonuclease 1 (FEN1) | XPG_I | DNA (5΄ flap and pseudo flap-like structure-specific, gapped DNA duplex, not ssDNA and dsDNA) ( | DNA (nicked or gapped dsDNA ( | DNA replication: removal of the RNA primers from lagging strand fragments, resolution of stalled DNA replication forks. DNA repair: long-patch base excision repair ( | Sites A and B (7 x Asp/Glu: D34, D86, E158, E160, D179, D181, D233) | Two Mg2+ 3.4 Å apart bound in the crystal structure of the complex with PCNA (PDB ID: 1ul1). |
| Human gap endonuclease 1 (GEN1) | XPG_I | DNA (5΄ flap, replication fork, Holliday junction) ( | — ( | Homologous recombination: Holliday junction resolution ( | Sites A and B (7 x Asp/Glu: D30, E75, E134, E136, D155, D157, D208) | One Mg2+ bound in the crystal structure of the complex with DNA (PDB ID: 5t9j). |
| Human DNA repair protein complementing XP-G cells (ERCC5) | XPG_I | DNA (single-stranded structure-specific, including bubble and splayed arm substrates) ( | ? | DNA repair: nucleotide excision repair (NER) ( | Sites A and B (7 x Asp/Glu: D30, D77, E789, E791, D810, D812, D861) | — |
| Virion host shutoff protein (UL41) | XPG_I | RNA (mRNA) ( | ? | Decay of host mRNAs ( | Sites A and B (7 x Asp/Glu: D34, D82, E192, D194, D213, D125, D261) ( | — |
| Human exoribonuclease 1 (XRN1) | XRN_N | ? | RNA (5΄ monophosphorylated single-stranded or duplex substrates ( | RNA decay: major 5΄–3΄ exoribonuclease in mRNA decay ( | Sites A and B (7 x Asp/Glu: D35, D86, E176, E178, D206, D208, D292) | One Mg2+ bound in the crystal structure of |
|
| XRN_N | ? | RNA (5΄ monophosphorylated single-stranded substrates ( | Transcription termination ( | Sites A and B (7 x Asp/Glu: D55, D104, E205, E207, D235, D237, D336) | One Mg2+ bound in the crystal structure of the complex with Rai1 (PDB ID: 3fqd). |
|
| PRORP | RNA (tRNA or tRNA-like structures) ( | ? | tRNA maturation: 5΄ maturation of tRNA precursors ( | Site A (5 x Asp: D399, D493, D497, D474, D475) ( | Two Mn2+ bound in the crystal structure (PDB ID: 4g24). |
| Human endoribonuclease ZC3H12A (MCPIP1) | RNase_Zc3h12a | RNA (preferentially cleaves a stem-loop structure) ( | ? | mRNA decay ( | Site A (5 x Asp: D141, D225, D226, D244, D248) ( | One Mg2+ bound in the crystal structure (PDB ID: 3v33). Nuclease active with Mg2+ and Mn2+, but not with Fe2+, Zn2+ or Ca2+ ( |
|
| PIN_6 | RNA (single-stranded region of a hairpin, i.e. site D of 18S rRNA) ( | ? | rRNA maturation ( | Site A (4 x Asp/Glu: D15, E43, D92, D110; but D110 is not essential for function) ( | NMR structure of |
| Human telomerase-binding protein EST1A (SMG6) | PIN_4.COG1875 | RNA (single-stranded, not double-stranded, preferentially cleaves within a degenerate pentameric motif ( | ? | Nonsense-mediated mRNA decay ( | Site A (4 x Asp/Glu: D1251, E1282, D1353, D1392) | No metal ions in the crystal structure (PDB ID: 2hww). Nuclease active with Mn2+ and, to a much lesser extent, Mg2+ ( |
| Human exosome complex exonuclease RRP44 (DIS3) | Rrp44 | RNA (single-stranded, preferentially 5΄ monophosphorylated, as shown for yeast homolog) ( | ? | Exosome-mediated mRNA decay ( | Site A (4 x Asp/Glu: D69, E97, D146, D177) | No metal ions in the crystal structure of the exosome complex (PDB ID: 4ifd). Nuclease active with Mn2+, Mg2+ and Zn2+. |
| Yeast rRNA-processing protein UTP24 | Fcf1 | RNA (sequence-specific, cleaves sites A1 and A2 of 18S pre-rRNA) ( | ? | rRNA maturation ( | Site A (4 x Asp/Glu: D68, E105, D139, D157) | No metal ions in the PIN-like domain in the crystal structure (PDB ID: 4mj7). Nuclease active with Mn2+ and Mg2+. |
|
| PIN.COG1487 | RNA (cleaves tRNA3Leu-CAG) ( | ? | Toxin-antitoxin (with VapB15 as an antitoxin) ( | Site A (5 x Asp/Glu: D4, E42, D96, D114, D116). | Mn2+-Mg2+ pair bound in the crystal structure of the heterotrimeric complex (VapBC2) with antitoxin (PDB ID: 4chg). Both metals are shared by the toxin-antitoxin pair. |
|
| PIN.COG1487 | RNA (low activity on double-stranded RNA), no activity on dsDNA ( | ? | Potential toxin-antitoxin (with VapB5 as an antitoxin). | Site A (4 x Asp/Glu: D26, E57, D115, D135) | No metal ions bound in the crystal structure of the complex with antitoxin (PDB ID: 3dbo). Nuclease active with Mg2+. |
|
| PIN.COG1487 | RNA (structure- and sequence-specific, cleaves initiator tRNA between the anticodon stem and loop, but does not cleave mRNA, rRNA or tmRNA), no activity on ssDNA or dsDNA ( | ? | Toxin-antitoxin: inhibition of translation initiation and translation activation at elongated codons ( | Site A (4 x Asp/Glu/Asn: D6, E43, D99, E120 or N117 — polar residue required at this position ( | Crystal structure not available, only a 3D model ( |
|
| PIN.COG1487 | RNA (cleaves single-stranded RNA) ( | ? | Toxin-antitoxin: pathogenesis ( | Site A (4 x Asp/Glu: D6, E43, D99, E120) | No metal ions in the crystal structure of the complex with VapB2 and its promoter DNA (PDB ID: 3zvk). |
|
| PIN.COG4113 | RNA (cleaves single-stranded, G-rich RNA) ( | ? | Potential toxin-antitoxin (with PAE2755 as an antitoxin). | Site A (4 x Asp/Glu: D8, E39, D92, D1180) ( | No metal ions in the crystal structure of the dimer (PDB ID: 1v8p). Nuclease active with Mn2+ and Mg2+. |
Further description of the terms used in the table can be found in the main text.