| Literature DB >> 22638584 |
Kamil Steczkiewicz1, Anna Muszewska, Lukasz Knizewski, Leszek Rychlewski, Krzysztof Ginalski.
Abstract
Proteins belonging to PD-(D/E)XK phosphodiesterases constitute a functionally diverse superfamily with representatives involved in replication, restriction, DNA repair and tRNA-intron splicing. Their malfunction in humans triggers severe diseases, such as Fanconi anemia and Xeroderma pigmentosum. To date there have been several attempts to identify and classify new PD-(D/E)KK phosphodiesterases using remote homology detection methods. Such efforts are complicated, because the superfamily exhibits extreme sequence and structural divergence. Using advanced homology detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses, we provide a comprehensive reclassification of proteins containing a PD-(D/E)XK domain. The PD-(D/E)XK phosphodiesterases span over 21,900 proteins, which can be classified into 121 groups of various families. Eleven of them, including DUF4420, DUF3883, DUF4263, COG5482, COG1395, Tsp45I, HaeII, Eco47II, ScaI, HpaII and Replic_Relax, are newly assigned to the PD-(D/E)XK superfamily. Some groups of PD-(D/E)XK proteins are present in all domains of life, whereas others occur within small numbers of organisms. We observed multiple horizontal gene transfers even between human pathogenic bacteria or from Prokaryota to Eukaryota. Uncommon domain arrangements greatly elaborate the PD-(D/E)XK world. These include domain architectures suggesting regulatory roles in Eukaryotes, like stress sensing and cell-cycle regulation. Our results may inspire further experimental studies aimed at identification of exact biological functions, specific substrates and molecular mechanisms of reactions performed by these highly diverse proteins.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22638584 PMCID: PMC3424549 DOI: 10.1093/nar/gks382
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The commonly conserved core of PD-(D/E)XK nuclease fold. Critical active site residues are shown as red sticks and marked in corresponding sequence logo. Sequence logo was derived from multiple sequence alignment for PD-(D/E)XK phosphodiesterase superfamily using WebLogo (20).
Figure 2.Multiple sequence alignment for the conserved core regions of the PD-(D/E)XK superfamily. Each group of closely related Pfam, COG, KOG families and PDB90 structures (detectable with PSI-BLAST) is represented by available PDB90 sequence or selected representative if the cluster does not contain solved structure. Sequences are labeled according to the group number followed by NCBI gene identification number or PDB code. The first residue numbers are indicated before each sequence, while the numbers of excluded residues are specified in parentheses. Sequence given in italic corresponds to circularly permuted α-helix. Residue conservation is denoted with the following scheme: uncharged, highlighted in yellow; polar, highlighted in grey; active site PD-(D/E)XK signature residues, highlighted in black; other conserved polar/charged residues augmenting the active site, highlighted in red. Locations of secondary structure elements are shown above the corresponding alignment blocks.
One hundred and twenty-one groups of proteins retaining PD-(D/E)XK nuclease fold
| No. | Name | Biological function | Taxonomy | HGTs | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Detailed distribution | ||||||||||
| 1 | NaeI | PF09126 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 1ev7 | ||||||||||
| 2 | BglI | 1dmu | ( | Type II Restriction Endonuclease ( | + | Bacteria | Only four sequences from distant taxa: | |||
| 3 | HpaII | PF09561 | New | Type II Restriction Endonuclease ( | + | Bacteria ( | ||||
| 4 | NgoBV, NlaIV | PF09564 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Multiple transfers, animal related bacteria. Single representatives of: Spirochaetes, Fusobacteria, Tenericutes, ε-proteobacteria, Clostridia, Bacilli. | |||
| 5 | ScaI | PF09569 | New | Type II Restriction Endonuclease ( | + | Bacteria | Multiple transfers. Ecologically and taxonomically unrelated bacteria from Bacilli, Proteobacteria, Cyanobacteria, Bacterioidetes. | |||
| 6 | LlaMI, ScrFI | PF09562 | ( | Type II Restriction Endonuclease ( | + | Bacteria ( | One clade grouping: Lachnospiraceae bacterium (Clostridiales), | |||
| 7 | PvuII | PF09225 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 3ksk | ||||||||||
| 8 | XamI | PF09572 | ( | Type II Restriction Endonuclease ( | + | {1} | Bacteria | Patchy distribution including a Haloarcheon— | ||
| 9 | XhoI | PF04555 | ( | Type II Restriction Endonuclease ( | + | {1} | Bacteria ( | |||
| 10 | ApaLI | PF09499 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Multiple transfers, | |||
| 11 | BamHI | PF02923 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Multiple transfers, extremophilic and/or aquatic bacteria. | |||
| 1bam, 3odh | ||||||||||
| 12 | BstYI, BglII | PF09195 | ( | Type II Restriction Endonuclease ( | {1} | + | Bacteria | Multiple transfers for example | ||
| 1sdo, 1d2i | ||||||||||
| 13 | SacI | PF09566 | ( | Type II Restriction Endonuclease ( | + | Bacteria ( | Multiple transfers. Patchy distribution: single sequences Bacteroides, Actinobacteria, γ-proteobacteria, ε-proteobacteria. | |||
| 14 | Eco47II | PF09553 | New | Type II Restriction Endonuclease ( | {1} | + | Bacteria | |||
| 15 | HaeII | PF09554 | New | Type II Restriction Endonuclease ( | + | Bacteria | Cyanobacteria sequences not grouped. Single sequences from Cyanobacteria, Bacterioidetes. | |||
| 16 | HindIII | PF09518 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Multiple transfers: | |||
| 3a4k | ||||||||||
| 17 | FokI | PF09254 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 2fok | ||||||||||
| 18 | EcoO109I | 1wtd | ( | Type II Restriction Endonuclease ( | + | Bacteria | No HGT observed | |||
| 19 | EcoRV | PF09233 | ( | Type II Restriction Endonuclease ( | + | {2} | Bacteria | |||
| 1eo3 | ||||||||||
| 20 | EcoRI | PF02963 | ( | Type II Restriction Endonuclease ( | + | {1} | Bacteria ( | |||
| 2oxv | ||||||||||
| 21 | XcyI | PF09571 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 22 | BsoBI | PF09194 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 1dc1 | ||||||||||
| 23 | HincII | PF09226 | ( | Type II Restriction Endonuclease ( | + | Bacteria ( | Oral bacterium | |||
| 3ebc | ||||||||||
| 24 | SinI, AvaII | PF09570 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Patchy distribution | |||
| 25 | NgoPII | PF09521 | ( | Type II Restriction Endonuclease ( | + | + | Prokaryota | Patchy distribution, possible transfer between | ||
| 26 | Tsp45I | PF06300 | New | Type II Restriction Endonuclease ( | + | Bacteria | Possible transfer between | |||
| 27 | MspI | PF09208 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Two γ-proteobacteria ( | |||
| 1sa3 | ||||||||||
| 28 | MjaII | PF09520 | ( | Type II Restriction Endonuclease ( | + | + | Prokaryota | Possible transfer between Archaea and Bacteria. Patchy distribution | ||
| 29 | MunI | PF11407 | ( | Type II Restriction Endonuclease ( | + | {1} | Bacteria | |||
| 1d02 | ||||||||||
| 30 | CfrBI | PF09516 | ( | Type II Restriction Endonuclease ( | + | Bacteria ( | Anaerobic ammonium-oxidizing candidatus | |||
| 31 | NgoMIV | PF09015 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 1fiu | ||||||||||
| 32 | Cfr10I, Bse634I, SgrAI | PF07832 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 1cfr, 1knv | ||||||||||
| 3dpg | ||||||||||
| 33 | Bpu10I | PF09549 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Multiple transfer events. One clade encompasses representatives of Cyanobacteria (Cyanothece and Nodularia), Proteobacteria ( | |||
| 34 | BspD6I, AlwI, MlyI | PF09491 2ewf, 2p14 | ( | Type II Restriction Endonuclease | + | {1} | Bacteria | |||
| Restriction Endonuclease ( | ||||||||||
| 35 | LlaJI, McrBC | PF09563 | ( | Type II Restriction Endonuclease ( | + | + | {1} | Prokaryota | ||
| PF10117 | ||||||||||
| COG4268 | ||||||||||
| 36 | SdaI, BsuBI | PF06616 | ( | Type II Restriction Endonuclease ( | + | {1} | Bacteria | |||
| 2ixs | ||||||||||
| 37 | DpnII, MboI | PF04556 | ( | Type II Restriction Endonuclease ( | + | + | Prokaryota | |||
| 38 | Ecl18kI, EcoRII, PspGI | PF09019 | ( | Type II Restriction Endonuclease ( | {2} | + | {1} | Bacteria | ||
| 2fqz, 1na6 | ||||||||||
| 3bm3 | ||||||||||
| 39 | HinP1I | PF11463 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 1ynm | ||||||||||
| 40 | NotI | PF12183 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 3bvq | ||||||||||
| 41 | Bsp6I | PF09504 | ( | Type II Restriction Endonuclease ( | {1} | + | Bacteria | |||
| 42 | HindVP, HgiDI, BsaHI | PF09519 | ( | Type II Restriction Endonuclease ( | + | Bacteria | Patchy taxonomic distribution | |||
| 43 | MjaI | PF09568 | ( | Type II Restriction Endonuclease | {1} | + | + | Prokaryota | ||
| 44 | TaqI | PF09573 | ( | Type II Restriction Endonuclease ( | + | Bacteria | ||||
| 45 | SfiI | PF11487 | ( | Type II Restriction Endonuclease ( | + | Bacteria | No HGT observed, the phylogeny could not be resolved with reliable confidence | |||
| 2ezv | ||||||||||
| 46 | MvaI, BcnI | 2odh, 2oa9 | ( | Type II Restriction Endonuclease ( | + | {2} | Bacteria | |||
| 47 | ThaI | 3ndh | ( | Type II Restriction Endonuclease ( | + | Archaea | No HGT observed | |||
| 48 | HSDR_N, HSDR_N_2, EcoR124I |
PF04313 PF13588 COG4748 COG2810 COG0610 2w00, 3h1t | ( | Type I Restriction Endonuclease ( | + | + | {1} | Prokaryota | ||
| Type IV Restriction Endonuclease (predicted, found mostly in Archaea) | ||||||||||
| 49 | HindVIP, EcoPI |
COG4889 COG4096 COG3421 COG3587 3s1s | ( |
Type I Restriction Endonuclease Type II Restriction Endonuclease ( Type III Restriction Endonuclease ( Broad sequence and function profile due to wide, multidomain definitions of COG entities | + | + | + | Prokaryota & phages | ||
| 50 | Mrr_cat, DUF2034 |
PF04471 PF10356 COG4127 COG1715 COG1787 1y88 | ( |
Mrr restriction endonuclease ( DUF2034 function is unknown. | {2} | + | + | + |
Eukaryota ( Bacteria Archaea Phages | No HGT observed, the phylogeny could not be resolved with reliable confidence |
| 51 | Archaeal HJC |
PF01870 COG1591 1hh1, 1gef 1ob8, 2wcw 2eo0 | ( | HJC resolvase ( | + | + | + | Prokaryota | A handful of unrelated bacteria: | |
| 52 | ERCC4, XPF, Mus81 |
PF02732 KOG0442 KOG2379 COG1948 1j22, 2bgw 2ziu, 2zix 2ziv | ( |
HJC resolvase ( DNA repair, structure specific endonuclease | + | + | Archaea & Eukaryota | No HGT observed | ||
| 53 | RecU, HJC Resolvase, Penicillin-binding protein-related factor A |
PF03838 COG3331 1zp7, 1y1o | ( | HJC resolvase ( | + | Bacteria | ||||
| 54 | Bacteriophage T7 endonuclease I, Phage_endo_I | PF05367 | ( | HJC resolvase ( | + | + | + | Prokaryota & phages | ||
| 2pfj | ||||||||||
| 55 | tRNA intron endonuclease |
PF01974 KOG4133 KOG4685 COG1676 1a79, 2cv8, 2gjw 2zyz, 2ohe, 3iey 3if0, 3ajv, 3p1y | ( | tRNA intron endonuclease, in the proximity of various tRNA synthases in archaeal genomes. | + | + | Archaea & Eukaryota | No HGT observed | ||
| 56 | Sen15 |
PF09631 PF12858 2gw6 | ( | A structural subunit of eukaryotic tRNA intron endonuclease ( | + | Eukaryota | No HGT observed | |||
| 57 | MutH |
PF02976 COG3066 1azo, 2aoq | ( | Mismatch repairing enzyme ( | + | Bacteria | ||||
| 58 | VSR, DUF559, DUF2726 |
PF04480 PF03852 COG3727 COG2852 1cw0, 3hrl, 3r3p | ( |
Very short patch repair (Vsr) endonuclease that specifically removes T/G mismatches in DNA sequences targeted to cytosine methyltransferase ( Group I intron homing endonuclease ( | {1} | + | + | Prokaryota | No HGT observed | |
| 59 | TnsA | PF08722 | ( | Transposase ( | + | {1} | Bacteria | |||
| 1t0f | ||||||||||
| 60 | XisH | PF08814 | Pfam | fdxN element excision controlling factor ( | + | Bacteria | ||||
| 2inb, 2okf | ||||||||||
| 61 | DUF83, Cas_Cas4 | PF01930 | ( | Cas1 protein (YgbT) has nuclease activity against single-stranded and branched DNAs including HJC, replication forks and 5′-flaps ( | + | + | {1} | Prokaryota | Not resolved phylogeny. | |
| COG1468 | ||||||||||
| COG2251 | ||||||||||
| 62 | RecBCD, Exonuclease V |
PF04257 COG1330 COG3857 COG1074 1w36 | ( | Exonuclease/helicase, a component of the RecBCD complex that handles double-strand breaks (DSB) ( | + | {1} | Bacteria | |||
| 63 | DUF2800, PDDEXK_1 |
PF10926 PF12705 COG2887 | ( | RecB-like, probable prophage proteins | + | + | Bacteria phages | |||
| 64 | Viral alkaline exonuclease | PF01771 | ( | Exonuclease processing viral genome during recombination ( | + | Herpesvirales | No HGT observed | |||
| 2w45, 3fhd | ||||||||||
| 65 | YqaJ, lambda-exonuclease |
PF09588 COG5377 1avq, 3k93 3slp | ( | Exonuclease facilitating phage DNA recombination ( | + | + | + |
Bacteria Eukaryota phages | No HGT observed | |
| 66 | RecE, DUF3799 | PF12684 | ( | Exonuclease from RecET recombination system ( | + | + | Bacteria phage | No HGT observed, the phylogeny could not be resolved with reliable confidence | ||
| 3h4r, 3l0a | ||||||||||
| 67 | DEM1, EXO5 | PF09810 | Pfam | Mitochondrial, single-strand-specific 5′-exonuclease releasing dinucleotides as the main products of catalysis. EXO5 binds to 5′-RNA termini of chimeric DNA–RNA molecules and, after sliding across the RNA substrate, cuts the DNA 2 nt from the RNA–DNA junction ( | {1} | + | + | + | Archaea | |
| KOG4760 | Eukaryota | |||||||||
| 68 | ssp6803i | PF11645 | ( | Homing endonuclease with a specificity profile extending over a long (17-bp) target site ( | + | + | Prokaryota | Patchy distribution including 5 Haloarcheales and 2 Ktedonobacter sequences as well as Bacillus forming a sister clade to 5 sequences Cyanobacteria suggest a HGT history | ||
| 2ost | ||||||||||
| 69 | Rpb5 N-terminal domain | PF03871 | ( | RNA Polymerase ( | + | Eukaryota | No HGT observed | |||
| KOG3218 | ||||||||||
| 1dzf, 3h0g | ||||||||||
| 70 | Arenavirus RNA polymerase N-terminal domain, virus L-Protein | PF06317 | ( | RNA Polymerase N-terminal domain that utilizes ‘cap snatching’ mechanism for viral mRNA transcription ( | + | Arenavirus | No HGT observed | |||
| 3jsb | ||||||||||
| 71 | RecB, DUF91 | PF01939 | ( | DNA endonuclease specialized in cleavage at double-stranded DNA (dsDNA)/ssDNA junctions on branched DNA substrates ( | + | + | Prokaryota | All 3 sequences from Deinococcus-Thermus are located within the Archaea clade. The Proteobacteria sequences are close to the root, this topology is not well resolved | ||
| COG1637 | ||||||||||
| 2vld | ||||||||||
| 72 | ERCC1-XPF, Swi10, Rad10 | PF03834 | ( | Nuclease of NER system incising oligonucleotide from damaged DNA strand ( | + | Eukaryota | No HGT observed | |||
| KOG2841 | ||||||||||
| COG5241 | ||||||||||
| 2a1i | ||||||||||
| 73 | La crosse virus L-protein | 2xi5 | ( | Cap-snatching Endonuclease; cleaves short and capped host primers that are subsequently used by viral RNA-dependent RNA polymerase to transcribe viral mRNAs ( | + | Bunyaniviridae | No HGT observed | |||
| 74 | Viral L-protein | PF00603 | ( | Cap-snatching Endonuclease, mechanism identical to that described above ( | + | Influenza A virus | Phylogeny not resolved | |||
| 3hw3 | ||||||||||
| 75 | D212 | PF12187 | ( | Uncharacterized nuclease suggested to take part in DNA replication, repair, or recombination ( | + | + | Archaea ( | Phages and prophages of Sulfolobus, together form one coherent clade | ||
| 2w8m | ||||||||||
| 76 | Archaea bacterial proteins of unknown function, DUF234 | PF03008 | ( | DEXX-box ATPase belonging to AAA+ superfamily; DEXX-box ATPases act to transduce the energy of ATP-hydrolysis into a conformational stress required for the remodeling of nucleic acid or protein–nucleic acid structure ( | + | + | Prokaryota | Two | ||
| COG1672 | ||||||||||
| 77 | RAI1-like, Dom-3z | PF08652 | ( | Exoribonuclease. Has a pyrophosphohydrolase activity towards 5′-triphosphorylated RNA ( | + | Eukaryota | No HGT observed | |||
| KOG1982 | ||||||||||
| 3fqg, 3fqi | ||||||||||
| 78 | NARG2 | PF10505 | ( | Nuclear protein involved in thickness of the brain’s cortical gray matter regulation ( | + | Eukaryota | No HGT observed | |||
| 79 | DUF911, Dna2 |
PF06023 PF08696 KOG1805 COG4343 | ( | Dna2 processes common structural intermediates that occur during diverse DNA processing (e.g. lagging strand synthesis and telomere maintenance) ( | + | + | + | Prokaryota & Eukaryota | Very long branches, dubious positioning of various taxons | |
| 80 | YhgA-like | PF04754 | ( | Putative transposase ( | + | Bacteria | Three | |||
| COG5464 | ||||||||||
| 81 | CoiA-like | PF06054 | ( | Negative regulator of competence. CoiA is probably involved after DNA uptake, either in DNA processing or recombination ( | + | Bacteria | No HGT observed | |||
| COG4469 | ||||||||||
| 82 | DUF524 | PF04411 | ( | Predicted restriction endonuclease ( | + | + | Bacteria & Euryarchaeota | Mixed clades like: | ||
| COG1700 | ||||||||||
| 83 | Mitochondrial protein Pet127 | PF08634 | ( | 5′-exonuclease responsible for processing the precursor to the mature form ( | + |
Alveolata Fungi Myxomycota Excavata | Distribution limited to different unicellular eukaryote, not enough sequencing data for a HGT hypothesis | |||
| 84 | Eukaryotic translation initiation factor 3 subunit 7, eIF-3-zeta, eIF3 p66, moe1 | PF05091 | ( | eIF3 p66 is the major RNA-binding subunit of the eIF3 complex; Cdc48, Yin6 and Moe1 act in the same protein complex to concertedly control ERAD and chromosome segregation ( | + | Eukaryota | No HGT observed | |||
| KOG2479 | ||||||||||
| 85 | Secreted endonuclease distantly related to HJC resolvase | PF10107 | ( | Predicted secreted endonuclease distantly related to archaeal HJC resolvase | + | + | {1} | Prokaryota | A sequence of a bacteria feeding nematode | |
| COG4741 | ||||||||||
| 86 | DUF1064 | PF06356 | ( | Unknown, In firmicutes co-occurs with: RecT, DnaC, DnaB, SSB what suggest a role in recombination. In Proteobacteria phage proteins are also present. | + | + | Bacteria phages | |||
| 87 | DUF790 | PF05626 | ( | Unknown. Co-occurs with ResIII and helicase domains. | + | + | Prokaryota | A single sequence of | ||
| COG3372 | ||||||||||
| 88 | VRR-NUC | PF08774 | ( | A DNA repair nuclease recruited to DNA damage by monoubiquitinated FANCD2 ( | + | + | + | Bacteria & Eukaryota & phages | No HGT observed | |
| KOG2143 | ||||||||||
| 89 | RmuC | PF02646 | ( | Molecular function unknown. Involved in DNA recombination ( | + | Bacteria | ||||
| COG1322 | ||||||||||
| 90 | Uncharacterized conserved protein | COG5482 | New | Unknown | {2} | + | {1} | Bacteria | ||
| 91 | Predicted transcriptional regulator | COG1395 | New | The function is unknown but it likely binds nucleic acids. Harbors a HTH motif, co-occurs with a two-domain protein consisting of DUF1743 and tRNA_anti (PF01336) nucleic acid-binding OB-fold domain. | + | Archaea | No HGT observed | |||
| 92 | DUF1052 | PF06319 | Pfam | Co-occurs with HisKA and Lactamase_B or YkuD (PF03734) which also gives β-lactam resistance. | {1} | + | Bacteria | An uncultured Acidobacterium within a Rhizobiales clade with | ||
| COG5321 | ||||||||||
| 3dnx | ||||||||||
| 93 | Sugar fermentation stimulation protein SfsA | PF03749 | ( | Unknown, SfsA protein binds to DNA non-specifically ( | + | Bacteria | ||||
| COG1489 | ||||||||||
| 94 | NERD | PF08378 | ( | Unknown, described as nuclease-related ( | + | {2} | Bacteria | |||
| 95 | DUF1626 | PF07788 | ( | Unknown | + | + | Prokaryota | |||
| COG5493 | ||||||||||
| 96 | UPF0102, RPA0323 | PF02021 | Pfam | Is often found with a TP_methylase (PF00590) domain. Tetrapyrrole (Corrin/Porphyrin) Methylases use S-AdoMet in the methylation of diverse substrates. The genomic context is well conserved for each bacterial class. | + | + | Prokaryota | |||
| COG0792 | ||||||||||
| COG4998 | ||||||||||
| 3fov | ||||||||||
| 97 | DUF1887 | PF09002 | Pfam | Occasionally co-occurs with phosphorylase superfamily PNP_UDP_1 (PF01048) (uridine phosphorylase) and zinc/cadmium/mercury/lead-transporting ATPase. | + | + | Prokaryota | Three | ||
| 1xmx | ||||||||||
| 98 | DUF1016 | PF06250 | ( | Co-occurs with restriction MTase, ResIII and ResI S domains, and mobile element domains (phage integrase, DDE). Might act as nucleic acid-binding element in restriction enzymes. | {1} | + | {3} | {2} | Bacteria | |
| COG4804 | ||||||||||
| 99 | DUF1703 | PF08011 | ( | There are 9 DUF1703 proteins in | + | {1} | Bacteria | Nine sequences from | ||
| 100 | DUF4143 | COG1373, PF13635 | Pfam | Unknown | + | + | Prokaryota | |||
| 101 | DUF511 | PF04373 | ( | Unknown | + | Bacteria | Unrelated sequences from Fibrobacterales, Chlorobiales, Clostridiales, Flavobacteriales and Bacteroidales on a Proteobacteria tree | |||
| COG2958 | ||||||||||
| 102 | DUF2887 | PF11103 | ( | Unknown. Co-occurs with transport related proteins. | + | Bacteria | ||||
| 103 | Restriction endonuclease-like fold superfamily protein | 3ijm | PDB | Unknown | + | No HGT observed | ||||
| 104 | DUF1853 | PF08907 | ( | Unknown. The genomic context is conserved within bacterial families. | + | Bacteria ( | ||||
| COG3782 | ||||||||||
| 105 | UL24 | PF01646 | ( | The molecular mechanism is unknown however the UL24 protein is able to induce G2 cell-cycle arrest ( | + | + | Herpesvirales | No HGT observed | ||
| 106 | DUF506 | PF04720 | ( | Unknown | Plantae | No HGT observed | ||||
| Green algae | ||||||||||
| 107 | TT1808, DUF820, Uma2 | PF05685 | ( | Predicted endonuclease. In Cyanobacteria the genomic context is well conserved. In γ-proteobacteria the context is not conserved and involves mobile elements suggesting recent mobility and/or acquisition. | + | Bacteria | Proteobacteria sequences within Firmicutes or Cyanobacteria clades. Very long branches. Multiple transfer | |||
| COG4636 | ||||||||||
| 1wdj, 3ot2 | ||||||||||
| 108 | DUF1780 | PF08682 | SCOP | Unknown. Well conserved context | + | Bacteria | No HGT observed | |||
| 1y0k | ||||||||||
| 109 | DUF2130 | PF09903 | Pfam | Unknown | + | {1} | Bacteria | |||
| COG4487 | ||||||||||
| 110 | DUF2726 | PF10881 | Pfam | Unknown. In Fusobacteria DUF2726 proteins are surrounded by mobile elements. This feature is less pronounced in other bacteria. | + | + | Bacteria | Multiple transfers. | ||
| 111 | RAP domain | PF08373 | Pfam | Unknown. Initially claimed to bind RNA and abundant in Apicomplexans, present in proteins involved in mitochondrial stress sensing ( | {1} | + | Eukaryota | |||
| 112 | YaeQ | PF07152 | ( | Located with bleomycin resistance (Glyoxalase) and Aceltyltransf_1 (GNAT). In | + | Bacteria | ||||
| COG4681 | ||||||||||
| 2ot9, 2g3w | ||||||||||
| 3c0u | ||||||||||
| 113 | PDDEXK_2 | PF12784 | Pfam | Putative transposase | + | {1} | Bacteria | Phylogeny not resolved | ||
| 114 | PDDEXK_3 | PF13366 | Pfam | Unknown | + | + | + | Prokaryota & Viruses | Multiple transfers, mixed clades for Bacteria and Archaea or different Bacterial divisions | |
| 115 | PDDEXK_4 | PF14281 | Pfam | Unknown | + | + | {1} | Prokaryota | ||
| 116 | DUF4263 | PF14082 | New | Unknown | {1} | + | {2} | {1} | Bacteria | |
| 117 | DUF3883 | PF13020 | New | Unknown | + | + | + | Eukaryota & Prokaryota | Phylogeny not well resolved | |
| 118 | DUF4420 | PF14390 | New | Putative transposase | + | {2} | Bacteria | |||
| 119 | Replic_Relax | PF13814 | New | Plasmid replication ( | {1} | + | Bacteria ( | |||
| 120 | Dam-replacing protein | PF06044 | ( | DNA adenine methyltransferase replacing protein (DRP), a restriction endonuclease ( | {2} | + | {3} | Bacteria | Patchy distribution possibly due to multiple transfers | |
| 121 | TBP-interacting protein | 2czr | ( | A family of proteins, that interact with TATA-binding protein (TBP) ( | + | Archaea | No HGT observed | |||
Groups include closely related families and structures that share relatively high sequence similarity detectable with PSI-BLAST and RPS-BLAST.
aThe tree was not rooted due to dubious position of the rooting sequence.
The curly brackets in the taxonomy columns indicate the number of sequences if kingdom is represented only by a few sequences.
Figure 3.Examples of structural diversity in the PD-(D/E)XK phosphodiesterase superfamily. (A) typical PD-(D/E)XK enzyme (Holiday junction resolvase, Pyrococcus furiosus, pdb|1gef); (B) highly diverged structure with short first β-strand and perpendicular orientation of core α-helices (Pa4535 protein, P. aeruginosa, pdb|1y0k); (C) structure deterioration and the loss of active site (RecC, E. coli, pdb|1w36C); (D) circular permutation of the first core α-helix (Hef endonuclease, Pyrococcus furiosus, pdb|1j22); (E) domain swapping (endonuclease I, Enterobacteria phage T7, pdb|2pfj). Active site PD-(D/E)XK signature residues are shown as red sticks.
Figure 4.Active site variations observed in the PD-(D/E)XK phosphodiesterase superfamily structures. Observed variant of ‘PD-(D/E)XK’ signature motif is given below each structure with residue migration denoted in blue. (A) archaeal HJC resolvase (P. furiosus, pdb|1gef); (B) BamHI restriction endonuclease (Oceanobacter kriegii, pdb|3odh); (C) BstYI restriction endonuclease (Geobacillus stearothermophilus, pdb|1sdo); (D) EcoO109I restriction endonuclease (E. coli, pdb|1wtd); (E) Bse634I restriction endonuclease (Geobacillus stearothermophilus, pdb|1knv); (F) tRNA splicing endonuclease (Methanocaldococcus jannaschii, pdb|1a79); (G) Vsr repair endonuclease (E. coli, pdb|1cw0); (H) a putative endonuclease-like protein (Neisseria gonorrhoeae, pdb|3hrl); (I) Pa4535 protein (P. aeruginosa, pdb|1y0k).
Figure 5.CLANS clustering of 21 911 sequences belonging to 121 clades of the PD-(D/E)XK superfamily. The image was drawn with an in-house script based on CLANS run files. (A) illustrates the taxonomic distribution of analyzed sequences and (B) summarizes their functional annotation.