| Literature DB >> 22502997 |
Rob Van Houdt1, Raphael Leplae, Gipsi Lima-Mendez, Max Mergeay, Ariane Toussaint.
Abstract
BACKGROUND: Tyrosine-based site-specific recombinases (TBSSRs) are DNA breaking-rejoining enzymes. In bacterial genomes, they play a major role in the comings and goings of mobile genetic elements (MGEs), such as temperate phage genomes, integrated conjugative elements (ICEs) or integron cassettes. TBSSRs are also involved in the segregation of plasmids and chromosomes, the resolution of plasmid dimers and of co-integrates resulting from the replicative transposition of transposons. With the aim of improving the annotation of TBSSR genes in genomic sequences and databases, which so far is far from robust, we built a set of over 1,300 TBSSR protein sequences tagged with their genome of origin. We organized them in families to investigate: i) whether TBSSRs tend to be more conserved within than between classes of MGE types and ii) whether the (sub)families may help in understanding more about the function of TBSSRs associated in tandem or trios on plasmids and chromosomes.Entities:
Year: 2012 PMID: 22502997 PMCID: PMC3414803 DOI: 10.1186/1759-8753-3-6
Source DB: PubMed Journal: Mob DNA
Figure 1Size distribution of TBSSR families. Size distribution of the Famint families generated by MCL clustering at IF = 1.8 and E-value 0.01.
TBSSR family analysis
| Famint ID | MGE type | No. prot | No. GI | No. phages | No. prophages | No. plasmids | Putative catalytic motif |
|---|---|---|---|---|---|---|---|
| 0 | intG | 210 | 153 | 10(vir2) | 38(pro2) | 9(plas226) | no obvious one |
| 1 | RIT(A) | 64(4) | ND | 0 | 3 | 33(plas10) | RH-Y |
| 2 | RIT(C) | 63(4) | ND | 0 | 0 | 34(plas10) | RH-Y |
| 3 | mix | 61 | 0 | 5(vir2) | 15(fam202) | 41(plas170) | RK-Y |
| 4 | mix | 58 | 0 | 27(vir2) | 29(pro2) | 2(plas10) | RH-Y |
| 5 | RIT(B) | 54(4) | ND | 0 | 0 | 27(plas10) | RH-Y(1) |
| 6 | (pro)phage | 52 | 0 | 16(vir2) | 36(pro2) | 0 | RH/R-Y |
| 7 | (pro)phage | 47 | 0 | 18(vir2) | 29(pro2) | 0 | RHT/S-Y |
| 8 | IntI | 44 | 0 | 0 | 1(pro2) | 43(plas10) | RH-Y |
| 9 | plasmid | 43 | 0 | 0 | 0 | 43(plas10) | RHS-Y(1) |
| 10 | plasmid | 39 | 0 | 0 | 0 | 39(plas10) | RH-Y |
| 11 | plasmid | 37 | 0 | 0 | 0 | 37(plas101) | R-Y |
| 12 | (pro)phage | 35 | 0 | 8(vir2) | 27(pro2) | 0 | RHT-Y |
| 13 | (pro)phage | 30 | 0 | 11(vir2) | 19(pro2) | 0 | RH-Y |
| 14 | Tn | 25 | 13 | 0 | 5(pro2) | 7(plas226) | RH-Y |
| 15 | (pro)phage | 22 | 0 | 3(418) | 19(pro76) | 0 | RSL or RLY-Y |
| 16 | (pro)phage | 22 | 1 | 12(vir2) | 9(pro2) | 0 | RHT-Y |
| 17 | (pro)phage | 20 | 0 | 13(vir2) | 7(pro2) | 0 | RHS-Y |
| 18 | plasmid | 16 | 0 | 0 | 0 | 16(plas101) | RSG-Y |
| 19 | BIM(A)(4) | 14 | ND | 0 | 0 | 8(plas10) | RH-Y |
| 20 | mix | 14 | 0 | 1(vir2) | 9 (pro2) | 4(plas226) | R-Y |
| 21 | prophage, plasmid | 12 | 0 | 0 | 3(pro2) | 9(plas226) | RRT-Y |
| 22 | plasmid | 11 | 0 | 0 | 0 | 11(plas10) | RRTF-Y |
| 23 | plasmid | 11 | 0 | 0 | 0 | 11(plas10) | R-Y |
| 24 | prophage | 10 | 0 | 0 | 10(pro2) | 0 | RK-Y |
| 25 | (pro)phage | 10 | 0 | 4(vir2) | 6(pro2) | 0 | RH-Y |
| 26 | (pro)phage | 9 | 0 | 3(vir2) | 6(pro2) | 0 | RH-Y |
| 27 | plasmid(5) | 9 | 0 | 0 | 0 | 9(plas454) | RHT-Y(2) |
| 28 | phage, plasmid | 9 | 0 | 0 | 4(pro2) | 5(plas10) | RH-Y |
| 29 | mix | 9 | 0 | 1(vir418) | 7(pro76) | 1(plas226) | R-Y |
| 30 | plasmid | 9 | 0 | 0 | 0 | 9(plas688) | R-Y |
| 31 | plasmid | 8 | 0 | 0 | 0 | 8(plas10) | RH-Y |
| 32 | plasmid | 8 | 0 | 0 | 0 | 8(plas589) | R-Y |
| 33 | plasmid | 8 | 0 | 0 | 0 | 8(plas10) | R-Y(2) |
| 34 | plasmid | 8 | 0 | 0 | 0 | 8(plas10) | RAT-Y |
| 35 | (pro)phage | 8 | 0 | 1(vir418) | 7(pro76) | 0 | No R at expected distance from Y |
| 36 | plasmid | 8 | 0 | 0 | 0 | 8(plas10) | RH-Y, partner 41, 90 |
| 37 | mix | 7 | 0 | 0 | 6(pro2) | 1(plas10) | RH-Y |
| 38 | plasmid | 7 | 0 | 0 | 0 | 7(plas101) | RR-Y |
| 39 | plasmid | 7 | 0 | 0 | 0 | 7(plas589) | RH-Y(3) |
| 40 | plasmid | 7 | 0 | 0 | 0 | 7(plas10) | RHS-Y |
| 41 | plasmid | 7 | 0 | 0 | 0 | 7(plas454) | RR-Y |
| 42 | mix | 6 | 0 | 0 | 1(pro2) | 5(plas10) | RH-Y(2) |
| 43 | plasmid | 6 | 0 | 0 | 0 | 6(plas10) | RR-Y(2) |
| 44 | plasmid | 6 | 0 | 0 | 0 | 6(plas10) | RH-Y |
| 46 | plasmid | 6 | 0 | 0 | 0 | 6(plas10) | RHT-Y |
| 47 | mix | 5 | 0 | 3(vir2) | 1(pro2) | 1(plas226) | RH-Y |
| 48 | mix | 5 | 0 | 3(vir2) | 0 | 2(plas226) | R-Y |
| 49 | plasmid | 5 | 0 | 0 | 0 | 5(plas10) | RRTAL-Y |
| 50 | (pro)phage | 5 | 0 | 4(vir2) | 1(pro2) | 0 | RHT-Y |
| 51 | (pro)phage | 4 | 0 | 3(vir2) | 1(pro2) | 0 | RHT-Y |
| 52 | prophage | 4 | 0 | 0 | 4(pro76) | 0 | RK-Y(2) |
| 53 | plasmid(5) | 4 | 0 | 0 | 0 | 4(plas454) | RH-Y, partner 62, one has no partner |
| 54 | plasmid | 4 | 0 | 0 | 0 | 4(plas688) | RHTF-Y |
| 55 | plasmid | 4 | 0 | 0 | 0 | 4(plas170) | R-Y |
| 57-68 | 3 | ||||||
| 69-81 | 2 | ||||||
| 82-102 | 1 |
The origins of the proteins can be traced by "vir", "plas" and "pro", which stand for ACLAME family IDs (vir2 is family:vir:2, plas10 is family:plasmids:10, pro2 is family: proph:2, and so on).
(pro)phage: the family contains proteins from both phages and prophages.
Only the most conserved R and adjacent amino-acids and the potential catalytic Y residue are mentioned.
(1) Not all proteins in the family have the conserved residues.
(2) A putative catalytic motif is discernible when two shorter sequences are removed from the alignment.
(3) Distance between RH and Y is around 50 residues.
(4) Some proteins in the family do not originate from ACLAME protein families, which is the reason why the number of GI, phages, prophages and plasmids do not add up to the number of proteins.
(5) Proteins in the family have a long N-terminal extension and are over 700 amino-acids long.
ND: Some proteins in the family could be part of a GI.
Tn554-like elements in plasmids and chromosomes
| Genbank | Plasmids | TnpB ID | TnpA ID | TnpC | ||
|---|---|---|---|---|---|---|
| 636892 | 36 | 636891 | 41 | none | ||
| 636892 | 36 | 636891 | 41 | none | ||
| 636892 | 36 | 636891 | 41 | none | ||
| 636892 | 36 | 636891 | 41 | none | ||
| 636892 | 36 | 636891 | 41 | none | ||
| 636892 | 36 | 636891 | 41 | none | ||
| 523288 | 33 | 2316403 | 27 | 2316404 | ||
| 2526550 | 33 | none | 27 | 2526549 | ||
| 647580 | 33 | 647579 | 27 | 647578 | ||
| 826662 | 33 | none | 27 | 647587 | ||
| no cluster | 33 | 2462320 | 53 | 925958 | ||
| 782394 | 62 | 782395 | 53 | 782393 | ||
| 782394 | 62 | 782395 | 53 | 782393 | ||
| no cluster | 33 | no cluster | 27 | no cluster | ||
| 647579 | 33 | 730723 | 27 | 647580 | ||
| 776375 | 33 | 776376 | 27 | 772808 | ||
| 776375 | 33 | 776376 | 27 | 772808 | ||
| no cluster | 33 | none | _ | none | ||
| 776375 | _ | 776376 | _ | 772808 | ||
| 776375 | _ | 776376 | _ | 772808 | ||
| 776375 | _ | 776376 | _ | 772808 | ||
| 776375 | _ | 776376 | _ | 772808 | ||
| 776375 | _ | 776376 | _ | 772808 | ||
| 647580 | _ | 647579 | _ | 647578 | ||
| 647580 | _ | 647579 | _ | 730718 | ||
| 647580 | _ | 647579 | _ | 647578 | ||
| 647580 | _ | 647579 | _ | 730718 | ||
| 647580 | _ | 647579 | _ | 647578 | ||
| 647580 | _ | 647579 | _ | |||
| 647580 | _ | 647579 | _ | none | ||
| 647580 | _ | 647579 | _ | none | ||
| 647580 | _ | 647579 | _ | none | ||
CLSK No., Protein Cluster Database Number; ID, Famint number; none, there is no equivalent annotated protein at that position. no cluster, the protein is annotated but not part of a cluster. *, plasmid SAP1 has a truncated version of the Famint33 protein (246 aa only) and has no partners associated. -, not in the set of analyzed TBSSR proteins. s, shorter, L, longer.
RIT elements classified according to NCBI Protein Clusters
| RIT type | RitA | RitB | RitC |
|---|---|---|---|
| RIT1 | CLSK521097 | CLSK2306416 | CLSK2306415 |
| RIT2 | CLSK2407259 | CLSK2314503 | CLSK458345 |
| RIT3A | CLSK382373 | CLSK747445 | CLSK502077 |
| RIT3B | CLSK2525360 | CLSK747445 | CLSK502077 |
| RIT3C | CLSK382373 | CLSK747445 | CLSK739704 |
| RIT4A | CLSK516123 | CLSK809014 | CLSK893868 |
| RIT4B | CLSK778800 | CLSK809014 | CLSK893868 |
| RIT5A | CLSK891968 | CLSK891969 | CLSK891970 |
| RIT5B | CLSK954479 | CLSK2468126 | CLSK891970 |
| RIT6 | CLSK953941 | CLSK953942 | CLSK953943 |
| RIT7A | CLSK2477616 | CLSK923804 | CLSK2477617 |
| RIT7B | CLSK962675 | CLSK923804 | CLSK2332435 |
| RIT7C | CLSK2809109 | CLSK923804 | CLSK2809110 |
| RIT8 | CLSK971115 | CLSK971116 | CLSK971117 |
| RIT9 | CLSK502016 | CLSK537353 | CLSK864421 |
| RIT10 | CLSK2491249 | CLSK2321302 | CLSK923805 |
| RIT11 | CLSK2471156 | CLSK962653 | CLSK962652 |
RITs with the same number include at least one common protein cluster.
Figure 2Weighted graphical representation of Famint families shared between bacterial hosts. Bacterial strains were grouped at the genus level unless there was a single representative at the strain or the species level. These groups of bacteria were represented in terms of the Famint families they contain. The graph was built as described in Methods. Nodes are bacterial genera, species or strains. They are linked by an edge if sharing Famint families. The thickness of the edges is proportional to the number of families shared by linked nodes. Note the tight grouping of Enterobacteria and Firmicutes.