| Literature DB >> 24704919 |
Fotis E Psomopoulos1, Victoria I Siarkou2, Nikolas Papanikolaou3, Ioannis Iliopoulos4, Athanasios S Tsaftaris5, Vasilis J Promponas6, Christos A Ouzounis7.
Abstract
The entire publicly available set of 37 genome sequences from the bacterial order Chlamydiales has been subjected to comparative analysis in order to reveal the salient features of this pangenome and its evolutionary history. Over 2,000 protein families are detected across multiple species, with a distribution consistent to other studied pangenomes. Of these, there are 180 protein families with multiple members, 312 families with exactly 37 members corresponding to core genes, 428 families with peripheral genes with varying taxonomic distribution and finally 1,125 smaller families. The fact that, even for smaller genomes of Chlamydiales, core genes represent over a quarter of the average protein complement, signifies a certain degree of structural stability, given the wide range of phylogenetic relationships within the group. In addition, the propagation of a corpus of manually curated annotations within the discovered core families reveals key functional properties, reflecting a coherent repertoire of cellular capabilities for Chlamydiales. We further investigate over 2,000 genes without homologs in the pangenome and discover two new protein sequence domains. Our results, supported by the genome-based phylogeny for this group, are fully consistent with previous analyses and current knowledge, and point to future research directions towards a better understanding of the structural and functional properties of Chlamydiales.Entities:
Year: 2012 PMID: 24704919 PMCID: PMC3899948 DOI: 10.3390/genes3020291
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
List of Chlamydiales genome sequences used in this study.
| ## | Species and Strain Name/Codes | Internal Identifier | Protein-Coding Genes |
|---|---|---|---|
| 01 | CPRO-UWE-01 | 2,031 | |
| 02 | CMUR-NIG-01 | 911 | |
| 03 | CTRA-434-01 | 874 | |
| 04 | CTRA-AHA-01 | 919 | |
| 05 | CTRA-BJA-01 | 875 | |
| 06 | CTRA-BTZ-01 | 880 | |
| 07 | CTRA-DUW-01 | 895 | |
| 08 | CTRA-L2B-01 | 874 | |
| 09 | CABO-S26-01 | 932 | |
| 10 | CCAV-GPI-01 | 1,005 | |
| 11 | CFEL-FEC-01 | 1,013 | |
| 12 | CPNE-AR3-01 | 1,112 | |
| 13 | CPNE-CWL-01 | 1,052 | |
| 14 | CPNE-J13-01 | 1,069 | |
| 15 | CPNE-TW1-01 | 1,113 | |
| 16 | WCHO-WSU-01 | 1,956 | |
| 17 | CTRA-E15-01 | 927 | |
| 18 | CPEC-E58-01 | 988 | |
| 19 | CPSI-6BC-01 | 975 | |
| 20 | CABO-LLG-01 | 925 | |
| 21 | CPNE-LPC-01 | 1,105 | |
| 22 | CPSI-CAL-01 | 1,005 | |
| 23 | PACA-UV7-01 | 2,788 | |
| 24 | PACA-HAL-01 | 2,809 | |
| 25 | SNEG-ZXX-01 | 2,518 | |
| 26 | WCHO-203-01 | 2,015 | |
| 27 | CPSI-01D-01 | 975 | |
| 28 | CPSI-02D-01 | 978 | |
| 29 | CPSI-08D-01 | 973 | |
| 30 | CTRA-DEC-01 | 878 | |
| 31 | CTRA-DLC-01 | 878 | |
| 32 | CTRA-E11-01 | 926 | |
| 33 | CTRA-G74-01 | 919 | |
| 34 | CTRA-G22-01 | 927 | |
| 35 | CTRA-G93-01 | 921 | |
| 36 | CTRA-G97-01 | 920 | |
| 37 | CTRA-SWE-01 | 875 | |
| Total | 43,736 |
The first column signifies the inclusion order into the genome collection and does not reflect any other relationship. The second column lists the species and strain name, the third column the COGENT-style identifier and the last column the number of protein-coding genes.
Core gene and protein families in the Chlamydiales.
| Cluster ID | Lead Sequence | Master Sequence | Function Annotation from Master Sequence |
|---|---|---|---|
| 181 | CABO-LLG-01-000000 | CTRA-DUW-01-000647 | NA |
| 182 | CABO-LLG-01-000003 | CTRA-DUW-01-000644 | ENZYME UDP- |
| 183 | CABO-LLG-01-000004 | CTRA-DLC-01-000248 | FUNCTION PhoB-like protein |
| 184 | CABO-LLG-01-000015 | CTRA-DLC-01-000228 | FUNCTION RecA protein |
| 185 | PACA-UV7-01-001616 | CTRA-DUW-01-000487 | NA |
| 186 | CABO-LLG-01-000023 | CTRA-DUW-01-000658 | NA |
| 187 | CABO-LLG-01-000024 | CTRA-DUW-01-000624 | FUNCTION RNA Polymerase Sigma-54 factor RpoN |
| 188 | CABO-LLG-01-000026 | CTRA-DUW-01-000622 | ENZYME Uracil DNA glycosylase [EC] 3.2.2.- |
| 189 | CABO-LLG-01-000028 | CTRA-DUW-01-000620 | SIMILAR-TO NTPase HAM1 homolog [EC] 3.6.1.15 |
| 190 | CABO-LLG-01-000029 | CTRA-DLC-01-000273 | NA |
| 191 | CABO-LLG-01-000032 | CTRA-DUW-01-000616 | NA |
| 192 | CABO-LLG-01-000034 | CTRA-DLC-01-000278 | FUNCTION Peptidoglycan-associated lipoprotein |
| 193 | CABO-LLG-01-000035 | CTRA-DLC-01-000279 | FUNCTION TolB macromolecule uptake homolog |
| 194 | CABO-LLG-01-000037 | CTRA-DUW-01-000611 | FUNCTION TolR/ExbD macromolecule uptake homolog |
| 195 | CABO-LLG-01-000040 | CTRA-DLC-01-000284 | FUNCTION protein translocase TatD/MttC homolog |
| 196 | CABO-LLG-01-000047 | CTRA-DUW-01-000600 | ENZYME enolase [EC] 4.2.1.11 |
| 197 | CABO-LLG-01-000048 | CTRA-DUW-01-000599 | FUNCTION Excinuclease ABC subunit B |
| 198 | CABO-LLG-01-000049 | CTRA-DUW-01-000598 | ENZYME Tryptophanyl-tRNA Synthetase [EC] 6.1.1.2 |
| 199 | CTRA-G22-01-000161 | CTRA-DUW-01-000746 | ENZYME Seryl-tRNA Synthetase [EC] 6.1.1.11 |
| 200 | CABO-LLG-01-000054 | CTRA-DUW-01-000593 | FUNCTION Nickel transporter CnrT homolog |
| 201 | CABO-LLG-01-000061 | CTRA-DUW-01-000586 | NA |
| 202 | CABO-LLG-01-000062 | CTRA-DUW-01-000585 | FUNCTION type II secretion system protein D homolog |
| 203 | CABO-LLG-01-000063 | CTRA-DUW-01-000584 | FUNCTION type II secretion system protein E homolog |
| 204 | CABO-LLG-01-000064 | CTRA-DLC-01-000308 | FUNCTION type II secretion system protein F homolog |
| 205 | CABO-LLG-01-000065 | CTRA-DLC-01-000309 | NA |
| 206 | CABO-LLG-01-000070 | CTRA-DUW-01-000577 | FUNCTION protein secretion system YscT homolog |
| 207 | CABO-LLG-01-000072 | CTRA-DLC-01-000316 | FUNCTION protein secretion system YscR homolog |
| 208 | CABO-LLG-01-000073 | CTRA-DEC-01-000317 | FUNCTION protein secretion system YscL homolog |
| 209 | CABO-LLG-01-000074 | CTRA-DUW-01-000573 | NA |
| 210 | CABO-LLG-01-000076 | CTRA-DUW-01-000571 | ENZYME lipoate synthase [EC] 2.8.1.- |
| 211 | CABO-LLG-01-000081 | CTRA-DUW-01-000714 | ENZYME Endonuclease III [EC] 4.2.99.18 |
| 212 | CABO-LLG-01-000083 | CTRA-DUW-01-000716 | ENZYME Phosphatidylserine decarboxylase [EC] 4.1.1.65 |
| 213 | CABO-LLG-01-000085 | CTRA-DUW-01-000718 | FUNCTION preprotein translocase subunit SecA |
| 214 | CABO-LLG-01-000089 | CTRA-DUW-01-000722 | ENZYME ATP-dependent Clp protease ATP-binding subunit ClpX [EC] |
| 215 | CABO-LLG-01-000091 | CTRA-DUW-01-000724 | FUNCTION Trigger factor |
| 216 | CABO-LLG-01-000093 | CTRA-DUW-01-000726 | FUNCTION Rod shape-determining protein MreB |
| 217 | CABO-LLG-01-000094 | CTRA-DUW-01-000727 | ENZYME Phosphoenolpyruvate carboxykinase (GTP) [EC] 4.1.1.32 |
| 218 | CABO-LLG-01-000098 | CTRA-DUW-01-000731 | ENZYME Glycerol-3-phosphate dehydrogenase [NAD+] [EC] 1.1.1.8 |
| 219 | CABO-LLG-01-000099 | CTRA-DUW-01-000732 | ENZYME UDP-N-acetylhexosamine pyrophosphorylase [EC] 2.7.7.- |
| 220 | CCAV-GPI-01-000128 | CTRA-DUW-01-000503 | FUNCTION Transcription termination factor Rho |
| 221 | CABO-LLG-01-000104 | CTRA-DUW-01-000737 | DOMAIN NifU |
| 222 | PACA-HAL-01-002518 | CTRA-DUW-01-000261 | ENZYME NifS aminotransferase [EC] -.-.-.- |
| 223 | CABO-LLG-01-000109 | CTRA-DUW-01-000742 | ENZYME Biotin-[acetyl-CoA-carboxylase] synthetase [EC] 6.3.4.15 |
| 224 * | CABO-LLG-01-000121 | CTRA-DUW-01-000754 | DOMAIN SET |
| 225 | CABO-LLG-01-000122 | CTRA-DUW-01-000755 | SIMILAR-TO metallo-beta-lactamase [EC] 3.5.-.- |
| 226 | CABO-LLG-01-000123 | CTRA-DUW-01-000756 | FUNCTION Cell division protein FtsK C-terminus |
| 227 | CABO-LLG-01-000125 | CTRA-DUW-01-000757 | NA |
| 228 | CABO-LLG-01-000126 | CTRA-DUW-01-000758 | FUNCTION preprotein translocase complex subunit YajC |
| 229 | CABO-LLG-01-000130 | CTRA-DUW-01-000762 | ENZYME Protoporphyrinogen oxidase HemY [EC] 1.3.3.4 |
| 230 | CABO-LLG-01-000132 | CTRA-DUW-01-000764 | ENZYME Uroporphyrinogen decarboxylase HemE [EC] 4.1.1.37 |
| 231 | CABO-LLG-01-000134 | CTRA-DLC-01-000129 | ENZYME Alanyl-tRNA Synthetase [EC] 6.1.1.7 |
| 232 | CABO-LLG-01-000135 | CTRA-DUW-01-000767 | ENZYME Transketolase [EC] 2.2.1.1 |
| 233 | CABO-LLG-01-000136 | CTRA-DUW-01-000768 | SIMILAR-TO AMP nucleosidase [EC] 3.2.2.4 |
| 234 | CABO-LLG-01-000142 | CTRA-DUW-01-000774 | ENZYME Phospho- |
| 235 | CABO-LLG-01-000143 | CTRA-DUW-01-000775 | ENZYME UDP- |
| 236 | CABO-LLG-01-000144 | CTRA-DUW-01-000776 | SIMILAR-TO |
| 237 | CABO-LLG-01-000146 | CTRA-DLC-01-000117 | ENZYME UDP- |
| 238 | CABO-S26-01-000517 | CTRA-DUW-01-000125 | ENZYME Biotin carboxylase [EC] 6.3.4.14 |
| 239 | CABO-LLG-01-000150 | CTRA-DUW-01-000781 | NA |
| 240 | CABO-LLG-01-000155 | CTRA-DUW-01-000786 | NA |
| 241 | CABO-LLG-01-000157 | CTRA-DUW-01-000788 | ENZYME bis(5'-nucleosyl)-tetraphosphatase [EC] 3.6.1.17 |
| 242 | CABO-LLG-01-000168 | CTRA-DLC-01-000098 | ENZYME Cysteinyl-tRNA Synthetase [EC] 6.1.1.16 |
| 243 | CABO-LLG-01-000173 | CTRA-DUW-01-000804 | FUNCTION Ribosomal protein S14 |
| 244 | CABO-LLG-01-000174 | CTRA-DUW-01-000805 | NA |
| 245 | CABO-LLG-01-000176 | CTRA-DUW-01-000808 | ENZYME Excinuclease ABC subunit C [EC] -.-.-.- |
| 246 | CABO-LLG-01-000177 | CTRA-DUW-01-000809 | FUNCTION DNA mismatch repair protein MutS |
| 247 | CABO-LLG-01-000184 | CTRA-DUW-01-000815 | ENZYME CDP-diacylglycerol-glycerol-3-phosphate |
| 248 | CABO-LLG-01-000185 | CTRA-DUW-01-000816 | ENZYME Glycogen synthase [EC] 2.4.1.21 2 |
| 249 | CABO-LLG-01-000186 | CTRA-DUW-01-000817 | FUNCTION Ribosomal protein L25 |
| 250 | CABO-LLG-01-000187 | CTRA-DUW-01-000818 | ENZYME Peptidyl-tRNA hydrolase [EC] 3.1.1.29 |
| 251 | CABO-LLG-01-000188 | CTRA-DUW-01-000819 | FUNCTION Ribosomal protein S6 |
| 252 | CABO-LLG-01-000189 | CTRA-DUW-01-000820 | FUNCTION Ribosomal protein S18 |
| 253 | CABO-LLG-01-000190 | CTRA-DUW-01-000821 | FUNCTION Ribosomal protein L9 |
| 254 * | CABO-LLG-01-000193 | CTRA-DUW-01-000823 | NA |
| 255 * | CABO-LLG-01-000194 | CTRA-DUW-01-000824 | SIMILAR-TO Small-peptide endopeptidase [EC] 3.4.24.55 |
| 256 | CABO-LLG-01-000195 | CTRA-DLC-01-000073 | ENZYME Glycerol-3-phosphate acyltransferase [EC] 2.3.1.15 |
| 257 | CABO-LLG-01-000196 | CTRA-DLC-01-000072 | ENZYME Ribonuclease E [EC] 3.1.4.- |
| 258 | CABO-LLG-01-000197 | CTRA-DLC-01-000071 | NA |
| 259 | CABO-LLG-01-000214 | CTRA-DLC-01-000063 | ENZYME Glucosamine-fructose-6-phosphate aminotransferase [EC] |
| 260 | CABO-LLG-01-000218 | CTRA-DUW-01-000840 | ENZYME Succinyl-CoA synthetase beta chain [EC] 6.2.1.5 |
| 261 | CABO-LLG-01-000222 | CTRA-DUW-01-000843 | SIMILAR-TO Small-peptide endopeptidase [EC] 3.4.24.55 |
| 262 | CABO-LLG-01-000224 | CTRA-DUW-01-000845 | ENZYME CDP-diacylglycerol-serine |
| 263 | CABO-LLG-01-000229 | CTRA-DUW-01-000850 | ENZYME UDP- |
| 264 | CABO-LLG-01-000230 | CTRA-DLC-01-000047 | FUNCTION Transcription termination protein NusB |
| 265 | CABO-LLG-01-000231 | CTRA-DLC-01-000046 | NA |
| 266 | CABO-LLG-01-000233 | CTRA-DUW-01-000854 | FUNCTION Ribosomal protein L20 |
| 267 | CABO-LLG-01-000234 | CTRA-DUW-01-000855 | ENZYME Phenylalanyl-tRNA Synthetase alpha chain [EC] 6.1.1.20 |
| 268 | CABO-LLG-01-000236 | CTRA-DUW-01-000857 | NA |
| 269 | CABO-LLG-01-000237 | CTRA-DUW-01-000858 | NA |
| 270 | CABO-LLG-01-000240 | CTRA-DUW-01-000861 | ENZYME Polynucleotide phosphorylase [EC] 2.7.7.8 |
| 271 | CABO-LLG-01-000241 | CTRA-DUW-01-000862 | NA |
| 272 * | CABO-LLG-01-000254 | CTRA-DUW-01-000874 | FUNCTION ABC transporter, ATP-binding protein |
| 273 | CABO-LLG-01-000267 | CTRA-DUW-01-000385 | ENZYME Glucose-6-phosphate isomerase [EC] 5.3.1.9 |
| 274 | CABO-LLG-01-000269 | CTRA-DLC-01-000502 | ENZYME Malate dehydrogenase [EC] 1.1.1.82 |
| 275 | CABO-LLG-01-000271 | CTRA-DUW-01-000382 | SIMILAR-TO D-Amino Acid Dehydrogenase [EC] 1.-.-.- |
| 276 * | CABO-LLG-01-000276 | CTRA-DLC-01-000508 | ENZYME 3-dehydroquinate dehydratase [EC] 4.2.1.10 |
| 277 | CPRO-UWE-01-000881 | CTRA-DUW-01-000373 | ENZYME 3-phosphoshikimate 1-carboxyvinyltransferase [EC] |
| 278 | CABO-LLG-01-000277 | CTRA-DUW-01-000376 | ENZYME 3-dehydroquinate synthase [EC] 4.6.1.3 |
| 279 | CABO-LLG-01-000278 | CTRA-DUW-01-000375 | ENZYME Chorismate synthase [EC] 4.6.1.4 |
| 280 | CABO-LLG-01-000288 | CTRA-DUW-01-000371 | ENZYME Dihydrodipicolinate reductase [EC] 1.3.1.26 |
| 281 | CABO-LLG-01-000290 | CTRA-DUW-01-000369 | ENZYME Aspartokinase [EC] 2.7.2.4 |
| 282 | CABO-LLG-01-000298 | CTRA-DUW-01-000328 | NA |
| 283 | SNEG-ZXX-01-000625 | CTRA-DLC-01-000783 | FUNCTION Translation initiation factor IF-2 |
| 284 | CABO-LLG-01-000304 | CTRA-DUW-01-000322 | FUNCTION Ribosomal protein L11 |
| 285 | CABO-LLG-01-000305 | CTRA-DUW-01-000321 | FUNCTION Ribosomal protein L1 |
| 286 | CABO-LLG-01-000306 | CTRA-DUW-01-000320 | FUNCTION Ribosomal protein L10 |
| 287 | CABO-LLG-01-000308 | CTRA-DUW-01-000318 | ENZYME DNA-directed RNA polymerase beta subunit [EC] 2.7.7.6 |
| 288 | CABO-LLG-01-000309 | CTRA-DUW-01-000317 | ENZYME DNA-directed RNA polymerase beta prime subunit [EC] |
| 289 | CABO-LLG-01-000312 | CTRA-DUW-01-000314 | NA |
| 290 | CABO-LLG-01-000313 | CTRA-DLC-01-000569 | ENZYME vacuolar ATPase proteolipid subunit E [EC] 3.6.1.34 |
| 291 | CABO-LLG-01-000314 | CTRA-DUW-01-000312 | NA |
| 292 | CABO-LLG-01-000317 | CTRA-DUW-01-000309 | ENZYME vacuolar ATPase proteolipid subunit D [EC] 3.6.1.34 |
| 293 | CABO-LLG-01-000320 | CTRA-DLC-01-000576 | NA |
| 294 | CABO-LLG-01-000324 | CTRA-DUW-01-000337 | ENZYME Pyruvate kinase [EC] 2.7.1.40 |
| 295 | CPRO-UWE-01-001632 | CTRA-DUW-01-000012 | NA |
| 296 | CABO-LLG-01-000328 | CTRA-DUW-01-000013 | ENZYME Cytochrome Oxidase D subunit I [EC] 1.10.3.- |
| 297 | CABO-LLG-01-000329 | CTRA-DUW-01-000014 | ENZYME Cytochrome Oxidase D subunit II [EC] 1.10.3.- |
| 298 | CABO-LLG-01-000331 | CTRA-DLC-01-000860 | NA |
| 299 | CABO-LLG-01-000332 | CTRA-DLC-01-000861 | NA |
| 300 | CABO-LLG-01-000333 | CTRA-DUW-01-000015 | FUNCTION PhoH-like protein |
| 301 | CABO-LLG-01-000337 | CTRA-DLC-01-000856 | NA |
| 302 | CABO-LLG-01-000338 | CTRA-DUW-01-000022 | FUNCTION Ribosomal protein L31 |
| 303 | CABO-LLG-01-000342 | CTRA-DUW-01-000026 | FUNCTION Ribosomal protein S16 |
| 304 | CABO-LLG-01-000343 | CTRA-DUW-01-000027 | ENZYME tRNA (guanine |
| 305 | CABO-LLG-01-000344 | CTRA-DUW-01-000028 | FUNCTION Ribosomal protein L19 |
| 306 | CABO-LLG-01-000345 | CTRA-DUW-01-000029 | ENZYME Ribonuclease HII [EC] 3.1.26.4 |
| 307 | CABO-LLG-01-000346 | CTRA-DUW-01-000030 | ENZYME Guanylate kinase [EC] 2.7.4.8 |
| 308 | CABO-LLG-01-000358 | CTRA-DUW-01-000215 | ENZYME Ribose 5-phosphate isomerase A [EC] 5.3.1.6 |
| 309 | CABO-LLG-01-000359 | CTRA-DLC-01-000666 | NA |
| 310 | CABO-LLG-01-000360 | CTRA-DUW-01-000213 | NA |
| 311 | CABO-LLG-01-000368 | CTRA-DLC-01-000765 | NA |
| 312 | CABO-LLG-01-000374 | CTRA-DUW-01-000147 | ENZYME DNA ligase (NAD+) [EC] 6.5.1.2 |
| 313 | CABO-LLG-01-000379 | CTRA-DUW-01-000210 | ENZYME 3-deoxy-d-manno-octulosonic-acid transferase [EC] 2.-.-.- |
| 314 | CABO-LLG-01-000392 | CTRA-DUW-01-000186 | NA |
| 315 | CABO-LLG-01-000393 | CTRA-DUW-01-000185 | ENZYME CTP synthetase [EC] 6.3.4.2 |
| 316 | CABO-LLG-01-000404 | CTRA-DUW-01-000195 | ENZYME Queuine tRNA-ribosyltransferase [EC] 2.4.2.29 |
| 317 | CABO-LLG-01-000420 | CTRA-DUW-01-000132 | NA |
| 318 | CABO-LLG-01-000423 | CTRA-DUW-01-000199 | ENZYME |
| 319 | CABO-LLG-01-000425 | CTRA-DUW-01-000187 | NA |
| 320 | CABO-LLG-01-000426 | CTRA-DUW-01-000188 | ENZYME Glucose-6-phosphate dehydrogenase [EC] -.-.-.- |
| 321 | CABO-LLG-01-000432 | CTRA-DLC-01-000753 | FUNCTION Ribosomal protein S9 |
| 322 | CABO-LLG-01-000433 | CTRA-DUW-01-000126 | FUNCTION Ribosomal protein L13 |
| 323 | CABO-LLG-01-000435 | CTRA-DUW-01-000152 | NA |
| 324 | CABO-LLG-01-000448 | CTRA-DUW-01-000138 | FUNCTION Sua5 homolog |
| 325 | CABO-LLG-01-000451 | CTRA-DUW-01-000190 | ENZYME Thymidylate kinase (dTMP kinase) [EC] 2.7.4.9 |
| 326 | CABO-LLG-01-000459 | CTRA-DUW-01-000217 | ENZYME Fructose-bisphosphate aldolase class I [EC] 4.1.2.13 |
| 327 | CABO-LLG-01-000471 | CTRA-DUW-01-000239 | FUNCTION acyl carrier protein ACP |
| 328 | PACA-UV7-01-000731 | CTRA-DUW-01-000105 | ENZYME Enoyl-[acyl-carrier protein] reductase (NADH) [EC] |
| 329 | CABO-LLG-01-000473 | CTRA-DLC-01-000640 | ENZYME Malonyl CoA-acyl carrier protein transacylase [EC] |
| 330 | CABO-LLG-01-000474 | CTRA-DLC-01-000639 | ENZYME 3-oxoacyl-[acyl-carrier-protein] synthase III [EC] |
| 331 | CABO-LLG-01-000475 | CTRA-DUW-01-000243 | FUNCTION Recombination protein RecR homolog |
| 332 | CABO-LLG-01-000477 | CTRA-DUW-01-000245 | NA |
| 333 | CPNE-TW1-01-000387 | CTRA-DUW-01-000055 | ENZYME 2-oxoglutarate dehydrogenase E1 component [EC] 1.2.4.2 |
| 334 | CABO-LLG-01-000486 | CTRA-DUW-01-000254 | FUNCTION Inner-membrane protein YidC |
| 335 | CABO-LLG-01-000489 | CTRA-DUW-01-000101 | ENZYME holo-[acyl-carrier protein] synthase [EC] 2.7.8.7 |
| 336 | CABO-LLG-01-000490 | CTRA-DUW-01-000100 | ENZYME Thioredoxin reductase (NADPH) [EC] 1.6.4.5 |
| 337 | CABO-LLG-01-000494 | CTRA-DUW-01-000096 | FUNCTION Ribosome-binding factor A RbfA |
| 338 | CABO-LLG-01-000496 | CTRA-DUW-01-000094 | ENZYME Riboflavin kinase [EC] 2.7.1.26 |
| 339 | CABO-LLG-01-000499 | CTRA-DUW-01-000090 | NA |
| 340 | CABO-LLG-01-000500 | CTRA-DUW-01-000089 | NA |
| 341 | CABO-LLG-01-000501 | CTRA-DLC-01-000793 | FUNCTION Ribosomal protein L28 |
| 342 | CABO-LLG-01-000508 | CTRA-DLC-01-000801 | ENZYME Methylenetetrahydrofolate dehydrogenase [EC] 1.5.1.15 |
| 343 | CABO-LLG-01-000509 | CTRA-DUW-01-000078 | FUNCTION Thiamine biosynthesis lipoprotein ApbE precursor |
| 344 | CABO-LLG-01-000510 | CTRA-DUW-01-000077 | FUNCTION Small protein B SmpB homolog |
| 345 | CABO-LLG-01-000511 | CTRA-DUW-01-000076 | ENZYME DNA polymerase III beta chain [EC] 2.7.7.7 |
| 346 | CABO-LLG-01-000514 | CTRA-DUW-01-000073 | SIMILAR-TO zinc protease [EC] -.-.-.- |
| 347 | CABO-LLG-01-000516 | CTRA-DUW-01-000071 | FUNCTION ABC transporter, permease protein TroD |
| 348 | CABO-LLG-01-000519 | CTRA-DUW-01-000068 | FUNCTION periplasmic substrate binding protein TroA |
| 349 | CPSI-CAL-01-000799 | CTRA-DUW-01-000423 | FUNCTION high-affinity ZnuA homolog |
| 350 | CABO-LLG-01-000520 | CTRA-DLC-01-000812 | NA |
| 351 | CABO-LLG-01-000523 | CTRA-DUW-01-000064 | ENZYME 6-phosphogluconate dehydrogenase [EC] 1.1.1.44 |
| 352 | CABO-LLG-01-000524 | CTRA-DUW-01-000063 | ENZYME Tyrosyl-tRNA Synthetase [EC] 6.1.1.1 |
| 353 | CABO-LLG-01-000535 | CTRA-DLC-01-000825 | NA |
| 354 | CABO-LLG-01-000541 | CTRA-DLC-01-000831 | NA |
| 355 | CABO-LLG-01-000544 | CTRA-DUW-01-000045 | FUNCTION single-stranded DNA-binding protein SSB |
| 356 | CABO-LLG-01-000545 | CTRA-DUW-01-000044 | NA |
| 357 | CABO-LLG-01-000547 | CTRA-DUW-01-000042 | NA |
| 358 | CABO-LLG-01-000554 | CTRA-DLC-01-000619 | ENZYME Protein phosphatase 2C [EC] 3.1.3.16 |
| 359 | CABO-LLG-01-000558 | CTRA-DUW-01-000257 | NA |
| 360 | CABO-LLG-01-000560 | CTRA-DUW-01-000108 | ENZYME A/G-specific adenine glycosylase [EC] 3.2.2.- |
| 361 | CABO-LLG-01-000564 | CTRA-DUW-01-000104 | NA |
| 362 | CABO-LLG-01-000571 | CTRA-DUW-01-000268 | ENZYME Acetyl-coenzyme A carboxylase carboxyl transferase |
| 363 | CABO-LLG-01-000574 | CTRA-DUW-01-000271 | ENZYME N-acetylmuramoyl- |
| 364 | CABO-LLG-01-000577 | CTRA-DUW-01-000273 | FUNCTION Penicillin-binding protein 3 |
| 365 | CABO-LLG-01-000578 | CTRA-DUW-01-000274 | NA |
| 366 | CABO-LLG-01-000581 | CTRA-DUW-01-000277 | DOMAIN TPR |
| 367 | CABO-LLG-01-000585 | CTRA-DLC-01-000601 | NA |
| 368 | CABO-LLG-01-000586 | CTRA-DUW-01-000282 | NA |
| 369 | CABO-LLG-01-000587 | CTRA-DLC-01-000599 | NA |
| 370 | CPEC-E58-01-000614 | CTRA-DUW-01-000284 | NA |
| 371 | CABO-LLG-01-000591 | CTRA-DLC-01-000597 | FUNCTION Glycine cleavage system H protein |
| 372 | CABO-LLG-01-000594 | CTRA-DUW-01-000288 | SIMILAR-TO Lipoate-protein ligase A [EC] 6.3.4.- |
| 373 | CABO-LLG-01-000596 | CTRA-DUW-01-000290 | ENZYME tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase |
| 374 | CABO-LLG-01-000601 | CTRA-DUW-01-000293 | ENZYME Nitrogen regulatory IIA protein A component [EC] 2.7.1.69 |
| 375 | WCHO-WSU-01-000243 | CTRA-DUW-01-000294 | ENZYME Nitrogen regulatory IIA protein A component [EC] 2.7.1.69 |
| 376 | CABO-LLG-01-000603 | CTRA-DUW-01-000295 | ENZYME dUTP pyrophosphatase [EC] 3.6.1.23 |
| 377 | CABO-LLG-01-000608 | CTRA-DUW-01-000300 | ENZYME Ribonuclease III [EC] 3.1.26.3 |
| 378 | CABO-LLG-01-000609 | CTRA-DLC-01-000581 | FUNCTION DNA repair protein RadA |
| 379 | CABO-LLG-01-000610 | CTRA-DUW-01-000302 | ENZYME Porphobilinogen deaminase [EC] 4.3.1.8 |
| 380 | CABO-LLG-01-000616 | CTRA-DUW-01-000340 | NA |
| 381 | CABO-LLG-01-000623 | CTRA-DUW-01-000346 | DOMAIN DnaJ |
| 382 | CABO-LLG-01-000624 | CTRA-DLC-01-000536 | FUNCTION Ribosomal protein S21 |
| 383 | CABO-LLG-01-000628 | CTRA-DUW-01-000351 | ENZYME Aryl-sulfate sulphohydrolase [EC] 3.1.6.1 |
| 384 | CABO-LLG-01-000631 | CTRA-DUW-01-000354 | FUNCTION Septum formation protein Maf homolog |
| 385 | CABO-LLG-01-000632 | CTRA-DUW-01-000355 | NA |
| 386 | WCHO-WSU-01-000567 | CTRA-DUW-01-000392 | NA |
| 387 | CABO-LLG-01-000633 | CTRA-DUW-01-000356 | NA |
| 388 | CABO-LLG-01-000636 | CTRA-DUW-01-000333 | ENZYME Triosephosphate isomerase [EC] 5.3.1.1 |
| 389 | CABO-LLG-01-000637 | CTRA-DUW-01-000334 | ENZYME Exonuclease VII large subunit [EC] 3.1.11.6 |
| 390 | CABO-LLG-01-000641 | CTRA-DUW-01-000360 | ENZYME Dimethyladenosine transferase [EC] 2.1.1.- |
| 391 | CABO-LLG-01-000642 | CTRA-DUW-01-000361 | NA |
| 392 | CABO-LLG-01-000643 | CTRA-DUW-01-000362 | DOMAIN Thioredoxin |
| 393 | CABO-LLG-01-000646 | CTRA-DLC-01-000868 | NA |
| 394 | CABO-LLG-01-000647 | CTRA-DLC-01-000869 | ENZYME Ribonuclease HII [EC] 3.1.26.4 |
| 395 | CABO-LLG-01-000651 | CTRA-DUW-01-000004 | ENZYME glutamyl-tRNA (Gln) amidotransferase, subunit B [EC] |
| 396 | CABO-LLG-01-000670 | CTRA-DUW-01-000386 | NA |
| 397 | CABO-LLG-01-000671 | CTRA-DUW-01-000387 | SIMILAR-TO metallo-beta-lactamase [EC] 3.5.-.- |
| 398 | CABO-LLG-01-000684 | CTRA-DLC-01-000488 | NA |
| 399 | CABO-LLG-01-000686 | CTRA-DUW-01-000396 | NA |
| 400 | CABO-LLG-01-000690 | CTRA-DLC-01-000484 | FUNCTION Heat-inducible transcription repressor HrcA |
| 401 | CABO-LLG-01-000691 | CTRA-DUW-01-000403 | FUNCTION GrpE protein |
| 402 | CABO-LLG-01-000698 | CTRA-DUW-01-000432 | NA |
| 403 | CABO-LLG-01-000701 | CTRA-DUW-01-000435 | NA |
| 404 | CABO-LLG-01-000705 | CTRA-DUW-01-000438 | ENZYME ubiquinone/menaquinone biosynthesis methlytransferase |
| 405 | CABO-LLG-01-000706 | CTRA-DLC-01-000449 | NA |
| 406 | CABO-LLG-01-000707 | CTRA-DUW-01-000440 | ENZYME Diaminopimelate epimerase [EC] 5.1.1.7 |
| 407 | CABO-LLG-01-000709 | CTRA-DLC-01-000446 | ENZYME Serine hydroxymethyltransferase [EC] 2.1.2.1 |
| 408 | CABO-LLG-01-000713 | CTRA-DUW-01-000406 | NA |
| 409 | CABO-LLG-01-000714 | CTRA-DLC-01-000479 | NA |
| 410 | CABO-LLG-01-000717 | CTRA-DUW-01-000410 | ENZYME Lipid A 4'-kinase [EC] 2.7.1.130 |
| 411 | CABO-LLG-01-000722 | CTRA-DLC-01-000471 | FUNCTION DnaK suppressor protein |
| 412 | CABO-LLG-01-000723 | CTRA-DUW-01-000416 | ENZYME Lipoprotein signal peptidase [EC] 3.4.23.36 |
| 413 | CABO-LLG-01-000735 | CTRA-DUW-01-000427 | FUNCTION Ribosomal protein L27 |
| 414 | CABO-LLG-01-000736 | CTRA-DLC-01-000459 | FUNCTION Ribosomal protein L21 |
| 415 | CABO-LLG-01-000738 | CTRA-DUW-01-000444 | NA |
| 416 | CABO-LLG-01-000739 | CTRA-DUW-01-000445 | ENZYME Sulfite reductase (NADPH) flavoprotein alpha-component |
| 417 | CABO-LLG-01-000740 | CTRA-DLC-01-000442 | FUNCTION Ribosomal protein S10 |
| 418 | CABO-LLG-01-000751 | CTRA-DUW-01-000456 | ENZYME Glutamyl-tRNA Synthetase [EC] 6.1.1.17 |
| 419 | CABO-LLG-01-000752 | CTRA-DLC-01-000431 | NA |
| 420 * | CABO-LLG-01-000753 | NA | |
| 421 | CABO-LLG-01-000754 | CTRA-DUW-01-000458 | ENZYME Single-stranded-DNA-specific exonuclease RecJ [EC] |
| 422 | CABO-LLG-01-000759 | CTRA-DUW-01-000463 | ENZYME Cytidylate kinase [EC] 2.7.4.14 |
| 423 | CABO-LLG-01-000761 | CTRA-DUW-01-000465 | ENZYME Arginyl-tRNA Synthetase [EC] 6.1.1.19 |
| 424 | CABO-LLG-01-000762 | CTRA-DUW-01-000466 | ENZYME UDP- |
| 425 | CABO-LLG-01-000764 | CTRA-DUW-01-000468 | NA |
| 426 | CABO-LLG-01-000778 | CTRA-DUW-01-000480 | NA |
| 427 | CABO-LLG-01-000779 | CTRA-DUW-01-000481 | NA |
| 428 | CABO-LLG-01-000784 | CTRA-DUW-01-000486 | ENZYME Phenylalanyl-tRNA Synthetase beta chain [EC] 6.1.1.20 |
| 429 * | CABO-LLG-01-000789 | CTRA-DUW-01-000491 | FUNCTION Dipeptide binding protein DppA |
| 430 | CABO-LLG-01-000792 | CTRA-DUW-01-000496 | NA |
| 431 | CABO-LLG-01-000793 | CTRA-DUW-01-000497 | ENZYME Protoheme ferro-lyase [EC] 4.99.1.1 |
| 432 | CABO-LLG-01-000794 | CTRA-DUW-01-000498 | FUNCTION Aminoacid-binding periplasmic protein precursor |
| 433 | CABO-LLG-01-000795 | CTRA-DUW-01-000499 | ENZYME HemK modification methylase homolog [EC] -.-.-.- |
| 434 | CABO-LLG-01-000796 | CTRA-DUW-01-000500 | NA |
| 435 | CABO-LLG-01-000801 | CTRA-DLC-01-000386 | DOMAIN ATP-binding |
| 436 | CABO-LLG-01-000802 | CTRA-DUW-01-000505 | ENZYME DNA polymerase I [EC] 2.7.7.7 |
| 437 | CABO-LLG-01-000803 | CTRA-DLC-01-000384 | NA |
| 438 | CABO-LLG-01-000805 | CTRA-DUW-01-000508 | ENZYME CDP-diacylglycerol-glycerol-3-phosphate |
| 439 | CABO-LLG-01-000807 | CTRA-DUW-01-000511 | FUNCTION Glucose inhibited division protein A GidA |
| 440 | CABO-LLG-01-000808 | CTRA-DUW-01-000512 | ENZYME Lipoate-protein ligase A [EC] 6.3.4.- |
| 441 | CABO-LLG-01-000810 | CTRA-DUW-01-000514 | ENZYME Holliday Junction DNA Helicase RuvA [EC] -.-.-.- |
| 442 | CABO-LLG-01-000811 | CTRA-DUW-01-000515 | ENZYME Holliday Junction DNA Helicase RuvC [EC] 3.1.22.4 |
| 443 | CABO-LLG-01-000813 | CTRA-DUW-01-000517 | NA |
| 444 | CABO-LLG-01-000814 | CTRA-DUW-01-000518 | ENZYME Glyceraldehyde 3-phosphate dehydrogenase [EC] 1.2.1.12 |
| 445 | CABO-LLG-01-000820 | CTRA-DUW-01-000524 | FUNCTION Ribosomal protein L15 |
| 446 | CABO-LLG-01-000821 | CTRA-DUW-01-000525 | FUNCTION Ribosomal protein S5 |
| 447 | CABO-LLG-01-000822 | CTRA-DUW-01-000526 | FUNCTION Ribosomal protein L18 |
| 448 | CABO-LLG-01-000824 | CTRA-DUW-01-000528 | FUNCTION Ribosomal protein S8 |
| 449 | CABO-LLG-01-000825 | CTRA-DUW-01-000529 | FUNCTION Ribosomal protein L5 |
| 450 | CABO-LLG-01-000826 | CTRA-DUW-01-000530 | FUNCTION Ribosomal protein L24 |
| 451 | CABO-LLG-01-000827 | CTRA-DUW-01-000531 | FUNCTION Ribosomal protein L14 |
| 452 | CABO-LLG-01-000828 | CTRA-DUW-01-000532 | FUNCTION Ribosomal protein S17 |
| 453 | CABO-LLG-01-000830 | CTRA-DUW-01-000534 | FUNCTION Ribosomal protein L16 |
| 454 | CABO-LLG-01-000831 | CTRA-DUW-01-000535 | FUNCTION Ribosomal protein S3 |
| 455 | CABO-LLG-01-000833 | CTRA-DUW-01-000537 | FUNCTION Ribosomal protein S19 |
| 456 | CABO-LLG-01-000834 | CTRA-DUW-01-000538 | FUNCTION Ribosomal protein L2 |
| 457 | CABO-LLG-01-000835 | CTRA-DUW-01-000539 | FUNCTION Ribosomal protein L23 |
| 458 | CABO-LLG-01-000836 | CTRA-DLC-01-000351 | FUNCTION Ribosomal protein L4 |
| 459 | CABO-LLG-01-000837 | CTRA-DUW-01-000541 | FUNCTION Ribosomal protein L3 |
| 460 * | CABO-LLG-01-000839 | CTRA-DUW-01-000543 | ENZYME Methionyl-tRNA formyltransferase [EC] 2.1.2.9 |
| 461 | CABO-LLG-01-000841 | CTRA-DUW-01-000545 | ENZYME (3R)-hydroxymyristoyl-[acyl carrier protein] dehydratase |
| 462 | CABO-LLG-01-000842 | CTRA-DLC-01-000345 | ENZYME UDP-3- |
| 463 | CABO-LLG-01-000843 | CTRA-DUW-01-000547 | ENZYME apolipoprotein N-acyltransferase [EC] 2.3.1. |
| 464 | CABO-LLG-01-000846 | CTRA-DUW-01-000550 | DOMAIN ATP-binding |
| 465 | CABO-LLG-01-000847 | CTRA-DUW-01-000551 | NA |
| 466 | CABO-LLG-01-000849 | CTRA-DUW-01-000553 | ENZYME rRNA methyltransferase SpoU homolog [EC] -.-.-.- |
| 467 | CABO-LLG-01-000852 | CTRA-DUW-01-000556 | ENZYME Histidyl-tRNA Synthetase [EC] 6.1.1.21 |
| 468 | CABO-LLG-01-000855 | CTRA-DUW-01-000558 | ENZYME DNA polymerase III alpha chain [EC] 2.7.7.7 |
| 469 | CABO-LLG-01-000856 | CTRA-DUW-01-000559 | NA |
| 470 | CABO-LLG-01-000857 | CTRA-DLC-01-000331 | NA |
| 471 | CABO-LLG-01-000858 | CTRA-DUW-01-000561 | NA |
| 472 | CABO-LLG-01-000860 | CTRA-DLC-01-000327 | ENZYME D-alanyl-D-alanine carboxypeptidase DacF [EC] 3.4.16.4 |
| 473 | CABO-LLG-01-000865 | CTRA-DUW-01-000710 | ENZYME Phosphoglycerate kinase [EC] 2.7.2.3 |
| 474 | CABO-LLG-01-000867 | CTRA-DUW-01-000708 | FUNCTION Phosphate transport system protein PhoU |
| 475 | CABO-LLG-01-000874 | CTRA-DUW-01-000703 | FUNCTION ABC transporter, ATP-binding protein |
| 476 | SNEG-ZXX-01-002117 | CTRA-DUW-01-000701 | FUNCTION ABC transporter, ATP-binding protein |
| 477 | CABO-LLG-01-000880 | CTRA-DUW-01-000697 | FUNCTION Ribosomal protein S2 |
| 478 | CABO-LLG-01-000881 | CTRA-DLC-01-000198 | FUNCTION Translation elongation factor EF-TS |
| 479 | CABO-LLG-01-000882 | CTRA-DUW-01-000695 | ENZYME Uridylate kinase [EC] 2.7.4. |
| 480 | CABO-LLG-01-000892 | CTRA-DUW-01-000684 | NA |
| 481 | CABO-LLG-01-000895 | CTRA-DUW-01-000681 | DOMAIN FHA |
| 482 | CABO-LLG-01-000897 | CTRA-DUW-01-000679 | ENZYME glutamyl-tRNA reductase [EC] 1.2.1. |
| 483 | CABO-LLG-01-000904 | CTRA-DUW-01-000672 | ENZYME KDO-8-phosphate synthetase [EC] 4.1.2.16 |
| 484 | CABO-LLG-01-000912 | CTRA-DLC-01-000254 | NA |
| 485 | CABO-LLG-01-000913 | CTRA-DUW-01-000640 | SIMILAR-TO Endonuclease IV [EC] 3.1.21.2 |
| 486 | CABO-LLG-01-000914 | CTRA-DUW-01-000641 | FUNCTION Ribosomal protein S4 |
| 487 | CABO-LLG-01-000916 | CTRA-DUW-01-000657 | FUNCTION Multidrug-efflux transporter |
| 488 | CABO-LLG-01-000917 | CTRA-DLC-01-000238 | ENZYME Exodeoxyribonuclease V gamma subunit [EC] 3.1.11.5 |
| 489 | CABO-LLG-01-000920 | CTRA-DUW-01-000652 | SIMILAR-TO Amino-acid aminotransferase class I [EC] 2.6.1. |
| 490 | CABO-LLG-01-000921 | CTRA-DUW-01-000651 | FUNCTION Transcription Elongation Factor GreA |
| 491 | CABO-LLG-01-000923 | CTRA-DUW-01-000649 | NA |
| 492 | CABO-LLG-01-000924 | CTRA-DLC-01-000245 | ENZYME Porphobilinogen synthase [EC] 4.2.1.24 |
* Eight clusters do not contain one member per genome exactly and are marked; these include cluster 420 which does not contain a master sequence from the C. trachomatis original annotation dataset; NA: not available. Lead sequence is first sequence found in cluster; master sequence is sequence where annotation is drawn from (see Experimental Section for details).
Figure 1Pangenome protein family size distribution. Cluster size is displayed on the x-axis (bins until 50 are all shown; above 50, bins are shown for each ten counts, labels for every five bin sizes); absolute frequency of clusters is shown on the left y-axis (bars, green curve); cumulative count of clusters is shown on the right y-axis (orange curve). Families are defined as those clusters with at least three members (see text); all cluster frequencies are shown here for completeness. The bimodal nature of the distribution can be seen between the peak at low cluster sizes and 37; above 37 there are multi-member and multi-species protein families (see text).
Figure 2Top ten multi-member families within the pangenome. Genomes (with full COGENT-like codes) are shown on the x-axis, sorted by total protein-coding gene count (see also Table 1). Absolute cumulative counts of multi-member families are shown on the y-axis (displayed in the figure legend from left to right and then top to bottom, e.g., ABC transporter permeases, POMPs, type III secretion system ATPases, etc. according to size, see text), color coded according to figure legend.
Figure 3Correlation between genome size and unique genes. Genome size is given as the number of protein-coding genes (shown on the x-axis) against the count of unique genes (number of unique genes without homologs within the pangenome, shown on the y-axis; y-axis is displayed on logarithmic scale). The six points on the upper right part of the graph are evidently those genomes with largest gene counts, all outside the Chlamydiaceae family (see Table 1 and text). The pattern observed is primarily due to the sampling of taxonomic space of the Chlamydiales and will vary as more genomes from this group become available.
Figure 4Alignment of the PCY domain. The PCY motif is centered around position 15 of the multiple alignment. The domain was discovered following five iterations with PSI-BLAST with CP0988 as query sequence (GI:16752158), until convergence and an e-value threshold 0.005. In total 70 sequences were recovered; redundancy was removed at 95% with Jalview [34], resulting in 32 sequences shown here. The length of the domain is just 30 residues; boxes signify sequence identity at 50% or above (darker color: more conserved). GI labels are provided, along with sequence coordinates on the left of the alignment (see text for more details and discussion).
Figure 5Alignment of a unique leucyl aminopeptidase family. The domain was discovered following five iterations with PSI-BLAST with pc0506 as query sequence (YP_007505.1). Display conventions as in Figure 4.
Figure 6Protein family contributions from genome projects. Genome codes are sorted according to their original publication date (and/or release date, x-axis); absolute number of “novel” protein families within the pangenome are given (left y-axis, blue curve and square symbols); cumulative sum of protein families (up to 5,260, excluding those without self-hits, see text) is also shown, defined as a “pangenome saturation curve” (right y-axis, green curve and square symbols).
Figure 7Genome tree of the Chlamydiales. Dendrogram representing phylogenetic relationships of the 37 Chlamydiales genomes analyzed, based on sharing of phylogenetic profiles (see Experimental Section for details). Genome codes are given as labels.