| Literature DB >> 25852675 |
Stefan Simm1, Mario Keller1, Mario Selymesi1, Enrico Schleiff2.
Abstract
Cyanobacteria are photosynthetic prokaryotes important for many ecosystems with a high potential for biotechnological usage e.g., in the production of bioactive molecules. Either asks for a deep understanding of the functionality of cyanobacteria and their interaction with the environment. This in part can be inferred from the analysis of their genomes or proteomes. Today, many cyanobacterial genomes have been sequenced and annotated. This information can be used to identify biological pathways present in all cyanobacteria as proteins involved in such processes are encoded by a so called core-genome. However, beside identification of fundamental processes, genes specific for certain cyanobacterial features can be identified by a holistic genome analysis as well. We identified 559 genes that define the core-genome of 58 analyzed cyanobacteria, as well as three genes likely to be signature genes for thermophilic and 57 genes likely to be signature genes for heterocyst-forming cyanobacteria. To get insights into cyanobacterial systems for the interaction with the environment we also inspected the diversity of the outer membrane proteome with focus on β-barrel proteins. We observed that most of the transporting outer membrane β-barrel proteins are not globally conserved in the cyanobacterial phylum. In turn, the occurrence of β-barrel proteins shows high strain specificity. The core set of outer membrane proteins globally conserved in cyanobacteria comprises three proteins only, namely the outer membrane β-barrel assembly protein Omp85, the lipid A transfer protein LptD, and an OprB-type porin. Thus, we conclude that cyanobacteria have developed individual strategies for the interaction with the environment, while other intracellular processes like the regulation of the protein homeostasis are globally conserved.Entities:
Keywords: Anabaena sp. PCC 7120; comparative genomics; core-genome; cyanobacteria; genotypic and phenotypic differences; ortholog search
Year: 2015 PMID: 25852675 PMCID: PMC4365693 DOI: 10.3389/fmicb.2015.00219
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Phenotypical, ecological and physiological features analyzed.
| 1 | Habitat | Sea/Ground/Fresh water/Salt meadow/Host/Water surface/Coast/Mud/Hot spring | 56 |
| 2 | Occurence | Lab/Nature | 42 |
| 3 | Nitrogen fixation | Yes or No | 29 |
| 4 | Toxin production and export | Yes or No | 14 |
| 5 | Trichome | Yes or No | 52 |
| 6 | Cell composition | Unicellular/Filament/Chain/Pairs | 56 |
| 7 | Cell shape | Spherical/Filamentous/Helical/Coccoid/Rod shaped/Oval | 52 |
| 8 | Heterocyst | Yes or No | 54 |
| 9 | Hormogonia | Yes or No | 6 |
| 10 | Akinete | Yes or No | 7 |
| 11 | Temperature range | Mesophilic/Thermophilic | 56 |
| 12 | Oxygen demand | Aerobic/Anaerobic/Facultative aerobic | 47 |
| 13 | Motility | Mobile/Immobile | 51 |
Given is the number (column 1) and name of the feature analyzed (column 2), the categories of the feature (column 3), and the number of cyanobacteria with known information on the specific feature (CWI, column 4). Detailed information are given in Additional File 1 in Supplementary Material.
β-Barrel probability categorization.
| (a) | Probable | Probable | Potential | Detected | 39 | 703 |
| (b) | Probable | Probable | One of the two criteria | 7 | 179 | |
| (c) | Probable | Probable | – | – | 4 | 78 |
| (d) | Others | 6089 | 228,326 | |||
Shown is the category of the β-barrel prediction (column 1), the major criteria based on TMBp (column 2) and pHMM (column 3) analysis, the minor criteria based on Pfam search for non-β-barrel domains (column 4) or analysis of the CLOG composition (column 4); the number of identified genes in Anabaena sp. PCC 7120 (column 6) or in all cyanobacteria (column 7).
before and
after structural prediction by Phyre2 and manual inspection.
Classification and genome size of the analyzed 58 cyanobacterial strains.
| Chroococcales | Acaryochloris marina | Aca1 | 8.36 | 8383 | 52.75 | |
| Cro1 | 6.24 | 5958 | 44.56 | |||
| Cya1 | 5.46 | 5304 | 56.73 | |||
| Cya2 | 6.55 | 5710 | 36.18 | |||
| Cya3 | 5.79 | 5327 | 33.40 | |||
| Cya4 | 4.79 | 4367 | 29.86 | |||
| Cya5 | 5.43 | 5109 | 31.45 | |||
| Cya6 | 5.88 | 6475 | 61.64 | |||
| Cya7 | 7.84 | 6981 | 46.48 | |||
| Cya8 | 4.80 | 4648 | 34.47 | |||
| Cyn1 | 2.83 | 2771 | 32.52 | |||
| Mic1 | 5.84 | 6311 | 53.40 | |||
| Syn2 | 2.70 | 2525 | 44.40 | |||
| Syn7 | 2.74 | 2662 | 38.92 | |||
| Syc1 | 3.95 | 3672 | 50.03 | |||
| Syn1 | 2.43 | 2526 | 46.00 | |||
| Syn3 | 2.61 | 2892 | 38.00 | |||
| Syn4 | 3.41 | 3186 | 31.17 | |||
| Syn5 | 2.23 | 2304 | 39.67 | |||
| Syn6 | 2.51 | 2638 | 45.94 | |||
| Syn8 | 3.05 | 2862 | 32.29 | |||
| Syn9 | 2.93 | 2760 | 31.88 | |||
| SynA | 2.22 | 2535 | 36.25 | |||
| SynB | 2.37 | 2533 | 33.48 | |||
| SynC | 2.29 | 2507 | 44.28 | |||
| SynD | 2.43 | 2719 | 42.18 | |||
| SynF | 5.97 | 5586 | 45.95 | |||
| SynG | 2.63 | 2883 | 49.60 | |||
| SynH | 2.69 | 2990 | 35.55 | |||
| SynI | 2.12 | 2577 | 39.74 | |||
| The1 | 2.59 | 2476 | 42.37 | |||
| Gloeobacterales | Glo1 | 4.66 | 4431 | 57.98 | ||
| Nostocales | Ana1 | 7.21 | 6135 | 56.95 | ||
| Ana2 | 7.11 | 5661 | 34.98 | |||
| Nod1 | 5.32 | 4860 | 50.41 | |||
| Nos2 | 5.49 | 5321 | 60.42 | |||
| Nos3 | 9.06 | 6690 | 39.07 | |||
| Oscillatoriales | Lyn1 | 7.04 | 6142 | 53.61 | ||
| Mil1 | 8.68 | 8294 | 57.14 | |||
| Art1 | 6.79 | 6630 | 61.70 | |||
| Art2 | 6.00 | 5690 | 36.50 | |||
| Art3 | 6.17 | 5675 | 46.70 | |||
| Osc1 | 6.68 | 5822 | 53.98 | |||
| Tri1 | 7.75 | 4451 | 39.00 | |||
| Prochlorales | Pro1 | 1.75 | 1882 | 27.52 | ||
| Pro2 | 1.66 | 1713 | 29.83 | |||
| Pro3 | 2.41 | 2267 | 32.91 | |||
| Pro4 | 1.84 | 2163 | 40.41 | |||
| Pro5 | 1.71 | 1962 | 35.68 | |||
| Pro6 | 1.67 | 1921 | 35.97 | |||
| Pro7 | 1.70 | 1906 | 36.41 | |||
| Pro8 | 2.68 | 2997 | 50.75 | |||
| Pro9 | 1.86 | 2193 | 46.69 | |||
| ProA | 1.64 | 1907 | 35.19 | |||
| ProB | 1.74 | 1983 | 37.17 | |||
| ProC | 1.69 | 1855 | 37.20 | |||
| ProF | 1.69 | 1890 | 33.17 | |||
| Stigonematales | Fis1 | 5.38 | 4627 | 27.34 |
Given is the order (column 1), the species according to NCBI and PATRIC taxonomy (column 2; Wattam et al., 2014) and the strain if not identical with the species (column 3) for each cyanobacteria included in this study. Column 4 gives the abbreviation used in here, column 5 gives the genome size of both, chromosomes and plasmids in megabases (Mb) and column 6 gives the number of protein coding open reading frames (ORFs) on the chromosomes and plasmids. Column 7 gives the percentage of the ORFs only annotated as putative/hypothetical.
Figure 1CLOG distribution of the 58 cyanobacteria. (A) The numbers cyanobacterial strains from which sequences are included in an individual CLOG was determined. The number of CLOGs containing genes from a given number of strains is shown. (B) CLOGs representing unique, dispensable (dispens) or CORE-genes (core) were determined by OrthoMCL for all genomes or by PGAP for the genome of the cyanobacterial order Chroococcales, Nostocales, Oscillatoriales, and Prochlorales. Shown is the frequency of assignment of genes of a certain CLOG category detected by OrthoMCL to another CLOG category by PGAP (CORE-gene: black; dispensable gene: red; unique gene: green). (C) Shown is the number of sequences of the individual cyanobacteria represented by a CLOG of the CORE-GENOME (black), by a CLOG of the dispensable-genome (red; dispens), and by a CLOG of unique genes (green).
Figure 2The cyanobacterial core- and pan-genome. (A,B) The number of CLOGs of the cyanobacterial (A) core-genome or (B) pan-genome for a given number of organisms is shown. The box plots were created for the results of 1000 different random selections of different cyanobacterial strains. Further simulations are shown in Additional File 6 in Supplementary Material.
Functional categories and processes according to COG.
| Information storage and processing | Translation, ribosomal structure and biogenesis | J | 90 |
| Transcription | K | 11 (3) | |
| Replication, recombination and repair | L | 37 (3) | |
| TOTAL | 141 | ||
| Cellular processes and signaling | Cell cycle control, cell division, chromosome partitioning | D | 11 |
| Defense mechanisms | V | 1 | |
| Signal transduction mechanisms | T | 8 | |
| Cell wall/membrane/envelope biogenesis | M | 27 | |
| Cell motility | N | – | |
| Intracellular trafficking, secretion, and vesicular transport | U | 10 (1) | |
| Posttranslational modification, protein turnover, chaperons | O | 40 (1) | |
| TOTAL | 103 | ||
| Metabolism | Energy production and conversion | C | 45 (2) |
| Carbohydrate transport and metabolism | G | 22 (1) | |
| Amino acid transport and metabolism | E | 49 (10) | |
| Nucleotide transport and metabolism | F | 23 (4) | |
| Coenzyme transport and metabolism | H | 46 (6) | |
| Lipid transport and metabolism | I | 15 (3) | |
| Inorganic ion transport and metabolism | P | 13 (2) | |
| Secondary metabolites biosynthesis, transport and catabolism | Q | 3 (2) | |
| TOTAL | 213 | ||
| Poorly characterized | General function prediction only | R | 35 |
| Function unknown | S | 77 | |
| mixed process | X | 17 | |
| TOTAL | 129 |
Given is the global functional category (column 1), the functional process (column 2), the one letter code for the functional process (column 3) and number of proteins per functional assignment of all proteins encoded by the CORE-GENOME of Anabaena sp. PCC 7120. The CLOG annotation is exemplarily for “Energy production and conversion” to the KEGG annotation (Additional File 7 in Supplementary Material).
The number of proteins in the bracket is the count of proteins assigned to two process (e.g., translation, ribosomal structure and biogenesis and transcription), and the protein is counted for each of the processes.
The number proteins assigned to more than two process.
Figure 3CLOGs of genes involved in heterocyst formation. (A) A scheme of the localization of the 17 selected heterocyst specific proteins is shown: the penicillin-binding protein 2 (alr5101 in Anabaena PCC 7120, PbpB; Lazaro et al., 2001), the pentapeptide-repeat protein HglK (all0813; Black et al., 1995), the oxidoreductase HgdA (all5345; Nicolaisen et al., 2009), the glycolipid deposition proteins HgdB and HgdC (all5347 and all5346; Nicolaisen et al., 2009), the HstK family proteins with two-component sensor domain Pkn44 and Pkn30 (all1625 and all3691; Shi et al., 2007), the sensory protein-histidine kinase of a two-component regulatory system (all4496; HepK; Golden and Yoon, 2003), the ketoreductase HetN (alr5358; Corrales-Guerrero et al., 2014), the heterocyst differentiation control proteins HetR (alr2339; Du et al., 2012) and HetF (alr3546; Ionescu et al., 2010), the poly-peptides controlling heterocyst pattern formation PatA (all0521; Zhang et al., 2007) and PatS (asl2301; Nicolaisen et al., 2009), the heterocyst envelope polysaccharide synthesis factor HepB (alr3698; Wang et al., 2007) and the heterocyst glycolipid synthases HglC, HglD, and HglE (alr5355, alr5354, and alr5351; Fan et al., 2005). (B) CLOGs including Anabaena sp. PCC 7120 sequences mentioned in the text have been analyzed concerning the cyanobacteria the sequences originated from. The number of detected proteins known to be involved in heterocyst formation/function in cyanobacteria known to form heterocysts (red), not to form heterocysts (yellow) or for which information about heterocysts formation is not available (blue) is shown. (C) The inset on the right shows the number of CLOGs with the sequences of the six heterocyst-forming cyanobacteria only.
Genes with known or putative function in heterocyst-specific CLOGs.
| all0521 | PatA | Heterocyst formation regulating two-component response regulator | 1,6 | 1,4 | 0/0 | Liang et al., | |
| all1866 | TrxA2 | Thioredoxin A2 | 2,8 | 3,7 | Fis1 | 391/499 | Ehira and Ohmori, |
| all2356 | PhnE | Phosphonate ABC transport permease | 5,9 | 6,1 | Nos2 | 0/490 | Pernil et al., |
| alr2392 | FraC/SepJ | Filament integrity protein | −1,7 | 1,9 | 0/0 | Bauer et al., | |
| alr2834 | HepC | Glycosyl transferase | 47,3 | 19,2 | 0/0 | Zhu et al., | |
| alr2837 | Glycosyl transferase of group 2 | Up | up | 0/27 | Huang et al., | ||
| alr3234 | Similar to heterocyst formation protein HetP | −1,2 | −1,3 | Fis1 | 0/0 | Higa and Callahan, | |
| alr3287 | NrtB | Nitrate transport protein | 1,1 | 1,9 | Nod1 | 0/479 | Herrero et al., |
| alr3732 | PknE | Protein serine-threonine kinase | 3,8 | 1,2 | 0/0 | Zhang et al., | |
| alr4368 | PknD | Serine/threonine kinase | 3,0 | 1,4 | 0/0 | Zhang and Libs, | |
| all5341 | HglT | Glycosyl transferase of group 1 | up | up | 48/485 | Awai and Wolk, | |
| all5344 | Unknown | not | up | 0/141 | Fan et al., | ||
| all5346 | HgdC | Membrane spanning subunit of heterocyst specific ABC-transporter | not | 34,6 | 0/85 | Fan et al., | |
| all5347 | HgdB | Membrane fusion protein of heterocyst specific ABC-transporter | 2,3 | 115,8 | 0/62 | Fan et al., | |
| all0059 | Lipopolysaccharide biosynthesis protein | 53,6 | 19,2 | 0/71 | None | ||
| all1345 | Probable glycosyl transferase | −1,2 | −1,3 | 0/185 | None | ||
| all1862 | Putative peptidase | 22,2 | 9,6 | Fis1 | 0/0 | None | |
| all2008 | Serine proteinase | 1,2 | 1,2 | 6/198 | None | ||
| all2068 | Alpha/beta hydrolase fold protein | 1,3 | 1,0 | 59/482 | None | ||
| all2357 | Phosphonate ABC transport ATP-binding component | 4,9 | 3,3 | Nos2 | 485/497 | None | |
| all2358 | Periplasmic phosphonate binding protein | 6,3 | 2,9 | 0/148 | None | ||
| alr2463 | Aminoglycoside phosphotransferase | 9,8 | 3,6 | 0/1 | None | ||
| alr3125 | Heme oxygenase | −2,5 | 2,4 | Nod1 | 0/385 | None | |
| alr3235 | TrpC | Indole-3-glycerol phosphate synthase | up | up | Fis1 | 89/498 | None |
| alr3246 | Pyridoxamine 5′ phosphate oxidase Related protein | up | up | Fis1 | 0/429 | None | |
| all3306 | Pentapeptide repeat containing protein | up | up | Fis1 | 0/21 | None | |
| all3559 | Putative peptidase | −1,7 | 1,5 | Nod1 | 0/0 | None | |
| alr3774 | Rhomboid like protein | 3,5 | 2,4 | 0/419 | None | ||
| alr3931 | Rhomboid family protein | 1,1 | −1,0 | Nos2 | 0/485 | None | |
| alr3948 | CbiQ | Cobalt transport protein | 6,8 | 4,2 | 0/1 | None | |
| all3984 | Predicted ATP-dependent protease | 2,1 | 1,0 | 0/0 | None | ||
| all4051 | Prc barrel domain containing protein | 2,3 | 2,7 | 0/30 | None | ||
| all4538 | Mannose-6-phosphate isomerase | 1,5 | −1,2 | 0/107 | None | ||
| all4729 | Putative metalloprotein | −1,0 | 100,8 | 0/1 | None | ||
| asl4754 | PetM | Cytochrome b6f complex subunit | −2,5 | −1,8 | 0/0 | None | |
| all4768 | ErfK/YbiS/YcfS/YnhG family protein | 2,7 | 7,5 | Nod1 | 0/11 | None | |
| alr4812 | PatN | Heterocyst differentiation related protein | 1,3 | 1,4 | Fis1 | 0/0 | None |
| alr4877 | WD40-repeat protein | 2,5 | 2,7 | Nod1 | 0/0 | None | |
| alr4898 | Transcriptional regulator | 2,1 | 1,6 | Fis1 | 3/90 | None | |
| alr4984 | Peptidoglycan binding domain 1 containing protein | 25,4 | 5,7 | 0/1 | None | ||
| asr5289 | Similar to subunit X of photosystem I | 1,2 | 1,0 | 0/0 | None | ||
| all5304 | Secretion protein HlyD family protein | 6,0 | 3,2 | 0/491 | None | ||
| ava0606 | Transmembrane protein | not | not | Ana1 | 0/0 | None | |
Shown is the accession number of Anabaena sp. PCC 7120 or Anabaena variables ATCC 29413; column 1, the name and function of the gene if assigned (column 2, 3), the fold change (FC) of expression after 12 and 21 h of nitrogen starvation compared to 0 h (Flaherty et al., 2011; column 4, 5; up, infinite; not, not expressed), the cyanobacteria for which no sequence is identified in the according CLOG (CA, column 6), the number of sequences found in the genomes of Viridplantae or bacteria (V/B, column 7) and a references for functional relevance for heterocyst function or development (column 8).
Candidatus Solibacter usitatus.
Thalassospira profundimaris.
Rhodopseudomonas palustris.
Paenibacillus mucilaginosus.
Genes of unknown function in heterocyst-specific CLOGs.
| asl0176 | 1,9 | 4,8 | 0/0 | |
| alr0255 | 8,5 | 4,9 | 0/0 | |
| all0307 | 5,8 | 3,0 | Fis1 | 0/0 |
| asr0460 | 1,6 | not | Nos2 | 0/0 |
| asr0461 | −1,0 | −1,9 | Nos2 | 0/0 |
| all0463 | 7,7 | 10,6 | Nos2 | 0/0 |
| asr0680 | −1,6 | −1,9 | Fis1 | 0/19 |
| alr0805 | 1,4 | 1,2 | 2/0 | |
| asl0842 | −1,5 | −1,6 | Fis1 | 0/0 |
| all0997 | −4,9 | −1,8 | Fis1 | 0/0 |
| alr1137 | −1,6 | −2,7 | 0/0 | |
| alr1146 | 9,1 | 5,3 | 0/1 | |
| alr1147 | 2,5 | 1,8 | Nos2 | 0/2 |
| alr1148 | 8,7 | 7,7 | 0/0 | |
| asr1289 | −2,7 | −2,7 | Fis1 | 0/0 |
| all1395 | up | up | Nos2 | 0/0 |
| asl1412 | 3,6 | 3,4 | 0/0 | |
| asr1775 | 1,9 | 2,2 | Nos2 | 0/0 |
| all1814 | 15,5 | 5,9 | 0/0 | |
| asl1933 | not | not | Fis1 | 0/0 |
| all2003 | 4,0 | 1,8 | 0/1 | |
| all2089 | 1,8 | 1,3 | Nos2 | 0/0 |
| all2344 | 1,5 | −1,1 | 0/0 | |
| alr2366 | −1,1 | −1,1 | Nos2 | 0/0 |
| alr2374 | 3,3 | 2,3 | 0/0 | |
| alr2522 | up | up | 0/0 | |
| asr3134 | −1,7 | −2,6 | Nod1 | 0/0 |
| asr3279 | 4,8 | 7,7 | Nos2 | 0/0 |
| all3520 | 2,5 | 2,4 | Fis1 | 0/0 |
| alr3562 | 1,3 | −1,7 | 0/0 | |
| all3568 | 1,1 | −1,0 | 0/445 | |
| all3696 | 13,2 | 6,1 | 0/243 | |
| alr3720 | 9,3 | 3,6 | 0/0 | |
| all3745 | −1,6 | −1,6 | Fis1 | 0/0 |
| alr3910 | −2,1 | −1,5 | Nos2 | 0/0 |
| all4073 | 4,6 | 8,5 | 0/0 | |
| asl4098 | 1,6 | 1,3 | 0/0 | |
| all4117 | 2,4 | 1,8 | 0/0 | |
| all4381 | 5,5 | 5,1 | 0/0 | |
| alr4534 | 1,5 | 1,2 | 0/0 | |
| all4555 | 2,2 | 1,5 | Nod1 | 0/0 |
| asl4565 | 1,1 | 2,5 | 0/0 | |
| alr4684 | 1,6 | 4,2 | Nos2 | 0/0 |
| alr4714 | 2,6 | 1,8 | 0/0 | |
| asl4743 | 4,1 | 1,2 | 0/0 | |
| alr4788 | 1,7 | 1,7 | 0/0 | |
| asl4860 | −1,2 | 1,7 | 0/0 | |
| all4962 | 9,7 | 5,6 | Nos2 | 0/0 |
| alr5005 | 1,3 | 1,1 | 0/0 | |
| asr5071 | −1,4 | 1,1 | 0/0 | |
Shown is the accession number of Anabaena sp. PCC 7120 (column 1), the fold change of expression after 12 and 21 h of nitrogen starvation compared to 0 h (Flaherty et al., 2011; column 2, 3; up, infinite; not, not expressed), the cyanobacteria for which no sequence is identified in the CLOG (column 5) and he number of sequences found in the genomes of Viridplantae or bacteria (V/B, column 7).
Glycine max, Solanum lycopersicum.
Streptomyces aurantiacus.
Frankia sp. EUN1f, Streptomyces aurantiacus.
Nitrosococcus halophilus.
Figure 4Feature and shared CLOG correlation tree. (A, B) The neighbor-joining tree of the 58 cyanobacteria based (A) on pair-wise shared CLOGs as distances or (B) on the similarities in the 13 selected features as distances was calculated. The root for the different branches from the deepest root (CORE-GENOME) to Anabaena sp. PCC 7120 are marked by letter in A (F–A) or roman numerals in B (I–VI), and the number of CLOGs defining the core-genome for the branch with this root is given. The ratio of the core-genomes of the branches with different roots to the average size of the core-genome expected for this number (Figure 2) is indicated on the bottom left. For simplicity, only branches discussed are shown, while all strains of the remaining part of the tree are clustered in the box on top. The full tree is shown in Additional File 4 in Supplementary Material.(C) Each core-genome with the root indicated in (A,B) was determined and the number of proteins of a specific category/process (Table 4) additionally found to the core-genome of the deeper roots was counted and is deposited in Additional Files 8, 9 in Supplementary Material. Shown is the occurrence of unique proteins (in percent of all identified proteins) assigned to the four categories “Information storage and processing” (I, S, and P), “Cellular processes and signaling” (C, P, and S), “Metabolism“ (METAB) and unknown (UNKN) in the different clade specific core genomes defined for the CLOG tree (top) and feature tree (bottom). (D) Shown is the occurrence of unique proteins assigned to the individual processes (indicated by one letter code shown in Table 4). The distribution for proteins for each process is shown as color code indicated on the right (Scale). For each distribution the profile was analyzed by an inversed gaussian distribution and the position of the minimum was used to assign the process as CLADE specific defined, CLADE and CORE-GENOME defined or CORE genome defined (scale is shown on the right, position of the minimum is given in percent: 0% = exclusive detection in core genome of CLADE A or I, 100% = exclusive detection in CORE-GENOME. The results for equally distributed (CORE and CLADE) genes are shown in Additional File 10 in Supplementary Material.
Clusters of β-barrel representing sequences.
| (Glucose selective) OprB | 9 | 7 | 58 | 2 | 6 | 8 | 295 |
| 20 | 9 | 2 | 15 | ||||
| 21 | 58 | 6 | 274 | ||||
| Omp85 | 10 | 58 | 6 | 155 | |||
| LptD (DUF3769) | 7 | 56 | 6 | 56 | |||
| TonB_dep_Rec/TBDT | 6 | 22 | 6 | 124 | |||
| OmpA_Pfam/OMPdb | 11 | 14 | 20 | 5 | 5 | 15 | 27 |
| 14 | 3 | 3 | 5 | ||||
| 16 | 7 | 2 | 7 | ||||
| Omp_β-brl | 13 | 7 | 17 | 4 | 5 | 10 | 27 |
| 15 | 9 | 1 | 11 | ||||
| 18 | 5 | 4 | 6 | ||||
| DUF3442 Intimin/Invasin | 3 | 5 | 16 | 2 | 3 | 6 | 32 |
| 4 | 3 | 3 | 6 | ||||
| 5 | 10 | 2 | 20 | ||||
| Fasciclin | 17 | 5 | 3 | 5 | |||
| DUF481 | 1 | 4 | 15 | 1 | 2 | 6 | 17 |
| 8 | 11 | 2 | 11 | ||||
| OmpW | 19 | 5 | 2 | 5 | |||
| Cellulose synthesis complex barrel/BcsC | 2 | 5 | 2 | 5 | |||
| Autotransporter | 12 | 4 | 1 | 5 | |||
Shown are the names of the Pfam domains characteristic for the β-barrel families (column 1), the number of the cluster according to Additional File 11 in Supplementary Material (column 2), number of strains of which a sequence is present in the cluster (column 3) or in all clusters of the same family (column 4), the number of orders of which sequences are in the cluster (column 5) or in all clusters of the same family (column 6), and the number of different sequences in the cluster (column 7) or in all clusters of the same family (column 8).
Orders: Chroococcales, Gloeobacterales, Nostocales, Oscillatoriales, Prochlorales, Stigonematales.
DUF, domain of unknown function.
Figure 5β-barrel proteins in various core-genomes. Given are the numbers of OMPs characterized by the indicated domains (Table 7) found in Anabaena sp. PCC 7120, which are present in the indicated core-genome of the feature or CLOG tree (Figure 3). T indicates the total number of identified sequences.