| Literature DB >> 29225728 |
Nikola Palevich1,2, William J Kelly1, Sinead C Leahy1, Eric Altermann1, Jasna Rakonjac2, Graeme T Attwood1.
Abstract
Butyrivibrio hungatei MB2003 was isolated from the plant-adherent fraction of rumen contents from a pasture-grazed New Zealand dairy cow, and was selected for genome sequencing in order to examine its ability to degrade plant polysaccharides. The genome of MB2003 is 3.39 Mb and consists of four replicons; a chromosome, a secondary chromosome or chromid, a megaplasmid and a small plasmid. The genome has an average G + C content of 39.7%, and encodes 2983 putative protein-coding genes. MB2003 is able to use a variety of monosaccharide substrates for growth, with acetate, butyrate and formate as the principal fermentation end-products, and the genes encoding these metabolic pathways have been identified. MB2003 is predicted to encode an extensive repertoire of CAZymes with 78 GHs, 7 CEs, 1 PL and 78 GTs. MB2003 is unable to grow on xylan or pectin, and its role in the rumen appears to be as a utilizer of monosaccharides, disaccharides and oligosaccharides made available by the degradative activities of other bacterial species.Entities:
Keywords: Bacteria; Butyrivibrio; Degradation; Genome; Hemicellulose; Pectin; Rumen
Year: 2017 PMID: 29225728 PMCID: PMC5716241 DOI: 10.1186/s40793-017-0285-8
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Morphology of B. hungatei MB2003. Micrograph of Gram stained B. hungatei MB2003 cells at 100 × magnification. Bar represents 10 μm
Fig. 2Transmission electron micrograph of B. hungatei MB2003. Micrograph of negatively stained B. hungatei MB2003 cells at 10,000 × magnification
Fig. 3Phylogenetic tree highlighting the relationship of B. hungatei MB2003 relative to the type strains of the other species within the genus Butyrivibrio. The evolutionary history was inferred using the Maximum Likelihood method based on the General Time Reversible model [55]. The tree with the highest log likelihood (−3712.3329) is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (10,000 replicates) is shown next to the branches [56]. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.3950)). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved six nucleotide sequences. All positions with less than 95% site coverage were eliminated. There were a total of 1509 positions in the final dataset. Evolutionary analyses were conducted in MEGA6 [55]. GenBank accession numbers of the 16S rRNA gene sequences are shown in parentheses. Bar, 0.02 nucleotide substitutions per site. T, indicates type strain. All the type strains have their genome sequencing projects registered in the Genomes Online Database (GOLD) [57]
Classification and general features of the rumen bacterium B. hungatei MB2003 in accordance with the MIGS recommendations [58]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Current classification | Domain: | TAS [ | |
| Phylum: | TAS [ | ||
| Class: | TAS [ | ||
| Order: | TAS [ | ||
| Family: | TAS [ | ||
| Genus: | TAS [ | ||
| Species: | TAS [ | ||
| Type strain: No | |||
| Strain: MB2003 | TAS [ | ||
| Gram stain | Positive | TAS [ | |
| Cell shape | Rod | TAS [ | |
| Motility | Non-motile | IDA | |
| Sporulation | Not reported | NAS | |
| Temperature range | 37–39 °C | IDA | |
| Optimum temperature | 39 °C | IDA | |
| pH range; Optimum | 6.0–7.0; 6.4 | IDA | |
| Carbon source | Variety of carbohydrates | IDA | |
| Energy metabolism | Fermentative metabolism | IDA | |
| MIGS-6 | Habitat | Bovine rumen | TAS [ |
| MIGS-6.3 | Salinity | Not reported | |
| MIGS-22 | Oxygen requirement | Anaerobic | IDA |
| MIGS-15 | Biotic relationship | Symbiont of ruminants | TAS [ |
| MIGS-14 | Pathogenicity | Non-pathogen | NAS |
| MIGS-4 | Geographic location | Ruakura, Hamilton, New Zealand | TAS [ |
| MIGS-5 | Sample collection time | May 2009 | TAS [ |
| MIGS-4.1 | Latitude | −37.77 (37°46′28″S) | IDA |
| MIGS-4.2 | Longitude | +175.31 (175°18′31″E) | IDA |
| MIGS-4.4 | Altitude | 40 m | IDA |
aEvidence codes - IDA, Inferred from Direct Assay, NAS, Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). Evidence codes are from the Gene Ontology project [65]
Carbon source utilization of the Butyrivibrio strains
| Substrate | MB2003 | JK615T | B316T | |
|---|---|---|---|---|
| Monosaccharides | Arabinose | ++ | ++ | ++ |
| Fructose | – | – | ++ | |
| Galactose | ++ | – | ++ | |
| Glucose | ++ | ++ | ++ | |
| Mannose | – | ++ | ++ | |
| Rhamnose | – | – | ++ | |
| Ribose | – | – | – | |
| Xylose | ++ | ++ | ++ | |
| Disaccharides | Cellobiose | ++ | ++ | ++ |
| Lactose | ++ | ++ | ++ | |
| Maltose | ++ | ++ | ++ | |
| Melibiose | – | – | + | |
| Sucrose | ++ | ++ | ++ | |
| Trisaccharides | Melezitose | – | – | ++ |
| Raffinose | – | ++ | ++ | |
| Trehalose | – | – | ++ | |
| Sugar Alcohols | myo-Inositol | – | – | – |
| Mannitol | – | – | + | |
| Sorbitol | – | – | – | |
| Glycosides | Amygdalin | + | – | ++ |
| Esculin | – | ++ | ++ | |
| Rutin | – | ++ | ++ | |
| Salicin | ++ | ++ | ++ | |
| Insoluble substrates | Cellulose | – | – | – |
| Dextrin | – | – | ++ | |
| Inulin | + | – | ++ | |
| Starch | – | – | ++ | |
| Pectin | – | – | ++ | |
| Xylan | – | – | ++ | |
ΔOD600nm readings of 0.5–1.0 were scored as ++, 0.2–0.5 scored as +, and 0–0.2 scored as -. Results for B. hungatei JK615T and B. proteoclasticus B316T are adapted from Kopečný et al. [19] and Moon et al. [6], respectively
Fig. 4Culture density achieved in 24 h by MB2003 growing in media with cellobiose as the sole substrate. Points indicate means of three replicates, and the error bars represent +/−one standard error
MB2003 genome project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS-31 | Finishing quality | High-quality, closed genome |
| MIGS-28 | Libraries used | 454 3 kb mate paired-end library |
| MIGS-29 | Sequencing platforms | 454 GS FLX Titanium chemistry |
| MIGS-31.2 | Fold coverage | 234× |
| MIGS-30 | Assemblers | Newbler version 2.3 |
| MIGS-32 | Gene calling method | Glimmer and BLASTX |
| Locus Tag | bhn and bhn_RS | |
| Genbank ID | CP017830, CP017831, CP017832, CP017833 | |
| Genbank Date of Release | 31 October 2016 | |
| GOLD ID | Ga0074201 | |
| BIOPROJECT ID | PRJNA349214 and PRJNA224116 | |
| BIOSAMPLE ID | SAMN05928573 | |
| MIGS-13 | Source Material Identifier |
|
| Project relevance | Ruminant plant-fibre degradation |
Summary of MB2003 genome replicon features
| Replicon type | Size (bp) | Topology | INSDC identifier | RefSeq ID |
|---|---|---|---|---|
| Chromosome | 3,143,784 | circular | CP017831 | NZ_CP017831 |
| Chromid_BhuII | 91,776 | circular | CP017830 | NZ_CP017830 |
| Megaplasmid_pNP144 | 144,470 | circular | CP017832 | NZ_CP017832 |
| Plasmid_pNP6 | 6284 | circular | CP017833 | NZ_CP017833 |
MB2003 genome statistics
| Attribute | Value | % of totala |
|---|---|---|
| Genome size (bp) | 3,386,314 | 100 |
| DNA coding (bp) | 3,064,986 | 90.51 |
| DNA G + C (bp) | 1,344,683 | 39.71 |
| DNA scaffolds | 4 | 100 |
| Total genes | 3064 | 100 |
| Protein coding genes | 2983 | 97.36 |
| RNA genes | 60 | 1.96 |
| Pseudogenes | 17 | 0.56 |
| Genes in internal clusters | 160 | 5.22 |
| Genes with function predicted | 2247 | 73.34 |
| Genes assigned to COGs | 1842 | 61.34 |
| Genes with Pfam domains | 2350 | 78.26 |
| Genes with signal peptides | 148 | 4.93 |
| Genes with transmembrane helices | 881 | 29.34 |
| CRISPR repeats | 2 |
aThe total is based on either the size of the genome in base pairs or the total number of genes or protein-coding genes in the annotated genome
Number of genes associated with the general COG functional categories
| Code | Value | % of totala | Description |
|---|---|---|---|
| J | 194 | 9.52 | Translation, ribosomal structure and biogenesis |
| A | 0 | 0 | RNA processing and modification |
| K | 149 | 7.31 | Transcription |
| L | 88 | 4.32 | Replication, recombination and repair |
| B | 0 | 0 | Chromatin structure and dynamics |
| D | 32 | 1.57 | Cell cycle control, Cell division, chromosome partitioning |
| V | 65 | 3.19 | Defense mechanisms |
| T | 139 | 6.82 | Signal transduction mechanisms |
| M | 155 | 7.61 | Cell wall/membrane biogenesis |
| N | 61 | 2.99 | Cell motility |
| U | 22 | 1.08 | Intracellular trafficking and secretion |
| O | 78 | 3.83 | Posttranslational modification, protein turnover, chaperones |
| C | 69 | 3.39 | Energy production and conversion |
| G | 243 | 11.93 | Carbohydrate transport and metabolism |
| E | 177 | 8.69 | Amino acid transport and metabolism |
| F | 80 | 3.93 | Nucleotide transport and metabolism |
| H | 79 | 3.88 | Coenzyme transport and metabolism |
| I | 72 | 3.53 | Lipid transport and metabolism |
| P | 79 | 3.88 | Inorganic ion transport and metabolism |
| Q | 16 | 0.79 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 158 | 7.76 | General function prediction only |
| S | 73 | 3.58 | Function unknown |
| – | 1245 | 40.33 | Not in COGs |
aThe total is based on the total number of protein coding genes in the genome
Fig. 5Genome atlas for B. hungatei MB2003. The figure represents a circular view of the four replicons that make up the B. hungatei MB2003 genome. The key at the right describes the concentric circles within each replicon in the outermost to innermost direction. The diagram was created using GENEWIZ [66] and custom-developed software. The innermost circle 1 shows GC-skew; Circle 2 shows COG classification: predicted ORFs were analysed using the COG database and grouped into the five major categories: yellow, information storage and processing; red, cellular processes and signalling; green, metabolism; blue, poorly characterised; and uncoloured, ORFs with uncharacterized COGs or no COG assignment. Circle 3 shows transmembrane helices (TMH) and SignalP domains: the four categories represent, uncoloured, absent; red, TMH; blue, SignalP; and black, both TMH and SignalP present. Circle 4 shows ORF orientation: ORFs in sense orientation (ORF+) are shown in blue; ORFs oriented in antisense direction (ORF-) are shown in red. Circle 5 shows ribosomal machinery: tRNAs and rRNAs are shown as green or red lines, respectively. Clusters are represented as coloured boxes to maintain readability. Circle 6 shows G + C content deviation from the average: GC-content is shown in either green (low GC spike) or orange (high GC spike). A box filter was applied to visualize contiguous regions of low or high GC deviations. Circle 7 shows BLAST similarities: deduced amino acid sequences were compared against the nonredundant (nr) database using gapped BLASTP [67]. Regions in blue represent unique proteins in MB2003, whereas highly conserved features relative to sequences in the nr database are shown in red. The degree of colour saturation corresponds to the level of similarity. The predicted origin and terminus of DNA replication are indicated
Genome statistics of MB2003, JK615T and B316T
| Attribute |
|
|
| |||
|---|---|---|---|---|---|---|
| Value | % of totala | Value | % of totala | Value | % of totala | |
| Status | Complete | Draft | Complete | |||
| Isolation source | Bovine rumen | Ovine rumen | Bovine rumen | |||
| Genome size (bp) | 3,386,314 | 100 | 3,394,947 | 100 | 4,404,886 | 100 |
| DNA coding (bp) | 3,064,986 | 90.51 | 3,108,180 | 91.55 | 3,954,077 | 89.77 |
| DNA G + C (bp) | 1,344,683 | 39.71 | 1,353,252 | 39.86 | 1,762,323 | 40.01 |
| Number of replicons | 4 | NA | 4 | |||
| DNA scaffolds | 4 | 100 | 22 | 100 | 4 | 100 |
| Total genes | 3064 | 100 | 3104 | 100 | 3863 | 100 |
| Protein coding genes | 2983 | 97.36 | 2996 | 96.52 | 3739 | 96.79 |
| RNA genes | 60 | 1.96 | 55 | 1.78 | 68 | 1.75 |
| rRNA operons | 4 | 4 | 6 | |||
| tRNA genes | 48 | 1.57 | 46 | 1.49 | 50 | 1.29 |
| Pseudo genes | 17 | 0.56 | 49 | 54 | 1.39 | |
| Genes in internal clusters | 160 | 5.22 | 211 | 6.82 | 327 | 8.43 |
| Genes with function prediction | 2225 | 72.62 | 2314 | 74.55 | 2505 | 64.85 |
| Genes assigned to COGs | 1842 | 61.34 | 1861 | 60.17 | 2075 | 53.49 |
| Genes with Pfam domains | 2350 | 78.26 | 2407 | 77.82 | 2784 | 71.77 |
| Genes with signal peptides | 148 | 4.93 | 137 | 4.43 | 269 | 6.93 |
| Genes with transmembrane helices | 881 | 29.34 | 847 | 27.38 | 1061 | 27.35 |
| CRISPR repeats | 2 | NA | NA | |||
| Reference | This report | [ | [ | |||
aThe total is based on either the size of the genome in base pairs or the total number of genes or protein-coding genes in the annotated genome. bIndicates draft genome sequence
Genes encoding predicted polysaccharide degrading enzymes in the MB2003 genome
| Locus tag | Name | Annotation | Size (aa) | CAZya | Binding domains |
|---|---|---|---|---|---|
| bhn_I2518 |
| β-galactosidaseb | 1034 | GH2 | |
| bhn_I0827 |
| β-galactosidaseb | 714 | GH2 | |
| bhn_I1587 |
| β-galactosidaseb | 825 | GH2 | |
| bhn_I0200 |
| glycoside hydrolase family 2b | 641 | GH2 | |
| bhn_I1127 |
| glycoside hydrolase family 2b | 912 | GH2 | |
| bhn_I1849 |
| glycoside hydrolase family 2b | 776 | GH2 | |
| bhn_III062 |
| β-glucosidaseb | 803 | GH3 | |
| bhn_I0707 |
| β-glucosidaseb | 808 | GH3 | |
| bhn_I0180 |
| β-glucosidaseb | 671 | GH3 | |
|
|
|
|
|
|
|
| bhn_I0189 |
| β-xylosidaseb | 707 | GH3 | |
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
| bhn_I1756 |
| reducing end xylose-releasing exo-oligoxylanaseb | 383 | GH8 | |
| bhn_I0834 |
| cellodextrinaseb | 552 | GH9 | CelD |
|
|
|
|
|
| |
|
|
|
|
|
| |
| bhn_I1458 |
| 1,4-α-glucan branching enzymeb | 824 | GH13 | CBM48 |
| bhn_I0053 |
| 1,4-α-glucan branching enzymeb | 663 | GH13 | CBM48 |
| bhn_I2702 |
| α-amylaseb | 697 | GH13 | CBM34 |
|
|
|
|
|
| |
| bhn_I1680 |
| α-amylaseb | 434 | GH13 | |
| bhn_I0669 |
| α-amylaseb | 511 | GH13 | |
| bhn_I1153 |
| glycogen debranching enzymeb | 726 | GH13 | CBM48 |
| bhn_I1315 |
| glycogen debranching enzymeb | 648 | GH13 | |
| bhn_I0652 |
| sucrose phosphorylaseb | 553 | GH13 | |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
| bhn_I0527 |
| lysozymeb | 1213 | GH25 | Big2 (×2) |
| bhn_I1287 |
| α-galactosidaseb | 577 | GH27 | |
| bhn_I0082 |
| glycoside hydrolase family 27b | 442 | GH27 | |
| bhn_I1952 |
| polygalacturonaseb | 531 | GH28 | |
| bhn_I2679 |
| polygalacturonaseb | 519 | GH28 | |
| bhn_I1087 |
| α-L-fucosidaseb | 475 | GH29 | |
|
|
|
|
|
| |
| bhn_I1581 |
| glycoside hydrolase family 31b | 756 | GH31 | |
| bhn_I2191 |
| glycoside hydrolase family 31b | 674 | GH31 | |
| bhn_I0283 |
| glycoside hydrolase family 31b | 635 | GH31 | |
| bhn_I0582 |
| sucrose-6-phosphate hydrolaseb | 493 | GH32 | |
| bhn_I0826 |
| β-galactosidaseb | 622 | GH35 | |
| bhn_I1817 |
| β-galactosidaseb | 735 | GH35 | |
| bhn_I0644 |
| α-galactosidaseb | 782 | GH36 | |
| bhn_I1583 |
| α-galactosidaseb | 620 | GH36 | |
| bhn_I1945 |
| α-galactosidaseb | 730 | GH36 | |
| bhn_I0086 |
| α-mannosidaseb | 1053 | GH38 | |
| bhn_III010 |
| β-galactosidaseb | 673 | GH42 | |
|
|
|
|
|
|
|
| bhn_I0981 |
| xylosidase/arabinofuranosidaseb | 301 | GH43 | |
| bhn_I2037 |
| xylosidase/arabinofuranosidaseb | 302 | GH43 | |
| bhn_I2111 |
| xylosidase/arabinofuranosidaseb | 517 | GH43 | |
| bhn_I2735 |
| xylosidase/arabinofuranosidaseb | 352 | GH43 | |
| bhn_I0032 |
| xylosidase/arabinofuranosidaseb | 312 | GH43 | |
| bhn_I0164 |
| xylosidase/arabinofuranosidase and esteraseb | 925 | GH43 | |
| bhn_I1509 |
| α-L-arabinofuranosidaseb | 630 | GH51 | |
| bhn_I2228 |
| α-L-arabinofuranosidaseb | 502 | GH51 | |
| bhn_I0010 |
| α-L-arabinofuranosidaseb | 504 | GH51 | |
|
|
|
|
|
| |
| bhn_I0183 |
| α-D-glucuronidaseb | 662 | GH67 | |
|
|
|
|
|
| |
| bhn_I0697 |
| unsaturated glucuronyl hydrolaseb | 385 | GH88 | |
| bhn_I2381 |
| unsaturated glucuronyl hydrolaseb | 383 | GH88 | |
| bhn_I2196 |
| cellobiose phosphorylaseb | 814 | GH94 | |
| bhn_I1582 |
| glycoside hydrolase family 95b | 734 | GH95 | |
| bhn_I2548 |
| unsaturated rhamnogalacturonyl hydrolaseb | 349 | GH105 | |
| bhn_I0090 |
| unsaturated rhamnogalacturonyl hydrolaseb | 363 | GH105 | |
| bhn_I2549 |
| D-galactosyl-β-1-4-L-rhamnose phosphorylaseb | 722 | GH112 | |
| bhn_I0185 |
| α-glucuronidaseb | 947 | GH115 | |
| bhn_I1083 |
| xylosidaseb | 861 | GH120 | |
| bhn_I1738 |
| xylosidaseb | 664 | GH120 | |
|
|
|
|
|
| |
| bhn_I1244 |
| acetyl-xylan esteraseb | 372 | CE2 | |
| bhn_III070 |
| polysaccharide deacetylaseb | 207 | CE4 | |
|
|
|
|
|
| |
| bhn_I0666 |
| N-acetylglucosamine-6-phosphate deacetylaseb | 371 | CE9 | |
| bhn_I1609 |
| carbohydrate esterase family 12b | 584 | CE12 | |
| bhn_I1927 |
| carbohydrate esterase family 12b | 244 | CE12 | |
| bhn_I1926 |
| polysaccharide lyaseb | 746 | PL11 | |
| bhn_I0657 |
| glycogen phosphorylaseb | 769 | GT35 | |
| bhn_I2673 |
| glycogen phosphorylaseb | 824 | GT35 | |
|
|
|
|
|
|
aCAZy descriptions and classifications compiled from the CAZy database [68]. bIndicates homologues in the B. hungatei JK615T draft genome. Genes encoding predicted secreted polysaccharide degrading enzymes are in bold
Comparison of MB2003, JK615T and B316T protein coding gene percentages to COG functional categories
| Code | % of totala | Description | ||
|---|---|---|---|---|
| MB2003 | JK615T | B316T | ||
| J | 9.52 | 9.33 | 8.96 | Translation |
| A | RNA processing and modification | |||
| K | 7.31 | 7.59 | 7.30 | Transcription |
| L | 4.32 | 4.59 | 4.63 | Replication, recombination and repair |
| B | Chromatin structure and dynamics | |||
| D | 1.57 | 1.60 | 1.44 | Cell cycle control, mitosis and meiosis |
| V | 3.19 | 2.90 | 3.19 | Defense mechanisms |
| T | 6.82 | 6.72 | 7.47 | Signal transduction mechanisms |
| M | 7.61 | 7.45 | 8.52 | Cell wall/membrane biogenesis |
| N | 2.99 | 3.29 | 2.75 | Cell motility |
| U | 1.08 | 1.26 | 1.14 | Intracellular trafficking and secretion |
| O | 3.83 | 3.63 | 3.89 | Posttranslational modification, protein turnover, chaperones |
| C | 3.39 | 3.63 | 3.72 | Energy production and conversion |
| G | 11.93 | 11.99 | 12.15 | Carbohydrate transport and metabolism |
| E | 8.69 | 8.85 | 7.91 | Amino acid transport and metabolism |
| F | 3.93 | 3.82 | 3.98 | Nucleotide transport and metabolism |
| H | 3.88 | 3.77 | 3.23 | Coenzyme transport and metabolism |
| I | 3.53 | 3.19 | 2.80 | Lipid transport and metabolism |
| P | 3.88 | 4.06 | 2.75 | Inorganic ion transport and metabolism |
| Q | 0.79 | 0.68 | 0.79 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 7.76 | 6.91 | 7.43 | General function prediction only |
| S | 3.58 | 3.53 | 4.11 | Function unknown |
| – | 40.33 | 39.83 | 46.51 | Not in COGs |
aThe percentage is based on the total number of protein coding genes in the genome
Fig. 6Genome synteny analysis. Alignment of the B. hungatei MB2003 genome against the draft genome of B. hungatei JK615T (a) and the complete genome of B. proteoclasticus B316T (b). Whenever the two sequences agree, a colored line or dot is plotted. If the two sequences were perfectly identical, a single line would go from the bottom left to the top right. Units displayed in base-pairs. Color codes: blue, forward sequence, red, reverse sequence