| Literature DB >> 19450243 |
Luen-Luen Li1, Sean R McCorkle, Sebastien Monchy, Safiyh Taghavi, Daniel van der Lelie.
Abstract
Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies.Entities:
Year: 2009 PMID: 19450243 PMCID: PMC2694162 DOI: 10.1186/1754-6834-2-10
Source DB: PubMed Journal: Biotechnol Biofuels ISSN: 1754-6834 Impact factor: 6.040
Recently identified plant biomass-degrading enzymes through metagenomic approaches (metagenome libraries screening for enzyme activity)
| Enzyme name | Metagenome DNA source | Library vector | Insert size | Number of clones screened | Positive clones | Reference |
| Agarase | Soil from an unplanted field | Cosmid | 25–40 kb | 1,523 | 12 clones (belong to six genes) | [ |
| Amylase | Environmental (US patent number 5,958,672) | Lambda | 50,000 | 15 clones (belong to three enzymes) | [ | |
| Amylase | Soil from an unplanted field | Cosmid | 25–40 kb | 1,523 | 1 clone | [ |
| Amylase | Soil from the junction of the groundwater table | Plasmid | 2–7 kb | 30,000 | 1 clone | [ |
| Amylase | Soil and compost from the surface layer of a private garden | Plasmid | 1.4–6.5 kb | 31,967 | 38 clones | [ |
| Cellulase | Various lake water samples from East Africa | Lambda | 2–10 kb | 114,000 | 4 clones | [ |
| Cellulase | Soil from an unplanted field | Cosmid | 25–40 kb | 1,523 | 1 clone | [ |
| Cellulase | Soda lake sediments from Wadi el Natrun, Egypt | Lambda | 2.0–5.5 kb | 35,000 | 1 clone | [ |
| Cellulase | A soda lake (Wadi el Natrun, Egypt) alkaline microcrystalline cellulose medium enrichment | Lambda | 2–6 kb | 37,000 | 1 clone | [ |
| Cellulase | Rabbit cecum contents | Cosmid | 22–47 kb | 32,500 | 11 clones (representing six genes) | [ |
| Chitinase | Coastal seawater outside the Delaware Bay | Lambda | 1.8–4.2 kb | 75,000 | 2 clones | [ |
| Chitinase | Estuarine water inside the Delaware Bay | Lambda | 5.0–6.1 kb | 75,000 | 9 clones | [ |
| Cyclodextrinase | Bovine rumen micro flora | Lambda | Average 5.5 kb | 14,000 | 1 clone | [ |
| Endo-β-1,4- glucanase | Bovine rumen micro flora | Lambda | Average 5.5 kb | 14,000 | 9 clones | [ |
| Esterase | Various lake water samples from East Africa | Lambda | 2–10 kb | 130,000 | 2 clones | [ |
| Esterase | Bovine rumen micro flora | Lambda | Average 5.5 kb | 14,000 | 12 clones | [ |
| Esterase | Crude oil springs contaminated soil | Cosmid | 25–40 kb | 2,500 | 1 clone | [ |
| Esterase | Biofilms growing with a drinking water network | Cosmid | 25–40 kb | 1,600 | 1 clone | [ |
| Esterase | Pools of various environmental soils | Fosmid | 30–40 kb | 60,000 | 1 clone | [ |
| Pectate lyase | Soil from an unplanted field | Cosmid | 25–40 kb | 1,523 | 2 clones | [ |
| Xylanase | Insect gut (insects collected from various locations) | Lambda | 3–6 kb | 1,000,000 | 4 clones | [ |
| Xylanase | Manure waste water from a dairy farm | Lambda | 4–10 kb | 5,000,000 | 1 clone | [ |
| 1,4-α-glucan branching enzyme | Soil from an unplanted field | Cosmid | 25–40 kb | 1,523 | 1 clone | [ |
kb = kilobase
Glycosyl hydrolase homologues found in metagenome samples
| Metagenome sourcea | Genome size (bp) | Gene countb | Glycosyl hydrolase matchesc | Glycosyl hydrolase matches/total genes (%)d | Matches/million base pairsd |
| Marine archaeal anaerobic methane oxidation community (methane oxidation, sulfate reducer) [ | 2,116,255 | 2,332 | 3 | 0.13 | 1.42 |
| Acid mine drainage (acidic, metal tolerance, pink biofilm) [ | 10,830,886 | 12,559 | 73 | 0.58 | 6.74 |
| Human gut community (gut microbiome of human) [ | 36,304,498 | 46,503 | 705 | 1.52 | 19.42 |
| Hypersaline mat (marine microbial communities) [ | 84,253,870 | 135,922 | 786 | 0.58 | 9.33 |
| Lake Washington formaldehyde enrichment (13C-labeled formaldehyde; 13C-labeled DNA isolated by CsCl purification) [ | 57,622,063 | 89,729 | 397 | 0.44 | 6.89 |
| Lake Washington formate enrichment (13C-labeled formate; 13C-labeled DNA isolated by CsCl purification) [ | 17,570,569 | 28,700 | 114 | 0.40 | 6.49 |
| Lake Washington methane enrichment (13C-labeled methane; 13C-labeled DNA isolated by CsCl purification) [ | 52,164,993 | 81,076 | 428 | 0.53 | 8.20 |
| Lake Washington methanol enrichment (13C-labeled methanol; 13C-labeled DNA isolated by CsCl purification) [ | 50,245,961 | 77,229 | 373 | 0.49 | 7.42 |
| Lake Washington methylamine enrichment (13C-labeled methylamine; 13C-labeled DNA isolated by CsCl purification) [ | 37,225,208 | 54,340 | 285 | 0.52 | 7.66 |
| Mouse gut community (lean mouse) [ | 6,511,633 | 8,510 | 119 | 1.40 | 18.27 |
| Mouse gut community (obese mouse) [ | 4,200,364 | 5,382 | 58 | 1.08 | 13.81 |
| 19,918,898 | 15,092 | 41 | 0.27 | 2.06 | |
| 9,964,793 | 6,026 | 16 | 0.27 | 1.61 | |
| Singapore air sample [ | 75,598,288 | 91,635 | 514 | 0.56 | 6.80 |
| Sludge Australian Phrap assembly (phosphate removal) [ | 53,048,954 | 30,590 | 177 | 0.58 | 3.34 |
| Sludge US Jazz assembly (phosphate removal) [ | 41,128,538 | 16,840 | 126 | 0.75 | 3.06 |
| Sludge US Phrap assembly (phosphate removal) [ | 56,608,360 | 34,254 | 260 | 0.76 | 4.59 |
| Soil diversa silage (farm silage surface soil) [ | 152,406,385 | 184,374 | 1,078 | 0.58 | 7.07 |
| TM7 (human oral microflora) [ | 3,451,819 | 3,908 | 14 | 0.36 | 4.06 |
| Termite gut (cellulolytic, cellulose degrader, lignin degrader, symbiont) [ | 61,992,778 | 83,225 | 1,267 | 1.52 | 20.44 |
| Uranium-contaminated groundwater (acidophile) | 9,554,544 | 12,335 | 71 | 0.58 | 7.43 |
| Whalefall sample (barophile) [ | 94,937,484 | 122,145 | 433 | 0.35 | 4.56 |
| Total (or average for glycosyl hydrolase matches/total genes and matches/million base pairs) | 937,657,141 | 1,142,706 | 7,338 | 0.64 | 7.83 |
aData source: IMG/M [18]; bonly protein coding sequences were included (no RNA genes); ctranslated sequences from all 43 environmental metagenome projects were blast-searched against the CAZy sequences for homologues of glycosyl hydrolases using an e value < 10-40 as a cut-off threshold; dGlycosyl hydrolase matches/total genes and matches per million base pairs provides an indication for the relative abundance of glycosyl hydrolases in the microbial community.
Most abundant glycosyl hydrolase families found in different metagenome samples
| Metagenome sourcea | Glycosyl hydrolase matchesb | Most abundant glycosyl hydrolase familyc |
| Marine archaeal anaerobic methane oxidation community (methane oxidation, sulfate reducer) [ | 3 | GH16 (33%), GH2 (33%), GH38 (33%) |
| Acid mine drainage (acidic, metal tolerance, pink biofilm) [ | 73 | GH13 (26%), GH15 (21%), GH57 (12%), GH28 (9%), GH18 (7%) |
| Human gut community (gut microbiome of human) [ | 705 | GH13 (20%), GH3 (11%), GH2 (9%), GH1 (7%), GH31 (5%) |
| Hypersaline mat (marine microbial communities) [ | 786 | GH13 (24%), GH2 (9%), GH3 (7%), GH65 (4%), GH57 (4%) |
| Lake Washington formaldehyde enrichment (13C-labeled formaldehyde; 13C-labeled DNA isolated by CsCl purification) [ | 397 | GH13 (15%), GH23 (9%), GH3 (8%), GH2 (8%), GH94 (8%) |
| Lake Washington formate enrichment (13C-labeled formate; 13C-labeled DNA isolated by CsCl purification) [ | 114 | GH13 (28%), GH23 (10%), GH2 (7%), GH94 (6%), GH28 (4%), GH8 (4%) |
| Lake Washington methane enrichment (13C-labeled methane; 13C-labeled DNA isolated by CsCl purification) [ | 428 | GH13 (17%), GH94 (10%), GH23 (8%), GH3 (8%), GH57 (5%) |
| Lake Washington methanol enrichment (13C-labeled methanol; 13C-labeled DNA isolated by CsCl purification) [ | 373 | GH13 (16%), GH23 (9%), GH3 (8%), GH94 (8%), GH2 (5%) |
| Lake Washington methylamine enrichment (13C-labeled methylamine; 13C-labeled DNA isolated by CsCl purification) [ | 285 | GH23 (20%), GH13 (13%), GH57 (8%), GH17 (7%), GH3 (6%) |
| Mouse gut community (lean mouse) [ | 119 | GH3 (10%), GH43 (9%), GH94 (9%), GH2 (8%), GH13 (8%) |
| Mouse gut community (obese mouse) [ | 58 | GH68 (12%), GH13 (10%), GH43 (10%), GH94 (10%), GH2 (10%), GH3 (10%) |
| 41 | GH23 (41%), GH13 (12%), GH57 (7%), GH2 (5%), GH3 (5%), GH5 (5%), GH28 (5%), GH43 (5%) | |
| 16 | GH23 (31%), GH13 (19%), GH2 (6%), GH3 (6%), GH28 (6%), GH31 (6%), GH57 (6%), GH73 (6%), GH77 (6%), GH103 (6%) | |
| Singapore air sample [ | 514 | GH13 (13%), GH3 (9%), GH23 (9%), GH15 (8%), GH28 (5%) |
| Sludge Australian Phrap assembly (phosphate removal) [ | 177 | GH13 (28%), GH23 (15%), GH16 (8%), GH103 (7%), GH3 (7%) |
| Sludge US Jazz assembly (phosphate removal) [ | 126 | GH13 (20%), GH23 (12%), GH3 (9%), GH16 (7%), GH2 (6%) |
| Sludge US Phrap assembly (phosphate removal) [ | 260 | GH13 (17%), GH23 (13%), GH3 (8%), GH16 (7%), GH94 (7%) |
| Soil diversa silage (farm silage surface soil) [ | 1,078 | GH13 (22%), GH3 (9%), GH94 (8%), GH43 (5%), GH15 (5%), GH2 (5%) |
| TM7 (human oral microflora) [ | 14 | GH13 (29%), GH1 (21%), GH57 (21%), GH4 (7%), GH25 (7%), GH 28 (7%), GH73 (7%) |
| Termite gut (cellulolytic cellulose degrader, lignin degrader, symbiont) [ | 1,267 | GH13 (12%), GH94 (12%), GH5 (9%), GH3 (8%), GH2 (6%) |
| Uranium-contaminated groundwater (acidophile) | 71 | GH23 (18%), GH13 (10%), GH94 (7%), GH17 (6%), GH28 (6%), GH3 (6%) |
| Whalefall sample (barophile) [ | 433 | GH23 (17%), GH13 (13%), GH3 (12%), GH2 (6%), GH103 (5%) |
aData source: IMG/M [18]; btotal number of GHase matches in each metagenome are given; translated sequences from all 43 environmental metagenome projects were blast-searched against the CAZy sequences for homologues of glycosyl hydrolases using an e value < 10-40 as a cut-off threshold; cthe five most abundant glycosyl hydrolase families are listed. GHX is short for glycosyl hydrolase family X; percentages of each glycosyl hydrolase family are indicated inside parentheses.