Literature DB >> 17631616

REPK: an analytical web server to select restriction endonucleases for terminal restriction fragment length polymorphism analysis.

Abstract

Terminal restriction fragment length polymorphism (T-RFLP) analysis is a widespread technique for rapidly fingerprinting microbial communities. Users of T-RFLP frequently overlook the resolving power of well-chosen restriction endonucleases and often fail to report how they chose their enzymes. REPK (Restriction Endonuclease Picker) assists in the rational choice of restriction endonucleases for T-RFLP by finding sets of four restriction endonucleases that together uniquely differentiate user-designated sequence groups. With REPK, users can provide their own sequences (of any gene, not just 16S rRNA), specify the taxonomic rank of interest and choose from a number of filtering options to further narrow down the enzyme selection. Bug tracking is provided, and the source code is open and accessible under the GNU Public License v.2, at http://code.google.com/p/repk. The web server is available without access restrictions at http://rocaplab.ocean.washington.edu/tools/repk.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2007 PMID： 17631616 PMCID： PMC1933217 DOI： 10.1093/nar/gkm384

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Terminal restriction fragment length polymorphism (T-RFLP) analysis is a microbial fingerprinting technique capable of discriminating microbial communities quickly and relatively inexpensively (1–3). T-RFLP is increasingly used in high-throughput studies of microbial communities in combination with or even in lieu of clone library analysis (4,5). Briefly, the method involves PCR amplification of a gene of interest (often 16S rRNA genes) with fluorescent dye-labeled primers, followed by multiple single restriction digests done in parallel. The resulting fragments are then separated by capillary electrophoresis with an internal size standard to determine the lengths of the terminal (fluorescently labeled) fragments. Each distinct terminal restriction fragment is considered an operational taxonomic unit (OTU), thus the choice of restriction enzymes can impact the number of OTUs observed in each sample and the calculation of diversity statistics. When analyzing uncharacterized and very diverse bacterial communities, sufficient community discrimination can often be accomplished with multiple randomly-chosen tetrameric restriction enzymes (6). However, a brief review of the literature indicates that there is still no standard in even this simplified case. We examined 26 papers (1–5,7–26) that were published between 1997 and 2007 and used T-RFLP. Of those papers, 38% used universal bacterial primers combined with a single restriction enzyme, but the choice of enzyme was not consistent. MspI was used most frequently (four studies), followed by TaqI (two studies), and one study each used AluI, CfoI, HhaI and HaeIII. Overall, only three of the 26 papers included a rationalization of enzyme selection (1,2,17). An alternate approach to T-RFLP can be taken if the microbial community has been characterized (by clone library analysis or by prediction from previous studies) or if a particular taxonomic group is being targeted with specific primers. In this case, a more reasoned choice of restriction enzymes can be conducted. In particular, specific species or microbial taxa of interest to the researcher—particularly closely related taxa that may share some restriction sites—can often be differentiated if the proper restriction enzymes are selected. There are, however, few resources available to narrow down the selection process. Over 600 Type II restriction enzymes are commercially available, accounting for 262 distinct specificities (27). Existing computer programs for assisting in the choice of restriction enzymes include TAP-TRFLP (28), MiCA Enzyme Resolving Power Analysis (http://mica.ibest.uidaho.edu) and TRF-CUT (29). These programs perform in silico restriction digestions of a predefined sequence database or user-provided sequences, but these results must still be manually examined to determine which enzymes are best suited to discriminate that set of sequences. CLEAVER (30), a stand alone program, provides the above features as well as the ability to assign sequences to taxonomic groups at multiple levels and to search for enzymes that cut one group but not another group. However, it is limited to comparing only two groups at once. Restriction Endonuclease Picker (REPK) addresses this gap by finding enzymes that are able to discriminate an unlimited number of user-designated sequence groups on the basis of their terminal restriction fragment lengths. If no single enzyme can discriminate all groups, REPK reports sets of four restriction enzymes that together are able to differentiate the groups of interest. An important component of REPK is this ability to specify the taxonomic rank of sequences to be differentiated, which is particularly useful in the case where a diverse microbial community has been characterized by clone library analysis or there is an existing database of several subgroups of sequences that amplify with the same specific primers.

SITE USAGE AND EXAMPLES

A complete manual and example input files are provided on the REPK website (http://rocaplab.ocean.washington.edu/tools/repk). The example shown in Figure 1 was prepared using REPK v. 1.0, with the following operating parameters (also the defaults): example sequence file (alignment5.txt), all commercially available Type IIP enzymes (REBASE Version 704), taxonomic rank = 1, cut-off = 5, min. fragment length = 75, max. fragment length = 900, stringency = ‘automatic’, max. missing groups = 0, max. matches returned = 100.

Figure 1.

Schematic summarizing the processing steps performed by REPK using program options detailed in the text, as well as subsets of example input and output files.

User input

The user must provide a trimmed FASTA-formatted file with nucleotide sequences beginning at the 5′-end of the labeled primer used for PCR amplification and ending at the 5′-end of the unlabeled primer. Sequence groups can be designated in the description line of the FASTA file, by using a delimiter to separate taxonomic rank terms or optionally taxonomic identifications can be prepended to the description line using an output file from RDP-Classifier (31). Figure 1A shows a subset of the example sequence file provided on the website, alignment5.txt. Sequence groups are separated by a single underscore, and in this example ‘taxonomic rank 1’ was chosen, corresponding to the genus of these Archaea. A selectable list of commercially available enzymes from the latest REBASE database (27) is available and is automatically updated on the first day of each month. The enzymes available for selection include primarily Type IIP enzymes, which have symmetric recognition sequences and cleavage sites. Restriction enzymes of Type IIA (having asymmetric recognition sequences) and Type IIB (cleaving both sides of the recognition sequence on both strands) are at the present time not supported by REPK, although some are included in a separate enzyme file for advanced users willing to perform some manual processing. Users should be aware that some enzymes in the REBASE database may not be suitable for T-RFLP due to methylation specificities or requirements for multiple restriction sites to be present for effective digestion. Finally, users can define their own custom enzymes if they are not included in the standard list. The default (all standard enzymes) was used for the example in Figure 1. For computational efficiency isoschizomers are grouped by cleavage site. The final output is refined by setting several options. Some of these, the minimum and maximum allowable fragment lengths and the maximum difference in size between two fragments that will still be considered the ‘same’ fragment, will be dependent on the specifications and resolving power of particular capillary electrophoresis systems. Users can also set the minimum threshold for the number of groups each enzyme must be able to discriminate on its own (the enzyme stringency), and the number of groups allowed to remain undifferentiated in the case that no ‘perfect’ enzyme groups are discovered.

Program operations

Sequences are first digested in both orientations by all selected enzymes to find the shortest labeled restriction fragment; these lengths are output as a table (and a downloadable tab-delimited text file, fragfile.csv), a subset of which is shown in Figure 1B. In this example, the sequences were cut by every enzyme except AasI, which resulted in full-length fragments. Next, all terminal fragment lengths are binned within the chosen cut-off (here 5 bp) and a binary matrix of pairwise group differentiations is created. Bins containing a single sequence group yield a ‘1’, while bins containing more than one sequence group yield a ‘0’, indicating no differentiation between those groups. In the example in Figure 1, BanII failed to distinguish between sequence groups Sulfurisphaera and Thermofilum because the difference between their fragment lengths (1 bp) was less than the chosen cutoff of 5 bp (Figure 1B). However, AspLEI did distinguish between those groups because the difference in fragment lengths was 188 bp. It is not necessary for sequences from the same sequence group to have similar fragment lengths (e.g. Sulfolobus). Fragment lengths outside the boundaries set by the minimum and maximum fragment length options are binned together without regard for their actual lengths, decreasing the number of sequence groups discriminated by those enzymes (e.g. BmiI). The enzyme stringency filter is then applied to this matrix, allowing only enzymes that discriminate at least the specified fraction of sequence groups to proceed. The passing enzymes are output as a table (and a downloadable tab-delimited text file, enzmatrix.csv), a subset of which is shown in Figure 1C. For computational efficiency, the enzymes are then sorted into ‘enzyme bins’ that produce identical differentiation patterns, although they may not produce the same terminal fragment lengths. In this example, neoschizomers AspLEI and GlaI produce different fragment lengths but the same differentiation pattern so they were grouped together for the final analysis. It is important to note that the enzyme bins are dependent on the particular sequence file and taxonomic rank selected for the analysis. That is, two enzymes may have equal discriminatory power for a particular set of sequence groups but for a different set of sequences, one enzyme may be much better and the two enzymes would be placed in the same bin in the first but not the second case. Finally, groups of four enzymes (a ‘set’) are logically summed (e.g. 101 + 011 = 111) to determine the coverage of the set, i.e. the number of sequence groups discriminated by the enzymes in the set. If this number is greater than the total number of sequence groups (less than the max. missing groups, here 0) then the set is saved. A score is calculated for each saved set and all saved sets are sorted before the highest-scoring sets are output to a text file, finalout.txt, a subset of which is shown in Figure 1D. If more than 10 000 sets are found and the enzyme stringency is set to ‘automatic’, it is incremented by 10% (decreasing the number of passing enzymes and thus enzyme sets) and the analysis is repeated. The final output reports and summarizes those enzyme sets that best discriminated the sequence groups. The final output consists of three parts: ‘successful enzyme sets’, ‘enzyme picker key’, and ‘quick overview’. The successful enzyme sets (Figure 1D.1) consist of a list of enzyme groups in each set, and a score indicating the frequency with which each set discriminated the sequence groups. A perfect enzyme (one that discriminates 100% of the sequence groups) contributes a score of 1, so four perfect enzymes would produce the maximum score of 4. The enzyme picker key (Figure 1D.2) lists the members of each enzyme group, with neoschizomers separated by brackets. Each member of an enzyme group produces the same sequence group differentiation pattern but may differ in recognition site, terminal fragment lengths, etc. The quick overview (Figure 1D.3) histogram summarizes the frequency with which each enzyme group appears in the printed results. After submission the program generally takes less than 1 min to complete, depending most heavily on the number of sequence groups, the number of enzymes selected and the server load, respectively. The final choice of restriction enzymes is left to the researcher, and is likely to be based on practical factors such as cost, availability, reaction conditions, methylation sensitivity or requirements, star activity and other specifics that are detailed at REBASE. An online manual detailing usage and options, bug tracking and the source code (open and accessible under the GNU Public License v.2) are available at http://code.google.com/p/repk.

CONCLUSIONS

We found that researchers often failed to report their rationale in choosing a particular set of restriction enzymes for T-RFLP analysis, yet this choice is crucial for resolving the microbial community and interpreting the results. We provide REPK in the hope that it will allow microbial ecologists to maximize their ability to discriminate terminal restriction fragments obtained during T-RFLP and thereby take greater advantage of this powerful community fingerprinting technique.

30 in total

1. Structure and function of the methanogenic archaeal community in stable cellulose-degrading enrichment cultures at two different temperatures (15 and 30 degrees C).

Authors:
Journal: FEMS Microbiol Ecol Date: 1999-12-01 Impact factor: 4.194

2. Characterization of depth-related population variation in microbial communities of a coastal marine sediment using 16S rDNA-based approaches and quinone profiling.

Authors: H Urakawa; T Yoshida; M Nishimura; K Ohwada
Journal: Environ Microbiol Date: 2000-10 Impact factor: 5.491

3. Fidelity of select restriction endonucleases in determining microbial diversity by terminal-restriction fragment length polymorphism.

Authors: Jeff J Engebretson; Craig L Moyer
Journal: Appl Environ Microbiol Date: 2003-08 Impact factor: 4.792

4. Semi-automated genetic analyses of soil microbial communities: comparison of T-RFLP and RISA based on descriptive and discriminative statistical approaches.

Authors: Martin Hartmann; Beat Frey; Roland Kölliker; Franco Widmer
Journal: J Microbiol Methods Date: 2005-01-15 Impact factor: 2.363

5. Correlation of functional instability and community dynamics in denitrifying dispersed-growth reactors.

Authors: M E Gentile; C M Jessup; J L Nyman; C S Criddle
Journal: Appl Environ Microbiol Date: 2006-12-01 Impact factor: 4.792

6. Comparison of two fingerprinting techniques, terminal restriction fragment length polymorphism and automated ribosomal intergenic spacer analysis, for determination of bacterial diversity in aquatic environments.

Authors: R Danovaro; G M Luna; A Dell'anno; B Pietrangeli
Journal: Appl Environ Microbiol Date: 2006-09 Impact factor: 4.792

7. Archaeal diversity in Icelandic hot springs.

Authors: Thomas Kvist; Birgitte K Ahring; Peter Westermann
Journal: FEMS Microbiol Ecol Date: 2006-10-02 Impact factor: 4.194

8. Diversity study of nitrifying bacteria in full-scale municipal wastewater treatment plants.

Authors: Slil Siripong; Bruce E Rittmann
Journal: Water Res Date: 2007-01-24 Impact factor: 11.236

9. Community structure of actively growing bacterial populations in plant pathogen suppressive soil.

Authors: Karin Hjort; Antje Lembke; Arjen Speksnijder; Kornelia Smalla; Janet K Jansson
Journal: Microb Ecol Date: 2006-08-31 Impact factor: 4.192

10. The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis.

Authors: J R Cole; B Chai; R J Farris; Q Wang; S A Kulam; D M McGarrell; G M Garrity; J M Tiedje
Journal: Nucleic Acids Res Date: 2005-01-01 Impact factor: 16.971

15 in total

1. Selection of enzymes for terminal restriction fragment length polymorphism analysis of fungal internally transcribed spacer sequences.

Authors: Pablo Alvarado; Jose L Manjón
Journal: Appl Environ Microbiol Date: 2009-05-22 Impact factor: 4.792

2. Changes in diversity and functional gene abundances of microbial communities involved in nitrogen fixation, nitrification, and denitrification in a tidal wetland versus paddy soils cultivated for different time periods.

Authors: Andrea Bannert; Kristina Kleineidam; Livia Wissing; Cornelia Mueller-Niggemann; Vanessa Vogelsang; Gerhard Welzl; Zhihong Cao; Michael Schloter
Journal: Appl Environ Microbiol Date: 2011-07-15 Impact factor: 4.792

3. Epoxyalkane:Coenzyme M Transferase Gene Diversity and Distribution in Groundwater Samples from Chlorinated-Ethene-Contaminated Sites.

Authors: Xikun Liu; Timothy E Mattes
Journal: Appl Environ Microbiol Date: 2016-05-16 Impact factor: 4.792

4. Abundance and diversity of ammonia-oxidizing prokaryotes in the root-rhizosphere complex of Miscanthus × giganteus grown in heavy metal-contaminated soils.

Authors: Julien Ollivier; Nastasia Wanat; Annabelle Austruy; Adnane Hitmi; Emmanuel Joussein; Gerhard Welzl; Jean Charles Munch; Michael Schloter
Journal: Microb Ecol Date: 2012-06-12 Impact factor: 4.552

5. Anatoxin-a synthetase gene cluster of the cyanobacterium Anabaena sp. strain 37 and molecular methods to detect potential producers.

Authors: Anne Rantala-Ylinen; Suvi Känä; Hao Wang; Leo Rouhiainen; Matti Wahlsten; Ermanno Rizzi; Katri Berg; Muriel Gugger; Kaarina Sivonen
Journal: Appl Environ Microbiol Date: 2011-08-26 Impact factor: 4.792

6. Persistence of bacterial and archaeal communities in sea ice through an Arctic winter.

Authors: R Eric Collins; Gabrielle Rocap; Jody W Deming
Journal: Environ Microbiol Date: 2010-02-25 Impact factor: 5.491

7. PyroTRF-ID: a novel bioinformatics methodology for the affiliation of terminal-restriction fragments using 16S rRNA gene pyrosequencing data.

Authors: David G Weissbrodt; Noam Shani; Lucas Sinclair; Grégory Lefebvre; Pierre Rossi; Julien Maillard; Jacques Rougemont; Christof Holliger
Journal: BMC Microbiol Date: 2012-12-27 Impact factor: 3.605

8. Bacterial diversity in different regions of gastrointestinal tract of Giant African snail (Achatina fulica).

Authors: Kiran D Pawar; Sunil Banskar; Shailendra D Rane; Shakti S Charan; Girish J Kulkarni; Shailesh S Sawant; Hemant V Ghate; Milind S Patole; Yogesh S Shouche
Journal: Microbiologyopen Date: 2012-10-19 Impact factor: 3.139

9. Dynamics of bacterial communities during the ripening process of different Croatian cheese types derived from raw ewe's milk cheeses.

Authors: Mirna Mrkonjić Fuka; Stefanie Wallisch; Marion Engel; Gerhard Welzl; Jasmina Havranek; Michael Schloter
Journal: PLoS One Date: 2013-11-20 Impact factor: 3.240

10. Reliable differentiation of Meyerozyma guilliermondii from Meyerozyma caribbica by internal transcribed spacer restriction fingerprinting.

Authors: Wahengbam Romi; Santosh Keisam; Giasuddin Ahmed; Kumaraswamy Jeyaram
Journal: BMC Microbiol Date: 2014-02-28 Impact factor: 3.605