| Literature DB >> 19776767 |
Shoko Iwai1, Benli Chai, Woo Jun Sul, James R Cole, Syed A Hashsham, James M Tiedje.
Abstract
Understanding the relationship between gene diversity and function for important environmental processes is a major ecological research goal. We applied gene-targeted metagenomics and pyrosequencing to aromatic dioxygenase genes to obtain greater sequence depth than possible by other methods. A polymerase chain reaction (PCR) primer set designed to target a 524-bp region that confers substrate specificity of biphenyl dioxygenases yielded 2000 and 604 sequences from the 5' and 3' ends of PCR products, respectively, which passed our validity criteria. Sequence alignment showed three known conserved residues, as well as another seven conserved residues not reported earlier. Of the valid sequences, 95% and 41% were assigned to 22 and 3 novel clusters in that they did not include any earlier reported sequences at 0.6 distance by complete linkage clustering for sequenced regions. The greater diversity revealed by this gene-targeted approach provides deeper insights into genes potentially important in environmental processes to better understand their ecology, functional differences and evolutionary origins. We also provide criteria for primer design for this approach, as well as guidance for data processing of diverse functional genes, as gene databases for most genes of environmental relevance are limited.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19776767 PMCID: PMC2808446 DOI: 10.1038/ismej.2009.104
Source DB: PubMed Journal: ISME J ISSN: 1751-7362 Impact factor: 10.302
Pyrosequencing information.
| Sequencing primer | |||
|---|---|---|---|
| BPHD-f3 | BPHD-r1 | Sum | |
| Number of raw sequences | 2486 | 835 | 3321 |
| Total raw sequence length | 556 744 bp | 198 152 bp | 754 896 bp |
| Average raw sequence length | 224 bp | 237 bp | |
| Obtained sequence | 175 bp (58 aa) | 200 bp (66 aa) | |
| Number of obtained sequences | 2024 | 608 | 2632 |
| Number of valid sequences | 2000 | 604 | |
| Unique nucleotide sequences in valid sequences | 743 | 339 | |
Sequences that passed position of error and frameshift analysis
The number of sequences which have 230D, 233H and 239H for BPHD-f3 alignment and 344P for BPHD-r1 alignment. (The number correspond to the position of bphA1 from B. xenovorans LB400.
Figure 1Shannon entropy (H′) at each alignment position and conserved residues among obtained sequences and/or reference sequences for (a) BPHD-f3 sequences and (b) BPHD-r1 sequences. Open circles (○) indicate entropy of reference sequences and filled circles (●) indicate entropy of obtained sequences. The corresponding position numbers and the residues of bphA1 from B. xenovorans LB400 are indicated. The ratio of residues conserved in either set with >95% are shown. The residues highly conserved only among obtained sequences are indicated with [*].
Figure 2Clustering of valid sequences at different distance levels by Complete Linkage Clustering based on amino acid sequences. The numbers of OTUs of valid sequences (solid line, BPHD-f3 sequences; dashed line, BPHD-r1 sequences) at each distance are shown. The arrow indicates the distance level used for the distribution analysis.
Figure 3Pairwise distance to the closest reference sequence(s) for each valid sequence and the number of sequences in each cluster. Symbols are the median of the distances in the cluster. Error bar indicates the range of the distance.