| Literature DB >> 15743531 |
Florent Angly1, Beltran Rodriguez-Brito, David Bangor, Pat McNairnie, Mya Breitbart, Peter Salamon, Ben Felts, James Nulton, Joseph Mahaffy, Forest Rohwer.
Abstract
BACKGROUND: Phages, viruses that infect prokaryotes, are the most abundant microbes in the world. A major limitation to studying these viruses is the difficulty of cultivating the appropriate prokaryotic hosts. One way around this limitation is to directly clone and sequence shotgun libraries of uncultured viral communities (i.e., metagenomic analyses). PHACCS http://phage.sdsu.edu/phaccs, Phage Communities from Contig Spectrum, is an online bioinformatic tool to assess the biodiversity of uncultured viral communities. PHACCS uses the contig spectrum from shotgun DNA sequence assemblies to mathematically model the structure of viral communities and make predictions about diversity.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15743531 PMCID: PMC555943 DOI: 10.1186/1471-2105-6-41
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flowchart of PHACCS. *The rank-abundance functions and the range of genotypes to use can be defined by the user. **This parameter represents b for the power law, logarithmic, lognormal and exponential distributions and k for the niche preemption. This parameter is not applicable to the broken stick.
Test data used for the study of the phage communities with PHACCS [10-12]. The average fragment sequence length was determined using Sequencher after sequence trimming (maximum one ambiguity on 99 bp at each extremity). A 98% identity for a minimal overlap length of 20 bp was used for sequence assembly with Sequencher to obtain the contig spectra. The average genome length was determined by pulse field gel electrophoresis [12, 22].
| 1021 17 2 0 ... | 841 13 2 0 ... | 1152 2 0 ... | 482 18 2 2 0 ... | |
| 50 kb | 50 kb | 50 kb | 30 kb | |
| 663 bp | 663 bp | 570 bp | 699 bp |
* The number of trailing zeros was set to 10 for each contig spectrum.
Best descriptive rank-abundance form for the viral communities as determined by PHACCS. The error represents the variance weighted sum squared deviation between the experimental and the predicted contig spectra. For each community, the best descriptive function is the one that minimizes the error. The best fit obtained for each rank-abundance form was ranked according to the error in ascending order.
| SP | MB | MBSED | FEC | ||||||||
| 1 | 1 | 1 | 1 | ||||||||
| 2 | Lognormal | 1.93 | 2 | Lognormal | 2.36 | 1 | 2 | Lognormal | 10.2 | ||
| 3 | Logarithmic | 2.57 | 3 | Logarithmic | 2.88 | 1 | 3 | Logarithmic | 10.3 | ||
| 4 | Broken stick | 10.7 | 4 | Broken stick | 14.6 | 1 | 4 | Broken stick | 52.2 | ||
| 5 | Exponential | 12.0 | 5 | Exponential | 16.2 | 5 | Niche preemption | 0.0139 | 5 | Exponential | 60.0 |
| 5 | Niche preemption | 12.0 | 5 | Niche preemption | 16.2 | 6 | Broken stick | 0.0157 | 5 | Niche preemption | 60.0 |
Figure 2Comparison of the structure and diversity of the different viral communities using PHACCS. The graphics represent rank-abundance curves, where the abundance of each genotype is plotted versus its abundance rank, the genotype of rank one being the most abundant. The curves were obtained by plotting the PHACCS rank-abundance values of the different communities on the same axis. *The predicted community structure for MBSED was the same for the lognormal, logarithmic, power and exponential rank-abundance forms. As a consequence, the diversity predictions were also the same.
Figure 3Screenshot of PHACCS' advanced web interface.