| Literature DB >> 28637232 |
Ludovic Mallet1, Tristan Bitard-Feildel2, Franck Cerutti1, Hélène Chiapello1.
Abstract
MOTIVATION: Genome sequencing projects sometimes uncover more organisms than expected, especially for complex and/or non-model organisms. It is therefore useful to develop software to identify mix of organisms from genome sequence assemblies.Entities:
Mesh:
Year: 2017 PMID: 28637232 PMCID: PMC5860033 DOI: 10.1093/bioinformatics/btx396
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1Visualization and interactive exploration of assemblies. (A) Pairwise compositional divergence of contigs produced by PhylOligo. Contigs are reordered by hierarchical clustering. (B) Contig tree produced by PhylOligo on the tardigrade genome. The clade in red is the current selection pointed by the user. (C) Contigs clustered by HDBSCAN on oligonucleotide frequencies, Data from Magnaporthe oryzae. Red and blue are predicted clusters, grey are unclassified. The hyperspace is reduced to 2 dimensions with t-SNE. (D) Determination of the untargeted threshold in ContaLocate based on the distribution of distances between the untargeted clade and the scanning windows over the whole assembly (Color version of this figure is available at Bioinformatics online.)
Impact of k-mer pattern on the hybrid score (best computed value of the product of cluster specificity and sensitivity) for 10 pairs of simulated data
| K-mer pattern | ||||||
|---|---|---|---|---|---|---|
| Species mix | 111 | 1111 | 11111 | 11001 | 110101 | 111001 |
| 0.39 | 0.79 | 0.94 | 0.45 | 0.93 | 0.97 | |
| 0.99 | 0.99 | 0.98 | 0.99 | 0.98 | 0.99 | |
| 1.00 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | |
| 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | |
| 0.96 | 0.96 | 0.93 | 0.95 | 0.95 | 0.95 | |
| 0.95 | 0.99 | 0.95 | 0.99 | 0.96 | 0.98 | |
| 0.41 | 0.50 | 0.70 | 0.72 | 0.71 | 0.72 | |
| 0.73 | 0.73 | 0.71 | 0.72 | 0.71 | 0.72 | |
| 0.60 | 0.56 | 0.49 | 0.57 | 0.51 | 0.58 | |
| 0.65 | 0.61 | 0.43 | 0.58 | 0.47 | 0.58 | |
| Mean | 0.69 | 0.74 | 0.75 | 0.81 | 0.83 | 0.78 |
| Median | 0.73 | 0.79 | 0.93 | 0.95 | 0.95 | 0.95 |
| Min | 0.01 | 0.01 | 0.15 | 0.45 | 0.47 | 0.05 |
| Max | 1.00 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 |