| Literature DB >> 30305012 |
Edward Vitkin1, Oz Solomon2,3, Sharon Sultan4, Zohar Yakhini5,6.
Abstract
BACKGROUND: Synthetic biology and related techniques enable genome scale high-throughput investigation of the effect on organism fitness of different gene knock-downs/outs and of other modifications of genomic sequence.Entities:
Keywords: Co-expression; Co-fitness; Fitness data; Flux balance analysis (FBA); Metabolic modelling; Orphan reactions
Mesh:
Year: 2018 PMID: 30305012 PMCID: PMC6180484 DOI: 10.1186/s12859-018-2341-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Analyses and methods used in the current study. a The general workflow. b example of orphan and non-orphan reactions
Fig. 2a First enriched motif in D-Glucose C. vs. Casaminos C. (the second enriched motif is found in Figure S3A).b Enriched motif detected for Sucrose C vs. Casaminos C. A comparison of this motif to the known metJ motif is found in Additional file 1: Figure S3B. Red points are promoters with high PSSM values with respect to the given motif. Corrected mmHG p-value <0.01 for both panels. Shaded black points are all the promoters analyzed
Fig. 3The distributions of co-fitness using different functional classess to group gene pairs. a Different classes from annotation databases (Methods). b Grouping the gene pairs according to genomic position bins. c The fraction of gene pairs with co-fitness ≥x divided by the correponding fraction of gene pairs in the background distribtion (total gene pairs, null). For further details see Methods
Fig. 4Enrichment/depletion scores (one tail Wilcoxon, -log10(p-value)) for annotated gene pairs grouped according to functional classes. Orange: co-expression. Blue: co-fitness
Correlation and mmHG results for comparing co-fitness to co-expression. Empirical p-values for the mmHG tests were calculated based on shuffled data, where each list, used as functional class for the gene pairs, was shuffled (100 instances), preserving the original partition structure
| Spearman’s R | Corrected mmHG statistics | # of pairs (N) | B | n* | b* | Empirical | |
|---|---|---|---|---|---|---|---|
| All pairs | 0.007 | 0.1 | 6,485,401 | 43 | 368 | 2 | N/A |
| Same operons | 0.186 | 1.18 × 10−13 | 2453 | 251 | 966 | 165 | 0 |
| Paralog genes | 0.048 | 1 | 4584 | 5 | 46 | 2 | 0.32 |
| Same Pfam | 0.051 | 1 | 10,626 | 948 | 155 | 31 | 0.47 |
| Within 5 kb | 0.058 | 2.53 × 10−4 | 7443 | 221 | 991 | 61 | 0 |
| Within 10 kb | 0.046 | 2.49 × 10−4 | 14,202 | 57 | 481 | 14 | 0 |
Fig. 5Validation. Comparative accuracy of Association Likelihood Score (ALS, Methods) values. For every k (on the x-axis) we indicate the fraction of reactions for which the true gene is within the predicted top k. ALS computed based on Spearman correlation to the two best neighbors
Fig. 6Prediction performance depends on the number of conditions measured