| Literature DB >> 35657709 |
Andrew J Borchert1,2, Alissa Bleem1,2, Gregg T Beckham1,2.
Abstract
Randomly barcoded transposon insertion sequencing (RB-TnSeq) is an efficient, multiplexed method to determine microbial gene function during growth under a selection condition of interest. This technique applies to growth, tolerance, and persistence studies in a variety of hosts, but the wealth of data generated can complicate the identification of the most critical gene targets. Experimental and analytical methods for improving the resolution of RB-TnSeq are proposed, using Pseudomonas putida KT2440 as an example organism. Several key parameters, such as baseline media selection, substantially influence the determination of gene fitness. We also present options to increase statistical confidence in gene fitness, including increasing the number of biological replicates and passaging the baseline culture in parallel with selection conditions. These considerations provide practitioners with several options to identify genes of importance in TnSeq data sets, thereby streamlining metabolic characterization.Entities:
Keywords: Pseudomonas putida; baseline selection; data resolution; gene function; transposon insertion sequencing
Mesh:
Substances:
Year: 2022 PMID: 35657709 PMCID: PMC9208016 DOI: 10.1021/acssynbio.2c00119
Source DB: PubMed Journal: ACS Synth Biol ISSN: 2161-5063 Impact factor: 5.249
Figure 1Summary of experimental and analytical considerations for RB-TnSeq. (A) Time-zero (T = 0) cultures are used as the “baseline” reference for fitness calculations and may either be grown in rich medium (e.g., LB) or minimal medium (e.g., M9 + glucose), the selection of which influences the fitness calculation between enrichment cultures and reference (T = 0) cultures. Enrichment cultures consist of either a passaged medium reference culture (top) or a desired selection condition (bottom). Fitness comparisons may be made between reference and selection enrichment cultures by calculating average fitness scores for each of the two groups. (B) If a practitioner prioritizes high throughput, a single sample may be used for each of the T = 0 and enrichment cultures, but this limits the gene fitness confidence metric to only the t-like statistic, which can vary between replicates. For increased statistical confidence, biological triplicates enable calculation of mean fitness and a two-sample t-score for each gene. (C) In theory, transposon insertions (blue bands) occur randomly along the length of a gene. Some analytical approaches discard genes from analysis if the transposon insertion lies within the first or last 10% of the gene coding sequence length.
Figure 2Baseline considerations for fitness calculations during RB-TnSeq experiments. (A) Triplicate T = 0 cultures were prepared in M9 + 20 mM glucose and used to directly inoculate M9 + 20 mM glucose enrichment cultures. Another set of triplicate T = 0 cultures was prepared in LB and washed once in M9 salts prior to inoculation of M9 + 20 mM glucose enrichment cultures. The 20 genes with lowest mean fitness scores from the M9 T = 0 data set are shown with corresponding fitness scores from the LB T = 0 data set. (B) Normalized fitness scores and t-like statistics plotted for (left) a single M9 + 10 mM ferulate enrichment culture using either the M9 + 20 mM glucose T = 0 condition as the baseline or (right) a parallel M9 + 20 mM glucose enrichment culture as the baseline. Significant (t-like statistic >5) negative and positive fitness values marked in red or blue, respectively. (C) Normalized fitness values for triplicate M9 + 20 mM glucose or M9 + 10 mM ferulate cultures were calculated using the M9 + 20 mM glucose T = 0 condition as the baseline and shown for members of the ferulate catabolism pathway in KT2440. (D) Average normalized fitness values for M9 + 20 mM glucose (reference medium) cultures were compared to those for M9 + 10 mM ferulate (enrichment medium) cultures using the M9 + 20 mM glucose T = 0 condition as the baseline. Red dashed lines indicate genes with an average fitness score difference >1.5. Significance was determined with a two-sample t-test, where a q value <0.1 (stars) denotes a significant fitness disparity between the two conditions.
Effect of Trimming on Fitness Calculations
| nontrimmed | |
|---|---|
| genes analyzed | 4937 |
Genes analyzed contain fitness data from all three biological replicates.
Genes from the nontrimmed data set that contained counts in the trimmed region.