| Literature DB >> 27303383 |
Punita Manga1, Dawn M Klingeman2, Tse-Yuan S Lu3, Tonia L Mehlhorn4, Dale A Pelletier5, Loren J Hauser5, Charlotte M Wilson2, Steven D Brown6.
Abstract
RNA-seq is being used increasingly for gene expression studies and it is revolutionizing the fields of genomics and transcriptomics. However, the field of RNA-seq analysis is still evolving. Therefore, we specifically designed this study to contain large numbers of reads and four biological replicates per condition so we could alter these parameters and assess their impact on differential expression results. Bacillus thuringiensis strains ATCC10792 and CT43 were grown in two Luria broth medium lots on four dates and transcriptomics data were generated using one lane of sequence output from an Illumina HiSeq2000 instrument for each of the 32 samples, which were then analyzed using DESeq2. Genome coverages across samples ranged from 87 to 465X with medium lots and culture dates identified as major variation sources. Significantly differentially expressed genes (5% FDR, two-fold change) were detected for cultures grown using different medium lots and between different dates. The highly differentially expressed iron acquisition and metabolism genes, were a likely consequence of differing amounts of iron in the two media lots. Indeed, in this study RNA-seq was a tool for predictive biology since we hypothesized and confirmed the two LB medium lots had different iron contents (~two-fold difference). This study shows that the noise in data can be controlled and minimized with appropriate experimental design and by having the appropriate number of replicates and reads for the system being studied. We outline parameters for an efficient and cost effective microbial transcriptomics study.Entities:
Keywords: DESeq2; Illumina; coverage; negative binomial; normalization; replicates
Year: 2016 PMID: 27303383 PMCID: PMC4886094 DOI: 10.3389/fmicb.2016.00794
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Sampling and major sources of variation. Strains CT43 and ATCC10792 grown in two medium lots #1091744 and 7220443 in water taken from building 1520 and 1610. Bacteria were cultured on four different dates and four biological replicates were grown to mid-log phase for each date, harvested and then RNA-seq data were generated using an Illumina Hiseq 2000 instrument.
Summary of trimmed and mapped reads for strain ATCC10792.
| 1A | 0.394 | 26,986,606 | 202x | 19,762,858 | 19,708,701 |
| 2A | 0.398 | 11,787,714 | 87x | 7,579,068 | 7,534,508 |
| 3A | 0.384 | 27,315,600 | 203x | 19,174,331 | 19,098,676 |
| 4A | 0.404 | 53,643,496 | 400x | 37,447,561 | 37,326,511 |
| 5A | 0.398 | 51,109,286 | 380x | 37,229,691 | 37,112,202 |
| 6A | 0.420 | 57,636,652 | 430x | 41,566,889 | 41,466,848 |
| 7A | 0.406 | 49,915,906 | 370x | 34,855,294 | 34,747,534 |
| 8A | 0.452 | 52,689,519 | 392x | 37,104,818 | 36,999,659 |
| 9A | 0.384 | 20,291,318 | 160x | 9,640,962 | 9,590,311 |
| 10A | 0.386 | 13,356,476 | 105x | 6,185,482 | 6,154,691 |
| 11A | 0.398 | 22,487,034 | 177x | 9,762,915 | 9,698,237 |
| 12A | 0.392 | 25,052,676 | 197x | 10,408,386 | 10,361,349 |
| 13A | 0.482 | 21,857,603 | 172x | 9,475,191 | 9,444,912 |
| 14A | 0.476 | 27,286,565 | 215x | 10,871,957 | 10,828,524 |
| 15A | 0.504 | 26,104,043 | 206x | 9,558,505 | 9,484,938 |
| 16A | 0.468 | 30,818,722 | 243x | 21,037,052 | 20,987,988 |
See Data Sheet .
Figure 2Variation analysis of raw read count data for strain ATCC10792 and strain CT43. (A) Principal Component Analysis (PCA) for ATCC10792 using a Pearson correlation coefficient and colored by media, (B) Hierarchical cluster analysis of the same data for strain ATCC10792, (C) PCA for CT43 and (D) CT43 cluster analysis.
Genes related to iron acquisition and metabolism differentially expressed in strain ATCC10792 grown in medium lot #1091744 over #7220443.
| BTHUR0008_RS01670 | Iron ABC transporter permease | 2.67 | <0.001 |
| BTHUR0008_RS01675 | Ferrichrome ABC transporter permease | 2.70 | <0.001 |
| BTHUR0008_RS01680 | ABC transporter substrate-binding protein | 2.94 | <0.001 |
| BTHUR0008_RS01685 | Ferredoxin–NADP reductase | 2.56 | <0.001 |
| BTHUR0008_RS02820 | Iron-enterobactin transporter ATP-binding protein | 1.30 | <0.001 |
| BTHUR0008_RS02825 | Iron ABC transporter permease | 1.45 | <0.001 |
| BTHUR0008_RS02835 | Iron siderophore-binding protein | 1.23 | <0.001 |
| BTHUR0008_RS03465 | Iron transporter FeoA | −1.23 | <0.001 |
| BTHUR0008_RS06975 | Ferredoxin | −1.04 | <0.001 |
| BTHUR0008_RS10095 | Fe-S oxidoreductase | −1.43 | <0.001 |
| BTHUR0008_RS10345 | Iron(III) dicitrate-binding protein | 1.96 | <0.001 |
| BTHUR0008_RS15775 | Ferrichrome ABC transporter permease | 2.50 | <0.001 |
| BTHUR0008_RS15780 | Iron ABC transporter permease | 2.34 | <0.001 |
| BTHUR0008_RS15785 | Iron-hydroxamate ABC transporter substrate-binding protein | 2.51 | <0.001 |
| BTHUR0008_RS17445 | Iron-uptake system-binding protein | 3.45 | <0.001 |
| BTHUR0008_RS17450 | Ferrichrome ABC transporter permease | 2.88 | <0.001 |
| BTHUR0008_RS17455 | Iron ABC transporter permease | 3.72 | <0.001 |
| BTHUR0008_RS17460 | ABC transporter ATP-binding protein | 3.40 | <0.001 |
| BTHUR0008_RS17465 | IroE protein | 2.50 | <0.001 |
| BTHUR0008_RS20850 | Iron ABC transporter ATP-binding protein | 3.70 | <0.001 |
| BTHUR0008_RS20855 | Iron ABC transporter permease | 3.24 | <0.001 |
| BTHUR0008_RS20860 | Iron-hydroxamate ABC transporter substrate-binding protein | 4.02 | <0.001 |
| BTHUR0008_RS21120 | Ferrichrome ABC transporter substrate-binding protein | 2.70 | <0.001 |
| BTHUR0008_RS21675 | Ferrichrome ABC transporter substrate-binding protein | 1.60 | <0.001 |
| BTHUR0008_RS21745 | Heme-degrading monooxygenase IsdG | 2.41 | <0.001 |
| BTHUR0008_RS21760 | ABC transporter permease | 1.40 | <0.001 |
| BTHUR0008_RS21765 | Heme ABC transporter substrate-binding protein | 2.31 | <0.001 |
| BTHUR0008_RS23110 | Iron transporter FeoA | 1.13 | <0.001 |
| BTHUR0008_RS23575 | Ferritin | −1.60 | <0.001 |
| BTHUR0008_RS25020 | Iron ABC transporter substrate-binding protein | 2.17 | <0.001 |
| BTHUR0008_RS25025 | Iron ABC transporter permease | 1.70 | <0.001 |
| BTHUR0008_RS25030 | Iron ABC transporter permease | 1.31 | <0.001 |
| BTHUR0008_RS25035 | Iron ABC transporter ATP-binding protein | 1.10 | <0.001 |
| BTHUR0008_RS25920 | Ferrichrome ABC transporter permease | 1.74 | <0.001 |
| BTHUR0008_RS25930 | Iron-dicitrate ABC transporter ATP-binding protein | 1.30 | <0.001 |
| BTHUR0008_RS25935 | Ferrichrome ABC transporter substrate-binding protein | 2.72 | <0.001 |
Elemental analysis of the two media lots and water sources.
| Lot #1091744 1520 | 0.15 ± 0.01 | 0.02 ± 0.02 |
| Lot #7220443 1060 | 0.30 ± 0.01 | 0.07 ± 0.02 |
| 1520 | 0.01 ± 0.01 | |
| 1060 | 0.01 ± 0.00 | |
Figure 3RNA seq data validation: Correlation between RNA seq and RT qPCR results for differential gene expression in . The log2 transformed expression ratio values from RNA seq (x-axis) and RT qPCR (y-axis) were plotted against each other and correlation coefficient (R2)-values were calculated. Seven genes plotted for medium effect: BTHUR0008_RS06920, BTHUR0008_RS03645, BTHUR0008_RS15085, BTHUR0008_RS17455, BTHUR0008_RS20850, BTHUR0008_RS17460 and BTHUR0008_RS19345. Six genes plotted for date effect in samples from medium Lot #744 (2/23/12 vs. 3/6/12): BTHUR0008_RS30620, BTHUR0008_RS19140, BTHUR0008_RS01820, BTHUR0008_RS21040, BTHUR0008_RS26070 and BTHUR0008_RS08955.
Effect of decreasing number of replicates on significantly differentially expressed genes while maintaining 100 and 25% of the reads.
| 4 (1A, 2A, 3A, 4A)/(9A, 10A, 11A, 12A) | 887 | 696 | 100% (887) | 100% (696) |
| 3 (1A, 3A, 4A)/(9A, 11A, 12A) | 885 | 689 | 83.5% (741) | 84.9% (591) |
| 2 (1A, 4A)/(9A,12A) | 720 | 501 | 68.4% (607) | 59.3% (413) |
All combinations of available replicates were tested. Results for replicates with most similar read numbers are shown.
Figure 4Venn analysis of DE genes detected with varying replicate numbers. (A) Venn diagram depicting the effect of 2–4 replicates while maintaining 100% of the reads, on significantly differentially expressed genes and the genes commonly detected within sets of varying replicates for strain ATCC10792. (B) Venn diagram for differentially expressed genes detected with 25% reads (~5–10 M) and 2–4 replicates for strain ATCC10792.
Figure 5Venn diagram depicting effect of reducing number of reads on DE gene numbers. The effect of decreasing of read numbers on significantly differentially expressed genes and the number of genes commonly detected within sets of 100, 50, 25, 10, 5% reads for strain ATCC10792.
Effect of decreasing number of reads on significantly differentially expressed genes while maintaining all four replicates.
| 100 | 887 | 100% (887) |
| 75 | 843 | 95% (803) |
| 50 | 793 | 89.4% (722) |
| 25 | 696 | 78.5% (611) |
| 10 | 574 | 64.7% (487) |
| 5 | 449 | 50.6% (376) |