| Literature DB >> 25712092 |
Anthony O Olarerin-George1, John B Hogenesch2.
Abstract
Mycoplasmas are notorious contaminants of cell culture and can have profound effects on host cell biology by depriving cells of nutrients and inducing global changes in gene expression. Over the last two decades, sentinel testing has revealed wide-ranging contamination rates in mammalian culture. To obtain an unbiased assessment from hundreds of labs, we analyzed sequence data from 9395 rodent and primate samples from 884 series in the NCBI Sequence Read Archive. We found 11% of these series were contaminated (defined as ≥100 reads/million mapping to mycoplasma in one or more samples). Ninety percent of mycoplasma-mapped reads aligned to ribosomal RNA. This was unexpected given 37% of contaminated series used poly(A)-selection for mRNA enrichment. Lastly, we examined the relationship between mycoplasma contamination and host gene expression in a single cell RNA-seq dataset and found 61 host genes (P < 0.001) were significantly associated with mycoplasma-mapped read counts. In all, this study suggests mycoplasma contamination is still prevalent today and poses substantial risk to research quality.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25712092 PMCID: PMC4357728 DOI: 10.1093/nar/gkv136
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Gene breakdown of mycoplasma-mapped reads. RNA-seq reads were aligned to four mycoplasma genomes using bowtie. Non-specific reads were filtered with BLAST. Of the resulting 472 219 mycoplasma-mapped reads, 90% mapped to mycoplasma ribosomal RNA.
Figure 2.Mycoplasma contamination in cultured versus non-cultured samples and series. (A) Fraction of series that are contaminated (containing cultured samples or not) at various cutoffs of mycoplasma-mapped reads per million (column graphs; primary y-axis). Red stars indicate the P-values of the respective comparisons (secondary y-axis; Fisher's exact test). Fraction of contaminated (B) cultured or (C) non-cultured samples for various cutoffs of mycoplasma-mapped reads per million, broken down by the indicated mycoplasma species.
Figure 3.Association between host gene expression and mycoplasma-mapped reads from single-cell RNA-seq. (A) Scatter plots of mycoplasma-mapped reads per million and host cell gene expression in single cell (DG-75) RNA-seq. Gene symbols are indicated in the upper right corner of the respective plots. P-values from the association test are indicated in the bottom left. (B) To assess the likelihood of obtaining significant genes by chance, the mycoplasma counts were permuted 1000 times. The analysis was repeated with each permutation. The expected number of significant genes is plotted in red. The observed number of significant genes is in black. Error bars are standard deviations.
Publication status of some of the most contaminated series
| GEO series ID | Mycoplasma-mapped RPM | Field of study | Journal | Year of publication | Citations |
|---|---|---|---|---|---|
| GSE25183 | 144 281 | Prostate cancer | Nat Biotechnol | 2011 | 271 |
| GSE30772 | 96 083 | Mitochondria biology | Cell | 2011 | 137 |
| GSE45982 | 84 759 | B-cell cancer | Cancer Cell | 2013 | 74 |
| GSE40948 | 66 905 | Embryonic stem cells, chromatin structure | Nat Struct Mol Biol | 2012 | 55 |
| GSE27823 | 51 752 | Enhancers, prostate cancer | Nature | 2011 | 269 |
| GSE49321 | 36 092 | Single cell RNA-seq method | Nature Methods | 2013 | 47 |
| GSE48159 | 34 902 | Estrogen receptor | NA | NA | N/A |
| GSE50429 | 34 179 | miRNAs, breast cancer | NA | NA | N/A |
| GSE24447 | 14 510 | Enhancers, developmental biology | Nature | 2011 | 580 |
| GSE45202 | 13 568 | Androgen signaling | Genes Dev | 2013 | 18 |
| GSE36695 | 12 312 | Stem cell biology | Stem Cells Transl Med | 2013 | 2 |
| GSE37003 | 10 200 | RNA modification | Nature | 2012 | 150 |
| GSE48514 | 10 095 | B-cell cancer | Proc Natl Acad Sci | 2012 | 4 |
| GSE43167 | 9234 | microRNA processing | Cell | 2013 | 40 |
| GSE16579 | 8641 | 3′UTR, cancer biology | Cell | 2009 | 560 |
| GSE40778 | 8629 | Exon junction complex | Nat Struct Mol Biol | 2012 | 35 |
| GSE15780 | 8291 | Cell survival and apoptosis | Nucleic Acids Res | 2011 | 12 |
| GSE25450 | 7180 | RNA polyadenylation | RNA | 2011 | 115 |
| GSE32340 | 6573 | Viral oncogenes, epigenomics | Genome Research | 2012 | 15 |
| GSE41292 | 6035 | microRNA processing | Cell | 2012 | 45 |
We looked up the publication information for the indicated series including the journal name, year of publication and the number of citations (according to Google Scholar). N/A denotes the study is not published. Field of study was obtained from analyzing the GEO descriptions and/or paper abstracts.