| Literature DB >> 29506533 |
Paul Simion1,2, Khalid Belkhir1, Clémentine François1, Julien Veyssier1, Jochen C Rink3, Michaël Manuel2, Hervé Philippe4,5, Maximilian J Telford6.
Abstract
BACKGROUND: Multiple RNA samples are frequently processed together and often mixed before multiplex sequencing in the same sequencing run. While different samples can be separated post sequencing using sample barcodes, the possibility of cross contamination between biological samples from different species that have been processed or sequenced in parallel has the potential to be extremely deleterious for downstream analyses.Entities:
Keywords: Contamination; Ctenophora; NGS; Phylogenomics
Mesh:
Substances:
Year: 2018 PMID: 29506533 PMCID: PMC5838952 DOI: 10.1186/s12915-018-0486-7
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Fig. 1Pervasive cross contaminations observed in recent transcriptomic datasets from six different labs. For each transcriptome, three columns indicate the percentage of transcripts categorised as low coverage (grey bars), dubious (orange bars) and cross contamination (red bars) as detected by CroCo (using default parameters). For the content of each dataset, see Additional file 1: Table S1; references [16, 22, 34, 41]
Fig. 2Dramatic effect of cross contaminations on reconstructing ctenophore relationships using a phylogenomic dataset. a Cross contamination network for dataset A as reconstructed by CroCo. Nodes and links represent transcriptomes and cross contaminations, respectively. Only transcripts strictly categorised as cross contaminants are taken into account here. Node sizes are proportional to the number of times the node is the source of cross contamination, and node colours represent the percentage of contaminated transcripts in the transcriptome. For clarity, weak links, defined as less than 2% of the strongest link in the network, are not shown. b, c Ctenophore phylogenetic relationships reconstructed with 114 genes using (b) untreated transcriptomes and (c) transcriptomes cleaned using CroCo (see details in Methods). In (b) the placement of lineages highlighted in orange disrupts the monophyly of the clade ‘Lobata’ (here represented by Mnemiopsis leidyi and Bolinopsis infundibulum). With the cleaned dataset (c), the same lineages, in blue, are placed in agreement with recent studies of ctenophore relationships [16, 31–33]. The two dotted arrows and their corresponding numbers indicate two major cross contamination events that can be observed on the cross contamination network (a)