| Literature DB >> 26062606 |
Zhi-Yong Tao1, Xu Sui2,3,4, Cao Jun5,6,7, Richard Culleton8, Qiang Fang9, Hui Xia10, Qi Gao11,12,13.
Abstract
We found a 47 aa protein sequence that occurs 17 times in the Plasmodium vivax nucleotide database published on PlasmoDB. Coding sequence analysis showed multiple restriction enzyme sites within the 141 bp nucleotide sequence, and a His6 tag attached to the 3' end, suggesting cloning vector origins. Sequences with vector contamination were submitted to NCBI, and BLASTN was used to cross-examine whole-genome shotgun contigs (WGS) from four recently deposited P. vivax whole genome sequencing projects. There are at least 26 genes listed in the PlasmoDB database that incorporate this cloning vector sequence into their predicted provisional protein products.Entities:
Mesh:
Year: 2015 PMID: 26062606 PMCID: PMC4464627 DOI: 10.1186/s13071-015-0927-x
Source DB: PubMed Journal: Parasit Vectors ISSN: 1756-3305 Impact factor: 3.876
Fig. 1Cloning vector source sequence contamination in PlasmoDB. a: A 141 bp vector source sequence with a his6 tag repeatedly occurred in the Plasmodium vivax nucleotide database. b: Dozens of restriction enzyme sites are present in the sequence. c: VecScreen search showed the contaminating sequence strongly match to pMQ354. d: Typical errors in Sal-1 strain sequencing results due to the contaminating sequence. The missing ends are marked in yellow, and contaminating vector sequences are underlined
Correction of 26 genes affected by a contaminated cloning vector sequence in PlasmoDB
| ID | PlasmoDB ID | GenBank accession number | Length (bp) | |
|---|---|---|---|---|
| Before correction | After correction | |||
| 1 | PVX_253300 | XM_001612328 | 1,086 | 945 |
| 2 | PVX_250300 | XM_001612323 | 1,047 | 906 |
| 3 | PVX_211290a | XM_001612311 | 945 | 807 |
| 4 | PVX_226290a | XM_001612298 | 792 | 741 |
| 5 | PVX_214290a | XM_001612318 | 861 | 792 |
| 6 | PVX_215290a | XM_001612317 | 861 | 793 |
| 7 | PVX_220290 | XM_001612333 | 654 | 513 |
| 8 | PVX_252300 | XM_001612332 | 1,149 | 1,008 |
| 9 | PVX_222290b | XM_001612349 | 1,233 | 1,098 |
| 10 | PVX_196290b | XM_001612337 | 1,173 | 1,101 |
| 11 | PVX_195290 | XM_001612373 | 1,893 | 1,752 |
| 12 | PVX_231290 | XM_001612334 | 639 | 498 |
| 13 | PVX_213290 | XM_001612274 | 513 | 441 |
| 14 | PVX_249300 | XM_001612331 | 1,113 | 972 |
| 15 | PVX_227290 | XM_001612370 | 1,902 | 1,761 |
| 16 | PVX_240290c | XM_001612308 | 942 | 801 |
| 17 | PVX_235290c | XM_001612320 | 717 | 576 |
| 18 | PVX_254300 | XM_001612327 | 1,062 | 921 |
| 19 | PVX_200290d | XM_001612305 | 921 | 876 |
| 20 | PVX_201290d | XM_001612303 | 828 | 780 |
| 21 | PVX_206290d | XM_001612319 | 924 | 876 |
| 22 | PVX_208290d | XM_001612329 | 1,017 | 876 |
| 23 | PVX_216290e | XM_001612279 | 711 | 570 |
| 24 | PVX_218290e | XM_001612281 | 711 | 570 |
| 25 | PVX_237290e | XM_001612314 | 711 | 570 |
| 26 | PVX_217290e | XM_001612282 | 621 | 570 |
a, b, c, d, e:Represent duplicated sequences respectively