| Literature DB >> 21895510 |
Abstract
Palindromati, the massive host-edited synthetic palindromic contamination found in GenBank, is illustrated and exemplified. Millions of contaminated sequences with portions or tandems of such portions derived from the ZAP adaptor or related linkers are shown (1) by the 12-bp sequence reported elsewhere, exon Xb, 5' CCCGAATTCGGG 3', (2) by a 22-bp related sequence 5' CTCGTGCCGAATTCGGCACGAG 3', and (3) by a longer 44-bp related sequence: 5' CTCGTGCCGAATTCGGCACGAGCTCGTGCCGAATTCGGCACGAG 3'. Possible reasons for why those long contaminating sequences continue in the databases are presented here: (1) the recognition site for the plus strand (+) is single-strand self-annealed; (2) the recognition site for the minus strand (-) is not only single-strand self-annealed but also located far away from the single-strand self-annealed plus strand, rendering impossible the formation of the active EcoRI enzyme dimer to cut on 5' G/AATTC 3', its target sequence. As a possible solution, it is suggested to rely on at least two or three independent results, such as sequences obtained by independent laboratories with the use, preferably, of independent sequencing methodologies. This information may help to develop tools for bioinformatics capable to detect/remove these contaminants and to infer why some damaged sequences which cause genetic diseases escape detection by the molecular quality control mechanism of cells and organisms, being undesirably transferred unchecked through the generations.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21895510 PMCID: PMC3272245 DOI: 10.1089/dna.2011.1339
Source DB: PubMed Journal: DNA Cell Biol ISSN: 1044-5498 Impact factor: 3.311