| Literature DB >> 23646288 |
Yong Wang1, On On Lee, Jiang Ke Yang, Tie Gang Li, Pei Yuan Qian.
Abstract
The Multiple Displacement Amplification (MDA) protocol is reported to introduce different artifacts into DNA samples with impurities. In this study, we report an artifactual effect of MDA with sediment DNA samples from a deep-sea brine basin in the Red Sea. In the metagenomes, we showed the presence of abundant artifactual 454 pyrosequencing reads over sizes of 50 to 220 bp. Gene fragments translocated from neighboring gene regions were identified in these reads. Occasionally, the translocation occurred between the gene fragments from different species. Reads containing these gene fragments could form a strong stem-loop structure. More than 60% of the artifactual reads could fit the structural models. MDA amplification is probably responsible for the massive generation of the artifactual reads with the secondary structure in the metagenomes. Possible sources of the translocations and structures are discussed.Entities:
Keywords: Artifactual 454 reads; Gene fragments; MDA; Metagenome
Year: 2013 PMID: 23646288 PMCID: PMC3642703 DOI: 10.7717/peerj.69
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1MDA protocol and flow chart of experiment. (A) The normal MDA reaction on the DNA template; (B) The plasmid is amplified by MDA; (C–D) Two DNA fragments with a complex secondary structure to be amplified by MDA in an unknown manner.
Figure 2Length range of the reads for five sediment samples. The control was the metagenome from the overlying Atlantis II brine water.
Layer-specific overabundance of short reads for some genes. The KEGG genes in the table were abundant in short reads in sizes of <220 bp (the number is shown). The number of all the reads and percentage of the short reads <220 bp are shown for the individual genes. Lengths and dG values are given with + /− standard deviation in the parentheses.
| Sample | KEGG id | No. <220 bp | Total | % | Clusters | Average length | Average |
|---|---|---|---|---|---|---|---|
| Sed12 | K04567 | 161 | 294 | 55 | 237 | 162(33) | −28.5(8.3) |
| Sed63 | K06988 | 2208 | 2435 | 91 | 648 | 162(35) | −42.7(14.3) |
| K07115 | 2056 | 2076 | 99 | 493 | 120(29) | −28.0(10.3) | |
| K00859 | 140 | 192 | 73 | 63 | 161(41) | −40.0(14.1) | |
| K02652 | 68 | 90 | 76 | 69 | 147(37) | −38.3(13.5) | |
| K01061 | 60 | 79 | 76 | 63 | 147(37) | −31.7(10.9) | |
| Sed105 | K01440 | 2511 | 2528 | 99 | 462 | 109(23) | −28.6(13.1) |
| K01409 | 566 | 894 | 63 | 493 | 159(37) | −38.5(12.9) | |
| K09800 | 405 | 461 | 88 | 277 | 149(37) | −32.4(13.4) | |
| K01207 | 124 | 221 | 56 | 188 | 143(42) | −24.8(9.2) | |
| K07788 | 68 | 100 | 68 | 85 | 163(40) | −35.4(11.7) | |
| Sed183 | K00257 | 662 | 836 | 79 | 442 | 104(27) | −26.3(8.7) |
| K09705 | 289 | 301 | 96 | 135 | 115(21) | −35.5(10.7) | |
| K00162 | 177 | 190 | 93 | 123 | 112(26) | −28.0(7.8) | |
| K07506 | 121 | 155 | 78 | 109 | 143(27) | −31.1(6.0) | |
| Sed222 | K01627 | 6068 | 6180 | 98 | 1189 | 143(38) | −34.4(116.7) |
| K01589 | 61 | 113 | 54 | 108 | 105(31) | −25.6(11.4) |
Figure 3Length distribution of the reads for the genes with abundant short reads. Alignment positions of the reads on proteins were based on BLASTX results. The numbers in parentheses following the sample names are those of the short reads (<220 bp) and total reads.
Figure 4Secondary structure of three representative reads for K06988 gene. Protein positions of K06988 gene are present on the reads. Length of read A is 152 nt, and 8–151 nt of this read was aligned to 117–164 aa of the K06988 protein. Length of read B is 210 nt and 2–85 nt of this read was aligned to 159–186 aa of the protein; the region of 84–206 nt was aligned to 117–157 aa. Length of read C is 179 nt, and 10–177 nt of this read was aligned to 131–186 aa. The protein positions were indicated by arrows on the reads.
Figure 5dG values of randomly-trimmed short reads from the metagenomes and those for selected KEGG genes. The names of the genes are shown beside the symbol of samples in which the average free energy was calculated for their reads. Symbols for the randomly-trimmed short reads in sizes of about 100, 120, 140 and 160 aa do not have a gene name beside them and were circled.