| Literature DB >> 23742985 |
Colin D Veal1, Hang Xu, Katherine Reekie, Robert Free, Robert J Hardwick, David McVey, Anthony J Brookes, Edward J Hollox, Christopher J Talbot.
Abstract
MOTIVATION: Genomic copy number variation (CNV) can influence susceptibility to common diseases. High-throughput measurement of gene copy number on large numbers of samples is a challenging, yet critical, stage in confirming observations from sequencing or array Comparative Genome Hybridization (CGH). The paralogue ratio test (PRT) is a simple, cost-effective method of accurately determining copy number by quantifying the amplification ratio between a target and reference amplicon. PRT has been successfully applied to several studies analyzing common CNV. However, its use has not been widespread because of difficulties in assay design.Entities:
Mesh:
Year: 2013 PMID: 23742985 PMCID: PMC3722521 DOI: 10.1093/bioinformatics/btt330
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Overview of PRTPrimer. Target region for which PRTs are required on chromosome 3 is shown in black. The software first splits the region into overlapping segments to ensure an even distribution of PRTs. A large number of amplicons are designed for each of these segments irrespective of what type of sequence they are in, i.e. segmental duplication, SINE and so forth. Each amplicon is then aligned to the human genome allowing for mismatches with the primers. Only amplicons that have exact priming matches twice in the genome are selected for filtering. The final stage filters the results to only those amplicons that meet adjustable criteria, such as size difference between target and reference, no SNPs at primer positions and CNVs spanning the reference amplicon. In this example, the target amplicon is 100 bp and the reference amplicon on a different chromosome is 120 bp
Fig. 2.The output of the alignment procedure is stored in a text file [BED format (genome.ucsc.edu/FAQ/FAQformat.html)] with the PRT ID, genomic location and alignment score. Because of the potential size of these files, SQLite databases are used to aid efficient access to these files without prohibitive memory requirements. The original output from Primer3 is stored in a single database and the BED files are processed to store a maximum of the first five exact genomic matches for each amplicon. In addition, each amplicon is counted for the total number of exact matches, and matches with an alignment score >900 (indicating up to four mismatches). The characteristics of amplicons that match just twice in the genome and their corresponding Primer3 data can be rapidly extracted from these databases. The results can then be filtered for various output settings and checked for existing CNV, indels and SNPs. The final output can be seen at the bottom for a single PRT: ID = unique ID for PRT; Chr, Start, End = genomic location; Size = amplicon length in bp; Misprime = potential number of genomic locations that may be amplified with small number of mismatches in primers; DGV = detects whether the amplicon coincides with any reported CNV or indels; Forward, Reverse = amplicon primers; SizeDiff = length difference between target and reference amplicons; FSNP, RSNP = number of SNPs within primers
Fig. 3.A diagram from the UCSC Genome Browser of PRT assays designed for the SOD2 gene region under two different runs of PRTPrimer using different parameters aligned against RepeatMasker and self-chain output for the interval. For Track A, primers were allowed to be designed in SINES and other genome regions. For Track B, SINES were excluded from the designed process. Track B resulted in PRT assays that are more likely to succeed in the laboratory, whereas Track A has a higher density of coverage
Counts of PRTPrimer designed assays for different criteria in two example chromosomes: chr13, which is gene-poor, and chr19, which is gene-rich
| PRT Criteria | Chromosome 19 | Chromosome 13 |
|---|---|---|
| Sequence difference | 1 729 709 | 2 060 532 |
| Paralogue <500 kb from target | 1 051 948 | 391 476 |
| Paralogue >500 kb from target | 677 761 | 1 669 056 |
| Paralogue >500 kb from target and size difference between amplicons >0 bp | 359 020 | 863 971 |
| Paralogue >500 kb from target and size difference between amplicons >2 bp | 204 030 | 502 948 |