| Literature DB >> 24998260 |
Bart J G Broeckx1, Frank Coopman2, Geert E C Verhoeven3, Valérie Bavegems4, Sarah De Keulenaer1, Ellen De Meester1, Filip Van Niewerburgh5, Dieter Deforce5.
Abstract
Whole exome sequencing is a technique that aims to selectively sequence all exons of protein-coding genes. A canine whole exome sequencing enrichment kit was designed based on the latest canine reference genome (build 3.1.72). Its performance was tested by sequencing 2 exome captures, each consisting of 4 pre-capture pooled, barcoded Illumina libraries on an Illumina HiSeq 2500. At an average sequencing depth of 102x, 83 to 86% of the target regions were completely sequenced with a minimum coverage of five and 90% of the reads mapped on the target regions. Additionally, it is shown that the reproducibility within and between captures is high and that pooling four samples per capture is a valid option. Overall, we have demonstrated the strong performance of this WES enrichment kit and are confident it will be a valuable tool in future disease association studies.Entities:
Mesh:
Year: 2014 PMID: 24998260 PMCID: PMC4083258 DOI: 10.1038/srep05597
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Statistics for exome sequencing eight dogs
| Sequencing reads | ||||||
|---|---|---|---|---|---|---|
| Sample | Total | Mapped | Duplicate | Remaining | Remaining (%) | Sequencing depth (x) |
| 1 | 82,574,410 | 77,392,469 | 4,820,648 | 72,571,821 | 87.9 | 93.0 |
| 2 | 74,657,388 | 69,542,653 | 4,518,820 | 65,023,833 | 87.1 | 82.6 |
| 3 | 90,534,096 | 83,841,822 | 4,680,806 | 79,161,016 | 87.4 | 102.0 |
| 4 | 77,786,110 | 72,147,586 | 4,457,341 | 67,690,245 | 87.0 | 87.1 |
| 5 | 111,624,766 | 108,781,536 | 9,882,797 | 98,898,739 | 88.6 | 125.1 |
| 6 | 96,041,166 | 93,261,066 | 8,278,081 | 84,982,985 | 88.5 | 106.9 |
| 7 | 103,290,412 | 100,440,603 | 8,653,736 | 91,786,867 | 88.9 | 116.7 |
| 8 | 86,094,438 | 83,226,207 | 5,926,249 | 77,299,958 | 89.8 | 99.3 |
Figure 1Relation between the proportion of each region being sequenced and the total amount of regions sequenced (%).
For each individual region per sample, the percentage of the region being sequenced at a minimum coverage of 5, was calculated. On average 85% of the regions were completely sequenced. This number increased to 87% of the regions being sequenced for at least 90%. Around 90% of the regions were being sequenced for at least 60%.
Regions with a coverage below 5
| Sample | Regions with minimum coverage <5 (%) | Regions with maximum coverage <5 (%) |
|---|---|---|
| 1 | 31,604 (15.56) | 16,330 (8.04) |
| 2 | 33,167 (16.33) | 17,042 (8.39) |
| 3 | 30,122 (14.83) | 15,800 (7.78) |
| 4 | 34,655 (17.07) | 17,831 (8.78) |
| 5 | 28,250 (13.91) | 14,733 (7.26) |
| 6 | 28,979 (14.27) | 14,824 (7.30) |
| 7 | 28,487 (14.03) | 14,696 (7.24) |
| 8 | 30,224 (14.88) | 15,465 (7.62) |
Coverage of targeted base pairs
| Sample | % of target bp covered (>1x) | % of target bp covered (>5x) |
|---|---|---|
| 1 | 93.15 | 89.96 |
| 2 | 92.90 | 89.54 |
| 3 | 93.22 | 90.24 |
| 4 | 92.82 | 89.15 |
| 5 | 93.52 | 90.63 |
| 6 | 93.66 | 90.71 |
| 7 | 93.53 | 90.63 |
| 8 | 93.56 | 90.53 |
The second and third column show the percentage of base pairs from the design of 52,876,195 basepairs with a coverage of at least one and five, respectively, within each sample.