| Literature DB >> 21102627 |
Rowida Almomani1, Jaap van der Heijden, Yavuz Ariyurek, Yuching Lai, Egbert Bakker, Michiel van Galen, Martijn H Breuning, Johan T den Dunnen.
Abstract
Although sequencing of a human genome gradually becomes an option, zooming in on the region of interest remains attractive and cost saving. We performed array-based sequence capture using 385K Roche NimbleGen, Inc. arrays to zoom in on the protein-coding and immediate intron-flanking sequences of 112 genes, potentially involved in mental retardation and congenital malformation. Captured material was sequenced using Illumina technology. A data analysis pipeline was built that detects sequence variants, positions them in relation to the gene, checks for presence in databases (eg, db single-nucleotide polymorphism (SNP)) and predicts the potential consequences at the level of RNA splicing and protein translation. In the samples analyzed, all known variants were reliably detected, including pathogenic variants from control cases and SNPs derived from array experiments. Although overall coverage varied considerably, it was reproducible per region and facilitated the detection of large deletions and duplications (copy number variations), including a partial deletion in the B3GALTL gene from a patient sample. For ultimate diagnostic application, overall results need to be improved. Future arrays should contain probes from both DNA strands, and to obtain a more even coverage, one could add fewer probes from densely and more probes from sparsely covered regions.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21102627 PMCID: PMC3039511 DOI: 10.1038/ejhg.2010.145
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Figure 1Detection of sequence variants. A total of 32 nucleotide NGS reads (top, sequence mismatches in red) aligned with the genomic reference sequence (bottom). The center of the alignment shows a variant present in the heterozygous state. ' × n' behind the read indicates how many identical reads were obtained.
Sequence summary results of the different array-capture experiments performed
| % | % | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| S-2, F | 6.744 | 4.804 | 2.428 | 1.359 | 691 | 378 | 138 | 87.11 | 6.22 | 50 | No |
| S-3, M | 7.305 | 5.354 | 2.176 | 1.225 | 618 | 333 | 100 | 90.71 | 4.49 | 50 | No |
| S-5, M | 10.43 | 7.237 | 5.576 | 4.935 | 499 | 142 | 120 | 92.42 | 2.09 | 32 | No |
| S-7, M | 15.771 | 6.112 | 4.719 | 3.885 | 638 | 196 | 100 | 91.13 | 2.70 | 32 | No |
| S-6, M | 12.154 | 6.575 | 6.575 | 5.914 | 486 | 174 | 99 | 99.24 | 7.08 | 32 | Yes, 2nd time |
| S-8, F | 11.077 | 3.531 | 3.531 | 2.301 | 736 | 485 | 44 | 85.38 | 4.43 | 49 | Yes, 3rd time |
Abbreviations: F, female; M, male; MM# reads, number of reads with # mismatches to the reference sequence; QC, quality control.
Figure 2Average coverage obtained for different genes in four different samples. (a) Shows average coverage of 69 autosomal genes from four different samples. (b) Shows average coverage of 39 genes located on X and one gene (NLGN4Y) located on the Y chromosome; a female sample exhibited an absence of hybridization on the captured array, with no coverage in the regions corresponding to the NLGN4Y. The female sample shows a higher average coverage per gene for all genes located on X-chromosome compared with male samples. (c) Lower average coverage of B3GALTL gene in a male patient sample with a known large deletion compared with three wild-type male samples. (d) Coverage per nucleotide/position for the whole B3GALTL gene: the patient sample shows lower coverage for the second half (exons 8–15) compared with wild type samples. del=deletion, wt=wild type.