| Literature DB >> 25231367 |
Yu Lu1, Xueya Zhou2, Zhanguo Jin3, Jing Cheng1, Weidong Shen1, Fei Ji1, Liyang Liu4, Xuegong Zhang4, Michael Zhang5, Ye Cao6, Dongyi Han1, KwongWai Choy6, Huijun Yuan1.
Abstract
Here, we report an unconventional Chinese pedigree consisting of three branches all segregating prelingual hearing loss (HL) with unclear inheritance pattern. After identifying the cause of one branch as maternally inherited aminoglycoside-induced HL, targeted next generation sequencing (NGS) was applied to identify the genetic causes for the other two branches. One affected subject from each branch was subject to targeted NGS whose genomic DNA was enriched either by whole-exome capture (Agilent SureSelect All Exon 50 Mb) or by candidate genes capture (Agilent SureSelect custom kit). By NGS analysis, we identified that patients from Branch A were compound heterozygous for p.E1006K and p.D1663V in the CDH23 (DFNB12) gene; and patients from Branch B were homozygous for IVS7-2A>G in the SLC26A4 (DFNB4) gene. Both CDH23 mutations altered conserved calcium binding sites of the extracellular cadherin domains. The co-occurrence of three different genetic causes in this family was exceedingly rare but fully compatible with the mutation spectrum of HL. Our study has also raised several technical and analytical issues when applying the NGS technique to genetic testing.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25231367 PMCID: PMC4521291 DOI: 10.1038/jhg.2014.78
Source DB: PubMed Journal: J Hum Genet ISSN: 1434-5161 Impact factor: 3.172
Figure 1The pedigree and typical audiograms of patients. (a) The four-generation pedigree of the Chinese family presenting prelingual HL is comprised of three branches (labeled A~C). The affected individuals could only be found in the third generation. Individuals with available DNAs in the second and the third generation were genotyped for the four pathogenic mutations. Two affected members (III-4 and III-13) from the third generation were selected for sequencing. (b) Typical audiograms of selected patients from each branch are shown. While patients in Branch A and B showed bilateral severe-to-profound hearing loss across all frequencies, patients in Branch C all showed severe hearing losses with down-sloping shaped audiograms (only III-29 is shown).
Comparing the design differences between the CUHK-HL V1 and the SureSelect 50Mb target enriched kits
| Target region length (bp) | 2 062 107 | 51 646 629 |
| Number of baits | 52 049 | 556 569 |
| Baits layout | Overlapping baits, fourfold tiling across the target intervals | Intermediately adjacent, head-to-tail anchored baits across the target intervals |
| Targeted gene groups | 78 known HL genes+174 candidates+entire mtDNA | GENCODE coding genes+noncoding RNAs from miRBase and Rfam |
| Targeted regions for each gene | All exons and UTRs+50 bp flanking sequences | All coding exons+10 bp flanking sequences |
| Proportion of the NSHL gene regions | 0.992 | 0.942 |
Abbreviations: bp, base pair; UTR, un-translated regions.
Gene regions are defined as all coding exons plus 10 bp flanking sequences at intron–exon junctions.
Figure 2Comparing the design and performance of the two target enrichment (TGE) kits. (a) The targeted regions, bait layouts, GC percent and depth of coverage at the GJB2 gene locus. The CUHK-HL V1 kit was targeting at both coding sequences and untranslated regions using fourfold tiling baits, whereas the commercial SureSelect 50 Mb kit was designed to capture only the protein-coding part of the gene using baits that were adjacently riveted to each other. The influence of local GC percent on the read depth is more evident with the CUHK-HL V1 kit: the exon 1 of GJB2 co-localizes with a CpG island on which no reads were mapped; across the exon2, the read depth tended to decrease with increasing GC percent. (b) Enrichment efficiency and the mtDNA effect. Enrichment efficiency can be measured by the proportion of total mapped bases that overlap the designed target regions (on-target proportion). Although the CUHK-HL V1 kit showed a higher on-target proportion than the SureSelect 50 Mb kit (~75% vs ~60%), nearly two-thirds of on-target bases were mapped onto mtDNA which is designed as a single target. (c) Comparing the uniformity of read depths across all NSHL genes. To account for the differences in the designed targets of two TGE kits, the comparison is restricted to the genomic intervals that encompass all coding exons plus 10bp intron–exon boundaries of the NSHL genes (exonic intervals) that overlap the target regions in both TGE kits. To account for the differences in the total sequence amounts, the depth per interval is then normalized by dividing the average depth over the exonic intervals under comparison. The cumulative distributions of the normalized depth on the exonic intervals are shown. The curve can be interpreted as the achieved coverage proportions (y axis) at different normalized depths (x axis). For the normalized depth ranging from 0 to 0.5, we found the SureSelect 50Mb kit consistently but slightly outperformed the CUHK-HL V1 kit on the coverage proportions. (d) The effect of GC content on read depths. Similar to (c), we compared the two TGE kits by using normalized depth over all targeted exonic regions of the NSHL genes. The pattern is quantitatively similar when using all targets. For both kits, regions with very high GC contents (>0.7) had very low depths. While the SureSelect 50Mb kit shows a parabolic relationship between read depth and GC content, the depths on the CUHK-V1 kit targets decrease monotonically with GC content. The difference can most likely be explained by the differences in the bait design. (e) The effect of repeat elements on coverage depths. Because the SureSelect TGE technology tends to avoid placing baits over repeat elements, target regions with low bait density should have higher densities of repeat elements. For the CUHK-HL V1 kit, under the fourfold tiling of 120 bp baits, the expected density should be 1/30; low bait density was defined as <1/50 based on the empirical bait density distribution. After accounting for the GC effect, read depths at the targets of low bait density tend to be shallower than targets with normal bait density. The influence of the bait density on target depth can also be observed for the SureSelect 50 Mb kit (see Table 6).
Summary of clinical data of affected individuals of family JX-H016
| III-1 | Female | — | Prelingual | — | — | — | — | — | — | — |
| III-2 | Female | 36 | Prelingual | No | — | — | — | — | — | — |
| III-4 | Female | 35 | Prelingual | No | 97 | 100 | Flat | No | No | No |
| III-10 | Male | — | Prelingual | — | — | — | — | — | — | — |
| III-11 | Female | 35 | Prelingual | No | 100 | 90 | Sloping | No | No | No |
| III-13 | Female | 27 | Prelingual | No | 98 | 98 | Flat | No | No | No |
| III-15 | Female | 24 | Prelingual | No | 100 | 100 | Flat | No | No | No |
| III-17 | Male | 29 | Prelingual | No | 88 | 90 | Flat | No | No | No |
| III-20 | Female | — | Prelingual | Yes | — | — | — | — | — | — |
| III-22 | Female | — | Prelingual | Yes | — | — | — | — | — | — |
| III-24 | Female | 28 | Prelingual | Yes | 82 | 83 | Sloping | No | No | No |
| III-29 | Male | 17 | Prelingual | Yes | 77 | 82 | Sloping | No | No | No |
PTA, pure-tone average.
The number of high-quality variants after each step of filtering
| All high-quality variants | 1492 | 220 | 35 728 | 2416 |
| After filtering against public databases | 50 | 131 | 1172 | 1129 |
| After in-house filtering against in-house exome database | 36 | 22 | 719 | 58 |
| After functional effects filtering | 11 | 1 | 196 | 14 |
| Variants disrupting known nonsyndromic hearing loss genes | 4 | 0 | 4 | 0 |
| Genes harboring homozygous variants or >=2 heterozygous variants | 1 | 2 | ||
| Recessive hearing loss genes harboring homozygous variants or >=2 heterozygotes | 1 | 1 | ||
Abbreviations: indels, insertions-deletions; SNV, single-nucleotide variants.
Excluding variants having alternative allele frequencies >0.01 in any one of the populations in dbSNP and 1000 genomes.
Excluding variants having alternative allele frequencies >0.01 in 170 other unrelated in-house exomes.
Keeping SNVs that are evolutionarily conserved and cause missense, nonsense changes or potentially disrupt splice sites; and keeping indels that result in inframe or frameshift alternations.
All rare variants that disrupt known nonsyndromic hearing loss genes discovered from two sequenced patients
| III-4 | 1:109472416 C>T (rs189033496) | Het | 0.0080 | NM_013296:exon15: c.1909C>T | p.R637W | 101 | +++ + | 3.05 | 5.05 | |
| III-4 | 6:76618264 T>C (rs144816202) | Het | 0 | NM_004999:exon32: c.3332T>C | p.V1111A | 64 | +++ + | 4.87 | 6.07 | |
| III-4 | 10:73466716 G>A | Het | 0 | NM_022124:exon25: c.3016G>A | p.E1006K | 56 | ? +?? | 5.98 | 5.28 | |
| III-4, III-13 | 10:73537579 A>T | Het | 0 | NM_022124:exon37: c.4988A>T | p.D1663V | 152 | ? +?? | 5.12 | 2.18 | |
| III-13 | 2:26699868 T>A | Het | 0.0011 | NM_004802:exon5: c.326A>T | p.N109I | 149 | +++ + | 3.16 | 3.76 | |
| III-13 | 7:107323898 A>G (rs111033313) | Hom | 0.0080 | NM_000441:exon8: c.919-2A>G | p.T307Vfs*5 | n.a. | 4.66 | 5.62 | ||
| III-13 | 5:68715804 G>A | Het | 0.0023 | NM_001038603:exon2: c.592G>A | p.V198M | 21 | ++− − | 2.28 | 5.16 |
Allele frequencies were calculated from the in-house exome database of 440 unrelated samples (Feb 2013; X. Zhou unpublished data).
Results from four non-synonymous SNV effects prediction programs, from left to right: PolyPhen2, SIFT, LRT, MutationTaster. ‘+', deleterious or damaging; ‘−', benign; ‘?', not avaiable.
Measures of evolutionary constraint. PhyloP score is the −log10 of P-value for testing the null hypothesis of neutral evolution, based on 46-way whole-genome alignment of vertebrates. GERP score can be interpreted as the substitutions expected under neutrality minus the number of substitutions observed at the position, which was based on 35-way whole-genome alignment of mammals and had theoretical maximum value of 6.18.
This variant was predicted to abolish splice donor and cause exon 8 skipping.
The summary statistics for two sequenced affected subjects
| Target enrichment kit | CUHK-HL V1 | SureSelect 50Mb |
| Total yield of raw sequence reads (Gbp) | 0.837 | 19.58 |
| Percent of aligned reads | 98.7% | 96.3% |
| Percent of duplicated reads | 15.8% | 30.7% |
| Mean depth of coverage on the targeted regions | 95.6 × | 110.86 × |
| Mean depth of coverage on the mtDNA | 14 479.2 | 131.3 |
| Percent of target bases covered at >=1 × | 92.8% | 96.3% |
| Percent of target bases covered at >=10 × | 83.1% | 89.9% |
| Percent of target bases covered at >= 20 × | 76.3% | 81.0% |
| Percent of target bases covered at >= 30 × | 71.0% | 72.5% |
| Mean depth of coverage on the NSHL gene regions | 83.5 × | 119.9 × |
| Percent of the NSHL gene regions covered at >=10 × | 85.3% | 90.6% |
| Percent of the NSHL gene regions covered at >=20 × | 74.9% | 83.6% |
| Percent of the NSHL gene regions covered at >=30 × | 67.3% | 76.2% |
Abbreviations: Gbp, giga base pair.
We did not include the mtDNA in calculating the average depth. If the mtDNA target of the CUHK-HL V1 kit were included, the mean target depth for B-3 would be 213.6 × ; and the coverage proportions at 10 × , 20 × and 30 × would be slightly increased to 83.3%, 76.5% and 71.3%, respectively.
Evaluating the influence of the genomic features on the per-target depth
| P | P | |||
|---|---|---|---|---|
| GC percent | −0.514 | <2e-16 | −0.245 | <2e-16 |
| Squared GC percent | 0.00238 | 0.564 | −0.218 | <2e-16 |
| Bait density | 17.77 | <2e-16 | 15.05 | <2e-16 |
Bait density reflects the density of repeat element; target regions rich in repeat element tend to have fewer designed baits.