| Literature DB >> 32099591 |
Barbara Jenko Bizjan1, Theodora Katsila2, Tine Tesovnik1, Robert Šket1, Maruša Debeljak1, Minos Timotheos Matsoukas3, Jernej Kovač1.
Abstract
Genomic structural variations, previously considered rare events, are widely recognized as a major source of inter-individual variability and hence, a major hurdle in optimum patient stratification and disease management. Herein, we focus on large complex germline structural variations and present challenges towards target treatment via the synergy of state-of-the-art approaches and information technology tools. A complex structural variation detection remains challenging, as there is no gold standard for identifying such genomic variations with long reads, especially when the chromosomal rearrangement in question is a few Mb in length. A clinical case with a large complex chromosomal rearrangement serves as a paradigm. We feel that functional validation and data interpretation are of outmost importance for information growth to be translated into knowledge growth and hence, new working practices are highlighted.Entities:
Keywords: Human genetics; Long reads sequencing; Structural variations; Theranostics
Year: 2019 PMID: 32099591 PMCID: PMC7026727 DOI: 10.1016/j.csbj.2019.11.008
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Bioinformatics methods for the discovery and identification of structural variants by long read sequencing.
| Bioinformatics analysis | Selected methods | References | Sequencing technology |
|---|---|---|---|
| Reference based alignment of reads with structural variation calling | |||
| Reads alignment | NGMLR | Sedlazeck, Rescheneder, et al. | ONT or PacBio |
| Minimap2 | Li | ||
| Variant calling | Sniffles | Sedlazeck, Rescheneder, et al. | |
| SVIM | Heller and Vingron | ||
| SMRT-SV | Huddleston et al. | PacBio | |
| PBHoney | English et al. | ||
| Visualization | IGV | Robinson et al. | ONT or PacBio |
| Ribbon | Nattestad, Chin, et al. | ||
| Assembly | Canu | Koren et al. | ONT or PacBio |
| Wtdbg2 | Ruan and Li | ||
| FALCON | Chin et al. | PacBio | |
| Assembly alignment and visualization | MUMer | Marcais et al. | ONT or PacBio |
| QUAST | Gurevich et al. | ||
| Assembly-based SV detection | Assemblitics | Nattestad and Schatz | |
Fig. 1A 13.2 Mb large deletion on chromosome 4. (A) Detection and identification by CNV analysis of the target clinical exome short read sequencing data. (B) Identification by long reads sequencing.
Long range sequencing and mapping platforms.
| Platform | General characteristic | Key features for the determination of SVs | Limitations for the determination of SVs |
|---|---|---|---|
| Long reads sequencing (Oxford Nanopore Sequencing, PacBio SMRT sequencing) | Single-molecule long read sequencing averaging ∼10 k | Single reads spanning whole SV or its break points | Large quantities of high molecular weight DNA |
| BioNano Genomics optical mapping | Optical mapping of long DNA reads ∼250 kb or longer | Single molecule spanning structural variants > 10 kb | Does not provide a nucleotide-level resolution of breakpoints |
| 10X Genomic Chromium | Linked short reads spanning ∼100 kb | Linked reads spanning ∼100 kb can detect large SV variants > 10 kb | Unable to identify complex inversions |
| Hi-C based analysis | Pairs of short reads formed from crosslinking chromatin interactions | Chromatin contact maps determining large SV with reads spanning breakpoints and reads located nearby the breakpoints | Limited in detecting SVs within 1 MB scale |
| Strand-Seq | Single-cell/single-strand genome sequencing | Possible to identify, haplotypes and h genomic rearrangements including complex inversions | High cost and demanding procedure (the protocol requires viable mitotic cells) |
PacBio SMRT, Pacific Biosciences single-molecule real time.
Fig. 2Identification of a large complex SV by a synergy of cytogenetic approaches. (A) The observed G-banded karyotype. (B) The result of aCGH indicating the 1.93 Mb duplication of segment 7q11.21, 5.27 Mp triplication of segment 7q11.21q11.22 and 1.33 Mb duplication of segment 7q11.23. (C) The 7q11.22 triplication is detected by FISH. The regions 7q11.21 (orange), 7q11.22 (green) and 7q11.23 (orange) are shown by combined FISH. (D) The result of FISH inversion detection using distal (RP11-409J21) and proximal (CEP 7, Agilent) probes. (E) Two different possible scenarios: I. if the colour pattern oscillates between red and green, the inversion is not present, II.: the presence of inversion is confirmed. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 3The bioinformatics pipeline set herein for the analysis of the large complex SV in question by Nanopore MinION technology. Best identification results were acquired via the synergy of reference alignment and de novo assembly approaches.
Fig. 4Visualization of the large complex SV in question, following the application of long reads sequencing. (A) The gain in the reads coverage of 7q11.21q11.23q21.11 obtained by Minimap2 aligner and NGMLR mapper and the detected SVs. (B) The Ribbon view of the reads spanning the 7.7 Mb large inversion. (C.1) The Ribbon and (C.2) IGV views of the reads spanning the 15.8 kb large inversion.