Kijong Yi1, Young Seok Ju2. 1. Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Korea. 2. Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Korea. ysju@kaist.ac.kr.
Abstract
Next-generation sequencing technology has enabled the comprehensive detection of genomic alterations in human somatic cells, including point mutations, chromosomal rearrangements, and structural variations (SVs). Using sophisticated bioinformatics algorithms, unbiased catalogs of SVs are emerging from thousands of human cancer genomes for the first time. Via careful examination of SV breakpoints at single-nucleotide resolution as well as local DNA copy number changes, diverse patterns of genomic rearrangements are being revealed. These "SV signatures" provide deep insight into the mutational processes that have shaped genome changes in human somatic cells. This review summarizes the characteristics of recently identified complex SVs, including chromothripsis, chromoplexy, microhomology-mediated breakage-induced replication (MMBIR), and others, to provide a holistic snapshot of the current knowledge on genomic rearrangements in somatic cells.
Next-generation sequencing technology has enabled the comprehensive detection of genomic alterations in human somatic cells, including point mutations, chromosomal rearrangements, and structural variations (SVs). Using sophisticated bioinformatics algorithms, unbiased catalogs of SVs are emerging from thousands of humancancer genomes for the first time. Via careful examination of SV breakpoints at single-nucleotide resolution as well as local DNA copy number changes, diverse patterns of genomic rearrangements are being revealed. These "SV signatures" provide deep insight into the mutational processes that have shaped genome changes in human somatic cells. This review summarizes the characteristics of recently identified complex SVs, including chromothripsis, chromoplexy, microhomology-mediated breakage-induced replication (MMBIR), and others, to provide a holistic snapshot of the current knowledge on genomic rearrangements in somatic cells.
Cancer genomics has contributed to medical oncology by providing the genomic landscape and catalog of somatic mutations of humancancers. This information holds clinically actionable targets that may be used for personalized oncology and the development of new therapeutics. In addition, because the catalog of somatic mutations is a cumulative archeological record of all the mutational processes a cancer cell has experienced throughout the lifetime of a patient, it provides a rich source of information for biologists to understand the DNA damage and repair mechanisms that function in human somatic cells[1].Genomic alterations in cancer cells consist of two major categories: (1) small variations that include single-nucleotide variants and short indels, and (2) large variations known as chromosomal rearrangements or structural variations (SVs). SVs are rearrangements of large DNA segments (for example, chromosomal translocations), occasionally accompanying DNA copy number alterations. Although there is no rule that clearly distinguishes the “small” and “large” variation categories, researchers currently regard 50 bp as the tentative cutoff criteria[2]. Before the era of whole-genome sequencing (WGS), tentatively regarded as prior to 2010, the comprehensive detection of SV “breakpoints” (qualitative changes) was not feasible in cancer genomes. CNAs (quantitative changes) were relatively easier to assess using classical technologies, such as comparative genomic hybridization (CGH) and genotyping microarrays[3].Because high-throughput DNA sequencing technologies produce unbiased sequences from whole genomes within a reasonable timeframe and at a reasonable cost (i.e., < 2000 USD and < 1 week for the production of 30 × WGS data from a tumor and paired normal tissue, as of Nov 2017), many research groups, in particular, two large international consortia (The International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)), have produced large-scale WGS data sets from a variety of common and rare tumor types during the last decade[4,5]. Various computational algorithms and tools have been developed for the sensitive and precise detection of SVs from the WGS data (reviewed in ref. [6,7]). These efforts have enabled the identification of driver SV events with remarkable functional consequences[8-14] and mechanistic patterns of SVs, which could not be identified by classical technologies. For example, the chromothripsis[15] mechanism, exhibiting a massive number of localized SV breakpoints with extensive oscillation of two DNA copy number states, was observed in cancer genome sequences, and elucidation of its molecular mechanisms followed[16-20]. However, many features remain unexplored, such as the frequency, activating conditions, and molecular machineries that are associated with the complex event. Understanding the diverse patterns of SVs observed in genome sequences is the first step to answering these questions.
Historical overview of SVs in cancers: from cytogenetics to array CGH
The first insights into SVs in cancer cells were provided by Theodor Boveri in the early twentieth century[21] (Fig. 1). By examining dividing cancer cells under a microscope, he observed the presence of scrambled chromosomes associated with uncontrolled cell division. Following the discovery of the double helix DNA structure (1952)[22], abnormalities of the genome were proposed to cause many human diseases. For example, the trisomy of chromosome 21 in Down syndrome (1959)[23] and the recurrent translocation between chromosomes 9 and 22 (known as the Philadelphia chromosome; 1960) in chronic myelogenous leukemia (CML) were found using cytogenetics technologies[24]. As the resolution of florescence in situ hybridization (FISH) technology improved, the CML-causing BCR-ABL1 fusion gene in the Philadelphia chromosome was identified[25]. In parallel, quantitative FISH analyses showed that some genetic loci are markedly amplified from the normal two copies in cancer cells[26,27]. Further technical improvements, such as CGH[28], array CGH[29] and genotyping microarray[30,31], enabled genome-wide screening of CNAs in the 1990s and 2000s[27,28,32,33]. Many cancer genes have been found to be frequently amplified (i.e., MCL1, EGFR, MYC, and ERBB2) or deleted (i.e., CDKN2A/B, RB1, and PTEN) in cancer cells[34,35]. Indeed, genomic instability is one of the hallmarks of cancers[36,37].
Fig. 1
The history of structural variation research
The history of structural variation researchAdvances in hybridization technologies increased the resolution of CNA detection to ~ 1000 base pairs. However, regardless of the resolution, these methods only approximate the genomic locations of CNAs without giving an accurate determination of the breakpoint sequences. Moreover, detection of novel copy number-neutral SVs (for example, balanced inversions and translocations) is fundamentally impossible when using array technologies. In addition, hybridization technologies are not adequate for exploring repetitive genome sequences (i.e., transposable elements)[3]. In the 2010s, advances in sequencing technologies finally enabled comprehensive, fine-scaled SV detection[4,38,39].
Patterns and mechanisms of SVs
Conventionally, cytogenetic technologies categorized SVs into four simple types: (large) deletions, duplications (amplifications), translocations, and inversions (Fig. 2). By definition, deletions and duplications are accompanied by CNAs. By contrast, inversions and translocations can be copy number neutral (balanced inversion or translocation). However, whole-genome analysis has shown that many SVs are not independent events but are acquired by a “single-hit” event and are therefore complex genome rearrangements. In this section, we introduce typical patterns of complex rearrangements found in cancers.
Fig. 2
Types of basic genomic variations.
a Small mutations, base substitution and indels. b Simple structural variations, deletion, amplification, inversion, and interchromosomal translocation
Types of basic genomic variations.
a Small mutations, base substitution and indels. b Simple structural variations, deletion, amplification, inversion, and interchromosomal translocation
Chromothripsis
Chromothripsis is a pattern of complex chromosomal rearrangement that is affected by a massive number of SV breakpoints, sometimes > 100, which are densely clustered in mostly one or a few chromosomal arms[40] (Fig. 3a). The term chromothripsis means “chromosome shattering into pieces” and was identified in 2011[15]. In general, chromothripsis is found in ~ 3% of all tumors and is frequently found in bone tumors (osteosarcoma and chordoma; 25%) and brain tumors (10%)[15]. However, an accurate description of its prevalence and cancer type specificity remains largely elusive.
Fig. 3
Patterns and proposed mechanisms of structural variations.
a Chromothripsis, showing a shattering and subsequent repair process. Telomere crisis and/or micronuclei by chromosome mis-segregation may induce chromothripsis. b Chromoplexy, showing a “closed chain” (upper) in the Circos plot. This is a multi-chromosomal translocation (lower). c MMBIR by template switching of the replication machineries. d BFB cycle, showing subtelomeric copy number increases and fold-back inversions. Proposed mechanisms are shown below. e Different patterns of SVs in BRCA1- and BRCA2-mutant breast cancers. f Patterns and formation of DMs and neochromosomes. DNA fragments can self-ligate, forming a ring structure, and are amplified (DMs). Fragments capturing centromeres and telomeres become neochromosomes. g Patterns and processes of L1 retrotransposition in the cancer genome. h HPV integration and regional rolling-circle amplification
Patterns and proposed mechanisms of structural variations.
a Chromothripsis, showing a shattering and subsequent repair process. Telomere crisis and/or micronuclei by chromosome mis-segregation may induce chromothripsis. b Chromoplexy, showing a “closed chain” (upper) in the Circos plot. This is a multi-chromosomal translocation (lower). c MMBIR by template switching of the replication machineries. d BFB cycle, showing subtelomeric copy number increases and fold-back inversions. Proposed mechanisms are shown below. e Different patterns of SVs in BRCA1- and BRCA2-mutant breast cancers. f Patterns and formation of DMs and neochromosomes. DNA fragments can self-ligate, forming a ring structure, and are amplified (DMs). Fragments capturing centromeres and telomeres become neochromosomes. g Patterns and processes of L1 retrotransposition in the cancer genome. h HPV integration and regional rolling-circle amplificationIn the typical case of chromothripsis localized in a chromosome arm, a massive number of SV elements (breakpoints) consist of similar proportions of all intrachromosomal rearrangement types (i.e., deletion type, tandem duplication type, and head-to-head and tail-to-tail inversion types). The copy number of the involved chromosome arm usually oscillates between the normal and deleted copy number states. In addition, loss-of-heterozygosity (LOH) is frequently observed in the low-DNA copy number regions. The simplest model for explaining the chromothripsis pattern is that a single “catastrophic hit” shatters one or a few chromosome arms into hundreds of DNA segments simultaneously in an ancestral region of cancer cells, and DNA repair pathways (presumably non-homologous end-joining) reassemble the fragments in an incorrect order and orientation[15]. DNA segments that are not rejoined during the repair process result in deletions. Although such a scenario explains the features of chromothripsis, the nature of the catastrophic hit is not fully understood. At present, two non-mutually exclusive mechanisms have been experimentally shown: (1) telomere crisis with telomere shortening and end-to-end chromosomal fusions followed by the formation of a chromatin bridge[41], and (2) micronuclei formation due to mis-segregated chromosomes during mitosis[18].A telomere is the DNA sequence region at the end of a chromosome that protects the chromosome. When telomeres are shortened, the ends of chromosomes (chromatids) can be fused, forming a dicentric chromosome that fails to segregate into daughter cells during mitosis. The fused sites are then stretched during the anaphase of mitosis[41], forming a chromatin bridge. Under certain circumstances, the bridge induces a partial rupture of the nuclear membrane in anaphase, and the nuclease activity of the 3′ repair exonuclease 1 (TREX1) generates extensive single-strand DNA and bridge breakage[42]. The frequently observed SV spectrums in the daughter cells are genomic rearrangements recapitulating known features of chromothripsis combined with localized hyper-point mutations (kataegis)[42]. This mechanism explains why chromothripsis frequently occurs in the vicinity of telomeric regions.Alternatively, a physical isolation of chromosomes in aberrant nuclear structures (micronuclei) was proposed as a possible mechanism of chromothripsis[18,19]. Micronuclei are frequently caused by errors in cell division, such as mis-segregation of intact chromosomes during mitosis[43] and acentric genome fragments from abnormal DNA replication/repair processes[19,20,44]. Molecular processes in micronuclei are known to be error prone; thus, isolated genetic materials are massively broken into pieces and reassembled[18,19,45]. The rejoined DNA fragments, showing chromothripsis-like features, can be fixed in a daughter cell.
Chromoplexy
Chromoplexy is another pattern of complex rearrangements that has many interdependent SV breakpoints (mostly interchromosomal translocations) but usually fewer than chromothripsis. This phenomenon was identified in prostate cancer genomes[46]. Chromoplexy mechanisms frequently disrupt tumor suppressor genes (i.e., PTEN, TP53, and CHEK2) and activate oncogenes by the formation of fusion genes (i.e., TMPRSS2-ERG) in the cancer type. The prevalence in prostate cancer is ~ 90%, but chromoplexy has not yet been explored in other cancer types. Conceptually, chromoplexy is an extended version of balanced translocation that reshuffles multiple chromosomes (rather than two chromosomes, as in balanced translocations) in a new scrambled configuration (Fig. 3b). Therefore, SVs in a chromoplexy event usually involve multiple chromosomes (usually > 3), and its rearrangement pattern resembles a “closed chain”. Although small deletions can occasionally be combined in the vicinity of the breakpoints as a form of “deletion bridge”, a large fraction of SVs in a chromoplexy event is copy number neutral. Like chromothripsis, chromoplexy is readily explained by the presence of a catastrophic hit that produces multiple DNA double-strand breaks (DSBs). Unlike chromothripsis, multiple DSBs in chromoplexy are not confined to a chromosome arm but are rather distributed across many chromosomes[46].Although the phenomenon is found in many common cancers (including prostate cancers, non-small cell lung cancers, head and neck cancers, and melanomas[46]) and rare solid cancers[47], the molecular basis of the catastrophic hit is unclear. The genome-wide distribution of DSBs in a chromoplexy event is not random but is enriched in actively transcribed and open chromatin regions[48-50]. This suggests that a nuclear transcription hub wherein many co-regulated genomic regions are spatially aggregated is fragmented by the catastrophic blow in chromoplexy[46].
Microhomology-mediated break-induced replication
The basic mechanisms of chromothripsis and chromoplexy are massive “shatter-and-stitch” processes of the genome. In these mechanisms, copy number gains of DNA segments are rarely observed. Cancer genomes frequently harbor another pattern of complex rearrangements, demonstrating a massive number of interspersed copy number gains (amplifications) of one parental allele without evidence of LOH These amplicons are directly interconnected with frequent templated insertions and common microhomologies (2–15 bps) at breakpoint junctions. These features suggest a replication-based mechanism for the acquisition of extra DNA copies, with frequent template switching of the DNA replication complex for the rearrangement (Fig. 3c). The replication-based model, termed microhomology-mediated break-induced replication (MMBIR), was initially suggested to explain the patterns of germline CNAs[51,52]. Presumably, translesion DNA polymerases, such as Polζ and Rev1, are responsible for MMBIR[53].The cellular conditions that induce MMBIR are not fully understood. Presumably, collapse of a replication fork due to a single-strand DNA break and/or a bulky DNA adduct in the template DNA (collectively referred to as replication stress) interferes with normal DNA replication and stimulates template switching[54,55]. Normally, the template switching contributes to the repair of broken replication forks using a sister chromatid. However, the process is a double-edged sword that may lead to chromosomal rearrangements when non-allelic chromosomal regions are selected as the template. A lack of Rec/RAD proteins (e.g., RAD51) due to persistent replication stress has been reported to trigger MMBIR[51,56].
Breakage-fusion-bridge cycle
The breakage-fusion-bridge (BFB) cycle, first discovered by Barbara McClintock[57] in 1939, is a recursive cycle of generation of the dicentric chromosome by telomere fusions and breaks when the two centromeres are pulled apart in anaphase (Fig. 3d). As multiple DSBs occur in random positions in the middle of the two centromeres over a few cell cycles, the BFB cycle leaves typical patterns of rearrangements, including (1) the stair-like increase in subtelomeric regions[58] (reviewed in ref. [41]) and (2) the enrichment fold-back inversions in the breakpoints. BFB cycle-mediated SVs have been well demonstrated in a subtype of acute lymphoblastic leukemia, which exhibits intrachromosomal amplification of chromosome 21 involving RUNX1 gene alteration[59,60].
Homologous recombination repair defect
Homologous recombination (HR) is a basic cellular mechanism to repair DSBs using identical or similar DNA sequences[61]. The basic steps of HR are (1) resection of the 5′ extremes of DSBs, (2) invasion of overhanging 3′ ends to a similar or identical DNA segment, and (3) DNA repair using one of two pathways—double-Holliday junction (reviewed in ref. [62]) or synthesis-dependent strand annealing (reviewed in ref. [62,63]).The defect of HR (for example, BRCA1 and BRCA2 inactivation) causes genomic instability and increases the incidence of breast and ovarian cancers[64,65]. Complete inactivation of BRCA1 and/or BRCA2 genes are found in 7% of all breast cancers[66], with an enrichment in the triple-negative breast cancer subtype[67]. BRCA gene-mutant breast cancers have a much higher burden of genome-wide SVs compared to ordinary breast cancers[68]. Interestingly, specific patterns of SVs are found according to the inactivated genes (Fig. 3e). For example, BRCA1-inactive cancers dominantly harbor short (< 10 kb) tandem duplications, but BRCA2-mutant cancers primarily show deletions[68]. Generally, BRCA1 recognizes DNA double-strand breaks along with ATM, TP53, and CHEK2 in the HR pathway. BRCA2 has an important role in the loading of RAD51[69,70], which is necessary for strand invasion after 5′-end resection[71].The HR defect has been of interest in clinical research fields because HR-defective cancers are susceptible to targeted therapies (PARP inhibitors) that inhibit the base excision repair pathway. This strategy aims to trigger additional genomic instability in HR-defective cancer cells (but not in normal cells), which leads to cancer cell death[72]. Breast cancerpatients with germline BRCA1/BRCA2 mutations are responding well to PARP inhibitor therapy[73,74].
Double-minute chromosome and neochromosome
Double-minute chromosomes (DMs) are aberrant genomic segments in a small circular form that are self-replicable but lack a centromere (Fig. 3f). DMs are often massively amplified in various solid and hematologic cancer cells[75]. DMs are detected in ~ 40% of glioblastomas, and some oncogenes, such as CDK4, MDM2, and EGFR, are frequently co-amplified in DMs[76,77]. DMs are important in tumorigenesis and tumor clonal evolution[78,79]. DM segments can be derived from DNA fragments that fail to be reassembled during chromothripsis[15].Neochromosomes are aberrant genomic segments in either circular or linear forms. Unlike DMs, neochromosomes harbor a centromeric structure and (if linear)`telomeric regions (Fig. 3f). Neochromosomes are observed in ~ 3% of all cancers and are especially frequent in a subset of mesenchymal tumors, including parosteal osteosarcomas (90%), atypical lipomatous tumors (85%), dedifferentiated liposarcomas (82%), and dermatofibrosarcoma protuberans (67%)[80]. The formation process of neochromosomes has been elucidated in detail from liposarcoma genomes[81]. Like DMs, neochromosomes begin as circular DNA structures. The intermediate structures subsequently capture centromeres and are finally linearized by the acquisition of telomeres at both ends due to concurrent rearrangements, including chromothripsis- and BFB cycle-like processes.
Transposition of mobile elements
Transposable elements (TEs) are repetitive DNA sequences that occupy 45% of the human genome[82]. In the human genome, these elements are successful parasitic units that have important roles in genome evolution by generating SVs via “cutting-and-pasting” (DNA transposons) or “copying-and-pasting” themselves (retrotransposons)[83]. Most of the TEs in human genomes are now truncated and inactive in both germline and somatic lineages. For example, of the 500,000 copies of the L1 retrotransposons[84,85] in the human genome, only ~ 100 L1 copies have intact open reading frames and are potentially capable of retrotransposition. In cancer cells, retrotranspositions of L1 are frequently observed (Fig. 3g)[86,87] in ~ 50% of pan-cancer tissues[86,88], with a high enrichment in esophageal cancers (> 90%), colon cancers (> 90%) and squamous cell lung cancers (> 90%)[86,88]. L1 retrotransposition is carried out by transcription, processing, reverse transcription, and novel insertion[89]. In some cases, hundreds of somatic retrotranspositions are observed in a cancer cell. In addition, L1 retrotranspositions occasionally carry adjacent non-repetitive DNA sequences (termed transduction), which can widely scatter genes, exons and regulatory elements across the genome[86]. The functional impacts of retrotranspositions in the pathogenesis of cancers are emerging[90]. The retrotranspositional insertion sites are enriched in the heterochromatin and hypomethylated regions[91], and cancer-related genes are sometimes affected[87,90,92-94].
Insertion of external DNA sequences
In addition to reshuffling of the nuclear genomes mentioned above, cancer cells may acquire completely new extranuclear DNA sequences from viruses, mitochondria[95,96] and bacteria[97,98]. For example, the vast majority of uterine cervical cancers (> 95%) and a substantial fraction of head and neck cancers (12%) contain human papillomavirus (HPV) DNA sequences in their genome[99]. HPV genome integration is involved in direct tumorigenesis (i.e., inhibition of the p53 pathway by the HPV oncoprotein E6[100]) and in the induction of genomic instability[101]. For example, the insertional sites of HPV are frequently amplified[102] by the “loop-mediated mechanism”[101] (Fig. 3h). If brief, the insertional regions tend to form a loop structure, which is susceptible to amplification during DNA replication. As a result, genomic DNA segments flanked by viral insertions can be massively amplified, occasionally by > 50 copies, which leads upregulation of the viral oncoprotein and co-amplified adjacent gene products[101].Intracellular nuclear transfers of full or partial mitochondrial DNA sequences are also observed in cancer genomes[95,103-105]. The prevalence of this event is ~ 2% of all cancers, with an enrichment in skin, lung, and breast cancers[96]. However, the molecular mechanism by which mitochondrial DNA is mobilized and inserted into nuclear genomes has not been fully elucidated. Most somatic nuclear integrations of mitochondrial DNA do not occur alone but are frequently combined with other complex rearrangements, suggesting that mitochondrial DNA fragments could be used as a “filler material” or a string for weaving broken nuclear DNA segments into the DNA repair processes in somatic cells[106].
Comprehensive signatures of SVs
Beyond the rearrangement patterns mentioned above, additional mechanisms presumably remain undetermined. Many ongoing efforts are being carried out to reveal comprehensive SV mutational signatures in cancer genomes. For example, > 30 mutational signatures have been revealed for point mutations from the statistical analysis of large catalogs of mutations[107]. Similar concepts have been applied to SVs in breast cancer genomes by clustering genome-wide SVs according to their features, such as local proximities, rearrangement class (tandem duplication, deletion, inversion, and translocation), and rearrangement size[68]. The analysis yielded six rearrangement signatures: (1) large ( > 100 kb) tandem duplication, (2) dispersed translocation, (3) small tandem duplication, (4) clustered translocation, (5) deletion, and (6) other clustered rearrangements. Among these signatures, tandem duplications (SV signatures 1 and 3) are thought to occur due to HR deficiency[108]. In a similar manner, Li et al.[109] identified nine SV signatures from a cohort of > 2500 cancer genomes. Using this classification, they inferred that a considerable proportion of rearrangements are caused by replication-based mechanisms.Large-scale genome studies have revealed that SVs are not evenly distributed across the genome. The density of SVs is affected by local genome and epigenome features as well as by 3D genome conformation[110-112]. For example, local rearrangement rates are affected by replication time, transcription rate, GC content, methylation status[113,114], and chromosomal fragile sites[115,116], including chromosome loop anchor sites[117]. More systematic analyses combining genome and epigenome features from a larger cohort will likely yield a better definition of the structural variation signatures and additional mutational processes in humancancers.
Functional consequences of SVs
SVs have functional consequences in tumorigenesis and clonal evolution via at least four direct mechanisms (Table 1): (1) truncation of genes (for example, deletion or gene disruption)[8,118], (2) amplifications of whole genes and their expression levels by the “dosage effect”, (3) fusion gene formation (for example, BCR-ABL in CML and EML4-ALK in lung cancers) and (4) mobilization of gene-regulatory element organization (‘enhancer hijacking’)[2]. The first three mechanisms are conventional, and evidence for the fourth mechanism is actively emerging. Examples of enhancer hijacking, which alters gene expression of cancer genes, including IRS4, SMARCA1, and TERT, have been reported[119]. In breast cancers, breast tissue-specific regulatory regions are recurrently duplicated[120], suggesting that positive selection pressures are strongly present. Similarly, many non-coding SVs may affect the gene expression of adjacent or distant genes by mobilizing many regulator regions or expressional quantitative trait loci[121,122]. More specifically, an experiment has shown that rearrangement involving the genomic topologically associating domain boundary can alter gene expression by altering the 3D genome structures that are involved in regulating gene expression[123].
Table 1
Selected examples of genes altered by structural variation in cancers
Type
Malignancy
Affected gene (prevalence)
Deletion
Retinoblastoma
RB (~ 100%)[131]
Renal cell carcinoma
VHL (90% of clear cell type)
Amplification
Invasive breast carcinoma
ERBB2 (18–25%)
Neuroblastoma
MYCN (20–25%)
Diffuse large B cell lymphoma
BCL2 (31%)
Acute myeloid leukemia
MLL, ALL1 (5–10%)
Gastric adenocarcinoma
FGF4 (7%)
Fusion gene
Lung adenocarcinoma
ALK (3.5%)
Prostatic adenocarcinoma
TMPRSS2-ETS family (29–43%)
Chronic myelogenous leukemia
ABL-BCR (~ 100%)
Burkitt lymphoma
MYC-IGH (~ 100%)
Ewing Sarcoma
EWSR1-FLI1 (~ 100%)
Enhancer hijacking
Medulloblastoma
GFI1 family gene activation[132]
Salivary gland adenoid cystic carcinoma
MYB gene overexpression[133]
T-lymphoblastic leukemia/lymphoma
TAL1 overexpression (30%)[134,135]
Lung squamous cell carcinoma
IRS4 overexpression[119]
Selected examples of genes altered by structural variation in cancers
Future direction and conclusion
The revolution of WGS provides an unbiased and comprehensive catalog of SVs in humancancer cells. Via a systematic, in-depth analysis of SV breakpoints, unique patterns and their underlying mutational processes are now emerging. However, current predominant WGS platforms producing short reads (< 500 bp) provide disintegrated data that are limited in the direct phasing of SV breakpoints. Despite many bioinformatic and statistical algorithms, the seamless reconstruction of final reassembled chromosomes is sometimes impossible with short read sequences, especially when the SVs are highly complex. In addition, SVs involved in highly repetitive regions (for example, telomeres, centromeres, and simple repeats) cannot be fully explored using these technologies. To this end, the combination of long read sequences (for example, from the PacBio platform) and high-resolution cytogenetics data will be helpful. Alternatively, Hi-C can be used to detect SVs in a high-throughput manner[124], although the cost efficiency could be an issue. If culturing somatic cells in vitro is possible, the Strand-seq[125] technique can provide fully phased data even if the subjects are not diploid. Single-cell genome sequencing is also a promising technology. For example, single-cell whole-genome sequencing could determine the exact timing of an SV per cell cycle[126].Apart from the technical limitations of DNA sequencing, the accurate molecular mechanisms of SVs are difficult to elucidate because tissue sequencing primarily reflects only the terminal results of SVs. Although we can observe DSBs under a microscope[127] or with special sequencing technology (e.g., END-seq[128]), observing the DSBs and final rearrangements (outcome) at the sequence level in the same cell is currently impossible to. Well-designed experiments and analyses are needed to bridge this gap.Understanding the functional consequences of SVs and their association with drug efficacy are important for precision medicine. For accurate functional analyses, the “genome sequencing-only” approach is limited, and the integration of multiomics data, such as genome, transcriptome, and epigenome data, are needed. Data representing the association between gene expression and the variation in the genome are being collected in the GTEx project[121]. Information on the regulatory region of the genome and the genomic regions interacting with it is actively accumulating in the ENCODE[129] and FANTOM projects[130,131]. By integrating these data sets, we will be able to comprehensively interpret the functional consequences of genome SVs and further advance precision oncology in the near future.
Authors: S W Morris; M N Kirstein; M B Valentine; K G Dittmer; D N Shapiro; D L Saltman; A T Look Journal: Science Date: 1994-03-04 Impact factor: 47.728
Authors: Sanju Sinha; Khadijah A Mitchell; Adriana Zingone; Elise Bowman; Neelam Sinha; Alejandro A Schäffer; Joo Sang Lee; Eytan Ruppin; Bríd M Ryan Journal: Nat Cancer Date: 2020-01-13
Authors: David G Menter; Jennifer S Davis; Bradley M Broom; Michael J Overman; Jeffrey Morris; Scott Kopetz Journal: Curr Gastroenterol Rep Date: 2019-01-30
Authors: Jana Wold; Klaus-Peter Koepfli; Stephanie J Galla; David Eccles; Carolyn J Hogg; Marissa F Le Lec; Joseph Guhlin; Anna W Santure; Tammy E Steeves Journal: Mol Ecol Date: 2021-09-12 Impact factor: 6.622
Authors: Roven Rommel Fuentes; Dmytro Chebotarov; Jorge Duitama; Sean Smith; Juan Fernando De la Hoz; Marghoob Mohiyuddin; Rod A Wing; Kenneth L McNally; Tatiana Tatarinova; Andrey Grigoriev; Ramil Mauleon; Nickolai Alexandrov Journal: Genome Res Date: 2019-04-16 Impact factor: 9.043