| Literature DB >> 25349267 |
Chao Xu1, Jigang Zhang1, Yu-Ping Wang2, Hong-Wen Deng3, Jian Li4.
Abstract
As an important subtype of structural variations, chromosomal translocation is associated with various diseases, especially cancers, by disrupting gene structures and functions. Traditional methods for identifying translocations are time consuming and have limited resolutions. Recently, a few studies have employed next-generation sequencing (NGS) technology for characterizing chromosomal translocations on human genome, obtaining high-throughput results with high resolutions. However, these studies are mainly focused on mechanism-specific or site-specific translocation mapping. In this study, we conducted a comprehensive genome-wide analysis on the characterization of human chromosomal material exchange with regard to the chromosome translocations. Using NGS data of 1,481 subjects from the 1000 Genomes Project, we identified 15,349,092 translocated DNA fragment pairs, ranging from 65 to 1,886 bp and with an average size of approximately 102 bp. On average, each individual genome carried about 10,364 pairs, covering approximately 0.069% of the genome. We identified 16 translocation hot regions, among which two regions did not contain repetitive fragments. Results of our study overlapped with a majority of previous results, containing approximately 79% of approximately 2,340 translocations characterized in three available translocation databases. In addition, our study identified five novel potential recurrent chromosomal material exchange regions with greater than 20% detection rates. Our results will be helpful for an accurate characterization of translocations in human genomes, and contribute as a resource for future studies of the roles of translocations in human disease etiology and mechanisms.Entities:
Keywords: chromosomal translocation; next-generation sequencing; recurrent translocation; structural variation
Mesh:
Year: 2014 PMID: 25349267 PMCID: PMC4255766 DOI: 10.1093/gbe/evu234
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FSize distribution of the identified translocated DNA fragments. X axis indicates the size groups of the identified translocated DNA fragments. Y axis represents the corresponding frequencies.
FThe numbers of translocated DNA fragment pairs on individual study subjects. X axis indicates the number of translocated DNA fragment pairs per subject. Y axis indicates the number of subjects.
Detection Rates of Chromosomal Material Exchanges in Known Recurrent Constitutional Translocations in Different Ethnic Groups
| Ethnic Group | t(11;22) (q23;q11) | t(8;22) (q24.13;q11.21) | t(4;8) (p16;p23) | t(4;11) (p16.2;p15.4) | t(4;8) (p16.2;p23.1) | t(8;12) p23.1;p13.31) |
|---|---|---|---|---|---|---|
| AFR (%) | 18.07 | 6.72 | 53.36 | 1.68 | 2.94 | 15.55 |
| AMR (%) | 20.54 | 4.46 | 45.98 | 2.68 | 1.79 | 14.29 |
| ASN (%) | 15.88 | 2.65 | 36.76 | 2.35 | 2.35 | 11.47 |
| EUR (%) | 16.61 | 4.15 | 36.71 | 3.32 | 1.50 | 12.46 |
| SAN (%) | 18.18 | 5.19 | 35.06 | 0.00 | 6.49 | 15.58 |
| All (%) | 17.35 | 4.32 | 40.72 | 2.57 | 2.23 | 13.17 |
| 0.1279 | 7.3 × 10−4 | 3.64 × 10−12 | 1.86 × 10−6 | 4.55 × 10−11 | 0.0429 |
Note.—AFR, Africa; AMR, admixed American; ASN, East Asian; EUR, European; SAN, South Asian; “All” is the sample with AFR, AMR, ASN, EUR and SAN combined; “P value” is the chi-square test P value for the rate differences among different populations.
Top 5 Novel Regions of Recurrent Chromosomal Material Exchange
| Region | Chr | Begin | End | gieStain | Gene | Chr | Begin | End | gieStain | Gene | Detection Rate |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 12 | 82958601 | 82958800 | gpos100 | 16 | 14580801 | 14581000 | gpos50 | 28.6 | ||
| 2 | 6 | 382201 | 382400 | gneg | 16 | 33428201 | 33428600 | gneg | 26.4 | ||
| 3 | 3 | 49783601 | 49783800 | gneg | 5 | 87153001 | 87153200 | gpos100 | 25.7 | ||
| 4 | 21 | 11022401 | 11022600 | acen | 21 | 45991601 | 45991800 | gneg | 23.0 | ||
| 5 | 11 | 61841601 | 61841800 | gpos25 | 14 | 81786801 | 81787000 | gpos100 | 21.1 |
Note.—Chr, chromosome; gieStain, Giemsa stain results: acen, pericentromeric region; gpos100 class consists of the darkest staining bands, with gpos75, gpos50 and gpos25 classes containing progressively lighter staining G-positive bands; gneg class consists of the nonstaining G-negative light bands (Furey and Haussler 2003).
Annotation of the Top 16 Translocation Hot Regions
| Hot Region | Chromosome | Begin | End | Size | gieStain | CDS | Gene | Repeats | Occurrence |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 12 | 66451361 | 66451530 | 170 | gpos50 | SINE, Simple | 8627 | ||
| 2 | 7 | 105741881 | 105741990 | 110 | gneg | CCDS47685.1 | Simple | 3911 | |
| 3 | 2 | 33141301 | 33141770 | 470 | gpos75 | Simple | 2833 | ||
| 4 | 6 | 160521751 | 160521890 | 140 | gneg | CCDS5273.1 | SINE, LINE | 2700 | |
| 5 | 1 | 231007791 | 231007920 | 130 | gpos50 | SINE, LTR | 2683 | ||
| 6 | 2 | 88206871 | 88207030 | 160 | gneg | LINE, Simple | 1936 | ||
| 7 | 2 | 238252121 | 238252260 | 140 | gneg | CCDS33412.1 | SINE, Simple | 1642 | |
| 8 | 9 | 140785301 | 140785680 | 380 | gneg | CCDS59523.1 | 1513 | ||
| 9 | 6 | 382041 | 382470 | 430 | gneg | 1371 | |||
| 10 | 3 | 64682161 | 64682300 | 140 | gpos50 | SINE | 1262 | ||
| 11 | 4 | 1708921 | 1709060 | 140 | gneg | CCDS3350.1 | SINE | 1206 | |
| 12 | 1 | 62390831 | 62390970 | 140 | gpos50 | CCDS617.2 | SINE, Simple | 1193 | |
| 13 | 17 | 30276681 | 30276790 | 110 | gneg | CCDS11270.1 | SINE | 1127 | |
| 14 | 16 | 33428141 | 33428570 | 430 | gneg | LINE | 1116 | ||
| 15 | 5 | 141379101 | 141380040 | 940 | gneg | SINE, LINE | 1083 | ||
| 16 | 7 | 107410631 | 107410760 | 130 | gpos75 | CCDS5748.1 | SINE, Simple | 1067 |
Note.—CDS, coding sequence id in NCBI; Repeats, repetitive elememts contained in the region including simple repeats (Simple); SINE, short interspersed nuclear elements; LINE, long interspersed nuclear elements; LTR, long terminal repeat elements; occurrence, the numer of observations of translocations in the region.
GO Analysis for the Genes with Top 5% Translocation Occurrences
| Category | GO Term | GO ID | C | O | E | R | rawP | adjP |
|---|---|---|---|---|---|---|---|---|
| Biological Process | Translational initiation | GO:0006413 | 152 | 12 | 2.79 | 4.29 | 2.42 × 10−5 | 0.0142 |
| Cellular macromolecular complex disassembly | GO:0034623 | 177 | 13 | 3.25 | 3.99 | 2.41 × 10−5 | 0.0142 | |
| Macromolecular complex disassembly | GO:0032984 | 182 | 13 | 3.35 | 3.88 | 3.23 × 10−5 | 0.0142 | |
| Nuclear-transcribed mRNA catabolic process, nonsense-mediated decay | GO:0000184 | 119 | 10 | 2.19 | 4.57 | 6.84 × 10−5 | 0.0226 | |
| Cellular protein complex disassembly | GO:0043624 | 156 | 11 | 2.87 | 3.83 | 0.0001 | 0.0264 | |
| Serine family amino acid metabolic process | GO:0009069 | 31 | 5 | 0.57 | 8.77 | 0.0002 | 0.0378 | |
| Protein complex disassembly | GO:0043241 | 161 | 11 | 2.96 | 3.72 | 0.0002 | 0.0378 | |
| Molecular Function | Structural constituent of ribosome | GO:0003735 | 157 | 14 | 3.04 | 4.6 | 2.24 × 10−6 | 0.0006 |
| Methyltransferase activity | GO:0008168 | 188 | 12 | 3.64 | 3.3 | 0.0003 | 0.0207 | |
| Immunoglobulin binding | GO:0019865 | 19 | 4 | 0.37 | 10.87 | 0.0004 | 0.0207 | |
| Transferase activity, transferring one-carbon groups | GO:0016741 | 194 | 12 | 3.76 | 3.19 | 0.0004 | 0.0207 | |
| mRNA binding | GO:0003729 | 91 | 7 | 1.76 | 3.97 | 0.0019 | 0.0492 |
Note.—C, the number of reference genes in the GO term; O, the number of genes in the gene set and also in the GO term; E, expected number in the GO term; R, ratio of enrichment (O/E); rawP, P value from hypergeometric test; adjP, P value adjusted by the multiple test adjustment.
FHeat map of the interchromosomal translocated fragment pairs. The color key changes from light yellow (low enrichment scores) to red (high scores). The upper left panel is the histogram of the enrichment scores. The cell in the heat map shows the interchromosomal translocated fragments enrichment score between the corresponding chromosomes.