| Literature DB >> 25657065 |
Abstract
Long inverted repeats (LIRs) have been shown to induce genomic deletions in yeast. In this study, LIRs were investigated within ±10 kb spanning each breakpoint from 109 human gross deletions, using Inverted Repeat Finder (IRF) software. LIR number was significantly higher at the breakpoint regions, than in control segments (P < 0.001). In addition, it was found that strong correlation between 5' and 3' LIR numbers, suggesting contribution to DNA sequence evolution (r = 0.85, P < 0.001). 138 LIR features at ±3 kb breakpoints in 89 (81%) of 109 gross deletions were evaluated. Significant correlations were found between distance from breakpoint and loop length (r = -0.18, P < 0.05) and stem length (r = -0.18, P < 0.05), suggesting DNA strands are potentially broken in locations closer to bigger LIRs. In addition, bigger loops cause larger deletions (r = 0.19, P < 0.05). Moreover, loop length (r = 0.29, P < 0.02) and identity between stem copies (r = 0.30, P < 0.05) of 3' LIRs were more important in larger deletions. Consequently, DNA breaks may form via LIR-induced cruciform structure during replication. DNA ends may be later repaired by non-homologous end-joining (NHEJ), with following deletion.Entities:
Mesh:
Year: 2015 PMID: 25657065 PMCID: PMC4319165 DOI: 10.1038/srep08300
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1LIR identification and selection were performed in 218 genomic regions including 5′ and 3′ breakpoints from 109 gross deletions involving 63 human genes using IRF software.
(a) DNA sequences from 5′ and 3′ deletion BPs were obtained from HGMD and GRaBD. Each BP-DNA sequence and corresponding gene was compaired using NCBI BLAST. Deletion BP locations were determined in related genes. 3′ BP sequence from BRCA1 gene was presented for describing BLAST comparing process. (b) LIR identification was done within ±10 kb flanking sequences each of 5′ and 3′ deletion BPs and 20 kb DNA fragments from control groups using IRF. LIRs with SL > 20 bp, IS of 0–10 kb, SID ≥ 70% were included for comparing total LIR number between deletion and control groups using Mann Whitney U test. (c) LIR selection was made within ±3 kb flanking sequences each of 5′ and 3′ BPs in deletion group. LIRs with SL > 20 bp, IS of 0–2.5 kb, SID ≥ 70% were selected for analysing of correlations between LIR features, distance from breakpoint and deletion size using Pearson's coefficient. Abbreviations: Bp, base pair; BP, breakpoint; GRaBD, gross rearrangement breakpoint database; HGMD, human gene mutation database; IRF, inverted repeat finder; IS, internal spacer; kb, kilobase; LIR, long inverted repeat; SID, stem identity; SL, stem length.
5′ and 3′ breakpoint locations from gross gene deletions and detection of long inverted repeats*
| GENE/DELETION SIZE | BREAKPOINT LOCATION/NCBI ACCESSION NUMBER | 5′LIR/3′LIR | GENE/DELETION SIZE | BREAKPOINT LOCATION/NCBI ACCESSION NUMBER | 5′LIR/3′LIR |
|---|---|---|---|---|---|
| 1- | (3′) 16223-30238 (5′)/NT_010755.15 | +/+ | 32- | (5′) 475951-1138023 (3′)/NT_030059.12 | +/− |
| 2- | (3′) 59697-63533 (5′)/NT_010755.15 | +/+ | 33- | (5′) 607242-620845 (3′)/NT_030059.12 | +/− |
| 3- | (5′) 29197-65577 (3′)/L78833.1 | +/+ | 34- | (5′) 612328-912640 (3′)/NT_030059.12 | −/− |
| 4- | (5′) 51483-71369 (3′)/L78833.1 | +/+ | 35- | (3′) 131486-199212 (5′)/AC000134.14 | +/+ |
| 5- | (5′) 56990-75331 (3′)/L78833.1 | +/+ | 36- | (5′) 62499-68391 (3′)/U78027.1 | +/+ |
| 6- | (5′) 62115-66940 (3′)/L78833.1 | +/+ | 37- | (5′) 53257-62324 (3′)/U01317.1 | +/− |
| 7- | (5′) 3806-8872 (3′)/NC_000013.9 | +/+ | 38- | (5′) 73863-77312 (3′)/U82828.1 | +/+ |
| 8- | (5′) 12323-26644 (3′)/AY436640.1 | +/− | 39- | (5′) 99957-136583 (3′)/AC079775.6 | +/+ |
| 9- | (5′) 27566-35513 (3′)/AY436640.1 | −/+ | 40- | (5′) 104712-106890 (3′)/AC079775.6 | +/+ |
| 10- | (5′) 45138-55975 (3′)/AY436640.1 | +/+ | 41- | (5′) 105900-121280 (3′)/AC079775.6 | +/+ |
| 11- | (5′) 56447-61399 (3′)/AY436640.1 | +/+ | 42- | (5′) 124511-131175 (3′)/AC079775.6 | +/+ |
| 12- | (3′) 176468-210312 (5′)/NT_008413.17 | −/+ | 43- | (3′) 86839-92892 (5′)/AC006583.31 | +/+ |
| 13- | (5′) 238391-338000 (3′)/NT_010799.14 | −/+ | 44- | (3′) 93192-96718 (5′)/AC006583.31 | −/+ |
| 14- | (5′) 347607-359613 (3′)/NT_010799.14 | +/− | 45- | (5′) 145570-159901 (3′)/AC011816.17 | +/+ |
| 15- | (5′) 305422-357770 (3′)/NT_024524.13 | −/− | 46- | (5′) 147927-160607 (3′)/AC011816.17 | +/+ |
| 16- | (5′) 312336-490300 (3′)/NT_024524.13 | +/− | 47- | (5′) 5698337-5709648 (3′)/NT_034483.3 | +/+ |
| 17- | (5′) 356866-558723 (3′)/NT_024524.13 | +/− | 48- | (5′) 5699487-5721160 (3′)/NT_034483.3 | +/+ |
| 18- | (5′) 365612-405655 (3′)/NT_024524.13 | −/+ | 49- | (5′) 53291-54449 (3′)/AC005995.3 | +/+ |
| 19- | (5′) 378370-382275 (3′)/NT_024524.13 | +/− | 50- | (3′) 449987-457933 (5′)/NT_010194.16 | −/+ |
| 20- | (5′) 462638-465032 (3′)/NT_024524.13 | +/− | 51- | (5′) 297453-310733 (3′)/NT_011786.15 | +/+ |
| 21- | (5′) 384224-819689 (3′)/NT_034772.5 | −/+ | 52- | (5′) 314410-314569 (3′)/NT_011786.15 | −/− |
| 22- | (5′) 529476-1266975 (3′)/NT_034772.5 | −/− | 53- | (5′) 191462-198534 (3′)/NT_011295.10 | +/+ |
| 23- | (5′) 417616-426044 (3′)/NT_037887.4 | +/− | 54- | (3′) 750100-1752166 (5′)/NT_009237.17 | −/+ |
| 24- | (5′) 432071-442186 (3′)/NT_037887.4 | +/− | 55- | (3′) 905528-1741689 (5′)/NT_009237.17 | −/− |
| 25- | (5′) 437300-438216 (3′)/NT_037887.4 | +/+ | 56- | (3′) 204471-936631 (5′)/NT_026437.11 | −/− |
| 26- | (3′) 177985-185621 (5′)/NT_035014.4 | −/+ | 57- | (5′) 9489-10630 (3′)/AY316591.1 | +/− |
| 27- | (3′) 500551-503808 (5′)/NT_011362.9 | +/+ | 58- | (5′) 63641-69726 (3′)/NG_008193.1 | −/+ |
| 28- | (3′) 32504-39823 (5′)/NC_000011.8 | −/+ | 59-F 8/19.9 kb | (5′) 105125-125067 (3′)/AY769950.1 | +/− |
| 29-DMD/7.8 kb | (3′) 152815-160602 (5′)/NT_011757.15 | +/− | 60-F 8/29.2 kb | (5′) 29349-58524 (3′)/AY769950.1 | −/− |
| 30-DMD/3.8 kb | (3′) 155384-159231 (5′)/NT_011757.15 | −/+ | 61-STS/40.1 kb | (5′) 32594-72652 (3′)/NT_011757.15 | −/− |
| 31-PTEN/453 kb | (5′) 452645-905607 (3′)/NT_030059.12 | +/+ | 62-WAS/1.9 kb | (5′) 2932-4803 (3′)/AF115549.2 | +/+ |
| 63- | (5′) 20561-81606 (3′)/NG_009072.1 | +/+ | 87- | (5′) 184647-186177 (3′)/NC_000007.12 | −/− |
| 64- | (5′) 146608-148929 (3′)/NG_009072.1 | −/+ | 88- | (5′) 18350-39431 (3′)/NC_000007.12 | +/+ |
| 65- | (5′) 8039-9822 (3′)/NG_007098.2 | +/+ | 89- | (3′) 1335845-1337134(5′)/NT_010823.11 | +/− |
| 66- | (5′) 23935-30628 (3′)/NG_008691.1 | +/+ | 90- | (5′) 11200-32195 (3′)/NG_007069.1 | −/− |
| 67- | (5′) 101289-131606 (3′)/AC013717.8 | +/+ | 91- | (5′) 37250-62577 (3′)/NT_079573.3 | −/− |
| 68- | (5′) 4337-7226 (3′)/X54486.1 | +/+ | 92- | (3′) 20800-22859 (5′)/NT_010542.15 | +/+ |
| 69- | (5′) 85491-93861 (3′)/AL592295.25 | +/+ | 93- | (3′) 32415-76544 (5′)/NT_010542.15 | +/+ |
| 70- | (3′) 23922-32811 (5′)/AL034420.16 | +/− | 94- | (5′) 187965-194400 (3′)/NG_008805.1 | −/− |
| 71- | (3′) 89888-94322 (5′)/NC_000003.10 | +/+ | 95- | (5′) 197759-204893 (3′)/NG_008805.1 | −/+ |
| 72- | (3′) 120275-144148 (5′)/AC013717.8 | −/+ | 96- | (5′) 70250-74168 (3′)/AL118505.17 | +/+ |
| 73- | (3′) 123903-162024 (5′)/AC013717.8 | +/+ | 97- | (3′) 47644-62873 (5′)/AC107385.4 | −/− |
| 74- | (5′) 332887-340206 (3′)/AY129465.1 | −/− | 98- | (3′) 60868-69094 (5′)/AC092947.12 | −/− |
| 75- | (5′) 337078-350281 (3′)/AY129465.1 | −/+ | 99- | (5′) 97036-105323 (3′)/NT_024871.11 | −/− |
| 76- | (5′) 18177-21076 (3′)/L39891.1 | −/− | 100- | (3′) 119019-224705 (5′)/NC_000003.10 | −/+ |
| 77- | (5′) 18515-23118 (3′)/NG_008164.1 | +/+ | 101- | (5′) 437912-442010 (3′)/NT_006576.15 | −/− |
| 78- | (5′) 484956-641159 (3′)/NG_008289.1 | +/− | 102- | (3′) 30733-35251 (5′)/NT_011651.16 | +/+ |
| 79- | (3′) 131887-254458 (5′)/NW_925783.1 | +/− | 103- | (3′) 36143-40797 (5′)/NT_011651.16 | +/+ |
| 80- | (5′) 15361-39242 (3′)/NT_023133.12 | +/+ | 104- | (3′)15719-53929(5′)/AC005026.2-AC005158.3 | +/− |
| 81- | (5′) 170994-178991 (3′)/NT_023133.12 | +/+ | 105- | (3′) 64730-77461 (5′)/AC005026.2-AC005158.3 | −/− |
| 82- | (5′) 9586-11738 (3′)/NG_009162.1 | +/+ | 106- | (3′)154476-127754(5′)/AC012596.4- AC099798.4 | −/− |
| 83- | (3′) 12883-28093 (5′)/Z94801.1 | −/+ | 107- | (3′) 149442-238453 (5′)/AC012596.4-AC005158.3 | +/− |
| 84- | (3′) 19564-33298 (5′)/Z94801.1 | +/+ | 108- | (3′) 7206-147487 (5′)/AC004844.1-AC005483.1 | −/+ |
| 85- | (3′) 31126-44864 (5′)/Z94801.1 | +/+ | 109- | (5′) (31695-31724)-(42846-42867) (3′)/NG_000006.1 | +/+ |
| 86- | (5′) 54052-75566 (3′)/U52112.2 | +/− |
*5′ and 3′ deletion breakpoint sequences were obtained from HGMD and GRaBD. DNA sequences of gene contigs were downloaded from NCBI. DNA sequences of 5′ and 3′ breakpoints and related gene contigs were compared using NCBI-Blast, and breakpoint locations of gross gene deletions determined. LIRs within genomic regions that included gene breakpoint sequences were investigated. Abbreviations: Bp, base pair; GRaBD, gross rearrangement breakpoint database; HGMD, human gene mutation database; kb, kilobase; LIR, long inverted repeat; Mb, megabase.
Figure 2Breakpoint regions of PINK1, ATM, PTEN and BRCA1 genes.
Sizes of LIR features, e.g. stem length, stem identity and internal spacer (loop length) are shown. NCBI accession numbers of each gene are provided. Coordinates correspond to GenBank sequences. (a) 3′ breakpoint sequence of the PINK1 deletion is in the downstream of gene. The LIR of PINK1 is located at the upstream of 1785 bp from 3′ breakpoint, has a stem length of 292 bp, internal spacer of 1736 bp, and stem identity of 82.65%. (b) 3′ breakpoint sequence of the ATM deletion is within the gene. The LIR of ATM is located at the 222 bp downstream of 3′ breakpoint, and has a stem length of 291 bp, internal spacer of 207 bp, and stem identity of 75.07%. (c) 5′ breakpoint sequence of the PTEN deletion is in the upstream of gene. The LIR of PTEN includes 5′ breakpoint, and has a stem length of 220 bp, internal spacer of 83 bp, and stem identity of 87.94%. (d) 5′ breakpoint sequence of the BRCA1 deletion is within the gene. The LIR of BRCA1 is located at the upstream of 632 bp from 5′ breakpoint, and has a stem length of 299 bp, internal spacer of 454 bp, and stem identity of 84.38%. Abbreviation: Bp, base pair; BP, breakpoint; LIR, long inverted repeat.
Figure 3Distribution of 138 LIRs at the 5′ and 3′ BP regions of 109 gross gene deletions.
(a) LIRs were detected in 89 (81%) gene deletions. (b) In 49 of these deletions, LIRs were located at both 5′ and 3′ BP regions. (c) Among the 40 deletions with LIRs at one of the breakpoint regions, in 21, the LIRs were at the 5′ BP region, and in 19 deletions, at the 3′ BP region. Abbreviations: BP, breakpoint; LIR, long inverted repeat.
Figure 4Identification of new LIRs between breakpoint regions of gross gene deletions including LIR at only one of the 5′ and 3′ BPs.
New LIRs were detected between genomic sequences flanking breakpoints in 24 of the 40 gross deletions including LIR at the 5′ or 3′ BP. Abbreviations: BP, breakpoint; LIR, long inverted repeat.
Figure 5In 24 gross deletion, new identified LIRs with stem identity, stem length and distance from breakpoint were shown.
Black bars indicate stem length, stem identity and distance from breakpoint of LIRs found between distant sites. Abbreviations: Bp, base pair; LIR, long inverted repeat.
Figure 6A model mechanism for single LIR-mediated gene deletion.
LIR forming cruciform structure in the single strand DNA nearing 3′ breakpoint of the gross gene deletion during replication is shown. After the first break is occurred in the vicinity of 3′ LIR, second break is induced by back-folded stem-loop structures forming with homolog sequences between distant 5′ and 3′ breakpoint sites. Free DNA ends may combine via 53BP1-mediated NHEJ. Abbreviations: LIR, long inverted repeat; NHEJ, non-homologous end-joining.
Figure 7A model mechanism for 5′ and 3′ LIRs-mediated gene deletion.
Cruciform structures of LIRs are formed on DNA strands during replication, with breaks potentially occurring inside LIR or near locations. LIR-induced breakages at the 5′ and 3′ breakpoint sequences may cause gene deletion by enabling free DNA ends to recombine via 53BP1-mediated NHEJ. Abbreviations: LIR, long inverted repeat; NHEJ, non-homologous end-joining.