| Literature DB >> 26159513 |
Anna L Paterson1,2,3, Jamie M J Weaver4,5, Matthew D Eldridge6, Simon Tavaré7, Rebecca C Fitzgerald8, Paul A W Edwards9.
Abstract
BACKGROUND: Mobile elements are active in the human genome, both in the germline and cancers, where they can mutate driver genes.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26159513 PMCID: PMC4498532 DOI: 10.1186/s12864-015-1685-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Inserts produced by L1 activity and how they are treated by paired-end sequencing. a, b Generation of inserts showing truncation and transduction (not to scale). mRNA (orange) is transcribed from an L1 in the germline (or from a newly inserted L1, if complete). Target site is nicked, and the end at the nick used to prime reverse transcription of the mRNA to cDNA (green and red), which is often incomplete (dotted line). cDNA is subsequently integrated, flanked by a short duplication of the target site. Some inserts have 5’ inversions, perhaps due to additional priming in the opposite direction [19]. a, simple L1 insert; b, Transduction of 3’ unique sequence. Transcription of an L1 sometimes reads through the L1 polyA addition site (asterisk) into 3’ unique sequence (red) until a polyA addition site (asterisk) is encountered. The resulting cDNA and insert includes a variable amount of the unique sequence and upstream L1 sequence. c Examples of inserts with transduced unique sequence, and resulting paired-end sequence reads. Reverse transcription of the mobile element RNA is often incomplete, resulting in 5’ -truncated inserts. These may or may not have any L1 sequence, and the most-truncated inserts may contain little more than polyA. Examples of possible read pairs are shown in black solid lines if the aligner can map (align) them uniquely to the reference genome; these will usually appear to be translocation junctions (Fig. 2). Many read pairs will not align (dashed lines) either because one read falls in a repeat, or the sequence is not present in the reference (fine dots). Yellow boxes are target site duplications. d Example of an insert of an L1 that does not transduce 3’ sequence but nevertheless may be mapped as a translocation junction. The parent L1 may have a unique sequence difference, e.g. a single base pair deviation (T > A) from the consensus, that identifies reads uniquely and maps them to its parent L1. Other reads (red) are aligned to an L1 (or Alu) in the reference sequence that has a polyA tail, e.g. the ‘element’ on chromosome 15 in Table 1. In some cases the alignment may be generated by the polyA alone. Such inserts are not in general from the element the read is mapped to. e Apparent junction that is not even a junction. Occasionally a read pair within an insert may be aligned to two different loci, appearing to report a rearrangement junction. For example, one read may map to a transduced sequence, while its pair contains polyA and is mapped to one of the polyA runs in the reference genome
Fig. 2Mobile element inserts mimic multiple translocations. Circos plot of mobile element inserts detected by discordant read pairs in tumour 7409, which had the highest number detected. The genome is displayed as a circle, chromosomes 1 to Y, with curved lines representing the apparent rearrangements detected. For example, the many apparent junctions between chromosome 14 and other chromosomes (green), represent copies of a chromosome 14 sequence that have been transduced and inserted all over the genome
‘Elements’ in the reference genome that mobile element inserts align to
| ID | Chromo-some | Transduced sequence start | Transduced sequence end | L1, Alu etc. | Start | End | +/− | Gene | Total Inserts | Tumours | Method | Max Transduced (bp) | Hot L1 list | Tubio list |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Reference L1s that transduce unique sequence | ||||||||||||||
| Chr 4 | chr4 | 137213864 | in L1 | L1HS | 137214650 | 137220701 | - | none | 3 | 2 | L1 | 786 | 14 | No |
| Chr 8 | chr8 | 135082457 | 135082642 | L1HS | 135082987 | 135089016 | - | none | 6 | 3 | L1 | 530 | 3 | 4 |
| Chr 12_A | chr12 | 3606945 | in L1 | L1HS | 3608362 | 3614394 | - | PRMT8 | 5 | 3 | L1 | 1417 | 35 | 6 |
| Chr 20 | chr20 | 23412922 | 23413624 | L1HS | 23406746 | 23412777 | + | none | 11 | 5 | Cl | 847 | Not listed | 8 |
| Chr 22 | chr22 | 29065365 | 29066424 | L1HS | 29059272 | 29065303 | + | TTC28 | 129 | 36 | Cl, L1 | 1121 | 7 | 137 |
| Chr X_A | chrX | 11731785 | 11732702 | L1HS | 11725369 | 11731399 | + | none | 80 | 19 | Cl, L1 | 1303 | 19 | 7 |
| Chr X_B | chrX | 11952984 | in L1 | L1HS | 11953208 | 11959433 | - | none | 23 | 6 | L1 | 224 | 1 | 20 |
|
| ||||||||||||||
| Chr 3 | chr3 |
|
| Polymorphic L1HS |
|
| + | MYLK | 29 | 7 | Cl | >4000 | N/A | 40 |
| Chr 14 | chr14 | 59220410 | 59221078 | Polymorphic L1HS |
|
| + | none | 179 | 11 | Cl | N/A | 40 | |
| Reference genome elements that inserts align to | ||||||||||||||
| Chr 6 | chr6 | L1HS | 24811907 | 24817934 | - | FAM65B | 3 | 3 | L1 | Nil | 2 | N/A | ||
| Chr 7 | chr7 | L1HS plus 4bp and polyA | 30478859 | 30484914 | + | NOD1 | 84 | 9 | L1 | Nil | 18 | N/A | ||
| Chr8 | see | footnote | L1HS | |||||||||||
| Chr 10 | chr10 | L1HS with polyA | 111572121 | 111578215 | - | none | 6 | 6 | L1 | Nil | 9 | N/A | ||
| Chr 12_B | chr12 | AluSx1 with polyA | 66451373 | 66451739 | - | none | 98 | 21 | Cl | Nil | N/A | N/A | ||
| Chr 15 | chr15 | AluYa5 with polyA | 77910868 | 77911236 | - | LINGO1 | 19 | 17 | Cl | Nil | N/A | N/A | ||
Positions refer to reference genome GRCh37/hg19. The mobile ‘elements’ are identified by chromosome. At least two inserts were verified by PCR for each ‘element’ except the Chr 6 element (only one insert verified) and the Chr 3 element (not verified but described by Tubio et al. [13]). The Chr 3 and Chr 14 elements are polymorphic L1s that are not in the reference genome but were shown to transduce sequence by Tubio et al. [13]; their insertion point in the reference genome is given in italics. The polyA addition site for the Chr 14 element is at the end of a 36bp fragment of an L2a element. The Chr8 element showed both transduction and mapping to the native L1 insert—of the two inserts verified, one had 3’ unique sequence transduced and one was pure L1 3’ terminus
Transduced sequence, maximum extent of unique sequence observed in inserts verified by PCR (Additional file 2), except for Chr 3 element where read map position is given. +/−, strand bearing polyA, which is same as orientation of L1 or Alu if present. Tumours, number of tumours with inserts. Method, method of identification: Cl, cluster of ‘translocation’ breakpoints from discordant reads; L1, cluster of breakpoints 3’ to a known active L1 (not exhaustive). Max transduced, maximum unique sequence transduced in cloned insert. Hot L1 list, rank in list of active L1s of Brouha et al. [24]. Tubio list, whether the elements that transduce 3’ sequence were listed by Tubio et al. [13] and how many inserts were reported; N/A, not applicable as were not transduction events
Fig. 3Verification of representative inserts. a Flanking junctions of an insert from chromosome 22 into ENPP2 on chromosome 8 in Tumour 7396. b PCR across four representative inserts, of elements denoted Chr X_B, Chr 6, Chr 7 and Chr 14 respectively. Note that in addition to the larger, tumour-specific band in most cases there is an intermediate size band, most clearly in the last example. c sequence of insert in a, showing target site duplication (TSD), insert of chromosome 22 material and polyA tail
Active elements in individual tumours
| Tumour | Element | Elements active | Inserts by discordant reads | Candidate Poly A inserts | TP53 Mutation | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 4 | 8 | 12_A | 14 | 20 | 22 | X_A | X_B | ? | |||||
| 3109 | 3 | 1 |
| 1 | 1 | 5 | 32 | 150 | MUT | |||||
| 3111† | 1 | 1 | 1 | 96 | MUT | |||||||||
| 3113 | 2 | 2 | 2 | 3 | 6 | 13 | MUT | |||||||
| 3115 | 3 | 1 | 2 | 4 | 33 | MUT | ||||||||
| 3117 | 0 | 0 | 10 | MUT | ||||||||||
| 3119 | 0 | 0 | 8 | — | ||||||||||
| 3121 | 4 | 3 | 1 | 3 | 8 | 34 | MUT | |||||||
| 3125 | 0 | 0 | 7 | — | ||||||||||
| 3129 | 6 | 1 | 6 | 129 | MUT | |||||||||
| 3131 | 1 | 1 | 2 | 2 | 12 | MUT | ||||||||
| 3133 | 1 | 1 | 2 | 2 | 11 | — | ||||||||
| 3135 | 6 | 1 | 2 | 7 | 16 | MUT | ||||||||
| 3137 | 1 | 1 | 1 | 5 | MUT | |||||||||
| 3149 | 5 | 3 | 7 | 3 | 15 | 19 | MUT | |||||||
| 3302 | 9 | 8 | 3 |
| 4 | 41 | 389 | MUT | ||||||
| 3305 | 5 | 2 | 2 | 7 | 16 | MUT | ||||||||
| 3308 | 6 | 8 | 2 | 14 | 194 | MUT | ||||||||
| 3311 | 1 | 5 | 1 | 3 | 7 | 95 | MUT | |||||||
| 3314 | 4 | 1 | 2 | 5 | 22 | MUT | ||||||||
| 3317 |
| 4 | 1 | 3 | 19 | 70 | MUT | |||||||
| 3320 | 2 | 4 | 2 | 3 | 8 | 97 | MUT | |||||||
| 3323 | 3 | 1 | 2 | 4 | 27 | MUT | ||||||||
| 7394 | 4 | 2 | 7 |
| 4 | 27 | 281 | MUT | ||||||
| 7396 | 2 |
| 8 |
| 4 | 55 | 830 | MUT | ||||||
| 7398 | 2 |
| 4 | 6 |
| 5 | 35 | 231 | MUT | |||||
| 7401 | 1 | 2 | 2 | 2 | 1 | 3 | 6 | 11 | 138 | MUT | ||||
| 7404† | 2 | 1 | 2 | 13 | MUT | |||||||||
| 7407 |
| 1 | 7 | 3 | 4 | 33 | 196 | MUT | ||||||
| 7409 |
| 3 |
| 1 |
|
| 6 | 149 | 416 | MUT | ||||
| 7414 | 1 | 1 | 3 | 3 | 5 | 33 | MUT | |||||||
| 7416 | 0 | 4 |
| 2 | 15 | 196 | MUT | |||||||
| 7418 | 4 | 3 | 2 | 3 | 9 | 73 | MUT | |||||||
| 7420 | 1 | 2 | 6 | 2 | 1 | 1 | 6 | 13 | 100 | MUT | ||||
| 7422 | 1 | 1 | 1 | 4 | 4 | 7 | 62 | — | ||||||
| 7424 | 8 | 2 | 5 |
| 4 | 47 | 352 | MUT | ||||||
| 7427 | 2 | 1 | 2 | 3 | 5 | 133 | MUT | |||||||
| 7430 | 1 | 5 | 2 | 3 | 8 | 30 | MUT | |||||||
| 7432 | 8 | 3 | 10 | 3 | 21 | 133 | MUT | |||||||
| 7434 | 3 | 2 | 2 | 3 | 7 | 125 | MUT | |||||||
| 7436 | 3 | 2 | 1 | 3 | 6 | 39 | MUT | |||||||
| 7438 | 2 | 2 | 1 | 3 | 5 | 64 | MUT | |||||||
| 7440 | 1 | 5 | 4 | 3 | 4 | 13 | 174 | — | ||||||
| 7442 | 2 | 8 | 3 | 3 | 13 | 198 | MUT | |||||||
| Total tumours | 7 | 2 | 3 | 3 | 11 | 5 | 36 | 19 | 6 | 31 | ||||
| Total inserts | 29 | 3 | 6 | 5 | 179 | 11 | 129 | 80 | 23 | 210 | 675 | 5270 | ||
| Average per tumour | 16 | 123 | ||||||||||||
The elements that transduce unique sequence are listed individually, while all inserts mapped to L1s or polyA in the genome are combined in column marked ‘?’, because it is not certain that they were copied from any specific L1. Elements active, number of different elements active in a given tumour. Inserts by discordant reads, total inserts found from discordant paired reads. PolyA, candidate inserts found by searching for tumour-specific polyA. TP53 mutation, mutations in TP53 called from Illumina sequencing. MUT, mutated; −, no mutation detected. † Note that tumours 3111 and 7404 had verified tumour-specific inserts of the chr12_A element, so definitely had some mobile-element activity