Literature DB >> 30692270

Dynamic Methylation of an L1 Transduction Family during Reprogramming and Neurodifferentiation.

Carmen Salvador-Palomeque1, Francisco J Sanchez-Luque1,2, Patrick R J Fortuna3, Adam D Ewing1, Ernst J Wolvetang3, Sandra R Richardson4, Geoffrey J Faulkner4,5,6.   

Abstract

The retrotransposon LINE-1 (L1) is a significant source of endogenous mutagenesis in humans. In each individual genome, a few retrotransposition-competent L1s (RC-L1s) can generate new heritable L1 insertions in the early embryo, primordial germ line, and germ cells. L1 retrotransposition can also occur in the neuronal lineage and cause somatic mosaicism. Although DNA methylation mediates L1 promoter repression, the temporal pattern of methylation applied to individual RC-L1s during neurogenesis is unclear. Here, we identified a de novo L1 insertion in a human induced pluripotent stem cell (hiPSC) line via retrotransposon capture sequencing (RC-seq). The L1 insertion was full-length and carried 5' and 3' transductions. The corresponding donor RC-L1 was part of a large and recently active L1 transduction family and was highly mobile in a cultured-cell L1 retrotransposition reporter assay. Notably, we observed distinct and dynamic DNA methylation profiles for the de novo L1 and members of its extended transduction family during neuronal differentiation. These experiments reveal how a de novo L1 insertion in a pluripotent stem cell is rapidly recognized and repressed, albeit incompletely, by the host genome during neurodifferentiation, while retaining potential for further retrotransposition.
Copyright © 2019 Salvador-Palomeque et al.

Entities:  

Keywords:  L1; LINE-1; methylation; neurogenesis; retrotransposon

Mesh:

Substances:

Year:  2019        PMID: 30692270      PMCID: PMC6425141          DOI: 10.1128/MCB.00499-18

Source DB:  PubMed          Journal:  Mol Cell Biol        ISSN: 0270-7306            Impact factor:   4.272


INTRODUCTION

LINE-1 (L1) retrotransposons are mobile genetic elements that occupy nearly 20% of the human genome (1) and are an endogenous source of mutagenesis (2). Approximately 100 retrotransposition-competent L1s (RC-L1s) are found in each individual, while the remaining ∼500,000 L1 copies are immobile due to 5′ truncations, inversions, deletions, and other mutations (3, 4). Almost all RC-L1s belong to the L1-Ta subfamily (5). L1 retrotransposition is a copy-and-paste process involving an RNA intermediate and an L1-encoded protein machinery (6–8) that orchestrates L1 integration via a molecular process termed target-primed reverse transcription (TPRT) (9). An RC-L1 is 6 kb in length and contains a 5′ untranslated region (UTR), an antisense promoter and open reading frame (ORF0) (10, 11), two nonoverlapping sense open reading frames (ORF1 and ORF2), and a 3′ UTR that is punctuated by a poly(A) tract (12–14). Critically, ORF2p possesses endonuclease (EN) and reverse transcriptase (RT) activities required for L1 mobility (8, 15, 16), while new L1 insertions usually integrate at a degenerate L1 EN recognition motif (5ʹ-TT/AAAA, where “/” represents the position cut by the L1 EN) (17) and are flanked by variable-length target site duplications (TSDs) (18, 19), which are hallmarks of TPRT. L1 is the only active autonomous human retrotransposon (5, 15) although other polyadenylated RNAs, including mRNAs and those of the Alu and SINE-VNTR-Alu (SVA) retrotransposon families, can be mobilized in trans by the L1 machinery (13, 20–24). The L1 5′ UTR has an internal RNA polymerase II promoter that directs L1 mRNA transcription (25) and is regulated by DNA methylation of a CpG island located nearby in the 5′ UTR (26–31). The host genome also restricts L1 activity through mechanisms limiting L1 mRNA production or otherwise hindering retrotransposition (32–34). A minor fraction of RC-L1s in the human population are thought to generate the majority of new germ line L1 insertions and are highly mobile, or “hot,” when tested in cultured cell L1 retrotransposition assays (3, 8, 35–37). These experiments largely measure the enzymatic efficiency of L1s introduced in episomal vectors, and, importantly, a particular L1 locus may present multiple alleles with different retrotransposition efficiencies (38, 39). The endogenous regulation of a given RC-L1 may therefore be most clearly resolved in the spatiotemporal contexts where it produces new L1 insertions. An RC-L1 can be identified as the donor element for an L1 insertion through shared unique internal single nucleotide variants, or transductions (37, 40). 5ʹ transductions are thought to accompany <0.1% of L1-Ta insertions and likely arise when the L1 promoter, or another nearby promoter, initiates L1 mRNA transcription upstream of the canonical L1 transcription start site (1, 32, 41–43). In contrast, 3ʹ transductions are found alongside ∼20% of new germ line L1 insertions and occur when L1 mRNA transcription bypasses the canonical L1 polyadenylation signal and terminates at an alternative downstream polyadenylation signal (1, 13, 44–48). Transductions have been used to trace RC-L1s responsible for pathogenic L1 insertions (32, 35, 49–53) and to reconstruct closely related RC-L1 lineages, or transduction families, in human populations (35, 44, 54). Early embryogenesis provides a major developmental niche for heritable L1 retrotransposition events in mammals (55–57). Cultivated human embryonic stem cells (hESCs) and human induced pluripotent stem cells (hiPSCs) resembling the cells of the embryonic inner cell mass also express L1 mRNAs and support engineered and endogenous L1 retrotransposition (58–63). De novo L1 insertions arising during embryogenesis or later development can cause somatic mosaicism (55, 64–66). In particular, somatic L1 insertions have been reported in brain tissue (65, 67–74), while engineered L1 reporter genes mobilize during neurogenesis and in postmitotic neurons (63, 67, 75). Importantly, the L1-Ta subfamily is hypomethylated in hESCs and hiPSCs compared to methylation of neurons and other differentiated cells, suggesting genome-wide L1 promoter methylation is enforced during development (58, 61, 63, 67). However, the likely related temporal profiles of DNA methylation and somatic retrotransposition for individual RC-L1s that mobilize during neurogenesis are unresolved. Here, we identified a reprogramming-associated de novo L1 insertion in a cultivated hiPSC line. This insertion was traced to a hot donor RC-L1 that was part of an extended and recently active transduction family. We then measured locus-specific DNA methylation among de novo, donor, and transduction family L1 promoters, as well as the L1-Ta subfamily genome-wide, at multiple points of neurodifferentiation. These experiments significantly elucidate the dynamic temporal profile of epigenetic L1 repression applied to new and extant L1 insertions during neurogenesis.

RESULTS

A de novo L1 insertion arising during reprogramming.

To study endogenous retrotransposition during neurogenesis, we obtained two hiPSC lines (hiPSC-CRL1502 and hiPSC-CRL2429) generated via delivery of defined reprogramming factors to healthy human dermal fibroblasts (58, 76). We then differentiated each hiPSC line toward a neuronal phenotype for 156 days in culture (Fig. 1A) and applied retrotransposon capture sequencing (RC-seq) (58, 69, 77) to genomic DNA sampled from the parental fibroblasts (time point 0 [T0]), hiPSCs (T1), and several time points of differentiation (T2 to T6) (Table 1). Two earlier passages of each hiPSC line were also analyzed by RC-seq to better distinguish L1 insertions arising during reprogramming or cell cultivation (Table 1). Cells from each point of neurodifferentiation were characterized by immunocytochemistry (Fig. 1A) and included neural epithelium (T2), neural rosettes denoting immature neurons (T3) and three stages of prolonged neuronal maturation (T4 to T6). Endogenous L1 insertions detected by RC-seq and absent from the reference genome were annotated as either polymorphic (previously published or present at T0) or de novo (only present at T1 or later in one time course). Two potential de novo L1 insertions were identified (see Table S1 in the supplemental material). We then performed insertion site-specific PCR validation for each event (Fig. 1B and Table 2) and found that one insertion, on chromosome 1 (Chr1), was de novo in hiPSC-CRL2429 cells at time point T1 was carried through neurodifferentiation (Fig. 1C), and was absent from hiPSC-CRL1502 (Fig. 1C). PCR indicated that the other putative de novo event was polymorphic because it was found in the matched parental fibroblast population (Table 2 and Table S1).
FIG 1

Characterization of a reprogramming-associated de novo L1 insertion carried through neurodifferentiation in vitro. (A) Schematic timeline of experimental approach. Fibroblasts (time point 0 [T0]) were reprogrammed to obtain hiPSCs (T1), which were then sampled at 5 points (T2 to T6) of neuronal differentiation in extended cell culture. Immunocytochemistry was used to characterize expression of marker genes (OCT4, NANOG, PAX6, TUJ1, CUX1, and GFAP gemes) and histone 3 phosphorylation (PH3), as associated with various stages of neural cell maturation, with Hoechst staining of DNA. (B) L1 insertion PCR validation strategies. Green and blue arrows, respectively, represent primers targeting the 5′ and 3′ genomic flanks of an L1 insertion (rectangle). Black arrows represent primers specific to the L1 sequence. Combinations of these primers are used to generate the following amplicons (arranged top to bottom): 5′ L1-genome junction, 3′ L1-genome junction, L1 insertion (filled site), and empty site. (C) PCR validation results for a de novo L1 insertion detected in cell line hiPSC-CRL2429. An empty/filled PCR was also performed with cell line hiPSC-CRL1502 as a negative control. Red and black arrow heads indicate the expected filled and empty site band sizes, respectively. NTC, nontemplate control. (D) De novo L1 insertion sequence structure. In addition to TSDs (triangles), the full-length L1-Ta insertion was flanked by 5′ (orange) and 3′ transductions (purple). (E) The same experiments as described for panel C except that they were performed for the donor L1 responsible for the de novo L1 insertion (left) and its lineage progenitor L1 (right), using CRL2429 fibroblast genomic DNA.

TABLE 1

RC-seq library information

hiPSC line and library DNA inputcTime pointRC-seq readsb
RC-seq data source
CountAligned %
CRL1502
    FibroblastsT044,033,58299.95Klawitter et al.
    hiPSCs p76a T142,151,99499.74This study
    Neural epitheliumT233,972,00199.77This study
    Immature neuronsT339,766,94099.77This study
    Neurons IT447,155,51499.78This study
    Neurons IIT544,381,11197.91This study
    Neurons IIIT636,222,61099.77This study
    hiPSCs p15Earlier hiPSC passage24,385,02299.88Klawitter et al.
    hiPSCs p40Earlier hiPSC passage63,130,77299.88Klawitter et al.
CRL2429
    FibroblastsT024,386,59099.91Klawitter et al.
    hiPSCs p70T138,460,24199.63This study
    Neural epitheliumT240,174,55499.79This study
    Immature neuronsT346,646,99999.78This study
    Neurons IT427,279,49299.79This study
    Neurons IIT546,018,31099.77This study
    Neurons IIIT636,033,94499.54This study
    hiPSCs p11Earlier hiPSC passage64,534,18999.40Klawitter et al.
    hiPSCs p40Earlier hiPSC passage27,447,96799.39Klawitter et al.

p, passage.

Data from 2- by 150-mer reads.

Neurons I, II, and III were harvested after 72, 112, and 156 days of differentiation in vitro, respectively.

TABLE 2

PCR primers used for validation and bisulfite sequencing

Primer function and nameSequence
Genomic primers for empty/filled L1 validation reactions
    LineageProgenitor_Chr11_fwdAGGAAACAGTGAGGGGAAGC
    LineageProgenitor_Chr11_revTGAGGCCCAGGAGTCATATC
    Donor_Chr3_fwdTGTATGACAGTAAAATAATGGGTAGATGA
    Donor_Chr3_revCTGGCCTCTTCACTGCATTT
    DeNovo_Chr1_fwdCTGGTAACCCCAGAATGACG
    DeNovo_Chr1_revATCCTGCCTCAGCGAACTTA
    Non-ref_Chr3_fwdTTGTGGGAAGGCAAAATGAT
    Non-ref_Chr3_revTATTCAATCCCAACCCAGGA
L1-specific primers for validation of 5′ and 3′ L1-genome junctions
    hL1_273_revACCCGATTTTCCAGGTGCGT
    hL1_ACshort_fwdAGATATACCTAATGCTAGATGACAC
NotI/L1-genome junction-spanning primers for cloning full-length L1s
    LineageProgenitor_Chr11_NotI_fwdCAAGCGGCCGCTTACATTTTTAAAGAATTGTAGGGGAG
    Donor_Chr3_NotI_fwdTAAAGCGGCCGCAACAGAATGAGTAAATAATGGAGGG
    DeNovo_Chr1_NotI_fwdTTCGCGGCCGCATTAAAGAAATGACATCTGAAATAATGGA
    Non-ref_Chr3_NotI_fwdCAACGCGGCCGCTTAAAGTTAAAGACACGG
L1-specific primers for sequencing full-length L1s
    L1_452_fwdGCCCAGGCTTGCTTAGGTA
    L1_1020_fwdTGATTTTGACGAGCTGAGAGAA
    L1_1532_fwdCCTCGAGAAGAGCAACTCCA
    L1_1966_fwdGCAAAATCACCAGCTAACATCA
    L1_2494_fwdAACTCAGCTCTGCACCAAGC
    L1_3014_fwdAAATCAGAGCAGAACTGAAGGAAA
    L1_3502_fwdGAGGCCAGCATCATTCTGATA
    L1_4022_fwdCAATCAGGCAGGAGAAGGAA
    L1_4472_fwdTCCCCATCAAGCTACCAATG
    L1_4973_fwdTGTCCAAAACACCAAAAGCA
    L1_5492_fwdTACCATTTGACCCAGCCATC
Primers for amplification of L1 promoters from bisulfite converted DNA
    L1_Bis-LPGATTTGTTTTTGGATTGTAAAATGGTT
    L1_Bis-DonorTGGGTAGATGAACAGATAAGTAAA
    L1_BiS-DNGTTATTTGATAGTATTTTAATGAAGATT
    L1_Bis-FTAGGGAGTGTTAGATAGTGG
    L1_Bis-RACTATAATAAACTCCACCCAAT
Characterization of a reprogramming-associated de novo L1 insertion carried through neurodifferentiation in vitro. (A) Schematic timeline of experimental approach. Fibroblasts (time point 0 [T0]) were reprogrammed to obtain hiPSCs (T1), which were then sampled at 5 points (T2 to T6) of neuronal differentiation in extended cell culture. Immunocytochemistry was used to characterize expression of marker genes (OCT4, NANOG, PAX6, TUJ1, CUX1, and GFAP gemes) and histone 3 phosphorylation (PH3), as associated with various stages of neural cell maturation, with Hoechst staining of DNA. (B) L1 insertion PCR validation strategies. Green and blue arrows, respectively, represent primers targeting the 5′ and 3′ genomic flanks of an L1 insertion (rectangle). Black arrows represent primers specific to the L1 sequence. Combinations of these primers are used to generate the following amplicons (arranged top to bottom): 5′ L1-genome junction, 3′ L1-genome junction, L1 insertion (filled site), and empty site. (C) PCR validation results for a de novo L1 insertion detected in cell line hiPSC-CRL2429. An empty/filled PCR was also performed with cell line hiPSC-CRL1502 as a negative control. Red and black arrow heads indicate the expected filled and empty site band sizes, respectively. NTC, nontemplate control. (D) De novo L1 insertion sequence structure. In addition to TSDs (triangles), the full-length L1-Ta insertion was flanked by 5′ (orange) and 3′ transductions (purple). (E) The same experiments as described for panel C except that they were performed for the donor L1 responsible for the de novo L1 insertion (left) and its lineage progenitor L1 (right), using CRL2429 fibroblast genomic DNA. RC-seq library information p, passage. Data from 2- by 150-mer reads. Neurons I, II, and III were harvested after 72, 112, and 156 days of differentiation in vitro, respectively. PCR primers used for validation and bisulfite sequencing We then cloned and capillary sequenced the entire de novo L1 insertion (Fig. 1D) and manually inspected the integration site for hallmarks of TPRT (8, 9, 16, 17). The L1 was full length, belonged to the L1-Ta subfamily, carried 5ʹ and 3ʹ transductions, was flanked by 16-nucleotide (nt) TSDs, inserted at a degenerate L1 endonuclease motif (5ʹ-TT/AAAG), and terminated with a 33-nt poly(A) tract. The 5ʹ and 3ʹ transductions were 10 nt and 44 nt in length, respectively, and the 3ʹ transduction was preceded by an internal 17-nt poly(A) tract (Fig. 1D). These features were consistent with endogenous retrotransposition mediated via TPRT and, as confirmed by insertion site-specific PCR, showed that the de novo L1 insertion represented a bona fide retrotransposition event occurring during reprogramming, or very early in hiPSC-CRL2429 cultivation.

An extended human RC-L1 transduction family.

The de novo L1 insertion was the first such example to be found in hiPSCs of an endogenous L1 insertion carrying both 5ʹ and 3ʹ transductions. These transductions uniquely indicated a donor L1 sequence on chromosome 3 that was heterozygous in the hiPSC-CRL2429 parental fibroblast population (Fig. 1E). The donor L1 was absent from the reference genome and was polymorphic in humans; it was previously shown to mobilize efficiently in vitro (35). To identify any other germ line L1 insertions closely related to the donor L1, we aligned the 3ʹ transduced sequence to the reference genome and to the annotated 3ʹ L1-genome junction sequences of polymorphic L1s carried by hiPSC-CRL2429 or hiPSC-CRL1502 (Table S1) or those annotated by previous studies (52, 58, 69, 70, 77, 78). We further annotated this list with results obtained by previous studies of L1 mobilization in the germ line, tumors, and cancer cell lines (3, 35, 37, 49, 77, 79–87). From this analysis, we reconstructed an extended L1 transduction family comprising 14 members (Table 3), including a plausible founder, or lineage progenitor (44), element for the family, which was homozygous in hiPSC-CRL2429 and located on chromosome 11 (Fig. 1E).
TABLE 3

Transduction family members

ElementGenomic coordinate (hg19)TSDFull-lengtha Identification source and/or reference(s)
Lineage progenitor L1Chr11: 95169381AAAGAATTGTAYReference genome; 3
Donor L1Chr3: 38626082AGAATGAGTAAATAATGY35, 49, 71, 77, 79, 8287
De novo L1Chr1: 231719316AAAGAAATGACATCTGYThis study
Ref_Chr7_q21.3Chr7: 96475963GAAAGTTCCAGTTGCYReference genome
Non-ref_Chr3_p24.3Chr3: 20748904TAAAGACACY35, 49, 71, 77, 79, 82, 83, 87
Ref_Chr1_p31.1_aChr1: 84518060AGAAAAACAAATCAYReference genome
Ref_Chr1_p31.1_bChr1: 83125969AAAAAAAATGGTTCATGCNReference genome
Ref_Chr9_p23Chr9: 12556931GAAAAGTATTGTATTGNReference genome
Non-ref_Chr3_p12.2_aChr3: 80590176GAAAATGGAATGGGY35, 37, 49, 77, 79, 82, 83
Non-ref_Chr3_p12.2_bChr3: 82144869AGAAATAATAATTTCCY49, 71, 77, 79, 83, 85, 86
Non-ref_ChrX_p11.4ChrX: 38097551AAAAGCGATATGY49, 86
Non-ref_Chr17_q12Chr17: 32813609AAGAAGGTAAGATGGN71, 77, 79, 8284, 87
Non-ref_Chr1_p22.2Chr1: 90914512AAAAAGCTCTTTCAGN49, 71, 77, 79, 85, 86
Non-ref_Chr4_q12Chr4: 53628490TAAATTACAGGTTAN49, 71, 77, 79, 82, 83, 86

Y, yes; N, no.

Transduction family members Y, yes; N, no. To further characterize the transduction family, we analyzed the complete internal sequence of eight of its members found in either hiPSC-CRL1502 or hiPSC-CRL2429, including the de novo L1 insertion. A consensus sequence was obtained for the lineage progenitor, donor, and de novo L1s, as well as for another L1 nonreference (Non-ref) element, named Non-ref_Chr3_p24.3, via capillary sequencing of multiple full-length amplicons derived from independent PCRs (Fig. 2). Internal and flanking sequences for four additional reference (Ref) elements (Ref_Chr7_q21.3, Ref_Chr1_p31.1a, Ref_Chr1_p31.1b, and Ref_Chr9_p23) were obtained from the reference genome assembly. The 5′ and 3′ L1-genome junctions of the remaining six nonreference elements (Non-ref_Chr3_p12.2_a, Non-ref_Chr3_p12.2_b, Non-ref_ChrX_p11.4, Non-ref_Chr17_q12, Non-ref_Chr1_p22.2, and Non-ref_Chr4_q12) were provided by previous studies (Table 3).
FIG 2

The reprogramming-associated de novo L1 insertion belonged to an extended L1 transduction family. The diagram shows 14 members of this family, including the de novo L1 insertion. Two alleles of the lineage progenitor L1 were characterized. TSDs flanking each L1 are represented by blue arrows. The 5ʹ UTR and 3ʹ UTR sequences are shown in dark gray, while ORFs with known and unknown sequences are shown in white and light gray, respectively. Transduction colors match their source L1 locus: donor L1 → de novo L1 (orange), Ref_Chr7_q21.3 → Non-ref_ChrX_p11.4 (pink), lineage progenitor L1 → all other family members (purple). Letter and number combinations within L1s correspond to L1.3 nucleotide (lowercase) and ORF1 and ORF2 amino acid (purple uppercase) positions (88). Nucleotide changes versus L1.3 and present in all, some, or one of the sequenced members of the transduction family are shown in gray, black, and blue, respectively. Nucleotide changes unique to the two alleles of the lineage progenitor are shown in green, and nucleotide changes unique to the donor L1 and de novo L1 are shown in pink.

The reprogramming-associated de novo L1 insertion belonged to an extended L1 transduction family. The diagram shows 14 members of this family, including the de novo L1 insertion. Two alleles of the lineage progenitor L1 were characterized. TSDs flanking each L1 are represented by blue arrows. The 5ʹ UTR and 3ʹ UTR sequences are shown in dark gray, while ORFs with known and unknown sequences are shown in white and light gray, respectively. Transduction colors match their source L1 locus: donor L1 → de novo L1 (orange), Ref_Chr7_q21.3 → Non-ref_ChrX_p11.4 (pink), lineage progenitor L1 → all other family members (purple). Letter and number combinations within L1s correspond to L1.3 nucleotide (lowercase) and ORF1 and ORF2 amino acid (purple uppercase) positions (88). Nucleotide changes versus L1.3 and present in all, some, or one of the sequenced members of the transduction family are shown in gray, black, and blue, respectively. Nucleotide changes unique to the two alleles of the lineage progenitor are shown in green, and nucleotide changes unique to the donor L1 and de novo L1 are shown in pink. Notably, the homozygous lineage progenitor L1 had two allelic variants in hiPSC-CRL2429 cells, which were distinguished by four single nucleotide variants. Allele 1 contained a nonsynonymous change (D523H) in the ORF2p RT domain, which was not found in allele 2. Further analysis of the remaining family members relative to the sequence of L1.3 (88) indicated that each contained internal single nucleotide variants common to both progenitor element alleles, in addition to shared 3ʹ transduced sequences (Fig. 2). The de novo and donor elements were identical in their L1 sequences, and the 5ʹ transduced sequence carried by the de novo insertion exactly matched the 10 nt directly upstream of the donor element. Surprisingly, in addition to the de novo L1 insertion, two other elements, Ref_Chr1_p31.1_a and Non-ref_ChrX_p11.4, each carried both 5ʹ and 3ʹ transductions, enabling us to unambiguously identify their respective donor L1 sequences (the lineage progenitor and Ref_Chr7_q21.3, respectively), which were also members of the transduction family (Fig. 2). Interestingly, the 539-nt 5ʹ transduction carried by Ref_Chr1_p31.1a was preceded by a single untemplated guanine, suggesting that the template mRNA was capped (18, 89), and utilized a transcription start site in the 5ʹ long terminal repeat (LTR7Y) sequence of a human endogenous retrovirus type H (HERV-H) provirus integrated ∼126 kb upstream of the lineage progenitor L1 (Fig. 2). This mRNA template incorporated two exons upstream of the lineage progenitor L1, which were spliced together and to the L1 via sites strongly resembling consensus mammalian splice donor and acceptor sequences (Fig. 3). Another element, Non-ref_Chr3_p24.3, incorporated a nonsense mutation predicted to truncate ORF2 prior to its RT domain. In sum, these experiments characterized relationships among members of a transduction family, which, in many cases, remain potentially capable of retrotransposition in the germ line, in tumors (37, 49), and, as shown here, in hiPSCs.
FIG 3

A spliced RNA initiated from an HERVH LTR7Y transcription start site upstream of the lineage progenitor L1 provided the proposed intermediate template for the 5′ transduction carried by Ref_Chr1_p31.1a. The proposed RNA structure and splice donor (SD) and acceptor (SA) sequences are provided, as are the consensus splice donor and acceptor motifs. An untemplated guanine was present at the 5′ end of Ref_Chr1_p31.1a, suggesting that the proposed intermediate template RNA was capped (18, 89).

A spliced RNA initiated from an HERVH LTR7Y transcription start site upstream of the lineage progenitor L1 provided the proposed intermediate template for the 5′ transduction carried by Ref_Chr1_p31.1a. The proposed RNA structure and splice donor (SD) and acceptor (SA) sequences are provided, as are the consensus splice donor and acceptor motifs. An untemplated guanine was present at the 5′ end of Ref_Chr1_p31.1a, suggesting that the proposed intermediate template RNA was capped (18, 89).

Transduction family mobilization in vitro.

To assess the retrotransposition competence of several members of the transduction family, we employed a cultured-cell-engineered L1 retrotransposition reporter assay (8) in HeLa cells. Briefly, in this assay, an L1 sequence is cloned into a vector containing an antibiotic resistance cassette oriented antisense to the L1 copy, where the resistance gene contains an intron oriented in sense to the L1, meaning antibiotic resistance occurs only after splicing and retrotransposition of the reporter cassette (8, 90) (Fig. 4A). Through this approach, we tested the following elements: a known hot RC-L1 (L1.3) as a positive control (88, 91), an RT mutant L1 (L1.3 RT−) as a negative control (6), both detected alleles of the lineage progenitor L1, the donor L1 (identical in sequence to the de novo L1), and Non-ref_Chr3_p24.3, which contained an ORF2 stop codon in its RT domain (Fig. 2). Each element was tested in triplicate experiments under the control of its native L1 promoter (Fig. 4B).
FIG 4

L1 transduction family members are retrotransposition competent in vitro. (A) At top is a schematic of the cultured-cell L1 retrotransposition reporter assay (8). An L1 driven by its native promoter is tagged with an intron-containing (SD, splice donor; SA, splice acceptor) G418 antibiotic resistance gene cassette (Neo) oriented antisense to the L1 (black circle, polyadenylation signal). Transcription, splicing, and retrotransposition of the L1 reporter generates newly integrated engineered L1 insertions with a functional and expressed Neo cassette. The bottom diagram is a summary of the retrotransposition assay protocol. (B) L1 transduction family members were tested via the L1 retrotransposition reporter assay in cultivated HeLa cells. Elements included positive (L1.3) (88, 91) and negative (L1.3 RT−) (6) controls, two identified alleles of the lineage progenitor L1, the donor L1 (identical in sequence to the de novo L1), and Non-Ref_Chr3_p24.3, which encoded an ORF2 stop codon. The assay was repeated three times (biological replicates) with similar results. Values represent the means ± standard deviations of colonies counted in each of three technical replicates, normalized to the value for L1.3. Representative images matching each element, tested in six-well plates, are shown below.

L1 transduction family members are retrotransposition competent in vitro. (A) At top is a schematic of the cultured-cell L1 retrotransposition reporter assay (8). An L1 driven by its native promoter is tagged with an intron-containing (SD, splice donor; SA, splice acceptor) G418 antibiotic resistance gene cassette (Neo) oriented antisense to the L1 (black circle, polyadenylation signal). Transcription, splicing, and retrotransposition of the L1 reporter generates newly integrated engineered L1 insertions with a functional and expressed Neo cassette. The bottom diagram is a summary of the retrotransposition assay protocol. (B) L1 transduction family members were tested via the L1 retrotransposition reporter assay in cultivated HeLa cells. Elements included positive (L1.3) (88, 91) and negative (L1.3 RT−) (6) controls, two identified alleles of the lineage progenitor L1, the donor L1 (identical in sequence to the de novo L1), and Non-Ref_Chr3_p24.3, which encoded an ORF2 stop codon. The assay was repeated three times (biological replicates) with similar results. Values represent the means ± standard deviations of colonies counted in each of three technical replicates, normalized to the value for L1.3. Representative images matching each element, tested in six-well plates, are shown below. Among the tested elements, the lineage progenitor L1 allele 2 exhibited the highest retrotransposition frequency activity, at 135% of L1.3 (Fig. 4B). Consistent with the progenitor L1 allele 1 carrying two nonsynonymous mutations in ORF2 not found in allele 2, resulting in Q159H and D523H amino acid changes (Fig. 2), we found allele 1 retrotransposed at ∼74% of the efficiency observed for allele 2 and at a similar efficiency as seen for L1.3 (Fig. 4B). Each progenitor L1 allele jumped at >10% of the efficiency of L1.3 and therefore met the definition of a hot RC-L1 (35). Notably, an allele of the progenitor L1 had previously been tested, albeit in an osteosarcoma cell line and with a different reporter system, and was found to present much more limited mobilization potential in vitro (3). The most likely explanation for this difference is that the prior study tested an allele of the progenitor L1 not assayed here. This result further highlights the impact of allelic variation upon the retrotransposition efficiency of a given genomic RC-L1 copy (38, 39). The donor L1 was sequenced from a line (hiPSC-CRL2429) established from a Caucasian individual. Apart from a single nucleotide mutation in its 3ʹ UTR, this L1 was identical to one identified in a Japanese individual by a previous study, which reported its retrotransposition efficiency as 101% of L1.3 in the same reporter assay (35). Here, the donor L1 jumped at 117% of L1.3, corroborating the prior experimental results and confirming that retrotransposition-competent alleles of this L1 exist in multiple human populations. Finally, L1.3 RT− and Non-ref_Chr3_p24.3 did not retrotranspose, consistent with disabled ORF2 RT activity in each case (Fig. 4B). Overall, these results demonstrate that the de novo L1, its donor sequence, and the progenitor element of the transduction family were all hot RC-L1s in vitro.

L1 promoter methylation is dynamic during neurodifferentiation.

Full-length L1 mRNA transcription is a prerequisite for L1 retrotransposition in cis and is directed by an internal promoter located in the L1 5ʹ UTR (25). DNA methylation of an adjacent CpG island mediates repression of the L1 promoter (26, 31). Genome-wide, the L1-Ta subfamily is thought to be broadly hypomethylated in pluripotent cells and then methylated during differentiation, including in mature neurons (40, 49, 58, 61, 63, 67). However, the temporal methylation patterns for the L1-Ta subfamily and individual L1-Ta promoters during the various stages of neurodifferentiation to date have not been resolved. It is also unknown how quickly methylation is established upon new L1 insertions that arise in pluripotent cells. To address these questions, we applied a multiplexed L1 locus-specific bisulfite sequencing approach (52, 78) (Fig. 5A and Table 2) to assess DNA methylation among the de novo, donor, and progenitor L1 5ʹ UTR sequences, as well as the L1-Ta subfamily genome wide. This analysis was performed for both hiPSC lines and their parental fibroblasts and derivative neuronal cell populations, as surveyed by RC-seq, with the exception of the de novo L1, which was present only in hiPSC-CRL2429 (Fig. 5B and 6).
FIG 5

L1 promoter DNA methylation is dynamic during hiPSC-CRL2429 reprogramming and neurodifferentiation. (A) L1 bisulfite sequencing analysis design. CpG dinucleotides are indicated by circles above the L1 5ʹ UTR, and their nucleotide positions are provided below. A common reverse primer (black) is combined with either an L1-Ta subfamily forward primer (purple) or an L1 locus-specific forward primer (pink) to generate PCR amplicons for multiplexed paired-end Illumina 2- by 300-mer sequencing, resolving each amplicon in full. (B) L1 CpG methylation patterns in hiPSC-CRL2429 fibroblasts, hiPSCs, and neural cells derived in vitro. Each cartoon panel corresponds to an amplicon (L1-Ta subfamily or specific L1 locus) and displays 50 random, nonidentical sequences (black circle, methylated CpG; white circle, unmethylated CpG; ×, mutated CpG). The percentage of methylated CpG is indicated in the lower right corner of each cartoon. (C) L1 promoter CpG methylation levels for the hiPSC-CRL2429 neurodifferentiation time course. Values represent the means ± standard deviations of CpG methylation of the corresponding 50 reads for each amplicon, as presented in panel B. Statistical analyses involved paired t tests, with a Bonferroni multiple-testing correction where appropriate. *, P < 0.01; **, P < 0.001; ***, P < 0.0001.

FIG 6

L1 CpG methylation patterns in hiPSC-CRL1502 fibroblasts, hiPSCs, and neural cells derived in vitro. (A) Each cartoon panel corresponds to an amplicon (L1-Ta subfamily or specific L1 locus) and displays 50 random, nonidentical sequences (black circle, methylated CpG; white circle, unmethylated CpG; ×, mutated CpG). The percentage of methylated CpG is indicated in the lower right corner of each cartoon. (B) L1 promoter CpG methylation levels for the hiPSC-CRL1502 neurodifferentiation time course. Values represent the means ± standard deviations of CpG methylation of the corresponding 50 reads for each amplicon, as presented in panel A. Statistical analyses involved paired t tests, with a Bonferroni multiple-testing correction where appropriate. *, P < 0.01; **, P < 0.001; ***, P < 0.0001.

L1 promoter DNA methylation is dynamic during hiPSC-CRL2429 reprogramming and neurodifferentiation. (A) L1 bisulfite sequencing analysis design. CpG dinucleotides are indicated by circles above the L1 5ʹ UTR, and their nucleotide positions are provided below. A common reverse primer (black) is combined with either an L1-Ta subfamily forward primer (purple) or an L1 locus-specific forward primer (pink) to generate PCR amplicons for multiplexed paired-end Illumina 2- by 300-mer sequencing, resolving each amplicon in full. (B) L1 CpG methylation patterns in hiPSC-CRL2429 fibroblasts, hiPSCs, and neural cells derived in vitro. Each cartoon panel corresponds to an amplicon (L1-Ta subfamily or specific L1 locus) and displays 50 random, nonidentical sequences (black circle, methylated CpG; white circle, unmethylated CpG; ×, mutated CpG). The percentage of methylated CpG is indicated in the lower right corner of each cartoon. (C) L1 promoter CpG methylation levels for the hiPSC-CRL2429 neurodifferentiation time course. Values represent the means ± standard deviations of CpG methylation of the corresponding 50 reads for each amplicon, as presented in panel B. Statistical analyses involved paired t tests, with a Bonferroni multiple-testing correction where appropriate. *, P < 0.01; **, P < 0.001; ***, P < 0.0001. L1 CpG methylation patterns in hiPSC-CRL1502 fibroblasts, hiPSCs, and neural cells derived in vitro. (A) Each cartoon panel corresponds to an amplicon (L1-Ta subfamily or specific L1 locus) and displays 50 random, nonidentical sequences (black circle, methylated CpG; white circle, unmethylated CpG; ×, mutated CpG). The percentage of methylated CpG is indicated in the lower right corner of each cartoon. (B) L1 promoter CpG methylation levels for the hiPSC-CRL1502 neurodifferentiation time course. Values represent the means ± standard deviations of CpG methylation of the corresponding 50 reads for each amplicon, as presented in panel A. Statistical analyses involved paired t tests, with a Bonferroni multiple-testing correction where appropriate. *, P < 0.01; **, P < 0.001; ***, P < 0.0001. Considering general trends observed in both hiPSC lines, the L1-Ta subfamily and individual L1 promoters were most methylated in fibroblasts and differentiated neurons and least methylated in hiPSCs and the earliest stages of neurodifferentiation (Fig. 5B and 6A). For example, 66.6%, 31.1%, and 61.0% of CpG dinucleotides surveyed in the donor L1 were methylated, on average, in hiPSC-CRL2429 fibroblasts, hiPSCs, and mature neurons, respectively. Among the two hiPSC lines, the highly significant (P < 0.0001, paired t test with Bonferroni correction) reductions in methylation observed for the donor L1 during hiPSC derivation (25.0% on average) far exceeded that seen for the lineage progenitor (12.5%) and L1-Ta subfamily (2.9%) (Fig. 5C and 6B). The lineage progenitor L1 was significantly (P < 0.001, paired t test) more methylated than the donor L1 at all time points in each hiPSC line, with the L1-Ta subfamily being methylated to a level between that of the lineage progenitor L1 and donor L1 at most time points (Fig. 5C and 6B). Notably, we observed a significant (P < 0.001, paired t test with Bonferroni correction) reduction in methylation (23.1% average decrease) for all amplicons at T5 in hiPSC-CRL2429, followed by a significant (P < 0.01) increase in methylation at T6 (20.1% average increase) (Fig. 5C). This trend was also observed at T5 for hiPSC-CRL1502, except for the donor L1 (Fig. 6B). The reasons for this pattern are presently unclear (see Discussion). Overall, these results demonstrate that DNA methylation is far more dynamic during reprogramming and differentiation for a donor L1 that can mobilize during or shortly after reprogramming than is seen for the vast majority of L1-Ta subfamily elements. The de novo L1, which arose in hiPSC-CRL2429, could be detected at its 5ʹ L1-genome junction by site-specific PCR at time points T1 through T6 (Fig. 1C). However, as assessed by the number of unique sequencing reads generated, the PCR amplicon pool for the de novo L1 was very low in complexity at T1, perhaps due to a low percentage of cells carrying the mutation, and we therefore excluded T1 from further analysis. The de novo L1 was nonetheless consistently less methylated than its donor L1 in hiPSC-CRL2429 time points T2 through T6, with average values across these stages of 41.6% and 53.8%, respectively (Fig. 5B). Methylation ultimately increased upon the de novo L1 during neurodifferentiation, but even in neurons we observed a significant number of cells in which the de novo L1 promoter was fully demethylated. For the donor L1 and the L1-Ta subfamily, we also observed instances of cells in which these promoters were fully demethylated at various points of neuronal differentiation (Fig. 5B, Fig. 6A). These results suggest that the de novo L1 was only partially methylated subsequent to its integration into the hiPSC-CRL2429 genome and remained incompletely methylated in mature neurons. Given the disparate methylation levels observed for the de novo and donor L1 promoter regions compared to the level of the lineage progenitor L1, we examined predicted DNA-binding protein motifs (92) affected by sequence variation among these elements (Fig. 2). The 10-nt 5′ transduction carried by the de novo L1 insertion incorporated a perfect FOX (forkhead box) protein binding motif (93). Members of the FOX protein family can act as “pioneer” factors in the developmental activation of promoters located in heterochromatin (94). In addition, the T708C nucleotide mutation present in the de novo and donor L1 copies greatly increased the predicted binding affinity for retinoid X receptor (RXR) proteins to this site. RXR proteins are known to respond to vitamin A (95), which is a component of the B-27 medium used here for neurodifferentiation. Conversely, the C581A nucleotide mutation carried by the lineage progenitor L1, and not by the de novo or donor L1 sequences or any other member of the transduction family, removed a key nucleotide mismatch from the core of a predicted PU.1 binding motif. PU.1 is established to recruit DNA methyltransferases to genomic loci and to form a repressor complex with MeCP2, which is a key mediator of L1 silencing (96–98). These in silico analyses suggested that differential DNA-binding protein activity as a result of sequence variation may impact the methylation and transcriptional state of members of the transduction family.

DISCUSSION

The L1 transduction family identified here is the largest found to date and adds to other such families characterized by previous studies (35, 44, 54). Although the extent of the transduction family is revealed here, it is likely that additional members will be identified in the future. It should also be noted that each transduction family member, aside from the de novo L1, was either present in the reference genome or identified by earlier works (Table 3). Unusually, in addition to 3ʹ transduced sequences, 3 of the 14 family members carried 5ʹ transductions. This 5ʹ transduction frequency (21.4%) is exceptionally high, given how rarely such events are found in the human germ line (1). Two of the 5ʹ transductions were relatively short (10 nt, de novo L1; 18 nt, Non-ref_ChrX_p11.4) and likely resulted from the L1 promoter directing mRNA transcriptional initiation upstream of L1 position +1. The third 5ʹ transduction identified was significantly longer (539 nt, Ref_Chr1_p31.1_a) and resulted from transcription initiated by the 5ʹ LTR of an upstream HERV-H proviral sequence, followed by splicing of this mRNA into a site adjacent to the donor L1. The inclusion of both LTR and internal HERV-H sequences in an L1 5ʹ transduction was an intriguing result as most heritable L1 insertions appear to arise early in mammalian embryogenesis (55, 56), and HERV-H elements are highly expressed in pluripotent cells (99–103). To speculate, this example demonstrates how HERV-H activation in the early embryo could lead to L1 mobilization. Nonetheless, it remains unclear why 5ʹ transductions are generally so frequent in this family and not in other transduction families (35, 44, 54). One possibility, an ORF2p amino acid change supporting elevated RT processivity and therefore increased average L1 insertion length, was excluded by an inspection of nonsynonymous sequence variants in this region (Fig. 2). Also excluded was the more likely possibility of mutations in known YY1, RUNX3, or SOX transcription factor binding sites (41, 104, 105) in the lineage progenitor L1 5ʹUTR or in alternative predicted sites located in the immediate 100 nt of its 5′ genomic flank, which may alter the accuracy of RNA polymerase II transcriptional initiation (Fig. 2). Otherwise, the family exhibited extensive variation in 3ʹ transduction and poly(A) tail length, as reported elsewhere for L1 insertions arising from a common donor L1 in the human population and cancer genomes (32, 37, 44, 49, 52, 78). The discovery of a de novo L1 insertion in hiPSC-CRL2429 corroborates previous reports of endogenous and engineered L1 retrotransposition associated with reprogramming and hiPSC cultivation (58, 61). L1-mediated mutagenesis is potentially an important consideration for the use of hiPSCs in biomedical applications and as models of disease because the phenotypic properties of hiPSCs and their cellular derivatives could be compromised as a result of de novo L1 insertions (58, 106). We demonstrate here that an endogenous L1 insertion arising in an hiPSC line is maintained during neurodifferentiation, indicating that such events can be present in differentiated cell lines derived from hiPSCs. In this case, the L1 was intergenic, and the accompanying transductions did not include protein-coding exons or regulatory elements (47), lessening the probability of a functional impact in neurons carrying the L1 insertion. Although endogenous L1 retrotransposition is established to occur in the neuronal lineage (65), we did not identify any additional de novo L1 insertions that were restricted to neural cells. These events were likely to each be carried by very few cells, meaning that they may not accrue sufficient RC-seq read depth to meet the detection thresholds used here. Nonetheless, it is plausible that de novo L1 insertions that impact the phenotype of hiPSC-derived cells will be identified in the future, especially as gene expression changes have been observed coincident with intronic L1 insertions arising during hiPSC generation (58). DNA methylation is thought to be established on L1 sequences very early in mammalian embryogenesis (27, 28, 58, 61, 63, 67) and maintained in mature neurons. To our knowledge, L1 promoter methylation has not been explored for the various multipotent and immature neuronal cell types that arise during neurogenesis. Using in vitro hiPSC neurodifferentiation to represent neuronal development and maturation in vivo, we found that L1 promoter methylation was highly dynamic and increased as neurons matured. In each hiPSC line studied, we observed cells at multiple stages of neurodifferentiation, including mature neurons, where the donor L1 and other L1-Ta promoters were fully demethylated. Although the donor L1 was demethylated in hiPSCs compared to the methylation level of the matching parental fibroblasts, the absolute magnitudes of this change were dissimilar in the two lines (35.5% and 14.4% for hiPSC-CRL2429 and hiPSC-CRL1502, respectively). This perhaps reflected natural variation in the cohort of RC-L1s hypomethylated in each individual, before and after reprogramming. At time point T5, which follows a gliogenic switch (107–109) during neural differentiation, we also observed a consistent reduction in L1 promoter methylation. This phenomenon could reflect a genome-wide reduction in DNA methylation specific to this stage of neurodifferentiation, perhaps due to a shift in the proportion of glial and neuronal cells present in culture, and warrants further study. The de novo L1 insertion appeared to be rapidly targeted for repression by the host genome. During neurodifferentiation, similar transitions in methylation were observed for the de novo, donor and lineage progenitor L1s, and the L1-Ta subfamily even if the absolute methylation levels were very different among these elements. This result was consistent with epigenomic remodeling during reprogramming and neurodifferentiation (110, 111) impacting the ground state of L1 methylation genome-wide. It also suggested that the de novo L1 insertion was quickly identified and regulated by the same pathways acting upon extant L1 copies on the genome even if the degree of methylation upon the de novo L1 was significantly lower than that applied to the transduction family and its ancestral L1-Ta subfamily. L1 5' UTR sequence variants, for example the C581A nucleotide mutation carried by the lineage progenitor L1 and predicted to increase DNA methylation mediated by PU.1, could contribute to differential methylation patterns among members of the transduction family. It is also notable that the de novo L1 remained retrotransposition competent, as do many other L1 insertions occurring in hiPSCs or arising during human embryogenesis (57, 58). To speculate, if hiPSCs are taken as a model of very early development, a milieu where most heritable L1 insertions arise (55), it is plausible that RC-L1 insertions arising de novo in this context will be incompletely methylated during later development and therefore possess a disproportionate capacity for further mobilization in the soma. Ultimately, hiPSCs and hESCs present accessible models to predict how L1 subfamilies and individual L1 loci are regulated. Additional work is required to test whether these patterns are observed during mammalian development in vivo.

MATERIALS AND METHODS

hiPSC generation and neuronal differentiation.

Human induced pluripotent stem cell lines were episomally derived as previously described (76). Neuronal differentiation was performed as described previously (112) with slight modifications. Prior to neuronal differentiation, feeder-free hiPSCs were cultured in murine embryonic fibroblast (MEF)-conditioned KOSR medium supplemented with 100 ng/ml basic fibroblast growth factor (b-FGF). Initiation of neuronal differentiation occurred with the supplementation of dual SMAD inhibitors SB431542 (10 μM) and dorsomorphin (1 μM) into knockout serum replacement (KOSR) medium, which was gradually exchanged for 3 N medium (1:1 medium mix of N-2- and B-27-containing medium comprised of 1:1 neurobasal/Dulbecco’s modified Eagle’s medium [DMEM]–F-12 supplemented with 2% B-27, 1% N-2, 2 mM GlutaMax, 2.5μg/ml insulin, 0.05mM nonessential amino acids [NEAA], 0.05 mM beta-mercaptoethanol [all from Life Technologies]) in 25% incremental steps on days 4, 6, 8, and 10. Neural rosettes were selectively harvested and plated on Matrigel-coated TC dishes and expanded in 3 N medium supplemented with 20 ng/ml b-FGF. Around day 30 early neuronal progenitors were harvested with Accutase and seeded onto poly-l-ornithine/laminin-coated dishes (0.01% weight/volume and 20 μg/ml, respectively), and maintained in 3 N medium for the remainder of neurodifferentiation.

Immunocytochemistry.

Neural cultures were grown on Matrigel-coated plastic coverslips in 3 N medium and were fixed in 4% paraformaldehyde (Sigma) in phosphate-buffered saline (PBS) for 15 min at room temperature and permeabilized in 0.01% Triton X-100 (Ajax Finechem) in PBS for 15 min at room temperature. All cells were blocked for 1 h with 10% goat serum (Invitrogen) in PBS. Primary antibodies used were OCT4 (1:100; Millipore), NANOG (1:100; Millipore), CUX1 (1:100; Abcam), glial fibrillary acidic protein (GFAP) (1:250; Dako), TUBB3/TUJ1 (1:1,000; Covance), BRN2 (1:100; Abcam), PAX6 (1:1,000; Developmental Studies Hybridoma Bank [DSHB]), anti-phospho-histone H3 (Ser10) (1:200; Cell Signaling Technology) and were applied for 3 to 4 h at room temperature or overnight at 4°C. Isotype- and species-matched Alexa Fluor-conjugated secondary antibodies (1:1,000; Invitrogen) were applied for 1 h at room temperature. Cells were washed in PBS and mounted on glass slides with ProLong Gold antifade containing 4′,6′-diamidino-2-phenylindole (DAPI; Invitrogen) and imaged using an Olympus IX51 (Olympus) fluorescence microscope equipped with a MicroPublisher, version 3.3, real-time viewing (RTV) charge-coupled-device (CCD) camera (QImaging) using Q-Capture Pro, version 6.0, software.

Nucleic acid extraction.

A total of approximately 500,000 cells per time point were pelleted (1,000 rpm for 5 min) and then washed with Dulbecco’s phosphate-buffered saline (DPBS) (14190144; Gibco) and pelleted again (1,000 rpm for 5 min) and resuspended in 100 μl of UltraPure DNase/RNase-free distilled water (10977023; Gibco). Cells were lysed in 10 mM Tris, pH 9.0, and 1 mM EDTA, with 2% SDS and 100 μg/ml proteinase K at 65°C. A final concentration of 10 μg/ml RNase A was added to each sample and incubated at 37°C for 30 min. DNA was extracted using phenol-chloroform-isoamyl alcohol (25:24:1) and chloroform-isoamyl alcohol (24:1). DNA was precipitated with 0.1 volume of 3 M sodium acetate and 2.5 volumes of 100% isopropanol. Precipitated DNA was washed in 0.8 ml of 75% ethanol (EtOH), slightly air dried, and resuspended in 50 μl of UltraPure DNase/RNase-free distilled water (10977023; Gibco). The quality and quantity of DNA were assessed by NanoDrop (Thermo Fisher Scientific).

RC-seq.

Genomic DNA from time points T1 to T6 for each hiPSC line was analyzed by retrotransposon capture sequencing (RC-seq), as described previously (69). Each library was constructed from 2 μg of input genomic DNA (gDNA) and sequenced in multiplex on an Illumina HiSeq 2500 instrument (Macrogen, South Korea). Fibroblast samples (time point T0) were previously analyzed by RC-seq (58). A total of 726,181,832 paired-end 2- by 150-mer reads were generated across 18 libraries (Table 1). RC-seq data were analyzed with TEBreak (https://github.com/adamewing/tebreak). Reads were aligned to the hg19 reference genome sequence using Burrows-Wheeler Aligner maximal exact match (BWA-MEM) (113) with parameters -Y and -M. Duplicate reads were marked with Picard MarkDuplicates (http://broadinstitute.github.io/picard). Candidate nonreference genome L1 insertions that were (i) detected in only one of the two hiPSC lines analyzed, (ii) absent from the matching parental fibroblasts, and (iii) did not correspond to a known nonreference germ line transposable element insertions (35, 49, 77, 79–87, 114–116) were annotated as putatively de novo (see Table S1 in the supplemental material). The remaining nonreference L1 insertions were annotated as polymorphic.

PCR validation of L1 insertions.

RC-seq reads indicating putative de novo L1 insertions were manually inspected, and primers (Table 2) were designed to PCR amplify integration sites and identify the hallmarks of bona fide L1 retrotransposition events (117). Empty/filled-site, 5ʹ L1-genome junction, and 3ʹ L1-genome junction PCRs were performed. Primers were situated within flanking genomic DNA sequences for empty/filled-site PCRs. The same flanking primers were paired with appropriate L1-specific primers for L1-genome junction assays. Expand long-range enzyme was used for empty/filled-site PCRs using 1.75 U of Expand Long Template enzyme (04829069001; Roche), 5 μl of 5× buffer with 12.5 mM MgCl2, 1.25 μl of 100% dimethyl sulfoxide (DMSO), 1.25 μl 10 mM deoxynucleoside triphosphates (dNTPs), 1 μl of primer mix (25 μM each primer), 4 ng of genomic DNA template, and molecular-grade water in a final volume of 25 μl under the following PCR conditions: 92°C for 2 min, followed first by 10 cycles at 92°C for 10 s, 59°C for 15 s, and 68°C for 6.5 min and then by 30 cycles at 92°C for 2 min, 59°C for 15 s, and 68°C for 6.5 min plus 20 s of extension time per cycle, with a single extension step at 68°C for 10 min. The 5ʹ and 3ʹ L1-genome junction PCRs were performed using 2 U of MyTaq hot-start DNA polymerase (BIO-21112; Bioline), 1× PCR buffer, 1 μM each primer, 5 ng of genomic DNA template, and molecular-grade water in a final volume of 25 μl. Cycling conditions were as follows: 95°C for 2 min, followed by 35 cycles at 95°C for 30 s, 58°C for 30 s, and 72°C for 3 min, with a single extension step of 72°C for 5 min. Amplified fragments were resolved on 1% and 2% agarose gels (1× Tris-acetate-EDTA [TAE] buffer) stained with SybrSafe (Life Technologies) for empty/filled-site and 5ʹ and 3ʹ junction PCR assays, respectively, and imaged using a Typhoon FLA 9500 (GE Healthcare Life Sciences, USA). Amplicons of the expected size were excised from the gels, and DNA was extracted using a QIAquick gel extraction kit (28704; Qiagen), followed by capillary sequencing to confirm and characterize L1 insertion structural features.

L1 genotyping and cloning.

To facilitate cloning of full-length L1 insertions, a NotI restriction enzyme sequence (5ʹ-GC/GGCC) was introduced at the 5ʹ end of each forward primer close to the L1-genome junction. Purified PCR products (500 ng) approximately 6 kbp in size were digested with NotI and Bstz17I (R3138; New England Biolabs) in 1× CutSmart buffer at 37°C for 1 h. Digestion reactions were run in 2% agarose gels (1× TAE buffer), purified by phenol-chloroform extraction, and cloned into the vector TOPO-XL PCR cloning kit (K4700-20; Life Technologies) according to the manufacturer’s instructions. Five microliters of the ligation product was used to transform One Shot TOP10 electrocompetent bacteria as per the manufacturer’s instructions. LB agar containing 0.5 μg/ml of kanamycin was used to plate bacteria, which were incubated at 37°C overnight. Single colonies were picked and transferred to 5 ml of LB liquid containing 0.5 μg/ml of kanamycin for Miniprep plasmid purification (12143; Qiagen). To filter induced PCR mutations and distinguish possible allelic variants, at least four independent PCR products, and clones from each L1 transduction family member were capillary sequenced using 12 overlapping primer pairs (Table 2) distributed at ∼500-bp intervals covering the entire L1 sequence. Each independent clone sequence was then manually assembled and aligned with the other clones of the same element using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). For each L1, a consensus sequence was obtained, and a mutation-free construct was reconstructed by performing multiple restriction enzyme digestions. The desired fragments were resolved in a 2% agarose gel (1× TAE buffer), purified, and ligated into a pCEP4 vector using T4 ligase in a 5:1 (insert/vector) ratio. Five microliters of the ligation product was used to transform One Shot TOP10 chemically competent bacteria (C404010; Invitrogen) as per the manufacturer’s instructions. LB agar containing 1 μg/ml of ampicillin was used to plate the bacteria, and these were incubated at 37°C overnight. Single colonies were picked and transferred to 5 ml of LB liquid containing ampicillin for Miniprep plasmid purification. To verify the fidelity of the resultant clones, these were capillary sequenced, as described above, using 12 different primers covering the entire L1 sequence. Retrotransposition indicator plasmids termed L1.3 and L1.3 RT− were generated through modification of the pCEP4 backbone of pJM101/L1.3 (14, 91) and pJM105/L1.3 (118) by removing a BgIII fragment containing the cytomegalovirus (CMV) promoter. The full L1.3 3ʹ UTR, except for a point mutation disrupting the native L1 polyadenylation signal, was reintroduced, and a PacI site was incorporated between the L1.3 3ʹ UTR and the Neo cassette (F. J. Sanchez-Luque and G. J. Faulkner, unpublished data). The mutation-free full-length transduction family members described above were then introduced into this retrotransposition indicator backbone. DNA-binding protein motif analyses of the lineage progenitor, donor, and de novo L1 sequences were performed using the Catalog of Inferred Sequence Binding Preferences (CIS-BP) database (92).

Retrotransposition assay.

HeLa-JVM cells grown in a humidified, 5% CO2 incubator at 37°C in high-glucose Dulbecco’s modified Eagle’s medium (DMEM) without pyruvate (11965-092; Gibco), supplemented with 10% fetal bovine serum (26400-044; Gibco), 2 mM l-glutamine, 100 U/ml penicillin, and 100 μg/ml streptomycin (10378-016; Gibco) (DMEM complete). Plasmid DNA was purified using a Midi kit (13343; Qiagen) and diluted in sterile water to 0.5 μg/μl. Cells were transfected and seeded at 5 × 103 cell/well in six-well plates using FuGENE HD transfection reagent (Promega) at a ratio of 4 μl to 1 μg of plasmid DNA. Selection with G418 began 72 h after transfection and continued every 48 h for 14 days (6). Transfection efficiency assays were performed in parallel by cotransfection of pCAG-enhanced green fluorescent protein (EGFP) with L1 reporter plasmids, as described above, with 0.5 μg of each construct and 0.5 μg of pCAG-EGFP. Cells were analyzed by flow cytometry 48 h posttransfection on a Cytoflex flow cytometer (Beckman-Coulter) at the Translational Research Institute Flow Cytometry Core. The results were used to normalize the G418-resistant colony counts with the percentage of EGFP-positive cells for each L1 reporter construct obtained in the retrotransposition assay, as performed previously (118).

L1 CpG methylation analyses.

L1-Ta subfamily-wide and L1 locus-specific bisulfite sequencing for each time point in hiPSC-CRL1502 and hiPSC-CRL2429 was performed as described previously (52). Briefly, 500 ng of gDNA was bisulfite treated using an EZ DNA Methylation Lightning kit (Zymo Research), allowing 20 min desulfonation time and eluting in a 25-μl volume. Primers L1_Bis-F and L1_Bis-R were used to amplify the L1-Ta 5ʹ UTR region containing a CpG island (Table 2), while for the L1 locus-specific reactions, L1_Bis-R was combined with one of three forward primers placed in the genomic flank of the lineage progenitor, donor, and de novo L1 insertions (L1_Bis-LP, L1_Bis-Donor, and L1_Bis-DN, respectively). PCRs incorporated 1 U of MyTaq hot-start DNA polymerase (BIO-21112; Bioline), 2 μl of bisulfite-treated gDNA from each sample, 1× reaction buffer, and 2 μM each primer, in a 20-μl final volume. PCR cycling conditions were as follows: 95°C for 2 min, followed by 40 cycles of 95°C for 30 s, 54°C for 30 s, and 72°C for 30 s, with a single extension step at 72°C for 5 min. Barcoded libraries were prepared from amplicons pooled by time point and sample using a TruSeq DNA PCR-free library preparation kit (FC-121-3001/2; Illumina) and subjected to multiplexed paired-end 2- by 300-mer sequencing using an Illumina MiSeq platform. Data were processed as described previously (52) and visualized using QUMA (119) with default parameters.

Accession number(s).

RC-seq FASTQ files were deposited in the European Nucleotide Archive under accession number PRJEB27103.
  118 in total

1.  Isolation of an active human transposable element.

Authors:  B A Dombroski; S L Mathias; E Nanthakumar; A F Scott; H H Kazazian
Journal:  Science       Date:  1991-12-20       Impact factor: 47.728

2.  Primate-specific ORF0 contributes to retrotransposon-mediated diversity.

Authors:  Ahmet M Denli; Iñigo Narvaiza; Bilal E Kerman; Monique Pena; Christopher Benner; Maria C N Marchetto; Jolene K Diedrich; Aaron Aslanian; Jiao Ma; James J Moresco; Lynne Moore; Tony Hunter; Alan Saghatelian; Fred H Gage
Journal:  Cell       Date:  2015-10-22       Impact factor: 41.582

3.  Undermethylation of specific LINE-1 sequences in human cells producing a LINE-1-encoded protein.

Authors:  R E Thayer; M F Singer; T G Fanning
Journal:  Gene       Date:  1993-11-15       Impact factor: 3.688

4.  The primate-specific noncoding RNA HPAT5 regulates pluripotency during human preimplantation development and nuclear reprogramming.

Authors:  Jens Durruthy-Durruthy; Vittorio Sebastiano; Mark Wossidlo; Diana Cepeda; Jun Cui; Edward J Grow; Jonathan Davila; Moritz Mall; Wing H Wong; Joanna Wysocka; Kin Fai Au; Renee A Reijo Pera
Journal:  Nat Genet       Date:  2015-11-23       Impact factor: 38.330

5.  Retinoic acid receptors and retinoid X receptors: interactions with endogenous retinoic acids.

Authors:  G Allenby; M T Bocquel; M Saunders; S Kazmer; J Speck; M Rosenberger; A Lovey; P Kastner; J F Grippo; P Chambon
Journal:  Proc Natl Acad Sci U S A       Date:  1993-01-01       Impact factor: 11.205

6.  dbRIP: a highly integrated database of retrotransposon insertion polymorphisms in humans.

Authors:  Jianxin Wang; Lei Song; Deepak Grover; Sami Azrak; Mark A Batzer; Ping Liang
Journal:  Hum Mutat       Date:  2006-04       Impact factor: 4.878

7.  A comprehensive map of mobile element insertion polymorphisms in humans.

Authors:  Chip Stewart; Deniz Kural; Michael P Strömberg; Jerilyn A Walker; Miriam K Konkel; Adrian M Stütz; Alexander E Urban; Fabian Grubert; Hugo Y K Lam; Wan-Ping Lee; Michele Busby; Amit R Indap; Erik Garrison; Chad Huff; Jinchuan Xing; Michael P Snyder; Lynn B Jorde; Mark A Batzer; Jan O Korbel; Gabor T Marth
Journal:  PLoS Genet       Date:  2011-08-18       Impact factor: 5.917

8.  Endogenous retrotransposition activates oncogenic pathways in hepatocellular carcinoma.

Authors:  Ruchi Shukla; Kyle R Upton; Martin Muñoz-Lopez; Daniel J Gerhardt; Malcolm E Fisher; Thu Nguyen; Paul M Brennan; J Kenneth Baillie; Agnese Collino; Serena Ghisletti; Shruti Sinha; Fabio Iannelli; Enrico Radaelli; Alexandre Dos Santos; Delphine Rapoud; Catherine Guettier; Didier Samuel; Gioacchino Natoli; Piero Carninci; Francesca D Ciccarelli; Jose Luis Garcia-Perez; Jamila Faivre; Geoffrey J Faulkner
Journal:  Cell       Date:  2013-03-28       Impact factor: 41.582

Review 9.  Transposable elements in the mammalian embryo: pioneers surviving through stealth and service.

Authors:  Patricia Gerdes; Sandra R Richardson; Dixie L Mager; Geoffrey J Faulkner
Journal:  Genome Biol       Date:  2016-05-09       Impact factor: 13.583

Review 10.  L1 retrotransposition in the soma: a field jumping ahead.

Authors:  Geoffrey J Faulkner; Victor Billon
Journal:  Mob DNA       Date:  2018-07-07
View more
  7 in total

Review 1.  The Role of Transposable Elements of the Human Genome in Neuronal Function and Pathology.

Authors:  Ekaterina Chesnokova; Alexander Beletskiy; Peter Kolosov
Journal:  Int J Mol Sci       Date:  2022-05-23       Impact factor: 6.208

Review 2.  Endogenous retroviruses in the origins and treatment of cancer.

Authors:  Natasha Jansz; Geoffrey J Faulkner
Journal:  Genome Biol       Date:  2021-05-10       Impact factor: 13.583

Review 3.  Transposable Elements, Inflammation, and Neurological Disease.

Authors:  Aurian Saleh; Angela Macia; Alysson R Muotri
Journal:  Front Neurol       Date:  2019-08-20       Impact factor: 4.003

4.  Somatic retrotransposition in the developing rhesus macaque brain.

Authors:  Victor Billon; Francisco J Sanchez-Luque; Jay Rasmussen; Gabriela O Bodea; Daniel J Gerhardt; Patricia Gerdes; Seth W Cheetham; Stephanie N Schauer; Prabha Ajjikuttira; Thomas J Meyer; Cora E Layman; Kimberly A Nevonen; Natasha Jansz; Jose L Garcia-Perez; Sandra R Richardson; Adam D Ewing; Lucia Carbone; Geoffrey J Faulkner
Journal:  Genome Res       Date:  2022-06-21       Impact factor: 9.438

Review 5.  Genomic Indexing by Somatic Gene Recombination of mRNA/ncRNA - Does It Play a Role in Genomic Mosaicism, Memory Formation, and Alzheimer's Disease?

Authors:  Uwe Ueberham; Thomas Arendt
Journal:  Front Genet       Date:  2020-04-29       Impact factor: 4.599

6.  Integrated Mobile Element Scanning (ME-Scan) method for identifying multiple types of polymorphic mobile element insertions.

Authors:  Jui Wan Loh; Hongseok Ha; Timothy Lin; Nawei Sun; Kathleen H Burns; Jinchuan Xing
Journal:  Mob DNA       Date:  2020-02-22

7.  Analysis of epigenetic features characteristic of L1 loci expressed in human cells.

Authors:  Benjamin Freeman; Travis White; Tiffany Kaul; Emily C Stow; Melody Baddoo; Nathan Ungerleider; Maria Morales; Hanlin Yang; Dawn Deharo; Prescott Deininger; Victoria P Belancio
Journal:  Nucleic Acids Res       Date:  2022-02-28       Impact factor: 16.971

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.