Literature DB >> 25689626

The evidence for increased L1 activity in the site of human adult brain neurogenesis.

Alexey A Kurnosov1, Svetlana V Ustyugova2, Vadim I Nazarov3, Anastasia A Minervina1, Alexander Yu Komkov1, Mikhail Shugay1, Mikhail V Pogorelyy1, Konstantin V Khodosevich4, Ilgar Z Mamedov1, Yuri B Lebedev1.   

Abstract

Retroelement activity is a common source of polymorphisms in human genome. The mechanism whereby retroelements contribute to the intraindividual genetic heterogeneity by inserting into the DNA of somatic cells is gaining increasing attention. Brain tissues are suspected to accumulate genetic heterogeneity as a result of the retroelements somatic activity. This study aims to expand our understanding of the role retroelements play in generating somatic mosaicism of neural tissues. Whole-genome Alu and L1 profiling of genomic DNA extracted from the cerebellum, frontal cortex, subventricular zone, dentate gyrus, and the myocardium revealed hundreds of somatic insertions in each of the analyzed tissues. Interestingly, the highest concentration of such insertions was detected in the dentate gyrus-the hotspot of adult neurogenesis. Insertions of retroelements and their activity could produce genetically diverse neuronal subsets, which can be involved in hippocampal-dependent learning and memory.

Entities:  

Mesh:

Year:  2015        PMID: 25689626      PMCID: PMC4331437          DOI: 10.1371/journal.pone.0117854

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Approximately 40% of the human genome is comprised of multiple copies of retroelements (REs) due to their winning streak in the course of mammalian evolution [1]. The major groups of retroelements present in the human genome are the Long Terminal Repeats (LTR) retrotransposons, the Long Interspersed Nuclear Elements (LINEs), the Short Interspersed Nuclear Elements (SINEs) (most of which belong to the Alu family), and the SINE-R/VNTR/Alu (SVA) elements. The only family of autonomous non-LTR REs known to be currently active is L1. The mobility of non-LTR REs from the other two groups (Alu and SVA) relies on the L1 retrotransposition machinery. Polymorphic insertions found in the genome of a part of human population usually result from the recent retrotranspositional activity of REs which have retained the capacity to amplify themselves. These insertions belong to the evolutionary young RE groups. Around 60–80 insertions of LINE elements in the human genome considered to be capable of retrotransposing belong to the L1Hs subfamily [2]. Most of the active Alu copies are attributed to the AluYa5 and AluYb8 subfamilies [3-5]. Insertions of retroelements can lead to reshaping of the genome structure and alteration of nearby genes expression [6, 7]. Some of such insertions have been reported to cause various diseases [8, 9]. It has been long considered that the majority of retrotranspositions occur in the germ line, while the mobility of REs in somatic cells is strictly suppressed. Mammalian cells possess multiple mechanisms of inhibiting REs expression, including siRNA [10], miRNA [11], piRNA-induced L1 methylation [12], and repression of the methylated L1 promoters by methyl-CpG-binding protein 2 (MeCP2) [13]. Recently, however, sufficient data have been accumulated suggesting that the REs activity may be a common property of cells in somatic tissues [14, 15]. Thus, somatic L1 and Alu insertions have been found in cancerous cells [16-19]. Furthermore, studies with the retrotransposition-reporter L1-EGFP plasmid in human embryonic stem cell lines [20] and the rat and mouse models [21] provided evidence for mobilization of L1 retroelements in embryonic tissues. An increased frequency of retrotransposition events in neurogenesis was demonstrated by introducing the L1-retroposition-reporter construct into the rat [22] and human [23] neural progenitor cells. An increase in the number of L1 copies in the human brain tissues in comparisons with non-neural tissues was found by quantitative PCR (qPCR) [23]. Somatic REs insertions were also detected in the human brain and induced pluripotent stem cells by next generation sequencing of the DNA libraries enriched in RE-containing sequences [24, 25]. This approach allowed to reveal thousands of somatic L1, Alu, and SVA integrations in the hippocampi and caudate nuclei of the donors. However, whole-genome L1 profiling in single neurons from the human caudate nucleus and cerebral cortex assessed the frequency of the somatic insertions at less than 0.6 L1 insertions per neuron [26]. Whole-genome sequencing was also applied to identify somatic L1 retrotranspositions in the brains of patients with schizophrenia where an increased number of mobile elements insertions was predicted by qPCR analysis [27] Here we applied a high-throughput approach to directly compare the number of autonomous (L1) and non-autonomous (Alu) retroelement somatic insertions in various human adult brain regions and a control non-nervous tissue. Whole-genome Alu and L1 profiling was performed for four brain regions including the dentate gyrus, the only region of the adult human brain that was shown to retain significant neurogenic capacity.

Results

Sample choice: neurogenic and non-neurogenic tissues

Previously, activation of L1 retrotranspositions was associated with a switch from neural stem cells to fast proliferating neural progenitor cells in vitro and in vivo in the mouse dentate gyrus [22, 28]. The subgranular zone (SGZ) of the dentate gyrus (DG) together with the subventricular zone (SVZ) of the lateral ventricles are the only regions in the mammalian brain that persist in generating new neurons throughout the animal life [29]. In the adult human brain, substantial neurogenesis was described in the dentate gyrus [30, 31], whereas the data regarding SVZ neurogenesis is controversial [32-35]. To analyze whether continuous proliferation of precursor cells affects somatic retrotranspositions, we chose five samples taken from different tissues of a single individual for the whole-genome L1 and Alu profiling. The tissues represented neurogenic brain regions (the DG and the potentially neurogenic SVZ), non-neurogenic brain regions (the cerebellum and the frontal cortex), and a non-neural control tissue (the myocardium).

Retrieving somatic insertions by next generation sequencing

For library preparation, we modified the suppression PCR-based method of retrieving REs terminal and flanking sequences from genomic DNA [36, 37]. Briefly, our protocol of somatic insertion identification included the following steps (Fig. 1; see Experimental Procedures for more details): 1) Ligation of the suppression adapters to the restricted genomic DNA. 2) Two steps of suppression PCR selectively amplifying REs of the AluYa5 or the L1Hs subfamily. L1 libraries were constructed from the L1 3’-termini as the LINEs often appear 5’-truncated. On the contrary, we have chosen 5’-flanking sequences of the Alu repeats for the construction of the libraries in order not to include 3’-polyA-sequences of the Alu into the libraries and thus to spare the informative reads length. DNA molecules in the produced libraries consisted of a short retroelement fragment, its flanking sequence, which served to identify an insertion coordinate, and an adaptor sequence. The libraries were comprised of the DNA fragments representing two types of insertions: fixed and germline insertions (coming from all cells of the samples) and somatic insertions (coming from few or even one cell). 3) Illumina sequencing of the DNA libraries. 4) Mapping of the reads to the reference human genome and analysis of the mapping results which aimed to reveal the coordinates of potentially somatic insertions. We defined an insertion as potentially somatic if its coordinate did not match the known REs insertion coordinates in the reference genome and if it belonged to the library originating from only one of the studied tissue samples. Another constraint imposed on the reads potentially representing somatic retrotranspositions was the possibility to establish the insertion coordinate with a single-base resolution. Thus, only the pairs of the Illumina reads which had a mapping-informative part in the retroelement-containing read could be considered to represent potentially somatic insertions.
Fig 1

Retroelements flanking sequences library preparation.

Small vertical arrows show the restriction sites. Horizontal arrows show PCR primers.

Retroelements flanking sequences library preparation.

Small vertical arrows show the restriction sites. Horizontal arrows show PCR primers. A total of 10,709,681 and 61,213,133 high-quality reads were obtained for L1 and Alu libraries, respectively. The detailed information on the number of reads and somatic insertions detected in each library is shown in Table 1. 817 out of 1,528 (53.5%) reference L1Hs and 2,933 out of 3,918 (74.9%) reference AluYa5 insertions were detected in the libraries. We have also found the coordinates of 167 non-reference L1 and 653 non-reference Alu retroelements which were detected in all tissues and therefore can be regarded as polymorphic or novel germline insertions.
Table 1

The number of potentially somatic L1 and Alu insertions detected in different tissue samples and the data on their distribution in genome.

cerebellumfrontal cortexSVZDGmyocardium
L1
Number of high-quality reads 2723127825363184536734355291880295
Number of reads representing potentially somatic insertions 1712475116132111170
Number of potentially somatic insertions 1651462113331001151
% of reads representing potentially somatic insertions 0.06290.05760.06290.09350.0622
Number (%) of somatic L1 detected in genes 842 (51.00)236 (51.08)584 (51.54)1558 (50.26)578 (50.22)
Number (%) of somatic L1 detected in 5 kB upstream genes 92 (5.57)31 (6.71)74 (6.53)177 (5.71)62 (5.39)
Alu
Number of high-quality reads 1197854011962901109213851333904113011266
Number of reads representing potentially somatic insertions 13762217135330791275
Number of potentially somatic insertions 13172138130829841243
% of reads representing potentially somatic insertions 0.01150.01850.01240.02310.0098
Number (%) of somatic Alu detected in genes 623 (47.30)1028 (48.08)609 (46.56)1465 (49.10)589 (47.39)
Number (%) of somatic Alu detected in 5 kB upstream genes 67 (5.09)105 (4.91)61 (4.66)105 (3.52)55 (4.42)
Combined set of L1 and Alu
% of somatic retroelements detected in genes 49.3648.6248.8749.6948.75
% of somatic retroelements detected in 5 kB upstream genes 5.365.235.534.644.89
A total of 7,497 potentially somatic L1 and 8,990 potentially somatic Alu insertions were identified in the tissue libraries (Table 1). All the detected insertions were represented by a very low number of reads (not more than by 4 reads for L1 and 5 for Alu, though overwhelmingly by one read). As the number of detected somatic insertions obviously depends on the total number of reads in a library, we normalized the number of insertions by dividing it by the number of reads. The percentage of the reads derived from potentially somatic insertions differed among the libraries (see Fig. 2). Strikingly, whereas the percentage of somatic L1 insertions was approximately equal for the cerebellum, cortex, SVZ and myocardium (0.058–0.063%), the dentate gyrus cells exhibited significantly higher percentage of L1 retrotranspositional events (0.093%) (Fig. 2A; p<0.0001, Poisson test). Pair-wise Poisson tests also confirmed that only the dentate gyrus exhibited a higher rate of L1 integration compared to other brain regions and myocardium, with the latter not being different from each other (p<0.0001 for the dentate gyrus and p>0.05 for all other samples, Poisson tests). The highest percentage of somatic Alu insertions (0.023%) was observed in the DG (Fig. 2B). However, in contrast to L1 data, the frontal cortex also exhibited a comparatively high percentage of somatic Alu insertions (0.018%). The percentage of somatic Alu insertions for the cerebellum, SVZ, and myocardium was much lower (0.011, 0.012 and 0.010%). Nevertheless, using the pair-wise Poisson tests we found that not only the dentate gyrus and the frontal cortex, but also the cerebellum, SVZ, and myocardium samples were different from all the other samples (except cerebellum vs SVZ, where p = 0.0506). Thus, whereas somatic L1 insertions were equally distributed in the analyzed samples (~0.06%), with the exception of the dentate gyrus, the percentage of somatic Alu insertions varied among different brain regions.
Fig 2

Normalized number of the somatic insertions (number of reads representing somatic insertions divided by the total number of reads) in the DNA of the studied samples.

(A)—L1 insertions; (B)—Alu insertions. See also Table 1.

Normalized number of the somatic insertions (number of reads representing somatic insertions divided by the total number of reads) in the DNA of the studied samples.

(A)—L1 insertions; (B)—Alu insertions. See also Table 1. A total of 3,798 somatic L1 and 4,314 somatic Alu elements from all libraries integrated into genes (overwhelmingly into introns). Another portion of somatic insertions (436 L1 and 393 Alu) was detected within 5,000 bp upstream of the gene transcription start sites (Table 1). In order to test whether the distribution of the detected insertions in the genome was random, we generated simulated coordinate datasets of the sizes identical to the sizes of the experimental datasets. Repeating the simulation 1000 times for each set gave ranges of the number of insertions, which were expected to be found in genes or promoter regions. These ranges were compared to our experimental data. In all brain regions, the number of L1 insertions was significantly higher than predicted both in genes (with the p-values < 0.001; = 0.02; = 0.001; < 0.001; = 0.01 for the cerebellum, the frontal cortex, the SVZ, the DG, and the myocardium, respectively; Monte-Carlo test, 1000 permutations) and promoters (with the p-values = 0.004; = 0.002; = 0.002; < 0.001; = 0.032 for the cerebellum, the frontal cortex, the SVZ, the DG, and the myocardium, respectively; Monte-Carlo test, 1000 permutations) (Fig. 3A and 3B). The Alu insertion rate matched the predicted rate in genes for all samples except the DG (Fig. 3C), where the number of Alu was significantly higher than predicted (p = 0.013; Monte-Carlo test, 1000 permutations). Additionally, the DG was the only tissue which demonstrated the lower than predicted Alu insertion rate in promoter regions (p = 0.021; Monte-Carlo test, 1000 permutations) (Fig. 3D).
Fig 3

Number of the retroelement insertions detected within genes and promoters (for each library, predictions are derived from 1000 simulations of coordinates sample sets).

Error bars show 1 SD. (A)—L1 in genes; (B)—L1 in promoters; (C)—Alu in genes; (D)—Alu in promoters.

Number of the retroelement insertions detected within genes and promoters (for each library, predictions are derived from 1000 simulations of coordinates sample sets).

Error bars show 1 SD. (A)—L1 in genes; (B)—L1 in promoters; (C)—Alu in genes; (D)—Alu in promoters. Interestingly, across all samples analyzed, the distribution of somatic L1 or Alu insertions was very similar, namely, the percentages of somatic L1 or Alu that integrated into the genes or 5 kb region upstream genes were 50–51% and 5.5–6.5% for L1 or 47–49% and 3.5–5% for Alu, respectively (Table 1). The same held true for the combined set of somatic L1+Alu (Table 1). The lack of variability was confirmed by an overdispersion test (in all cases p > 0.98). Finally, we analyzed the orientation of somatic retroelement insertions relative to genes (for those retroelements that integrated into introns or 5 kb regions upstream genes). Strikingly, intronic L1 elements preferentially integrated in the opposite orientation relative to the gene (Table 2) −40.96 ± 0.31% (mean±SD) of L1 were co-oriented; 59.05 ± 0.29% (mean±SD) of L1 were counter-oriented (analysis of distribution across different brain regions and myocardium: p = 0.9999, overdispersion test; analysis of distribution in a single brain region or myocardium: p < 0.0001 for all samples, binomial test, the null hypothesis was no preference in orientation). We observed no preference in the orientation of L1 integrations in the promoter regions (binomial test). The integration of Alu in both promoter and intronic regions also exhibited no preference in orientation (p > 0.05), with the exception of Alu in the dentate gyrus, (p = 0.032, binomial test).
Table 2

The orientation of somatic L1 and Alu insertions relative to nearby genes.

cerebellumfrontal cortexSVZDGmyocardium
L1
Number of somatic L1 detected in genes (p-value*)842 (<0.001)236 (0.02)584 (0.001)1558 (<0.001)578 (0.01)
Number (%) of CO-oriented somatic L1345 (40.97)97 (41.10)239 (40.92)644 (41.34)234 (40.48)
Number (%) of COUNTER-oriented somatic L1497 (59.03)139 (58.90)345 (59.08)914 (58.66)344 (59.52)
Number of somatic L1 detected in 5 kB upstream gene (p-value*)92 (0.004)31 (0.002)74 (0.002)177 (<0.001)62 (0.032)
Number (%) of CO-oriented somatic L138 (41.30)20 (64.52)31 (41.89)88 (49.72)34 (54.84)
Number (%) of COUNTER-oriented somatic L154 (58.70)11 (35.48)43 (58.11)89 (50.28)28 (45.16)
Alu
Number of somatic Alu detected in genes (p-value*)623 (NS)1028 (NS)609 (NS)1465 (0.013)589 (NS)
Number (%) of CO-oriented somatic Alu323 (51.85)497 (48.35)304 (49.92)691 (47.17)279 (47.37)
Number (%) of COUNTER-oriented somatic Alu301 (48.15)531 (51.65)305 (50.08)774 (52.83)310 (52.63)
Number of somatic Alu detected in 5 kB upstream gene (p-value*)67 (NS)105 (NS)61 (NS)105 (0.021)55 (NS)
Number (%) of CO-oriented somatic Alu29 (43.28)43 (40.95)32 (52.46)45 (42.86)25 (45.45)
Number (%) of COUNTER-oriented somatic Alu38 (56.72)62 (59.05)29 (47.54)60 (57.14)30 (54.55)

*—p-value based on Monte-Carlo test, 1000 permutations (see Materials and Methods for details), NS—non-significant (p>0.05)

*—p-value based on Monte-Carlo test, 1000 permutations (see Materials and Methods for details), NS—non-significant (p>0.05)

Validation by PCR and Sanger sequencing

Nested PCR and Sanger sequencing of the PCR products were carried out to validate the potentially somatic insertions. We selected 34 L1 and 26 Alu elements integration events for validation (S1 Table). We confirmed the presence of the selected 15/34 L1 and 17/26 Alu element insertions in the DNA samples produced by the first step of suppression PCR and validated them as somatic (see Fig. 4 for the validation scheme). Sanger sequencing revealed single nucleotide substitutions in the sequences of several newly found REs. Comparing these RE sequences with the reference genome allowed us to identify a limited number of ancestral elements or even the exact master-copy (S1 Table). Although all of the detected somatic insertions were supposed to be represented by one or very few molecules in the initial sample, we attempted to amplify their full-length copies from the genomic DNA. We were successful in retrieving the Alu insertion previously validated in the first step of suppression PCR. However, the proper PCR product for this insertion accumulated only in one of the 12 nested reactions with 40 ng input DNA, indicating that the concentration of the template molecules in the DNA sample was very low indeed.
Fig 4

Validation of the potentially somatic retroelement insertions.

Black arrows show the primers. GP primers are complementary to the flanking sequences, RE primers are complementary to the retroelement sequence (RE).

Validation of the potentially somatic retroelement insertions.

Black arrows show the primers. GP primers are complementary to the flanking sequences, RE primers are complementary to the retroelement sequence (RE).

Discussion

In this study we for the first time showed an increased number of L1 somatic retrotranspositions in the dentate gyrus of the human brain in comparison to other brain regions and the myocardium by directly sequencing and mapping the somatic insertions. Several recent studies proposed that activation of L1 retroelements coincides with neuronal differentiation [22, 23, 28]. Our data demonstrate that the dentate gyrus is a “hotspot” of retrotranspositional activity in the adult human brain. The increase in the number of retrotranspositions can be associated with the presence of proliferating precursor cells. Each new somatic retrotransposition could alter gene expression and hence underlie the neuron individuality. Retroelement activity in proliferating precursor cells can potentially produce subpopulations of the dentate gyrus granule cells which possess unique properties that distinguish these neurons from the neighboring ones. Considering that the adult human dentate gyrus was shown to generate around 700 neurons per day [31], a portion of neurons carrying somatic L1 retrotranspositions could be high enough to affect the local neuronal circuits. Notably, we did not find any increase in the retrotranspositions in another putative neurogenic zone of the adult mammalian brain, the SVZ. However, several recent studies indicated that in humans, unlike rodents, the neurogenesis in the adult SVZ is negligible [33, 35]. Robust neurogenesis in the human SVZ was shown to persist for up to 18 month after birth [33]. Thus, the observed differences in the L1 retrotransposition rate between the SVZ and the DG might be associated with a significantly larger precursor cell pool in the DG in comparisons to the SVZ. Interestingly, we showed that the number of somatic L1 retrotranspositions in promoters and genes is higher than expected for all brain samples and the myocardium. This observation is different from the data obtained by Ewing and Kazazian [38] where authors demonstrated that recent insertions are less abundant in intronic regions. This can be explained by different modes of selection acting upon the somatic and germ-line insertions. Unlike the somatic insertions, the germ-line insertions are present in every cell and can potentially affect the functioning of the whole organism. Thus, they are subject to the selection at the population level, while the brain somatic insertions are subject to completely different selective pressures. However, the uneven distribution of the discovered insertions in the genome can also result from the mapping bias: the sequencing reads better map to the unique genomic regions which comprise the actively transcribed chromatin than to the highly repetitive non-transcribed sequences. The orientation of genic L1 insertions was observed to be biased towards counter-orientated state in all brain regions and myocardium. This can be caused by a higher negative effect of co-oriented insertions on gene expression (e.g. knock-down of gene expression) and is similar to the results obtained by Ewing and Kazazian [38]. However, Ewing and Kazazian have demonstrated the orientation bias for the insertions that established a firm foothold in the population, while our results indicate a possibility of negative selection against the co-oriented L1 insertions at the level of individual cells. Similar to the somatic L1 insertions, the Alu insertions demonstrated the most persistent expansion in the dentate gyrus. However, the rates of Alu propagation in the rest of the samples was not as equal as for the L1. The percentage of somatic Alu elements was variable and significantly different between the regions analyzed. This can probably be explained by the differences between the mechanisms of Alu and L1 suppression [13, 39] or by the difference in the tissue-specificity of the RE expression regulating mechanisms [28, 40, 41]. Presumably, the difference in the number of detected somatic L1 and Alu insertions can result from the L1 reverse transcriptase cis-preference to the self RNA [42]. Interestingly, the number of somatic Alu integration events in genes and promoters was similar to the expected values for all regions analyzed, but the DG. Furthermore, in the DG while Alu integrated preferably in the genes, they avoided integrating in the promoter regions, which is opposite to L1. Moreover, DG was the only brain region in which the orientation bias of the somatic Alu insertions was observed. Altogether these facts suggest that cells of the DG carrying new Alu insertions are subject to selection. The data on both L1 and Alu somatic insertions obtained in the current work indicate an increased retrotranspositional activity in the DG. Since our results are based on the analysis of a single donor we cannot exclude the possibility that the observed pattern of the somatic insertion distribution in brain tissues is unique and can vary in other members of the population. However, our results match the data obtained by qPCR [23] indicating that our conclusions may be generally applicable. We have not observed any insertions which were represented by many sequencing reads on the one hand and could be considered somatic on the other. This indicates that the retroelement hops in the genome occur mainly in adult tissues or at the stage of late development, but not during embryogenesis. This corresponds to the results of Baillie and colleagues [24] who have also identified no somatic insertions represented by many reads. The vanishingly small concentrations of the somatic insertions left little chance to retrieve their sequences from the initial samples and prompted us to validate them using the first step of suppression PCR as a template. However, we managed to validate one Alu insertion in the genomic DNA, which is an exclusive case of the direct confirmation of an endogenous retroelement somatic integration. Our results are consistent with the studies that were performed in mice. Although L1 can retrotranspose in many regions of the mouse brain [22], an injection of a lentivirus expressing L1 ORF2 fused with EGFP into the DG resulted in the L1 expression restricted to neuronal progenitor and newborn granule cells [28]. Moreover, in the hippocampi of the L1-EGFP transgenic mice, somatic L1 retrotranspositions were mainly found in cells that reside in the neurogenic niche, i.e. the SGZ of the DG [22]. Thus, L1 retrotransposition may be stimulated in proliferating progenitor cells of the mouse DG, leading to, similar to humans, an increase in the number of retrotranspositions for the DG. Since mice and humans are evolutionarily quite distant mammalian species, it is tempting to speculate that activation of L1 retrotransposition in the postnatal dentate gyrus might be an trait that is preserved in different mammals. Furthermore, since L1 retrotranspositions can be stimulated simply by running [43], activation of retrotransposition could be used by the DG as a fast response to some external (environmental) stimuli which results in the generation of neuronal pools different from the existing neurons. Taking in consideration the importance of the dentate gyrus in the formation of the hippocampal-dependent memory and learning, we can speculate that the activation of the L1 retrotransposition could have a significant effect on animal behavior. This could be addressed in the future by analyzing L1 retrotransposition in the DG of different mammalian species, and by stimulating L1 retrotransposition coupled with the analysis of animal behavior, e.g. pattern separation that was shown to be modulated by adult hippocampal neurogenesis [44, 45].

Materials and Methods

Contamination precautions

A three room standard was applied for sample preparation: all procedures with genomic DNA were performed in room 1; PCR preparation was performed in room 2; electrophoresis and other post-PCR activities were performed in room 3. Primers for validating potential somatic insertions lied out of the sequences which comprised the resulting Illumina libraries in order to avoid the amplification of the contaminating molecules from these libraries.

Tissue sources and sample preparation

Tissue samples were taken within 12 hours after death from a 72 year old male individual. Tissue samples were frozen in liquid nitrogen immediately after sectioning and then stored at -80°C. The study was approved by the local ethics committee of the Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences and conducted in accordance with the Declaration of Helsinki. As no consent could be obtained from the donor, the written consent was provided by the next of the kin. Genomic DNA was extracted from five frozen tissue samples of a single individual: cerebellum, frontal cortex, subventricular zone, dentate gyrus and myocardium, using the standard phenol-chloroform extraction method.

Library construction and sequencing

The library construction protocol included the step of genomic DNA digestion by restriction enzymes (AluI and HaeIII for L1, AluI and RsaI for Alu), adapter ligation and two steps of suppression PCR selectively amplifying the insertion sites of the most active RE subfamilies (L1Hs and AluYa5). Sequences of the oligonucleotides used for library preparation are shown in Table 3. The libraries were sequenced on Illumina HiSeq 2000 and Illumina Genome Analyzer IIx platforms using 101 bp reads.
Table 3

The oligonucleotides used for the preparation of the DNA libraries.

OligonucleotideSequence (5'-3')
RE-specific primers
AY107TCACCGTTTTAGCCGGGA
AY24AGGCGTGAGCCACCGCGC
AY18GAGCCACCGCGCCCGGC
3-L1HSGAGATATACCTAATGCTAGATGACAC
3-end-L1GCACATGTACCCTAAAACTTAGAGTA
Suppression PCR primers and adapters
Na21st19TGTAGCGTGAAGACGACAGAAAGGGCGTGGTGCGGAGGGCGGT
st20ACCGCCCTCC
Na15Na21AGCAGCGAACTCAGTACAACATGTAGCGTGAAGACGACAGAA
Na15AGCAGCGAACTCAGTACAACA
st19AGGGCGTGGTGCGGAGGGCGGT

Sequence mapping and analysis

Data analysis included the use of standard tools: Bowtie2 [46, 47] and Galaxy [48-50], Perl and Python scripts. The raw data processing protocol included mapping the reads to the reference genome (UCSC hg19), identifying the coordinates of RE insertions, retrieving the coordinates of potentially somatic insertions, and several steps of filtering out false-positive results.

Analysis of the distribution of potential somatic insertions in genome

The coordinates of the potential somatic insertions were intersected with the coordinates of the known genes and genes promoter regions (5,000 bp upstream of the genes transcription start sites) present in hg19 by the Galaxy tool “Join”. Statistical analysis was done using R software [51] including qcc package [52].

Statistical data analysis

The analysis of Alu and L1 distributions in different brain areas and of the genomic distribution of the somatic L1 and Alu insertions was performed using an overdispersion test. The distribution of Alu and L1 in the dentate gyrus was compared with the distribution in all other samples combined using the Poisson test. Binomial tests were applied for the analysis of somatic L1 and Alu orientation relative to nearby genes. Monte Carlo simulations of random retroelement distributions throughout the genome were performed to analyze the randomness of the Alu and L1 distributions in promoters and genes.

Validation of the somatic insertions

Nested PCR was performed for the validation of the selected somatic retroelement insertions. PCR products were Sanger sequenced. Primer structures and sequences are given in S1 Table.

PCR amplification of somatic insertions from genomic DNA

Nested PCR was performed to amplify the RE insertion flanking sequences from gDNA. Primer structures are given in S1 Table. For more details see

Accession Numbers

The European Nucleotide Archive accession number for the Illumina sequences reported in this paper is PRJEB4787.

The L1 and Alu insertions chosen for confirmation by PCR and Sanger sequencing, primers used for validation and Sanger sequences.

(XLSX) Click here for additional data file.

Power analysis for statistical tests.

(PDF) Click here for additional data file.

Supplementary materials and methods.

(PDF) Click here for additional data file.
  50 in total

1.  Most recent AluY insertions in human gene introns reduce the content of the primary transcripts in a cell type specific manner.

Authors:  Yuri B Lebedev; Anna L Amosova; Ilgar Z Mamedov; Gleb Y Fisunov; Eugene D Sverdlov
Journal:  Gene       Date:  2006-10-17       Impact factor: 3.688

2.  Galaxy: a platform for interactive large-scale genome analysis.

Authors:  Belinda Giardine; Cathy Riemer; Ross C Hardison; Richard Burhans; Laura Elnitski; Prachi Shah; Yi Zhang; Daniel Blankenberg; Istvan Albert; James Taylor; Webb Miller; W James Kent; Anton Nekrutenko
Journal:  Genome Res       Date:  2005-09-16       Impact factor: 9.043

3.  Somatic mosaicism in neuronal precursor cells mediated by L1 retrotransposition.

Authors:  Alysson R Muotri; Vi T Chu; Maria C N Marchetto; Wei Deng; John V Moran; Fred H Gage
Journal:  Nature       Date:  2005-06-16       Impact factor: 49.962

4.  Alu elements within human mRNAs are probable microRNA targets.

Authors:  Neil R Smalheiser; Vetle I Torvik
Journal:  Trends Genet       Date:  2006-08-17       Impact factor: 11.639

5.  LINE-1 retrotransposition in human embryonic stem cells.

Authors:  Jose L Garcia-Perez; Maria C N Marchetto; Alysson R Muotri; Nicole G Coufal; Fred H Gage; K Sue O'Shea; John V Moran
Journal:  Hum Mol Genet       Date:  2007-04-27       Impact factor: 6.150

6.  L1 retrotransposition is suppressed by endogenously encoded small interfering RNAs in human cultured cells.

Authors:  Nuo Yang; Haig H Kazazian
Journal:  Nat Struct Mol Biol       Date:  2006-08-27       Impact factor: 15.369

7.  Human neuroblasts migrate to the olfactory bulb via a lateral ventricular extension.

Authors:  Maurice A Curtis; Monica Kam; Ulf Nannmark; Michelle F Anderson; Mathilda Zetterstrom Axell; Carsten Wikkelso; Stig Holtås; Willeke M C van Roon-Mom; Thomas Björk-Eriksson; Claes Nordborg; Jonas Frisén; Michael Dragunow; Richard L M Faull; Peter S Eriksson
Journal:  Science       Date:  2007-02-15       Impact factor: 47.728

8.  MIWI2 is essential for spermatogenesis and repression of transposons in the mouse male germline.

Authors:  Michelle A Carmell; Angélique Girard; Henk J G van de Kant; Deborah Bourc'his; Timothy H Bestor; Dirk G de Rooij; Gregory J Hannon
Journal:  Dev Cell       Date:  2007-03-29       Impact factor: 12.270

9.  Developmentally regulated piRNA clusters implicate MILI in transposon control.

Authors:  Alexei A Aravin; Ravi Sachidanandam; Angelique Girard; Katalin Fejes-Toth; Gregory J Hannon
Journal:  Science       Date:  2007-04-19       Impact factor: 47.728

10.  Alu elements contain many binding sites for transcription factors and may play a role in regulation of developmental processes.

Authors:  Paz Polak; Eytan Domany
Journal:  BMC Genomics       Date:  2006-06-01       Impact factor: 3.969

View more
  17 in total

Review 1.  Mosaicism in health and disease - clones picking up speed.

Authors:  Lars A Forsberg; David Gisselsson; Jan P Dumanski
Journal:  Nat Rev Genet       Date:  2016-12-12       Impact factor: 53.242

2.  Overexpression of LINE-1 Retrotransposons in Autism Brain.

Authors:  Svitlana Shpyleva; Stepan Melnyk; Oleksandra Pavliv; Igor Pogribny; S Jill James
Journal:  Mol Neurobiol       Date:  2017-02-20       Impact factor: 5.590

Review 3.  The Role of Transposable Elements of the Human Genome in Neuronal Function and Pathology.

Authors:  Ekaterina Chesnokova; Alexander Beletskiy; Peter Kolosov
Journal:  Int J Mol Sci       Date:  2022-05-23       Impact factor: 6.208

4.  Deciphering fact from artifact when using reporter assays to investigate the roles of host factors on L1 retrotransposition.

Authors:  Pamela R Cook; G Travis Tabor
Journal:  Mob DNA       Date:  2016-11-22

5.  Neurotoxic Doses of Chronic Methamphetamine  Trigger Retrotransposition of the Identifier Element  in Rat Dorsal Dentate Gyrus.

Authors:  Anna Moszczynska; Kyle J Burghardt; Dongyue Yu
Journal:  Genes (Basel)       Date:  2017-03-06       Impact factor: 4.096

Review 6.  Warning SINEs: Alu elements, evolution of the human brain, and the spectrum of neurological disease.

Authors:  Peter A Larsen; Kelsie E Hunnicutt; Roxanne J Larsen; Anne D Yoder; Ann M Saunders
Journal:  Chromosome Res       Date:  2018-02-19       Impact factor: 5.239

Review 7.  Retrotransposon-induced mosaicism in the neural genome.

Authors:  Gabriela O Bodea; Eleanor G Z McKelvey; Geoffrey J Faulkner
Journal:  Open Biol       Date:  2018-07       Impact factor: 6.411

Review 8.  Genomic Indexing by Somatic Gene Recombination of mRNA/ncRNA - Does It Play a Role in Genomic Mosaicism, Memory Formation, and Alzheimer's Disease?

Authors:  Uwe Ueberham; Thomas Arendt
Journal:  Front Genet       Date:  2020-04-29       Impact factor: 4.599

Review 9.  L1 retrotransposition in the soma: a field jumping ahead.

Authors:  Geoffrey J Faulkner; Victor Billon
Journal:  Mob DNA       Date:  2018-07-07

10.  An advanced enrichment method for rare somatic retroelement insertions sequencing.

Authors:  Alexander Y Komkov; Anastasia A Minervina; Gaiaz A Nugmanov; Mariia V Saliutina; Konstantin V Khodosevich; Yuri B Lebedev; Ilgar Z Mamedov
Journal:  Mob DNA       Date:  2018-10-31
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.