Literature DB >> 24195013

Identification and characterization of MGEs and their insertion sites in the gorilla genome.

Kamal Rawal¹, Sangey Dorji, Amit Kumar, Anwesha Ganguly, Ankit Singh Grewal.

Abstract

Recently published gorilla genome has offered an opportunity to study human evolution through variety of approaches. Mobile genetic elements (MGEs) insert non randomly in genome through mechanisms such as retrotransposition and may cause gene inactivation, transduction, regulation of gene expression and genome expansion. Here we report that majority of gorilla genome is occupied with MGEs (> 36%) with presence of LTRs and Non-LTRs such as Alus and L1s. Other types of MGEs such as MIRs, retrovirus like elements ERVs and DNA transposons are also found using repeatmasker and ELAN pipeline. The distribution is similar to Humans and Macaca genome. Using DNA Scanner we also scanned preinsertion loci for number of different properties such as DNA denaturation, energy measures, potential for protein interactions and sequence based features. We also predicted preinsertion loci with > 70% accuracy using a machine learning tool called insertion site finder (ISF) based upon support vector machines.

Entities: CellLine Chemical Disease Gene Species

Keywords: Alu; L1; LINEs; SINEs; mobile genetic elements; physiochemical properties; primates; truncation points

Year: 2013 PMID： 24195013 PMCID： PMC3812790 DOI： 10.4161/mge.25675

Source DB: PubMed Journal: Mob Genet Elements ISSN： 2159-2543

Introduction

Gorillas (genus Gorilla) are the largest living primates and closest relatives of humans after chimpanzee. They are primarily herbivores, found in the African forests and important to study human origins. There are two species of gorilla-eastern lowland gorilla and western lowland gorilla. To date four surviving hominids- humans, orangutan, chimpanzee and gorilla have been sequenced. There is great deal of interest in understanding genomic differences among these organisms. Latest discoveries reveal the fact that primates like gorilla and human share common ancestry 5–8 million years ago. In all three species (gorilla, human and chimpanzee), genes relating to sensory perception, hearing and brain development showed accelerated evolution and particularly so in humans and gorillas. The mobile genetic elements were first discovered in maize plant and from that every newly sequenced genome is subjected to the discovery as well as study of these elements. Recent sequencing of gorilla genome has offered an opportunity to study distribution of these elements in single female western lowland gorilla, Kamilah (Gorilla gorilla gorilla). Previous studies in context of mobile genetic elements in gorilla has revealed the presence of Alu, L1, SVA, LTRs and HERV insertions in gorilla genome but the study is limited to analysis of few segments of genome. Earlier studies have also reported that retrotransposons are the most abundant MGE in mammalian genomes which affects wide functional activities such as genome evolution, gene disruption and regulation.- To date, no studies, related to the detailed analysis of presence of the mobile genetic elements in the gorilla genome, have been conducted. Here we present the detailed analysis of the occurrences of the various mobile genetic elements in the recently sequenced gorilla genome. The study reveals the presence of various categories of transposons and the retrotransposons within the genome and many interesting results have been obtained allowing the new insights and dimensions to the possibilities of study, within this genome. Mobile genetic elements (MGEs) are fragments of DNA that can move around within the genome through retrotransposition., The genomic hotspots are identified by DNA structure and, endonuclease (EN) nicking to that DNA sequence. These insertion hot spots are characterized by presence of sequence motifs and unique patterns. The present work involves the identification and study of the distribution of several MGE particularly Alu and L1 retrotransposons in the genome of Gorilla gorilla using repeatmasker and ELEFINDER., Previously, we have used ELEFINDER to perform a genome wide analysis of the MGEs in human and macaca genome. We also scanned the DNA for number of different properties such as potential for protein interactions, physicochemical properties and sequence based features using DNA SCANNER. We then used the results for computational testing of the pre-insertion loci in order to detect potential insertion sites using ISF.

Results

Elements discovery

The ~3919 Mbp (2917687013 bps) gorilla genome sequence was screened for transposable elements with repeatmasker software revealing 3025664 elements in the genome. These elements accounted for 36.96% of the gorilla genome. The TEs that were identified included the major TE classes: long terminal repeat (LTR) retrotransposons, non-LTR retrotransposons and DNA transposons. The Table 1 shows the chromosome wise distribution of different kinds of MGEs present in the gorilla genome. NonLTR transposons were the most abundant TEs in the gorilla genome, and included diverse super-families such as SINEs and LINEs. While SINEs consisted of two super-families namely, Alus and MIRs, three different kinds of LINE elements discovered in the genome were, such as LINE1, LINE2 and LINE3/CR1. Among the LINEs, the major part is covered by the L1s and that in the case of SINEs, the Alus were found to be in numbers.

Table 1. Summary of transposable elements in gorilla genome

Chromosome no.	Alus	MIRs	LINE1	LINE2	L3/CR1	ERVL	ERVLMaLRs	ERVclassI	ERVclassII	hAT-Charle	TcMar-Tigger	Unclassified
1	91871	42291	41364	26891	3063	6697	15574	7983	634	12946	6139	520
2a	38096	14943	22798	9258	1254	3280	8357	3955	226	7085	3040	186
2b	39589	15791	26554	10686	1365	4257	10258	4611	275	8033	3654	232
3	62483	29994	41408	18504	2248	6694	15493	7207	514	14484	5416	373
4	51541	22609	38952	16426	1736	7357	16305	7961	599	9508	5568	293
5	70191	27423	31568	15842	1781	4479	11247	5062	356	10542	4175	338
6	54639	20082	34897	14374	1860	5504	12690	6361	452	10123	4806	323
7	61046	17277	31824	11823	1594	4680	11114	6226	414	9072	4198	589
8	45944	20456	29423	13493	1541	4846	11889	5737	435	7747	3919	242
9	42510	20001	24463	11053	1359	3539	8655	4025	303	7403	2904	225
10	51595	19025	27478	10958	1365	3819	10141	4913	340	8233	3591	258
11	43907	25484	25990	14785	1755	3915	9339	4390	334	7388	3282	265
12	53875	21217	25459	13997	1518	4089	10637	4804	331	8414	3459	266
13	27254	9549	20480	7705	926	3359	8027	4338	225	4959	2863	147
14	33196	12630	17438	7980	953	2928	7151	3302	231	5199	2318	185
15	33979	12510	16597	7128	980	2237	5182	2505	172	5494	2201	137
16	44337	15098	13939	7880	672	2332	6541	2677	197	6227	1531	140
17	31725	9704	18919	6994	870	3340	8203	3540	250	5521	2563	183
18	23161	8703	15599	5954	942	2465	5825	2909	142	4306	2097	129
19	47025	6755	8552	4802	153	1540	2963	2744	466	2526	881	168
20	25569	11830	11180	6897	511	2253	5114	1802	82	5287	1369	154
21	11510	2974	6822	2113	228	1396	3789	1586	72	1875	859	36
22	20426	7827	5319	4023	343	837	2214	1044	101	1797	687	76
X	42894	17909	45679	12042	1651	5092	12193	6750	459	9041	3650	233

A pictorial representation can be viewed in the Figure 1, which clearly shows the differential distribution of the various MGEs in the genome.

Figure 1. The chromosome wise representation of the distribution of MGEs in the gorilla genome.

Figure 1. The chromosome wise representation of the distribution of MGEs in the gorilla genome. LTR retrotransposons were the second most abundant TEs in the gorilla genome, and the majority of these belonged to a class of mammalian repeats derived from retrovirus like elements and it was categorized into 4 subgroups namely, ERVL, ERVL-MaLRs, ERV_class I and ERV_class II. DNA transposons are rare in the gorilla genome, and are represented by only two super-families (hAT-Charlie and TcMar-Tigger). Some of the repeat elements, that could not be classified in the above mentioned families, were also identified. However, they were quite less in number. Some other repeat elements like, satellites, small RNAs, simple repeats and low complexity regions were also identified (see Table 1).

Analysis of the LINEs and SINEs

In accordance to the length of each chromosome, the analysis of the total LINEs, SINEs and LTR elements was also done. Tables 2, 3 and 4 summarize the complete data set. The total number of SINE elements was found to be 1,464,694, LINE elements were 8, 80,335 in number and the total count of LTRs was 4,33,996 in the entire genome. The average length of the inserted element was also calculated in each case along with the length occupied within the corresponding chromosome. The average length for the SINE elements was found to be within the approximate range of 208 to 220 bp and that of LINE element was approximately 408 to 658 bp.

Table 2. Chromosome wise summary of the SINE elements in gorilla genome

Chromosome no.	Total length (bp)	SINEs
		Length occupied (bp)	No. of elements	Average length of element
1	229507203	28090071	134557	208.75
2a	1113551968	11251200	53188	211.53
2b	131632457	11899664	55606	213.99
3	199944510	19414804	92794	209.22
4	201139530	15782085	74411	212.09
5	165930986	20810185	97878	212.61
6	171703152	16075800	74977	214.40
7	158137892	16975146	78557	216.08
8	145327772	14118592	66630	211.89
9	121947112	13107115	62655	209.19
10	147764049	15126370	70834	213.54
11	133470886	14310615	69572	205.69
12	133360231	15996265	75292	212.45
13	97499607	7992543	36939	216.37
14	88974843	9721051	45987	211.38
15	82026568	9885170	46623	212.02
16	80971650	12725256	59528	213.76
17	94257108	9016411	41540	217.05
18	78787515	6820885	31990	213.21
19	56181278	11866327	53843	220.38
20	62603092	7807243	37473	208.34
21	35451371	3201567	14543	220.14
22	35671106	5969251	28278	211.09
X	154045127	13000221	60999	213.12

Table 3. Chromosome wise summary of the LINE elements in gorilla genome

Chromosome no.	Total length (bp)	Length occupied	No. of elements	Average length of element
1	229507203	35719713	71763	497.74
2a	1113551968	17825260	33525	531.70
2b	131632457	21599024	38868	555.70
3	199944510	33940485	62566	542.47
4	201139530	33070071	57467	575.46
5	165930986	24894380	49469	503.23
6	171703152	28474055	51529	552.58
7	158137892	24387235	45617	534.60
8	145327772	23886020	44708	534.26
9	121947112	18867313	37128	508.16
10	147764049	20895555	40065	521.54
11	133470886	21911049	42747	512.57
12	133360231	20811356	41225	504.82
13	97499607	15886349	29288	542.41
14	88974843	13905388	26549	523.76
15	82026568	12722443	24879	511.37
16	80971650	9370538	22615	414.35
17	94257108	14593897	26941	541.69
18	78787515	12068676	22639	533.09
19	56181278	5520135	13525	408.14
20	62603092	8267354	18683	442.50
21	35451371	4834686	9215	524.65
22	35671106	4145031	9728	426.09
X	154045127	39142498	59596	656.79

Table 4. Chromosome wise summary of the LTR elements in gorilla genome

Chromosome no.	Total length (bp)	LTR elements
		Length occupied	No. of elements	Average length of element
1	229507203	16331504	31734	514.63
2a	1113551968	7938316	16285	487.46
2b	131632457	9873134	19928	495.44
3	199944510	15861956	30683	516.96
4	201139530	17614113	32882	535.67
5	165930986	10618919	21725	488.78
6	171703152	13456407	25647	524.67
7	158137892	11400774	22995	495.79
8	145327772	11703956	23430	499.52
9	121947112	8137837	16944	480.27
10	147764049	9403542	19649	478.57
11	133470886	9697311	18429	526.19
12	133360231	10222587	20364	501.99
13	97499607	8318528	16279	510.99
14	88974843	7087817	13954	507.94
15	82026568	4878339	10401	469.02
16	80971650	4955225	11971	413.93
17	94257108	7798515	15689	497.06
18	78787515	5736835	11595	494.76
19	56181278	3859077	7768	496.79
20	62603092	3862835	9450	408.76
21	35451371	3322579	6960	477.38
22	35671106	1814702	4256	426.38
X	154045127	14856258	24978	594.77

Genome wide coverage

Table 5 shows the total percentage of the genome sequence as covered by various MGEs. It was observed that although the total count of Alu within the genome is highest, yet, the maximum percentage of the genome length was covered by the LINE 1 element. The total percentage of the genome as covered by Alu was 8.5%, but L1 covers 13.3% of the total gorilla genome. This shows that within the total area of the genome, as covered by the MGEs, the L1s form the major part. Among the non LTR elements, the L3/CR1 element was present in least numbers, and the MIRs and LINE2 covered approximately the same amount of region on the genome (~2%). The LTR elements covered ~7.4% of the total genome, out of which ERVL-MaLRs were the major constituents. We compared MGE content between chromososme no.1 and X chromosome of gorilla genome (see Fig. 2). We found that there are more number of L1 on X chromosome despite being smaller than chromosome 1. On the other hand Alu appear to be distributed more randomly i.e., number of Alus on chromosome are roughly proportionate with size of the chromosomes.

Table 5. Summary of MGEs in gorilla genome

TE superfamily	Counts (copy no.)	Length (bp)	% of sequence covered
Non-LTR
Alus	1048363	248964770	8.53
MIRs	412082	61542196	2.11
LINE1	582702	387476383	13.28
LINE2	261608	71320928	2.44
L3/CR1	42953	6666746	0.22
LTR elements
ERVL	90935	44219203	1.51
ERVL-MaLRs	218901	93556837	3.20
ERV_classI	106432	71144203	2.43
ERV_classII	7610	7508372	0.26
DNA elements
hAT-Charlie	173210	34481719	1.18
TcMar-Tigger	75170	30557703	1.04
Unclassified	5698	2800558	0.09
Total	3025664	1060239618

Figure 2. The MGE content comparision between chromososme 1 and X of gorilla genome.

DNA SCANNER and ISF

We present results generated by DNA SCANNER on Alu elements insertion sites (chromosome 22) as a representative case (see Table 6 and Fig. 3). In ISF module, we trained and tested using insertion site sequences of chromosome 21 and 22. The accuracy of system (ability to identify positive example and reject negative examples) was found to be 77% for chromosome 21 and 76% for chromosome 22 (Table 7).

Table 6. DNA SCANNER output of gorilla chromosome 22 showing position and parameter values of A-rule

Position	Parameter value
0	0.275602587
1	0.275720165
2	0.277954145
3	0.279188713
4	0.278306878
5	0.279835391
6	0.282304527
7	0.282716049
8	0.28265726
9	0.282892416
10	0.282951205
11	0.28265726
12	0.283656673
13	0.283891828
14	0.28212816
15	0.282480894
16	0.28377425
17	0.284538507
18	0.284009406
19	0.283127572
20	0.282951205
21	0.284068195
22	0.287360376
23	0.289065256
24	0.288359788
25	0.289535567
26	0.291534392
27	0.292004703
28	0.291828336
29	0.292651382
30	0.29159318
31	0.289300412
32	0.288594944
33	0.28712522
34	0.285008818
35	0.283715461
36	0.281951793
37	0.281128748
38	0.281422693
39	0.281599059
40	0.281834215
41	0.282774838
42	0.28547913
43	0.288536155
44	0.290299824
45	0.292357437
46	0.294532628
47	0.293592005
48	0.292239859
49	0.293004115
50	0.295238095
51	0.295884774
52	0.293180482
53	0.291651969
54	0.293415638
55	0.296061141
56	0.298353909
57	0.300470312
58	0.301528513
59	0.300058789
60	0.29888301
61	0.300587889
62	0.303821282
63	0.304644327
64	0.302880658
65	0.303527337
66	0.306819518
67	0.312698413
68	0.321105232
69	0.328747795
70	0.335390947
71	0.340270429
72	0.345679012
73	0.353497942
74	0.362081129
75	0.371369782
76	0.378542034
77	0.385067607
78	0.392945326
79	0.398059965
80	0.400352734
81	0.400764256
82	0.396413874
83	0.38489124
84	0.367430923
85	0.351440329
86	0.333803645
87	0.310229277
88	0.287536743
89	0.268077601
90	0.254323021

Figure 3. Various signals upstream of the insertion sites of Alu in chromosome 22. The y axis represents value of the property and the x-axis gives the relative position with respect to the insertion site.

Table 7. Performance of ISF in gorilla chromosome 21 and 22 for Alu element

Chromosome	Linear kernel
21	0.7779
22	0.7663

Methods

Retrieving genome sequences

Gorilla genome (gorGor3.1; GCA_000151905.1) was retrieved from ensembl database: ftp://ftp.ensembl.org/pub/release-66/fasta/gorilla_gorilla/dna/. The total size of genome arranged in several chromosomes is of total of ~2900 million base pairs of nucleotides.

Repeat sequence retrieval

RepBase Update (RU) is a comprehensive database of repetitive element consensus sequences. Most prototypic sequences from RU are consensus sequences of large families and subfamilies of repetitive sequences. We have used sequences provided by RU for identification of transposable elements (TEs) based on their features.

RepeatMasker for screening of DNA sequences

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences based upon RU.

Whole genome distribution analysis by ELEFINDER

ELEFINDER not only identify repeats but also extracts flanking sequence at each MGE site identified as a preinsertion locus. We used it to find the insertion sites of various MGEs in the gorilla genome., This tool finds the nature, distribution, genomic location and the site of truncation for each of the MGE and performs comparative genome analysis. It is a perl based system requiring organism name, chromosome number, element name, genome file and element file as input parameters. The results files are generated after performing BLAST and parsing scripts. The output files comprise the copies of MGE generated by the program categorized into 5′ truncated, 3′ truncated and both side truncated examples (see Figs. S1–S4).

Pairwise alignment

To understand the distribution of variety of LINEs present in REPBASE we aligned them sequentially with each other to understand their relationship at sequence level. To view this large data set we generated pairwise percentage similarity matrix (see Tables. S3–S5). We also used Gene cluster 3.0 and Java TreeView to view large data set in a tree format (see Fig. S5).

DNA SCANNER

DNA SCANNER is a tool which scans DNA using sliding window mechanism for number of different properties such as biophysical parameters, energy measures, potential for protein interactions and sequence based features such as T density, AT density etc. Sequence and physico-chemical based motifs are extracted at insertion site by using this tool. Based on a choice of input parameters, the program evaluates a number of properties in moving windows along the length of the query DNA sequence. Substrings of window size w are generated from the 5′ end of input DNA sequences, and further divided into words (Di/Tri nucleotides). It screens various physicochemical properties as described below: (A) Structural Signals: DNA Bendability DNA Bendability is the ability of a DNA to deform under a specific stimulus such as protein binding. A tri-nucleotide model based on DNase-I cutting frequencies predicts that DNase I binds and cuts DNA that is bent toward major groove,, (B) Thermodynamic Signals: Stacking Energy Stacking energies are indicators of stability, both of a given DNA sequence and as well as protein interactions and thus plays a critical role in formation of local structures, (C) Duplex Stability: Free Energy Signals The relative stability of DNA duplex structure depends upon its base sequence and more specifically upon ten different types of nearest neighbor interactions namely AA/TT;AT/TA; CA/GT; GT/CA; CT/GA; GA/CT; CG/GC; GC/CG; GG/CC. Using this information, the overall stability (as a measure of G) and melting behavior of a sequence can be predicted., (D) Propeller Twist Signals, Bending Stiffness and Nucleosomal Positioning DNA must distort in order to bend around a protein: this distortion is facilitated by the deformational capacity of dinucleotide.,, This can be characterized by properties such as propeller twist. (E) Protein Interaction Signals The DNA sequence carries signals specific for its potential to deform when interacting with other molecules such as proteins and also during important biochemical processes such as transcription, replication and retro-transposition.,

Insertion Site Finder (ISF)

ISF is a machine learning tool which relies on support vector machines for learning and classification tasks. Present ISF is generic version of SVMs, wherein insertion sites from any of the genome can be used. Insertion site finder identifies and predicts insertion sites of the mobile genetic elements. Earlier work involved the study of E. histolytica by identification of signals, thereby showing the site for the mobile genetic element to insert at a particular locus. The information provided by DNA SCANNER is used as positive and negative data sets for training. We have applied ISF on Human genome and E. histolytica using bayes rule by using various signals (chemical, thermodynamic and biophysical properties) to produce the score of an insertion site. It gives the probability of a particular property Sj to get inserted at a particular location or insertion site Pi. It also computes sensitivity and specificity for the same. See Figure 4 for more explanation.

Figure 4. Accuracy (AC) is the proportion of the total number of predictions that were correct: AC = (a + d) / (a + b + c + d). Recall is the proportion of positive cases that were correctly identified: R = d / (c + d). Precision is the proportion of the predicted positive cases that were correct: p = d / (b + d). Sensitivity is the ability of the system to identify actual positives: Sn = TP/TP + FN. Specificity is the ability of the system to reject negative examples: Sp = TN/FP+TN. We generated the positive data set labeled as Class P for insertion sites of full length copies of the given elements namely Alu and L1. We also created negative data set by shuffling these insertion sequences labeled as Na. The independent graphs were generated using DNA Scanner for the two data sets. The observed extrema for the given rule was compared in both data sets. The rules were selected in case they have shown significant value when compared with negative data set as well with background values. For instance, A rule peak was selected for chromosome 22 only when it exceeded the cut off range of 2 Std Dev from the background of A rule values. In addition, there has to be statistical difference of A rule values between positive and negative data set (p < 0.05).

Discussion

The gorilla genome is one of the most recently sequenced non-human primates. The successful complete mapping of the gorilla genome has given new and fresh insights to human, chimpanzee and gorilla evolution. We used ELAN pipeline module to identify various types of mobile genetic elements in the gorilla genome and analyze physical and chemical properties as well as predict insertion sites. The results were compared with other primate genomes such as humans and macaca. It appears that gorilla MGE distribution is similar to human, macaca and mouse suggesting common mechanisms shaping spread of MGE. To compare the distribution of Alus and L1 element in gorilla and human, we extracted 1 Mb of DNA sequence of both genomes starting from chromosome 1 (Position 1–1000000) from ensembl database. We divided these sequences into 10 Kb non overlapping segments so as to observe the distribution of Alus and L1. The genomic sequences were also aligned using pairwise blast option so as to see the effect of sequence divergence on density of Alus and L1s in the given segment (Fig. 5). The GATA tool (http://gata.sourceforge.net/PlotterHelp.html) was used to display the distribution of Alus and L1 in both genomes. It appears that distribution of Alus and L1s is similar even in the areas which are dissimilar at genomic level. We are trying to extend this work at whole genome level and develop a statistical model to understand this approach at multi species level.

Figure 5. The boxes are plotted against horizontal representations of the input sequences with the reference sequence on top (human). The size of each box is determined by the start and stop positions in the sub-alignment. The shading of the boxes and connector line are scaled according to the sub-alignment score where solid black represents the highest score obtained, light gray the lowest. Lastly the color of the connecting line is used to indicate the sub-alignment orientation, black for +/+, red for +/−. Where windows overlap, those with the highest score are displayed on top. The dark pink lines represent Alu elements whereas parrot green represent L1 elements. The white portions in the beginning and in the last section of human (top) represent undetermined sequences (NNNNN etc). This section represents position numbers 1…250000 bp of human and gorilla chromosome 1. The GFF files were generated for Alus and L1s. The additional figures in the document show subsequent sections of chromosome 1. We used repeatmasker to find the different types of Alus and LINEs found in the gorilla genome. This was supplemented by ELEFINDER program to identify the distribution of mobile genetic elements in the gorilla genome (see Tables S1 and S2). DNA SCANNER was used to scan insertion sites of gorilla chromosomes for a number of different properties such as biophysical, energy, potential for protein interactions and sequence based features. Extrema present in profiles were used to predict insertions sites using machine learning systems. Potential insertion sites were detected using ISF from the result obtained by DNA SCANNER. The patterns or signals observed in gorilla genome were very similar to signals observed at insertion sites of Alus and L1 in humans. The most common transposable element was found to be Alu in gorilla and its distribution is similar to previously reported distribution of Alu element in human genome. For all chromosomes, the Alu copy number roughly correlates with chromosome length. Though Alu element superceded all the MGEs numerically but L1s were present in exceptionally high numbers in X chromosome (see Fig. 2) suggesting special role of sex chromosomes for accumulation of MGE. When these results were compared with human genome, we found characteristically similar behavior in the distribution among humans and gorilla. Different LINEs were found in different numbers in the gorilla genome hence we used various tools to identify various classes of LINEs. In future we plan to analyze MGE in context to genes showing accelerated evolution specially related to brain, sensory and speech.

11 in total

1. Whole-genome analysis of Alu repeat elements reveals complex evolutionary history.

Authors: Alkes L Price; Eleazar Eskin; Pavel A Pevzner
Journal: Genome Res Date: 2004-11 Impact factor: 9.043

Review 2. Repbase Update, a database of eukaryotic repetitive elements.

Authors: J Jurka; V V Kapitonov; A Pavlicek; P Klonowski; O Kohany; J Walichiewicz
Journal: Cytogenet Genome Res Date: 2005 Impact factor: 1.636

3. Stacking energies in DNA.

Authors: S G Delcourt; R D Blake
Journal: J Biol Chem Date: 1991-08-15 Impact factor: 5.157

4. DNA sequence-dependent deformability deduced from protein-DNA crystal complexes.

Authors: W K Olson; A A Gorin; X J Lu; L M Hock; V B Zhurkin
Journal: Proc Natl Acad Sci U S A Date: 1998-09-15 Impact factor: 11.205

5. Propeller-twisting of base-pairs and the conformational mobility of dinucleotide steps in DNA.

Authors: M A el Hassan; C R Calladine
Journal: J Mol Biol Date: 1996-05-31 Impact factor: 5.469

6. Detecting motifs and patterns at mobile genetic element insertion site.

Authors: Bhuvan Bhaskar Dev; Aman Malik; Kamal Rawal
Journal: Bioinformation Date: 2012-08-24

7. Genome-wide analysis of mobile genetic element insertion sites.

Authors: Kamal Rawal; Ram Ramaswamy
Journal: Nucleic Acids Res Date: 2011-05-23 Impact factor: 16.971

8. Insights into hominid evolution from the gorilla genome sequence.

Authors: Aylwyn Scally; Julien Y Dutheil; LaDeana W Hillier; Gregory E Jordan; Ian Goodhead; Javier Herrero; Asger Hobolth; Tuuli Lappalainen; Thomas Mailund; Tomas Marques-Bonet; Shane McCarthy; Stephen H Montgomery; Petra C Schwalie; Y Amy Tang; Michelle C Ward; Yali Xue; Bryndis Yngvadottir; Can Alkan; Lars N Andersen; Qasim Ayub; Edward V Ball; Kathryn Beal; Brenda J Bradley; Yuan Chen; Chris M Clee; Stephen Fitzgerald; Tina A Graves; Yong Gu; Paul Heath; Andreas Heger; Emre Karakoc; Anja Kolb-Kokocinski; Gavin K Laird; Gerton Lunter; Stephen Meader; Matthew Mort; James C Mullikin; Kasper Munch; Timothy D O'Connor; Andrew D Phillips; Javier Prado-Martinez; Anthony S Rogers; Saba Sajjadian; Dominic Schmidt; Katy Shaw; Jared T Simpson; Peter D Stenson; Daniel J Turner; Linda Vigilant; Albert J Vilella; Weldon Whitener; Baoli Zhu; David N Cooper; Pieter de Jong; Emmanouil T Dermitzakis; Evan E Eichler; Paul Flicek; Nick Goldman; Nicholas I Mundy; Zemin Ning; Duncan T Odom; Chris P Ponting; Michael A Quail; Oliver A Ryder; Stephen M Searle; Wesley C Warren; Richard K Wilson; Mikkel H Schierup; Jane Rogers; Chris Tyler-Smith; Richard Durbin
Journal: Nature Date: 2012-03-07 Impact factor: 49.962

9. Identification of insertion hot spots for non-LTR retrotransposons: computational and biochemical application to Entamoeba histolytica.

Authors: Prabhat K Mandal; Kamal Rawal; Ram Ramaswamy; Alok Bhattacharya; Sudha Bhattacharya
Journal: Nucleic Acids Res Date: 2006-10-13 Impact factor: 16.971

10. Distribution of MGEs and their insertion sites in the Macaca mulatta genome.

Authors: Kamal Rawal; Avantika Priya; Aman Malik; Radhika Bahl; Ram Ramaswamy
Journal: Mob Genet Elements Date: 2012-05-01