| Literature DB >> 22908042 |
Josip Brajković1, Isidoro Feliciello, Branka Bruvo-Mađarić, Durđica Ugarković.
Abstract
In the red flour beetle Tribolium castaneum the major TCAST satellite DNA accounts for 35% of the genome and encompasses the pericentromeric regions of all chromosomes. Because of the presence of transcriptional regulatory elements and transcriptional activity in these sequences, TCAST satellite DNAs also have been proposed to be modulators of gene expression within euchromatin. Here, we analyze the distribution of TCAST homologous repeats in T. castaneum euchromatin and study their association with genes as well as their potential gene regulatory role. We identified 68 arrays composed of TCAST-like elements distributed on all chromosomes. Based on sequence characteristics the arrays were composed of two types of TCAST-like elements. The first type consists of TCAST satellite-like elements in the form of partial monomers or tandemly arranged monomers, up to tetramers, whereas the second type consists of TCAST-like elements embedded with a complex unit that resembles a DNA transposon. TCAST-like elements were also found in the 5' untranslated region (UTR) of the CR1-3_TCa retrotransposon, and therefore retrotransposition may have contributed to their dispersion throughout the genome. No significant difference in the homogenization of dispersed TCAST-like elements was found either at the level of local arrays or chromosomes nor among different chromosomes. Of 68 TCAST-like elements, 29 were located within introns, with the remaining elements flanked by genes within a 262 to 404,270 nt range. TCAST-like elements are statistically overrepresented near genes with immunoglobulin-like domains attesting to their nonrandom distribution and a possible gene regulatory role.Entities:
Keywords: gene regulation; immunoglobulin-like genes; repetitive DNA; satellite DNA; transposon
Mesh:
Substances:
Year: 2012 PMID: 22908042 PMCID: PMC3411249 DOI: 10.1534/g3.112.003467
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
TCAST-like elements associated with genes within T. castaneum euchromatin
| Uniprot | Entrez | Gene Name | Chr | Sat_seq. | Position | Distance, bp | DM Homolog | FBgn | Type | Length | Copies |
|---|---|---|---|---|---|---|---|---|---|---|---|
| D6WZP1 | Altered disjunction | 9 | 1 | 5′ | 18,773 | Q9VEH1 | Satellite | 734 | 2.0 | ||
| D6WZP3 | Ras-related protein Rab-26 | 9 | 1 | 3′ | 7795 | Q9VP48 | Satellite | 734 | 2.0 | ||
| D6WZL9 | Probable serine/threonine-protein kinase | 9 | 2 | Inside | Q0KHT7 | Satellite | 993 | 2.8 | |||
| D6X226 | Arrest | 9 | 3 | 5′ | 99,669 | Q8IP89 | Satellite | 716 | 2.0 | ||
| D6X238 | Numb | 9 | 3 | 3′ | 115,984 | P16554 | Satellite | 716 | 2.0 | ||
| no match on uniprot | 9 | 4 | 5′ | 1520 | Satellite | 517 | 1.4 | ||||
| D6X2D0 | Short-chain dehydrogenase | 9 | 4 | 3′ | 6704 | Q9VE80 | Satellite | 517 | 1.4 | ||
| D6X1E7 | Cytochrome P450 306A1 | 9 | 5 | 5′ | 404,270 | Q9VWR5 | Satellite | 1058 | 2.9 | ||
| D6X2U7 | Elongase | 9 | 5 | 3′ | 9947 | Q9VCY6 | Satellite | 1058 | 2.9 | ||
| D6X2C4 | Dopamine receptor 1 | 9 | 6 | Inside | P41596 | Satellite | 304 | 0.8 | |||
| D6X2U7 | Elongase | 9 | 7 | 5′ | 7128 | Q9VCY6 | Satellite | 394 | 1.1 | ||
| D6X366 | elongation of very long chain fatty acids protein | 9 | 7 | 3′ | 50,111 | Q9VCY5 | Satellite | 394 | 1.1 | ||
| D6X0D7 | Ret oncogene | 9 | 8 | 5′ | 56,625 | Q8INU0 | Satellite | 213 | 0.6 | ||
| D6X0E1 | Dpr9 | 9 | 8 | 3′ | 62,781 | Q9VFD9 | Satellite | 213 | 0.6 | ||
| D6X2H8 | ADAM metalloprotease | 9 | 9 | Inside | Q6QU65 | Transposon | 1107 | ||||
| D6X2U7 | Elongase | 9 | 10 | 5′ | 47,902 | Q9VCZ0 | Transposon | 1085 | |||
| D6X2V3 | Putative uncharacterized protein | 9 | 10 | 3′ | 67,953 | Q9VDB7 | Transposon | 1085 | |||
| D6X244 | Serine/threonine-protein kinase 32B | 9 | 11 | Inside | Q0KID3 | Transposon | 1062 | ||||
| D6X374 | Putative uncharacterized protein | 9 | 12 | Inside | Q9VGZ4 | Satellite | 292 | 0.8 | |||
| D6X2C4 | Dopamine receptor 1 | 9 | 13 | Inside | P41596 | Transposon | 900 | ||||
| D6X259 | Transport and Golgi organization 13 | 9 | 14 | 5′ | 9456 | Q9VGT8 | Satellite | 222 | 0.6 | ||
| D6X260 | Protein-tyrosine sulfotransferase | 9 | 14 | 3′ | 33,523 | Q9VYB7 | Satellite | 222 | 0.6 | ||
| D6X075 | MICAL-like protein | 9 | 15 | 5′ | 39,684 | Q9VU34 | Satellite | 203 | 0.6 | ||
| D6X1P2 | tiptop | 9 | 15 | 3′ | 142,821 | Q9U3V5 | Satellite | 203 | 0.6 | ||
| D6X095 | Troponin C | 9 | 16 | 5′ | 3922 | P47947 | Transposon | 589 | |||
| D6X0I1 | Troponin C | 9 | 16 | 3′ | 15,143 | P47947 | Transposon | 589 | |||
| D6X1J0 | Transporter | 9 | 17 | Inside | Q9NB97 | Satellite | 915 | 2.5 | |||
| D6WF56 | zinc finger protein 250 | 3 | 18 | 5′ | 125,685 | Q7KAH0 | Satellite | 1208 | 3.4 | ||
| D6WF61 | Transcription initiation factor TFIID subunit 7 | 3 | 18 | 3′ | 64,294 | Q9VHY5 | Satellite | 1208 | 3.4 | ||
| D6WGB1 | Mahya | 3 | 19 | 5′ | 115,537 | P20241 | Satellite | 687 | 1.9 | ||
| D6WGB5 | V-type proton ATPase subunit E | 3 | 19 | 3′ | 82,217 | P54611 | Satellite | 687 | 1.9 | ||
| D6WII0 | NADH dehydrogenase, putative | 3 | 20 | 5′ | 4278 | Q9W3N7 | Transposon | 635 | |||
| D6WII2 | Putative uncharacterized protein | 3 | 20 | 3′ | 18,585 | Q9VIY1 | Transposon | 635 | |||
| D6WFT8 | WD repeat-containing protein 47 | 3 | 21 | Inside | Q960Y9 | Satellite | 604 | 1.7 | |||
| D6WDY2 | Kynurenine aminotransferase | 3 | 22 | 5′ | 10,696 | Q8SXC2 | Transposon | 1000 | |||
| D6WDY4 | Annexin IX | 3 | 22 | 3′ | 11,599 | P22464 | Transposon | 1000 | |||
| D6WFK8 | ankyrin 2,3/unc44 | 3 | 23 | Inside | Q7KU95 | Transposon | 1081 | ||||
| D6WFX1 | ral guanine nucleotide exchange factor | 3 | 24 | 5′ | 64,025 | Q8MT78 | Transposon | 888 | |||
| D6WFX3 | galactose-1-phosphate uridylyltransferase | 3 | 24 | 3′ | 3958 | Q9VMA2 | Transposon | 888 | |||
| D6WDQ4 | Putative uncharacterized protein | 3 | 25 | 5′ | 15,051 | Q8T0R9 | Transposon | 1016 | |||
| D6WDQ6 | coiled-coil domain containing 96 | 3 | 25 | 3′ | 9162 | A1ZA72 | Transposon | 1016 | |||
| D6WF68 | glucose dehydrogenase | 3 | 26 | 5′ | 25,896 | Q9VY00 | Transposon | 1067 | |||
| C3XZ92 | Mitogen-activated protein kinase kinase kinase kinase 2 | 3 | 26 | 3′ | 92,876 | Q8SYA1 | Transposon | 1067 | |||
| D6WE82 | Putative uncharacterized protein | 3 | 27 | Inside | Q9VDK2 | Transposon | 314 | ||||
| D6WHX6 | Putative uncharacterized protein | 3 | 28 | 5′ | 173,881 | Q1RKQ9 | Transposon | 826 | |||
| D6WI58 | Cathepsin L | 3 | 28 | 3′ | 82,559 | Q95029 | Transposon | 826 | |||
| D6WDJ9 | Putative uncharacterized protein | 3 | 29 | 5′ | 173,548 | Q8IPJ1 | Transposon | 1084 | |||
| D6WDN0 | PRMT5 | 3 | 29 | 3′ | 383,809 | Q9U6Y9 | Transposon | 1084 | |||
| D6WGS3 | Putative uncharacterized protein | 3 | 30 | 5′ | 37572 | A0AMQ8 | Satellite | 216 | 0.6 | ||
| D6WGT0 | calpain 3 | 3 | 30 | 3′ | 226,707 | Q11002 | Satellite | 216 | 0.6 | ||
| D6WDS8 | Muscle-specific protein 300 | 3 | 31 | 5′ | 378,626 | Q4ABG9 | Transposon | 1060 | |||
| D6WDT0 | Phosphatidylinositol-binding clathrin assembly protein | 3 | 31 | 3′ | 7855 | C1C3H4 | Transposon | 1060 | |||
| D6WHF2 | Nephrin | 3 | 32 | Inside | Q9W4T9 | Transposon | 666 | ||||
| D6WI96 | Heat shock protein 70 | 3 | 33 | Inside | P11147 | Transposon | 1058 | ||||
| D6WG02 | N-acetylglucosaminyltransferase vi | 3 | 34 | Inside | Q9VUH4 | Transposon | 319 | ||||
| D6WYD1 | Putative uncharacterized protein | 8 | 35 | 5′ | 385,712 | Q8SY79 | Satellite | 625 | 1.7 | ||
| D6WYN3 | serine-type protease inhibitor | 8 | 35 | 3′ | 58,583 | Q9VSC9 | Satellite | 625 | 1.7 | ||
| D6WYA1 | Copia protein (Gag-int-pol protein) | 8 | 36 | 3′ | 262 | B6V6Z8 | ?? | Satellite | 196 | 0.5 | |
| D6WYC9 | Cmp-n-acetylneuraminic acid synthase | 8 | 37 | Inside | B5RJF3 | Transposon | 831 | ||||
| D6WV42 | CG5080 | 8 | 38 | Inside | Q7K3E2 | Transposon | 582 | ||||
| D6WYA0 | Beaten path | 8 | 39 | 5′ | 7165 | Q94534 | Transposon | 1181 | |||
| D6WUX6 | Putative uncharacterized protein | 8 | 40 | Inside | Q7KUK9 | Transposon | 440 | ||||
| D6X0E1 | defective proboscis extension response | 7 | 41 | Inside | Q9VFD9 | Satellite | 722 | 2.0 | |||
| D6WPX8 | Ribosome-releasing factor 2, mitochondrial | 7 | 42 | 5′ | 17,480 | Q9VCX4 | Transposon | 905 | |||
| A2AX72 | Gustatory receptor | 7 | 42 | 3′ | 1581 | Q9VPT1 | Transposon | 905 | |||
| D6WTD1 | similar to chitinase 6 | 7 | 43 | Inside | Q9W2M7 | Satellite | 1440 | 4.0 | |||
| D6WPE6 | voltage-gated potassium channel | 7 | 44 | Inside | P17970 | Transposon | 814 | ||||
| D2A2C6 | Putative uncharacterized protein | 4 | 45 | 5′ | 9489 | Q9V3S3 | Satellite | 549 | 1.5 | ||
| D2A2D1 | Putative uncharacterized protein | 4 | 45 | 3′ | 10,920 | Q9W191 | Satellite | 549 | 1.5 | ||
| D2A2I0 | Putative uncharacterized protein | 4 | 46 | 5′ | 5820 | Q8SZ28 | Satellite | 558 | 1.6 | ||
| D2A2I1 | Ribonucleoside-diphosphate reductase | 4 | 46 | 3′ | 7000 | P48591 | Satellite | 558 | 1.6 | ||
| D1ZZG6 | Kinesin-like protein | 4 | 47 | Inside | Q9VLW2 | Transposon | 508 | ||||
| D2A2P8 | PiggyBac transposable element | 4 | 48 | Inside | Q9VHL1 | Transposon | 377 | ||||
| D6WB65 | E74 | 2 | 49 | 5′ | 60,525 | P20105 | Satellite | 770 | 2.1 | ||
| D6WB73 | organic cation transporter | 2 | 49 | 3′ | 2638 | Q7K3M6 | FBgn0034479 | Satellite | 770 | 2.1 | |
| D6WBG8 | pre-mRNA-splicing helicase BRR2 | 2 | 50 | 3′ | 4811 | Q9VUV9 | Satellite | 728 | |||
| D6WB14 | monophenolic amine tyramine | 2 | 51 | 5′ | 7955 | P22270 | Transposon | 567 | |||
| D6WB15 | Cuticular protein 47Ef | 2 | 51 | 3′ | 16,173 | A1Z8H7 | FBgn0033603 | Transposon | 567 | ||
| D6WB29 | Endoprotease FURIN | 2 | 52 | Inside | P30432 | Transposon | 1045 | ||||
| A8DIV5 | Nicotinic acetylcholine receptor subunit alpha11 | 2 | 53 | 5′ | 13,645 | P25162 | Transposon | 1021 | |||
| D6WB29 | Endoprotease FURIN | 2 | 53 | 3′ | 4875 | P30432 | Transposon | 1021 | |||
| D6X3I9 | Transcription initiation factor IIF | 10 | 54 | 5′ | 10,025 | Q05913 | Satellite | 870 | 2.4 | ||
| D6X3J1 | Putative uncharacterized protein | 10 | 54 | 3′ | 6607 | Satellite | 870 | 2.4 | |||
| D6X4P3 | Neutral alpha-glucosidase ab | 10 | 55 | Inside | Q7KMM4 | Satellite | 694 | 1.9 | |||
| D6X3H5 | Neurexin-4 | 10 | 56 | 5′ | 2234 | Q94887 | Satellite | 224 | 0.6 | ||
| D6X3H7 | Succinate semialdehyde dehydrogenase | 10 | 56 | 3′ | 14,901 | Q9VBP6 | Satellite | 224 | 0.6 | ||
| D6X4V6 | Tubby, putative | 10 | 57 | Inside | Q9VB18 | Transposon | 763 | ||||
| D6X3J6 | Putative uncharacterized protein | 10 | 58 | 5′ | 1015 | Q9VEJ9 | Satellite | 564 | 1.6 | ||
| D6X3J7 | cdc73 domain protein | 10 | 58 | 3′ | 27,239 | Q9VHI1 | Satellite | 564 | 1.6 | ||
| D2A693 | lysine-specific demethylase 4B | 6 | 59 | Inside | Q9V6L0 | Satellite | 498 | 1.4 | |||
| D2A490 | Facilitated trehalose transporter Tret1-2 homolog | 6 | 60 | Inside | Q8MKK4 | Transposon | 689 | ||||
| D2A6I4 | Putative uncharacterized protein | 6 | 61 | 5′ | 116,030 | Q9W4G2 | Transposon | 764 | |||
| D2A6I6 | Putative uncharacterized protein | 6 | 61 | 3′ | 4860 | Q9VNB4 | Transposon | 764 | |||
| D2A3V0 | Fasciclin-3 | 6 | 62 | 5′ | 37,286 | P15278 | Satellite | 281 | 0.8 | ||
| D2A3V3 | LIM domain kinase 1 | 6 | 62 | 3′ | 21,789 | Q8IR79 | Satellite | 281 | 0.8 | ||
| D6W8F4 | Disco-related | x | 63 | Inside | Q9VXJ5 | Satellite | 530 | 1.5 | |||
| D6W8D3 | PlexA | x | 64 | 5′ | 1973 | O96681 | Transposon | 848 | |||
| D6WGD2 | Aldose-1-epimerase | x | 64 | 3′ | 6472 | Q9VRU1 | Transposon | 848 | |||
| B3MMG1 | Neural-cadherin | 5 | 65 | Inside | O15943 | Satellite | 273 | 0.8 | |||
| D6WNN6 | Transient receptor potential-gamma protein | 5 | 66 | 5′ | 2547 | Q9VJJ7 | Transposon | 894 | |||
| A3RE80 | Cardioacceleratory peptide receptor | 5 | 66 | 3′ | 27,105 | Q868T3 | Transposon | 894 | |||
| A1JUG2 | Ultraspiracle | 5 | 67 | Inside | P20153 | Satellite | 379 | 1.1 | |||
| D6WNB3 | Y box protein | 5 | 68 | 5′ | 14,993 | O46173 | Satellite | 455 | 1.3 | ||
| D6WNB6 | Peptide chain release factor 1 | 5 | 68 | 3′ | 350,365 | Q9VK20 | Satellite | 455 | 1.3 |
A list of genes with gene identity numbers, gene name, chromosomal location, position, and distance relative to the associated TCAST-like element, as well as a list of TCAST-like elements, their types (satellite or transposon-like), total length in bp, and copy number of satellite repeats within an array are shown.
Figure 1 Bayesian/ML phylogenetic trees of: (A) TCAST satellite-like elements (subunits Tcast1a), (B) TCAST satellite-like elements (subunits Tcast1b), and (C) TCAST transposon-like elements. Sequence numbers correspond to those in Table 1. When a particular sequence is composed of few subrepeats (e.g., Tcast1a or Tcast1b), numbers indicating subrepeats are added (e.g., 43_1, 43_2, 43_3). Numbers in brackets indicate chromosomes on which the corresponding sequences are located. Numbers on branches indicate Bayesian posterior probabilities/ML bootstrap support (above 0.5/50%, respectively).
Figure 2 Organization of TCAST elements within T. castaneum genome in the form of TCAST transposon-like element, tandem arrays, and CR1-3_TCa retrotransposon. Regions corresponding to TCAST element are shown in red. TCAST transposon-like element contains an almost complete TCAST monomer and a monomer segment of approximately 121 bp in an inverted orientation, whereas CR1-3 retrotransposon contains segment corresponding to 1.2 monomer. Within TCAST transposon-like element terminal inverted repeats (arrows) unique nonsatellite sequence (green), target-site duplication in the form of “ACT,” and the insertion point of 925-bp sequence found within TR 1.9, element and coding for the putative transposase are shown. Three short ORFs within TCAST transposon-like element are also indicated. Within nonlong terminal repeat retrotransposon CR1-3_TCa regions corresponding to 5′UTR and to two ORFs are indicated.
Figure 3 Distribution of TCAST-like elements on T. castaneum chromosomes. The karyotype representing the haploid set of T. castaneum chromosomes, and positions of constitutive heterochromatin (dark) and euchromatin (white) are depicted based on C-banding data (Stuart and Mocelin 1995) and T. castaneum 3.0 assembly (http://www.beetlebase.org). TCAST transposon-like elements (blue) and TCAST satellite-like elements (red) are shown. Two TCAST-like elements are represented as separate lines if they are at least 100 kb distant from each other.
Figure 4 Models of spreading of TCAT-like elements based on (A) retrotransposition of CR-3_TCa element. CR1-3_TCa was inserted within TCAST satellite array and through recombination has acquired a part of TCAST sequence, which could act as a promoter and become a new functional 5′UTR. Subsequent retrotransposition of CR1-3_TCa could explain the dispersion of TCAST within the euchromatin. (B) Rolling circle replication of TCAST satellite DNA sequences excised from their heterochromatin loci via intrastrand recombination, followed by reintegration into different genome locations by homologous recombination.