| Literature DB >> 26134680 |
Abstract
The achaete-scute complex (AS-C) has been a useful paradigm for the study of pattern formation and its evolution. achaete-scute genes have duplicated and evolved distinct expression patterns during the evolution of cyclorraphous Diptera. Are the expression patterns in different species driven by conserved regulatory elements? If so, when did such regulatory elements arise? Here, we have sequenced most of the AS-C of the fly Calliphora vicina (including the genes achaete, scute and lethal of scute) to compare noncoding sequences with known cis-regulatory sequences in Drosophila. The organization of the complex is conserved with respect to Drosophila species. There are numerous small stretches of conserved noncoding sequence that, in spite of high sequence turnover, display binding sites for known transcription factors. Synteny of the blocks of conserved noncoding sequences is maintained suggesting not only conservation of the position of regulatory elements but also an origin prior to the divergence between these two species. We propose that some of these enhancers originated by duplication with their target genes.Entities:
Keywords: Calliphora; Diptera; Drosophila; achaete-scute complex; gene duplication; regulatory elements; sequence evolution
Mesh:
Year: 2015 PMID: 26134680 PMCID: PMC4832353 DOI: 10.1111/jeb.12687
Source DB: PubMed Journal: J Evol Biol ISSN: 1010-061X Impact factor: 2.411
Figure 1Dipteran phylogeny showing the available information about the AS‐C. Arrows indicate Dipteran coding genes. Genes are connected with a line when genome organization is known. Previously published data are shown in blue, data obtained in this work in red.
Figure 2Map of the AS‐C region in Calliphora vicina. Sequenced BACs are shown in black and other BACs in grey. Blue arrows represent AS‐C genes and green boxes transposable elements and repeats.
Summary of sequenced BAC clones. Gene and repeat content
| BAC | Total size (bp) | Genes | Repeats | ||||
|---|---|---|---|---|---|---|---|
| Number | bp | % | Number | bp | % | ||
| 113H10 | 96 426 | 1 ( | 885 | 0.92 | 61 | 23 570 | 24.44 |
| 99M22 | 102 758 | 1 ( | 885 | 0.86 | 78 | 28 879 | 28.10 |
| 97L04 | 111 044 | 1 ( | 963 | 0.87 | 38 | 38 432 | 34.61 |
| 62B24 | 90 178 | 1 ( | 963 | 1.07 | 44 | 28 013 | 31.06 |
| 16B10 | 135 393 | 1 ( | 786 | 0.58 | 66 | 27 144 | 20.05 |
| 104L14 | 115 595 | 0 | 0 | 0 | 68 | 19 608 | 16.96 |
| Total | 651 394 | 5 | 4482 | 0.69 | 355 | 165 646 | 25.43 |
| Without overlap | 530 000 | 3 | 2634 | 0.50 | |||
Conserved noncoding sequences of the achaete‐scute complex detected by blast2seq between Calliphora and Drosophila
|
|
| Blast2sequences hits | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Start | End | Clone | Start | End | % Identity | Length | Mismatches | Gap opens |
| Bit score |
| 13 423 | 13 479 | 113H10 | 18 171 | 18 211 | 71.93 | 57 | 0 | 2 | 0.066 | 37.4 |
| 14 226 | 14 304 | 113H10 | 20 773 | 20 851 | 72.62 | 84 | 13 | 3 | 4.00E‐04 | 44.6 |
| 19 665 | 19 697 | 99M22 | 37 506 | 37 474 | 90.91 | 33 | 3 | 0 | 1.00E‐04 | 46.4 |
| 23 3237 | 23 352 | 99M22 | 57 812 | 57 841 | 93.33 | 30 | 2 | 0 | 1.00E‐04 | 46.4 |
| 25 8107 | 25 845 | 99M22 | 73 435 | 73 400 | 100.00 | 36 | 0 | 0 | 1.00E‐10 | 66.2 |
| 33 788 | 33 828 | 97L04 | 44 884 | 44 925 | 90.48 | 42 | 3 | 1 | 3.00E‐07 | 55.4 |
| 34 886 | 34 928 | 97L04 | 50 672 | 50 630 | 90.70 | 43 | 4 | 0 | 7.00E‐09 | 60.8 |
| 37 687 | 37 713 | 97L04 | 24 868 | 24 894 | 96.30 | 27 | 1 | 0 | 5.00E‐04 | 44.6 |
| 38 125 | 38 149 | 97L04 | 70 125 | 70 101 | 96.00 | 25 | 1 | 0 | 0.006 | 41.0 |
| 40 1051 | 40 129 | 97L04 | 90 093 | 90 069 | 92.00 | 25 | 2 | 0 | 0.076 | 37.4 |
|
|
|
|
| |||||||
| 42 3562 | 42 403 | 97L04 | 99 269 | 99 316 | 89.58 | 48 | 5 | 0 | 5.00E‐10 | 64.4 |
|
|
|
|
| |||||||
| 42 8585 | 42 923 | 97L04 | 104 855 | 104 920 | 81.82 | 66 | 12 | 0 | 2.00E‐10 | 66.2 |
|
|
|
|
| |||||||
| 46 110 | 46 141 | 62B24 | 47 973 | 47 942 | 90.62 | 32 | 3 | 0 | 4.00E‐04 | 44.6 |
| 46 525 | 46 558 | 62B24 | 46 965 | 46 931 | 91.43 | 35 | 2 | 1 | 1.00E‐04 | 46.4 |
| 51 0573 | 51 103 | 62B24 | 72 470 | 72 424 | 87.23 | 47 | 6 | 0 | 2.00E‐08 | 59.0 |
|
|
|
|
| |||||||
| 51 9643 | 52 013 | 16B10 | 30 646 | 30 599 | 86.00 | 50 | 5 | 1 | 1.00E‐07 | 57.2 |
| 52 7553 | 52 844 | 16B10 | 24 209 | 24 116 | 82.98 | 94 | 12 | 3 | 6.00E‐17 | 87.8 |
| 53 7313 | 53 762 | 62B24 | 89 286 | 89 255 | 93.75 | 32 | 2 | 0 | 1.00E‐05 | 50.0 |
|
|
|
| ||||||||
| 57 5556 | 57 671 | 16B10 | 50 662 | 50 779 | 70.25 | 121 | 29 | 4 | 6.00E‐04 | 44.6 |
| 66 7398 | 66 764 | 16B10 | 117 858 | 117 833 | 92.31 | 26 | 2 | 0 | 0.026 | 39.2 |
| 68 2138 | 68 243 | 104L14 | 28 982 | 29 012 | 90.32 | 31 | 3 | 0 | 0.002 | 42.8 |
| 68 2868 | 68 311 | 104L14 | 29 081 | 29 106 | 92.31 | 26 | 2 | 0 | 0.023 | 39.2 |
| 69 4008 | 69 423 | 104L14 | 32 865 | 32 888 | 95.83 | 24 | 1 | 0 | 0.023 | 39.2 |
| 69 6388 | 69 686 | 104L14 | 32 987 | 33 035 | 86.00 | 50 | 5 | 2 | 4.00E‐06 | 51.8 |
| 72 9014 | 72 925 | 104L14 | 26 919 | 26 895 | 100.00 | 25 | 0 | 0 | 2.00E‐04 | 46.4 |
| 86 922 | 86 977 | 104L14 | 54 520 | 54 575 | 87.50 | 56 | 7 | 0 | 1.00E‐11 | 69.8 |
| 90 6929 | 90 719 | 104L14 | 69 009 | 69 036 | 100.00 | 28 | 0 | 0 | 4.00E‐06 | 51.8 |
| 109 96810 | 109 997 | 97L04 | 107 083 | 107 054 | 86.67 | 30 | 4 | 0 | 0.076 | 37.4 |
Fragments detected when comparing the C. vicina sequence to both D. melanogaster and Drosophila virilis (see text for details). The coordinates for D. melanogaster refer to the sequence ChrX 210 000–330 000 from the whole genome sequence. The C. vicina coordinates refer to each BAC clone (italics: fragments present in more than one BAC clone). Fragments overlapping known structures in D. melanogaster: wing enhancers 1sc‐SOPE, 2L3/TSM, 3pTG, 4tr1‐tr2; UTRs 5sc 6l'sc; genetically inferred blastoderm enhancers 7A, 8C, 9D, 10E.
Worst hits (high e‐value in addition to short length or low % identity), and these are shown with a thin line in Fig. 2.
Figure 3Comparison of AS‐C noncoding sequences between Calliphora vicina and Drosophila melanogaster. Note the different scale in each species. Arrows represent genes: AS‐C genes in blue, other genes in white. Green boxes indicate repeats and yellow boxes enhancers tested in D. melanogaster. The shadowed area in the D. melanogaster AS‐C indicates the approximate area included in the C. vicina sequence. Blue and red lines indicate conserved noncoding sequences detected between C. vicina and D. melanogaster and Drosophila virilis (details in Table 2).
Figure 4Sequence alignment of the L3/TSM enhancer. ClustalW2 alignment of the L3/TSM enhancer region between Drosophila melanogaster (Dmel), Drosophila virilis (Dvir) and Calliphora vicina (Cvic). Underlined are the blast2sequences hit (see Table 2 for details) and the TTAATTAA homeobox binding site identified by Gomez‐Skarmeta et al. (1996). Red boxes are En/Antp binding sites and blue boxes Ara/Caup binding sites, as defined by Noyes et al. (2008).
Figure 5Comparison of the structure of the enhancers of asense, scute and achaete between Drosophila melanogaster and Calliphora vicina. Coloured rectangles represent matches to binding sites: α‐box in blue, β‐box in green, E‐box in red, N‐box in purple and TATA‐box in black. The thick black line represents conserved fragments. Arrows indicate the transcription start site and ATG the beginning of the coding region. Note that the sc are several kilobases upstream of the coding regions. ase from Gibert and Simpson (2003).