| Literature DB >> 17697377 |
Abstract
BACKGROUND: Gene clusters are of interest for the understanding of genome evolution since they provide insight in large-scale duplications events as well as patterns of individual gene losses. Vertebrates tend to have multiple copies of gene clusters that typically are only single clusters or are not present at all in genomes of invertebrates. We investigated the genomic architecture and conserved non-coding sequences of vertebrate KCNA gene clusters. KCNA genes encode shaker-related voltage-gated potassium channels and are arranged in two three-gene clusters in tetrapods. Teleost fish are found to possess four clusters. The two tetrapod KNCA clusters are of approximately the same age as the Hox gene clusters that arose through duplications early in vertebrate evolution. For some genes, their conserved retention and arrangement in clusters are thought to be related to regulatory elements in the intergenic regions, which might prevent rearrangements and gene loss. Interestingly, this hypothesis does not appear to apply to the KCNA clusters, as too few conserved putative regulatory elements are retained.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17697377 PMCID: PMC1978502 DOI: 10.1186/1471-2148-7-139
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Phylogenetic scheme of . Non-connected genes indicate missing linkage/genomic data. Grey squares show hypothetical genes that most likely exist, but are still missing from the current versions of genomic databases. The boxes include genes that are not part of KNCA clusters. The teleost state is hypothetical since we found duplicated KCNA4 genes in Gnathonemus petersi and duplicated KCNA7 genes in Oryzias latipes, but no teleost studied so far showed the full set of duplicated genes.
Figure 2Maximum likelihood tree of based on 80 sequences and 364 amino acid positions. The tree was obtained using PhyML [5], with 500 bootstrap replicates, values are shown by the first numbers. Posterior probabilities as obtained by MrBayes 3.1.1 [59] (100 000 generations) are indicated with asterisks. (** = 100% PP, * = 99–95% PP)
Figure 3Proposed scenario for the evolution of . Based on our analyses we suggest that all KCNA genes are derived from an ancestral intronless gene, as all genes included from Branchiostoma floridae are intronless and that KCNA7 in vertebrates independently gained an intron. Two tandem duplications led to the three gene clusters found in today's genomes, which was probably duplicated initially before the origin of the gnathostomes. Probably this is linked to the second genome duplication (2R) during vertebrate evolution. The four clusters in teleost fish originated through the fish-specific genome duplication (FSGD, 3R).
Pairwise comparison of KCNA 6-1-5 clusters Above the diagonal are the numbers of shared cliques (clusters of phylogenetic footprints) based on Tracker analyses; below are the complete lengths of shared elements. Excluded direct comparisons between pufferfish clusters are printed in bold. "a" and "b" refer to the duplicated fish clusters.
| 615 | |||||||||
| - | 45 | 25 | 7 | 7 | 7 | 5 | 5 | 5 | |
| 2959 | - | 30 | 10 | 5 | 8 | 5 | 5 | 5 | |
| 1871 | 2100 | - | 6 | 5 | 6 | 4 | 4 | 3 | |
| 528 | 622 | 345 | - | 19 | 17 | 7 | 2 | ||
| 517 | 378 | 311 | - | 16 | 11 | 6 | 1 | ||
| 571 | 513 | 370 | 2354 | 1808 | - | 16 | 7 | 0 | |
| 214 | 200 | 218 | 1717 | 1553 | 1815 | - | 7 | 0 | |
| 244 | 173 | 205 | 1245 | 1212 | 1358 | 1039 | - | 0 | |
| 195 | 221 | 140 | 102 | 40 | 0 | 0 | 0 | - |
Abbreviations are standing for human (Hs), chicken (Gg), frog (Xt), zebrafish (Dr), stickleback (Ga), medaka (Ola), and the two pufferfishes (Tn, Tr).
Pairwise comparison of KCNA 3-2-10 clusters Above the diagonal are the numbers of shared cliques (clusters of phylogenetic footprints) based on Tracker analyses; below are the complete lengths of shared elements. Excluded direct comparisons between pufferfish clusters are printed in bold. "a" and "b" refer to the duplicated fish clusters.
| - | 28 | 24 | 5 | 6 | 5 | 5 | 8 | 6 | 13 | 10 | |
| 1788 | - | 16 | 9 | 7 | 3 | 6 | 8 | 4 | 6 | 6 | |
| 1375 | 1077 | - | 10 | 13 | 6 | 10 | 11 | 8 | 11 | 12 | |
| 450 | 514 | 507 | - | 47 | 41 | 39 | 8 | 8 | 7 | 9 | |
| 445 | 462 | 618 | 4560 | - | 37 | 34 | 4 | 5 | 4 | 7 | |
| 484 | 282 | 395 | 4129 | 3599 | - | 3 | 6 | 4 | 5 | ||
| 412 | 452 | 497 | 4270 | 3520 | - | 6 | 8 | 9 | 6 | ||
| 559 | 408 | 553 | 827 | 460 | 444 | 505 | - | 75 | 10 | ||
| 439 | 303 | 414 | 1005 | 586 | 703 | 741 | - | 70 | 8 | ||
| 730 | 370 | 550 | 803 | 460 | 493 | 647 | 5952 | 5527 | - | 8 | |
| 788 | 469 | 602 | 805 | 586 | 600 | 574 | 980 | 813 | 906 | - |
Abbreviations are standing for human (Hs), chicken (Gg), frog (Xt), zebrafish (Dr), stickleback (Ga), medaka (Ola), and the two pufferfishes (Tn, Tr).
Pairwise comparisons between paralogous clusters and the number of shared PFCs (phylogenetic footprint cliques) and their complete lengths
| KCNA 6-1-5 clusters | ||||||||||||
| Number of cliques | ||||||||||||
| 24 | 17 | 21 | 13 | 13 | 9 | 7 | 8 | 7 | 13 | 13 | ||
| 20 | 12 | 20 | 9 | 18 | 9 | 6 | 15 | 9 | 15 | 8 | ||
| 17 | 11 | 39 | 8 | 18 | 11 | 10 | 12 | 6 | 12 | 11 | ||
| 3 | 2 | 3 | 1 | 2 | 2 | 1 | 3 | 1 | 3 | 2 | ||
| 3 | 3 | 3 | 1 | 2 | 1 | 1 | 3 | 2 | 2 | 2 | ||
| 1 | 2 | 6 | 1 | 2 | 1 | 1 | 3 | 1 | 2 | 2 | ||
| 3 | 1 | 2 | 1 | 2 | 1 | 1 | 2 | 1 | 1 | 1 | ||
| 2 | 1 | 0 | 1 | 1 | 1 | 1 | 2 | 1 | 0 | 2 | ||
| 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 1 | 2 | 3 | ||
| Total length | ||||||||||||
| 1248 | 774 | 1177 | 564 | 584 | 407 | 340 | 644 | 338 | 830 | 556 | ||
| 980 | 508 | 946 | 420 | 815 | 382 | 293 | 869 | 381 | 987 | 330 | ||
| 801 | 450 | 3014 | 332 | 742 | 521 | 438 | 633 | 302 | 639 | 406 | ||
| 155 | 444 | 512 | 33 | 397 | 89 | 33 | 214 | 38 | 163 | 454 | ||
| 155 | 510 | 486 | 43 | 421 | 43 | 43 | 139 | 69 | 89 | 359 | ||
| 136 | 477 | 667 | 66 | 441 | 66 | 66 | 289 | 60 | 196 | 427 | ||
| 100 | 58 | 98 | 99 | 132 | 99 | 99 | 114 | 31 | 31 | 291 | ||
| 83 | 21 | 0 | 52 | 52 | 52 | 52 | 127 | 42 | 0 | 105 | ||
| 50 | 57 | 64 | 68 | 72 | 74 | 67 | 54 | 23 | 79 | 98 | ||
Length of the sequences in different organisms, counted from the start codon of the 5'most gene to the stop of the 3'most gene or until the next gene The second column contains the complete length of FCs from on sequence, and its percentage of the complete length of the cluster.
| Sequence | Total length | Bp in cliques | % of bp cliques |
| 157401 | 5555 | 3.5 | |
| 54669 | 4149 | 7.6 | |
| 154969 | 7587 | 5.9 | |
| 30201 | 6675 | 22.1 | |
| 48863 | 6405 | 13.1 | |
| 24581 | 4403 | 17.9 | |
| 26792 | 4680 | 17.5 | |
| 23872 | 6137 | 25.7 | |
| 26629 | 6815 | 25.6 | |
| 30044 | 8186 | 27.2 | |
| 88550 | 3091 | 3.5 | |
| 237607 | 7497 | 3.2 | |
| 173038 | 7279 | 4.2 | |
| 144682 | 7518 | 5.2 | |
| 8433 | 2044 | 24.2 | |
| 10347 | 2920 | 28.2 | |
| 8252 | 2835 | 34.4 | |
| 7038 | 1892 | 26.9 | |
| 9756 | 1119 | 11.5 | |
| 10689 | 1176 | 11.0 | |