| Literature DB >> 19087244 |
Mariana Lomiento1, Zhaoshi Jiang, Pietro D'Addabbo, Evan E Eichler, Mariano Rocchi.
Abstract
BACKGROUND: Evolutionary-new centromeres (ENCs) result from the seeding of a centromere at an ectopic location along the chromosome during evolution. The novel centromere rapidly acquires the complex structure typical of eukaryote centromeres. This phenomenon has played an important role in shaping primate karyotypes. A recent study on the evolutionary-new centromere of macaque chromosome 4 (human 6) showed that the evolutionary-new centromere domain was deeply restructured, following the seeding, with respect to the corresponding human region assumed as ancestral. It was also demonstrated that the region was devoid of genes. We hypothesized that these two observations were not merely coincidental and that the absence of genes in the seeding area constituted a crucial condition for the evolutionary-new centromere fixation in the population.Entities:
Mesh:
Year: 2008 PMID: 19087244 PMCID: PMC2646277 DOI: 10.1186/gb-2008-9-12-r173
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Definition of the ENC seeding region in the reference genome
| Chromosome | ENC position | Size (kb) | p arm BAC | Position in HSA (hg17) or MMU (rheMac2) | q arm BAC | Position in HSA (hg17) or MMU (rheMac2) | AT content (%) |
| LLA7 (HSA8) | Chr8:63,002,317-63,047,396 | 45 | RP11-953L16 | Chr8:62,816,386-63,002,317 | RP11-159F22 | Chr8:63,047,396-63,204,407 | 63.9 |
| MMU2 (HSA3) | Chr3:164,221,008-164,539,729 | 319 | RP11-449O23 | Chr3:164,054,860-164,221,008 | RP11-418B12 | Chr3:164,539,729-164,707,135 | 65.9 |
| MMU4 (HSA6) | Chr6:145,651,644-145,845,896 | 194 | Chr6:145,651,644-145,845,896 | 63.2 | |||
| MMU12 (HSA2q) | Chr2:138,847,788-138,947,383 | 99 | RP11-343I5 | Chr2:138,777,146-138,947,383 | RP11-846E22 | Chr2:138,847,788-139,025,935 | 63.1 |
| MMU13 (HSA2p) | Chr2:86,680,785-86,885,407 | 204 | Chr2:86,680,785-86,885,407 | 60.0 | |||
| MMU14 (HSA11) | Chr11:5,856,181-5,864,725 | 8 | RP11-625D10 | Chr11:5,667,339-5,864,725 | RP11-661M13 | Chr11:5,856,181-6,043,020 | 62.8 |
| MMU15 (HSA9) | Chr9:122,486,836-122,532,865 | 46 | RP11-64P14 | Chr9:122,344,545-122,532,865 | RP11-1069J21 | Chr9:122,486,836-122,680,563 | 62.4 |
| MMU17 (HSA13) | Chr13:61,178,154-62,520,878 | 1,343 | Chr13:61,111,769-61,178,154 | Chr13:62,520,878-62,699,203 | 66.2 | ||
| MMU18 (HSA18) | Chr18:50,313,129-50,360,135 | 47 | RP11-61D1 | Chr18:50,155,761-50,313,129 | RP11-289E15 | Chr18:50,360,135-50,526,341 | gap |
| NLE15 (HSA11) | Chr11:89,446,995-89,488,776 | 42 | RP11-529A4 | Chr11:89,286,313-89,446,995 | RP11-1129K7 | Chr11:89,488,776-89,644,713 | 63.8 |
| PPY11 (HSA11) | Chr11:20,180,424-20,332,556 | 152 | Chr11:20,180,424-20,332,556 | 61.2 | |||
| HSA3 (MMU2) | Chr2:14,301,434-14,386,749 | 85 | CH250-111O10 | Chr2:14,301,466-14,396,994 | CH250-4J18 | Chr2:14,386,749-14,533,296 | 62.3 |
| HSA6 (MMU4) | Chr4: 57,710,481-57,863,274 | 153 | Chr4: 57,710,481-57,863,274 | 66.1 | |||
| HSA11 (MMU14) | Chr14:17,109,970-17,281,610 | 171 | CH250-111J7 | Chr14:17,015,710-17,109,970 | CH250-37N19 | Chr14:17,281,610-17,299,898 | 63.4 |
Seeding regions of the studied ENCs, defined by a splitting BAC (in bold) or by overlapping BACs mapping in opposite sides of the centromere (p arm/q arm). In the latter case the overlapping portion of the two BACs was assumed as the seeding point. In MMU17 (human 13), several contiguous human BACs gave split signals. The table lists the most external ones (in italics). The human genome was used as a reference genome for non-human primate ENCs. The macaque genome was used as a reference for the three human ENCs (see text).
Figure 1The phylogenetic relationships of the species under study. Data on OWMs and Hominoidea are from Raaum et al. [22], while those on NWMs are from Schneider et al. [24].
Duplication analyses in ENC regions
| Non-redundant WSSD base pair (bp) | |||||||||
| ENC | Start (HAS hg17) | Start+1M | End (HS A hg17) | End+1M | Size | HSA | PTR | PPY | MMU |
| MMU2 (HSA3) | 164,221,008 | 164,539,729 | 318,722 | 0 | 0 | 0 | 0 | ||
| MMU4 (HSA6) | 145,651,644 | 145,845,896 | 194,253 | 0 | 0 | 0 | 104,409 | ||
| MMU12 (HSA2) | 138,847,788 | 138,947,383 | 99,596 | 0 | 0 | 0 | 0 | ||
| MMU13 (HSA2) | 86,680,785 | 86,885,407 | 204,623 | 24,002 | 0 | 0 | 0 | ||
| MMU14 (HSA11) | 5,856,181 | 5,864,725 | 8,545 | 0 | 0 | 0 | 0 | ||
| MMU15 (HSA9) | 122,486,836 | 122,532,865 | 46,030 | 0 | 0 | 0 | 0 | ||
| MMU17 (HSA13) | 61,178,154 | 62,520,878 | 1,342,725 | 24,879 | 15,879 | 103,912 | 85,133 | ||
| MMU18 (HSA18) | 50,313,129 | 50,360,135 | 47,007 | 0 | 0 | 0 | 0 | ||
| PPY11 (HSA11) | 20,180,424 | 20,332,556 | 152,133 | 0 | 0 | 126,135 | 0 | ||
| MMU2 (HSA3) | 163,221,008 | 165,539,729 | 2,318,722 | 0 | 0 | 0 | 24,053 | ||
| MMU4 (HSA6) | 144,651,644 | 146,845,896 | 2,194,253 | 0 | 0 | 0 | 115,053 | ||
| MMU12 (HSA2) | 137,847,788 | 139,947,383 | 2,099,596 | 0 | 0 | 17,001 | 1,706 | ||
| MMU13 (HSA2) | 85,680,785 | 87,885,407 | 2,204,623 | 1,227,738 | 309,321 | 0 | 19,317 | ||
| MMU14 (HSA11) | 4,856,181 | 6,864,725 | 2,008,545 | 0 | 0 | 13,379 | 0 | ||
| MMU15 (HSA9) | 121,486,836 | 123,532,865 | 2,046,030 | 0 | 0 | 0 | 0 | ||
| MMU17 (HSA13) | 60,178,154 | 63,520,878 | 3,342,725 | 160,4637 | 98,004 | 144,056 | 85,133 | ||
| MMU18 (HSA18) | 49,313,129 | 51,360,135 | 2,047,007 | 0 | 0 | 0 | 0 | ||
| PPY11 (HSA11) | 19,180,424 | 21,332,556 | 2,152,133 | 0 | 0 | 784,808 | 0 | ||
We estimated the number of duplicated base-pairs predicted in each of the ENC intervals using the WSSD method; duplications >10 kb and >94% were detected with the exception of the macaque, where a threshold of >88% was used due to the greater sequence divergence of the human and macaque genomes. The analysis was performed separately for each of the four primate species. Two different ENC intervals were considered: a narrow interval, as defined in Table 1 (upper dataset) and a larger interval adding 1 Mbp to each side of the region (lower dataset).
Species-specific BACs yielding duplicated signals oround ENCs
| ENC | BAC | Position in HSA (May 2004) |
| MMU13 (HSA2p) | CH250-565F19* | Chr2:86,755,212-alphoid |
| CH250-417O7 | Chr2:86,785,727-repeat | |
| CH250-371E19* | Chr2:86,870,586-alphoid | |
| MMU12 (HSA2q) | CH250-359C1 | Chr2:138,344,201-138,510,183 |
| CH250-158G21 | Chr2:138,478,651-138,621,067 | |
| CH250-18F12* | Chr2:138,643,711-alphoid | |
| MMU14 (HSA11) | CH250-444O7* | Chr11:5,861,684-alphoid |
| CH250-499K18* | Chr11:6,038,164-alphoid | |
| MMU15 (HSA9) | CH250-221O11* | Chr9:122,220,400-alphoid |
| MMU17 (HSA13) | CH250-310C22 | Chr13:61,479,136-61,591,608 |
| CH250-299M13 | Chr13:61,503,914-61,617,441 | |
| CH250-115C9 | Chr13:61,540,997-61,676,877 | |
| MMU18 (HSA18) | CH250-322J6 | Chr18:50,437,322-repeat |
| NLE15 (HSA11) | CH271-140J13 | Chr11:89,572,864-repeat |
Species-specific BAC clones yielding duplicated signals around the ENC. Their specific pericentromeric location, confirmed by FISH, was derived by their BAC-end(s) mapping. *One BAC-end of these BACs is entirely composed of alphoid repeats. The FISH signal, however, was not centromeric, indicating that the alphoid content of the BAC was marginal. See Figure 1 for examples.
Figure 2FISH examples. (a) Examples of FISH experiments using species-specific BAC clones yielding duplicated signals around the centromere. The CH250 and CH271 are BAC libraries specific for macaque and gibbon, respectively. The DAPI-stained chromosome without the signal is reported on the left to better show the morphology of the chromosome. (b) FISH experiment using the BAC clone CH250-417O7 (MMU2) on a macaque metaphase, showing pericentromeric signals on several chromosomes.
Figure 3Gene density simulations. The observed density of (a) genes (Refseq), (b) Refseq exons and (c) expressed sequence tag (Est) exons within the corresponding region of the 14 ENCs were compared against a simulated set of 10,000 regions distributed randomly within the human genome (see Materials and methods). A significant depletion of exons and genes was observed.
RefSeq genes flanking the ENCs
| ENC | Interval (Mb) | Left gene | Position in HSA (hg17) or MMU (rheMac2) | Right gene | Position in HSA (hg17) or MMU (rheMac2) |
| LLA7 (HSA8) | 0.534 | ASPH | Chr8:62,699,652-62,789,681 | FAM77D | Chr8:63,324,055-64,074,765 |
| MMU2 (HSA3) | 3.607 | C3orf57 | Chr3:162,545,283-162,572,573 | SI | Chr3:166,179,388-166,278,984 |
| MMU4 (HSA6) | 0.772 | UTRN | Chr6:144,654,566-145,215,861 | EPM2A | Chr6:145,988,141-146,098,684 |
| MMU13 (HSA2p) | 0.097 | RNF103 | Chr2:86,742,174-86,762,636 | RMD5A | Chr2:86,859,351-86,914,090 |
| MMU14 (HSA11) | 1.213 | MMP26 | Chr11:4,966,000-4,970,233 | C11orf42 | Chr11:6,183,374-6,188,935 |
| MMU15 (HSA9) | 0.423 | PTGS1 | Chr9:122,212,783-122,237,535 | PDCL | Chr9:122,660,178-122,670,394 |
| MMU12 (HSA2q) | 0.485 | HNMT | Chr2:138,555,540-138,607,665 | LOC339745 | Chr2:139,093,103-139,164,532 |
| MMU17 (HSA13) | 4.888 | PCDH20 | Chr13:60,881,822-60,887,282 | PCDH9 | Chr13:65,774,968-66,702,464 |
| MMU18 (HSA18) | 0.247 | C18orf54 | Chr18:50,139,169-50,162,379 | C18orf26 | Chr18:50,409,388-50,417,722 |
| NLE14 (HSA11) | 2.746 | CHORDC1 | Chr11:89,574,265-89,595,854 | MTNR1B | Chr11:92,342,437-92,355,596 |
| PPY11 (HSA11) | 0.203 | DBX1 | Chr11:20,134,336-20,138,446 | HTATIP2 | Chr11:20,341,924-20,361,904 |
| HSA3 (MMU2) | 0.641 | EPHA3 in HSA (not annotated in | Chr3:89,239,364-89,613,972 | PROS1 (L31380 in MMU) | Chr3:95,074,647-95,175,395 |
| MMU) | (MMU2:13,335,593-13,694,578) | (Chr2:14,335,824-14,391,596) | |||
| HSA6 (MMU4) | 0.897 | PRIM2A in HAS (not annotated in | Chr6:57,290,381-57,621,334 | KHDRBS2 in HAS (not annotated in | Chr6:62,447,824-63,054,091 |
| MMU) | (2 dup in MMU: | MMU) | (3 dup in MMU: | ||
| MMU4:56,935,673-57,245,600 | Chr4:58,142,819-58,698,705 | ||||
| MMU11:20,043,342-20,044,345) | Chr17:3,473,312-3,473,395 | ||||
| Chr8:138,072,498-138,196,040) | |||||
| HSA11 (MMU14) | 1.280 | LRRC55 in HAS (not annotated in | Chr11:56,705,797-56,714,154 | PTPRJ in HAS (not annotated in MMU) | Chr11:47,958,689-48,146,246 |
| MMU) | (MMU14:16,226,175-16,234,557) | (Chr14:23,931,871-24,124,487) |
Position of the most proximal and distal genes with respect to each ENC seeding region, calculated in the reference genome (see text). The interval size, in Mb, between the two genes is reported in column 2.