| Literature DB >> 19672682 |
Esther Melamed1, Arthur P Arnold.
Abstract
Most avian Z genes are expressed more highly in ZZ males than ZW females, suggesting that chromosome-wide mechanisms of dosage compensation have not evolved. Nevertheless, a small percentage of Z genes are expressed at similar levels in males and females, an indication that a yet unidentified mechanism compensates for the sex difference in copy number. Primary DNA sequences are thought to have a role in determining chromosome gene inactivation status on the mammalian X chromosome. However, it is currently unknown whether primary DNA sequences also mediate chicken Z gene compensation status. Using a combination of chicken DNA sequences and Z gene compensation profiles of 310 genes, we explored the relationship between Z gene compensation status and primary DNA sequence features. Statistical analysis of different Z chromosomal features revealed that long interspersed nuclear elements (LINEs) and CpG islands are enriched on the Z chromosome compared with 329 other DNA features. Linear support vector machine (SVM) classifiers, using primary DNA sequences, correctly predict the Z compensation status for >60% of all Z-linked genes. CpG islands appear to be the most accurate classifier and alone can correctly predict compensation of 63% of Z genes. We also show that LINE CR1 elements are enriched 2.7-fold on the chicken Z chromosome compared with autosomes and that chicken chromosomal length is highly correlated with percentage LINE content. However, the position of LINE elements is not significantly associated with dosage compensation status of Z genes. We also find a trend for a higher proportion of CpG islands in the region of the Z chromosome with the fewest dosage-compensated genes compared with the region containing the greatest concentration of compensated genes. Comparison between chicken and platypus genomes shows that LINE elements are not enriched on sex chromosomes in platypus, indicating that LINE accumulation is not a feature of all sex chromosomes. Our results suggest that CpG islands are not randomly distributed on the Z chromosome and may influence Z gene dosage compensation status.Entities:
Mesh:
Year: 2009 PMID: 19672682 PMCID: PMC2759020 DOI: 10.1007/s10577-009-9068-4
Source DB: PubMed Journal: Chromosome Res ISSN: 0967-3849 Impact factor: 5.239
Fig. 1SVM results. SVM was trained to associated genes’ compensation status in a training set of genes and then tested to predict the compensation status of a leave-out sample of genes. The tables show the percentages of correctly and incorrectly predicted compensation status of genes based on the actual observed gene compensation status for different features. An overall percentage of correctly predicted compensated and uncompensated genes is included, calculated by averaging the correctly predicted compensated and uncompensated genes for the feature. The search width for these results was 2 kb
Fig. 2Percentage LINE element composition on chicken and human chromosomes. In both chicken (a) and human (b), the percentage LINE elements increases monotonically with chromosome length, but in each case the larger sex chromosome (X in human, Z in chicken) has a disproportionately high percentage of LINE elements
Percentage of LINE elements per length of chromosome in the chicken, human, and platypus genomes. Compared with the chicken Z and human X chromosomes, both of which have a higher percentage LINE composition, the platypus X chromosome has a similar percentage of LINES compared with autosomes
| Human | Chicken | Platypus | ||||||
|---|---|---|---|---|---|---|---|---|
| Name | Length | Percent | Name | Length | Percent | Name | Length | Percent |
| chr22 | 49691432 | 10.3 | chr25 | 2031799 | 2.44 | chr17 | 1399469 | 10.56 |
| chrY | 57772954 | 11.1 | chr22 | 3936574 | 3.63 | chr20 | 1816412 | 21.44 |
| chr19 | 63811651 | 11.6 | chr28 | 4512026 | 3.25 | chr14 | 2696122 | 18.75 |
| chr21 | 46944323 | 13.3 | chr27 | 4841970 | 5.32 | chr15 | 3786880 | 18.47 |
| chr16 | 88827254 | 13.6 | chr26 | 5102438 | 2.07 | |||
| chr17 | 78774742 | 14.4 | chr23 | 6042217 | 2.74 | |||
| chr15 | 100338915 | 16.4 | chr24 | 6400109 | 3.14 | chr18 | 6611290 | 16.19 |
| chr14 | 106368585 | 17.3 | chr21 | 6959642 | 3.14 | chr11 | 6809224 | 17.41 |
| chr20 | 62435964 | 17.5 | chr19 | 9939723 | 2.64 | chr10 | 11243762 | 15.72 |
| chr13 | 114142980 | 17.9 | chr18 | 10925261 | 3.14 | chr12 | 15872666 | 18.19 |
| chr9 | 140273252 | 18.2 | chr17 | 11182526 | 2.61 | chr6 | 16302927 | 17.87 |
| chr1 | 247249719 | 18.6 | chr15 | 12968165 | 2.15 | chr5 | 24609220 | 20.16 |
| chr10 | 135374737 | 19.4 | chr20 | 13986235 | 2.76 | |||
| chr7 | 158821424 | 20.1 | chr14 | 15819469 | 2.48 | chr7 | 40039088 | 18.40 |
| chr12 | 132349534 | 20.3 | chr13 | 18911934 | 2.69 | |||
| chr18 | 76117153 | 20.4 | chr12 | 20536687 | 2.67 | chr1 | 47594283 | 22.17 |
| chr2 | 242951149 | 20.8 | chr11 | 21928095 | 2.37 | chr2 | 54797317 | 21.78 |
| chr8 | 146274826 | 21.2 | chr10 | 22556432 | 2.37 | chr4 | 58987262 | 20.07 |
| chr11 | 134452384 | 21.4 | chr9 | 25554352 | 2.59 | chr3 | 59581953 | 21.15 |
| chr6 | 170899992 | 21.5 | chr8 | 30671729 | 3.11 | |||
| chr3 | 199501827 | 21.7 | chr6 | 37400442 | 3.54 | |||
| chr5 | 180857866 | 22.3 | chr7 | 38384769 | 2.99 | |||
| chr4 | 191273063 | 22.8 | chr5 | 62238931 | 4.09 | |||
| chr4 | 94230402 | 5.06 | ||||||
| chr3 | 113657789 | 6.2 | ||||||
| chr2 | 154873767 | 7.6 | ||||||
| chr1 | 200994015 | 8.92 | ||||||
Fig. 3CpG islands are differentially associated with compensated and uncompensated genes. a 2 kb (p = 3.19 × 10−5), b 10 kb (p = 6 × 10−4), c 50 kb (p = 0.012), and d 100 kb (p > 0.05) genomic windows upstream of gene start. Fisher’s exact test)
CpG islands are differentially associated with compensated and uncompensated genes. Fisher’s exact test results for 2 kb, 10 kb, 50 kb, and 100 kb genomic windows upstream of gene start. In all four genomic windows, the low-CpG-score group contains a disproportionate number of compensated genes, and the high-CpG-score group contains a disproportionate number of uncompensated genes, reaching significance in the 2 kb, 10 kb, and 50 kb windows
| Low CpG score | High CpG score | ||
|---|---|---|---|
| 2 kb window | |||
| Number of compensated genes | 108 | 47 | 3.19E-05 |
| Number of uncompensated genes | 71 | 84 | |
| 10 kb window | |||
| Number of compensated genes | 88 | 67 | 6.0E-4 |
| Number of uncompensated genes | 57 | 98 | |
| 50 kb window | |||
| Number of compensated genes | 101 | 54 | 0.012 |
| Number of uncompensated genes | 78 | 77 | |
| 100 kb window | |||
| Number of compensated genes | 107 | 48 | 0.1223 |
| Number of uncompensated genes | 93 | 62 | |
CpG island and LINE-CR1 element proportions on different regions of the Z chromosome. Number indicates the proportion of DNA sequence occupied by CpG islands and LINE-CR1 elements calculated for the entire Z chromosome or inside the Zq peak or MHM valley
| CpG | LINE | |
|---|---|---|
| Entire Z chromosome | 1.07 × 10−5 | 2.34 × 10−4 |
| Zq peak | 1.21 × 10−5 | 2.31 × 10−4 |
| MHM valley | 8.80 × 10−6 | 2.32 × 10−4 |