Jingyue Ellie Duan1, Wei Shi2, Nathaniel K Jue3, Zongliang Jiang4, Lynn Kuo2, Rachel O'Neill5, Eckhard Wolf6, Hong Dong7, Xinbao Zheng7, Jingbo Chen7, Xiuchun Cindy Tian1. 1. Department of Animal Science, University of Connecticut, Storrs, CT. 2. Department of Statistics, University of Connecticut, Storrs, CT. 3. School of Natural Sciences, California State University, Monterey Bay, CA. 4. School of Animal Science, Louisiana State University, Agricultural Center, Baton Rouge, LA. 5. Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT. 6. Gene Center and Department of Biochemistry, Ludwig-Maximilians-Universität Muünchen, Germany. 7. Institute of Animal Science, Xinjiang Academy of Animal Sciences, Urumqi, Xinjiang, P.R. China.
Abstract
Dosage compensation of the mammalian X chromosome (X) was proposed by Susumu Ohno as a mechanism wherein the inactivation of one X in females would lead to doubling the expression of the other. This would resolve the dosage imbalance between eutherian females (XX) versus male (XY) and between a single active X versus autosome pairs (A). Expression ratio of X- and A-linked genes has been relatively well studied in humans and mice, despite controversial results over the existence of upregulation of X-linked genes. Here we report the first comprehensive test of Ohno's hypothesis in bovine preattachment embryos, germline, and somatic tissues. Overall an incomplete dosage compensation (0.5 < X:A < 1) of expressed genes and an excess X dosage compensation (X:A > 1) of ubiquitously expressed "dosage-sensitive" genes were seen. No significant differences in X:A ratios were observed between bovine female and male somatic tissues, further supporting Ohno's hypothesis. Interestingly, preimplantation embryos manifested a unique pattern of X dosage compensation dynamics. Specifically, X dosage decreased after fertilization, indicating that the sperm brings in an inactive X to the matured oocyte. Subsequently, the activation of the bovine embryonic genome enhanced expression of X-linked genes and increased the X dosage. As a result, an excess compensation was exhibited from the 8-cell stage to the compact morula stage. The X dosage peaked at the 16-cell stage and stabilized after the blastocyst stage. Together, our findings confirm Ohno's hypothesis of X dosage compensation in the bovine and extend it by showing incomplete and over-compensation for expressed and "dosage-sensitive" genes, respectively.
Dosage compensation of the mammalian X chromosome (X) was proposed by Susumu Ohno as a mechanism wherein the inactivation of one X in females would lead to doubling the expression of the other. This would resolve the dosage imbalance between eutherian females (XX) versus male (XY) and between a single active X versus autosome pairs (A). Expression ratio of X- and A-linked genes has been relatively well studied in humans and mice, despite controversial results over the existence of upregulation of X-linked genes. Here we report the first comprehensive test of Ohno's hypothesis in bovine preattachment embryos, germline, and somatic tissues. Overall an incomplete dosage compensation (0.5 < X:A < 1) of expressed genes and an excess X dosage compensation (X:A > 1) of ubiquitously expressed "dosage-sensitive" genes were seen. No significant differences in X:A ratios were observed between bovine female and male somatic tissues, further supporting Ohno's hypothesis. Interestingly, preimplantation embryos manifested a unique pattern of X dosage compensation dynamics. Specifically, X dosage decreased after fertilization, indicating that the sperm brings in an inactive X to the matured oocyte. Subsequently, the activation of the bovine embryonic genome enhanced expression of X-linked genes and increased the X dosage. As a result, an excess compensation was exhibited from the 8-cell stage to the compact morula stage. The X dosage peaked at the 16-cell stage and stabilized after the blastocyst stage. Together, our findings confirm Ohno's hypothesis of X dosage compensation in the bovine and extend it by showing incomplete and over-compensation for expressed and "dosage-sensitive" genes, respectively.
Gene dosage is the number of copies of a given gene in cells of an organism and can be manifested by the amount of its products (Ercan 2015). Maintenance of proper gene dosage is essential in functional cellular networks such as in embryogenesis, and fetus development. Aneuploidy such as monosomy or trisomy is an abnormal change in the dosage of chromosomes and is generally detrimental to the organism (Holtzman et al. 1992). For example, aneuploidy accounts for 46.3% of spontaneous abortions in humans (Hassold et al. 1980). Small changes in dosage of single genes can lead to many diseases (Hurles et al. 2008) and the onset of tumorigenesis (Gordon et al. 2012). However, monosomy of the X chromosome in mammalian males is well tolerated, although over a thousand genes important for both sexes are located on X (Ercan 2015). Susumu Ohno hypothesized that to prevent the deleterious effects of haploinsufficiency in males, a compensatory mechanism involving the doubling expression of X-linked genes must occur (Ohno et al. 1959). This, however, at the same time could cause a quadruple dosage of X in females. By transcriptionally silencing one of the two X chromosomes in females the dosage of the X chromosome between males and females is balanced (Veitia and Potier 2015). Meanwhile, this also balances the gene dosage between sex chromosome and autosome pairs (A) in both sexes (Ohno 1966).Although X chromosome inactivation (XCI) has been observed in all mammalian species studied to date (Ohno et al. 1959; Lyon 1961; Heard et al. 1997), dosage compensation by doubling the expression of X-linked genes has only been studied in very few species and is still heavily debated (Nguyen and Disteche 2006; Xiong et al. 2010; Deng et al. 2011; He et al. 2011; Kharchenko et al. 2011). Dosage compensation is determined by calculating the ratio of averaged expression value of X-linked genes to that of the autosomes (X:A ratio). The ratio X:A ≈ 1 indicates the doubling of X gene expression, while X:A of 0.5 rejects Ohno’s hypothesis. Two previous microarray studies fully supported dosage compensation in both humans and mice (Gupta et al. 2006; Nguyen and Disteche 2006). However, the first RNA sequencing (RNA-seq) study of humans and mice (Xiong et al. 2010) claimed that microarray-based expression was not suitable for comparing expression levels of different genes, and reported X:A of approximately 0.5, that is, a lack of dosage compensation. Subsequently, the same RNA-seq data were re-analyzed with all low and nonexpressed genes removed because such genes are enriched on X chromosomes and thus could skew the comparison. Results from such filtering verified the hypothesis (Deng et al. 2011; Kharchenko et al. 2011). Since then, Ohno’s hypothesis has been tested with many different analysis approaches including comparing the ratio between the modern X to the proto XX using 1:1 orthologs between humans and chickens (Lin et al. 2012) or comparing only genes coding for large proteins as the “dosage-sensitive housekeeping genes” (Pessia, Engelstädter, et al. 2014). It has been found that different experimental platforms, analysis methods, and cut-off values all influenced the dosage compensation results. Thus, more questions than answers are presented on the evolution of dosage compensation of sex chromosomes (He and Zhang 2016).While debates persist over dosage compensation, XCI has been observed in all mammalian species studied to date (Okamoto et al. 2011). In the bovine, XCI was proposed by De La Fuente et al. (1999), and confirmed by Xue et al. (2002). In early bovine embryos imprinted XCI was observed at the morula stage (Ferreira et al. 2010) and random XCI occurred between the blastocyst and elongation stages (Bermejo-Alvarez et al. 2011). Although a recent study reported incomplete X dosage compensation in bovine fat, liver, muscle, and pituitary gland (Ka et al. 2016), further studies are needed for bovine early embryos and germ cells.Here we report the first comprehensive test of Ohno’s hypothesis in the compensatory upregulation of the X chromosome in bovine embryos, germline, and a vast array of somatic tissues using seven RNA-seq data sets (three from bovine preattachment embryos and four from somatic tissues), including immature and mature oocytes, in vivo and in vitro embryos up to the blastocyst stage (days 1–7; day 0 = standing estrus), conceptuses (embryos and associated extra-embryonic membranes) from day 7 to day 19, two adult female-specific tissues, eight adult female and male somatic tissues. Using median expression of the X:A ratio with its 95% bootstrap confidence intervals, we report incomplete compensatory upregulation of expressed X-linked genes and complete dosage compensation of “dosage-sensitive” genes. Our data thus fully support Ohno’s hypothesis in that the compensatory upregulation of X chromosome expression affects “dosage-sensitive” genes in bovine developing embryos, germline cells, and female/male tissues.
Materials and Methods
Paralog Analysis
To determine the unique or nonunique mapping strategy, we first calculated paralog enrichment on each bovine chromosome. Paralogs were identified on BioMart (Ensembl genome browser: http://useast.ensembl.org/biomart/; ensemble genes 85) and defined as genes with greater than 70% amino acid identity. This minimal cut-off for paralogs was a result of our pervious study, which determined that 70% was the best match of the BioMart search algorithm for the identification of X-linked multigene families (Jue et al. 2013). The total number of genes on each chromosome was calculated using bovine genome reference annotation UMD3.1 (http://useast.ensembl.org/Bos_taurus/Info/Annotation?redirect = no). Enrichment of paralogs gene number for each autosome was calculated by Fisher’s exact test compared with that on X chromosome (table 1).
Table 1
The Enrichment of Paralogs on Bovine Autosomes and the X Chromosome
Chromosome
Total Number of Genes
Number (%) of Paralogs
P-Value
Chromosome
Total Number of Genes
Number (%) of Paralogs
P-Value
1
985
167 (17)
5.65 e-18
16
710
129 (18)
6.56 e-13
2
1,021
229 (22)
1.88 e-08
17
665
149 (22)
6.55 e-07
3
1,372
314 (23)
7.24 e-09
18
1,236
207 (17)
1.20 e-20
4
855
222 (26)
0.00031
19
1,347
303 (22)
2.13 e-09
5
1,323
336 (25)
1.51 e-05
20
384
91 (24)
0.00027
6
692
156 (23)
6.65 e-07
21
731
221 (30)
0.10
7
1,396
377 (27)
0.00046
22
608
110 (18)
6.40 e-12
8
829
230 (28)
0.0059
23
785
264 (34)
0.61
9
602
146 (24)
6.45 e-05
24
347
98 (28)
0.049
10
1,074
316 (29)
0.033
25
766
102 (13)
7.22 e-24
11
1,047
192 (18)
1.63 e-15
26
437
82 (19)
5.63 e-09
12
414
98 (24)
0.00018
27
274
73 (27)
0.022
13
850
185 (22)
1.31 e-08
28
355
78 (22)
3.08 e-05
14
571
135 (24)
2.76 e-05
29
705
186 (26)
0.0012
15
1,050
387 (37)
0.97
X
1,128
374 (33)
Genome average
819
199 (24)
1.46 e-05
The Enrichment of Paralogs on Bovine Autosomes and the X Chromosome
RNA-seq Data Sets and Read Trimming
Raw FASTQ files were obtained from NCBI GEO database (table 2). A total of seven data sets including (1) in vivo developed matured oocytes and 2-cell to blastocyst stage preimplantation embryos (Jiang et al. 2014), (2) immature oocyte, in vitro developed matured oocytes, 4-cell, 8-cell, 16-cell, and blastocyst embryos (Graf et al. 2014), (3) conceptuses at days 7, 10, 13, 16, and 19 (embryos and associated extra-embryonic membranes) (Mamo et al. 2012), (4) female endometria and corpora lutea (CL) (Moore et al. 2016), (5) female somatic tissues of brain, liver, muscle, and kidney (Chen et al. 2015), (6) male somatic tissues of fat, muscle, hypothalamus, duodenum, liver, lung, and kidney (PRJEB6377), and (7) female and male somatic tissues of fat, liver, muscle, and pituitary gland (Seo et al. 2016).All RNA-seq raw reads were downloaded from NCBI using sratoolkit (version 2.5.0; http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view = toolkit_doc&f = std#s-2). The sequence read archive (sra) format files were converted to fastq format by fastq-dump (version 2.5.0; http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view = toolkit_doc&f = fastq-dump). Quality trimming and control were conducted as follows before mapping to the reference genome. First, Trimmomatic (version 0.33; http://www.usadellab.org/cms/?page = trimmomatic) was applied to removing the universal sequencing adaptors of SOLiD and Illumina in respective data sets with a minimum Phred score of 20 and minimal length of 30 bp. Subsequently, read quality was examined using FastQC (version 0.11.3; http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). The summary of the numbers of reads in each sample after trimming is presented in supplementary table S1, Supplementary Material online. The average number of read input for mapping across all samples is 22,814,027.
Mapping and Transcript Assembly
RNA-seq read mapping and transcript assembly were preformed based on the following pipeline. Trimmed RNA-seq reads were aligned to Ensemble bovine reference genome assembly UMD3.1.1 using Hisat2 version 2.0.5 aligner (Pertea et al. 2016). Transcript splice site detection was used and both unique and nonuniquely mapped reads were kept for the subsequent analysis. The percentages of nonuniquely mapped reads for each sample are summarized in supplementary table S1, Supplementary Material online, and the averaged overall mapping rate was 83.5%. IsoEM version 1.1.5 (Nicolae et al. 2011) was used to quantify gene expression to transcripts per kilobase million (TPM) using default parameters. For nonuniquely mapped reads, IsoEM assign fractions of the multiple aligned reads to each location using an expectation maximization algorithm (Nicolae et al. 2011). Expressed genes were defined as expression level TPM > 1. “Dosage-sensitive” genes were selected as ubiquitously expressed genes (TPM > 1) throughout all somatic samples or embryonic sample.
RNA-seq Data Set Overview
Matrices of gene expression TPM for the embryo (data sets 1–3) and somatic tissues (data sets 4–8) were processed separately in R to identify ubiquitous genes. Correlation plots and unsupervised hierarchical clustering were conducted in R for quality control and identification of biologically distinct subgroups. Outliers in the biological replicates were removed for the downstream analysis.The chromosome-wide gene expression distributions were isolated by gene locations on each chromosome in all samples using log2-transformed TPM (TPM > 1), the boxplots for the distribution were made in R.
GO Analysis
Gene ontology enrichment analysis was performed in DAVID (Huang et al. 2009) and 245 and 7,603 genes on the X and autosomes, respectively, were found as ubiquitous in the somatic tissue data sets. Similarly, 117 and 3,947 genes on the X and autosomes, respectively, were found as ubiquitous in the embryo data sets. The P-values in top 10 biological processes were plotted using plotly (https://plot.ly) in R. Pie charts for biological processes were generated as described by The Gene Ontology Consortium (Gene Ontology Consortium 2015).
X:A Ratio Calculation
When calculating the X:A ratio, we applied the pairwiseCI package in R (Schaarschmidt and Gerhard 2015) to obtain a 95% confidence interval for the ratio of the median of X to the median of A as in a previous study (Sangrithi et al. 2017). It is based on 1,000 bootstrap replicates where sampling from the original data was done with replacement and stratified by the group variables. Bootstrapping (Efron and Tibshirani 1994) was used because it is simple to apply and does not require any distribution assumptions.
Results
Overview of the RNA-seq Data Sets and Paralog Analysis
We used seven bovine RNA-seq data sets, three embryonic, and four somatic, generated by us (Jiang et al. 2014) and downloaded from NCBI (table 2). In total, we have 40 samples including 19 embryos and 21 tissues from all data sets. Pearson correlation and unsupervised hierarchical clustering (supplementary fig. S1, Supplementary Material online) show that replicates within each tissue or embryonic stage clustered closely to each other, suggesting even though the data were obtained from different studies, the data were replicable and reliable. Because we only compared the X:A ratio and gene expression within each data set instead of across data sets, we were able to use the data from different studies and experimental platforms after data normalization. Paralogs are homologous genes within the same genome created by gene duplication (Gevers et al. 2004). This is one potential way to achieve X dosage compensation because X lacks a homolog in males (Jue et al. 2013). We found an approximately 1.4-fold increase in the percentage of paralogs ( > 70% amino acid identity [Jue et al. 2013]) on the bovine X chromosome (33%) compared with the genome averages (24%; table 1). This enrichment is significantly higher (table 1; P < 0.05 by Fisher’s exact test) than to that of most chromosomes, with the exception of chromosomes 15, 21, and 23 (P = 0.97, 0.10, and 0.61, respectively), suggesting the potential roles of paralogs in X dosage compensation. Such paralog enrichment on X has also been observed in humans (32% vs. 17%; 1.9-fold) and mice (51% vs. 35%; 1.5-fold) (Jue et al. 2013). This information demonstrated that unique mapping as performed in a previous RNA-seq study (Xiong et al. 2010) is not appropriate because many paralogs will be excluded from the analysis, and potentially skewing the X:A comparison. Thus, we applied the “nonunique” mapping strategy for reads mapping. Reads that aligned to multiple locations (such as paralog gene family) in the reference genome or had alternative splice junctions were kept in all subsequent analysis. This resulted in a total of 959 X-linked and 20,316 autosomal protein coding genes.
Table 2
The Raw FASTQ Files Generated by Us and Downloaded from NCBI GEO Database
Tissue Type (Replicates)
Breed/Subspecies
Number of Samples
Library Type
BioProject ID
Reference
1
In vivo MII oocytes and embryos: 2-, 4-, 8-, 16-, 32-cell, CM, and BL (n = 2)
Holstein
8
Single-read SOLiD
PRJNA254699
Jiang et al. (2014)
2
In vitro GV and MII oocytes and embryos: 4-, 8-, 16-cell, and BL (n = 3)
German Simmental (♀) and Brahman (♂) cross
6
Single-read Illumina
PRJNA228235
Graf et al. (2014)
3
In vivo conceptuses: days 7 (n = 6), 10 (n = 7), 13 (n = 5), 16 (n = 5), 19 (n = 5)
Male somatic tissues: fat, muscle, hypothalamus, duodenum, liver, lung, and kidney (pools of 7–14 animals)
Bos taurus
7
Paired-end Illumina
PRJEB6377
PRJEB6377, 2014
7
Female and male somatic tissues: fat, liver, muscle, and pituitary gland (n = 5)
Hanwoo (Korean cattle)
8
Paired-end Illumina
PRJNA273164
Seo et al. (2016)
Deng et al. (2011) suggested that the low and nonexpression values in RNA-seq data may result from background noise of sequencing and read mapping. Inclusion of such values would be inappropriate and strongly influence the results. Furthermore, when we calculated the percentages of X-linked and autosomal genes with low transcript per million (TPM) values (fig. 1
and supplementary table S2, Supplementary Material online), we found that on the average the X chromosome has 11.7% more genes with TPM ≤ 1 than autosomes. Somatic tissues including endometrium, fat, liver, and muscle had the most enrichment of lowly expressed X-linked genes (supplementary table S2, Supplementary Material online). Therefore, we used a cut-off of TPM > 1 as expressed genes to remove data bias. After this filtering, 468 X-linked genes and 12,288 autosomal genes on average were used for X:A ratio calculation. The numbers of expressed genes (TPM > 1) on X chromosome or autosomes in each sample are listed in supplementary table S2, Supplementary Material online.
. 1.
—Expression ranges of genes on the X and autosome pairs in the bovine. (A) the averaged percentages of lowly expressed genes (TPM ≤ 1) on the X and autosomes in all samples. (B) A representative (in vivo matured oocytes) boxplot showing that the range and median expression levels of X-linked genes (red) (TPM > 1) were similar to those of each autosome pairs (blue) in the bovine. (C) A representative (in vivo matured oocyte) Kernel density plot showing that the distribution of X-linked gene (red) expression (TPM > 1) was similar to those of each autosome pairs (blue) in the bovine.
—Expression ranges of genes on the X and autosome pairs in the bovine. (A) the averaged percentages of lowly expressed genes (TPM ≤ 1) on the X and autosomes in all samples. (B) A representative (in vivo matured oocytes) boxplot showing that the range and median expression levels of X-linked genes (red) (TPM > 1) were similar to those of each autosome pairs (blue) in the bovine. (C) A representative (in vivo matured oocyte) Kernel density plot showing that the distribution of X-linked gene (red) expression (TPM > 1) was similar to those of each autosome pairs (blue) in the bovine.
Ranges of Gene Expression of All Chromosomes
We then investigated the gene expression profiles of all chromosomes to determine whether the transcriptional outputs from X chromosome were comparable to those of each autosome pairs. We performed log2 transformation of TPM to normalize the data distribution. The TPM distribution of X-linked genes was not significantly different from those of all autosome pairs in 11 out of 40 samples using all expressed genes, and 31 out of 40 samples using ubiquitously expressed genes (P > 0.05, by two-sided Kolmogorov–Smirnov test, fig. 1B and supplementary table S3, Supplementary Material online). Such distributions were also demonstrated by kernel density estimation (fig. 1C). These observations demonstrate that regardless of the number of X chromosomes, the expression levels of ubiquitously expressed genes on X were comparable to those of each autosome pair in all samples, suggesting dosage compensation.
X Chromosome Upregulation in Adult Somatic Tissues
To determine the X chromosome dosage compensation in adult cattle tissue, we analyzed RNA-seq data sets (table 2) for two female-specific tissues, endometrium and corpus luteum, and other somatic tissues from both males and females including the brain (hypothalamus), liver, kidney, muscle, fat, pituitary gland, lung, and duodenum. Overall, the X:A ratios of these tissues were in the range of 0.5–1, suggesting upregulation of the expression from the X chromosome, yet the dosage compensation is incomplete (fig. 2fig. 3A and B). Specifically, the liver gave the highest X:A ratio (1.01) in females and showed complete compensation, followed by the pituitary gland (0.91). These data suggest that the X chromosome expression was enriched for activities in these tissues. In contrast, fat, muscle, endometrium, and the lung gave relatively low but incomplete compensation X:A ratios (0.64–0.72), indicating less X chromosome activities. Furthermore, we compared the X chromosome expression distribution between males and females in common somatic tissues and observed no significant difference (P > 0.05, by two-sided Kolmogorov–Smirnov test, supplementary table S4, Supplementary Material online), except in muscle. The X:A ratio of common tissues between sexes was also not significantly different (P = 0.45, by paired t-test after log transformation).
. 2.
—Median X:A ratios with 95% confidence intervals of bovine female (A, B) and male (C, D) somatic tissues. The X:A ratios were calculated using expressed (TPM > 1) genes (A, C) and ubiquitously expressed (TPM > 1 and present in all somatic data sets; B, D) genes. Blue line: X:A = 1, complete dosage compensation; red line: X:A = 0.5, no dosage compensation.
. 3.
—Median X:A ratios with 95% confidence intervals of bovine in vivo (A, B) and in vitro (C, D) produced oocytes and pre- and postimplantation embryos. The X:A ratios were calculated using expressed (TPM > 1) genes (A, C) and ubiquitously expressed (TPM > 1 and present in all embryo data sets; B, D). GV, germinal vesicle; MII, metaphase of second meiosis; CM, compact morula; BL, blastocyst. Blue line: X:A = 1, complete dosage compensation; red line: X:A = 0.5, no dosage compensation.
The Raw FASTQ Files Generated by Us and Downloaded from NCBI GEO Database—Median X:A ratios with 95% confidence intervals of bovine female (A, B) and male (C, D) somatic tissues. The X:A ratios were calculated using expressed (TPM > 1) genes (A, C) and ubiquitously expressed (TPM > 1 and present in all somatic data sets; B, D) genes. Blue line: X:A = 1, complete dosage compensation; red line: X:A = 0.5, no dosage compensation.—Median X:A ratios with 95% confidence intervals of bovine in vivo (A, B) and in vitro (C, D) produced oocytes and pre- and postimplantation embryos. The X:A ratios were calculated using expressed (TPM > 1) genes (A, C) and ubiquitously expressed (TPM > 1 and present in all embryo data sets; B, D). GV, germinal vesicle; MII, metaphase of second meiosis; CM, compact morula; BL, blastocyst. Blue line: X:A = 1, complete dosage compensation; red line: X:A = 0.5, no dosage compensation.Although all somatic tissues we analyzed had upregulated expression of X-linked genes which support Ohno’s hypothesis, the confidence intervals of X:A ratios did not encompass 1 in most of the samples. As suggested in the previous studies “dosage-sensitive” genes with housekeeping functions were more likely affected by dosage imbalance and were upregulated (Pessia, Makino, et al. 2012). When nondosage-sensitive genes were included in the X:A calculation, the X:A ratio were likely lower (Sangrithi et al. 2017). Therefore, we further selected ubiquitously expressed genes (TPM > 1) throughout all somatic samples. Gene ontology analysis showed strong evidence that these ubiquitously expressed genes had housekeeping roles (supplementary fig. S2, Supplementary Material online), such as translation, RNA transcription, protein transport, and cellular and metabolic process (supplementary fig. S2, Supplementary Material online). A total of 245 and 7,603 genes on the X chromosome and autosomes, respectively, were included as ubiquitously expressed genes in the recalculation of the X:A ratios (fig. 2C and D). The confidence intervals encompassed 1 and the medians were greater than 1 in most of the samples. Brain and its specific regions such as the pituitary gland and hypothalamus had the highest X:A ratios (1.20, 1.28, and 1.22, respectively), consistent with previous reports in other species (Nguyen and Disteche 2006; Deng et al. 2011).
X Chromosome Upregulation in Immature and Mature Oocytes
Germinal vesicle stage (immature) oocytes are arrested at the diplotene stage of the first prophase of meiosis (Pro I) (Mehlmann 2005), and contain a duplicated genome (XXXX:AAAA) and two active X chromosomes (Fukuda et al. 2015). The matured oocytes, on the other hand, are arrested at the second metaphase of meiosis (MII) (Li and Albertini 2013) and are haploid (1 N) although each homologous chromosome contains two sister chromatids/complements of DNA (2 C; XX:AA). Our analysis included immature and both in vivo and in vitro matured oocytes (Graf et al. 2014; Jiang et al. 2014). First, we identified expressed genes (TPM > 1) and ubiquitously expressed genes across all preimplantation samples. Fewer expressed ubiquitous genes were found in these samples than in somatic tissues but similar gene ontology terms (supplementary figs. S3 and S4, Supplementary Material online). A total of 117 X-linked and 3,947 autosomal genes were used as ubiquitous genes for X:A ratio calculation. Compared with expressed genes (TPM > 1, X:A ≈ 0.75, fig. 3C), ubiquitously expressed genes had higher X:A ratios in immature and mature oocytes at 1 and 0.87, respectively (fig. 3D). Taken together, our analyses reveal a higher X:A ratio for ubiquitous genes, and a balanced X to autosome expression in immature diploid oocytes and an incomplete balance in mature haploid oocytes.
X Chromosome Upregulation in Preimplantation Embryos
X inactivation and reactivation happen in cycles during early embryonic development. In mice, the zygote contains an inactive X from the sperm but three X chromosome reactivation events occur subsequently: (1) embryonic genomic activation (EGA) at 2-cell, (2) pluripotency establishment in inner cell mass (ICM) of blastocyst, and (3) primordial germ cell generation in the genital ridge (Ohhata and Wutz 2013). Using the bovine X-linked monomine oxidase type A (MAOA), Ferreira et al. (2010) demonstrated that transcripts from both the maternal and paternal MAOA were present in embryos at the 4-, 8-, to 16-cell, and blastocyst stages, while only the maternal transcripts were present in compact morula. These data revealed that XCI occurred in an imprinted fashion in the morula stage in the bovine and the paternal X was reactivated at the blastocyst stage. A more permanent random XCI was observed between the blastocyst and early elongation stages by analyzing seven X-linked genes in day14 embryos (Bermejo-Alvarez et al. 2011). However, the expression dynamics of an individual gene cannot represent the activity of the whole X chromosome, global transcript analysis will generate a more definitive conclusion on X inactivation-reactivation dynamics.We therefore analyzed RNA-seq data from preattachment embryos. In in vivo produced preimplantation embryos, we observed an incomplete dosage compensation (0.5 < X:A < 1) using expressed genes and an excess of compensation (X:A > 1) from the 8-cell to compact morula stages using ubiquitous genes (fig. 3A and B). X dosage slightly decreased after fertilization, with the lowest X:A ratio seen at the 4-cell stage, indicating that the sperm brought in an inactive X chromosome to the matured oocyte. The X:A ratio started to increase from the 4- to 8-cell stage, coincident with embryonic genome activation (EGA) for bovine in vivo embryos (Jiang et al. 2014), suggesting that EGA actives both paternal and maternal genome and has a more profound effect on the X chromosome. The increased X:A ratio exhibited excess compensation from the 8-cell to compact morula stage. A sharp decrease of X:A ratio was then observed between early (32-cell) and compact morula stages, corresponding to the first observed inactivation of the paternal X chromosome in bovine embryos. The X:A subsequently stabilized from days 7 to 19 of gestation, corresponding to random XCI between the blastocyst and early elongation stages.Using expressed genes, in vitro produced bovine preimplantation embryos present a similar X:A dynamics from fertilization to the 8-cell stage (fig. 3C and D). However, no excess dosage compensation was observed using ubiquitous genes at any stage but the blastocyst, suggesting a deviation of X compensation from in vivo embryos. These observations are consistent with recent findings of aberrant X regulation in in vitro produced human and mouse embryos (Tan et al. 2016).
Effect of PAR and Putative XCI-Escaping Genes on X Chromosome Upregulation
Pseudoautosomal regions (PARs) contain homologous genes between the X and Y chromosomes, and are important in homologous chromosome pairing and recombination during male meiosis (Das et al. 2009). A total of 20 PAR genes (supplementary table S3, Supplementary Material online) have been characterized in bovine, sheep, goats, and other ruminants (Raudsepp and Chowdhary 2015). Most PAR genes are known to escape XCI in humans (Helena Mangs and Morris 2007). Moreover, approximately 15% and 3% of non-PAR X-linked genes in humans and mice, respectively, are known to escape XCI (Berletch et al. 2011). In the bovine, 55 such X-linked genes (supplementary table S5, Supplementary Material online) were classified as candidates that escape XCI (Ka et al. 2016). To tease out the effects of these bi-allelically expressed PAR genes and XCI-escaping genes, we plotted X:A ratios in the categories of “all genes,” “expressed genes,” “genes subjected to XCI (excluding PAR genes and putative XCI-escaping genes),” and “dosage-sensitive genes” (fig. 4
). The X:A ratios for “all genes” had the median value closer to 0.5, while the ratios for “expressed genes” were closer to 1. When we excluded PARs and putative XCI-escaping genes from “expressed genes,” the outliers of extremely high X:A ratios disappeared, suggesting bi-allelically expressed X-linked genes did contribute to the high X:A ratios. Moreover, “dosage-sensitive genes” maintained the highest X:A ratios ( > 1), further confirming the hyperexpression nature of this subgroup of genes.
. 4.
—Boxplot of the X:A medians for all data sets. Genes were categorized in “all genes,” “expressed (TPM > 1) genes,” “genes subjected to XCI (excluding PAR genes and putative XCI-escaping genes),” and “dosage-sensitive (ubiquitously expressed) genes.” Blue line: X:A = 1, complete dosage compensation; red line: X:A = 0.5, no dosage compensation.
—Boxplot of the X:A medians for all data sets. Genes were categorized in “all genes,” “expressed (TPM > 1) genes,” “genes subjected to XCI (excluding PAR genes and putative XCI-escaping genes),” and “dosage-sensitive (ubiquitously expressed) genes.” Blue line: X:A = 1, complete dosage compensation; red line: X:A = 0.5, no dosage compensation.
Discussion
In this study, we determined the X chromosome dosage profiles in four chromosome scenarios in the bovine. Immature oocytes represent diploid germline with duplicated genome/four complements of DNA (XXXX:AAAA); mature oocyte represent haploid germline with duplicated genome/two complements of DNA (XX:AA); bovine preimplantation embryos at various stages represent the gradual change from two active X (XaXa:AA) to one inactive X chromosome (XaXi:AA); female and male somatic tissues contain diploid cells with one already inactivated X (XaXi:AA or XY:AA). Our analyses showed incomplete compensation (0.5 < X:A < 1) of X chromosome to autosome pairs in all scenarios for expressed genes (TPM > 1) and excess compensation (X:A > 1) for “dosage-sensitive” genes in somatic tissues and certain stages in early embryos. These findings suggest that X dosage upregulation occurs in bovine germlines, preattachment embryos, and somatic tissues analyzed here.Our results in bovine are consistent with previous findings in other mammalian species using similar strategies of data filtering (Deng et al. 2011; Pessia, Makino, et al. 2012). However, studies applying different threshold criteria on RNA-seq data generated conflicting results for mammalian X dosage compensation. Xiong et al. (2010) included genes with low and no expression and reported X:A ratio close to 0.5. Whereas many follow-up studies reanalyzing the same RNA-seq data after removing the “noise” concluded hyper-activation of X on expressed genes (Deng et al. 2011), especially dosage-sensitive ones (genes encoding protein complex of seven or more members and having housekeeping roles) (Pessia, Makino, et al. 2012). Thus, dosage compensation, unlike XCI, has been proposed to be a local process with hyper-expression by only dosage-sensitive genes (Pessia, Engelstädter, et al. 2014). In our study, we applied two gene selection methods in order to identify “dosage-sensitive” genes: expressed genes that are TPM > 1 and ubiquitous genes whose TPM’s are > 1 in all somatic samples or in all embryo samples. We found similar results as in humans and mice that incomplete dosage compensation was incomplete in most scenarios when expressed genes were used, while ubiquitous genes had a higher X dosage, implying that the expression of a group of X-linked bovine genes are collectively more than doubled. Furthermore, this level of upregulation could not be globally applied to all genes on the X chromosome.The previous study by Xiong et al. (2010) filtered out reads that mapped to multiple locations of the genome and those that spanned splice junctions. Their unique mapping approach resulted in X:A ratio close to 0.5 and refuted the X dosage compensation conclusion. Multiple groups (Kharchenko et al. 2011; Pessia, Makino, et al. 2012) have re-analysis the data from Xiong et al. by different mapping strategy and reached opposite conclusions. Jue et al. (2013) showed that mapping strategies can significantly impact the conclusions regarding X:A ratios. The “unique” mapping strategy can remove reads from close paralogs which affects the X chromosome substantially (Jue et al. 2013). Considering this point, here we used nonunique mapping strategy. We used the Hisat2 software that by default reports both uniquely and multimapped reads Hisat2 is known to have the highest correctly multimapped reads compared with other aligners (Kim et al. 2015).Bovine female somatic tissues had a slightly higher X:A ratio than the same tissues of males. This difference, however, was not significantly different (P = 0.45). X chromosome expression distribution was also similar between sexes (P > 0.05). These demonstrated that the expression of the X chromosome is balanced between males and females although they have different numbers of the X chromosome. The slightly higher X:A ratio in females may be a result of genes that escape XCI (Couldrey et al. 2017). Although it is unclear how many genes escape XCI in the bovine, a previous report documented a few X-linked genes escaping XCI in a mosaic fashion in the bovine (Yen et al. 2007).The brain tissue had the highest X:A ratio compared with the other tissues, consistent with previous RNA-seq studies in several mammalian species (Nguyen and Disteche 2006; Deng et al. 2011). This could be related to the fact that may genes related to brain functions, such as MAOA, are located on the X chromosome (Zechner et al. 2001). On the contrary, X dosage of other bovine somatic tissues including fat, liver, muscle, and the pituitary gland was previously determined as incompletely compensated (Ka et al. 2016), which is consistent with results in humans, mice, and ours.Germ cells and the developing embryos undergo drastic epigenetic changes. We observed balanced expression of the X chromosome with that of the autosomes in diploid immature oocytes and incomplete balance in haploid matured oocytes. These data are consistent with those by Fukuda et al. who also showed higher dosage compensation of X chromosome in immature than matured oocytes in the human and mouse (Fukuda et al. 2015). However, X:A ratio slightly decreased after fertilization probably due to the sperm brings in some mRNAs of autosomal origin in addition to an inactive X (Huynh and Lee 2003). Because early embryonic development is primarily dependent on stored maternal mRNAs and proteins which gradually degrade until EGA (Memili and First 2000), a consistent decrease and then increase of X:A before and after EGA, respectively, were observed. The timing of the changes in X:A in in vivo embryos, however, was one cell cycle earlier than their in vitro counterparts due to the timing difference between these two types of embryos, which are 4–8 and 8–16 cell stages, respectively (Telford et al. 1990; Misirlioglu et al. 2006; Kues et al. 2008; Graf et al. 2014; Jiang et al. 2014).
Conclusion
In conclusion, our study shows the upregulation of X chromosome in four bovine genome scenarios, supporting a balanced expression between a single active X and autosome pairs. However, deviating from Ohno’s theory, dosage compensation to rescue X haploinsufficiency appears to be an incomplete process for expressed genes but a complete process for “dosage-sensitive” genes. Removal of PAR genes and those putatively escape XCI eliminated the outliers of extremely high X:A ratios. In addition, the switch from imprinted XCI at the compact morula stage to random XCI at the blastocyst stage may happen so rapidly that the potential transient state of two active X chromosomes could not be captured in the current data set with limited time points. Whether a transient two active X state occurs or not during blastulation requires frequent sampling and further study. Lastly, no relative X expression difference was observed between bovine female and male somatic tissues, suggesting Ohno’s hypothesis balanced the overall X between sexes.
Ethical Approval
This article does not contain any studies with humanparticipants or animals performed by any of the authors.
Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.Click here for additional data file.
Authors: Alexander Graf; Stefan Krebs; Valeri Zakhartchenko; Björn Schwalb; Helmut Blum; Eckhard Wolf Journal: Proc Natl Acad Sci U S A Date: 2014-03-03 Impact factor: 11.205
Authors: Mahesh N Sangrithi; Helene Royo; Shantha K Mahadevaiah; Obah Ojarikre; Leena Bhaw; Abdul Sesay; Antoine H F M Peters; Michael Stadler; James M A Turner Journal: Dev Cell Date: 2017-01-26 Impact factor: 12.270
Authors: Megan A Gura; Soňa Relovská; Kimberly M Abt; Kimberly A Seymour; Tong Wu; Haskan Kaya; James M A Turner; Thomas G Fazzio; Richard N Freiman Journal: Development Date: 2022-02-09 Impact factor: 6.868