Literature DB >> 33028202

A comparative analysis of chromatin accessibility in cattle, pig, and mouse tissues.

Michelle M Halstead1, Colin Kern1, Perot Saelao1, Ying Wang1, Ganrea Chanthavixay1, Juan F Medrano1, Alison L Van Eenennaam1, Ian Korf1, Christopher K Tuggle2, Catherine W Ernst3, Huaijun Zhou4, Pablo J Ross5.   

Abstract

BACKGROUND: Although considerable progress has been made towards annotating the noncoding portion of the human and mouse genomes, regulatory elements in other species, such as livestock, remain poorly characterized. This lack of functional annotation poses a substantial roadblock to agricultural research and diminishes the value of these species as model organisms. As active regulatory elements are typically characterized by chromatin accessibility, we implemented the Assay for Transposase Accessible Chromatin (ATAC-seq) to annotate and characterize regulatory elements in pigs and cattle, given a set of eight adult tissues.
RESULTS: Overall, 306,304 and 273,594 active regulatory elements were identified in pig and cattle, respectively. 71,478 porcine and 47,454 bovine regulatory elements were highly tissue-specific and were correspondingly enriched for binding motifs of known tissue-specific transcription factors. However, in every tissue the most prevalent accessible motif corresponded to the insulator CTCF, suggesting pervasive involvement in 3-D chromatin organization. Taking advantage of a similar dataset in mouse, open chromatin in pig, cattle, and mice were compared, revealing that the conservation of regulatory elements, in terms of sequence identity and accessibility, was consistent with evolutionary distance; whereas pig and cattle shared about 20% of accessible sites, mice and ungulates only had about 10% of accessible sites in common. Furthermore, conservation of accessibility was more prevalent at promoters than at intergenic regions.
CONCLUSIONS: The lack of conserved accessibility at distal elements is consistent with rapid evolution of enhancers, and further emphasizes the need to annotate regulatory elements in individual species, rather than inferring elements based on homology. This atlas of chromatin accessibility in cattle and pig constitutes a substantial step towards annotating livestock genomes and dissecting the regulatory link between genome and phenome.

Entities:  

Keywords:  Cattle; Chromatin accessibility; Comparative epigenomics; Functional annotation; Pig

Mesh:

Substances:

Year:  2020        PMID: 33028202      PMCID: PMC7541309          DOI: 10.1186/s12864-020-07078-9

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Despite considerable progress to annotate protein-coding genes in livestock species, the vast majority of these genomes is noncoding and remains poorly characterized. Epigenomics techniques, such as chromatin immunoprecipitation followed by sequencing (ChIP-seq) and DNase I hypersensitive sites sequencing (DNase-seq), have been extensively employed to catalog functional elements in humans [1] and classical model organisms [2-6]. For instance, the international human epigenome consortium has profiled thousands of epigenomes and identified millions of regulatory elements in the human genome [1, 7, 8], yielding an atlas of functional elements that has been invaluable for subsequent research in a wide variety of biological processes, including disease [9-13], pluripotency [14-16], differentiation [17-19], and morphology [20, 21]. Ultimately, genome-wide patterns of chromatin accessibility and compaction determine which genomic regions are available to cellular machinery, and are thereby intimately connected to the cell-specific gene expression patterns that determine identity and function. Controlled exposure of specific sites provides opportunities for transcription factors to bind their recognition motifs and regulate gene expression through further recruitment of proteins, such as RNA polymerases [22, 23]. Consequently, profiling open chromatin has the potential to not only identify regulatory elements, but also profile their activities in different cell types. The increasing availability of next-generation sequencing-based techniques spurred development of several alternative epigenomics assays, such as the Assay for Transposase Accessible Chromatin (ATAC-seq), which was first reported by Buenrostro et al in 2013 [24]. Following its introduction, ATAC-seq quickly became one of the leading methods for identification of open chromatin, largely due to the simplicity of the technique and low input requirements, which made it possible to study chromatin structure in rare samples. Here we implemented ATAC-seq to profile open chromatin in a set of cattle and pig tissues: subcutaneous adipose, brain (frontal brain cortex, hypothalamus, and cerebellum), liver, lung, skeletal muscle, and spleen. This set of prioritized tissues are associated with a large number of qualitative phenotypic traits relevant to animal production, such as disease resistance, growth, and feed efficiency. Overall, about 300,000 accessible regions were identified in each species, yielding an epigenomic resource that will benefit agricultural genomics research and enable cross-species comparisons that will enhance knowledge of comparative epigenomics and transcriptional regulation.

Results

ATAC-seq library quality control and preprocessing

Using a modified ATAC-seq protocol, genome-wide chromatin accessibility was profiled in eight tissues derived from two adult male Hereford cattle and two adult male Yorkshire pigs: three brain tissues (frontal brain cortex, hypothalamus, and cerebellum), liver, lung, spleen, subcutaneous adipose, and skeletal muscle (Fig. 1a). With the exception of one cattle cerebellum sample, which was lost during processing, ATAC-seq data were generated for two biological replicates per tissue (Table S1). In addition, two technical replicates were prepared for cattle cortex, pig cerebellum, and pig hypothalamus. ATAC-seq signal from technical replicates were highly correlated (Pearson R averaged 0.97), and principal components analysis (PCA) of genome-wide signal grouped biological and technical replicates together (Fig. S1).
Fig. 1

Experimental design and ATAC-seq quality metrics. a Overview of tissues collected from adult male cattle and pigs for ATAC-seq, wherein the Tn5 transposase preferentially cuts DNA at accessible sites and simultaneously inserts sequencing adapters. b Scatterplots showing Pearson correlation of normalized genome-wide ATAC-seq signal (reads per million; RPM) between biological replicates for brain cortex and lung. c Heatmaps depicting normalized ATAC-seq signal at all TSS, sorted by signal intensity. Signal shown for brain cortex and lung. d Normalized ATAC-seq signal in cattle tissues at the GAPDH locus (a housekeeping gene), as well as several genes with tissue-specific activity. Pink bars indicate that signal exceeded the viewing range (up to 20 RPM). e Normalized ATAC-seq signal in pig tissues at the same genes. f PCA of normalized ATAC-seq signal in consensus open chromatin identified in cattle. G) PCA of normalized ATAC-seq signal in consensus open chromatin identified in pig

Experimental design and ATAC-seq quality metrics. a Overview of tissues collected from adult male cattle and pigs for ATAC-seq, wherein the Tn5 transposase preferentially cuts DNA at accessible sites and simultaneously inserts sequencing adapters. b Scatterplots showing Pearson correlation of normalized genome-wide ATAC-seq signal (reads per million; RPM) between biological replicates for brain cortex and lung. c Heatmaps depicting normalized ATAC-seq signal at all TSS, sorted by signal intensity. Signal shown for brain cortex and lung. d Normalized ATAC-seq signal in cattle tissues at the GAPDH locus (a housekeeping gene), as well as several genes with tissue-specific activity. Pink bars indicate that signal exceeded the viewing range (up to 20 RPM). e Normalized ATAC-seq signal in pig tissues at the same genes. f PCA of normalized ATAC-seq signal in consensus open chromatin identified in cattle. G) PCA of normalized ATAC-seq signal in consensus open chromatin identified in pig Mapping of sequencing data resulted in 58 ± 4 million (S.E.) informative reads (uniquely mapping, non-mitochondrial, monoclonal reads) per sample (Table 1). Normalized genome-wide ATAC-seq signal was highly reproducible between biological replicates, with the Pearson correlation coefficient averaging 0.97 (Fig. 1b; Fig. S2), and increased in intensity at transcription start sites (TSS) (Fig. 1c; Fig. S3). Genes with tissue-specific functions demonstrated open chromatin that was specific to the corresponding tissue; for instance, the gene STMN4, which is involved in neuron projection development, is specifically marked by open chromatin at two sites in both cattle (Fig. 1d) and pig (Fig. 1e) in all three brain tissues, which likely correspond to alternate TSS, based on the gene annotation in pig. PCA of normalized ATAC-seq signal separated samples by tissue, with brain tissues grouping together in both pig and cattle (Fig. 1f,g, Fig. S4).
Table 1

ATAC-seq data preprocessing. Per library, total raw reads, mapped reads (excluding mitochondrial DNA), duplicate reads, informative reads (monoclonal and uniquely mapping), and percent of raw reads that were informative

SpeciesTissueReplicateRawreadsMapped readsDuplicate readsInformative reads%Raw
CattleAdiposeA152,184,070145,648,16182,972,32437,815,39224.85
AdiposeB126,920,776123,879,74925,466,93872,623,98557.22
CerebellumB158,964,554134,391,90053,303,88649,305,25031.02
Brain CortexA216,202,716189,763,599122,567,57430,997,04514.34
Brain CortexB205,363,760174,961,99596,411,12841,021,48719.98
HypothalamusA146,439,048102,771,47150,044,17635,090,14023.96
HypothalamusB68,153,35663,771,72331,495,32515,328,53522.49
LiverA175,124,072165,112,50395,001,24840,031,56022.86
LiverB194,367,992179,265,34199,942,56736,472,28918.76
LungA163,469,102159,710,29527,682,06399,691,29160.98
LungB190,030,170184,571,09236,657,121110,584,50258.19
MuscleA89,174,61886,045,85715,262,76755,794,69362.57
MuscleB97,247,28069,592,64010,539,04845,909,94547.21
SpleenA261,785,316250,911,567163,860,74137,521,24114.33
SpleenB179,854,284162,449,02551,533,28469,626,10038.71
PigAdiposeA93,118,99887,025,52015,814,10757,651,67761.91
AdiposeB71,639,95666,597,41212,368,40442,960,58359.97
CerebellumA186,388,542175,280,07090,013,90663,538,78334.09
CerebellumB125,817,350116,536,74341,557,70357,997,39546.10
Brain CortexA101,924,24098,384,41136,110,62753,096,95252.09
Brain CortexB160,871,726155,783,25777,377,04366,403,81041.28
HypothalamusA112,463,726106,966,83552,803,73039,919,89535.50
HypothalamusB170,163,006162,907,05292,303,71056,125,16032.98
LiverA171,864,386167,321,556113,588,69942,376,22024.66
LiverB169,952,062164,205,92085,279,20064,117,45837.73
LungA108,086,464104,424,55619,658,15673,018,74967.56
LungB110,690,180106,275,98117,791,03074,869,85267.64
MuscleA168,211,474165,225,30541,747,860113,058,64967.21
MuscleB141,069,686138,785,46639,577,31591,780,48665.06
SpleenA91,549,65288,698,19913,254,76964,352,89370.29
SpleenB93,747,29290,558,89512,817,04567,241,93471.73
ATAC-seq data preprocessing. Per library, total raw reads, mapped reads (excluding mitochondrial DNA), duplicate reads, informative reads (monoclonal and uniquely mapping), and percent of raw reads that were informative Several commonly used statistics were used to evaluate library quality (Table 2) [25, 26]. The non-redundant read fraction (NRF), which gauges library complexity by measuring the proportion of non-duplicate uniquely mapped reads out of all mapped reads, averaged 0.62 ± 0.03 (S.E.), indicating acceptable library complexity. The synthetic Jensen-Shannon distance (sJSD), which measures the divergence between the genome-wide ATAC-seq signal in a given sample versus a uniform distribution, averaged 0.46 ± 0.01 (S.E.), suggesting a non-random ATAC-seq signal distribution throughout the genome. Finally, the Fraction of Reads in Peaks (FRiP) was calculated to evaluate the strength of signal over background. On average, 130,712 ± 10,994 (S.E.) ATAC-seq peaks (regions of enrichment) were called per sample (Supplementary Data 1–2), and the FRiP score averaged 34.8 ± 2.6% (S.E). Notably, FRiP scores for adipose libraries were consistently lower (average 8.5 ± 2.5%) than for other tissues (average 38.7 ± 2.1%).
Table 2

Quality metrics of ATAC-seq libraries. Non-redundant read fraction (NRF) measures library complexity, Fraction of reads in peaks (FRiP) measures signal-to-noise ratio, and synthetic Jensen-Shannon distance (sJSD) measures divergence between ATAC-seq signal and a uniform distribution

SpeciesTissueReplicateNRFFRiPsJSD
CattleAdiposeA0.439.820.34
AdiposeB0.7915.040.34
CerebellumB0.6042.530.48
Brain CortexA0.3530.160.48
Brain CortexB0.4540.800.49
HypothalamusA0.5143.670.50
HypothalamusB0.5111.970.37
LiverA0.4237.990.50
LiverB0.4428.170.44
LungA0.8337.630.45
LungB0.8044.260.48
MuscleA0.8240.770.50
MuscleB0.8541.770.51
SpleenA0.3521.290.42
SpleenB0.6835.240.45
PigAdiposeA0.825.210.40
AdiposeB0.814.120.40
CerebellumA0.4943.050.46
CerebellumB0.6441.490.47
Brain CortexA0.6340.650.48
Brain CortexB0.5044.530.46
HypothalamusA0.5136.850.49
HypothalamusB0.4337.760.44
LiverA0.3255.380.52
LiverB0.4849.250.49
LungA0.8131.210.43
LungB0.8336.870.46
MuscleA0.7558.160.62
MuscleB0.7161.940.65
SpleenA0.8529.410.43
SpleenB0.8622.700.38
Quality metrics of ATAC-seq libraries. Non-redundant read fraction (NRF) measures library complexity, Fraction of reads in peaks (FRiP) measures signal-to-noise ratio, and synthetic Jensen-Shannon distance (sJSD) measures divergence between ATAC-seq signal and a uniform distribution For each tissue, peaks called from biological replicates were compared to evaluate consistency between replicates and identify accessible regions with high confidence. On average, 67.4 ± 17.8% (S.E.) of peaks from the replicate with fewer peaks called were also identified in the other replicate (Table 3). Regions that were enriched for ATAC-seq signal in both biological replicates of at least one tissue were collapsed to obtain a single comprehensive set of unique “consensus” peaks, accounting for accessibility in all eight tissues. Altogether, 306,304 and 273,594 consensus peaks were identified in pig and cattle, respectively (Table 4).
Table 3

Replicability of ATAC-seq peaks. Overlap of peak sets derived from biological replicates

SpeciesTissueRep. A PeaksRep. B PeaksRep. A Peaks overlapping Rep. B Peaks% Rep. A PeaksRep B. Peaks overlapping Rep A. peaks% Rep B. Peaks
CattleAdipose59,612133,76838,64764.838,47128.8
Brain Cortex109,395160,54675,46269.075,20646.8
Hypothalamus59,96637,03619,97033.319,99954.0
Liver102,704114,58358,44456.958,56351.1
Lung221,576248,844167,31175.5166,77767.0
Muscle107,208113,50276,78071.676,80167.7
Spleen110,852200,32379,35571.679,00739.4
PigAdipose91927645477852.0477862.5
Cerebellum220,327214,455132,12360.0132,29261.7
Brain Cortex142,658156,069103,58172.6103,38266.2
Hypothalamus103,402144,41863,38261.363,36843.9
Liver112,202136,08578,39569.978,27457.5
Lung167,298191,926114,47068.4113,86459.3
Muscle137,000105,82392,03967.292,70587.6
Spleen100,982100,86764,19263.664,20863.7
Table 4

Regions identified as ATAC-seq peaks in both biological replicates of pig, cattle, and mouse tissues. To obtain a single comprehensive set of “consensus” ATAC-seq peaks for each species, regions that were identified as peaks in both biological replicates for each tissue were collapsed, such that any overlapping peaks were merged into a single unique interval

PigCattleMouse
TissuePeaks in both replicatesAverage size (bp)Peaks in both replicatesAverage size (bp)Peaks in both replicatesAverage size (bp)
Adipose4785373.58338,745501.57564,008633.770
Cerebellum134,086555.92193,927535.28457,771618.660
Brain Cortex104,272515.20975,762465.855149,223604.979
Hypothalamus63,736514.03620,045415.089
Liver78,841498.52858,853482.56190,228679.330
Lung115,491569.537169,734619.45076,168652.975
Muscle93,550698.70877,378566.26222,003514.762
Spleen64,667560.97279,960542.83235,278601.650
Consensus (collapsed set)306,304623.608273,594616.064254,076668.062
Replicability of ATAC-seq peaks. Overlap of peak sets derived from biological replicates Regions identified as ATAC-seq peaks in both biological replicates of pig, cattle, and mouse tissues. To obtain a single comprehensive set of “consensus” ATAC-seq peaks for each species, regions that were identified as peaks in both biological replicates for each tissue were collapsed, such that any overlapping peaks were merged into a single unique interval

Global characteristics of accessible chromatin in cattle, pig, and mouse

To infer the functional significance of accessible regions that were identified in pig and cattle tissues, consensus peaks were characterized by genomic localization (positioning relative to genes), sequence content, and tissue-specificity. In cattle, consensus peaks averaged 616 bp in width (Table 4) and covered 6.2% of the genome. Similarly, pig consensus peaks were 624 bp wide (Table 4) and accounted for 7.2% of the genome. As expected, open chromatin was significantly enriched near TSS (p-value <1e-200), although most consensus peaks were intronic or intergenic (Fig. 2a). In addition, consensus peaks frequently occurred in only one tissue (54 and 58% of cattle and pig peaks, respectively) (Fig. 2b).
Fig. 2

Open chromatin localization and differential accessibility. a Distribution of cattle and pig consensus open chromatin relative to genomic features. Because peaks often span multiple features, peaks were categorized based on 1 bp overlap with features in the following order: first as TSS (±50 bp), then as promoter (2 kb upstream of TSS), transcription termination site (TTS) (±50 bp), 5′ untranslated region (UTR), 3′ UTR, coding sequence (CDS), intronic, and if no features were overlapped, peaks were considered intergenic. b Distribution of consensus peak activity, ranging from tissue-specific (accessible in only one tissue) to ubiquitous (accessible in all sampled tissues). Consensus peaks that were accessible in a single tissue were further broken down by tissue. c Distribution of consensus peak activity for regions containing CTCF motifs

Open chromatin localization and differential accessibility. a Distribution of cattle and pig consensus open chromatin relative to genomic features. Because peaks often span multiple features, peaks were categorized based on 1 bp overlap with features in the following order: first as TSS (±50 bp), then as promoter (2 kb upstream of TSS), transcription termination site (TTS) (±50 bp), 5′ untranslated region (UTR), 3′ UTR, coding sequence (CDS), intronic, and if no features were overlapped, peaks were considered intergenic. b Distribution of consensus peak activity, ranging from tissue-specific (accessible in only one tissue) to ubiquitous (accessible in all sampled tissues). Consensus peaks that were accessible in a single tissue were further broken down by tissue. c Distribution of consensus peak activity for regions containing CTCF motifs A comparable mouse ATAC-seq dataset, which included libraries from two male replicates for all tissues except hypothalamus, was downloaded from the CNGB Nucleotide Sequence Read Archive (Project ID CNP0000198) and processed in the same manner as the pig and cattle data. Similar to cattle and pig, 254,076 consensus peaks were identified from mouse tissues (Table 4). Mouse consensus peaks also covered a comparable portion of the genome (6.2%), were of similar width (average 668 bp), demonstrated enrichment at TSS (Fig. S5a), and were often tissue-specific (63% of peaks) (Fig. S5b). Surprisingly, a higher fraction of mouse consensus peaks localized to TSS (9.6%) than in cattle (5.1%) or pig (5.6%) (Fig. 2a, Fig. S5a). This discrepancy could be attributed either to the protocol, as pig and cattle ATAC-seq libraries were subjected to size-selection prior to sequencing and mouse libraries were not, or suboptimal genome annotations, as the pig and cattle annotations are relatively incomplete in comparison to the mouse. Nevertheless, the global characteristics of open chromatin were generally consistent across these three species. To interrogate the potential function of accessible regions in cattle and pig, consensus peaks were subjected to motif enrichment analysis. Overall, consensus peaks were most significantly enriched for CTCF recognition sites, with about 8% of accessible regions harboring CTCF motifs in each species (Table 5). Of note, CTCF motifs were more prevalent in consensus peaks identified in 3 or more tissues (8% of peaks in cattle and pig) than in consensus peaks identified in only 1 or 2 tissues (3 and 2% of peaks in cattle and pig, respectively). Nevertheless, in both species 30% of consensus peaks containing a CTCF motif were only accessible in a single tissue (Fig. 2c), indicating that CTCF binding could play both tissue-specific and ubiquitous regulatory roles.
Table 5

Motif enrichment in consensus open chromatin. Top ten enriched known binding motifs identified from the merged set of consensus peaks in each species

Cattle consensus peaksPig consensus peaks
MotifP-valuePeaks with motif (%)MotifP-valuePeaks with motif (%)
CTCF (Zf)1e-28918.66CTCF (Zf)1e-25878.16
BORIS (Zf)1e-169511.56BORIS (Zf)1e-158010.90
NF1(CTF)1e-50516.51Jun-AP1 (bZIP)1e-7848.25
Sp1 (Zf)1e-3898.60Fosl2 (bZIP)1e-76911.25
CEBP (bZIP)1e-33414.00Fra1 (bZIP)1e-71816.30
ETS (ETS)1e-2789.70Sp1 (Zf)1e-6398.16
NRF1 (NRF)1e-2763.09BATF (bZIP)1e-62418.32
RFX (HTH)1e-2722.75Mef2d (MADS)1e-5836.00
Rfx2 (HTH)1e-2643.07Atf3 (bZIP)1e-57619.02
Mef2d (MADS)1e-2574.82Mef2c (MADS)1e-50211.63
Motif enrichment in consensus open chromatin. Top ten enriched known binding motifs identified from the merged set of consensus peaks in each species The unique open chromatin landscapes present in different cell types are crucial for regulation of transcription, the products of which ultimately confer cell identity and function. Most consensus peaks (54% in cattle, 58% in pig, and 63% in mouse) were only present in a single tissue (Fig. 2b), suggesting that these regions were involved in tissue-specific regulatory programs. These regions were of particular interest, considering that differentially accessible regions have been associated with higher density of transcription factor (TF) binding sites, hinting at interesting regulatory roles. To stringently identify open chromatin that was specific to a given tissue, only consensus peaks that did not overlap any peaks called from either replicate in any other tissue were considered tissue-specific. In sum, 71,479 peaks in pig, 47,454 peaks in cattle, and 116,700 peaks in mouse (Fig. 3a) were identified as having highly tissue-specific ATAC-seq signal (Fig. 3b). Interestingly, tissues with the greatest number of tissue-specific peaks varied between species. Although cerebellum-specific peaks were numerous compared to the cortex in cattle and pig, the opposite trend was observed in the mouse. Of the remaining tissues, liver-specific peaks were particularly abundant in mouse, whereas lung-specific peaks were prevalent in cattle, and muscle-specific peaks were the most frequent in pig.
Fig. 3

Characterization of tissue-specific consensus open chromatin. a Number of consensus peaks demonstrating tissue-specific accessibility in each tissue and species. b Normalized ATAC-seq signal (RPM) at regions demonstrating tissue-specific accessibility in cattle and pig tissues. Tissue-specific peaks were first grouped by corresponding tissue, then ordered by signal intensity. Peaks were scaled to 500 bp, and signal is shown 500 bp upstream and downstream. c Distribution of tissue-specific open chromatin relative to gene annotations. d Enrichment of known TF binding motifs in tissue-specific open chromatin. Motifs sorted by TF family. Sets of tissue-specific open chromatin for each species are grouped by tissue. Increasing size and color intensity indicate increasing enrichment for a given motif

Characterization of tissue-specific consensus open chromatin. a Number of consensus peaks demonstrating tissue-specific accessibility in each tissue and species. b Normalized ATAC-seq signal (RPM) at regions demonstrating tissue-specific accessibility in cattle and pig tissues. Tissue-specific peaks were first grouped by corresponding tissue, then ordered by signal intensity. Peaks were scaled to 500 bp, and signal is shown 500 bp upstream and downstream. c Distribution of tissue-specific open chromatin relative to gene annotations. d Enrichment of known TF binding motifs in tissue-specific open chromatin. Motifs sorted by TF family. Sets of tissue-specific open chromatin for each species are grouped by tissue. Increasing size and color intensity indicate increasing enrichment for a given motif Whereas all consensus peaks were enriched at TSS (p-value <1e-200) (Fig. 2a, Fig. S5a), tissue-specific peaks coincided with TSS less frequently (Fig. 3c). Although TSS annotation in these species is likely to be incomplete, the lack of tissue-specific peaks near annotated TSS suggests that tissue-specific open chromatin is more likely to delineate enhancers, which are known to demonstrate highly tissue-specific activity, both spatially and temporally. Tissue-specific peaks from cerebellum, cortex, liver, lung, muscle, and spleen were evaluated for motif enrichment in cattle, pig, and mouse. Adipose- and hypothalamus-specific peaks were excluded from this analysis, due to the low number of tissue-specific peaks detected, and lack of hypothalamus data in the mouse. Several TF families demonstrated consistent motif enrichment in particular tissues, such as forkhead box family members, which were enriched in liver- and lung-specific open chromatin in all three species (Fig. 3d). Homeobox motifs also demonstrated consistent enrichment patterns in tissue-specific open chromatin across species; HNF motifs were enriched in liver, NKX3.1 motifs in lung, and MEIS1 and SIX1 motifs in muscle. Brain-specific open chromatin was consistently enriched for motifs of brain-specific TFs (ATOH1, NEUROD1, and OLIG2). Nevertheless, several discrepancies between species were noted in motif enrichment of tissue-specific open chromatin. For instance, bHLH factors MYF5, MYOD, and MYOG motifs were consistently enriched in muscle-specific open chromatin, but only demonstrated enrichment in cortex-specific open chromatin in the mouse. Across species, spleen-specific open chromatin demonstrated little motif enrichment, with the notable exception of ETS, IRF, and RUNT TF family motifs, which were almost exclusively enriched in the pig. Overall, liver-, lung-, and muscle-specific regulatory circuitry appeared to be the most highly conserved across cattle, pig and mouse, whereas brain-specific regulation was more varied between species.

Conservation of chromatin accessibility across mammals

Although extensive epigenetic divergence is expected between species, sequence similarity among cattle, pig, and mouse genomes well above the coding sequence fraction suggests that a significant portion of epigenetic control of transcription is likely under evolutionary constraint. Having observed similarities in motif enrichment in tissue-specific open chromatin between species, it was suspected that the sequence and accessibility of regulatory elements would also be constrained. Reasonably, portions of regulatory elements, such as the motifs that facilitate TF-DNA contacts, would be under selective pressure, especially those that connect tissue-specific transcription factors to conserved tissue-specific expression programs. To identify homologous regions that correspond to regulatory elements, the coordinates of consensus ATAC-seq peaks from each species were projected to the other two species using Ensembl Compara [27], which is largely based on whole-genome pairwise and multiple sequence alignments. Unsurprisingly, considering the smaller size of the mouse genome and the larger relative evolutionary distance between mice and ungulates, more consensus peaks could be mapped between pig and cattle, as opposed to between pig and mouse or cattle and mouse. Overall, about half of peaks were conserved at the sequence level between cattle and pig, whereas only about a third of peaks were conserved at the sequence level between ungulates and mice (Fig. 4a). Moreover, about 40% of accessible regions that could be mapped between pig to cattle were accessible in at least one tissue in both species, whereas only about 30% of accessible regions mapped between mouse and pig, or between mouse and cattle, demonstrated conserved accessibility in at least one tissue (Fig. 4a; Fig. S6). Overall, conservation of open chromatin at specific loci was in line with evolutionary distance. Comparing cattle and pig, which are separated by about 62 million years, about 20% of consensus peaks were conserved in terms of sequence and accessibility, whereas mouse, separated from ungulates by about 96 million years, only shared about 10% of consensus peaks, in terms of sequences and accessibility, with either species (Fig. S7a,b). Additionally, promoter accessibility at homologous regions was considerably more conserved than enhancer accessibility at homologous regions, with almost half of promoter open chromatin in the pig detected in cattle, while only a fifth of all open chromatin in pig was conserved in cattle (Fig. S7a,b).
Fig. 4

Conservation of regulatory element accessibility in cattle, pig, and mouse. a Summary of pairwise open chromatin conservation. Circles reflect number of consensus ATAC-seq peaks in each species. Arrow width reflects proportion of consensus peaks that could be projected to the other species. Lighter section of arrows reflect proportion of regions that could be mapped which demonstrated conserved accessibility in at least one tissue in both species. b Of regions that could be projected to all three species, number of regions with conserved accessibility in all three species and in ungulates laid over a phylogenetic tree reflective of evolutionary distance in millions of years (MYA). c Genomic distribution of regions with conserved accessibility in all three species, relative to the mouse gene annotation. Brief summary of enriched gene ontology (GO) terms for genes marked by conserved open chromatin at their TSS in every tissue. d Consensus peaks, consensus peaks with conserved sequence (that could be mapped to all three species), and consensus peaks with conserved sequence and conserved accessibility in all three species at the MEF2A locus. Tracks show normalized ATAC-seq signal. Conserved promoter open chromatin highlighted in green. Conserved distal (putative enhancer) open chromatin highlighted in yellow

Conservation of regulatory element accessibility in cattle, pig, and mouse. a Summary of pairwise open chromatin conservation. Circles reflect number of consensus ATAC-seq peaks in each species. Arrow width reflects proportion of consensus peaks that could be projected to the other species. Lighter section of arrows reflect proportion of regions that could be mapped which demonstrated conserved accessibility in at least one tissue in both species. b Of regions that could be projected to all three species, number of regions with conserved accessibility in all three species and in ungulates laid over a phylogenetic tree reflective of evolutionary distance in millions of years (MYA). c Genomic distribution of regions with conserved accessibility in all three species, relative to the mouse gene annotation. Brief summary of enriched gene ontology (GO) terms for genes marked by conserved open chromatin at their TSS in every tissue. d Consensus peaks, consensus peaks with conserved sequence (that could be mapped to all three species), and consensus peaks with conserved sequence and conserved accessibility in all three species at the MEF2A locus. Tracks show normalized ATAC-seq signal. Conserved promoter open chromatin highlighted in green. Conserved distal (putative enhancer) open chromatin highlighted in yellow Among the consensus peaks identified in cattle, pig and mouse, 145,801 regions could be mapped to all three species. Of these, 13,735 were consistently accessible in the same tissue (at least one) in all three species, and 30,215 were accessible in at least one tissue in both cattle and pig (Fig. 4b). Regions with conserved accessibility in all three species tended to be ubiquitously present in all sampled tissues. Whereas only 2–4% of consensus peaks were accessible in all tissues when considering a single species, 7% of regions with conserved accessibility in all species were accessible in all tissues (Fig. S7c). Furthermore, regions with conserved accessibility in all three species were heavily enriched around TSS (32% of regions), especially those which were accessible in all tissues (97% of regions), which marked TSS of housekeeping genes (Fig. 4c; Table S2). In contrast, regions that demonstrated conserved accessibility in cattle and pig, but not in mouse, were very rarely accessible in all tissues (only 23 out of 14,543 regions, or 0.2%) (Fig. S7d), and only occurred at TSS 4% of the time (Fig. S7e). Intriguingly, most regions with conserved accessibility in cattle, pig and mouse were only open in one or two tissues (62%) and were predominantly intronic and intergenic (Fig. S7e). For instance, several regions upstream of the MEF2A locus demonstrated muscle-specific accessibility in all three species (Fig. 4d). These regions could represent conserved enhancers, suggesting that even distal regulatory elements are subject to some level of evolutionary constraint. In all, 3105 intergenic loci (6%) demonstrated conserved open chromatin signatures in all three species, and further examination of the genes that were closest to these sites (within 100 kb) revealed functional enrichment for developmental processes, such as regionalization and organogenesis (Table S3). Notably, this small subset of “conserved” enhancers may underrepresent conserved activity at distal loci. For instance, accessible regions around the FOXG1 locus appear to be syntenically conserved across all three species; however, only one region – corresponding to the TSS of a human long non-coding RNA – could be mapped to all three species (Fig. S8). The remaining loci could not be mapped between two species, suggesting a pervasive lack of overall sequence identity, despite apparent functional conservation.

Discussion

Despite the intimate connection between chromatin structure and regulation of transcription, an atlas of chromatin accessibility in livestock tissues had not yet been reported. To address this gap in knowledge, ATAC-seq was used to identify regions of open chromatin in a prioritized set of pig and cattle tissues, yielding a first glimpse at the landscape of active regulatory elements in these genomes. In all, 6 to 7% of the cattle and pig genomes demonstrated accessibility in at least one tissue, which was consistent with a comparable dataset in the mouse [28]. Notably, about half of these accessible sites were intergenic, accounting for about 3% of each genome. The identification of these regulatory elements is a crucial first step towards a comprehensive annotation of the non-coding genome, which has been severely lacking in livestock species [29]. Although efforts are currently underway to further characterize these regulatory elements as enhancers, silencers, insulators, promoters, etc. [29, 30], motif enrichment analysis highlighted some of their potential regulatory roles. By far, the most enriched sequences in cattle and pig open chromatin were CTCF motifs, suggesting pervasive involvement of accessible regions in higher order chromatin organization. In particular, convergently oriented CTCF sites are known to demarcate topologically associated domain (TAD) boundaries [31], which are largely invariable across cell types [32-34] and even across species [32, 33, 35, 36]. Indeed, regions that were globally accessible in all pig and cattle tissues were particularly enriched for CTCF recognition motifs, suggesting that these regions may delineate TAD boundaries, although direct profiling of chromatin interactions will be necessary to provide more definitive annotations of 3D chromatin structure. Interestingly, out of all accessible CTCF motifs, almost a third were only accessible in a single tissue, which is consistent with CTCF binding being highly variable in different cell types [37, 38], even though TAD structure is largely consistent [32-34]. Only a fraction of CTCF binding sites (15%) actually localize to TAD boundaries [39], and most CTCF binding sites are interspersed with enhancers, stabilizing enhancer-promoter interactions [40], and forming cell type-specific chromatin loops linked to differential gene expression [40-42]. Therefore, differentially accessible CTCF motifs may reflect tissue-specific chromatin looping. Taken together, the presence of both tissue-specific and globally accessible CTCF motifs suggests a multi-tiered 3D structure that participates in both fundamental and tissue-specific regulation. Tissue-specific open chromatin was widespread and conspicuously lacking near TSS. Motif enrichment analysis revealed that tissue-specific open chromatin demonstrated conserved enrichment for tissue-specific TF binding motifs, which was expected, as expression programs in vertebrate tissues are thought to be controlled by a highly conserved set of tissue-specific TFs [43]. However, not all TF families demonstrated consistent motif enrichment in tissue-specific open chromatin. The motifs of the RUNX family, highly conserved TFs involved in cell fate determination [44], were only enriched in mouse lung-, pig cerebellum-, and spleen-specific open chromatin. Whether this points to a divergence in tissue-specific regulation remains unclear, as motif enrichment analyses rely heavily on known binding motifs in human and mouse, failing to account for any species-specific differences in TF recognition sites. Certainly, divergent chromatin structure has implications for differential transcriptional regulation, a phenomenon that has been long recognized as a significant contributor to phenotypic diversity [45-49]. As expected, the proportion of open chromatin that was conserved between species was consistent with evolutionary distance – higher concordance was observed between cattle and pig, than between mouse and either cattle or pig. By classifying loci as either proximal or distal based on their closeness to annotated genes, it was also apparent that accessibility at proximal elements, such as promoters, was significantly more conserved than accessibility at distal elements. This discrepancy in functional conservation was not altogether surprising; whereas promoters are fundamental for gene expression in any context, modulation of enhancer activity can subtly alter phenotypes without compromising viability [50, 51]. In fact, enhancers are known to evolve rapidly, and several studies have demonstrated how changes to enhancer sequences can lead to differing phenotypes between species [52-55]. Indeed, only 17% of intergenic open chromatin in cattle was also accessible in pig, and a meager 6% was accessible in mice, indicating that enhancers are largely species-specific, as has been previously demonstrated [3, 46]. Nevertheless, more than 3000 intergenic loci, relative to gene annotations in the mouse, had a conserved open chromatin signature in at least one tissue in all three species. Considering some highly conserved enhancers have been implicated in core biological processes, such as embryonic development [56], these intergenic loci are suspected to be involved in fundamental biological processes in adult tissues, which would account for their abnormal sequence constraint and functional conservation. Intriguingly, several loci appeared to share open chromatin signatures based on synteny, despite lack of sequence conservation. Several studies have demonstrated that enhancer function can be conserved even when overall sequence is not [57-59]. Instead, selective pressure may only operate on the functional components of regulatory elements: TF binding sites, which are typically short and degenerate sequences [45, 46, 52]. Although inferring sequence conservation is possible with sequences as short as 36 bp [45], detecting homologous regulatory regions based only on conserved TF binding sites (6–12 bp [60]) is problematic. This begs a pragmatic question in the field of comparative epigenomics: if orthologous regions cannot be determined based on sequence, how then can we determine whether function is conserved? If TF binding sites are all that is required for enhancer function, then most enhancers would not be conserved in the canonical sense of sequence constraint, but instead through TF binding and relative positioning.

Conclusions

To our knowledge, these data constitute the first atlas of chromatin accessibility in a common set of livestock tissues, and consequently a first look at the distribution across multiple tissues of active regulatory elements in the pig and cattle genomes. Moreover, this initial annotation of the non-coding genome will help to inform the identification of causal variants for disease and production traits. From the standpoint of comparative epigenomics, these data contribute to the ever-growing wealth of epigenomic information; the comprehensive analysis of which will undoubtedly help bridge the gap between genome and phenome, providing crucial insight into transcriptional regulation and its connection to evolution.

Methods

Tissue collection and cryopreservation

All necessary permissions were obtained for collection of tissues relevant to this study, following the Protocol for Animal Care and Use #18464, as per the University of California Davis Animal Care and Use Committee (IACUC). As described previously [61], two intact male Line 1 Herefords, provided by Fort Keogh Livestock and Range Research lab, were euthanized by captive bolt under USDA inspection at the University of California, Davis. Both cattle were 14 months of age and shared the same sire. Two castrated male Yorkshire pigs were humanely euthanized by animal electrocution followed by exsanguination, which is the standard method of euthanasia at pig slaughterhouses, under USDA inspection at Michigan State University Meat Lab. Pigs were littermates aged 6 months old, and sourced from the Michigan State University Swine Teaching and Research Center. From each animal, subcutaneous adipose, frontal cortex, cerebellum, hypothalamus, liver, lung (left lobe), longissimus dorsi muscle, and spleen were collected and promptly processed for cryopreservation. For each sample, roughly one gram of fresh tissue was minced and transferred to 10 mL of ice-cold sucrose buffer (250 mM D-Sucrose, 10 mM Tris-HCl (pH 7.5), 1 mM MgCl2; 1 protease inhibitor tablet per 50 mL solution just prior to use). Minced tissue was twice homogenized using the gentleMACS dissociator “E.01c Tube” program. Homogenate was filtered with the 100 μm Steriflip vacuum filter system, volume was brought up to 9.9 mL with sucrose buffer, and 1.1 mL DMSO was added to achieve a 10% final concentration. Preparations were aliquoted into cryovials and frozen at − 80 °C overnight in Nalgene Cryo 1 °C freezing containers, then stored at − 80 °C long-term.

ATAC-seq library construction and sequencing

A modified ATAC-seq protocol compatible with cryopreserved tissue samples was employed [62]. Cryopreserved tissue samples were thawed on ice, then centrifuged for 5 min at 500 rcf and 4 °C in a centrifuge with a swinging bucket rotor. Pellets were resuspended in 1 mL ice-cold PBS, and centrifuged again for 5 min at 500 rcf and 4 °C. Pellets were then resuspended in 1 mL ice-cold freshly-made ATAC-seq cell lysis buffer (10 mM Tris-HCl pH = 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% (v/v) IGEPAL CA-630), and centrifuged for 10 min at 500 rcf and 4 °C. Pellets were then resuspended again in ice-cold PBS for cell counting on a hemocytometer. Between 50,000 and 1,000,000 cells were aliquoted for library preparation, depending on tissue, success of previous library preparation attempts, and cell abundance in a given preparation (Table S1). Aliquoted cells were centrifuged once more for 5 min at 500 rcf and 4 °C, and pellets were resuspended in 50 μL transposition mix (22.5 μL nuclease-free H2O, 25 μL TD buffer and 2.5 μL TDE1 enzyme from Nextera DNA Library Prep Kit (Illumina, cat. no. FC-121-1030)). Nuclear pellets were incubated with transposition mix for 60 min at 37 °C, shaking at 300 rpm. Transposed DNA was purified with the MinElute PCR Purification Kit (Qiagen, cat. no. 28004) and eluted in 10 μL Buffer EB. Eluted DNA was added to 40 μL PCR master mix (25.4 μL SsoFast™ EvaGreen® Supermix, 13 μL nuclease-free H2O, 0.8 μL 25 μM Primer 1, 0.8 25 μM Primer 2 (see Table S4 for sequences)) to 10 μL eluted DNA and PCR cycled (1 x [5 min at 72 °C, 30 s at 98 °C], 10-13x [10 s at 98 °C, 30 s at 63 °C, 1 min at 72 °C]). Libraries were then purified again with the MinElute PCR Purification Kit, and eluted in 10 μL Buffer EB. Libraries were quantified by Qubit (Thermo Fisher Scientific, Inc., Waltham, MA), and checked for nucleosomal laddering using a Bioanalyzer High Sensitivity DNA Chip (Agilent Technologies, Santa Clara, CA) (Figs. S9 and S10). Details for individual library preparations, including cell input, PCR cycles, and concentrations can be found in Table S1. Finally, libraries were size selected for subnucleosomal length fragments (150–250 bp) on the PippinHT system using a 3% agarose cassette (Sage Science, Beverly, MA). Size selection and DNA concentration were evaluated with a Bioanalyzer High Sensitivity DNA Chip, pooled, and submitted for sequencing on the NextSeq 500 platform to generate 40 bp paired end reads.

ATAC-seq data preprocessing and quality evaluation

Low quality bases and residual adapter sequences were trimmed from raw sequencing data using Trim Galore! (v0.4.0), a wrapper around Cutadapt (v1.12) [63], with the options “-a CTGTCTCTTATA” and “-length 10” to retain trimmed reads at least 10 bp in length. Trimmed reads were then aligned to either the susScrofa11 (pig), ARS-UCD1.2 (cattle), or GRCm38 (mouse) genome assemblies using BWA mem with default settings (v0.7.17) [64]. Duplicate alignments were removed with Picard-Tools (v2 .9.1), and mitochondrial and low quality (q < 15) alignments were removed using SAMtools (v1.9) [65]. Finally, broad peaks were called using MACS2 (v2.1.1) [66] with options “-q 0.05 -B –broad –nomodel –shift -100 –extsize 200.” For all reported statistics, standard error is also reported. Quality metrics were calculated as follows. The non-redundant read fraction (NRF) was calculated by dividing the number of nonduplicate uniquely mapping reads out of all mapped reads. The Fraction of Reads in Peaks (FRiP) score for each sample was calculated using the plotEnrichment function from the deepTools suite (v.3.2.0), given the broad peaks identified by MACS2. Finally, the synthetic Jensen-Shannon distance (sJSD) was calculated using the deepTools plotFingerprint function. For visualization, genome-wide ATAC-seq signal was normalized by RPKM in 50 bp windows using the bamCoverage function from the deepTools suite. Other deepTools functions were used along with the resulting bigwig files to generate [1] pairwise scatter plots of genome-wide signal, including Spearman correlation coefficients (plotCorrelation –log1p –removeOutliers), [2] principal components analyses of signal in consensus peaks (plotPCA –transpose –log2), and signal at loci of interest (plotHeatmap), such as TSS (computeMatrix reference-point –beforeRegionStartLength 2000 –afterRegionStartLength 2000 –skipZeros) and peaks (computeMatrix scale-regions –beforeRegionStartLength 500 –regionBodyLength 500 –afterRegionStartLength 500 –skipZeros). Normalized signal from bigWig files was visualized at specific loci with the UCSC Genome Browser, limiting the y-axis range to 20 and displaying the mean in smoothing windows of 4 bases.

Identification of consensus and tissue-specific open chromatin

To obtain a single comprehensive set of consensus ATAC-seq peaks for each species, regions that were identified as peaks in both biological replicates of a tissue were first identified, and then collapsed. Specifically, for each tissue peaks from biological replicates were compared with BEDtools intersect (v2.26.0) [67] to identify intersecting regions that were consistently identified as peaks. In the case of cattle cerebellum, for which only one biological replicate was available, robust peaks were identified by calling broad peaks more stringently with MACS2 (−q 0.01 -B –broad –nomodel –shift − 100 –extsize 200). Regions that were called as ATAC-seq peaks in both biological replicates (or which were especially robust, in the case of cattle cerebellum) were then collapsed with BEDtools merge [67] (v2.26.0) to obtain a single comprehensive set of “consensus” peaks that accounted for accessibility in any of the eight tissues. Tissue-specific peaks were defined as consensus peaks that were [1] identified as peaks in both biological replicates of the tissue in question, and [2] not detected as accessible in either biological replicate of any other tissue. These comparisons were conducted using BEDtools intersect to compare consensus peaks with broad peaks called from individual biological replicates.

Categorization of peaks by location relative to gene annotations

Peaks were categorized by position relative to features in the Ensembl annotations (v96) for each species using BEDtools intersect with default settings. Because many peaks overlap multiple features, peaks were first classified as TSS (within 50 bp), then as promoters (within 2 kb upstream of TSS), as transcription termination sites (TTS; within 50 bp), as overlapping a 5′ untranslated region (UTR), as overlapping a 3′ UTR, as exonic, as intronic, and finally, if peaks did not overlap any of these features, they were considered to be intergenic. To determine if peaks were enriched near genomic features, peak locations were iteratively randomized 100 times using BEDtools shuffle, excluding “unmappable” genomic regions to avoid bias. Unmappable regions were empirically defined as any 500 bp window to which no read mapped in any library. Randomized peaks were also categorized by position relative to gene annotations, and compared to the localization of the actual peak set using a one-sample T-test (one-tailed).

Motif enrichment analysis

Regions of interest were evaluated for motif enrichment using the HOMER findMotifs.pl function (v4.8) [68], and the top ten enriched known motifs, based on p-values, were reported.

Conservation of open chromatin

All interspecies comparisons were based on the 46-mammalian Enredo-Pecan-Ortheus (EPO) multiple sequence alignment (MSA) available through Ensembl Compara (v99) [27]. Regions of consensus open chromatin in each species were projected onto the other two species using the Ensembl Compara Application Programming Interface (API). For simplicity, regions that mapped to multiple loci in another species were discarded prior to evaluating whether accessibility was conserved. Chromatin accessibility was considered to be conserved at homologous regions if they overlapped (by at least 1 bp) consensus open chromatin in the same tissue (at least one tissue) in all species in question.

Functional annotation enrichment analysis

Ensembl IDs were converted to external gene names using the BiomaRt package, and these were submitted to DAVID (v6.8) [69, 70] for functional annotation clustering. Mus musculus was used as background, and functional annotation clustering was conducted on medium stringency for the following terms: GOTERM_BP_5, GOTERM_CC_DIRECT, GOTERM_MF_DIRECT, BIOCARTA, and KEGG_PATHWAY. For each gene set, the top four clusters were reported. Additional file 1 Figure S1. Correlation of ATAC-seq signal in select technical replicates ATAC-seqlibraries. A) Pearson correlation of genome-wide signal (RPKM) in 500 bp windows. B) PCA of Cortex A technical replicate libraries alongside all biological replicates. C) PCA of pig technical replicate libraries alongside all biological replicates. D) Signal of cattle cortex technical and biological replicates at the STMN4 locus. E) Signal of pig cerebellum and hypothalamus technical and biological replicates at the STMN4 locus. Figure S2. Correlation of ATAC-seq signal in biological replicates. Scatterplots showing Pearson correlation of normalized genome-wide signal in 500 bp windows between biological replicates for cattle and pig tissues. Figure S3. ATAC-seq signal at TSS. Heatmaps depicting normalized ATAC-seq signal in the proximity of TSS, including 2 kb upstream and downstream, with TSS sorted by signal intensity. Figure S4. PCA of normalized ATAC-seq signal in consensus open chromatin identified in pig and cattle tissues tissues. Principal components 1, 2 and 3 are included to better visualize clustering of tissues. Figure S5. Mouse open chromatin localization and differential accessibility. A) Distribution of mouse consensus open chromatin relative to the Ensembl gene annotation (v96). B) Distribution of consensus peak activity, ranging from tissue-specific (accessible in only one tissue) to ubiquitous (accessible in all sampled tissues). Consensus peaks that were accessible in a single tissue were further broken down by tissue. Figure S6. Conservation of open chromatin in individual tissues. Titles above bar plots indicate the species that consensus peaks were identified in, followed by the species to which the consensus peak coordinates were projected to evaluate accessibility conservation in the corresponding tissue. Figure S7. Characterization of conserved open chromatin. Proportion of all consensus peaks, promoter consensus peaks (within 2 kb upstream and 50 bp downstream of TSS), and intergenic consensus peaks that were identified in (A) cattle or (B) pig that demonstrated both conserved sequence and accessibility in the other two species. Number of tissues in which consensus peaks demonstrated conserved accessibility in (C) all three species or (D) only in cattle and pig. E) Distribution of consensus peaks with conserved accessibility in cattle, pig, and mouse, relative to the mouse gene annotation (Ensembl v96). Figure S8. Positional conservation of chromatin accessibility at the FOXG1 locus. For each species, consensus peaks, consensus peaks with conserved sequence (that could be mapped to all three species), and consensus peaks with conserved accessibility are shown. Tracks show normalized ATAC-seq signal for each sample. Conserved promoter open chromatin is highlighted in green. Consensus peaks that appear to be syntenically conserved, relative to FOXG1, but which could not be mapped between species, are highlighted in grey. Figure S9. Bioanalyzer traces of cattle ATAC-seq libraries prior to size selection. Bioanalyzer traces were used to check for nucleosomal laddering. Size selection removed excess primer and fragments > 250 bp. Figure S10. Bioanalyzer traces of pig ATAC-seq libraries prior to size selection. Bioanalyzer traces were used to check for nucleosomal laddering. Size selection removed excess primer and fragments > 250 bp. Table S1. ATAC-seq library construction details. For each library constructed, rounds of PCR amplification, number of cells used as input, and concentration in the 150–250 bp range prior to size selection are indicated. Table S2. Functional annotation clustering of genes with conserved and global TSS accessibility. Genes with accessible TSS (± 50 bp) in all profiled tissues in all species were subjected to functional annotation clustering to identify enriched cellular functions. Top four clusters reported. Table S3. Functional annotation clustering of genes near conserved intergenic open chromatin. Genes that were closest (within 100 kb) to intergenic open chromatin that was conserved in all three species were subjected to functional annotation clustering to identify enriched cellular functions. Top four clusters reported. Table S4. ATAC-seq oligos used for PCR. Sequences have been previously described by Buenrostro et al., 2013. Primers 2A-2X contain variable barcodes which permit library pooling prior to sequencing, and which were used to demultiplex sequencing data. Additional file 2. Cattle ATAC-seq peaks. Genomic locations of ATAC-seq peaks that were called for each tissue in each replicate. Additional file 3. Pig ATAC-seq peaks. Genomic locations of ATAC-seq peaks that were called for each tissue in each replicate.
  69 in total

1.  Cell type specificity of chromatin organization mediated by CTCF and cohesin.

Authors:  Chunhui Hou; Ryan Dale; Ann Dean
Journal:  Proc Natl Acad Sci U S A       Date:  2010-02-02       Impact factor: 11.205

Review 2.  Introduction to the Thematic Minireview Series: Chromatin and transcription.

Authors:  Joel M Gottesfeld; Michael F Carey
Journal:  J Biol Chem       Date:  2018-08-01       Impact factor: 5.157

3.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

4.  Extensive evolutionary changes in regulatory element activity during human origins are associated with altered gene expression and positive selection.

Authors:  Yoichiro Shibata; Nathan C Sheffield; Olivier Fedrigo; Courtney C Babbitt; Matthew Wortham; Alok K Tewari; Darin London; Lingyun Song; Bum-Kyu Lee; Vishwanath R Iyer; Stephen C J Parker; Elliott H Margulies; Gregory A Wray; Terrence S Furey; Gregory E Crawford
Journal:  PLoS Genet       Date:  2012-06-28       Impact factor: 5.917

5.  Decoding the regulatory landscape of melanoma reveals TEADS as regulators of the invasive cell state.

Authors:  Annelien Verfaillie; Hana Imrichova; Zeynep Kalender Atak; Michael Dewaele; Florian Rambow; Gert Hulselmans; Valerie Christiaens; Dmitry Svetlichnyy; Flavie Luciani; Laura Van den Mooter; Sofie Claerhout; Mark Fiers; Fabrice Journe; Ghanem-Elias Ghanem; Carl Herrmann; Georg Halder; Jean-Christophe Marine; Stein Aerts
Journal:  Nat Commun       Date:  2015-04-09       Impact factor: 14.919

6.  Ensembl comparative genomics resources.

Authors:  Javier Herrero; Matthieu Muffato; Kathryn Beal; Stephen Fitzgerald; Leo Gordon; Miguel Pignatelli; Albert J Vilella; Stephen M J Searle; Ridwan Amode; Simon Brent; William Spooner; Eugene Kulesha; Andrew Yates; Paul Flicek
Journal:  Database (Oxford)       Date:  2016-05-02       Impact factor: 3.451

7.  Genome-wide target analysis of NEUROD2 provides new insights into regulation of cortical projection neuron migration and differentiation.

Authors:  Efil Bayam; Gulcan Semra Sahin; Gizem Guzelsoy; Gokhan Guner; Alkan Kabakcioglu; Gulayse Ince-Dunn
Journal:  BMC Genomics       Date:  2015-09-05       Impact factor: 3.969

8.  Genome-wide nucleosome specificity and function of chromatin remodellers in ES cells.

Authors:  Maud de Dieuleveult; Kuangyu Yen; Isabelle Hmitou; Arnaud Depaux; Fayçal Boussouar; Daria Bou Dargham; Sylvie Jounier; Hélène Humbertclaude; Florence Ribierre; Céline Baulard; Nina P Farrell; Bongsoo Park; Céline Keime; Lucie Carrière; Soizick Berlivet; Marta Gut; Ivo Gut; Michel Werner; Jean-François Deleuze; Robert Olaso; Jean-Christophe Aude; Sophie Chantalat; B Franklin Pugh; Matthieu Gérard
Journal:  Nature       Date:  2016-01-27       Impact factor: 49.962

9.  Dynamics of Transcription Factor Binding Site Evolution.

Authors:  Murat Tuğrul; Tiago Paixão; Nicholas H Barton; Gašper Tkačik
Journal:  PLoS Genet       Date:  2015-11-06       Impact factor: 5.917

10.  Genome-wide identification of tissue-specific long non-coding RNA in three farm animal species.

Authors:  Colin Kern; Ying Wang; James Chitwood; Ian Korf; Mary Delany; Hans Cheng; Juan F Medrano; Alison L Van Eenennaam; Catherine Ernst; Pablo Ross; Huaijun Zhou
Journal:  BMC Genomics       Date:  2018-09-18       Impact factor: 3.969

View more
  10 in total

1.  Profiling of open chromatin in developing pig (Sus scrofa) muscle to identify regulatory regions.

Authors:  Mazdak Salavati; Shernae A Woolley; Yennifer Cortés Araya; Michelle M Halstead; Claire Stenhouse; Martin Johnsson; Cheryl J Ashworth; Alan L Archibald; Francesc X Donadeu; Musa A Hassan; Emily L Clark
Journal:  G3 (Bethesda)       Date:  2022-02-04       Impact factor: 3.542

2.  The landscape of chromatin accessibility in skeletal muscle during embryonic development in pigs.

Authors:  Jingwei Yue; Xinhua Hou; Xin Liu; Ligang Wang; Hongmei Gao; Fuping Zhao; Lijun Shi; Liangyu Shi; Hua Yan; Tianyu Deng; Jianfei Gong; Lixian Wang; Longchao Zhang
Journal:  J Anim Sci Biotechnol       Date:  2021-05-03

Review 3.  Decoding the Equine Genome: Lessons from ENCODE.

Authors:  Sichong Peng; Jessica L Petersen; Rebecca R Bellone; Ted Kalbfleisch; N B Kingsley; Alexa M Barber; Eleonora Cappelletti; Elena Giulotto; Carrie J Finno
Journal:  Genes (Basel)       Date:  2021-10-27       Impact factor: 4.096

4.  Chromatin accessibility and regulatory vocabulary across indicine cattle tissues.

Authors:  Pâmela A Alexandre; Marina Naval-Sánchez; Moira Menzies; Loan T Nguyen; Laercio R Porto-Neto; Marina R S Fortes; Antonio Reverter
Journal:  Genome Biol       Date:  2021-09-21       Impact factor: 13.583

5.  Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin.

Authors:  Irene M Kaplow; Daniel E Schäffer; Morgan E Wirthlin; Alyssa J Lawler; Ashley R Brown; Michael Kleyman; Andreas R Pfenning
Journal:  BMC Genomics       Date:  2022-04-11       Impact factor: 4.547

6.  Transcriptional states and chromatin accessibility during bovine myoblasts proliferation and myogenic differentiation.

Authors:  Qian Li; Yahui Wang; Xin Hu; Yapeng Zhang; Hongwei Li; Qi Zhang; Wentao Cai; Zezhao Wang; Bo Zhu; Lingyang Xu; Xue Gao; Yan Chen; Huijiang Gao; Junya Li; Lupei Zhang
Journal:  Cell Prolif       Date:  2022-04-01       Impact factor: 8.755

7.  Butyrate Induces Modifications of the CTCF-Binding Landscape in Cattle Cells.

Authors:  Clarissa Boschiero; Yahui Gao; Ransom L Baldwin; Li Ma; Cong-Jun Li; George E Liu
Journal:  Biomolecules       Date:  2022-08-25

8.  Differentially CTCF-Binding Sites in Cattle Rumen Tissue during Weaning.

Authors:  Clarissa Boschiero; Yahui Gao; Ransom L Baldwin; Li Ma; Cong-Jun Li; George E Liu
Journal:  Int J Mol Sci       Date:  2022-08-13       Impact factor: 6.208

9.  Successful ATAC-Seq From Snap-Frozen Equine Tissues.

Authors:  Sichong Peng; Rebecca Bellone; Jessica L Petersen; Theodore S Kalbfleisch; Carrie J Finno
Journal:  Front Genet       Date:  2021-06-16       Impact factor: 4.599

10.  Characterization of Accessible Chromatin Regions in Cattle Rumen Epithelial Tissue during Weaning.

Authors:  Clarissa Boschiero; Yahui Gao; Ransom L Baldwin; Li Ma; George E Liu; Cong-Jun Li
Journal:  Genes (Basel)       Date:  2022-03-18       Impact factor: 4.096

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.