| Literature DB >> 25089258 |
Robert Landick1, Azra Krek2, Michael S Glickman3, Nicholas D Socci2, Christina L Stallings4.
Abstract
CarD is an essential mycobacterial protein that binds the RNA polymerase (RNAP) and affects the transcriptional profile of Mycobacterium smegmatis and Mycobacterium tuberculosis (6). We predicted that CarD was directly regulating RNAP function but our prior experiments had not determined at what stage of transcription CarD was functioning and at which genes CarD interacted with the RNAP. To begin to address these open questions, we performed Chromatin Immunoprecipitation sequencing (ChIP-seq) to survey the distribution of CarD throughout the M. smegmatis chromosome. The distribution of RNAP subunits β and σA were also profiled. We expected that RNAP β would be present throughout transcribed regions and RNAP σA would be predominantly enriched at promoters based on work in Escherichia coli (3), however this had yet to be determined in mycobacteria. The ChIP-seq analyses revealed that CarD was never present on the genome in the absence of RNAP, was primarily associated with promoter regions, and was highly correlated with the distribution of RNAP σA. The colocalization of σA and CarD led us to propose that in vivo, CarD associates with RNAP initiation complexes at most promoters and is therefore a global regulator of transcription initiation. Here we describe in detail the data from the ChIP-seq experiments associated with the study published by Srivastava and colleagues in the Proceedings of the National Academy of Science in 2013 (5) as well as discuss the findings from this dataset in relation to both CarD and mycobacterial transcription as a whole. The ChIP-seq data have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE48164).Entities:
Year: 2014 PMID: 25089258 PMCID: PMC4115788 DOI: 10.1016/j.gdata.2014.05.012
Source DB: PubMed Journal: Genom Data ISSN: 2213-5960
Number of sequencing reads for each sample from the AB SOLiD 4 high-throughput genome sequencer set to a 50 bp read length.
| Sample | # of reads | # of mapped reads | % mapped reads |
|---|---|---|---|
| CarD-HA-1 | 24,988,001 | 16,452,015 | 65.84% |
| RNAP β-1 | 24,145,461 | 16,249,329 | 67.30% |
| Unfused HA-1 | 27,153,580 | 17,194,808 | 63.32% |
| CarD-HA-2 | 9,323,217 | 7,097,095 | 76.12% |
| RNAP β-2 | 12,709,226 | 9,660,422 | 76.01% |
| RNAP σA-2 | 19,596,174 | 14,868,445 | 75.87% |
| Unfused HA-2 | 11,641,903 | 8,015,559 | 68.85% |
Pearson correlations of the genomic coverage profiles of each pair of samples. The bolded numbers show the correlation between the distributions of individual replicates for a single immunoprecipitation condition.
| CarD-HA-1 | RNAP β-1 | Unfused HA-1 | CarD-HA-2 | RNAP β-2 | RNAP σA-1 | Unfused HA-2 | |
|---|---|---|---|---|---|---|---|
| CarD-HA-1 | 1.00 | 0.71 | 0.57 | 0.71 | 0.89 | 0.62 | |
| RNAP β-1 | 1.00 | 0.71 | 0.63 | 0.60 | 0.76 | ||
| Unfused HA-1 | 1.00 | 0.41 | 0.65 | 0.36 | |||
| CarD-HA-2 | 1.00 | 0.70 | 0.95 | 0.52 | |||
| RNAP β-2 | 1.00 | 0.66 | 0.80 | ||||
| RNAP σA-2 | 1.00 | 0.47 | |||||
| Unfused HA-2 | 1.00 |
Average Pearson correlations of the genomic coverage profiles for each immunoprecipitation condition examined. Each sample was done in duplicate, except σA was done once. Correlations are the average of each duplicate to one another. The bolded number shows the correlation between the distribution of CarD-HA and the distribution of RNAP σA.
| HA | CarD-HA | RNAP β | RNAP σA | |
|---|---|---|---|---|
| HA | 0.934 | 0.530 | 0.730 | 0.417 |
| CarD-HA | 0.962 | 0.687 | ||
| RNAP β | 0.954 | 0.629 | ||
| RNAP σA | 1.000 |
Fig. 1Normalized log2 of ChIP-seq reads from M. smegmatis DNA co-immunoprecipitated with RNAP β, RNAP σA, or CarD-HA. Protein–DNA complexes containing CarD-HA, RNAP β, and RNAP σA were immunoprecipitated from M. smegmatis lysates. The co-precipitated DNA was sequenced, and the number of sequence reads per bp was normalized to total reads per sample and expressed as a log2 value. Normalized reads per base pair from DNA precipitated from cells expressing only the HA epitope were used as background and subtracted from the other samples. Shown are the aggregate profiles averaged over 62 highly active transcription units with the 0 designating the estimated transcriptional start sites. The 62 transcription units were selected on the basis of high signal and isolation from surrounding transcription units.
| Specifications | |
|---|---|
| Sample and organism | Genomic DNA from |
| Sequencer | AB SOLiD 4 system high-throughput genome sequencer |
| Data format | Raw data: sra files, normalized data: wig, SOFT, MINiML, and TXT files |
| Experimental factors | In the |
| Experimental features | All |