| Literature DB >> 23433959 |
Derek M Bickhart1, George E Liu.
Abstract
A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach-using sequence conservation across cattle, human and dog-and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at http://bfgl.anri.barc.usda.gov/BovineTFBS/ or http://199.133.54.77/BovineTFBS.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23433959 PMCID: PMC4357788 DOI: 10.1016/j.gpb.2012.10.004
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Performance of TFLOC predictions at various score thresholds.
| Threshold (×0.01%) | MultiTF | ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | ||
| Known sites | 44 | 44 | 44 | 44 | 44 | 44 | 44 |
| Predicted sites | 24 | 31 | 32 | 33 | 33 | 33 | 29 |
| False negatives | 20 | 13 | 12 | 11 | 11 | 11 | 15 |
| 50% Overlapping predictions | 24 | 40 | 66 | 78 | 81 | 84 | 88 |
| Total predictions | 93 | 179 | 284 | 359 | 416 | 452 | 215 |
| Sensitivity (%) | 55 | 70 | 72 | 75 | 75 | 75 | 66 |
| Specificity (%) | 25 | 22 | 23 | 22 | 20 | 19 | 41 |
Figure 1Comparison of known and predicted sites upstream of the The chromosome position (Chr13: 59,379,179–59,379,654) is listed at the top of the diagram, with vertical gray bars serving as scale bar markers. Known PCK1 TFBSs are represented by black bars (previously identified in [11]) and blue bars (identified in [5]) in the top track. TFBS predictions made by TFLOC using a 3-way alignment of human, dog and cow are depicted in the following three tracks. Predictions from JASPAR CORE, JASPAR FAM and JASPAR PHYLOFACTS were represented by red, grey and green bars, respectively. Additional UCSC tracks include gap locations, RefSeq annotated genes, cow mRNAs mapped to the reference genome, and 5-way multiz alignment & conservation.