| Literature DB >> 35089985 |
Ariel A Aptekmann1,2, Denys Bulavka1,3, Alejandro D Nadra4, Ignacio E Sánchez1.
Abstract
We study the limits imposed by transcription factor specificity on the maximum number of binding motifs that can coexist in a gene regulatory network, using the SwissRegulon Fantom5 collection of 684 human transcription factor binding sites as a model. We describe transcription factor specificity using regular expressions and find that most human transcription factor binding site motifs are separated in sequence space by one to three motif-discriminating positions. We apply theorems based on the pigeonhole principle to calculate the maximum number of transcription factors that can coexist given this degree of specificity, which is in the order of ten thousand and would fully utilize the space of DNA subsequences. Taking into account an expanded DNA alphabet with modified bases can further raise this limit by several orders of magnitude, at a lower level of sequence space usage. Our results may guide the design of transcription factors at both the molecular and system scale.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35089985 PMCID: PMC8797260 DOI: 10.1371/journal.pone.0263307
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Known and predicted transcription factor binding site motifs.
(A) Regular expression length and number of letters allowed for TFBS motifs in the SwissRegulon Fantom5 collection. (B) Bars (left Y axis): Motif-discriminating positions for every pair of TFBS motifs in the SwissRegulon Fantom5 collection. Black circles (right Y axis): Theoretical estimation of the maximal number of coexisting TFBS motifs, as a function of the minimal requirement of motif-discriminating positions. (C) Theoretical estimation of the maximal number of coexisting TFBS motifs, as a function of alphabet size. (D) Potential occupancy of the DNA sequence space by TFBS motifs for an alphabet size of 4 as a function of the number of motif-discriminating positions.