| Literature DB >> 12814519 |
Zhaolei Zhang1, Mark Gerstein.
Abstract
Phylogenetic footprinting is an approach to finding functionally important sequences in the genome that relies on detecting their high degrees of conservation across different species. A new study shows how much it improves the prediction of gene-regulatory elements in the human genome.Entities:
Mesh:
Year: 2003 PMID: 12814519 PMCID: PMC193683 DOI: 10.1186/1475-4924-2-11
Source DB: PubMed Journal: J Biol ISSN: 1475-4924
Figure 1An example of a position-specific weight matrix (PWM) adapted from the TRANSFAC database [5]. The sequences that have been shown experimentally to bind to the human transcription factor GATA-1 have 14 positions, among which only positions 6–10 are fully conserved. Abbreviations: R, G or A (purine); N, any; S, G or C (strong); D, G or A or T. Twelve sequences were used to build this matrix.
Figure 2Using phylogenetic footprinting to detect conserved TFBSs. This schematic diagram shows a hypothetical human gene aligned with its orthologs from three other mammals. Cross-species sequence comparison reveals conserved TFBSs in each sequence. Sequence motifs of the same shape (colored in green) represent binding-sites of the same class of transcription factors. TFBS1 and TFBS4 are conserved in all four mammals; TFBS3 represents a newly acquired, primate-specific binding site. TFBS2 and TFBS2' represent orthologous regulatory sites that have diverged significantly between the primate and rodent lineages. Blue rectangles represent TATA boxes.