| Literature DB >> 15123586 |
Stéphane Cruveiller1, Kamel Jabbari, Oliver Clay, Giorgio Bernardi.
Abstract
The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%-30% of the estimated total gene set. Our analysis of the landscape defined by this gene set revealed not only the well documented linear crest, but also the presence of several peaks and valleys along that crest, a property that was also indicated in two other warm-blooded vertebrates represented by large gene databases, that is, mouse and chicken. GC2 is the sum of eight amino acid frequencies, whereas GC3 is linearly related to the GC level of the chromosomal region containing the gene. The landscapes therefore portray relations between proteins and the DNA environments of the genes that encode them.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15123586 PMCID: PMC479116 DOI: 10.1101/gr.2246704
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043