| Literature DB >> 35330182 |
Dominique Belin1, Pere Puigbò2,3,4.
Abstract
The genome hypothesis postulates that genes in a genome tend to conform to their species' usage of the codon catalog and the GC content of the DNA. Thus, codon frequencies differ across organisms, including the three termination codons in the standard genetic code. Here, we analyze the frequencies of stop codons in a group of highly expressed genes from 196 prokaryotes under strong translational selection. The occurrence of the three translation termination codons is highly biased, with UAA (ochre) being the most prevalent in almost all bacteria. In contrast, UAG (amber) is the least frequent termination codon, e.g., only 321 occurrences (7.4%) in E. coli K-12 substr. W3110. Of the 253 highly expressed genes, only two end with an UAG codon. The strength of the selective bias against UAG in highly expressed genes varies among bacterial genomes, but it is not affected by the GC content of these genomes. In contrast, increased GC content results in a decrease in UAA abundance with a concomitant increase in UGA abundance. We propose that readthrough efficiency and context effects could explain the prevalence of UAA over UAG, particularly in highly expressed genes. Findings from this communication can be utilized for the optimization of gene expression.Entities:
Keywords: bacterial genomes; codon prevalence; codon usage; gene; genome; genome hypothesis; highly expressed genes; non-sense; stop codons; translation termination; translational selection
Year: 2022 PMID: 35330182 PMCID: PMC8954436 DOI: 10.3390/life12030431
Source DB: PubMed Journal: Life (Basel) ISSN: 2075-1729
Figure 1Correlation of the frequency of the three termination codons and the guanine + cytosine (GC) content in the highly expressed genes of 196 prokaryotes under strong translational selection.
Figure 2Frequency of nucleotides after the stop codons. (a) All genes from genomes with a %GC lower than 40% (b) HEG from genomes with a %GC lower than 40% (c) All genes from genomes with a %GC between 40% and 60% (d) HEG from genomes with a %GC between 40% and 60% (e) All genes from genomes with a %GC higher than 60% (f) HEG from genomes with a %GC higher than 60. Groups within stop codons are determined based on a Student’s t-test (p < 0.01).
Figure 3Frequency of nucleotides after the stop codons (+456) in the HEG. (a) Most frequent codons downstream of the stop codon. The red line indicates chance expectations (1/64 = 2%). Only codons occurring with or above the chance expectation are indicated. (b) Frequency of tandem stop codons.