| Literature DB >> 19077233 |
Mark W J van Passel1, Leo H de Graaff.
Abstract
BACKGROUND: Systematic analyses of sequence features have resulted in a better characterisation of the organisation of the genome. A previous study in prokaryotes on the distribution of sequence repeats, which are notoriously variable and can disrupt the reading frame in genes, showed that these motifs are skewed towards gene termini, specifically the 5' end of genes. For eukaryotes no such intragenic analysis has been performed, though this could indicate the pervasiveness of this distribution bias, thereby helping to expose the selective pressures causing it.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19077233 PMCID: PMC2621210 DOI: 10.1186/1471-2164-9-596
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
List of the 47 analyzed species, the number of coding sequences and their scientific, industrial or biomedical merit.
| 9110 | Animal pathogen | |
| 12434 | Phytopathogen | |
| 9884 | Opportunistic human pathogen | |
| 10474 | Model organism | |
| 6329 | Industrial strain | |
| 12063 | Industrial strain | |
| 10400 | Opportunistic human pathogen/industrial strain | |
| 8779 | Amphibian pathogen | |
| 15512 | Phytopathogenic fungus | |
| 5916 | Opportunistic human pathogen | |
| 5851 | Opportunistic human pathogen | |
| 5897 | Haploid relative of Candida albicans | |
| 5891 | Opportunistic human pathogen | |
| 5687 | Opportunistic human pathogen | |
| 6216 | Opportunistic human pathogen | |
| 10987 | Important decomposers of biomass | |
| 10480 | Human pathogen | |
| 10368 | Human pathogen | |
| 10379 | Human pathogen | |
| 10609 | Human pathogen | |
| 9932 | Human pathogen | |
| 10070 | Human pathogen | |
| 13523 | Model organism for multicellularity | |
| 7077 | Human pathogen | |
| 6101 | Cryo- and halotolerant marine yeast | |
| 13220 | Representative of an important family of phytopathogens | |
| 17202 | Phytopathogen and model organism for evolutionary research | |
| 14002 | Phytopathogen and model organism for evolutionary research | |
| 9164 | Human pathogen | |
| 5739 | Closest sexual relative to | |
| 12564 | Phytopathogen | |
| 10383 | Opportunistic human pathogen and involved in food spoilage | |
| 9795 | Model organism | |
| 9235 | Human pathogen | |
| 20462 | Phytopathogen | |
| 12092 | Phytopathogen | |
| 17074 | Representative agent of mucormycosis | |
| 5272 | Model organism | |
| 5122 | Model organism for comparative genomics study | |
| 4906 | Model organism for comparative genomics study | |
| 4991 | Model organism | |
| 13704 | Broad host range phytopathogen | |
| 15949 | Phytopathogen | |
| 7777 | Closest known relative to the pathogenic Coccidioides | |
| 6517 | Phytopathogen | |
| 10098 | Phytopathogen | |
| 10453 | Phytopathogen |
*) All transcripts, except those with annotated internal stop codons, annotated as missing ends or transcripts that are not multiples of three in length.
Figure 1Six examples of repeat distribution profiles in the gene quintiles of fungal gene repertoires. Represented are the deviations of the expected value per quintile (i.e., 20%). # genes signifies the number of genes with mononucleotide repeats (per genome), and Rep. length signifies the length of the MNR in basepairs.
Figure 2The distribution of very long repeats (>15 bp mononucleotide repeats, 298 repeats, 294 genes) in the quintiles of the predicted coding regions from all tested fungal genomes. Represented are the deviations of the expected value per quintile (i.e., 20%).
Figure 3Distribution of trinucleotide repeats GGT, TTG and ACA in the quintiles of the protein coding genes from Represented are the deviations of the expected value (i.e., 20%).
Figure 4Percentages of the gene repertoires that contain homopolymeric tracts in fungal species: only the three species with the highest ( Note the logarithmic scale on the y-axis. The data for all 47 strains is available in Additional File 3.
Figure 5A) Distribution of the relative position of long repeats (15 residues or longer) in a gene vs. the gene length. Genes with a functional KOG annotation (KOG) and those without such an annotation (nKOG) are depicted with blue circles and red crosses, respectively. B) The average lengths of the transcripts (and the standard error of the means) are depicted for genes with a KOG annotation (blue) and genes without a KOG annotation (red), with respect to the position of the repeat in that transcript.