| Literature DB >> 15784153 |
Stefan A Rensing1, Dana Fritzowsky, Daniel Lang, Ralf Reski.
Abstract
BACKGROUND: The moss Physcomitrella patens is an emerging plant model system due to its high rate of homologous recombination, haploidy, simple body plan, physiological properties as well as phylogenetic position. Available EST data was clustered and assembled, and provided the basis for a genome-wide analysis of protein encoding genes.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15784153 PMCID: PMC1079823 DOI: 10.1186/1471-2164-6-43
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Comparative BLAST searches between . Comparative BLAST searches of the Arabidopsis (At, yellow), rice (Os, cyan) and Physcomitrella (Pp, green) transcriptomes. Each search was done with the respective sets once as query and once as search space (subject). The area of the circles represents the percentage of the query/subject sequence space that yielded filtered hits.
Taxonomic constitution of the taxprot dataset
| Metazoa | 33208 | 862,420 |
| Fungi | 4751 | 184,282 |
| Viridiplantae (plants and green algae) | 33090 | 293,156 |
| Non-green algae2 | 21,889 | |
| Other Eukaryotes3 | 49,732 | |
| Eubacteria (without Cyanobacteria) | 2 | 1,386,089 |
| Cyanobacteria | 1117 | 94,920 |
| Archaea | 2157 | 122,394 |
| Viruses | 10239 | 331,246 |
1Genbank amino acid sequences as of 2004–04–07, NCBI taxon ids are shown under "txid", all taxonomic crown groups with at least 100 sequence members were used; 2Cercozoa [136419], Cryptophyta [3027], Euglenozoa [33682], Glaucocystophyceae [38254], Haptophyceae [2830], Rhodophyta [2763], Stramenopiles [33634]; 3Acanthamoebidae [33677], Alveolata [33630], Diplomonadida [207245], Entamoebidae [33084], Heterolobosea [5752], Jakobidae [143015], Mycetozoa [142796], Parabasalidea [5719]
Figure 2BLAST hits of . a) Absolute number of hits against different taxonomic groups. b) Amount of non-redundant hits as percentage of the respective sequence space.
Figure 3Mapping of . Mapping of filtered BLAST hits (grey), paralogs (red) and orthologs (green) against the five Arabidopsis chromosomes (left to right / top to bottom). a) Hits per Mbp; error bars: average absolute deviation (AAD); column 6: mean values. b) Graphical representation using a finer granularity (100 kbp), each vertical step represents one hit.
Figure 4Retained genes in moss: taxonomic distribution and functional categories. a) Physcomitrella transcripts which have their best BLAST hit not among plants, divided by taxonomic category, further subdivided into specific hits (unique to a single taxonomic group – yellow) and those that could be assigned a putative function by means of homology searches (green). b) Distribution of functional categories among those taxonomic groups that yielded unique hits.
Functional annotation of retained genes into broad functional categories, assembled transcripts can be retrieved via .
| PPP_2925_C1 | bacteria | Membrane-bound lytic murein transglycosylase B | cytotoxicity | murein degradation | murein-degrading enzyme, may play a role in recycling of muropeptides during cell |
| BJ203770 | bacteria | putative protease | cytotoxicity | protease | |
| PPP_4234_C1 | metazoa | cytolysin I | cytotoxicity | cytotoxicity | involved in pore-formation |
| PPP_3510_C1 | cyano | RTX toxins and related Ca2+-binding proteins | cytotoxicity | cytotoxicity | |
| PPP_1172_C1 | bacteria | Enoyl-CoA hydratase/carnithine racemase | metabolism | fatty acid metabolism | |
| PPP_6629_C1 | bacteria | mannosylglycerate synthase | metabolism | sugar metabolism | |
| PPP_5746_C1 | metazoa | L-kynurenine 3-monooxygenase Fpk | metabolism | amino acid metabolism | |
| PPP_8479_C1 | metazoa | COMMD2 | metabolism | copper metabolism | COMM (copper metabolism MURR1) domain containing 2 |
| BJ173412 | metazoa | ubiquitin | metabolism | protein metabolism | ribosomal protein in C. elegans dehydrogenases |
| PPP_3987_C1 | fungi | MNN9 | metabolism | N-glycosylation | |
| PPP_6514_C1 | cyano | oxidoreductase | metabolism | energy metabolism | related to aryl-alcohol dehydrogenases |
| PPP_11394_C1 | bacteria | homolog of eukaryotic DNA ligase III | nucleic acid binding / modification | DNA repair | |
| BJ191550 | bacteria | formamidopyrimidine-DNA glycosylase | nucleic acid binding / modification | DNA repair | |
| BJ160862 | metazoa | Osa1 nuclear protein | nucleic acid binding / modification | DNA binding | chromatin regulation |
| BJ582496 | cyano | SAM-dependent methyltransferase | nucleic acid binding / modification | nucleic acid modification | |
| PPP_2586_C1 | bacteria | CarD protein | signal transduction | DNA binding | leucine zipper transcription factor, light- and starvation-induced response |
| PPP_3689_C1 | bacteria | serine/threonine protein kinase | signal transduction | signal transduction | |
| BJ172132 | bacteria | serine/threonine protein kinase | signal transduction | signal transduction | |
| PPP_460_C1 | metazoa | HLA-B-associated transcript | signal transduction | signal transduction | |
| PPP_1041_C1 | metazoa | calcium/calmodulin-dependent protein kinase II delta | signal transduction | signal transduction | |
| PPP_6326_C1 | metazoa | tumor suppressor tout-velu | signal transduction | signal transduction | involved in diffusion of hedgehog |
| PPP_11399_C1 | metazoa | dual-specificity tyrosine phosphatase YVH1 | signal transduction | signal transduction | Non-receptor class dual specificity subfamily |
| PPP_184_C1 | fungi | high-affinity iron permease | transport | transport | high affinity iron uptake |
| PPP_7115_C2 | fungi | uric acid-xanthine permease | transport | transport | belongs to the Xanthine/Uracil oermeases family |
| PPP_11191_C1 | fungi | inorganic phosphate transporter | transport | transport | probable inorganic phosphate transporter; yeast pho99 homologue |
Figure 5Splice site sequence logos and efficiency of splice site prediction. a) Sequence logos of Physcomitrella donor and acceptor sites. b) Prediction performance of Netplantgene and svmsplice for Physcomitrella splice sites. TP = true positive, FN = false negative, FP = false positive, measured on the lefthand (%) axis. Recall (sensitivity) = tp/(tp+fn), precision = tp/(tp+fp), measured on the righthand axis.
Figure 6Trinucleotide frequencies and codon usage. a) The averaged Physcomitrella codon fraction usage measured as percentage of the total amount of counted codons is shown as grey diamonds, including a margin of 2× average absolute deviation (AAD, error bars), in comparison with Arabidopsis (yellow circles). Significantly deviating codons of the sequence subsets are presented as colored circles, namely retained genes (blue), paralogs (red) and orthologs (green). b) The effective number of codons (enc) for Physcomitrella (green) and Arabidopsis (yellow) as a range distribution scatter plot (y axis: % of analysed genes) and as averaged values (horizontal bar chart; error bars: standard deviation).
Codon usage of Physcomitrella retained genes, orthologs and paralogs
| At mRNAs | 10,755,859 | 56,020 | 43.32 | n.a. | n.a. | n.a. | n.a. | n.a. | n.a. |
| Pp ORFs | 7,638,122 | 39,782 | 49.94 | n.a. | n.a. | n.a. | n.a. | n.a. | n.a. |
| retained genes | 77,998 | 406 | 50.30 | 7 | 1 | 2 | 5 | Phe under represented | |
| paralogs | 1,115,937 | 5,812 | 50.07 | 3 | 1 | 2 | 1 | 2 | none |
| orthologs | 953,293 | 4,965 | 10 | 2 | 4 | 6 | Pro under reprensented | ||
| sum | 10 | 10 | 7 | 13 |
The predicted Physcomitrella ORF were used as background to check for significant changes in percentage codon fraction usage in the orthologs, paralogs and retained genes (best BLAST hit not among plants). In case of significant deviation (two times average absolute deviation – AAD) from the total set, the direction of the change relative to the Arabidopsis codon usage was checked. Significant deviations are shown enlarged, At = Arabidopsis thaliana, Pp = Physcomitrella patens.