| Literature DB >> 26462790 |
James F Robson1,2, Daniel Barker3.
Abstract
BACKGROUND: To demonstrate the bioinformatics capabilities of a low-cost computer, the Raspberry Pi, we present a comparison of the protein-coding gene content of two species in phylum Chlamydiae: Chlamydia trachomatis, a common sexually transmitted infection of humans, and Candidatus Protochlamydia amoebophila, a recently discovered amoebal endosymbiont. Identifying species-specific proteins and differences in protein families could provide insights into the unique phenotypes of the two species.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26462790 PMCID: PMC4604092 DOI: 10.1186/s13104-015-1476-2
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Analysis of protein families predicted across the genome-wide protein sets of C. trachomatis and P. amoebophila
| Protein relationship | Number of protein families using BLOSUM62 | Number of protein sequences using BLOSUM62 | Number of protein families using BLOSUM45 | Number of protein sequences using BLOSUM45 |
|---|---|---|---|---|
| Single in both species | 590 | 1180 | 586 | 1172 |
| Unique to | 9 | 28 | 9 | 25 |
| Unique to | 132 | 448 | 132 | 443 |
| Single in | 8 | 28 | 7 | 25 |
| Single in | 1 | 6 | 1 | 6 |
| Multiple in both species | 1 | 6 | 1 | 6 |
Differences in proteins produced, excluding shared single copy proteins
| Protein relationship | Group number | Protein name |
|---|---|---|
| Unique to | 1 | F-box |
| 2 | Transposases | |
| 3 | Putative tetratricopeptide repeat protein | |
| 4 | Sel1 repeat protein4 | |
| 5 | Transposases | |
| Unique to | 10 | Polymorphic outer membrane protein |
| 16 | *2 Effector from type III secretion system | |
| 70 | Polymorphic outer membrane protein | |
| 71 | Hypothetical membrane associated protein | |
| 72 | Hypothetical membrane associated protein | |
| 148 | Deubiquitinase and deneddylase | |
| 149 | Biotin synthase | |
| 150 | *3 Threonine-rich GPI-anchored glycoprotein | |
| 151 | Outer membrane proteins | |
| Single in | 11 | Virulence plasmid integrases |
| 18 | Low calcium response proteins | |
| 19 | Pb, Cd, Zn and Hg transporting ATPases | |
| 36 | Excinuclease ABC subunit A | |
| 38 | Chaperonins | |
| 39 | Putative antibiotic transporter | |
| 40 | *4 | |
| 41 | Nucleoside diphosphate kinases | |
| Single in | 9 | Phosphatidylcholine-hydrolyzing phospholipase D (PLD) family |
| Multiple in both species | 8 | Tyrosine-specific transport protein |
Protein families are uniquely identified by arbitrary group numbers, whose member proteins’ accession numbers are given in Additional file 1. For notes numbered *2 to *4, see Table 3. *1 In this category, only the largest five groups are shown. All proteins within these five groups were putative and uncharacterised, probable protein function was obtained by finding homologs on UniProtKB with >50 % sequence identity. For group three, although no homologs were found with >50 % sequence identity, it is possible that they are tetratricopeptide proteins as all within this group showed >30 % sequence identity to various tetratricopeptide proteins
Fig. 1Predicting functional interactions of unannotated proteins. To further investigate the function of the protein family whose members were all unannotated, Group 40 (Additional file 1), functional interactions were investigated using the STRING database. It was found that P. amoebophilia Q6MEA2 (a STRING ID pc0373) and C. trachomatis Q3KL42 (b STRING ID CTA_0708) both interact with the (putative) exodeoxyribonuclease V alpha chain with a high confidence score. Each query protein is in the centre of the interaction web and is coloured red. Grey dots in the key represent strength of evidence (darker is stronger). The sum of each distinct evidence type was used to generate the total score
Putative, homology-based characterisation of proteins in Table 2
| Note (asterisk) | Possible homolog | Species | Identity (%) | Additional comments |
|---|---|---|---|---|
| 2 | Effector from type III secretion system |
| 73 | 86 % positives |
| 3 | Threonine-rich GPI-anchored glycoprotein |
| 80 | 84 % positives |
| 4 | Unknown | N/A | N/A | No homologs found; no secondary structure elements found; increased disorder at each terminal |