| Literature DB >> 23671615 |
Helena G Dos Santos1, David Abia, Robert Janowski, Gulnahar Mortuza, Michela G Bertero, Maïlys Boutin, Nayibe Guarín, Raúl Méndez-Giraldez, Alfonso Nuñez, Juan G Pedrero, Pilar Redondo, María Sanz, Silvia Speroni, Florian Teichert, Marta Bruix, José M Carazo, Cayetano Gonzalez, José Reina, José M Valpuesta, Isabelle Vernos, Juan C Zabala, Guillermo Montoya, Miquel Coll, Ugo Bastolla, Luis Serrano.
Abstract
Here we perform a large-scale study of the structural properties and the expression of proteins that constitute the human Centrosome. Centrosomal proteins tend to be larger than generic human proteins (control set), since their genes contain in average more exons (20.3 versus 14.6). They are rich in predicted disordered regions, which cover 57% of their length, compared to 39% in the general human proteome. They also contain several regions that are dually predicted to be disordered and coiled-coil at the same time: 55 proteins (15%) contain disordered and coiled-coil fragments that cover more than 20% of their length. Helices prevail over strands in regions homologous to known structures (47% predicted helical residues against 17% predicted as strands), and even more in the whole centrosomal proteome (52% against 7%), while for control human proteins 34.5% of the residues are predicted as helical and 12.8% are predicted as strands. This difference is mainly due to residues predicted as disordered and helical (30% in centrosomal and 9.4% in control proteins), which may correspond to alpha-helix forming molecular recognition features (α-MoRFs). We performed expression assays for 120 full-length centrosomal proteins and 72 domain constructs that we have predicted to be globular. These full-length proteins are often insoluble: Only 39 out of 120 expressed proteins (32%) and 19 out of 72 domains (26%) were soluble. We built or retrieved structural models for 277 out of 361 human proteins whose centrosomal localization has been experimentally verified. We could not find any suitable structural template with more than 20% sequence identity for 84 centrosomal proteins (23%), for which around 74% of the residues are predicted to be disordered or coiled-coils. The three-dimensional models that we built are available at http://ub.cbm.uam.es/centrosome/models/index.php.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23671615 PMCID: PMC3650010 DOI: 10.1371/journal.pone.0062633
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Fraction of protein length that is predicted to be disordered, coiled-coil, or modeled by homology.
The plots represent the distribution of the percentage of protein length that has been (A) predicted to be disordered and coiled-coil at the same time; (B) Predicted to be disordered and not coiled-coil; (C) modeled; (D) Predicted to have regular secondary structure and not to be disordered neither coiled-coil, but not modeled.
Figure 2Number of residues and number of models for each protein.
The plots represent the distribution of the number of residues of the longest isoform of centrosomal genes (A) and the number of structural models obtained for each protein (B).
Figure 3Summary of the structural models, either built by homology or retrieved from the PDB.
The plots represent the distribution of the length (A) and sequence identity between query and template protein (B) for the 362 modeled fragments.
Figure 4Empirical energy functions evaluated for each models and for the corresponding region of the template show that the predicted stability decrease is moderate.
Figure 5Frequency of the three main secondary structure classes for modeled residues (gray: DSSP of the template; yellow: PSIPRED prediction) and for all residues (pink).
One can see that the set of all residues is strongly diminished in beta structures.
Figure 6Number of occurrences of domains predicted by SMART.
Only domains with more than 3 occurrences are shown.
Figure 7Protein-protein interaction networks for interactions experimentally observed for human centrosomal proteins.
The color code represents betweeness centrality, a graph theoretic measure of the centrality of a node in a network, red representing the most central node.