| Literature DB >> 28472458 |
Francesco Oteri1, Francesca Nadalin1, Raphaël Champeimont1, Alessandra Carbone1,2.
Abstract
Along protein sequences, co-evolution analysis identifies residue pairs demonstrating either a specific co-adaptation, where changes in one of the residues are compensated by changes in the other during evolution or a less specific external force that affects the evolutionary rates of both residues in a similar magnitude. In both cases, independently of the underlying cause, co-evolutionary signatures within or between proteins serve as markers of physical interactions and/or functional relationships. Depending on the type of protein under study, the set of available homologous sequences may greatly differ in size and amino acid variability. BIS2Analyzer, openly accessible at http://www.lcqb.upmc.fr/BIS2Analyzer/, is a web server providing the online analysis of co-evolving amino-acid pairs in protein alignments, especially designed for vertebrate and viral protein families, which typically display a small number of highly similar sequences. It is based on BIS2, a re-implemented fast version of the co-evolution analysis tool Blocks in Sequences (BIS). BIS2Analyzer provides a rich and interactive graphical interface to ease biological interpretation of the results.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28472458 PMCID: PMC5570204 DOI: 10.1093/nar/gkx336
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
BIS2Analyzer computational time on different protein families
| # Seqs | AL (aa) | API (%) | Time* | |
|---|---|---|---|---|
| Amyloid β peptide | 80 | 43 | 87 | 13’ |
| B domain/protein A | 452 | 62 | 82 | 2΄49΄ |
| c-KIT | 81 | 976 | 67 | 1h47΄30΄ |
| HCV NS5A | 40 | 451 | 94 | 3΄37″ |
| HCV NS3-NS5B | 27 | 1222 | 92 | 37΄40″ |
| Morbillivirus protein N | 144 | 387 | 73 | 21΄39″ |
# Seqs = number of sequences; AL = Alignment Length; API = Average Pairwise Identity; * execution realized on an Intel(R) Xeon(R) CPU E5-2440 0 @ 2.40GHz.
Performances of various co-evolution analysis methods
| Amyloid β peptide | B domain/protein A | |||
|---|---|---|---|---|
| C/G/P (TP) | Pr (TP)(R) | C/G/P (TP) | Pr (TP) | |
| BIS2 | 5 (4) | 30 (26)(6) | 5 (2) | 28 (10) |
| CAPS | 3 (2) | 14 (6)(1) | 6 (0) | 22 (1) |
| ContactMap* | 50 (13) | 28 (14)(1) | 50 (16) | 25 (11) |
| DCA | 50 (8) | 29 (14)(3) | 50 (1) | 30 (4) |
| DCA | 50 (26) | 37 (23)(5) | 50 (2) | 48 (11) |
| DCA | 50 (13) | 29 (15)(3) | 50 (1) | 42 (10) |
| EVcouplings | 50 (2) | 28 (13)(1) | 50 (0) | 32 (3) |
| EVcouplings | 50 (2) | 29 (19)(1) | 50 (0) | 26 (1) |
| EVcouplings | 50 (2) | 27 (16)(1) | 50 (3) | 46 (9) |
| PSICOV | 50 (11) | 33 (20)(3) | 50 (1) | 24 (4) |
| PSICOV | 50 (20) | 30 (26)(3) | 50 (4) | 45 (8) |
| PSICOV | - | - | 50 (4) | 44 (12) |
C/G/P = predicted Cluster/Group/Pairs (outputs: C for BIS2, G for CAPS and P for all other methods) depending on the method; Pr = predicted residues (that is, the total number of different residues in C/G/P); TP = True Positives; R= experimental functional regions that are, at least partially, predicted (at least one pair of residues lies within the same C/G/P).
* ContactMap built a MSA of 73 sequences and 27% API for amyloid β peptide and a dataset of 56 sequences and 40% API for B domain/protein A. All other methods are run on the MSA described in Table 1, unless specified differently.
MSA for amyloid β peptide: 919 sequences (NCBI PF03494 entry), 90% API; B domain/protein A: 11116 sequences (NCBI PF02216 entry), 74% API.
MSA for amyloid β peptide: 273 sequences (Uniprot PF03494 entry), 87% API (PSICOV output is empty on this sequence set); B domain/protein A: 919 sequences (Uniprot PF02216 entry), 87% API.
Figure 1.Visualization of c-KIT tyrosine kinase analysis on Bis2Analyzer. (A) Part of the sequence alignment where cluster 9 is localized. (B) Description of the three hits comprising cluster 9. (C) Display of cluster 9 on c-KIT inactive form (left, 1T45) and on its active form (right, 1PKG). Note that the active form has an unfolded N-terminal that has been partially removed in the crystal (right). (D) Plot of cluster 9 (green dots) on a multiple sequence alignmen (MSA) sequence.
Figure 2.(A) Visualization of two clusters (orange and violet) obtained by BIS2Analyzer on Hepatitis C Virus (HCV) protein NS5A (1ZH1). These co-evolving residues are localized in the same regions of the protein, suggesting a conformational change (see schema in (C)). (B) The orange co-evolving residues in (A), localized far in the monomer structure of NS5A, are found in close proximity in the dimer (1ZH1, see schema in (D)). (F) Co-evolving residues (green; P-value < 1.2e−5) located on the two HCV protein structures NS3 (1CU1, light blue) and NS5B (1GX6, beige) illustrate BIS2Analyzer possibility to visualize inter-protein co-evolved residues (see schema in (E)) and inspect potential interactions.
Figure 3.(A) BIS2Analyzer detects six co-evolving residues (blue, P-value ≤ 1.2e−5) located at close distance on the surface of c-KIT tyrosine kinase (1T45). (B) Schema illustrating the distances among the 6 co-evolving residues in (A). (D) Three co-evolving residues (red) face opposite sites of protein N (chain A, 4XJN, right). They identify inter-protein contacts at the interface of the mononegavirales protein N assembly (4XJN, left) (see schema in (C)).
Figure 4.(A) Clustering of the phylogenetic tree constructed from a dataset of RL3 sequences (2414 sequences; subset of the UniRef90 dataset in UniProt from which too divergent sequences have been eliminated). Selected subtrees (shown in color) contain at least 20 sequences and at least one non-trivial co-evolution cluster (See text.). (B) BIS2 co-evolution clusters on the 3D structure (PDB ID: 4U26, chain BD), colored according to the subtree they belong to; the six co-evolution clusters shown above belong to five sub-trees and link the ordered extension loop with the structured region.