| Literature DB >> 19296847 |
Ségolène Caboche1, Maude Pupin, Valérie Leclère, Phillipe Jacques, Gregory Kucherov.
Abstract
BACKGROUND: Nonribosomal peptides (NRPs), bioactive secondary metabolites produced by many microorganisms, show a broad range of important biological activities (e.g. antibiotics, immunosuppressants, antitumor agents). NRPs are mainly composed of amino acids but their primary structure is not always linear and can contain cycles or branchings. Furthermore, there are several hundred different monomers that can be incorporated into NRPs. The NORINE database, the first resource entirely dedicated to NRPs, currently stores more than 700 NRPs annotated with their monomeric peptide structure encoded by undirected labeled graphs. This opens a way to a systematic analysis of structural patterns occurring in NRPs. Such studies can investigate the functional role of some monomeric chains, or analyse NRPs that have been computationally predicted from the synthetase protein sequence. A basic operation in such analyses is the search for a given structural pattern in the database.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19296847 PMCID: PMC2674430 DOI: 10.1186/1472-6807-9-15
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Figure 1Examples of peptide and pattern graphs. This figure shows examples of (a) peptide graphs and (b) pattern graphs. Nodes and edges represent monomers and chemical bonds respectively. Labels are the monomer names.
Figure 2A simple example of compatibility graph. The Figure shows a pattern graph, a peptide graph and the corresponding CG. Each node of pattern and peptide graphs has a label (for example 'Ala') and a number that is a unique identifier of this node. Identifiers of pattern nodes are underlined in order to distinguish them from peptide nodes. A node of the CG corresponds to the association of a node of the pattern graph (underlined number) and a node of the peptide graph with the same label. Nodes of the CG are named by letters. For example, node 'a' corresponds to the association of pattern node 0 and peptide node 1 that have both label 'Ala'. Edges between nodes 'a' and 'b' and between nodes 'b' and 'c' in the CG correspond to condition (1). Edge between nodes 'a' and 'c' corresponds to condition (2) of the definition of CG. A clique of size 3 exists in the CG and corresponds to the occurrence of the pattern in the peptide.
Figure 3Matrix of elementary path sizes. This figure shows matrix of elementary path sizes for (a) P1 of Figure 1 and (b) G1 of Figure 1.
Figure 4Example of compatibility graph constructed with classical and new methods. The CG of pattern graph P1 and peptide graph G1 of Figure 1 constructed with (a) classical and (b) new CG building rules. Each CG node is identified by a letter. It represents an association between a node of P1 and a node of G1 with compatible labels. For example, node 'a' associates node 0 of P1 and node 4 of G1 that both carry the 'D-Tyr' label. Dashed edges correspond to the edges that differ between the two CGs and the bold edges correspond to a clique of size 5 (size of P1).
Figure 5Structural properties of NRPs contained in N. Distribution of (a) 4-patterns and (b) peptide sizes in the NORINE database.
Number of nodes and edges of the CG constructed with classical and new building rules
| pattern | peptide | # CG nodes | # CG edges |
| P1 | G1 | 13/ | 22/ |
| P2 | G1 | 16/ | 43/ |
| P3 | G1 | 35/ | 210/ |
| P3 | G2 | 25/ | 100/ |
| P4 | G2 | 10/ | 14/ |
| Ala-1( | Ala( | 73/ | 1918/ |
| (X)19( | Ala( | 380/ | 53010/ |
Patterns P1–P4 and peptides G1–G2 refer to Figure 1. In all examples, k is equal to the pattern size. In columns 3 and 4, data shown in regular and bold font concern respectively the classical and modified CG building rules.
(linear pattern of size 19 corresponding to alamethicin F50 without the last monomer
(alamethicin F50 [NORINE:NOR00007]
(linear pattern of 19 'X' monomers
Figure 6Search time for different complete patterns in the N. Here, k is equal to the size of the pattern. In the 2nd and 3rd columns, the first and second value corresponds respectively to the classical and new building rules. 'ND' means that the result has not been obtained as the running time exceeded 8 hours.