| Literature DB >> 34107869 |
Nikhilesh K Yadav1,2, Nidhi S Saikhedkar3,4, Ashok P Giri5,6.
Abstract
BACKGROUND: Serine protease inhibitors belonging to the Potato type-II Inhibitor family Protease Inhibitors (Pin-II type PIs) are essential plant defense molecules. They are characterized by multiple inhibitory repeat domains, conserved disulfide bond pattern, and a tripeptide reactive center loop. These features of Pin-II type PIs make them potential molecules for protein engineering and designing inhibitors for agricultural and therapeutic applications. However, the diversity in these PIs remains unexplored due to the lack of annotated protein sequences and their functional attributes in the available databases.Entities:
Keywords: Annotation; Data analysis; Database; Knowledge representation; Pin-II type protease inhibitor; Protein sequence analysis
Year: 2021 PMID: 34107869 PMCID: PMC8188708 DOI: 10.1186/s12870-021-03027-0
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Fig. 1Development of PINIR Database (a) Schematic depicting processing of Pin-II type PIs into IRDs (left), and structure of bi-domain Pin-II typePI from tomato (PDB id: 1pju) showing characteristics of Pin-II type PIs- IRD, linker and RCL regions (right). b Conceptual Data Model (CDM) of PINIR database, showing the entities and relations between them. All the entities are structured around two Primary entities highlighted within red rectangles (Pin-II type PIs and Inhibitory RepeatDomains). The entities which capture the details about the two primary entities are shown within green rectangles and the grey rectangles represent the entities which provide the support data. The images depicted in the figure are original images prepared by the authors
Conceptual Data Model (CDM) Entities of PINIR Database
| Primary Entities | Entities providing detail information | Entities providing support information |
|---|---|---|
| Pin-II PIs | Signal Peptides | Taxonomy |
| Iso-electric Points | Publication | |
| Spatio-Temporal Distribution | Gene Ontology Type | |
| Cross-References | ||
| Gene Ontology | ||
| Domains | Sequence Domains | Linkers |
| Amino acid Composition | Reactive Loops | |
| Domain Biophysical Properties | Target Protease | |
| Domain Biochemical Properties | Domain Type | |
| Domain Disulfide Bonds |
Fig. 2Development Model followed for the creation of PINIR comprises of four development stages: Data Sources, Data Preprocessing and Integration, Database Creation; PINIR Website. The images depicted in the figure are original images prepared by the authors
Description of entities and corresponding tables in PINIR
| Entity | Table Name | Content |
|---|---|---|
| Pin-II PIs | Stores general information about properties and composition of Pin-II PIs such as its Uniprot Accession number, Protein Name, Organism, Gene Name etc | |
| Taxonomy | Stores organism’s details to which Pin- II PIs belongs to. It is captured from Uniprot database | |
| Sequence Domains | Captures the PIs and Domains relationship. Stores information about what Domains are found in a PI sequence | |
| Domains | Stores the information about Inhibitory Repeat Domains (IRD) sequences of the Pin–II PI. It includes the IRD sequence, Domain Type and the RCL and Linker sequences found in it | |
| Domain Type | Based upon Linker organization in IRD sequences, stores the various Domain Types: Type 1 (H–L Type), Type 2 (L–H Type) and Type 3 (H + L Type) | |
| Reactive Loops | Stores the information about Reactive center loop (Active site loop) appearing in a domain. It includes the Target Protease found and Residues at P1, P2 and P1` position | |
| Target Protease | Stores the Probable target protease for the Pin-II inhibitory domain. These are categorized as: Trypsin; Chymotrypsin; Elastase; and Unknown | |
| Linkers | Stores the information about various Linker Sequences appearing between two domains in a Pin-II type PI. It includes the Linker sequence and type | |
| Cross References | Stores details about other databases which capture information related to PINIR entries | |
| Domain Biochemical Properties | Captures Inhibitory potential of an IRD | |
| Domain Biophysical Properties | Captures Biophysical binding parameters for IRD and protease | |
| Domain Disulfide Bonds | Captures the occurrence and position of disulfide bonds in a domain | |
| Iso-Electric Points | Captures the Isoelectric points details of the Pin-II type PI sequences, which is the pH at which a particular molecule carries no net electrical charge | |
| Gene Ontology | Stores the lists of selected terms derived from the GO project, which gives the Functional attributes of Pin-II type PI | |
| Gene Ontology Type | Stores the three categories of Gene Ontology: Cellular Component; Biological Process; and Molecular Function | |
| Signal Peptide | Captures the localization signal for the Pin-II type PI sequences | |
| Spatio Temporal Distribution | Captures the localization and tissue specific occurrence of Pin-II type PI | |
| Publication | This table stores the information related to references to publications related to PINIR entries | |
| Amino Composition | It stores the information about Amino Acids composition of Pin-II type PI and Domain sequences, which includes the percentage and number of an amino acid present in a sequence |
Fig. 3Logical Data Model (LDM) of PINIR. LDM captures entities and relationships and also specifies Attributes, Primary Key and Foreign Keys for each entity. a and b present the LDM for Primary entities Pin-II type PIs and Inhibitory Repeat Domains, respectively. The images depicted in the figure are original images prepared by the authors
Fig. 4Main features of PINIR website (a) Browse page of PINIR (b) The Detail page showing the detailed information of a PI (c) Data Analysis page provides a comprehensive analysis of the Pin-II type PIs (d) Search page of PINIR allows users to search Pin-II type PIs by a variety of keywords (e) Download page allows the users to download the PINIR data. The images depicted in the figure are prepared and compiled by the authors using original screenshots of PINIR website
Fig. 5Linker regions in Pin-II type PIs. a Classification of IRD sequences based on linker regions; heavy chain (H), light chain (L), RCL and linker regions are color-coded for sequences in (b) and (c). b Example of Pin-II type PI with H + L type IRDs. c Example of Pin-II type PI with overlapping H–L and L–H type IRDs. The images depicted in the figure are original images prepared by the authors
Fig. 6RCL regions and disulfide variants in Pin-II type PIs. a Preference of amino acids at P1, P2 and P1’ residues in the RCL regions (b) Distribution of RCL sequences across solanaceous and non-solanaceous plants. Frequently occurring RCL are shown separately in (b’) (c) Cysteine content (number of cysteine residues per sequence) in Pin-II type PIs and IRDs (inset) (d) Major types of disulfide bond architectures in IRDs. The images depicted in the figure are original images prepared by the authors
Fig. 7Categorization of Pin-II type PIs in PINIR according to sequence characteristics. Purple boxes represent the basis of categorization, blue boxes represent the categories. The number of sequences in a particular category is mentioned in brackets. The images depicted in the figure are original images prepared by the authors