Literature DB >> 28104964

MFPPI - Multi FASTA ProtParam Interface.

Vijay Kumar Garg1, Himanshu Avashthi1, Apoorv Tiwari1, Prashant Ankur Jain1, Pramod Wasudev Ramkete2, Arvind Mohan Kayastha3, Vinay Kumar Singh3.   

Abstract

Physico-chemical properties reflect the functional and structural characteristics of a protein. The comparative study of the physicochemical properties is important to know role of a protein in exploring its molecular evolution. A number of online and offline tools are available for calculating the physico-chemical properties of a single protein sequence. However, a tool is not available for a comparative study with graphical visualization of Multi-FASTA sequences. Hence, we describe the development and utility of MFPPI V.1.0 (a web interface developed in JAVA platform) to input each FASTA sequence from Multi-FASTA file into the ProtParam web server for the calculation of physico-chemical properties. MFPPI V.1.0 calculates different physico-chemical properties for a given set of proteins in a single run and saves the data in the MSExcel sheet. Furthermore, it provides a graphical representation of protein physico-chemical properties for analysis and visualization of data in a user-friendly manner. Therefore, the output from the analysis helps to understand compositional changes and functional relationship in evolution among organisms. We have demonstrated the utility of MFPPI V.1.0 using 17 mtATP6 protein sequences from different mammalian species. It is available for free at http://insilicogenomics.in/mfpcalc/mfppi.html.

Entities:  

Year:  2016        PMID: 28104964      PMCID: PMC5237651          DOI: 10.6026/97320630012074

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

The physicochemical property of proteins is critical for sustainability, efficiency, and stability in a biological system. Various physico-chemical parameters of proteins such as amino acid composition, extinction coefficient [1], instability index [2,3], grand average of hydropathicity (GRAVY), aliphatic index, theoretical pI, atomic composition and molecular weight allows us to understand the stability, activity and nature of protein. There are many web based and standalone softwares available that compute physico-chemical properties of proteins. AACompIdent is a web-based tool at ExPASy that identifies proteins using amino acid composition [1]. Protein/Peptide Property Calculator [4] is a web-based tool to calculate the peptide chemical formula, molecular weight, netcharge at neutral pH, hydrophilicity, hydrophobicity, isoelectric point and extinction coefficient. It also predicts hydrophobic or hydrophilic region, secondary structure of the protein, trans-membrane region and flexible region of the input protein or peptide sequence of interest. However, it is useful for single sequence analysis. The Molinspiration server also offers number of chemoinformatics tools to calculate LogP (octanol/water partition coefficient), molecular polar surface area and molecular volume [5]. ProtParam [6] from ExPASy [7] server is a reliable algorithm to compute physico-chemical properties. However, it uses single sequence per analysis through the interface. Moreover, current methods do not analyze multiple sequences for comparative analysis. It also does not provide options for downloading results for subsequent analysis. Therefore, it is of interest to develop a novel interface using ProtParam to analyze multiple sequences from a multi-FASTA file producing results for comparative inference with evolutionary insights. It is also of interest to develop methods to download and store results in an “.xls” format for further analysis. Hence, we describe the development and utility of MFPPI V.1.0 in a JAVA platform version JRE7 (simple, objectoriented, reliable, secure and portable) for this purpose.

Methodology

Sequence retrieval and construction of Multi-FASTA file

Mitochondrial protein (mtProtein) sequences of 17 different mammalian members were retrieved in FASTA format from National Centre of Biotechnology Information on a single notepad file with “.txt” extension was created. The FASTA format of protein chosen must start with >lcl| then followed by accession number or description. In the end there should be at least one bracket “[ ]” and in this bracket there may be species name or other details, sequence length should start after bracket. The input FASTA file of different mammalian protein has been illustrated in Figure 1.
Figure 1

Multi-FASTA sequence file of different mammalian members. Input file format prepared for Multi-FASTA file to be subjected in the Akriti V.1.0

Script Development

Java GUI programming involves two packages first the original Abstract Window Toolkit (AWT) and second newer Swing toolkit. Swing is the primary Java GUI widget toolkit. The script of the web interface was developed in four steps.

Input data

Multi-FASTA text file of mtProteins were declared as string that contains several sequences in FASTA format separated by greater than (“>”) symbol.

Splitting and storing Multi-FASTA sequence into raw sequence

Each sequence was split and converted into raw format (without any symbol and description line) and then stored into a separate file. To split the sequence from description line, each FASTA sequence was taken into string and then split method was applied from where greater than symbol “>”starts and ends with “]”.

Fetching raw sequence into ProtParam server

To fetch the sequence into ProtParam server sequentially one by one, a connection was established with ProtParam server using following syntax. Syntax: URL siturl = new URL ("http://web.expasy.org/cgibin/ ProtParam/ProtParam"); Redirect method was applied to calculate next sequence and then output condition should be “true” to print the results after physico-chemical property calculation compilation.

Saving data into MS-Excel file

After compilation of calculated parameters at ProtParam server sequential result was saved in MS-Excel (.xls) file.

Graphical User Interface

The graphical user interface was developed very simple and user friendly. Interface contains text field, browse button, submit button and process status. Logo of software with its name in Hindi and English language as well as logo of Banaras Hindu University, Varanasi and Sam Higginbottom Institute of Agriculture Technology & Sciences, Allahabad was also added. MFPPI V.1.0 is fully automated web interface tool for ProtParam to calculate physico-chemical property. Also we divided this software into six different packages for particular calculation.

Utility and application:

General features

The MFPPI V.1.0 graphical user interface of tool has only two buttons, browse and submit (Figure 2). The server is able to calculate total number of amino acid, molecular weight, theoretical pI, number of each amino acid residue and their percentage, total number of negatively charged residues (D + E), instability index, aliphatic index, and grand average of hydropathicity (GRAVY) for several protein sequences simultaneously.
Figure 2

Graphical User Interface of MFPPI V.1.0. web interface for MULTI-FASTA PROT-PARAM interface

Special features

Multiple FASTA format (>lcl|Sequence ID or description of protein [sequence source or any other information]) sequences in a file are used as input for analysis. The result is saved in an excel file format for further analysis and inference.

Example analysis

The results from MFPPI V.1.0 for 17 mtATP6 protein [8] sequences from different mammalian species are given in (Table 1&Table 2. A graph drawn using Table 1 is shown in Figure 3. This is an example of comparative analysis of multiple sequences. The sequences are amino acid C poor and L rich. Low frequency of D was found across the species and absent in Saimiri boliviensis and Gorilla gorilla gorilla. The amino acid residues R, E, K, W and Y were also present in low frequency in comparison to higher frequencies of N, Q, G, H, M, F, P and V. Residues A, I, S and T frequency was found relatively higher among all species.
Table 1

Amino Acid composition (%) of 17 mammalian mitochondrial ATP 6 encoded protein

SpeciesARNECQDGHILKMFPSTWYV
Bos taurus6.61.84.40.4041.34.92.79.719.51.85.35.85.37.111.91.30.95.3
Canis lupus8.82.24.40.4041.34.92.711.918.61.84.95.85.86.29.31.31.34.4
Cavia porcellus7.11.840.403.11.34.43.112.819.52.26.24.96.25.810.61.31.34
Cricetulus griseus6.62.23.50.903.11.34.93.513.317.72.76.65.85.86.29.71.30.94
Equus caballus81.84.40.4041.34.93.111.917.71.86.26.25.86.69.31.30.94.4
Felis catus81.84.90.4041.34.93.510.6191.85.85.35.86.29.71.30.94.9
Gorilla gorilla9.71.84.9003.51.83.52.710.619.92.25.83.56.25.811.91.31.33.5
Homo sapiens8.41.84.90.403.11.33.52.712.819.52.75.346.25.811.51.31.33.5
Loxodonta africana6.82.33.6003.22.34.12.712.219.41.84.14.15.45.913.11.82.35.4
Mus musculus6.62.240.402.71.34.4412.817.32.76.26.26.26.69.71.30.94.4
Ovis aries7.11.85.30.4041.35.32.79.719.51.85.85.85.36.210.61.30.95.3
Pan paniscus8.81.84.40.403.51.33.53.111.119.52.24.94.96.25.811.91.30.94.4
Pan troglodytes9.31.84.40.403.51.33.53.111.119.92.24.94.46.25.311.51.31.34.4
Pongo abelii8.42.240.403.11.33.12.711.922.12.24.43.56.65.811.91.31.33.5
Rattus norvegicus6.62.23.50.902.71.84.4412.818.12.26.25.86.26.69.31.30.94.4
Saimiri boliviensis5.81.85.30040.942.211.521.21.35.845.37.111.91.31.84.9
Sus scrofa7.51.84.90.404.41.34.42.711.917.72.25.86.25.35.811.51.31.33.5
Table 2

Physico-chemical properties of 17 mammalian mitochondrial ATP 6 encoded protein calculated by MFPPI V.1.0.

SpeciesMWECIIAIGRAVY
Bos taurus24787.91948036.15135.930.924
Canis lupus247892097032.34140.750.977
Cavia porcellus24952.52097036.95144.61.025
Cricetulus griseus25071.61948035.14138.980.965
Equus caballus24866.11948040.64136.420.973
Felis catus248051948040.85137.70.92
Gorilla gorilla24676.92097035.9139.070.888
Homo sapiens24817.22097034.74144.650.952
Loxodonta africana24575.72945032.01145.410.963
Mus musculus25095.51948031.88136.810.943
Ovis aries24797.91948034.15136.370.924
Pan paniscus247581948031.82140.750.939
Pan troglodytes247702097032.19142.920.953
Pongo abelii24801.22097030.49151.551.004
Rattus norvegicus25075.51948028.24140.270.969
Saimiri boliviensis24925.32246037.5147.571.019
Sus scrofa25039.22097034.58133.410.881
Figure 3

The relationship between amino acid and their percent composition in mtATP6 among different species is shown. The composition graph shows mtATP6 is rich in amino acid L and poor in C.

Other features

The interface also provides values for molecular weight, extinction coefficient, instability index, aliphatic index and grand average of hydropathycity (GRAVY) [9] for the protein sequences (Table 2) in a comparative manner among 17 mammalian species. This provides insight for functional analysis and molecular evolution.

Conclusion

The added feature in MFPPI V.1.0 interface is its ability to calculate physico-chemical properties of multiple protein sequences along with comparative analysis of several physiochemical parameters using the Expasy’s ProtParam server. The interface provides output in Excel sheet format for further useful statistical analysis and graph generation for further visualization analysis. MFPPI V.1.0 finds utility in understanding compositional changes and functional relationship in evolution among organisms. We have demonstrated this using 17 mtATP6 protein sequences from different mammalian species.

Acknowledgment:

Authors are grateful to Centre for Bioinformatics, Institute of Science, Banaras Hindu University, Varanasi, Bharat (India) for providing necessary infrastructure facility to carry out this work.

Disclosure:

The authors report no conflict of interest regarding this work.
  7 in total

Review 1.  Protein identification and analysis tools in the ExPASy server.

Authors:  M R Wilkins; E Gasteiger; A Bairoch; J C Sanchez; K L Williams; R D Appel; D F Hochstrasser
Journal:  Methods Mol Biol       Date:  1999

2.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties.

Authors:  P Ertl; B Rohde; P Selzer
Journal:  J Med Chem       Date:  2000-10-05       Impact factor: 7.446

3.  Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence.

Authors:  K Guruprasad; B V Reddy; M W Pandit
Journal:  Protein Eng       Date:  1990-12

4.  Calculation of protein extinction coefficients from amino acid sequence data.

Authors:  S C Gill; P H von Hippel
Journal:  Anal Biochem       Date:  1989-11-01       Impact factor: 3.365

Review 5.  Chemiosmotic coupling in oxidative and photosynthetic phosphorylation.

Authors:  P Mitchell
Journal:  Biol Rev Camb Philos Soc       Date:  1966-08

6.  Thermostability and aliphatic index of globular proteins.

Authors:  A Ikai
Journal:  J Biochem       Date:  1980-12       Impact factor: 3.387

7.  A simple method for displaying the hydropathic character of a protein.

Authors:  J Kyte; R F Doolittle
Journal:  J Mol Biol       Date:  1982-05-05       Impact factor: 5.469

  7 in total
  44 in total

1.  Transcriptome-wide identification of genes involved in Ascorbate-Glutathione cycle (Halliwell-Asada pathway) and related pathway for elucidating its role in antioxidative potential in finger millet (Eleusine coracana (L.)).

Authors:  Himanshu Avashthi; Rajesh Kumar Pathak; Neetesh Pandey; Sandeep Arora; Amrendra Kumar Mishra; Vijai Kumar Gupta; Pramod Wasudeo Ramteke; Anil Kumar
Journal:  3 Biotech       Date:  2018-11-26       Impact factor: 2.406

2.  Investigation of Interferon Gamma Activity Using Bioinformatics Methods.

Authors:  N E Hassan; A A Al-Janabi
Journal:  Arch Razi Inst       Date:  2021-11-30

3.  Tp0684, Tp0750, and Tp0792 Recombinant Proteins as Antigens for the Serodiagnosis of Syphilis.

Authors:  Júlio Henrique Ferreira de Sá Queiroz; Marcelo Dos Santos Barbosa; Lais Gonçalves Ortolani Miranda; Natasha Rodrigues de Oliveira; Odir Antônio Dellagostin; Silvana Beutinger Marchioro; Simone Simionatto
Journal:  Indian J Microbiol       Date:  2022-04-19

4.  Phenotypic, Anatomical, and Diel Variation in Sugar Concentration Linked to Cell Wall Invertases in Common Bean Pod Racemes under Water Restriction.

Authors:  Karla Chavez Mendoza; Cecilia Beatriz Peña-Valdivia; Martha Hernández Rodríguez; Monserrat Vázquez Sánchez; Norma Cecilia Morales Elías; José Cruz Jiménez Galindo; Antonio García Esteva; Daniel Padilla Chacón
Journal:  Plants (Basel)       Date:  2022-06-21

5.  Immunopotentiating properties of chimeric OprF-OprI-PopB protein against Pseudomonas aeruginosa PAO1 in the infected burned rat model.

Authors:  Fattaneh Sabzehali; Hossein Goudarzi; Mehdi Goudarzi; Alireza Salimi Chirani; Mohammad Hossein Yoosefi Izad
Journal:  Iran J Basic Med Sci       Date:  2022-03       Impact factor: 2.532

6.  A comparative proteomic approach using metabolic pathways for the identification of potential drug targets against Helicobacter pylori.

Authors:  Reaz Uddin; Waqar Khalil
Journal:  Genes Genomics       Date:  2020-03-19       Impact factor: 1.839

7.  Genome-Wide Analysis of Heat Shock Transcription Factors in Ziziphus jujuba Identifies Potential Candidates for Crop Improvement Under Abiotic Stress.

Authors:  Kishor Prabhakar Panzade; Sonam S Kale; Vijay Kapale; Narendra R Chavan
Journal:  Appl Biochem Biotechnol       Date:  2020-11-26       Impact factor: 2.926

8.  Genome-wide analysis of Hsp70 and Hsp100 gene families in Ziziphus jujuba.

Authors:  Kishor Prabhakar Panzade; Sonam S Kale; Narendra R Chavan; Bhupal Hatzade
Journal:  Cell Stress Chaperones       Date:  2020-11-12       Impact factor: 3.667

9.  Prospective Role of Peptide-Based Antiviral Therapy Against the Main Protease of SARS-CoV-2.

Authors:  Shafi Mahmud; Gobindo Kumar Paul; Suvro Biswas; Shamima Afrose; Mohasana Akter Mita; Md Robiul Hasan; Mst Sharmin Sultana Shimu; Alomgir Hossain; Maria Meha Promi; Fahmida Khan Ema; Kumarappan Chidambaram; Balakumar Chandrasekaran; Ali M Alqahtani; Talha Bin Emran; Md Abu Saleh
Journal:  Front Mol Biosci       Date:  2021-05-10

10.  Integrated Core Proteomics, Subtractive Proteomics, and Immunoinformatics Investigation to Unveil a Potential Multi-Epitope Vaccine against Schistosomiasis.

Authors:  Abdur Rehman; Sajjad Ahmad; Farah Shahid; Aqel Albutti; Ameen S S Alwashmi; Mohammad Abdullah Aljasir; Naif Alhumeed; Muhammad Qasim; Usman Ali Ashfaq; Muhammad Tahir Ul Qamar
Journal:  Vaccines (Basel)       Date:  2021-06-16
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.