Literature DB >> 28497061

Mining of Microbial Genomes for the Novel Sources of Nitrilases.

Nikhil Sharma1, Neerja Thakur2, Tilak Raj1, Tek Chand Bhalla2.   

Abstract

Next-generation DNA sequencing (NGS) has made it feasible to sequence large number of microbial genomes and advancements in computational biology have opened enormous opportunities to mine genome sequence data for novel genes and enzymes or their sources. In the present communication in silico mining of microbial genomes has been carried out to find novel sources of nitrilases. The sequences selected were analyzed for homology and considered for designing motifs. The manually designed motifs based on amino acid sequences of nitrilases were used to screen 2000 microbial genomes (translated to proteomes). This resulted in identification of one hundred thirty-eight putative/hypothetical sequences which could potentially code for nitrilase activity. In vitro validation of nine predicted sources of nitrilases was done for nitrile/cyanide hydrolyzing activity. Out of nine predicted nitrilases, Gluconacetobacter diazotrophicus, Sphingopyxis alaskensis, Saccharomonospora viridis, and Shimwellia blattae were specific for aliphatic nitriles, whereas nitrilases from Geodermatophilus obscurus, Nocardiopsis dassonvillei, Runella slithyformis, and Streptomyces albus possessed activity for aromatic nitriles. Flavobacterium indicum was specific towards potassium cyanide (KCN) which revealed the presence of nitrilase homolog, that is, cyanide dihydratase with no activity for either aliphatic, aromatic, or aryl nitriles. The present study reports the novel sources of nitrilases and cyanide dihydratase which were not reported hitherto by in silico or in vitro studies.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28497061      PMCID: PMC5405348          DOI: 10.1155/2017/7039245

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Advancement in the DNA sequencing technologies has led to sequencing of large number of genomes and the enormous sequence data are available in the public domain. The fourth-generation DNA sequencing has made it possible to sequence a bacterial genome within a few hours at a reasonably low cost [1-4]. As of today 5293 prokaryotic and 22 eukaryotic genomes have been completely sequenced and the sequence data are easily accessible in databases such as NCBI, GOLD, and IMG/ER. It is evident from previous studies that not all the gene/protein sequences in the databases are functionally characterized, which make these repositories a rich source for the discovery of novel genes and proteins [5, 6]. Genome mining has emerged as an alternate approach to find novel sources of desired genes/proteins as the conventional screening methods which involve isolation of microbes and their screening for desired products are time consuming, tedious, and cost intensive [7, 8]. Microbial nitrilases are considered to be the most important enzymes in the nitrilase superfamily that find application in the synthesis of fine chemicals, production of some important acids, and drug intermediates and in green chemistry [9-13]. Besides their wide applications nitrilases are prone to certain limitations, for example, their inactivation or inhibition by the acidic product, extremes of pH, temperature, and organic solvent [14, 15]. These limitations are being addressed either by the isolation of microorganisms from the extreme habitat or by enrichment techniques for specific substrate using conventional microbiological procedures [6] prone to limitation as mentioned above. The present communication focuses on in silico screening of publicly available bacterial genomes for nitrilase genes and in vitro validation of the predicted novel sources of nitrilases.

2. Material and Methods

2.1. Genome Screening Using Homology and Motif Based Approach

Primary screening of microbial genomes (data given as supplementary material in Supplementary Material available online at https://doi.org/10.1155/2017/7039245) was done using homology based approach. Tblastn and blastp were used to screen the sequenced genomes with query sequence to identify the presence and position of similar genes in the genome. Computationally predicted proteins from the bacterial genomes with keyword “nitrilase/cyanide dihydratase” were also downloaded using advanced search options in the IMG/ER database. Sequences with low (30%) and high similarity (80%) were discarded. Nitrilase gene in contigs showing the presence of nitrilase homologs was downloaded from IMG/ER. GenMark S tool was used to predict the ORFs in each contig, and the output was downloaded selecting protein sequence as output option. Amino acid sequences less than 100 amino acids were considered to be as false positive (FP) and were discarded. Small amino acid sequence database was created which was further subjected to local blast, to confirm the presence of nitrilase homolog in the contigs of the individual genome. On the other hand, protein based manually designed motifs (MDMs) were used to screen the bacterial genome to search for the presence of conserved motifs using MAST (Motif Alignment and Search Tool) at MEME (Multiple Em for Motif Elicitation) suite. The motifs used are already described in our previous communication [12]. Motifs identified in sequences less than hundred amino acids were rejected, considered to be false positive (FP). Sequences above 100 amino acids were taken to be as true positive (TP).

2.2. Study of Physiochemical Properties and Phylogenetic Analysis of Predicted Nitrilases

Physiochemical data of the in silico predicted nitrilases were generated from the ProtParam software using ExPASy server and compared to the values deduced from the previous nitrilase study [16]. Some important physiochemical properties such as number of amino acids, molecular weight (kda), isoelectric point (pI), computing pI/Mw and the atomic compositions, values of instability index, aliphatic index, and grand average of hydropathicity (GRAVY) were calculated. A comparative chart was drawn between previously characterized and predicted nitrilases. An output file of multiple aligned sequences using Clustal W for both previously characterized and predicted nitrilases was used to generate the Neighbor Joining (NJ) tree using MEGA 6 version. Phylogenetic tree was generated in order to predict the sequences as aliphatic or aromatic with previously characterized nitrilases.

2.3. Nitrilase Activity Assay

Culture of some of the bacteria predicted to have nitrilase gene (Shimwellia blattae, Runella slithyformis, Geodermatophilus obscurus, Nocardiopsis dassonvillei, Streptomyces albus, Flavobacterium indicum, Saccharomonospora viridis, Sphingopyxis alaskensis, and Gluconacetobacter diazotrophicus) was procured from Microbial Type Culture Collection (MTCC); Chandigarh Escherichia coli BL21 (DE3) from Invitrogen was used as negative control as this organism does not have nitrilase gene. These cultures were grown in the laboratory using different media (Table 1) for the production of nitrilase activity following the procedures described earlier [17-19]. Nitrilase activity was assayed in 1.0 mL reaction mixture containing nitrile as substrate (1–10 mM) and 0.1 mL resting cells. After 30 min of incubation at 30°C the reaction was quenched with 0.1 M HCl and the amount of ammonia released was estimated using nitrilase assay, that is, modified phenate-hypochlorite method described by Dennett and Blamey [20]. One unit of nitrilase activity was defined as the amount of enzyme required to release 1 μmole of ammonia per min under the assay conditions.
Table 1

Composition of various media used to cultivate procured strains for nitrilase production.

Name of the organismMTCC numberComposition (gL−1)pHGrowth temperature
Shimwellia blattaeATCC 299074155Beef extract: 1.0 gYeast extract: 2.0 gPeptone: 5.0 gNaCl: 5.0 gAgar: 15.0 g7.0–7.537°C

Runella slithyformisATCC 295309504Glucose: 1.0 gPeptone: 1.0 gYeast extract: 1.0 gAgar: 15.0 gGlucose: 4.0 g7.0–7.526°C

Geodermatophilus obscurusDSM 431604040Yeast extract: 4.0 gMalt extract: 10.0 gCaCO3: 2.0 gAgar: 12.0 g7.2–7.528°C

Nocardiopsis dassonvilleiDSM 431111411Yeast extract: 4.0 gMalt extract: 1.0Glucose: 4.0 g Agar: 20.0 g7.2–7.428°C

Streptomyces albusJ10741138Yeast extract: 4.0 gMalt extract: 1.0 gGlucose: 4.0 g Agar: 20.0 g7.2–7.425°C

Flavobacterium indicumDSM 174476936Tryptic soy broth with agar(TSBA-100)7.3–7.530°C

Saccharomonospora viridisATCC 15386320Yeast extract: 4.0 gMalt extract: 1.0 gGlucose: 4.0 g Agar: 20.0 g7.2–7.445°C

Sphingopyxis alaskensisDSM 135937504Beef extract: 1.0 gYeast extract: 2.0 gPeptone: 5.0 gNaCl: 5.0 gAgar: 15.0 g7.0–7.530°C

Gluconacetobacter diazotrophicusATCC 490371224Yeast extract: 5.0 gPeptone: 3.0 gMannitol: 25.0 gAgar: 15.0 g7.0–7.328°C

Escherichia coliBL21 (DE3)Yeast extract: 5.0 gNaCl: 10.0 g Casein enzymatic hydrolysate: 10.0 g7.0–7.537°C

Negative control.

3. Results

3.1. Genome Screening Using Conserved Motifs and Homology Search

As many as 138 candidate sequences were identified using tblastn and blastp at IMG/ER on both gene and protein level. Identification of potentially coding nitrilase genes was done using homology based approach (blastp and tblastn) allowing the identification of nitrilase sequences. To identify newer sources of nitrilases, candidate sequences bearing unassigned functions (hypothetical, uncharacterized, or putative) were selected from the translated genomes (Table 2). The identified sequences shared 30–50% sequence identity to biochemically characterized Rhodococcus rhodochrous J1 nitrilase which was taken as query sequence. Catalytic residues were found to be conserved in all the predicted proteins. Nine predicted and translated sequences were further chosen for their in silico and in vitro validation based on the manually designed motifs (MDMs) (Tables 3 and 4) identified from previous study [12].
Table 2

Prediction of ORFs length in the individual scaffold for prediction of coding sequence for nitrilase using IMG/ER.

Name of organismScaffold or genome length (bp) with accession numberTotal number of ORF's predicted in scaffold of complete genomePredicted coding region for nitrilaseNumber ofbase-pairs
Acaryochloris marinaMBIC11017NC_009925(6503724 bp)152200001–200999999

Acetobacter pasteurianusIFO 3283-32AP011157(191443 bp)120174107–173133974

Achromobacter xylosoxidansA8NC_014640(7013095 bp)406200001–200960960

Acidovorax avenae avenaeATCC 19860NC_015138(5482170 bp)188201035–2000011035

Acidothermus cellulolyticus11BNC_008578(2443540 bp)403200001–2011311131

Acidaminococcus fermentansVR4NC_013740(2329769 bp)293200924–200001924

Alcanivorax dieseloleiB5CP003466(4928223 bp)343200001–200981981

Arthrobacter aurescensTC1NC_008711(4597686 bp)385200001–200930930

Azorhizobium caulinodansORS 571NC_009937(5369772 bp)26289665–885801083

Azospirillum sp. B510NC_013854(3311395 bp)402200001–200921921

Bacillus pumilusSAFR-032NC_009848(3704465 bp)73201026–2000011026

Bradyrhizobium japonicumUSDA 110NC_004463(9105828 bp)387200001–200966966

Bradyrhizobium sp.  BTAi1NC_009485(8264687 bp)392201146–2000011146

Bradyrhizobium sp.  ORS278NC_009445(7456587 bp)395201041–2000011041

Brevibacillus brevisNBRC 100599NC_012491(6296436 bp)182200001–200960960

Flavobacterium indicumGPTSA100-9HE774682(2993089 bp)317200001–200981981

Saccharomonospora viridisP101NC_013159(4308349 bp)315200001–200996996

Sphingopyxis alaskensisDSM13593NC_008048(3345170 bp)387200001–2010171017

Burkholderia cenocepaciaJ2315NC_011000(3870082 bp)393199944–2010501050

Burkholderia glumaeBGR1NC_012720(141067 bp)15447491–484771017

Burkholderia gladioliBSR3NC_015376(3700833 bp)338200001–2010141014

Burkholderia phymatum NC_010623(2697374 bp)375199971–2010231023

Burkholderia phytofirmans NC_010681(4467537 bp)357200001–2010351035

Burkholderia sp.  CCGE1002NC_014119(1282816 bp)28072013–730411020

Burkholderia sp.  CCGE1003NC_014540(2966498 bp)344200019–2010411022

Burkholderia vietnamiensis G4NC_009254(1241007 bp)436199986–2010231037

Burkholderia xenovoransLB400NC_007951(4895836 bp)396200001–200996996

Caulobacter sp. K31NC_010335(233649 bp)219180936–181871935

Chlorobium phaeobacteroidesBS1NC_010831(2736403 bp)382200001–200936936

Clostridium difficile 630NC_009089(4290252 bp)364200001–200927927

Clostridium difficileCD196NC_013315(4110554 bp)308200001–200927927

Clostridium difficileR20291NC_013316(4191339 bp)329200001–200927927

Clostridium kluyveriNBRC 12016NC_011837(3896121 bp)442200001–200930957

Clostridium kluyveriATCC 8527NC_009706(3964618 bp)491200001–200930930

Conexibacter woeseiDSM 14684NC_013739(6359369 bp)388200001–200942942

Cupriavidus necatorATCC 17699NC_008313(4052032 bp)318200001–2010171017

Cupriavidus necatorATCC 43291NC_015726(3872936 bp)318200001–2010171017

Cyanobium gracileATCC 27147Cyagr_Contig81(3342364 bp)405200001–200999999

Deinococcus deserti (strain VCD115)NC_012529(314317 bp)269200001–200951951

Deinococcus peraridilitorisDSM 19664Deipe_Contig72.1(3881839 bp)412200001–200951951

Desulfomonile tiedjeiATCC 49306Desti_Contig107.1(6500104 bp)379200001–2010291029

Dickeya zeae Ech1591NC_012912(4813854 bp)194200001–200927927

Erwinia billingiae Eb661NC_014305(169778 bp)19487964–889651001

Erythrobacter litoralisHTCC2594NC_007722(3052398 bp)411200001–200969969

Flavobacterium indicumDSM 17447HE774682(2993089 bp)317200001–200981981

Frateuria aurantiaATCC 33424Fraau_Contig24.1(3603458 bp)366200001–200924924

Geobacillus sp.  Y4.1MC1NC_014650(3840330 bp)434200001–200966966

Geobacillus thermoglucosidasiusC56-YS93NC_015660(3893306 bp)446200001–200966966

Geodermatophilus obscurusDSM 43160NC_013757(5322497 bp)24454102–54884783

Gluconacetobacter diazotrophicusATCC 49037NC_010125(3944163 bp)333200001–200960960

Haliangium ochraceumDSM 14365CP002175(2309262 bp)377200001–200957957

Halanaerobium praevalensATCC 33744NC_013440(9446314 bp)262200001–200999999

Hyphomicrobium sp.  MC1NC_015717(4757528 bp)392200001–200984984

Janthinobacterium sp.  MarseilleNC_009659(4110251 bp)398200001–2010681068

Jannaschia sp.  CCS1NC_007802(4317977 bp)382200001–2010261026

Maricaulis marisMCS10NC_008347(3368780 bp)392200001–200933933

Methylobacterium extorquens CM4NC_011758(380207 bp)2117191–82671077

Methylobacterium extorquensATCC 14718NC_012811(1261460 bp)436200001–2010771077

Methylobacterium extorquens DM4NC_012988(5943768 bp)378200001–200918918

Methylobacterium extorquens PA1NC_010172(5471154 bp)354200001–2011101110

Methylomonas methanicaMC09Contig38(5051681 bp)402200001–200996996

Methylobacterium nodulans ORS2060NC_011892(487734 bp)425200001–2011161116

Methylobacterium populiATCC BAA-705NC_010725(5800441 bp)19361617–626931077

Methylibium petroleiphilum PM1NC_008825(4044195 bp)364200001–2010741074

Methylobacterium radiotoleransATCC 27329NC_010505(6077833 bp)377200001–2010771077

Methylocella silvestrisBL2NC_011666(4305430 bp)439199971–2010291029

Mycobacterium intracellulareATCC 13950CP003322(5402402 bp)383199938–200897897

Mycobacterium liflandii128FXTCP003899(6208955 bp)405200001–2010591059

Mycobacterium rhodesiae NBB3MycrhN_Contig54.1(6415739 bp)267200001–200957957

Mycobacterium smegmatisATCC 700084CP001663(6988208 bp)377200001–200978978

Natranaerobius thermophilusATCC BAA-1301NC_010718(3165557 bp)387200001–200930930

Nocardia farcinicaIFM 10152NC_006361(6021225 bp)390198993–199811818

Nocardiopsis dassonvilleiDSM 43111NC_014211(775354 bp)353201134–200001843

Oligotropha carboxidovoransATCC 49405CP002826(3595748 bp)372200001–2010651065

Pantoea sp. At-9bNC_014839(394054 bp)349114577–1155811005

Peptoniphilus duerdeniiATCC BAA-1640NZ_AEEH01000050(96694 bp)8052942–53863921

Photorhabdus asymbioticaATCC 43949NC_012962(5064808 bp)338200001–2010501050

Pirellula staleyiATCC 27377NC_013720(6196199 bp)338200001–200909909

Polaromonas naphthalenivoransCJ2NC_008781(4410291 bp)389200001–2010411062

Polaromonas sp. JS666NC_007948(5200264 bp)398200001–200942942

Pseudomonas syringae pv. lachrymansM302278PTLac106_115287.20(115287 bp)10747704–487471043

Pseudoalteromonas atlanticaATCC BAA-1087NC_008228(5187005 bp)397200001–200921921

Pseudomonas aeruginosaP7-L633/96Ga0060317_132(369634 bp)27091986–92801816

Pseudomonas brassicacearumNFM421NC_015379(6843248 bp)377200001–2010261026

Pseudomonas sp. TJI-51AEWE01000051(6502 bp)051482–24981017

Pseudomonas fluorescensPf-5NC_007492(6438405 bp)349200001–200924924

Pseudomonas fluorescensSBW25NC_012660(6722539 bp)376200043–200930888

Pseudomonas mendocinaNK-01NC_015410(5434353 bp)376200001–200883883

Pseudomonas syringaepv. tomato DC 3000PSPTOimg_DC3000(6397126 bp)377200001–2010111011

Pseudomonas syringaepv. syringae B728aNC_007005(6093698 bp)1968233–9231999

Pseudoxanthomonas suwonensis11-1NC_014924(3419049 bp)362200001–200885885

Pseudonocardia dioxanivoransATCC 55486CP002593(7096571 bp)386200001–2010081008

Ralstonia solanacearumGMI1000NC_003295(3716413 bp)343200001–2010321032

Rhizobium hainanenseCCBAU 57015Ga0061100_113(148344 bp)14661240–622801040

Rhizobium leguminosarum bv. Viciae 3841NC_008380(5057142 bp)397200001–2010471047

Rhizobium leguminosarum bv. trifoliiWSM1325NC_012850(4767043 bp)21018450–19442993

RhodopseudomonaspalustrisTIE-1NC_011004(5744041 bp)387199980–2010501070

Rhodopseudomonas palustrisDX-1NC_014834(5404117 bp)390200001–200954954

Rubrobacter xylanophilusDSM 9941NC_008148(3225748 bp)385200001–2010801080

Ruegeria pomeroyiATCC 700808NC_006569(491611 bp)308118859–1198931035

Runella slithyformisATCC 29530Unknown(6568739 bp)362200001–200933933

Saccharothrix espanaensisATCC 51144HE804045(9360653 bp)347200001–2010201020

Saccharomonospora viridis ATCC 15386NC_013159(4308349 bp)315200001–200996996

Shewanella halifaxensisHAW-EB4NC_010334(5226917 bp)337200001–200945945

Shewanella pealeanaATCC 700345NC_009901(5174581 bp)333200001–200945945

Shewanella sediminis HAW-EB3NC_009831(5517674 bp)337200001–200954954

Shewanella violaceaJCM 1017NC_014012(4962103 bp)307200001–200936936

Shewanella woodyiATCC 51908NC_010506(5935403 bp)327200001–2010051005

Shimwellia blattaeATCC 29907EBLc (4158725 bp)376200001–2010291029

Singulisphaera acidiphilaATCC 1392Sinac_Contig49.1(9629675 bp)337200001–2010141014

Sorangium cellulosum Soce56NC_010162(13033779 bp)329200001–2010291029

Sphingopyxis alaskensisDSM 13593NC_008048(3345170 bp)387200001–2010171016

Sphaerobacter thermophilusDSM 20745NC_013524(1252731 bp)335200097–201092995

Sphingomonas wittichiiRW1NC_009511(5382261 bp)354200001–2010261026

Spirosoma lingualeATCC 33905NC_013730(8078757 bp)339200001–200906906

Starkeya novellaATCC 8093NC_007604(2695903 bp)402200001–2010051005

Streptomyces albus J1074CP004370(6841649 bp)2521635309–1636256948

Synechococcus elongatusPCC 7942NC_007604(2695903 bp)402200001–2010051005

Syntrophobacter fumaroxidansDSM 10017NC_008554(4990251 bp)337200001–200987987

Synechococcus sp.  ATCC 27264NC_010475(3008047 bp)431200001–2010081008

Synechococcus elongatusPCC 6301NC_006576(2696255 bp)402200001–2010051005

Synechococcus sp.  PCC 7002NC_010475(3008047 bp)431200001–2010081008

Synechococcus sp.  WH8102NC_005070(2434428 bp)537200001–2010171017

Synechocystis sp.CP003265(3569561 bp)371200001–2010261026

Synechocystis sp.  PCC 6803NC_017052(3570103 bp)374200001–2010261026

Terriglobus roseusKBS 63Terro_Contig51.1(5227858 bp)354200001–200873873

Tistrella mobilisKA081020-065CP003239(1126962 bp)379200001–2010771077

Variovorax paradoxus (strain EPS)NC_014931(6550056 bp)360200001–2010351035

Variovorax paradoxusS110NC_012791(5626353 bp)420200001–2010531053

Verminephrobacter eiseniae EF01-2NC_008786(5566749 bp)337200001–2009871020

Zobellia galactanivoransDSM 12802FG20DRAFT(5340688 bp)331200001–200951951

Zymomonas mobilis subsp. MobilisATCC 10988NZ_ACQU01000006(113352 bp)11382520–83509990
Table 3

Manually designed motifs (MDMs) for aliphatic and aromatic nitrilases showing the presence of essential catalytic triad (E, K, and C).

NitrilasesManually designed motif
Aliphatic[FL]-[ILV]-[AV]-F-P-E-[VT]-[FW]-[IL]-P-[GY]-Y-P-[WY]
R-R-K-[LI]-[KRI]-[PA]-T-[HY]-[VAH]-E-R
C-W-E-H-[FLX]-[NQ]-[PT]-L
[VA]-A-X-[AV]-Q-[AI]-X-P-[VA]-X-[LF]-[SD]

Aromatic[ALV]-[LV]-[FLM]-P-E-[AS]-[FLV]-[LV]-[AGP]-[AG]-Y-P
[AGN]-[KR]-H-R-K-L-[MK]-P-T-[AGN]-X-E-R
C-W-E-N-[HY]-M-P-[LM]-[AL]-R-X-X-[ML]-Y
A-X-E-G-R-C-[FW]-V-[LIV]
Table 4

Aliphatic and aromatic nitrilase motif patterns with bold letter depicting catalytic center (E, K, and C) in predicted nitrilases.

NitrilasesManually Designed motif123456789
Aliphatic [FL]-[ILV]-[AV]-F-P-E-[VT]-[FW]-[IL]-P-[GY]-Y-P-[WY] A-F-P-E-V-F-V-P-A-Y-P-YF-P-E-L-W-L-P-G-Y-P-I-FF-P-E-V-F-I-S-G-Y-P-Y-W-N-WF-P-E-V-F-I-A-G-YF-P-E-T-F-V-P-Y-Y-P-Y
R-R-K-[LI]-[KRI]-[PA]-T-[HY]-[VAH]-E-R L-R-R-K-L-V-P-T-WR-R-K-L-K-P-T-H-V-E-RR-K-L-V-P-T-W-A-E-K-L-TR-H-R-K-L-V-P-T-W-A-E-RR-R-K-I-T-P-T-Y-H-E-R
C-W-E-H-[FLX]-[NQ]-[PT]-L C-G-E-N-T-N-T-L-A C-A-E-N-M-Q-P-L C-G-E-N-T-N-T-L-A C-G-E-N-T-N-T-L-A-R-F-S C-W-E-H-Y-N-P-L-
[VA]-A-X-[AV]-Q-[AI]- X- P-[VA]-X-[LF]-[SD] V-A-A-V-Q-A-A-P-V-F-L-D-PV-A-S-V-Q-A-EV-Q-T-A-P-V-F-L-N-V-EA-A-V-Q-A-A-P-V-F-LA-A-V-Q-I-S-P-V-L-

Aromatic [ALV]-[LV]-[FLM]-P-E-[AS]-[FLV]-[LV]-[AGP]-[AG]-Y-P F-Q-E-V-F-N-AP-E-S-F-I-P-C-Y-P-R-GF-P-E-A-F-L-G-T-Y-PS-E-T-F-S-T-G
[AGN]-[KR]-H-R-K-L-[MK]-P-T-[AGN]-X-E-R R-K-H-H-I-P-Q-VH-R-K-L-K-P-T-G-L-E-RH-R-K-V-M-P-T-G-A-E-RR-K-L-H-P-F-T
C-W-E-N-[HY]-M-P-[LM]-[AL]-R-X-X-[ML]-Y C-Y-D-R-H C-W-E-N-Y-M-P-L-A-R-M C-W-E-N-Y-M-P-L-L-R-A C-Y-D-L-R-F-A
A-X-E-G-R-C-[FW]-V-[LIV] A-H-L-W-K-L-EA-L-E-G-R-C-F-V-L-AA-L-E-G-R-C-W-VA-I-E-N-Q-A-Y-V

3.2. Physiochemical Parameters and Phylogenetic Analysis

In silico identified nitrilases were analyzed for their physiochemical properties using ProtParam, an online tool at the ExPASy proteomic server. The selected candidates values for various properties were found to be very much similar to those with earlier published data by Sharma and Bhalla [16] as mentioned in Table 5. Average values deduced for aliphatic and aromatic nitrilases from earlier characterized proteins were taken as standard for the comparison of a predicted set of nitrilase. The values of the same were found to be very much similar to those with earlier published data by Sharma and Bhalla [16] as mentioned in Table 5. The total number of amino acids ranged from 260 amino acids (Nocardiopsis dassonvillei) to 342 amino acids (Shimwellia blattae) with different molecular weight. Isoelectric point ranged between 4.8 and 5.8 which is found to be closer to the consensus value, that is, the average data value from previously characterized aliphatic or aromatic nitrilases.
Table 5

Comparison of physiochemical properties of aliphatic, aromatic, and predicted nitrilase from the average consensus values reported by Sharma and Bhalla [16].

ParametersAverage value for aliphaticAverage value for aromatic123456789
Number of amino acids352.2309.8338.0331.0326.0260.0280.0310.0315.0342.0319.0

Molecular weight (Da)38274.033693.536154.936491.236364.727903.331464.134938.133821.537472.734678.7

TheoreticalpI5.55.55.04.96.25.25.65.44.85.45.8

NCR41.735.841.044.040.032.036.043.043.041.039.0

PCR30.329.226.025.037.021.027.034.029.030.032.0

Extinction coefficients (M−1 cm−1) at 280 nm50213.343975.045295.033015.043890.035200.062465.053400.047900.038305.031775.0

Instability index41.238.530.152.527.027.728.639.646.636.638.5

Aliphatic index89.4089.9094.187.993.681.176.090.986.292.889.3

Grand average of hydropathicity (GRAVY)00.1000.010.027−0.17−0.14−0.051−0.283−0.1090.045−0.052−0.002

NCR: negatively charged residues; PCR: positively charged residues.

Neighbor Joining (NJ) tree using MEGA 6 shows the phylogenetic analysis with in silico predicted sequences from completely sequenced microbial genomes with that of previously characterized nitrilase sequences. They were distinguished either as aliphatic or aromatic according to their position in the phylogenetic tree (Figure 1).
Figure 1

Neighbor Joining (NJ) method differentiating characterized and in silico predicted as aliphatic and aromatic nitrilases.

3.3. In Vitro Validation of Some In Silico Predicted Nitrilases

To validate for nitrile transforming activity of nine predicted novel sources of nitrilases, these were tested against common aliphatic, aromatic, and aryl nitriles and potassium cyanide (KCN). Gluconacetobacter diazotrophicus, Sphingopyxis alaskensis, Saccharomonospora viridis, and Shimwellia blattae were found to be more specific for aliphatic nitriles. On the other hand, Geodermatophilus obscurus, Nocardiopsis dassonvillei, Runella slithyformis, and Streptomyces albus exhibited nitrilase activity for aromatic nitriles. Flavobacterium indicum was the only organism which showed no activity for either aliphatic, aromatic, or aryl nitriles but was specific towards the degradation of the potassium cyanide (KCN) (Table 6). On the other hand, negative control, that is, Escherichia coli BL21 (DE3), showed no activity for any of the nitriles/substrates tested.
Table 6

Nitrilase activity of in silico predicted microbial sources of nitrilases assayed using common aliphatic, aromatic, aryl aliphatic, and KCN as substrate.

Organisms Substrates
ValeronitrileBenzonitrileMandelonitrileIsobutyronitrileAdiponitrile2-CyanopyridinePropionitrileAcrylonitrileKCN
Streptomyces albusJ10740.00150.0027ND0.0014NDNDNDNDND

Nocardiopsis dassonvilleiDSM 43111ND0.00400.0024NDNDNDNDNDND

Geodermatophilus obscurusDSM 43160ND0.00430.0021NDNDNDNDNDND

Shimwellia blattaeATCC 299070.0028ND0.00160.0019NDNDNDNDND

Runella slithyformisATCC 29530ND0.02970.01520.0095NDND0.0169NDND

Gluconacetobacter diazotrophicusATCC 49037ND0.00160.00200.00510.0048NDNDNDND

Sphingopyxis alaskensisDSM 13593ND0.00073ND0.00240.00075NDNDNDND

Saccharomonospora viridisATCC 15386NDNDND0.0030NDNDNDNDND

Flavobacterium indicumDSM 17447NDNDNDNDNDNDNDND0.25

Escherichia coliBL21 (DE3)∗∗NDNDNDNDNDNDNDNDND

Expressed as µmole of ammonia released/min/mg dcw under the assay conditions; ND = not detected; negative control.

4. Discussion

Annotation of sequenced genomes to identify new genes has become integral part of the research in bioinformatics [21-24]. The present investigation has revealed some novel sources of nitrilases. Homology and conserved motif approach screened microbial genomes and proteins predicted as nitrilase or cyanide dihydratase or carbon-nitrogen hydrolase in 138 prokaryotic bacterial genomes. Manually designed motifs (MDMs) also differentiated the in silico predicted nitrilases as aliphatic or aromatic [12] as the designed motifs are class specific. All the four motifs identified were uniformly conserved throughout the two sets of aliphatic and aromatic nitrilases as mentioned in Table 4. The sequences belonged to the nitrilase superfamily, showing the presence of the catalytic triad Glu (E), Lys (K), and Cys (C) to be conserved throughout. Phylogenetic analysis using the MEGA 6.0 version for the aliphatic and aromatic set of protein sequences revealed two major clusters. Neighbor Joining (NJ) tree used for phylogenetic analysis revealed that in silico predicted proteins (this study) and previously identified nitrilases as aliphatic and aromatic [16] were found to be grouped in their respective clusters (Figure 1). Aliphaticity and aromaticity of in silico predicted and characterized nitrilases were differentiated based on their physiochemical properties. The physicochemical properties of the predicted set of nitrilase were deduced using the ProtParam subroutine of Expert Protein Analysis System (ExPASy) from the proteomic server of the Swiss Institute of Bioinformatics (SIB), in order to predict aromaticity or aliphaticity. Several of the parameters (number of amino acids, molecular weight, number of negatively charged residues, extinction coefficients, and grand average of hydropathicity) listed in Table 5 are closer to the consensus values reported for aromatic and aliphatic nitrilases, supporting that the predicted set of nitrilase has aromatic or aliphatic substrate specificity (Table 5). In silico predictions were verified by in vitro validation of the predicted proteins. Common nitriles (aliphatic, aromatic, and aryl nitriles) and potassium cyanide (KCN) were tested to check for the nitrile/cyanide transforming ability of the predicted proteins. Out of nine predicted proteins eight were found active for different nitriles, whereas Flavobacterium indicum was found to hydrolyze toxic cyanide (KCN) into nontoxic form (Table 6). The present approach contributed to finding novel sources of desired nitrilase from microbial genome database.

5. Conclusion

Genome mining for novel sources of nitrilases has predicted 138 sources for nitrilases. In vitro validation of the selected nine predicted sources of nitrilases for nitrile/cyanide hydrolyzing activity has furthered the scope of genome mining approaches for the discovery of novel sources of enzymes. List of organisms with completely sequenced genomes avaliable at NCBI and IMG/ER.
  17 in total

1.  A rapid and precise method for the determination of urea.

Authors:  J K FAWCETT; J E SCOTT
Journal:  J Clin Pathol       Date:  1960-03       Impact factor: 3.411

Review 2.  Using comparative genome analysis to identify problems in annotated microbial genomes.

Authors:  Maria S Poptsova; J Peter Gogarten
Journal:  Microbiology       Date:  2010-04-29       Impact factor: 2.777

Review 3.  Microbial genomics for the improvement of natural product discovery.

Authors:  Steven G Van Lanen; Ben Shen
Journal:  Curr Opin Microbiol       Date:  2006-05-02       Impact factor: 7.934

4.  Investigative mining of sequence data for novel enzymes: a case study with nitrilases.

Authors:  Jennifer L Seffernick; Sudip K Samanta; Tai Man Louie; Lawrence P Wackett; Mani Subramanian
Journal:  J Biotechnol       Date:  2009-06-17       Impact factor: 3.307

5.  Optimization of arylacetonitrilase production from Alcaligenes sp. MTCC 10675 and its application in mandelic acid synthesis.

Authors:  S K Bhatia; P K Mehta; R K Bhatia; T C Bhalla
Journal:  Appl Microbiol Biotechnol       Date:  2013-10-09       Impact factor: 4.813

6.  Nocardia globerula NHB-2 nitrilase catalysed biotransformation of 4-cyanopyridine to isonicotinic acid.

Authors:  Nitya Nand Sharma; Monica Sharma; Tek Chand Bhalla
Journal:  AMB Express       Date:  2012-04-26       Impact factor: 3.298

Review 7.  Nanopore-based fourth-generation DNA sequencing technology.

Authors:  Yanxiao Feng; Yuechuan Zhang; Cuifeng Ying; Deqiang Wang; Chunlei Du
Journal:  Genomics Proteomics Bioinformatics       Date:  2015-03-02       Impact factor: 7.691

Review 8.  Strategies for discovery and improvement of enzyme function: state of the art and opportunities.

Authors:  Praveen Kaul; Yasuhisa Asano
Journal:  Microb Biotechnol       Date:  2011-08-24       Impact factor: 5.813

Review 9.  Insights from 20 years of bacterial genome sequencing.

Authors:  Miriam Land; Loren Hauser; Se-Ran Jun; Intawat Nookaew; Michael R Leuze; Tae-Hyuk Ahn; Tatiana Karpinets; Ole Lund; Guruprased Kora; Trudy Wassenaar; Suresh Poudel; David W Ussery
Journal:  Funct Integr Genomics       Date:  2015-02-27       Impact factor: 3.410

Review 10.  Nitrilases in nitrile biocatalysis: recent progress and forthcoming research.

Authors:  Jin-Song Gong; Zhen-Ming Lu; Heng Li; Jin-Song Shi; Zhe-Min Zhou; Zheng-Hong Xu
Journal:  Microb Cell Fact       Date:  2012-10-30       Impact factor: 5.328

View more
  2 in total

1.  Classifying nitrilases as aliphatic and aromatic using machine learning technique.

Authors:  Nikhil Sharma; Ruchi Verma; Tek Chand Bhalla
Journal:  3 Biotech       Date:  2018-01-12       Impact factor: 2.406

2.  Purification and Characterization of Nit phym , a Robust Thermostable Nitrilase From Paraburkholderia phymatum.

Authors:  Thomas Bessonnet; Aline Mariage; Jean-Louis Petit; Virginie Pellouin; Adrien Debard; Anne Zaparucha; Carine Vergne-Vaxelaire; Véronique de Berardinis
Journal:  Front Bioeng Biotechnol       Date:  2021-07-01
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.