Literature DB >> 19261719

A web server for inferring the human N-acetyltransferase-2 (NAT2) enzymatic phenotype from NAT2 genotype.

Igor B Kuznetsov1, Michael McDuffie, Roxana Moslehi.   

Abstract

UNLABELLED: N-acetyltransferase-2 (NAT2) is an important enzyme that catalyzes the acetylation of aromatic and heterocyclic amine carcinogens. Individuals in human populations are divided into three NAT2 acetylator phenotypes: slow, rapid and intermediate. NAT2PRED is a web server that implements a supervised pattern recognition method to infer NAT2 phenotype from SNPs found in NAT2 gene positions 282, 341, 481, 590, 803 and 857. The web server can be used for a fast determination of NAT2 phenotypes in genetic screens. AVAILABILITY: Freely available at http://nat2pred.rit.albany.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19261719      PMCID: PMC2672629          DOI: 10.1093/bioinformatics/btp121

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

N-acetyltransferase-2 (NAT2) is an important enzyme that catalyzes the acetylation of aromatic and heterocyclic amine carcinogens (Blum et al., 1990). Based on the level of NAT2 acetylator activity, individuals in human populations are divided into three enzymatic phenotypes: rapid (normal activity), intermediate and slow (reduced activity) (Hein et al., 2000). Single nucleotide polymorphisms (SNPs) within NAT2 determine the NAT2 acetylator phenotype. A consensus has been reached on association between NAT2 genotype and acetylator phenotype (Hein, 2006). Recently, we showed that individuals with NAT2 SNP variants associated with the slow phenotype were more susceptible to the effects of tobacco smoking with respect to the risk of developing an advanced colorectal adenoma (Moslehi et al., 2006). Several other studies have also linked NAT2 gene variants and acetylator phenotypes to the risk of several malignant and pre-malignant conditions (Brockton et al., 2000; Hein, 2006; Potter et al., 1999; Tiemersma et al., 2004). The identification of at-risk individuals is an important component of cancer prevention. Current genotyping technologies are able to determine which alleles are present at each locus, but do not provide information about the phase of the alleles at different loci (i.e. do not provide information about which alleles at adjacent loci occur on the same chromosome). In order to assign an acetylator phenotype to a particular individual, the NAT2 haplotypes for this individual need to be determined by inferring the phase of the alleles. After phasing, the acetylator phenotype is assigned manually based on haplotypes (Supplementary Fig. 1). Phasing of alleles (i.e. haplotype determination) is laborious. Experimental methods exist, but are time-consuming and expensive. In most studies, computational statistical methods are used, such as the algorithm implemented in PHASE (Stephens et al., 2001). However, methods for statistical determination of phase are computer intensive and require specific data formatting steps. The goal of the present work was to develop a web server that implements a supervised pattern recognition approach to infer NAT2 acetylator phenotype (slow, intermediate or rapid) directly from the observed combinations of NAT2 SNPs, without taking the extra step of determining the haplotypes for each individual.

2 METHODS

The dataset used in this work was obtained from the Prostate, Lung, Colorectal and Ovarian (PLCO) cancer screening trial of the National Cancer Institute (see Moslehi et al., 2006 for details). Genotyping for six NAT2 SNPs (C282T, T341C, C481T, G590A, A803G and G857A) was performed using the TaqMan® (Applied Biosystems Inc., Carlsbad, CA, USA) kit. The acetylator phenotypes were assigned in our previous study based on the haplotypes determined from SNP genotyping data for each subject (Moslehi et al., 2006). The dataset consists of 1377 subjects (see Supplementary Table 1 for details and ethnic makeup). Prediction of the acetylator phenotype from combinations of SNPs, as defined here, is a three-class classification problem that can be addressed using a supervised pattern recognition method. We used Support Vector Machine (SVM) as a method of choice (Vapnik, 1998). We constructed a three-class SVM predictor using the one-against-one approach which was shown to perform better than other approaches in multi-class SVMs (Hsu and Lin, 2002). We used SVM implemented in the LIBSVM package (Chang and Lin, (2003) with the linear kernel. Each NAT2 SNP was encoded using a set of three mutually orthogonal binary vectors: homozygote for the most frequent allele (1,0,0), heterozygote for the most frequent allele (0,1,0) and homozygote for the least frequent allele (0,0,1). For a given subject corresponding vectors describing each of the six observed SNPs were concatenated together, resulting in a final binary feature vector of dimension 18. Thus, the SNP combination of each subject was described by 18 binary variables. We used a 7-fold cross-validation to test the SVM predictor of the acetylator phenotype. In this approach, the dataset is randomly partitioned into seven groups, each containing 1/7 of the dataset. At each cross-validation run, one group is removed and the predictor is trained on the remaining observations and tested on the removed group. The process is repeated seven times, so that each group is used for testing once. In order to assess different aspects of classification quality, we used the following performance measures: overall accuracy (ACC), sensitivity (SN) for class i (SN) and specificity (SP) for class i (SP) (Baldi et al., 2000): where Z is a 3 × 3 confusion (contingency) matrix, in which an element z[i,j] represents the number of times objects from class i are predicted to be in class j; N is the total number of objects (N=1377 in this work). The performance of NAT2PRED server The total number of cases for a given phenotype is shown in a corresponding row name. The SVM penalty parameter C was set to 3 (an optimal value determined using a grid search). Numbers in parenthesis show the results of prediction based on non-synonymous SNPs.

3 RESULTS

The results of the cross-validation are shown in Table 1. If all six SNPs are used, the predictor of the NAT2 acetylator phenotype achieves a nearly perfect accuracy of 99.9% (Equation 1) and nearly perfect class-specific sensitivities and specificities (Equation 2) between 99.6 and 100%. Such a well-balanced performance is observed despite the highly unbalanced nature of the dataset, meaning that the number of subjects with the slow phenotype is almost an order of magnitude larger than that of subjects with the rapid phenotype. Importantly, individuals with the slow phenotype, who are at increased risk of developing tumors, are identified with 100% SN, meaning that no at-risk individuals are missed. If data on two synonymous SNPs (C282T and C481T) are removed and only non-synonymous SNPs are used, the accuracy of the prediction drops from 99.9 to 93.2%, with similar declines in SN and SP (Table 1). We therefore conclude that all six SNPs used in the present study are required to reliably assign the acetylator phenotype.
Table 1.

The performance of NAT2PRED server

NAT2 phenotypeSensitivity (SN)Specificity (SP)
Rapid99.6%100%
n=84(93.4%)(90.1%)
Intermediate100%99.7%
n=503(94.0%)(95.4%)
Slow100%100%
n=790(92.5%)(93.1%)

The total number of cases for a given phenotype is shown in a corresponding row name. The SVM penalty parameter C was set to 3 (an optimal value determined using a grid search). Numbers in parenthesis show the results of prediction based on non-synonymous SNPs.

The web server implementation of the SVM predictor of the NAT2 acetylator phenotype was trained using the data on all 1377 subjects. It has a simple intuitive user interface.The user is asked to select a genotype for each of the six SNP loci using radio buttons (Supplementary Fig. 2). There are three possible genotypes for each SNP locus, which corresponds to three radio buttons per locus. After the genotype is selected, the user can click ‘Submit’ button and immediately obtain an inferred NAT2 acetylator phenotype. The output page displays the selected genotype and the probabilities of each of the three acetylator phenotypes (slow, intermediate and rapid) for these genotypes (Supplementary Fig. 3). The final prediction is the phenotype with the highest probability. There is also an option for a batch submission of genotypes for multiple individuals. Detailed instructions and information about the methodology and output format can be found by clicking the corresponding help hyperlink located on the input page. To the best of the authors′ knowledge, NAT2PRED is the only existing web server for inferring NAT2 acetylator phenotypes from genotyping data. NAT2PRED was developed on a dataset where majority of subjects are Caucasian (94%). However, the prediction model utilizes generally observed linkage disequilibrium between the six NAT2 SNPs and can be applied to individuals from any ethnicity. The web server is publicly available at http://nat2pred.rit.albany.edu.
  10 in total

1.  A new statistical method for haplotype reconstruction from population data.

Authors:  M Stephens; N J Smith; P Donnelly
Journal:  Am J Hum Genet       Date:  2001-03-09       Impact factor: 11.025

Review 2.  Molecular genetics and epidemiology of the NAT1 and NAT2 acetylation polymorphisms.

Authors:  D W Hein; M A Doll; A J Fretland; M A Leff; S J Webb; G H Xiao; U S Devanaboyina; N A Nangju; Y Feng
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2000-01       Impact factor: 4.254

Review 3.  Assessing the accuracy of prediction algorithms for classification: an overview.

Authors:  P Baldi; S Brunak; Y Chauvin; C A Andersen; H Nielsen
Journal:  Bioinformatics       Date:  2000-05       Impact factor: 6.937

4.  A comparison of methods for multiclass support vector machines.

Authors:  Chih-Wei Hsu; Chih-Jen Lin
Journal:  IEEE Trans Neural Netw       Date:  2002

Review 5.  N-acetyltransferase polymorphisms and colorectal cancer: a HuGE review.

Authors:  N Brockton; J Little; L Sharp; S C Cotton
Journal:  Am J Epidemiol       Date:  2000-05-01       Impact factor: 4.897

6.  Cigarette smoking, N-acetyltransferase genes and the risk of advanced colorectal adenoma.

Authors:  Roxana Moslehi; Nilanjan Chatterjee; Timothy R Church; Jinbo Chen; Meredith Yeager; Joel Weissfeld; David W Hein; Richard B Hayes
Journal:  Pharmacogenomics       Date:  2006-09       Impact factor: 2.533

7.  Human arylamine N-acetyltransferase genes: isolation, chromosomal localization, and functional expression.

Authors:  M Blum; D M Grant; W McBride; M Heim; U A Meyer
Journal:  DNA Cell Biol       Date:  1990-04       Impact factor: 3.311

8.  Colorectal adenomatous and hyperplastic polyps: smoking and N-acetyltransferase 2 polymorphisms.

Authors:  J D Potter; J Bigler; L Fosdick; R M Bostick; E Kampman; C Chen; T A Louis; P Grambsch
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  1999-01       Impact factor: 4.254

Review 9.  N-acetyltransferase 2 genetic polymorphism: effects of carcinogen and haplotype on urinary bladder cancer risk.

Authors:  D W Hein
Journal:  Oncogene       Date:  2006-03-13       Impact factor: 9.867

10.  Effect of SULT1A1 and NAT2 genetic polymorphism on the association between cigarette smoking and colorectal adenomas.

Authors:  Edine W Tiemersma; Annelies Bunschoten; Frans J Kok; Hansruedi Glatt; Sybrand Y de Boer; Ellen Kampman
Journal:  Int J Cancer       Date:  2004-01-01       Impact factor: 7.396

  10 in total
  18 in total

1.  N-acetyltransferase 2 polymorphisms, tobacco smoking, and breast cancer risk in the breast and prostate cancer cohort consortium.

Authors:  David G Cox; Lucie Dostal; David J Hunter; Loïc Le Marchand; Robert Hoover; Regina G Ziegler; Michael J Thun
Journal:  Am J Epidemiol       Date:  2011-11-10       Impact factor: 4.897

2.  N-acetyltransferase 2 enzyme genotype-phenotype discordances in both HIV-negative and HIV-positive Nigerians.

Authors:  Olayinka A Kotila; Olufunmilayo I Fawole; Olufunmilayo I Olopade; Adejumoke I Ayede; Adeyinka G Falusi; Chinedum P Babalola
Journal:  Pharmacogenet Genomics       Date:  2019-07       Impact factor: 2.089

3.  Genetic risk factors in Parkinson's disease: single gene effects and interactions of genotypes.

Authors:  Anna Göbel; Eric A Macklin; Susen Winkler; Rebecca A Betensky; Christine Klein; Katja Lohmann; David K Simon
Journal:  J Neurol       Date:  2012-08-10       Impact factor: 4.849

4.  Factors Influencing Tuberculosis Treatment Outcome in Adult Patients Treated with Thrice-Weekly Regimens in India.

Authors:  Geetha Ramachandran; Hemanth Kumar Agibothu Kupparam; Chandrasekaran Vedhachalam; Kannan Thiruvengadam; Vijayalakshmi Rajagandhi; Azger Dusthackeer; Ramesh Karunaianantham; Lavanya Jayapal; Soumya Swaminathan
Journal:  Antimicrob Agents Chemother       Date:  2017-04-24       Impact factor: 5.191

5.  A Method to Determine Xenobiotic Acetylation Rate by Taq SNP rs1495741.

Authors:  O B Ogarkov; N P Peretolchina; S I Malov; E A Orlova; L A Stepanenko; P A Khromova; I V Malov; S I Kolesnikov
Journal:  Bull Exp Biol Med       Date:  2022-09-05       Impact factor: 0.737

6.  Effects of Enzyme Induction and Polymorphism on the Pharmacokinetics of Isoniazid and Rifampin in Tuberculosis/HIV Patients.

Authors:  Jesper Sundell; Emile Bienvenu; Sofia Birgersson; Angela Äbelö; Michael Ashton
Journal:  Antimicrob Agents Chemother       Date:  2022-09-07       Impact factor: 5.938

7.  Evaluating NAT2PRED for inferring the individual acetylation status from unphased genotype data.

Authors:  Audrey Sabbagh; Pierre Darlu; Michel Vidaud
Journal:  BMC Med Genet       Date:  2009-12-31       Impact factor: 2.103

8.  Evaluation of polymorphisms in the sulfonamide detoxification genes NAT2, CYB5A, and CYB5R3 in patients with sulfonamide hypersensitivity.

Authors:  James C Sacco; Mahmoud Abouraya; Alison Motsinger-Reif; Steven H Yale; Catherine A McCarty; Lauren A Trepanier
Journal:  Pharmacogenet Genomics       Date:  2012-10       Impact factor: 2.089

9.  Identification and validation of N-acetyltransferase 2 as an insulin sensitivity gene.

Authors:  Joshua W Knowles; Weijia Xie; Zhongyang Zhang; Indumathi Chennamsetty; Indumathi Chennemsetty; Themistocles L Assimes; Jussi Paananen; Ola Hansson; James Pankow; Mark O Goodarzi; Ivan Carcamo-Orive; Andrew P Morris; Yii-Der I Chen; Ville-Petteri Mäkinen; Andrea Ganna; Anubha Mahajan; Xiuqing Guo; Fahim Abbasi; Danielle M Greenawalt; Pek Lum; Cliona Molony; Lars Lind; Cecilia Lindgren; Leslie J Raffel; Philip S Tsao; Eric E Schadt; Jerome I Rotter; Alan Sinaiko; Gerald Reaven; Xia Yang; Chao A Hsiung; Leif Groop; Heather J Cordell; Markku Laakso; Ke Hao; Erik Ingelsson; Timothy M Frayling; Michael N Weedon; Mark Walker; Thomas Quertermous
Journal:  J Clin Invest       Date:  2015-03-23       Impact factor: 14.808

10.  Population pharmacokinetic analysis of isoniazid, acetylisoniazid, and isonicotinic acid in healthy volunteers.

Authors:  Kok-Yong Seng; Kim-Hor Hee; Gaik-Hong Soon; Nicholas Chew; Saye H Khoo; Lawrence Soon-U Lee
Journal:  Antimicrob Agents Chemother       Date:  2015-08-17       Impact factor: 5.191

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.