MOTIVATION: The human uridine diphosphate-glucuronosyltransferase enzyme family catalyzes the glucuronidation of the glycosyl group of a nucleotide sugar to an acceptor compound (substrate), which is the most common conjugation pathway that serves to protect the organism from the potential toxicity of xenobiotics. Moreover, it could affect the pharmacological profile of a drug. Therefore, it is important to identify the metabolically labile sites for glucuronidation. RESULTS: In the present study, we developed four in silico models to predict sites of glucuronidation, for four major sites of metabolism functional groups, i.e. aliphatic hydroxyl, aromatic hydroxyl, carboxylic acid or amino nitrogen, respectively. According to the mechanism of glucuronidation, a series of 'local' and 'global' molecular descriptors characterizing the atomic reactivity, bonding strength and physical-chemical properties were calculated and selected with a genetic algorithm-based feature selection approach. The constructed support vector machine classification models show good prediction performance, with the balanced accuracy ranging from 0.88 to 0.96 on test set. For further validation, our models can successfully identify 84% of experimentally observed sites of metabolisms for an external test set containing 54 molecules. AVAILABILITY AND IMPLEMENTATION: The software somugt based on our models is available at www.dddc.ac.cn/adme/jlpeng/somugt_win32.zip.
MOTIVATION: The human uridine diphosphate-glucuronosyltransferase enzyme family catalyzes the glucuronidation of the glycosyl group of a nucleotide sugar to an acceptor compound (substrate), which is the most common conjugation pathway that serves to protect the organism from the potential toxicity of xenobiotics. Moreover, it could affect the pharmacological profile of a drug. Therefore, it is important to identify the metabolically labile sites for glucuronidation. RESULTS: In the present study, we developed four in silico models to predict sites of glucuronidation, for four major sites of metabolism functional groups, i.e. aliphatic hydroxyl, aromatic hydroxyl, carboxylic acid or amino nitrogen, respectively. According to the mechanism of glucuronidation, a series of 'local' and 'global' molecular descriptors characterizing the atomic reactivity, bonding strength and physical-chemical properties were calculated and selected with a genetic algorithm-based feature selection approach. The constructed support vector machine classification models show good prediction performance, with the balanced accuracy ranging from 0.88 to 0.96 on test set. For further validation, our models can successfully identify 84% of experimentally observed sites of metabolisms for an external test set containing 54 molecules. AVAILABILITY AND IMPLEMENTATION: The software somugt based on our models is available at www.dddc.ac.cn/adme/jlpeng/somugt_win32.zip.