Literature DB >> 24719894

Improved candidate drug mining for Alzheimer's disease.

Yu-Huei Cheng1, Li-Yeh Chuang2, Hsueh-Wei Chang3, Cheng-Hong Yang4.   

Abstract

Alzheimer's disease (AD) is the main cause of dementia for older people. Although several antidementia drugs such as donepezil, rivastigmine, galantamine, and memantine have been developed, the effectiveness of AD drug therapy is still far from satisfactory. Recently, the single nucleotide polymorphisms (SNPs) have been chosen as one of the personalized medicine markers. Many pharmacogenomics databases have been developed to provide comprehensive information by associating SNPs with drug responses, disease incidence, and genes that are critical in choosing personalized therapy. However, we found that some information from different sets of pharmacogenomics databases is not sufficient and this may limit the potential functions for pharmacogenomics. To address this problem, we used approximate string matching method and data mining approach to improve the searching of pharmacogenomics database. After computation, we can successfully identify more genes linked to AD and AD-related drugs than previous online searching. These improvements may help to improve the pharmacogenomics of AD for personalized medicine.

Entities:  

Mesh:

Year:  2014        PMID: 24719894      PMCID: PMC3955684          DOI: 10.1155/2014/897653

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Alzheimer's disease (AD), the most common form of dementia, was first reported in 1906 [1]. In 2006, there were about 26.6 million AD patients worldwide and it was also common in southern Taiwan [2]. Although AD has been identified for a long time, most research progress was made in the recent 30 years [3]. However, no definitive cure is available for this disease and eventually it leads to death. Therefore, the drug discovery for Alzheimer's disease remains challenging. Single nucleotide polymorphisms (SNPs) are the most common variation in human genomes [4]. The importance of SNPs has been reviewed in genome-wide association studies for its association with disease susceptibility and drug metabolism [5, 6]. About 60–90% of the individual variation of drug response depends on pharmacogenomic factors. Therefore, SNP genotyping for candidate genes, pharmacological research, and drug discovery may play an increasingly important role in AD treatment. Meanwhile, increasing amounts of related information require the assistance of bioinformatics to construct the suitable databases and web servers. Recently, PharmGKB (the Pharmacogenetics and Pharmacogenomics Knowledge Base) has been constructed to provide a comprehensive database for pharmacogenomic studies [7]. PharmGKB provides the pharmacogenetics research network in terms of SNP discovery and drug responses [8] with the fully curated knowledge for drug pathways, drug-related genes, and relationships among genes, drugs, and diseases. However, some information of different functions of PharmGKB is insufficient to allow convenient crosstalking between each other. To solve this problem, we propose data mining method to improve the searching of pharmacogenomics of AD based on the download dataset of the PharmGKB resource.

2. Materials and Methods

The flowchart for pharmacogenomics in AD for personalized drug studies is shown in Figure 1. First of all, the AD-related drugs and genes are retrieved from PharmGKB download data using approximate string matching method and data mining approach. The genes associated with AD and the genes associated with a single Alzheimer's drug are identified and compared with the online searching of PharmGKB. Then, numerous SNPs of genes associated with AD are identified. Through some SNP genotyping tools or assays, the association studies to AD-related drugs may be evaluated. Finally, the relevant information may be helpful for the personalized drug research.
Figure 1

The flowchart for PharmGKB-based pharmacogenomics of AD in this study.

2.1. AD-Related Drugs Using Approximate String Matching Based on PharmGKB Download Data

In order to study the pharmacogenomics of AD, we downloaded the known PharmGKB (the Pharmacogenetics and Pharmacogenomics Knowledge Base) (http://www.pharmgkb.org/downloads/) [9, 10] as source by the approximate string matching method [11] to find out all AD-related drug classes. The meaningful keywords associated with “Alzheimer's disease” are shown in Table 1. Then, these found drug classes are used to find out associated genes by data mining approach. The description of the approximate string matching method for all AD-related drug classes gives a pattern string P = p 1 p 2 p 3 ⋯ p , that is, the meaningful keywords associated with “Alzheimer's disease” and a text string T = t 1 t 2 t 3 ⋯ t , that is, the description for drug and disease retrieved from PharmGKB. Find a substring T = t t t ⋯ t in T that has the smallest edit distance [12] to the pattern P. The pseudocode for the edit distance is shown in Algorithm 1.
Table 1

The meaningful keywords associated with “Alzheimer's disease” are retrieved from PharmGKB and they are applied to discover the drug classes*.

IDKeywords
1AD
2Alzheimer's disease
3AD—Alzheimer's disease
4Acute Confusional Senile Dementia
5Alzheimer Dementia, Presenile
6Alzheimer Disease, Early Onset
7Alzheimer Disease, Late Onset
8Alzheimer Type Dementia
9Alzheimer Type Senile Dementia
10Alzheimer's Disease, Focal Onset
11Alzheimer's disease, NOS
12Dementia, Alzheimer Type
13Dementia, Presenile
14Dementia, Presenile Alzheimer
15Dementia, Primary Senile Degenerative
16Dementia, Senile
17Dementias, Presenile
18Dementias, Senile
19Disease, Alzheimer
20Disease, Alzheimer's
21Early Onset Alzheimer Disease
22Focal Onset Alzheimer's Disease
23Late Onset Alzheimer Disease
24Presenile Alzheimer Dementia
25Presenile Dementia
26Presenile Dementias
27Primary Senile Degerative Dementia
28Senile Dementia
29Senile Dementia, Acute Confusional
30Senile Dementia, Alzheimer Type
31Senile Dementias
32MeSH: D000544 (Alzheimer Disease)
33MedDRA: 10001896 (Alzheimer's disease)
34NDFRT: N0000000363 (Alzheimer Disease [Disease/Finding])
35SnoMedCT: 26929004 (Alzheimer's disease)
36UMLS: C0002395 (C0002395)

*Drug class is one of the functions listed in the ParamGKB download data.

Algorithm 1

Pseudocode for the edit distance used for approximate string matching.

2.2. Data Mining Method for PharmGKB Download Data

In this study, we used a priori algorithm [13] for frequent item set mining and association rule learning over PharmGKB. The pseudocode for the a priori algorithm for data mining in PharmGKB is shown in Algorithm 2. At first, a priori algorithm has to find out the frequent gene in drug class for “Alzheimer's disease.” A set of genes can be mined from each drug class. A priori algorithm is a “bottom up” approach, where frequent gene subsets are extended one item at a time (i.e., candidate generation) and groups of candidates are tested against the data. This algorithm is terminated when no further successful extensions are found.
Algorithm 2

Pseudocode for a priori algorithm for the data mining in PharmGKB, where ε is a support threshold, L is the frequent gene subsets that satisfy the support threshold, k is the number of current iterations, and C is the candidate set, and count[gene] accesses a field of the data structure that represents gene candidate set.

2.3. SNP Searching for Genes Using the NCBI dbSNP

Every gene contains numerous SNPs. In order to find out SNPs of single gene for Alzheimer's pharmacogenomics, NCBI dbSNP (http://www.ncbi.nlm.nih.gov/snp) is used to search in the study.

3. Results and Discussion

3.1. AD Information Based on PharmGKB Search

In PharmGKB online searching, the SNP variants, related genes, and drugs for AD are able to be retrieved. For example, the SNP information such as rs2066853 and rs6313 is provided (Figure 2). As shown in Figure 3, the AD-related genes such as ADRB1, AHR, HTR2A, MTHFR, and PTGS2 are identified and the related drugs such as olanzapine and risperidone are searched. This information may assist the researchers to study the pharmacogenomics of AD. Unfortunately, this PharmGKB online searching just provides limited information and it insufficiently copes with the complexity of the drug researches for Alzheimer's personalized medicine.
Figure 2

PharmGKB-pharmacogenomics online query for the variant information (SNP rs#ID) of “Alzheimer's disease.” Retrieval source: http://www.pharmgkb.org/disease/PA443319?previousQuery=Alzheimer's%20disease.

Figure 3

Gene and drug related information of “Alzheimer's disease” online query from PharmGKB. Retrieval source: http://www.pharmgkb.org/disease/PA443319?previousQuery=Alzheimer's%20disease#tabview=table 3&subtab=33.

3.2. PharmGKB-Based Data Mining of AD Information of Drug Classes or Gene Symbols

In current study, our proposed method is used to perform data mining for PharmGKB download data in terms of the keyword “Alzheimer's disease.” As shown in Table 2, 22 kinds of AD-related drug classes are identified from “drug classes” of PharmGKB. Their corresponding PharmGKB accession ID, PubMed PMID, and the number of genes that are associated with AD-related drug classes are also presented. In total, 495 genes are identified for AD information of drug classes (see Supplementary file 1: gene information includes PharmGKB Accession Id, gene symbol, and publications are providing in different classes; it is available online at http://dx.doi.org/10.1155/2014/897653). Alternatively, 99 genes associated with AD are identified from “gene symbols” of PharmGKB in terms of the keyword “Alzheimer's disease.” These results suggest that the same keyword, for example, Alzheimer's disease, may identify different numbers of AD-associated genes between “drug classes” or “gene symbols” of PharmGKB.
Table 2

PharmGKB-based data mining results in terms of the PharmGKB accession ID, drug class, publications, and the number of gene information of Alzheimer's disease.

No.PharmGKB accession IDDrug classesPublications∗1 Gene no.∗2
1 PA164712423 AnticholinesterasesPMID: 20644562   20644562   14674789 6
2 PA164712308 Ace inhibitors, plainPMID: 17362841 24
3 PA449515 EtanerceptPMID: 19027875 12
4 PA451262 RivastigminePMID: 20644562   16323253   17082448 20644562   15289797   17522596 2
5 PA450243 LithiumPMID: 17082448 13
6 PA10384 Anti-inflammatory and antirheumatic products, nonsteroidsPMID: 17082448   17082448 11
7 PA449760 Glatiramer acetatePMID: 17082448 4
8 PA133950441 Hmg coa reductase inhibitorsPMID: 17082448 39
9 PA151958596 CurcuminPMID: 17082448 2
10 PA451898 Vitamin cPMID: 17082448 16
11 PA451900 Vitamin ePMID: 17082448 1
12 PA452229 AntidepressantsPMID: 17082448 43
13 PA452233 AntipsychoticsPMID: 17082448 46
14 PA449726 GalantaminePMID: 20644562   16323253   17082448 15853556   20644562   14674789   12177686 7
15 PA10364 MemantinePMID: 17082448 0
16 PA451283 RosiglitazonePMID: 16770341 34
17 PA448031 AcetylcholinePMID: 15695160 8
18 PA450626 NicotinePMID: 15695160 88
19 PA137179528 NimesulidePMID: 16331303   11810182 3
20 PA449394 DonepezilPMID: 20859244   20644562   16323253 16424819   17082448   20644562   1973817012142731 9
21 PA451576 TacrinePMID: 9521254   17082448   10801254 9777427   18004213 6
22 PA448976 CholinePMID: 8618881 122

∗1PMID: PubMed article ID number.

∗2The full gene names for each of the “drug classes” have been provided in the Supplementary file 1.

After detailed examination, 67 genes in the gene symbols searching (bold fonts of gene names as shown in Table 3) are absent from the genes in the drug class searching (Table 2). Furthermore, genes corresponding to the drug “memantine” listed in Table 2 (drug classes) are not found in Table 3 (gene symbols). Therefore, some current drugs have identified a small number of AD-related genes in the drug class searching; however, the remaining AD-related genes that may affect AD-related drugs may be partly discovered in the gene symbols searching. These novelly identified AD-related genes may be the potential candidates for further drug development of AD. These results demonstrated that our proposed data mining method may be an improved AD pharmacogenomics study.
Table 3

PharmGKB-based data mining results of gene symbols of Alzheimer's disease and NCBI dbSNP-based query results for SNP number for the genes of Alzheimer's disease.

No.PharmGKB accession IDGene symbols*SNP no.No.PharmGKB accession IDGene symbols*SNP no.No.PharmGKB accession IDGene symbols*SNP no.
1 PA20 ACHE89934 PA37597 ZNF225 81367 PA125 CYP2C8993
2 PA26490 CHRNA4151835 PA38499 DEFB123 33068 PA126 CYP2C91605
3 PA128 CYP2D648236 PA134902026 SORCS2 1907369 PA30864 MME3323
4 PA130 CYP3A489937 PA134949387 SORCS3 1396970 PA142671271 NCSTN 741
5 PA26620 CLU 64438 PA38274 TOMM40 46271 PA36153 SST 120
6 PA26855 CR1 1985939 PA162397694 NLRC5 229772 PA36457 TF 1501
7 PA33287 PICALM 316940 PA24641 AHR 99173 PA31930 OPCML28437
8 PA46 ALOX5 199241 PA134950706 DNMBP 331274 PA29561 HTR7 2623
9 PA293 PTGS257942 PA24910 APP941175 PA162393285 KIF20B 2109
10 PA108 CETP124643 PA238 MAPT439976 PA26971 CSRP3 907
11 PA32996 PCDH11X 1519944 PA128394579 TMED10107977 PA231 LMNA 1486
12 PA24507 ADAM12 1082745 PA162397475 NGF 128678 PA27029 CTSD 460
13 PA25165 ATP8A1 598346 PA25232 BACE1 79479 PA29629 IDE 2755
14 PA26243 CD86 138547 PA36022 SORL1 439480 PA31374 MYH7 1157
15 PA26935 CSF1 56948 PA33796 PRNP 45281 PA272 PLN 343
16 PA27342 DISC1 1181349 PA37302 VEGFA 56182 PA33855 PSEN1 2343
17 PA28597 GBP2 62550 PA114 CHRNA7371483 PA33856 PSEN2 959
18 PA220 KCNMA1 1908151 PA37155 UBQLN1 140084 PA304 SCN5A3380
19 PA25512 KCTD12 23552 PA26123 CBS 92485 PA36638 TNNT2739
20 PA164724093 NOS2182053 PA26976 CST3 23386 PA139 ACE1108
21 PA33614 PPP1R11 21554 PA25623 C1QB35687 PA37935 SIRT1 1145
22 PA143485670 WWC1 507055 PA162380954 CALHM1 24788 PA55 APOE184
23 PA37596 ZNF224 49056 PA30748 MEOX2 214089 PA24357 A2M 1385
24 PA162380963 CALHM2 19257 PA26448 CHAT257290 PA192 HTR1A186
25 PA51 APOC124358 PA38239 CLSTN2 1560891 PA182 GSTM1264
26 PA34958 ATXN1 1191059 PA134952303 NMNAT3 3992 PA183 GSTT1200
27 PA26210 CD33 46560 PA134904440 C1orf49 34893 PA268 ABCB41915
28 PA28478 GAB2 511961 PA134864387 RALGPS2 398094 PA115 CHRNB2698
29 PA34052 PVRL2 134462 PA134870196 RGSL1 330095 PA156 ESR1 10108
30 PA37754 ZNRD1 31663 PA25294 BCHE179696 PA134934259 GAPDHS 361
31 PA38114 TRIM15 46664 PA120 CRP97797 PA245 MTHFR790
32 PA134927803 MTHFD1L 722965 PA127 CYP2C18 135398 PA36458 TFAM 376
33 PA144596420 INTS1 182066 PA124 CYP2C19269299 PA435 TNF268

*Gene names in bold fonts are not identified in Table 2.

3.3. SNP Information of AD-Related Genes

The SNP statuses for 99 AD-related genes are also provided in Table 3. This SNP status for each gene is calculated from the online NCBI dbSNP queries. In general, many SNPs are found in these AD-related genes. Some SNPs of these genes have been reported to be associated with AD. For example, the APOE gene is found in Table 3 and one of its SNPs, such as ApoE epsilon 4 allele, has been reported to be associated with AD [14]. With suitable tools for SNP genotyping, these SNP candidates are warranted for the pharmacogenomics research of AD. Currently, there are many high throughput SNP genotyping methods developed (as shown in Figure 1), including PCR resequencing [15], TaqMan probes [16], SNP microarrays [17], Matrix Assisted Laser Desorption/Ionization-Time of Flight (MALDI-TOF) [18], and others [19, 20]. Furthermore, some SNP genotyping tools or databases are also developed, such as SNP-RFLPing2 for comprehensive PCR-RFLP information based on SNPs [21-24], algorithmic PCR-RFLP primer design and restriction enzymes for SNP genotyping [25, 26], and primer design for PCR-confronting two-pair primers (PCR-CTPP) [27, 28]. These tools and methods can provide useful and convenient information for SNP genotyping in the AD pharmacogenomics studies.

4. Conclusions

AD is the most common form of dementia for older people. The pharmacogenomics of AD still remains a challenge. In this study, we propose the pharmGKB-based data mining method to improve the gene discoveries for the potential AD-related drug candidates. With the assistance of bioinformatics, this improvement can help researchers to develop personal therapeutic drugs of AD. Gene information includes PharmGKB Accession Id, gene symbol, and publications are providing in different classes. Click here for additional data file.
  24 in total

1.  Variation is the spice of life.

Authors:  L Kruglyak; D A Nickerson
Journal:  Nat Genet       Date:  2001-03       Impact factor: 38.330

Review 2.  SNP genotyping with fluorescence polarization detection.

Authors:  Pui-Yan Kwok
Journal:  Hum Mutat       Date:  2002-04       Impact factor: 4.878

3.  Mutagenic primer design for mismatch PCR-RFLP SNP genotyping using a genetic algorithm.

Authors:  Cheng-Hong Yang; Yu-Huei Cheng; Cheng-Huei Yang; Li-Yeh Chuang
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2012 May-Jun       Impact factor: 3.710

Review 4.  The pharmacogenetics research network: from SNP discovery to clinical drug response.

Authors:  K M Giacomini; C M Brett; R B Altman; N L Benowitz; M E Dolan; D A Flockhart; J A Johnson; D F Hayes; T Klein; R M Krauss; D L Kroetz; H L McLeod; A T Nguyen; M J Ratain; M V Relling; V Reus; D M Roden; C A Schaefer; A R Shuldiner; T Skaar; K Tantisira; R F Tyndale; L Wang; R M Weinshilboum; S T Weiss; I Zineh
Journal:  Clin Pharmacol Ther       Date:  2007-03       Impact factor: 6.875

Review 5.  Restriction enzyme mining for SNPs in genomes.

Authors:  Li-Yeh Chuang; Cheng-Hong Yang; Ke-Hung Tsui; Yu-Huei Cheng; Phei-Lang Chang; Cheng-Hao Wen; Hsueh-Wei Chang
Journal:  Anticancer Res       Date:  2008 Jul-Aug       Impact factor: 2.480

Review 6.  The importance of integrating SNP and cheminformatics resources to pharmacogenomics.

Authors:  Hsueh-Wei Chang; Li-Yeh Chuang; Ming-Tz Tsai; Cheng-Hong Yang
Journal:  Curr Drug Metab       Date:  2012-09-01       Impact factor: 3.731

7.  Associate PCR-RFLP assay design with SNPs based on genetic algorithm in appropriate parameters estimation.

Authors:  Li-Yeh Chuang; Yu-Huei Cheng; Cheng-Huei Yang; Cheng-Hong Yang
Journal:  IEEE Trans Nanobioscience       Date:  2013-06       Impact factor: 2.935

Review 8.  The Invader assay for SNP genotyping.

Authors:  Michael Olivier
Journal:  Mutat Res       Date:  2005-06-03       Impact factor: 2.433

Review 9.  SNP technologies for drug discovery: a current review.

Authors:  Joanne Voisey; Charles Phillip Morris
Journal:  Curr Drug Discov Technol       Date:  2008-09

10.  SNPdetector: a software tool for sensitive and accurate SNP detection.

Authors:  Jinghui Zhang; David A Wheeler; Imtiaz Yakub; Sharon Wei; Raman Sood; William Rowe; Paul P Liu; Richard A Gibbs; Kenneth H Buetow
Journal:  PLoS Comput Biol       Date:  2005-10-28       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.