Literature DB >> 19761572

Analysis of AML genes in dysregulated molecular networks.

Eunjung Lee1, Hyunchul Jung, Predrag Radivojac, Jong-Won Kim, Doheon Lee.   

Abstract

BACKGROUND: Identifying disease causing genes and understanding their molecular mechanisms are essential to developing effective therapeutics. Thus, several computational methods have been proposed to prioritize candidate disease genes by integrating different data types, including sequence information, biomedical literature, and pathway information. Recently, molecular interaction networks have been incorporated to predict disease genes, but most of those methods do not utilize invaluable disease-specific information available in mRNA expression profiles of patient samples.
RESULTS: Through the integration of protein-protein interaction networks and gene expression profiles of acute myeloid leukemia (AML) patients, we identified subnetworks of interacting proteins dysregulated in AML and characterized known mutation genes causally implicated to AML embedded in the subnetworks. The analysis shows that the set of extracted subnetworks is a reservoir rich in AML genes reflecting key leukemogenic processes such as myeloid differentiation.
CONCLUSION: We showed that the integrative approach both utilizing gene expression profiles and molecular networks could identify AML causing genes most of which were not detectable with gene expression analysis alone due to the minor changes in mRNA level.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19761572      PMCID: PMC2745689          DOI: 10.1186/1471-2105-10-S9-S2

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

Mining disease-causing genes and elucidating their pathogenic molecular mechanisms are of great importance for developing effective diagnostics and therapeutics [1-5]. Along with many genetic and genomic studies aimed at identification of disease genes (e.g. linkage analysis, cytogenetic studies, microarray experiments, proteomic studies), several computational methods have been proposed to prioritize candidate genes based on various information including sequence similarity, literature annotation, and molecular pathways [6-11]. Given a set of genes known to be involved in disease, these methods typically score similarities between candidate genes and known disease genes in terms of various genomic features. Recently, accumulated knowledge about molecular interaction networks in human cells such as protein-protein, and protein-DNA interactions has been utilized to predict disease genes [6-8,10,12-14]. The previous studies have incorporated topological characteristics of known disease genes such as degrees in networks [14], the overlap between interaction partners of candidate genes and those of known disease genes [6], the probability of candidate genes to participate in the same protein complexes with known disease-causing genes [10], or the distribution of distances from candidate genes to known disease genes [13]. Despite their successful performance in general, for some specific diseases of our interest, such as acute myeloid leukemia (AML), the performance is not satisfactory (AUC = 0.55 by Radivojac et al. [13]). We hypothesized that integrating molecular networks with mRNA expression profiles from patients might help delineate disease-specifically dysregulated molecular subnetworks containing disease-causing mutation genes. Chuang et al. supported this hypothesis showing the identified subnetworks included significantly enriched known breast cancer mutation genes [15]. Mani et al. proposed another method predicting oncogenes in B-cell lymphomas integrating both molecular interactions and mRNA expressions [16]. Here, we identified molecular subnetworks dysregulated in AML patients which were associated with key leukemogenic processes such as myeloid differentiation. We also evaluated the enrichment of known AML-causing mutation genes within the subnetworks, and found that the subnetworks contain significant fraction of known AML genes (mostly non-differentially expressed) embedded among the interconnections of differentially expressed genes. In addition, several characteristics of AML genes in the subnetworks were reported in this study, which can be utilized to build prediction models for unknown AML genes.

Results and discussion

Identification of subnetworks perturbed in AML

The method to find subnetworks of AML is similar to that of our previous work [15], and visualized in Figure 1. We overlaid the gene expression values of each gene on its corresponding protein in the protein-protein and protein-DNA interaction network and searched for subnetworks whose combined activities across the patients have high perturbation scores (PS) starting from each node in a greedy fashion. The gene expression profiles used cDNA platforms where each expression value of gene gin patient p(g) is the mean log ratio of intensities from Cy5-labeled mRNA of the patient and Cy3-labeled reference mRNA. Expression values were normalized for each gene across patients to have mean 0 and standard deviation 1 (z). We took absolute values of expression levels to measure perturbation effect regardless of the direction of changes (i.e. up or down). The perturbation score was defined as the mean over standard deviation of an activity vector across samples where each activity value was the averaged expression level of genes participating in each subnetwork Mand is denoted as S(M) in Figure 1. Subnetworks with higher mean and smaller variance of activity levels are considered more perturbed in AML samples.
Figure 1

Schematic overview of the subnetwork identification. Schematic overview of the subnetwork identification. The mRNA expression levels of each gene were overlaid on its corresponding protein in the network and subnetworks whose combined activities across the patients have high perturbation score were searched. An activity level (a) for a subnetwork Min jth sample was defined as the mean expression levels with the square-root of the number of participating genes in the denominator. The perturbation score S(M) for the subnetwork was then calculated as the mean over the standard deviation of the activity levels across patients.

Schematic overview of the subnetwork identification. Schematic overview of the subnetwork identification. The mRNA expression levels of each gene were overlaid on its corresponding protein in the network and subnetworks whose combined activities across the patients have high perturbation score were searched. An activity level (a) for a subnetwork Min jth sample was defined as the mean expression levels with the square-root of the number of participating genes in the denominator. The perturbation score S(M) for the subnetwork was then calculated as the mean over the standard deviation of the activity levels across patients.

AML subnetworks associated with key leukemogenic processes

Through the search for sutnebworks perturbed in AML patients, we identified 269 subnetworks (p < 0.05) comprising of 859 genes whose functions are associated with AML development processes such as myeloid differentiation, cell signaling of growth and survival, cell cycle, cell and tissue remodeling. Within the significant AML subnetworks, we found many of already known AML-causing mutation genes. Figure 2 shows examples of subnetworks containing known AML genes such as JAK2, JAK3, PDGFRB, and CREBBP, and their representative biological processes. Especially, a severe block in myeloid differentiation is known to be the hallmark of AML.
Figure 2

Examples of subnetworks containing known AML mutation genes. Nodes and links represent human proteins and protein interactions, respectively. The color of each node shows the degree of mRNA level change in AML patients. Known AML mutation genes are marked with the diamond shape.

Examples of subnetworks containing known AML mutation genes. Nodes and links represent human proteins and protein interactions, respectively. The color of each node shows the degree of mRNA level change in AML patients. Known AML mutation genes are marked with the diamond shape.

AML subnetworks enriched for known AML causing genes

We have evaluated the enrichment of known AML genes in significant subnetworks in a systematic way. We compiled 62 genes known to be causally mutated in AML from Sanger Cancer Gene Census. 150 out of 269 subnetworks included at least one AML gene, and 49 subnetworks included two or more AML genes. As shown in Figure 3, subnetworks were much more significantly enriched for AML causing mutation genes than the conventional gene-expression analysis alone without considering molecular interactions (p value P = 7.14e-6 vs. P = 0.04).
Figure 3

The enrichment of AML mutation genes in subnetworks. 18 out of 62 AML genes (29.03%) were found in 269 subnetworks including 859 genes, and their enrichment was significant (p-value P = 7.14e-6) through the hypergeometric test (the probability of 18 AML genes out of all 62 are found in the subnetworks including total 859 genes out of 9142 genes in the whole network). In contrast, only two AML genes (FLT3, JAK3) (3.23%) were found among 859 top differentially expressed genes in their mRNA levels (P = 0.04)

The enrichment of AML mutation genes in subnetworks. 18 out of 62 AML genes (29.03%) were found in 269 subnetworks including 859 genes, and their enrichment was significant (p-value P = 7.14e-6) through the hypergeometric test (the probability of 18 AML genes out of all 62 are found in the subnetworks including total 859 genes out of 9142 genes in the whole network). In contrast, only two AML genes (FLT3, JAK3) (3.23%) were found among 859 top differentially expressed genes in their mRNA levels (P = 0.04)

Characteristics of AML genes in the subnetworks

Table 1 lists 18 known AML genes detected in perturbed subnetworks along with the number of subnetworks including a designated gene and the magnitude of differential expression in its mRNA level (DES) for each gene. JAK3, KIT, EVI1, and CREBBP appeared in more than 10 subnetworks while other genes were present in two or mostly one subnetwork. JAK3 with the extremely high frequency (110) has been reported to have great biological importance in AML pathogenesis through gain-of-function JAK3 mutations (e.g. JAK3A572V, JAK3V722I, JAK3P132T) activating signal transduction [17]. Mutations in KIT having the second highest frequency (43) were also found in more than 30% of patients with de novo AML [18]. The appearance frequency of an AML gene in subnetworks was more correlated with the magnitude of its DES score (correlation coefficient r = 0.43) than the number of interacting partners, the node degree (r = 0.01).
Table 1

AML mutation genes in subnetworks

GenesNumber of SubnetworksDegreeDES+
JAK3*110322.73
KIT43541.97
EVI1167-1.27
CREBBP142091.5
EP30072162.36
BCR2321.41
FLT3**2113.58
NSD1261.14
PTPN111109-1.07
JAK2187N/A
PDGFRB156-0.33
NPM1133-0.37
RUNX11220.68
GATA21201.58
PICALM180.44
FNBP1172.05
RPL22131.08
TRIP1112-1.29

+DES means the degree of change in its mRNA level for each gene.

Genes with the absoluteDES ranked within top 5% are marked with **, and genes within top 10% are marked with *.

AML mutation genes in subnetworks +DES means the degree of change in its mRNA level for each gene. Genes with the absoluteDES ranked within top 5% are marked with **, and genes within top 10% are marked with *. We examined whether AML genes captured in subnetworks might have high degrees in the network because that property has been used to predict unknown disease genes in other diseases previously (Figure 4a). The figure shows that all known AML-causing genes (AML) and AML genes captured in subnetworks (AML_Network) have significantly more interaction partners than all genes in the network (P = 1.22e-6, and P = 2.34e-6, respectively). AML genes found in subnetworks have slightly higher degree than AML genes not captured in subnetworks (P = 0.02). The mean and median degrees of all genes in the network are 9 and 4, while those of 18 AML genes are 51 and 27. Though this result supports that known AML genes have tendency of high network degrees, low degree AML genes such as RPL22, and TRIP11 also appeared in the subnetworks.
Figure 4

(a) Degrees and (b) mRNA expression changes of AML genes. Each figure shows node degrees and magnitudes of differential expression (DES) for AML-causing mutation genes found in subnetworks (AML_Network), all known AML mutation genes (AML), and all genes in the whole network. The bottom and top of each box are first and third quartiles, and the band near the middle of the box is the median. Whiskers extend to at most 1.5 times the inter-quartile range. Beyond the whiskers, all outliers are shown in open circles. The statistical significances for differences between two groups of genes (e.g. AML_Network vs. All genes) measured by non-parametric Wilcoxon rank-sum test are denoted below the labels of gene groups.

(a) Degrees and (b) mRNA expression changes of AML genes. Each figure shows node degrees and magnitudes of differential expression (DES) for AML-causing mutation genes found in subnetworks (AML_Network), all known AML mutation genes (AML), and all genes in the whole network. The bottom and top of each box are first and third quartiles, and the band near the middle of the box is the median. Whiskers extend to at most 1.5 times the inter-quartile range. Beyond the whiskers, all outliers are shown in open circles. The statistical significances for differences between two groups of genes (e.g. AML_Network vs. All genes) measured by non-parametric Wilcoxon rank-sum test are denoted below the labels of gene groups. Finally, we investigated the differential expression of AML genes in mRNA levels (Figure 4b). There was no significant difference between each group of genes, and all known AML genes and those found in subnetworks except FLT3, and JAK3 did not show mRNA level aberrations. This result shows that gene expression alone does not provide enough information to predict unknown AML-causing mutation genes. However, our integrative approach could capture non-differentially expressed AML genes in subnetworks if they were entangled with differentially expressed neighbour proteins yielding subnetworks with high perturbation scores.

Conclusion

We have demonstrated that integration of condition-independent molecular networks extracted from various types of cells and experiments under different conditions, and disease-specific mRNA expression profiles of AML patients enables the dissection of pathogenic modules of interacting proteins reflecting key leukemogenic processes. In addition, the dissected modules are enriched for AML-causing mutation genes most of which are not detectable with gene expression analysis alone due to minor changes in their mRNA levels. Identification of subnetworks perturbed in AML patients can provide novel molecular hypotheses underlying AML etiology, and investigated characteristics of known AML genes appearing in the subnetworks can be exploited to predict unknown AML-causing genes.

Methods

Protein-protein interaction networks

We downloaded the PPI network from the PhenoPred website by Radivojac et al. [19]. It consists of 41456 physical interactions among 9142 proteins assembled from Human Protein Reference Database (HPRD) [20], the Online Predicted Human Interaction Database (OPHID) [21], and studies by Rual et al. and Stelzl et al. [22,23].

mRNA expression profiles of AML patients

Gene expression profiles of 65 peripheral-blood samples and 54 bone marrow specimens from 116 adult patients with AML were downloaded from Gene Expression Omnibus (GSE425) whose expression values are log ratios (base 2) of mean intensities of patient samples vs. common reference mRNA [24]. Gene identifiers of three cDNA microarray platforms (GPL317,318,319) were mapped to gene symbols using accompanied gene annotation files from GEO yielding 6987 gene symbols with expression levels in at least one of three platforms.

Mutation genes in AML patients

We compiled two sets of AML-associated genes: 14 genes downloaded from PhenoPred web site originally collected from OMIM [25], Swiss-Prot [26], and HPRD [20] by Radivojac et al. (Disease Ontology ID: 9119) [19], and 62 genes whose somatic and germline mutations are causally implicated in AML patients downloaded from Sanger Cancer Gene Census [27], and also appearing in our PPI network.

Significance evaluation of subnetworks

To evaluate the significance of the identified subnetworks, we performed the same search procedure over 1000 random trials in which the expression vectors of individual genes are randomly permuted in the network. The p value of each real subnetwork was calculated as the fraction of random subnetworks having higher PS scores than the designated real subnetwork among all random subnetworks. We considered subnetworks with the p-value P < 0.05 significant in this work.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

EL designed the study, and analyzed the results. EL and HC carried out the experiments. EL wrote the initial draft of the manuscript. HC, PR, JK, and DL revised the manuscript, and gave valuable suggestions and improvements. DL supervised this work. All authors read and approved the final manuscript.
  25 in total

1.  Towards a proteome-scale map of the human protein-protein interaction network.

Authors:  Jean-François Rual; Kavitha Venkatesan; Tong Hao; Tomoko Hirozane-Kishikawa; Amélie Dricot; Ning Li; Gabriel F Berriz; Francis D Gibbons; Matija Dreze; Nono Ayivi-Guedehoussou; Niels Klitgord; Christophe Simon; Mike Boxem; Stuart Milstein; Jennifer Rosenberg; Debra S Goldberg; Lan V Zhang; Sharyl L Wong; Giovanni Franklin; Siming Li; Joanna S Albala; Janghoo Lim; Carlene Fraughton; Estelle Llamosas; Sebiha Cevik; Camille Bex; Philippe Lamesch; Robert S Sikorski; Jean Vandenhaute; Huda Y Zoghbi; Alex Smolyar; Stephanie Bosak; Reynaldo Sequerra; Lynn Doucette-Stamm; Michael E Cusick; David E Hill; Frederick P Roth; Marc Vidal
Journal:  Nature       Date:  2005-09-28       Impact factor: 49.962

2.  A human protein-protein interaction network: a resource for annotating the proteome.

Authors:  Ulrich Stelzl; Uwe Worm; Maciej Lalowski; Christian Haenig; Felix H Brembeck; Heike Goehler; Martin Stroedicke; Martina Zenkner; Anke Schoenherr; Susanne Koeppen; Jan Timm; Sascha Mintzlaff; Claudia Abraham; Nicole Bock; Silvia Kietzmann; Astrid Goedde; Engin Toksöz; Anja Droege; Sylvia Krobitsch; Bernhard Korn; Walter Birchmeier; Hans Lehrach; Erich E Wanker
Journal:  Cell       Date:  2005-09-23       Impact factor: 41.582

3.  Creation and implications of a phenome-genome network.

Authors:  Atul J Butte; Isaac S Kohane
Journal:  Nat Biotechnol       Date:  2006-01       Impact factor: 54.908

4.  Gene prioritization through genomic data fusion.

Authors:  Stein Aerts; Diether Lambrechts; Sunit Maity; Peter Van Loo; Bert Coessens; Frederik De Smet; Leon-Charles Tranchevent; Bart De Moor; Peter Marynen; Bassem Hassan; Peter Carmeliet; Yves Moreau
Journal:  Nat Biotechnol       Date:  2006-05       Impact factor: 54.908

5.  Online Mendelian Inheritance in Man (OMIM).

Authors:  A Hamosh; A F Scott; J Amberger; D Valle; V A McKusick
Journal:  Hum Mutat       Date:  2000       Impact factor: 4.878

6.  Online predicted human interaction database.

Authors:  Kevin R Brown; Igor Jurisica
Journal:  Bioinformatics       Date:  2005-01-18       Impact factor: 6.937

7.  Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia.

Authors:  Lars Bullinger; Konstanze Döhner; Eric Bair; Stefan Fröhling; Richard F Schlenk; Robert Tibshirani; Hartmut Döhner; Jonathan R Pollack
Journal:  N Engl J Med       Date:  2004-04-15       Impact factor: 91.245

8.  An integrated approach to inferring gene-disease associations in humans.

Authors:  Predrag Radivojac; Kang Peng; Wyatt T Clark; Brandon J Peters; Amrita Mohan; Sean M Boyle; Sean D Mooney
Journal:  Proteins       Date:  2008-08-15

9.  Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes.

Authors:  Nicki Tiffin; Euan Adie; Frances Turner; Han G Brunner; Marc A van Driel; Martin Oti; Nuria Lopez-Bigas; Christos Ouzounis; Carolina Perez-Iratxeta; Miguel A Andrade-Navarro; Adebowale Adeyemo; Mary Elizabeth Patti; Colin A M Semple; Winston Hide
Journal:  Nucleic Acids Res       Date:  2006-06-06       Impact factor: 16.971

10.  The Universal Protein Resource (UniProt).

Authors:  Amos Bairoch; Rolf Apweiler; Cathy H Wu; Winona C Barker; Brigitte Boeckmann; Serenella Ferro; Elisabeth Gasteiger; Hongzhan Huang; Rodrigo Lopez; Michele Magrane; Maria J Martin; Darren A Natale; Claire O'Donovan; Nicole Redaschi; Lai-Su L Yeh
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  11 in total

1.  A scalable method for molecular network reconstruction identifies properties of targets and mutations in acute myeloid leukemia.

Authors:  Edison Ong; Anthony Szedlak; Yunyi Kang; Peyton Smith; Nicholas Smith; Madison McBride; Darren Finlay; Kristiina Vuori; James Mason; Edward D Ball; Carlo Piermarocchi; Giovanni Paternostro
Journal:  J Comput Biol       Date:  2015-04       Impact factor: 1.479

Review 2.  Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review.

Authors:  Peter Csermely; Tamás Korcsmáros; Huba J M Kiss; Gábor London; Ruth Nussinov
Journal:  Pharmacol Ther       Date:  2013-02-04       Impact factor: 12.310

Review 3.  Modeling information flow in biological networks.

Authors:  Yoo-Ah Kim; Jozef H Przytycki; Stefan Wuchty; Teresa M Przytycka
Journal:  Phys Biol       Date:  2011-05-13       Impact factor: 2.583

4.  Systems biology and the future of medicine.

Authors:  Joseph Loscalzo; Albert-Laszlo Barabasi
Journal:  Wiley Interdiscip Rev Syst Biol Med       Date:  2011-02-24

5.  Chapter 5: Network biology approach to complex diseases.

Authors:  Dong-Yeon Cho; Yoo-Ah Kim; Teresa M Przytycka
Journal:  PLoS Comput Biol       Date:  2012-12-27       Impact factor: 4.475

Review 6.  Pathway mapping and development of disease-specific biomarkers: protein-based network biomarkers.

Authors:  Hao Chen; Zhitu Zhu; Yichun Zhu; Jian Wang; Yunqing Mei; Yunfeng Cheng
Journal:  J Cell Mol Med       Date:  2015-01-05       Impact factor: 5.310

7.  An integrated approach of gene expression and DNA-methylation profiles of WNT signaling genes uncovers novel prognostic markers in acute myeloid leukemia.

Authors:  Erdogan Taskesen; Frank J T Staal; Marcel J T Reinders
Journal:  BMC Bioinformatics       Date:  2015-02-23       Impact factor: 3.169

8.  Hope for GWAS: relevant risk genes uncovered from GWAS statistical noise.

Authors:  Catarina Correia; Yoan Diekmann; Astrid M Vicente; José B Pereira-Leal
Journal:  Int J Mol Sci       Date:  2014-09-29       Impact factor: 5.923

9.  Selected proceedings of the 2009 Summit on Translational Bioinformatics.

Authors:  Yves A Lussier; Indra Neil Sarkar
Journal:  BMC Bioinformatics       Date:  2009-09-17       Impact factor: 3.169

10.  Bridging the Gap between Genotype and Phenotype via Network Approaches.

Authors:  Yoo-Ah Kim; Teresa M Przytycka
Journal:  Front Genet       Date:  2013-05-31       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.