Literature DB >> 20007742

WebPARE: web-computing for inferring genetic or transcriptional interactions.

Cheng-Long Chuang¹, Jia-Hong Wu, Chi-Sheng Cheng, Grace S Shieh.

Abstract

SUMMARY: Inferring genetic or transcriptional interactions, when done successfully, may provide insights into biological processes or biochemical pathways of interest. Unfortunately, most computational algorithms require a certain level of programming expertise. To provide a simple web interface for users to infer interactions from time course gene expression data, we present WebPARE, which is based on the pattern recognition algorithm (PARE). For expression data, in which each type of interaction (e.g. activator target) and the corresponding paired gene expression pattern are significantly associated, PARE uses a non-linear score to classify gene pairs of interest into a few subclasses of various time lags. In each subclass, PARE learns the parameters in the decision score using known interactions from biological experiments or published literature. Subsequently, the trained algorithm predicts interactions of a similar nature. Previously, PARE was shown to infer two sets of interactions in yeast successfully. Moreover, several predicted genetic interactions coincided with existing pathways; this indicates the potential of PARE in predicting partial pathway components. Given a list of gene pairs or genes of interest and expression data, WebPARE invokes PARE and outputs predicted interactions and their networks in directed graphs.

Entities: Chemical Disease Species

Mesh：

Year: 2009 PMID： 20007742 PMCID： PMC2820674 DOI： 10.1093/bioinformatics/btp684

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

Genetic interaction (GI) networks may reveal how a group of genes function together to carry out a biological process and unravel cellular buffering mechanisms (Boone et al., 2007), while predicting transcriptional regulatory interactions (TIs) may reveal the regulatory mechanisms in organisms (Wang et al., 2007). Henceforth, we use interactions to denote GIs or TIs. Recently, there have been a few studies on GIs (Wong and Roth, 2005). Paralogs or redundant genes are called SSL gene pairs if the combination of two mutants, neither by itself lethal, causes the organism to die or malfunction. Other types of GIs of interest are transcriptional compensatory and transcriptional diminishment interactions from SSL gene pairs (Chuang et al., 2008). Following a gene's loss, the expression level of its compensatory gene increases (decreases), and this phenomenon is called transcriptional compensatory (transcriptional diminishment). With the emergence of modern biotechnologies, various computational methods have been proposed to predict interactions using gene expression data and/or other experimental data. Inferring these interactions, when done successfully, can provide insights into biological processes or biochemical pathways of interest. Unfortunately, most computational algorithms require a certain level of programming expertise. A web-computing implementation of such an algorithm provides easy access to predicting interactions that are not annotated in any databases or literature. We have previously published the pattern recognition algorithm PARE (Chuang et al., 2008), which can infer interactions from time course expression data, provided that each type of interaction, e.g. AT or RT, and the corresponding paired gene expression pattern are significantly associated. PARE uses a non-linear score to classify gene pairs of interest into subclasses of various time lags. In each subclass, PARE learns the parameters in the decision score using known interactions from biological experiments or published literature. Subsequently, the trained algorithm predicts interactions of a similar nature. PARE was shown to infer two sets of interactions in yeast successfully using expression data and existing knowledge such as 112 pairs of qRT-PCR validated GIs. Moreover, several of the predicted GIs coincided with existing pathways in yeast. This indicates that PARE has the potential to predict biochemical pathways, while altered pathways are likely to play key roles in cancers and other human complex diseases (Ding et al., 2008). Recently, we applied PARE to infer TIs involved in human adipogenesis, and preliminary results identified some promising transcription factors for further biological experiments (J.-D.Zucker and K.Clement, unpublished data). Furthermore, a web-computing of PARE will be quite useful to predict GIs for recent large-scale SGA results in yeast. Here, a web-computing implementation of PARE (WebPARE) is presented, which attempts to provide a simple web interface for users to infer interactions from time course expression data. In addition, a graphical display of the predicted network is also provided. In the following, we outline the architecture of WebPARE, and conclude with an example of inferring TIs of genes involved in the yeast cell cycle.

2 THE ARCHITECTURE OF WEB PARE

In this section, we introduce the structure of WebPARE, which consists of two main components, the web-interface unit and the PARE computing unit; see Figure 1 for the flowchart. Via the web interface, users can create new requests to infer unknown interactions by uploading a list of gene pairs or genes of interest and their time course expression data; a list of gene pairs is automatically formed if a list of genes is uploaded. After a new request has arrived in the queue, WebPARE distributes it to the computing unit. Some integrated existing interactions, e.g. the 112 pairs of qRT-PCR validated yeast GIs and known TIs in Arabidopsis, yeast, mouse and human, are used to train parameters of PARE or a set of default parameters can be used. More existing TIs in other species will be integrated in the near future. All gathered information is then passed to the computing unit.

Fig. 1.

The flowchart of WebPARE.

The flowchart of WebPARE. The key procedures of the computing unit are outlined as follows; we refer to Chuang et al. (2008) for the details of PARE. First, WebPARE checks whether the uploaded expression data is in PreCLustering file format; see the website for an example. Next, a filtering process applied to expression data checks whether uploaded data satisfies the assumptions of PARE. Namely, (i) whether any gene expression curve of interest is too ‘flat’ to be predicted [to satisfy Equation (1) in Chuang et al., 2008], where G(t) denotes gene i's expression after smoothing at time t, and (ii) Fisher's exact test for the training data, in which users can select the value of C and the percentage of the training passing Fisher's test, according to the guidelines (Supplementary Material) or use the default values (C = 1.4 and 50%). Once the dataset passes this filtering step, among subclasses with a few time lags, PARE proceeds to classify each gene pair into a particular subclass in which an interaction occurs most probably. In each subclass, either the particle swarm optimization algorithm is used to optimize the parameters using known interactions or the default values are used. Finally, all gene pairs are scored by PARE. After WebPARE finishes a request, an email will notify the user to download the result, in which the most probable time lag, the associated PARE score and the predicted interaction type for each gene pair are outputted. In addition, a directed graph of the predicted interactions (a Cytoscape session file) is reported (Supplementary Material), in which each node denotes a gene and is labeled with the gene name, while each edge represents a significant predicted interaction; non-significant interactions, those where the absolute values of PARE scores are smaller than the threshold, are not plotted. A solid edge represents an AT (or transcriptional diminishment) interaction, while a dashed edge denotes a RT (or transcriptional compensatory) interaction when inferring TIs (or GIs). The web-interface unit of WebPARE is written in ASP, and runs on Microsoft internet information services web server, while the computing unit is written in MATLAB. Currently, WebPARE allows 100 thousands queries/access.

3 AN EXAMPLE

Suppose that a list of 15 gene pairs involved in cell cycle using expression data from cyclin-mutant yeast cells (Orlando et al., 2008) were uploaded to WebPARE, and TIs of these gene pairs were of interest. In the filtering step, since all 15 pairs were to be predicted, following the guidelines (Supplementary Material) the user relaxed the value of C to 1.1 such that all gene pairs passed the filtering process of Equation (1). Next, the 162 integrated (prestored) pairs of known TIs in yeast passed Equation (1) with C = 1.4, and 100% of them passed the Fisher's exact test. Therefore, WebPARE was invoked. All integrated yeast TI pairs were classified into subclasses with distinct time lags based on their PARE scores with the default weights (1, 1, 3.5). In each subclass, the integrated known TIs were used to train the parameters of PARE, and the TIs of interest were predicted. After comparing the predicted results with published literature, the modified true positive rate (mTPR) was 60% (9/15), where mTPR was defined as the ratio of the number of correctly predicted interactions to the total known interactions among all gene pairs. However, if the user preferred more accurate predictions, following the guidelines (Supplementary Material) the user would apply larger values of C. Setting C to 1.4 and 1.5 reduced the number of gene pairs to be predicted to 10 and 10, respectively, and both their mTPRs were 70%. This echoes the guideline that a larger value of parameter C in Equation (1) leads to more accurate predictions, but has a risk of filtering out gene pairs of interest. The significant predicted network for the 10 pairs is in the (Supplementary Material). A pilot study of predicting 99 gene pairs (Supplementary Material) resulted in mTPRs 72% and 82% for C equal to 1.1 and 1.5, respectively; the experiment took ∼16 min, which was conducted by PC with Pentium Core 2 1.86 GHz and 1.0 GB RAM.

6 in total

1. Transcriptional compensation for gene loss plays a minor role in maintaining genetic robustness in Saccharomyces cerevisiae.

Authors: Sharyl L Wong; Frederick P Roth
Journal: Genetics Date: 2005-07-05 Impact factor: 4.562

2. Inferring transcriptional regulatory networks from high-throughput data.

Authors: Rui-Sheng Wang; Yong Wang; Xiang-Sun Zhang; Luonan Chen
Journal: Bioinformatics Date: 2007-09-22 Impact factor: 6.937

3. A pattern recognition approach to infer time-lagged genetic interactions.

Authors: Cheng-Long Chuang; Chih-Hung Jen; Chung-Ming Chen; Grace S Shieh
Journal: Bioinformatics Date: 2008-03-12 Impact factor: 6.937

4. Somatic mutations affect key pathways in lung adenocarcinoma.

Authors: Li Ding; Gad Getz; David A Wheeler; Elaine R Mardis; Michael D McLellan; Kristian Cibulskis; Carrie Sougnez; Heidi Greulich; Donna M Muzny; Margaret B Morgan; Lucinda Fulton; Robert S Fulton; Qunyuan Zhang; Michael C Wendl; Michael S Lawrence; David E Larson; Ken Chen; David J Dooling; Aniko Sabo; Alicia C Hawes; Hua Shen; Shalini N Jhangiani; Lora R Lewis; Otis Hall; Yiming Zhu; Tittu Mathew; Yanru Ren; Jiqiang Yao; Steven E Scherer; Kerstin Clerc; Ginger A Metcalf; Brian Ng; Aleksandar Milosavljevic; Manuel L Gonzalez-Garay; John R Osborne; Rick Meyer; Xiaoqi Shi; Yuzhu Tang; Daniel C Koboldt; Ling Lin; Rachel Abbott; Tracie L Miner; Craig Pohl; Ginger Fewell; Carrie Haipek; Heather Schmidt; Brian H Dunford-Shore; Aldi Kraja; Seth D Crosby; Christopher S Sawyer; Tammi Vickery; Sacha Sander; Jody Robinson; Wendy Winckler; Jennifer Baldwin; Lucian R Chirieac; Amit Dutt; Tim Fennell; Megan Hanna; Bruce E Johnson; Robert C Onofrio; Roman K Thomas; Giovanni Tonon; Barbara A Weir; Xiaojun Zhao; Liuda Ziaugra; Michael C Zody; Thomas Giordano; Mark B Orringer; Jack A Roth; Margaret R Spitz; Ignacio I Wistuba; Bradley Ozenberger; Peter J Good; Andrew C Chang; David G Beer; Mark A Watson; Marc Ladanyi; Stephen Broderick; Akihiko Yoshizawa; William D Travis; William Pao; Michael A Province; George M Weinstock; Harold E Varmus; Stacey B Gabriel; Eric S Lander; Richard A Gibbs; Matthew Meyerson; Richard K Wilson
Journal: Nature Date: 2008-10-23 Impact factor: 49.962

Review 5. Exploring genetic interactions and networks with yeast.

Authors: Charles Boone; Howard Bussey; Brenda J Andrews
Journal: Nat Rev Genet Date: 2007-06 Impact factor: 53.242

6. Global control of cell-cycle transcription by coupled CDK and network oscillators.

Authors: David A Orlando; Charles Y Lin; Allister Bernard; Jean Y Wang; Joshua E S Socolar; Edwin S Iversen; Alexander J Hartemink; Steven B Haase
Journal: Nature Date: 2008-05-07 Impact factor: 49.962

6 in total

1 in total

1. Inferring genetic interactions via a nonlinear model and an optimization algorithm.

Authors: Chung-Ming Chen; Chih Lee; Cheng-Long Chuang; Chia-Chang Wang; Grace S Shieh
Journal: BMC Syst Biol Date: 2010-02-26

1 in total