Literature DB >> 20727159

Gene Expression Browser: large-scale and cross-experiment microarray data integration, management, search & visualization.

Ming Zhang1, Yudong Zhang, Li Liu, Lijuan Yu, Shirley Tsang, Jing Tan, Wenhua Yao, Manjit S Kang, Yongqiang An, Xingming Fan.   

Abstract

BACKGROUND: In the last decade, a large amount of microarray gene expression data has been accumulated in public repositories. Integrating and analyzing high-throughput gene expression data have become key activities for exploring gene functions, gene networks and biological pathways. Effectively utilizing these invaluable microarray data remains challenging due to a lack of powerful tools to integrate large-scale gene-expression information across diverse experiments and to search and visualize a large number of gene-expression data points.
RESULTS: Gene Expression Browser is a microarray data integration, management and processing system with web-based search and visualization functions. An innovative method has been developed to define a treatment over a control for every microarray experiment to standardize and make microarray data from different experiments homogeneous. In the browser, data are pre-processed offline and the resulting data points are visualized online with a 2-layer dynamic web display. Users can view all treatments over control that affect the expression of a selected gene via Gene View, and view all genes that change in a selected treatment over control via treatment over control View. Users can also check the changes of expression profiles of a set of either the treatments over control or genes via Slide View. In addition, the relationships between genes and treatments over control are computed according to gene expression ratio and are shown as co-responsive genes and co-regulation treatments over control.
CONCLUSION: Gene Expression Browser is composed of a set of software tools, including a data extraction tool, a microarray data-management system, a data-annotation tool, a microarray data-processing pipeline, and a data search & visualization tool. The browser is deployed as a free public web service (http://www.ExpressionBrowser.com) that integrates 301 ATH1 gene microarray experiments from public data repositories (viz. the Gene Expression Omnibus repository at the National Center for Biotechnology Information and Nottingham Arabidopsis Stock Center). The set of Gene Expression Browser software tools can be easily applied to the large-scale expression data generated by other platforms and in other species.

Entities:  

Mesh:

Year:  2010        PMID: 20727159      PMCID: PMC2941691          DOI: 10.1186/1471-2105-11-433

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

A microarray measures the expression of thousands of genes simultaneously. This experimental system has revolutionized biological research by enabling discovery of a large set of genes whose expression levels reflect a given cell type, treatment, disease or development stage. Since the advent of this technology more than a decade ago, a large amount of expression data has been accumulated on more than 100 species [1]. Several initiatives have been undertaken to develop microarray public data repositories and analysis tools for scientists to share and utilize these data [2]. The public data repositories, such as NASC, NCBI GEO [3], EBI ArrayExpress [4,5] and NIG CIBEX [6], have been collecting, annotating, storing and redistributing large amounts of microarray data from diverse experiments. For example, NCBI GEO (http://www.ncbi.nlm.nih.gov/geo/) has collected 366,965 samples from 14,304 experiments. These microarray data are invaluable resources for scientific research and discovery. Effective utilization of these datasets has, however, been limited because of a shortage of suitable tools to integrate large-scale and diverse microarray datasets. In most common use case, a scientist performs an experiment-based analysis: he or she downloads microarray data and sample annotations corresponding to a single experiment, inputs the data into a microarray data-analysis tool, such as GeneSpring [2], HDBStat! [7], or Bioconductor packages [2], etc., and carries out single-experiment centered analysis. In another common use case (e.g. for many gene-centric studies), a scientist wants to know how the expression of a given gene changes under various experimental conditions. The latter case is critically important for discovering gene functions, validating biomarkers, and developing new drugs targeted to specific genes. To answer gene-centric questions, we must have a tool that can be used to integrate a large amount of data from different microarray experiments. Developing such a tool presents several challenges. The first challenge is the heterogeneity of data collected from different microarray experiments. Different microarray experiments from different laboratories are usually designed independently for specific research purposes. Heterogeneity might come from differences in experimental designs, materials sampled, developmental stages, treatment levels (including controls), and so on. The second challenge is to develop an effective software tool to process such a large amount of data at an acceptable speed with currently available hardware resources (i.e., CPU, memory and network). The third challenge is related to the complexity of displaying or visualizing data in a software tool. Most software tools, when applied to large data sets, display items in an extended page or multiple display pages. Therefore, it is impossible for users to get an overall view of the data on a single page. It is also inefficient and inconvenient for users to scroll display pages to find interesting information from thousands of data items. Thus, it is important to design a data display interface that can show both an overall view of a large-scale dataset in its totality and a detailed view of individual data points. Genevestigator [8] and GeneChaser [1] are two web-based gene expression visualization tools that have successfully integrated a large number of microarray datasets and facilitated gene-centric and cross-experiment gene-expression discoveries. Genevestigator defines experiment annotation categories as Tissues/Organs, Developmental Stage, Environmental Factors (Stimulus) and Mutation. The expression data and the analysis results are organized according to these categories. The microarray experiments are discarded if they cannot be classified into one of the predefined categories. GeneChaser, on the other hand, automatically re-annotates and analyzes GDS datasets from NCBI GEO. It segregates all experimental conditions (treatment levels) into groups and then performs group versus group comparisons. However, the display systems of both Genevestigator and GeneChaser are limited. These two tools display data with heatmap or bar graphics on a display page with extended dimension or in multiple display pages. Only a limited number of data points can be shown at a time. Users have to scroll down the page to find interesting data points from among hundreds or thousands of total experimental conditions. The GEB, on the other hand, displays efficiently a large number of data points simultaneously. This has been achieved by developing a set of software tools of data extraction, data management, data annotation, data processing, and gene expression profile search & visualization. This set of software tools can be applied to microarray data in both public and private data repositories. The current public GEB web service (http://www.ExpressionBrowser.com) integrates 301 ATH1 microarray experiments that were originally stored in the data repositories of NCBI and NASC [9]. Arabidopsis, as a model plant, is widely used in various microarray experiments and gene-network modeling [10-12]. The results and knowledge obtained from Arabidopsis studies can be used as a reference for corresponding research on other plants, especially field crops [13,14].

Implementation

Overall design of workflow

The GEB workflow is shown in Figure 1. Microarray data can be downloaded from public data repositories with the data extraction tool. Alternatively, data owners may upload their data directly into GEB. The data extraction tool harvests raw data files, sample annotations, and experimental designs from data repositories into the GEB data-management system. Data curators use the web-based interfaces of the data-management system to create sample sets by combing all replicated samples in each treatment level into individual groups (i.e. sample sets). Then, the data curators define a T/C by selecting a treatment sample set and a control sample set. In the data-processing pipeline, the microarray data are normalized, and the log2 ratio of treatment-over-control (LOG2R) and its t-test P value are calculated. The normalized intensities of each chip, average intensities of each sample set, LOG2Rs and P values of each T/C are loaded into the GEB database, from which the data can be queried via the web-based search & visualization tool.
Figure 1

The schema and workflow of GEB. Microarray data can be downloaded into GEB from public repositories or uploaded into GEB by data owners. GEB is composed of a set of functional components. The major components are the data extractor (a command-line program), the data management system (a web application), the data processing pipeline (a set of command-line programs piped together), a MySQL database, and a web-based search and visualization tool.

The schema and workflow of GEB. Microarray data can be downloaded into GEB from public repositories or uploaded into GEB by data owners. GEB is composed of a set of functional components. The major components are the data extractor (a command-line program), the data management system (a web application), the data processing pipeline (a set of command-line programs piped together), a MySQL database, and a web-based search and visualization tool.

Affymetrix probe set annotation

The probe sets on Affymetrix ATH1 chip were annotated via the following procedures: (1) Arabidopsis cDNA sequences and annotations were downloaded from TAIR (http://www.arabidopsis.org/) and ATH1 probe sequences were downloaded from Affymetrix; (2) All probe sequences were BLASTed against all cDNA sequences; (3) A probe set was mapped to a cDNA when nine or more probes in the probe set had a 100% match to a cDNA sequence (each ATH1 probe set contains 11 probes); and (4) The annotation of matched cDNA was used as the annotation of the probe set.

Data extraction and management

The data extraction tool was developed using Java with Jakarta Commons Net Library (http://commons.apache.org/net/). The tool is a web crawler that recursively harvests raw data (such as Affymetrix CEL files), sample annotations, and experiment design descriptions from a repository website and then loads them into GEB database. To download data from different repositories, a corresponding plug-in component was developed for each repository. So far, two data extraction plug-ins have been developed for harvesting data from GEO and NASC. The data-management system was developed for data curators to view and annotate the microarray data extracted from data repositories or submitted by data owners. Data curators annotate the data via the following steps: First, a data curator creates a sample set by grouping replicated samples from every treatment level. The user interface for defining a sample set is shown in Figure 2A. A sample set name of "Wildtype_no treatment" is given at Name box and two replicates of "Wildtype_no treatment_Rep1" and "Widetype_no treatment_Rep2" are assigned to the sample set by moving them from the left panel to the right panel. Other sample sets in the experiment are created via the same procedure as noted above.
Figure 2

Screenshots of the GEB data management system. A. The web interface used by a data curator to define sample sets for all experiment data. B. The web interface used by a data curator to choose a treatment sample set and a control sample set to create a T/C. Detailed information for all eight sample sets can be found at http://expressionbrowser.com/arab/displayExperiment.jsp?id=2202517&tab=2.

Screenshots of the GEB data management system. A. The web interface used by a data curator to define sample sets for all experiment data. B. The web interface used by a data curator to choose a treatment sample set and a control sample set to create a T/C. Detailed information for all eight sample sets can be found at http://expressionbrowser.com/arab/displayExperiment.jsp?id=2202517&tab=2. Second, a data curator creates a T/C pair by choosing a treatment sample set and the corresponding control sample set from a drop-down menu (Figure 2B). For instance, we selected "ice1_no treatment" as treatment and "Wildtype_no treatment" as control to form a T/C. Then, the curator specifies a name of "ICE1 mutant vs. wild type" at Name box and detailed T/C information is given in Description box at the lower panel of Figure 2B. The control sample set is selected for a given treatment sample set so that only one-factor differs between the treatment and the control. Therefore, the biological effect of the T/C will be clearly distinguished by the differential factor. All possible T/C pairs were created in this way. In the example shown in Figure 2, a total of 10 T/Cs are defined as follows: 3 T/Cs for cold effects in a mutant (viz. "Ice1 mutant with cold treatment for 3 hr vs. Ice1 mutant with no treatment", "Ice1 mutant with cold treatment for 6 hr vs. Ice1 mutant with no treatment", and "Ice1 mutant with cold treatment for 24 hr vs. Ice1 mutant with no treatment"); 3 T/Cs for cold effects in wild type (viz. "Wildtype with cold treatment for 3 hr vs. Wildtype with no treatment", "Wildtype with cold treatment for 6 hr vs. Wildtype with no treatment", and "Wildtype with cold treatment for 24 hr vs. Wildtype with no treatment"); 3 T/Cs for mutation effects under cold treatment (viz. "Ice1 mutant with cold treatment for 3 hr vs. Wildtype with cold treatment for 3 hr", "Ice1 mutant with cold treatment for 6 hr vs. Wildtype with cold treatment for 6 hr", and "Ice1 mutant with cold treatment for 24 hr vs. Wildtype with cold treatment for 24 hr"); and one T/C for mutation effects without cold treatment (viz. "Ice1 mutant with no treatment vs. Wildtype with no treatment"). All 10 T/Cs are shown at http://expressionbrowser.com/arab/displayExperiment.jsp?id=2202517&tab=1. After all treatment levels in each experiment are transformed into T/Cs, different experiments have same data structure and are comparable to one another and are, thus, easily integrated together. As a result, the heterogeneity caused by the differences in experimental designs is removed. The LOG2R of T/C also removes system errors that affect both treatment and control. Therefore, the ratio data generated based on T/Cs can be more instructive and reliable than intensity data generated from treatment levels.

Data processing and data quality monitoring

The GEB data-processing pipeline is composed of four consecutive programs. The first program is for data normalization using the Robust Multichip Average (RMA) algorithm [15] that was implemented in the Bioconductor Affy package (http://www.bioconductor.org/packages/2.4/bioc/html/affy.html). The second program takes this normalized intensity data as input and computes average intensities, standard deviations, LOG2Rs, and P values of two-sample, two-tailed t-tests. The third program renders JPEG images of MA plots [16,17] with average intensity as the x-axis, LOG2R as the y-axis, and P value as the color. The images are loaded into the GEB application server (Tomcat) for data display when queried by users. The fourth program computes the mean percentage coefficient of variation (%CV) of all microarray features (genes) in a sample set using the following two steps. First, the standard deviation, mean, and %CV of each feature (gene) in a sample set are calculated: that is, %CV = 100 * (Mean intensity/Standard deviation). Second, the mean %CV of all features in the sample set is calculated. The mean %CV of each of individual sample set is computed via the above procedure; the distribution of all mean %CVs is shown in Figure 3. Most sample sets have mean %CV between 0.5 and 4.68. There is a long tail to the right side of the distribution, in which the mean %CV ranges from 4.68 to 16. This result indicates that about 10% of the total sample sets have extremely large mean %CV, and thus probably have poor data quality. Mean %CV of a sample set could be used to monitor quality of the sample set because higher mean %CV implies larger variation among the replicated samples in the sample set. Therefore, any finding or conclusion from a sample set with high mean %CV must be interpreted cautiously. We plan to filter out the sample sets with extremely high mean %CV in the future to guarantee the quality of all the data in GEB.
Figure 3

The distribution of mean %CV of all sample sets. The mean %CV is calculated in two steps: first calculate the standard deviation, mean, and %CV of each gene in a sample set (%CV = 100 * Mean intensity/Standard deviation), and then compute the mean %CV of all genes in the sample set.

The distribution of mean %CV of all sample sets. The mean %CV is calculated in two steps: first calculate the standard deviation, mean, and %CV of each gene in a sample set (%CV = 100 * Mean intensity/Standard deviation), and then compute the mean %CV of all genes in the sample set. Some microarray experiments in NASC or GEO were discarded because there were no replicated samples or no suitable controls. As of now, there are a total of 301 experiments, 1450 T/Cs, and 33,074,500 LOG2R data points in the Arabidopsis GEB database. Additional data, when available, can be easily entered into GEB.

Data search and visualization

The Lucene search engine (http://lucene.apache.org/) is used for full-text search. Search index files in GEB are built with the text from gene identifiers, gene symbols, gene annotations, T/C names, T/C descriptions, experiment titles, and experiment descriptions. Genes, T/Cs, and experiments are searchable by matching keywords in the index files. A 2-layer visualization display is designed to show large-scale data points as both an overall view and a detailed view. This visualization was developed using AJAX technology [18]. The first display layer is a static display (image) generated offline that contains all data points. The second layer is a real-time interactive display built by Web2.0 technology (JavaScript/AJAX). With the 2-layer display, users not only obtain an overall expression profile of the distribution of data points on the static plot, but can also get detailed information on each data point by real-time interactive searching or highlighting. The P value of ratio data is shown by the color of the data. Therefore, data significance level is displayed at the same time as the magnitude of the data is.

Results and Discussion

Full-text search

With full-text searching, users can easily access the information inside GEB. The full text searching method employed by GEB is different from the searching in Genevestigator [8] or GeneChaser [1], in which only gene identifiers or symbols can be used for searching. Users can obtain expression information from Genevestigator or GeneChaser only when they clearly know the gene names or symbols. In contrast, GEB carries out full-text search for any word or letters for a gene symbol, gene annotation, T/C name, T/C description, experiment title and experiment description. Users can freely explore the expression data with any search term they wish. The full-text search is implemented in three places. The first is the GEB home page (http://www.ExpressionBrowser.com), where the user can enter keywords and find three types of information: genes, T/Cs and experiments. The second place is in Gene View (Figure 4), where users can search T/Cs and investigate how different T/Cs affect the expression of the selected gene. The third place is in the T/C View (Figure 5), where users can search genes and observe how the expressions of these genes are changed by the selected T/C.
Figure 4

The Gene View of . The up-regulation T/Cs were highlighted and selected. The color can be changed by right clicking on the color icon in the lower box of right panel in the figure. Users may test this functionality at http://expressionbrowser.com/arab/displayFeature.jsp?id=1001343.

Figure 5

The T/C View of "16 hr Pseudomonas infection". When PR1, PR2, PR3, PR4 and PR5 were searched and selected on this T/C View, the data points on the MA plot on left panel are labeled with a colored box. The color can be changed by right clicking on the color icon on the lower box of the right panel in this figure. You may test this function at http://expressionbrowser.com/arab/displayPair.jsp?id=2056966.

The Gene View of . The up-regulation T/Cs were highlighted and selected. The color can be changed by right clicking on the color icon in the lower box of right panel in the figure. Users may test this functionality at http://expressionbrowser.com/arab/displayFeature.jsp?id=1001343. The T/C View of "16 hr Pseudomonas infection". When PR1, PR2, PR3, PR4 and PR5 were searched and selected on this T/C View, the data points on the MA plot on left panel are labeled with a colored box. The color can be changed by right clicking on the color icon on the lower box of the right panel in this figure. You may test this function at http://expressionbrowser.com/arab/displayPair.jsp?id=2056966.

Gene View and co-responsive genes

The GEB backend data model is a matrix with two dimensions, genes and T/Cs. Users visualize the expression profiles as a slice along either of these two dimensions: the Gene View displays data points of all T/Cs for a selected gene, whereas the T/C View displays data points of all genes for a selected T/C. Figure 4 illustrates the Gene View. Data points from all T/Cs for a gene are displayed in the MA plot [16,17]. Here, M, the y-axis, is the log2 ratio of treatment over control (LOG2R) [log2 (treatment intensity) - log2 (control intensity)] and A, the x-axis, is the average log2 intensity of the treatment and control [(log2 (treatment intensity) + log2 (control intensity))/2]. The MA plot provides a quick overview of data points for all T/Cs affecting the selected gene. The data points located in the upper area of the MA plot are 'up-regulation' T/Cs, and those located at lower area are 'down-regulation' T/Cs. Gene View is a cross-experimental display of the expressions of a gene under all experimental conditions currently available in GEB. With the MA plot, users can get a clear overall view of a gene-expression profile without scrolling down the display page, no matter how many data points might be on the plot. From a GEB MA plot, users can easily view both the LOG2R changes and also the statistical significance of the LOG2R. Each data point is color coded on the basis of the t-test P value that indicates the significance level of its LOG2R. The data points are coded in blue color when P values are lower than 0.01, in green color when P values are between 0.01 and 0.05, and in yellow color when P values are higher than 0.05. The color-coded data points help users know visually significance levels and reliability of the data. For example, if the data point has both a high-fold change (at the top or bottom of the display) and high P values (P > 0.05, yellow color), it suggests that there may be large systematic or experimental errors among replications so that the results should be interpreted cautiously before conclusion are drawn based on such a data point. Therefore, the location and color of the data points on the GEB MA plot give users a clear view of gene expression in both ratio scale and significance level (reliability). The MA plot is a JPEG image generated by the offline data-processing pipeline. The image is about 60 K in size, with 480 × 480 pixel dimensions, which allows the image to be loaded from host server to users' browser very quickly so that users can rapidly obtain an overall view of the expression profile of a gene. Most importantly, GEB is equipped with highlighting and search functions that allow users to highlight data points by dragging-and-dropping the mouse and to search data by entering keywords. Figure 4 illustrates how to use the "highlighting window" to locate the up-regulation T/Cs on the MA plot. First, users move the "highlighting window" to cover the data points on the upper panel of the MA plot. The users can resize the window, if needed. The two text boxes to the right of the MA plot are used for listing detailed information about the highlighted data points. Users can click the 'Select' button for any T/C on the upper text box and then the selected T/C will be moved to lower text box. At the same time, the selected T/C is also marked on the MA plot with a small rectangle. This two-layer display solution achieves both a quick overview of an expression profile and a detailed view of the selected data points. Arabidopsis PR-1 gene, a pathogenesis-related gene [19], was used as an example of Gene View in Figure 4. The up-regulation T/Cs selected in Figure 4 are listed in Table 1. A total of 95 T/Cs were selected when 2-fold and P < 0.05 were used as a double cutoff. Among the 95 T/Cs, 44 T/Cs are pathogen treatments, 13 T/Cs are plant defense elicitor treatments, and 14 T/Cs are plant defense-related mutants. These results clearly suggested that the expression of PR-1 was promoted by infections, plant-defense elicitors, and plant defense-related mutations. In previous studies, PR-1 was defined as a pathogenesis-related gene that was coordinately activated by pathogen infection and functioned as an indicator of the defense reaction [20,21]. The silencing of this gene leads to an increase in extracellular β-(1→3)-glucanase activity at the onset of tobacco defense reactions [22-24]. A decrease in β-(1→3)-glucan deposition in PR-1-silenced lines [22] might cause less deposition of callose that is linked with β-(1→3)-glucan and while the callose deposition is one of the characteristics of defense reactions associated with hypersensitive response of a plant [25]. Morris et al. [26] indicated that chemical induction of maize PR-1 genes increased resistance to downy mildew. The results for PR-1 functions revealed by GEB were impressively consistent with the previous findings. These results strongly suggested that Gene View of GEB would be very useful in gene-function discovery, biomarker validation, and bioprocess identification.
Table 1

A list of T/Cs that induces the expression of the Arabidopsis PR-1 gene.

T/C NameTreatment typeFold ChangeP-value
Seedling, SA treatment vs. controlPlant defense elicitor338.610.0024

Csn5 (csn5a-2 csn5b) mutant, light vs. wild type, lightPlant defense related mutant230.143.83E-05

gh3.5-1D mutant Pst DC3000(avrRpt2) vs. gh3.5-1D un-inoculated controlPathogen infection220.442.07E-06

Leaf, eds16 mutant, Golovinomyces orontii infection for 7 d vs. eds16 mutant, 0 d controlPathogen infection160.282.21E-05

Csn4-1 mutant, light vs. wild type, lightPlant defense related mutant130.356.81E-04

Csn4-1 mutant vs. wild typePlant defense related mutant125.670.0011

BTH Effect for 24 hr in wrky18 mutantPlant defense elicitor111.620.0012

Whole plant, mkk1/mkk2 vs. WTPlant defense elicitor108.595.48E-04

senescence effects in podSenescence97.642.67E-05

cpr5scv1 double mutantPlant defense related mutant88.610.0385

Pst DC3000 infection (12 hr) in WTPathogen infection83.370.0181

BTH Effect for 24 hr in WTPlant defense elicitor77.470.012

Whole plant, WT, 24 h BTH vs. WT controlPlant defense elicitor71.713.44E-07

Csn3-1 mutant, light vs. wild type, lightPlant defense related mutant67.197.58E-05

120 hr Erysiphe orontii infectionPathogen infection64.80.0053

Whole plant, mkk2, 24 h BTH vs. mkk2 controlPlant defense elicitor63.450.0042

Col-0 WT, Pst DC3000 (avrRpt2) infection vs. un-inoculated controlPathogen infection60.320.0182

Cold 7 days effectsOthers58.930.0061

cpr5 mutantPlant defense related mutant56.630.0354

Pst DC3000 infection (12 hr) in wrky17 mutantPathogen infection55.620.0293

siz1-3 mutant drought with treatment vs. Col-0 WT with drought treatmentPlant defense related mutant52.980.0034

Brm-101 mutant vs. Ler WTOthers52.50.0221

96 hr Erysiphe orontii infectionPathogen infection49.322.50E-05

Phytophthora infection for 24 hrPathogen infection47.653.19E-05

32 hr PsES4326 infection vs 9 hr PsES4326 infectionPathogen infection41.110.0267

siz1-3 mutant vs. Col-0 WTPlant defense related mutant38.880.0031

Pst DC3000 infection (12 hr) in wrky11 mutantPathogen infection37.070.0093

24 hr PsES4326 infection vs 9 hr PsES4326 infectionPathogen infection33.640.0297

E2Fa-DPa over-expressingOthers32.750.009

CotyledonOthers30.28.69E-05

Chitin receptor mutant, chitooctaose treatment vs. Wild type, chitooctaose treatmentPlant defense elicitor29.647.04E-04

shoot vs rootOthers29.618.02E-04

Csn5 (csn5a-2 csn5b) mutant, dark vs. wild type, darkPlant defense related mutant29.260.007

Chitin receptor mutant vs. Wild typePlant defense related mutant28.923.09E-05

flower stage 15, sepalsOthers28.051.08E-04

Whole plant, mkk1, 24 h BTH vs. mkk1 controlPlant defense elicitor26.740.0174

Leaf, WT, Golovinomyces orontii infection for 7 d vs. 0 d controlPathogen infection26.062.36E-07

BTH Effect for 8 hr in WTPlant defense elicitor26.060.0201

BTH Effect for 8 hr in wrky18 mutantPlant defense elicitor25.923.30E-04

camta3-2 mutant vs. wild typePlant defense related mutant22.870.0358

cdpk6-yfp 4 transgene effectsOthers20.980.0151

PsmES4326 infection for 32 hrPathogen infection19.530.0079

Leaf, WT, Golovinomyces orontii infection for 5 d vs. 0 d controlPathogen infection17.732.80E-06

S15-118 mutant vs. WTOthers17.60.0368

PsmES4326 infection for 24 hrPathogen infection16.370.0072

flower stage 15Others14.821.09E-04

BTH treatment in WT vs. WT controlPlant defense elicitor14.783.70E-04

Pseudomonas syringae pv phaseolicola infiltration for 24 hrPathogen infection12.870.0015

mature leaves, 35 days after sowing vs. AverageOthers12.580.0232

72 hr Erysiphe orontii infectionPathogen infection12.550.0086

old rosette leaf vs young rosettet leaf in WTSenescence10.820.0235

SPH1 knockout vs WT in young rosette leafOthers10.690.0187

pmr5 pmr6 double mutant vs. WTPlant defense related mutant10.460.0293

Pseudomonas syringae pv tomato avrRpm1 infiltration for 24 hrPathogen infection10.190.0036

flower stage 12 equivalent (7)Others8.833.61E-04

sni1 mutantOthers8.590.0117

flower stage 12 equivalent (6)Others8.582.48E-04

High nitrogen and glucose effectsOthers7.640.0015

Pnp1-1, phosphate deficiency 1 wk vs. WT, phostphate deficiency 1 wkOthers6.920.0259

Pseudomonas syringae pv tomato DC3000 hrcC-infiltration for 24 hrPathogen infection6.781.75E-04

glucose effectsOthers6.427.38E-04

Pnp1-1, phosphate efficiency 1 wk vs. WT, phostphate efficiency 1 wkOthers6.360.0173

mil4 overexpression line with BTH treatment vs. mil4 overexpression line controlPlant defense elicitor6.35.77E-04

flower stage 12, sepalsOthers6.23.95E-04

arr10 arr12 double null mutant effects under cytokininOthers6.030.0034

pmr5 mutant vs. WTPlant defense related mutant5.880.0412

WT, INA 48 h vs. control 48 hOthers5.770.0437

pnp1-1 mutant, phosphate starvation for 1 wk vs. pnp1-1 mutant, 1 wk controlOthers5.740.0402

BTH treatment in mil4 mutant vs. H2O in mil4 mutantPlant defense elicitor5.520.0028

CotyledonOthers5.463.86E-04

WT, phosphate starvation for 1 wk vs. 1 wk controlOthers5.280.0166

seedling 3 vs averageOthers5.18.30E-04

seedling 2 vs averageOthers4.870.001

SAM SE, 35S:AGL15 vs. WTOthers4.640.0461

16 hr Pseudomonas infectionPathogen infection4.580.0114

gl1T rosette leaf #4, 1 cm longOthers4.531.56E-04

Pseudomonas syringae pv phaseolicola infiltration for 6 hrPathogen infection4.330.0308

senescing leavesSenescence4.323.95E-05

Botrytis cinerea infection on 48 hpi leafPathogen infection4.170.0247

Col-0 rosette leaf #4Others4.080.0016

mil4 mutant vs. WTPlant defense related mutant3.810.0031

gl1T rosette leaf #12Others3.643.24E-04

flower stage 12 equivalent (5)Others3.620.0028

LeafOthers3.220.0016

cauline leavesOthers3.130.0023

shoot under potassium starvationOthers30.0098

Met1-3 mutant leaf (4th generation) vs. Col-0 WTOthers2.90.0353

shoot under Caesium treatmentOthers2.90.0061

Col-0 rosette leaf #4Others2.760.0018

24 hr control vs 0 hr controlOthers2.70.0158

Ambient CO2 and Ambient Light at 96 hr vs 0 hrOthers2.490.0364

rosette leaf # 2Others2.450.0075

leaf 7, distal halfOthers2.420.0023

HSP90 reduced mutant (RNAi-B1) vs. Control-3Others2.050.0063

gh3.5-1D mutant, Pst DC3000 (avrRpt2) infection vs. Col-0 WT, Pst DC3000 (avrRpt2) infectionOthers2.030.0263
A list of T/Cs that induces the expression of the Arabidopsis PR-1 gene. Figure 6 represents a screenshot of "Co-responsive Genes" tab in the PR-1 Gene View (http://www.expressionbrowser.com/arab/displayFeature.jsp?id=1001343&tab=4). The co-responsive relationship of two genes is determined by the following procedure: (1) The up- and down-regulation T/Cs of the two genes are selected using a double cutoff of P < 0.05 and of 2-fold; (2) the overlap T/Cs that have the two genes selected are then used to compute the overlap percentage; (3) the Pearson correlation coefficient is calculated using the LOG2R of overlapped T/Cs; and (4) a relationship index is calculated using the overlap percentage multiplied by the square of the correlation coefficient. The relationship between the two co-responsive genes is computed with ratio data from T/C with only a single factor differing between treatment and control. Therefore, the relationship between co-responsive genes solely reflects the effect of a biological treatment because the variations caused by most other factors are removed. On the other hand, if the relationship between co-expressed genes is computed with intensity data where multiple factors vary (such as tissue and cell type of sample, biological treatment, sampling methods, such as time and location, experimental methods, such as sample storage, mRNA extraction, or microarray dying, and systematic errors), then the relationship between co-expression genes reflects the mixed effects from biological treatment and these multiple factors. In the list of PR-1 co-responsive genes (Figure 6), impressively, many well-known plant defense-related genes, such as EXLB3, PR-2, Chitinase, PR-5 and AGP5, were found. Among them, PR-2 and PR-5 are considered to have a similar function as PR-1 in systemically acquired resistance (SAR) responses [27]. According to a review on the integrated application of online data mining tools by Meier and Gehring [28], PR-1, PR-2 and PR-5 were induced by necrotrophic Botrytis cinerea pathogen. The results shown by GEB are consistent with those from previous studies. The consensus results from multiple experiments in GEB provide reliable clues for gene-expression discoveries.
Figure 6

The co-responsive genes related to . The co-responsive genes were listed in the order of their relation index to PR-1 genes. The relation index is a product of the overlap percentage (the percentage of overlapped co-regulation T/Cs between PR-1 and the selected gene) and the correlation coefficient (the Pearson correlation coefficient among the overlapped T/Cs). More co-regulation T/Cs can be found at http://expressionbrowser.com/arab/displayFeature.jsp?id=1001343&tab=4.

The co-responsive genes related to . The co-responsive genes were listed in the order of their relation index to PR-1 genes. The relation index is a product of the overlap percentage (the percentage of overlapped co-regulation T/Cs between PR-1 and the selected gene) and the correlation coefficient (the Pearson correlation coefficient among the overlapped T/Cs). More co-regulation T/Cs can be found at http://expressionbrowser.com/arab/displayFeature.jsp?id=1001343&tab=4.

T/C View and co-regulation T/Cs

Figure 5 represents an example of T/C View of "16 hr Pseudomonas infection." Each data point on the T/C View is the LOG2R of a gene. The MA plot, color codes, two-layer display design, and searching/highlighting functions on the T/C View are exactly the same as those in Gene View described above. The following example shows how to use search function to locate genes in the T/C View. When a string of "PR1 PR2 PR3 PR4 PR5" was used as a search keyword, all genes with any matching word in its annotation are shown in the upper right box (Figure 5). By clicking the 'Select' button on each gene, the gene is moved to the lower box. At the same time, the selected gene is marked on an MA plot with a small rectangle. T/C view provides a condition-centric view of microarray data. Though different T/Cs may stimulate different sets of genes, any two different T/Cs may co-regulate a set of genes such that they have similar gene-expression signatures. The co-regulation relationship between two T/Cs can be constructed from the similarity of gene-expression signatures of the two T/Cs. If we click the "Co-regulation T/Cs" tab in the T/C View of "16 hr Pseudomonas infection" (http://expressionbrowser.com/arab/displayPair.jsp?id=2056966&tab=4), a total of 199 co-regulation T/Cs are listed in a table ordered by their "relation index" to the "16 hr Pseudomonas infection" T/C. The calculation of relation index between the two T/Cs is described in the footnote in Table 2. The T/C of "24 hr Pseudomonas infection" has the closest relationship (with relation index of 0.623816) to "16 hr Pseudomonas infection." This result is easily understood because they are the same treatment with an 8-hour treatment-time difference. The top 80 (of the 199) co-regulation T/Cs of "16 hr Pseudomonas infection" are listed in Table 2: 29 belong to pathogen-infection, 16 are plant-defense elicitors, and 6 are plant defense-related mutants. It is interesting to note that 3 T/Cs are negatively correlated with the T/C of "16 hr Pseudomonas infection" (Table 2). Two of the three T/Cs are mutants of Enhanced Disease Susceptibility 16 (EDS16) under infection conditions. EDS genes have special function in basal disease resistance to pathogens as well as R genes [29,30]. Arabidopsis EDS mutants, such as eds1 [31] and eds5 [32], have lower PR gene-expression level and exhibit higher susceptibility to pathogen infection. The reverse relationship of gene-expression signatures between EDS16 under infection and "16 hr Pseudomonas infection" implies that some pathogen-related genes are either not activated or reduced in EDS16 mutants when they are infected by pathogens. Another T/C negatively correlated with the "16 hr Pseudomonas infection" is caused by "high nitrogen effect". Hoffland et al. [33] reported that high nitrogen application caused higher N concentration in plant tissue, and the effect of tissue N concentration on disease susceptibility was highly pathogen-dependent. They found that disease susceptibility to P. syringae and Oidium lycopersicum was significantly increased with increasing N concentration in tomato tissue [34]. The results obtained from GEB are consistent with the previous independent studies, further suggesting that the results generated by GEB are reliable and the logic/principles implemented in GEB are scientifically sound.
Table 2

The co-regulation T/Cs with expression profiles correlated to the T/C of "16 hr Pseudomonas infection" (A)1

T/C NameClassificationGeneNumber(B)2OverlappingGene Number(C)3OverlappingPercentage% (OP)4CorrelationCoefficient(CC)5RelationIndex(RI)6
24 hr Pseudomonas infectionPathogen infection477277640.9829170.623816

Leaf, WT, Golovinomyces orontii infection for 5 d vs. 0 d controlPathogen infection716241430.936830.385622

BTH Effect for 8 hr in wrky18 mutantPlant defense elicitor1245296360.9436390.3242

BTH Effect for 8 hr in WTPlant defense elicitor1614308300.9515570.279581

SA effect at 6 hr (Col-0)Plant defense elicitor421121300.9544830.274902

Pseudomonas syringae pv tomato DC3000 hrcC-infiltration for 24 hrPathogen infection1065223300.9393570.272162

BTH treatment in mil4 mutant vs. H2O in mil4 mutantPlant defense elicitor652184350.8729430.271469

BTH Effect for 24 hr in wrky18 mutantPlant defense elicitor1655307300.9353430.263835

Leaf, eds16, Golovinomyces orontii infection for 5 d vs. WT, infection for 5 dPlant defense related mutant49516938-0.825380.26286

siz1-3 mutant vs. Col-0 WTPlant defense related mutant987221320.8892540.255498

mil4 overexpression line with BTH treatment vs. mil4 overexpression line controlPlant defense elicitor574181370.8200140.254887

SA effects at 4 hr (MT-0)Plant defense elicitor702158290.9340910.254588

BTH Effect for 24 hr in WTPlant defense elicitor2062341270.947320.250527

pmr5 pmr6 double mutant vs. WTPlant defense related mutant423126310.8927950.249832

Leaf, WT, Golovinomyces orontii infection for 7 d vs. 0 d controlPathogen infection2111329260.9409660.23379

BTH treatment in WT vs. WT controlPlant defense elicitor496141320.8471560.230768

120 hr Erysiphe orontii infectionPathogen infection591159320.8377770.229624

Whole plant, mkk2, 24 h BTH vs. mkk2 controlPlant defense elicitor973225330.8289290.228364

SA effect at 4 hr (Est)Plant defense elicitor25978240.966220.22756

PsmES4326 infection for 9 hrPathogen infection34099270.906380.225606

Phytophthora infection for 24 hrPathogen infection776152260.9199650.222373

Pseudomonas syringae pv phaseolicola infiltration for 24 hrPathogen infection1667256250.931040.216709

upf3 mutant vs WTOthers635123240.9383780.213205

Whole plant, WT, 24 h BTH vs. WT controlPlant defense elicitor707165300.8342360.211088

Pst DC3118 COR-hrpS double mutant infection 10 hrPathogen infection41890220.9633170.209057

Ozone effectsPlant defense elicitor1544247250.8988340.207327

SA effect at 4 hr (Tsu-1)Plant defense elicitor29483240.9114130.204284

cpr5scv1 double mutantPlant defense related mutant742163290.8336980.20177

Leaf, eds16 mutant, Golovinomyces orontii infection for 7 d vs. eds16 mutant, 0 d controlPathogen infection2644341220.9399440.199188

Phytophthora infection for 12 hrPathogen infection877152240.9077670.199132

shoot under Caesium treatmentOthers18764220.9385560.19851

E. coli TUV86-2 fliC mutant infection 7 hrPathogen infection859136210.9419370.194622

Whole plant, mkk1, 24 h BTH vs. mkk1 controlPlant defense elicitor1003176250.8672350.191285

SA effects at 4 hr (Van-0)Plant defense elicitor24366210.9381730.18619

Pst DC3000 hrpA mutant infection 7 hrPathogen infection796121200.9470910.184426

siz1-3 mutant drought with treatment vs. Col-0 WT with drought treatmentPlant defense related mutant1713244230.8849650.182514

Pseudomonas syringae pv phaseolicola infiltration for 6 hrPathogen infection1090161210.8994350.177085

upf1 mutant vs WTOthers26885260.820470.176332

WT (Col-0) Bgh infection vs. WT controlPathogen infection2489296200.9241270.176158

6 hr control vs 0 hr controlOthers1128154200.9247560.174548

shoot under potassium starvationOthers1293168200.9297430.173504

Pst DC3118 Coronatine infection 24 hrPathogen infection48382180.9554760.173289

ataf1-1 mutant, Bgh infection vs. ataf1-1 mutant controlPathogen infection3165346190.9383680.171836

Phytophthora infection for 6 hrPathogen infection1920237200.9127240.171609

S15-118 mutant vs. WTOthers16063230.8541060.169901

Leaf, eds16 mutant, Golovinomyces orontii infection for 5 d vs. eds16 mutant, 0 d controlPathogen infection15254200.9051240.166002

35S::ERF104, Flg22 treatment vs. 35S::ERF104, controlPlant defense elicitor1388160180.952890.164251

pmr5 mutant vs. WTPlant defense related mutant9343180.9476340.16293

E. coli 0157:H7 infection 7 hrPathogen infection58288180.9403320.161603

SA effects at 4 hr (Kin-0)Plant defense elicitor23759190.9197290.161515

S58-2 mutant vs. WTOthers17556200.8926420.160508

Leaf, eds16, Golovinomyces orontii infection for 7 d vs. WT, infection for 7 dPlant defense related mutant45510825-0.782660.158267

Rosette leaf, flu mutant vs. WTOthers1024138190.895760.157622

Triazolopyrimidine herbicide treatment vs. controlHerbicide1768191170.9386010.156599

Col-0 WT, Pst DC3000 (avrRpt2) infection vs. uninoculated controlPathogen infection629110210.8271560.149031

cpr5npr1svi1 triple mutantPlant defense related mutant38889230.7999190.14811

DC3000hrpA vs WT at 14 hr pathogen treatmentPathogen infection891107160.938120.148062

2 hr control vs 0 hr controlOthers864113180.8989380.146689

OGs effects for 1 hrPlant defense elicitor866122190.863490.145894

AgNO3Others807117190.8540760.143679

Rosette leaf, flu mutant, over-expressing tAPX vs. WT, over-expressing tAPXOthers1414149160.9269770.142656

Elicitor experiment, HrpZ treatment for 2 hr vs. 2 hr controlPlant defense elicitor2043211170.9050020.142587

Imidazolinone herbicide treatment vs. controlHerbicide1843176150.9480630.14226

flg22 effects for 1 hrPlant defense elicitor1714180170.9085220.141837

Pseudomonas syringae pv phaseolicola infiltration for 2 hrPathogen infection31965180.8694640.140394

Pst DC3000 infection (5 hr) in wrky17 mutantPathogen infection2918271160.9189350.138735

Elicitor experiment, GST-NPP1 treatment for 4 hr vs. 4 hr controlOthers2044214170.8823730.137416

high nitrogen effectsOthers115513117-0.892490.135869

Pst DC3000 infection (5 hr) in WTPathogen infection2782264160.9017350.135735

Whole plant, mkk1/mkk2 vs. WTPlant defense related mutant2559241160.9058590.134531

Whole plant, mkk2, 24 h BTH treatment vs. WT, 24 h BTH treatmentPlant defense elicitor26255170.8867470.134518

Whole plant, mkk1/mkk2, 24 h BTH vs. WT 24 h BTHPlant defense elicitor1887197170.8795390.134389

gh3.5-1D mutant Pst DC3000(avrRpt2) vs. gh3.5-1D un-inoculated controlPathogen infection2533248170.8883020.134312

Elicitor experiment, Flg-22 treatment for 4 hr vs. 4 hr controlPlant defense elicitor1259134160.9062820.13422

senescence effects in podSenescence1722195180.8506410.134189

sni1 mutantPlant defense related mutant17072260.7139150.1332

Pst DC3000 infection (5 hr) in wrky11 mutantPathogen infection3067276160.9098590.132532

Primisulfuron herbicide treatment vs. controlHerbicide2805242150.9312660.131749

1T/C of "16 hr Pseudomonas infection" has 1316 genes (A) that are significantly changed (2-fold and p-value 0.05 as cutoff)

2The number of the genes (B) that are significantly changed (2-fold and P value 0.05 as cutoff)

3The number of overlapping genes (C) between A and B

4The Overlap Percentage OP = 2*C/(A + B)

5The Pearson Correlation Coefficient (CC) of LOG2R of the overlapping genes

6The Relation Index RI = OP *C

The co-regulation T/Cs with expression profiles correlated to the T/C of "16 hr Pseudomonas infection" (A)1 1T/C of "16 hr Pseudomonas infection" has 1316 genes (A) that are significantly changed (2-fold and p-value 0.05 as cutoff) 2The number of the genes (B) that are significantly changed (2-fold and P value 0.05 as cutoff) 3The number of overlapping genes (C) between A and B 4The Overlap Percentage OP = 2*C/(A + B) 5The Pearson Correlation Coefficient (CC) of LOG2R of the overlapping genes 6The Relation Index RI = OP *C Gene network building has been a hot research topic during the past few years [10,12,34,35]. GEB is not only able to construct gene networks based on the co-responsive relationship described above (Figure 6) but is also able to construct T/C networks based on the co-regulation relationship (Table 2). Another paper will address the details about constructing gene networks and T/C networks in Arabidopsis.

Slide View

The slide view of genes or T/Cs is designed to help users discover changes in multiple genes under various T/Cs or vice versa. Users can make a slide show to compare a set of T/Cs with multiple selected genes. For example, the user can search T/C conditions in Gene View (Figure 4) by typing "cold" in the search box and then selecting three T/C conditions with cold treatment of 12 hr, 6 hr, and 3 hr from the upper right box to the lower right box. After selecting the three "cold" conditions, the user can also search another three non-related T/Cs, such as "drought," "UV-B" and "wounding" with the same procedure. After the six necessary T/C conditions are selected, the user can click the "[slide]" link and then the six MA plots of T/Cs are shown as slides (Figure 7). In Figure 7A, the user highlights a certain number of genes by dragging, dropping and resizing the "highlighting window." A total of 51 genes with at least 30-fold increase (LOG2R > 4.9) in 12-hr cold condition are selected. To see how the selected genes are changed in other T/Cs, click the "next slide" arrow, and the next slide will appear. The selected genes in the first slide are still highlighted but the positions of the selected genes are changed in different slides. With this slide show, users are able to see the change of these 51 selected genes in different T/Cs. Figures 7B to 7F reveal changes in the selected genes in the T/Cs with treatments of 6-hr cold, 3-hr cold, 12-hr drought, 12-hr UV-B, and 12-hr wounding, respectively. These slides clearly demonstrate that the selected genes had highest fold changes under 12-hour cold treatment (Figure 7A). The fold-changes decreased in 6-hr (Figure 7B) and 3-hr (Figure 7C) cold treatments. The positions of the 51 selected genes in treatments of drought (Figure 7D), UV-B (Figure 7E) and wounding (Figure 7F) showed less similarity to "12 hr cold treatment". The Slide View is a very simple and powerful visualization tool for scientists to compare their candidate genes and see how the genes behave differently in the T/Cs across studies.
Figure 7

A sample of Slide View for six T/Cs. (3 cold treatments, 1 drought treatment, 1 UV-B treatment, and 1 wounding treatment). Users may test this function at http://www.expressionbrowser.com/arab/displayPairSlides.jsp?id=2055336&id=2055335&id=2055334&id=2055599&id=2055979&id=2056077.

A sample of Slide View for six T/Cs. (3 cold treatments, 1 drought treatment, 1 UV-B treatment, and 1 wounding treatment). Users may test this function at http://www.expressionbrowser.com/arab/displayPairSlides.jsp?id=2055336&id=2055335&id=2055334&id=2055599&id=2055979&id=2056077.

Experiment View

Experiment View shows experiment title, description/design, lab information, samples, sample sets, biological replicates, and the definitions of T/Cs. This view helps users understand the data in detail. For example, the contents of the Experiment View of "Pathogen Series: Pseudomonas half leaf injection" can be seen at the following link http://expressionbrowser.com/arab/displayExperiment.jsp?id=2020113. There are three tabs in the Experiment View. The first tab, called "Details," displays experiment title, description, and other detailed information of the experiment. The second tab "T/C" contains information about the T/Cs in the experiment. The third tab "Samples and Data" contains information about all sample sets, samples and raw data files. Users can download raw microarray data files through the "Samples and Data" tab and then input these raw data into other microarray data-analysis software to analyze the data and to validate the results obtained from GEB.

Conclusions

GEB is composed of a data extraction tool, a microarray data-management system, a data annotation tool, a data-processing pipeline, and a search & visualization tool. The heterogeneity of diverse experimental designs has been greatly mitigated by re-organizing different experimental treatment levels into T/Cs so that cross-experimental data integration is easily achieved. GEB separates data processing from interactive display. It pre-processes data and generates data plot images, and then displays the processed data with a web2.0-based interactive user-interface, according to users' requests. This design allows heavy computing to be done offline, and thus allows a large number of data points to be queried quickly and displayed interactively in real-time. GEB displays all data points in one view so that users do not need to scroll down display pages to obtain the trend or pattern of gene expressions from all data points. The highlighting and searching functions in Gene View, T/C View, and Slide View greatly facilitate dynamically exploring the data points based on users' interests. As an additional strategy to improve usability, all raw data and calculated data in GEB are accessible via a full-text search engine. GEB also computes relations of co-regulation T/Cs and co-responsive genes. These relations are the foundation for building gene networks and T/C networks.

Availability and requirements

• Project Name: Gene Expression Browser (GEB) • Public web service: http://www.ExpressionBrowser.com Free and no registration. • Programming Language: Java, R • Database: MySQL • Software License: The software license is owned by GeneExp. GeneExp grants free licenses to non-profit organizations and general licenses to commercial organizations. • License request: support@ExpressionBrowser.com

Abbreviations

A: Average Log2 Intensity; AJAX: Asynchronous JavaScript and XML; ATH1: A light-regulated Arabidopsis thaliana homeobox 1 gene; BTH: Benzothiadiazole; CC: Correlation Coefficient; CIBEX: Center for Information Biology Gene Expression Database; CPU: Central Processing Unit; EDS: Enhanced Disease Susceptibility; EBI: The European Bioinformatics Institute; GDS: Granite Data Services; GEB: Gene Express Browser; GEO: Gene Expression Omnibus; JPEG: Joint Photographic Experts Group; LOG2R: Log2 Ratio of Treatment over Control; MA plot: a Quick Overview of Intensity-dependent Ratio of Microarray Data; M: LOG2R; N: Nitrogen; NASC: Nottingham Arabidopsis Stock Center; NCBI: The National Center for Biotechnology Information; NIG: The National Institute of Genetics; PR gene: Pathogenesis-related gene; R gene: Resistance genes; RI: Relation Index; RMA: Robust Multichip Average; SA: Salicylic Acid; SAR: Systemic Acquired Resistance; TAIR: Texas Association for Institutional Research; T/C: Treatment over Control; %CV: Percentage Coefficient of Variation; OP: Overlapping Percentage; UV-B: Ultraviolet-B Radiation.

Authors' contributions

MZ, XF, ST proposed software requirements. MZ, YZ, XF, MSK did software specification and design. YZ developed statistical protocols. MZ designed database schema and developed computational algorithms and the software. LL, LY, ST, JT, WY, YA tested the software application and wrote the manual. MZ downloaded and processed raw microarray data from GEO and NASC. LL, LY, ST, JT, WY annotated the data and nominated the T/Cs. MZ, YZ, XF, ST, MSK, YA drafted the manuscript. MZ, YZ, XF, LL, LY, ST, JT, WY, MSK wrote different parts of the manuscript. MZ, YZ, XF, MSK, YA assembled all parts written by different authors together into this manuscript. All authors read and approved this manuscript.
  27 in total

Review 1.  Genes controlling expression of defense responses in Arabidopsis--2001 status.

Authors:  J Glazebrook
Journal:  Curr Opin Plant Biol       Date:  2001-08       Impact factor: 7.834

2.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.

Authors:  B M Bolstad; R A Irizarry; M Astrand; T P Speed
Journal:  Bioinformatics       Date:  2003-01-22       Impact factor: 6.937

3.  GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox.

Authors:  Philip Zimmermann; Matthias Hirsch-Hoffmann; Lars Hennig; Wilhelm Gruissem
Journal:  Plant Physiol       Date:  2004-09       Impact factor: 8.340

4.  Induced resistance responses in maize.

Authors:  S W Morris; B Vernooij; S Titatarn; M Starrett; S Thomas; C C Wiltse; R A Frederiksen; A Bhandhufalck; S Hulbert; S Uknes
Journal:  Mol Plant Microbe Interact       Date:  1998-07       Impact factor: 4.171

5.  Herbivore-induced resistance against microbial pathogens in Arabidopsis.

Authors:  Martin De Vos; Wendy Van Zaanen; Annemart Koornneef; Jerôme P Korzelius; Marcel Dicke; L C Van Loon; Corné M J Pieterse
Journal:  Plant Physiol       Date:  2006-07-07       Impact factor: 8.340

6.  An Arabidopsis gene network based on the graphical Gaussian model.

Authors:  Shisong Ma; Qingqiu Gong; Hans J Bohnert
Journal:  Genome Res       Date:  2007-10-05       Impact factor: 9.043

7.  Silencing of acidic pathogenesis-related PR-1 genes increases extracellular beta-(1->3)-glucanase activity at the onset of tobacco defence reactions.

Authors:  Marie-Pierre Rivière; Antoine Marais; Michel Ponchet; William Willats; Eric Galiana
Journal:  J Exp Bot       Date:  2008-04-04       Impact factor: 6.992

8.  GeneChaser: identifying all biological and clinical conditions in which genes of interest are differentially expressed.

Authors:  Rong Chen; Rohan Mallelwar; Ajit Thosar; Shivkumar Venkatasubrahmanyam; Atul J Butte
Journal:  BMC Bioinformatics       Date:  2008-12-18       Impact factor: 3.169

9.  Enhanced disease susceptibility 1 and salicylic acid act redundantly to regulate resistance gene-mediated signaling.

Authors:  Srivathsa C Venugopal; Rae-Dong Jeong; Mihir K Mandal; Shifeng Zhu; A C Chandra-Shekara; Ye Xia; Matthew Hersh; Arnold J Stromberg; DuRoy Navarre; Aardra Kachroo; Pradeep Kachroo
Journal:  PLoS Genet       Date:  2009-07-03       Impact factor: 5.917

10.  NASCArrays: a repository for microarray data generated by NASC's transcriptomics service.

Authors:  David J Craigon; Nick James; John Okyere; Janet Higgins; Joan Jotham; Sean May
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

View more
  4 in total

1.  Integration of light- and brassinosteroid-signaling pathways by a GATA transcription factor in Arabidopsis.

Authors:  Xiao-Min Luo; Wen-Hui Lin; Shengwei Zhu; Jia-Ying Zhu; Yu Sun; Xi-Ying Fan; Menglin Cheng; Yaqi Hao; Eunkyoo Oh; Miaomiao Tian; Lijing Liu; Ming Zhang; Qi Xie; Kang Chong; Zhi-Yong Wang
Journal:  Dev Cell       Date:  2010-12-14       Impact factor: 12.270

2.  Comparative Proteomics Analysis of Phloem Exudates Collected during the Induction of Systemic Acquired Resistance.

Authors:  Philip Carella; Juliane Merl-Pham; Daniel C Wilson; Sanjukta Dey; Stefanie M Hauck; A Corina Vlot; Robin K Cameron
Journal:  Plant Physiol       Date:  2016-04-19       Impact factor: 8.340

3.  DNA microarray integromics analysis platform.

Authors:  Tomasz Waller; Tomasz Gubała; Krzysztof Sarapata; Monika Piwowar; Wiktor Jurkowski
Journal:  BioData Min       Date:  2015-06-25       Impact factor: 2.522

4.  Semantic integration of gene expression analysis tools and data sources using software connectors.

Authors:  Flávia A Miyazaki; Gabriela D A Guardia; Ricardo Z N Vêncio; Cléver R G de Farias
Journal:  BMC Genomics       Date:  2013-10-25       Impact factor: 3.969

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.