Literature DB >> 23681125

MEME-LaB: motif analysis in clusters.

Paul Brown1, Laura Baxter, Richard Hickman, Jim Beynon, Jonathan D Moore, Sascha Ott.   

Abstract

SUMMARY: Genome-wide expression analysis can result in large numbers of clusters of co-expressed genes. Although there are tools for ab initio discovery of transcription factor-binding sites, most do not provide a quick and easy way to study large numbers of clusters. To address this, we introduce a web tool called MEME-LaB. The tool wraps MEME (an ab initio motif finder), providing an interface for users to input multiple gene clusters, retrieve promoter sequences, run motif finding and then easily browse and condense the results, facilitating better interpretation of the results from large-scale datasets. AVAILABILITY: MEME-LaB is freely accessible at: http://wsbc.warwick.ac.uk/wsbcToolsWebpage/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23681125      PMCID: PMC3694638          DOI: 10.1093/bioinformatics/btt248

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Analyzing microarray expression data using cluster analysis is a common and frequently performed task in functional genomics. Typically, a large number of clusters are produced, each containing a large number of genes (e.g. 50 clusters of 200 genes). Each cluster is predicted to contain a set of genes that are co-expressed and as such would be expected to share common regulatory features, such as transcription factor-binding sites (TFBSs). There are established methods and tools for predicting known TFBSs (such as Athena, O’Connor ), but ab initio motif discovery remains an important aspect to consider. Several tools exist to perform this task, and MEME in particular is a well-recognized suite for motif discovery (Bailey ), but the MEME web suite offers limited usage on large numbers of clusters and subsequent navigation and post-processing of the results. Machanick and Bailey (2012) provide a web tool (MEME-ChIP) that is specifically tailored towards ChIP-seq data, providing a useful expansion in MEME’s functionality. Here, we describe a web tool called MEME-LaB (MEME Launcher and Browser), which wraps the MEME tool in ways that are ideally suited to the task of ab initio motif finding in co-expressed gene clusters: (i) users can input multiple gene clusters at once. (ii) Promoter sequences are automatically retrieved from a local database or from a user-specified file. (iii) MEME is run on all clusters simultaneously, and the results are presented in a condensed and navigable format. (iv) Identified motifs are compared for similarity with known TFBS motifs.

2 IMPLEMENTATION

2.1 General workflow

The MEME-LaB web service is designed to simplify the task of identifying putative TFBSs in the promoters of co-expressed gene clusters, and it provides an easy way to navigate through and filter the results.

2.2 Input

Users are not required to register for the service and are automatically logged in as a guest user. Gene clusters are uploaded as a simple tab-separated file consisting of two columns: the first for numbers identifying the clusters, and the second for gene IDs. For the Arabidopsis genome, sequences will be automatically retrieved from a local database; for all other genomes, a second file containing promoter sequences in FASTA format is uploaded by the user.

2.3 Processing

Uploaded files are verified for validity, and the user warned of any detected errors in the input. Users specify the minimum and maximum length of promoter regions to search, between 50 and 1000 nt in length, and optionally to stop at a neighbouring gene if there is one present in this region. Users select the number of motifs to find per cluster, within a minimum and maximum motif length (6–20 nt). Details of comparisons with known motifs from JASPAR (Bryne ) and PLACE (Higo ) are provided in the output (users with a valid login for TRANSFAC can also compare motifs with TRANSFAC motifs on our server). An email is sent to the address provided notifying when the job is complete and results ready for retrieval.

2.4 Output

Results are provided as interactive html pages, and they can also be downloaded. For each cluster, the specified number of motifs is identified using the MEME algorithm and displayed as motif logos. Additional information for each motif is displayed, including its distribution among the input set, positional bias, strand bias and similarity to known motifs.

3 EXAMPLE RESULTS

We demonstrate the functionality offered by MEME-LaB, using co-expression clusters derived from a time-course microarray experiment of Arabidopsis responses to infection with Botrytis cinerea (Windram ) (input files and the complete results are available as Supplementary Data S1). The usefulness of the tool is demonstrated by being able to easily browse motifs for all clusters on a single webpage and to reduce a large set of motifs to the most informative results based on motif properties (Fig. 1A and B). Additional information on each motif’s positional distribution among the input set is provided (Fig. 1C). For each ab initio motif predicted, up to five of the most similar known motifs are listed, with a distance measure, and additional information and links are provided (Fig. 1A and D). In the example result view (Fig. 1), filtering to show only motifs that occur in 25% or more of sequences in a cluster, occurring in ≥20 sites and have an information content >10 has resulted in 21 of the possible 220 motifs being displayed. MEME identified motifs similar to I-box (Fig. 1A, top) and G-box (Fig. 1A, bottom), which are consistent with previous findings, but also a third motif (Fig. 1A, middle) that is not closely similar to any known motif but is present in all the sequences in cluster 4. The MEME-LaB service makes existing functionality more widely and more easily applicable, enabling the identification of significant motifs in large co-expression cluster datasets.
Fig. 1.

Screenshot of a typical MEME-LaB result. Motifs are displayed as motif logos, and information about each motif is shown (A). Results can be filtered on motif properties (B). Additional information on positional distribution (C) and properties of similar motifs (D) are accessed in pop-up windows

Screenshot of a typical MEME-LaB result. Motifs are displayed as motif logos, and information about each motif is shown (A). Results can be filtered on motif properties (B). Additional information on positional distribution (C) and properties of similar motifs (D) are accessed in pop-up windows Funding: Biotechnology and Biological Sciences Research Council (BBSRC) (BB/F005806/1 to P.B., L.B., J.D.M., J.B. and S.O.); Engineering and Physical Sciences Research Council/BBSRC–funded Warwick Systems Biology Doctoral Training Centre (to R.H.). Conflict of Interest: none declared.
  6 in total

1.  Plant cis-acting regulatory DNA elements (PLACE) database: 1999.

Authors:  K Higo; Y Ugawa; M Iwamoto; T Korenaga
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

2.  Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences.

Authors:  Timothy R O'Connor; Curtis Dyreson; John J Wyrick
Journal:  Bioinformatics       Date:  2005-10-13       Impact factor: 6.937

3.  Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high-resolution temporal transcriptomic analysis.

Authors:  Oliver Windram; Priyadharshini Madhou; Stuart McHattie; Claire Hill; Richard Hickman; Emma Cooke; Dafyd J Jenkins; Christopher A Penfold; Laura Baxter; Emily Breeze; Steven J Kiddle; Johanna Rhodes; Susanna Atwell; Daniel J Kliebenstein; Youn-Sung Kim; Oliver Stegle; Karsten Borgwardt; Cunjin Zhang; Alex Tabrett; Roxane Legaie; Jonathan Moore; Bärbel Finkenstadt; David L Wild; Andrew Mead; David Rand; Jim Beynon; Sascha Ott; Vicky Buchanan-Wollaston; Katherine J Denby
Journal:  Plant Cell       Date:  2012-09-28       Impact factor: 11.277

4.  MEME-ChIP: motif analysis of large DNA datasets.

Authors:  Philip Machanick; Timothy L Bailey
Journal:  Bioinformatics       Date:  2011-04-12       Impact factor: 6.937

5.  MEME: discovering and analyzing DNA and protein sequence motifs.

Authors:  Timothy L Bailey; Nadya Williams; Chris Misleh; Wilfred W Li
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

6.  JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update.

Authors:  Jan Christian Bryne; Eivind Valen; Man-Hung Eric Tang; Troels Marstrand; Ole Winther; Isabelle da Piedade; Anders Krogh; Boris Lenhard; Albin Sandelin
Journal:  Nucleic Acids Res       Date:  2007-11-15       Impact factor: 16.971

  6 in total
  27 in total

1.  Genome-wide identification, phylogenetic and expression analysis of the maize HECT E3 ubiquitin ligase genes.

Authors:  Yunfeng Li; Lihong Zhai; Jingsheng Fan; Jiaxin Ren; Wenrong Gong; Xin Wang; Jun Huang
Journal:  Genetica       Date:  2019-11-18       Impact factor: 1.082

2.  Genome-Wide Identification of the GRAS Family Genes in Melilotus albus and Expression Analysis under Various Tissues and Abiotic Stresses.

Authors:  Shengsheng Wang; Zhen Duan; Qi Yan; Fan Wu; Pei Zhou; Jiyu Zhang
Journal:  Int J Mol Sci       Date:  2022-07-03       Impact factor: 6.208

3.  Changes in Gene Expression in Space and Time Orchestrate Environmentally Mediated Shaping of Root Architecture.

Authors:  Liam Walker; Clare Boddington; Dafyd Jenkins; Ying Wang; Jesper T Grønlund; Jo Hulsmans; Sanjeev Kumar; Dhaval Patel; Jonathan D Moore; Anthony Carter; Siva Samavedam; Giovanni Bonomo; David S Hersh; Gloria M Coruzzi; Nigel J Burroughs; Miriam L Gifford
Journal:  Plant Cell       Date:  2017-09-11       Impact factor: 11.277

4.  Wigwams: identifying gene modules co-regulated across multiple biological conditions.

Authors:  Krzysztof Polanski; Johanna Rhodes; Claire Hill; Peijun Zhang; Dafyd J Jenkins; Steven J Kiddle; Aleksey Jironkin; Jim Beynon; Vicky Buchanan-Wollaston; Sascha Ott; Katherine J Denby
Journal:  Bioinformatics       Date:  2013-12-18       Impact factor: 6.937

5.  Encoded expansion: an efficient algorithm to discover identical string motifs.

Authors:  Aqil M Azmi; Abdulrakeeb Al-Ssulami
Journal:  PLoS One       Date:  2014-05-28       Impact factor: 3.240

Review 6.  Improving crop disease resistance: lessons from research on Arabidopsis and tomato.

Authors:  Sophie J M Piquerez; Sarah E Harvey; Jim L Beynon; Vardis Ntoukakis
Journal:  Front Plant Sci       Date:  2014-12-03       Impact factor: 5.753

7.  Genome-Wide Identification and Expression Analyses of Aquaporin Gene Family during Development and Abiotic Stress in Banana.

Authors:  Wei Hu; Xiaowan Hou; Chao Huang; Yan Yan; Weiwei Tie; Zehong Ding; Yunxie Wei; Juhua Liu; Hongxia Miao; Zhiwei Lu; Meiying Li; Biyu Xu; Zhiqiang Jin
Journal:  Int J Mol Sci       Date:  2015-08-20       Impact factor: 5.923

8.  Expression of the translocator protein (TSPO) from Pseudomonas fluorescens Pf0-1 requires the stress regulatory sigma factors AlgU and RpoH.

Authors:  Charlène Leneveu-Jenvrin; Emeline Bouffartigues; Olivier Maillot; Pierre Cornelis; Marc G J Feuilloley; Nathalie Connil; Sylvie Chevalier
Journal:  Front Microbiol       Date:  2015-09-24       Impact factor: 5.640

9.  Discriminative gene co-expression network analysis uncovers novel modules involved in the formation of phosphate deficiency-induced root hairs in Arabidopsis.

Authors:  Jorge E Salazar-Henao; Wen-Dar Lin; Wolfgang Schmidt
Journal:  Sci Rep       Date:  2016-05-25       Impact factor: 4.379

10.  Genome-wide characterization and analysis of bZIP transcription factor gene family related to abiotic stress in cassava.

Authors:  Wei Hu; Hubiao Yang; Yan Yan; Yunxie Wei; Weiwei Tie; Zehong Ding; Jiao Zuo; Ming Peng; Kaimian Li
Journal:  Sci Rep       Date:  2016-03-07       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.