Literature DB >> 18842622

AthaMap, integrating transcriptional and post-transcriptional data.

Lorenz Bülow1, Stefan Engelmann, Martin Schindler, Reinhard Hehl.   

Abstract

The AthaMap database generates a map of predicted transcription factor binding sites (TFBS) for the whole Arabidopsis thaliana genome. AthaMap has now been extended to include data on post-transcriptional regulation. A total of 403,173 genomic positions of small RNAs have been mapped in the A. thaliana genome. These identify 5772 putative post-transcriptionally regulated target genes. AthaMap tools have been modified to improve the identification of common TFBS in co-regulated genes by subtracting post-transcriptionally regulated genes from such analyses. Furthermore, AthaMap was updated to the TAIR7 genome annotation, a graphic display of gene analysis results was implemented, and the TFBS data content was increased. AthaMap is freely available at http://www.athamap.de/.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18842622      PMCID: PMC2686474          DOI: 10.1093/nar/gkn709

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

A large number of different databases are available for database-assisted gene-expression analysis (1). The first level of gene-expression regulation is transcription which is controlled by the synchronized binding of transcription factors (TFs) to adjacent cis-regulatory sequences. The bioinformatic identification of cis-regulatory sequences is an important tool to predict target genes of specific TFs (2). Towards these ends, the AthaMap database was developed. AthaMap is a database that generates a genome-wide map of predicted transcription factor binding sites (TFBS) and cis-regulatory elements for Arabidopsis thaliana (3,4). Compared to similar databases such as AGRIS, Athena and ATTED-II (5–8), AthaMap covers the whole-genome sequence and includes predicted TFBS that were identified with positional weight matrices. Recently, plant-related contents of the transcription and promoter databases TRANSFAC and TRANSPRO (9,10) were integrated with plant proteome and pathway data to the platform BKL Plant (BIOBASE Knowledge library). This was combined with the previously reported ExPlain tool that screens promoter regions with positional weight matrices for TFBS and evaluates results using the ‘Composite Module Analyst’ (CMA) as core component (11,12). This commercial product integrates promoter and pathway analysis of gene-expression data (BIOBASE, Wolfenbüttel, Germany). In contrast, AthaMap is in the public domain and provides online tools to display TFBS in user-selected genes or at specific genomic positions (3). The detection of combinatorial elements and their target genes allows the prediction of co-regulated genes (13). The gene analysis function detects common TFBS in user-provided genes (14). A short user manual has been published recently (15) and all tools are explained on the ‘Description’ page on the AthaMap website as well. AthaMap has been linked with PathoPlant, a database on plant–pathogen interactions (16). Arabidopsis thaliana microarray experiments in PathoPlant can be screened for co-regulated genes that respond to up to three different stimuli (17). A list of co-regulated genes can directly be exported to AthaMap for identification of common TFBS. However, not all differentially expressed genes are transcriptionally regulated (18). One important factor for post-transcriptional regulation is the expression of small RNAs such as miRNA, siRNA and ta-siRNA (19). Although there are distinct pathways to generate these types of small RNAs, the resulting molecules are very similar in size and represent the small RNA transcriptome of the organism (20). Using a massive parallel sequencing approach, small transcriptome data became available for seedlings and inflorescence tissue of A. thaliana (21). The genome-wide nature of AthaMap and the availability of small RNA data provide a unique opportunity to combine transcriptional and post-transcriptional data in a single database. This may add significantly to the quality of cis-regulatory sequence identification involved in transcriptional regulation.

ANNOTATION OF GENOMIC POSITIONS OF SMALL RNAS

Sequence signatures (17-mers) derived from a small RNA transcriptome analysis of A. thaliana inflorescence tissue and seedlings were used for genomic screenings (21). The complete lists of screening sequences (Accession numbers GSM65747 and GSM65750) were downloaded from NCBI's Gene Expression Omnibus (GEO) repository (22). Genomic positions were determined by using a Perl script that screens for occurrences of perfect matches of all 109 590 small RNA 17-mer screening sequences within the five chromosomes of A. thaliana. Absolute positions and orientation of small RNA matches from inflorescence tissue and seedlings were annotated to AthaMap resulting in a total of 403 173 genomic matches. For screening sequences yielding more than one genomic match, corresponding loci were determined. A total of 5772 genes were predicted to be post-transcriptionally regulated by small RNAs since their transcribed regions are targets of at least one small RNA in antisense orientation. A text file with the genome identifiers of the 5772 predicted target genes of small RNAs can be downloaded on the documentation page at AthaMap. Genomic positions of small RNAs are displayed in AthaMap analogous to TFBSs and are symbolized as xxxxx>. The arrow head gives the orientation of the small RNA. A tool tip box appears when moving over the arrow indicating the absolute genomic position and screening library of the small RNA. Selecting the name adjacent to this symbol will open a new window giving additional information. Figure 1 shows a partial screen shot of position 11 911 on chromosome 1 with a small RNA from the inflorescence library, the tool tip box and the associated pop-up window. This new window shows the screening sequence, corresponding genomic positions for this particular small RNA and the reference.
Figure 1.

Small RNA binding sites in the Arabidopsis thaliana genome. Partial screen shot of the sequence display window with a small RNA binding site at position 11 911 on chromosome 1. The tool tip box indicates the absolute genomic position and screening library. A pop-up window with additional information on the small RNA is also shown.

Small RNA binding sites in the Arabidopsis thaliana genome. Partial screen shot of the sequence display window with a small RNA binding site at position 11 911 on chromosome 1. The tool tip box indicates the absolute genomic position and screening library. A pop-up window with additional information on the small RNA is also shown. Putative post-transcriptionally regulated genes are identified within the Colocalization and Gene Analysis functions. These genes are tagged on the result pages with an italicized genome identifier. They can be subtracted in the Colocalization and Gene Analysis functions by activating the checkbox ‘exclude genes regulated by smallRNA’ in order to restrict the analyses exclusively to transcriptionally regulated genes.

UPDATE TO TAIR7

The recent publication of the TAIR7 A. thaliana genome release motivated the implementation of this genome annotation into AthaMap (23). The annotation of the gene structure is based on five chromosomal XML flatfiles downloaded from the TAIR web site (release 7). These files were parsed using a Perl script and positional information for 5′- and 3′-UTRs, exons and introns were annotated to AthaMap. These regions are displayed in AthaMap with a colour code similar to the one used by TAIR. Due to the significantly increased number of genes with annotated transcription start site (TSS) in TAIR7, the Gene Analysis and Colocalization functions of AthaMap have been changed to show positions of TFBS relative to TSS of the nearest gene. This applies to 23 222 (73.1%) genes while for the remaining 8540 (26.9%) genes results are still displayed relative to the translation start site. In earlier versions of AthaMap, all positions were shown relative to translation start sites as point of reference. Compared to TAIR5 the previous version annotated to AthaMap, the nucleotide sequence of the A. thaliana genome in TAIR7 was not changed. Therefore, the positional information of all previously determined TFBS remained constant, except for TATA-boxes. Because of the larger number of genes with an annotated TSS, the number of annotated TATA boxes decreased from 16 277 (13) to currently 15 955. The number of TATA boxes decreased because for genes lacking a TSS a larger upstream region was screened for putative TATA boxes than for genes with an annotated TSS (3). Therefore, the lower number of TATA boxes results from elimination of false positives.

GRAPHIC DISPLAY OF GENE ANALYSIS RESULTS

The Gene Analysis function of AthaMap generates long lists with positional information on TFBSs in all genes investigated (14). Although overviews or summaries of the data can be displayed, the positional information is difficult to perceive. Therefore, a graphic display of TFBS in the analysed gene region was implemented that enables easy comparison between genes and visual identification of common binding site patterns. Every TF family as well as the small RNAs and combinatorial elements are identified with a different colour and their display can be selected individually. Figure 2 shows the web interface with the buttons to select the TF families and a graphic display of TFBS for selected TF family members in the Arabidopsis genes At2g42530 and At2g42540. Also shown is a tool tip box that opens when the mouse pointer moves over the colour-coded TFBS. The tool tip box gives additional information for the TF that identified this particular TFBS. Factor (RAV1) and factor family (AP2/EREBP) are identified as well as the position relative to the TSS (−70). For TFBS identified with positional weight matrices, threshold score, maximum score and score of the binding site are given (3).
Figure 2.

Graphic display of transcription factor and small RNA binding sites. Partial screen shot of the gene analysis tool with the checkboxes for TF families included in a graphic display and the graphic display of the upstream region of the genes At2g42530 and At2g42540. A tool tip box with additional information on one of the TFBS is also shown.

Graphic display of transcription factor and small RNA binding sites. Partial screen shot of the gene analysis tool with the checkboxes for TF families included in a graphic display and the graphic display of the upstream region of the genes At2g42530 and At2g42540. A tool tip box with additional information on one of the TFBS is also shown.

DATA INCREASE

Recently published binding sites for the Arabidopsis TFs TAC1, RAP2.2 and MYB98 were annotated to AthaMap (24–26). These factors belong to the C2H2(Zn), AP2/EREBP and MYB TF families. Detection and annotation of single binding sites was done as described earlier (4). Binding sites for two TFs for which positional weight matrices could be generated were annotated as well. These are the factors STF1 and SPL1 which belong to the bZIP and SBP TF families (27,28). Detection and annotation of matrix-based binding sites was done as described earlier (3). AthaMap now harbours 9 998 736 predicted TFBSs.

FUNDING

German Federal Ministry for Education and Research through GABI-ADVANCIS (BMBF 0315037B). Funding for open access charge: Technical University of Braunschweig. Conflict of interest statement. None declared.
  27 in total

1.  PathoPlant: a database on plant-pathogen interactions.

Authors:  Lorenz Bülow; Martin Schindler; Claudia Choi; Reinhard Hehl
Journal:  In Silico Biol       Date:  2004

Review 2.  Post-transcriptional small RNA pathways in plants: mechanisms and regulations.

Authors:  Hervé Vaucheret
Journal:  Genes Dev       Date:  2006-04-01       Impact factor: 11.361

Review 3.  MicroRNAS and their regulatory roles in plants.

Authors:  Matthew W Jones-Rhoades; David P Bartel; Bonnie Bartel
Journal:  Annu Rev Plant Biol       Date:  2006       Impact factor: 26.379

4.  MYB98 positively regulates a battery of synergid-expressed genes encoding filiform apparatus localized proteins.

Authors:  Jayson A Punwani; David S Rabiger; Gary N Drews
Journal:  Plant Cell       Date:  2007-08-10       Impact factor: 11.277

5.  Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences.

Authors:  Timothy R O'Connor; Curtis Dyreson; John J Wyrick
Journal:  Bioinformatics       Date:  2005-10-13       Impact factor: 6.937

6.  Control of gene expression during T cell activation: alternate regulation of mRNA transcription and mRNA stability.

Authors:  Chris Cheadle; Jinshui Fan; Yoon S Cho-Chung; Thomas Werner; Jill Ray; Lana Do; Myriam Gorospe; Kevin G Becker
Journal:  BMC Genomics       Date:  2005-05-20       Impact factor: 3.969

7.  AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana.

Authors:  Nils Ole Steffens; Claudia Galuschka; Martin Schindler; Lorenz Bülow; Reinhard Hehl
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

8.  PathoPlant: a platform for microarray expression data to analyze co-regulated genes involved in plant defense responses.

Authors:  Lorenz Bülow; Martin Schindler; Reinhard Hehl
Journal:  Nucleic Acids Res       Date:  2006-11-11       Impact factor: 16.971

9.  Internet Resources for Gene Expression Analysis in Arabidopsis thaliana.

Authors:  Reinhard Hehl; Lorenz Bülow
Journal:  Curr Genomics       Date:  2008-09       Impact factor: 2.236

10.  AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors.

Authors:  Ramana V Davuluri; Hao Sun; Saranyan K Palaniswamy; Nicole Matthews; Carlos Molina; Mike Kurtz; Erich Grotewold
Journal:  BMC Bioinformatics       Date:  2003-06-23       Impact factor: 3.169

View more
  31 in total

1.  RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

Authors:  Jaime Abraham Castro-Mondragon; Sébastien Jaeger; Denis Thieffry; Morgane Thomas-Chollier; Jacques van Helden
Journal:  Nucleic Acids Res       Date:  2017-07-27       Impact factor: 16.971

2.  Multiple regulatory elements in the Arabidopsis NIA1 promoter act synergistically to form a nitrate enhancer.

Authors:  Rongchen Wang; Peizhu Guan; Mingsheng Chen; Xiujuan Xing; Yali Zhang; Nigel M Crawford
Journal:  Plant Physiol       Date:  2010-07-28       Impact factor: 8.340

3.  Transcriptional regulation of tocopherol biosynthesis in tomato.

Authors:  Leandro Quadrana; Juliana Almeida; Santiago N Otaiza; Tomas Duffy; Junia V Corrêa da Silva; Fabiana de Godoy; Ramon Asís; Luisa Bermúdez; Alisdair R Fernie; Fernando Carrari; Magdalena Rossi
Journal:  Plant Mol Biol       Date:  2012-12-18       Impact factor: 4.076

4.  Analysis of antisense expression by whole genome tiling microarrays and siRNAs suggests mis-annotation of Arabidopsis orphan protein-coding genes.

Authors:  Casey R Richardson; Qing-Jun Luo; Viktoria Gontcharova; Ying-Wen Jiang; Manoj Samanta; Eunseog Youn; Christopher D Rock
Journal:  PLoS One       Date:  2010-05-26       Impact factor: 3.240

5.  Small RNA diversity in plants and its impact in development.

Authors:  Christine Lelandais-Brière; Céline Sorin; Marie Declerck; Abdelali Benslimane; Martin Crespi; Caroline Hartmann
Journal:  Curr Genomics       Date:  2010-03       Impact factor: 2.236

6.  Gains and Losses of Cis-regulatory Elements Led to Divergence of the Arabidopsis APETALA1 and CAULIFLOWER Duplicate Genes in the Time, Space, and Level of Expression and Regulation of One Paralog by the Other.

Authors:  Lingling Ye; Bin Wang; Wengen Zhang; Hongyan Shan; Hongzhi Kong
Journal:  Plant Physiol       Date:  2016-04-05       Impact factor: 8.340

7.  Genome-wide Medicago truncatula small RNA analysis revealed novel microRNAs and isoforms differentially regulated in roots and nodules.

Authors:  Christine Lelandais-Brière; Loreto Naya; Erika Sallet; Fanny Calenge; Florian Frugier; Caroline Hartmann; Jérome Gouzy; Martin Crespi
Journal:  Plant Cell       Date:  2009-09-18       Impact factor: 11.277

8.  Integration of bioinformatics and synthetic promoters leads to the discovery of novel elicitor-responsive cis-regulatory sequences in Arabidopsis.

Authors:  Jeannette Koschmann; Fabian Machens; Marlies Becker; Julia Niemeyer; Jutta Schulze; Lorenz Bülow; Dietmar J Stahl; Reinhard Hehl
Journal:  Plant Physiol       Date:  2012-06-28       Impact factor: 8.340

9.  Three enhancements to the inference of statistical protein-DNA potentials.

Authors:  Mohammed AlQuraishi; Harley H McAdams
Journal:  Proteins       Date:  2012-11-12

10.  MotifAdjuster: a tool for computational reassessment of transcription factor binding site annotations.

Authors:  Jens Keilwagen; Jan Baumbach; Thomas A Kohl; Ivo Grosse
Journal:  Genome Biol       Date:  2009-05-01       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.