| Literature DB >> 19210766 |
Kolja Henckel1, Kai J Runte, Thomas Bekel, Michael Dondrup, Tobias Jakobi, Helge Küster, Alexander Goesmann.
Abstract
BACKGROUND: Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database - hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data.Entities:
Mesh:
Year: 2009 PMID: 19210766 PMCID: PMC2654896 DOI: 10.1186/1471-2229-9-19
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Figure 1The TRUNCATULIX database schema. The TRUNCATULIX database schema. The main table stores the sequence data. All other information is stored in different tables refering to the main table.
Sequence data integrated into TRUNCATULIX.
| MtGI 8.0 TCs & Singletons | 36,878 | 6,174 (16.74%) | 12,746 (34.56%) | 10,268 (27.84%) |
| MtGI 9.0 TCs & Singletons | 67,463 | 11,253 (16.68%) | 20,570 (30.49%) | 19,008 (28.18%) |
| Mt Genome 2.0 | 38,759 | 5,938 (15.32%) | 3,434 (8.85%) | 10,444 (26.95%) |
| Affymetrix Medicago | ||||
| GeneChip® probes | 61,103 | 12,044 (19.71%) | 18,731 (30.65%) | 19,775 (32.36%) |
| Medicago 454 sequencing project | 3,619 | 911 (25.17%) | 1,798 (49.68%) | 519 (14.34%) |
The table shows the number of sequences integrated in TRUNCATULIX via the SAMS software. The annotation information was calculated using various bioinformatic tools.
Microarray expression data imported from EMMA into TRUNCATULIX.
| Nitrogen-fixing root nodules in | 4 | 10 |
| Nod-Factor response in | 9 | 13 |
| Root endosymbiosis in | 10 | 23 |
| Uromyces pathogenesis in | 3 | 4 |
| AHL treatment of | 11 | 17 |
| LMW EPS I treatment of | 6 | 8 |
| LMW EPS I treatment of | 6 | 8 |
| LMW EPS I treatment of | 24 | 32 |
| Nod-factor treatment of | 6 | 8 |
| Nod-factor treatment of | 18 | 24 |
| Seed development in | 22 | 51 |
| Early Salt Stress in | 4 | 5 |
| Cold stress in | 8 | 11 |
| 16 | 20 | |
| Response to phosphate in | 3 | 4 |
The table shows the amount of microarray expression data integrated into TRUNCATULIX via EMMA. Currently there are 15 different experiments with a total of 150 microarrays (248 transformed datasets) integrated. Some of the integrated data is unpublised: H. Küster(*), M. Hahn, N. Hohnjec, H. Küster(**), D. Hinse, A. Becker, H. Küster(***), C. Hogekamp, H. Küster(****) and F. Frugier(*****). Abbreviations: AHL: acetylhomoserine lactone, LMW: low molecular weight, EPS: exopolysaccharide, Nod-factor: Nodulation factor
GeneChip data integrated into TRUNCATULIX.
| Leaf: 4-week old trifolia were harvest without their petioles (but with their petiolule) [ | 3 |
| Petiole: Petioles from 4-week old plant [ | 3 |
| Stem: Stems of 4-week old plants (without vegetative buds) [ | 3 |
| Vegetative Bud: Vegetative buds of 4-week old plants [ | 3 |
| Root: 4-week old non-inoculated roots [ | 3 |
| Nodule: Nodules from 4-week old plants [ | 3 |
| Flower: Fully open flowers were harvest at the day of anthesis [ | 3 |
| Pod: Mix of small, medium and physiologically mature pods [ | 3 |
| Root-0d: Roots at 0 dpi (control for nodule developmental series) [ | 3 |
| Nod4d: Nodules at 4 dpi (root lumps with residual roots) [ | 3 |
| Nod10d: Developing nodules at 10 dpi [ | 3 |
| Nod14d: Mature nodules at 14 dpi [ | 3 |
| Seed10d: Developing seeds at early embryogenesis – 10 dap [ | 3 |
| Seed12d: Developing seeds at 12 dap (transition between embryogenesis and seed filling) [ | 3 |
| Seed16d: Developing seeds at 16 dap (accumulation of storage proteins) [ | 3 |
| Seed20d: Developing seeds at 20 dap (seed filling) [ | 3 |
| Seed24d: Developing seeds at 24 dap (maturation phase) [ | 3 |
| Seed36d: Developing seeds at 36 dap (physiologically mature seeds, desiccation) [ | 3 |
The table shows the experiments and number of GeneChip arrays® directly imported into TRUNCATULIX. The three experiments address major topics: Mature organs covering the whole plant, nodulation development, and seed development.
Figure 2The first filterstep – gene annotation. The screenshot shows the filter page for the gene annotation data. The annotation descriptions can be queried, as well as the gene names, sequences and EC numbers.
Figure 3The second filterstep – gene expression. This screenshot shows the filter page for the expression data. The different integrated experiments can be selected, as well as different expression values and the number of replicates.
Figure 4Export options of TRUNCATULIX. The screenshot shows the last page of the query dialog. The user can select which data and details should be exported, receiving an Excel, HTML, or csv file as result.
Figure 5The simple search dialog. The screenshot shows the simple search dialog. The user can search for text fragments in the annotation of the genes, the gene names, the gene products, and the reporter names.
A comparison of TRUNCATULIX to other data warehouses
| Data Warehouse feature | GeWare | TRUNCATULIX | |
| Target organism | |||
| static/dynamic data | dynamic | Static | static |
| number of microarray experiments | unknown | 18 | 18 |
| number of microarrays mircoarrays | unknown | 54 | 204 |
| automatic annotation information | yes | searchable, but not visible | yes |
| KEGG-mapping | no | yes | yes |
| GO-numbers | no | yes | yes |
| search | yes | yes | yes |
| blast homology search | no | yes | no, planned |
| export options | yes | yes | yes |
| free use | yes | yes | yes |
| free access | no | yes | yes |
The table shows a comparison of the three data warhouses GeWare, the Medicago truncatula Gene Expression Atlas, and TRUNCATULIX.