| Literature DB >> 30793170 |
Sarahani Harun1, Muhammad-Redha Abdullah-Zawawi1, Mohd Rusman Arief A-Rahman1, Nor Azlan Nor Muhammad1, Zeti-Azura Mohamed-Hussein1,2.
Abstract
Plants produce a wide range of secondary metabolites that play important roles in plant defense and immunity, their interaction with the environment and symbiotic associations. Sulfur-containing compounds (SCCs) are a group of important secondary metabolites produced in members of the Brassicales order. SCCs constitute various groups of phytochemicals, but not much is known about them. Findings from previous studies on SCCs were scattered in published literatures, hence SuCComBase was developed to store all molecular information related to the biosynthesis of SCCs. Information that includes genes, proteins and compounds that are involved in the SCC biosynthetic pathway was manually identified from databases and published scientific literatures. Sets of co-expression data was analyzed to search for other possible (previously unknown) genes that might be involved in the biosynthesis of SCC. These genes were named as potential SCC-related encoding genes. A total of 147 known and 92 putative Arabidopsis thaliana SCC-related genes from literatures were used to identify other potential SCC-related encoding genes. We identified 778 potential SCC-related encoding genes, 4026 homologs to the SCC-related encoding genes and 116 SCCs as shown on SuCComBase homepage. Data entries are searchable from the Main page, Search, Browse and Datasets tabs. Users can easily download all data stored in SuCComBase. All publications related to SCCs are also indexed in SuCComBase, which is currently the first and only database dedicated to plant SCCs. SuCComBase aims to become a manually curated and au fait knowledge-based repository for plant SCCs.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30793170 PMCID: PMC6384505 DOI: 10.1093/database/baz021
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1SuCComBase schema contains 12 tables with the connections from table to table.
Figure 2SuCComBase data types structure organization. These data types are tables that can be found in ‘Browse’ and ‘Datasets’ menu.
Figure 3Identification of 92 putative A. thaliana SCC-related genes, 778 potential A. thaliana SCC-related genes and 4026 SCC homologs using 147 known A. thaliana SCC-related genes as queries.
Figure 4Detailed overview on the approaches used to identify and annotate GO terms to each SCC-related genes entry. Known, potential and putative A. thaliana SCC-related genes were used to search for the GO annotation information.
Number of entries in SuCComBase
| Data set | Entries |
|---|---|
| Known | 147 |
| Putative | 92 |
| KEGG putative | 3 |
| AraCyc putative | 89 |
| Potential | 778 |
| SCC homologs | 4026 |
|
| 1970 |
|
| 1319 |
|
| 737 |
| Compounds | 116 |
|
| 47 |
|
| 28 |
|
| 40 |
|
| 1 |
| Publications | 206 |
Figure 5The integration of three co-expression gene networks reveals potential SCC-related genes. Different colors refer to the function of known SCC-related genes in GSL biosynthesis: yellow (transcription factor), blue (core structure synthesis), green (side-chain elongation), purple (side-chain modification) and red (GSL degradation). Known SCC-related genes were used as query to identify the co-expressed genes to be classified as potential SCC-related genes.