| Literature DB >> 28365725 |
Abstract
To promote understanding of how organisms are related via carotenoids, either evolutionarily or symbiotically, or in food chains through natural histories, we built the Carotenoids Database. This provides chemical information on 1117 natural carotenoids with 683 source organisms. For extracting organisms closely related through the biosynthesis of carotenoids, we offer a new similarity search system 'Search similar carotenoids' using our original chemical fingerprint 'Carotenoid DB Chemical Fingerprints'. These Carotenoid DB Chemical Fingerprints describe the chemical substructure and the modification details based upon International Union of Pure and Applied Chemistry (IUPAC) semi-systematic names of the carotenoids. The fingerprints also allow (i) easier prediction of six biological functions of carotenoids: provitamin A, membrane stabilizers, odorous substances, allelochemicals, antiproliferative activity and reverse MDR activity against cancer cells, (ii) easier classification of carotenoid structures, (iii) partial and exact structure searching and (iv) easier extraction of structural isomers and stereoisomers. We believe this to be the first attempt to establish fingerprints using the IUPAC semi-systematic names. For extracting close profiled organisms, we provide a new tool 'Search similar profiled organisms'. Our current statistics show some insights into natural history: carotenoids seem to have been spread largely by bacteria, as they produce C30, C40, C45 and C50 carotenoids, with the widest range of end groups, and they share a small portion of C40 carotenoids with eukaryotes. Archaea share an even smaller portion with eukaryotes. Eukaryotes then have evolved a considerable variety of C40 carotenoids. Considering carotenoids, eukaryotes seem more closely related to bacteria than to archaea aside from 16S rRNA lineage analysis. Database URL: : http://carotenoiddb.jp.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28365725 PMCID: PMC5574413 DOI: 10.1093/database/bax004
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Growth curve of compiled carotenoid structures.
The data content of carotenoid entries (December 2016 release)
| Field | Data content |
|---|---|
| ENTRY | Accession number which begins with CA |
| HIERARCHICAL CLASSIFICATION | Classification by the number of carbon atoms, end groups and chemical modification patterns |
| NAME | Trivial name of the carotenoid |
| IUPAC NAME | Systematic-name abide by nomenclature of carotenoids approved by the IUPAC and the IUPAC-IUB Commission |
| FORMULA | Chemical formula calculated by Open Babel |
| MOLECULAR WEIGHT | Molecular weight calculated with Standard Atomic Weights 2015 which are defined by the Chemical Society of Japan |
| CHEMICAL STRUCTURE | PNG file and Mol file of our own handwriting carotenoid structure |
| CHEMICAL FINGERPRINTS | Carotenoid DB Chemical Fingerprints investigated in this work |
| ISOMERS | Accession numbers of constitutional isomers, and stereoisomers which include |
| BIOLOGICAL FUNCTIONS AND PROPERTIES | Photosynthetic pigment, photoprotective agent, provitamin A, antioxidant, anticarcinogenic activity, colour, etc. |
| InChI | The IUPAC International Chemical Identifier converted by Open Babel |
| InChIKey | Fixed-length (27-character) condensed digital representation of an InChI converted by Open Babel |
| Canonical SMILES | Canonical Simplified Molecular Input Line Entry System converted by Open Babel |
| XLogP | Partition coefficient calculated by PaDEL-Descriptor |
| HYDROGEN BOND DONORS | Number of hydrogen bond donors (using Lipinski's definition: Any OH or NH. Each available hydrogen atom is counted as onehydrogen bond donor) calculated by PaDEL-Descriptor |
| HYDROGEN BOND ACCEPTORS | Number of hydrogen bond acceptors (using Lipinski's definition: any nitrogen; any oxygen) calculated by PaDEL-Descriptor |
| LIPINSKI FAILURES | Number failures of the Lipinski's Rule Of 5 calculated by PaDEL-Descriptor |
| COMPLEXITY OF MOLECULE | Complexity of a molecule calculated by PaDEL-Descriptor |
| NUMBER OF HEAVY ATOMS | Number of heavy atoms (i.e. not hydrogen) calculated by PaDEL-Descriptor |
| TOPOLOGICAL POLAR SURFACE AREA | Sum of solvent accessible surface areas of atoms with absolute value of partial charges greater than or equal to 0.2 calculated by PaDEL-Descriptor |
| SOURCE ORGANISMS | Scientific names of source organisms obtained from the latest available papers |
| REFERENCES | References of original papers |
| CAS | Chemical Abstract Service number |
| LINKS TO OTHER DB | Links to KEGG COMPOUND, KNApSAcK, Lipidbank and ProCarDB |
The data content of source organism entries (December 2016 release)
| Field | Data content |
|---|---|
| NAME | Scientific name of source organism |
| NCBI taxonomy ID | Taxonomy ID defined by NCBI |
| LINEAGE | Full lineage defined by NCBI |
| DESCRIPTION | Popular names and explanations of source organism |
| CAROTENOID PROFILE | List of CA-numbers, structures, descriptions of the carotenoids and reference numbers |
| REFERENCES | References describing the carotenoid profiles of the source organism |
Figure 2.Links in the Carotenoids Database.
Figure 3.Fucoxanthin: (3S,5R,6S,3'S,5'R,6'R)-5,6-Epoxy-3'-ethanoyloxy-3,5'-dihydroxy-6',7'-didehydro-5,6,7,8,5',6'-hexahydro-beta,beta-caroten-8-on whose chemical fingerprints are made as “(3S,5R,6S,3'S,5'R,6'R), 6',7'–H, 5,6 + H, 7,8 + H, 5',6'+H, 3-OH, 5'-OH, 3'-Ethanoyloxy, 8 = O, 5,6-Epoxy, beta,beta”.
Category of carotenoids and including Carotenoid DB Chemical Fingerprints
| Carotenoids category | Categories of including Carotenoid DB Chemical Fingerprints | Examples of entries, and their fingerprints | ||
|---|---|---|---|---|
| CA00047 | Neurosporene | 7,8+H, | ||
| End groups, and/or | CA00322 | β-Cryptoxanthin | (3R), 3- | |
| End groups, and/or | CA00628 | α-Carotene epoxide | 5,6+H, 5,6- | |
| End groups, and/or | CA00161 | Anhydrorhodovibrinal | 3,4–H, 1,2+H, 1-Methoxy, 20- | |
| End groups, and/or | CA00184 | Ketohydroxylycopene | 3′-OH, 4 | |
| End groups, and/or | CA00283 | Torularhodin | 3′,4′–H, 16′- | |
| End groups, and/or | CA00288 | Neurosporaxanthin | 4′-COOH, 4′- | |
| End groups, and/or | CA00572 | Actinioerythrol | (3S,3’S), 3-OH, 3′-OH, 4=O, 4′=O, 2- | |
| End groups, and/or | CA00584 | β-Carotenone | 5=O, 6=O, 5′=O, 6′=O, 5,6- | |
| End groups, and/or | CA00196 | Retrodehydro-γ-carotene | 4′,5′–H, 4,5′- | |
| End groups, and/or | CA00413 | Peridininol 5,8-furanooxide | (3S,5R,6S,3’S,5’R,6’S), 6′,7′–H, 5,6+H,5′,6′+H, 3-OH, 3′-OH, 5’-OH, 5,8-Epoxy,19,11- | |
| End groups, and/or | CA00341 | Trollein | (3S,5R,6R,3’R), | |
| End groups, and/or | CA00296 | Crassostreaxanthin A | (3R,3’R,5’R,6’S), | |
| End groups, and/or | CA00886 | Crocetindial | 8-al, 8′-al, 8 | |
Numbers of carotenoids in the three domains of life (December 2016 release)
| Domains of life | Number of organisms | C30 carotenoids | C40 originated carotenoids | C45 carotenoids | C50 carotenoids | Total number of carotenoids |
|---|---|---|---|---|---|---|
| Archaea | 8 | 1 | 14 | 0 | 5 | 19 |
| Bacteria | 170 | 33 | 243 | 7 | 24 | 307 |
| Eukaryotes | 505 | 0 | 607 | 0 | 0 | 607 |
Distribution of carbon numbers and end groups of carotenoids among organisms at phylum level (December 2016 release)
| End groups | Source organisms |
| ψ,ψ | Euryarchaeota, Firmicutes, Cyanobacteria, Alphaproteobacteria, Gammaproteobacteria |
| End groups | Source organisms |
| ψ,ψ | Cyanobacteria, Deinococci, Alphaproteobacteria, Betaproteobacteria, Deltaproteobacteria, Gammaproteobacteria, Actinobacteria, Firmicutes, Gemmatimonadetes, Cryptophyta, Streptophyta, Chordata, Chlorophyta, Basidiomycota, Ascomycota, Porifera, Arthropoda (Insecta) |
| β,ψ | Cyanobacteria, Actinobacteria, Deinococci, Alphaproteobacteria, Gammaproteobacteria, Bacteroidetes, Chlorobi, Firmicutes, Deltaproteobacteria, Euglenida, Chlorophyta, Streptophyta, Basidiomycota, Ascomycota, Chordata, Porifera, Mollusca, Arthropoda (Insecta) |
| ε,ψ | Streptophyta, Chordata |
| γ,ψ | Arthropoda (Insecta) |
| φ,ψ | Gammaproteobacteria, Unclassified bacteria (Chlorochromatium), Chlorobi, Actinobacteria |
| χ,ψ | Gammaproteobacteria |
| β,β | Crenarchaeota, Euryarchaeota, Cyanobacteria, Deinococci, Alphaproteobacteria, Actinobacteria, Bacteroidetes, Rhodophyta, Chlorophyta, Streptophyta, Cryptophyta, Eustigmatophyceae, (Stramenopiles), Haptophyceae, Alveolata, (Raphidophyceae), Bacillariophyta, Euglenida, Unclassified chlorophyta, Phaeophyceae, Glaucocystophyceae, Basidiomycota, Porifera, Arthropoda, Mollusca, Ascomycota, Chordata, Echinodermata, Cnidaria, Arthropoda (Insecta) |
| β,ε | Euryarchaeota, Cyanobacteria, Rhodophyta, Chlorophyta, Streptophyta, Cryptophyta, Haptophyceae, Euglenida, Unclassified chlorophyta, Mollusca, Chordata, Porifera, Ascomycota, Echinodermata, Actinobacteria, Arthropoda, Cnidaria, Arthropoda (Insecta) |
| β,γ | Ascomycota, Mollusca, Streptophyta, Arthropoda (Insecta) |
| ε,ε | Cyanobacteria, Cryptophyta, Chlorophyta, (Stramenopiles), Mollusca, Streptophyta, Chordata |
| γ,ε | Chlorophyta, Unclassified chlorophyta, Porifera |
| γ,γ | Arthropoda (Insecta) |
| β,φ | Gammaproteobacteria, Porifera |
| β,χ | Gammaproteobacteria, Echinodermata, Porifera |
| β,κ | Streptophyta, Chordata, Mollusca, Porifera, Echinodermata, Ascomycota |
| κ,χ | Porifera |
| κ,κ | Streptophyta |
| φ,φ | Actinobacteria, Gammaproteobacteria, Chlorobi, Unclassified bacteria (Chlorochromatium), Mollusca, Porifera |
| χ,χ | Cyanobacteria, Gammaproteobacteria |
| φ,χ | Gammaproteobacteria, Porifera |
| ψ,- | Cyanobacteria, Streptophyta, Ascomycota |
| β,- | Streptophyta, Haptophyceae, Mollusca, Cyanobacteria, Gammaproteobacteria, Alphaproteobacteria, Chordata, Actinobacteria, Ascomycota, Echinodermata, Bacillariophyta, Arthropoda (Insecta), Porifera, Arthropoda |
| ε,- | Streptophyta, Mollusca, Chordata, Ascomycota, Arthropoda (Insecta) |
| γ,- | Streptophyta |
| κ,- | Streptophyta |
| no end group | Cyanobacteria, Gammaproteobacteria, Streptophyta, Ascomycota |
| End groups | Source organisms |
| ψ,ψ | Actinobacteria |
| β,ψ | Bacteroidetes |
| ε,ψ | Actinobacteria |
| β,β | Actinobacteria |
| ε,ε | Actinobacteria |
| End groups | Source organisms |
| ψ,ψ | Euryarchaeota, Actinobacteria, Unclassified bacteria (Halophilic bacteria) |
| β,ψ | Actinobacteria |
| ε,ψ | Actinobacteria |
| β,β | Actinobacteria |
| ε,ε | Actinobacteria, Gammaproteobacteria |
| γ,γ | Firmicutes, Actinobacteria |
Figure 4.Numbers of unique carotenoids and common carotenoids in the three domains of life. (December 2016 release).