| Literature DB >> 35404045 |
Dominik Kopczynski1, Nils Hoffmann2, Bing Peng3, Gerhard Liebisch4, Friedrich Spener5,6, Robert Ahrends1.
Abstract
Goslin is the first grammar-based computational library for the recognition/parsing and normalization of lipid names following the hierarchical lipid shorthand nomenclature. The new version Goslin 2.0 implements the latest nomenclature and adds an additional grammar to recognize systematic IUPAC-IUB fatty acyl names as stored, e.g., in the LIPID MAPS database and is perfectly suited to update lipid names in LIPID MAPS or HMDB databases to the latest nomenclature. Goslin 2.0 is available as a standalone web application with a REST API as well as C++, C#, Java, Python 3, and R libraries. Importantly, it can be easily included in lipidomics tools and scripts providing direct access to translation functions. All implementations are open source.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35404045 PMCID: PMC9047418 DOI: 10.1021/acs.analchem.1c05430
Source DB: PubMed Journal: Anal Chem ISSN: 0003-2700 Impact factor: 6.986
Hierarchical Presentation of a Shorthand Notation for Oxygenated Phosphatidylethanolamine PE 16:1(6Z)/16:0;5OH[R],8OH[S];3oxoa
| level | lipid name |
|---|---|
| category | GP |
| class | PE |
| species level | PE 32:2;O3 |
| molecular species level | PE 16:1_16:1;O3 |
| PE 16:1/16:1;O3 | |
| structure defined level | PE 16:1(6)/16:0;(OH)2;oxo |
| full structure level | PE 16:1(6 |
| complete structure level | PE 16:1(6 |
From top to bottom, the structural information of the molecule increases. The species level provides information about the head group plus aggregated information on fatty acyl chains. The molecular species level provides aggregated information about constituent fatty acyl chains with unknown sn-positions. The sn-position level clarifies stereo-specific numbering. Until this level, the double bonds in the functional groups may be aggregated in the double bond equivalent. The structure defined level resolves functional groups in constituent fatty acyl chains. The full structure level adds position information, while the complete structure level adds all stereo-chemical information.
Figure 1Exemplary illustrations of nested patterns: left, triacylglycerol with additional O-acyl linkage;[3] right, branched fatty acyl chain (LIPID MAPS-ID LMFA01160041) can be aligned schematically into the substitution blocks [lipid class] and [chain specification]. Here, the blocks [chain specification] (blue) are substituted into their successors. Some [functional group] blocks are again substituted into a [chain specification] block (gray) to describe the attached fatty acyl (left) or alkyl (right) branches in their lipids. A chain specification appears within another chain specification (gray within blue).
Examples for Lipid Naming by IUPAC-IUB and Standardization by Shorthand Notation
| IUPAC-IUB name | LIPID MAPS | standardized name |
|---|---|---|
| 5-methyl-octadecanoic acid | LMFA01020216 | FA 18:0;5Me |
| 2-docosyl-3-hydroxy-28,29-epoxy-30-methyl-pentacontanoic acid | LMFA01160100 | FA 50:0;2(22:0);30Me;28Ep;3OH |
| 11 | LMFA03010032 | FA 15:0;[4-8cy5:0;7OH;5oxo];11oxo;15COOH |
| LMFA08040030 | NAE
20:3(5 |
Number of Parsed Lipids per Database: All Database Snapshots Were Acquired in July 2021
| LIPID MAPS | SwissLipids | HMDB | |
|---|---|---|---|
| total no. of lipids | 45 552 | 777 956 | 90 688 |
| total no. of FA, GL, GP, SP, and ST | 35 556 | 777 956 | 87 775 |
| no. of converted FA, GL, GP, SP, and ST by Goslin 2.0 | 29 098 (81.8%) | 771 287 (99.1%) | 85 179 (97.0%) |