Literature DB >> 33712066

ChemProps: A RESTful API enabled database for composite polymer name standardization.

Bingyin Hu1, Anqi Lin1, L Catherine Brinson2.   

Abstract

The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based materials. The current solution of using a variety of different chemical identifiers has proven insufficient to address the challenge and is not intuitive for researchers. This work proposes a multi-algorithm-based mapping methodology entitled ChemProps that is optimized to solve the polymer indexing issue with easy-to-update design both in depth and in width. RESTful API is enabled for lightweight data exchange and easy integration across data systems. A weight factor is assigned to each algorithm to generate scores for candidate chemical names and optimized to maximize the minimum value of the score difference between the ground truth chemical name and the other candidate chemical names. Ten-fold validation is utilized on the 160 training data points to prevent overfitting issues. The obtained set of weight factors achieves a 100% test accuracy on the 54 test data points. The weight factors will evolve as ChemProps grows. With ChemProps, other polymer databases can remove duplicate entries and enable a more accurate "search by SMILES" function by using ChemProps as a common name-to-SMILES translator through API calls. ChemProps is also an excellent tool for auto-populating polymer properties thanks to its easy-to-update design.

Entities:  

Keywords:  API; Database; Materials Informatics; NanoMine; Optimization; Polymers; SMILES

Year:  2021        PMID: 33712066      PMCID: PMC7955638          DOI: 10.1186/s13321-021-00502-6

Source DB:  PubMed          Journal:  J Cheminform        ISSN: 1758-2946            Impact factor:   5.514


  11 in total

Review 1.  Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation.

Authors:  Antony J Williams; Sean Ekins; Valery Tkachenko
Journal:  Drug Discov Today       Date:  2012-03-08       Impact factor: 7.851

2.  Polymer Informatics: Opportunities and Challenges.

Authors:  Debra J Audus; Juan J de Pablo
Journal:  ACS Macro Lett       Date:  2017-09-15       Impact factor: 6.903

3.  SmilesDrawer: Parsing and Drawing SMILES-Encoded Molecular Structures Using Client-Side JavaScript.

Authors:  Daniel Probst; Jean-Louis Reymond
Journal:  J Chem Inf Model       Date:  2018-01-11       Impact factor: 4.956

4.  Targeted sequence design within the coarse-grained polymer genome.

Authors:  Michael A Webb; Nicholas E Jackson; Phwey S Gil; Juan J de Pablo
Journal:  Sci Adv       Date:  2020-10-21       Impact factor: 14.136

5.  InChI - the worldwide chemical structure identifier standard.

Authors:  Stephen Heller; Alan McNaught; Stephen Stein; Dmitrii Tchekhovskoi; Igor Pletnev
Journal:  J Cheminform       Date:  2013-01-24       Impact factor: 5.514

6.  Consistency of systematic chemical identifiers within and between small-molecule databases.

Authors:  Saber A Akhondi; Jan A Kors; Sorel Muresan
Journal:  J Cheminform       Date:  2012-12-13       Impact factor: 5.514

7.  Capturing mixture composition: an open machine-readable format for representing mixed substances.

Authors:  Alex M Clark; Leah R McEwen; Peter Gedeck; Barry A Bunin
Journal:  J Cheminform       Date:  2019-05-23       Impact factor: 5.514

8.  BigSMILES: A Structurally-Based Line Notation for Describing Macromolecules.

Authors:  Tzyy-Shyang Lin; Connor W Coley; Hidenobu Mochigase; Haley K Beech; Wencong Wang; Zi Wang; Eliot Woods; Stephen L Craig; Jeremiah A Johnson; Julia A Kalow; Klavs F Jensen; Bradley D Olsen
Journal:  ACS Cent Sci       Date:  2019-09-12       Impact factor: 14.553

Review 9.  Data-Driven Materials Science: Status, Challenges, and Perspectives.

Authors:  Lauri Himanen; Amber Geurts; Adam Stuart Foster; Patrick Rinke
Journal:  Adv Sci (Weinh)       Date:  2019-09-01       Impact factor: 16.806

10.  Chemical Entities of Biological Interest: an update.

Authors:  Paula de Matos; Rafael Alcántara; Adriano Dekker; Marcus Ennis; Janna Hastings; Kenneth Haug; Inmaculada Spiteri; Steve Turner; Christoph Steinbeck
Journal:  Nucleic Acids Res       Date:  2009-10-23       Impact factor: 16.971

View more
  1 in total

1.  FAIR and Interactive Data Graphics from a Scientific Knowledge Graph.

Authors:  Michael E Deagen; Jamie P McCusker; Tolulomo Fateye; Samuel Stouffer; L Cate Brinson; Deborah L McGuinness; Linda S Schadler
Journal:  Sci Data       Date:  2022-05-27       Impact factor: 8.501

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.