Literature DB >> 32096410

A Design-to-Device Pipeline for Data-Driven Materials Discovery.

Jacqueline M Cole1,2,3,4.   

Abstract

The world needs new materials to stimulate the chemical industry in key sectors of our economy: environment and sustainability, information storage, optical telecommunications, and catalysis. Yet, nearly all functional materials are still discovered by "trial-and-error", of which the lack of predictability affords a major materials bottleneck to technological innovation. The average "molecule-to-market" lead time for materials discovery is currently 20 years. This is far too long for industrial needs, as highlighted by the Materials Genome Initiative, which has ambitious targets of up to 4-fold reductions in average molecule-to-market lead times. Such a large step change in progress can only be realistically achieved if one adopts an entirely new approach to materials discovery. Fortunately, a fundamentally new approach to materials discovery has been emerging, whereby data science with artificial intelligence offers a prospective solution to speed up these average molecule-to-market lead times.This approach is known as data-driven materials discovery. Its broad prospects have only recently become a reality, given the timely and major advances in "big data", artificial intelligence, and high-performance computing (HPC). Access to massive data sets has been stimulated by government-regulated open-access requirements for data and literature. Natural-language processing (NLP) and machine-learning (ML) tools that can mine data and find patterns therein are becoming mainstream. Exascale HPC capabilities that can aid data mining and pattern recognition and also generate their own data from calculations are now within our grasp. These timely advances present an ideal opportunity to develop data-driven materials-discovery strategies to systematically design and predict new chemicals for a given device application.This Account shows how data science can afford materials discovery via a four-step "design-to-device" pipeline that entails (1) data extraction, (2) data enrichment, (3) material prediction, and (4) experimental validation. Massive databases of cognate chemical and property information are first forged from "chemistry-aware" natural-language-processing tools, such as ChemDataExtractor, and enriched using machine-learning methods and high-throughput quantum-chemical calculations. New materials for a bespoke application can then be predicted by mining these databases with algorithmic encodings of relationships between chemical structures and physical properties that are known to deliver functional materials. These may take the form of classification, enumeration, or machine-learning algorithms. A data-mining workflow short-lists these predictions to a handful of lead candidate materials that go forward to experimental validation. This design-to-device approach is being developed to offer a roadmap for the accelerated discovery of new chemicals for functional applications. Case studies presented demonstrate its utility for photovoltaic, optical, and catalytic applications. While this Account is focused on applications in the physical sciences, the generic pipeline discussed is readily transferable to other scientific disciplines such as biology and medicine.

Year:  2020        PMID: 32096410     DOI: 10.1021/acs.accounts.9b00470

Source DB:  PubMed          Journal:  Acc Chem Res        ISSN: 0001-4842            Impact factor:   22.384


  10 in total

Review 1.  Into the Unknown: How Computation Can Help Explore Uncharted Material Space.

Authors:  Austin M Mroz; Victor Posligua; Andrew Tarzia; Emma H Wolpert; Kim E Jelfs
Journal:  J Am Chem Soc       Date:  2022-10-07       Impact factor: 16.383

2.  Perovskite- and Dye-Sensitized Solar-Cell Device Databases Auto-generated Using ChemDataExtractor.

Authors:  Edward J Beard; Jacqueline M Cole
Journal:  Sci Data       Date:  2022-06-17       Impact factor: 8.501

3.  A database of refractive indices and dielectric constants auto-generated using ChemDataExtractor.

Authors:  Jiuyang Zhao; Jacqueline M Cole
Journal:  Sci Data       Date:  2022-05-03       Impact factor: 8.501

4.  A database of battery materials auto-generated using ChemDataExtractor.

Authors:  Shu Huang; Jacqueline M Cole
Journal:  Sci Data       Date:  2020-08-06       Impact factor: 6.444

Review 5.  Opportunities and challenges of text mining in aterials research.

Authors:  Olga Kononova; Tanjin He; Haoyan Huo; Amalie Trewartha; Elsa A Olivetti; Gerbrand Ceder
Journal:  iScience       Date:  2021-02-06

6.  Auto-generated database of semiconductor band gaps using ChemDataExtractor.

Authors:  Qingyang Dong; Jacqueline M Cole
Journal:  Sci Data       Date:  2022-05-03       Impact factor: 8.501

7.  Now Is the Time to Build a National Data Ecosystem for Materials Science and Chemistry Research Data.

Authors:  Eva M Campo; Sadasivan Shankar; Alexander S Szalay; Robert J Hanisch
Journal:  ACS Omega       Date:  2022-04-13

8.  Control of pH-Responsiveness in Graphene Oxide Grafted with Poly-DEAEMA via Tailored Functionalization.

Authors:  Roxana Noriega-Navarro; Jésica Castro-Medina; Martha V Escárcega-Bobadilla; Gustavo A Zelada-Guillén
Journal:  Nanomaterials (Basel)       Date:  2020-03-27       Impact factor: 5.076

9.  3-D Inorganic Crystal Structure Generation and Property Prediction via Representation Learning.

Authors:  Callum J Court; Batuhan Yildirim; Apoorv Jain; Jacqueline M Cole
Journal:  J Chem Inf Model       Date:  2020-09-16       Impact factor: 4.956

Review 10.  Can we predict materials that can be synthesised?

Authors:  Filip T Szczypiński; Steven Bennett; Kim E Jelfs
Journal:  Chem Sci       Date:  2020-12-09       Impact factor: 9.825

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.