Literature DB >> 35548703

BD2K Training Coordinating Center's ERuDIte: the Educational Resource Discovery Index for Data Science.

José Luis Ambite1, Lily Fierro1, Jonathan Gordon1, Gully A Burns1, Florian Geigl2, Kristina Lerman1, John D Van Horn3.   

Abstract

Data science is a field that has developed to enable efficient integration and analysis of increasingly large data sets in many domains. In particular, big data in genetics, neuroimaging, mobile health, and other subfields of biomedical science, promises new insights, but also poses challenges. To address these challenges, the National Institutes of Health launched the Big Data to Knowledge (BD2K) initiative, including a Training Coordinating Center (TCC) tasked with developing a resource for personalized data science training for biomedical researchers. The BD2K TCC web portal is powered by ERuDIte, the Educational Resource Discovery Index, which collects training resources for data science, including online courses, videos of tutorials and research talks, textbooks, and other web-based materials. While the availability of so many potential learning resources is exciting, they are highly heterogeneous in quality, difficulty, format, and topic, making the field intimidating to enter and difficult to navigate. Moreover, data science is rapidly evolving, so there is a constant influx of new materials and concepts. We leverage data science techniques to build ERuDIte itself, using data extraction, data integration, machine learning, information retrieval, and natural language processing to automatically collect, integrate, describe, and organize existing online resources for learning data science.

Entities:  

Keywords:  H.2.0.b Database design; H.2.8.c Data and knowledge visualization; I.2.1.d Education; I.2.6.g Machine learning; modeling and management

Year:  2019        PMID: 35548703      PMCID: PMC9089329          DOI: 10.1109/tetc.2019.2903466

Source DB:  PubMed          Journal:  IEEE Trans Emerg Top Comput        ISSN: 2168-6750            Impact factor:   6.595


  4 in total

1.  Database of NIH grants using machine-learned categories and graphical clustering.

Authors:  Edmund M Talley; David Newman; David Mimno; Bruce W Herr; Hanna M Wallach; Gully A P C Burns; A G Miriam Leenders; Andrew McCallum
Journal:  Nat Methods       Date:  2011-06       Impact factor: 28.547

2.  NIH's Big Data to Knowledge initiative and the advancement of biomedical informatics.

Authors:  Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2014 Mar-Apr       Impact factor: 4.497

3.  BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences.

Authors:  Peter McQuilton; Alejandra Gonzalez-Beltran; Philippe Rocca-Serra; Milo Thurston; Allyson Lister; Eamonn Maguire; Susanna-Assunta Sansone
Journal:  Database (Oxford)       Date:  2016-05-17       Impact factor: 3.451

4.  EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats.

Authors:  Jon Ison; Matús Kalas; Inge Jonassen; Dan Bolser; Mahmut Uludag; Hamish McWilliam; James Malone; Rodrigo Lopez; Steve Pettifer; Peter Rice
Journal:  Bioinformatics       Date:  2013-03-11       Impact factor: 6.937

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.