Literature DB >> 27918179

Mining big data to extract patterns and predict real-life outcomes.

Michal Kosinski1, Yilun Wang2, Himabindu Lakkaraju2, Jure Leskovec2.   

Abstract

This article aims to introduce the reader to essential tools that can be used to obtain insights and build predictive models using large data sets. Recent user proliferation in the digital environment has led to the emergence of large samples containing a wealth of traces of human behaviors, communication, and social interactions. Such samples offer the opportunity to greatly improve our understanding of individuals, groups, and societies, but their analysis presents unique methodological challenges. In this tutorial, we discuss potential sources of such data and explain how to efficiently store them. Then, we introduce two methods that are often employed to extract patterns and reduce the dimensionality of large data sets: singular value decomposition and latent Dirichlet allocation. Finally, we demonstrate how to use dimensions or clusters extracted from data to build predictive models in a cross-validated way. The text is accompanied by examples of R code and a sample data set, allowing the reader to practice the methods discussed here. A companion website (http://dataminingtutorial.com) provides additional learning resources. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

Entities:  

Mesh:

Year:  2016        PMID: 27918179     DOI: 10.1037/met0000105

Source DB:  PubMed          Journal:  Psychol Methods        ISSN: 1082-989X


  8 in total

1.  A methodology for preprocessing structured big data in the behavioral sciences.

Authors:  Paul A Brown; Ricardo A Anderson
Journal:  Behav Res Methods       Date:  2022-06-29

2.  Personalization in Australian K-12 classrooms: how might digital teaching and learning tools produce intangible consequences for teachers' workplace conditions?

Authors:  Janine Aldous Arantes
Journal:  Aust Educ Res       Date:  2022-04-28

3.  Text message content as a window into college student drinking: Development and initial validation of a dictionary of "alcohol talk".

Authors:  Michaeline Jensen; Andrea Hussong
Journal:  Int J Behav Dev       Date:  2019-11-26

4.  Evidence for Distinct Facial Signals of Reward, Affiliation, and Dominance from Both Perception and Production Tasks.

Authors:  Jared D Martin; Adrienne Wood; William T L Cox; Scott Sievert; Robert Nowak; Eva Gilboa-Schechtman; Fangyun Zhao; Zachary Witkower; Andrew T Langbehn; Paula M Niedenthal
Journal:  Affect Sci       Date:  2021-02-03

5.  The Effect of Moral Congruence of Calls to Action and Salient Social Norms on Online Charitable Donations: A Protocol Study.

Authors:  Nikola Erceg; Matthias Burghart; Alessia Cottone; Jessica Lorimer; Kiran Manku; Hannah Pütz; Denis Vlašiček; Manou Willems
Journal:  Front Psychol       Date:  2018-10-26

6.  Peace Data Standard: A Practical and Theoretical Framework for Using Technology to Examine Intergroup Interactions.

Authors:  Rosanna E Guadagno; Mark Nelson; Laurence Lock Lee
Journal:  Front Psychol       Date:  2018-05-28

7.  ASIA: Automated Social Identity Assessment using linguistic style.

Authors:  Miriam Koschate; Elahe Naserian; Luke Dickens; Avelie Stuart; Alessandra Russo; Mark Levine
Journal:  Behav Res Methods       Date:  2021-02-11

8.  Text mining of Reddit posts: Using latent Dirichlet allocation to identify common parenting issues.

Authors:  Elizabeth M Westrupp; Christopher J Greenwood; Matthew Fuller-Tyszkiewicz; Tomer S Berkowitz; Lauryn Hagg; George Youssef
Journal:  PLoS One       Date:  2022-02-02       Impact factor: 3.240

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.