Literature DB >> 33816842

Evaluating named entity recognition tools for extracting social networks from novels.

Niels Dekker1, Tobias Kuhn1, Marieke van Erp2.   

Abstract

The analysis of literary works has experienced a surge in computer-assisted processing. To obtain insights into the community structures and social interactions portrayed in novels, the creation of social networks from novels has gained popularity. Many methods rely on identifying named entities and relations for the construction of these networks, but many of these tools are not specifically created for the literary domain. Furthermore, many of the studies on information extraction from literature typically focus on 19th and early 20th century source material. Because of this, it is unclear if these techniques are as suitable to modern-day literature as they are to those older novels. We present a study in which we evaluate natural language processing tools for the automatic extraction of social networks from novels as well as their network structure. We find that there are no significant differences between old and modern novels but that both are subject to a large amount of variance. Furthermore, we identify several issues that complicate named entity recognition in our set of novels and we present methods to remedy these. We see this work as a step in creating more culturally-aware AI systems.
© 2019 Dekker et al.

Entities:  

Keywords:  Classic and modern literature; Cultural AI; Digital humanities; Evaluation; Named entity recognition; Social networks

Year:  2019        PMID: 33816842      PMCID: PMC7924459          DOI: 10.7717/peerj-cs.189

Source DB:  PubMed          Journal:  PeerJ Comput Sci        ISSN: 2376-5992


  5 in total

1.  Modularity and community structure in networks.

Authors:  M E J Newman
Journal:  Proc Natl Acad Sci U S A       Date:  2006-05-24       Impact factor: 11.205

2.  Collective dynamics of 'small-world' networks.

Authors:  D J Watts; S H Strogatz
Journal:  Nature       Date:  1998-06-04       Impact factor: 49.962

3.  The ubiquity of small-world networks.

Authors:  Qawi K Telesford; Karen E Joyce; Satoru Hayasaka; Jonathan H Burdette; Paul J Laurienti
Journal:  Brain Connect       Date:  2011-11-14

4.  Probing the topological properties of complex networks modeling short written texts.

Authors:  Diego R Amancio
Journal:  PLoS One       Date:  2015-02-26       Impact factor: 3.240

5.  Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks.

Authors:  Camilo Akimushkin; Diego Raphael Amancio; Osvaldo Novais Oliveira
Journal:  PLoS One       Date:  2017-01-26       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.