Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems.

Literature DB >> 28328252

Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems.

Lifu Huang¹, Jonathan May², Xiaoman Pan¹, Heng Ji¹, Xiang Ren³, Jiawei Han³, Lin Zhao⁴, James A Hendler¹.

Abstract

The ability of automatically recognizing and typing entities in natural language without prior knowledge (e.g., predefined entity types) is a major challenge in processing such data. Most existing entity typing systems are limited to certain domains, genres, and languages. In this article, we propose a novel unsupervised entity-typing framework by combining symbolic and distributional semantics. We start from learning three types of representations for each entity mention: general semantic representation, specific context representation, and knowledge representation based on knowledge bases. Then we develop a novel joint hierarchical clustering and linking algorithm to type all mentions using these representations. This framework does not rely on any annotated data, predefined typing schema, or handcrafted features; therefore, it can be quickly adapted to a new domain, genre, and/or language. Experiments on genres (news and discussion forum) show comparable performance with state-of-the-art supervised typing systems trained from a large amount of labeled data. Results on various languages (English, Chinese, Japanese, Hausa, and Yoruba) and domains (general and biomedical) demonstrate the portability of our framework.

Keywords: Liberal Information Extraction; fine-grained entity typing; multi-level entity mention and representation; unsupervised learning

Mesh：

Year: 2017 PMID： 28328252 PMCID： PMC5374868 DOI： 10.1089/big.2017.0012

Source DB: PubMed Journal: Big Data ISSN： 2167-6461 Impact factor: 2.128

1 in total

1. Composition in distributional models of semantics.

Authors: Jeff Mitchell; Mirella Lapata
Journal: Cogn Sci Date: 2010-11

1 in total

1. Identifying stroke diagnosis-related features from medical imaging reports to improve clinical decision-making support.

Authors: Xiaowei Xu; Lu Qin; Lingling Ding; Chunjuan Wang; Meng Wang; Zixiao Li; Jiao Li
Journal: BMC Med Inform Decis Mak Date: 2022-10-20 Impact factor: 3.298

1 in total