Using frame semantics for knowledge extraction from Twitter

Anders Søgaard; Barbara Plank; Hector Martinez Alonso

Using frame semantics for knowledge extraction from Twitter

Anders Søgaard, Barbara Plank, Hector Martinez Alonso

LUKKET: Center for Sprogteknologi

8 Citationer (Scopus)

Abstract

Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.

Originalsprog	Engelsk
Titel	PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE : AAAI 2015
Antal sider	6
Forlag	Association for the Advancement of Artificial Intelligence
Publikationsdato	1 jun. 2015
Sider	2447-52
Status	Udgivet - 1 jun. 2015

Adgang til dokumentet

http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9349

Citationsformater

Using frame semantics for knowledge extraction from Twitter. / Søgaard, Anders; Plank, Barbara; Martinez Alonso, Hector.

PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE: AAAI 2015. Association for the Advancement of Artificial Intelligence, 2015. s. 2447-52.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

@inproceedings{7047e61b7bad488fa2d92553636b8a6e,

title = "Using frame semantics for knowledge extraction from Twitter",

abstract = "Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.",

author = "Anders S{\o}gaard and Barbara Plank and {Martinez Alonso}, Hector",

year = "2015",

month = jun,

day = "1",

language = "English",

pages = "2447--52",

booktitle = "PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE",

publisher = "Association for the Advancement of Artificial Intelligence",

}

TY - GEN

T1 - Using frame semantics for knowledge extraction from Twitter

AU - Søgaard, Anders

AU - Plank, Barbara

AU - Martinez Alonso, Hector

PY - 2015/6/1

Y1 - 2015/6/1

N2 - Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.

AB - Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.

M3 - Article in proceedings

SP - 2447

EP - 2452

BT - PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE

PB - Association for the Advancement of Artificial Intelligence

ER -

Using frame semantics for knowledge extraction from Twitter

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater