Putting sarcasm detection into context: the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations.
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Standard
Putting sarcasm detection into context : the effects of class imbalance and manual labelling on supervised machine classification of Twitter conversations. / Abercrombie, Gavin; Hovy, Dirk.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop. Stroudsburg, PA : Association for Computational Linguistics, 2016. s. 107-113.Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Putting sarcasm detection into context
AU - Abercrombie, Gavin
AU - Hovy, Dirk
N1 - Conference code: 54
PY - 2016
Y1 - 2016
N2 - Sarcasm can radically alter or invert a phrase's meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. However, the majority of prior research has treated sarcasm detection as classification, with three important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users' self-declarations in the form of hashtags to label data, when sarcasm can take many forms. 3. While contextual features have been suggested, most works use solely linguistic features. To address these issues, we create an unbalanced corpus of manually annotated Twitter conversations. We compare human and machine ability to recognize sarcasm on this data under varying amounts of context. Results indicate that both class imbalance and labelling method affect performance, and are factors that should be considered when designing automatic sarcasm detection systems. We conclude that for progress to be made in real-world sarcasm detection, we will require a new class labelling scheme that is able to access the `common ground' held between conversational parties.
AB - Sarcasm can radically alter or invert a phrase's meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. However, the majority of prior research has treated sarcasm detection as classification, with three important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users' self-declarations in the form of hashtags to label data, when sarcasm can take many forms. 3. While contextual features have been suggested, most works use solely linguistic features. To address these issues, we create an unbalanced corpus of manually annotated Twitter conversations. We compare human and machine ability to recognize sarcasm on this data under varying amounts of context. Results indicate that both class imbalance and labelling method affect performance, and are factors that should be considered when designing automatic sarcasm detection systems. We conclude that for progress to be made in real-world sarcasm detection, we will require a new class labelling scheme that is able to access the `common ground' held between conversational parties.
M3 - Article in proceedings
SN - 978-1-945626-02-9
SP - 107
EP - 113
BT - Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – Student Research Workshop
PB - Association for Computational Linguistics
CY - Stroudsburg, PA
Y2 - 7 August 2016 through 12 August 2016
ER -
ID: 167581934