L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language

Mulki, Hala; Haddad, Hatem; Ali, Chedi Bechikh; Alshabani, Halima

L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language

dc.contributor.author	Mulki, Hala
dc.contributor.author	Haddad, Hatem
dc.contributor.author	Ali, Chedi Bechikh
dc.contributor.author	Alshabani, Halima
dc.date.accessioned	2025-01-21T16:33:07Z
dc.date.available	2025-01-21T16:33:07Z
dc.date.issued	2019
dc.department	Kırıkkale Üniversitesi
dc.description	3rd Workshop on Abusive Language Online -- AUG 01, 2019 -- Florence, ITALY
dc.description.abstract	Hate speech and abusive language have become a common phenomenon on Arabic social media. Automatic hate speech and abusive detection systems can facilitate the prohibition of toxic textual contents. The complexity, informality and ambiguity of the Arabic dialects hindered the provision of the needed resources for Arabic abusive/hate speech detection research. In this paper, we introduce the first publicly-available Levantine Hate Speech and Abusive (L-HSAB) Twitter dataset with the objective to be a benchmark dataset for automatic detection of online Levantine toxic contents. We, further, provide a detailed review of the data collection steps and how we design the annotation guidelines such that a reliable dataset annotation is guaranteed. This has been later emphasized through the comprehensive evaluation of the annotations as the annotation agreement metrics of Cohen's Kappa (k) and Krippendorff's alpha (alpha) indicated the consistency of the annotations.
dc.description.sponsorship	UCLA,Google,Facebook,Element AI,Aylien
dc.identifier.endpage	118
dc.identifier.isbn	978-1-950737-43-7
dc.identifier.startpage	111
dc.identifier.uri	https://hdl.handle.net/20.500.12587/23730
dc.identifier.wos	WOS:000538480400012
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science
dc.language.iso	en
dc.publisher	Assoc Computational Linguistics-Acl
dc.relation.ispartof	Third Workshop on Abusive Language Online
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/openAccess
dc.snmz	KA_20241229
dc.title	L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language
dc.type	Conference Object

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: Tam Metin/Full Text
Boyut:: 1.46 MB
Biçim:: Adobe Portable Document Format

İndir

Koleksiyon

WOS İndeksli Yayınlar Koleksiyonu
Bildiri ve Sunum Koleksiyonu