Show simple item record

dc.contributor.authorKhan, Shumaila
dc.contributor.authorQasim, Iqbal
dc.contributor.authorKhan, Wahab
dc.contributor.authorKhan, Aurangzeb
dc.contributor.authorAli Khan, Javed
dc.contributor.authorQahmash, Ayman
dc.contributor.authorGhadi, Yazeed Yasin
dc.contributor.editorHassani, Hossein
dc.date.accessioned2024-12-06T12:00:02Z
dc.date.available2024-12-06T12:00:02Z
dc.date.issued2024-12
dc.identifier.citationKhan , S , Qasim , I , Khan , W , Khan , A , Ali Khan , J , Qahmash , A , Ghadi , Y Y & Hassani , H (ed.) 2024 , ' An automated approach to identify sarcasm in low-resource language ' , PLoS ONE , vol. 19 , no. 12 , e0307186 , pp. 1-29 . https://doi.org/10.1371/journal.pone.0307186
dc.identifier.issn1932-6203
dc.identifier.otherJisc: 2477108
dc.identifier.otherpublisher-id: pone-d-24-04057
dc.identifier.otherORCID: /0000-0003-3306-1195/work/173286320
dc.identifier.urihttp://hdl.handle.net/2299/28522
dc.description© 2024 The Author(s). This is an open access article distributed under the Creative Commons Attribution License, to view a copy of the license, see: https://creativecommons.org/licenses/by/4.0/
dc.description.abstractSarcasm detection has emerged due to its applicability in natural language processing (NLP) but lacks substantial exploration in low-resource languages like Urdu, Arabic, Pashto, and Roman-Urdu. While fewer studies identifying sarcasm have focused on low-resource languages, most of the work is in English. This research addresses the gap by exploring the efficacy of diverse machine learning (ML) algorithms in identifying sarcasm in Urdu. The scarcity of annotated datasets for low-resource language becomes a challenge. To overcome the challenge, we curated and released a comparatively large dataset named Urdu Sarcastic Tweets (UST) Dataset, comprising user-generated comments from X (former Twitter). Automatic sarcasm detection in text involves using computational methods to determine if a given statement is intended to be sarcastic. However, this task is challenging due to the influence of the user’s behavior and attitude and their expression of emotions. To address this challenge, we employ various baseline ML classifiers to evaluate their effectiveness in detecting sarcasm in low-resource languages. The primary models evaluated in this study are support vector machine (SVM), decision tree (DT), K-Nearest Neighbor Classifier (K-NN), linear regression (LR), random forest (RF), Naïve Bayes (NB), and XGBoost. Our study’s assessment involved validating the performance of these ML classifiers on two distinct datasets—the Tanz-Indicator and the UST dataset. The SVM classifier consistently outperformed other ML models with an accuracy of 0.85 across various experimental setups. This research underscores the importance of tailored sarcasm detection approaches to accommodate specific linguistic characteristics in low-resource languages, paving the way for future investigations. By providing open access to the UST dataset, we encourage its use as a benchmark for sarcasm detection research in similar linguistic contexts.en
dc.format.extent29
dc.format.extent1600189
dc.language.isoeng
dc.relation.ispartofPLoS ONE
dc.subjectAlgorithms
dc.subjectDecision Trees
dc.subjectEmotions
dc.subjectHumans
dc.subjectLanguage
dc.subjectMachine Learning
dc.subjectNatural Language Processing
dc.subjectSupport Vector Machine
dc.subjectGeneral
dc.titleAn automated approach to identify sarcasm in low-resource languageen
dc.contributor.institutionSchool of Physics, Engineering & Computer Science
dc.contributor.institutionCybersecurity and Computing Systems
dc.contributor.institutionBiocomputation Research Group
dc.contributor.institutionDepartment of Computer Science
dc.description.statusPeer reviewed
dc.identifier.urlhttp://www.scopus.com/inward/record.url?scp=85211569730&partnerID=8YFLogxK
rioxxterms.versionofrecord10.1371/journal.pone.0307186
rioxxterms.typeJournal Article/Review
herts.preservation.rarelyaccessedtrue


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record