Unsupervised Learning-based Anomalous Arabic Text Detection

Abouzakhar, Nasser, Allison, Ben and Guthrie, Louise (2008) Unsupervised Learning-based Anomalous Arabic Text Detection. In: Procs 6th Language Resources and Evaluation Conference : LREC 2008. UNSPECIFIED, MAR, pp. 291-296.

Copy

The growing dependence of modern society on the Web as a vital source of information and communication has become inevitable. However, the Web has become an ideal channel for various terrorist organisations to publish their misleading information and send unintelligible messages to communicate with their clients as well. The increase in the number of published anomalous misleading information on the Web has led to an increase in security threats. The existing Web security mechanisms and protocols are not appropriately designed to deal with such recently developed problems. Developing technology to detect anomalous textual information has become one of the major challenges within the NLP community. This paper introduces the problem of anomalous text detection by automatically extracting linguistic features from documents and evaluating those features for patterns of suspicious and/or inconsistent information in Arabic documents. In order to achieve that, we defined specific linguistic features that characterise various Arabic writing styles. Also, the paper introduces the main challenges in Arabic processing and describes the proposed unsupervised learning model for detecting anomalous Arabic textual information.

Item Type	Book Section
Keywords	natural language processing, arabic text processing, anomalous text detection, unsupervised learning
Date Deposited	15 May 2025 16:31
Last Modified	30 May 2025 23:13

Explore Further

Abouzakhar, Nasser

visibility_off

picture_as_pdf

picture_as_pdf: N_Abouzakhar_2.pdf
subject: Published Version
lock: Restricted to Repository staff only

Request Copy

picture_as_pdf

Published Version

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

Data Cite XML

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads