Extraction of Arabic word roots: An Approach Based on Computational Model and Multi-Backpropagation Neural Networks
Date
Authors
Advisors
Journal Title
Journal ISSN
ISSN
DOI
Volume Title
Publisher
Type
Peer reviewed
Abstract
Stemming is a process of extracting the root of a given word, by stripping
off the affixes attached to this word. Many attempts have been made
to address the stemming of Arabic words problem. The majority of the
existing Arabic stemming algorithms require a complete set of morphological
rules and large vocabulary lookup tables. Furthermore, many of them give
more than one potential stem or root for a given Arabic word. According to
Ahmad [11], the Arabic stemming process based on the language morphological
rules is still a very difficult task due to the nature of the language itself.
The limitations of the current Arabic stemming methods have motivated this
research in which we investigate a novel approach to extract the word roots
of Arabic language named here as MUAIDI-STEMMER 2. This approach attempts
to exploit numerical relations between Arabic letters, avoiding having a list
of the root and pattern of each word in the language, and giving one root solution.
This approach is composed of two phases. Phase I depends on a basic
calculations extracted from linguistic analysis of Arabic patterns and affixes.
Phase II is based on artificial neural network trained by backpropagation
learning rule. In this proposed phase, we formulate the root extraction problem
as a classification problem and the neural network as a classifier tool.
This study demonstrates that a neural network can be effectively used to ex- tract the word roots of Arabic language
The stemmer developed is tested using 46,895 Arabic word types3. Error counting accuracy evaluation was employed to evaluate the performance of
the stemmer. It was successful in producing the stems of 44,107 Arabic words
from the given test datasets with accuracy of 94.81%.
2.Muaidi is the author father's name.
3.Types mean distinct or unique words.