An intelligent use of stemmer and morphology analysis for Arabic information retrieval


Alnaied A., Elbendak M., Bulbul A.

Egyptian Informatics Journal, vol.21, no.4, pp.209-217, 2020 (Journal Indexed in SCI Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 21 Issue: 4
  • Publication Date: 2020
  • Doi Number: 10.1016/j.eij.2020.02.004
  • Title of Journal : Egyptian Informatics Journal
  • Page Numbers: pp.209-217

Abstract

© 2020Arabic Information Retrieval has gained significant attention due to an increasing usage of Arabic text on the web and social media networks. This paper discusses a new approach for Arabic stem, called Arabic Morphology Information Retrieval (AMIR), to generate/extract stems by applying a set of rules regarding the relationship among Arabic letters to find the root/stem of the respective words used as indexing terms for the text search in Arabic retrieval systems. To demonstrate the usefulness of the proposed algorithm, we highlight the benefits of the proposed rules for different Arabic information retrieval systems. Finally, we have evaluated AMIR system by comparing its performance with LUCENE, FARASA, and no-stemmer counterpart system in terms of mean average precisions. The results obtained demonstrate that AMIR has achieved a mean average precision of 0.34% while LUCENE, FARASA and no stemmer giving 0.27%, 0.28% and 0.21, respectively. This demonstrates that AMIR is able to improve Arabic stemmer and increases retrieval as well as being strong against any type of stem.