Next Article in Journal
Study on the Spatial and Temporal Distribution of the High–Quality Development of Urbanization and Water Resource Coupling in the Yellow River Basin
Next Article in Special Issue
The Artificial Intelligence Revolution in Digital Finance in Saudi Arabia: A Comprehensive Review and Proposed Framework
Previous Article in Journal
Theoretical Review of Research to Date on Competences 4.0—What Do We Know about Competences in Industry 4.0? A Status Quo Analysis
Previous Article in Special Issue
Prediction of Gender-Biased Perceptions of Learners and Teachers Using Machine Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

AraMAMS: Arabic Multi-Aspect, Multi-Sentiment Restaurants Reviews Corpus for Aspect-Based Sentiment Analysis

Information Technology Department, College of Computer and Information Sciences, King Saud University, Riyadh 11362, Saudi Arabia
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(16), 12268; https://doi.org/10.3390/su151612268
Submission received: 24 June 2023 / Revised: 7 August 2023 / Accepted: 8 August 2023 / Published: 11 August 2023

Abstract

:
The abundance of data on the internet makes analysis a must. Aspect-based sentiment analysis helps extract valuable information from textual data. Because of limited Arabic resources, this paper enriches the Arabic dataset landscape by creating AraMA, the first and largest Arabic multi-aspect corpus. AraMA comprises 10,750 Google Maps reviews for restaurants in Riyadh, Saudi Arabia. It covers four aspect categories—food, environment, service, and price—along with four sentiment polarities: positive, negative, neutral, and conflict. All AraMA reviews are labeled with at least two aspect categories. A second version, named AraMAMS, includes reviews labeled with at least two different sentiments, making it the first Arabic multi-aspect, multi-sentiment dataset. AraMAMS has 5312 reviews covering the same four aspect categories and sentiment polarities. Both corpora were evaluated using naïve biased (NB), support vector classification (SVC), linear SVC, and stochastic gradient descent (SGD) models. In the AraMA corpus, the aspect categories task achieved a 91.41% F1 measure result using the SVC model, while in the AraMAMS corpus, the best F1 measure result for aspect categories task reached 91.70% using the linear SVC model.

1. Introduction

With the growth of social media usage recently, it is essential to discover and reap the benefits of online user-generated information to enhance a product or service, and help to create more effective marketing efforts. For instance, analyzing consumers’ feelings and opinion data from reviews on e-commerce platforms is very important, as it provides insight into customers’ satisfaction levels. This type of data analysis can provide businesses with valuable insights into customer sentiment, brand perception, market trends, and investment opportunities. By leveraging these insights, businesses can enhance customer satisfaction, brand reputation, market competitiveness, and financial performance. Overall, assessing customer reviews is a critical component of constructing a strong and equitable infrastructure that promotes economic growth and improves the quality of life and wellbeing.
However, analyzing this opinion data manually would be impossible, given the enormous volume of textual content. As a result, the field of sentiment analysis (SA) has emerged as an AI tool that allows automatic extraction of the knowledge about opinions, emotions, and attitudes concealed within unstructured texts. Yet, SA only provides a view of what people like or dislike. It basically classifies a given text into positive, negative, and neutral sentiments [1]. Aspect-based sentiment analysis (ABSA) is a field of SA that goes one step further than SA by automatically assigning sentiments to certain features or aspects in the text. The primary goal of ABSA is to extract the relevant aspects and then classify them into different sentiment polarities [1]. This entails breaking down text data into smaller fragments in order to obtain deeper, more granular insights. As such, all relationships between the entities involved must be appropriately identified and linked to the conveyed sentiment. Thus, the main challenge of this task is to distinguish between different opinion contexts for different aspects or targets. Of late, ABSA has become one of the most important tasks of SA, since it can extract a deeper insight from text to ensure that the right decisions are made, and provide a clearer image of weaknesses.
ABSA can play a significant role in supporting the Sustainable Development Goals (SDGs) of the 2030 Agenda, which were adopted by the United Nations to address global challenges and promote sustainable development [2]. This can be accomplished by providing insights into the sustainability performance of businesses. For instance, ABSA for customer reviews can help in assessing the sustainability efforts of restaurants, hotels, retail stores, or service providers by analyzing sentiments related to specific aspects of sustainability, and this allows for a comprehensive assessment. Analyzing restaurant customer reviews is critical for constructing high-quality, long-lasting, and robust infrastructure to promote economic development and human wellbeing. If restaurant owners analyze customer reviews, they will be more aware of their needs, enabling them to direct efforts and money to improve these aspects more quickly and with less effort. As a result, client happiness and loyalty rise, resulting in higher revenue and economic progress. Furthermore, restaurant owners can focus more on enabling cheap and equitable access to high-quality food and improve eating experiences for all consumers in order to obtain more positive comments which, in turn, will increase the number of visitors to the restaurant. Overall, assessing customer reviews is a critical component of constructing a strong and equitable infrastructure that promotes economic growth and the wellbeing of all people.
In recent years, a number of researchers have carried out a great deal of work in the field of SA and its application. However, ABSA studies are still scarce compared to SA research, especially in the Arabic language. This is for two main reasons: the lack of labelled dataset resources in Arabic, and the complexity of the Arabic language [3].There are three different varieties of the Arabic language, Classical Arabic (CA), which is used in the Holy Qur’an of Islam; Modern Standard Arabic (MSA), which is used in official contexts such as newspapers and education; and Dialectical Arabic (DA), which is used in daily conversation and in most social media content. DA also differs from one Arab nation and the next, and does not have standard orthographies [4].
Researchers have become more interested in ABSA in the past few years. There are different studies in the literature regarding ABSA in the English language; however, there is a lack of Arabic research in this field. In addition, it is clear that the field of Arabic ABSA suffers from the small number of available well-created corpora that would help the Arabic research community. Therefore, to bridge this gap, we aim to enhance the Arabic dataset resources available for serving ABSA studies. In this study, we create two versions of Arabic ABSA corpora, and provide an in-depth analysis of restaurant reviews in the city of Riyadh by identifying the most important aspects, such as price, environment, food, and service quality, that affect restaurants. This will make it easier to pinpoint exactly what customers like and dislike, and thus improve the business in question. We collected 21,330 Arabic reviews from Google Maps about restaurants in Riyadh, Saudi Arabia. These reviews have been manually annotated by Arab annotators who can understand Saudi DA. To make data more reliable, annotation guidelines were created. The process was carried out in two rounds. To our knowledge, this work is the first study targeting the field of restaurant ABSA in Arabic.
In summary, this paper makes the following contributions:
  • Creating the first and the largest Arabic multi-aspect corpus (AraMA) with a total of 10,739 reviews from Google Maps related to restaurants in Riyadh, and analyzing its suitability for the ABSA task.
  • Based on the research in [5], the authors prove that if an ABSA corpus contains sentences labeled with the same sentiments, this can reduce ABSA to sentence-level, and the classifiers can obtain good results without considering aspects (i.e., it returns to a classical sentiment analysis). Thus, we wanted to investigate the effect of this on Arabic datasets by generating the second version of AraMA, which only includes reviews labeled with different sentiments. This is the first Arabic corpus of Google Maps reviews of Riyadh restaurants with multi-aspect, multi-sentiment properties; it is named AraMAMS.
  • The annotation guidelines are highlighted to help researchers in future studies.
The rest of the paper is organized as follows. The Section 2 is a review of the related work. In the Section 3, the process of collecting and cleaning the data is described. In the Section 4, the workflow of the annotation process is explained. The Section 5 contains an expletory data analysis for both corpora. The Section 6 concerns data evaluation. Section 7 concludes the paper, and discusses potential future work. Section 8 contains corpus availability details.

2. Related Work

There are several studies in the Arabic language for ABSA. We reviewed the Arabic corpora that have been created and labeled for ABSA in the literature. Table 1 provides a summary of the datasets used regarding their domain, size, Arabic language type, publicity, predefined aspect categories, sentiment polarity, and, if applicable, the platform used during annotation. For a comprehensive review of the available Arabic ABSA research, we refer the reader to [3].
In 2015, Ref. [6] provided the first research benchmark dataset for ABSA. The authors created the human-annotated Arabic dataset for book reviews (HAAD) [15]. It consists of 1513 annotated book reviews taken from the large Arabic book review corpus (LABR) dataset that was essentially created for SA [16]. The authors annotated aspect terms, aspect categories, and sentiment polarities. In Ref. [7], published in the same year, the authors collected 200 reviews from forums, Facebook, YouTube, and Google search. Then, they extracted aspects using part-of-speech tagging (POS) and manually annotated sentiment polarity.
In 2016, Al-Sarhan et al. [8] collected 2265 Arabic news posts related to the Gaza conflict, associated comments from Al Jazeera and Al Arabiya (well-known Arabic news networks), and related posts on Facebook. They annotated the posts’ aspect categories, aspect terms, and sentiment polarities. Additionally, they annotated comment categories and sentiment polarities. They chose the most dominant aspect category only for both posts and comments. The BRAT tool was used in this study to ease annotation. In the same year, Semantic Evaluation (SemEval) launched a workshop to create Arabic hotel reviews for ABSA; it has been used as a benchmark until now. The dataset includes a total of 2291 annotated review sentences, of which 1839 were used for training and 452 for testing. The sentences were gathered from hotel booking websites such as Booking.com, (accessed on 23 June 2023) and TripAdvisor.com, (accessed on 23 June 2023). The selected reviews belong to hotels from different Arabian cities such as Dubai, Mecca, Amman, Beirut, etc. They annotated aspect terms, aspect categories, and sentiment polarities. The aspect category annotations were more detailed; annotators were required to identify entities and attributes. Further, the category field needed to be completed using the syntax of (Entity # Attribute). Entities were predefined as hotel, rooms, room_amenities, facilities, service, location, food&drinks. Attributes were defined as general, prices, design&features, cleanliness, comfort, quality, style&options, and miscellaneous. For example, in the sentence “the rooms are comfortable”, the entity is the room, and the attribute is comfort. Thus, the category field would be defined as value (room#comfort) [17].
A much simpler corpus was subsequently created. The authors of [10] compiled a total of 5000 tweets related to the service on Saudi airlines. They annotated aspect categories and sentiment polarities. Additionally, in Ref. [11], customers’ sentiments were extracted—using machine learning and deep learning approaches—from 1098 tweets, collected by the authors, regarding the Saudi telecommunication companies STC, Mobile, and Zain. The paper was part of an ongoing project. They extracted available aspects such as internet, customer services, network, billing, packages, and general. For annotation, they manually annotated sentiment polarities using the DataTracking website.
In 2020, a total of 7934 tweets related to Qassim university were collected by the author of [12]. Annotators labelled aspect categories and sentiment polarities. In [13], 1000 Arabic book reviews were selected and annotated from an LABR dataset. Annotators labelled aspect terms and sentiment polarity terms. Since a review can contain more than one aspect, they added one sentiment for an entire review, much like SA. Additionally, in [14], a total of 2071 Arabic reviews were selected from the Apple Store and Google Play. Using these reviews, 60 different mobile apps were created by the United Arab Emirates’ government. Annotation was carried out with a specially designed computer application named “GARSA”. Annotators labelled aspect terms, sentiment words, and aspect categories.
In our review of the literature, we noted that most of the datasets were collected from multiple resources such as Twitter, Youtube, Facebook, application reviews, and different websites. Mostly, datasets were manually annotated by researchers, although most of them are not publicly available. Currently, the SemEval 2016 Arabic hotel reviews is the published dataset that best represents a benchmark for ABSA in the Arabic language. In the SemEval 2016 Arabic hotel reviews dataset, there are multiple records that have one aspect category. Thus, in using it, we will lose the advantage of ABSA, since it is reduced to a sentence-level sentiment analysis, according to the study of Jiang et al. [5]. This study proved that if the dataset contains sentence-level reviews, classifiers can still achieve competitive results without considering aspects. Further, advanced ABSA methods trained on these datasets can hardly distinguish the various aspects of the sentiment polarities in the sentences, which contain multiple aspects and multiple sentiments. This encourages us to enrich the field with a well-created ABSA Arabic corpus for Riyadh restaurants. The creation process will be described in detail in the following sections.

3. Dataset Creation

In this paper, the aim is to create an Arabic multi-aspect (AraMA) corpus of Riyadh restaurants for ABSA. We decided to remove sentences with one aspect category from the original dataset to prevent the reduction of ABSA to the sentence level, as proven in [5]. All sentences in AraMA will have at least two aspect categories. After that, we will create an Arabic multi-aspect, multi-sentiment (AraMAMS) version, which contains only sentences with different sentiment polarities. Both datasets have sentences with at least two aspect categories, but the difference is in their sentiments. AraMA may contain sentences with the same sentiments, while AraMAMS only contains sentences with different sentiments. Figure 1 summarizes the creation workflow.

3.1. Data Collection

We wanted to target the dialectical Arabic that Saudi people use in daily life. Thus, we decided to collect Google Maps reviews of Riyadh restaurants. We used the Instant Data Scraper extension of Google Chrome to help us collect reviews in an Excel sheet [18]. We collected the most recent restaurant reviews from famous restaurants that are in a highly visited area in Riyadh on the Google Maps website. A total of 21,330 reviews were collected from 61 restaurants. It was obvious that Arabic reviews were mostly in DA. There were few English reviews. The data included the reviewer’s username, number of reviews, user title, review, an image link if available, review date, thumbs up, and a reply from the restaurant owner if available. We were only interested in reviews. Figure 2 illustrates one comment section in Google Maps, with the user review highlighted in red.

3.2. Corpus Cleaning and Preprocessing

To increase the accuracy of the opinion-mining process and to prevent excessive processing overhead, we first removed some empty reviews from the Excel sheet. Following that, a regular expression tool (regex [19]) in Python was used for preprocessing reviews using the Google Collab platform [20]. Regex was used to check if a string contained the specified search pattern. The preprocessing steps included multiple tasks, e.g., the removal of unnecessary characters such as punctuation, diacritics, numbers, emojis, and all English letters. After that, stemming was performed to remove repeated characters, and normalization was carried out. Table 2 shows examples of parts of the reviews before and after the preprocessing task. At this step, we removed about 2678 reviews that were changed to empty.

4. Annotation

Although manual annotation consumes a great deal of time and resources, we performed it to ensure a more accurate data-labeling process [14]. Figure 3 illustrates the annotation process.
This section is divided into three parts. The first is an aspect-based approach. In this part, we explain how we defined categories and sentiments before starting annotation. The second part describes the annotation platform. In this section, we explained how the platform we created is used during annotation. The third part is annotator recruitment, wherein we explain in detail all the steps we went through in choosing the annotators.

4.1. Aspect-Based Approach

Before starting the annotation process, the aspect categories must be identified. The research in [21] aimed to identify the most important specifications affecting restaurant guests’ satisfactions, in order to help restaurants owners address them. The study concluded that the quality dimensions can be summarized as food quality, service quality, physical environment, and price fairness. Based on this, we selected our aspect categories (food, service, environment, and price). Table 3 shows topics related to each aspect category.
On the other hand, for sentiment annotation, we added conflict polarity as well as positive, negative, and neutral, because of the nature of restaurant review topics. A user frequently liked one dish but hated another. In these cases, reviews were marked as conflicting sentiment, wherein two positive and negative sentiment polarities applied to the same aspect. Examples of each aspect category in each sentiment case are displayed in Table 4.

4.2. Annotation Platform

Within this step, we collected all reviews in an Excel sheet. In some other SA types (i.e., emotion detection, stance detection), annotation is easy to manage in Excel. In these SA types, each review needs one tag only; for example, in stance detection, the annotator will label a review with one of the following tags: with, against, or neutral. On the other hand, ABSA datasets are very difficult to maintain in an Excel sheet file, because annotators need to identify all opinions in the sentence, the categories that they fall under, and their sentiments. For example, the sentence “الاكل ممتاز لكن الخدمة متوسطة” has two categories: food and service. The food category shows a positive sentiment, while the service category shows a neutral sentiment. Thus, to start our annotation, we converted the Excel file containing reviews into an XML file using the Python programming language.
For the convenience of the annotators, and to obtain the best results in the annotation process in terms of accuracy and time, we built a website to start the annotations. The website interface is very simple, as shown in Figure 4. The website allowed the annotators to sequentially obtain reviews from a database that contains all reviews. Then, each review was read, analyzed, and annotated by marking checkboxes that represent the categories included in the sentence. Then, sentiments were selected from the corresponding drop-down list. The review background color was red, but after the annotators clicked “save”, the background color changed to grey, to ensure that annotators would not miss any review.

4.3. Annotator Recruitment

Since all the reviews we collected were in Arabic, the annotators were required to be Arab, with a mother tongue of Arabic and a wide knowledge of the Arabic language and dialects. Due to the complexity of the reviews and users’ opinions, the annotators were required to be above 20 years old. As such, we recruited four Arab annotators whose mother tongue was Arabic and who understood the dialects in reviews well. Their ages ranged from 26 to 32. All of them had a computer-related background (information technology and software engineering). Firstly, an overview and explanation of the aim of the research were given to the annotators. Then, the annotation guidelines (Supplementary Material) were given to them. Following that, we assessed their understanding of the task by reviewing their annotations for 20 review sentences. After confirming that the annotations were correct, we sent them the link for the annotation website. The reviews were divided evenly; each annotator was responsible for completing about 5330 reviews. They were told that they could make contact at any time that they had doubts about their work. After the annotations were completed, we exchanged sets between annotators to ensure that each sentence was reviewed by at least two individuals.

5. Exploratory Data Analysis

During the annotation process, sentences with one aspect and unrelated reviews were excluded. Thus, from the 21,330 collected reviews, we obtained 10,739 annotated reviews to create the AraMA dataset (the original dataset). Annotators identified 25,653 aspect categories mentioned in sentences with corresponding sentiment polarities. The dataset contains a total of 16,551 positive reviews, 6154 negative reviews, 1439 neutral reviews, and 1509 conflict reviews. The dataset contains a total of 9539 reviews in the food aspect category, 6395 in the environment aspect category, 5660 in the service aspect category, and 4059 in the price aspect category. Table 5 provides more statistics of the dataset. Figure 5 shows the percentage of total number of reviews in each aspect category.
Following that, we used Python code to extract reviews with multiple sentiments in an individual file to create the AraMAMS corpus, which is another version of AraMA. AraMAMS contains 5312 extracted reviews, with 13,387 annotated aspect category and sentiment polarity tags. It contains a total of 6483 positive reviews, 4056 negative reviews, 1403 neutral reviews, and 1445 conflict reviews. This includes a total number of 4791 reviews in the food aspect category, 3183 in the environment aspect category, 2282 in the service aspect category, and 3131 in the price aspect category. Table 6 provides more statistics of the AraMAMS dataset. Figure 6 shows the percentages of the total number of reviews in each aspect category.
AraMA and AraMAMS are in XML format, containing a record for each review. The record information includes the user review, aspect category, and the corresponding sentiment. Figure 7 shows an example of review records from both datasets.
In both datasets, there were more positive sentiments than negative, conflict, and neutral sentiments, respectively. On the category side, food aspects dominated in both datasets, followed by the environment category. In AraMA, the service category followed the price category, while in AraMAMS, the price category followed the service category.
When comparing between datasets (results of AraMA are shown in Table 5 and those of AraMAMS are shown in Table 6), the greatest difference is seen in the positive sentiment tags. The total number of positive sentiment tags in Table 5 is 16,551, while the total number of positive sentiments tags in Table 6 is 6483. There is a great difference between the number of positive sentiments tags between two datasets: 10,068. Further, there are 6154 negative sentiments tags in Table 5, while there are 4056 negative sentiment tags in Table 6. The difference between the total number of negative sentiment tags between the two datasets is 2098. On the other hand, there was only a small difference between the total number of neutral and conflict sentiment tags.

6. Corpus Validation

In order to validate both corpora, we applied supervised ML classifiers to offer baseline results. Using the Python language on the Google Collab platform, we ran four different classifiers: naïve biased (NB); support vector classification (SVC), with linear kernel as well as linear SVC; and stochastic gradient descent (SGD). In this study, we dealt with the datasets as a multi-class, multi-label text classification problem. Thus, we calculated the micro-averages of the precision, recall, and F1 measures.
Both corpora were divided into training (70%) and testing (30%) data. Table 7 shows the number of reviews in the two corpora after splitting. After that, four data frames were created: one for aspect categories, and three for sentiments. The positive and negative sentiments each had individual data frames, while we gathered neutral and conflict sentiments in the same data frame, because they have almost the same sentimental meaning, and fewer tags.
The evaluation results of the classifiers in the AraMA corpus (original) and AraMAMS corpus (second version) are provided in Table 8. The results can be viewed on a color scale between green and red for easier reading; green represents good results, while red represents bad results.
Starting with the results of the AraMA corpus, it can be seen from Table 7 that SVC, with the kernel linear model, achieved the best performance in terms of aspect categories and negative, neutral, and conflict sentiments. The highest F1 measure result was 91.41%, found for the aspect category. On the other hand, in the AraMAMS corpus, the best F1 measure results of all categories were obtained using the linear SVC model (an F1 measure value of 91.70% in terms of the aspect category).
Overall, the NB model achieved the worst results in both corpora. In addition, the results of the linear SVC and SVC with kernel linear models were similar in all categories, except for the negative sentiment category, wherein there was a minor difference of 0.15%.
When comparing the results of the two corpora, the best F1 measure result in AraMA was 91.41% for the aspect category, while the best F1 measure result in AraMAMS was 91.70% for the same category. Therefore, we can say that there was a slight improvement in the F1 measure result in AraMAMS, yet the difference is not significant. In addition, the results of precision, recall, and F1 measure were the worst in the neutral and conflict sentiments in both corpora. This is due to the small number of tags available in both corpora.

7. Conclusions and Future Work

In this paper, the aim was to create the largest multi-aspect (AraMA) and the first multi-aspect, multi-sentiment (AraMAMS) Arabic corpora. We collected 10,739 Google Maps reviews of restaurants from the city of Riyadh, Saudi Arabia, as the content of the AraMA corpus. Subsequently, we extracted multi-sentiment reviews to create the AraMAMS corpus. Both corpora were annotated with four aspects (food, environment, service, and price), which were required to correspond to one of four sentiment polarities: positive, negative, neutral, and conflict. AraMA and AraMAMS were labeled manually by four annotators. The annotation process passed through two rounds: annotation and reviewing. The annotation methodology is presented in detail in this paper so that it can be used as a reference in future research on ABSA.
Both corpora were evaluated using four ML models: naïve biased (NB), support vector classification (SVC), linear SVC, and stochastic gradient descent (SGD). By comparing the F1 measure results, we observed that, in the AraMA corpus, the best result in aspect categories was 91.41%, found using the SVC model. In the AraMAMS corpus, the best result in the aspect categories was 91.70%, found using the linear SVC model. Future work will be conducted by performing extensive experiments using deep learning models on the created datasets.

8. Corpus Availability

Both the corpora established in this paper, AraMA and AraMAMS, are not published currently. With the aim of enriching Arabic resources, we will make them available for research purposes upon request.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su151612268/s1.

Author Contributions

Conceptualization, H.H.A.-B.; methodology, H.H.A.-B.; software, A.A.; validation, A.A. and H.H.A.-B.; formal analysis, A.A.; investigation, A.A.; resources, A.A.; writing—original draft preparation, A.A.; writing—review and editing, H.H.A.-B.; supervision, H.H.A.-B.; project administration, H.H.A.-B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Both corpora are available upon request.

Acknowledgments

The authors would like to acknowledge the Researchers Supporting Project Number (RSP2023R287), King Saud University, Riyadh, Saudi Arabia, for their support in this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhang, W.; Li, X.; Deng, Y.; Bing, L.; Lam, W. A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges. arXiv 2022, arXiv:2203.01054. [Google Scholar] [CrossRef]
  2. THE 17 GOALS. Sustainable Development. Available online: https://sdgs.un.org/goals (accessed on 23 June 2023).
  3. Obiedat, R.; Al-Darras, D.; Alzaghoul, E.; Harfoushi, O. Arabic Aspect-Based Sentiment Analysis: A Systematic Literature Review. IEEE Access 2021, 9, 152628–152645. [Google Scholar] [CrossRef]
  4. Shaalan, K.; Nizar, Y. Habash, Introduction to Arabic Natural Language Processing (Synthesis Lectures on Human Language Technologies). Mach. Transl. 2010, 24, 285–289. [Google Scholar] [CrossRef]
  5. Jiang, Q.; Chen, L.; Xu, R.; Ao, X.; Yang, M. A Challenge Dataset and Effective Models for Aspect-Based Sentiment Analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; Association for Computational Linguistics: Hong Kong, China, 2019; pp. 6279–6284. [Google Scholar] [CrossRef]
  6. Al-Smadi, M.; Qawasmeh, O.; Talafha, B.; Quwaider, M. Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud, Rome, Italy, 24–26 August 2015; IEEE: Rome, Italy, 2015; pp. 726–730. [Google Scholar] [CrossRef]
  7. Abd-Elhamid, L.; Elzanfaly, D.; Eldin, A.S. Feature-Based Sentiment Analysis in Online Arabic Reviews. In Proceedings of the 2016 11th International Conference on Computer Engineering & Systems (ICCES), Cairo, Egypt, 20–21 December 2016; IEEE: Cairo, Egypt, 2016; pp. 260–265. [Google Scholar] [CrossRef]
  8. Al-Sarhan, H.; Al-So’ud, M.; Al-Smadi, M.; Al-Ayyoub, M.; Jararweh, Y. Framework for Affective News Analysis of Arabic News: 2014 Gaza Attacks Case Study. In Proceedings of the 2016 7th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 5–7 April 2016; IEEE: Irbid, Jordan, 2016; pp. 327–332. [Google Scholar] [CrossRef]
  9. AL-Smadi, M.; Qwasmeh, O.; Talafha, B.; Al-Ayyoub, M.; Jararweh, Y.; Benkhelifa, E. An Enhanced Framework for Aspect-Based Sentiment Analysis of Hotels’ Reviews: Arabic Reviews Case Study. In Proceedings of the 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST), Barcelona, Spain, 5–7 December 2016. [Google Scholar]
  10. Ashi, M.M.; Siddiqui, M.A.; Nadeem, F. Pre-Trained Word Embeddings for Arabic Aspect-Based Sentiment Analysis of Airline Tweets. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2018, Cairo, Egypt, 3–5 September 2018; Hassanien, A.E., Tolba, M.F., Shaalan, K., Azar, A.T., Eds.; Advances in Intelligent Systems and Computing. Springer International Publishing: Cham, Switzerland, 2019; Volume 845, pp. 241–251. [Google Scholar] [CrossRef]
  11. Alshammari, N.F.; AlMansour, A.A. Aspect-Based Sentiment Analysis for Arabic Content in Social Media. In Proceedings of the 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), Istanbul, Turkey, 12–13 June 2020; IEEE: Istanbul, Turkey, 2020; pp. 1–6. [Google Scholar] [CrossRef]
  12. Alassaf, M.; Qamar, A.M. Aspect-Based Sentiment Analysis of Arabic Tweets in the Education Sector Using a Hybrid Feature Selection Method. In Proceedings of the 2020 14th International Conference on Innovations in Information Technology (IIT), Al Ain, United Arab Emirates, 17–18 November 2020; IEEE: Al Ain, United Arab Emirates, 2020; pp. 178–185. [Google Scholar] [CrossRef]
  13. Masadeh, R. A Hybrid Approach of Lexicon-Based and Corpus-Based Techniques for Arabic Book Aspect and Review Polarity Detection. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 4336–4340. [Google Scholar] [CrossRef]
  14. Areed, S.; Alqaryouti, O.; Siyam, B.; Shaalan, K. Aspect-Based Sentiment Analysis for Arabic Government Reviews. In Recent Advances in NLP: The Case of Arabic Language; Abd Elaziz, M., Al-qaness, M.A.A., Ewees, A.A., Dahou, A., Eds.; Studies in Computational Intelligence; Springer International Publishing: Cham, Switzerland, 2020; Volume 874, pp. 143–162. [Google Scholar] [CrossRef]
  15. msmadi. HAAD. 2022. Available online: https://github.com/msmadi/HAAD (accessed on 3 May 2022).
  16. Papers with Code-LABR Dataset. Available online: https://paperswithcode.com/dataset/labr (accessed on 23 June 2023).
  17. Al-Smadi, M.; Al-Ayyoub, M.; Jararweh, Y.; Qawasmeh, O. Enhancing Aspect-Based Sentiment Analysis of Arabic Hotels’ Reviews Using Morphological, Syntactic and Semantic Features. Inf. Process. Manag. 2019, 56, 308–319. [Google Scholar] [CrossRef]
  18. Instant Data Scraper. Available online: https://chrome.google.com/webstore/detail/instant-data-scraper/ofaokhiedipichpaobibbnahnkdoiiah (accessed on 9 May 2022).
  19. Python RegEx. Available online: https://www.w3schools.com/python/python_regex.asp (accessed on 23 June 2023).
  20. Google Colaboratory. Available online: https://colab.research.google.com/ (accessed on 23 June 2023).
  21. Gagic, S.; Tesanovic, D.; Jovicic, A. The Vital Components of Restaurant Quality That Affect Guest Satisfaction. Turizam 2013, 17, 166–176. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Workflow describing the corpus building process. each step is described in individual section.
Figure 1. Workflow describing the corpus building process. each step is described in individual section.
Sustainability 15 12268 g001
Figure 2. A preview of a Google Maps review text. The translation of text inside red box in English is “A Turkish restaurant.. very good.. baking and potteries are good.. restaurant environment is good.. we came to taste beef and cheese pottery, but it was less than expected.”
Figure 2. A preview of a Google Maps review text. The translation of text inside red box in English is “A Turkish restaurant.. very good.. baking and potteries are good.. restaurant environment is good.. we came to taste beef and cheese pottery, but it was less than expected.”
Sustainability 15 12268 g002
Figure 3. Workflow depicting the main steps of the annotation process.
Figure 3. Workflow depicting the main steps of the annotation process.
Sustainability 15 12268 g003
Figure 4. The website interface, which includes review sentences, check boxes of aspect categories, a drop down list of corresponding sentiments, and a save button.
Figure 4. The website interface, which includes review sentences, check boxes of aspect categories, a drop down list of corresponding sentiments, and a save button.
Sustainability 15 12268 g004
Figure 5. Visual representation of the tag percentages of each aspect category for the AraMA corpus.
Figure 5. Visual representation of the tag percentages of each aspect category for the AraMA corpus.
Sustainability 15 12268 g005
Figure 6. Visual representation of the tag percentages of the aspect categories within the AraMAMS corpus.
Figure 6. Visual representation of the tag percentages of the aspect categories within the AraMAMS corpus.
Sustainability 15 12268 g006
Figure 7. A snapshot of AraMA dataset records in XML format, which contains the user review inside text tags and aspect category with the corresponding sentiment in opinion tags.
Figure 7. A snapshot of AraMA dataset records in XML format, which contains the user review inside text tags and aspect category with the corresponding sentiment in opinion tags.
Sustainability 15 12268 g007
Table 1. Summary of Arabic datasets proposed in previous work for ABSA.
Table 1. Summary of Arabic datasets proposed in previous work for ABSA.
PaperDomainSizeLanguagePublishedPlatformPolarities
Al-Samadi [6]Book1513 Arabic book reviewsMSAYesN/Apositive, negative, neutral, or conflict
Abd-Elhamid
[7]
Multiple domains200 reviewsDANoN/Apositive, negative, neutral
Al-Samadi, AlAyyoub
[8]
News posts related to the Gaza conflict2265 news postsMSANoBRATpositive, negative, or neutral
Al-Samadi, Qwasmeh
[9]
Hotel2291 hotel reviewsMSA, DAYesN/Apositive, negative, or neutral.
Ashi et al.
[10]
Saudi Airline service-related5000 tweetsMSA, DA (Saudi dialect)NoN/Apositive or negative.
Alshammari and AlMansour
[11]
Telecommunication companies in Saudi Arabia1098 tweetsDANoDataTrackingpositive, negative and natural
Alassaf and Qamar
[12]
Educational7934 tweetsDANoN/Anegative and not negative
Masadeh and Saad AlAzzam
[13]
Book1000 Arabic book reviewsMSANoN/Apositive or negative
Areed [14]Government2071 Arabic reviewsMSA, DANoGARSAranging from −1.0 to 1.0
Note: Table contains the paper’s author, dataset domain, size, language, whether it is published or not, the platform used during annotation, and sentiment polarity tags.
Table 2. The effects of pre-processing on texts.
Table 2. The effects of pre-processing on texts.
TaskBeforeAfter
Diacriticsجداًجدا
Punctuationالطعام جيد .. انصح بهالطعام جيد انصح به
Stemmingنظييييييفنظيف
Normalizationالخدمةالخدمه
أفضلافضل
EmojisSustainability 15 12268 i001Sustainability 15 12268 i002(empty)
Numbers100%(empty)
English LettersGood food(empty)
Table 3. Aspect categories and related topics.
Table 3. Aspect categories and related topics.
Aspect CategoryTopics
FoodFood taste, food temperature, serving size, diversity of dishes on the menu, dish appearance
ServiceSpeed of service, employee discipline, respect for customers, hospitality of visitors, how staff dealt with unexpected situations, opening time
EnvironmentHow visitors felt about the weather,
attractive decor, comfortable seats and tables, parking availability, congestion levels
PriceCustomer satisfaction with prices, how prices have changed from a previous visit, availability of payment methods
Table 4. Examples of reviews on each sentiment from all aspect categories.
Table 4. Examples of reviews on each sentiment from all aspect categories.
CategorySentimentExample 1Example 2
Food Positive جيد جدا من ناحيه جوده الطعام
“Very good in terms of food quality”
بصراحه الاكل خرافي
“Honestly, the food is suspicious”
Negative الكميات قليله 
“Quantities are small”
البيتزا سيئه
“Pizza is bad”
Neutral تجربتي مع الاكل لاباس بها
“My experience with food is okay”
الطعم عادي
“The taste is normal”
Conflict الاكل لذيذ الاصناف في المنيو قليله
“The food is delicious, but there are few items on the menu”
المكرونة لذيذه لكن السلطه فيها طعم غريب
“The pasta is delicious, but the salad has a strange taste”
EnvironmentPositive الديكور جميل جدا
“The décor is very nice”
المكان نظيف
“The place is clean”
Negative للأسف لايصلح للعوائل مافيه خصوصيه ابد
“Unfortunately, it is not suitable for families, as there is no privacy ever”
الأثاث قديم جدا مهترئ
“The furniture is very old and worn out”
Neutral الجلسات الداخليه والخارجيه عاديه  
“Indoor and outdoor tables are normal”
المكان والديكور لاباس به
“The place and decor are okay”
Conflict المكان راقي صوت الاغاني مزعج 
“The place is cool, the music is annoying”
الجلسات الداخليه سيئه الخارجيه حلوه ومرتبه 
“The inside tables are bad; the outside tables are nice and tidy”
Price Positive اسعارهم رخيصه مقارنه بالمطاعم المنافسه له
“Their prices are cheap compared to competing restaurants”
اسعارهم حلوه مره للعوايل الكبيره 
“Their prices are very good for large families”
Negative ماعندهم دفع شبكه 
“They do not have card payment”
الأسعار ارفعت كثير عن قبل
“Prices are much higher than before”
Neutral الاسعار مناسبه
“Prices are suitable”
البيتزا سعرها متوسط
“The pizza price is average”
Conflict المويه سعرها غالي مره الاطباق حلوه أسعارها
“Water is expensive, the dish prices are good”
الاكل سعره مقبول لكن المشروبات ليه كذا غاليه
“The food is reasonably priced, but the drinks are so expensive”
Service Positive العاملين بشوشين اول ما تدخل يرحبون بك
“The staff are friendly, as soon as you enter, they welcome you”
سريعين في الخدمه
“Quick serving”
Negative العاملين غير محترمين ابدا
“Staff are not respectful at all”
الويتر ما جاء يستلم طلباتنا الا بعد ربع ساعه
“The waiter did not come to receive our orders until after a quarter of an hour”
Neutral الويترز مثل أي مطعم مافيه شي مميز
“The waiters are like those in any other restaurant, nothing special”
الخدمه عاديه
“Service is normal”
Conflict العاملين متعاونين بس فرق اللغه يصعب الامور
“The staff is cooperative, but the language difference makes things difficult”
الخدمه سريعه لكن ياليت يفعلون الطلب بالهاتف او الموقع
“The service is fast, but I wish that I could order by phone or website”
Table 5. Number of sentiment tags in the AraMA corpus for each aspect.
Table 5. Number of sentiment tags in the AraMA corpus for each aspect.
Aspect CategoryPositiveNegativeNeutralConflictTotal
Food699112534258709539
Environment45401288914766395
Service4411110156925660
Price6092512867714059
Total16,55161541439150925,653
Table 6. The number of sentiment tags in the AraMAMS corpus for each aspect.
Table 6. The number of sentiment tags in the AraMAMS corpus for each aspect.
Aspect CategoryPositiveNegativeNeutralConflictTotal
Food30345104098384791
Environment1793856784563183
Service152661952852282
Price1302071864663131
Total648340561403144513,387
Table 7. The number of reviews in training and testing datasets for each corpus.
Table 7. The number of reviews in training and testing datasets for each corpus.
TestTrainTotal
AraMA7517322210,739
AraMAMS371815945312
Table 8. A color-scale representation of both corpora results produced using machine learning models.
Table 8. A color-scale representation of both corpora results produced using machine learning models.
AraMAAraMAMS
AccuracyPrecisionRecallF1-MeasueMCCAccuracyPrecisionRecallF1-MeasueMCC
Aspect CategoryNB54.13%84.97%94.07%89.29%55.73%51.07%83.71%95.19%89.08%56.24%
SVC KL66.67%91.87%90.95%91.41%62.68%66.50%92.80%90.60%91.68%66.65%
Lin SVC 65.89%91.93%90.60%91.24%62.65%66.19%92.91%90.52%91.70%67.11%
SGD65.18%91.50%90.65%91.07%62.31%64.12%92.14%90.37%91.25%65.82%
Positive SentimentNB48.79%78.38%78.04%78.21%40.53%42.10%71.88%52.16%60.45%34.45%
SVC KL56.30%82.97%80.43%81.69%50.75%54.77%77.67%73.14%75.34%45.49%
Lin SVC 56.46%83.15%80.57%81.84%51.43%55.08%77.94%73.19%75.49%45.76%
SGD56.49%84.23%78.10%81.05%48.38%45.29%81.34%52.63%63.91%46.21%
Negative SentimentNB63.04%96.11%14.52%25.23%35.71%54.20%93.72%36.25%52.28%51.25%
SVC KL75.70%79.51%65.24%71.67%65.07%66.31%81.71%63.27%71.32%63.35%
Lin SVC 75.67%80.08%64.39%71.38%65.18%66.50%82.68%62.94%71.47%64.06%
SGD74.58%87.27%53.07%66.00%64.81%64.99%79.75%62.46%70.05%62.50%
Conflict and Neutral SentimentNB74.55%60.00%0.33%0.66%4.28%51.19%91.67%2.58%5.02%13.19%
SVC KL79.33%64.05%38.96%48.45%49.49%67.00%72.14%49.47%58.69%54.96%
Lin SVC 79.33%64.30%38.18%47.91%48.78%66.75%72.57%49.00%58.50%55.32%
SGD78.52%62.76%36.29%45.99%44.69%63.68%63.76%50.53%56.38%49.50%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

AlMasaud, A.; Al-Baity, H.H. AraMAMS: Arabic Multi-Aspect, Multi-Sentiment Restaurants Reviews Corpus for Aspect-Based Sentiment Analysis. Sustainability 2023, 15, 12268. https://doi.org/10.3390/su151612268

AMA Style

AlMasaud A, Al-Baity HH. AraMAMS: Arabic Multi-Aspect, Multi-Sentiment Restaurants Reviews Corpus for Aspect-Based Sentiment Analysis. Sustainability. 2023; 15(16):12268. https://doi.org/10.3390/su151612268

Chicago/Turabian Style

AlMasaud, Alanod, and Heyam H. Al-Baity. 2023. "AraMAMS: Arabic Multi-Aspect, Multi-Sentiment Restaurants Reviews Corpus for Aspect-Based Sentiment Analysis" Sustainability 15, no. 16: 12268. https://doi.org/10.3390/su151612268

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop