You are currently viewing a new version of our website. To view the old version click .
Social Sciences
  • Article
  • Open Access

23 February 2023

Computational Techniques for Analyzing Women’s Social Change in Saudi Newspapers

and
1
Department of English, College of Arts, King Faial University, Hofuf 31982, Saudi Arabia
2
Department of Arabic, College of Humanities and Social Sciences, King Saud University, Riyadh 11362, Saudi Arabia
*
Authors to whom correspondence should be addressed.

Abstract

This study utilized computational techniques for a reliable analysis of discourse. These techniques were adopted to analyze the progress of Saudi social change in terms of women’s empowerment within the Saudi transformation program. The data from open source 2021–2022 Saudi newspaper archives were automatically crawled using cutting-edge computational techniques and structured according to the sections of the Saudi newspapers: front page, economy, international, sports, society, culture and religion. The analysis was based on computing the minimally uneven distribution of the relative frequencies of the occurrence of the central word (the Arabic forms of woman and women) from the years 2021 to 2022. This produced two samples of text data, each of which represented the respective years. Calculating the normalized and adjusted frequencies of the central word from each section in the data from each year was important to avoid unbalanced absolute frequencies in the qualitative analysis stage. In addition, dispersion measures showed that the amount of variance in terms of the lexical dispersion of the central word was not high. The observable facts from the quantitative analysis produced a more accurate observational sample of citations, which we qualitatively analyzed. The results of the latter showed a considerable ascending change in favor of empowering women as a consequence of Saudi Vision 2030.

1. Introduction

The media has a tremendous impact on creating and circulating social constructs within any context; thus, analyzing women’s empowerment representations as portrayed by newspaper sections may provide useful insight into deconstructing gender symmetry and asymmetry. In recent years, Saudi Arabia has carried out many transformative measures aimed at women’s empowerment in the country as a result of its Saudi Vision 2030. Consequently, these measures resulted in a noticeable increase in women in previously male-dominated professions and within socio–cultural and eco–political domains. The media, whether in the traditional sense or its more modernized digital forms, has a tremendous and reciprocal impact on regulating discursive practices in any social and eco–political context. Consequently, the media plays a major role in producing and maintaining hegemonic ideologies in any given society (Sriwimon and Zilli 2017). Such ideologies encompass many social constructs, including identity constructions and gender representations. Considering the key role of the media in negotiating these, there is a constantly growing need in social science research to explore how media discourses interact within diverse contexts and which constructions are created, reproduced and circulated based upon such interactions.
The current paper adds to this rich body of research by highlighting how gender constructions are portrayed in Saudi newspapers from socio–cultural and eco–political perspectives. Keeping in mind the transformative measures taken in Saudi Arabia toward women’s empowerment in the last decade and the progressive nature of the sustainable 2030 vision, which was launched by the Saudi government in 2016. Originally, Saudi Vision 2020 had been proposed, but later it was expanded and incorporated within a broader nationwide scheme aimed at a national transformation program and has been officially referred to as Saudi Vision 2030 (SV2030 2022). As a comprehensive transformation program, Saudi Vision 2030 operates on a sustainable agenda that targets the strategic reduction of the Kingdom’s dependence on oil, diversification of economic sources and improving the quality of life across diverse sectors. Due to such motivation, women’s empowerment has emerged as a prominent theme within this vision, and in particular, women’s contribution to the Saudi 2030 vision has been tremendously encouraged. Consequently, this has stirred up many changes, not only on an economic level but also on political, social and cultural grounds.
With this in mind, a discourse study in such a direction was needed, and this has led us to investigate Saudi societal interactions toward women-related themes. Our focus was on the Saudi newspaper articles published electronically in the two years after Saudi Vision 2020 as an initial stage of the wider 2030 Vision. This study explored woman-related themes by using corpus linguistic methods, which are an understudied area of research in Arabic discourse analysis. The sample text data from 2021 and 2022 Saudi newspapers were automatically crawled and structured from the Saudi newspapers archived on www.sauress.com (accessed on 1 December 2022). The main purpose of this study was twofold: to contribute to the critical discourse analysis (CDA) research by elucidating computational techniques that help to gather a homoscedastic variance of large-scale linguistic data and to utilize these techniques in investigating the progress of women’s social change in Saudi Arabia.

3. Data and Computational Techniques for Corpus Analysis

The data we collected represented two months from each of the years 2021 and 2022. The data used in this paper were crawled from the Saudi archived newspapers website www.sauress.com (accessed on 1 December 2022), and we assigned two months from each year, due largely to these being the two months in which the intended central word as the nodes المرأة (woman) and النساء (women) appeared more frequently. This was detected using an advanced search we conducted on the website itself. Thus, our sample was from February and March for the year 2021, and from January and February for the year 2022. The Saudi archived newspapers included 43 local newspapers. This archived platform publishes daily news and articles, which helped with the task of simultaneous crawling and structuring.
The relevant literature proposes many linguistic manifestations to underpin the intricate interplay between discourse on the one hand and gender representations on the other; to name a few: semantic macro structures (Al-Hejin 2015), metaphor analysis (Al Maghlouth 2021), titles, (Alkhammash and Al-Nofaie 2020) social actor representation (Almaghlouth 2022), multimodality (Alkhammash 2022) and process type analysis (Koller 2012). The current study was designed based on minute examination of collocations. Collocations are pairs or groups of words that often come together within the same near-linguistic context (Baker 2006). Due to such linguistic proximity, collocations are often examined as evidence of Moscovici’s (2000) mental representations since their co-occurrence often suggests that they tend to correlate in cognition as well. Against this backdrop, it is hypothesized in the current study that the aforementioned measures taken by the Saudi government in support of women’s empowerment can be traced back to the newspapers’ corpus at hand in the form of more empowered/positive women’s representations. In that sense, utilizing collocations as an inbuilt tool in corpus processing has been quite informative within many gender studies (see, for instance, Almujaiwel (2017), within a gender-based Saudi context). The reviewed literature, especially from non-Saudi newspapers, has confirmed the negative representations attached to Saudi women within such discourse. By investigating the collocational behaviors of the intended central words, this study might be able to detect a corresponding linguistic change of such representations, especially keeping in mind the reformative changes made in support of more women’s empowerment.
As the focus of our analysis was on the collocational behaviors of the intended central word: woman (and its plural form: women), in the sections of the newspaper structures, information regarding the number of texts in each local newspaper was unnecessary. According to our linguistic raw data, Table 1 and Table 2 show the number of texts (files), each of which contained a specific article, and the number of types (unique word forms) and tokens (all running words) across the respective years. The tables show the numbers of texts, types and tokens in the seven sections.
Table 1. Basic statistical information about the 2021 Saudi newspapers.
Table 2. Basic statistical information about the 2022 Saudi newspapers.
The computational techniques adopted for the corpus/data analysis were as follows: first, the raw and normalized frequency analysis of the central word between the two years; second, the raw and normalized frequency analysis of the intended lexical bundles (5n-grams collocation window) between the two years; third, the raw and normalized frequency analysis of the intended lexical bundles between the two years and the newspaper section data; and fourth, the raw and normalized frequency analysis of the central word between the multiple newspaper sections data. The terms and their concepts needed to be well-defined for the sake of clarity. Such clarity paved the way for explaining the computational techniques used for our data analysis. These statistical corpus linguistics terms were analyzed in terms of the normalized frequency—corpus size n, relative frequency rf and normalization base nf (McEnery and Hardie 2012, pp. 50–51)—and dispersion—the standard deviation SD, coefficient of variance CV and Juilland’s D (Brezina 2018, pp. 46–53).
The corpus size n is the total number of all running tokens/words in a given corpus. The raw frequency f of a word is the absolute number of f in a given corpus. The relative frequency rf is the number of times a word occurs divided by n. The normalized frequency nf is simply the result of f / n × n o r m a l i z a t i o n   b a s e . The normalization base nf is a number to be set on average, and they always follow a numerical pattern that starts with 1 followed by zeros. For example, if n equals any number in a format of tens of thousands (27,000, 76,938, 89,002 and so on), the nf will be set to the base 10,000 or less (1000 or 100) according to the preference of the human analyst. The corpus-to-corpus ratio (nf1/nf2) is computed to provide the number of times the word occurs in a corpus compared with another corpus. It gives the difference ratio between the nf over the multiple corpora or sub-corpora. The benefit of using the nf measure is to avoid the unevenly distributed number of f for a given word while its nf is low. For example, in our data, the f values for the central words woman and women between the culture and religion sections for the sample of the year 2021 were 220,740 and 79,510, respectively, but the nf (×10,000) values were 6.025 and 8.427, which means that the nf in the religion section was more than that in the culture section.
Dispersion is different and more accurate than distribution when it comes to the whole corpus. The parts of a given corpus are simply the nf (normalization base). Distribution is usually lacking in denseness, while dispersion tells us how the relative frequencies of a word’s per normalization base (×100, ×1000, ×10,000 and so on) are agglomerated between the parts of the intended corpus and on the whole. The merit of dispersion is that it is a set of measures that output the variation within different parts of the corpus. The measures utilized are the coefficient of variance CV and Juilland’s D. The coefficient of variation CV is simply calculated as follows: C V   ( w o r d ) = s t a n d a r d   d e v i a t i o n m e a n . The result of CV is then divided by the square root of the number of parts minus 1: C V = C V   ( w o r d ) n u m b e r   o f   t h e   c o r p u s   p a r t s 1 . The final result comes out between 0 and 1. When the CV is closer to 0, the given word is more evenly distributed throughout the parts of the corpus. As for Juilland’s D, it is a measure that depends on CV and is based on the following formula: J u i l l a n d s   D = 1 C V n u m b e r   o f   t h e   c o r p u s   p a r t s 1 . The result of Juilland’s D is also between 0 and 1. When it is close to 1, the distribution of the given word is perfect over the parts of the corpus. This means that CV and Juilland’s D are opposite in terms of reporting a perfectly or imperfectly even distribution.

4. Results

In this section, the observable facts are described. The data were analyzed using two main computational techniques. First, the normalized frequency nf was used to adjust the frequencies of the central word across two years and among the newspaper sections over the two years. Second, the dispersion measures were used to demonstrate the fairly even distributions of the relative frequencies of the central word across the two years and over the multiple newspaper sections to increase the confidence of how good our sample was in terms of homoscedasticity (the equality of variances in two groups) for qualitatively analyzing the discursive practices of women’s social change within the contexts of the collocations in the 2021 and 2022 Saudi newspapers. Utilizing these techniques allowed us to recognize the scales of the women-related themes and perform qualitative analyses of the intended examples from the data.

4.1. Normalized Frequency and Dispersion Measures

As explained earlier regarding nf, comparing the data sizes of the years 2021 and 2022 showed that the text data of the former were larger than the latter (Table 3), even in terms of the f and nf of the central word. However, comparing the nf of the sections of each year was the touchstone.
Table 3. Raw vs. normalized frequency of the central word for the two years.
Table 4 and Table 5 include the n, f and nf (rf × 100) of the central word across the seven sections. The notation (rf × 100) denotes the relative frequency of the word occurring per 100 words in all seven sections of the data from each year. This was applied in each section to show the more realistic proportional occurrences of lexical items from each part of the corpus. The nf of the central word for the sampled data of the year 2021 showed that it was higher in the religion section than in the remaining sections. As for the year 2022, the nf was found to be higher in the same section than in the remaining sections, except for the front page section. The nf, therefore, was the pivot around which the adjusted nf evenly redistributed the central word over the multiple sections, and it allowed us to avoid becoming confused by the n and f. Table 6 reports the section-to-section ratios and Table 7 shows the descending order of the adjusted frequencies of the central word for the sections according to size. Regarding the section-to-section ratios of the two sets of sampled data for the respective years, the difference in nf sizes varied, but the overall section-to-section ratio of the sampled data for the years 2021 (N = 1,480,584) and 2022 (N = 1,200,333) was 0.175. This meant that we are confident regarding the qualitative analysis of the contexts of the central word (women-related themes) for the two years.
Table 4. Raw vs. normalized frequencies of the central word in multiple newspaper sections for the year 2021.
Table 5. Raw vs. normalized frequencies of the central word in multiple newspaper sections for the year 2022.
Table 6. Corpus-to-corpus (section-to-section) ratio for the 2021 and 2022 samples.
Table 7. Descending order of nf (×100) of multiple sections for the 2021 and 2022 samples.
Regardless of the n and f of the central word for the sampled data of the two years, the nf (per 100 words) provided the relative distributions of the central word for the sections of the two years, as visualized in Figure 1, where the boxplot shows the nf of the central word for the sections, and the red line inside the box is the median of the nf (per 100 words). The nf values of the central word in sections E-2, C-6 and R-7 were larger. Smaller values can be seen for the international, front page, society and sports sections. These adjusted frequencies were essential when analyzing the discursive practices and constructs of the women-related themes in the sections for the two years, and they helped to avoid the bias that occurs when relying solely on the raw frequencies. By looking at Table 4 (the raw frequencies F of the central word for the seven sections in the 2021 data), we can state that if we relied solely on F, we would conclude that the central word occurred from higher to lower frequencies in the economy, international, culture, religion, sports, society and front page sections. This made it difficult to judge the examples extracted for the qualitative analysis when we looked at the co-occurrences of the central word in terms of the social practices in the discourses from the data, even if the examples representing the raw frequencies were small in number. The order of sections in terms of the adjusted frequencies nf was different, as it shows that the central word occurred from higher to lower adjusted frequencies in the religion, economy, culture, international, front page, sports and society sections. The same operation was conducted for the seven sections in the 2022 data (Table 5). The nf order of sections from higher to lower for the two years is given in Table 7.
Figure 1. The adjusted frequencies nf (rf × 100) for the different sections.
Dispersion measures demonstrate the degree to which the data are scattered. This shows whether the data are homogenous or heterogeneous. The dispersion measures used herein were the relative measures of dispersion: the mean of the relative frequency rf × 100 (the nf of the central word for the seven sections in the data from each year), the standard deviation SD, the coefficient of variance and the powerful measure of dispersion known as Juilland’s D. The outputs of these measures are given in Table 8 and Table 9.
Table 8. Dispersion analysis of the central word among the sections for the year 2021.
Table 9. Dispersion analysis of the central word among the sections for the year 2022.
There were seven data parts, which corresponded to the number of sections in the data from each year. The interpretation of the sampled data was based on the values (between 0 and 1) of the coefficient of variance CV and Juilland’s D. In Table 8 and Table 9, the CV values were both closer to 0 than to 1, meaning that the amount of variance was small. This can be further demonstrated by calculating Juilland’s D, the value of which was close to 1 for both years, meaning that the central word was distributed fairly evenly.

4.2. Normalized Frequencies of Co-Occurrences

Newspaper articles (opinions and editorials) and news are an instance of discourse, where discursive events are affected by social practice, and the latter has an impact on shaping the discursive practice (Fairclough 1992). We obtained the complete excerpts/examples of the co-occurrences concerning women from our quantitative data, and the social practices were excavated from all those examples to unveil the progress of women’s empowerment.
The application of the 2n-grams of the central word, excluding the grammatical items to ensure that all the co-occurrences appeared correctly, was the first step for the qualitative observations of the co-occurrences and their broad contexts. Not excluding the grammatical items produced a long list of co-occurrences and removing them reduced the size of the list. This is feasible as long as such items will appear when extracting the citations of the co-occurrences for further contextual interpretations. What we processed reproduced the instantiations of social practices toward women-related themes and rhemes. What we arrived at afterward are examples of the co-occurrences. Table 10 presents the highest absolute frequency collocates associated with the central word, in addition to the sections in which they appeared. It is noticeable that the absolute frequency of the co-occurrences was higher in 2021 than in 2022. However, as we were following the perspective of the nf base, and rf in particular, some co-occurrences were more frequent in 2022 than in 2021. That is, the normalized frequencies of تمكين, إنجاز, حقوق, عمل and دور were higher in 2022.
Table 10. Absolute frequencies (f) of the co-occurrences for 2021 and 2022.
Noticeably, the practicality of the nf base of each co-occurrence reflected in rf (×100) is an orthodox method for undertaking a reasonable analysis of the co-occurrences that reflect the progress of the social change toward women-related themes. Some negative co-occurrences from opponents terminated in the year 2022, namely, ضد (against) and جسد (body), after they occurred in the year 2021 in the economy (E-2), international (I-3), culture (C-6), religion (R-7) and society (O-5) sections. These two collocates were the only ones found to be negative in the year 2021. The remaining 13 collocates in their co-occurrences were positive. This indicated swift progress in social practices constructed toward women-related themes, keeping in mind the theoretical and methodological grounds highlighted earlier in this study.
The same method was applied to the verbs (Table 11) associated with the central word and the genitive/adjective construction of the co-occurrences for the residuals of the examples (Figure 2 and Figure 3). The verbs found to be negative from their co-occurrences posed by opponents in the sampled data of the years 2021 and 2022 were تستهلك (she consumed), تغار (she is jealous) and أغرى (he tempted her to), which were found in the front page (F-1), international (I-3), culture (C-4) and religion (R-7) sections.
Table 11. Verbal co-occurrences rf = (>0.00001 × 100) for 2021 and 2022.
Figure 2. Bar plot of the genitive/adjective construction co-occurrences rf = (>0.000001 × 100) for 2021.
Figure 3. Bar plot of the genitive/adjective construction co-occurrences rf = (>0.000001 × 100) for 2022.
The results of the remaining co-occurrences are shown in Figure 2 and Figure 3 and are given according to the relative frequencies in scientific notation due to them being too small. For example, number (200) means that its rf is based on 1 × 10−6 (0.000001), and as a result, the number 200 corresponds to a value of 0.000200. The nf (rf × 100) for the collocate مسؤولية, for instance, was exactly 0.000278 in the output file after the automatic calculation. The collocates that were found to be negative in the remaining co-occurrences were conveyed by ادعاءات (allegations), استهداف (targeting), تشبيهه (image/likening), مغرورة (arrogant), تهميش (marginalization), عورة (grooming), السيارة (car), حرام (forbidden), بتصوير (portrayal), شعرها (her hair), مصافحة (shaking hands) and المتورطات (who are implicated).

4.3. Instantiations of Social Practices

In this section, we present the results in a broader context, meaning that the examples extracted herein were analyzed in their contexts by considering the sections they were mentioned in. The women-related themes and their progress after Saudi Vision 2020 and the constructivist social changes that took place in favor of or against women-related themes were manually extracted. We grouped the co-occurrences and their broad contexts into two facets of social practice constructions, namely, proponents (progressive supporters) and opponents. The former denoted further support and real positive achievements in empowering women, while the latter represented opponents of women’s empowerment. In Table 12, Table 13 and Table 14, the number of positive (supported by proponents) and negative (raised by opponents) examples of co-occurrences is 74 positive and 17 negative co-occurrences according to their broad contexts.
Table 12. Broad contexts of the genitive construction co-occurrences (rf = >0.00001 × 100).
Table 13. Broad contexts of the verbal construction co-occurrences (rf = >0.00001 × 100).
Table 14. Broad contexts of the genitive/adjectival construction co-occurrences (nf = <0.00001 × 100).
Table 12 contains the examples and the broad contexts of the genitive co-occurrences whose nf values were more than 0.0001 per 100 words, and Table 13 contains the examples and the broad contexts of the verbal co-occurrences whose nf values were more than 0.00001 per 100 words. Table 12 and Table 13 show that a few negative examples were categorized under opponents. At a glance, the examples of positive genitive and verbal co-occurrences (proponents) were much more frequent than negative ones (opponents). Table 14 shows the examples and the broad contexts of the co-occurrences whose rf values were less than 0.000001 per 100 words. As for the co-occurrences exemplified therein, negativity increased.
A few examples of negativity from opponents were found in the contexts of the international (I-3), culture (C-6), society (O-5) and religion (R-7) sections in Table 12, and in the contexts of the culture (C-6), front page (F-1), international (I-3) and religion (R-7) sections in Table 13. The reaming co-occurrences did not match between the 2021 and 2022 data, which resulted in them being presented separately in Table 14. Therein, a few examples of negativity can be seen in the contexts of the international (I-3), society (O-5), economy (E-2), international (I-3) and religion (R-7) sections in the 2021 data and in the contexts of the front page (F-1) and international (I-3) sections in the 2022 data. No negativity was detected in the context of the sports (S-4) section in either year or in the context of the economy (E-2) section in the year 2022.

5. Discussion

All the observable facts about the normalized frequencies nf (relative frequencies rf × 100) of the co-occurrences of the central word from the data parts of the 2021 and 2022 samples and about the dispersion measures gave us grounds to look through the citations/examples of the co-occurrences. The next step was to capture the broad contexts of the collocations in their given citations to categorize the discursive practices. In Table 12, Table 13 and Table 14, we numbered the collocates associated with the central word and provided the broad context constructs and the contextualized co-occurrences. The quantitative method relating to the measuring of the normalized frequency base of the co-occurrences was used to express the frequencies of the co-occurrences relative to all the study data. This method is powerful as it is based on the average. In addition, such computational techniques utilized for text data are of high importance in discourse analyses, especially critical discourse analysis (CDA), since the media and newspapers have become digitized. The technique of scraping textual data and organizing it in terms of metadata is feasible using different techniques and the library request in Python. Moreover, the nf and dispersion measures that tackle the unbalanced linguistic data parts provided us with the confidence to analyze and report the progress of women’s empowerment in Saudi newspapers after Saudi Vision 2020.
Recall that the study question was the following: to what extent do co-occurrences (collocate + central word) and their broad contexts indicate the discursive and social practices of the women-related themes that were introduced, promoted or modified over the two years and the multiple sections? The findings given in Table 12, Table 13 and Table 14 included all the examples and their broad contexts with reference to their sections. The number of cases in which the progress of women’s empowerment was supported was 74. On the other hand, progress was impeded in only 17 cases. Therefore, opponents were far fewer in number than supporters. Opponents were found in the international, culture, society and religion (Table 12), front page and religion (Table 13) and economy sections in the year 2021, and the international, society, religion and front page sections in both years (Table 14).
In light of this, the quantitative findings could be taken as empirical evidence of the reciprocal and constructionist link between discourse and context that was established earlier in this paper and is often highlighted in the relevant literature. In particular, what can be inferred from the above discussion is that the reformative measures taken by the Saudi government in accordance with its 2020 and, prospectively, 2030 visions are starting to materialize within discursive practices at various levels. This has not only taken place as these changes have been gradually implemented but has also manifested at a linguistic level through the national media. Women’s empowerment emerges within this context as a dominant theme that extends across various fields, and opposition to this theme, which used to be quite strong in previous decades, is starting to fade.
In that sense, it is possible to see the link between our results and what Fairclough (1992, p. 201) refers to as the ‘democratization’ and ‘technologisation’ of discourse. Democratization of discourse can be defined as ‘the removal of inequalities and asymmetries in the discursive and linguistic rights, obligations and prestige of groups of people’, while technologization refers to the opposite in which there is a deliberate or subconscious intervention in discourse to maintain status quo by ensuring that a given discursive hegemony is discursively introduced and distributed. As many of the examined collocations established a shift toward more positive and women-empowering representations—be it through the positive polarity of the noun collocates or the grammatically active verb collocates—some of the linguistic asymmetries disfavoring women are also starting to decline.
Taking into consideration the aforementioned discussion of positive discourse analysis and the socially constituting inherent feature of discourse, one might be able to highlight the potential of such linguistic transformation in pushing forward more gender equality. Since it has been established in the analysis that such reformative measures were correspondingly translated in linguistic circles, it is only fair to predict further social change to be initiated by discourse on its linguistic end. In particular, this should be seen as operating in a circular fashion rather than a linear one. As change is repeatedly introduced and distributed discursively vis-à-vis the status quo within a particular context, it begins to transform gradually into the status quo per se (Al Maghlouth 2017). This usually takes place through several means, one of which is the normalization of such change linguistically and, in consequence, cognitively and socially.
This could be also linked to the role of governing and policymaking as primarily societal factors within a given discourse, as proposed by Bracher (1993, p. 53). By the same token and drawing on Gramsci’s structure of power, Fairclough (2013) highlighted the significant role of political power in the domination of certain ideologies, which would gradually inform and be informed by discourse Fairclough (2013) highlighted the significant role of political power in the domination of certain ideologies, which would gradually inform and be informed by discourse. This is consistent with what was highlighted in another local study (Al Maghlouth 2017), in which social change was advocated for and pushed forward by decision makers and the associated policy in the country. Interestingly, in her study, policies in support of women’s empowerment faced opposition more than a decade ago; nevertheless, they persisted and paved the way for far more reformative measures to materialize. However, this documented far less opposition, which is a finding that highlights the role of awareness in promoting, accepting and maintaining the desired change alongside governing and policymaking.
Moreover, our study data add to the rich research on the us–them representation spectrum, which is a recurrent theme in social psychology. In particular, a detailed examination of the relevant literature clearly documented a distinction between how Saudi women are constructed by the Saudi media locally and how they are portrayed on a more international level by the foreign media. For instance, othering and negative representations in the Western media often seem to be reinforced (Al-Hejin 2015; Elyas and Aljabri 2020; Karimullah 2020; Ruby 2013; Saleh 2016) and sincere changes or reforms in support of women are often overlooked. However, media sources produced, distributed and examined locally appear to be more consistent at reporting such changes and even portraying a more progressive construction of Saudi women. This should be approached from a perspective that validates such findings while acknowledging that this might not always be the case. To elaborate on this, Ndambuki and Janks (2010), for instance, reported a linguistic clash between how Kenyan women are portrayed discursively as lacking in agency by Kenyan political leaders, while in reality, these women were quite agentive despite being surrounded by dominating discourses of patriarchy and rurality in Kenya. What this signifies is that the constructionist link between discourse and context in this Kenyan study was not analyzed based on broad discursive patterns of varying representative data, and thus, such a link should not be taken for granted across different discourses and contexts.

6. Conclusions

In brief, this study analyzed women’s social change in Saudi newspapers after the implementation of Saudi Vision 2020 as an initial stage of the national transformation program using computational techniques. One of the techniques used to assess the data from the years 2021 to 2022 was based on the adjusted frequencies of the central word (woman, pl. women) across seven data parts from the datasets from each year. This is a method that is used when the data parts are unbalanced and the raw frequencies of the lexical unit are highly skewed.
The discursive social practices of the broad contexts of the co-occurrences concerning the central word were extracted from the study data. The data was amassed from archived Saudi newspaper articles. As such, we utilized quantitative data obtained from statistical tools from our self-built corpus from Saudi newspapers published in 2021 and 2022. In doing so, the rationale and procedures for data collection and analysis were articulated in detail. The findings clearly indicated a more progressive tone in terms of women’s empowerment in the Kingdom, which was consistent with the transformative measures that have been implemented by the Saudi government over the last decade. However, despite the considerable ascending change in favor of empowering women detected in the examples of discursive practices in the 2022 data, some examples conveying negativity were found in the religion and society sections.
As such, the study offers another insight into the constructionist perspective of discourse, highlighting various linguistic manifestations of social change that are embedded within discursive practices. The study also demonstrated the need to investigate representations of Saudi women from within the Saudi context rather than from representations stemming from foreign, mostly Western, media. In addition, a number of theoretical and methodological implications can be drawn from the approach taken in this study, especially if one considers the very scarce literature on Arabic corpora in international journals.

Author Contributions

The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Deanship of Scientific Research, King Faisal University, Hofuf, Saudi Arabia (GRANT1355).

Data Availability Statement

Data available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Al Maghlouth, Shrouq. 2017. A Critical Discourse Analysis of Social Change in Women-Related Posts on Saudi English-Language Blogs Posted between 2009 and 2012. Ph.D. thesis, Lancaster University, Lancaster, UK. [Google Scholar]
  2. Al Maghlouth, Shrouq. 2021. Metaphorical Analysis of Discourse on Early Saudi Attempts to Include Women in Unconventional Work Environments. GATR Global Journal of Business and Social Science Review 9: 1–9. [Google Scholar] [CrossRef]
  3. Al-Hejin, Bandar. 2015. Covering Muslim women: Semantic macrostructures in BBC News. Discourse & Communication 9: 19–46. [Google Scholar] [CrossRef]
  4. Al-Munajjed, Muna. 2010. Women’s employment in Saudi Arabia. A Major Challenge. In Middle East and North Africa Business. Report. Riyadh: Booz & Company’s Ideation Center. [Google Scholar]
  5. Alemdaroglu, Ayça. 2015. Escaping femininity, claiming respectability: Culture, class and young women in Turkey. Women’s Studies International Forum 53: 53–62. [Google Scholar] [CrossRef]
  6. Alkhammash, Reem. 2022. Multimodal metaphors and sexism in Arabic cartoons depicting gender and gender relations during COVID-19. Multimodal Communication 11: 235–46. [Google Scholar] [CrossRef]
  7. Alkhammash, Reem, and Haifa Al-Nofaie. 2020. Do Saudi academic women use more feminised speech to describe their professional titles? An evidence from corpus. Training, Language and Culture 4: 9–20. [Google Scholar] [CrossRef]
  8. Almaghlouth, Shrouq. 2022. Mourning the lost: A social actor analysis of gender representation in the@ FacesofCovid’s tweets. Frontiers in Psychology 13: 7614. [Google Scholar] [CrossRef]
  9. Almujaiwel, Sultan. 2017. Discursive patterns of anti-feminism and pro-feminism in Arabic newspapers of the KACST corpus. Discourse & Communication 11: 441–66. [Google Scholar] [CrossRef]
  10. Bakar, Kesumawati. 2014. Attitude and identity categorizations: A corpus-based study of gender representation. Procedia, Social and Behavioral Sciences 112: 747–56. [Google Scholar] [CrossRef]
  11. Baker, Paul. 2006. Using Corpora in Discourse Analysis. London: A&C Black. [Google Scholar]
  12. Baker, Paul. 2014. Using Corpora to Analyze Gender. London: Bloomsbury Publishing. [Google Scholar]
  13. Baker, Carolyn D., and Peter Freebody. 1989. Children’s First School Books: Introduction to the Culture of Literacy. Cambridge: Cambridge University Press. [Google Scholar]
  14. Bashatah, Nahid. 2017. Framing Analysis of British Newspaper Representation of Saudi Women from 2005–2013. Ph.D. thesis, University of Salford, Manchester, UK. [Google Scholar]
  15. Biber, Doug, Randi Reppen, and Eric Friginal. 2012. Research in Corpus Linguistics. In The Oxford Handbook of Applied Linguistics, 2nd ed. Edited by Robert Kaplan. Oxford: Oxford University Press, pp. 548–70. [Google Scholar] [CrossRef]
  16. Bracher, Mark. 1993. Lacan, Discourse and Social Change: A Psychoanalytic Cultural Criticism. Ithaca: Cornell University. [Google Scholar] [CrossRef]
  17. Brezina, Vaclav. 2018. Statistics in Corpus Linguistics. Cambridge: Cambridge University Press. [Google Scholar] [CrossRef]
  18. Brooke, Mark. 2020. “Feminist” in the sociology of sport: An analysis using legitimation code theory and corpus linguistics. Ampersand 7: 1–8. [Google Scholar] [CrossRef]
  19. Brown, Jane Delano, Carl Bybee, Stanley Wearden, and Dulcie Murdock Straughan. 1987. Invisible power: Newspaper news sources and the limits of diversity. Journalism & Mass Communication Quarterly 64: 45–54. [Google Scholar] [CrossRef]
  20. Brun-Mercer, Nicole. 2021. Women and men in the United Nations: A corpus analysis of general debate addresses. Discourse & Society 32: 443–62. [Google Scholar] [CrossRef]
  21. Butler, Judith. 1999. Gender Trouble: Feminism and the Subversion of Identity. New York: Routledge. [Google Scholar]
  22. Caldas-Coulthard, Carmen Rosa, and Rosamund Moon. 2010. “Curvy, hunky, kinky”: Using corpora as tools for critical analysis. Discourse & Society 21: 99–133. [Google Scholar] [CrossRef]
  23. Coimbra-Gomes, Elvis, and Heiko Motschenbacher. 2019. Language, normativity, and sexual orientation obsessive-compulsive disorder (SO-OCD): A corpus-assisted discourse analysis. Language in Society 48: 565–84. [Google Scholar] [CrossRef]
  24. Eberhardt, Maeve. 2017. Gendered representations through speech: The case of the Harry Potter series. Language and Literature 26: 227–46. [Google Scholar] [CrossRef]
  25. Elliott, Carolyn. 2008. Introduction: Markets, communities and empowerment. In Global Empowerment of Women: Responses to Globalization and Politicized Religions. Edited by Carolyn Elliot. New York: Routledge. [Google Scholar]
  26. Elyas, Tariq, Kholoud Ali Al-Zhrani, Abrar Mujaddadi, and Alaa Almohammadi. 2021. The representation(s) of Saudi women pre-driving era in local newspapers and magazines: A critical discourse analysis. British Journal of Middle Eastern Studies 48: 1033–52. [Google Scholar] [CrossRef]
  27. Elyas, Tariq, and Abdulrahman Aljabri. 2020. Representations of Saudi male’s guardianship system and women’s freedom to travel in Western newspapers: A critical discourse analysis. Contemporary Review of the Middle East (Online) 7: 339–57. [Google Scholar] [CrossRef]
  28. Fairclough, Norman. 1992. Discourse and Social Change. Cambridge: Polity Press. [Google Scholar]
  29. Fairclough, Norman. 2013. Critical Discourse Analysis: The Critical Study of Language, 2nd ed. London: Pearson. [Google Scholar]
  30. Fairclough, Norman, and Ruth Wodak. 1997. Critical discourse analysis. In Discourse as Social Interaction. Discourse Studies: A Multidisciplinary Introduction, 2nd ed. Edited by Teun A. Van Dijk. London: Sage, pp. 258–84. [Google Scholar]
  31. Grunenfelder, Julia. 2013. Discourses of gender identities and gender roles in Pakistan: Women and non-domestic work in political representations. Women’s Studies International Forum 40: 68–77. [Google Scholar] [CrossRef]
  32. HRDF. 2023. Human Resources Development Fund. Available online: www.hrdf.org.sa (accessed on 1 December 2022).
  33. Kabeer, Naila. 1999. Resources, agency, achievements: Reflections on the measurement of women’s empowerment. Development and Change 30: 435–64. [Google Scholar] [CrossRef]
  34. Kabeer, Naila. 2005. Gender equality and women’s empowerment: A critical analysis of the third millennium development goal. Gender and Development 13: 13–24. [Google Scholar] [CrossRef]
  35. Kahf, Mohja. 1999. Western Representations of the Muslim Woman from Termagant to Odalisque, 1st ed. Austin: University of Texas Press. [Google Scholar]
  36. Karimullah, Kamran. 2020. Sketching women: A corpus-based approach to representations of women’s agency in political Internet corpora in Arabic and English. Corpora 15: 21–53. [Google Scholar] [CrossRef]
  37. Khumalo, Kathryn, Kimber Haddix McKay, and Wayne Freimund. 2015. Who is a “real woman”? Empowerment and the discourse of respectability in Namibia’s Zambezi region. Women’s Studies International Forum 48: 47–56. [Google Scholar] [CrossRef]
  38. Kjellmer, Göran. 1986. ‘The lesser man’: Observations on the role of women in modern English writings. In Corpus Linguistics II. Edited by Jan Aarts and Willem Meijs. Leiden: Brill, pp. 163–76. [Google Scholar] [CrossRef]
  39. Kleemans, Mariska, Gabi Schaap, and Liesbeth Hermans. 2017. Citizen sources in the news: Above and beyond the vox pop? Journalism 18: 464–81. [Google Scholar] [CrossRef]
  40. Lee, Jackie F. K. 2018. Gender representation in Japanese EFL textbooks: A corpus study. Gender and Education 30: 379–95. [Google Scholar] [CrossRef]
  41. Martin, James. 2004. Positive discourse analysis: Solidarity and change. Revista Canaria de Estudios Ingleses 49: 179–200. [Google Scholar]
  42. Mayoux, Linda. 1998. Participatory learning for women’s empowerment in micro-finance programmes: Negotiating complexity, conflict and change. IDS Bulletin 29: 39–50. [Google Scholar] [CrossRef]
  43. McEnery, Tony, and Andrew Hardie. 2012. Corpus Linguistics. Cambridge: Cambridge University Press. [Google Scholar] [CrossRef]
  44. Mishra, Smeeta. 2007. “Liberation” vs. “purity”: Representations of Saudi women in the American press and American women in the Saudi press. The Howard Journal of Communications 18: 259–76. [Google Scholar] [CrossRef]
  45. Moscovici, Serge. 2000. Social Representations: Explorations in Social Psychology. Oxford: Blackwell. [Google Scholar]
  46. Ndambuki, Jacinta, and Hilary Janks. 2010. Political discourses, women’s voices: Mismatches in representation. CADAAD Journal 4: 73–92. [Google Scholar]
  47. Partington, Alan, Alison Duguid, and Charlotte Taylor. 2013. Patterns and Meanings in Discourse: Theory and Practice in Corpus-assisted Discourse Studies (CADS). Amsterdam: John Benjamins. [Google Scholar]
  48. Pearce, Michael. 2008. Investigating the collocational behaviour of man and woman in the BNC using Sketch Engine. Corpora 3: 1–29. [Google Scholar] [CrossRef]
  49. Popa, Dorin, and Delia Gavriliu. 2015. Gender representations and digital media. Procedia, Social and Behavioral Sciences 180: 1199–206. [Google Scholar] [CrossRef]
  50. Koller, Veronika. 2012. How to analyse collective identity in discourse–textual and contextual parameters. Critical Approaches to Discourse Analysis across Disciplines 5: 19–38. [Google Scholar]
  51. Romaine, Suzanne. 1999. Communicating Gender. Mahwah: L. Erlbaum Associates. [Google Scholar]
  52. Romano, Manuela. 2021. Creating new discourses for new feminisms: A critical socio-cognitive approach. Language & Communication 78: 88–99. [Google Scholar] [CrossRef]
  53. Ross, Karen, Elizabeth Evans, Lisa Harrison, Mary Shears, and Wadia Khursheed. 2013. The gender of news and news of gender: A study of sex, politics, and press coverage of the 2010 British general election. The International Journal of Press/Politics 18: 3–20. [Google Scholar] [CrossRef]
  54. Ruby, Tabassum Fahim. 2013. Muslim women and the Ontario Shari’ah tribunals: Discourses of race and imperial hegemony in the name of gender equality. Women’s Studies International Forum 38: 32–42. [Google Scholar] [CrossRef]
  55. Saleh, Layla. 2016. (Muslim) woman in need of empowerment: US foreign policy dicourses in the Arab spring. International Feminist Journal of Politics 18: 80–98. [Google Scholar] [CrossRef]
  56. Sarfo-Kantankah, Kwabena Sarfo. 2021. The discursive construction of men and women in Ghanaian parliamentary discourse: A corpus-based study. Ampersand 8: 1–10. [Google Scholar] [CrossRef]
  57. Sjøvaag, Helle, and Truls Pedersen. 2019. Female voices in the news: Structural conditions of gender representations in Norwegian newspapers. Journalism & Mass Communication Quarterly 96: 215–38. [Google Scholar] [CrossRef]
  58. Sriwimon, Lanchukorn, and Pattamawan Jimarkon Zilli. 2017. Applying critical discourse analysis as a conceptual framework for investigating gender stereotypes in political media discourse. Kasetsart Journal of Social Sciences 38: 136–42. [Google Scholar] [CrossRef]
  59. Saudi Vision 2030. 2022. Kingdom of Saudi Arabia: SV2030. Available online: https://www.vision2030.gov.sa/ (accessed on 14 December 2022).
  60. Termine, Paola, and Monika Percic. 2015. Rural women’s empowerment through employment from the Beijing Platform for Action Onwards. IDS Bulletin 46: 33–40. [Google Scholar] [CrossRef]
  61. Van Dijk, Teun A. 2015. Critical Discourse Analysis. In The Handbook of Discourse Analysis, 2nd ed. Edited by Deborah Tanned, Heidi Hamilton and Deborah Schiffrin. Hoboken: John Wiley & Sons, pp. 466–85. [Google Scholar] [CrossRef]
  62. Wharton, Sue. 2005. Invisible females, incapable males: Gender construction in a children’s reading scheme. Language and Education 19: 238–51. [Google Scholar] [CrossRef]
  63. Wolfsfeld, Gadi, and Tamir Sheafer. 2006. Competing actors and the construction of political news: The contest over waves in Israel. Political Communication 23: 333–54. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.