Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News

Fang, Xinyu; Dong, Fangfeng

doi:10.3390/journalmedia7010030

Open AccessArticle

Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News

by

Xinyu Fang

and

Fangfeng Dong

^*

School of Foreign Languages, Central China Normal University, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Journal. Media 2026, 7(1), 30; https://doi.org/10.3390/journalmedia7010030

Submission received: 31 December 2025 / Revised: 4 February 2026 / Accepted: 7 February 2026 / Published: 11 February 2026

Download

Browse Figures

Review Reports Versions Notes

Abstract

In the context of polarized media discourse, this study examines how outlets with distinct political leanings constructed multimodal representations of the 2025 Los Angeles protests. Adopting a corpus-assisted Multimodal Critical Discourse Analysis (MCDA) framework, this study integrates Entman’s framing theory with Kress and van Leeuwen’s visual grammar to analyze news coverage of the protests. The results reveal a divergence in multimodal strategies. Fox News employs a delegitimization frame through a dominant strategy of reinforcement, where images serve as direct evidence for textual claims. Conversely, CNN constructs a conditional legitimacy frame via a more nuanced strategy, through which the outlet strategically utilizes multimodal contradiction to negotiate with the “protest paradigm” and mitigate the visual reality of disorder. The findings demonstrate how partisan media leverage distinct multimodal strategies to reconstruct opposing social realities. The study contributes to political discourse research by going beyond textual bias to reveal how multimodal strategies function in media polarization environments.

Keywords:

multimodal critical discourse analysis; framing; media polarization; Los Angeles protests; (de)legitimization

1. Introduction

In June 2025, Los Angeles in California witnessed large-scale protests triggered by workplace raids conducted by U.S. Immigration and Customs Enforcement (ICE) targeting the local Latino community. The move led to the arrest of 118 immigrants, sparking widespread public outrage. Protesters condemned ICE’s aggressive enforcement tactics and criticized the Trump administration for further escalating tensions by deploying the National Guard and Marine Corps. While the demonstrations were initially peaceful, clashes intensified as some radical protesters engaged in violent confrontations with law enforcement, including throwing stones, burning autonomous vehicles, and destroying public property. Los Angeles Mayor Karen Bass called for calm, emphasizing the need for peaceful protests while condemning the federal government’s excessive militarized intervention. However, the Trump administration maintained a hardline stance, labeling the protests as “riots” and authorizing the use of force to restore order. The incident not only highlighted deep divisions over U.S. immigration policy but also reignited debates about federal power versus state autonomy. California Governor Gavin Newsom and other local officials strongly opposed federal intervention, regarding it as an infringement on state sovereignty. Beyond its political significance, the protest also became an arena of discursive struggle, where competing narratives sought to define legitimacy.

This event has become the subject of massive media coverage. In this context, journalism bears the responsibility not merely to report events but to construct their public meaning (Linge & Bangstad, 2024). Through framing and rhetorical selection, the media determine what aspects of an event are highlighted and how they are interpreted (Rasoulikolamaki et al., 2025). Accordingly, the representation of LA protests in news discourse is not a static reflection of reality but an active process of meaning-making, which further influences public perception.

Furthermore, discursive differences in media coverage underscore the power of language in reconstructing political realities. In the United States, news discourse is deeply shaped by partisan polarization, especially in relation to politically conflicting issues (Iyengar & Hahn, 2009) such as immigration and protest. Specifically, liberal-leaning outlets such as CNN and conservative-leaning outlets such as Fox News frequently construct sharply divergent accounts of the same incident. Moreover, audiences are easily influenced by mainstream outlets (F. L. F. Lee, 2014) and favor those aligning with their ideological preferences (Chong & Druckman, 2007). In this case, understanding outlets’ rhetorical strategies is of great significance for citizens to form their own opinion.

Notably, the contemporary media landscape is characterized by the interplay of multiple modes, including text, image, and video (Xu & Loffelholz, 2021). The analysis of media discourse is incomplete without considering other modalities, particularly visual resources which have been shown to exert a “picture superiority effect” (Childers & Houston, 1984; Geise & Baden, 2015; Mosallaei & Porpora, 2024; Nelson et al., 1976). Therefore, media discourse, as an integrated ensemble of verbal and visual resources, should be analyzed through the lens of multimodality.

Based on the perspective above, this study aims to investigate how CNN and Fox News framed the 2025 LA protests through multimodal critical discourse analysis (MCDA), further revealing how text and images jointly contribute to ideological polarization in U.S. media discourse.

2. Literature Review

Academic research has devoted considerable attention to political discourse. Within this field, topics related to immigration and social protests occupy a critical position because they saliently reveal how power, identity, and legitimacy are negotiated through language (Fairclough, 1995; van Dijk, 1988). Research on these topics has shown that news discourse is not a neutral mirror of events but a meaning-making process which consistently frames them as threats to national identity and social security (Cap, 2018; Ekström et al., 2025).

Two key research trends can be identified in the literature. The first concerns the evolution of the “protest paradigm”. Previous research has long been dominated by the “protest paradigm”, which refers to the tendency of mainstream media to marginalize and delegitimize protests (Chan & Lee, 1984; McLeod & Hertog, 1999). However, recent studies portray protests in a more humanizing and legitimizing way (Ondimu et al., 2025; Rasoulikolamaki et al., 2025). This shift reflects the context-dependent nature of protest coverage, which may vary with political and ideological positions. For instance, when the reporting subject is an “official enemy” of a Western regime, the Western media shows a clear “paradigm deviation” (Mosallaei & Porpora, 2024). Similarly, in domestic contexts, conflicting issues such as the LA protests, as the reporting focus, are significantly influenced by the political stance of a certain outlet (Baum & Groeling, 2008; Doufesh & Briel, 2021; Feldman et al., 2012). The second major trend highlights the ideological functions of the media in constructing public meaning. Recognizing this tendency, researchers have applied diverse analytical approaches, including framing theory, Critical Discourse Analysis, and social semiotics, to reveal how news discourse shapes ideology (Entman, 2004; Machin, 2013; Mosallaei & Porpora, 2024; van Leeuwen, 2008). Besides, studies have also shown that news discourse is deeply influenced by institutional and political alignment, reflecting the interests of ownership and ideology (Fairclough, 1995; Fowler, 1991; Richardson, 2007). In polarized contexts, framing offers an analytical lens for examining whether the protest is legitimized as democratic expression or delegitimized as a threat to social order (Doufesh & Briel, 2021; Mosallaei & Porpora, 2024).

Although previous research has confirmed media bias in crisis reporting (Iyengar & Hahn, 2009; C. Lee, 2016; Teo, 2000) and the great role of framing (Entman, 1993; Chong & Druckman, 2007), three critical gaps remain.

First, existing studies largely focused on international issues such as the Arab Spring or humanitarian crises (Abbas & Kadim, 2024; Entman, 2004; Rodríguez & Dimitrova, 2011), while domestic federal-state conflicts in a certain country have received limited attention despite their discursive salience.

Second, although a large body of research has recognized the role of Systemic Functional Linguistics or framing as theoretical foundations (Abbas & Kadim, 2024; Brookes, 2023; Dai & Hyun, 2010), most studies over-rely on textual resources, such as keywords, syntactic patterns, or thematic emphases, while overlooking the multimodal nature of contemporary journalism (Matthes, 2009; Coleman, 2010). Visual elements are not simply supplements but central components of meaning-making (Kress & van Leeuwen, 2006; Machin, 2013).

Third, while researchers have observed that text–image relations are complex and systematic (Martinec & Salway, 2005), empirical studies often reduce their relations to a binary of congruence and incongruence (Jungblut & Zakareviciute, 2019; Mosallaei & Porpora, 2024). The research often presupposes that inconsistent relationships will inevitably lead to competitive or dissolving effects and links them to specific negative frameworks such as the “protest paradigm”. While it reveals deep ideological operation mechanisms, it also simplifies the complexity of meaning-making, overlooking how different modes interact dynamically to construct ideology.

To address these gaps, this study aims to adopt an integrated approach from the perspective of MCDA to examine how outlets with distinct political leanings construct the same event. By focusing on the 2025 Los Angeles protests, the study moves beyond the international scope of previous studies to investigate how domestic federal-state conflicts are discursively constructed. Furthermore, the framework of MCDA allows the research to overcome the over-reliance on textual analysis, positioning visual elements as central components of meaning-making as well. Crucially, this study transcends the simplistic binary of congruence and incongruence by refining the categories of text-image relations to offer a more precise lens to investigate their collaboration. By doing so, it contributes to understanding domestic media discourse, shedding light on how multimodal representations reconstruct meaning.

3. Theoretical Framework

The study adopts Multimodal Critical Discourse Analysis (MCDA) as the analytical tool. This framework, developed by Machin and Mayr (2012), integrates multiple dimensions including critical discourse analysis and semiotic resources. It holds that semiotic choices are never arbitrary but are ideologically motivated to naturalize specific power relations. Therefore, emphasis is placed on investigating why certain semiotic resources are chosen to conceal agency or background specific social actors. Central to this approach is the concept of recontextualization, which can be used to examine “the discursive process of transforming social practices” (Machin, 2013). In news discourse, textual frames act as a filter, selecting specific meanings from visual images to support the outlet’s preferred narrative (Machin & Mayr, 2012). This perspective provides a lens for revealing how the media construct reality and disseminate ideology through multiple kinds of communicative modes.

Crucially, to investigate whether the frame is legitimization or delegitimization, this study draws on van Leeuwen’s (2007, 2008) framework. According to van Leeuwen, legitimization refers to the discursive strategies used to justify or explain social practices and institutional orders. Specifically, this study focuses on three key categories of legitimation: (1) authorization, legitimizing via reference to the authority such as laws and figures; (2) moral evaluation, legitimizing via reference to value systems; and (3) rationalization, legitimizing via reference to the utility of social action. Following this classification, the study can precisely identify how multimodal frames function to grant or revoke political legitimacy.

Within the framework of MCDA, this paper combines Entman’s (1993) framing theory with Kress and van Leeuwen’s (2006) visual grammar to explore the use and integration of these two modes. Although the two models originate from different traditions, framing and social semiotics, respectively, they share a common concern about the underlying meanings behind representations. In this sense, integrating them gives an insight into the role of multimodal resources in news discourse.

Framing theory (Entman, 1993) offers an analytical framework for examining how the media construct social meaning by selecting and emphasizing particular aspects of reality. According to Entman (1993), framing refers to the process of selecting certain aspects of reality and making them more salient. Specifically, it involves four interrelated functions: (1) problem definition: identifying what an event is about; (2) causal attribution: diagnosing the sources of the problem; (3) moral evaluation: assigning judgment and values; and (4) treatment recommendation: suggesting possible solutions. Framing thus embodies rhetorical choices that reflect ideological values, legitimizing certain actors and delegitimizing others (Entman, 2004; Chong & Druckman, 2007; Benford & Snow, 2000).

As for visual grammar (Kress & van Leeuwen, 2006), it offers a lens for understanding the semiotics of images. Based on Halliday’s Systemic Functional Linguistics, they argue that images realize meaning through three metafunctions: (1) representational meaning: how participants and actions are depicted; (2) interactive meaning: how viewers are positioned toward the image; and (3) compositional meaning: how visual elements are arranged to create emphasis and coherence (Table 1). They enable researchers to systematically decode how images communicate ideologies.

Overall, the analysis proceeds within the broader framework of MCDA. Specifically, the paper operationalizes Entman’s framing theory to examine the text, focusing on how lexical choices and syntactic structures function to define problems, attribute causes, evaluate morality, and suggest remedies. Simultaneously, Kress and van Leeuwen’s visual grammar is applied to analyze visual representations, examining how semiotic resources such as angle, social distance, and narrative vectors construct representational, interactive and compositional meanings. Based on this examination, the study will then explore the multimodal interaction between text and image, including reinforcement, complementarity, or contradiction. Following this analytical procedure, the study aims to uncover how and why conservative and liberal media outlets employ distinct framing strategies in reporting the same protest. To address it, the study aims to answer three questions:

How do CNN and Fox News define, attribute, evaluate, and propose solutions to the protests through textual rhetoric, respectively?
How do visual strategies interact with the textual frames to co-construct meaning?
What ideological positions are conveyed through these multimodal strategies?

4. Data and Method

This study examines the 2025 LA protests, a representative domestic conflicting event in U.S. To explore how partisan outlets construct contrasting framings and convey ideology, this study compares CNN and Fox News, which are widely recognized as liberal and conservative news organizations, respectively (Stroud, 2011).

Methodologically, this study combines textual framing analysis (Entman, 1993) and visual grammar analysis (Kress & van Leeuwen, 2006) within the framework of MCDA. Yet, considering that frame identification often involves observers’ subjective perception, the research integrates qualitative discourse interpretation with quantitative corpus evidence. This mixed-method design avoids the subjective limitations of purely qualitative analysis while overcoming the contextual insensitivity of purely quantitative research, aligning with current discourse studies’ emphasis on methodological synergy (Baker et al., 2008; Wodak & Meyer, 2001).

4.1. Data Collection and Sampling

CNN and Fox News were selected as analytical subjects, representing the two dominant ideological poles in the U.S. media landscape (Gabrielatos & Baker, 2008; Iyengar & Hahn, 2009). This contrastive design effectively captures how political stances mediate representations of the same social conflict.

To investigate the response of different media outlets towards the LA protest, this study retrieved all relevant online reports published in CNN and Fox News within the period of the first month of the issue (1–30 June 2025). This paper confines its analysis to this period due to two reasons. First, theoretically, the earliest media coverage of an event plays a decisive role in shaping the dominant frame because it represents a formative phase where competing narratives struggle most intensely to establish the dominant definition of the event before public opinion solidifies (Entman, 2004). Second, empirically, this selection is supported by a longitudinal preliminary search of the two outlets (from June to October 2025), indicating that over 60% of the relevant coverage was concentrated in the first month, with a sharp decline in frequency observed in the following four months. With a focus on news with the highest relevance, a search was conducted on official websites of CNN (https://www.cnn.com/) and Fox News (https://www.foxnews.com/), using the keywords “LA (or Los Angeles or California) protest (or demonstration or riot or rally or march or unrest)” and “immigration (or migrant or immigrant rights)” in headlines. Then, to ensure the remaining articles are relevant and related to the protests, as well as allow multimodal discourse analysis, two criteria were established: First, articles must contain at least one image; Second, duplicates and purely video-based reports were manually excluded. To ensure statistical balance, photo galleries and animated content were limited to their static cover images. After cleaning data with PyCharm (version 2022), 27 articles (30,169 word tokens) and 56 images were collected from CNN, while 46 articles (37,224 word tokens) and 148 images were obtained from Fox News. This disparity suggests a differential multimodal emphasis, which reflects Fox News’s more intensive coverage and potentially greater reliance on visual storytelling for this event. Besides, despite the discrepancy in the number of articles, the comparable word tokens between the two outlets suggest that CNN may have employed longer, more detailed narratives in its coverage.

For the subsequent qualitative investigation, purposeful sampling strategy (Miles et al., 2014) was employed. To ensure analytical rigor, three criteria were established. First, the selected articles must contain high-frequency keywords identified in quantitative phase. Second, priority was given to articles fully embodying Entman’s four framing functions. Third, images with high semiotic salience were chosen. Following these criteria, paradigmatic multimodal ensembles covering headlines and textual excerpts with their accompanying images were selected. These samples were chosen to reveal the rhetorical strategies used by each outlet, ensuring that the qualitative analysis captures the deep structure of their ideological construction rather than merely offering a random overview.

4.2. Data Analysis

The study combines quantitative and qualitative analysis.

In order to carry out a solid examination of the data, the study employed AntConc 4.3.1 (Anthony, 2024) to conduct the quantitative analysis of the textual data first. Specifically, word frequency lists and keyword lists were generated separately for CNN and Fox News articles. The study constructed a self-built reference corpus by combining the datasets of both CNN and Fox News (total tokens = 67,393). Keyword extraction was conducted using log-likelihood tests (p < 0.001) with a minimal frequency of five. Besides, the concordance lines (KWIC) were qualitatively examined to observe the immediate context and semantic associations of the keywords. The statistics facilitated the identification of salient lexical items and recurrent co-occurrence structures, which supported the interpretation of the frame, select samples, and refine the qualitative coding. As for images, the study conducted manual coding of them based on several key dimensions, including focus, distance, angle, and contact. Based on the coding, we calculated the proportion of different visual strategies, thereby revealing the dominant frame of each outlet. Combined with the previous textual data, the distribution of different types of text–image relations are also revealed. Such quantitative statistics provided a lens for investigating CNN’s and Fox News’ multimodal strategies.

In terms of qualitative analysis, the study proceeded each multimodal ensemble in three interrelated phases: textual analysis, visual analysis, and multimodal integration. First, the study operationalized Entman’s (1993) framing theory within van Leeuwen’s (2007) legitimization framework. Specifically, the study examined how problems are defined, attributed, judged, and to be solved through three key strategies: (1) Authorization: referencing personal, impersonal, and expert authority; (2) Moral evaluation: using evaluative lexis or visual distance; and (3) Rationalization: justifying actions by causal explanations. Second, the study adopted Kress and van Leeuwen’s (2006) model of visual grammar to operate visual analysis, focusing on three meanings: (1) Representational meaning: using narrative vectors to identify the depiction of action; (2) Interactive meaning: using distance and angle to analyze how the audience empathize with or objectify the actors; and (3) Compositional meaning: using salience to foreground certain visual elements. Based on both the theory and data, the operational coding scheme is presented in Table 2.

Finally, to explore the multimodal synergy, the study adopted from Martinec and Salway (2005)’s classification and examined the integration between textual and visual resources based on three distinct intersemiotic relations (Table 3). Specifically, the text-image relationship was coded as: (1) Reinforcement, where the visual content provided direct evidence to the text, such as textually describing “violence” paired with images of “fire”; (2) Complementarity, where images offered new visual information to extend the textual narrative, like textually mentioning “protesters” paired with images of slogan; and (3) Contradiction, where the visual representation diverged from the text, such as choosing an image of chaos under the justification of protests. Through this operational framework, the study investigated how textual and visual modes reinforced, complemented, or conflicted with each other in constructing specific ideological meanings.

To ensure objectivity and consistency, two researchers coded the sample. Prior to formal analysis, both authors received training and reached a consensus on the codebook and clarified inclusion criteria. Specifically, a preliminary coding scheme was established based on the theoretical frameworks first. To ensure the scheme’s applicability to the specific dataset, both authors conducted a pilot coding on a random sample of 10 articles. During this phase, the researchers engaged in iterative discussions to refine the operational criteria and resolve ambiguities, combining theoretical categories with data-driven insights. This process continued until the final codebook was established. Based on the code book, the first researcher coded the full dataset, while the second author acted as an independent verifier. To assess inter-coder reliability, a random subset of 20% of the data was independently coded by the second author. Crucially, the verifier strictly adhered to the coding scheme without accessing the primary coder’s decisions. The coding decisions achieved a high level of consistency with 93.6% (more than 90%). Any discrepancies emerged were resolved through iterative discussions until 100% consensus was reached.

5. Results and Discussion

5.1. Quantitative Patterns

To investigate how ideologically opposed media outlets constructed the reality of the 2025 Los Angeles protests, this study first employed a corpus analysis software to identify the dominant lexical frames, followed by an in-depth qualitative multimodal analysis of the samples.

The quantitative keyword analysis, conducted with AntConc 4.3.1, reveals distinct salience patterns between the two outlets. The process investigates keywords in the CNN and Fox News corpora, respectively, against a combined reference corpus (69,340 tokens in total) to reveal how the two outlets discursively defined the LA protests. The distinct results point to different lexical patterns, reflecting divergent frames, which lays a foundation for the following content analysis.

In the Fox News corpus (Figure 1), the keyword list can be classified into two categories. For one thing, it is dominated by terms associated with conflict and disorder to describe the event and participants. Salient keywords such as “riots”, “anti”, “rioters”, and “illegal” reveal a delegitimization strategy. The repeated labeling of the event as a riot and the participants as rioters functions rhetorically to deny the political legitimacy of them and portray them as threats to public order. This aligns with the classic “protest paradigm”, where media routines systematically marginalize dissent by emphasizing spectacle and violence over political substance (McLeod & Hertog, 1999). For another, other prominent keywords include “ICE (Immigration and Customs Enforcement)”, “Newsom”, “U.S.” and “Trump(s)”, which foreground the enforcement institution and political figures. In this way, Fox News not only frames the protests as violent disruptions but also highlights the role of immigration control and state authority, thereby legitimizing coercive state deployment while marginalizing protesters’ grievances.

Conversely, CNN defines the protests through a rights-based narrative. As evidenced in Figure 2, the CNN corpus shows a starkly different set of keywords. High-frequency items include “Huerta” (president of Service Employees International Union California, who was arrested for protesting), “court”, “Breyer” (US District Judge), “people”, “justice”, and “husband”. Rather than emphasizing “riots” like Fox News, CNN highlights words related to district leaders, institutions and citizens, thus foregrounding the human and legal dimensions of the event. This lexical choice suggests a framing more aligning with upholding rights, portraying the protest as a struggle for immigrant justice rather than mere disorder and riot.

Based on the keyword results, the study manually examined the text and calculated the distribution of textual legitimization and delegitimization frame, as demonstrated in Table 4. The statistics illustratively reveal two divergent frames, which also reflects different problem definitions. To be specific, Fox News portrays the event as a riot which threatens the public safety and order, while CNN defines it as a moral and just struggle which is rooted in human rights and legitimacy.

In addition, a quantitative coding of visual patterns was conducted to examine how the protests were visually framed. As shown in Table 4, the statistical distribution illustratively supports the textual findings, revealing a stark polarization in visual strategies. It’s demonstrated that Fox News shows a significantly higher proportion of delegitimization visual elements, such as the foregrounding of the authority, chaotic scenes of fire or destruction, and the objectification of protesters through masking. By contrast, images of CNN involve more legitimization elements, including the salience of protest signs and slogans, closer social distance, and visible human expressions. Taken together, the frame of both text and images is consistent, exhibiting distinct ideological tendencies.

5.2. Delegitimization Frame

The quantitative analysis of the Fox News corpus (N = 46) reveals a dominance of delegitimization strategies. As indicated in Table 4, the visual coding demonstrates a strong reliance on authority focus, objectification, and chaotic description. Specifically, visual codes related to the images of police, military deployment and political figures have a proportion of 50.3% of all the images, while chaos of fire, destruction and looting accounts for 36.7%. Furthermore, 24.5% of the images employ “objectification”, stripping protesters of their individuality. This aligns with the textual finding that 93.5% (n = 43) articles use negative moral evaluation, authorization, and rationalization of state force while only 6.5% (n = 3) offer space for legitimizing discourses. This disproportionate distribution confirms that delegitimization is not merely a rhetorical choice but a systemic strategy, constructed through the interplay of textual blame and visual alienation.

As detailed in Table 3, the reinforcement strategy is overwhelmingly dominant in the Fox News corpus. Therefore, the following analysis focuses on how the text aligns with the images.

Textually, Fox News directly puts the blame on the immigrants and protesters themselves, who are portrayed as instigators of violence and disruption from the very beginning of the event. The outlet avoids mentioning the original trigger, the ICE raids, but focuses on the destructive actions of the crowd, framing them as inherently violent and irrational. As evidenced in example (1), both the headline and the text repeatedly label protesters as rioters and agitators, which directly shifts responsibility from political issues to the intentional disruption of outsiders. In this way, the protests are portrayed as malicious chaos rather than spontaneous grievance. In contrast, when describing the police’s attack, Fox News justifies it with an attributive clause, “that officials deployed downtown as agitators clashed with authorities in the city”, positioning state violence not as aggression, but as a necessary reactive measure to maintain social order and security.

(1): Rioters Smash Windows at LAPD Headquarters as Anti-ICE Agitators Clash with Authorities

Rioters smashed windows of the Los Angeles Police Department’s (LAPD’s) headquarters on West 1st Street and faced tear gas that officials deployed downtown as agitators clashed with authorities in the city… “Agitators have splintered into and through out [sic] the Downtown Area,” the LAPD’s Central Division wrote on X. (Fox News, 9 June 2025).

Correspondingly, the images shown in this article reinforce this criminalization narrative. Fox News relies heavily on narrative representation, particularly actional processes, which foreground dramatic confrontation, including flames, smoke, and flag-waving, emphasizing tension and instability. As illustrated in Figure 3, the pictures collectively establish a binary opposition between the chaotic “rioters” and the ordered state. Two of the images focus entirely on narrative processes of destruction. To be specific, in the first picture, a masked man pointing toward the police with fire functions as the Actor, while the police car positioned in the lower frame becomes the Goal of the actional process. The gesture forms a clear vector towards the police, which visually encodes aggression and confrontation. This transactional process completely attributes the responsibility for the conflict to the protesters, portraying them as provocateurs, while the police are depicted as passive. Simultaneously, conceptual representations connect visual elements with abstract meanings or cultural values. The burning flag serves as a cultural signifier of defiance and anti-American sentiment, symbolically reinforcing the moral illegitimacy of the protesters. Furthermore, the bright color and movement of the burning flag enhance its salience. By foregrounding this symbolic desecration, the visual frame strips the protesters of their status as legal citizens and recasts them as ideological enemies of the state. Besides, the participants are depicted with masked or obscured faces, increasing social distance and reinforcing their criminalized images. As noted by Machin and Mayr (2012), such “collectivization” strips social actors of their individual agency, portraying them as threats rather than specific human beings. In stark contrast, the other two images employ a distinct compositional structure to legitimize law enforcement. Both images (top-right and bottom-left) place the officer in the foreground, occupying a position of stability and order, while the burning scene is in the background as a symbol of the chaos. The police are portrayed as restorers of order amidst a chaotic landscape, naturally suggesting the coercive force is the only remedy for this incident.

Beyond the clash with state authority, Fox News also frames protesters as the violent mob from the perspective of ordinary citizens, centering on their economic suffering. Reports grant authoritative voice to the public, whose testimonies serve as hard evidence that the event has devolved from a political protest into purely opportunistic criminality. As illustrated in example (2), the title directly labels the protesters as “stupid” and expresses the public’s attitudes with emotional words. Such dehumanizing lexicalization contributes to moral exclusion (van Dijk, 1998). Besides, the report uses many quotations of victims such as Monty, while deliberately retaining his non-standard grammar. By preserving linguistic markers of “broken English”, Fox News identifies Monty as a genuine, hard-working member of the immigrant community, which further creates a powerful narrative of differentiation between the decent, law-abiding immigrant and the violent rioters. The quote strongly refutes the protesters’ condemnation on the law enforcement, instead, suggesting the disorder stems from the protesters’ criminality. Through these accounts, the event is framed not as a struggle for justice but as a social problem caused by irrational, violent, and criminal actors.

(2): Los Angeles Business Owners “Sick and Tired” of “Stupid” Anti-ICE Rioters Looting Their Stores. (Fox News)

“This is so ridiculous. This doesn’t look like they’re protesting for ICE or anything. They are doing [it] just for the looting of the stores. I saw they’re breaking into the Apple store. They’re breaking into the Adidas store. This is not them doing protest [sic],” Monty told NewsNation. (Fox News, 11 June 2025).

Such criminalized narrative is visually reinforced by a set of images (Figure 4). The images utilize narrative representations to depict protesters as pure perpetrators of crime. In the convenience store (right), the participants are captured in the act of material processes, exiting through shattered glass and carrying stolen beverages with their hidden faces. The focus on non-essential items like alcohol may further evoke the audience’s association with hedonism and vice. Similarly, the street scene (left) depicts a man throwing boxes into the fire, with an abandoned shopping cart nearby, pointing to the disruption of business. Crucially, both images strip the actors of political identity. There are no protest signs visible in the frame. Instead, protesters turn their backs to the camera or cover their faces with helmets or hoods, emphasizing their identities as anonymous looters. Through these multimodal strategies, Fox News presents a consistent moral condemnation of the protesters.

Based on the construction of a violent mob and victimized citizens, Fox News logically points to a treatment recommendation of control and coercive enforcement. As evidenced in example (3), the outlet supports the deployment of the National Guard, regarding it as a restoration of “law and order”. The text strategically links “Mexican flag-wielding rioters” with “burning cars and assaulting police”, fusing ethnic identity with criminality. This rhetorical strategy validates the President’s hardline immigration policies by presenting them as the only viable solution to the chaos. Additionally, by emphasizing the “inability to maintain law and order in blue states”, the discourse puts the blame on the local governance failure, thereby legitimizing federal intervention as a moral imperative to save the city from chaos and disorder. Such framing foregrounds order restoration while backgrounding policy reform, thereby legitimizing state authority as the only effective solution.

(3): Trump Deployment of Troops to Quell LA Rioters Latest Page in President’s Political Playbook

“Images splashed across the media of Mexican flag-wielding rioters burning cars and assaulting police officers validate President Trump’s call for greater immigration enforcement and border control. It also puts Democrats on the defensive by highlighting their inability to maintain law and order in blue states,” veteran Republican strategist and communicator Ryan Williams told Fox News. (Fox News, 11 June 2025).

This narrative is visually supported by images (Figure 5). The left image visually instantiates the text’s description of “Mexican flag-wielding rioters”. In the chaotic scene, the Mexican flag is displayed alongside the American flag. In this context, the presence of the foreign flag is not interpreted as cultural pride, but as a visual signifier of divided loyalty or even invasion. Besides, Fox News seeks to maximize the social distance between the audience and the protesters, thereby constructing a distant relationship devoid of empathy. For one thing, the outlet avoids choosing images with close eye contact when presenting the protest. This creates an “offer” image to the audience and does not invite them to engage. As shown here, protesters’ faces are covered with masks, obscuring individual identity and emotion. This visual anonymity prevents emotional connection and reinforces their portrayal as an indistinct collective. For another, the frequent use of medium and long shots increases social distance, visually excluding the viewer from the scene. The camera angle is typically at eye-level or slightly high, which positions the audience as detached onlookers observing chaos rather than participants sharing common experience. This detached perspective depicts the protesters as “others”, namely, a homogeneous, irrational mob rather than citizens deserving of sympathy.

Differently, when presenting the confrontation between political figures, who represent federal and state authority, respectively, the outlet chooses an image with close shot to engage audience into the scene. As shown in the right image, Trump is depicted in profile, wearing his “Make America Great Again” cap and accompanied with military personnel in the background, while Newsom stands alone with a pointing finger towards Trump, seeming angry. Such a comparison symbolizes the disparity in power, positioning Trump as the ultimate arbiter and solver of the event.

Taken together, Fox News constructs a coherent delegitimization frame through the interplay of textual and visual resources. Textually, the outlet reframes this event as deliberate disruption and criminal opportunism while portraying law enforcement as righteous protector. This narrative is validated by the visual evidence. Crucially, the text–image relationship throughout this framing is mutual reinforcement. By visually presenting the anonymous, chaotic mob and the disciplined federal authority, Fox News systematically frames the LA protests as delegitimized, constructing them as a security crisis that necessitates coercive state intervention.

5.3. Legitimization Frame

In contrast, CNN constructs a legitimization frame. This narrative shifts the focus from disorder to democratic expression, employing multimodal resources to humanize the participants and position the protests as a justifiable response to systemic injustice (Entman, 1993; Mosallaei & Porpora, 2024). Notably, while Table 3 indicates that CNN also relies heavily on reinforcement, it exhibits a more diverse range of relations compared to Fox News. Thus, the following analysis illustrates how these strategies jointly construct a legitimization frame.

While Fox News linguistically assigns agency to protesters, CNN reverses the direction of causality. It attributes responsibility to state institutions and immigration enforcement. The outlet highlights injustice through the voices of political figures. As evidenced in example (4), the causal chain is clear: aggressive federal immigration enforcement led to public outrage, which in turn led to protests. This framing aligns with Entman’s (1993) function of “causal attribution,” where the diagnosis of the problem determines the moral judgment. The causal interpretation is suggested in the headline first, which syntactically positions “Immigration Enforcement” as the active agent. The verb “draws” establishes a direct causal link, framing the protests as a reactive consequence triggered by state action. Further in the text, through quoting the political figure’s words, those protesters are not portrayed as “rioters” but as “hard-working people” and “family members”. This discursive strategy of “personalization” (van Leeuwen, 2008) invites audience empathy to legitimize the protesters’ grievances against state enforcement, much like the multimodal humanization of victims documented in anti-police brutality activism (Ondimu et al., 2025). Besides, by condemning the enforcement’s “madness” and “injustice”, the text reshapes the protesters’ response as a collective objection to being “treated like criminals”. In this way, the event is defined not as a breakdown of order, but as a legitimate struggle for dignity and rights.

(4): Clash Resume in Los Angeles Area as Immigration Enforcement Draws New Protests

“Hard-working people, and members of our family and our community, are being treated like criminals,” he (Huerta) said. “We all collectively have to object to this madness because this is not justice. This is injustice. And we all have to stand on the right side of justice.” (CNN, 7 June 2025).

The textual frame is visually substantiated by a pair of images in Figure 6. The left picture captures a dramatic narrative process of conflict. In the center of the image, the police officers are the active Agents of violence, wielding batons and weapons against a subdued person on the ground. The vector of one officer’s baton is directed downward at the protester, visually aligning with the “madness” and “injustice” mentioned in the text. This image exhibits the asymmetry of force, namely, heavily armed state authority versus a vulnerable individual. This visual framing validates Huerta’s claim that the community is being “treated like criminals” unjustly. It also aligns with findings that visual asymmetry reduces moral distance and increases viewer identification with victims (Doufesh & Briel, 2021).

Moreover, the onlookers in the background function as Reactors, observing the Phenomenon of police violence. Many of them are captured holding up smartphones to record the incident, visually constructed as civic witnesses of the enforcement’s violence. As for the right image, it presents the protesters as a disciplined, unified collective. The image utilizes a frontal angle and medium shot to facilitate engagement, allowing the viewers to see the protesters as a group of ordinary citizens claiming equal rights. Such “demand” images establish an imaginary relation of social affinity, inviting the viewers to acknowledge the participants’ humanity (Kress & van Leeuwen, 2006). They are holding standardized signs reading “ICE OUT OF LA!” and “EDUCATION NOT DEPORTATION”, which function as conceptual representations of rational political claims. Besides, the presence of various flags in the background suggests a coalition of solidarity rather than foreign invasion. This visual proof suggests that these protesters are fighting for justice.

This synergy reinforces CNN’s legitimization frame, showing the audience that the clash is a necessary struggle for rights against unjust enforcement, thus constructing moral virtue to legitimize collective resistance.

(5): Trump Deploys National Guard to Stop LA Immigration Protests, Defying California’s Governor. Why Experts Call the Move Dangerous

CNN senior national security analyst and former DHS official Juliette Kayyem called the Trump administration’s response to this weekend’s protests an extreme overreaction and said it is “not rational given the threat we’re seeing.” (CNN, 9 June 2025).

Since CNN defines the event as a moral struggle for rights, it recommends policy reform and negotiation. It’s shown in example (5) that through covering interviews with community leaders or politicians, CNN constructs a reform-oriented frame emphasizing dialogue, inclusion, and recognition of immigrants’ rights. The headline regards Trump as “defying” the local governor, suggesting the federal intervention is a violation of state sovereignty. The text quotes an expert authority to label the deployment of the National Guard as an “extreme overreaction” and “not rational”. Further, by diagnosing the state’s response as improper with the word “dangerous”, CNN shifts the moral blame from the protesters to the administration.

Visually, the selected images present a twofold frame, where solidarity and chaos coexist. As shown in Figure 7, on one hand, conceptual representation is utilized to symbolize unity and shared purpose. The bottom-left image depicts a group of demonstrators with their arms linked, which conveys collective identity and mutual protection, visually encoding democratic participation rather than disorder. In this image, they look like an organized and united community rather than a mob.

On the other hand, the images do not shy away from depicting disorder but function to illustrate the asymmetry of the conflict. Specifically, the visual resources operate through a relationship of “complementarity” (Martinec & Salway, 2005) to validate the textual claim of “state overreaction”. For instance, the bottom-right image displays a group of heavily armed soldiers standing idly. Their relaxed posture sharply contrasts with the heavy militarization, creating a visual irony that implies a lack of imminent threat (Martinec & Salway, 2005; Geise & Baden, 2015). This visual evidence complements and supports Kayyem’s textual claim that this level of force is “not rational given the threat”. Here, the two modes operate a semiotic division of labor: the text gives the political judgment, while the image provides the reality evidence.

However, unlike Fox News’s closed reinforcement loop, CNN employs a more complex strategy involving “contradiction”. As demonstrated by Mosallaei and Porpora (2024), Liberal media often exhibit incongruency where text justifies the protest while imagery delegitimizes it. As illustrated in the top-left image, it presents a high-intensity scene of conflict. Visually, it depicts a scene of chaos, where a protester is captured in a motorcycle waving a Mexican flag against thick smoke and the police. This encodes chaos and threat, aligning more closely with the “criminalization”. Crucially, such a chaotic image stands in sharp contradiction to the textual frame, as analyzed above (Mosallaei & Porpora, 2024; Gibson & Zillmann, 2000; Powell et al., 2015). While the image captures the unavoidable spectacle of disorder, the text actively mitigates the delegitimizing potential of the visual chaos by diagnosing the scene as a “backfire” of policy failure.

Overall, CNN constructs a frame of “conditional legitimacy” (Rasoulikolamaki et al., 2025) through a more diverse interplay between textual and visual modes. By combining reinforcement and complementarity, the outlet validates the protesters’ moral identity. Simultaneously, CNN negotiates with the constraints of the traditional “protest paradigm” (Chan & Lee, 1984; McLeod & Hertog, 1999) by finding a balance between textual justification and visual disorder (contradiction), finally shifting the moral blame to the administration despite the messy visual reality.

5.4. Distinct Ideology Conveyed Through Multimodal Rhetoric

The previous textual and visual analyses have jointly demonstrated that CNN and Fox News do not merely report the 2025 Los Angeles protests but actively reframe the event to construct two ideologically distinct social realities. This polarization is not accidental but driven by their opposing ideological stances and achieved through distinct rhetorical strategies.

Primarily, the analysis reveals an ideological divergence between the two outlets. Fox News, aligning with a conservative “law and order” agenda (Groeling & Baum, 2007), constructs a “delegitimization frame”. It portrays the protests as a threat to social stability. For instance, by frequently using labeling like “rioters”, “agitators”, along with violent images to reinforce the narrative of “cultural threat” (Vultee, 2009), the outlets put protesters in an “us vs. them” binary, ultimately delegitimizing the protests (Rasoulikolamaki et al., 2025). In contrast, CNN, reflecting a liberal-leaning stance, constructs a “conditional legitimacy frame”. Rather than giving consistent justification, CNN grants legitimacy only when the protests align with civil norms such as humanized expression. It tends to frame the unrest as a reaction to systemic injustice or the government’s overreach, focusing on the structural causes rather than the immediate symptoms of disorder (Entman, 1993; Geise & Baden, 2015).

Second, instead of simply giving different opinions, the two outlets employed distinct multimodal rhetorical strategies. Visually, Fox News deliberately amplifies scenes of destruction, fire, and confrontation, leveraging the “picture superiority effect” to emphasize chaos (Geise & Baden, 2015). Crucially, this outlet utilizes a dominant strategy of reinforcement, where visual instantiation supports the textual frame. Specifically, the textual narrative, using labels like “rioters”, anchors the meaning of these chaotic images, creating a closed semiotic loop and thus leaving no space for alternative interpretations (Kress & van Leeuwen, 2006). In contrast, CNN utilizes a more complex strategy, involving reinforcement, complementarity, and contradiction. While it also utilizes visual reinforcement to highlight police aggression or humanizing moments of solidarity, it cannot fully avoid the visual reality of chaos and violence due to the constraint of “protest paradigm” (Chan & Lee, 1984; McLeod & Hertog, 1999). Consequently, a strategic contradiction emerges. Although the visuals sometimes capture the chaotic scenes (e.g., smoke, clashes), the text manages to mitigate the delegitimizing potential of these images by diagnosing them as symptoms of state overreaction rather than mob criminality. Through this textual recontextualization, CNN maintains its support for the protesters’ grievance.

These findings suggest that media polarization has evolved from a mere divergence of political stances to a clash between reality reconstruction (Iyengar & Hahn, 2009). When partisan outlets deploy distinct multimodal strategies to interpret the same image, they actually position their audience in two incompatible semiotic worlds, with one confirming the necessity of suppressing disorder, while the other attributing the chaos to the failure of governance. Consequently, the multimodal rhetoric reinforces ideological polarization, leaving the American public with two irreconcilable versions of “truth” and making consensus on domestic conflicts increasingly unattainable.

6. Conclusions

This study has examined how CNN and Fox News framed the 2025 Los Angeles Protests differently through multimodal rhetorical strategies that integrated Entman’s (1993) framing theory with Kress and van Leeuwen’s (2006) visual grammar from the perspective of corpus-assisted MCDA. The findings demonstrate that the textual and visual modes do not merely co-exist but jointly construct two polarized realities of the same protests.

The findings reveal a stark contrast. Fox News mainly constructed a delegitimization frame that defined the event as a criminal riot and a threat to social order. Through a strategy of multimodal reinforcement, the visual elements, which emphasize destruction, anonymous crowds, and anti-American symbols, served as direct forensic evidence to validate the textual criminalization. This closed semiotic loop effectively constrained the scope of interpretation, thus proposing suppression as the necessary remedy. By contrast, CNN mainly constructed a nuanced frame of “conditional legitimacy” that defined the issue as a just struggle, attributed responsibility to the ICE raid, and affirmed the moral legitimacy of the protests through a strategic synergy of reinforcement and complementarity. Notably, it also negotiates with the unavoidable visual reality of conflict, displaying contradiction between the text and images. To be specific, while reinforcement was utilized to validate the protesters’ identity as common citizens and rights defenders, complementarity and contradiction were crucially employed to mitigate the visual reality of disorder. By textually diagnosing the chaos as a “backfire” of state overreaction, CNN discursively recontextualized the visual violence as a symptom of policy failure, thereby redirecting moral blame to aggressive enforcement and thus advocated for negotiation and reform.

The combined effect of these strategies confirms that news discourse functions as a powerful practice in ideological meaning-making. Partisan outlets leverage the division of semiotic labor between text and image to recontextualize the protests. These multimodal strategies reflect and reinforce broader ideological divides in U.S. political discourse, particularly surrounding immigration and national power.

This study contributes to the field of media and discourse studies in four ways. Methodologically, it advances the rigor of MCDA by establishing a “data-driven triangulation” model. By anchoring qualitative multimodal sampling in quantitative corpus patterns via AntConc, this study overcomes the common critique of “cherry-picking” in qualitative research and the “contextual blindness” of pure corpus analysis. This synergy provides a more scientifically robust pathway for selecting paradigmatic multimodal ensembles. Second, theoretically, it refines the analytical dimensions of text–image relations by conceptualizing “contradiction” not as an error, but as a strategic mechanism of negotiating with the traditional “protest paradigm”. By distinguishing between visual instantiation (reinforcement), textual recontextualization (complementarity), and intersemiotic divergence (contradiction), the study offers a precise framework to decode how media rhetoric operates under the constraints of journalistic routines and ideological framing. Third, thematically, this research extends the scope of media polarization from general political debates to the specific domain of domestic federal-state conflicts. It reveals how outlets with different political orientations serve as arenas for ideological dissemination. Lastly, practically, it highlights the urgency of critical media literacy in decoding how multimodal rhetoric shapes political bias in today’s polarized digital media landscape (Chong & Druckman, 2007; Iyengar & Hahn, 2009).

However, this study has several limitations. First, due to the time constraint, the corpus is limited in size and scope, focusing only on selected articles from two outlets within a narrow time frame. Furthermore, this research focused on the production of news content without investigating audience reception. Exploring how these different frames were actually interpreted by viewers with varying political predispositions can further reveal the impact of media.

Considering these merits and limitations, future research could extend this study in two main directions. First, expanding the corpus to include more media sources or longer time span would increase generalizability and monitor the diachronic changes of the event. Second, audience reception studies could investigate the real-world effects of these opposing multimodal frames on public opinion and political attitudes. This would deepen understanding of how multimodal framing influences public perception and the reproduction of ideology in mediated communication.

Author Contributions

Conceptualization, X.F.; methodology, X.F. and F.D.; software, X.F.; validation, X.F. and F.D.; formal analysis, X.F.; investigation, X.F.; resources, X.F.; data curation, X.F.; writing—original draft preparation, X.F.; writing—review and editing, X.F. and F.D.; visualization, X.F.; supervision, F.D.; project administration, F.D. All authors have read and agreed to the published version of the manuscript.

Funding

The research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No additional data are available. The analysis is based exclusively on published sources and materials cited in the text.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Abbas, A. H., & Kadim, E. N. (2024). A corpus—Critical discourse analysis of the representation of the Yemeni violent crisis in the press. International Journal for the Semiotics of Law-Revue Internationale De Sémiotique Juridique, 38, 811–833. [Google Scholar] [CrossRef]
Anthony, L. (2024). AntConc. (Version 4.3.1) [Computer software]. Waseda University. Available online: https://www.laurenceanthony.net/software/AntConc (accessed on 7 December 2025).
Baker, P., Gabrielatos, C., & McEnery, T. (2008). A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press. Discourse & Society, 19(3), 273–306. [Google Scholar] [CrossRef]
Baum, M. A., & Groeling, T. (2008). New media and the polarization of American political discourse. Political Communication, 25(4), 345–365. [Google Scholar] [CrossRef]
Benford, R. D., & Snow, D. A. (2000). Framing processes and social movements: An overview and assessment. Annual Review of Sociology, 26, 611–639. [Google Scholar] [CrossRef]
Brookes, G. (2023). Killer, thief or companion? A corpus-based study of dementia metaphors in UK tabloids. Metaphor and Symbol, 38(3), 213–230. [Google Scholar] [CrossRef]
Cap, P. (2018). From ‘cultural unbelonging’ to ‘terrorist risk’: Communicating threat in the Polish anti-immigration discourse. Critical Discourse Studies, 15(3), 285–302. [Google Scholar] [CrossRef]
Chan, J. M., & Lee, C. (1984). The journalistic paradigm on civil protests: A case study of Hong Kong. In A. Arno, & W. Dissanayake (Eds.), The news media in national and international conflict (pp. 193–202). Westview Press. [Google Scholar]
Childers, T. L., & Houston, M. J. (1984). Conditions for a picture-superiority effect on consumer memory. Journal of Consumer Research, 11(2), 643–654. [Google Scholar] [CrossRef]
Chong, D., & Druckman, J. N. (2007). Framing theory. Annual Review of Political Science, 10(1), 103–126. [Google Scholar] [CrossRef]
Coleman, R. (2010). Framing the pictures in our heads: Exploring the framing and agenda-setting effects of visual images. In P. D’Angelo, & J. A. Kuypers (Eds.), Doing news framing analysis: Empirical and theoretical perspectives (pp. 233–261). Routledge. [Google Scholar]
Dai, J., & Hyun, K. (2010). Global risk, domestic framing: Coverage of the North Korean nuclear test by US, Chinese, and South Korean news agencies. Asian Journal of Communication, 20(3), 299–317. [Google Scholar] [CrossRef]
Doufesh, B., & Briel, H. (2021). Ethnocentrism in conflict news coverage: A multimodal framing analysis of the 2018 Gaza protests in the Times of Israel and Al Jazeera. International Journal of Communication, 15, 4230–4251. [Google Scholar]
Ekström, H., Krzyżanowski, M., & Johnson, D. (2025). Saying ‘criminality’, meaning ‘immigration’? Proxy discourses and public implicatures in the normalisation of the politics of exclusion. Critical Discourse Studies, 22(2), 183–209. [Google Scholar] [CrossRef]
Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4), 51–58. [Google Scholar] [CrossRef]
Entman, R. M. (2004). Projections of power: Framing news, public opinion, and U.S. foreign policy. University of Chicago Press. [Google Scholar]
Fairclough, N. (1995). Media discourse. Edward Arnold. [Google Scholar]
Feldman, L., Maibach, E. W., Roser-Renouf, C., & Leiserowitz, A. (2012). Climate on cable: The nature and impact of climate change coverage on Fox News, CNN, and MSNBC. The International Journal of Press/Politics, 17(1), 3–31. [Google Scholar] [CrossRef]
Fowler, R. (1991). Language in the news: Discourse and ideology in the press. Routledge. [Google Scholar]
Gabrielatos, C., & Baker, P. (2008). Fleeing, sneaking, flooding: A corpus analysis of discursive constructions of refugees and asylum seekers in the UK press, 1996–2005. Journal of English Linguistics, 36(1), 5–38. [Google Scholar] [CrossRef]
Geise, S., & Baden, C. (2015). Putting the image back into the frame: Modeling the linkage between visual communication and frame-processing theory. Communication Theory, 25(1), 46–69. [Google Scholar] [CrossRef]
Gibson, R., & Zillmann, D. (2000). Reading between the photographs: The influence of incidental pictorial information on issue perception. Journalism & Mass Communication Quarterly, 77(2), 355–366. [Google Scholar] [CrossRef]
Groeling, T., & Baum, M. A. (2007, August 30–September 2). Barbarians inside the gates: Partisan news media and the polarization of American political discourse. Annual Meeting of the American Political Science Association, Chicago, IL, USA. [Google Scholar]
Iyengar, S., & Hahn, K. S. (2009). Red media, blue media: Evidence of ideological selectivity in media use. Journal of Communication, 59(1), 19–39. [Google Scholar] [CrossRef]
Jungblut, M., & Zakareviciute, I. (2019). Do pictures tell a different story? A multimodal frame analysis of the 2014 Israel–Gaza conflict. Journalism Practice, 13(2), 206–228. [Google Scholar] [CrossRef]
Kress, G., & van Leeuwen, T. (2006). Reading images: The grammar of visual design (2nd ed.). Routledge. [Google Scholar]
Lee, C. (2016). A corpus-based approach to transitivity analysis at grammatical and conceptual levels. International Journal of Corpus Linguistics, 21(4), 465–498. [Google Scholar] [CrossRef]
Lee, F. L. F. (2014). Triggering the protest paradigm: Examining factors affecting news coverage of protests. International Journal of Communication, 8, 2725–2746. [Google Scholar]
Linge, M., & Bangstad, S. (2024). The Qur’an burnings of SIAN: Far-right fringe actors and the staging of conflictual media events in Norway. Temenos-Nordic Journal for the Study of Religion, 60(1), 83–103. [Google Scholar] [CrossRef]
Machin, D. (2013). What is multimodal critical discourse studies? Critical Discourse Studies, 10(4), 347–355. [Google Scholar] [CrossRef]
Machin, D., & Mayr, A. (2012). How to do critical discourse analysis: A multimodal introduction. Sage. [Google Scholar]
Martinec, R., & Salway, A. (2005). A system for image–text relations in new (and old) media. Visual Communication, 4(3), 337–371. [Google Scholar] [CrossRef]
Matthes, J. (2009). What’s in a frame? A content analysis of media framing studies in the world’s leading communication journals, 1990–2005. Journalism & Mass Communication Quarterly, 86(2), 349–367. [Google Scholar] [CrossRef]
McLeod, D. M., & Hertog, J. K. (1999). Social control, social change and the mass media’s role in the regulation of protest groups. In D. Demers, & K. Viswanath (Eds.), Mass media, social control and social change: A macrosocial perspective (pp. 305–330). Iowa State University Press. [Google Scholar]
Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative data analysis: A methods sourcebook (3rd ed.). SAGE Publications. [Google Scholar]
Mosallaei, A., & Porpora, D. (2024). Image-text congruency in legacy press coverage of Iran’s 2019 bloody November: A shift away from the protest paradigm? International Journal of Communication, 18, 5269–5295. [Google Scholar]
Nelson, D. L., Reed, V. S., & Walling, J. R. (1976). Pictorial superiority effect. Journal of Experimental Psychology: Human Learning and Memory, 2(5), 523–528. [Google Scholar] [CrossRef]
Ondimu, J., Yieke, F., & Mwithi, F. (2025). Multimodal representation and ideological framing of social actors in Kenya’s anti-police brutality Twitter activism. Social Semiotics, 1–19. [Google Scholar] [CrossRef]
Powell, T. E., Boomgaarden, H. G., De Swert, K., & de Vreese, C. H. (2015). A clearer picture: The contribution of visuals and text to framing effects. Journal of Communication, 65(6), 997–1017. [Google Scholar] [CrossRef]
Rasoulikolamaki, S., Mat Isa, N. A. N., & Kaur, S. (2025). From delegitimisation to conditional legitimacy: Media slant and multimodal framing of Quran-burning protests. Journalism Studies, 27(2), 180–206. [Google Scholar] [CrossRef]
Richardson, J. E. (2007). Analyzing newspapers: An approach from critical discourse analysis. Palgrave Macmillan. [Google Scholar]
Rodríguez, L., & Dimitrova, D. V. (2011). The levels of visual framing. Journal of Visual Literacy, 30(1), 48–65. [Google Scholar] [CrossRef]
Stroud, N. J. (2011). Niche news: The politics of news choice. Oxford University Press. [Google Scholar]
Teo, P. (2000). Racism in the news: A critical discourse analysis of news reporting in two Australian newspapers. Discourse & Society, 11(1), 7–49. [Google Scholar] [CrossRef]
van Dijk, T. A. (1988). News as discourse. Lawrence Erlbaum. [Google Scholar]
van Dijk, T. A. (1998). Ideology: A multidisciplinary approach. SAGE Publications. [Google Scholar]
van Leeuwen, T. (2007). Legitimation in discourse and communication. Discourse & Communication, 1(1), 91–112. [Google Scholar] [CrossRef]
van Leeuwen, T. (2008). Discourse and practice: New tools for critical discourse analysis. Oxford University Press. [Google Scholar]
Vultee, F. (2009). Jump back Jack, Mohammed’s here: Fox News and the construction of Islamic peril. Journalism Studies, 10(5), 623–638. [Google Scholar] [CrossRef]
Wodak, R., & Meyer, M. (Eds.). (2001). Methods of critical discourse analysis. SAGE Publications. [Google Scholar]
Xu, Y., & Loffelholz, M. (2021). Multimodal framing of Germany’s national image: Comparing news on Twitter (USA) and Weibo (China). Journalism Studies, 22(16), 2256–2278. [Google Scholar] [CrossRef]

Figure 1. Keywords in Fox News Corpus.

Figure 2. Keywords in CNN Corpus.

Figure 3. The police vs. Protesters (Fox News, 9 June 2025).

Figure 4. Innocent victims vs. Greedy protesters (Fox News, 11 June 2025).

Figure 5. Crisis and cure (Fox News, 11 June 2025).

Figure 6. Injustice and solidarity (CNN, 7 June 2025).

Figure 7. Chaos vs. solidarity (CNN, 9 June 2025).

Table 1. Three Meanings of Visual Grammar (Kress & van Leeuwen, 2006).

Representational Meaning	Interactive Meaning	Compositional Meaning
Narrative representations Conceptual representations	Contact Social distance Attitude Modality	Information value Salience Framing

Table 2. (De)legitimization framing coding scheme.

Frame	Categories	Textual Element	Visual Element
Delegitimization	Authorization	Personal and impersonal authority (police, ICE, Trump, law enforcement)	Authority focus Angle
	Moral evaluation	Keywords: riots, rioters Evaluative statement	Distance Objectification
	Rationalization	Maintaining order and security	Chaos focus
Legitimization	Authorization	Personal, impersonal and expert authority (Huerta, court, experts)	Civil focus
	Moral evaluation	Keywords: justice	Distance Contact Angle
	Rationalization	Struggling for justice and rights	Humanization

Table 3. The text-image relations in CNN and Fox News.

Relation	Explanation	Percentage in CNN	Percentage in Fox News
Reinforcement	The image provides direct evidence for the textual frame.	70.4% (19 articles)	89.1% (41 articles)
Complementarity	The image offers new information to expand upon the textual frame.	18.5% (5 articles)	8.7% (4 articles)
Contradiction	The image conflicts with the textual frame.	11.1% (3 articles)	2.2% (1 articles)

Table 4. The distribution of textual and visual (de)legitimization frame.

Frame	Elements	Fox News	CNN
Textual delegitimization		43 articles (93.5%)	8 articles (29.6%)
Textual legitimization		3 articles (6.5%)	19 articles (70.4%)
Visual delegitimization	Authority focus: foregrounding police/military equipment/political figures	50.3%	41.1%
	Angle: high angle (viewers look down at protesters); low angle (viewers look up at the police)	8.2%	1.8%
	Distance: long shots of the protesters	5.4%	5.4%
	Objectification: masked faces, back views	24.5%	17.9%
	Chaos focus: salience of destruction, fire, looting, smoke	36.7%	17.9%
Visual legitimization	Civil focus: foregrounding signs, slogans, marching	28.6%	55.4%
	Distance: close or medium shots of the protesters	12.9%	33.9%
	Contact: demand	8.8%	14.3%
	Angle: frontal and eye-level angle	9.5%	30.4%
	Humanization	12.9%	32.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fang, X.; Dong, F. Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News. Journal. Media 2026, 7, 30. https://doi.org/10.3390/journalmedia7010030

AMA Style

Fang X, Dong F. Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News. Journalism and Media. 2026; 7(1):30. https://doi.org/10.3390/journalmedia7010030

Chicago/Turabian Style

Fang, Xinyu, and Fangfeng Dong. 2026. "Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News" Journalism and Media 7, no. 1: 30. https://doi.org/10.3390/journalmedia7010030

APA Style

Fang, X., & Dong, F. (2026). Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News. Journalism and Media, 7(1), 30. https://doi.org/10.3390/journalmedia7010030

Article Menu

Legitimization or Delegitimization? A Multimodal Critical Discourse Analysis of the 2025 Los Angeles Protests in CNN and Fox News

Abstract

1. Introduction

2. Literature Review

3. Theoretical Framework

4. Data and Method

4.1. Data Collection and Sampling

4.2. Data Analysis

5. Results and Discussion

5.1. Quantitative Patterns

5.2. Delegitimization Frame

5.3. Legitimization Frame

5.4. Distinct Ideology Conveyed Through Multimodal Rhetoric

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI