Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (31)

Search Parameters:
Keywords = audiovisual texts

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
9 pages, 2704 KB  
Article
The Machined Human and the Digital Unconscious
by Guillaume Soulez
Arts 2025, 14(6), 158; https://doi.org/10.3390/arts14060158 - 1 Dec 2025
Viewed by 422
Abstract
Reflecting on the digital unconscious may mean proposing a reflection on non-mastery in a field—digital creation of images and sounds, or the use of the digital in audiovisual creation—where resides the idea that digital machinery gives immense power to the artist who can [...] Read more.
Reflecting on the digital unconscious may mean proposing a reflection on non-mastery in a field—digital creation of images and sounds, or the use of the digital in audiovisual creation—where resides the idea that digital machinery gives immense power to the artist who can now, thanks to calculation and data storage, surpass the usual limitations that human capacities have otherwise imposed on creation. On the contrary, we should take into account not only what digital machines reveal about us or from which unconscious patterns our work with them emerges, but how we deal with them as machines. Are we so aware of what we expect from technologies, or of what we project onto them? Pierre Schaeffer (the inventor of musique concrète but also a media theorist in his own right), who wrote on that topic 50 years ago can be of help here. This paper mainly relies on his text “Le machinisme artistique” (“Artistic Machinism”), published as a chapter at the beginning of Machines à communiquer in 1970 (his book on media theory and practice, not yet translated into English) and proposes, with this approach in mind, an examination of several uses and conceptions of the digital image today, with particular reference to the movie Oppenheimer. Full article
(This article belongs to the Special Issue Film and Visual Studies: The Digital Unconscious)
Show Figures

Figure 1

25 pages, 1475 KB  
Article
The Design of Informational and Promotional Messages by Cooperative Banks and Their Perception Among Young Consumers—An Eye-Tracking Analysis Versus Conscious Identification Based on Empirical Research
by Przemysław Pluskota, Kamila Słupińska, Agata Wawrzyniak and Barbara Wąsikowska
Appl. Sci. 2025, 15(17), 9635; https://doi.org/10.3390/app15179635 - 1 Sep 2025
Cited by 1 | Viewed by 835
Abstract
The article explores how the design of informational and promotional messages from financial institutions influences their reception by young people. The study combined eye tracking, individual in-depth interviews (IDIs), and text mining analysis to examine both visual attention and participants’ conscious reactions. The [...] Read more.
The article explores how the design of informational and promotional messages from financial institutions influences their reception by young people. The study combined eye tracking, individual in-depth interviews (IDIs), and text mining analysis to examine both visual attention and participants’ conscious reactions. The aim was to identify young users’ preferences, determine factors influencing content perception, and assess the effectiveness of visual and audiovisual communication strategies. The main hypothesis proposed that minimalistic and visually attractive messages, enhanced with dynamic graphics, more effectively shape attitudes and elicit positive emotions. Specific aspects examined included the role of infographics, color schemes, message dynamics, and references to financial institutions in attracting attention and engagement. The results indicate that young people operate primarily in virtual space and express limited interest in traditional media such as television or print. They favor short, clear, and visually structured messages. Excessive textual content and lack of clarity provoked negative reactions and discouraged further engagement. Elements like infographics, colors, and logos were found to be strongly associated with brand recognition and memorability. Full article
(This article belongs to the Special Issue Latest Research on Eye Tracking Applications)
Show Figures

Figure 1

23 pages, 7732 KB  
Article
Vocabulary Retention Under Multimodal Coupling Strength Index (MCSI): Insights from Eye Tracking
by Qiyue Tang and Chen Chen
Appl. Sci. 2025, 15(14), 7645; https://doi.org/10.3390/app15147645 - 8 Jul 2025
Viewed by 907
Abstract
This eye-tracking investigation employed a 2 × 2 experimental design to examine multimodal lexical encoding processes. Eighty participants were systematically assigned to four conditions: Group A (text-only), Group B (text + image), Group C (text + sound), and Group D (text + image [...] Read more.
This eye-tracking investigation employed a 2 × 2 experimental design to examine multimodal lexical encoding processes. Eighty participants were systematically assigned to four conditions: Group A (text-only), Group B (text + image), Group C (text + sound), and Group D (text + image + sound). The results demonstrated significantly superior recall accuracy in Group D (92.00%) compared with unimodal conditions (Group B: 82.07%; Group C: 76.00%; Group A: 59.60%; p < 0.001), confirming robust audiovisual synergy. The novel Multimodal Coupling Strength Index (MCSI) dynamically quantified crossmodal integration efficacy through eye-tracking metrics (Attentional Synchronization Coefficient, ASC; Saccade Duration–Fixation Duration differential, SD-FD), revealing significantly stronger coupling in audiovisual conditions (C/D: 0.71; B/D: 0.54). Crucially, the established MCSI provides a transferable diagnostic framework for evaluating multimodal integration efficiency in learning environments. Full article
Show Figures

Figure 1

27 pages, 316 KB  
Article
Hearing Written Magic in Harry Potter Films: Insights into Power and Truth in the Scoring for In-World Written Words
by Jamie Lynn Webster
Humanities 2025, 14(6), 125; https://doi.org/10.3390/h14060125 - 10 Jun 2025
Viewed by 4237
Abstract
This paper explores how sound design in the Harry Potter film series shapes the symbolic significance of written words within the magical world. Sound mediates between language and meaning; while characters gain knowledge by reading and seeing, viewers are guided emotionally and thematically [...] Read more.
This paper explores how sound design in the Harry Potter film series shapes the symbolic significance of written words within the magical world. Sound mediates between language and meaning; while characters gain knowledge by reading and seeing, viewers are guided emotionally and thematically by how these written texts are framed through sound. For example, Harry’s magical identity is signalled to viewers through the score long before he fully understands himself—first through music when he speaks to a snake, then more explicitly when he receives his letter from Hogwarts. Throughout the series, characters engage with a wide array of written media—textbooks, letters, newspapers, diaries, maps, and inscriptions—that gradually shift in narrative function, from static props to dynamic, multi-sensory agents of transformation. Using a close analysis of selected scenes to examine layers of utterances, diegetic sounds, underscore, and sound design, this study draws on metaphor theory and adaptation theory to examine how sound design gives writing a metaphorical voice, sometimes framing it as character, landscape, or moral authority. As the series progresses, becoming more autonomous from the literary source, written words take on greater symbolic significance, and sound increasingly determines which texts are granted narrative power, whose voices are trusted, and how viewers interpret truth and agency across media. Ultimately, written words in the films are animated through sound into agents of growth, memory, resistance, and transformation. Thus, the audio-visual treatment of written magic reveals not just what is written, but what matters. Full article
(This article belongs to the Special Issue Music and the Written Word)
13 pages, 2815 KB  
Article
More than Interactivity: Designing a Critical AI Game Beyond Ludo-Centrism
by Hongwei Zhou, Fandi Meng, Katherine Kosolapova and Noah Wadrip-Fruin
Humanities 2025, 14(4), 88; https://doi.org/10.3390/h14040088 - 15 Apr 2025
Viewed by 1465
Abstract
This article presents our work-in-progress game Sea of Paint, aimed at exploring concerns around contemporary machine-learning-based AI technologies. It is a narrative-driven game with dialogues and a custom-made text-to-image system as its core mechanics. We identify our design approach as non-ludo-centric, as [...] Read more.
This article presents our work-in-progress game Sea of Paint, aimed at exploring concerns around contemporary machine-learning-based AI technologies. It is a narrative-driven game with dialogues and a custom-made text-to-image system as its core mechanics. We identify our design approach as non-ludo-centric, as in, de-emphasizing the importance of mechanical interactions. We argue that contemporary game design language has largely been ludo-centric, where audiovisual and narrative aspects are framed as having somewhat static and complementary roles to rules and mechanics: as context, content, or smoothening and juicing up interactions. Although we do not believe that game design writ large has been ludo-centric, given the diversities of games in both commercial and experimental spaces, we still argue that the entanglement of design decisions across a game’s different aspects have been under-discussed. By presenting our project, we demonstrate how the interrelations across mechanical, narrative and visual aspects help us communicate our critical AI themes more effectively, and explore their potentials more thoroughly. Full article
(This article belongs to the Special Issue Electronic Literature and Game Narratives)
Show Figures

Figure 1

22 pages, 1195 KB  
Article
Harmonizing Sight and Sound: The Impact of Auditory Emotional Arousal, Visual Variation, and Their Congruence on Consumer Engagement in Short Video Marketing
by Qiang Yang, Yudan Wang, Qin Wang, Yushi Jiang and Jingpeng Li
J. Theor. Appl. Electron. Commer. Res. 2025, 20(2), 69; https://doi.org/10.3390/jtaer20020069 - 8 Apr 2025
Cited by 5 | Viewed by 10529
Abstract
Social media influencers strategically design the auditory and visual features of short videos to enhance consumer engagement. Among these, auditory emotional arousal and visual variation play crucial roles, yet their interactive effects remain underexplored. Drawing on multichannel integration theory, this study applies multimodal [...] Read more.
Social media influencers strategically design the auditory and visual features of short videos to enhance consumer engagement. Among these, auditory emotional arousal and visual variation play crucial roles, yet their interactive effects remain underexplored. Drawing on multichannel integration theory, this study applies multimodal machine learning to analyze 12,842 short videos from Douyin, integrating text analysis, sound recognition, and image processing. The results reveal an inverted U-shaped relationship between auditory emotional arousal and consumer engagement, where moderate arousal maximizes interaction while excessively high or low arousal reduces engagement. Visual variation, however, exhibits a positive linear effect, with greater variation driving higher engagement. Notably, audiovisual congruence significantly enhances engagement, as high alignment between arousal and visual variation optimizes consumer information processing. These findings advance short video marketing research by uncovering the multisensory interplay in consumer engagement. They also provide practical guidance for influencers in optimizing voice and visual design strategies to enhance content effectiveness. Full article
(This article belongs to the Topic Interactive Marketing in the Digital Era)
Show Figures

Figure 1

24 pages, 19241 KB  
Article
Secular “Angels”. Para-Angelic Imagery in Popular Culture
by Urszula Jarecka
Religions 2025, 16(3), 396; https://doi.org/10.3390/rel16030396 - 20 Mar 2025
Viewed by 6976
Abstract
Religious symbols and figures are gaining new life in popular culture. Reinterpretations of symbols rooted in the visual arts tradition are appearing in film, TV series and short audiovisual forms presented on the Internet, especially on social media. This also applies to angels, [...] Read more.
Religious symbols and figures are gaining new life in popular culture. Reinterpretations of symbols rooted in the visual arts tradition are appearing in film, TV series and short audiovisual forms presented on the Internet, especially on social media. This also applies to angels, to which the author’s research would be devoted. This article discusses images of “secular angels”, decontextualized religious symbols, popularized throughout the 20th and 21st centuries in the visual media of Western culture. From the rich research material, the most characteristic images are selected for discussion and interpretation and subjected to interpretation in the spirit of discourse analysis. The images of modern “angels” in the texts of popular culture refer not so much to their biblical prototypes, but to the moral condition of man in consumerist, individualistic societies focused on living for pleasure. Film, TV series and Internet images of “angels” also show the controversies and social problems (such as racism) faced by contemporary Western societies. Full article
(This article belongs to the Special Issue The Interplay between Religion and Culture)
Show Figures

Figure 1

20 pages, 20407 KB  
Article
VAD-CLVA: Integrating CLIP with LLaVA for Voice Activity Detection
by Andrea Appiani and Cigdem Beyan
Information 2025, 16(3), 233; https://doi.org/10.3390/info16030233 - 16 Mar 2025
Cited by 1 | Viewed by 4109
Abstract
Voice activity detection (VAD) is the process of automatically determining whether a person is speaking and identifying the timing of their speech in an audiovisual data. Traditionally, this task has been tackled by processing either audio signals or visual data, or by combining [...] Read more.
Voice activity detection (VAD) is the process of automatically determining whether a person is speaking and identifying the timing of their speech in an audiovisual data. Traditionally, this task has been tackled by processing either audio signals or visual data, or by combining both modalities through fusion or joint learning. In our study, drawing inspiration from recent advancements in visual-language models, we introduce a novel approach leveraging Contrastive Language-Image Pretraining (CLIP) models. The CLIP visual encoder analyzes video segments focusing on the upper body of an individual, while the text encoder processes textual descriptions generated by a Generative Large Multimodal Model, i.e., the Large Language and Vision Assistant (LLaVA). Subsequently, embeddings from these encoders are fused through a deep neural network to perform VAD. Our experimental analysis across three VAD benchmarks showcases the superior performance of our method compared to existing visual VAD approaches. Notably, our approach outperforms several audio-visual methods despite its simplicity and without requiring pretraining on extensive audio-visual datasets. Full article
(This article belongs to the Special Issue Application of Machine Learning in Human Activity Recognition)
Show Figures

Figure 1

28 pages, 13856 KB  
Article
Tayseer: A Novel AI-Powered Arabic Chatbot Framework for Technical and Vocational Student Helpdesk Services and Enhancing Student Interactions
by Abeer Alabbas and Khalid Alomar
Appl. Sci. 2024, 14(6), 2547; https://doi.org/10.3390/app14062547 - 18 Mar 2024
Cited by 16 | Viewed by 7690
Abstract
The rise of conversational agents (CAs) like chatbots in education has increased the demand for advisory services. However, student–college admission interactions remain manual and burdensome for staff. Leveraging CAs could streamline the admission process, providing efficient advisory support. Moreover, limited research has explored [...] Read more.
The rise of conversational agents (CAs) like chatbots in education has increased the demand for advisory services. However, student–college admission interactions remain manual and burdensome for staff. Leveraging CAs could streamline the admission process, providing efficient advisory support. Moreover, limited research has explored the role of Arabic chatbots in education. This study introduces Tayseer, an Arabic AI-powered web chatbot that enables instant access to college information and communication between students and colleges. This study aims to improve the abilities of chatbots by integrating features into one model, including responding with audiovisuals, various interaction modes (menu, text, or both), and collecting survey responses. Tayseer uses deep learning models within the RASA framework, incorporating a customized Arabic natural language processing pipeline for intent classification, entity extraction, and response retrieval. Tayseer was deployed at the Technical College for Girls in Najran (TCGN). Over 200 students used Tayseer during the first semester, demonstrating its efficiency in streamlining the advisory process. It identified over 50 question types from inputs with a 90% precision in intent and entity predictions. A comprehensive evaluation illuminated Tayseer’s proficiency as well as areas requiring improvement. This study developed an advanced CA to enhance student experiences and satisfaction while establishing best practices for education chatbot interfaces by outlining steps to build an AI-powered chatbot from scratch using techniques adaptable to any language. Full article
Show Figures

Figure 1

27 pages, 843 KB  
Article
EMOLIPS: Towards Reliable Emotional Speech Lip-Reading
by Dmitry Ryumin, Elena Ryumina and Denis Ivanko
Mathematics 2023, 11(23), 4787; https://doi.org/10.3390/math11234787 - 27 Nov 2023
Cited by 3 | Viewed by 3307
Abstract
In this article, we present a novel approach for emotional speech lip-reading (EMOLIPS). This two-level approach to emotional speech to text recognition based on visual data processing is motivated by human perception and the recent developments in multimodal deep learning. The proposed approach [...] Read more.
In this article, we present a novel approach for emotional speech lip-reading (EMOLIPS). This two-level approach to emotional speech to text recognition based on visual data processing is motivated by human perception and the recent developments in multimodal deep learning. The proposed approach uses visual speech data to determine the type of speech emotion. The speech data are then processed using one of the emotional lip-reading models trained from scratch. This essentially resolves the multi-emotional lip-reading issue associated with most real-life scenarios. We implemented these models as a combination of EMO-3DCNN-GRU architecture for emotion recognition and 3DCNN-BiLSTM architecture for automatic lip-reading. We evaluated the models on the CREMA-D and RAVDESS emotional speech corpora. In addition, this article provides a detailed review of recent advances in automated lip-reading and emotion recognition that have been developed over the last 5 years (2018–2023). In comparison to existing research, we mainly focus on the valuable progress brought with the introduction of deep learning to the field and skip the description of traditional approaches. The EMOLIPS approach significantly improves the state-of-the-art accuracy for phrase recognition due to considering emotional features of the pronounced audio-visual speech up to 91.9% and 90.9% for RAVDESS and CREMA-D, respectively. Moreover, we present an extensive experimental investigation that demonstrates how different emotions (happiness, anger, disgust, fear, sadness, and neutral), valence (positive, neutral, and negative) and binary (emotional and neutral) affect automatic lip-reading. Full article
(This article belongs to the Section E: Applied Mathematics)
Show Figures

Figure 1

15 pages, 274 KB  
Article
Facing Your Fears: Navigating Social Anxieties and Difference in Contemporary Fairy Tales
by Dorothea Trotter
Literature 2023, 3(3), 342-356; https://doi.org/10.3390/literature3030023 - 4 Sep 2023
Viewed by 4394
Abstract
In the 20th and 21st centuries, the rise of audio-visual media, particularly cinema and television, brought about new visual techniques and storytelling conventions that have transformed the way fairy tales are adapted for the screen. Initially adapted for a younger audience, newer adaptations [...] Read more.
In the 20th and 21st centuries, the rise of audio-visual media, particularly cinema and television, brought about new visual techniques and storytelling conventions that have transformed the way fairy tales are adapted for the screen. Initially adapted for a younger audience, newer adaptations often return to the darker and more horrific elements of the source texts; this includes body horror and an emphasis on physiological differences. This article employs structural, cultural, and folkloric interpretive lenses for the analysis of three contemporary, audio-visual fairy tales to discuss the way contemporary fairy tales include disability and difference as social constructs that are shaped by cultural attitudes and anxieties. The stories’ plots are driven by the protagonists’ “otherness”, and these texts feature transformations that provide clues to understanding current standards of beauty and normality. I argue that newer adaptations place an emphasis on finding resolutions to difference that challenge the traditional idea that if one has a face or body that strays from the standard of the norm, one must die, relegate oneself to the margins, or join others like oneself. Full article
22 pages, 3769 KB  
Article
Multilingualism as a Functional Element, a Useful Category for the Study of the Construction and Translation of Linguistically Diverse Discourse
by Lorena Hurtado-Malillos
Languages 2023, 8(3), 198; https://doi.org/10.3390/languages8030198 - 23 Aug 2023
Cited by 1 | Viewed by 2729
Abstract
This article is a discursive and equivalence-generating study of the use of the multilingual property as a narrative transmission mechanism in audiovisual texts. Specific functions can be constructed and different events and aspects of the plot can be presented through the introduction of [...] Read more.
This article is a discursive and equivalence-generating study of the use of the multilingual property as a narrative transmission mechanism in audiovisual texts. Specific functions can be constructed and different events and aspects of the plot can be presented through the introduction of linguistic variation and its deliberate application to achieve defined purposes. The analysis is based on functionalist approaches to the study of fiction and translation and on the binary branching classification model of solution types for determining textual problems in translation based on the form these adopt. This article presents the findings of multilingual property identification and translation related to the application of this forms- and functions-based approach. Several classifications of solution types are also developed with representative examples extracted from film and series. Full article
Show Figures

Figure 1

13 pages, 392 KB  
Article
Stereotypes in a Multilingual Film: A Case Study on Issues of Social Injustice
by Azadeh Eriss and Masood Khoshsaligheh
Languages 2023, 8(3), 174; https://doi.org/10.3390/languages8030174 - 20 Jul 2023
Cited by 2 | Viewed by 12999
Abstract
Films serve to (re-)create a ‘world’ within the mind of the audience. Additionally, they introduce or reinforce stereotypes portrayed as a reality of the modern world through multiplexity and the strategic use of foreign languages, dialects, and non-native language use, among others. Various [...] Read more.
Films serve to (re-)create a ‘world’ within the mind of the audience. Additionally, they introduce or reinforce stereotypes portrayed as a reality of the modern world through multiplexity and the strategic use of foreign languages, dialects, and non-native language use, among others. Various concepts of stereotypes can be explored in fiction feature films, especially as film characters are often based on different kinds of stereotypes. Audiovisual texts tend to operate as cultural constructs that reflect and convey certain ideologies within an industry that holds the power to marginalize or belittle voices. Multilingual films highlight the contrasts among and within cultures; hence, they can further exacerbate the marginalization and stereotyping of different cultures and nations, ultimately having damaging effects on society’s perception of different stereotypes, such as race and gender groups, which is shown with the examples from a multilingual film. This article analyzes the marginalization and stereotypes in a Hollywoodian multilingual film through film analysis and critical theory. By doing so, this study aims to provide insight into the stereotypes that have been depicted, covering various clichés and stereotypes, including cultural, gender, political, and religious stereotypes. Furthermore, it seeks to dissect the societal consequences that arise from detrimental portrayals of stereotyping in a purposeful selection of an American multilingual film. Full article
14 pages, 414 KB  
Article
Translating Multilingualism in Mira Nair’s Monsoon Wedding
by Montse Corrius, Eva Espasa and Laura Santamaria
Languages 2023, 8(2), 129; https://doi.org/10.3390/languages8020129 - 17 May 2023
Cited by 2 | Viewed by 3444
Abstract
Linguistic diversity is present in many audiovisual productions and has given rise to fruitful research on translation of multilingualism and language variation. Monsoon Wedding (Mira Nair, 2001) is a prototypical film for translation analysis, since multilingualism is a recurrent feature, as the film [...] Read more.
Linguistic diversity is present in many audiovisual productions and has given rise to fruitful research on translation of multilingualism and language variation. Monsoon Wedding (Mira Nair, 2001) is a prototypical film for translation analysis, since multilingualism is a recurrent feature, as the film dialogue combines English (L1) with Hindi and Punjabi (L3), which creates an effect of code-switching. This article analyses how the multilingualism and the cultural elements present in the source text (ST) have been transferred to the Spanish translated text (TT) La boda del monzón. The results show that in the Spanish dubbed and subtitled versions, few Indian cultural elements are left, and little language variation is preserved. Thus, L3 does not play a central role as it does in the source text. In the translation, only a few loan words from Hindi or Punjabi are kept, mainly from the domains of food and cooking, as well as terms of address and greetings, or words related to the wedding ceremony. The results also show that when L3 is not fully rendered in translation, otherness is still conveyed through image and music, thus (re)creating a different atmosphere for Spanish audiences. Full article
19 pages, 8455 KB  
Article
Elicitation of Content Layout Preferences in Virtual 3D Spaces Based on a Free Layout Creation Task
by Anna Sudár and Ádám B. Csapó
Electronics 2023, 12(9), 2078; https://doi.org/10.3390/electronics12092078 - 30 Apr 2023
Cited by 11 | Viewed by 2507
Abstract
Three-dimensional virtual reality (VR) environments, whether operating on desktop platforms or immersive screens, have been recognized for enabling novel and extremely engaging methods of interacting with digital content across various fields of application. Studies conducted over the past several years have also consistently [...] Read more.
Three-dimensional virtual reality (VR) environments, whether operating on desktop platforms or immersive screens, have been recognized for enabling novel and extremely engaging methods of interacting with digital content across various fields of application. Studies conducted over the past several years have also consistently suggested that utilizing 3D in contrast to 2D interfaces can lead to enhancements in multiple performance dimensions. These enhancements encompass better understanding and retention of information, increased capacity for inventive and efficient collaboration, and the ability to execute workflows that integrate numerous information sources more quickly. At the same time, how digital content such as documents, audio–visual content and web browsers are integrated into 3D spaces is often decided by the creators of the spaces based on either aesthetic considerations, or on a case-by-case basis depending on the workflow. In this paper, we present the results of an experiment we conducted to better understand how users prefer to arrange digital content in their 3D environments, depending on the subject matter, the format of the content (e.g., text-based, image, or audio–visual) and the 3D objects within the space. The results of the experiment presented in the paper can help inform future 3D VR design methodologies and may also provide support for automated content arrangement solutions. Full article
Show Figures

Figure 1

Back to TopTop