Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (115)

Search Parameters:
Keywords = voice appropriation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
12 pages, 445 KiB  
Article
The Effect of Phoniatric and Logopedic Rehabilitation on the Voice of Patients with Puberphonia
by Lidia Nawrocka, Agnieszka Garstecka and Anna Sinkiewicz
J. Clin. Med. 2025, 14(15), 5350; https://doi.org/10.3390/jcm14155350 - 29 Jul 2025
Viewed by 268
Abstract
Background/Objective: Puberphonia is a voice disorder characterized by the persistence of a high-pitched voice in sexually mature males. In phoniatrics and speech-language pathology, it is also known as post-mutational voice instability, mutational falsetto, persistent fistulous voice, or functional falsetto. The absence of an [...] Read more.
Background/Objective: Puberphonia is a voice disorder characterized by the persistence of a high-pitched voice in sexually mature males. In phoniatrics and speech-language pathology, it is also known as post-mutational voice instability, mutational falsetto, persistent fistulous voice, or functional falsetto. The absence of an age-appropriate vocal pitch may adversely affect psychological well-being and hinder personal, social, and occupational functioning. The aim of this study was to evaluate of the impact of phoniatric and logopedic rehabilitation on voice quality in patients with puberphonia. Methods: The study included 18 male patients, aged 16 to 34 years, rehabilitated for voice mutation disorders. Phoniatric and logopedic rehabilitation included voice therapy tailored to each subject. A logopedist led exercises aimed at lowering and stabilizing the pitch of the voice and improving its quality. A phoniatrician supervised the therapy, monitoring the condition of the vocal apparatus and providing additional diagnostic and therapeutic recommendations as needed. The duration and intensity of the therapy were adjusted for each patient. Before and after voice rehabilitation, the subjects completed the following questionnaires: the Voice Handicap Index (VHI), the Vocal Tract Discomfort (VTD) scale, and the Voice-Related Quality of Life (V-RQOL). They also underwent an acoustic voice analysis. Results: Statistical analysis of the VHI, VTD, and V-RQOL scores, as well as the voice’s acoustic parameters, showed statistically significant differences before and after rehabilitation (p < 0.005). Conclusions: Phoniatric and logopedic rehabilitation is an effective method of reducing and maintaining a stable, euphonic male voice in patients with functional puberphonia. Effective voice therapy positively impacts selected aspects of psychosocial functioning reported by patients, improves voice-related quality of life, and reduces physical discomfort in the vocal tract. Full article
(This article belongs to the Section Otolaryngology)
Show Figures

Figure 1

20 pages, 3901 KiB  
Article
Designing Social Robots with LLMs for Engaging Human Interaction
by Maria Pinto-Bernal, Matthijs Biondina and Tony Belpaeme
Appl. Sci. 2025, 15(11), 6377; https://doi.org/10.3390/app15116377 - 5 Jun 2025
Viewed by 1157
Abstract
Large Language Models (LLMs), particularly those enhanced through Reinforcement Learning from Human Feedback, such as ChatGPT, have opened up new possibilities for natural and open-ended spoken interaction in social robotics. However, these models are not inherently designed for embodied, multimodal contexts. This paper [...] Read more.
Large Language Models (LLMs), particularly those enhanced through Reinforcement Learning from Human Feedback, such as ChatGPT, have opened up new possibilities for natural and open-ended spoken interaction in social robotics. However, these models are not inherently designed for embodied, multimodal contexts. This paper presents a user-centred approach to integrating an LLM into a humanoid robot, designed to engage in fluid, context-aware conversation with socially isolated older adults. We describe our system architecture, which combines real-time speech processing, layered memory summarisation, persona conditioning, and multilingual voice adaptation to support personalised, socially appropriate interactions. Through iterative development and evaluation, including in-home exploratory trials with older adults (n = 7) and a preliminary study with young adults (n = 43), we investigated the technical and experiential challenges of deploying LLMs in real-world human–robot dialogue. Our findings show that memory continuity, adaptive turn-taking, and culturally attuned voice design enhance user perceptions of trust, naturalness, and social presence. We also identify persistent limitations related to response latency, hallucinations, and expectation management. This work contributes design insights and architectural strategies for future LLM-integrated robots that aim to support meaningful, emotionally resonant companionship in socially assistive settings. Full article
Show Figures

Figure 1

23 pages, 634 KiB  
Systematic Review
Implementation Outcomes for Agitation Detection Technologies in People with Dementia: A Systematic Review
by Nicolas Farina, Lorna Smith, Melissa Rajalingam and Sube Banerjee
Geriatrics 2025, 10(3), 70; https://doi.org/10.3390/geriatrics10030070 - 24 May 2025
Viewed by 638
Abstract
Background: Experiencing agitation can be particularly distressing for people with dementia and their caregivers. Using technologies to detect agitation can help monitor and intervene when agitation occurs, potentially reducing overall care and support needs. This systematic review aims to explore the implementation [...] Read more.
Background: Experiencing agitation can be particularly distressing for people with dementia and their caregivers. Using technologies to detect agitation can help monitor and intervene when agitation occurs, potentially reducing overall care and support needs. This systematic review aims to explore the implementation outcomes related to the use of agitation detection technologies in people with dementia. By adopting a taxonomy of implementation outcomes, this review seeks to provide insights valuable for the real-world adoption of such technologies for people with dementia. Methods: Searches were conducted in the following databases: SCOPUS, PubMed, PsychINFO, IEEEXplore, and CINAHL Plus. Included studies were required to have implemented, evaluated, or validated technology with the intention to detect agitation in people with dementia in real-time. Results: On 14 May 2024, 1697 records were identified, and 19 were included in the review. The median sample size was 10, and around two-thirds of the records (n = 12, 63%) used ‘multimodal’ technologies for detecting agitation. Over half of the records (n = 10, 53%) were reporting from two studies. Across technologies, there was evidence of acceptability and feasibility, though there was a general absence of primary data related to implementation outcomes. There were, however, a number of technical issues and limitations that affected the fidelity and appropriateness of the technology, albeit not unique to people with dementia. Conclusions: There is a need for more empirical data on this topic to maximise uptake and adoption. Future research needs to ensure that the voice of the person with dementia is integrated within the evaluation process. Full article
Show Figures

Figure 1

14 pages, 2750 KiB  
Article
Subjective Evaluation of Generative AI-Driven Dialogues in Paired Dyadic and Topic-Sharing Triadic Interaction Structures
by Kaori Abe, Changqin Quan, Sheng Cao and Zhiwei Luo
Appl. Sci. 2025, 15(9), 5092; https://doi.org/10.3390/app15095092 - 3 May 2025
Viewed by 617
Abstract
As the linguistic capabilities of dialogue systems improve, the importance of how they interact with humans and build trustworthy relationships is increasing. This study investigated the effect of interaction structures in a generative AI-driven dialogue system to improve relationships through interactions. The dialogue [...] Read more.
As the linguistic capabilities of dialogue systems improve, the importance of how they interact with humans and build trustworthy relationships is increasing. This study investigated the effect of interaction structures in a generative AI-driven dialogue system to improve relationships through interactions. The dialogue system communicated with subjects in natural language via voice and included a facial expression function. The settings of dyadic and triadic interaction structures were applied to the system. The one-to-one dyadic interaction and triadic interaction with joint attention to a topic were designed following the developmental stages of children’s social communication ability. Subjective evaluations of the dialogues and the system were conducted through a questionnaire. As a result, positive evaluations were based on well-constructed structures. The system’s inappropriate behavior under failed structures reduced the quality of the dialogues and worsened the evaluation of the system. The interaction structures in the system settings needed to match the structures intended by the subjects, whether the structures were dyadic or triadic. Under the matching and successful construction, the system fully demonstrated its dialogue capability and behaved pleasantly with the subjects. By switching interaction structures to adapt to users’ demands, system behavior becomes more appropriate for users. Full article
Show Figures

Figure 1

14 pages, 5944 KiB  
Article
Analysis of the Fundamental Frequency F0 of Oesophageal Speech in Patients Following Total Laryngectomy Surgery
by Krzysztof Tyburek
Appl. Sci. 2025, 15(8), 4402; https://doi.org/10.3390/app15084402 - 16 Apr 2025
Cited by 1 | Viewed by 413
Abstract
The aim of this article is to analyse the fundamental frequency of oesophageal speech (ES) F0 and compare the results with the physiological speech of healthy people. The research focused on spectrogram analysis, taking into account a frequency range that is appropriate for [...] Read more.
The aim of this article is to analyse the fundamental frequency of oesophageal speech (ES) F0 and compare the results with the physiological speech of healthy people. The research focused on spectrogram analysis, taking into account a frequency range that is appropriate for both people following total laryngectomy and healthy people. Therefore, the frequency range of 50 Hz to 200 Hz was proposed for the research. The studied fundamental frequency F0 was determined by segmenting the speech signal using a moving time window. As a result, a frequency vector F0 (for each tested word) was obtained, the length of which depends on the number of frames. The obtained set of fundamental frequencies (pitch listing) was analysed using statistical functions, which led to the determination of the F0 distribution in the range of minimum, maximum, median, and standard deviation values. Voice samples were taken from 12 people aged between 30 and 70 following total laryngectomy. In accordance with the rehabilitation process, words (spoken in Polish) such as “barrel”, “bread roll”, “egg”, “package”, and “snow” were analysed (each as a separate pattern). Full article
Show Figures

Graphical abstract

24 pages, 4828 KiB  
Article
Effects of Different Individuals and Verbal Tones on Neural Networks in the Brain of Children with Cerebral Palsy
by Ryosuke Yamauchi, Hiroki Ito, Ken Kitai, Kohei Okuyama, Osamu Katayama, Kiichiro Morita, Shin Murata and Takayuki Kodama
Brain Sci. 2025, 15(4), 397; https://doi.org/10.3390/brainsci15040397 - 15 Apr 2025
Viewed by 568
Abstract
Background/Objectives: Motivation is a key factor for improving motor function and cognitive control in patients. Motivation for rehabilitation is influenced by the relationship between the therapist and patient, wherein appropriate voice encouragement is necessary to increase motivation. Therefore, we examined the differences [...] Read more.
Background/Objectives: Motivation is a key factor for improving motor function and cognitive control in patients. Motivation for rehabilitation is influenced by the relationship between the therapist and patient, wherein appropriate voice encouragement is necessary to increase motivation. Therefore, we examined the differences between mothers and other individuals, such as physical therapists (PTs), in their verbal interactions with children with cerebral palsy who have poor communication abilities, as well as the neurological and physiological effects of variations in the tone of their speech. Methods: The three participants were children with cerebral palsy (Participant A: boy, 3 years; Participant B: girl, 7 years; Participant C: girl, 9 years). Participants’ mothers and the assigned PTs were asked to speak under three conditions. During this, the brain activity of the participants was measured using a 19-channel electroencephalogram. The results were further analyzed using Independent Component Analysis frequency analysis with exact Low-Resolution Brain Electromagnetic Tomography, allowing for the identification and visualization of neural activity in three-dimensional brain functional networks. Results: The results of the ICA frequency analysis for each participant revealed distinct patterns of brain activity in response to verbal encouragement from the mother and PT, with differences observed across the theta, alpha, and beta frequency bands. Conclusions: Our study suggests that the children were attentive to their mothers’ inquiries and focused on their internal experiences. Furthermore, it was indicated that when addressed by the PT, the participants found it easier to grasp the meanings and intentions of the words. Full article
(This article belongs to the Special Issue The Application of EEG in Neurorehabilitation)
Show Figures

Figure 1

20 pages, 6941 KiB  
Article
EmoSDS: Unified Emotionally Adaptive Spoken Dialogue System Using Self-Supervised Speech Representations
by Jaehwan Lee, Youngjun Sim, Jinyou Kim and Young-Joo Suh
Future Internet 2025, 17(4), 143; https://doi.org/10.3390/fi17040143 - 25 Mar 2025
Viewed by 786
Abstract
In recent years, advancements in artificial intelligence, speech, and natural language processing technology have enhanced spoken dialogue systems (SDSs), enabling natural, voice-based human–computer interaction. However, discrete, token-based LLMs in emotionally adaptive SDSs focus on lexical content while overlooking essential paralinguistic cues for emotion [...] Read more.
In recent years, advancements in artificial intelligence, speech, and natural language processing technology have enhanced spoken dialogue systems (SDSs), enabling natural, voice-based human–computer interaction. However, discrete, token-based LLMs in emotionally adaptive SDSs focus on lexical content while overlooking essential paralinguistic cues for emotion expression. Existing methods use external emotion predictors to compensate for this but introduce computational overhead and fail to fully integrate paralinguistic features with linguistic context. Moreover, the lack of high-quality emotional speech datasets limits models’ ability to learn expressive emotional cues. To address these challenges, we propose EmoSDS, a unified SDS framework that integrates speech and emotion recognition by leveraging self-supervised learning (SSL) features. Our three-stage training pipeline enables the LLM to learn both discrete linguistic content and continuous paralinguistic features, improving emotional expressiveness and response naturalness. Additionally, we construct EmoSC, a dataset combining GPT-generated dialogues with emotional voice conversion data, ensuring greater emotional diversity and a balanced sample distribution across emotion categories. The experimental results show that EmoSDS outperforms existing models in emotional alignment and response generation, achieving a minimum 2.9% increase in text generation metrics, enhancing the LLM’s ability to interpret emotional and textual cues for more expressive and contextually appropriate responses. Full article
(This article belongs to the Special Issue Generative Artificial Intelligence in Smart Societies)
Show Figures

Figure 1

34 pages, 7840 KiB  
Article
Context-Based Model for Browsing the Web Through Voice
by Citlalli Selene Avalos Montiel, José G. Rodríguez García, Sonia Mendoza and Dominique Decouchant
Appl. Sci. 2025, 15(6), 3400; https://doi.org/10.3390/app15063400 - 20 Mar 2025
Viewed by 473
Abstract
To find useful information on the Web, a user must define the search according to their interests, then they must select and analyze one or more web pages, and finally they must decide which content is most useful to them. This process requires [...] Read more.
To find useful information on the Web, a user must define the search according to their interests, then they must select and analyze one or more web pages, and finally they must decide which content is most useful to them. This process requires visual attention, certain skills, and interaction with the web browser through keyboards, screens, or mice. Web browsing can be difficult for people who have some disability or lack of knowledge in the use of information and communications technology, causing them to stop this activity. This paper proposes a model to facilitate web browsing and contribute to reducing the digital divide among the population. The model input is the user’s request in natural language using voice, and the output, presented in sound, text, or graphic format, is the most suitable content that corresponds to the user’s interests. First, a content search is performed based on the user’s context. Subsequently, among the results obtained, the most appropriate for the user are identified by analyzing the context of web pages. We implemented a prototype, which was evaluated by users. The results show that it reached an acceptable usability level and that 84.75% of users obtained relevant results in their interactions. Full article
(This article belongs to the Special Issue Human–Computer Interaction and Virtual Environments)
Show Figures

Figure 1

18 pages, 2430 KiB  
Article
The Art of Replication: Lifelike Avatars with Personalized Conversational Style
by Michele Nasser, Giuseppe Fulvio Gaglio, Valeria Seidita and Antonio Chella
Robotics 2025, 14(3), 33; https://doi.org/10.3390/robotics14030033 - 13 Mar 2025
Viewed by 1447
Abstract
This study presents an approach for developing digital avatars replicating individuals’ physical characteristics and communicative style, contributing to research on virtual interactions in the metaverse. The proposed method integrates large language models (LLMs) with 3D avatar creation techniques, using what we call the [...] Read more.
This study presents an approach for developing digital avatars replicating individuals’ physical characteristics and communicative style, contributing to research on virtual interactions in the metaverse. The proposed method integrates large language models (LLMs) with 3D avatar creation techniques, using what we call the Tree of Style (ToS) methodology to generate stylistically consistent and contextually appropriate responses. Linguistic analysis and personalized voice synthesis enhance conversational and auditory realism. The results suggest that ToS offers a practical alternative to fine-tuning for creating stylistically accurate responses while maintaining efficiency. This study outlines potential applications and acknowledges the need for further work on adaptability and ethical considerations. Full article
(This article belongs to the Special Issue Human–AI–Robot Teaming (HART))
Show Figures

Figure 1

11 pages, 204 KiB  
Article
Prayer When Life’s in the Balance: One Pentecostal’s Perspectives on Luther’s Theology of the Cross
by David J. Courey
Religions 2025, 16(2), 223; https://doi.org/10.3390/rel16020223 - 12 Feb 2025
Viewed by 783
Abstract
Hearing the word ‘death’ applied to oneself is a remarkably sobering experience. This is particularly true when the ‘one’ being referred to is a Pentecostal, a theologian, and a friend of Martin Luther. Reading Luther with Pentecostal ears is always a deconstructive process [...] Read more.
Hearing the word ‘death’ applied to oneself is a remarkably sobering experience. This is particularly true when the ‘one’ being referred to is a Pentecostal, a theologian, and a friend of Martin Luther. Reading Luther with Pentecostal ears is always a deconstructive process against the accumulated Luther scholarship that champions his view of the objective nature of Word and Sacrament over against the vicissitude of spiritual experience. Nevertheless, two moments in Luther’s life (the recovery of Philip Melanchthon and the death of his daughter Magdalena) open perspectives on the personal appropriation of the theologia crucis in the later Luther. In the process they illuminate the Pentecostal longing for healing, while critiquing some of its popular paradigms. Together they voice this particular ‘one’s’ journey through a bout of cancer. Full article
(This article belongs to the Special Issue Cancer and Theology: Personal and Pastoral Perspectives)
12 pages, 1841 KiB  
Article
Electromagnetic Design and End Effect Suppression of a Tubular Linear Voice Motor for Precision Vibrating Sieves
by Meizhu Luo, Zijiao Zhang, Yan Jiang and Ji-an Duan
Energies 2025, 18(3), 704; https://doi.org/10.3390/en18030704 - 3 Feb 2025
Cited by 1 | Viewed by 741
Abstract
Precision vibrating sieves need a kind of power source, featuring small size, high frequency response, and small vibration amplitude. Linear Voice Coil Motor (LVCM) can achieve a high accelerated speed in a short stroke; it is an appropriate power source for the precision [...] Read more.
Precision vibrating sieves need a kind of power source, featuring small size, high frequency response, and small vibration amplitude. Linear Voice Coil Motor (LVCM) can achieve a high accelerated speed in a short stroke; it is an appropriate power source for the precision vibrating sieves. This paper designs a tubular LVCM with a volume no more than 6 cm3 and a stroke no less than 1.5 mm. The electromagnetic topology of this LVCM is established to validate its feasibility; the back Electromotive Force (back EMF) and the electromagnetic force are calculated. The end effect of this tubular LVCM is studied in detail; the auxiliary pole and the magnetic conductive stator base are designed to suppress its end detent force. Then, the main structure parameters are globally optimized by the multi-objective genetic algorithm to obtain better performance. The prototype of this tubular LVCM is manufactured and tested. The results of the experiments are compared with those of theoretical analyses. It is indicated that this tubular LVCM can provide an accelerated speed of 15g; g is the gravitational acceleration. Full article
(This article belongs to the Section F3: Power Electronics)
Show Figures

Figure 1

24 pages, 2500 KiB  
Article
Formative Research for Adapting the Cholera-Hospital-Based-Intervention-for-7-Days (CHoBI7) Water Treatment and Hygiene Mobile Health Program for Scalable Delivery in Rural Bangladesh
by Fatema Zohura, Tahmina Parvin, Kelly Endres, Elizabeth D. Thomas, Zakir Hossain, Kabir Hossain, Jahed Masud, Ismat Minhaj, Sawkat Sarwar, Jamie Perin, Mohammad Bahauddin, Md. Nazmul Islam, Sheikh Daud Adnan, Ahmed Al-Kabir, Abu S. G. Faruque and Christine Marie George
Int. J. Environ. Res. Public Health 2025, 22(2), 170; https://doi.org/10.3390/ijerph22020170 - 26 Jan 2025
Viewed by 1335
Abstract
The Cholera-Hospital-based-Intervention-for-7-Days (CHoBI7) mobile health (mHealth) program is a targeted water treatment and hygiene (WASH) program for the household members of diarrhea patients, initiated in the healthcare facility with a single in-person visit and reinforced through weekly voice and text messages for 3 [...] Read more.
The Cholera-Hospital-based-Intervention-for-7-Days (CHoBI7) mobile health (mHealth) program is a targeted water treatment and hygiene (WASH) program for the household members of diarrhea patients, initiated in the healthcare facility with a single in-person visit and reinforced through weekly voice and text messages for 3 months. A recent randomized controlled trial of the CHoBI7 mHealth program in urban Dhaka, Bangladesh, found that this intervention significantly increased WASH behaviors and reduced diarrhea prevalence. The objective of this present study was to conduct formative research using an implementation science framework to adapt the CHoBI7 mHealth program for scalable implementation in rural Bangladesh, and to promote construction of self-made handwashing stations (CHoBI7 Scale-up program). We conducted a 3-month multi-phase pilot with 275 recipients and 25 semi-structured interviews, 10 intervention planning workshops, and 2 focus group discussions with intervention recipients and program implementers. High appropriateness, acceptability, and adoption of the CHoBI7 Scale-up program was observed, with most recipients constructing self-made handwashing stations (90%) and chlorinating drinking water (63%) and 50% of participants observed handwashing with soap in the final pilot phase. At the recipient level, facilitators included weekly voice and text messages with videos on handwashing station construction, which served as reminders for the promoted water treatment and hand hygiene behaviors. Barriers included perceptions that self-made iron filters commonly used in households also removed microbial contamination from water and therefore chlorine treatment was not needed, and mobile messages not always being shared among household members. At the implementer level, facilitators for program implementation included follow-up phone calls to household members not present at the healthcare facility at the time of intervention delivery, and the promotion of multiple self-made handwashing station designs. Barriers included high patient volume in healthcare facilities, as well as the high iron in groundwater in the area that reduced chlorination effectiveness. These findings provide valuable evidence for adapting the CHoBI7 mHealth program for a rural setting, with a lower-cost, scalable design, and demonstrated the important role of formative research for tailoring WASH programs to new contexts. Full article
Show Figures

Figure 1

18 pages, 2854 KiB  
Article
Lenition in L2 Spanish: The Impact of Study Abroad on Phonological Acquisition
by Ratree Wayland, Rachel Meyer, Sophia Vellozzi and Kevin Tang
Brain Sci. 2024, 14(9), 946; https://doi.org/10.3390/brainsci14090946 - 21 Sep 2024
Cited by 2 | Viewed by 2292 | Correction
Abstract
Objective: This study investigated the degrees of lenition, or consonantal weakening, in the production of Spanish stop consonants by native English speakers during a study abroad (SA) program. Lenition is a key phonological process in Spanish, where voiced stops (/b/, /d/, /ɡ/) typically [...] Read more.
Objective: This study investigated the degrees of lenition, or consonantal weakening, in the production of Spanish stop consonants by native English speakers during a study abroad (SA) program. Lenition is a key phonological process in Spanish, where voiced stops (/b/, /d/, /ɡ/) typically weaken to fricatives or approximants in specific phonetic environments. For L2 learners, mastering this subtle process is essential for achieving native-like pronunciation. Methods: To assess the learners’ progress in acquiring lenition, we employed Phonet, a deep learning model. Unlike traditional quantitative acoustic methods that focus on measuring the physical properties of speech sounds, Phonet utilizes recurrent neural networks to predict the posterior probabilities of phonological features, particularly sonorant and continuant characteristics, which are central to the lenition process. Results: The results indicated that while learners showed progress in producing the fricative-like variants of lenition during the SA program and understood how to produce lenition in appropriate contexts, the retention of these phonological gains was not sustained after their return. Additionally, unlike native speakers, the learners never fully achieved the approximant-like realization of lenition. Conclusions: These findings underscore the need for sustained exposure and practice beyond the SA experience to ensure the long-term retention of L2 phonological patterns. While SA programs offer valuable opportunities for enhancing L2 pronunciation, they should be supplemented with ongoing support to consolidate and extend the gains achieved during the immersive experience. Full article
Show Figures

Figure 1

39 pages, 6629 KiB  
Article
A Combined CNN Architecture for Speech Emotion Recognition
by Rolinson Begazo, Ana Aguilera, Irvin Dongo and Yudith Cardinale
Sensors 2024, 24(17), 5797; https://doi.org/10.3390/s24175797 - 6 Sep 2024
Cited by 6 | Viewed by 4984
Abstract
Emotion recognition through speech is a technique employed in various scenarios of Human–Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being more notable when deep learning techniques are used. The lack of [...] Read more.
Emotion recognition through speech is a technique employed in various scenarios of Human–Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being more notable when deep learning techniques are used. The lack of a standard in feature selection leads to continuous development and experimentation. Choosing and designing the appropriate network architecture constitutes another challenge. This study addresses the challenge of recognizing emotions in the human voice using deep learning techniques, proposing a comprehensive approach, and developing preprocessing and feature selection stages while constructing a dataset called EmoDSc as a result of combining several available databases. The synergy between spectral features and spectrogram images is investigated. Independently, the weighted accuracy obtained using only spectral features was 89%, while using only spectrogram images, the weighted accuracy reached 90%. These results, although surpassing previous research, highlight the strengths and limitations when operating in isolation. Based on this exploration, a neural network architecture composed of a CNN1D, a CNN2D, and an MLP that fuses spectral features and spectogram images is proposed. The model, supported by the unified dataset EmoDSc, demonstrates a remarkable accuracy of 96%. Full article
(This article belongs to the Special Issue Emotion Recognition Based on Sensors (3rd Edition))
Show Figures

Figure 1

15 pages, 833 KiB  
Article
An Attractive School-Age Educare—Free Choices as Expanded or Limited Agency
by Helena Ackesjö, Marina Wernholm and Mergim Krasniqi
Educ. Sci. 2024, 14(9), 937; https://doi.org/10.3390/educsci14090937 - 26 Aug 2024
Cited by 2 | Viewed by 1736
Abstract
The present study aimed to investigate and problematize an attractive school-age educare (SAEC) from the children’s perspectives. Which different aspects of quality appear in the children’s narratives about the SAEC activities? This was achieved by listening to children’s narratives and their voices. Forty-three [...] Read more.
The present study aimed to investigate and problematize an attractive school-age educare (SAEC) from the children’s perspectives. Which different aspects of quality appear in the children’s narratives about the SAEC activities? This was achieved by listening to children’s narratives and their voices. Forty-three children aged 6 to 10 participated in group conversations with the staff in their SAEC center. The study is theoretically based on a childhood sociological lens where children are recognized as active participants and agents for change and therefore important to listen to. The results show that an attractive School-Age Educare requires committed staff who inspire new discoveries, its own identity-specific premises with appropriate materials, the provision of planned and guided activities, the offering of unexpected and non-routine activities, and space for children’s agency to influence and to choose and to direct one’s own time. It is shown that free choices can both expand and limit children’s agency. In addition, the study illustrates how conversations with children can form a basic method for both developing quality and making the contextual factors for children’s agency visible. Full article
(This article belongs to the Section Early Childhood Education)
Show Figures

Figure 1

Back to TopTop