Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (95)

Search Parameters:
Keywords = musical pitch

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 522 KiB  
Article
Enhancing Typhlo Music Therapy with Personalized Action Rules: A Data-Driven Approach
by Aileen Benedict, Zbigniew W. Ras, Pawel Cylulko and Joanna Gladyszewska-Cylulko
Information 2025, 16(8), 666; https://doi.org/10.3390/info16080666 - 4 Aug 2025
Abstract
In the context of typhlo music therapy, personalized interventions can significantly enhance the therapeutic experience for visually impaired children. Leveraging a data-driven approach, we incorporate action-rule discovery to provide insights into the factors of music that may benefit individual children. The system utilizes [...] Read more.
In the context of typhlo music therapy, personalized interventions can significantly enhance the therapeutic experience for visually impaired children. Leveraging a data-driven approach, we incorporate action-rule discovery to provide insights into the factors of music that may benefit individual children. The system utilizes a comprehensive dataset developed in collaboration with an experienced music therapist, special educator, and clinical psychologist, encompassing meta-decision attributes, decision attributes, and musical features such as tempo, rhythm, and pitch. By extracting and analyzing these features, our methodology identifies key factors that influence therapeutic outcomes. Some themes discovered through action-rule discovery include the effect of harmonic richness and loudness on expression and communication. The main findings demonstrate the system’s ability to offer personalized, impactful, and actionable insights, leading to improved therapeutic experiences for children undergoing typhlo music therapy. Our conclusions highlight the system’s potential to transform music therapy by providing therapists with precise and effective tools to support their patients’ developmental progress. This work shows the significance of integrating advanced data analysis techniques in therapeutic settings, paving the way for future enhancements in personalized music therapy interventions. Full article
(This article belongs to the Section Information Applications)
Show Figures

Figure 1

22 pages, 3451 KiB  
Article
LSTM-Based Music Generation Technologies
by Yi-Jen Mon
Computers 2025, 14(6), 229; https://doi.org/10.3390/computers14060229 - 11 Jun 2025
Viewed by 643
Abstract
In deep learning, Long Short-Term Memory (LSTM) is a well-established and widely used approach for music generation. Nevertheless, creating musical compositions that match the quality of those created by human composers remains a formidable challenge. The intricate nature of musical components, including pitch, [...] Read more.
In deep learning, Long Short-Term Memory (LSTM) is a well-established and widely used approach for music generation. Nevertheless, creating musical compositions that match the quality of those created by human composers remains a formidable challenge. The intricate nature of musical components, including pitch, intensity, rhythm, notes, chords, and more, necessitates the extraction of these elements from extensive datasets, making the preliminary work arduous. To address this, we employed various tools to deconstruct the musical structure, conduct step-by-step learning, and then reconstruct it. This article primarily presents the techniques for dissecting musical components in the preliminary phase. Subsequently, it introduces the use of LSTM to build a deep learning network architecture, enabling the learning of musical features and temporal coherence. Finally, through in-depth analysis and comparative studies, this paper validates the efficacy of the proposed research methodology, demonstrating its ability to capture musical coherence and generate compositions with similar styles. Full article
Show Figures

Figure 1

22 pages, 1596 KiB  
Article
Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony
by Mirjam Visscher and Frans Wiering
Heritage 2025, 8(5), 164; https://doi.org/10.3390/heritage8050164 - 6 May 2025
Viewed by 641
Abstract
Understanding tonal structures in Renaissance music has been a long-standing musicological problem. Computational analysis on a large scale could shed new light on this. Encoded scores provide easy access to pitch content, but the availability of such data is low. This paper addresses [...] Read more.
Understanding tonal structures in Renaissance music has been a long-standing musicological problem. Computational analysis on a large scale could shed new light on this. Encoded scores provide easy access to pitch content, but the availability of such data is low. This paper addresses this shortage of data by exploring the potential of audio recordings. Analysing audio, however, is challenging due to the presence of harmonics, reverb and noise, which may obscure the pitch content. We test several multiple pitch estimation models on audio recordings, using encoded scores from the Josquin Research Project (JRP) as a benchmark for evaluation. We present a dataset of multiple pitch estimations from 611 compositions in the JRP. We use the pitch estimations to create pitch profiles and pitch class profiles, and to estimate the lowest final pitch of each recording. Our findings indicate that the Multif0 model yields pitch profiles, pitch class profiles and finals most closely aligned with symbolic encodings. Furthermore, we found no effect of year of recording, number of voices and ensemble composition on the accuracy of pitch estimations. Finally, we demonstrate how these models can be applied to gain insight into tonal structures in early polyphony. Full article
Show Figures

Figure 1

20 pages, 1143 KiB  
Review
Perfecting Sensory Restoration and the Unmet Need for Personalized Medicine in Cochlear Implant Users: A Narrative Review
by Archana Podury, Brooke Barry, Karen C. Barrett and Nicole T. Jiam
Brain Sci. 2025, 15(5), 479; https://doi.org/10.3390/brainsci15050479 - 1 May 2025
Viewed by 1616
Abstract
Hearing loss is one of the most common and undertreated medical conditions worldwide, with an estimated 466 million people (5% of the world’s population) reporting disabling hearing impairment. The implications are significant; untreated hearing loss increases the risk of depression, social isolation, unemployment, [...] Read more.
Hearing loss is one of the most common and undertreated medical conditions worldwide, with an estimated 466 million people (5% of the world’s population) reporting disabling hearing impairment. The implications are significant; untreated hearing loss increases the risk of depression, social isolation, unemployment, cognitive decline, and falls. Cochlear implants (CIs) are surgically implanted electrical devices that allow people with severe hearing loss to process sound. Over the past 50 years, CI development has made remarkable ground, such that most CI users have adequate speech perception in a silent environment. These language achievements, while significant milestones, fall short of perfect sensory restoration. Many of these limitations with complex sound perception are due to our one-size-fits-all approach towards CIs and speech-based metrics for evaluating implant performance. In the past decade, there has been exponential interest in improving CI-mediated music perception, as it serves as a key conduit to restoring normal hearing. The present literature demonstrates the need for a personalized approach towards cochlear implantation and management. Our proposed narrative review illustrates the limitations of CI-mediated sound processing and discusses ways in which precision medicine can be introduced into the ever-expanding hearing loss population. Full article
Show Figures

Figure 1

20 pages, 618 KiB  
Systematic Review
Music and Language in Williams Syndrome: An Integrative and Systematic Mini-Review
by Jérémy Villatte, Agnès Lacroix, Laure Ibernon, Christelle Declercq, Amandine Hippolyte, Guillaume Vivier and Nathalie Marec-Breton
Behav. Sci. 2025, 15(5), 595; https://doi.org/10.3390/bs15050595 - 29 Apr 2025
Viewed by 792
Abstract
Individuals with Williams syndrome (WS) are known for their interest in language and music. As producing and comprehending music and language usually involve a set of similar or comparable cognitive abilities, the music–language relationship might be of interest to better understand WS. We [...] Read more.
Individuals with Williams syndrome (WS) are known for their interest in language and music. As producing and comprehending music and language usually involve a set of similar or comparable cognitive abilities, the music–language relationship might be of interest to better understand WS. We identified, analyzed, and synthesized research articles on music and language among individuals with WS. Three different databases were searched (SCOPUS, PubMed, PsycInfo). Eight research articles were identified after screening, based on title, abstract and full text. In this integrative–systematic review, we assess methodologies, report findings and examine the current understanding of several subdimensions of the relationship between music and language. The findings suggest that basic musical abilities such as tone, rhythm and pitch discrimination are correlated with several verbal skills, particularly the understanding of prosody. Musical practice seems to benefit individuals with WS, in particular for prosody understanding and verbal memory. A correlation was also observed between emotional responsiveness to music and verbal ability. Further studies are needed to better characterize the relationship between music and language in WS. The clinical use of musical practice could be of interest in improving prosodic skills and verbal memory, which deserves extended experimental investigation. Full article
(This article belongs to the Section Developmental Psychology)
Show Figures

Figure 1

16 pages, 553 KiB  
Article
Improving Phrase Segmentation in Symbolic Folk Music: A Hybrid Model with Local Context and Global Structure Awareness
by Xin Guan, Zhilin Dong, Hui Liu and Qiang Li
Entropy 2025, 27(5), 460; https://doi.org/10.3390/e27050460 - 24 Apr 2025
Viewed by 488
Abstract
The segmentation of symbolic music phrases is crucial for music information retrieval and structural analysis. However, existing BiLSTM-CRF methods mainly rely on local semantics, making it difficult to capture long-range dependencies, leading to inaccurate phrase boundary recognition across measures or themes. Traditional Transformer [...] Read more.
The segmentation of symbolic music phrases is crucial for music information retrieval and structural analysis. However, existing BiLSTM-CRF methods mainly rely on local semantics, making it difficult to capture long-range dependencies, leading to inaccurate phrase boundary recognition across measures or themes. Traditional Transformer models use static embeddings, limiting their adaptability to different musical styles, structures, and melodic evolutions. Moreover, multi-head self-attention struggles with local context modeling, causing the loss of short-term information (e.g., pitch variation, melodic integrity, and rhythm stability), which may result in over-segmentation or merging errors. To address these issues, we propose a segmentation method integrating local context enhancement and global structure awareness. This method overcomes traditional models’ limitations in long-range dependency modeling, improves phrase boundary recognition, and adapts to diverse musical styles and melodies. Specifically, dynamic note embeddings enhance contextual awareness across segments, while an improved attention mechanism strengthens both global semantics and local context modeling. Combining these strategies ensures reasonable phrase boundaries and prevents unnecessary segmentation or merging. The experimental results show that our method outperforms the state-of-the-art methods for symbolic music phrase segmentation, with phrase boundaries better aligned to musical structures. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

17 pages, 908 KiB  
Article
Motor-Sensory Learning in Children with Disabilities: Does Piano Practice Help?
by Simon Strübbe, Susmita Roy, Irina Sidorenko and Renée Lampe
Children 2025, 12(3), 335; https://doi.org/10.3390/children12030335 - 7 Mar 2025
Viewed by 1317
Abstract
Background/Objectives: Patients with physical disabilities, like cerebral palsy, the most common movement disorder in childhood, can benefit from instrumental therapy using piano. Playing the piano promotes the interaction between different brain regions and integrates motor skills, sensory skills, musical hearing, and emotions. A [...] Read more.
Background/Objectives: Patients with physical disabilities, like cerebral palsy, the most common movement disorder in childhood, can benefit from instrumental therapy using piano. Playing the piano promotes the interaction between different brain regions and integrates motor skills, sensory skills, musical hearing, and emotions. A pilot music study examined the effects of six months of piano lessons on hand motor skills and musical hearing in groups of children with motor disabilities. Methods: The allocation to the group was not randomized. Various tests, including the standardized Box and Block Test (BBT) and piano tests, assessed hand motor skills. Musical hearing was evaluated, and a questionnaire was used to determine the participants’ enjoyment and experience with the piano lessons. The regularity, tempo of keystrokes, and synchronization between the two hands were assessed and compared to evaluate the effects of six months of piano training. Results: After six months of piano training, statistically significant improvements were observed in the BBT, as well as in the regularity and tempo of the non-dominant hand. The children showed significant improvement in hand-motor control, moving 27.3% more cubes in the BBT. Regularity and tempo in piano playing, especially in the non-dominant hand, also improved. Moreover, 55% of the children better recognized the correct pitches of notes. Conclusions: Thus, this study supports the concept that piano lessons are an effective form of physical therapy for the development of hand motor skills and musical hearing. Full article
(This article belongs to the Special Issue Children with Cerebral Palsy and Other Developmental Disabilities)
Show Figures

Figure 1

20 pages, 3601 KiB  
Article
Full-Scale Piano Score Recognition
by Xiang-Yi Zhang and Jia-Lien Hsu
Appl. Sci. 2025, 15(5), 2857; https://doi.org/10.3390/app15052857 - 6 Mar 2025
Viewed by 851
Abstract
Sheet music is one of the most efficient methods for storing music. Meanwhile, a large amount of sheet music-image data is stored in paper form, but not in a computer-readable format. Therefore, digitizing sheet music is an essential task, such that the encoded [...] Read more.
Sheet music is one of the most efficient methods for storing music. Meanwhile, a large amount of sheet music-image data is stored in paper form, but not in a computer-readable format. Therefore, digitizing sheet music is an essential task, such that the encoded music object could be effectively utilized for tasks such as editing or playback. Although there have been a few studies focused on recognizing sheet music images with simpler structures—such as monophonic scores or more modern scores with relatively simple structures, only containing clefs, time signatures, key signatures, and notes—in this paper we focus on the issue of classical sheet music containing dynamics symbols and articulation signs, more than only clefs, time signatures, key signatures, and notes. Therefore, this study augments the data from the GrandStaff dataset by concatenating single-line scores into multi-line scores and adding various classical music dynamics symbols not included in the original GrandStaff dataset. Given a full-scale piano score in pages, our approach first applies three YOLOv8 models to perform the three tasks: 1. Converting a full page of sheet music into multiple single-line scores; 2. Recognizing the classes and absolute positions of dynamics symbols in the score; and 3. Finding the relative positions of dynamics symbols in the score. Then, the identified dynamics symbols are removed from the original score, and the remaining score serves as the input into a Convolutional Recurrent Neural Network (CRNN) for the following steps. The CRNN outputs KERN notation (KERN, a core pitch/duration representation for common practice music notation) without dynamics symbols. By combining the CRNN output with the relative and absolute position information of the dynamics symbols, the final output is obtained. The results show that with the assistance of YOLOv8, there is a significant improvement in accuracy. Full article
(This article belongs to the Special Issue Integration of AI in Signal and Image Processing)
Show Figures

Figure 1

16 pages, 2532 KiB  
Article
Towards Automatic Expressive Pipa Music Transcription Using Morphological Analysis of Photoelectric Signals
by Yuancheng Wang, Xuanzhe Li, Yunxiao Zhang and Qiao Wang
Sensors 2025, 25(5), 1361; https://doi.org/10.3390/s25051361 - 23 Feb 2025
Viewed by 596
Abstract
The musical signal produced by plucked instruments often exhibits non-stationarity due to variations in the pitch and amplitude, making pitch estimation a challenge. In this paper, we assess different transcription processes and algorithms applied to signals captured by optical sensors mounted on a [...] Read more.
The musical signal produced by plucked instruments often exhibits non-stationarity due to variations in the pitch and amplitude, making pitch estimation a challenge. In this paper, we assess different transcription processes and algorithms applied to signals captured by optical sensors mounted on a pipa—a traditional Chinese plucked instrument—played using a range of techniques. The captured signal demonstrates a distinctive arched feature during plucking. This facilitates onset detection to avoid the impact of the spurious energy peaks within vibration areas that arise from pitch-shift playing techniques. Subsequently, we developed a novel time–frequency feature, known as continuous time-period mapping (CTPM), which contains pitch curves. The proposed process can also be applied to playing techniques that mix pitch shifts and tremolo. When evaluated on four renowned pipa music pieces of varying difficulty levels, our fully time-domain-based onset detectors outperformed four short-time methods, particularly during tremolo. Our zero-crossing-based pitch estimator achieved a performance comparable to short-time methods with a far better computational efficiency, demonstrating its suitability for use in a lightweight algorithm in future work. Full article
(This article belongs to the Special Issue Recent Advances in Smart Mobile Sensing Technology)
Show Figures

Figure 1

30 pages, 21062 KiB  
Article
Influence of Microstructure on Music Properties of SWP-B Music Steel Wire Under Different Annealing Treatments
by Xinru Jia, Qinghua Li, Fuguo Li, Xiaohui Fang, Junda You, Qian Zhao, Xia Wang and Jinhua Lu
Materials 2025, 18(2), 440; https://doi.org/10.3390/ma18020440 - 18 Jan 2025
Viewed by 960
Abstract
The mechanical properties of music wire are contingent upon its microstructure, which in turn influences its applications in music. Chinese stringed instruments necessitate exacting standards for comprehensive performance indexes, particularly with regard to the strength, resilience, and rigidity of the musical steel wires, [...] Read more.
The mechanical properties of music wire are contingent upon its microstructure, which in turn influences its applications in music. Chinese stringed instruments necessitate exacting standards for comprehensive performance indexes, particularly with regard to the strength, resilience, and rigidity of the musical steel wires, which differ from the Western approach to musical wire. In this study, SWP-B music wire was selected for investigation through metal heat treatment, which was employed to regulate its microstructure characteristics. Furthermore, a spectral analysis was conducted to evaluate the musical expression, encompassing attributes such as pitch and timbre. In conclusion, the governing law of the impact of the microstructure of music wire on its musical expression was established. The results demonstrate that steel wire subjected to a 200 °C annealing treatment for cementite spheroidization can effectively reduce stress concentration, thereby reducing the probability of fracture and consequently improving tonal uniformity and richness while increasing tensile strength from 2578 MPa to 2702 MPa. Conversely, the high-temperature annealing treatment alters the crystalline structure of the material and refines the grain structure, thereby improving the material’s performance and sound quality. The fine microstructure of the music steel wire displays enhanced uniformity. As the annealing temperature increases, the strength of the ferrite phase <110>//ND (<010>//ND, indicating that the <010> direction of the crystal is parallel to the normal direction of the material) and the cementite phase <010>//ND demonstrates a gradual decline. However, this also results in a more pronounced harmonic performance, which, in turn, affects the overall music expression. Full article
Show Figures

Figure 1

17 pages, 1898 KiB  
Article
Musical Pitch Perception and Categorization in Listeners with No Musical Training Experience: Insights from Mandarin-Speaking Non-Musicians
by Jie Liang, Fen Zhang, Wenshu Liu, Zilong Li, Keke Yu, Yi Ding and Ruiming Wang
Behav. Sci. 2025, 15(1), 30; https://doi.org/10.3390/bs15010030 - 31 Dec 2024
Cited by 1 | Viewed by 1211
Abstract
Pitch is a fundamental element in music. While most previous studies on musical pitch have focused on musicians, our understanding of musical pitch perception in non-musicians is still limited. This study aimed to explore how Mandarin-speaking listeners who did not receive musical training [...] Read more.
Pitch is a fundamental element in music. While most previous studies on musical pitch have focused on musicians, our understanding of musical pitch perception in non-musicians is still limited. This study aimed to explore how Mandarin-speaking listeners who did not receive musical training perceive and categorize musical pitch. Two experiments were conducted in the study. In Experiment 1, participants were asked to discriminate musical tone pairs with different intervals. The results showed that the nearer apart the tones were, the more difficult it was to distinguish. Among adjacent note pairs at major 2nd pitch distance, the A4–B4 pair was perceived as the easiest to differentiate, while the C4–D4 pair was found to be the most difficult. In Experiment 2, participants completed a tone discrimination and identification task with the C4–D4 and A4–B4 musical tone continua as stimuli. The results revealed that the C4–D4 tone continuum elicited stronger categorical perception than the A4–B4 continuum, although the C4–D4 pair was previously found to be more difficult to distinguish in Experiment 1, suggesting a complex interaction between pitch perception and categorization processing. Together, these two experiments revealed the cognitive mechanism underlying musical pitch perception in ordinary populations and provided insights into future musical pitch training strategies. Full article
Show Figures

Figure 1

15 pages, 17738 KiB  
Article
Assessing the Impact of Force Feedback in Musical Knobs on Performance and User Experience
by Ziyue Piao, Christian Frisson, Bavo Van Kerrebroeck and Marcelo M. Wanderley
Actuators 2024, 13(11), 462; https://doi.org/10.3390/act13110462 - 16 Nov 2024
Cited by 1 | Viewed by 1543
Abstract
This paper examined how rotary force feedback in knobs can enhance control over musical techniques, focusing on both performance and user experience. To support our study, we developed the Bend-aid system, a web-based sequencer with pre-designed haptic modes for pitch modulation, integrated with [...] Read more.
This paper examined how rotary force feedback in knobs can enhance control over musical techniques, focusing on both performance and user experience. To support our study, we developed the Bend-aid system, a web-based sequencer with pre-designed haptic modes for pitch modulation, integrated with TorqueTuner, a rotary haptic device that controls pitch through programmable haptic effects. Then, twenty musically trained participants evaluated three haptic modes (No-force feedback (No-FF), Spring, and Detent) by performing a vibrato mimicry task, rating their experience on a Likert scale, and providing qualitative feedback in post-experiment interviews. The study assessed objective performance metrics (Pitch Error and Pitch Deviation) and subjective user experience ratings (Comfort, Ease of Control, and Helpfulness) of each haptic mode. User experience results showed that participants found force feedback helpful. Performance results showed that the Detent mode significantly improved pitch accuracy and vibrato stability compared to No-FF, while the Spring mode did not show a similar improvement. Post-experiment interviews showed that preferences for Spring and Detent modes varied, and the applicants provided suggestions for future knob designs. These findings suggest that force feedback may enhance both control and the experience of control in rotary knobs, with potential applications for more nuanced control in DMIs. Full article
(This article belongs to the Special Issue Actuators for Haptic and Tactile Stimulation Applications)
Show Figures

Figure 1

15 pages, 3317 KiB  
Article
Musicianship Modulates Cortical Effects of Attention on Processing Musical Triads
by Jessica MacLean, Elizabeth Drobny, Rose Rizzi and Gavin M. Bidelman
Brain Sci. 2024, 14(11), 1079; https://doi.org/10.3390/brainsci14111079 - 29 Oct 2024
Cited by 1 | Viewed by 1257
Abstract
Background: Many studies have demonstrated the benefits of long-term music training (i.e., musicianship) on the neural processing of sound, including simple tones and speech. However, the effects of musicianship on the encoding of simultaneously presented pitches, in the form of complex musical [...] Read more.
Background: Many studies have demonstrated the benefits of long-term music training (i.e., musicianship) on the neural processing of sound, including simple tones and speech. However, the effects of musicianship on the encoding of simultaneously presented pitches, in the form of complex musical chords, is less well established. Presumably, musicians’ stronger familiarity and active experience with tonal music might enhance harmonic pitch representations, perhaps in an attention-dependent manner. Additionally, attention might influence chordal encoding differently across the auditory system. To this end, we explored the effects of long-term music training and attention on the processing of musical chords at the brainstem and cortical levels. Method: Young adult participants were separated into musician and nonmusician groups based on the extent of formal music training. While recording EEG, listeners heard isolated musical triads that differed only in the chordal third: major, minor, and detuned (4% sharper third from major). Participants were asked to correctly identify chords via key press during active stimulus blocks and watched a silent movie during passive blocks. We logged behavioral identification accuracy and reaction times and calculated information transfer based on the behavioral chord confusion patterns. EEG data were analyzed separately to distinguish between cortical (event-related potential, ERP) and subcortical (frequency-following response, FFR) evoked responses. Results: We found musicians were (expectedly) more accurate, though not faster, than nonmusicians in chordal identification. For subcortical FFRs, responses showed stimulus chord effects but no group differences. However, for cortical ERPs, whereas musicians displayed P2 (~150 ms) responses that were invariant to attention, nonmusicians displayed reduced P2 during passive listening. Listeners’ degree of behavioral information transfer (i.e., success in distinguishing chords) was also better in musicians and correlated with their neural differentiation of chords in the ERPs (but not high-frequency FFRs). Conclusions: Our preliminary results suggest long-term music training strengthens even the passive cortical processing of musical sounds, supporting more automated brain processing of musical chords with less reliance on attention. Our results also suggest that the degree to which listeners can behaviorally distinguish chordal triads is directly related to their neural specificity to musical sounds primarily at cortical rather than subcortical levels. FFR attention effects were likely not observed due to the use of high-frequency stimuli (>220 Hz), which restrict FFRs to brainstem sources. Full article
(This article belongs to the Section Sensory and Motor Neuroscience)
Show Figures

Figure 1

14 pages, 9755 KiB  
Article
Phoneme Recognition in Korean Singing Voices Using Self-Supervised English Speech Representations
by Wenqin Wu and Joonwhoan Lee
Appl. Sci. 2024, 14(18), 8532; https://doi.org/10.3390/app14188532 - 22 Sep 2024
Viewed by 1749
Abstract
In general, it is difficult to obtain a huge, labeled dataset for deep learning-based phoneme recognition in singing voices. Studying singing voices also offers inherent challenges, compared to speech, because of the distinct variations in pitch, duration, and intensity. This paper proposes a [...] Read more.
In general, it is difficult to obtain a huge, labeled dataset for deep learning-based phoneme recognition in singing voices. Studying singing voices also offers inherent challenges, compared to speech, because of the distinct variations in pitch, duration, and intensity. This paper proposes a detouring method to overcome this insufficient dataset, and applies it to the recognition of Korean phonemes in singing voices. The method started with pre-training the HuBERT, a self-supervised speech representation model, on a large-scale English corpus. The model was then adapted to the Korean speech domain with a relatively small-scale Korean corpus, in which the Korean phonemes were interpreted as similar English ones. Finally, the speech-adapted model was again trained with a tiny-scale Korean singing voice corpus for speech–singing adaptation. In the final adaptation, melodic supervision was chosen, which utilizes pitch information to improve the performance. For evaluation, the performance on multi-level error rates based on Word Error Rate (WER) was taken. Using the HuBERT-based transfer learning for adaptation improved the phoneme-level error rate of Korean speech by as much as 31.19%. Again, on singing voices by melodic supervision, it improved the rate by 0.55%. The significant improvement in speech recognition underscores the considerable potential of a model equipped with general human voice representations captured from the English corpus that can improve phoneme recognition on less target speech data. Moreover, the musical variation in singing voices is beneficial for phoneme recognition in singing voices. The proposed method could be applied to the phoneme recognition of other languages that have less speech and singing voice corpora. Full article
Show Figures

Figure 1

13 pages, 1479 KiB  
Article
Effects of Rhythm Step Training on Foot and Lower Limb Balance in Children and Adolescents with Flat Feet: A Radiographic Analysis
by Ji-Myeong Park, Byung-Cho Min, Byeong-Chae Cho, Kyu-Ri Hwang, Myung-Ki Kim, Jeong-Ha Lee, Min-Jun Choi, Hyeon-Hee Kim, Myung-Sung Kang and Kyoung-Bin Min
Medicina 2024, 60(9), 1420; https://doi.org/10.3390/medicina60091420 - 30 Aug 2024
Viewed by 2492
Abstract
Background and Objectives: Owing to the recent reports regarding the efficacy of rhythm step training (RST) in lower limb muscle development and motor skill enhancement, this study aimed to evaluate the effects of RST on foot and lower limb balance in children and [...] Read more.
Background and Objectives: Owing to the recent reports regarding the efficacy of rhythm step training (RST) in lower limb muscle development and motor skill enhancement, this study aimed to evaluate the effects of RST on foot and lower limb balance in children and adolescents diagnosed with flat feet using radiographic analysis. Materials and Methods: A total of 160 children and adolescents diagnosed with flat feet from a hospital in Seoul were randomly assigned to the general flat feet training (GFFT) (n = 80) or RST (n = 80) group. Patients in both groups exercised for 50 min once a week for 12 weeks. Key variables, such as quadriceps angle (Q-angle), calcaneal pitch angle (CPA), calcaneal–first metatarsal angle (CFMA), and navicular–cuboid overlap ratio (OR) were measured before and after the intervention. Results: Significant improvements in Q-angle (p < 0.001), CPA (p < 0.001), CFMA (p < 0.001), and navicular–cuboid OR (p < 0.001) were observed in the RST group compared to the GFFT group. RST was found to be more effective in normalizing the biomechanical function of the calcaneus and improving lower limb function. Conclusions: RST significantly enhances foot and lower limb balance in children and adolescents with flat feet, suggesting its potential use as an effective intervention for this population. The study did not specifically analyze the effects of various components of rhythm training, such as music, exercise intensity, and frequency, on the outcomes. Further research is needed to determine how each of these elements individually influences the results. Full article
(This article belongs to the Section Sports Medicine and Sports Traumatology)
Show Figures

Figure 1

Back to TopTop