AI-Enhanced Motion Capture for Multimodal Interaction in Chinese Shadow Puppetry Heritage
Abstract
1. Introduction
- The ongoing value conflict between technological adaptability and cultural authenticity. Proponents emphasize the advantages of transcending spatiotemporal limitations, while critics warn against the risks of performative stylization resulting from over-technologization.
- Ethical dilemmas in technological implementation are becoming increasingly prominent, including the lack of standardized data protocols, varying levels of technological acceptance among cultural bearers, and instances of cultural appropriation in commercial applications.
2. Overview of AI-Enhanced Motion Capture Technology
2.1. Technical Principles and Classification
- First, motion data acquisition:
- Second, AI algorithmic intervention:
- Third, practical application scenarios:
- Enabling multi-user interactive scenarios powered by embedded AI algorithms expands the expressive boundaries of intangible cultural heritage (ICH) art forms.
2.2. Technological Evolution and Key Breakthroughs
2.2.1. Transformation from Film Entertainment to Cultural Heritage Preservation
2.2.2. Integration of AI Algorithms
3. Current Applications of AI-Based Motion Capture in Intangible Cultural Heritage (ICH)
3.1. Case Studies and Critical Analysis
3.1.1. Technology-Driven Approach
3.1.2. Cultural Integration Model
3.1.3. Entertainment-Driven Integration Model
4. Innovative Practices of AI Motion Capture Technology in Intangible Cultural Heritage
4.1. Specific Pathways of Technological Empowerment
4.2. Comparative Analysis of Technology-Enabled Shadow Puppetry Cases
4.2.1. The Shadow Story Digital Narrative System: A Case Analysis of Wireless Sensor Interaction
4.2.2. The Kinect-Based Digital Shadow Play Interaction System: A Case Analysis
4.3. Parametric Modeling-Driven Digital Shadow Puppetry Interaction System: A Conceptual and Technical Analysis
5. Challenges, Solutions, and Future Outlook
- Hybrid capture strategies that combine the portability of inertial sensing with the lower-threshold advantages of vision-based motion capture may offer a more balanced approach to documentation across different heritage settings [13].
- Future systems may further explore lightweight temporal models and edge-based interaction architectures in order to improve responsiveness in real-time heritage interaction, while remaining attentive to interpretability, deployment conditions, and cultural use contexts.
Cultural Challenges
- Commercialization may lead to the decontextualization of intangible cultural heritage symbols. For example, simplifying Chinese shadow puppet patterns in the game “Projection: First Light” leads to a gap in user recognition.
- The intergenerational digital divide may create a “digital breakpoint” in the inheritance chain, manifested by insufficient digital literacy in elderly artists and superficial cultural understanding among young practitioners.
- A “heritage practitioner–technician–user” collaborative platform may provide a useful direction for future work. Rather than treating the technical intervention threshold as a fixed technical limit, future research may explore it as a culturally negotiated boundary—concerning what should be preserved, what may be moderately adapted, and what should not be altered solely for the sake of novelty—through sustained dialogue with heritage practitioners and comparative case discussion.
- Future work may also explore the use of smart-contract or traceability-oriented frameworks to better coordinate digital records, performance parameters, and rights-related governance in heritage transmission contexts [29].
- Another promising direction is the development of training-oriented digital systems that combine gesture recognition, multimodal feedback, and practitioner-informed verbal guidance in order to support apprenticeship, movement correction, and the transmission of embodied know-how.
6. Conclusions
- Cross-case comparison suggests that technological intervention should remain responsive to the aesthetic density and performative complexity of the heritage form itself [7,25]. When digital systems over-simplify component structure, movement logic, or stylistic detail—as in the often-cited reduction in shadow-puppet configuration in Shadow Story—the result may be increased accessibility at the cost of cultural and performative fidelity.
- The comparison also indicates that symbol embedding and context reconstruction may provide a promising pathway for renewing cultural engagement, especially when digital systems retain narrative background, symbolic coherence, and recognizable cultural cues [21]. At the same time, such approaches remain vulnerable to commercialization, contextual detachment, and the risk of reducing cultural meaning to a surface-level aesthetic resource.
- The comparison further highlights the importance of digital ethics as a governance-oriented dimension of heritage digitization. Issues such as consent, cultural boundaries, interpretive authority, and responsible reuse cannot be resolved by technical optimization alone but require localized ethical frameworks that remain sensitive to community knowledge, symbolic value, and the conditions of cultural transmission.
- The case studies primarily focus on performance-related intangible cultural heritage, with insufficient exploration of the technical adaptability for craft-based intangible cultural heritage.
- The application of neural-symbolic systems to the digital representation of tacit principles, embodied know-how, or “mental principles” remains largely conceptual and requires further case-based exploration.
- The broader transferability of the digital ethics framework across different cultural traditions, heritage categories, and regional contexts still requires further comparative discussion.
- Establishing a “Digital Cultural Community” based on the metaverse, exploring an “each beauty in its way” inheritance ecology in a decentralized structure.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI-MoCap | AI-enhanced Motion Capture |
| ICH | Intangible Cultural Heritage |
| MoCap | Motion Capture |
| VR/AR | Virtual Reality/Augmented Reality |
References
- Li, J.; Krishnamurthy, S.; Pereira Roders, A.; van Wesemael, P. Community Participation in Cultural Heritage Management: A Systematic Literature Review Comparing Chinese and International Practices. Cities 2020, 96, 102476. [Google Scholar] [CrossRef]
- Yang, C.; Tengku Wook, T.S.M.; Rosdi, F. Advancing Cultural Heritage: A Decadal Review of Digital Transformation in Chinese Museums. npj Herit. Sci. 2025, 13, 189. [Google Scholar] [CrossRef]
- He, Y. ShadowPlayVR: Understanding Traditional Shadow Puppetry Performance Techniques Through Non-Intuitive Embodied Interactions. In Proceedings of the VRST ′23: Proceedings of the 29th ACM Symposium on Virtual Reality Software and Technology, Christchurch, New Zealand, 9–11 October 2023. [Google Scholar] [CrossRef]
- Lu, F.; Tian, F.; Jiang, Y.; Cao, X.; Luo, W.; Li, G.; Zhang, X.; Dai, G.; Wang, H. ShadowStory: Creative and Collaborative Digital Storytelling Inspired by Cultural Heritage. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2011), Vancouver, BC, Canada, 7–12 May 2011; pp. 1919–1928. [Google Scholar] [CrossRef]
- Hu, Z.; Lin, M.; Liu, S.; Wang, M.; Hong, R.; Yan, S. eHeritage of Shadow Puppetry: Creation and Manipulation. In Proceedings of the 21st ACM International Conference on Multimedia (MM ′13), Barcelona, Spain, 21–25 October 2013. [Google Scholar] [CrossRef]
- Selmanović, E.; Rizvić, S.; Harvey, C.; Bosković, D.; Hulusić, V.; Chmeliar, J.; Okanović, V. Improving Accessibility to Intangible Cultural Heritage Preservation Using Virtual Reality. J. Comput. Cult. Herit. 2020, 13, 13. [Google Scholar] [CrossRef] [PubMed]
- Hou, Y.; Kenderdine, S.; Picca, D.; Egloff, M.; Adamou, A. Digitizing Intangible Cultural Heritage Embodied: State of the Art. J. Comput. Cult. Herit. 2022, 15, 55. [Google Scholar] [CrossRef]
- Yu, J.; Wang, Z.; Cao, Y.; Cui, H.; Zeng, W. Centennial Drama Reimagined: An Immersive Experience of Intangible Cultural Heritage through Contextual Storytelling in Virtual Reality. J. Comput. Cult. Herit. 2025, 18, 11. [Google Scholar] [CrossRef]
- Mourot, L.; Hoyet, L.; Le Clerc, F.; Schnitzler, F.; Hellier, P. A Survey on Deep Learning for Skeleton-Based Human Animation. Comput. Graph. Forum 2022, 41, 122–157. [Google Scholar] [CrossRef]
- Jia, W.; Wang, H.; Chen, Q.; Bao, T.; Sun, Y. Analysis of Kinect-Based Human Motion Capture Accuracy Using Skeletal Cosine Similarity Metrics. Sensors 2025, 25, 1047. [Google Scholar] [CrossRef]
- Plemons, A.M.; Spiros, M.C. Toward Ethical Digital Practices: Guidelines for Consent, Accountability, and Transparency in Anthropology. Am. J. Biol. Anthropol. 2025, 186, e70044. [Google Scholar] [CrossRef] [PubMed]
- Jiang, X.; Ibrahim, Z.; Jiang, J.; Liu, G. Motion Capture as an Immersive Learning Technology: A Systematic Review of Its Applications in Computer Animation Training. Multimodal Technol. Interact. 2026, 10, 1. [Google Scholar] [CrossRef]
- Pan, S.; Ma, Q.; Yi, X.; Hu, W.; Wang, X.; Zhou, X.; Li, J.; Xu, F. Fusing Monocular Images and Sparse IMU Signals for Real-Time Human Motion Capture. In Proceedings of the SIGGRAPH Asia 2023 Conference Papers, Sydney, Australia, 12–15 December 2023; pp. 1–11. [Google Scholar] [CrossRef]
- Guo, Y.; Zhong, C. Motion Capture Technology and Its Applications in Film and Television Animation. Adv. Multimed. 2022, 2022, 6392168. [Google Scholar] [CrossRef]
- Hu, Z.; Tang, J.; Li, L.; Hou, J.; Xin, H.; Yu, X.; Bu, J. MarkerNet: A Divide-and-Conquer Solution to Motion Capture Solving from Raw Markers. Comput. Animat. Virtual Worlds 2024, 35, e2228. [Google Scholar] [CrossRef]
- McInnes, M.; Chadwick, E.; Blana, D.; Starkey, A. Improving the Accuracy and Reliability of Upper Limb Inertial Motion Capture without Increasing Calibration Complexity. Gait Posture 2024, 113, 144–145. [Google Scholar] [CrossRef]
- Zabulis, X.; Partarakis, N.; Manikaki, V.; Demeridou, I.; Dubois, A.; Moreno, I.; Bartalesi, V.; Pratelli, N.; Meghini, C.; Manitsaris, S.; et al. A Digitally Enhanced Ethnography for Craft Action and Process Understanding. Appl. Sci. 2025, 15, 5408. [Google Scholar] [CrossRef]
- Du, D.; Ding, J.; Liu, Y. Knowledge Graph Construction of Chinese Embroidery Evolution Based on Associating Cultural Space and Critical Incidents under Intangible Cultural Heritage. Electron. Libr. 2025, 43, 283–302. [Google Scholar] [CrossRef]
- Zabulis, X.; Stamou, A.; Demeridou, I.; Koutlemanis, P.; Karamaounas, P.; Papageridis, V.; Partarakis, N. Simulation and Visualisation of Traditional Craft Actions. Heritage 2024, 7, 7083–7114. [Google Scholar] [CrossRef]
- Shen, X.S.; Gao, J.; Li, M.; Zhou, C.; Hu, S.; He, M.; Zhuang, W. Toward Immersive Communications in 6G. Front. Comput. Sci. 2023, 4, 1068478. [Google Scholar] [CrossRef]
- Zabulis, X.; Partarakis, N.; Bartalesi, V.; Pratelli, N.; Meghini, C.; Dubois, A.; Moreno, I.; Manitsaris, S. Multimodal Dictionaries for Traditional Craft Education. Multimodal Technol. Interact. 2024, 8, 63. [Google Scholar] [CrossRef]
- Zabulis, X.; Meghini, C.; Dubois, A.; Doulgeraki, P.; Partarakis, N.; Adami, I.; Karuzaki, E.; Carre, A.-L.; Patsiouras, N.; Kaplanidi, D.; et al. Digitisation of Traditional Craft Processes. J. Comput. Cult. Herit. 2022, 15, 53. [Google Scholar] [CrossRef]
- Zhang, M.; Chen, X.; Pan, Z. Interaction Study of Digital Shadow Play Using the Kinect. In Proceedings of the SIGGRAPH Asia 2014 Posters, Shenzhen, China, 3–6 December 2014. [Google Scholar] [CrossRef]
- Li, T.; Cao, W. Research on a Method of Creating Digital Shadow Puppets Based on Parameterized Templates. Multimed. Tools Appl. 2021, 80, 20403–20422. [Google Scholar] [CrossRef]
- Wagner, A.; de Clippele, M.-S. Safeguarding Cultural Heritage in the Digital Era—A Critical Challenge. Int. J. Semiot. Law—Rev. Int. Sémiot. Jurid. 2023, 36, 1915–1923. [Google Scholar] [CrossRef]
- UNESCO. Report of the Independent Expert Group on Artificial Intelligence and Culture. 2025. Available online: https://www.unesco.org/sites/default/files/medias/fichiers/2025/09/CULTAI_Report%20of%20the%20Independent%20Expert%20Group%20on%20Artificial%20Intelligence%20and%20Culture%20%28final%20online%20version%29%201.pdf (accessed on 22 April 2026).
- UNESCO. Recommendation on the Ethics of Artificial Intelligence. 2021. Available online: https://unesdoc.unesco.org/ark:/48223/pf0000380455 (accessed on 22 April 2026).
- Marek, H.M. Navigating Intellectual Property in the Landscape of Digital Cultural Heritage Sites. Int. J. Cult. Prop. 2022, 29, 1–21. [Google Scholar] [CrossRef]
- Liu, X.; Dong, F.; Shui, W.; Geng, G. Blockchain in Digital Cultural Heritage Resources: Technological Integration, Consensus Mechanisms, and Future Directions. npj Herit. Sci. 2025, 13, 235. [Google Scholar] [CrossRef]




| Type | Core Technology | Advantages | Limitations | Application Scenarios |
|---|---|---|---|---|
| Optical Motion Capture | Infrared markers + multi-camera tracking | High precision, low error rate, strong data integrity | High equipment cost, requires a fixed setup | Immersive experience and public-oriented teaching of Cantonese Opera |
| Inertial Motion Capture | Wearable IMU sensors | High portability, supports outdoor dynamic capture | Accumulated errors over time require periodic calibration | Field recording of ritual dances such as Nuo Opera |
| Vision-based Motion Capture | Deep learning + mono-/multi-view vision | Low cost, non-intrusive, easily scalable | Sensitive to lighting, complex movements may lose detail | Construction of shadow puppetry motion libraries |
| Shadow Puppet Type | Number of Control Points | Mapping Strategy |
|---|---|---|
| Human | 5 | Palm movement → torso, finger bending → head/limbs |
| Winged Bipedal Animal | 4 | Palm tilt → wing extension, finger spacing → leg movement |
| Serpentine Animal | 3 | Hand trajectory → spine curve fluctuation |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, G.; Yun, H.; Yang, L.; Zheng, Q.; Liu, T. AI-Enhanced Motion Capture for Multimodal Interaction in Chinese Shadow Puppetry Heritage. Multimodal Technol. Interact. 2026, 10, 46. https://doi.org/10.3390/mti10050046
Wang G, Yun H, Yang L, Zheng Q, Liu T. AI-Enhanced Motion Capture for Multimodal Interaction in Chinese Shadow Puppetry Heritage. Multimodal Technologies and Interaction. 2026; 10(5):46. https://doi.org/10.3390/mti10050046
Chicago/Turabian StyleWang, Gaihua, Hengchao Yun, Lixin Yang, Qingyuan Zheng, and Tianmuran Liu. 2026. "AI-Enhanced Motion Capture for Multimodal Interaction in Chinese Shadow Puppetry Heritage" Multimodal Technologies and Interaction 10, no. 5: 46. https://doi.org/10.3390/mti10050046
APA StyleWang, G., Yun, H., Yang, L., Zheng, Q., & Liu, T. (2026). AI-Enhanced Motion Capture for Multimodal Interaction in Chinese Shadow Puppetry Heritage. Multimodal Technologies and Interaction, 10(5), 46. https://doi.org/10.3390/mti10050046

