MuMIA: Multimodal Interactions to Better Understand Art Contexts
Abstract
:1. Introduction
2. Related Work and Motivation
2.1. Multimodal Interactions
2.2. Multimodal Technologies and Interfaces in Cultural Heritage
2.2.1. Interactions with Sound/Speech Interfaces
2.2.2. Interactions with Visual/Gaze Interfaces
2.3. Motivation
“Is the design of an interactive multimodal interface based on eye-tracking and voice communication within the cultural heritage domain feasible?”
“What is its impact on the visitor experience?“
3. Design of the System
3.1. System Components
3.1.1. Art Exhibit
3.1.2. Eye-Tracking Module
- It monitors the eye gaze behavior of the visitors in real time: it captures saccades (i.e., the type of eye movement used to move the fovea rapidly from one point of interest to another) in real time in <> coordinates projected on the surface of the art exhibit. An area of interest is characterized by a collection of <> coordinates that build the total area (Equation (3)). Therefore, Equation (2) is enriched with that collection of coordinates (Equation (4)). Moreover, a calibration procedure needs to be performed prior to the eye-tracking session to increase the validity of the captured eye gaze data.
- It analyzes the captured saccades in fixations (i.e., the periods of time when the eye is aligned with the target for a certain duration, allowing for the image details to be processed), and then, it extracts more complex metrics, such as fixation duration, fixation entropy, etc. To extract the fixations, we used a customized velocity threshold identification algorithm, with a minimum fixation duration set to 80 ms, as it is accepted to use fixations shorter than 100 ms when analyzing visual scene perception [51] as MuMIA does. Based on the position and the duration of the fixations, the system understands when and what the visitors gaze at, and thus, it activates (or not) the corresponding area of interest.
- It identifies the areas of interest, notifies the visitors (with the use of visual annotations), and then provides the visitors with a way to interact with them (through voice commands, as discussed in the next subsection).
3.1.3. Voice Commands
3.1.4. Repository
3.2. Scenario
- 1.
- The visitor enters a virtual gallery hall and starts exploring the various art exhibits. When an exhibit attracts their attention, they visit it (e.g., they stand in front of it).
- 2.
- The visitor visually explores the art exhibit, trying to discover areas that seem interesting. At the same time, the audio guide provides them with brief information about the exhibit and the gallery theme.
- 3.
- The visitor identifies an area of interest by gazing at it and asks the system to provide them with more information about what they gaze at, through the use of voice commands.
- 4.
- The system searches in the repository for information about the identified area of interest. When found, it retrieves the corresponding cultural information. It presents it to the visitor through the audio channel (e.g., sound interface).
- 5.
- After receiving the information, the visitor can gaze at other areas of interest to receive cultural information about them or can ask the system to provide them with more information about the gazed area of interest, in order to create deeper connections and to build a better understanding of the underlying contexts of the presented area of interest and its relation with other areas within the art exhibit.
3.3. Implementation
4. Evaluation Study
4.1. Method
4.1.1. Procedure
- S1
- Preparation: Preparation started with the recruitment process, during which we contacted potential visitors who had experience in visiting both physical and virtual cultural heritage institutions and communicated the study motivation. Next, the people who were willing to take part in the study received more information about the study and we arranged a mutually agreed upon date and time to conduct the main stage of the study.
- S2
- Main stage: For the main stage of the user study, (i) the participants provided their consent; (ii) we presented them with the MuMIA system and the audio guide; (iii) the participants used MuMIA and the audio guide to explore the School of Athens exhibit; and (iv) we discussed with the participants their overall experience following a semi-structured interview approach in which we adopted the MUSETECH [52] model. During the aforementioned steps, we took notes and recorded the participants when necessary.
- S3
- Analysis: After all participants completed the main stage, we collected the data, transcribed the recordings, and performed a thematic analysis based on the dimensions described in the MUSETECH framework [52], which is discussed in more detail in the following paragraphs.
4.1.2. Participants
4.1.3. Visit Scenario
4.1.4. Apparatus
4.1.5. Thematic Analysis
4.2. Results
4.2.1. Experience Design and Narrative
“I totally lost myself in the visit experience! The feeling was great and I hope I had more time to explore more MuMIA exhibits and other aspects of the gallery”∼[Participant #4]
“The time slipped away so fast! I was fully concentrated and engaged with the exploration of the School of Athens; that categorization and the chunking of the narratives into micro-steps of acquiring knowledge definitely helped.”∼[Participant #3]
“I loved the features of MuMIA! The fact that I could visually explore the scene and select a figure or an instrument, that I knew nothing about, to receive information and know about it, was super helpful. For example, I didn’t know what the central figure holds and I wouldn’t be able to ask about it, if I couldn’t gaze at it to show to the system somehow that I need more information about it.”∼[Participant #10]
“MuMIA incorporates a great multi-modal interaction mechanism! The look-and-ask feature and the micro-blocks of narratives, I have the feeling that I could acquire quick knowledge and information about the art exhibit; which would normally take time when searching in the web and Wikipedia.”∼[Participant #11]
“The system gave me the opportunity of self-exploration, meaning that I could explore, inspect, and process what I liked at a given time under a given motivation. So, it triggered my curiosity and it decreased my cognitive effort, helping me to better understand the concept of the art exhibit and hopefully remember it.”∼[Participant #2]
“I am sure that a single pass through the whole information does not help you to remember, and thus learn, the presented cultural content, in contrast to a more goal-oriented approach, as the one supported in MuMIA. The visitors are implicitly guided to small chunks of information, which are more memorable and thus the visitors can recall the information in the future.”∼[Participant #11]
4.2.2. Interactions, Affordances, and Metaphors
“While I had never used an eye-tracker before, its use was quite simple and intuitive. It was as simple as you look at an area that is interesting and you ask the system to provide more information about it.”∼[Participant #11]
“I noticed no failure of the system; it always provided information about the figure I was looking at. I enjoyed the fact that I could ask questions for combinations of figures, such as Michelangelo and Heraclitus, with the system providing me with more complex information.”∼[Participant #10]
“It was easy to learn how to use MuMIA; for people who face difficulties, maybe, it would be a good idea to include a tutorial phase to practice interactions before the actual visit phase.”∼[Participant #6]
“The concept was natural and friendly to me, so, I didn’t need much time to understand how to use MuMIA. But, even if I had to spend more time to figure out how it works, I would had spent it, because the visitor experience was unique!”∼[Participant #11]
“I didn’t notice any failure during the operation, but a shortcoming that I noticed is that, at some times, there was a latency to follow my gaze. So, I caught myself trying to follow the [system] pointer rather than focusing on the content of the exhibit.”∼[Participant #3]
“While I enjoyed MuMIA, I am not sure if it’s a feasible solution for a museum context. I mean, what is the cost for a museum to have an eye-tracker for each exhibit or what is the cost for a visitor to have their own eye-tracker? In case that the cost is high, it might be a problem for adopting such solutions in real-life settings.”∼[Participant #1]
“Both navigating through the scene and navigating from eye-tracking to voice-commands and vice versa was super easy. It was more natural than using an audio guide, and I think, that it contributed into being more engaged with the cultural information.”∼[Participant #5]
“Given that visitors can use MuMIA through multisensoriality, meaning both visual and auditory channels, definitely will help them navigate through the art concepts presented in each exhibit or even gallery, and thus, they will eventually have a more pleasant and informative visitor experience.”∼[Participant #10]
4.2.3. Perceived Content Quality
“I was free to choose and filter what content I had access to, and this helped a lot at acquiring information and knowledge for art concepts that I was interested and not for concepts that were not relevant to my quest.”∼[Participant #9]
4.2.4. Operation as Deployment and Setting Up
“The system can definitely be used in real-life settings, in a museum for example, as it is robust, credible, and performs flawlessly; with no doubt, it can increase the quality of the visitor experience when interacting with art exhibits.”∼[Participant #7]
“To be entirely honest I was somewhat surprised that the system operation was that good! Given that it is based on a eye-tracking, I wouldn’t be surprised if the prototype was not working well.”∼[Participant #9]
“One thing that I am concerned about the tool is that it might not be integrable to the devices that the people bring with them when visiting a cultural heritage institution, such as their mobile devices; so, the set-up might be difficult in such conditions.”∼[Participant #2]
“The responsiveness of the tool was impressive, given that it works with eye-tracking and voice commands. When gazing at some of the painting figures, it was giving me the opportunity to ask for information in zero time. I had also no problems with the voice commands, as it understood literally all the commands I gave.”∼[Participant #5]
5. Discussion
5.1. Evoking Natural Interactions
5.2. Micro-Narratives for Supporting Self-Exploration through Multimodal Interaction
5.3. Implications Beyond the Cultural Heritage Domain
6. Limitations, Future Work, and Ethical Considerations
6.1. Limitations and Future Work
6.2. Ethical Considerations
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AOI | Area of Interest |
API | Application Programming Interface |
MuMIA | Multi-Modal Interactions in Art |
SQL | Structured Query Language |
References
- Bekele, M.K.; Pierdicca, R.; Frontoni, E.; Malinverni, E.S.; Gain, J. A Survey of Augmented, Virtual, and Mixed Reality for Cultural Heritage. J. Comput. Cult. Herit. 2018, 11, 1–36. [Google Scholar] [CrossRef]
- Walmsley, A.P.; Kersten, T.P. The Imperial Cathedral in Königslutter (Germany) as an Immersive Experience in Virtual Reality with Integrated 360° Panoramic Photography. Appl. Sci. 2020, 10, 1517. [Google Scholar] [CrossRef] [Green Version]
- Edler, D.; Keil, J.; Wiedenlübbert, T.; Sossna, M.; Kühne, O.; Dickmann, F. Immersive VR Experience of Redeveloped Post-industrial Sites: The Example of “Zeche Holland” in Bochum-Wattenscheid. KN J. Cartogr. Geogr. Inf. 2019, 69, 267–284. [Google Scholar] [CrossRef] [Green Version]
- Raptis, G.E.; Katsini, C.; Chrysikos, T. CHISTA: Cultural Heritage Information Storage and reTrieval Application. In Digital Heritage: Progress in Cultural Heritage: Documentation, Preservation, and Protection; Springer: Berlin/Heidelberg, Germany, 2018; pp. 163–170. [Google Scholar] [CrossRef]
- Turk, M. Multimodal interaction: A review. Pattern Recognit. Lett. 2014, 36, 189–195. [Google Scholar] [CrossRef]
- Cutugno, F.; Leano, V.A.; Rinaldi, R.; Mignini, G. Multimodal Framework for Mobile Interaction. In Proceedings of the International Working Conference on Advanced Visual Interfaces, Capri Island, Italy, 22–26 May 2012; pp. 197–203. [Google Scholar] [CrossRef]
- Oviatt, S. Advances in robust multimodal interface design. IEEE Ann. Hist. Comput. 2003, 23, 62–68. [Google Scholar] [CrossRef]
- Xiao, B.; Girand, C.; Oviatt, S. Multimodal integration patterns in children. In Proceedings of the Seventh International Conference on Spoken Language Processing, Denver, CO, USA, 16–20 September 2002. [Google Scholar]
- Oviatt, S.; Lunsford, R.; Coulston, R. Individual differences in multimodal integration patterns: What are they and why do they exist? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Portland, OR, USA, 2–7 April 2005; pp. 241–249. [Google Scholar]
- Oviatt, S.; Cohen, P.; Wu, L.; Duncan, L.; Suhm, B.; Bers, J.; Holzman, T.; Winograd, T.; Landay, J.; Larson, J.; et al. Designing the user interface for multimodal speech and pen-based gesture applications: State-of-the-art systems and future research directions. Hum. Comput. Interact. 2000, 15, 263–322. [Google Scholar] [CrossRef]
- Jaimes, A.; Sebe, N. Multimodal human–computer interaction: A survey. Comput. Vis. Image Underst. 2007, 108, 116–134. [Google Scholar] [CrossRef]
- Li, T.J.J.; Azaria, A.; Myers, B.A. SUGILITE: Creating multimodal smartphone automation by demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017; pp. 6038–6049. [Google Scholar]
- Srinivasan, A.; Lee, B.; Henry Riche, N.; Drucker, S.M.; Hinckley, K. InChorus: Designing consistent multimodal interactions for data visualization on tablet devices. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–13. [Google Scholar]
- Aslan, S.; Alyuz, N.; Tanriover, C.; Mete, S.E.; Okur, E.; D’Mello, S.K.; Arslan Esme, A. Investigating the impact of a real-time, multimodal student engagement analytics technology in authentic classrooms. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–12. [Google Scholar]
- Alyuz, N.; Okur, E.; Genc, U.; Aslan, S.; Tanriover, C.; Esme, A.A. An unobtrusive and multimodal approach for behavioral engagement detection of students. In Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, Glasgow UK, 13 November 2017; pp. 26–32. [Google Scholar]
- Bedri, A.; Li, D.; Khurana, R.; Bhuwalka, K.; Goel, M. Fitbyte: Automatic diet monitoring in unconstrained situations using multimodal sensing on eyeglasses. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–12. [Google Scholar]
- Speicher, M.; Nebeling, M. Gesturewiz: A human-powered gesture design environment for user interface prototypes. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montréal, QC, Canada, 21–26 April 2018; pp. 1–11. [Google Scholar]
- Salminen, K.; Farooq, A.; Rantala, J.; Surakka, V.; Raisamo, R. Unimodal and multimodal signals to support control transitions in semiautonomous vehicles. In Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Utrecht, The Netherlands, 22–25 September 2019; pp. 308–318. [Google Scholar]
- Politis, I.; Brewster, S.; Pollick, F. Language-based multimodal displays for the handover of control in autonomous cars. In Proceedings of the 7th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Nottingham UK, 1–3 September 2015; pp. 3–10. [Google Scholar]
- Khamis, M.; Alt, F.; Hassib, M.; von Zezschwitz, E.; Hasholzner, R.; Bulling, A. Gazetouchpass: Multimodal authentication using gaze and touch on mobile devices. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, San Jose, CA, USA, 7–16 May 2016; pp. 2156–2164. [Google Scholar]
- Lee, J.; Han, J.; Lee, G. Investigating the information transfer efficiency of a 3x3 watch-back tactile display. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Korea, 18–23 April 2015; pp. 1229–1232. [Google Scholar]
- He, L.; Xu, C.; Xu, D.; Brill, R. PneuHaptic: Delivering haptic cues with a pneumatic armband. In Proceedings of the 2015 ACM International Symposium on Wearable Computers, Osaka, Japan, 7–11 September 2015; pp. 47–48. [Google Scholar]
- Lee, J.; Lee, G. Designing a non-contact wearable tactile display using airflows. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, Tokyo, Japan, 16–19 October 2016; pp. 183–194. [Google Scholar]
- Shim, Y.A.; Lee, J.; Lee, G. Exploring multimodal watch-back tactile display using wind and vibration. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montréal, QC, Canada, 21–26 April 2018; pp. 1–12. [Google Scholar]
- Christou, C.; Angus, C.; Loscos, C.; Dettori, A.; Roussou, M. A versatile large-scale multimodal VR system for cultural heritage visualization. In Proceedings of the ACM Symposium on Virtual Reality Software and Technology, Limassol, Cyprus, 1–3 November 2006; pp. 133–140. [Google Scholar]
- Liarokapis, F.; Petridis, P.; Andrews, D.; de Freitas, S. Multimodal serious games technologies for cultural heritage. In Mixed Reality and Gamification for Cultural Heritage; Springer: Berlin/Heidelberg, Germany, 2017; pp. 371–392. [Google Scholar]
- Dimitropoulos, K.; Manitsaris, S.; Tsalakanidou, F.; Denby, B.; Crevier-Buchman, L.; Dupont, S.; Nikolopoulos, S.; Kompatsiaris, I.; Charisis, V.; Hadjileontiadis, L.; et al. A Multimodal Approach for the Safeguarding and Transmission of Intangible Cultural Heritage: The Case of i-Treasures. IEEE Intell. Syst. 2018, 1. [Google Scholar] [CrossRef]
- Santoro, C.; Paterno, F.; Ricci, G.; Leporini, B. A multimodal mobile museum guide for all. In Proceedings of the Mobile Interaction with the Real World Workshop, Singapore, 9 September 2007; pp. 21–25. [Google Scholar]
- Ho, C.M.; Nelson, M.E.; Müeller-Wittig, W. Design and implementation of a student-generated virtual museum in a language curriculum to enhance collaborative multimodal meaning-making. Comput. Educ. 2011, 57, 1083–1097. [Google Scholar] [CrossRef]
- Santangelo, A.; Augello, A.; Gentile, A.; Pilato, G.; Gaglio, S. A Chat-Bot Based Multimodal Virtual Guide for Cultural Heritage Tours. In Proceedings of the 2006 International Conference on Pervasive Systems & Computing, Las Vegas, NV, USA, 26–29 June 2006; pp. 114–120. [Google Scholar]
- Carmichael, J.; Larson, M.; Marlow, J.; Newman, E.; Clough, P.; Oomen, J.; Sav, S. Multimodal indexing of digital audio-visual documents: A case study for cultural heritage data. In Proceedings of the 2008 International Workshop on Content-Based Multimedia Indexing, London, UK, 18–20 June 2008; pp. 93–100. [Google Scholar]
- Cutugno, F.; Dell’Orletta, F.; Poggi, I.; Savy, R.; Sorgente, A. The CHROME Manifesto: Integrating Multimodal Data into Cultural Heritage Resources. In Proceedings of the Fifth Italian Conference on Computational Linguistics, Torino, Italy, 10–12 December 2018. [Google Scholar]
- Neto, J.N.; Silva, R.; Neto, J.P.; Pereira, J.M.; Fernandes, J. Solis’ Curse-A Cultural Heritage game using voice interaction with a Virtual Agent. In Proceedings of the 2011 Third International Conference on Games and Virtual Worlds for Serious Applications, Athens, Greece, 4–6 May 2011; pp. 164–167. [Google Scholar]
- D’Auria, D.; Di Mauro, D.; Calandra, D.M.; Cutugno, F. A 3D audio augmented reality system for a cultural heritage management and fruition. J. Digit. Inf. Manag. 2015, 13. [Google Scholar] [CrossRef]
- Sernani, P.; Vagni, S.; Falcionelli, N.; Mekuria, D.N.; Tomassini, S.; Dragoni, A.F. Voice interaction with artworks via indoor localization: A vocal museum. In Proceedings of the International Conference on Augmented Reality, Virtual Reality and Computer Graphics. Springer: Cham, Switzerland. Lecce, Italy, 8–11 June 2020; pp. 66–78. [Google Scholar]
- Ferracani, A.; Faustino, M.; Giannini, G.X.; Landucci, L.; Del Bimbo, A. Natural experiences in museums through virtual reality and voice commands. In Proceedings of the 25th ACM international conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 1233–1234. [Google Scholar]
- Picot, A.; Charbonnier, S.; Caplier, A. Drowsiness detection based on visual signs: Blinking analysis based on high frame rate video. In Proceedings of the 2010 IEEE Instrumentation & Measurement Technology Conference, Austin, TX, USA, 3–6 May 2010; pp. 801–804. [Google Scholar]
- Xu, G.; Zhang, Z.; Ma, Y. Improving the performance of iris recogniton system using eyelids and eyelashes detection and iris image enhancement. In Proceedings of the 2006 5th IEEE International Conference on Cognitive Informatics, Beijing, China, 17–19 July 2006; Volume 2, pp. 871–876. [Google Scholar]
- Zhang, Y.; Chong, M.K.; Müller, J.; Bulling, A.; Gellersen, H. Eye Tracking for Public Displays in the Wild. Pers. Ubiquitous Comput. 2015, 19, 967–981. [Google Scholar] [CrossRef]
- Katsini, C.; Abdrabou, Y.; Raptis, G.E.; Khamis, M.; Alt, F. The Role of Eye Gaze in Security and Privacy Applications: Survey and Future HCI Research Directions. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–21. [Google Scholar] [CrossRef]
- Raptis, G.E.; Fidas, C.; Avouris, N. Do Game Designers’ Decisions Related to Visual Activities Affect Knowledge Acquisition in Cultural Heritage Games? An Evaluation From a Human Cognitive Processing Perspective. J. Comput. Cult. Herit. 2019, 12, 4:1–4:25. [Google Scholar] [CrossRef]
- Rainoldi, M.; Neuhofer, B.; Jooss, M. Mobile eyetracking of museum learning experiences. In Information and Communication Technologies in Tourism 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 473–485. [Google Scholar]
- Pelowski, M.; Leder, H.; Mitschke, V.; Specker, E.; Gerger, G.; Tinio, P.P.; Vaporova, E.; Bieg, T.; Husslein-Arco, A. Capturing aesthetic experiences with installation art: An empirical assessment of emotion, evaluations, and mobile eye tracking in Olafur Eliasson’s “Baroque, Baroque!”. Front. Psychol. 2018, 9, 1255. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Raptis, G.E.; Fidas, C.; Katsini, C.; Avouris, N. A Cognition-Centered Personalization Framework for Cultural-Heritage Content. User Model. User Adapt. Interact. 2019, 29, 9–65. [Google Scholar] [CrossRef]
- Pierdicca, R.; Paolanti, M.; Quattrini, R.; Mameli, M.; Frontoni, E. A Visual Attentive Model for Discovering Patterns in Eye-Tracking Data—A Proposal in Cultural Heritage. Sensors 2020, 20, 2101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mokatren, M.; Kuflik, T.; Shimshoni, I. Exploring the potential of a mobile eye tracker as an intuitive indoor pointing device: A case study in cultural heritage. Future Gener. Comput. Syst. 2018, 81, 528–541. [Google Scholar] [CrossRef]
- Toyama, T.; Kieninger, T.; Shafait, F.; Dengel, A. Museum Guide 2.0–an eye-tracking based personal assistant for museums and exhibits. In Proceedings of the International Conference “Re-Thinking Technology in Museums”, Limerick, Ireland, 26–27 May 2011; pp. 103–110. [Google Scholar]
- Garbutt, M.; East, S.; Spehar, B.; Estrada-Gonzalez, V.; Carson-Ewart, B.; Touma, J. The embodied gaze: Exploring applications for mobile eye tracking in the art museum. Visit. Stud. 2020, 23, 82–100. [Google Scholar] [CrossRef]
- Cantoni, V.; Merlano, L.; Nugrahaningsih, N.; Porta, M. Eye Tracking for Cultural Heritage: A Gaze-Controlled System for Handless Interaction with Artworks. In Proceedings of the 17th International Conference on Computer Systems and Technologies 2016, Palermo, Italy, 23–24 June 2016; pp. 307–314. [Google Scholar] [CrossRef]
- Sibert, L.E.; Jacob, R.J.K. Evaluation of Eye Gaze Interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Hague, The Netherlands, 1–6 April 2000; pp. 281–288. [Google Scholar] [CrossRef] [Green Version]
- Helmert, J.R.; Joos, M.; Pannasch, S.; Velichkovsky, B.M. Two visual systems and their eye movements: Evidence from static and dynamic scene perception. In Proceedings of the Annual Meeting of the Cognitive Science Society, Stresa, Italy, 21–23 July 2005; Volume 27. [Google Scholar]
- Damala, A.; Ruthven, I.; Hornecker, E. The MUSETECH Model: A Comprehensive Evaluation Framework for Museum Technology. ACM J. Comput. Cult. Herit. 2019, 12, 7:1–7:22. [Google Scholar] [CrossRef] [Green Version]
- Braun, V.; Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef] [Green Version]
- Damala, A.; Ruthven, I.; Hornecker, E. The MUSETECH Companion: Navigating the Matrix. In Guide or Manual; University of Strathclyde: Glasgow, UK, 2019. [Google Scholar]
- Falk, J.H.; Dierking, L.D. The Museum Experience Revisited; Routledge: Oxfordshire, UK, 2016. [Google Scholar]
- Ahuja, K.; Islam, R.; Parashar, V.; Dey, K.; Harrison, C.; Goel, M. EyeSpyVR: Interactive Eye Sensing Using Off-the-Shelf, Smartphone-Based VR Headsets. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2018, 2, 57:1–57:10. [Google Scholar] [CrossRef]
- Fuhl, W.; Tonsen, M.; Bulling, A.; Kasneci, E. Pupil Detection for Head-mounted Eye Tracking in the Wild: An Evaluation of the State of the Art. Mach. Vis. Appl. 2016, 27, 1275–1288. [Google Scholar] [CrossRef]
- George, C.; Khamis, M.; Buschek, D.; Hussmann, H. Investigating the Third Dimension for Authentication in Immersive Virtual Reality and in the Real World. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 May 2019; pp. 277–285. [Google Scholar] [CrossRef] [Green Version]
- Hirzle, T.; Gugenheimer, J.; Geiselhart, F.; Bulling, A.; Rukzio, E. A Design Space for Gaze Interaction on Head-mounted Displays. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; p. 625. [Google Scholar] [CrossRef]
- Liu, S.; Wilson, J.; Xia, Y. Eye Gazing Passcode Generation Crossing Augmented Reality (AR) and Virtual Reality (VR) Devices. US Patent 9824206B1, 21 November 2017. [Google Scholar]
- Khamis, M.; Alt, F.; Bulling, A. The Past, Present, and Future of Gaze-enabled Handheld Mobile Devices: Survey and Lessons Learned. In Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services, Barcelona, Spain, 3–6 September 2018. [Google Scholar] [CrossRef]
- Brondi, R.; Avveduto, G.; Carrozzino, M.; Tecchia, F.; Alem, L.; Bergamasco, M. Immersive Technologies and Natural Interaction to Improve Serious Games Engagement. In Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2016; pp. 121–130. [Google Scholar] [CrossRef]
- Galais, T.; Delmas, A.; Alonso, R. Natural interaction in virtual reality. In Proceedings of the 31st Conference on l’Interaction Homme-Machine Adjunct, Grenoble, France, 10–13 December 2019. [Google Scholar] [CrossRef] [Green Version]
- McMahan, R.P.; Alon, A.J.D.; Lazem, S.; Beaton, R.J.; Machaj, D.; Schaefer, M.; Silva, M.G.; Leal, A.; Hagan, R.; Bowman, D.A. Evaluating natural interaction techniques in video games. In Proceedings of the 2010 IEEE Symposium on 3D User Interfaces (3DUI). IEEE, Waltham, MA, USA, 20–21 March 2010. [Google Scholar] [CrossRef]
- Cantoni, V.; Cellario, M.; Porta, M. Perspectives and challenges in e-learning: Towards natural interaction paradigms. J. Vis. Lang. Comput. 2004, 15, 333–345. [Google Scholar] [CrossRef]
- Pisoni, G.; Díaz-Rodríguez, N.; Gijlers, H.; Tonolli, L. Human-Centered Artificial Intelligence for Designing Accessible Cultural Heritage. Appl. Sci. 2021, 11, 870. [Google Scholar] [CrossRef]
- Bordoni, L.; Ardissono, L.; Barcelo, J.; Chella, A.; de Gemmis, M.; Gena, C.; Iaquinta, L.; Lops, P.; Mele, F.; Musto, C.; et al. The contribution of AI to enhance understanding of Cultural Heritage. Intell. D 2013, 7, 101–112. [Google Scholar] [CrossRef]
- Díaz-Rodríguez, N.; Pisoni, G. Accessible cultural heritage through explainable artificial intelligence. In Proceedings of the Adjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization, Genoa, Italy, 12–18 June 2020; pp. 317–324. [Google Scholar]
- Caggianese, G.; De Pietro, G.; Esposito, M.; Gallo, L.; Minutolo, A.; Neroni, P. Discovering Leonardo with artificial intelligence and holograms: A user study. Pattern Recognit. Lett. 2020, 131, 361–367. [Google Scholar] [CrossRef]
- Antoniou, A.; O’Brien, J.; Bardon, T.; Barnes, A.; Virk, D. Micro-Augmentations: Situated Calibration of a Novel Non-Tactile, Peripheral Museum Technology. In Proceedings of the 19th Panhellenic Conference on Informatics, Athens, Greece, 1–3 October 2015; pp. 229–234. [Google Scholar] [CrossRef]
- Rizvic, S.; Djapo, N.; Alispahic, F.; Hadzihalilovic, B.; Cengic, F.F.; Imamovic, A.; Okanovic, V.; Boskovic, D. Guidelines for interactive digital storytelling presentations of cultural heritage. In Proceedings of the 2017 9th International Conference on Virtual Worlds and Games for Serious Applications (VS-Games), Athens, Greece, 6–8 September 2017. [Google Scholar] [CrossRef]
- Sylaiou, S.; Dafiotis, P. Storytelling in Virtual Museums: Engaging A Multitude of Voices. In Visual Computing for Cultural Heritage; Springer: Berlin/Heidelberg, Germany, 2020; pp. 369–388. [Google Scholar] [CrossRef]
- Caine, K. Local Standards for Sample Size at CHI. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, San Jose, CA, USA, 7–16 May 2016; pp. 981–992. [Google Scholar] [CrossRef] [Green Version]
- Kang, J.; Jang, J.; Jeong, C. Understanding museum visitor satisfaction and revisit intentions through mobile guide system: Moderating role of age in museum mobile guide adoption. Asia Pac. J. Tour. Res. 2018, 23, 95–108. [Google Scholar] [CrossRef]
- Hammady, R.; Ma, M.; Strathearn, C. User experience design for mixed reality: A case study of HoloLens in museum. Int. J. Technol. Mark. 2019, 13, 354–375. [Google Scholar] [CrossRef]
Work | Sound/Speech Interface | Gaze Interface |
---|---|---|
Raptis et al. [44] | Not available | Eye-tracking was used to elicit visitors’ cognitive profiles and to adjust cultural heritage applications (virtual tour and game) during visit time. |
Pierdicca et al. [45] | Passive audio guide | Gaze data were used to build visitors’ attention models to extract patterns and to provide art recommendations that would be interesting to the visitors during visit time. |
Mokatren et al. [46] | The visitors were delivered with verbal information about the exhibit they were looking at. Gesture identification was also employed. | Mobile eye tracking was used to identify visitors’ location and exhibits of interest, taking advantage of image-based object recognition techniques. |
Toyama et al. [47] | After identifying an exhibit, the corresponding audio file was played and verbal information was delivered to the visitors. | Mobile eye tracking was used to detect gaze on exhibits, which were then identified through image-based object recognition techniques. |
Garbutt et al. [48] | Not available | Mobile eye tracking was used to identify areas of attention on exhibits and identify eye-movement patterns within exhibition spaces during visit time. |
Cantoni et al. [49] | Not available | Gaze interactions was used to enable visitors of an exhibition to select artworks, to perform image handling (e.g., scrolling and resizing), and to define areas of interest. |
Our tool: MuMIA | Visitors ask for information about the areas of interest they look at. They can combine areas of interest and ask for information about them, including areas that they had looked at before. The system provides visitors with verbal information. | Gaze data are used to help visitors identify multiple areas of interest on an exhibit and ask the system for information about them. They can revisit the identified areas of interest. |
Characteristic | Information |
---|---|
Sampling rate | 60 Hz |
Accuracy | 0.5° (average) |
Spatial resolution | 0.1° (RMS) |
Latency | <20 ms |
Calibration | 5, 9, 12 points |
Operating range | 45 cm–75 cm |
Tracking area | 40 cm × 30 cm at 65 cm distance |
Screen sizes | Up to 24 inches |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Raptis, G.E.; Kavvetsos, G.; Katsini, C. MuMIA: Multimodal Interactions to Better Understand Art Contexts. Appl. Sci. 2021, 11, 2695. https://doi.org/10.3390/app11062695
Raptis GE, Kavvetsos G, Katsini C. MuMIA: Multimodal Interactions to Better Understand Art Contexts. Applied Sciences. 2021; 11(6):2695. https://doi.org/10.3390/app11062695
Chicago/Turabian StyleRaptis, George E., Giannis Kavvetsos, and Christina Katsini. 2021. "MuMIA: Multimodal Interactions to Better Understand Art Contexts" Applied Sciences 11, no. 6: 2695. https://doi.org/10.3390/app11062695
APA StyleRaptis, G. E., Kavvetsos, G., & Katsini, C. (2021). MuMIA: Multimodal Interactions to Better Understand Art Contexts. Applied Sciences, 11(6), 2695. https://doi.org/10.3390/app11062695