Creating Non-Visual Non-Verbal Social Interactions in Virtual Reality
Abstract
:1. Introduction
1.1. Job Training
1.2. Remote Assist and Collaboration
1.2.1. Healthcare
1.2.2. Visual Virtual Reality
Part 1: Visuals
Part 2: Audio
Part 3: Head Tracking
Part 4: Tactile
1.3. Use of VR with BLVIs While Navigating
1.3.1. Non-Visual VR Navigation Technologies
1.3.2. Locomotive Navigation
1.4. Non-Verbal Social Interactions, Importance for BLVIs
1.5. Inventory of Non-Verbal Social Interactions in VR
1.6. Evaluations of Non-Visual Non-Verbal Social Interactions in VR
1.7. Auditory Display Techniques Useful in Non-Visual VR
1.8. Audio Games
1.9. Game and VR Accessibility Conventions
1.10. Accessibility Barriers in Mainstream VR Platforms
1.11. Self-Disclosure
2. Method for Data Collection
3. Results
3.1. Locative Movement
3.1.1. Direct Teleportation
Description
Results (Quotations)
There are two possible interactions (recall that these “results” are direct quotations from Delphi participants):
I press a button to begin selecting the target. I hear a small sound in the background that lets me know I’m in selection mode. Any movement now gets represented by the actual orientation of the listener in game, including position and orientation. The background sound changes slightly based on whether or not I’m able to teleport to this specific spot. Alternatively, it would include a mode which does not permit me to move to an area that I cannot teleport to, making the exploration a separate action from the actual target selection. Alternatively, exploration mode could be the default, and I press a button to select this spot as my target. If impossible, I get an error sound. As an alternative to a free-floating destination target: I am in the lobby of the virtual hotel, hearing any number of sounds around me. I press my teleport key and hear “D4” softly spoken. I can now arrow around a 7 by 7 grid, which has placed itself around me. As I arrow, I have not officially moved my avatar, but I begin hearing sounds as though I were in that location. I arrow up and over a few times and hear “A3”, followed by splashing in the pool and the sounds of the waterslide. In my actual current location (D4 on this imaginary grid overlay) I wasn’t able to hear the pool because it was around a corner and down a hall. From A3 I am around that corner and at a direct enough path to be able to hear the pool a little ways off. My rotation is maintained as I get to “try out” each space on this 7 by 7 grid, so I won’t have to deal with the confusion of teleporting and being spun in some unexpected way. Some of the grids do say “unavailable” after speaking their coordinates, letting me know it is an invalid teleport location. I may escape out of this teleport menu without moving anywhere, though I will have quickly gained some information about my surroundings by looking around the grid. If I do choose to move, I press enter, hear some sort of sound/teleported statement, and am now actually in this new location. Dedicated background music could play any time someone is using the teleportation grid, just to help them always know the difference between actually being in a spot versus sampling it only through the grid.
Discussion
3.1.2. Analog Stick Movement
Description
Results (Quotations)
This is handled as in most traditional games, and thus should be performed in a similar manner, but non-visually. Movement sounds denote the actual movement of the character (steps, wing flaps, claw clicks, etc.) as well as collision sounds for running into other objects that differ based on the object that’s being collided with. This includes jumping, crouching, crawling, and so on.
Discussion
3.1.3. 1:1. Player Movement
Description
Results (Quotations)
This should be the easiest form to represent and should not require any additional sound cues (although sounds can add a lot), besides the obvious positioning and rotating of the listener’s position, mapped to the player’s head. If the player cannot move in the virtual environment for any reason, such as if their avatar is not standing, or has a broken leg, this must be clearly indicated to the player, usually with a text message as well as sound. The text message should indicate why movement is impossible. Note that blind players may have difficulty sensing or checking for obstacles in the real world if their ears and hands are busy in VR. This mode of interaction may not be the best idea.
Discussion
3.1.4. Third-Person Movement
Description
Results (Quotations)
Upon initiating this mode, the target begins to make a sound. This sound could potentially change based on rotation; however, rotation seems to play a secondary role here, so it doesn’t strike me as particularly important and can be neglected. The sound of the target should either play in short intervals, or preferably be a looping sound that plays continuously, like a hum or chord. Alternatively, you could take over the third person and temporarily control them as if they are your avatar. You could also do this when the third person needs to stay with you. You could define a position, such as 5 feet behind you, where the third-person avatar attempts to remain. But note: For completely blind players, I do not believe there is any distinction between first-person and third-person perspectives. Moving the camera back behind a player can be a visual benefit to sighted players (allows you to see yourself, possibly gain a little bit of increased field of view and possibly see around corners) but this does not translate once visuals are removed. To a fully blind player the change to third-person perspective is simply an adjustment to sound volumes, without anything gained.
Discussion
3.1.5. Hotspot Selection
Description
Results (Quotations)
Each time a new hotspot is selected, the hotspot emits a singular sound placed at the position in 3D space of the selected object. This sound can change based on what object is being selected—chair, door, etc. It should also use speech, either pre-recorded or synthetic or both, to convey additional information about the selected hotspot object and what one is supposed to do with it.
Discussion
3.1.6. Other Thoughts in “Locative Movement”
Results (Quotations)
If the game has automatic movement, such as in rail shooters, art, or sequences in moving vehicles, for example, this should be represented in an environment-friendly way, such as the following:
Cars have engine sounds, characters could walk and emit footsteps, wind sound could be used to represent movement without a particular movement constraint.
Discussion
3.2. Camera Positions
3.2.1. POV Shift
Description
Results (Quotations)
POV changes do not really exist for blind players and should be avoided when possible. If there is a strong need for a POV shift, it should be represented using either speech or a recognizable sound cue. The position of the listener should gradually change to represent this and not change abruptly. This assumes POV changes would affect my listening location, otherwise a change in POV would mean absolutely nothing. I am moving around, and the game changes me to a third person perspective POV. Perhaps I hear a sound letting me know this change has taken place, which is a good warning against the confusion that would come next. Even though I am not moving, all of my senses (sound in this case) tell me that I am being moved backwards. When the movement stops, I would assume I am now several steps farther away from the counter I was approaching moments ago. I start moving forward and collide with it much sooner than expected, in fact the sound of the bartender still makes it seem as though I’m too far away. I continue moving around the room but keep misjudging my surroundings. I overshoot things I wanted to interact with and struggle to navigate by sound, because my body is now invisibly floating out ahead of me somewhere, but I cannot visually see it lining up with things like my sighted counterparts can. I can no longer use sound to position myself correctly.
Discussion
3.3. Facial Control
3.3.1. Expression Preset
Description
Results (Quotations)
The main idea that comes to mind here is to simply display an emoticon or a one- to three-word text string next to the name of the avatar displaying the expression, wherever such names are shown in a list of nearby avatars or entities. A small sound sequence could be made to loosely indicate someone’s current expression as well either when it changes or when the avatar gets in range of sight of the listener. It is, I think, very possible to indicate expression or emotion with just a few well-produced and rapid musical notes, and as such if they are played in the right manner, it may at least communicate a subset of the information attempted. Listeners would be able to remember the UI sound associated with a common expression. On the next row [in Puppeteered Expressions] I describe a sort of audible facial description system using tonal audio waveforms, that would apply to this section as well. A text string that the listener can access at will, though, would be far more diverse and would require no learning curve. In response to pressing a button in the interface to indicate nearby avatars, my screen reader would announce instead of just the avatar’s name and distance from me, smiling George is off to the left. Frank is behind and off to the right. If I am supposed to involuntarily know people’s expressions, the same text string could be used to describe the avatar all the time. Angry face Joe has just entered the room, for example, could be spoken if Joe enters the room while being straight in front of the listener. This would fit into the method described under “Words on body” (C-40). This is essentially a form of avatar customization, and a description of the preset expression can be added to the description list.
Discussion
3.3.2. Puppeteered Expressions
Description
Results (Quotations)
Imagine if each controllable segment of the face (upper lip, lower lip, left eyelid, right eyelid, nostril flare, etc.) had a different and distinct audio waveform attached to it. To be clear, these would be constant tones. When the listener is interested in watching the face of another avatar, they focus on the avatar they want to watch, allowing these waveforms to become audible. The waveforms would change pitch based on the position of each facial element. The more curved down the lower lip, the lower the pitch of that particular waveform in the mix, the more curved up the upper lip is, the higher the pitch of the waveform indicating the state of the upper lip would go, etc. Volume could also be worked with here. A simple learning menu would of course exist that would allow the listener to learn what each waveform in the mix meant, preferably giving the listener the ability to change the controls on a sample face so that they could really learn the sounds. I’m pretty sure that enough listening to the output from this audible facial description method would cause the sounds that common expressions make to be remembered, so that by listening to just a fragment of the audio mix describing a face, the listener would understand the set of all features being displayed. Assuming you know how, you can convert between a couple hundred milliseconds of a chord to a few letters that can represent that chord, why not the same, but representing facial expression instead. If someone switches from a frown to a smile, you would even be able to hear the mix shift over time as the face visually changed. Alternatively, there probably is a list of facial expressions. The selected expression can be spoken.
Discussion
3.3.3. Lip Sync
Description
Results (Quotations)
If the player is speaking, I think the listener can automatically assume that their lips are moving, and any additional information would be distracting from whatever the avatar is trying to say. If a user’s vision is too poor to see the lip movements of other avatars, I don’t believe it is important to devise an audio equivalent. The lack of lip movement might break the immersion of sighted users, because they are used to seeing that with every person they speak with in real life. If someone does not see lip movements ever, they won’t miss it once they enter the virtual world. There are times we can find a clever way to restore some social cue that blind users struggle to experience out of VR, but lip movement does not seem to be worth the effort. We must always ask ourselves if feature X even needs to be adapted. In the case of voice audio, I guess the real concern would be making sure that it is clearly indicated whether voice audio is coming from a source with lips, or something like a speaker system like a telephone or the voice of some robotic creature. This would simply be accounted for by applying an effect to the voice audio, but other than that the transmission of lip synchronicity I believe is already achieved in the talking itself in regards to voice audio. As for silent lip movement though, this is more tricky. An option may be the audible facial description idea I mentioned above [in the Puppeteered Expressions section], where the listener could focus or gaze at the person whose lips are moving, and the audio tones changing would indicate the movement of the lips. Alternatively, it may be possible to play generic audio of whispered fragments of human speech [phonemes] when a moving pair of lips passes the position required to emit such a sound. For example, if the lips aggressively part while the listener is interested in the facial activity of an avatar, the p sound would play. The same could be applied for as many lip positions as possible. When we think to ourselves it can often manifest in audible whisperings anyway, a similar concept could be used here to indicate silent lip movement.
Discussion
3.3.4. Gaze/Eye Fixation
Description
Results (Quotations)
The most unobtrusive way to handle this mostly would involve the use of text strings, where the listener can press a key that would cause screen reading software to announce what entity or object the given avatar is currently looking at. Text strings would be the simplest way to keep from overwhelming a player with audio cues. However, I could think of a way to do this with just sound. I imagine if we’re talking about virtual reality, this includes a complete 3D environment, including the properties of audio. As such, imagine that if you press a key, a UI sound of a small laser beam would emanate from the position of the gazing avatar and quickly travel to the destination of the gaze. Almost like an inconsequential projectile, it would quickly travel from the source of the gaze to where the gaze lands, whatever the sound projectile impacted would be what the gaze was focused on. The sound could play automatically if the listener was focused on an avatar whose gaze changed. A less intrusive version of the same effect could happen if, for example, the listener is in a group of avatars who all suddenly turn to look at something, or even just when the gaze of an avatar you are not focused on changes. The only issue with sounds alone for this is if there are a cluster of entities in one area and an avatar gazes at just one of them. With sounds alone, it would be difficult to tell for sure, especially from more than a couple feet away, exactly what was being looked at. Combine sound with speech for this though? That would be cool.
Discussion
3.3.5. Other Thoughts in “Facial Control”
Results (Quotations)
Generally, I’ve listed ideas I’d like to see in VR experiences, the only thing is that a lot of the stuff regarding facial expressions must be optional. For example, as noted above, I certainly don’t want lip sync indication when an avatar is talking. I know they’re talking. It must just be kept in mind that as a blind user in a VR experience, too much automatic reporting of expressions or gazes, etc., would serve to detract from the experience. Such things should be optional, or for example only trigger when you are in line of sight with the avatar’s face etc. Of course, the facial expression reporting should exist and would be awesome to see, but unless it is utterly unobtrusive I really don’t care to know, at least in some cases, about every expression change that the resident annoying troll in the corner is making. Basically, as far as developer implementation, just make sure we can look away from things at least to the degree a sighted person can.
3.4. Multi-Avatar Interactions
3.4.1. Physical Contact
Description
Results (Quotations)
Many MUDs, and at least Survive the Wild, I don’t know how many other audio games have socials. The user can select an action, such as laughing or whistling, and the associated sound will play from that player’s position. In a MUD with a soundpack, the effect is somewhat similar, enough to show the concept. In short [to represent character interactions, use] Foley sounds. Pretty much all of the examples here are not purely visual in nature anyway, in the real world they are just nearly silent. You can certainly hear kissing, the clasp of a handshake, certainly a high five, and unless little clothing, you can even sometimes tell that a hug is taking place. In a VR experience, these sounds would probably just be exaggerated. The Hollywood Edge Foley FX library [165] is a great example of one that contains many delicate human Foley sounds, anything from placing an arm on a chair to someone rubbing their belly to someone grabbing one hand with another to grabbing a shirt that’s being worn, etc. In addition, less obvious sounds could be described.
Discussion
- Recording 2: Emote Examples—Materia Magica. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emote-Examples---Materia-Magica.mp3&type=file, accessed on 1 April 2025). Recording 2 demonstrates both sides of a multi-player interaction in a MUD interface.
- Recording 3: Emotes Example—STW Socials. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emotes-Example---STW-Socials.mp4&type=file, accessed on 1 April 2025). Recording 3 from Survive the Wild shows an example of proximity with a non-player character, although the same command can be used for player characters.
3.4.2. Avatar–Avatar Collisions
Description
Results (Quotations)
I think this can just be indicated with a sound that is more exaggerated than how it would actually sound in real life. Bang two folded towels together, maybe two books wrapped in cloth, two thick jackets or whatever, and there is your collision sound. If an avatar passes through another, a tiny cinematic or magical effect would probably handle the situation nicely to indicate a defiance of physics and the passthrough. Alternatively, a small shuffling sound to indicate that, for a brief moment, one avatar stepped aside and made room for another.
Discussion
3.5. Gesture and Posture
3.5.1. Micro-Gestures
Description
Results (Quotations)
Set your avatar to randomly twirl their mustache. Rustling of cloth for body movement, and possible earcons like quick blink sounds for things like eye blinks. This is also not completely necessary. We make a lot of little micromovements, and if each one was accompanied by a sound, it would quickly get overwhelming, just like how sighted people stop focusing on something, this should only be audible if either you have this feature enabled or are consciously paying attention to a person. I do not wish to be notified of every eye blink of every avatar, that would quickly become an annoyance during a voice chat session, for example. I only wish to be automatically notified if such a blink is communicative in any way, for example the reaction of the eyes after being suddenly hit by sunlight might be interesting to hear/be notified about, but any micro-gesture that cannot be represented by a light Foley sound and that is also not communicative or important in any way, I feel, should not produce any notification other than that which the listener wishes to receive. If it is your own avatar, you should be able to select automatic body actions to happen randomly.
Discussion
3.5.2. Posable/Movable Appendages
Description
Results (Quotations)
If voice chat, adjusting relative position of voice. It may be interesting to consider positioning the sounds for an avatar using multiple HRTF [head-related transfer function] sources set near each other, one for each major appendage. If an avatar snaps with their left hand extended, and the listener is standing right behind that avatar, the sound of the snap playing slightly off to the left in conjunction with the position of the extended appendage would already communicate much. The constant updating of the positions of such sounds as the avatar moves combined with some method of a speech description (Fred with left arm extended) should be able to communicate all important information about the basic position of an avatar’s body, with the sound positioning and very slight sounds for each movement, it should be possible to get a clear picture. I do think that for particularly communicative gestures (thumbs up or raising five fingers), they should be detected and spoken as text strings to a listener if possible.
Discussion
3.5.3. Mood/Status
Description
Results (Quotations)
Textual descriptions of facial expressions. If some of this is customizable by the user, allowing the user to set some sort of custom mood string (just a few words long) should they wish to, which would then be announced by the screen reader of an observer, may also be beneficial here, at least among anyone who would wish to take advantage of such a feature.
Discussion
3.5.4. Proxemic Acts
Description
Results (Quotations)
Definitely HRTF audio. Volume [loudness] for distance and panning for position. Survive the Wild uses similar systems [115].
Discussion
3.5.5. Iconic Gestures
Description
Results (Quotations)
Textual descriptions of gestures via TTS [screen reader, or text-to-speech]. Any MUD with emotes is probably a good example.
Discussion
- Recording 2: Emote Examples—Materia Magica (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emote-Examples---Materia-Magica.mp3&type=file, accessed on 1 April 2025) Recording 2 has examples (e.g., “cat nap”) that have been expanded for both the user and their recipient. Instead of just saying “You cat nap”, it says “You curl up in John’s lap for a quick cat nap.”
- Recording 3: Emotes Example—STW Socials. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emotes-Example---STW-Socials.mp4&type=file, accessed on 1 April 2025). Recording 3 has examples of emotes that players can perform.
3.5.6. Other Thoughts in “Gesture and Posture”
Results (Quotations)
Make sure there is some sort of table of constantly updating information or keystroke a user can quickly press that prints a list of nearby avatars with some textual information about their posture. I imagine if someone enters a room, they can glance around to take note of the current posture of avatars; visually impaired users need this information as well, what use are posture updates if one cannot have known the previous posture being changed?
Discussion
3.6. Avatar Mannerisms
3.6.1. Personal Mannerisms
Description
Results (Quotations)
If we are talking about pre-designed actions you can tell your avatar to perform, then it could be handled in the same way as other gestures (such as hand waving, smiling, or a thumbs up) with earcons or extra layered Foley sounds. Make sure that either sounds representing such actions are very well done or allow the report of such information to be optional. As described in previous sections, BLVIs don’t want to be notified of someone’s eye blinks any more aggressively than a sighted user would be. If a player drums their fingers on a table, make certain the volume of such a sound is quiet relative to how such a thing would sound in the real world, for example. If we are talking about the actual motions of other users, then I suspect it should be ignored. Depending on how accurately the VR headset/hand controls pick up motion, a sighted user may notice that someone in the room is gently rocking back and forth or has a nervous tick where they physically jerk without intending to. A blind user does not need to be specifically told about those movements, because the person making them would probably prefer that the sighted user doesn’t see it either. An involuntary or nervous movement is not often something someone wishes to announce to those around them, which is why they are involuntary movements and not an intentional hand wave or other form of expression. If I stutter and give a speech, I cannot help those in attendance hear my stuttering. If I dictate my speech to someone writing it in an email, it would frankly be insulting if they misspelled all of my words to relay my stuttering to the eventual reader. In the same way, involuntary/unintended motions may be picked up by sighted users, but going out of our way to translate that to those who wouldn’t otherwise notice is just rude.
Discussion
3.7. Conventional Communication
3.7.1. Visual Communication Support
Description
Results (Quotations)
In regards to showing visual displays in HTML for screen readers. Consider addons for the Non-Visual Desktop Access screen reader that actually turn the screen into a 2-dimensional HRTF field for the user. If the user passes a link, a little click can be played, in HRTF, at the position on the screen where the item is visually located.
There are two possible displays: 1. A standard chat window: If one were to make an accessible display of such a board that followed the standards of an accessible chatting application (modified to suit the purposes of the board), a blind user should have no trouble viewing the information, even if not done through a virtual lens or gaze. Consider that a blind user of a chatting application does not typically get behind in the conversation. The keystroke convention examples below assume a Windows keyboard but could be adapted to whatever control methods are available. Such standards may include but are not limited to home and end for top or bottom item (sorted by date added usually), up and down arrows to scroll between messages (quickly announce ONLY prevalent info here—in a chatting app this would be name: message, sent at date), BEING ABLE TO HOLD ARROW KEYS TO SCROLL THROUGH ITEMS ON THE BOARD QUICKLY, and instantly announcing new changes to the board or visual display as they are made. Consider that many websites are accessible to a blind person that contain many complicated structures of information; showing the user a small browser window with an HTML representation of a visual display is probably the easiest for both developers to integrate, and for VI users to understand. That being said, being able to use a virtual gaze to understand the layout of the visual display would be really cool indeed if done correctly, as in a browser window, a blind person doesn’t know where an element they are focused on is visually located on the screen. 2. Virtual gaze: This one is trickier than my bulletin board approach below [in Public Postings], because it is likely the data will be actively changing as new messages are added to the board. I still think the bulletin board approach will help the user visualize the layout of the messages, but changes will not be immediately obvious without the user happening upon that location to read that the message is different. New messages being added would/could quickly force background music to be changed for existing entries, causing confusion to a user who had already begun to depend on them to navigate the board. Without an example put together to test this idea, I have no idea whether a blind user would be able to follow the changes, or if they would become hopelessly lost and left behind. The bulletin board approach might end up working fine for actively changing whiteboards, but it would need to be tested.
Discussion
3.7.2. Public Postings
Description
Results (Quotations)
I am assuming such a bulletin board would not be neatly arranged in grids, but rather be messages thrown about in possibly different sizes and orientations. I imagine this like the map coloring puzzles, where algorithms can be used to determine the minimum number of colors needed to fill in a map where no adjacent states share the same color. Rather than color I’m imagining a handful of unobtrusive background sounds like background hums. More like drones instead of music. Maybe white noise with a set of triangle waves in a chord. Something soft and unobtrusive. The board itself would invisibly divide up all entries posted on it, assigning jagged borders equal distance between them. Using a “map coloring puzzle” technique, entries are assigned background sound so that no adjacent entries use the same one. This should always keep the number of required background sound files to a handful or less. As the user’s gaze moves from entry to entry, the background sound adjusts in real time, and the text at that location is spoken using text-to-speech. The user can perform a gesture to open the message in a standard text box so they can read it with their screen reader using the screen reader’s commands. It might even be useful to adjust some attribute of the voice based on the size/bold/underline of the text, since often on a whiteboard an important message may be written bold with an underline, while less important messages are smaller. A pitch adjustment to the voice could help convey emphasis, which may be important depending on the situation. This is only if the user has decided to use the built-in screen reader, but changing the voice is not possible when interfacing with the user’s own screen reader. The main purpose of the background sound is to help the user construct the 2D layout of the messages, and to help as “landmarks” when moving their focus around. Initially looking at a board of messages is going to take time slowly looking it over, which is expected. After moving around to read/have read several messages, going back to revisit a previous message is far easier if your brain has linked it to the background sound. As you turn your head back in the direction you believe it was, each change in sound clearly marks the borders between messages, letting you keep track of how far you’re moving. If the user chooses, the sound will play for a moment before the text has a chance to be read, letting someone quickly move four or five messages until they stop on the sound they are looking for, without needing to sit through several spoken words, checking to see if those are how their desired message should start. The reason for this is because the problem with a finite number of music tracks or sounds, particularly that play several seconds before the text of a post is read, is that this alone is not sufficient for quickly browsing between posts or locating one you are interested in. This is because, even if there is an assurance of different tracks on each border, two messages that are across the room or that are many feet apart from each other may end up using the same track, thus causing me to have to listen to it for seconds to determine that the track doesn’t correlate to the message I’m looking for. As such, in optional addition to the sounds/tracks that let one learn the physical layout, I think that every time your focus fixes on a message, some sort of post number, author username, subject slug or any other form of very short identifier should be announced to the user instantly upon message focus as well. Then I can simply remember something like 95, Marcus, July 30 public announcement, 09:31 PM, or literally anything else that is very quickly presented to relocate a post I am interested in with speed. If you come in view of a message you feel you may want to view again, you should be able to somehow highlight or bookmark the message, so that the UI can instantly alert you if you return your gaze to a message you have interest in. There will be times when a user gets a little lost in their search, but I do believe background sound will help tremendously and save time. Finally, if I know I have located a post, there needs to be a way to interrupt the delay between the start of the background sound and the text of the post, or such a delay needs to be optional, as I do not want to wait a couple of seconds after finding my post for it to begin reading.
Ensure, though, that when a message is viewed it is not just spoken, it should be imperative that such a message show up in an accessible and navigable read only edit box or other browsable element so that a blind user can scroll through the message with their screen reader, similar to a sighted user. If not this, a way to, for example, copy the focused message to a user’s clipboard in plain text, so that the user can optionally browse it in their own text editing application.
Alternatively, randomly placed messages could be collected and transformed into a list which can just be browsed with the user’s screen reader, similar to an email client.
Discussion
3.7.3. Synchronous Communication
Description
Results (Quotations)
Old school chalk and whiteboards serve as a proof of concept here. The rubbing of the chalk and the squeak of the marker both provide this auditory feedback, starting and stopping to indicate how long the movement took place, and a pitch change tied to the speed of that movement. I wouldn’t be able to tell you what someone was drawing or what they were writing on the chalk/whiteboard, but from sound alone I would know if they were drawing or writing, and could let you know any time they switched from one to the other. Drawing and writing would sound very different, and as a human I can learn to tell the difference. No sound effect had to play to inform me that the person was writing, another that they switched to drawing, then back to writing, and finally that they’ve stopped. Each of those sound cues would have to be memorized independently, and you’d need a ton of them as possible movements added up. The chalk/whiteboard example shows that even a single sound (started and stopped correctly plus pitch adjusted for speed) lets me determine all of those actions on my own.
When enabled for accessibility, I think unobtrusive sounds could be linked to many avatar motions. As an example, if someone turns their head left it could play a soft creaking sound for only as long as their head was turning, and possibly pitch adjusted to express the speed (distance head actually turned). A different creaking sound would represent a head turning to the right. In combination, a person slowly shaking their head “no” would resemble a creaky door being moved open and closed a few times. With the pitch expressing speed and the sound stopping as the head motion stopped, hearing alone would easily be able to differentiate someone slowly shaking their head “no…?” in an unsure/questioning manner versus a strong “No!”. The sounds would need to be carefully selected to keep them from being distracting or annoying, and ideally paired so expected combinations would sound nice. The sound of a creaking door being opened and closed a few times would seem to fit together, much better than the mooing of a cow and the ring of an old telephone being alternated. Head nods up and down would have their own sounds. Without needing to capture every possible motion, some trial and error could be used to determine a minimum functional set of body movements to be given such sounds. I can imagine benefits to the sound of each wrist moving closer/farther from the body, also one for each hand being turned face-up or face-down. These eight sounds alone (close/away/face-up/face-down per arm) would help indicate if the other person was offering to hand you something or take something from you. A short moment of the right wrist turning toward face-up, plus the wrist moving away from the person, is likely them initiating a hand shake. Even without shoulders, both wrists moving to face-up is likely a questioning shrug. Depending on the size of the message being drawn out, a person moving their hand to draw should eventually be determinable through the sound of their wrist rotating and moving toward/away from the body. I believe it would sound different enough from a person “speaking with their hands”, which is of course itself worth capturing. After enough conversations with someone, the sounds of their body language softly playing in the background as they speak, should help add to the experience. Insight can be gained about a person’s excitement level by seeing how they move their hands as they speak, as much as hearing the words themselves. I’m hoping a few dozen carefully chosen background sounds could solve several problems at the same time. For training purposes, a user should be able to activate these sounds for themselves, so they can begin associating those sounds with their own movements.
Discussion
3.7.4. Asynchronous Direct Communication
Description
Results (Quotations)
I don’t think anything needs to be changed from how sighted users receive emails/messages. For sighted users I assume some icon somewhere indicates they have unread emails/messages, and once opening that they are presented with the message and are able to reply. When a new message arrives while they are active, I assume there is some sort of sound in addition to a visual cue. For blind users they’ll know if a message arrives while they are active because of the same sound sighted users hear. When signing in, the same sound could be played to let them know existing messages are waiting from when they were away.
Discussion
3.8. Avatar Appearance
3.8.1. Avatar Customization
Description
Results (Quotations)
I don’t think anything needs to be changed from how sighted users select these options, so long as each option is given an adequate description. It may feel like a description isn’t going to help blind users recreate their appearance as precisely as sighted users can, and while that’s true, it doesn’t matter in the slightest. Sighted users who are looking to match their own face settle constantly with options that are “close enough”. Users who are very happy with the face they’ve made will not actually have eyes that perfectly match the avatar, face shape, hair style, eyebrow shape, etc. They’re assembling an approximation. For blind users this shouldn’t be any different, and they won’t notice or care if the description they selected for a hairstyle doesn’t actually make their avatar look exactly like their real hair. I believe users fully accepting “close enough” is going to be the same within the blind community as well.
Discussion
3.8.2. Words on Body
Description
Results (Quotations)
In an environment where people can be anything (custom 3D models), it will have to rely on player-provided descriptions, and some menu/key blind users can press to have that read aloud about the avatar they are facing. Far from perfect since it requires people to participate in something they won’t benefit from or see a need for (in 99.9% of situations). For text, particularly custom text that is not part of a simple list of clothing items for example, OCR can be performed on the image of the user that is being inspected. A screen reader’s optical character recognition function can read the screen so thoroughly that a virtual document can get created which a user can scroll through, and sometimes even click elements of. I think this proves that with refinement, using OCR to determine text on clothing items, furthermore, even telling the user which clothing item contained the text or where on the clothing item, is very possible. If the environment is more “sims”-like, where avatars are clothed using a large list of options, then descriptions could be relied upon. Developers/artists would add descriptions to the clothing/decorations and normal users wouldn’t be required to do anything extra. The safest approach is probably just have a key that can be held while facing someone’s avatar. Clothing/decoration descriptions should begin being listed in a specific order starting from most generic and moving to specifics. The longer they hold the key, the further down the list of descriptions they go (are read aloud to them). As I encounter someone new, I hold the key. Jim is wearing a red shirt, blue pants, a tan baseball cap, and I release the key to hear them better as he talks to me. In a short conversation pause I hold the key again. Jim is wearing white shoes, his shirt contains a logo on the chest, the pants are torn on the left knee, a black watch on his right wrist. I once again release the key to talk. As in real life, we quickly grab generalities about someone and begin filling in the details over time. Each clothing/decoration avatar can equip what would need a few descriptions entered that span a few levels of detail, and a lot of this could be automatic, where the logo placed on a shirt is automatically on such a list, and the logo’s description itself is further down. It would require some test groups to determine what order an avatar’s appearance should be described in, but I’m sure there is some order that would be ideal for most situations. In my example I didn’t include other features, but hair color and style, eyes, makeup, and basically everything else designed into the avatar can be handled in this way. Perhaps 2 different keys would be used to separate describing face/hair/body from clothing to save on time, that would depend on how many controls are free to dedicate to this, but I think traveling down a list from generic to specific descriptions goes a long way in reproducing how sighted people gather this same information (albeit slower). Five minutes after meeting someone, “Oh hey I like your earring! I hadn’t noticed that until now,” would literally be a thing in this approach, the same as it is for sighted users. Alternatively or additionally, a menu should be easily accessible that should allow you to scroll through all such items that would be automatically announced the longer a key is held down. This way the user can quickly scroll to the information that interests them, rather than relying on some predetermined order of elements.
Discussion
3.8.3. Other Thoughts in “Avatar Appearance”
Results (Quotations)
Something similar to this was mentioned above, but it’s very important that in addition to VI users receiving audio updates when someone’s appearance changes, there needs to be, whether by some sort of HTML table or keystroke that presents information, a way to get a summary of those near you and a basic description of what they look like, despite the more detailed features that would appear if your gaze fixates on a certain user or avatar.
Discussion
3.9. Avatar–Environment Interactions
3.9.1. Avatar–Environment Collisions
Description
Results (Quotations)
You would hear a body thump with the avatar exclaiming the appropriate vocal sound. For example, a knight walks into a tree. You would hear metal hitting wood with a randomly selected groan, curse or gasp.
Discussion
3.9.2. Object Manipulation
Description
Results (Quotations)
Use a unique directional (i.e., via HRTF) sound to represent an object’s location in relationship to the listener, and the sound of the object itself to indicate what it is and how it’s being interacted with. Pitch can be used to designate height for non-3D systems. You can also use sound qualities like pitch to convey further information, for example when hammering in a nail, the pitch of the hammering sounds can be three or four semitones lower or higher in pitch when starting than when performing the last stroke with the hammer.
A button on the wall can be directionally located by sound. For height, the sound’s pitch can slide repeatedly between a low tone to a tone representing its height. In Survive the Wild, for example, nobody has an issue figuring out what someone is doing with an object, as each major action attached to each object has a sound cue, which is positioned in [virtual space, via] HRTF, where the object is located. Different sounds for each action should make it very clear what’s happening with an object.
Discussion
- Recording 3: Emotes Example—STW Socials. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emotes-Example---STW-Socials.mp4&type=file, accessed on 1 April 2025). Recording 3 shows an example of how objects are described around the user in a menu (e.g., “Mud near the stream straight behind, shallow stream straight behind, sand near the pond straight off to the left.”)
- Recording 7: Avatar Object Interaction Example. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Avatar-Object-Interaction-Example.mp4&type=file, accessed on 1 April 2025). Recording 7 shows how a user interacts with doors and an object they can take.
3.9.3. Other Thoughts in “Avatar–Environment Interactions”
Description
Results (Quotations)
Use two or more 2D or 3D sounds to provide orientation. The volume of these sounds imparts direction. First-person perspective in audio games (like Swamp) try to place ambient sounds around maps so that players are always within earshot of at least three sounds at different positions, even if at a distance. Being aware of three sounds lets the player triangulate their location and the direction they are facing.
I find myself randomly placed in a large room I’m already familiar with. In my right ear I hear the radio softly playing, which I know is in the center of the room on a table. Based on how loud it is, I have a pretty good idea of how far from the table I am, and I know it’s to my right, but that means I could be anywhere in a clockwise circle around the radio. Now I notice some bubbling sounds from the fish tank elsewhere in the room. I can now immediately rule out all but two spots in the room I could be standing, to hear both sounds how I hear them. In one situation the fish tank is ahead of me a little to my left, and the radio is ahead of me a little to my right. The other situation is that the fish tank and radio are still to my left and right, but behind me. It is quite difficult to differentiate sounds ahead of you verses behind you based solely on the volumes of each sound in each ear. Finally, the final clue is the ticking of a distant clock on a wall. I know the clock should be on the wall behind me (through previously wandering around and exploring this place), and it is so quiet that I know I’m far from that wall. Of the two possible locations I could have been standing, I now know I am the one farther from the back wall. The fish tank and radio are actually behind me and to either side. From this point forward I am fully aware of my position in the room and can turn and move accordingly. This entire process may have only taken a few moments as I took in the sounds around me.
Discussion
3.10. Full-Body Interactions
3.10.1. Emotes
Description
Results (Quotations)
Sounds are played to represent the action, with HRTF positioning the sounds based on the location of the object or player making them. In this example, the sound of jumping feet combined with an excited voice.
Discussion
- Recording 2: Emote Examples—Materia Magica. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emotes-Example---STW-Socials.mp4&type=file, accessed on 1 April 2025). Recording 2 shows how MUDs have a set of key actions (e.g., Cat Nap, bonk, and poke) that developers have added into the game with extra text for both the action performer and the action receiver.
- Recording 3: Emotes Example—STW Socials. (https://www.openicpsr.org/openicpsr/project/224701/version/V1/view?path=/openicpsr/224701/fcr:versions/V1/Emotes-Example---STW-Socials.mp4&type=file, accessed on 1 April 2025). Recording 3 from Survive the Wild has a list of emotes (e.g., burp, yawn, scratch, stretch) that play a speech message and a sound in spatial audio.
3.11. Environment Appearance
3.11.1. Customization
Description
Results (Quotations)
Include sounds that the object might make. There should of course be a hotkey that speaks or shows in a menu a list of objects and their textual descriptions, particularly those that don’t make sounds. Flapping drapes, a clock ticking, bird sounds coming from where a window might be. It should also be possible to explore these objects somehow, for example by interacting/touching them, with a description or similar.
Discussion
3.12. Non-Body Communication
3.12.1. Emojis
Description
Results (Quotations)
Sounds and or speech messages are played to represent the emotion when the user presses a hotkey.
Soft romantic violin sounds to represent a heart, a jubilant trumpet sound to represent a good idea, or simple TTS speech to represent the emoji (e.g., “Face with heart shaped eyes”).
Discussion
3.13. Working with the BLVI Community (Message from One Sighted VR Developer to Another)
3.13.1. Why Should You Make a Non-Visual Experience?
3.13.2. What Impact Making My Games Non-Visual Has Had on Me
4. Overall Discussion
5. Final Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
VR | virtual reality |
VE | virtual environment |
BLVI | blind and low vision individual |
HRTF | head related transfer function (for spatial audio) see 2D or 3D audio for more information. Technically HRTF is a function that calculates the volume of sounds based on the size of the user’s head. This is supposed to be as close to real life as possible. Frequently, such as in web audio, head sizes are assumed and not customizable by the user. In this paper, HRTF, 2D audio, 3D audio, and spatial audio all mean the same. Sounds that are positioned around the user in space. |
2D or 3D audio | Same as above. HRTF, 2D audio, 3D audio, binaural, and spatial audio, all refer to sounds being designed such that they seem to be coming from a specific location in space, around the user’s head. |
IVR | immersive virtual reality |
AR | augmented reality |
AI | artificial intelligence |
3D | three-dimensional (i.e., located in the space around the user) |
2D | two-dimensional (i.e., located on a surface) |
UI | user interface |
LLM | large language model |
ADA | Americans With Disabilities Act |
POV | point of view |
Appendix A. Social Interactions Inventory
Category or Subcategory | Short Description | Design Summary |
Category: Locative Movement | ||
Direct Teleportation: | The user pushes a button and a target appears on the ground where they are pointing; they may move the target around. The user teleports on release. | Activate teleport mode, move to a free location indicated by speech message and sound and activate teleport |
Analog Stick Movement: | Joystick and its buttons are used to move the character within the virtual environment. | As normal, but with movement and colision sounds |
1:1 Player Movement: | The relative position of the player’s body in their physical play space is mapped directly to the position of the avatar in VR, so that the user’s body moves exactly as it does in real life. | Map movement to player’s head, use movement and vocal sounds |
Third-Person Movement: | The player places a moveable target in the environment, using a teleportation arc to place it. The user views the avatar from a third-person point of view (POV) instead of the usual first-person POV. | Same as analog stick movement, there is no third-person in audio |
Hotspot Selection: | Users can jump from hotspot to hotspot, but no other movement is supported (e.g., interactions take place around a table, with chairs for users, and users may only sit in the chairs). | Menu of hotspots that play a sound in spatial audio and speech message when focused |
Please list and describe any movements we have missed in the category of Locative Movement: | If the game has automatic movement, such as in rail shooters, art, or sequences in moving vehicles for example, this should be represented in an environment friendly way. | Indicate movement and collisions with sound |
Category: Camera Positions | ||
POV Shift: | Shifting the POV from first- to third-person view (e.g., watching an interaction from a first-person vs. third-person perspective). | Do not POV shift, but if needed, gradual shift indicated with speech messages and sound cues |
Category: Facial Control | ||
Expression Preset: | The user has selectable/templatized facial expressions to choose from in menus or interfaces. Presets manipulate the entire face, not individual features. | Text message combined with a short musical phrase |
Puppeteered Expressions: | Users control and compose individual facial features (or linked constellations of features) through a range of possible expressions to varying degrees. A user might puppeteer an expression from slightly smiling to grinning broadly, and any point in-between extremes. | Experimental: Multiple short musical phrases tracking facial elements simultaneously |
Lip Sync: | The movement of the avatar’s lips or mouth synchronizes with the player’s voice audio (or with another voice track, such as a song). | Play speech, or experimental: phonemes if only lips are moving |
Gaze/Eye Fixation: | The ability for the avatar’s gaze to fixate on items or people in their environment. | Text messages, or experimental: moving sound from gazer to target combined with speech message |
Category: Multi-Avatar Interactions | ||
Physical Contact: | Interactions where two or more avatars interact physically (e.g., a high five, hugging, dancing, poking, shaking hands, or kissing). These can be intentional or unintentional (e.g., bumping into someone in a crowded room). | Exaggerated short sounds combined with a speech message |
Avatar–Avatar Collisions: | Avatars collide with one another. These are intentional collisions and may result either in moving through the other player’s avatar, or bumping off of them. | Exaggerated short sounds combined with a speech message |
Category: Gesture and Posture | ||
Micro-Gestures: | Automatic gestures, such as eye blinking, shifting weight and other actions people perform without conscious effort. | This should be optional: short sounds and musical phrases combined with text messages |
Posable/Movable Appendages: | The avatar’s body movements that occur when the head/torso/arm/leg of the avatar moves. These movements change in response to the player’s head/torso/arm movement in space. | Speech message describing body position combined with spatial audio for sounds of body parts in spatial audio |
Mood/Status: | The way that the avatar’s movement may communicate the avatar’s general emotional state. When the mood/status changes movements may change subtly to match the mood. | This should be optional: Text message describing emotional state |
Proxemic Acts: | Movements which occur in relation to how close/far users should be able to perceive communication. For instance, increasing the voice volume of speaking or text size of written messages. | Use spatial audio |
Iconic Gestures: | Use of gestures which have a clear and direct meaning related to communication. These include social gestures (e.g., waving, pointing) and representational gestures (e.g., miming actions, such as scissors with one’s hand). | Text messages |
Category: Avatar Mannerisms | ||
Personal Mannerisms: | Smaller movements that avatars perform periodically and/or repetitively (e.g., rocking, swaying, tapping feet, shaking legs, twisting/pulling hair, playing with clothes/jewelry). | Short sounds or musical phrases |
Category: Conventional Communication | ||
Visual Communication Support: | Visuals (e.g., mind maps, whiteboards) used to record and organize ideas while meeting, brainstorming, or collaborating with others. | Text messages, and experimental: short musical phrases based on different messages or changes |
Public Postings: | Areas that host messages meant for public viewing (e.g., bulletin boards) | Text messages in a menu, and experimental: short musical phrases based on different messages |
Synchronous Communication: | Non-verbal communication between two or more avatars in real time (e.g., signing, writing). | Short sounds tied to user actions, or experimental: short sounds tied to body parts, that change speed based on user movement |
Asynchronous Direct Communication: | Emails/messages sent directly to someone, but which may not be received in real time. | Same as currently exists, short sound and text message indicating new mail, messages show in a menu |
Category: Avatar Appearance | ||
Avatar Customization: | Selecting personalized looks and voice, including gender expression, body features, and dress of avatars. | Describe each option and have a speech message when a new option is selected with the description |
Words on Body: | These are words that are written on the avatar’s clothes or directly on the body, e.g., words on a T-shirt, tattoo, product logo. | Create multiple detail levels describing the user including AI recognized text |
Category: Avatar–Environment Interactions | ||
Avatar–Environment Collisions: | Avatar behavior when they collide with objects or walls in the environment. (e.g., hitting the wall, running into trees, and interacting with the environment without going through it). | Short collision sounds |
Object Manipulation: | Avatars interaction with objects in the environment (e.g., throwing a ball back and forth, playing the guitar), or and indications that an object can be manipulated (e.g., showing that a ball can be picked up, or a button can be pushed). | Short sounds in spatial audio that change based on state variables |
Please list and describe if we have missed any movements in the category of Avatar–Environment Interactions: | Orientation in a room | Use spatial audio to help orient to a room |
Category: Full-Body Interactions | ||
Emotes: | Preset animations that involve a full-body demonstration of an emotion (e.g., jumping up and down when happy, waving arms to gain attention). | Short sound in spatial audio |
Category: Environment Appearance | ||
Customization: | Adding objects and NPCs into the world and making them look a particular way (e.g., decorating a room, changing the lighting). | Menu of objects with short sounds in spatial audio |
Category: Non-Body Communication | ||
Emojis: | Emojis that appear above avatar heads to signal emotions (e.g., large happy face above a head, heart emojis surrounding avatar to signal love). | Text message combined with a short musical phrase |
References
- Cipresso, P.; Giglioli, I.A.C.; Raya, M.A.; Riva, G. The past, present, and future of virtual and augmented reality research: A network and cluster analysis of the literature. Front. Psychol. 2018, 9, 2086. [Google Scholar] [CrossRef] [PubMed]
- Kreimeier, J.; Götzelmann, T. Two decades of touchable and walkable virtual reality for blind and visually impaired people: A high-level taxonomy. Multimodal Technol. Interact. 2020, 4, 79. [Google Scholar] [CrossRef]
- Yiannoutsou, N.; Johnson, R.; Price, S. Non visual virtual reality. Educ. Technol. Soc. 2021, 24, 151–163. [Google Scholar] [CrossRef]
- Desai, P.R.; Desai, P.N.; Ajmera, K.D.; Mehta, K. A review paper on oculus rift-a virtual reality headset. arXiv 2014, arXiv:1408.1173. [Google Scholar] [CrossRef]
- Picinali, L.; Afonso, A.; Denis, M.; Katz, B.F.G. Exploration of architectural spaces by blind people using auditory virtual reality for the construction of spatial knowledge. Int. J. Hum. Comput. Stud. 2014, 72, 393–407. [Google Scholar] [CrossRef]
- Zhao, Y.; Bennett, C.L.; Benko, H.; Cutrell, E.; Holz, C.; Morris, M.R.; Sinclair, M. Enabling people with visual impairments to navigate virtual reality with a haptic and auditory cane simulation. In Proceedings of the 2018 CHI Conference on Human factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; pp. 1–14. [Google Scholar] [CrossRef]
- Sinclair, J.-L. Principles of Game Audio and Sound Design: Sound Design and Audio Implementation for Interactive and Immersive Media; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
- De Almeida, G.C.; de Souza, V.C.; Da Silveira Júnior, L.G.; Veronez, R. Spatial audio in virtual reality: A systematic review. In Proceedings of the 25th Symposium on Virtual and Augmented Reality, Rio Grande, Brazil, 6–9 November 2023; pp. 264–268. [Google Scholar] [CrossRef]
- Control Spatial Audio and Head Tracking. 2024. Available online: https://support.apple.com/en-ph/guide/airpods/dev00eb7e0a3/web (accessed on 1 April 2025).
- Using Directional Audio—Zoom Support. 2023. Available online: https://support.zoom.com/hc/en/article?id=zm_kb&sysparm_article=KB0058025 (accessed on 1 April 2025).
- Spatial Audio in Microsoft Teams Meetings—Microsoft Support. 2024. Available online: https://support.microsoft.com/en-gb/office/spatial-audio-in-microsoft-teams-meetings-547b5f81-1825-4ee1-a1cf-f02e12db4fdb (accessed on 1 April 2025).
- Stefan Campbell. VR Headset Sales and Market Share in 2023 (How Many Sold?). 2023. Available online: https://thesmallbusinessblog.net/vr-headset-sales-and-market-share/#:~:text=2019%20%E2%80%93%205.51%20million%20pieces%20of,been%20sold%20during%20the%20year (accessed on 1 April 2025).
- Ferjan, M. Interesting AirPods Facts 2023: AirPods Revenue, Release Date, Units Sold; HeadphonesAddict: 2023. Available online: https://headphonesaddict.com/airpods-facts-revenue/ (accessed on 1 April 2025).
- PEAT LLC. Inclusive XR & Hybrid Work Toolkit. 2022. Available online: https://www.peatworks.org/inclusive-xr-toolkit/ (accessed on 1 April 2025).
- Thévin, L.; Brock, A. How to move from inclusive systems to collaborative systems: The case of virtual reality for teaching o&m. In CHI 2019 Workshop on Hacking Blind Navigation; HAL: Lyon, France, 2019. [Google Scholar]
- Carruth, D.W. Virtual reality for education and workforce training. In Proceedings of the 2017 15th International Conference on Emerging Elearning Technologies and Applications (ICETA), Stary Smokovec, Slovakia, 26–27 October 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Pillai, A.S.; Mathew, P.S. Impact of virtual reality in healthcare: A review. In Virtual and Augmented Reality in Mental Health Treatment; IGI Global: Hershey, PA, USA, 2019; pp. 17–31. [Google Scholar] [CrossRef]
- Striuk, A.; Rassovytska, M.; Shokaliuk, S. Using blippar augmented reality browser in the practical training of mechanical engineers. arXiv 2018, arXiv:1807.00279. [Google Scholar] [CrossRef]
- Torres-Gil, M.A.; Casanova-Gonzalez, O.; González-Mora, J.L. Applications of virtual reality for visually impaired people. WSEAS Trans. Comput. 2010, 9, 184–193. [Google Scholar] [CrossRef]
- The Journey Forward: Recovery from the COVID-19 Pandemic. 2022. Available online: https://www.afb.org/research-and-initiatives/covid-19-research/journey-forward/introduction (accessed on 1 April 2025).
- Biggs, B.; Agbaroji, H.; Toth, C.; Stockman, T.; Coughlan, J.M.; Walker, B.N. Co-designing auditory navigation solutions for traveling as a blind individual during the COVID-19 pandemic. J. Blind. Innov. Res. 2024, 14, 1. [Google Scholar] [CrossRef]
- Thomas, D.; Warwick, A.; Olvera-Barrios, A.; Egan, C.; Schwartz, R.; Patra, S.; Eleftheriadis, H.; Khawaja, A.; Lotery, A.; Müller, P.; et al. Estimating excess visual loss in people with neovascular age-related macular degeneration during the COVID-19 pandemic. BMJ Open. 2020, 12, e057269. [Google Scholar] [CrossRef]
- Williams, M.A.; Hurst, A.; Kane, S.K. “pray before you step out” describing personal and situational blind navigation behaviors. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility, Bellevue, DC, USA, 21–23 October 2013; pp. 1–8. [Google Scholar] [CrossRef]
- Cliburn, D.C. Teaching and learning with virtual reality. J. Comput. Sci. Coll. 2023, 39, 19–27. [Google Scholar] [CrossRef]
- Hoffmann, C.; Büttner, S.; Prilla, M. Conveying procedural and descriptive knowledge with augmented reality. In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Greece, 29 June–1 July 2022; pp. 40–49. [Google Scholar] [CrossRef]
- Damiani, L.; Demartini, M.; Guizzi, G.; Revetria, R.; Tonelli, F. Augmented and virtual reality applications in industrial systems: A qualitative review towards the industry 4.0 era. IFAC-Pap. 2018, 51, 624–630. [Google Scholar] [CrossRef]
- Nair, V.; Ma, S.-E.; Penuela, R.E.G.; He, Y.; Lin, K.; Hayes, M.; Huddleston, H.; Donnelly, M.; Smith, B.A. Uncovering visually impaired gamers’ preferences for spatial awareness tools within video games. In Proceedings of the 24th international ACM SIGACCESS Conference on Computers and Accessibility, Athens, Greece, 23–26 October 2022; pp. 1–16. [Google Scholar] [CrossRef]
- Fehling, C.D.; Müller, A.; Aehnelt, M. Enhancing vocational training with augmented reality. In Proceedings of the 16th International Conference on Knowledge Technologies and Data-Driven Business, Graz, Austria, 16–19 September 2016. [Google Scholar] [CrossRef]
- Leporini, B.; Buzzi, M.; Hersh, M. Video conferencing tools: Comparative study of the experiences of screen reader users and the development of more inclusive design guidelines. ACM Trans. Access. Comput. 2023, 16, 1–36. [Google Scholar] [CrossRef]
- Schröder, J.-H.; Schacht, D.; Peper, N.; Hamurculu, A.M.; Jetter, H.-C. Collaborating across realities: Analytical lenses for understanding dyadic collaboration in transitional interfaces. In Proceedings of the 2023 CHI conference on human factors in computing systems, Hamburg, Germany, 23–28 April 2023; pp. 1–16. [Google Scholar] [CrossRef]
- Le, K.-D.; Ly, D.-N.; Nguyen, H.-L.; Le, Q.-T.; Fjeld, M.; Tran, M.-T. HybridMingler: Towards mixed-reality support for mingling at hybrid conferences. In Proceedings of the Extended abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany, 23–28 April 2023; pp. 1–7. [Google Scholar] [CrossRef]
- Biggs, B.; Toth, C.; Stockman, T.; Coughlan, J.M.; Walker, B.N. Evaluation of a non-visual auditory choropleth and travel map viewer. In Proceedings of the 27th International Conference on Auditory Display, Virtually, 24–27 June 2022. [Google Scholar] [CrossRef]
- Wisotzky, E.L.; Rosenthal, J.-C.; Eisert, P.; Hilsmann, A.; Schmid, F.; Bauer, M.; Schneider, A.; Uecker, F.C. Interactive and multimodal-based augmented reality for remote assistance using a digital surgical microscope. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 1477–1484. [Google Scholar] [CrossRef]
- Aira. 2018. Available online: https://aira.io/ (accessed on 1 April 2025).
- Oda, O.; Elvezio, C.; Sukan, M.; Feiner, S.; Tversky, B. Virtual replicas for remote assistance in virtual and augmented reality. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, Charlotte, NC, USA, 11–15 November 2015; pp. 405–415. [Google Scholar] [CrossRef]
- Mourtzis, D.; Siatras, V.; Angelopoulos, J. Real-time remote maintenance support based on augmented reality (AR). Appl. Sci. 2020, 10, 1855. [Google Scholar] [CrossRef]
- Cofano, F.; Di Perna, G.; Bozzaro, M.; Longo, A.; Marengo, N.; Zenga, F.; Zullo, N.; Cavalieri, M.; Damiani, L.; Boges, D.J.; et al. Augmented reality in medical practice: From spine surgery to remote assistance. Front. Surg. 2021, 8, 657901. [Google Scholar] [CrossRef]
- Lányi, S. Virtual reality in healthcare. In Intelligent Paradigms for Assistive and Preventive Healthcare; Springer: Berlin/Heidelberg, Germany, 2006; pp. 87–116. [Google Scholar] [CrossRef]
- Wedoff, R.; Ball, L.; Wang, A.; Khoo, Y.X.; Lieberman, L.; Rector, K. Virtual showdown: An accessible virtual reality game with scaffolds for youth with visual impairments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Scotland, UK, 4–9 May 2019; pp. 1–15. [Google Scholar] [CrossRef]
- Moline, J. Virtual reality for health care: A survey. Virtual Real. Neuro-Psycho-Physiol. 1997, 44, 3–34. [Google Scholar] [CrossRef]
- Shenai, M.B.; Dillavou, M.; Shum, C.; Ross, D.; Tubbs, R.S.; Shih, A.; Guthrie, B.L. Virtual interactive presence and augmented reality (VIPAR) for remote surgical assistance. Operat. Neurosurg. 2011, 68, ons200–ons207. [Google Scholar] [CrossRef] [PubMed]
- Riva, G. Virtual reality for health care: The status of research. Cyberpsychol. Behav. 2002, 5, 219–225. [Google Scholar] [CrossRef]
- Wilson, C.J.; Soranzo, A. The use of virtual reality in psychology: A case study in visual perception. Comput Math Methods Med. 2015, 2015, 151702. [Google Scholar] [CrossRef]
- Ruotolo, F.; Maffei, L.; Di Gabriele, M.; Iachini, T.; Masullo, M.; Ruggiero, G.; Senese, V.P. Immersive virtual reality and environmental noise assessment: An innovative audio–visual approach. Environ. Impact Assess. Rev. 2013, 41, 10–20. [Google Scholar] [CrossRef]
- Walker, B.N.; Nees, M.A. Chapter 2: Theory of sonification. In The Sonification Handbook; Hermann, T., Hunt, A., Neuhoff, J.G., Eds.; Logos Publishing House: Berlin, Germany, 2011; Available online: http://sonification.de/handbook/download/TheSonificationHandbook-chapter2.pdf (accessed on 1 April 2025).
- brinkman, W.-P.; Hoekstra, A.R.D.; Van Egmond, R. The effect of 3D audio and other audio techniques on virtual reality experience. Annu. Rev. Cybertherapy Telemed. 2015, 219, 44–48. [Google Scholar] [CrossRef]
- LaValle, S.M.; Yershova, A.; Katsev, M.; Antonov, M. Head tracking for the oculus rift. In Proceedings of the 2014 IEEE international conference on robotics and automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 187–194. [Google Scholar] [CrossRef]
- Wu, T.L.Y.; Gomes, A.; Fernandes, K.; Wang, D. The effect of head tracking on the degree of presence in virtual reality. Int. J. Hum. –Comput. Interact. 2019, 35, 1569–1577. [Google Scholar] [CrossRef]
- Knigge, J.-K. Theoretical background: The use of virtual reality head-mounted devices for planning and training in the context of manual order picking. In Virtual Reality in Manual Order Picking: Using Head-Mounted Devices for Planning and Trainin; Springer: Berlin/Heidelberg, Germany, 2021; pp. 13–32. [Google Scholar] [CrossRef]
- Pamungkas, D.S.; Ward, K. Electro-tactile feedback system to enhance virtual reality experience. IJCTE 2016, 8, 465–470. [Google Scholar] [CrossRef]
- Buttussi, F.; Chittaro, L. Locomotion in place in virtual reality: A comparative evaluation of joystick, teleport, and leaning. IEEE Trans. Vis. Comput. Graph. 2019, 27, 125–136. [Google Scholar] [CrossRef]
- de Pascale, M.; Mulatto, S.; Prattichizzo, D. Bringing haptics to second life for visually impaired people. In Proceedings of the International Conference on Human Haptic Sensing and Touch Enabled Computer Applications, Madrid, Spain, 10–13 June 2008; pp. 896–905. [Google Scholar] [CrossRef]
- Bekrater-Bodmann, R.; Foell, J.; Diers, M.; Kamping, S.; Rance, M.; Kirsch, P.; Trojan, J.; Fuchs, X.; Bach, F.; Çakmak, H.K.; et al. The importance of synchrony and temporal order of visual and tactile input for illusory limb ownership experiences–an fMRI study applying virtual reality. PLoS ONE 2014, 9, e87013. [Google Scholar] [CrossRef]
- Kunz, A.; Miesenberger, K.; Zeng, L.; Weber, G. Virtual navigation environment for blind and low vision people. In Proceedings of the International Conference on Computers Helping People with Special Needs, Linz, Austria, 11–13 July 2018; pp. 114–122. [Google Scholar] [CrossRef]
- HaptX|haptic Gloves for VR Training, Simulation, and Design. 2019. Available online: https://haptx.com/ (accessed on 1 April 2025).
- Soviak, A.; Borodin, A.; Ashok, V.; Borodin, Y.; Puzis, Y.; Ramakrishnan, I.V. Tactile accessibility: Does anyone need a haptic glove? In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, Reno, Nevada, 24–26 October 2016; pp. 101–109. [Google Scholar] [CrossRef]
- Afonso-Jaco, A.; Katz, B.F.G. Spatial knowledge via auditory information for blind individuals: Spatial cognition studies and the use of audio-VR. Sensors 2022, 13, 4794. [Google Scholar] [CrossRef]
- White, G.R.; Fitzpatrick, G.; McAllister, G. Toward accessible 3D virtual environments for the blind and visually impaired. In Proceedings of the 3rd International Conference on Digital Interactive Media in Entertainment and Arts, Athens, Greece, 10–12 September 2008; pp. 134–141. [Google Scholar] [CrossRef]
- Schinazi, V.R.; Thrash, T.; Chebat, D.-R. Spatial navigation by congenitally blind individuals. Wiley Interdiscip. Rev. Cogn. Sci. 2016, 7, 37–58. [Google Scholar] [CrossRef] [PubMed]
- Siu, A.F.; Sinclair, M.; Kovacs, R.; Ofek, E.; Holz, C.; Cutrell, E. Virtual reality without vision: A haptic and auditory white cane to navigate complex virtual worlds. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–13. [Google Scholar] [CrossRef]
- Allain, K.; Dado, B.; Van Gelderen, M.; Hokke, O.; Oliveira, M.; Bidarra, R.; Gaubitch, N.D.; Hendriks, R.C.; Kybartas, B. An audio game for training navigation skills of blind children. In Proceedings of the 2015 IEEE 2nd VR workshop on Sonic Interactions for Virtual Environments (SIVE), Arles, France, 24 March 2015; pp. 1–4. [Google Scholar] [CrossRef]
- Kuriakose, B.; Shrestha, R.; Sandnes, F.E. Tools and technologies for blind and visually impaired navigation support: A review. IETE Tech. Rev. 2022, 39, 3–18. [Google Scholar] [CrossRef]
- Guerreiro, J.; Kim, Y.; Nogueira, R.; Chung, S.; Rodrigues, A.; Oh, U. The design space of the auditory representation of objects and their behaviours in virtual reality for blind people. IEEE Trans. Vis. Comput. Graph. 2023, 29, 2763–2773. [Google Scholar] [CrossRef]
- Magnusson, C.; Rassmus-Gröhn, K.; Sjöström, C.; Danielsson, H. Navigation and recognition in complex haptic virtual environments–reports from an extensive study with blind users. In Proceedings of the third International ACM Conference on Assistive Technologies, Marina del Rey, CA, USA, 15–17 April 1998. [Google Scholar] [CrossRef]
- Balan, O.; Moldoveanu, A.; Moldoveanu, F. Navigational audio games: An effective approach toward improving spatial contextual learning for blind people. Int. J. Disabil. Hum. Dev. 2015, 14, 109–118. [Google Scholar] [CrossRef]
- Podkosova, I.; Urbanek, M.; Kaufmann, H. A hybrid sound model for 3D audio games with real walking. In Proceedings of the 29th International Conference on Computer Animation and Social Agents (CASA ’16), Geneva, Switzerland, 23–25 May 2016; pp. 189–192. [Google Scholar] [CrossRef]
- Seki, Y.; Sato, T. A training system of orientation and mobility for blind people using acoustic virtual reality. IEEE Trans. Neural Syst. Rehabil. Eng. 2010, 19, 95–104. [Google Scholar] [CrossRef]
- Husin, M.H.; Lim, Y.K. InWalker: Smart white cane for the blind. Disabil. Rehabil. Assist. Technol. 2020, 15, 701–707. [Google Scholar] [CrossRef] [PubMed]
- Tatsumi, H.; Murai, Y.; Sekita, I.; Tokumasu, S.; Miyakawa, M. Cane walk in the virtual reality space using virtual haptic sensing. In Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics, Hong Kong, China, 9–12 October 2015. [Google Scholar] [CrossRef]
- Tahat, A.A. A wireless ranging system for the blind long-cane utilizing a smart-phone. In Proceedings of the 2009 10th International Conference on Telecommunications, Zagreb, Croatia, 8–10 June 2009; pp. 111–117. [Google Scholar] [CrossRef]
- Lécuyer, A.; Mobuchon, P.; Mégard, C.; Perret, J.; Andriot, C.; Colinot, J.-P. HOMERE: A multimodal system for visually impaired people to explore virtual environments. In IEEE Virtual Reality, 2003. Proceedings; IEEE: New York, NY, USA, 2003; pp. 251–258. [Google Scholar] [CrossRef]
- Schloerb, D.W.; Lahav, O.; Desloge, J.G.; Srinivasan, M.A. BlindAid: Virtual environment system for self-reliant trip planning and orientation and mobility training. In Proceedings of the 2010 IEEE Haptics Symposium, Washington, DC, USA, 25–26 March 2010; pp. 363–370. [Google Scholar] [CrossRef]
- Lahav, O. Virtual reality systems as an orientation aid for people who are blind to acquire new spatial information. Sensors 2022, 22, 1307. [Google Scholar] [CrossRef] [PubMed]
- Kreimeier, J.; Karg, P.; Götzelmann, T. BlindWalkVR: Formative insights into blind and visually impaired people’s VR locomotion using commercially available approaches. In Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Greece, 30 June 2020–3 July 2020; pp. 1–8. [Google Scholar] [CrossRef]
- Strelow, E.R.; Brabyn, J.A. Locomotion of the blind controlled by natural sound cues. Perception 1982, 11, 635–640. [Google Scholar] [CrossRef]
- McCrindle, R.J.; Symons, D. Audio space invaders. In Proceedings of the 3rd international conference on disability, virtual reality and associated technologies, Alghero, Italy, 23–25 September 2000. [Google Scholar] [CrossRef]
- Fryer, L.; Freeman, J. Presence in those with and without sight: Audio description and its potential for virtual reality applications. J. CyberTherapy Rehabil. 2012, 5, 15–23. [Google Scholar] [CrossRef]
- Roberts, J.; Lyons, L.; Cafaro, F.; Eydt, R. Interpreting data from within: Supporting humandata interaction in museum exhibits through perspective taking. In Proceedings of the 2014 Conference on Interaction Design and Children, Aarhus, Denmark, 17–20 June 2014; pp. 7–16. [Google Scholar] [CrossRef]
- Fishkin, K.P.; Moran, T.P.; Harrison, B.L. Embodied user interfaces: Towards invisible user interfaces. In Proceedings of the Engineering for Human-Computer Interaction: IFIP TC2/TC13 WG2. 7/WG13. 4 Seventh Working Conference on Engineering for Human-Computer Interaction, Heraklion, Greece, 14–18 September 1998; pp. 1–18. [Google Scholar] [CrossRef]
- Zhang, X.; Zhang, H.; Zhang, L.; Zhu, Y.; Hu, F. Double-diamond model-based orientation guidance in wearable human–machine navigation systems for blind and visually impaired people. Sensors 2019, 19, 4670. [Google Scholar] [CrossRef] [PubMed]
- McDaniel, T.; Bala, S.; Rosenthal, J.; Tadayon, R.; Tadayon, A.; Panchanathan, S. Affective haptics for enhancing access to social interactions for individuals who are blind. In Proceedings of the Universal Access in Human-Computer Interaction. Design and Development Methods for Universal Access: 8th International Conference, UAHCI 2014, Heraklion, Greece, 22–27 June 2014; Held as Part of HCI International 2014 Proceedings, part i 8. pp. 419–429. [Google Scholar] [CrossRef]
- Krishna, S.; Colbry, D.; Black, J.; Balasubramanian, V.; Panchanathan, S. A systematic requirements analysis and development of an assistive device to enhance the social interaction of people who are blind or visually impaired. In Workshop on Computer Vision Applications for the Visually Impaired; HAL: Lyon, France, 2008. [Google Scholar] [CrossRef]
- Sarfraz, M.S.; Constantinescu, A.; Zuzej, M.; Stiefelhagen, R. A multimodal assistive system for helping visually impaired in social interactions. Inform. Spektrum 2017, 40, 540–545. [Google Scholar] [CrossRef]
- McDaniel, T.; Tran, D.; Devkota, S.; DiLorenzo, K.; Fakhri, B.; Panchanathan, S. Tactile facial expressions and associated emotions toward accessible social interactions for individuals who are blind. In Proceedings of the 2018 Workshop on Multimedia for Accessible Human Computer Interface, Seoul, Republic of Korea, 22 October 2018; pp. 25–32. [Google Scholar] [CrossRef]
- McDaniel, T.; Krishna, S.; Balasubramanian, V.; Colbry, D.; Panchanathan, S. Using a haptic belt to convey non-verbal communication cues during social interactions to individuals who are blind. In Proceedings of the 2008 IEEE International Workshop on Haptic Audio Visual Environments and Games, Phoenix, AZ, USA, 16–17 October 2010; pp. 13–18. [Google Scholar] [CrossRef]
- Tao, Y.; Ding, L.; Ganz, A. Indoor navigation validation framework for visually impaired users. IEEE Access 2017, 5, 21763–21773. [Google Scholar] [CrossRef]
- de Almeida Rebouças, C.B.; Pagliuca, L.M.F.; de Almeida, P.C. Non-verbal communication: Aspects observed during nursing consultations with blind patients. Esc. Anna Nery 2007, 11, 38–43. [Google Scholar] [CrossRef]
- James, D.M.; Stojanovik, V. Communication skills in blind children: A preliminary investigation. Child Care Health Dev. 2007, 33, 4–10. [Google Scholar] [CrossRef]
- Collis, G.M.; Bryant, C.A. Interactions between blind parents and their young children. Child Care Health Dev. 1981, 7, 41–50. [Google Scholar] [CrossRef]
- Klauke, S.; Sondocie, C.; Fine, I. The impact of low vision on social function: The potential importance of lost visual social cues. J. Optom. 2023, 16, 3–11. [Google Scholar] [CrossRef]
- Pölzer, S.; Miesenberger, K. Presenting non-verbal communication to blind users in brainstorming sessions. In Proceedings of the Computers Helping People with Special Needs: 14th International Conference, ICCHP 2014, Paris, France, 9–11 July 2014; Proceedings, Part i 14. pp. 220–225. [Google Scholar] [CrossRef]
- Maloney, D.; Freeman, G.; Wohn, D.Y. “talking without a voice” understanding non-verbal communication in social virtual reality. In Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2); ACM: New York, NY, USA, 2020; pp. 1–25. [Google Scholar] [CrossRef]
- Argyle, M. Non-verbal communication in human social interaction. Non-Verbal Commun. 1972, 2, 1. [Google Scholar] [CrossRef]
- Astler, D.; Chau, H.; Hsu, K.; Hua, A.; Kannan, A.; Lei, L.; Nathanson, M.; Paryavi, E.; Rosen, M.; Unno, H.; et al. Increased accessibility to nonverbal communication through facial and expression recognition technologies for blind/visually impaired subjects. In Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility, Dundee Scotland, UK, 24–26 October 2011; pp. 259–260. [Google Scholar] [CrossRef]
- Tanenbaum, T.J.; Hartoonian, N.; Bryan, J. “how do i make this thing smile?” an inventory of expressive nonverbal communication in commercial social virtual reality platforms. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–13. [Google Scholar] [CrossRef]
- Ke, F.; Im, T. Virtual-reality-based social interaction training for children with high-functioning autism. J. Educ. Res. 2013, 106, 441–461. [Google Scholar] [CrossRef]
- Wigham, C.R.; Chanier, T. A study of verbal and nonverbal communication in second life–the ARCHI21 experience. ReCALL 2013, 25, 63–84. [Google Scholar] [CrossRef]
- Ji, T.F.; Cochran, B.; Zhao, Y. VRBubble: Enhancing peripheral awareness of avatars for people with visual impairments in social virtual reality. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility, Athens, Greece, 23–26 October 2022; pp. 1–17. [Google Scholar] [CrossRef]
- Jung, C.; Collins, J.; Penuela, R.E.G.; Segal, J.I.; Won, A.S.; Azenkot, S. Accessible nonverbal cues to support conversations in VR for blind and low vision people. In Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility, St. John’s, NL, Canada, 27–30 October 2024; pp. 1–13. [Google Scholar] [CrossRef]
- Giudice, N.A.; Guenther, B.A.; Kaplan, T.M.; Anderson, S.M.; Knuesel, R.J.; Cioffi, J.F. Use of an indoor navigation system by sighted and blind travelers: Performance similarities across visual status and age. ACM Trans. Access. Comput. (TACCESS) 2020, 13, 1–27. [Google Scholar] [CrossRef]
- Loeliger, E.; Stockman, T. Wayfinding without visual cues: Evaluation of an interactive audio map system. Interact. Comput. 2014, 26, 403–416. [Google Scholar] [CrossRef]
- Walker, B.N.; Wilson, J. SWAN 2.0: Research and development on a new system for wearable audio navigation. In Proceedings of the 2007 11th IEEE International Symposium on Wearable Computers, Boston, MA, USA, 11–13 October 2007. [Google Scholar] [CrossRef]
- Ponchillia, P.E.; Jo, S.-J.; Casey, K.; Harding, S. Developing an indoor navigation application: Identifying the needs and preferences of users who are visually impaired. J. Vis. Impair. Blind. 2020, 114, 344–355. [Google Scholar] [CrossRef]
- Diaz-Merced, W.L.; Candey, R.M.; Brickhouse, N.; Schneps, M.; Mannone, J.C.; Brewster, S.; Kolenberg, K. Sonification of astronomical data. Proc. Int. Astron. Union 2011, 7, 133–136. [Google Scholar] [CrossRef]
- Lakoff, G. Women, Fire, and Dangerous Things: What Categories Reveal about the Mind; The University of Chicago Press: Chicago, IL, USA, 2012. [Google Scholar]
- Biggs, B.; Coughlan, J.; Coppin, P. Design and evaluation of an audio game-inspired auditory map interface. Proc. Int. Conf. Audit. Disp. 2019, 2019, 20–27. [Google Scholar] [CrossRef]
- Dingler, T.; Lindsay, J.; Walker, B.N. Learnability of sound cues for environmental features: Auditory icons, earcons, spearcons, and speech. In Proceedings of the 14th International Conference on Auditory Display, Paris, France, 24–27 June 2018. [Google Scholar] [CrossRef]
- O’Sullivan, J.A.; Power, A.J.; Mesgarani, N.; Rajaram, S.; Foxe, J.J.; Shinn-Cunningham, B.G.; Slaney, M.; Shamma, S.A.; Lalor, E.C. Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cereb. Cortex 2015, 25, 1697–1706. [Google Scholar] [CrossRef]
- Skantze, D.; Dahlback, N. Auditory icon support for navigation in speech-only interfaces for room-based design metaphors. In Proceedings of the 2003 International Conference on Auditory Display, Boston, MA, USA, 6–9 July 2003. [Google Scholar] [CrossRef]
- Audiogames.net. AudioGames, Your Resource for Audiogames, Games for the Blind, Games for the Visually Impaired! 2018. Available online: http://audiogames.net/ (accessed on 1 April 2025).
- Nair, V.; Karp, J.L.; Silverman, S.; Kalra, M.; Lehv, H.; Jamil, F.; Smith, B.A. NavStick: Making video games blind-accessible via the ability to look around. In Proceedings of the 34th annual ACM Symposium on User Interface Software and Technology, online, 10–14 October 2021; pp. 538–551. [Google Scholar] [CrossRef]
- Balan, O.; Moldoveanu, A.; Moldoveanu, F.; Dascalu, M.-I. Audio games-a novel approach towards effective learning in the case of visually-impaired people. In ICERI2014 Proceedings; IATED: Valencia, Spain, 2014; pp. 6542–6548. [Google Scholar] [CrossRef]
- Biggs, B.; Yusim, L.; Coppin, P. The Audio Game Laboratory: Building Maps from Games; OCAD University: Toronto, ON, Canada, 2018. [Google Scholar] [CrossRef]
- Jørgensen, K. Gameworld Interfaces; MIT Press: Cambridge, MA, USA, 2013. [Google Scholar]
- Sam Tupy. Survive the Wild. 2021. Available online: http://www.samtupy.com/games/stw/ (accessed on 1 April 2025).
- Max, M.L.; Gonzalez, J.R. Blind persons navigate in virtual reality (VR); hearing and feeling communicates “reality”. In Medicine Meets Virtual Reality; IOS Press: Amsterdam, The Netherlands, 1997; pp. 54–59. [Google Scholar] [CrossRef]
- Andrade, R.; Rogerson, M.J.; Waycott, J.; Baker, S.; Vetere, F. Playing blind: Revealing the world of gamers with visual impairment. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–14. [Google Scholar] [CrossRef]
- Nielsen, J. 10 Usability Heuristics for User Interface Design. 1994. Available online: https://www.nngroup.com/articles/ten-usability-heuristics/ (accessed on 1 April 2025).
- Kaldobsky, J. Swamp. 2011. Available online: http://www.kaldobsky.com/audiogames/ (accessed on 1 April 2025).
- Out of Sight Games. A Hero’s Call. 2019. Available online: https://outofsightgames.com/a-heros-call/ (accessed on 1 April 2025).
- Magica, M. Materia Magica. 2017. Available online: https://www.materiamagica.com/ (accessed on 1 April 2025).
- Gluck, A.; Boateng, K.; Brinkley, J. Racing in the dark: Exploring accessible virtual reality by developing a racing game for people who are blind. In Proceedings of the Human Factors and Ergonomics society Annual Meeting, Baltimore, MD, USA, 3–8 October 2021; pp. 1114–1118. [Google Scholar] [CrossRef]
- Gluck, A. Virtual Reality in the Dark: VR Development for People Who are Blind. 2022. Available online: https://equalentry.com/virtual-reality-development-for-blind/ (accessed on 1 April 2025).
- Design for Every Gamer—DFEG. Available online: https://www.rnib.org.uk/living-with-sight-loss/assistive-aids-and-technology/tv-audio-and-gaming/design-for-every-gamer/ (accessed on 1 April 2025).
- Ellis, B.; Ford-Williams, G.; Graham, L.; Grammenos, D.; Hamilton, I.; Lee, E.; Manion, J.; Westin, T. Game Accessibility Guidelines. Available online: https://gameaccessibilityguidelines.com/full-list/ (accessed on 1 April 2025).
- Accessible Platform Architectures Working Group. XR Accessibility User Requirements. 2021. Available online: https://www.w3.org/TR/xaur/ (accessed on 1 April 2025).
- Creed, C.; Al-Kalbani, M.; Theil, A.; Sarcar, S.; Williams, I. Inclusive AR/VR: Accessibility barriers for immersive technologies. Univers. Access Inf. Soc. 2024, 23, 59–73. [Google Scholar] [CrossRef]
- Perdigueiro, J. A Look at Mobile Screen Reader Support in the Unity Engine. 2024. Available online: https://unity.com/blog/engine-platform/mobile-screen-reader-support-in-unity (accessed on 1 April 2025).
- Stealcase. UI Toolkit Screen Reader, Comment 2. Available online: https://discussions.unity.com/t/ui-toolkit-screen-reader/246795/2 (accessed on 1 April 2025).
- MetalPop Games. UI Accessibility Plugin (UAP). 2021. Available online: https://assetstore.unity.com/packages/tools/gui/ui-accessibility-plugin-uap-87935 (accessed on 1 April 2025).
- Aralan007. An Open Letter: Please Improve Screen Reader Support of the Unity Editor and Engine. 2023. Available online: https://discussions.unity.com/t/an-open-letter-please-improve-screen-reader-support-of-the-unity-editor-and-engine/882417 (accessed on 1 April 2025).
- Repiteo. Available online: https://github.com/godotengine/godot/pull/76829 (accessed on 1 April 2025).
- lightsoutgames. Godot Accessibility Plugin. 2020. Available online: https://github.com/lightsoutgames/godot-accessibility (accessed on 1 April 2025).
- Epic Games. Designing UI for Accessibility in Unreal engine|Unreal Engine 5.5 Documentation. Available online: https://dev.epicgames.com/documentation/en-us/unreal-engine/designing-ui-for-accessibility-in-unreal-engine (accessed on 1 April 2025).
- Epic Games. Blind Accessibility Features Overview. Available online: https://dev.epicgames.com/documentation/en-us/unreal-engine/blind-accessibility-features-overview-in-unreal-engine (accessed on 1 April 2025).
- Bridge, K.; Coulter, D.; Batchelor, D.; Satran, M. Microsoft Active Accessibility and UI Automation Compared. 2020. Available online: https://learn.microsoft.com/en-us/windows/win32/winauto/microsoft-active-accessibility-and-ui-automation-compared (accessed on 1 April 2025).
- Gorla, E. Foundations: Native Versus Custom Components. 2023. Available online: https://tetralogical.com/blog/2022/11/08/foundations-native-versus-custom-components/ (accessed on 1 April 2025).
- Federal Communications Commission. 21st Century Communications and Video Accessibility Act (CVAA). 2010. Available online: https://www.fcc.gov/consumers/guides/21st-century-communications-and-video-accessibility-act-cvaa (accessed on 1 April 2025).
- U.S. Department of Justice Civil Rights Division. Fact Sheet: New Rule on the Accessibility of Web Content and Mobile Apps Provided by State and Local Governments. 2024. Available online: https://www.ada.gov/resources/2024-03-08-web-rule/ (accessed on 1 April 2025).
- Costanza-Chock, S. Design Justice: Towards an intersectional feminist framework for design theory and practice. In Proceedings of the Design as a Catalyst for Change—DRS International Conference, Limerick, Ireland, 25–28 June 2018. [Google Scholar] [CrossRef]
- Welcome to XR Navigation! Available online: https://xrnavigation.io/ (accessed on 1 April 2025).
- Skulmoski, G.J.; Hartman, F.T.; Krahn, J. The delphi method for graduate research. J. Inf. Technol. Educ. Res. 2007, 6, 1–21. [Google Scholar] [CrossRef]
- Screen Reader User Survey 10 Results. 2024. Available online: https://webaim.org/projects/screenreadersurvey10/#intro (accessed on 1 April 2025).
- Bălan, O.; Moldoveanu, A.; Moldoveanu, F.; Nagy, H.; Wersényi, G.; Unnórsson, R. Improving the audio game-playing performances of people with visual impairments through multimodal training. J. Vis. Impair. Blind. 2017, 111, 148. [Google Scholar] [CrossRef]
- Dog, N. Accessibility Options for the Last of us Part II. Available online: https://www.playstation.com/en-us/games/the-last-of-us-part-ii/accessibility/ (accessed on 1 April 2025).
- Bailenson, J.N.; Yee, N. Digital chameleons: Automatic assimilation of nonverbal gestures in immersive virtual environments. Psychol. Sci. 2005, 16, 814–819. [Google Scholar] [CrossRef] [PubMed]
- Peña, A.; Rangel, N.; Muñoz, M.; Mejia, J.; Lara, G. Affective behavior and nonverbal interaction in collaborative virtual environments. J. Educ. Technol. Soc. 2016, 19, 29–41. [Google Scholar] [CrossRef]
- Using NVDA to Evaluate Web Accessibility. Available online: https://webaim.org/articles/nvda/ (accessed on 1 April 2025).
- NV Access. NVDA 2017.4 User Guide. 2017. Available online: https://www.nvaccess.org/files/nvda/documentation/userGuide.html (accessed on 1 April 2025).
- Using VoiceOver to Evaluate Web Accessibility. Available online: https://webaim.org/articles/voiceover/ (accessed on 1 April 2025).
- Apple. Chapter 1. Introducing VoiceOver. 2020. Available online: https://www.apple.com/voiceover/info/guide/_1121.html (accessed on 1 April 2025).
- Kager, D.; Kelman, A. Tolk: Screen reader abstraction library. Available online: https://github.com/dkager/tolk (accessed on 1 April 2025).
- Mozilla. ARIA Live Regions. 2019. Available online: https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/ARIA_Live_Regions (accessed on 1 April 2025).
- Hubs—Private, Virtual 3D Worlds in Your Browser. Available online: https://hubs.mozilla.com/ (accessed on 1 April 2025).
- Ian Reed. Tactical Battle. 2013. Available online: https://blindgamers.com/Home/IanReedsGames (accessed on 1 April 2025).
- Zombies, Run! Available online: https://zrx.app/ (accessed on 1 April 2025).
- Wu, J. Voice Vista. 2023. Available online: https://drwjf.github.io/vvt/index.html (accessed on 1 April 2025).
- Iachini, T.; Ruggiero, G.; Ruotolo, F. Does blindness affect egocentric and allocentric frames of reference in small and large scale spaces? Behav. Brain Res. 2014, 273, 73–81. [Google Scholar] [CrossRef]
- DragonApps. Cyclepath. 2017. Available online: https://www.iamtalon.me/cyclepath/ (accessed on 1 April 2025).
- The Mud Connector. Getting Started: Welcome to Your First MUD Adventure. 2013. Available online: http://www.mudconnect.com/mud_intro.html (accessed on 1 April 2025).
- Tanveer, M.I.; Anam, A.S.I.; Rahman, A.K.M.; Ghosh, S.; Yeasin, M. FEPS: A sensory substitution system for the blind to perceive facial expressions. In Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility, Boulder, CO, USA, 22–24 October 2012; pp. 207–208. [Google Scholar] [CrossRef]
- Valenti, R.; Jaimes, A.; Sebe, N. Sonify your face: Facial expressions for sound generation. In Proceedings of the 18th ACM International Conference on Multimedia, Place, Italy, 25–29 October 2010; pp. 1363–1372. [Google Scholar] [CrossRef]
- Denby, B.; Schultz, T.; Honda, K.; Hueber, T.; Gilbert, J.M.; Brumberg, J.S. Silent speech interfaces. Speech Commun. 2010, 52, 270–287. [Google Scholar] [CrossRef]
- Kapur, A.; Kapur, S.; Maes, P. Alterego: A personalized wearable silent speech interface. In Proceedings of the 23rd International Conference on Intelligent User Interfaces, Tokyo, Japan, 7–11 March 2018; pp. 43–53. [Google Scholar] [CrossRef]
- Foley Sound Effect Libraries. Available online: https://www.hollywoodedge.com/foley.html (accessed on 1 April 2025).
- OpenAI. GPT-4 Technical Report. arXiv 2023. [Google Scholar] [CrossRef]
- Buzzi, M.C.; Buzzi, M.; Leporini, B.; Mori, G.; Penichet, V.M.R. Collaborative editing: Collaboration, awareness and accessibility issues for the blind. In On the Move to Meaningful Internet Systems: OTM 2014 Workshops. Proceedings of the Confederated International Workshops: OTM Academy, OTM Industry Case Studies Program, c&TC, EI2N, INBAST, ISDE, META4eS, MSC and OnToContent 2014, Amantea, Italy, 27–31 October 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 567–573. [Google Scholar] [CrossRef]
- Fan, D.; Glazko, K.; Follmer, S. Accessibility of linked-node diagrams on collaborative whiteboards for screen reader users: Challenges and opportunities. In Design Thinking Research: Achieving Real Innovation; Springer: Berlin/Heidelberg, Germany, 2022; pp. 97–108. [Google Scholar] [CrossRef]
- Kolykhalova, K.; Alborno, P.; Camurri, A.; Volpe, G. A serious games platform for validating sonification of human full-body movement qualities. In Proceedings of the 3rd International Symposium on Movement and Computing, Thessaloniki, Greece, 5–6 July 2016; pp. 1–5. [Google Scholar] [CrossRef]
- Tiku, K.; Maloo, J.; Ramesh, A.; Indra, R. Real-time conversion of sign language to text and speech. In Proceedings of the 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 15–17 July 2020; pp. 346–351. [Google Scholar] [CrossRef]
- Rumble. The Builder’s Tutorial. 2018. Available online: https://www.tbamud.com (accessed on 1 April 2025).
- Driftwood Games. Entombed. 2008. Available online: http://www.blind-games.com/ (accessed on 1 April 2025).
- Bergin, D.; Oppegaard, B. Automating media accessibility: An approach for analyzing audio description across generative artificial intelligence algorithms. Tech. Commun. Q. 2024, 34, 169–184. [Google Scholar] [CrossRef]
- Silverman, A.M.; Baguhn, S.J.; Vader, M.-L.; Romero, E.M.; So, C.H.P. Empowering or excluding: Expert insights on inclusive artificial intelligence for people with disabilities. Am. Found. Blind. 2025. [CrossRef]
- Audiom: The world’s Most Inclusive Map Viewer. 2021. Available online: https://audiom.net (accessed on 1 April 2025).
- Sketchbook (Your World). Available online: https://sbyw.games/index.php (accessed on 1 April 2025).
- Download Sable Proof of Concept Demo. Available online: https://ebonskystudios.com/download-sable-demo/ (accessed on 1 April 2025).
- Ebon Sky Studios. Sable demo—Part i (Map Creation). 2018. Available online: https://www.youtube.com/watch?v=wyAkqGlDIgY (accessed on 1 April 2025).
- Tigwell, G.W.; Gorman, B.M.; Menzies, R. Emoji accessibility for visually impaired people. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar] [CrossRef]
- Cantrell, S.J.; Winters, R.M.; Kaini, P.; Walker, B.N. Sonification of emotion in social media: Affect and accessibility in facebook reactions. In Proceedings of the ACM on Human-Computer Interaction; ACM: New York, NY, USA, 2022; pp. 1–26. [Google Scholar] [CrossRef]
- Virtual Reality Accessibility: 11 Things we Learned from Blind Users. 2022. Available online: https://equalentry.com/virtual-reality-accessibility-things-learned-from-blind-users/ (accessed on 1 April 2025).
- Soviak, A. Haptic gloves for audio-tactile web accessibility. In Proceedings of the 12th Web for All Conference, Florence, Italy, 18–20 May 2015; p. 40. [Google Scholar] [CrossRef]
- Your Rights Under Section 504 of the Rehabilitation Act. 2006. Available online: https://www.hhs.gov/sites/default/files/ocr/civilrights/resources/factsheets/504.pdf (accessed on 1 April 2025).
- Rulings, Filings, and Letters. Available online: https://nfb.org/programs-services/legal-program/rulings-filings-and-letters#education (accessed on 1 April 2025).
- Unity. Unity Gaming Report. Unity Technologies. 2022. Available online: https://create.unity.com/gaming-report-2022 (accessed on 1 April 2025).
- Awards—the Last of Us: Part II. Available online: https://www.imdb.com/title/tt6298000/awards/ (accessed on 1 April 2025).
- System Usability Scale (SUS). 2021. Available online: https://www.usability.gov/how-to-and-tools/methods/system-usability-scale.html (accessed on 1 April 2025).
- NASA. NASA TLX. 2018. Available online: https://humansystems.arc.nasa.gov/groups/TLX/ (accessed on 1 April 2025).
- Tomlinson, B.J.; Noah, B.E.; Walker, B.N. Buzz: An auditory interface user experience scale. In Proceedings of the Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Coleman, G.W.; Hand, C.; Macaulay, C.; Newell, A.F. Approaches to auditory interface design-lessons from computer games. In Proceedings of the 11th International Conference on Auditory Display, Limerick, Ireland, 6–9 July 2005. [Google Scholar] [CrossRef]
- Oren, M.A. Speed sonic across the span: Building a platform audio game. In CHI’07 Extended Abstracts on Human Factors in Computing Systems; ACM: New York, NY, USA, 2007; pp. 2231–2236. [Google Scholar] [CrossRef]
- Ian Reed. Blind Gamers Home. 2025. Available online: https://blindgamers.com/Home/ (accessed on 1 April 2025).
- World Health Organization. Global Data on Visual Impairments. 2010. Available online: https://www.who.int/publications-detail-redirect/world-report-on-vision (accessed on 1 April 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Biggs, B.; Murgaski, S.; Coppin, P.; Walker, B.N. Creating Non-Visual Non-Verbal Social Interactions in Virtual Reality. Virtual Worlds 2025, 4, 25. https://doi.org/10.3390/virtualworlds4020025
Biggs B, Murgaski S, Coppin P, Walker BN. Creating Non-Visual Non-Verbal Social Interactions in Virtual Reality. Virtual Worlds. 2025; 4(2):25. https://doi.org/10.3390/virtualworlds4020025
Chicago/Turabian StyleBiggs, Brandon, Steve Murgaski, Peter Coppin, and Bruce N. Walker. 2025. "Creating Non-Visual Non-Verbal Social Interactions in Virtual Reality" Virtual Worlds 4, no. 2: 25. https://doi.org/10.3390/virtualworlds4020025
APA StyleBiggs, B., Murgaski, S., Coppin, P., & Walker, B. N. (2025). Creating Non-Visual Non-Verbal Social Interactions in Virtual Reality. Virtual Worlds, 4(2), 25. https://doi.org/10.3390/virtualworlds4020025