Pediatric Speech Audiometry Web Application for Hearing Detection in the Home Environment

: This paper describes the development of the speech audiometry application for pediatric patients in Slovak language and experiences obtained during testing with healthy children, hearing-impaired children, and elderly persons. The ﬁrst motivation behind the presented work was to reduce the stress and fear of the children, who must undergo postoperative audiometry, but over time, we changed our direction to the simple game-like mobile application for the detection of possible hearing problems of children in the home environment. Conditioned play audiometry principles were adopted to create a speech audiometry application, where children help the virtual robot Thomas assign words to pictures; this can be described as a speech recognition test. Several game scenarios together with the setting condition issues were created, tested, and discussed. First experiences show a positive inﬂuence on the children’s mood and motivation.


Introduction
Audiometry is generally aimed at measuring the perception of the audio signal by the human auditory system. Audiometric tests can be divided into two main categories: pure tone audiometry and speech audiometry. Speech audiometry is a standard part of the audiological test battery/collection. It is usually done after the pure tone audiometry and helps the audiologist to answer questions regarding a patient's ability to be involved in speech communication. In other words, speech audiometry enables to test the speech processing abilities at different levels within the auditory system [1].
Speech audiometry contains several types of speech tests which focus on different aspects of speech perception and processing. Tests can measure the patient's most comfortable and uncomfortable listening levels, their range of comfortable listening, and their ability to recognize and discriminate speech sounds. A typical setting for the speech audiometry is similar to the pure tone audiometry. Usually, a two-room setting is needed. A two-channel audiometer can be used to present the stimulus through a microphone (monitored live voice) or through an external device, in case of recorded speech. The patient's role in speech audiometry is to react to the provided stimuli. They can repeat proposed words, write down their response or point to the picture.
Speech audiometry tests can be divided into two main categories: threshold level testing and suprathreshold testing [2]. In the case of threshold level testing, audiologists try to find the lowest level of speech where patients can detect and recognize the speech stimulus. The speech detection threshold (SDT) can be seen as the basic parameter. SDT is the lowest level of speech that a patient is able to detect for at least 50% of the time. The next important measure focuses on the speech recognition threshold (SRT), which is the lowest level that a person recognizes and repeats speech back to the audiologist.
The next group of speech audiometry tests falls under suprathreshold tests. After finding the SDT and SRT, in suprathreshold testing the speech material is provided to the patient at a normal conversation level and we try to identify speech recognition and understand the ability of that person. There are several suprathreshold speech tests that can be done. One is the most comfortable loudness (MCL) test, which tries to identify the most comfortable speech level of the patient. Another important measure is the uncomfortable loudness level (UCL). The UCL imagines the maximum level in which you can perform word recognition testing, and together with the SDT enables you to determine the dynamic range for speech.
The next important speech audiometry tests are word recognition tests or speech recognition tests. Their purpose is to determine the person's ability to understand and repeat words presented at a conversation level. Speech recognition testing is performed with a specific set or list of words known as phonetically balanced words. These lists consist of commonly occurring words in their normal proportion in everyday speech. There exist a few standardized lists. The most known are the PB-50 (phonetically balanced 50 words), CID (Central Institute of Deaf) W-22 list or Northwestern University (NU-6) list. Words from these lists can be presented by a live voice, but the usage of a prerecorded voice is a better choice. Another advantage of pre-prepared sounds is that they allow identical measurements to be made repeatedly, thus they are preferred in audiometry. Typically, 25 or 50 words are presented to the patient and they are instructed to repeat them or point to the correct picture. Words are usually presented on a fixed level (sound pressure level (SPL)) at about 30 dB to 40 dB above the patient's SDT. The result is expressed in the percentage of words they get correct. The person without any hearing impairment should have a word recognition score (WRS) of 90-100%.
The current research in the pediatric audiometry domain is mainly focused on phonetically balanced sentence/word lists in different languages, such as Greek [2], German (German Oldenburg Sentence Test for Children [3] or Mainz speech test for children 3-7 years old [4]), Thai [5] or Chinese [6]. However, these word lists are not combined with pictures and audio samples to build an automated speech application for the home environment. On the other hand, a very interesting THear framework for mobile audiometry [6] is designed mainly for adults, where written text and Chinese traditional character pictures are used to choose the word heard. This is not suitable for pediatric patients.
When we tried to review the latest pediatric audiometry applications in Slavic languages, as there is no Slovak word list to our knowledge, we found the following: Polish hearing screening of school children is tone-based [7]; Serbian speech audiometry authors published a good speech audiometry word re-evaluation lately [8] but it is not oriented to the pediatric domain and has no picture set associated; Czech preschool children testing [9] was based on a whispered voice performed by pediatricians; Russian speech audiometry materials and SRT tests recorded high-quality audio samples [10] but are still not suitable for child audiometry as they do not have a picture set associated; Ukrainian phonetic tables [11] are not designed for speech audiometry and no word list for pediatric audiometry was found; for Bulgarian language, we found only newborn screening program results [12] with no Bulgarian phonetically-balanced word list available.
In the present paper we focus our attention on the pediatric speech audiometry, which has its own specifics. Children have a limited, specific vocabulary, which must be considered. A closed set of words instead of an open set are preferred. That means that the patient can select a correct result from a set of options. This can be done using picture cards, where a child can point to a picture related to what he or she heard. For children in the kindergarten age, phonetically-balanced kindergarten word lists exist (e.g., [13]).
During the audiometry testing of pediatric patients, we need to consider their abilities and limitations in the language area, which relates to their age. The above-mentioned audiometry tests need to be modified to adapt to pediatric patients. Pediatric audiometry is performed in case of hearing loss and the deafness of children in prelingual and post-lingual age, in range of postoperative, medication, and compensatory therapy. A high level of stress and distrust of a pediatric patient towards the therapist and the therapy is a commonly observed issue that arises during the application of the therapy, which results in a situation when checking the sound perception during an interview with the child becomes ineffective or even precluded. The level of stress in a child patient, low motivation, and involvement in therapy can be positively influenced using game-like approaches, smart technologies, or by involving robotic systems. Generally, studies indicate the positive effect of robots on therapy (see [14][15][16][17]).
Therapists often report problems with children concerning motivation and involvement during audiometry after implantation of a cochlear implant [18]; this thus became the motivation behind the current study. The initial idea lied in the use of a robot during the audiometry process, in order to positively influence the therapy by decreasing the stress and distrust of the pediatric patient towards the therapist/therapy. Our goal was to prepare a research platform that would enable us to study aspects of virtual robot-supported audiometry. After collecting first experiences we extended our focus to involve other smart devices, like smartphones and tablets, into speech audiometry. Our attention was focused on the development of a simple game-like mobile application for the detection of possible hearing loss problems in children at home conditions. Conditioned play audiometry (CPA) principles were adopted to create speech audiometry applications, where children help a virtual robot known as Thomas to assign words to pictures. We previously described a similar web-based audiology application [19] in which a telemetry application is presented with remote audiology measuring devices. The advantage of our proposed system is that it is not dependent on special devices and the purpose of home environment testing is different. We mainly propose a simple child-acceptable application that is easy to play with and can provide a short everyday testing of the current state of hearing with cochlear implants or Otitis media (middle ear inflammation).
Our initial aim to support child audiometry using robots was extended to use modern technologies for the above-mentioned purpose. One of the main reasons for addressing this problem was the experiences of therapists from post-operative therapy after cochlear implant implantation, where they described problems with fear and low motivation of children. Another reason was no commonly accepted pre-recorded speech stimuli for kindergarten children existed, nor any tool for diagnostics of children with hearing problems in the Slovak language, particularly in the home environment. Therefore, we started to develop the audiometry application, which can serve both therapists and parents and can be easily used in the home environment. Testing the application brought many ideas and findings, which were used to further improve the application.
Our work consists of several tasks. The first task was to select a child audiometry method, which is well suitable for pediatric speech audiometry and the desired application. The second task was to define the scenario of audiometry tests and prepare resources in the Slovak language. The next tasks were focused on the design and development of the research platform for robot-assisted child audiometry and speech audiometry application for smart devices.
The paper is organized as follows: Section 2 describes the design and development of the pediatric speech audiometry application including speech stimuli preparation. Section 3 presents details about the performed experiments, their results, and collected observations.

Speech Stimuli
Speech stimuli are the most fundamental part of speech audiometry testing. To test speech perception and processing a therapist provides speech stimuli to the patient. Speech stimuli can be provided by a live voice or by a pre-recorded voice (recordings). Both live and pre-recorded voices have their own advantages and disadvantages. In the case of a live voice, the rapport between a patient and therapist can be reached easier. One the other hand, such measurement will be difficult to repeat at the same conditions. In the case of using a pre-recorded voice, the measurement can be easily repeated in the future.
Speech stimuli need to cover the phonetical set of the language, to be able to assess the speech understanding ability of the patient. Moreover, in the case of pediatric audiometry, they need to consider significant differences in speech and language ability in comparison to adult patients.
Speech audiometry can be performed with very young children (from approx. two and half years) and pediatric audiometry methods are substituted by adult audiometry in the case of ten-or twelve-year-old patients. Pediatric audiometry methods can also be used for adult patients with specific kinds of mental disabilities or for elderly patients. The child's mental processes up to two years depend on his or her experiences-what they see, what they hear, what they touch. This period is based on sensorimotor thinking and development of practical intelligence [20,21]. The vocabulary of a child of this age is very limited. A two-year-old child can actively use around 200-300 of words. Of course, his or her understanding capacity is larger [22].
Speech stimuli often form word lists, which contain words that are pronounced by the therapist or played through an audiometer or CD player. Several word lists exist, which were developed for other languages. The most well-known word list for children's speech audiometry is the Phonetically balanced Kindergarten List (PBK) defined by Haskins in 1949 [13]. It consists of 50 phonetically balanced word items, which were selected from the spoken vocabulary of normal-hearing kindergarten children [1]. There are also other lists, such as the Isophonemic Word Lists designed by Boothroyd in 1968 [23] or the Northwestern University Children's Perception of Speech (NU-CHIPS), which consists of 50 words with pictures [24].
To perform children's speech audiometry for Slovak children, Dr. Hapčo and Dr. Bargár designed a Slovak set of words, which contains 80 words; this set is well suitable for older children (school-age). Another set also exists, which is used by audiologists during behavioral audiometry, but it is not standardized or publicly available. Behavioral audiometry is usually performed with very young pediatric patients (from 6-months-old) and it is very interactive and subjective.
Due to the lack of an appropriate word list for Slovak kindergarten children with hearing disabilities, we decided to develop a new list, which will be well suited for children, two-years-old and older, although predominantly for kindergarten children. The newly designed unique Slovak kindergarten word list (SKWL) for child audiometry consists of 50-word items, related pictures, and audio files with recorded speech stimuli. The acquaintance criterion was the most important in the process of word selection. Words were separated into the following groups: transport/vehicles (5), colors (4), animals (10), toys/things (8), human body (5), food (5), and combinations (13). More information about the SKWL word list can be found elsewhere [25]. The group of animals is the biggest group (10) because animals are usually the first words in a child's vocabulary [26]. In early childhood, children imitate animal sounds, play with animal toys, and they are happy to watch them; thus, this category is representative. The category of the human body contains only one-and two-syllable words, so we consider it to be the least demanding for perception. Conversely, the food category contains one-, two-, three-and four-syllable words with Slovak phonemesč, dž,ĺ, ch, whose teaching is in the second part of the primer [27]. For these reasons, we consider this category to be the most challenging. A specific category is a combination of words and phrases. We graded the terms as two-word, three-word, and sentence, so that we can gradually distinguish what the patient hears and understands, and what is already too difficult for him/her, by the audiometric measurement.
There are several words in the database, which can serve as a distinctive element for stratifying the patient's audio capabilities [28], for example: • the syllable length criterion (čokoládka, kravička, lietadielko); • the occurrence of syllable phonemes [ Electronics 2020, 9, x FOR PEER REVIEW 4 of 15 patient and therapist can be reached easier. One the other hand, such measurement will be difficult to repeat at the same conditions. In the case of using a pre-recorded voice, the measurement can be easily repeated in the future. Speech stimuli need to cover the phonetical set of the language, to be able to assess the speech understanding ability of the patient. Moreover, in the case of pediatric audiometry, they need to consider significant differences in speech and language ability in comparison to adult patients. Speech audiometry can be performed with very young children (from approx. two and half years) and pediatric audiometry methods are substituted by adult audiometry in the case of ten-or twelveyear-old patients. Pediatric audiometry methods can also be used for adult patients with specific kinds of mental disabilities or for elderly patients. The child's mental processes up to two years depend on his or her experiences-what they see, what they hear, what they touch. This period is based on sensorimotor thinking and development of practical intelligence [20,21]. The vocabulary of a child of this age is very limited. A two-year-old child can actively use around 200-300 of words. Of course, his or her understanding capacity is larger [22].
Speech stimuli often form word lists, which contain words that are pronounced by the therapist or played through an audiometer or CD player. Several word lists exist, which were developed for other languages. The most well-known word list for children's speech audiometry is the Phonetically balanced Kindergarten List (PBK) defined by Haskins in 1949 [13]. It consists of 50 phonetically balanced word items, which were selected from the spoken vocabulary of normal-hearing kindergarten children. [1]. There are also other lists, such as the Isophonemic Word Lists designed by Boothroyd in 1968 [23] or the Northwestern University Children's Perception of Speech (NU-CHIPS), which consists of 50 words with pictures [24].
To perform children's speech audiometry for Slovak children, Dr. Hapčo and Dr. Bargár designed a Slovak set of words, which contains 80 words; this set is well suitable for older children (school-age). Another set also exists, which is used by audiologists during behavioral audiometry, but it is not standardized or publicly available. Behavioral audiometry is usually performed with very young pediatric patients (from 6-months-old) and it is very interactive and subjective.
Due to the lack of an appropriate word list for Slovak kindergarten children with hearing disabilities, we decided to develop a new list, which will be well suited for children, two-years-old and older, although predominantly for kindergarten children. The newly designed unique Slovak kindergarten word list (SKWL) for child audiometry consists of 50-word items, related pictures, and audio files with recorded speech stimuli. The acquaintance criterion was the most important in the process of word selection. Words were separated into the following groups: transport/vehicles (5), colors (4), animals (10), toys/things (8), human body (5), food (5), and combinations (13). More information about the SKWL word list can be found elsewhere [25]. The group of animals is the biggest group (10) because animals are usually the first words in a child's vocabulary [26]. In early childhood, children imitate animal sounds, play with animal toys, and they are happy to watch them; thus, this category is representative. The category of the human body contains only one-and twosyllable words, so we consider it to be the least demanding for perception. Conversely, the food category contains one-, two-, three-and four-syllable words with Slovak phonemes č, dž, ĺ, ch, whose teaching is in the second part of the primer [27]. For these reasons, we consider this category to be the most challenging. A specific category is a combination of words and phrases. We graded the terms as two-word, three-word, and sentence, so that we can gradually distinguish what the patient hears and understands, and what is already too difficult for him/her, by the audiometric measurement.
There are several words in the database, which can serve as a distinctive element for stratifying the patient's audio capabilities [28], for example: patient and therapist can be reached easier. One the other hand, such measurement will be difficult to repeat at the same conditions. In the case of using a pre-recorded voice, the measurement can be easily repeated in the future. Speech stimuli need to cover the phonetical set of the language, to be able to assess the speech understanding ability of the patient. Moreover, in the case of pediatric audiometry, they need to consider significant differences in speech and language ability in comparison to adult patients. Speech audiometry can be performed with very young children (from approx. two and half years) and pediatric audiometry methods are substituted by adult audiometry in the case of ten-or twelveyear-old patients. Pediatric audiometry methods can also be used for adult patients with specific kinds of mental disabilities or for elderly patients. The child's mental processes up to two years depend on his or her experiences-what they see, what they hear, what they touch. This period is based on sensorimotor thinking and development of practical intelligence [20,21]. The vocabulary of a child of this age is very limited. A two-year-old child can actively use around 200-300 of words. Of course, his or her understanding capacity is larger [22].
Speech stimuli often form word lists, which contain words that are pronounced by the therapist or played through an audiometer or CD player. Several word lists exist, which were developed for other languages. The most well-known word list for children's speech audiometry is the Phonetically balanced Kindergarten List (PBK) defined by Haskins in 1949 [13]. It consists of 50 phonetically balanced word items, which were selected from the spoken vocabulary of normal-hearing kindergarten children. [1]. There are also other lists, such as the Isophonemic Word Lists designed by Boothroyd in 1968 [23] or the Northwestern University Children's Perception of Speech (NU-CHIPS), which consists of 50 words with pictures [24].
To perform children's speech audiometry for Slovak children, Dr. Hapčo and Dr. Bargár designed a Slovak set of words, which contains 80 words; this set is well suitable for older children (school-age). Another set also exists, which is used by audiologists during behavioral audiometry, but it is not standardized or publicly available. Behavioral audiometry is usually performed with very young pediatric patients (from 6-months-old) and it is very interactive and subjective.
Due to the lack of an appropriate word list for Slovak kindergarten children with hearing disabilities, we decided to develop a new list, which will be well suited for children, two-years-old and older, although predominantly for kindergarten children. The newly designed unique Slovak kindergarten word list (SKWL) for child audiometry consists of 50-word items, related pictures, and audio files with recorded speech stimuli. The acquaintance criterion was the most important in the process of word selection. Words were separated into the following groups: transport/vehicles (5), colors (4), animals (10), toys/things (8), human body (5), food (5), and combinations (13). More information about the SKWL word list can be found elsewhere [25]. The group of animals is the biggest group (10) because animals are usually the first words in a child's vocabulary [26]. In early childhood, children imitate animal sounds, play with animal toys, and they are happy to watch them; thus, this category is representative. The category of the human body contains only one-and twosyllable words, so we consider it to be the least demanding for perception. Conversely, the food category contains one-, two-, three-and four-syllable words with Slovak phonemes č, dž, ĺ, ch, whose teaching is in the second part of the primer [27]. For these reasons, we consider this category to be the most challenging. A specific category is a combination of words and phrases. We graded the terms as two-word, three-word, and sentence, so that we can gradually distinguish what the patient hears and understands, and what is already too difficult for him/her, by the audiometric measurement.
There are several words in the database, which can serve as a distinctive element for stratifying the patient's audio capabilities [28], for example: patient and therapist can be reached easier. One the other hand, such measurement will be difficult to repeat at the same conditions. In the case of using a pre-recorded voice, the measurement can be easily repeated in the future. Speech stimuli need to cover the phonetical set of the language, to be able to assess the speech understanding ability of the patient. Moreover, in the case of pediatric audiometry, they need to consider significant differences in speech and language ability in comparison to adult patients. Speech audiometry can be performed with very young children (from approx. two and half years) and pediatric audiometry methods are substituted by adult audiometry in the case of ten-or twelveyear-old patients. Pediatric audiometry methods can also be used for adult patients with specific kinds of mental disabilities or for elderly patients. The child's mental processes up to two years depend on his or her experiences-what they see, what they hear, what they touch. This period is based on sensorimotor thinking and development of practical intelligence [20,21]. The vocabulary of a child of this age is very limited. A two-year-old child can actively use around 200-300 of words. Of course, his or her understanding capacity is larger [22].
Speech stimuli often form word lists, which contain words that are pronounced by the therapist or played through an audiometer or CD player. Several word lists exist, which were developed for other languages. The most well-known word list for children's speech audiometry is the Phonetically balanced Kindergarten List (PBK) defined by Haskins in 1949 [13]. It consists of 50 phonetically balanced word items, which were selected from the spoken vocabulary of normal-hearing kindergarten children. [1]. There are also other lists, such as the Isophonemic Word Lists designed by Boothroyd in 1968 [23] or the Northwestern University Children's Perception of Speech (NU-CHIPS), which consists of 50 words with pictures [24].
To perform children's speech audiometry for Slovak children, Dr. Hapčo and Dr. Bargár designed a Slovak set of words, which contains 80 words; this set is well suitable for older children (school-age). Another set also exists, which is used by audiologists during behavioral audiometry, but it is not standardized or publicly available. Behavioral audiometry is usually performed with very young pediatric patients (from 6-months-old) and it is very interactive and subjective.
Due to the lack of an appropriate word list for Slovak kindergarten children with hearing disabilities, we decided to develop a new list, which will be well suited for children, two-years-old and older, although predominantly for kindergarten children. The newly designed unique Slovak kindergarten word list (SKWL) for child audiometry consists of 50-word items, related pictures, and audio files with recorded speech stimuli. The acquaintance criterion was the most important in the process of word selection. Words were separated into the following groups: transport/vehicles (5), colors (4), animals (10), toys/things (8), human body (5), food (5), and combinations (13). More information about the SKWL word list can be found elsewhere [25]. The group of animals is the biggest group (10) because animals are usually the first words in a child's vocabulary [26]. In early childhood, children imitate animal sounds, play with animal toys, and they are happy to watch them; thus, this category is representative. The category of the human body contains only one-and twosyllable words, so we consider it to be the least demanding for perception. Conversely, the food category contains one-, two-, three-and four-syllable words with Slovak phonemes č, dž, ĺ, ch, whose teaching is in the second part of the primer [27]. For these reasons, we consider this category to be the most challenging. A specific category is a combination of words and phrases. We graded the terms as two-word, three-word, and sentence, so that we can gradually distinguish what the patient hears and understands, and what is already too difficult for him/her, by the audiometric measurement.
There are several words in the database, which can serve as a distinctive element for stratifying the patient's audio capabilities [28], for example: A total of 23 words have their diminutive equivalent (e.g., krava-kravička; pes-psík-havo), so that we can get as close as possible to the child's speech in each household, where the same subject may be named differently.
Due to our focus on the conditioned play audiometry and behavioral audiometry, we decided to prepare a picture card for each word in the Slovak kindergarten word list. These pictures were carefully drawn by the artist for this research to be kind and suitable for pediatric patients. Five types of picture tests from our speech recognition test are depicted in Figures 1-5. Each of them focuses on a specific task connected with hearing capability.
Electronics 2020, 9, x FOR PEER REVIEW 5 of 15 A total of 23 words have their diminutive equivalent (e.g., krava-kravička; pes-psík-havo), so that we can get as close as possible to the child's speech in each household, where the same subject may be named differently.
Due to our focus on the conditioned play audiometry and behavioral audiometry, we decided to prepare a picture card for each word in the Slovak kindergarten word list. These pictures were carefully drawn by the artist for this research to be kind and suitable for pediatric patients. Five types of picture tests from our speech recognition test are depicted in Figures 1-5. Each of them focuses on a specific task connected with hearing capability.     A total of 23 words have their diminutive equivalent (e.g., krava-kravička; pes-psík-havo), so that we can get as close as possible to the child's speech in each household, where the same subject may be named differently.
Due to our focus on the conditioned play audiometry and behavioral audiometry, we decided to prepare a picture card for each word in the Slovak kindergarten word list. These pictures were carefully drawn by the artist for this research to be kind and suitable for pediatric patients. Five types of picture tests from our speech recognition test are depicted in Figures 1-5. Each of them focuses on a specific task connected with hearing capability.     A total of 23 words have their diminutive equivalent (e.g., krava-kravička; pes-psík-havo), so that we can get as close as possible to the child's speech in each household, where the same subject may be named differently.
Due to our focus on the conditioned play audiometry and behavioral audiometry, we decided to prepare a picture card for each word in the Slovak kindergarten word list. These pictures were carefully drawn by the artist for this research to be kind and suitable for pediatric patients. Five types of picture tests from our speech recognition test are depicted in Figures 1-5. Each of them focuses on a specific task connected with hearing capability.     A total of 23 words have their diminutive equivalent (e.g., krava-kravička; pes-psík-havo), so that we can get as close as possible to the child's speech in each household, where the same subject may be named differently.
Due to our focus on the conditioned play audiometry and behavioral audiometry, we decided to prepare a picture card for each word in the Slovak kindergarten word list. These pictures were carefully drawn by the artist for this research to be kind and suitable for pediatric patients. Five types of picture tests from our speech recognition test are depicted in Figures 1-5. Each of them focuses on a specific task connected with hearing capability.      A picture identification speech recognition test is suitable for children up to 10-years-old. The pediatric patient correctly marks the appropriate image representation based on the heard sound stimuli. In this case, it is a closed set test.

Previous Experiences with HRI Audiometry
As mentioned in the abstract, the first motivation behind the presented work was to improve the user acceptance level and user experience and reduce stress and fear of children, who must undergo postoperative audiometry, which was induced by the experience of therapists. Our first idea was to involve real robots in the speech audiometry process, which is repeatedly performed after cochlear implant surgery. Previously, we started to design and develop a small application with a humanoid robot, where the robot prompts a child to help him to put together pictures on the table and sounds. This robot-assisted speech audiometry ran on VoMIS system (see [29]) and is described in detail elsewhere [25]. During the experiments with the robot in this role, we collected new ideas and experiences. One of the key findings was that healthy children liked to interact with the robot. The next experiments also brought several drawbacks:

•
It was uncomfortable to use a magnetic table for the picture presented. A robot with a touchscreen could be more suitable. The used humanoid robot had no display. • Using a humanoid robot enables one to perform only free field audiometry, which does not enable to measure the left ear and right ear separately and there was a high risk of cross hearing.

•
Motors in the joints of the robot produce noise, during the gesticulation of the robot, which may be disturbing in such an audiometry scenario.

•
Very young children may be afraid of humanoids (we performed tests with 4-to 6-year-old children only). Instead of a humanoid robot, it could be better to use some a family member, companion, or social robot (e.g., Asus Zenbo, currently not available for EU). • Children speech audiometry with robot assistance cannot be easily used in home conditions.
In other words, the idea to use robots looks very nice, but obtained experiences showed that such a system was not usable or helpful for the therapy. Therefore, we turned to something more usable, simple, and helpful. We focused our attention to hearing detection in the home environment without any humanoids needed. We developed the idea to prepare a simple game-like mobile application, which can be easily used by parents, when they have some doubts about the speech and sound perception of their child. Due to the fact, that in home conditions, users will not be able to set the accurate acoustic conditions, the application focuses rather on suprathreshold speech tests instead of threshold levels testing. The designed application falls into the category of word recognition audiometry tests with the closed set.

Web-Based Application for Children Speech Audiometry
Conditioned play audiometry principles were adopted to create a speech audiometry application, where children help robot Thomas to assign words (sounds) to pictures, which can be marked as a kind of speech recognition. The selected test is a part of the behavioral audiometry.
The designed application was prepared as a web application, which enabled us to run it on each device with Internet connection and a web browser without any other requirements. The application can be used for free field speech audiometry and with a headset. The design is currently a simple A picture identification speech recognition test is suitable for children up to 10-years-old. The pediatric patient correctly marks the appropriate image representation based on the heard sound stimuli. In this case, it is a closed set test.

Previous Experiences with HRI Audiometry
As mentioned in the abstract, the first motivation behind the presented work was to improve the user acceptance level and user experience and reduce stress and fear of children, who must undergo postoperative audiometry, which was induced by the experience of therapists. Our first idea was to involve real robots in the speech audiometry process, which is repeatedly performed after cochlear implant surgery. Previously, we started to design and develop a small application with a humanoid robot, where the robot prompts a child to help him to put together pictures on the table and sounds. This robot-assisted speech audiometry ran on VoMIS system (see [29]) and is described in detail elsewhere [25]. During the experiments with the robot in this role, we collected new ideas and experiences. One of the key findings was that healthy children liked to interact with the robot. The next experiments also brought several drawbacks:

•
It was uncomfortable to use a magnetic table for the picture presented. A robot with a touchscreen could be more suitable. The used humanoid robot had no display. • Using a humanoid robot enables one to perform only free field audiometry, which does not enable to measure the left ear and right ear separately and there was a high risk of cross hearing.

•
Motors in the joints of the robot produce noise, during the gesticulation of the robot, which may be disturbing in such an audiometry scenario.

•
Very young children may be afraid of humanoids (we performed tests with 4-to 6-year-old children only). Instead of a humanoid robot, it could be better to use some a family member, companion, or social robot (e.g., Asus Zenbo, currently not available for EU). • Children speech audiometry with robot assistance cannot be easily used in home conditions.
In other words, the idea to use robots looks very nice, but obtained experiences showed that such a system was not usable or helpful for the therapy. Therefore, we turned to something more usable, simple, and helpful. We focused our attention to hearing detection in the home environment without any humanoids needed. We developed the idea to prepare a simple game-like mobile application, which can be easily used by parents, when they have some doubts about the speech and sound perception of their child. Due to the fact, that in home conditions, users will not be able to set the accurate acoustic conditions, the application focuses rather on suprathreshold speech tests instead of threshold levels testing. The designed application falls into the category of word recognition audiometry tests with the closed set.

Web-Based Application for Children Speech Audiometry
Conditioned play audiometry principles were adopted to create a speech audiometry application, where children help robot Thomas to assign words (sounds) to pictures, which can be marked as a kind of speech recognition. The selected test is a part of the behavioral audiometry.
The designed application was prepared as a web application, which enabled us to run it on each device with Internet connection and a web browser without any other requirements. The application can be used for free field speech audiometry and with a headset. The design is currently a simple HTML code with short task description and pictures to choose from, based on the heard voice command. The pictures were designed and completely sketched by one of the co-authors, and they are a unique and significant contribution to the Slovak audiology clinicians' community together with the SKWL word list and audio recordings.
The application is organized into levels. In each level five screens with a set of pictures are presented to the patient with randomly generated speech stimuli. After performing all levels, the word recognition score is computed. Speech stimuli are presented on the supposed most comfortable loudness level, which is around 50 dB. The application offers the introduction, which helps the therapist or parent to set required acoustic conditions (MCL). Cold running speech is used to set the MCL. Parent/therapist together with the patient can set the MCL by adjusting the volume while listening to a short story.
The first screens of the application contain the story description, basic setting page, and entry form. The story ( Figure 6, screen 2) about robot Thomas was designed to motivate a child to undergo the audiometry. A child patient is invited to help robot Thomas to organize his collection of pictures and sounds, which is broken. We tried to engage emotions by placing on the screen gif-animation with the sad robot. The next screen offers the basic setting instructions, which help parents/therapists to set the MCL. The last initial screen is an entry form, where the user fills in his/her name (or nickname), gender, and age. He/she can also provide information about the sound level which was set for the experiment (in the case when audiometry is done on the most comfortable loudness level). Then, the application is ready to start the game.
Electronics 2020, 9, x FOR PEER REVIEW 7 of 15 HTML code with short task description and pictures to choose from, based on the heard voice command. The pictures were designed and completely sketched by one of the co-authors, and they are a unique and significant contribution to the Slovak audiology clinicians' community together with the SKWL word list and audio recordings. The application is organized into levels. In each level five screens with a set of pictures are presented to the patient with randomly generated speech stimuli. After performing all levels, the word recognition score is computed. Speech stimuli are presented on the supposed most comfortable loudness level, which is around 50 dB. The application offers the introduction, which helps the therapist or parent to set required acoustic conditions (MCL). Cold running speech is used to set the MCL. Parent/therapist together with the patient can set the MCL by adjusting the volume while listening to a short story.
The first screens of the application contain the story description, basic setting page, and entry form. The story ( Figure 6, screen 2) about robot Thomas was designed to motivate a child to undergo the audiometry. A child patient is invited to help robot Thomas to organize his collection of pictures and sounds, which is broken. We tried to engage emotions by placing on the screen gif-animation with the sad robot. The next screen offers the basic setting instructions, which help parents/therapists to set the MCL. The last initial screen is an entry form, where the user fills in his/her name (or nickname), gender, and age. He/she can also provide information about the sound level which was set for the experiment (in the case when audiometry is done on the most comfortable loudness level). Then, the application is ready to start the game. The game is divided into several levels. In each level a set of pictures is provided with an appropriate randomly generated audio file with speech stimuli. On each screen with pictures, a word is played, which belongs to one of the pictures (see Figure 7). The task of the patient is to select the correct picture. After each level, the overall score is calculated, but it stays hidden from the patient. We decided to hide a partial score because initial testing showed that children stay demotivated in case of a bad score. The game is divided into several levels. In each level a set of pictures is provided with an appropriate randomly generated audio file with speech stimuli. On each screen with pictures, a word is played, which belongs to one of the pictures (see Figure 7). The task of the patient is to select the correct picture. After each level, the overall score is calculated, but it stays hidden from the patient. We decided to hide a partial score because initial testing showed that children stay demotivated in case of a bad score. Two versions of the game were developed: The first is for free field audiometry and the second one for testing each ear separately. Although free field audiometry is more comfortable and less stressful for pediatric patients, it brings less precise results. If it is possible, better results can be obtained using a headset, when each ear can be tested separately. A cross hearing problem may prevent proper diagnosis [30,31], therefore a special version of speech recordings was prepared for right and left ear testing. During testing of one ear, the masking noise is played into the second ear with the 10 dB distance to the speech stimuli.

Experiments and Results
Several game scenarios together with the setting condition issues were created, tested, and discussed. First experiences show a positive influence on the children's mood and motivation.
Eleven child participants were involved in the interaction with the audiometry application. Nine were healthy children mainly around kindergarten age. The two testing subjects were a 4 and 16 year old boy and girl with hearing impairment, respectively. The last participant was an elderly patient (72-year-old woman with a hearing aid in the right ear and hearing problems in both ears). The total number of test participants was 12.
All tests were performed in the home environment in a relatively quiet place. One of the parents played the role of a therapist. He set the sound pressure level (SPL) and read motivation stories and instructions to the child. Before testing, the child was not affected by any louder sound. Each test consists of three game levels. The role of the child is to pick up the correct picture from the provided set of pictures according to provided speech stimuli in the form of prerecorded words from the Slovak kindergarten word list. The audiometry application ran on mobile devices (Samsung Galaxy A70 and Xiaomi Redmi 4X) and tablets (Huawei MediaPad M5 lite). In the beginning, the application requires adjusting the SPL volume before playing the game.
The first version of our audiometry application used another mechanism to set the SPL and each level was played with a different SPL. To set the desired sound level, the therapist or parent needs to use the second device (smartphone) with a sound meter application to measure and set the correct SPL. Achieving stable acoustic conditions was very difficult. Additional problems were identified during the experiments: • a movement or even the presence of other person causes a disturbance; Two versions of the game were developed: The first is for free field audiometry and the second one for testing each ear separately. Although free field audiometry is more comfortable and less stressful for pediatric patients, it brings less precise results. If it is possible, better results can be obtained using a headset, when each ear can be tested separately. A cross hearing problem may prevent proper diagnosis [30,31], therefore a special version of speech recordings was prepared for right and left ear testing. During testing of one ear, the masking noise is played into the second ear with the 10 dB distance to the speech stimuli.

Experiments and Results
Several game scenarios together with the setting condition issues were created, tested, and discussed. First experiences show a positive influence on the children's mood and motivation.
Eleven child participants were involved in the interaction with the audiometry application. Nine were healthy children mainly around kindergarten age. The two testing subjects were a 4 and 16 year old boy and girl with hearing impairment, respectively. The last participant was an elderly patient (72-year-old woman with a hearing aid in the right ear and hearing problems in both ears). The total number of test participants was 12.
All tests were performed in the home environment in a relatively quiet place. One of the parents played the role of a therapist. He set the sound pressure level (SPL) and read motivation stories and instructions to the child. Before testing, the child was not affected by any louder sound. Each test consists of three game levels. The role of the child is to pick up the correct picture from the provided set of pictures according to provided speech stimuli in the form of prerecorded words from the Slovak kindergarten word list. The audiometry application ran on mobile devices (Samsung Galaxy A70 and Xiaomi Redmi 4X) and tablets (Huawei MediaPad M5 lite). In the beginning, the application requires adjusting the SPL volume before playing the game.
The first version of our audiometry application used another mechanism to set the SPL and each level was played with a different SPL. To set the desired sound level, the therapist or parent needs to use the second device (smartphone) with a sound meter application to measure and set the correct SPL. Achieving stable acoustic conditions was very difficult. Additional problems were identified during the experiments: • a movement or even the presence of other person causes a disturbance; • variable position between the child and the sound source/smartphone (the child tends to be as close as possible to the sound source); • impossible to test the left and right ear separately and occurrence of cross hearing; • physical properties of sound emission (reflections, attenuation, etc.) had a significant impact on the resulting perceived level of acoustic information.
Therefore, in the second version of the application, we decided to perform testing on the most comfortable loudness level, which can be easily set at the beginning. According to analyzed literature around speech audiometry, we abandoned the strict adherence to acoustic conditions, because the appearance of some noises in the background can lead to more realistic results of audiometry, which closely reflects situations in the real environment.
Each game level has a different difficulty and allows us to test various aspects of the cognitive ability of patients. Tests can evaluate several distinctive levels of hearing and subsequent understanding (e.g., it includes phonetic similarity of words, the visual similarity of presented pictures, the same word base, different word length, etc.). All mentioned aspects focus on a specific task connected with the hearing capability and each of them can influence the perception results.

Experiments with Healthy Children
In these experiments, we considered as healthy children those who were not clinically diagnosed with any hearing problems before. From the testing of healthy children, two main observations were collected:

•
For the children, it was very funny and exciting to play the audiometry game. They did not want to stop playing. They did not perceive that it was therapeutic testing.

•
Testing showed that, in cases where the child marked an incorrect picture and the system displayed a picture of the sad robot, the child started to become demotivated, sad, and did not want to continue with the game.
When we decreased the SPL to approx. 30 dB, the word recognition score decreased to 65% for child #1 and #2, which is still higher than the threshold score for the healthy patients (WRS = 50%) [1].
According to the obtained observations we decided to remove the backchannel after each picture's set. Instead of a negative backchannel, the application provided in each situation a positive backchannel after each level.
The test routine in the second version of the application consisted of setting the MCL and SRT volume levels and selecting the test method (via the loudspeaker, so-called free field, or via headphones for the right and left ear). Both volume levels were adjusted by the parent in cooperation with the child subjectively. The precondition for such a setting is that the parent has no hearing impairment. MCL level is set correctly if the sound stimuli are well audible (not too loud or less loud). The child completes the test and based on the final score the parent obtains information about the child's hearing abilities; in cases where the parent performed the test too, he/she can compare the achieved results. The minimum audible level (SRT) was set again by the parent. He/she continuously increased the volume of the presented sounds from the zero level while observing the child's reactions and ability to repeat the proposed sound. This setting can be simplified by the fact that SRT is usually the lowest level that can be heard through the device used (computer, mobile phone, tablet). Similarly, if a parent completed the test, he/she could compare the obtained results with his/her child's results to get an idea of his/her hearing abilities.

Experiment with Hearing-Impaired Child
The third testing subject was a 4-year-old boy with hearing impairment. He interacted with the audiometry application several times both in the free field scenario and with headphones (see Figure 8). The first interactions were made on the MCL level interactively set in cooperation between the child and his parent. In these tests, all speech stimuli were recognized correctly and a WRS equal to 100% was achieved. Then we decided to change the sound pressure level in the range from 70 dB to the lowest possible level, which can be reached by the device (Samsung Galaxy A70). This level was around 35 dB. Recognition problems started to occur at such a low level and word recognition score declined below 50%. According to our observations, the incorrectly recognized words were those from the group of short words and phonetically similar words.
Since hearing problems were suspected, we decided to continue in the audiometry testing with headphones (Marshall Major III Bluetooth closed headphones). The same game scenario was performed. The tested subject was able to perform individual levels of the game without errors for SPL from 70 dB to 30 dB. All presented recordings were in mono mode, and although the sound was present on one side (for one ear), both ears participated in the process of perception of the sound stimulus via vibrations through the bone conduction.
Electronics 2020, 9, x FOR PEER REVIEW 10 of 15 declined below 50%. According to our observations, the incorrectly recognized words were those from the group of short words and phonetically similar words.
Since hearing problems were suspected, we decided to continue in the audiometry testing with headphones (Marshall Major III Bluetooth closed headphones). The same game scenario was performed. The tested subject was able to perform individual levels of the game without errors for SPL from 70 dB to 30 dB. All presented recordings were in mono mode, and although the sound was present on one side (for one ear), both ears participated in the process of perception of the sound stimulus via vibrations through the bone conduction. The last part of the experiment with the hearing-impaired boy was performed with in-ear headphones, which enabled us to partially reduce sound stimulation of the healthy ear via vibrations through the bone conduction. Speech stimuli in this scenario were provided only into the tested ear without precise masking of the untested ear.
The results for the left ear were very good. We obtained a WRS higher than 90%. A completely different situation occurred in the case of the right ear, where the word recognition score was very low, also for higher sound pressure level (higher than 50 dB was only 30%). When we decreased the sound pressure level below 50 dB, he became angry, demotivated, did not want to continue, and demanded to increase the volume.
For reliable evaluation of hearing in each ear separately it is necessary to mask the untested ear with noise. Therefore, later, we performed testing where the healthy ear was masked by cocktailparty noise. This noise pressure level was set to 10 dB below the speech stimuli provided into the tested ear.
During testing of the hearing-impaired child, we also focused our attention on observing the mood and motivation of the patient. The result was that during the audiometric game, the child was very motivated and really enjoyed the game. Some disappointment was observed when the child was unable to hear and correctly label multiple consecutive test sound items. The volume of the presented sound stimulus, when a child starts to become disappointed from failures in the game, is close to his or her speech detection threshold (SDT). The last part of the experiment with the hearing-impaired boy was performed with in-ear headphones, which enabled us to partially reduce sound stimulation of the healthy ear via vibrations through the bone conduction. Speech stimuli in this scenario were provided only into the tested ear without precise masking of the untested ear.
The results for the left ear were very good. We obtained a WRS higher than 90%. A completely different situation occurred in the case of the right ear, where the word recognition score was very low, also for higher sound pressure level (higher than 50 dB was only 30%). When we decreased the sound pressure level below 50 dB, he became angry, demotivated, did not want to continue, and demanded to increase the volume.
For reliable evaluation of hearing in each ear separately it is necessary to mask the untested ear with noise. Therefore, later, we performed testing where the healthy ear was masked by cocktail-party noise. This noise pressure level was set to 10 dB below the speech stimuli provided into the tested ear.
During testing of the hearing-impaired child, we also focused our attention on observing the mood and motivation of the patient. The result was that during the audiometric game, the child was very motivated and really enjoyed the game. Some disappointment was observed when the child was unable to hear and correctly label multiple consecutive test sound items. The volume of the presented sound stimulus, when a child starts to become disappointed from failures in the game, is close to his or her speech detection threshold (SDT). Table 1 contains results from all tested children who participated in our research. Most of the tested children managed both MCL and SRT levels very well in all tested scenarios. In child #3, the deterioration of hearing quality in the case of the right ear was confirmed. Child #7 (with cochlear implants (CI)) achieved very good results in the tests, which indicate the correct functioning of her cochlear implant. First tests were performed in the free field scenario by using a hearing aid. In the case of presentation of speech stimuli on MCL, we obtained word recognition score around 75% WRS. When we decreased the sound pressure level to 35 dB, the recognition score decreased to 50% WRS. It is necessary to note that she needed to listen to speech stimuli several times to be able to recognize the word. Phonetically similar words (e.g., "vlak" and "vták") were the most difficult for her to recognize. In the case of the words, where some part of them was the same, she anticipated the correct answer from the combination of pictures and listened to part of the word. This situation was observed and reported in the case of the words "auto" and "autobus", where "auto" is part of both.
When we tested without her hearing aid the situation was completely different. She was able to detect the sound only with a very loud stimuli around 70 dB and the word recognition score was very poor-under 20% WRS. Tests with closed headphones were performed too, but only with a hearing aid and with SPL equal to MCL. The result of this testing was 89% WRS.
These results show that the hearing aid works at an acceptable level when the lowest acceptable WRS equal to 50% is already reached near the SRT. The overall impression of using the designed audiometry application was interesting for us. Initially there was a reluctance to participate. After overcoming the initial rejection, she passed the whole testing without any problems, also for testing without her hearing aid. The overall length of the test was acceptable for her, but the provided pictures seemed to her too childish.

Results Summary
The evaluation was performed several times with healthy children, two children with hearing impairment, and one elderly (72 years) individual with a hearing aid. In these experiments, we considered as healthy children those who were not clinically diagnosed with any hearing problems before. The children's age was 4 and 16 year. In this study we instructed the parents to contact the clinician when the results of the test fell under 50%, as described elsewhere [1]. More accurate results can be obtained using headphones when each ear is measured separately, which eliminates the problem of cross-hearing.
Testing the app with an elderly person shows us that it can be easily used for speech audiometry testing in this group of patients. Both children and the elderly were able to easily interact with the application thanks to pointing gestures on the touchscreen. The large size of the pictures seems to be important too. The selection of words, which cover words known by children, is also suitable for testing elderly patients with reduced mental capabilities.

Conclusions
In this work the web-based pediatric speech audiometry application for hearing impairment detection was described and evaluated. The designed speech audiometry application is suitable for use in the home environment. It enabled us to measure the word recognition score (WRS) in a free field scenario and also to measure each ear separately using headphones. The application adopts conditioned play audiometry principles and can be classified as a speech recognition test. Recordings from the newly designed Slovak kindergarten word list (SKWL) were used as speech stimuli. SKWL meets all requirements for audiometric data and, together with the corresponding images and speech audio recordings, creates a unique novel database suitable especially for pediatric ontological patients during long-term therapy with high user acceptance level among pediatric and elderly patients.
The evaluation shows that the designed application can detect hearing problems at an early stage to support better intervention. The more accurate results can be obtained using headphones when each ear is measured separately, which eliminates the cross-hearing problem. Children accepted the application very well. They liked the application and did not want to stop playing it. Some portion of stress was observed when the child was not successful several times in a row or in situations when he or she perceived the presentation volume level as too low. In comparison with the classical speech audiometry methodology using live speech as a stimulus, the designed application removes the problem of lip reading. The application can be used to measure different levels and to evaluate the hearing loss or to verify the functionality of the hearing aid. Even though we initially intended to develop the application to support speech audiometry performed by therapists, experimentations with the application show us many other cases where the application can be used: • by a therapist to increase motivation and reduce the fear of the pediatric patient during speech audiometry; • by parents to verify hearing problems when they start to observe hearing problems in their child; • for daily verification of correct functionality of a hearing aid or cochlear implant; • for adults with specific disabilities and for audiometry testing of elderly patients, especially in situations when the patient is not able to answer by voice or writing; • in the home environment; • in each web browser without any special requirements.
In the future we plan to improve the application in several areas, by extending the number of levels, adding more phonetically similar word pairs, enabling parents to identify words which are unknown by their child. We also plan to add other types of tests, such as testing of speech detection and speech recognition threshold and to develop an application for the Ling 6-word test. We developed an Android-based application following the proposed web application and it will soon be available on Google play for free. The next idea is to use an automatic speech recognition system and natural language processing tools (see [32]) to enable the child to react using his/her voice or to prepare more sophisticated audiometric games. We plan to test the application with autistic pediatric patients and with a larger group of elderly patients. We already started a collaboration with the Bulgarian Academy of Sciences and EPU University for the Bulgarian version of this application for elderly people [33].