The Design of an Environmental Noise Labeling App for Citizen Participation in Smart Cities †

: Urban acoustic environment is composed of a great variety of sounds, whose effects citizens are mainly unaware of, despite medical studies showing that urban noise affects quality of life and health, causing from sleep disturbances to cardiovascular diseases. The development and deployment of Wireless Acoustic Sensor Networks (WASNs) presents new ways to face the urban acoustic challenges in the context of a smart city environment. The improvement of the quality of life of the citizens cannot be limited to measuring the equivalent levels of noise in the streets, but should also identify the type of noise source and its impact on the overall noise measurement. For this purpose, the collection of noise sources information requires the application of many techniques, including the recording and labeling of noise events. The latter are tasks that are mostly performed manually by experts, using certain types of software and being very time-consuming. To improve this task and taking advantage of the rise of new technologies, we propose the design of a game using a mobile application to encourage citizens’ participation in research. The goals of the app are two-folded: to raise awareness of the problems generated by noise and to help experts in the work of pre-labeling sounds for later analysis. We detail the envisioned mechanics of the proposed game, its dynamics, and its design.


Introduction
In our day to day, we are surrounded by a great variety of sounds that we are able to recognize and discern automatically, without paying too much attention to it. This set of data is usually considered irrelevant. However, over the last few years, there has been an increasing interest in the information they can convey, with an increasing number of studies in the field. Different projects and tools have been created to collect and classify sound events for various purposes [1]. Some examples are study of behavior patterns of society or cities based on the sound trends [2] or the identification of excess noise for subsequent study of impact and the design of improvements [3].
Today, the process of collecting information and subsequently classifying and labeling it is a costly and time-consuming task, because it is carried out manually by experts. This work can also be done in other ways, such as automating the process with the help of Artificial Intelligence mechanisms. Nevertheless, depending on how it is done, our team has evidence [4,5] that different results in terms of accuracy can be obtained [6], usually on the other side of the amount of time devoted to a precise labelling. For this reason, the manual method is still commonly used for being more precise [4], because it does not require a post-supervision by experts as automatic methods require. In this sense, the possibility of giving a basic training to citizens in order to introduce their help is a real option, because with the support of technology, they can be key actors in distributing the process among more volunteers.
The present work aims to use gaming and new technologies to make the classification and labeling process more enjoyable, taking advantage of their ubiquity in today's society. An example of this is found in schools, institutes and universities, which use digital tools for many tasks in the teaching-learning process [7]. Likewise, gamification elements in digital and board games hooks the user, create incentives for him or her to continue playing [8]. In addition, mobile phones provide permanent access to games and increase the potential number of players, which in turn can increase the data collected.

State of Art
In order to carry out a good design and development of a system aimed at labeling sounds or audio recordings, different knowledge areas are required. These must be intertwined and complementary, so that the best possible experience and results are obtained.

Gamification
Gamification promotes the best user experience through the use of playful elementstechniques and game mechanics in an mobile application to engage, motivate and encourage its use [8].
When it comes to designing a game with gamification in mind, one of the most influential systems is the infrastructure Mechanics-Dynamics-Aesthetics (MDA) [9], in which a framework is created to bridge the gap between game design and development, game criticism and technical research. Specifically, this infrastructure divides the set into rules, system and "fun", establishing the concepts of: • Mechanics: describes the game components, rules, and concepts that specify the system. • Dynamics: describes the behavior of the game's run time mechanics, which act on the inputs of the players and the outputs of the others over time.

•
Aesthetics: describes the desirable emotional responses in the player when interacting with the game system caused by dynamics.
Importantly, from the designer's perspective, mechanics gives rise to behavior in the system, which in turn leads to concrete aesthetic experiences. Likewise, from the player's perspective, aesthetics sets the standard that arises from observable dynamics and, finally, from operable mechanics.

Citizen Science
Citizen science is scientific research carried out thanks to the participation of volunteer citizens, usually not specialists in this subject, who actively contribute to science with their intellectual efforts, tools and resources [10].
In the Observatory of Citizen Science platform in Spain a compilation of such initiatives, resources and projects can be found. Among these, the ones that are most related to the topic of this paper are the following: • Zaragoza Noise Pollution Map Project [11]: a mobile application to document neighborhood noise. It allows labeling the noise source together with its measurement. Citizen collect to create an urban map of noise sources harmful to health and to provide the necessary information for administrations' decision making. • Noise App [12]: a mobile application to record and report noise in your area. This has the option of recording noise with the phone itself, in order to report it to a central office registered in the system. In addition, you can create your personal recordings diary in which you can store and label any noise you want for later use.

Labeling Sounds
A fundamental part of the project is the proper labeling of sounds, a process that consists of indicating in a recording where a specific sound is located and how it is defined [3]. To carry out this task, in first place, the characteristics of the sounds that are sought must be precisely defined. The fact of using recordings obtained in a real environment implies that background noise is added. Consequently, this aspect makes processing difficult, since it is more complex to indicate the beginning and end or to differentiate the ambient sound of the desired event.
After this definition, the labeling process is carried out by comparing and analysing sonograms and L A eq. The classic method followed by experts is the following [13]: 1.
Check the trend changes in the spectrum. For this, different techniques can be followed, such as looking for its maximums and analyzing its surroundings; or consider a threshold level and look for changes that are above this value.

2.
Check for level trends in specific noise spectrum bands. 3.
Analyze at frequencies that are typically found to be abnormal.
Once the sound has been identified, it is labeled with the corresponding word and the beginning and end of the recording are signaled.

Requirements and Design
Following the objective defined, the project focuses on the development of a citizen science mobile application to pre-label sounds events in a recording. It was decided to use gamification in order to obtain the maximum number of tagged audios and thus make the project motivating and engaging to users. In addition, the MDA gamification system explained above is applied, as detailed in the following sections.

Mechanics
For the mechanics, points, ranking and levels are used. The ranking aims to promote positive competitiveness, ordering users according to the points obtained, regardless of level. These levels can be three: low, medium and high. All users start at the lowest level and, as the levels increase, it raises the complexity of labeling with a greater number of sounds present. This is the only difference between levels. To advance to the next level, the user needs to collect enough points. The application awards points by comparing the labels entered by the user with those previously registered in the database, in order to know how often it has been entered, considering that the beginning and end also coincide. Coincidence and not innovation is rewarded in order to obtain a database of labels concentrated in the terms offered. For this reason, the more times that word has been tagged, the more points the player gets. The points follow the pattern: • None entered: 0 points • Entered more than four times: 4 points This score is the corresponding to the low level, multiplying by 2 in the medium level and by 3 in the high level, as shown in Table 1. Once the distribution of points is known, to advance levels, you must get 20 points to go from low to medium level and 40 more to go from medium to high level. It is also worth mentioning that a maximum of points is not set, since the more recordings are tagged and the higher the level, the more points can be achieved.
Finally, it should be noted that it is important to know the ability of the user to classify and label noise events, both to provide a good experience and to define the level of reliability in the data labelling generated by that user.

Dynamics
Regarding the dynamics, the functioning that the application follows is to first show a start screen (Figure 1a), where a game begins (Figure 1b) or access to the frequently asked questions area (Figure 1c). If the user chooses to play and it is the first time he or she access the application, a tutorial is shown (Figure 1d) in order to teach the basics to perform a good labeling. The tutorial consists of an initial video and later a screen for the user to carry out an example in a practical way and thus verify that they have achieved this knowledge. Once the tutorial is finished, the game starts. It consists of a screen (Figure 1b) where a recording is played. To tag it, in the upper part, a label must be chosen among those shown, and in case it does not match any of them, there is the possibility of adding a label by entering a name and its description. It is important to prioritize choosing from the available options whenever possible, before introducing new labels. Furthermore, another screen can be opened to see details of the recording (Figure 2a). Likewise, in the lower area, the start and end of the labeled sound should be signaled, adjusting them as precisely as possible. Once both tasks are finished, that label must be saved in order to continue with the recording or, if everything has been labeled, finish and send all the information to be saved on a central server. Once the labels are submitted, the score obtained is displayed (Figure 2b), followed by a screen with the user's profile (Figure 2c), detailing his or her total score, current level, number of tagged audios with respect to the total and position in the ranking.
Once a recording has been tagged ten times in the same way, it is assumed that the audio is tagged correctly and is no longer displayed in the application to tag it. This decision has been contrasted with an expert in sound labeling to decide in which range this labeling can be considered good.

Aesthetics
Regarding aesthetics, when making a level change, an informational message is displayed and a different color pattern from the previous one is followed.

Related Work
There are different systems and applications related to the labeling of sounds. The most relevant ones are described below: • MajorMiner [14]: a web application for tagging entire audios. The process consists of playing offline "against" a database, so that the user tags an audio and scores according to the number of people who have previously tagged the same audio in the same way. After each game or audio, the application shows a summary of the tags entered by the player, by other people, the score obtained in that round and the total points. • TagATune [15]: an application to tag entire audios for a limited time. It is a game to be played in pairs or with a bot, in case no one is online at the same time. The user must describe the sound according to the category indicated, during the set time. Once the round is over, either because time has elapsed or both have decided to finish the round, the points are obtained according to the matches with your partner. Based on the descriptions received by users, the tag is considered official when a minimum number of people have matched, which depends on the total amount of participants. • The Listen Game [16]: an application to tag entire audios playing simultaneously against multiple players. The process begins with the audition of the audio by all players; they select from the labels shown on the screen whether they are "good" or "bad" sounds; and obtain feedback of the selection compared with the other players, adding points according to the coincidences between them. Firstly, 7 rounds are played, and then a freestyle round, where the player enters the word that best matches the audio heard.
In the next round, this word appears as an option among those chosen by the system. Finally, every 8 rounds a score summary and different statistics are shown.
After researching this field, we have observed that they all search for techniques to attract attention and engage users, in a way that is fun and thus generate constancy. To do this, most make competitions or comparisons with other users, scoring according to the coincidence between them. Likewise, the way to corroborate the good labeling is to see the coincidence of these labels between users, being considered correct from a certain range.

Conclusions and Future Work
This work has presented the design of a game using a mobile application to encourage citizens' participation in research, raising awareness of the problems generated by noise and helping experts in the work of pre-labeling sounds for later analysis. The most important aspects to consider when implementing an application of this type are detailed, such as gamification, citizen science and sound tagging, which all combined generate an attractive prototype for users to perform the task of pre-labeling in a more enjoyable way.
Moreover, we developed a first prototype and tested with 10 researchers of La Salle University. In that test, criteria assessed were installation, labels, navigation, tutorial, frequently asked questions, operation, design, and general aspects. The main highlights during the test were that half of the users did not understand the sounds correctly, some felt that labels shown were a little confusing and not always fitted with the sound, most felt the app is easy to use and useful. Finally, the global valuation of the app was rated as good. From this, we detected the need for some improvements, such as improve the feedback received by the user during the tutorial, reconsider the labels displayed for labeling and improve the audio controller operation.
Our future work is to make appropriate corrections to the design and implementation proposed to improve the functioning of the app, test it with users and continue with a labeling marathon.