Crowdsourcing without Data Bias: Building a Quality Assurance System for Air Pollution Symptom Mapping

: Crowdsourcing is one of the spatial data sources, but due to its unstructured form, the quality of noisy crowd judgments is a challenge. In this study, we address the problem of detecting and removing crowdsourced data bias as a prerequisite for better-quality open-data output. This study aims to ﬁnd the most robust data quality assurance system (QAs). To achieve this goal, we design logic-based QAs variants and test them on the air quality crowdsourcing database. By extending the paradigm of urban air pollution monitoring from particulate matter concentration levels to air-quality-related health symptom load, the study also builds a new perspective for citizen science (CS) air quality monitoring. The method includes the geospatial web (GeoWeb) platform as well as a QAs based on conditional statements. A four-month crowdsourcing campaign resulted in 1823 outdoor reports, with a rejection rate of up to 28%, depending on the applied. The focus of this study was not on digital sensors’ validation but on eliminating logically inconsistent surveys and technologically incorrect objects. As the QAs effectiveness may depend on the location and society structure, that opens up new cross-border opportunities for replication of the research in other geographical conditions.


Introduction
Urban air pollution is well known to cause negative health impacts. Therefore, the monitoring of pollutant concentration levels plays a key role in understanding air quality and its effects on the subjective well-being (SWB) of citizens [1]. The SWB reflects the philosophical notion of a good life, as a proxy for assessing life satisfaction, momentary experiences, and stress. Kim-Prieto et al. [2] also took into account contemporary health hazards, among which air pollution is a key factor [3,4]. The development of smart and sustainable cities can only be accomplished through inclusive growth, using smart people, technologies, and policies [5]. From the perspective of smart people [6]-those who use smart devices to make their everyday living easier and health safer-we found it necessary to develop geospatial web (GeoWeb) solutions to measure the adverse impact of cities on their inhabitants. Ensuring measurement credibility becomes a key scientific challenge in this context. To this end, we carried out research on the example of air pollution health symptoms, an emerging trend particularly related to odor [7] and green pollutant [8] crowdsensing [9,10]. As crowdsensing (or, more generally, crowdsourcing) methods for health symptom mapping are subject to data bias [11], we developed and tested a quality assurance mechanism (QAm) framework (Section 2.2), which can be transferred to similar health-symptom-based studies.
Sparse or irregular monitoring station networks as well as limited access to the reference air pollution data underlie the need for citizen science (CS) activities in the field of air pollution monitoring. Personalized information about exposure to air pollutants, monitoring during acute events or at specific locations, partnerships with local governments, and educational and community-driven purposes are the key benefits of bottom-up environmental monitoring. CS enables the collection of data on much larger spatial and temporal scales and at much finer resolution than would otherwise be possible. The issue of urban air pollution crowdsourcing motivated the implementation of several CS programs, such as those led by Mapping for Change, a community interest company in London (e.g., Pepys Air Quality Project, Science in the City Project, Love Lambeth Air (https://mappingforchange.org.uk/projects/)), as well other international-scale activities found at claircity.eu and citi-sense.eu [12]. All the CS activities mentioned above were hosted on GeoWeb [13][14][15] (as discussed in Section 1.2). In this approach, citizens are required to act as sensors [12,16].
The CS HackAir project (https://www.hackair.eu) as well PollenApp were the projects that undertook the first attempts to crowdsource urban air pollution data, where air pollution was understood as the compound effect of anthropogenic-and biophysical-sourced particulate matter (PM) [8]. Pan-European CS projects, such as Distributed Network for Odor Sensing Empowerment and Sustainability (D-NOSES) [7], have shown that particular aspects of urban air pollution can be measured through the sense of smell of trained citizen scientists. Despite the subjective measurement nature of the human sense of smell, humansensed CS (sCS) meets the high-quality method expectations (D-NOSES meets germ. Verein Deutscher Ingenieure, eng. the Association of German Engineers (VDI) 3940 standard). Ambient air pollution sCS, as well as the symptoms it causes, provides a promising source of spatial information. However, the unstructured nature of crowdsourced data [17][18][19] requires data quality assurance (QA) protocols [20,21], as well as trust and reputation modeling (TRM) [22] procedure development.
To the best of our knowledge, using QA for the purpose of air pollution symptom mapping (APSM) has not yet been investigated. Our goal was to address the challenge of crowdsourced APSM with the use of the GeoWeb platform (Section 2.3) and to solve the data bias problem by using a quality assurance system (QAs; Section 2.2) based on logic rules. By assessing the rejection rate of reports affected by data bias, we provide evidence of reliable air pollution monitoring expressed as the severity of human health symptoms caused by combined factors of anthropogenic and biophysical ambient air pollutants [23]. Extending the paradigm of air pollution referenced by WHO in terms of six main air pollutant [24] concentration levels and their public health impacts [25] to air pollution symptoms (APS), we indicate new possibilities for citizen-driven research and social inclusion in environmental and health-related issues, as experienced by sustainable cities. We also contribute to the development of the sCS data quality methodology.
In our study, we consider health symptoms caused by air pollution as one of the indicators of the current ecological footprint of humanity on the environment. Therefore, we focus on urban air pollution as a case study with the starting point of health symptoms caused by human exposure to air pollutants. This concept highlights the relationship between urban habitats and the SWB of citizens. The issue of air pollution requires spatial information provided thoroughly by a modern spatially variable society [26,27]. This underlines the need to implement a local partnership between monitoring agencies, researchers, and the local community, who all breathe the same air.

Extending the Paradigm of Urban Air Pollution
In the field of environmental research air quality, information about the quality (i.e., clean or polluted) of air is reported as an air quality index (AQI) [28]. The AQI tracks six major air pollutants: inhalable particulate matter (PM 10 ), fine particulate matter (PM 2.5 ), ozone (O 3 ), sulfur dioxide (SO 2 ), nitrogen dioxide (NO 2 ), and carbon monoxide (CO) [24]. The spectrum of pollutant sources includes those related to the development of human civilization (anthropogenic pollutants) [23], as well those from natural sources, which questions the belief that everything that is natural is healthy [29]. Ambient air pollution concentrations above the approved limits [30,31] can cause certain health symptoms. Conversely, health symptoms can reflect air pollution. However, health symptoms resulting from inhalation of polluted air are also stimulated by natural-sourced biophysical PM such as pollen, mold spores [23,32], and volcanic emissions [33], causing human health problems such as respiratory allergies, including allergic asthma, which is regarded as an important disease [23,34,35]. Bastl et al. [8,32] described pollen as one of the green pollutants, which are significant components of the atmosphere and are relevant to air quality information for pollen allergy sufferers. This distinction is important for the comprehensive understanding of APS. Air pollution is specified as the concentration of pollutants measured in physical values (e.g., micrograms per cubic meter), whereas air quality refers to the AQI, as well as to classifications, opinions, and feelings, including the experiences of citizens in terms of air-and air-quality-related SWB [1,4]. This concept extends our understanding of air pollution from pollutant concentration levels to personal health symptoms caused by pollutant inhalation. The quantity and severity of symptoms can explain the air quality; however, a consensus about the terminology involving urban air quality has not yet been reached, and researchers typically distinguish air pollution through pollen exposure [36]. There is no symptom classification for air quality yet. Regardless, both factors shape air quality. Future research is required to understand and quantify the interaction of co-exposure to both types of air pollutants and its impact on the severity of human health symptoms [37].
Symptom mapping is a prerequisite for the spatial explanation of both dependencies. The first attempts at citizen symptom mapping related to green pollutants were made by Bastl et al. [32] and Werchan et al. [38]. Their research proved that citizen symptom load can be mapped efficiently using crowdsourced data; however, the sources of the symptoms cannot be clearly determined. The symptom load index is not directly correlated with annual pollen load and has a strong correlation to allergen content [32], with a linear (often daily) correlation [32,39]. Finding that relationship is beyond the scope of this paper; however, crowdsourced symptom data have shown potential as an indicator of the effects of urban air pollution on citizen well-being. The unstructured nature of crowdsourced data requires rigorous QA mechanisms (QAm). In this study, our aim is to identify a QA system for GeoWeb-based APSM to stream high-quality geospatial data. So far, this data stream does not exist. By sharing trusted and open data on air pollution symptoms, our findings can be used for aerobiological and health-risk-forecasting research.

Contribution of Citizen Science to Improvements in Air Pollution Mapping
According to Haklay [14], geographical citizen science overlaps volunteered geographic information (VGI), especially in the geographical context of citizen-driven research. GeoWeb plays an essential role in this field. However, it is crucial that CS and VGI not be seen as equal, as the main purpose of VGI is to produce geographical information, while citizen science aims to produce new scientific knowledge [40,41]. Citizens engaged in scientific research projects become citizen scientists [42], who, depending upon their personal interests, motivation, education level, and experience in previous projects, engage with different levels of participation and expect to see the results of their research contribution. They contribute to the project by collecting and analyzing data but may also be involved in defining research questions or even interpreting results [14,43,44]. Considering the scope of citizen participation, Haklay [14] defined four levels of CS: crowdsourcing (first level), distributed intelligence (second level), participatory science (third level), and extreme citizen science (fourth level). Citizen involvement in environmental projects on air pollution is usually based on collecting and analyzing sensor data in the form of online maps. In this way, knowledge is produced. The fundamental questions about the harmful health effects of air pollutants have been asked, so these activities are typified as CS level 1 and CS level 2. Of course, higher levels (depending on the engagement of members) are not excluded. In the case of odor crowdsourcing, which requires training as well as expecting measurement insights back from members, a collection method can be devised (i.e., level 3). Our study was based on the first level of CS, where citizens are engaged in the process of crowdsourcing APS data, producing a new scientific knowledge of APSM, together with researchers. CS provides a solution to research problems [44], while also educating citizens [45].
So far, smartphones have not been considered appropriate equipment for measuring urban air pollution. This is due to the fact that the built-in sensors of smartphones, by default, do not allow users to measure air pollutant concentrations. Therefore, bottomup activities considering air pollution have usually relied on external, low-cost sensors (initially only capable of PM measurement, these sensors can now also sense all major pollutants, including volatile organic compounds). Loreto et al. [46] emphasized that modern participatory sensing, which is one of three sub-categories of citizen cyberscience [47], has witnessed significant progress related to the fast development and social networking tools of information and communication technologies (ICT), which "allow effective data and opinion collection and real-time information sharing processes". In that context, Guo et al. [48] and Capponi et al. [49] introduced mobile crowdsensing (MCS), which focuses on sensing and collecting data with mobile devices and aggregating data in the cloud. However, there are pollutants that are still exclusive for Internet of Things sensor dust. A great challenge of contemporary CS measurement is odor sensing, which affects both indoor as well as outdoor air quality. Human-sensed air pollution monitoring seems to be an emerging trend.
In this research, we specify the citizens as sensors and participatory sensing concepts, where the senses, subjective impressions, and perception of humans are the only sensors used in the project; therefore, we propose this as sCS. Moreover, by developing a QA mechanism for sCS, this study contributes to bottom-up air pollution monitoring and open-data credibility.

Importance of Data Quality in Crowdsourced Air Pollution
Data quality issues include errors and biases. Factors affecting the data collected through citizen perceptions result in data biases. CS requires the collaborative contributions of multiple contributors [50], but the assumption of multiple contributors is insufficient to provide high-quality data. Therefore, data quality protocols are an essential part of crowdsourcing-driven research. Although participatory research faces methodological challenges such as biases in data collection [51,52], CS has been proven to be a source of trusted geospatial data [20,[53][54][55][56], including for health risk mapping [57][58][59] and risks caused by poor air quality [32,60,61]. The data quality determines its usefulness [62,63]. Thus, the unstructured nature of crowdsourced data requires rigorous data QA protocols [17,[64][65][66][67].
The starting point for QA in CS is education and the provisioning of technical information and resources [21,68] in order to increase citizen knowledge about the issues of air pollution and to improve their environmental awareness and motivation to provide air-quality-monitoring supporting activities [60,69,70]. Of the range of crowdsourced data quality measures discussed in the academic literature by Haklay [50] and Foody et al. [67], among others, attribute accuracy and completeness are essential. Furthermore, these aspects of geographic data quality have also been recognized by international standards of spatial data quality. The ISO 19157 [71], which handles the diverse perspective of data quality, defines a set of standardized data quality measures, including completeness of data, positional accuracy, and temporal accuracy, which are all grouped as so-called data quality elements (DQEs) [72]. Each DQE is, then, further evaluated, and the result of the evaluation is documented and reported [67]. The principles of the aforementioned ISO 19157 [71] served as the basis for the proposed APSM data quality assurance framework.
The goal of this study was to answer the question of data quality assurance mechanism implementation in GeoWeb-based APSM. For this purpose, we propose a dedicated APSM framework for the QA mechanisms of start-check, sequence, cross-validation, repeating, and time-loop check in order to reduce data bias, such as user response inconsistency, location inaccuracy, and duplicate time-space-related reports (see Section 2). By combining several logic-based QA mechanisms (QAm), we tested the robustness of the QAm to find the strongest QAm set and build a ranked data quality assurance system (QAs). Our research question was, Which QAs best reduces data bias in APSM? The sources of data bias include contradictory entries in the geodatabase attribute table recorded as answers supplied to the specially created APSM survey. However, we did not solve the problem caused by the non-air-pollution-related factors that affect human symptom severity, which act synergistically with air pollution to contribute to spatial database robustness on healthrelated symptoms [8,[73][74][75][76].

Materials and Methods
For our participatory APSM project, we followed the CS development framework of Bonney et al. [77], starting from the research question and project team formulation through to CS action execution and the dissemination of project findings (Sections 2.1 and 2.3). However, we focused on addressing data bias (Section 2.2) to improve the symptom-based air-pollution-mapping data quality.

Building a Field Data Collection Strategy
The method was adopted in the city of Lublin and Lubelskie Voivodship, located in eastern Poland. The data collection campaign lasted for one academic semester, from February to May 2018. Starting the campaign in the first quarter of the year is crucial, as anthropogenic pollutants and pollen occur simultaneously at the beginning of the year, especially gaseous pollutants that can act as adjuvants, exacerbating pollen allergenic potency and immunoreactivity [78]. A group of 56 students from different faculties of the University of Life Sciences in Lublin were involved in the project as volunteers and, according to Harlin et al. [79], turned into citizen scientists, of which, 30 spatial management students were the core citizen group of the project. A total of 18 non-academic citizens joined the research and collected data together with the student group. The project was continually open to everybody through the community channel for data and app sharing, which was implemented in GeoWeb. By sharing their conclusions and opinions during the social campaign, the citizens had a direct impact on the optimization of methods used.
To strengthen the data quality potential already before the field data collection campaign began, the scientific student organization of the Spatial Management Faculty of the University of Life Sciences in Lublin organized workshops for the citizen scientists, who were learned about the subject of study and the research project assumptions. Furthermore, they were trained on handling the mobile and web apps (details about apps provided in Section 2.3). The workshops, training, informing, and research project promotion among citizens lasted for the first month.
According to the study requirements, the citizen scientists were asked to collect data only during their daily outdoor activities, preferably once per day. If they observed airpollution-related symptoms, they were asked to report them as soon as possible. If they caught a cold or were sick, they were expected to stop collecting data until they recovered. The project assumptions and rules for collecting data were included in the mobile app, as a user guide introducing the study. Other educational materials were made available in the narrative web apps.
Our study referred to the patterns of citizen activity characteristic for CS specifics [80]. As such, we implemented a method for citizen engagement improvement, which was based on the monthly activity ranking of users, presenting a number of submitted APS reports. Users were assigned award titles according to the following number of reports sent per month: 1-5, Beginner; 6-10, Pretty Involved; 11-20, Super-Engaged; and >20, Excellent Citizen Scientist. A user-activity-tracking module, included in the operations dashboard app (details in Section 2.3), was public and allowed for competition between the citizens.

Data Quality Assurance Methods for APSM
Following other studies of human health symptom severity, such as the Symptom Load Index research in pollen allergy sufferers [81], we used a questionnaire sheet. However, we extended the group of potential citizens to all those who suffered from air-quality-related health symptoms, whereas to improve the method of data quality assessment, the userscreening question was implemented in the questionnaire, which helped identify the more sensitive citizens to air pollution. The question asked for any pollen allergens the user is allergic to (birch, alder, hazel, grass, pine, plantain, mugwort, nettle, ragweed, others; it is also possible to answer no allergens; Figure 1a). The survey included close-ended questions about health symptoms related to air quality according to the classification defined in the research on the epidemiology of allergic diseases in Poland (PL: Epidemiologia Chorob Alergicznych w Polsce (ECAP)) [82], accompanying the SWB question, which developed the QA method; a personal profile with the user-screening question; and finally time range and geolocation information. The questions are as follows: The questionnaire design was inspired by sociological research. We followed the rule that filling out a web form shouldn`t take more than 10 min [83]. The designed web questionnaire was tested in web form so that it did not exceed the assumed 10 min. Following Malchotra [84], the questionnaire contained mainly dichotomous and multiple-choice questions. The APS questions were grouped together [85]. The data stored in the database were displayed as a text data type in the mobile app, which is simple and intuitive for the user. The data were also coded in the database in the short integer data type (except for the user-screening question, which was coded in text data type) and were used for data analysis and statistics, as shown in the flowchart of the APSM questionnaire ( Figure 1). The proposed method includes data forms, which are the basis of the developed conditional statements.  The questionnaire design was inspired by sociological research. We followed the rule that filling out a web form shouldn't take more than 10 min [83]. The designed web questionnaire was tested in web form so that it did not exceed the assumed 10 min. Following Malchotra [84], the questionnaire contained mainly dichotomous and multiplechoice questions. The APS questions were grouped together [85]. The data stored in the database were displayed as a text data type in the mobile app, which is simple and intuitive for the user. The data were also coded in the database in the short integer data type (except for the user-screening question, which was coded in text data type) and were used for data analysis and statistics, as shown in the flowchart of the APSM questionnaire ( Figure 1). The proposed method includes data forms, which are the basis of the developed conditional statements.

QA Methods Applied during the Data Collection Process
We initially adopted three QA methods in the mobile survey app in order to improve data quality during the data collection process. These methods were based on the quality measures of ISO 19157 [71]: positional and temporal accuracy, data completeness, and consistency. The first QA method eliminated identical reports sent from the same location within a certain time interval (5 min) from the database in case the same report was duplicated. The report duplication was confirmed with the same nickname of the citizen who made the report. Then reports lacking geolocation were excluded from the database by the second QA method. The third method controlled the Global Navigation Satellite System (GNSS) positioning accuracy of the reported APS observations, under the assumption that reports with a horizontal accuracy error greater than 100 m are outliers, which were eliminated in the data collection stage. If the surveys were accepted under the three QA methods described above, they were finally checked with the completeness quality measure. Surveys that were not completed, in terms of obligatory questions, were automatically blocked against submission through the mechanism configured in the app so that the database was free from incompleteness.

Logic-Based QA Mechanisms Implemented after the Data Collection Process
The proposed logic-based QA framework for APSM has a four-tier structure, starting with a set of logically related questions, the basis of the quality assurance (QA) mechanisms, which in various configurations form five mechanisms and then QA systems ( Figure 2).
The logic formula for each conditional statement was built to filter and eliminate data bias. The conditional statements were the primary components for QA mechanisms. To determine the potential logic rules, the following general criteria based on the databasecoded answers from the questionnaire were assumed: Q2-Q7, which are the questions reporting citizen health symptoms, should have been implemented in minimum six conditional statements, while Q1 and Q9 (SWB questions) should have been implemented in minimum two conditional statements each. Then the conditional statements identified data inconsistencies, based on false hypothesis, such that false results were returned ( Table 1). The QA mechanisms of start-check, sequence, cross-validation, repeating, and time-loop check were proposed, with one or two levels of robustness (Table 2). These QA mechanisms were implemented in the database after the data collection process was finished.

Logic-Based QA Mechanisms Implemented after the Data Collection Process
The proposed logic-based QA framework for APSM has a four-tier structure, starting with a set of logically related questions, the basis of the quality assurance (QA) mechanisms, which in various configurations form five mechanisms and then QA systems (Figure 2). The logic formula for each conditional statement was built to filter and eliminate data bias. The conditional statements were the primary components for QA mechanisms. To determine the potential logic rules, the following general criteria based on the databasecoded answers from the questionnaire were assumed: Q2-Q7, which are the questions reporting citizen health symptoms, should have been implemented in minimum six conditional statements, while Q1 and Q9 (SWB questions) should have been implemented in minimum two conditional statements each. Then the conditional statements identified data inconsistencies, based on false hypothesis, such that false results were returned (Table 1). The QA mechanisms of start-check, sequence, cross-validation, repeating, and timeloop check were proposed, with one or two levels of robustness (Table 2). These QA mechanisms were implemented in the database after the data collection process was finished. Table 1. Conditional (Con.) statements, the basis of the quality assurance (QA) mechanisms.  Table 1. Conditional (Con.) statements, the basis of the quality assurance (QA) mechanisms.

Conditional Statement Number Logic Formula
Con. To date, such a method has not been implemented in an sCS air-pollution-related symptom-mapping project. As a data quality assurance framework for APSM, we propose a data quality assurance system (QAs) that is a combination of each QA mechanism, depending on the QA mechanism robustness variant ( Table 3). The choice of the QA system variant depends on the project character: each QA mechanism works independently and, so, can be implemented in the APSM project separately or in any combination, if needed. Table 2. QA mechanisms implemented in the citizen air-pollution-related symptoms questionnaire, with less (1) and more (2) robust variants.

QA Mechanism QA Mechanism Code QA Mechanism Components: Conditional Statement Combination
Start-check (SC) SC1 Con.1 or Con.2 SC2 Con.3 or Con.4 Sequence (Sq) Sq Con.5 or Con.6 or Con.7 Cross-validation (CV) CV1 Con.8 CV2 Con.9 Repeating (Rp) Rp1 Con.10 or Con.11 Rp2 Con.12 or Con.13 Time-loop check (TC) TC Con.14 or Con.15 or Con.16 or Con.17 The core principle for the research is the logic-based data quality assurance procedure. The start-check, cross-validation, and time-loop check mechanisms are based primarily on the medical assumptions included in the logic rule framework. The other QA mechanisms, sequence and repeating, are purely logical and result from social and psychological survey methods for monitoring, the typical respondent-answering process, and data quality control [86].
The start-check mechanism was used to verify the report consistency at the beginning of the survey, excluding reports whose symptom severity answers were not consistent with the general well-being question. The quality assurance method of applying a general question about the issue preceding the detailed questions was used by Bastl et al. [8] and Bousquet et al. [87]; however, these studies did not report success in using this conditional statement. We examined this in two variants of robustness. Components and algorithms of the start-check mechanism are presented in Tables 1 and 2, which were implemented in the database work as follows: Variant 1 is less robust and assumes that the report is consistent when the citizens assess their current comfort as "good" (1), and then the answers to Q2-Q7 should be between "no symptoms" (0) and "moderate symptom severity" (2). If the answer for Q1 is poor self-comfort (3), then at least one question between Q2-Q7 must be answered as "strong symptom severity" (3). Variant 2 is stricter and assumes that the possible answers for Q2-Q7 can only be "no symptoms" (0) or "mild symptom severity" (1) when the current comfort is rated as "good" (1). If the answer for Q1 was "poor comfort" (3), the same conditional statement was used as in variant 1. For both variants, the reports with the answer "I have no opinion" (0) for Q1 were excluded.
The sequence mechanism was applied to exclude user automatism in providing answers, which is often caused by a citizen giving rash answers or not reading the questions. Therefore, each report with all questions answered by responses with the same place in the sequence (e.g., every first answer for each question) were eliminated from the database. The QA mechanism is based on the method of rearranging the order of possible answers [88,89] for one question in the sequence of similarly asked questions. The standard order of the answers for questions Q2-Q7 was 1, 2, 3, 0. Q5 was an exception, with an answer order of 3, 0, 1, 2.
The cross-validation mechanism was used to reject responses by using three essentially related questions. If the answer to the additional question was not consistent with one of the two previously answered related questions, the report was excluded. The APS mentioned in Q8 (rubbing eyes) should be related to symptoms of a runny nose (Q4) or watering eyes (Q5). The mechanism was tested in two variants, where variant 2 is stricter.
The repeating mechanism determines the consistency of the report, according to the other previously answered questions, by asking the same question but in a different way. If the repeated answer is not consistent with the former one [21,88,90,91], the report is excluded from the database. This was used with Q9, which repeated the content about the citizen's self-comfort in Q1. Additionally, the mechanism tested the report's compatibility based on the repeated SWB question and symptom severity ones (Q2-Q7). The mechanism was examined in two variants of the robustness level analogous to the previous mechanisms.
The time-loop check mechanism was used to eliminate reports that did not align to the geolocation of the citizens, according to the length of their stay in a place in comparison to the duration of their symptoms. Previous studies [92][93][94] have shown that human allergic reactions to pollen range from 10 to 20 min, usually resolving after 1-2 h for an early-phase reaction and 3-4 h with resolution after 6-12 h, sometimes even 24 h, for a late-phase reaction. The late-phase reaction is preceded by an early-phase reaction. The early-phase reaction is characterized by symptoms such as allergic rhinitis (including sneezing, itching, and rhinorrhea) [34,95], while the late phase is connected with nasal congestion and obstruction [34,96]. For anthropogenicsourced air pollutants PM 10 and PM 2.5 , the reaction time (according to the pollutant type) ranges between 2 and 10 min for the most sensitive subjects and resolves after 30 min [97,98]. For the purpose of APSM, we adopted the time range for the duration of the human earlyphase reaction to air pollutants as between 1 min and 2 h. This mechanism assumes that all reports with a time loop value higher than the duration the user remained in the place are eliminated, as such information indicates a late-phase reaction, which is not connected with the actual geolocation of the citizen.

GeoWeb Method Supporting Data Quality Assurance
The GeoWeb platform, which was the technological basis of this project, to maintain project openness, consists of a mobile app for field data collection based on the configurable Survey123 for the ArcGIS application (Survey123; Esri Inc., Redlands, CA, USA) and a data mapping module, including dashboards (web mapping applications based on ArcGIS Online (AGOL) components; see Figure 3), available to citizens for free on a smartphone but also via a web browser.
The mobile app (A) includes the APS survey for field data collection. The educational and technical training materials (B) are provided in a narrative web mapping application (Esri story map templates), which are served through Representational State Transfer (REST) services. The collected data are sent to the geodatabase through REST services, and raw data are presented in the dashboard app (C) in real time. Then, the raw data are QA-checked in the database using ArcGIS Desktop integrated with the REST and R statistical software (R: a language and environment for statistical computing; R Foundation for Statistical Computing, Vienna, Austria; www.R-project.org) (D), which is used for analysis of the QA mechanisms and QA system variant robustness, as well as the survey result statistics. Finally, the QA-checked data are returned to the database and presented as APSM results in the open web mapping applications, including dashboards (C). The results were reported in the dashboard app as a percentage of the observations eliminated by the most robust QA mechanism combination, presenting the total robustness of the implemented QA system.
The questions proposed for the survey in Section 2.2 were included in the mobile survey, which was divided into individual pages, in order to help the user to quickly navigate between questions without scrolling down the whole form [99,100]. This helped to avoid mistakes in filling out similar APS questions and focus on the relevant section of the survey. An offline mode in the mobile app was secured to collect actual geolocation, even if without the internet. We used a user authorization option with an alphanumeric nickname and a four-digit code at the beginning of the survey in order to help follow the reports of each citizen, while also providing them with anonymity. To improve the quality of collected data, we implemented closed questions with single-or multiple-choice type and data range functionality, which prevented submitting certain reports that were inconsistent with the rules of the project.
value higher than the duration the user remained in the place are eliminated, as such information indicates a late-phase reaction, which is not connected with the actual geolocation of the citizen.

GeoWeb Method Supporting Data Quality Assurance
The GeoWeb platform, which was the technological basis of this project, to maintain project openness, consists of a mobile app for field data collection based on the configurable Survey123 for the ArcGIS application (Survey123; Esri Inc., Redlands, CA, USA) and a data mapping module, including dashboards (web mapping applications based on ArcGIS Online (AGOL) components; see Figure 3), available to citizens for free on a smartphone but also via a web browser. The mobile app (A) includes the APS survey for field data collection. The educational and technical training materials (B) are provided in a narrative web mapping application (Esri story map templates), which are served through Representational State Transfer (REST) services. The collected data are sent to the geodatabase through REST services, and raw data are presented in the dashboard app (C) in real time. Then, the raw data are QA-checked in the database using ArcGIS Desktop integrated with the REST and R statistical software (R: a language and environment for statistical computing; R Foundation for Statistical Computing, Vienna, Austria; www.R-project.org) (D), which is used for analysis of the QA mechanisms and QA system variant robustness, as well as the survey result statistics. Finally, the QA-checked data are returned to the database and presented as APSM results in the open web mapping applications, including dashboards (C). The Within the field data collection campaign, the reports were mapped, in real time, in a point-symbolized data layer ( Figure A4, Appendix A). Cooperation between the Survey123 mobile app and the web mapping module was based on the typical WebGIS architecture [101]. The dashboard displayed the user activity-ranking widget, which was based on a list of 10 most active citizens, presented as their nicknames together with their award titles, as well as their number of reports in the last month (Section 2.1). This functionality helped to enhance the engagement of the citizens during the data collection process.
The design concept and interface description of GeoWeb applications, along with the crowdsourced symptom map formed from the Lublin case study, are provided in Appendix A.

Results
Data quality assurance is key for useful and effective sCS, and so, the results mostly focus on the implemented QA mechanisms and the robustness analysis of the QA system variants for air-pollution-related symptom mapping. We first focused on the field data collection campaign, which provided us with the input for our analysis. The citizen activity curves indicated the varying and regularly decreasing activity of the citizens. For QA robustness results, our key focus was the eight QA system variants, the combinations of five logic-based QA mechanisms, which rejected from 18.3% to 28.6% of reports with data bias. As a result, we created and proposed a QA framework for APSM, based on the ArcGIS platform, which was configured and customized to the study requirements. The air-pollution-related symptom map and GeoWeb apps are presented in Appendix A.

Data Collection Campaign Outcomes
During the data collection campaign, 1936 APS reports were sent by citizens to the cloud, which were used as the input to the QA-mechanism-developed database. The citizen scientists involved in the APSM project became members of the Citizen Science Community of University of Warsaw (CS-Community-UW; Warsaw, Poland) (https://arcg.is/WeqfK), which was set up to share all project data, materials, and apps in GeoWeb, available at https://arcg.is/1iDD18, in order to provide continuous access to the results. When citizens collected data, their activity was controlled and updated every month in a user activity module implemented in the dashboard app (Section 3.3). Analysis of the activity curve indicated that citizen activity peaked for 11 days at the beginning of the campaign (39-48 reports per day) and then decreased. After this, activity stabilized at between 4 and 29 reports per day, with two peaks of 32 and 34 reports per day (Figure 4). The students were more active than non-academic citizens. The most active student collected 25 reports in April, whereas the most active non-academic citizen collected 7 reports in March.

Robustness of QA Mechanisms
During the initial stage of data set filtering, 19 outliers with no geolocation were eliminated. Another 68 APS records were excluded as they were duplicate reports at the same geolocation within a short (5 min) interval. The horizontal positioning accuracy varied from 0 to 100 m, where 93% of reports had a horizontal accuracy between 0 and 30 m. We filtered 26 outliers that had a horizontal accuracy error exceeding 100 m, which were rejected. As a result, 1823 reports of the 1936 remained and were used as the object of our logic-based QA implementation study. After the implementation of the QA mechanisms in the database, their robustness was analyzed.
According to Table 4, the three most robust QA mechanisms for our specific case study were: repeating in two variants (more robust, repeating2 (Rp2); less robust, repeat-ing1 (Rp1)) and start-check in the more robust variant (SC2). The repeating2 mechanism excluded 23.1% of reports, repeating1 eliminated 10.6% of reports, and start-check2 eliminated 6.9% of reports. Thus, most data bias resulting from inconsistencies between the first and the repeated question was identified by the repeating mechanism.

Robustness of QA Mechanisms
During the initial stage of data set filtering, 19 outliers with no geolocation were eliminated. Another 68 APS records were excluded as they were duplicate reports at the same geolocation within a short (5 min) interval. The horizontal positioning accuracy varied from 0 to 100 m, where 93% of reports had a horizontal accuracy between 0 and 30 m. We filtered 26 outliers that had a horizontal accuracy error exceeding 100 m, which were rejected. As a result, 1823 reports of the 1936 remained and were used as the object of our logic-based QA implementation study. After the implementation of the QA mechanisms in the database, their robustness was analyzed.
According to Table 4, the three most robust QA mechanisms for our specific case study were: repeating in two variants (more robust, repeating2 (Rp2); less robust, repeat-ing1 (Rp1)) and start-check in the more robust variant (SC2). The repeating2 mechanism excluded 23.1% of reports, repeating1 eliminated 10.6% of reports, and start-check2 eliminated 6.9% of reports. Thus, most data bias resulting from inconsistencies between the first and the repeated question was identified by the repeating mechanism.
Three QA mechanisms-start-check, cross-validation, and repeating-which were implemented in two variants (i.e., less robust and more robust) were analyzed in terms of the report reduction in each variant. For the start-check mechanism, the two variants reduced 71 of the same reports, while start-check2 eliminated 54 more reports than start-check1. Cross-validation1 and cross-validation2 reduced the same 54 reports, while cross-validation2 additionally eliminated 54 observations. For the repeating mechanism, repeating1 and repeating2 were compatible for 194 reports, while repeating2 was 12.5% more robust than repeating1, excluding 228 additional APS reports (Table 5). The largest data bias was related to the inconsistency between the general SWB question and its repeated query (repeating1: 10.6% and repeating2: 23.1%). This result could have been produced by the inaccuracy of the repeated-question structure or by citizens misunderstanding the question. The start-check2 mechanism rejected 6.9% of reports, which means that the severity symptoms did not align with the general SWB assessment. The sequence mechanism was the least robust (0.9%), indicating that citizens rarely filled in the form automatically, that is, without carefully reading the answers. A high rejection rate was observed using the QA mechanisms start-check2 (6.9%), cross-validation2 (5.9%), and repeating2 (23.1%), which were between 46% and 57% more robust than their alternative variants (start-check1, cross-validation1, and repeating1, respectively). They rejected a higher percentage of reports. Due to its high rejection rate, start-check2 was found to also reject some consistent reports.
Finally, according to the methodology (Section 2), the QA mechanisms were combined into eight QA system variants (QAs 1-8 ; Table 6). Implementing each subsequent QA mechanism changed the database, considering the QA system functions. For this study, the most robust QA systems were QAs 8 and QAs 7 (28.6%) and QAs 2 and QAs 5 (27.3%), which best reduced the number of falsely filled in reports in the survey. They increased the quality of the collected data but rejected a percentage of reports that might have contained valid information. As a result, some valuable data would be lost. The least robust were QAs 1 (18.3%) and QAs 3 (20.0%), which could not identify all the reports with false data, thus decreasing the quality and validity of the research results. As mentioned above, two pairs of QAs variants reduced the same data set and replicated the result database (QAs 2 with QAs 5 ; QAs 7 with QAs 8 ), thus allowing their interchangeable use. To reduce replication in the results, the studied QA framework was limited to six QAs variants (with QAs 5 and QAs 7 removed). For another location (i.e., country, continent) and society structure, the robustness ranking of the six QAs variants could be different and the particular QA mechanisms could be more or less effective than in the considered case study in Lublin.

APSM Results after QA System Implementation
APSM results calculated on the QA s8 -checked data set were added to the GeoWeb dashboard app ( Figure 5) such that each user could track the APSM data of the whole crowdsourcing campaign, presenting the impact of air pollution on his/her SWB status. From the QA s8 -checked database, the most frequently observed health symptom was a runny nose, which was reported in 46.98% surveys during the whole campaign. The least frequently reported symptoms were breathing problems (only 6.92% surveys). The QA s8 had an impact on the percentage of surveys reporting any APS (

APSM Results after QA System Implementation
APSM results calculated on the QAs8-checked data set were added to the GeoWeb dashboard app ( Figure 5) such that each user could track the APSM data of the whole crowdsourcing campaign, presenting the impact of air pollution on his/her SWB status. From the QAs8-checked database, the most frequently observed health symptom was a runny nose, which was reported in 46.98% surveys during the whole campaign. The least frequently reported symptoms were breathing problems (only 6.92% surveys). The QAs8 had an impact on the percentage of surveys reporting any APS (Table 7). Most surveys reporting breathing problems were rejected, reducing their percentage in the database by 47.34%. The percentage of surveys reporting a runny nose remained almost unchanged, as it increased by 0.86%. The percentage of surveys reporting other symptoms reduced by 14.88% to 35.42%.    The information about the percentage of surveys reporting individual symptom severity values were presented in the dashboard app (Q2-Q8; Figure 5f), which includes additionally the percentage robustness of the QA s8 implemented in the project (Figure 5g).
Moreover, the web application presented the results in terms of the spatial location of eligible symptom-related layers (Q1-Q8; Figure 5a-c), the number of QA-checked reports depending on the spatial extent (Figure 5d), and the general level of citizen comfort (Figure 5e). For more details about the GeoWeb app, see Appendix A.

Discussion
In this study, we introduced social innovation into the urban air pollution issue, where citizens act to assess air pollution using their symptoms, thereby extending the paradigm of air pollution. We considered the spatial context; therefore, a map was used to spatially model symptom severity ( Figure A5, Appendix A). The web mapping application is public and provides information about air pollution in specific areas of the city such that citizens can learn about which areas could positively or negatively impact their SWB, according to information about the severity of the symptoms observed by the APSM project members within individual months. The air pollution level based on health symptom observations could be compared with other methods, such as satellite observations and in situ measurements. Such in situ ones were described and studied by [81], who compared pollen counts recorded by the European Aeroallergen Network and pollen-induced symptoms reported by pollen allergy sufferers, precisely the Pollen App users' community. The health symptoms registered by the APSM community can be compared with green pollutants' concentration levels recorded by the Polish Aerobiological Monitoring Network [102]. In the context of other air pollutants, Sentinel 5p imagery can be taken as a point of comparison, especially because it is served near real time (NRT) [103], although at a more global scale.
Although the potential for using crowdsourced data to monitor urban air pollution was demonstrated here, the minimum report sample can be considered a limitation of the project. Citizen science is an emerging trend in Poland, and so, specific motivation mechanisms need to be elaborated, based on the citizen motivation recommendations developed in other projects [104,105]. As APSM focuses on citizens who are interested in the effect of air pollution on their health and well-being, our study is wider than other projects that are dedicated only to people diagnosed with health problems. This means that the motivation mechanisms involved differ from those that are appropriate for patients (e.g., free medical consultations).
Crowdsourcing projects rely on a suite of methods to boost data quality and account for data bias [20]. To gain better data quality in CS, three-step mechanisms are generally recommended: taking considerations before, during, and after the data collection process [21]. The APSM method, using logic rules to reject inconsistent database entries, was successfully implemented after the data collection process. Depending on the expected data quality level, different mechanisms were tested.
From a practical point of view, the data quality (i.e., completeness, spatial accuracy, thematic resolution, timeliness, and logic consistency) should be suitable for the project purpose. The quality of CS data is expected to be similar to that of the data collected by professionals. According to Wiggins et al. [21], some general solutions for improving crowdsourced data quality include volunteer training (workshops), a large sample size, data filters, data mining algorithms, a qualifying system, voting for the best, reputation scores, online data and metadata sharing, citizen contribution feedback, reuse of data, and replicate studies; however, the purpose of our study was to create a data quality framework.
This confirms that data quality control mechanisms are an indispensable element in any citizen-driven research, hence also being effective in CS activities such as that considered in this paper. The removal of falsely completed surveys increased the collected data's quality and usefulness. In our research, only 5.8% of data were eliminated due to positioning accuracy, either a lack of geolocation or duplicated reports at the same geolocation. Farman [106] identified 12% of crowdsourced data to have accuracy-related errors. Thus, we conclude that in our sCS, this type of error did not pose a significant problem in terms of data quality; however, subjective data bias definitely does. Still, the problem of human bias in data poses a problem that must always be considered during data analysis. Human bias introduced into data can be mitigated by using clearly stated survey questions, providing additional training, and limiting the scope of the survey. We found that up to 29% of all collected surveys regarding air pollution objectively contained useless or false information. Kosmala et al. [20] reported subjective data bias at levels between 5% and 35%, depending on the simplicity of the tasks assigned to citizens, Hube et al. [107] presented data bias at 15-17% in a crowdsourced data set, and Eickhoff [108] pointed out an accuracy rate reduction of 20% due to cognitive data bias. We recognize that our percentage share of data bias was high, highlighting the absolute necessity of a QA mechanism framework for sCS health-related projects. As QA mechanisms are created through the use of logic rules, they are easy to understand and can be crafted to match particular (expected or observed) error types in collected survey data. We found that QA mechanisms can be used to remove surveys that contain clearly defined errors. Moreover, by choosing a single QA system (Table 3) and combining rejected reports with the username, each user's quality rank can be calculated. A user who passes the QA system could then be rewarded with trust and reputation statuses. Such citizen trust models have been proposed by Alabri and Hunter [109], who developed a social trust metrics framework, and Langley et al. [110], who applied a reputation model and used a reputation score system to determine the threshold for accepting volunteered data. This should be based, for each citizen, on the ratio of reports accepted by the QA system to the total number of their surveys: the higher the ratio of accepted reports, the higher the level of citizen trust.
The APSM data quality mechanisms implemented after the data collection stage-but referenced to a particular user's reports-could be used to develop a user motivation system (which, in the current study, was limited only to user activity). Furthermore, the technological implementation of QA systems as cloud services may enable the ranking of user trust and reputation during the data collection process. Then, not only the quantity but also the quality of user reports can be analyzed and their level of reputation could be assessed and presented during the campaign. In large-scale CS projects such as iNaturalist (iNaturalist.org), the trust and reputation of citizens are based on their activity: "The users' community ensures that data is reliable, but it also gives the opportunity for fellow users to gain knowledge" [111].
In future research, during the data collection process, some new solutions can be implemented, such as GNSS trajectories. Currently, Q10 (Figure 1) requires users to estimate how long they have stayed in a location. Using GNSS or Global System for Mobile Communications (GSM) data to characterize user mobility patterns and analyze user spatial trajectories could increase data quality and make the application smarter. Changes in user trajectories could also result in an individual push notification to maintain or cancel the APS, depending on the change of location. Finally, to gain better data quality, improvements before the data collection stage may also be proposed. APSM was carried out as a Polish case study. As CS has been recognized as an emerging trend in Poland, we found it necessary to promote the sCS concept through the European CS platform (https://eu-citizen.science) and engage citizens in air pollution monitoring by organizing training sessions. What differentiates CS from other VGI activities is the fact that CS can be taken up by any volunteers who have undertaken standardized training. Learning how to observe one's own symptoms in reaction to air pollution, relating them with air quality information and green pollutant concentration levels, and regular symptom recording were considered prerequisite parameters for ensuring the quality of APSM. The D-NOSES project [7] proved that the sense of smell of individuals can be calibrated through training on odor pollution and workshops exploring odor perception in the D-NOSE method.
In the study, we applied a user activity rank model, which showed that citizen activity decreased over time, which is typical for a CS project [112]. The level of citizen engagement and motivation was the highest at the beginning of the crowdsourcing campaign (100 reports per day), dropping after 14 days. The two peaks in the last month of the field data collection campaign could have resulted from the motivational workshops with an educator where citizens rankings were discussed, thus increasing competition between the citizens (35 and 40 reports per day, May 2018). In summary, for a case study of Lublin or any city of similar size and structure, a citizen group larger than 74 people is needed and regular workshops, as well social media campaigns engaging people to participate directly in the CS group forum, are necessary to maintain their activity.
Due to the intuitive access and operation of the presented tools, such methods and tools are suitable for scientists, educators, and evaluators alike. The ability to reduce data bias in real time is not possible without a programmed web-based mechanism functionality. AGOL-configurable capabilities allow for data filtering, but the filters are too basic for the conditional statement combinations that form the QA mechanisms.
Conveniently, our database was set up on REST services, such that the QA mechanisms and their combinations could be implemented and analyzed using desktop software, in direct connection with AGOL apps, which ensured the stability of the REST service-based data source.

Conclusions
In summary, the APSM project was the first research in Europe that focused on assessing air pollution based on the health symptoms of citizens and their subjective wellbeing. This source of air pollution monitoring could be complementary to other methods. The highly subjective form of the data source could be burdened with data bias and specific errors. Therefore, it requires a QA framework for APSM projects, which was proposed and implemented in this study.
Of the five QA mechanisms employed, the most robust were those aimed at removing inconsistent user answers, which were intentionally repeated in the survey (i.e., the repeating QA mechanism). These results suggest that some of the methods employed might lead to a decrease in user engagement, as some users were not consistent with their own answers in the same survey. This finding may be due to a natural phenomenon associated with the human condition or to a survey questionnaire that lacks user engagement. Future surveys employing sCS as a data source might expect many haphazardly completed user surveys. Analyzing the QAm effectiveness results, we assume that some of the QA mechanisms' effectiveness can be increased, and we recommend some modifications to the examined QA framework. The sequence mechanism should be additionally enhanced by a functionality measuring how much time it took to fill out the form. This will help capture the user automatism in filling out the survey. Furthermore, the results provided evidence that APS assessment is much easier for citizens than identification of their SWB. This conclusion is confirmed with the level of the repeating mechanism's robustness, which eliminated the most reports, and was based on the repeated SWB questions. Therefore, we recommend that researchers rather focus on the health symptom questions and not repeat SWB questions in the survey, as it could be too difficult for the citizens to verify. The screening question could be developed in future research as well. Moreover, the QA method for APSM can be extended by the abundance and frequency of the surveys in a close or the same geolocation in the near time. This will be a mechanism that potentially improves the reliability of collected data. If some citizens give very close responses in the same or a close time and location, then the quality of data should have a potentially higher level of trust than those that are not confirmed by any surveys of other citizens.
The focus of our research was not on validation with digital sensors but on elimination of logically inconsistent answers and technologically incorrect objects. However, the APSM method can capture the moment when air pollution changes. The observed health symptom severity can be validated with air pollution concentrations measured by air-quality-monitoring stations. Having information about whether citizens are diagnosed as pollen allergy sufferers, and by collating this information with the current concentration of pollen species, the chance for confirmation of the impact of air pollution on citizen SWB is higher. The completed database has potential for further research to test the thesis, if citizens more sensitive to air pollution provide data of better quality than those who do not report any pollen allergy or other relevant preexisting conditions. Thus, people who are more sensitive to air pollution can be potentially more interested in providing high-quality data than those who have no air-pollution-related health problems. However, understanding the mechanisms underlying citizen scientists' motivations requires further research.
Citizens, together with scientists, built a reliable model of the impact of air pollution on the well-being of citizens in their city. As a result of our research, we can confirm that not only QA mechanisms but also citizen activity are necessary for CS contribution to geospatial data quality improvement.
Due to the proposed QA framework, the data obtained with regard to the measured air pollution could be output as a spatial model of city well-being. Acknowledgments: The authors would like to thank the Esri Poland team. We are also grateful to the editors and anonymous reviewers for their constructive comments and suggestions that helped to improve this manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Acronym Defined in Section
Meaning (Section) CS citizen science Citizen-driven research that citizens (non-experts) participate in by cooperating with researchers (Introduction, Section 1.2).

QAm quality assurance mechanism
Conditional-statement-based mechanism for data bias controlling. Five data quality assurance mechanisms are proposed in this study (Introduction, Section 1.3; Section Materials and Methods, Section 2.2).

QAs quality assurance system
Combinations of data quality assurance mechanisms. In the study, we studied and analyzed eight QAs variants, depending on their robustness levels, QAs 1 -QAs 8 (Materials and Methods, Section 2.2).

GeoWeb geospatial web
Geographically related tools and web services for individuals and groups (Abstract, Introduction, Materials and Methods Section 2.3).

SC (SC1, SC2)
start-check mechanism in variant 1 and variant 2 The start-check mechanism is used to verify the report quality at the beginning of the survey and controls the report quality during the analysis of each symptom severity answer.The mechanism studied and proposed in two variants of robustness (variant 1: less robust; variant 2: more robust) (Materials and Methods, Section 2.2).
Rp (Rp1, Rp2) repeating mechanism in variant 1 and variant 2 The repeating mechanism determines the quality of the report, according to other previously asked questions, by asking the same question but in a different way.The mechanism studied and proposed in two variants of robustness (variant 1: less robust; variant 2: more robust) (Materials and Methods, Section 2.2).

Sq sequence mechanism
The sequence mechanism was applied to exclude user automatism in providing answers (Materials and Methods, Section 2.2).
CV (CV1, CV2) cross-validation mechanism in variant 1 and variant 2 The cross-validation mechanism was used to reject responses using three essentially related questions.The mechanism studied and proposed in two variants of robustness (

Appendix A APSM-Dedicated GeoWeb Tools Supporting Quality Assurance
The tools for the APSM project were based on GeoWeb. We used ArcGIS platform components, which were available to the technologist as a puzzle structure, which allowed for direct customization of the applications to implement the APSM assumptions and requirements. For the project, we configured the mobile app and a set of web apps was publicly shared for citizens.

Appendix A.1 Mobile App for Crowdsourcing
The mobile app was based on Survey123 for ArcGIS components and is available at the public link https://arcg.is/0HWXrO. The survey consists of six information pages to facilitate its use and clear navigation ( Figure A1a). It is available in two languages, Polish and English ( Figure A1b). The app starts with an introduction with a short user guide ( Figure A1c) in order to explain the research rules and how to use the survey app (page 1), followed by the user's basic information (nickname, four-digit code, and student/nonacademic status; Figure A1d), helping the users in the citizen group to control the data collection process (page 2). The next pages (3-5) include 12 APSM questions, which are completed with the user's geolocation and the date of the report (page 6). All obligatory questions are marked with a red star ( Figure A1e). The third page focuses only on the general well-being level of the citizens ( Figure A1f), which is the basis for the start-check mechanism. Then, the citizens answer questions about their individual symptoms using drop-down lists of answers ( Figure A1g). In the summary (page 5), the citizens specify their level of well-being, choosing from a star rating scale, where one star means the lowest and three stars indicate the highest level of well-being ( Figure A1h). Then, using the calculator appearance widget, the users report the length they have stayed in their location and the symptoms observed. These values are expressed in minutes, provided for question 11 ( Figure A1i). Question 11, regarding the length of the observation of symptoms, is fixed in the app as relevant only when any APS are observed. If Q2-Q7 are answered as "no symptoms," then Q11 does not appear in the survey. On the last page of the app, a map widget is presented to mark the current location and date ( Figure A1j). Here, app users are told that all reports with a horizontal positioning accuracy error greater than 100 m are automatically eliminated, as these values are considered GNSS positioning accuracy outliers. When completing the survey, the user can check the current location status at any time (i.e., latitude, longitude, and horizontal accuracy; Figure A1k). The default date is set to the current date. The geolocation defaults to the current GPS location of the user, as well. When the survey is completed, a bottom-right submit tick is made active, and the report is ready to send to the cloud geodatabase. The first app-Introduction to the project ( Figure A3)-is based on the Esri Story Map Cascade template, which is used for building narrative web mapping apps by combining images, maps, and multimedia context with narrative text (https://storymaps-classic.arcgis.com/en/app-list/cascade/). The application has an educational function for the citizens involved. It provides educational materials about air pollution and the APSM project idea, as well as extended mobile and web app tutorials and technical knowledge. The first app-Introduction to the project ( Figure A3)-is based on the Esri Story Map Cascade template, which is used for building narrative web mapping apps by combining images, maps, and multimedia context with narrative text (https://storymaps-classic. arcgis.com/en/app-list/cascade/). The application has an educational function for the citizens involved. It provides educational materials about air pollution and the APSM project idea, as well as extended mobile and web app tutorials and technical knowledge. The first app-Introduction to the project ( Figure A3)-is based on the Esri Story Map Cascade template, which is used for building narrative web mapping apps by combining images, maps, and multimedia context with narrative text (https://storymaps-classic.arcgis.com/en/app-list/cascade/). The application has an educational function for the citizens involved. It provides educational materials about air pollution and the APSM project idea, as well as extended mobile and web app tutorials and technical knowledge. Button number 2 links to a web version of the Survey123-based application for data collection.
Application 3 presents the raw data collected before the QA process and contains the operations-dashboard-based interface, which consists of five modules: a map with the raw-data APSM reports collected during the crowdsourcing campaign ( Figure A4a); a legend ( Figure A4b); an indicator counting the total number of reports ( Figure A4c); a histogram of the citizen activity from the beginning of the crowdsourcing campaign to the current moment ( Figure A4d), which changes dynamically, according to the map; and a citizen activity ranking, divided for each month and cumulatively ( Figure A4e), as described in Section 2.1. The APSM result application (number 4) is a six-module app that presents the results of the APSM data checked with QAs8, the most robust variant of the QA system. The application includes a map ( Figure A5a  Button number 2 links to a web version of the Survey123-based application for data collection. Application 3 presents the raw data collected before the QA process and contains the operations-dashboard-based interface, which consists of five modules: a map with the raw-data APSM reports collected during the crowdsourcing campaign ( Figure A4a); a legend ( Figure A4b); an indicator counting the total number of reports ( Figure A4c); a histogram of the citizen activity from the beginning of the crowdsourcing campaign to the current moment ( Figure A4d), which changes dynamically, according to the map; and a citizen activity ranking, divided for each month and cumulatively ( Figure A4e), as described in Section 2.1. Button number 2 links to a web version of the Survey123-based application for data collection.
Application 3 presents the raw data collected before the QA process and contains the operations-dashboard-based interface, which consists of five modules: a map with the raw-data APSM reports collected during the crowdsourcing campaign ( Figure A4a); a legend ( Figure A4b); an indicator counting the total number of reports ( Figure A4c); a histogram of the citizen activity from the beginning of the crowdsourcing campaign to the current moment ( Figure A4d), which changes dynamically, according to the map; and a citizen activity ranking, divided for each month and cumulatively ( Figure A4e), as described in Section 2.1. The APSM result application (number 4) is a six-module app that presents the results of the APSM data checked with QAs8, the most robust variant of the QA system. The application includes a map ( Figure A5a  The APSM result application (number 4) is a six-module app that presents the results of the APSM data checked with QAs 8 , the most robust variant of the QA system. The application includes a map ( Figure A5a) with eligible symptom-related layers (Q1-Q8; Figure A5b), a legend for the map ( Figure A5c), a pie chart for the severity of each symptom (Q2-Q8) indicated with question-related bookmarks ( Figure A5d), a bar chart presenting the level of general citizen comfort ( Figure A5e), a pie chart presenting the percentage robustness of the QA system ( Figure A5f), and an indicator counting all the QA-checked reports ( Figure A5g).  The time slider app (number 5) is based on the Esri Time Aware configurable template. It includes a map of point-symbolized QA-checked reports accompanied by a time slider tool, which displays the increase in collected data over the entire duration of the crowdsourcing campaign. The time slider can move automatically (with a play button) or can be moved manually to the required date ( Figure A6).  The time slider app (number 5) is based on the Esri Time Aware configurable template. It includes a map of point-symbolized QA-checked reports accompanied by a time slider tool, which displays the increase in collected data over the entire duration of the crowdsourcing campaign. The time slider can move automatically (with a play button) or can be moved manually to the required date ( Figure A6).  The time slider app (number 5) is based on the Esri Time Aware configurable template. It includes a map of point-symbolized QA-checked reports accompanied by a time slider tool, which displays the increase in collected data over the entire duration of the crowdsourcing campaign. The time slider can move automatically (with a play button) or can be moved manually to the required date ( Figure A6). Figure A6. Time slider app following the data collection process over time.