Mobile Apps to Fight the COVID-19 Crisis

: The COVID-19 pandemic led to a multi-faceted global crisis, which triggered the diverse and quickly emerging use of old and new digital tools. We have developed a multi-channel approach for the monitoring and analysis of a subset of such tools, the COVID-19 related mobile applications (apps). Our approach builds on the information available in the two most prominent app stores (i.e., Google Play for Android-powered devices and Apple’s App Store for iOS-powered devices), as well as on relevant tweets and digital media outlets. The dataset presented here is one of the outcomes of this approach, uses the content of the app stores and enriches it, providing aggregated information about 837 mobile apps published across the world to ﬁght the COVID-19 crisis. This information includes: (a) information available in the mobile app stores between 20 April 2020 and 2 August 2020; (b) complementary information obtained from manual analysis performed until mid-September 2020; and (c) status information about app availability on 28 February 2021, when we last collected data from the mobile app stores. We highlight our ﬁndings with a series of descriptives, which depict both the activities in the app stores and the qualitative information that was revealed by the manual analysis. License: CC BY 4.0.


Summary
The COVID-19 pandemic led to a multi-faceted global crisis, which triggered the diverse and quickly emerging use of old and new digital tools. The release of digital tools (including mobile applications-also known as apps) to help mitigate the COVID-19 pandemic caused almost immediately a lively public debate, since many stakeholders questioned the efficiency of the tools and addressed the risks introduced regarding privacy and personal data protection [1][2][3]. As a consequence, privacy, personal data protection, use of data for social good [4] and trust from the citizens were the next concerns. To address the latter, relevant work on COVID-19 apps was conducted by few groups resulting in app classifications based on different criteria. An example is the survey on frameworks and mobile apps presented in [5], which examines both mobile apps and the underlying infrastructure from a technical point of view, focusing on cybersecurity and privacy and classifying the frameworks based on the privacy-preserving Bluetooth technology as centralised, decentralised and hybrid. In addition, several classifications on the app functionality have also emerged, e.g., [1,2,6].
We have developed an approach for the monitoring and analysis of the COVID-19 related mobile apps [7], which builds on the information available in the two most prominent app stores (i.e., Google Play for Android-powered devices and Apple's App Store for iOS-powered devices), as well as on relevant tweets and news items. Each of the three channels (i.e., app stores, tweets and media outlets) is processed separately, but we occasionally benefit from interconnections.
The dataset presented here is one of the outcomes of this approach, resulting from the separate processing of the information coming from our primary information sources, i.e., the app stores. We have reused the relevant app store content, harmonised and enriched it, to derive information about 837 mobile apps published across the world in the attempt to fight the COVID-19 crisis. The dataset includes: (a) information available in the mobile app stores (Google Play and App Store); (b) complementary information obtained from manual analysis; and (c) status information regarding recent app availability in the stores.
In the period from 20 April 2020 to 2 August 2020, we monitored mobile app releases and updates related to the COVID-19 pandemic in both the Google Play and the App Store. The relevant app store contents were analysed daily, and the data retrieved were processed once a week; this way, we developed a database of relevant mobile apps together with the metadata describing them. This information was then complemented with the results of the manual analysis that we performed, in order to get some descriptives on the COVID-19 mobile apps landscape.
The remainder of this paper is organised as follows. The description of the dataset is provided in Section 2, and the methodology that we followed to collect the data is outlined in Section 3. Our concluding remarks together with further notes informing the adequate usage of the dataset are provided in Section 4.

Data Description
In this section, we describe in detail our dataset, which provides information on the mobile apps that emerged during the first wave of the pandemic across the world to assist in fighting the COVID-19 crisis. The dataset comprises 837 records: each record corresponds to an app found in any of the two stores; 249 apps have been found in both stores, so there are 588 "unique" apps in total. The dataset is described using 38 attributes that encode: (a) information about the apps available in the mobile app stores between 20 April 2020 and 2 August 2020; (b) complementary information obtained from manual analysis performed until mid-September 2020; and (c) status information about app availability on 28 February 2021, when we last visited the mobile app stores. The above information is captured in the attributes outlined below and further detailed in Table 1. Information available in the mobile app stores. This information has been retrieved, using the publicly available Application Programming Interfaces (APIs), from the respective mobile app stores for the apps that were identified as COVID-19 related. For every app, information includes its name and version, the name of the store it has been found in (storeName), the date of the app release (releaseDate) and the date of the latest app update that we took into account for our analysis (latestUpdate).
Information obtained from manual analysis. A team of six researchers, among the coauthors of this paper, added this information in a coordinated manner based on the app descriptions available on the app stores (or their English translations, for the apps with descriptions available in different languages). In case the app descriptions were not informative enough, we also looked at the user reviews and integrated information available on the websites of the apps (provided in the stores), but the apps were neither installed nor tested. The latter was a necessary choice because the goal was to provide a high level analysis of the overall landscape of mobile apps and because many of the apps were only available to be downloaded in the countries in which they were released. The information derived at this step includes: • An indication (yes/no) whether the app was considered interesting for more in-depth analysis (interestingApp). Our selection criteria included the novelty of the app at the technological and governance level, the sophisticated use and generation of data, and the potential for fighting the pandemic. We evaluated as "interesting" all contact tracing apps, apps with more sophisticated functions for personal data sharing and exchange (compared to simple symptom trackers), as well as the ones with innovative features and/or technical solutions (such as decentralised data management). • The type of organisation of the app provider (providerCategory). The attribute providerCategory may take the values non-profit organisation, community, technology company, health company, other type of company, university(ies) or research centre(s), international organisation, local/regional government or national government. • An indication whether a health authority is/was involved in the development/distribution of the app and/or using data collected with the app (healthEntityInvolved). This attribute is not present for apps that were not considered interesting for further analysis. The attribute healthEntityInvolved may take the values health authority, health facilities, medical research institute(s) university(ies), consortium, no or na. • An indication (yes/no), in the (EU) attribute, whether the app is/was released in an European Union (EU) country. • The ISO 3166-1 alpha-3 code of the country the app comes from (geographicCoverage). • The continent the app comes from (continents). • Category of the app (appCategory), according to the framework of app functionalities presented in Table 2. The attribute appCategory may take the values: (a) COVID-19 specific, for apps that were specifically developed to face the COVID-19 crisis; (b) COVID-19 influenced, for apps that existed before the COVID-19 crisis but were modified to help address it; (c) health-generic, for apps with generic health-related functionality that can also help with the COVID-19 crisis; and (d) other, for apps outside the health domain that can also help with the COVID-19 crisis. • Category of the app functionality (appFunctionalityCategory), according to the framework of app functionalities presented in Table 2. The attribute appFunc-tionalityCategory may take the values: (a) expert support, for apps that were developed for specific experts (such as, for example, medical staff members) to help them carry out dedicated COVID-19 related duties; (b) information provision, for apps that are essentially one-directional channels of information about COVID-19; (c) personalised support, for apps that provide COVID-19 related personalised support (such as, for example, symptom trackers) without sharing collected data with third parties; (d) information exchange, for apps that provide bi-directional information exchange for COVID-19 related personalised support (i.e., including data sharing with third parties); (e) contact tracing, for apps that allow for identifying persons who may have been in contact with an individual infected with COVID-19; (f) notifications, for apps that offer COVID-19 related personal notification func-tionalities with or without having previously received personal data; (g) lockdown management, for apps that support lockdown management with functionalities such as mobility checking, exit management, etc.; (h) other-health for apps that offer other health-related functionalities; and (i) other for any other app that helps address the COVID-19 crisis. • The app functionalities (appFunctionality), specifying the actual app functionalities according to the framework of app functionalities presented in Table 2 and in compliance to the app functionality categories presented there. The attribute is not present for apps that were not considered interesting for further analysis. • Types of personal data collected by the app (typesOfPersonalData), with possible values equal to Proximity, Location as province/region, Location as GPS/cell tower data, Health status, Positive status and Other. The attribute is not present for apps that have not been considered interesting for further analysis. • An indication (yes/no/na) whether the app clearly communicates what information it collects and how it processes it, explains how to request data deletion or informs about a clear window for data retention (clearPrivacyPolicy). Other key information for a clear privacy policy includes the privacy statement explaining that personal data collection is limited to what is necessary and that data is shared only upon explicit user consent (in the spirit of the GDPR-General Data Protection Regulation [8]). The attribute is not present for apps that have not been considered interesting for further analysis. Status information. This is an indication whether a mobile app was still active on 28 February 2021 (i.e., the date of our last visit to the mobile app stores), which is captured in the status attribute.

Methods
The workflow for creating the dataset comprises four processes: (a) Store monitoring; (b) Information harmonisation; (c) Information aggregation; and (d) Information enrichment. The data flow in these processes is presented in Figure 1 using the Input-Process-Output (IPO) pattern (https://en.wikipedia.org/wiki/IPO_model, accessed on 9 August 2021). As shown in Figure 1, the outputs of a process are inputs for the next one. Details in the processes are provided in the following paragraphs. Store monitoring looks-up weekly and in an automated way the Google Play and the App Store for COVID-19 related apps and stores the search results in JSON format. To retrieve the relevant information from the app stores, we used semi-automatically selected keywords (terms): first we manually searched for COVID-19 related apps in the Italian country app store, we went through the search results, and we identified the most appropriate combination of keywords for retrieving the apps of interest in every country store. We ended up with the following keywords: coronavirus, corona, covid, sars-cov-2, symptoms, track, virus, social isolation and self-diagnosis. We retrieved the app metadata that are made available through APIs by the stores if at least one of these keywords (case-insensitive) appears in: (a) the brief description or title of the mobile app in Google Play and (b) the app name, app subtitle, app description, app keywords or app reviews in the App Store.
Information harmonisation takes the JSON files resulting from store monitoring as input and automatically transforms them in a unique overview table with common attributes. An Extract-Transform-Load (ETL) procedure is employed for the transformation, since the attribute names in the Google Play and the App Store may differ, although they are often semantically equivalent. The ETL procedure also checks if an app is available in both the app stores.
Information aggregation updates, using an ETL procedure, the joint overview table that keeps the aggregated information from the app store monitoring with the overview table created by the information harmonisation process and automatically creates an analysis table including all the information available about COVID-19 related apps. In addition to the aggregated information from the overview tables that are created every week, the joint overview table also stores the timestamps of the release dates and the available versions of the apps, thus allowing us to create timelines for both app releases and app updates.
Information enrichment was manually conducted by a team of six researchers to enrich the information available in the analysis table created by the information aggregation process, thus resulting in the dataset on COVID-19 related apps and a set of descriptives. The analysis was conducted on top of the app descriptions retrieved from the stores.
At the technical level, we used the iTunes Search API [9] to identify app descriptions and retrieve the relevant data from the App Store and the Node.js [10] module "googleplay-scraper" for the Google Play [11]. Data integration and analysis were performed using the ETL tool Feature Manipulation Engine (FME) [12], which also launches all the search APIs and lookup searches daily. In particular, we used the FME Workbench application of FME Desktop 2019.2.1.

User Notes and Conclusions
The collected dataset allowed us to analyse different temporal and spatial characteristics of the rapidly emerged phenomenon of mobile apps created to fight the COVID-19 crisis. Possible reusers can benefit from the dataset to investigate different aspects related to COVID-19 related mobile apps, such as (i) the interplay between public and private actors in the governance of the apps, (ii) the implicit signals for the importance of the data protection legislation for the governance of data and (iii) how governments attempted to harness novel technologies in fighting the pandemic.
The dataset is publicly available in the CSV (Comma-Separated Values) format, through the Joint Research Centre (JRC) Data Catalogue [13], under the CC BY 4.0 licence. We also published a set of descriptives that give some clues based on our analysis on our corporate GitHub space [14]. The navigation page of this space is presented in Figure 2, which is leading to four sections: (a) Basic information, which presents statistics on the basic information on COVID-19 related mobile apps; (b) Geospatial information, which presents statistics on the geospatial distribution of COVID-19 related mobile apps; (c) Basic descriptives, which presents statistics on the basic descriptives about COVID-19 related mobile apps, based on the information resulting from the dataset analysis; and (d) Composite descriptives, which presents statistics on the composite descriptives about COVID-19 related mobile apps, based on the information resulting from the dataset analysis.
Our early findings from the analysis of COVID-19 related apps include: • The two different app stores we monitored (App Store and Google Play) are describing the apps in a different way, using different attributes. Thus, we employed ETL procedures to get a harmonised view of the apps in both stores, with common attributes (i.e., schema-level harmonisation). We noticed, though, that in many cases the same app could have different (but almost similar) names in the two stores and/or be described in a different way, with the common attributes having different values. Since this observation was frequent, we preferred not to proceed at a data-level harmonisation and keep the apps published in both the stores as two different records. Figure 3 shows the numbers of apps published in both the stores (249 apps), in App Store only (359 apps) and in Google Play only (229 apps).
• Results from automatic (objective) processes can be combined with the findings of manual (subjective) work. However, to harmonise the results coming from inherently different viewpoints and approaches, we relied on cross-validation (peer reviews) of analysis results by more than one researcher. Furthermore, meetings to discuss the definition of the attributes, while the analysis was ongoing, were very useful to make sure there was a common understanding in the classification approach. • As the first COVID-19 cases started in China in early December 2019, followed by the first cases in Europe in the second half of January 2020, with the World Health Organisation (WHO) declaring the global health emergency on 30 January 2020 [15], the actual emergence of COVID-19 related apps accelerated in April 2020 (see Figure 4 for the timeline of new COVID-19 related app releases in Google Play and App Store). • The COVID-19 crisis led to the development of many different apps, much more than the contact tracing apps that were prominently present in the news and on social media. We now have a rich landscape of COVID-19 related apps available, and they are highly diverse across countries. Different clusters of functionalities can be spotted, but multiple practices emerge in terms of addressing quarantine management, checking mobility, etc. • The public sector, i.e., local, regional and national governments, is by far the main provider of COVID-19-related apps, as shown in Figure 5 (although most of the times these apps were formally developed by companies contracted by the governments). Figure 2. Navigation page of the GitHub space [14] with the descriptives resulting from the dataset analysis, leading to four sections: Basic information; Geospatial information; Basic descriptives; and Composite descriptives.  • Sharing of personal data seems often not regulated by clear privacy policies, especially for apps released in countries outside the EU. This was indicated when analysing the text provided in the app privacy policy. Further analysis on this dimension could be performed by installing and checking/completing the attributes from a user experience perspective. • Almost every country, especially in the EU, adopted its own contact-tracing app.
Most of these apps are based on Bluetooth technology to exchange data in a fully anonymous and privacy-respectful way. • The geographic distribution of offers is highly diverse. Some countries (such as India, Brazil and the USA) provide a high number of apps, where besides apps released on a national scale, there are also apps with similar functionalities released by different cities or regions (hence functioning on a more local scale). We did not spot this general trend in European countries. • The functionalities that the COVID-19 apps provide change over time. Whereas many apps focused on information provision about regional situations and training (e.g., how to wash hands) early on, we witness a peak of contact tracing apps over the summer of 2020, followed by an increase of apps that support re-entering to schools, work or online records of test results. Logically, the countries that were hit earlier by the crisis underwent this evolution earlier-as compared to those which were affected later. • We also observe differences in the geographic distribution of app availability, which might be explained by the disparity in the popularity of iOS/Android per country. As already mentioned, this dataset is one of the outcomes of our approach, with the first being the technical report published by the end of 2020 [7] and further scientific publications envisaged. We hope that this dataset might also inspire others in researching the use of mobile apps to stand up against crises, technology for public good and related fields.