1. Introduction
Nowadays, smartphones and smart wearables are becoming more prevalent and powerful. One of the areas that saw considerable growth with mobile devices is the fitness industry Silva et al. [
1], Yeoh et al. [
2]. Wearable devices such as smartbands are oriented for fitness purposes as they pack a significant set of sensors, such as heart rate monitor, step counter, Oxygen Saturation (
), and sleep monitoring, to name just a few. The functionalities of wearables often depend on and are augmented by smartphones, like, for example, a smartband paired with a specially developed mobile application, communicating through Bluetooth Low Energy (BLE). This companion application interfaces between the device and the smartphone, providing services such as data visualization, geolocalization, access to the Internet for cloud storage and firmware updates. Another crucial role of companion applications is to serve as a bridge between the wearable device and its connected smartphone, facilitating the transfer of fitness data from the wearable to the phone. Subsequently, the phone uploads these data to the application’s cloud server. Frequently, a companion application is specifically designed by a fitness brand, making it exclusively compatible with wearables produced by that brand. An example is Garmin Connect (
https://play.google.com/store/apps/details?id=com.garmin.android.apps.connectmobile (accessed on 8 August 2024)), which is the companion application for Garmin devices such as smartbands and smartwatches. Another example is the Fitbit application, which interacts with Fitbit fitness devices. Furthermore, these applications extend their reach and functionality by providing APIs that can be used by independent fitness applications to send/receive data from the associated companion application. For example, through the proper Application Programming Interface (API), the diet application MyFitnessPal (
https://play.google.com/store/apps/details?id=com.myfitnesspal.android (accessed on 8 August 2024)) can receive the number of burnt calories from companion applications, providing in return the amount of ingested calories by the users. The Strava (
https://play.google.com/store/apps/details?id=com.strava (accessed on 8 August 2024)) application is another example, as it can synchronize data from a running activity tracked by a Garmin smartband.
As they often couple precise GPS-based locations with date/time, fitness applications can provide valuable digital forensic data, allowing one to locate the whereabouts of the device bearer at a given date and time [
3]. Additionally, if other metrics are available, such as heart rate, it is possible to infer the activity state of the device bearer: idle, normal or high. This might prove of paramount importance in criminal and fraud investigations [
4].
An actual illustration of the benefits of smartwatch data in assisting with a criminal investigation is evident in a murder case. The Fitbit data of the victim played a crucial role in securing the husband’s conviction, as the Fitbit timeline demonstrated movements within the house and the distance travelled, conclusively placing the husband at the crime scene and refuting his alibi [
5]. In 2015, the police used GPS coordinates and step pace, which were stored in a Garmin smartwatch, to help bring charges to a man for a double homicide, correlating the time/date of the homicides with the coordinates stored in the smartwatch to draw the suspect’s escape route [
6]. Another publicly known case is the use of Strava in running mode to identify and convict a man of hitting and injuring a cyclist in Virginia, USA [
7]. A final example is a case where the fitness application Apple Health data were used to convict a man of rape and murder: the police apprehended the man’s iPhone, extracted the data and correlated them with the time and location of the murder [
8]. Therefore, fitness applications can provide valuable data in digital forensics as they store a wealth of private information related to the user in specific companion applications such as Garmin Connect [
9]. They can store health data, GPS data, sleep data and more. However, a type of application that is also interesting to analyze is running applications. The focus of these applications is related to outdoor running and other types of physical activities. Usually, these applications also function as fitness-tracking social networks, where one connects with other users and shares their running activities, sometimes to compare performances. The difference between these and companion applications is that they do not require the user to wear a smartwatch or smartband, as it is sufficient to use a smartphone with GPS and step counter capabilities. As running applications store data such as GPS coordinates and the date/time of run workouts, they can be instrumental in digital forensic analysis. Because of all their benefits, fitness applications have grown in numbers and users in recent years. In 2019, the major app stores featured over 350,000 healthcare and fitness applications, resulting in an annual download count of 3.7 billion [
10]. This prevalence increases the likelihood of their relevance in a criminal investigation.
The high volume makes these types of applications a target for cybercriminals intending to steal valuable private data related to the applications’ users. Since these applications handle private data, security and privacy should be their top priority, considering the number of regulations and legislation that they must follow. However, this is not always the case. Various studies have shown failures and shortcomings in the security and privacy of these applications. Scott et al. [
11] studied fitness applications, showing that most stored data are in plain text and do not encrypt their communication. These facts made health and fitness apps a target of attacks that could cause the leak of millions of users’ data.
Another issue is the poor privacy features implemented in some of these applications, which could be exploited by users with malicious intent, such as stalking another person by studying their GPS activities. A famous example is the Strava application, where a man used it to stalk his ex-partner [
12]. In 2022, it was revealed that Strava harbored a security vulnerability capable of enabling user tracking, even for the ones who had set the strictest privacy settings allowed by the application. The failure could be exploited by uploading fake running segments, allowing malicious users to learn the identities and past routes of Strava’s users in the area [
13]. In recent years, applications claim to have significantly evolved in security and privacy concerns by adding various privacy features, often due to discovered failures. However, some of these problems remain, and many applications still store a large amount of sensitive data, making them vulnerable to data leaks.
In this work, we conduct a forensic analysis of six running applications. As demonstrated later, Android applications were selected due to their widespread popularity, indicated by download numbers in the millions. All running applications except Strava are linked to well-known and popular sportswear brands. Strava, while not affiliated with a sportswear brand, is a highly regarded sports application. It allows users to record workouts and share their achievements on social networks, and it is favored by top professional runners and cyclists, contributing significantly to its popularity Couture [
14].
The main contributions of this work are (i) the analysis and findings of forensic artifacts in post mortem scenarios for the Android version of the studied applications; (ii) the development of 12 modules for the framework Android Logs Events And Protobuf Parser (ALEAPP) to ease the extraction of forensic artifacts and creation of reports for further analysis; (iii) two new functionalities incorporated into ALEAPP: the capability to handle Flexible and Interoperable data Transfer (FIT) files and the introduction of a timeline plugin.
The remainder of this paper is organized as follows.
Section 2 discusses related work while
Section 3 describes the materials and methods of this study.
Section 4 studies the
Nike Run Club application, explaining the process and the main analysis tools. In
Section 5, we focus on the peculiar analysis of
Strava, as the results and artifacts differ from the other applications. In
Section 6, we present the methodology gathered from the
Nike Run Club and apply it to the other studied applications.
Section 7 focuses on our open-source Python 3 modules developed for ALEAPP and the added
timeline feature. Finally,
Section 8 concludes the paper.
2. Related Work
We begin by defining two important concepts in digital forensics—
forensic artifacts and
data extraction—before examining related work.
Forensic artifacts are data elements that provide reliable evidence to support or refute hypotheses about user activities, system operations or network communications on digital devices.
Data extraction, within the context of digital forensics, involves retrieving data from digital devices in a sound manner, maintaining the integrity and authenticity of data, a crucial step in the preservation phase of digital investigations [
3].
Scott et al. [
11] examined 20 health apps, primarily concentrating on security and privacy aspects but not targeting specific apps under our scope. While the analysis may lack relevance due to subsequent updates received by the applications, their methods remain pertinent for current post mortem analysis.
Hassenfeldt et al. [
4] analyzed nine Android fitness apps. Some of these applications— Runkeeper (from Asics), Strava and Runtastic (now designated
Adidas Running)—are also the focus of our work. The authors created their testing environment by collecting data and extracting them from the smartphone to the forensic computer through Android Debug Bridge (ADB) and the commercial XRY (
https://www.msab.com/product/xry-extract (accessed on 8 August 2024)) tool. In their analysis, they found (
i) account data, (
ii) personal information and (
iii) GPS data. The authors also developed a Python tool to extract potential artifacts from the Extensible Markup Language (XML) and database files of the applications’ private directory.
Sinha et al. [
15] performed forensic analysis of six fitness applications with an interesting focus on the application
Nike Training Club, from the same developers as
Nike Run Club, and
MapMyFitness, whose developers also developed
MapMyWalk, two applications studied in our research. The data found for each application are quite similar: (
i) user profile data, (
ii) health and exercise data, (
iii) captured device data and (
iv) GPS data.
Hutchinson et al. [
16] performed a forensic analysis of three companion applications using different devices:
Amazon Halo,
Garmin Connect and
Mobvoi. Their comprehensive analysis unveiled the following forensic artifacts: (
i) health data; (
ii) profile information; (
iii) phone notifications; (
iv) exercise data; (
v) GPS data; (
vi) steps data. The in-depth study enabled us to grasp how fellow analysts establish their data collection and analysis environments.
Donaire-Calleja et al. [
17] studied the forensic analysis of wearable devices, specifically smartwatches. The paper focuses on the challenges that forensic analysts face in this area, such as the lack of standardized procedures and the use of private communication protocols.
In prior research [
9], we analyzed the Garmin Connect application paired with the Garmin Vivosmart 4 smartband. Our developed open-source tools can extract several artifacts, such as (
i) daily summary data; (
ii) GPS data; (
iii) response cache data; (
iv) network logs; (
v) Facebook API tokens; (
vi) device synchronization cache; and (
vii)
reading charts. These results motivated us to delve deeper into Android applications, focusing on applications to monitor running activities.
Several papers study wearable and smartphone accuracy in physical activity measurements such as step count and walking distances [
18,
19,
20]. van Zandwijk and Boztas [
21] reported that the iPhone has a low 2% error in step count reporting but can diverge up to 40% for walking distance measurement. Goh et al. [
22] documented that smartphones overestimated step counts when assessed on a 3-day free-living condition, with precision affected by factors such as walking style and phone-wearing location. Precision assessment is beyond the scope of this work.
To summarize this section,
Table 1 shows a comparison between our work and the ones presented above, focusing on the different goals of the analysis, applications targets and major findings.
4. Post Mortem Analysis
This section presents the post mortem analysis of each application. This kind of analysis focuses on extracting all data generated by the applications (after one month of usage) and analyzing them in a dedicated computer. After gathering the data, we followed a structured approach to retrieve the information from the device.
By default, application-generated data are stored in the internal storage, which remains private to the application and inaccessible to other apps or users unless root access is granted. As implied by its name, internal storage is a suitable location for storing application data that users do not directly interact with, such as database files, application logs and related data. Also, Android devices support a shared “external storage” space where developers can save files. Files saved to external storage are accessible and modifiable by the user when they enable USB mass storage to transfer files to a computer. Forensic practitioners aim to extract data stored in both the internal and, if available, external storage systems. The extraction process can be accomplished using commercial forensic tools, like Cellebrite (
https://cellebrite.com/ (accessed on 8 August 2024)). However, because our device was rooted, we utilized ADB. ADB, or Android Debug Bridge, is a command-line tool for communicating with Android devices connected to a computer via USB or wirelessly (since Android 11). ADB facilitates various device actions, such as app installation and debugging, and provides access to a Unix shell for executing commands on the device (
https://developer.android.com/studio/command-line/adb (accessed on 8 August 2024)). The procedure for extracting data using ADB is the same across all studied applications:
Access the device via ADB;
Navigate to the public or private directory (requires sudo privileges);
Locate the application folder;
Archive and compress the folder, storing it in external storage;
Transfer the archived data from the device to the analyst’s workstation.
The following steps were undertaken to initiate the extraction process using ADB. Initially, we extracted data from the public folder located at
Subsequently, the most relevant part was extracting the application’s private folder. However, unlike the public directory, accessing the private directory requires root access. In a rooted device, the user must only execute the su command to enter the privileged mode and access the private directory, which resides in the following path:
The binaries of the installed application can also be important in studying their behavior. Their location is
Since Android 8, app folders have been named using a random string in base64 to enhance privacy and security, making it more difficult for unauthorized users or malicious applications to access sensitive app data [
26]. To identify the correct path, the following command was executed:
This command outputs the path to the application’s APK file location. That file can then be extracted using ADB with the corresponding path. Extracting data from all locations can be time-consuming. Therefore, we developed a Python script named ADB-Extractor to automate the process. The script allows users to select what data they want to extract (public directory, private directory or application binary file) from the chosen device (Android emulator or physical device). The usage of the script is as follows:
A graphical interface is also available, where users can select the device, what to extract and where to save it. Additionally, users can select the desired package name from a list of all installed applications.
Figure 2 displays the tool’s graphical interface.
Maintaining forensic integrity is critical to the credibility and admissibility of digital evidence. To ensure that digital evidence remains authentic, unaltered and verifiable from the point of collection through to its presentation in legal or investigative contexts, investigators compute the hash of extracted files prior to analysis to guarantee that the data are not tampered [
27]. Our tool calculates the
SHA256 hash for each extracted file and directory, storing the results in an output file for subsequent integrity verification.
As mentioned previously, our study examines six distinct run-tracking applications. They all have the same goal—to record running routes, times and performances—and thus exhibit many similarities. To prevent redundancy, we will provide an in-depth analysis of one of the applications, specifically Nike Run Club, which is the most feature-rich among the set. Our methodology for analyzing Nike Run Club is then applied to the remaining fitness applications.
4.1. Nike Run Club
Nike Run Club has 10M+ downloads in Google Play Store (see
Table 3) and focuses on preparing athletes for competitions by mixing virtual personal training with a gamification system and social network. Users can connect with friends through the club feature, which acts as a small community. Within this community, they can engage in challenges, compete with each other and collectively work toward enhancing their overall fitness levels.
Nike Run Club lacks a built-in authentication method. During login request or account creation, the application opens a webview for the
URL https://accounts.nike.com/ (accessed on 8 August 2024), where the browser will display the authentication form. This fact means that the authentication is performed in the browser and is exposed to possible website vulnerabilities, which is vital for future forensic dynamic analysis. The registration form asks for (
i) the email (where the verification code is sent); (
ii) password; (
iii) name; and (
iv) birth date. After successfully creating an account, the user returns to the application, where the application asks for (
v) gender, (
vi) height and (
vii) weight.
Upon login, the authenticated user lands on the main dashboard, featuring a live Google Maps display of their current location, shown in
Figure 3a. Here, users can track their activity, with metrics such as distance, calories burned and duration showcased alongside the activity’s route on the map. This run-tracking dashboard is standard, complemented by extras like training plans and audio-guided runs accessed by swiping. The app includes a side menu with options like an activity summary screen detailing stats and a log of past activities. Users can delve into individual activities for detailed statistics like pace, duration, calories burned, heart rate and GPS route, as shown in
Figure 3b. Although all studied running applications are unique, we identified similar points, such as the activity screen and the challenges page. This fact is essential from a forensic standpoint because it helps us to identify the possible data commonly stored in all of them.
Data Synchronization
As standard in modern applications, Nike Run Club can synchronize data between different devices. After analyzing its main database, as will be explained in
Section 4.4, we discovered that the application can store, in its local databases, the data from the devices that are synchronizing in the same account.
In addition, the Nike Run Club application can import activities from the cloud of companion applications. We had previously used the Nike Run application on an iOS device, and all activities from Apple’s Healthkit were seemingly imported into the Android database. To test this venue further for collecting data, we installed the application on an Android emulator, resorting to Android Studio. After installing the application and logging into the Nike Run account, we extracted the application’s private directory. Again, all previous activities were stored in the databases. This finding is relevant from a forensic standpoint since digital practitioners are not bound to using a rooted device to access data. Indeed, with access to the account credentials, one can install the application on an emulator and collect all data. Moreover, suppose that the Nike Run Club account is connected to a companion application such as Garmin Connect or Fitbit. In that case, one can potentially access activities recorded by devices running under iOS, resorting to Healthkit.
4.2. Android Permissions
Upon accessing the Play Store, one can observe the permissions that the application requests during installation: Play Store → About this app → permissions (at the bottom of the page) → view details.
Table 5 displays the permissions requested by Nike Run Club, which include several high-level permissions. It is important to note that these permissions are typical for this category of applications, and the other apps in our study also request similar permissions.
4.3. Extraction of Data
We used ADB to extract both public and private data from each application. This process can be time-consuming since it requires finding the data, compressing them, storing them in the phone’s public storage and then pulling them from the device to the forensic practitioner’s computer. Therefore, we created a Python 3 script to automate the process with the following syntax:
Next, we analyze the data generated by the Nike Run Club app. We start with the so-called public data and then analyze the private data.
4.3.1. Public Data
The directory structure of the public data is shown in
Figure 4, which holds 2 subdirectories and 11 files. However, as expected, the data in these directories do not hold much forensic value, as all stored files are cache files related to maps. Nonetheless, there was one specific application whose public directory held relevant forensic data:
Strava. This will be discussed later on.
4.3.2. Private Data
Applications typically store most data in their private folders, which holds for the applications under examination in this study. Unfortunately, accessing private data requires root privileges, meaning that such data can only be retrieved when the mobile device is rooted. Consequently, using a rooted device was essential for conducting this study.
The private directory structure, up to two sub-levels, is depicted in
Figure 5. This structure is notably more extensive than the public directory, encompassing 182 subdirectories and 411 files.
Examining all these files can be a daunting task. However, obtaining a clear idea of where the most crucial information is stored is feasible by leveraging prior knowledge of the Android operating system and understanding how private data are organized [
28]. The
files/ folder typically contains cache or temporary files and log files. In the case of the Nike Run Club, there are no forensic relevant data in this folder. The
database/ folder usually holds the most meaningful forensic artifacts, as it holds databases, usually SQLite 3 ones. Applications such as Nike Run Club have many
SQLite databases, yet most of the relevant data are stored in just a few databases. The remaining databases are usually related to several services, such as Google
APIs, integration with other applications of the same developer or even features related to premium features.
The shared_prefs/ is another folder where relevant forensic data can be found. It usually holds several XML files that store key-value data. It is common for applications to store data such as user account credentials and other information related to hardware and interactions with authentication services, such as Facebook and Google. This folder can also hold API keys if developers are not careful enough to protect them. Next, we focus on the databases of the Nike Run Club application.
4.4. Databases
The Nike Run Club application keeps 15 different SQLite 3 databases in its private directory. The most relevant ones, from a digital forensic perspective, are (i) com.nike.nrc.room.database, which acts as the main database, and (ii) ns_inbox.db, which holds the application notifications.
4.4.1. com.nike.nrc.room.database
The database com.nike.nrc.room.database is the core storage of the application, holding 53 tables. The main elements stored by the database are (i) user activities; (ii) user weight; (iii) training plans; (iv) challenges; (v) audio runs; and (vi) achievements.
User Activities
Nike Run Club splits data related to the activities into eight different tables. The main table
activity stores the basic information, such as start and end time. Each subsequent table is connected to this table from where the data came: directly from the application or from a companion application. An activity can have multiple records related to it in other tables.
Table 6 explains the different tables and their forensic value. Indeed, the application stores a large amount of information related to an activity. We could extract information like the timespan of the run, duration, distance, heart rate, calories burned and more by analyzing the data. The database diagram, which highlights relationships among the tables listed in
Table 6, is shown in
Figure 6. This diagram was carried out with the aid of
schemacrawler and
DBDiagram.io, the first to create a base diagram enhanced with the second tool. Since the diagram is quite large, we only highlight relationships between tables in
Figure 6 related to a fraction of the database. A more detailed version of the diagram and the code used to generate it in
DBDiagram.io is available in a GitHub repository (
https://github.com/labcif/Running-Databases (accessed on 8 August 2024)). This repository contains the diagrams and code for each diagram presented in this study.
One of the most relevant artifacts found is the activities’ GPS coordinates. Nike Run Club stores these coordinates in a unique polyline format in the table
activity_polyline.
Polyline is a string of characters that encode coordinates used by Google Maps to draw the route line on the map [
29] as shown in
Figure 3b.
The coordinates bear crucial forensic significance as they precisely identify the user’s whereabouts during a specific timeframe. According to Google’s documentation, reversing a polyline back into grouped coordinates is feasible. Utilizing the Python library
polyline (
https://pypi.org/project/polyline/ (accessed on 8 August 2024)), we devised a Python script to decode Google’s polylines into coordinates and store them in an
XLSX file. This file hosts the activity’s coordinates. Additionally, employing the
geopy library (
https://pypi.org/project/geopy/ (accessed on 8 August 2024)), our script enriches these coordinates with additional details such as the corresponding
road,
city,
postcode and
country. This file aims to facilitate the analysis of GPS coordinates and streamline the identification of potential locations of interest. Subsequently, utilizing these coordinates, we generate a file showcasing the user’s traveled route as exemplified in
Figure 7. The script allows for the export of this file in either
HTML or Keyhole Markup Language (KML). The
Polyline2GPS (
https://github.com/labcif/Polyline2GPS (accessed on 8 August 2024)) script is available as a standalone tool after the functions described were initially developed for the ALEAPP framework.
Another forensic relevant table of the com.nike.nrc.room.database database is activity_raw_metric. It stores specific actions during the activity, such as the user pausing, resuming or ending the activity. The table stores the action that occurred and timestamps it. This can be useful if the need arises to accurately determine an activity’s sequence of events. For instance, cross-referencing data allows one to detect when a user stopped walking for a given period and when the stop happened.
4.4.2. ns_inbox.db
The ns_inbox.db database holds only one relevant table that stores notifications received through the applications inbox. These messages could be the application’s news or notifications from other users. This database stores the notifications’ contents and the date/time that the message was sent. Note, however, that Nike Run Club does not have an integrated message system; hence, all messages are system-generated. Thus, no user messages are stored in this database.
7. ALEAPP Modules
To ease and automate the analysis of the six studied run-tracking apps for digital forensic practitioners, we provide several modules for the ALEAPP forensic framework created by Alex Brignoni (
https://github.com/abrignoni/ALEAPP (accessed on 8 August 2024)). ALEAPP is an open-source Python-based tool created specifically for digital forensics investigations of Android devices. It is used to parse and analyze various types of data extracted from Android devices, including logs, events and protocol buffer (Protobuf) files, a common data format used in mobile applications. By automating some of the repetitive tasks in Android forensics, ALEAPP allows digital forensics examiners to save time.
Figure 10 shows the graphical user interface of ALEAPP.
ALEAPP was conceptualized with scalability and modularity in mind. Developers can seamlessly integrate their contributions into the framework by creating a single Python 3 file, which acts as an ALEAPP module. An ALEAPP module is a single Python 3 script (also known as a plugin) within the ALEAPP framework, created to process and analyze a particular type of forensic artifact found on Android devices. Data parsing occurs from previously extracted Android files, with the results being presented in the HTML report generated by ALEAPP. An ALEAPP module reads the specified file, extracts relevant data and presents them to the user. Although this is a primary function, additional features have been incorporated into ALEAPP to improve its reporting features.
The decision to target ALEAPP stems from its ease of use, the platform’s open-source nature and its popularity and transparent integration in the also-open-source
Autopsy forensic software [
33].
To deal with the forensic artifacts of the six applications, we developed 14 new modules for ALEAPP and a new ALEAPP’s feature that we named
timeline.
Table 7 lists all 14 created scripts that have been added to ALEAPP’s GitHub repository (
https://github.com/abrignoni/ALEAPP/ (accessed on 24 July 2024)).
7.1. Timeline
Our addition to the core of ALEAPP was the creation of a timeline plugin. The plugin displays, in a timeline format, events in chronological order. This plugin was developed using the open-source components
timeline.js (
https://github.com/squarechip/timeline (accessed on 8 August 2024)). The plugin is utilized to showcase data from the
activity_moment table of the
NikeRunClub database. This table stores specific moments for each activity, such as pausing, stopping and the completion of each additional kilometre. These data are read by the special
NikeAMoments.py module, which resorts to the
timeline plugin to render the run activity, as shown in
Figure 11.
Note that any ALEAPP module can use the timeline plugin, as it was designed to be application-agnostic. In Listing 1, we exemplify how it is possible to generate a timeline using this feature.
Listing 1. Code snippet for implementing the timeline feature on ALEAPP. |
|
7.2. Specific ALEAPP Modules for Running Applications
Although it is possible to extract all data from an application using a single Python 3 script for each module, the ALEAPP community suggests developing separate modules for each parsed data source. This strategy simplifies module maintenance because the analyzed running applications often share common features and stored data. The individual modules created for parsing each application’s data demonstrate high similarity.
The Activities modules (e.g., MMWActivities.py) are responsible for extracting and presenting information regarding running activities. These modules handle data such as (i) the duration of the run; (ii) calories burned; and (iii) distance travelled. Additionally, they utilize a functionality previously developed by us for ALEAPP to display the user’s route on an OpenStreetMap. The same approach applies to the Users modules (e.g., adidasUsers.py). These modules process data related to user accounts, including (i) gender, (ii) email, (iii) height and (iv) weight.
The module NikePolyline.py contains the visual representation of each activity stored by the application shown in OpenStreetMap. The reason for placing it in a separate module stems from the fact that the activities module already contained a sizable amount of data, and the aim was to streamline it and reduce bloat. The module NikeNotification.py contains the notifications that the application sends.
The module adidasGoals.py extracts data related to the user-defined goals inside Adidas Running. Lastly, although the output delivered by the StravaGPS.py module is similar to the Activities modules, its approach is entirely different as it extracts data from FIT files.
8. Conclusions
In this work, we developed a post mortem forensic analysis of six popular fitness applications for Android focusing on outdoor running: Adidas Running, MapMyWalk, Nike Run Club, Pumatrac, Runkeeper and Strava. We resorted to a rooted Android smartphone and a Garmin Vivosmart 4 smart band to generate, collect and analyze data in search of meaningful forensic artifacts.
As all applications were similar in nature and structure, we focused on one application—NikeRunClub—and then used the process and knowledge learned to adapt the analysis methodology to the other applications.
The NikeRunClub application stores forensically meaningful data—activities, GPS coordinates and account data—in SQLite 3 databases in its private directory, mainly in the databases directory. Related work has already demonstrated the high value of accessing data from applications that track running workouts as they frequently merge locations with date/time. These can yield critical proof for refuting or, on the contrary, confirming someone’s alibi.
We analyzed the remaining applications based on NikeRunClub and, by using the same methods, we acquired similar information in their respective files. The main findings are as follows:
- -
The applications do not store data in their public directory. The exception is Strava.
- -
Workout activities are stored in SQLite 3 databases, with the most significant forensic data being GPS coordinates, timestamps and duration. Strava is again the exception, resorting to the FIT file format to store workouts.
- -
The format for GPS coordinates depends on the applications, with some encoding geolocation coordinates using the polyline format and others relying on a text-based pair of latitude/longitude values.
- -
User’s account data are often stored in a database, although they can also be in XML format in the shared_prefs folder.
To decode FIT files, we developed the decode_FIT Python script. It extracts activity data and formats them for rendering on an OpenStreetMap map or generates a KMZ file for visualization in Google Earth. Additionally, we created a set of modules for the ALEAPP framework to parse application data and generate a case report for streamlined data interpretation. We further enhanced the ALEAPP report with a new timeline feature, previously incorporated into the ALEAPP repository. As ALEAPP is integrated with the Windows version of the Autopsy forensic software, our modules will also be seamlessly incorporated into this forensic application.
As future work, we plan to assess the privacy and security features of the applications, analyzing the data collected and sent to their respective cloud servers. For this purpose, one needs to resort to dynamic analysis techniques of mobile applications, intercepting and analyzing the HTTPS traffic between applications and their cloud servers, to decode and map their APIs [
34]. Another task for future work is to assess the performance of our methodologies, looking for possible optimization. Finally, we also plan to analyze the iOS version of the studied applications and see if the same or even more data can be obtained so that we can adapt our ALEAPP modules to the iOS Logs, Events, And Plists Parser (iLEAPP) framework focused on, which is similar to ALEAPP but focused on iOS applications.