1. Introduction
Through technological evolution, smartphones have become mini-computers, possessing the ability to store and process data, perform tasks of high computational value, and access a large amount of information through Internet access [
1]. Consequently, smartphones store a trove of user information, such as location data, captured images, videos and audio, and interactions in multiple social networks, from entertainment to e-dating. As such, these devices are an essential part of a digital forensic analysis process.
Bumble is an e-dating app founded in 2014 [
2], available for both Android and iOS. At a time when the term feminism is gaining traction, the Bumble app (
https://bumble.com/, accessed on 28 December 2021) proposes to empower women by making “(…) not only necessary, but acceptable for women to make the first move, shaking up outdated gender norms” [
3]. The name indicates that the app spins around the queen bee, in this case represented by the app’s female users. For this reason, the company describes itself as 100 percent feminist, encouraging equality and “reversing the heteronormative rules surrounding dating” [
4]. Bumble currently has more than 10 million downloads on the Google Play Store (
https://play.google.com/store/apps/details?id=com.bumble.app, accessed on 28 December 2021), where its average rating is 3.9 in the opinion of approximately 280 thousands users. In the App Store (
https://apps.apple.com/us/app/bumble-dating-meet-people/id930441707, accessed on 28 December 2021), its rating is higher, being 4.2 out of one million reviews. Bumble has around 42 million active users globally, of which 1.2 million subscribe to the application’s premium features [
5]. The United States has the highest number of users, with around 5 million, only surpassed amongst dating apps by Tinder, with approximately 8 million users [
6]. In February 2021, Bumble stated that its market value was 13 billion dollars, an increase from the previous year when it was valued at around 8 billion [
5].
The Bumble app allows the user to perform various operations, such as: creating a profile, exchanging messages, images, and videos, searching for filters, and integrating with other social networks like Facebook, Instagram, and Spotify. To benefit from the features that Bumble offers, one must first create an account in the application. This process allows the authentication to be done from two sources: a mobile phone number, or through a Facebook account. When creating a Bumble account with the Facebook credentials, the users grant permission to share their names and profile pictures. Also, users can choose what additional permissions are granted by Facebook by opting out of granting access to: the users‘ email address (if there is one associated with the Facebook account); date of birth; profile photos; gender; page likes, and current town/city.
Like other e-dating applications, Bumble allows the creation of a profile, where users add information about themselves, such as a short biography, their interests, height, weight, religion, photos, geographic location, and gender. Furthermore, it is possible to insert activities that the user performs, such as whether a person is a smoker, has a habit of drinking, practices sports, among others. However, there is no mandatory information other than a name, the date of birth, and a mobile number or a Facebook account. It is noteworthy that when creating the profile, the user is requested to verify their profile by sending a photo in a pose illustrated in the application. This procedure has the purpose of reducing the creation of fake profiles, because of the extra work it would take to find someone’s else picture in a pose that is unknown before the verification. Yet, the verification process is not mandatory.
Once the account is created, the application starts to present a stream of profiles for the user to choose from by swiping right (indicating interest on the shown profile) or left (if not interested). When a match occurs, that is, two people swiping right on each other’s profile, Bumble invites one of them to send a message within 24 h. If it does not occur, the match will expire, and, therefore, the users will not be able to interact again. If the match is between two individuals with opposite gender, the female user must start the conversation. If the interactions are between users of the same gender, both can initiate the conversation [
7]. While communicating, the application allows users to exchange text messages, photos, audio recordings, and video calls. If the user is not interested in the shown profile, they can swipe left. Henceforth, the user will no longer see the profile previously displayed, and, subsequently, the application recommends another potentially compatible user. The application also has a nude recognition mechanism. Through artificial intelligence, it processes the exchanged pictures to identify if they contain sexual content, also known as “nudes” [
8]. In our limited test, we sent five distinct sexual pictures and the application was able to identify all of them accurately. This functionality is designed to prevent the delivery of unsolicited nudes. If a nude is delivered, the application hides the image and informs the user that it is an erotic image, asking the user to approve, or not, its reception. If preferred, the user may not receive the image and report the sender.
Besides the e-dating mode, Bumble also offers two extra modes in the application: the Best Friends Forever (BFF) and business (identified as BIZZ). The BFF mode focuses on establishing a match to create a new friendship, while BIZZ mode aims at finding professional connections.
As Bumble is a global application, all data is sent to, stored, and processed in the United States of America, and the United Kingdom, regardless of the user’s country of residence. The privacy policy continues explaining how personal data is handled [
9].
Table 1 shows the information that might be collected about a user when creating a Bumble account. Additionally, the application may request the user’s full name and address to share with third parties, to send merchandise and loyalty programs. Besides registering the user’s email, when the customer support team is contacted, Bumble can store the users’ Internet Protocol (IP) address to keep track of customer communications and complaints concerning other users. Bumble processes collected data, such as demographic information, to target advertising through in-app advertising and, in addition, shares data with ad networks that host its ads. Bumble may also collect information about the device, such as its unique identifier, model, operating system, Media Access Control (MAC) address, and, if authorized, may access the user’s contact list.
Bumble offers a free tier with a limited amount of profiles a user can view per day. The optional premium account offers additional features, such as Spotlight, SuperSwipe, Bumble Boost, and Bumble Premium. The Spotlight allows the user’s profile to be viewed by more people instantly for 30 min, while the SuperSwipe notifies a potential match that the user is confidently interested in them. The Bumble Boost feature includes the possibility to backtrack (swipe right on a profile which the user swiped left previously), extend the time to answer on current matches, unlimited swipes, one Spotlight, and five SuperSwipes per week. The Bumble Premium service has all the Bumble Boost features and access to unlimited advanced filters, the user’s admirers, travel mode, the rematch with expired connections, and incognito mode (not available on the Bumble’s Web version).
The Android version of Bumble is updated regularly, about once a week. The available versions of the application are compatible with Android 5.0 and upper. So far, Bumble for Android’s updates do not contain information that notifies the user of the modifications made in the latest updates. Nevertheless, the updates for the iOS app describe the changes made, which mainly consist of bug fixes. Bumble does not allow the usage of some of the prior versions, pushing the users to update it regularly to keep using the application. Our tests showed that it is only possible to use up to the 10th older version before the user is required to update. The version analyzed in this work refers to 5.250.1 released on 13 December 2021 (
https://www.apkmirror.com/apk/bumble-holding-limited/bumble-date-meet-friends-network/bumble-date-meet-friends-network-5-250-1-release/, accessed on 28 December 2021).
Despite app developers’ best efforts to protect users’ information, e-dating services are not immune to cybercrimes, such as extortion, romance scams, and identity theft. Generally, attackers use e-dating applications to create a relationship with users, misleading them into sending money, personal, and financial information. According to the Internet Crime Complaint Center’s (IC3) 2020 Internet Crime Report, extortion, identity fraud, and romance scams crimes were, respectively, the third, fifth, and eighth most commonly reported cybercrimes globally, creating a totaled victim loss superior to 900 million US dollars [
10]. Therefore, identifying and analyzing e-dating applications’ forensic artifacts can help to discover how a crime was conducted and uncover information that might lead to its perpetrators. There are several studies about e-dating applications, such as Tinder and Badoo [
11,
12,
13,
14], but, to the best of our knowledge, there is no in-depth analysis about Bumble in the digital forensics domain. Hence, the contributions of this work are: (1) a thorough study of the Bumble data stored in an Android device, and the analysis of its artifacts, (2) the diagram of the most forensic valuable database schema, and (3) a parsing script to present the most relevant forensic artifacts in a human-readable format.
This document is structured in the following way:
Section 2 presents a literature review of works about mobile digital forensics analysis in general and e-dating apps. Then,
Section 3 displays the testing methodology and tools used to conduct our study. The obtained results are presented and discussed in
Section 4, followed by the description of the developed parsing script in
Section 5. Finally, the conclusions and future work are presented in
Section 6.
2. Literature Review
The analysis of mobile applications and devices has been the subject of several literature reviews over the last few years. The forensic interest is due to the amount of information regarding the status of devices and data about the user, mobile systems, and applications use and storage [
15]. Data about users’ previous locations can be obtained from applications’ digital artifacts, both on Android and iOS systems. This information contributes to an investigation with data relating to preceding suspect locations [
16]. Another type of forensic artifact is the identification of files present in the cache, which are automatically generated by the applications and may contain important information about the user [
17]. One type of application that can create relevant artifacts for forensic investigations are e-dating apps. E-dating applications might provide multiple artifacts, such as the Global Positioning System (GPS) location, telephone contacts, email addresses, and messages exchanged between users [
14,
18]. Additionally, Lcdi and Lcdi [
19] performed a static analysis on an older version of Bumble for Android, where it was possible to identify images stored locally on the device [
19].
Hayes and Snow [
11] analyzed three e-dating apps (Tinder, Bumble, and Grindr) and identified that these were collecting and sharing personal information about their users that did not match the information stated in the privacy policy [
11]. Additionally, the authors identified communication protocols that posed potential risks to the security of users’ data. Knox et al. [
20] performed an analysis of the Happn e-dating app for Android and iOS systems to identify artifacts that could be exploited by a malicious agent to gain access to confidential data of its users [
20]. Furthermore, Kim et al. [
12] analyzed five e-dating apps (Tinder, Amanda, Noondate, Glam, and DangYeonsi). Through traffic analysis and reverse engineering techniques they found that sensitive user data could be exploited by malicious agents [
12]. Another study by Shetty et al. [
13] carried out an analysis to assess the possibility of executing Man-In-The-Middle (MITM) attacks on seven e-dating apps (Tinder, Happen, Badoo, MeetMe, Skout, Lovoo, Coffe Meets Bagel, Chrome for Android, and Facebook). The authors identified that these were vulnerable to this type of attack, allowing access to the application’s user data [
13]. Also, a study from Farnden et al. [
14] presents forensic techniques used to identify and retrieve data from the following e-dating apps: Badoo, Grindr, Skout, Tinder, Jaumo, Meet Me, FullCircle, and MuiMeet [
14].
The literature review by Phan et al. [
21] presents a different perspective, identifying the physiological impacts that e-dating apps can have on users. The authors also point out the risks related to digital security, possible crimes, and the digital artifacts of these apps that can be used to solve crimes [
21].
Several e-dating applications have been studied to identify digital artifacts and determine which have forensic value. Nonetheless, to the best of our knowledge, the Bumble app has not yet been a target of an in-depth analysis in the digital forensic domain.
4. Results
After conducting several interaction tests with our accounts, we extracted the data from both the public folder (/sdcard/Android/data/com.bumble.app) and private folder (/data/data/com.bumble.app). The Bumble public folder contains two folders: cache and files, both without any content.
The most valuable data for the forensic analyst resides in the application’s private directory:
/data/data/com.bumble.app. To read the contents of this directory, the device (physical or emulated) must be rooted. To quickly acquire data while testing, a bash script was developed that automates the acquisition operations of the private directory. Bumble’s private directory hierarchy is illustrated in
Figure 1.
When Bumble’s app is uninstalled, all folders and files created by the app are deleted, which is the normal behaviour of the Android operating system. Then, if the app is reinstalled, and the same account is configured, all previous data is restored from Bumble’s servers. However, if the Bumble account is deleted, no data will be restored, even if the user repeats its previous account credentials.
In the following paragraphs, the directories that were considered most relevant at a forensic level are addressed, namely: databases, shared_prefs, files, and cache. The remaining directories are not addressed, as they did not reveal any data with forensic value, or the content could not be identified due to unknown encoding.
4.1. Databases
Bumble contains five databases, four of which are in the databases directory (ChatComDatabase, com.google.android.datatransport.events, google_app_measurement.db and lexems.db). The Cookies database is present in the app_webview/Default directory, while androidx.work.workdb is in the no_backup directory. Following, the tables and fields that we consider most relevant from a forensic perspective are discussed. The timestamps in databases are in an Epoch format (representation of time in operating systems, such as Unix, Linux, and Android, represented by the number of seconds elapsed since 0:00 of 1 January 1970).
The
ChatComDatabase database contains the most extensive number of tables and mostly stores information about the user’s messages and conversations. The 18 tables relating to this database are listed in
Table 4, and a summary of the database schema is illustrated in
Figure 2.
A conversation contains a set of messages between two users of the application. The table that holds information about conversations is named
conversation_info, and each table entry (row) represents a user conversation. The table fields identified as having forensic value are illustrated in
Table 5.
The tables that store information about messages are designated as
offline_message_read_info,
message_read_info, and
message. Contextually, messages refer to any data sent and received, such as text, images, gifs, and audio. The main fields of the
message table with forensic value are described in
Table 6. Each entry in the table represents a message sent or received in one of the user’s conversations. It is noteworthy to mention that even if a user blocks another user, the messages exchanged will still be available in this table.
The payload field is represented through a JavaScript Object Notation (JSON) object, which has different characteristics depending on the information present in the message. In addition to the information already available in the database, sending images, audio, and GIFs provide additional information. When sending a picture, data such as height, width, id, expiration timestamp, and if the image is masked (is_masked) are stored. The image itself is stored online, on Bumble’s servers, and only the URL and its metadata are stored on the local database. When an image is marked as masked, it means that it is blurred by displaying it at a markedly lower resolution to the point that its content is not recognizable. Furthermore, an additional field is saved: is_lewd_photo. This field is a boolean, which displays as true if the image content is classified as sexual and false otherwise. Usually, whenever a sexual image is sent, it is camouflaged and identified as sexual, allowing the user to choose whether to view it. Sending an audio message stores its id, expiration timestamp, duration, and waveform. The latter represents, through a numerical vector, the change in amplitude over time. When sending a GIF, the application stores two pieces of information, these being its Uniform Resource Locator (URL) and its provider.
Sending images, audios, and GIFs only stores their URL on the local database. However, only GIFs can be visualized outside the Bumble’s app, such as in a browser. An attempt to view the exchanged private media always returns the HTTP error “403 Forbidden”. Nevertheless, the user’s profile photos specified in user_image_url and user_photos fields in the conversation_info table can be viewed in a browser until their URL expires.
In addition to the aforementioned tables, the database contains a virtual table created using the Full Term Search (FTS4) extension. This is an extension that helps the creation of virtual tables with a built-in full-text index. Through this method, it is possible to manually create a virtual table that maps a JSON document schema [
23]. This table is named
search_fts and includes a payload field that contains the text messages sent in multiple conversations, however, it does not include information on images, audios, and GIFs sent.
The google_app_measurement.db database, despite the tests carried out, was exclusively populated with relevant data in one table named apps. The table stores the application version in the field app_version. It could be useful to determine the last version the user executed.
4.2. XML Files
The Bumble application uses XML (Extensible Markup Language) files to configure and store information and application features. These files are located in the
shared_prefs subdirectory. This directory had a total of 36 XML files, 10 of which had a forensic value and are listed in
Table 7.
The contents of the files are briefly presented below, as well as a consideration of why each was classified as being of forensic interest. Note that the files appsflyer-data.xml and
com.google.android.gms.measurement.prefs.xml have already been addressed in other literature [
24] and refer to functions not related to the operation of the application itself.
The file
appsflyer-data.xml is related to the AppsFlyer Software Development Kit to collect statistical information for advertisers and advertising campaigns. It displays the information related to the IMEI (International Mobile Equipment Identity) of the device on which the application is running. The IMEI-related fields can be useful to identify the phone which executed the application [
24]. However, in all our tests the IMEI values were set to false, meaning they were not collected. The file also presents some information related to timestamps in the
appsFlyerFirstInstall field, referring to when the app was first installed.
The
com.google.android.gms.measurement.prefs.xml file is a collection of Google APIs that contains classes to configure Firebase Analytics core services to support functionalities across multiple devices [
25]. The file is of forensic interest due to some information related to timestamps when in the
first_open_time field, referring to when the app was first opened.
The BumbleAppPreferences.xml file shows, in the attribute MyCurrentUserStateKEY_GAME_MODE, the information of the current App mode selected by the user. When dating mode is active, the stored value is GAME_MODE_REGULAR. This option can also have the values GAME_MODE_BFF, for those who select the looking for friends option, and GAME_MODE_BIZZ, which corresponds to the users that selected the business connections option.
In the file
com.facebook.AccessTokenManager.SharedPreferences.xml it is possible to identify a considerable amount of information regarding the Facebook account linked to the application. The identifiers referring to the Facebook token stand out, bringing a set of information, such as the registration or token code, the permissions within Facebook assigned to this token (which can be accessed on Facebook when using this token), the application that is using it (code
428250913904849 refers to Bumble), the account code to use, and the token and its expiration date. In the same file, it is possible to identify the name of the Facebook account to which the token is linked, and which is related to the id. A content sample of this file is shown in Listing 1.
Listing 1. Sample content of com.facebook.AccessTokenManager.SharedPreferences.xml file (redacted). |
|
The file com.facebook.login.AuthorizationClient.WebViewAuthHandler.TOKEN_STORE_KEY.xml contains the Facebook authentication token to be used by the application. This token allows the application to access the user’s Facebook account information, which was previously linked to Bumble’s account. The value of expires_at (line 7 of Listing 1) as the value 9223372036854775807, which corresponds to the maximum value of the 32 bits signed integer (0x 7F FF FF FF FF FF FF FF). This suggests that the token never expires.
The DeviceUtil.xml file presents some information that identifies the device on which the application is installed. In this sense, the DeviceId attribute is present. Also, there is the attribute FirstLaunch, which shows if it is the first execution of the application on the device. The third attribute of interest is DeviceIdStored, however it was not possible to identify the information referring to it.
In the HotLexemPrefs.xml file, the version of the application running on the device can be found. As previously informed, the version of the application used is 5.250.1. However, it can also be referenced by the number 2618, as it is present on websites that download APK files. Also, it was possible to identify the APK’s creation date as well as the installation date.
The RatingRulesCriteriaParams.xml file appears to be related to the number of times the application has been executed and the number of conversations with other users.
The ServerCommunicationPreferences.xml file shows the address of the hosts and possible ports considered to be secure, and which should be used in the application’s communication with the server. It is noteworthy that the file does not present information regarding the protocol to be used by the application in establishing communication with the server (TCP or UDP). The mentioned addresses and the respective ports are bmaeu.bumble.com:80, and bmaeu.bumble.com:443.
The VOTING_QUOTA.xml file presents the information corresponding to how many swipes the user can perform on the same day. This file can be considered as a possible indicator that the user has a premium account, as the value corresponding to the KEY_YES_VOTING_QUOTA field is much higher in the premium than that of the free version of the application. The premium version has a value in the billions whereas in the free version the value starts with 80. The value of the free version can increase if the user stays more than one day without swiping.
The
c2V0dGluZ3M= file stands out because its name stands for
settings after being base64 decoded. In the analysis performed it was not possible to have full access to the content of this file, as it was not possible to identify its encoding or if it was encrypted. With the use of UTF-8 encoding, it is possible to access part of the content, including some information about the user, which is presented in
Table 8. The premium version of Bumble contains additional information that is marked with (*) in the aforementioned table.
Figure 3 shows a capture of the partial mobile number using the HxD hexadecimal editor.
It is possible to conclude that the file files/c2V0dGluZ3M= contains information relating to user data. It is important to note that the location is not reliable, as it can be manually entered by the user. The travel mode present in the premium version does not change the data referring to the user’s location but adds the data related to the travel mode’s destination location. In our tests we noted that a user can enter the desired location on travel mode without any restriction or verification.
The files folder contains the
PersistedInstallation file, whose contents (in JSON format) are shown in Listing 2. This file contains relevant information about user authentication in the application and the communication with Bumble servers. It is noteworthy that the token is renewed every seven days. In the same folder, there is a file named
c2Vzc2lvbklk, which represents the
sessionId in base64 encoding. However, no data with forensic value was identified.
Listing 2. Sample content of the file PersistedInstallation. |
|
Inside the Cache folder, two directories are of forensic interest, these being: downloader and decorator_tmp. These contain images sent, received, and viewed while using the application.
The
decorator_tmp directory contains images with a blurring filter applied by the application, while the
downloader folder contains the same pictures but without the filter applied. The comparison between the same image in the two directories is shown in
Figure 4 and
Figure 5.
Although the images are present in the directories mentioned above, they are quickly eliminated by the application. For this reason, these folders end up not being reliable for the collection of all images sent by a user. In addition, it is not possible to correlate the images sent in the message table, from the ChatComDatabase database, with the images present in the directories. Images stored in both the decorator_tmp and downloader directories do not contain relevant metadata to identify them, such as their date and time of submission. It has been determined that the Bumble application makes changes to photos uploaded by users. When sending photos with Bumble’s app all EXIF (Exchangeable Image File Format) metadata is eliminated. Therefore, it is not possible to obtain typical image metadata such as the location, camera model, etc. In our tests, we were not able to confirm if the metadata is eliminated on the app itself, or on the server-side, given that the app does not allow exporting pictures taken with the app—this seems to be a privacy feature.
In the cache folder, only the most recent images that the user viewed, sent, or received can be observed. However, the images obtained are not satisfactory to determine whether the user sent, received, or viewed them on a user’s profile. Additionally, the directory cache/recents/teleport_cache contains the last location, as this file only exists when the location is changed.
Additionally, requests to the Bumble API were executed. When viewing the Bumble Web API calls, it was found that several data provided, such as session_id and device_id, are present in the files directory. For this reason, this data was used to try to gain access to the URLs described in the message table, replacing Bumble Web’s cookie_session with the tokens provided by the application. However, it was not possible to obtain results through the requests made to the API with the authentication data provided in the files. This result may be due to numerous factors such as the lack of parameters, additional authentication by the application, requests that do not respect the structure required by Bumble, among others, making it extremely difficult to access the URLs of the images present in the payload field.
Various databases and XML files held no forensic interest. The databases com.google.android.datatransport.events, lexems, Cookies, and androidx.work.workdb were discarded since neither had any information which might be of forensic interest nor were they populated with data. These databases are related to how data transport takes place, required information for the app to function, web application session cookies, and a Room database, respectively. Correspondingly, most of the XML files had no forensic value and, therefore, were not discussed in this work.
4.3. MobSF Analysis
By means of the MobSF tool we were able to identify several relevant pieces of information from a privacy point of view. The following subsections address our findings.
The MobSF highlighted eight permissions classified as dangerous that are displayed in
Table 9.
The APKID Analysis section contains information about how the APK was built, the compiler used, packages, obfuscation techniques, and more. The report shows that anti-reverse engineering techniques were applied to the application’s source code. In this sense, it should be noted that anti-emulation (identified in the report as anti-VM, or virtual machine, code) and code obfuscation techniques were applied.
The Network Security section addresses the security of the application to receive and send data. A vulnerability classified as high was identified, which indicated that the application is configured to allow the traffic of cleartext (unencrypted) information for API versions lower than 27. However, it was not possible to test this vulnerability, as the available APKs were not able to compile in this Android version.
The Manifest Analysis describes some essential information about the application for the Android build tools, about permissions used in the application, and so on. In this analysis, 14 high-level vulnerabilities were identified related to essential application information and permissions that could be exploited by malicious users or to get privileged information that could help forensic analysts.
The Code Analysis section contains information regarding the vulnerabilities identified in the application and their respective classification according to CVSSv2 (Common Vulnerability Scoring System version 2) [
26]. However, the vulnerabilities were not analyzed due to obfuscation techniques applied to the source code.
The NIAP Analysis data describes, oversees, and monitors the security of commercial Information Technology products. In this section, only one risk was identified, since the application uses cryptographic hashing services in disagreement with the FCS_COP.1.1(2) and due to the use of outdated and insecure cryptographic algorithms like RC2/RC4 and MD4/MD5.
The Domain Malware Check refers to malware that may be present in the application. In this section, it was identified that the domains used by the application are not listed as unsafe and are not related to suspicious or malicious records or activities. In this way, all domains related to the application are identified as safe.
Multiple URLs hardcoded in the application have been identified. Despite this, none were identified or characterized as suspicious. Also, six emails were identified by the tool, however, only two are valid:
[email protected] and
[email protected].
Trackers are pieces of software that are used to record information about the user. The MobSF report highlights the following trackers: AppsFlyer, FacebookAnalytics, and GoogleFirebaseAnalytics. Reviewing the data found by MobSF, it concludes that the companies mentioned are utilized for statistical purposes. Because of this, the Bumble application may send data to obtain relevant information about its users, which could not be identified in the analysis performed.
The Hardcoded secrets section contains data relating to secrets, privileged information such as API keys, passwords, and other relevant data stored in the application. In this section, possible secrets were identified, however, only two were recognized as being valuable, the google_api_key, and the google_crash_reporting_api_key.
The information found in the static analysis is common to other applications and was the only data of interest besides the API keys. The access to the API key will be able to make undue calls to the application and allow access to sensitive information to unauthorized people. Due to this problem, customer trust can be diminished, in addition to causing financial losses for the organization. However, this might be more harmful to the app itself than its users.
4.4. Dynamic Analysis
Bumble requires the location service to be active to load its contents. Hence, we expected to find some GPS coordinates in the post-mortem analysis, but that was not the case. We hypothesized that such information would be transient and not stored in the mobile device. To test this hypothesis we performed a dynamic analysis as described in the following paragraphs.
Android implements several protective mechanisms to prevent impersonation attacks (also known as man-in-the-middle attacks). One of those mechanisms is HTTP Public Key Pinning (HPKP), also known as certificate pinning. The HPKP security mechanism is delivered via an HTTP header, which allows HTTPS websites to resist impersonation using fraudulent digital certificates. A server uses it to deliver to the client (e.g., web browser, or an app) a set of hashes of the public keys that must appear in the certificate chain of future connections to the same domain name [
27]. Although this mechanism is a good protective measure for end users’ privacy, it is also a challenge to overcome to do a forensic analysis. We followed the “Android Network Traffic Interception” tutorial available on Github [
28], but with some modifications. This tutorial shows how to install a proxy and a new digital certificate into the Android device to intercept the traffic. However, we opted to use Packet Capture because it allows us to intercept the network packets inside the Android device itself. The Packet Capture also requires the installation of a Certificate Authority (CA) digital certificate bundle (both private and public keys) that we created by issuing the openssl commands present in Listing 3.
Listing 3.openssl commands to generate a CA digital certificate to be used by Packet Capture to intercept traffic. |
|
Once the certificate bundle is installed in Android, the traffic is intercepted when redirected by the operating system through a fake VPN service created by the Packet Capture. However, this is not enough, as the Bumble app still refuses to load any content due to the certificate pinning security mechanism. Therefore, we also had to resort to the Frida tool, which is able to intercept Bumble’s calls to the Android API to overcome this security mechanism.
The data we intercepted includes user profile information, mobile device information (manufacturer, model, version, and device_id), and event timestamps in Unix Epoch format, as shown in Listing 4. We also found many packets corresponding to the images on other Bumble’s users’ profiles as presented in
Figure 6. Although Bumble requires the location service to be active, we did not find any GPS coordinates being transmitted. However, we cannot assert that this information is never sent to Bumble’s servers. First, the location information might be sent only sporadically and our tests might not have been long enough to find it. Second, the information might be encoded in an unknown way and, for that reason, it was not identified.
Listing 4. Sample of json encoded data sent to Bumble’s servers. |
|
6. Conclusions
Bumble is an e-dating application with many features and a complex internal structure. Thus, numerous tests were executed, using both virtual and physical devices, to cover as many use cases as possible. Afterward, an in-depth post-mortem analysis was done to determine what user data with forensic value is stored. Like other mobile apps, Bumble stores its data in public and private directories, hence /sdcard/data/com.bumble.com and /data/data/com.bumble.com directories were the primary focus of our analysis. While investigating the public directory, it was concluded that there was no relevant or forensically relevant data. However, the private directory was the focal point of analysis since it revealed the most significant results, highlighting several forensics artifacts. Bumble’s private directory, in version 5.250.1, contains 6 SQLite3 databases, accounting for 40 tables, numerous cache files, and 36 XML files. These provided valuable information about a user, such as the messages exchanged with others, its matches, data about the linked Facebook account, and configuration settings. Yet, the files sent (images and audio recordings) were not obtainable since these are not stored locally. Nevertheless, it was possible to identify some of the pictures exchanged in personal conversations within cached files. Given its size, a forensic analyst with exclusive access to the private directory cannot accurately tell which were sent and received by the user. Additionally, the automated static analysis identified Bumble’s required permissions, APKID, source code, secrets, application vulnerabilities, URLs, and IPs. To analyze the application’s behavior in real-time, a dynamic analysis was performed with the help of tools to bypass the certificate pinning security mechanism, and intercept the network traffic on an Android device. The intercepted data included the user profile information, and the mobile device information (manufacturer, model, version, and device_id). Although Bumble requires the location service to be active, no GPS coordinates were found being transmitted. However, it is not possible to assert that this information is never sent to Bumble’s servers. Finally, a Python script was developed that enabled the visualization of messages exchanged between users through a web browser and its export to a PDF document.
As future work, it would be valuable to obtain the pictures and audio exchanged through the links available on ChatComDatabase’s message table. Eventually, it would be significant to investigate the application’s vulnerabilities and determine if they can be useful to obtain more forensic relevant data. Additionally, it would be valuable to further develop our script to be included in tools like the Autopsy Forensic Browser, and the Android Logs Events And Protobuf Parser (ALEAPP) (
https://github.com/abrignoni/ALEAPP, accessed on 28 December 2021).