A User Study of a Wearable System to Enhance Bystanders’ Facial Privacy

: The privacy of users and information are becoming increasingly important with the growth and pervasive use of mobile devices such as wearables, mobile phones, drones, and Internet of Things (IoT) devices. Today many of these mobile devices are equipped with cameras which enable users to take pictures and record videos anytime they need to do so. In many such cases, bystanders’ privacy is not a concern, and as a result, audio and video of bystanders are often captured without their consent. We present results from a user study in which 21 participants were asked to use a wearable system called FacePET developed to enhance bystanders’ facial privacy by providing a way for bystanders to protect their own privacy rather than relying on external systems for protection. While past works in the literature focused on privacy perceptions of bystanders when photographed in public/shared spaces, there has not been research with a focus on user perceptions of bystander-based wearable devices to enhance privacy. Thus, in this work, we focus on user perceptions of the FacePET device and/or similar wearables to enhance bystanders’ facial privacy. In our study, we found that 16 participants would use FacePET or similar devices to enhance their facial privacy, and 17 participants agreed that if smart glasses had features to conceal users’ identities, it would allow them to become more popular.


Introduction
The availability of cameras and Artificial Intelligence (AI) through wearables, mobile phones, drones, and Internet of Things (IoT) devices is making bystanders' facial privacy more significant to the general public. Bystanders' privacy arises when a device that collects sensor data (such as photos, sound or video) can be used to identify third-parties (or their actions) when they have not given consent to be part of the collection [1,2]. Even though bystanders' privacy has been an issue since the end of the 19th century with the invention of portable cameras that could take photos in a short amount of time [1], recent advances of camera-enabled devices (e.g., mobile phones, IoT) combined with Artificial Intelligence (AI) and the Internet have raised awareness about this privacy issue especially in the last couple of years. We show in Figure 1 some of issues related to bystanders' facial privacy. Figure 1. Issues related to bystanders' privacy. Our focus in this work is to study bystanders' perceptions of a bystander-centric device/system to enhance facial privacy.
Recently, various solutions  that address bystanders' privacy have been proposed in the literature. However, most of these solutions rely on bystanders trusting third-party devices or systems which do not give a choice to protect their privacy. To enable bystanders to protect their privacy, we have developed the Facial Privacy Enhancing Technology (FacePET) [25] smart wearable device. FacePET is a wearable system made of intelligent goggles worn by bystanders to protect their privacy from unauthorized face detection. FacePET operates on image features (in particular Haarlike features [26]) through visible light produced by the FacePET goggles to confuse face detection algorithms based on the Viola-Jones face detection algorithm [27]. If an unauthorized party takes a photo of the bystander with the FacePET system enabled, the action on the features is registered in the photo. Thus, if later an Artificial Intelligence (AI) algorithm based on the Viola-Jones algorithm attempts to detect a bystander's face, the goal of the FacePET is to prevent detection of the bystander's facial features by the AI algorithm.
The FacePET goggles are controlled via a mobile application at the bystanders' mobile phone which permits the bystander to create privacy policies to automatically provide consent to third-party cameras. When a third-party authorized by the bystander wants to take a photo of the bystander, FacePET turns off the goggles and disables the operation.
The concept of consent is a cornerstone in privacy [28][29][30], and in this context, FacePET improves upon previous bystander-based approaches to protect facial privacy by allowing the bystander to create his/her own privacy policies and provide the consent. We describe the complete FacePET system, how it acts on Haar-like features based on the Viola-Jones face detection algorithm, and its effectiveness in [25].
In this work, we present the results of a small user study with a focus on perceptions of users about the FacePET system and intelligent goggles with features to mitigate facial detection algorithms. While there have been past works [8,14,[31][32][33][34][35][36][37][38][39][40] on understanding the perceptions of bystanders with respect to facial privacy, to the best of our knowledge, our user study is the first to address the perceptions of a smart wearable (IoT) system worn by bystanders with a privacy protection focus.

Research contributions of this work
We summarize the main research contributions of this work as follows: • We present a summary of human-computer interaction studies and systems related to facial privacy.

•
We present a user study of the FacePET system with a focus on users' perceptions about the device and intelligent goggles with features to mitigate facial detection algorithms.

•
We discuss the results of the study to further enhance the FacePET system, as well as influence the development of future bystander-centric devices for facial privacy.
The rest of this paper is organized as follows. Section 2 presents a review of related works. In Section 3 we describe the FacePET system. Section 4 presents the results of our usability evaluation of FacePET. Finally, in Section 5, we make some concluding remarks and present future work.

Bystanders' Facial Privacy: Human-Computer Interaction (HCI) Perspective
From the HCI perspective, research studies related to bystanders' privacy can be classified into two groups: (1) understanding the utilization/adoption of mobile, camera-enabled devices (i.e., mobile phones, wearables, IoT, and drones), and related technologies in shared spaces; (2) usability studies for facial privacy systems. These studies have been conducted using a variety of methods such as interviews, analysis of logged data (i.e., voice-mail diaries), online web comments, surveys, and a combination of more than one of these methods. We highlight some of these studies in Table 1. We describe below some of the common findings among these studies: • Seven studies in Table 1 recruited less than 36 participants (five studies recruited 20 or less participants [8,31,[38][39][40], and two studies recruited less than 36 participants [32,34]. Only two studies recruited more than 100 participants [36,37]. The studies with less than 36 participants use interviews, observation, testing of devices and some of them use surveys. The studies with more than 100 participants use surveys or automated ways (AI) to gather data of interest. • The definitions of private/public (shared) spaces and privacy perceptions vary among individuals. What is meant for a private/public space seems to depend on context (i.e., individuals, actions and devices used at any given location). • The design of the data capturing device has an impact on user and bystanders' privacy perceptions. • Individuals want to have control of their facial privacy even though some contexts are less private-sensitive than others.
In contrast to the related works discussed above which focused primarily on privacy perceptions of users/bystanders when photographed in shared/public spaces by different kinds of devices, and their perceptions about how these photographs are shared in social networks and used by external parties (i.e., in web/remote services for facial recognition), in this work we explore the perceptions of a bystander-centric device (smart goggles) to protect bystanders' facial privacy. To the best of our knowledge, our study is the first study to explore user perceptions of a bystander-centric IoT/wearable system with a focus on privacy.

Bystanders' Facial Privacy: Solutions
In the past we proposed a taxonomy [1] to classify solutions to handle bystanders' facial privacy. Our taxonomy is composed of two major groups of solutions: location-dependent methods and obfuscation-dependent methods. Methods in these categories have differences in terms of effectiveness [25], usability [41], and power consumption [42]. We show this taxonomy in Figure 2 and we present a summary of methods under each category in Table 2.

Location-Dependent Methods
The focus of location-dependent methods is to disable/enable the utilization of a capturing device at a particular location [43,44] or context. Location-based methods can be divided into two categories: • Banning/Confiscating devices: Even though they are non-technological solutions, banning/confiscating devices are the oldest method to handle bystanders' privacy. In the U.S., this method was first used starting from the development of portable photographic cameras at the end of the 19th century [45]. Around this time, cameras were forbidden at some public spaces and private venues. • Disabling devices: In this group the goal is to disable a capturing device to protect bystanders' privacy. Methods under this category can be further classified based on the technology used to disable the capturing device. In the first group (sensor saturation), a capturing device is disabled by some type of signal that interferes with a sensor that collects identifiable data [3]. In the broadcasting of commands group, a capturing device receives disabling messages via data communication interfaces (i.e., Wi-Fi, Bluetooth, infrared) [4,5]. In the last group (context-based approaches) the capturing device identifies contexts using badges, labels, or it recognizes contexts [46] using Artificial Intelligence (AI) methods to determine if capturing cannot take place [6][7][8].

Obfuscation-Dependent Methods
The goal of obfuscation-dependent methods is to hide the identity of bystanders to avoid their identification. Depending on who performs the action to hide a bystander, these methods can be classified into two categories: • Bystander-based obfuscation: In this category, bystanders avoid their facial identification either by using technological solutions to hide or perturb bystanders' identifiable features, or by performing a physical action such as asking somebody to stop capturing data, or simply leaving a shared/public space. Our FacePET [25] wearable device falls into this category. • Device-based obfuscation: In this group, third-party devices which are not owned by the bystander perform blurring or add noise (in the signal processing sense) to the image captured from the bystander to hide his/her identity. Depending on how the software at the capturing device performs the blurring, solutions in this category can be further classified into default obfuscation (any face in the image will be blurred) [19], selective obfuscation (third-party device users select who to obfuscate in the image) [20], or collaborative obfuscation (third-party and bystander's device collaborate via wireless protocols [47] to allow a face to be blurred) [21]. A drawback of device-based obfuscation method is that a bystander must trust a device that he/she does not control to protect his/her privacy.  [16] PlaceAvoider [17] PrivacEye: [8] Obfuscation

Adversarial Machine Learning Attacks on the Viola-Jones Algorithm
To detect a face automatically in an image, supervised machine learning (classification) methods in image processing can be used. Given an image/photo x and a face detection (classification) method/algorithm Fd, the goal of Fd is to classify (or assign a label) to the image x such that if x contains a face, then Fd(x) = 1, and if x does not contain a face then Fd(x) = 0.
The process of finding a vulnerability to make classification algorithms fail is an application of a field called adversarial machine learning [48,49] which studies how an adversary/attacker can generate attacks to render machine learning models/methods ineffective. For face detection, this process can be done by applying a transformation Tr(x) on the image such that if Fd(x) = 1, then Fd(Tr(x)) = 0. In other words, if x contains a face, the goal of an adversary during the face detection process is to find a method/transformation of a face in x so the face detection method does not detect the face. The transformation can be done after the image x has been captured by a camera, which in this case, Tr(x) is performed by software, or Tr(x) can be generated as part of the process to capture an image wherein a person (i.e., a bystander) in the photo has a physical method to execute the transformation which is recorded/stored in the image. Thus, the goal for FacePET is to physically generate a transformation to prevent the Haar-like features from being used by the face detection (classification) algorithm. A Haar-like feature is calculated using the following formula: In this formula, s(r1) is the average of pixel intensities in "white" regions, and s(r2) is the average of pixel intensities in the "black" regions of predefined black/white patterns that are juxtaposed over an image (or a region of an image). The patterns are engineered to train classification models using machine learning algorithms and the Haar-like features. Once the model is trained, the patterns are used in images to calculate the Haar-like features, which then serve as inputs to the trained classifier. Figure 3 presents the predefined black/white patterns used by Viola-Jones to calculate Haar-like features for face detection. When using these patterns, the Viola-Jones algorithm creates windows of different sizes (subregions/sub images), calculates the Haar-like features for each window using the patterns, and then each window is passed through a classifier Fd(x) that outputs 1 if a face is detected. Performing adversarial attacks on a Viola-Jones face detection algorithm can be achieved by generating noise (in the signal processing sense) in the bystander's face (or photo) such that the values of the Haar-like features make a Viola-Jones classifier fail.
In FacePET [25], PrivacyVisor [10], and Invisibility glasses [18], these attacks are performed using Light Emitting Diodes (LEDs) (either through visible light in the case of FacePET or infrared light in the case of PrivacyVisor and Invisibility glasses) embedded in goggles. Figure 4 shows an example of a detected face without the attack (Figure 4a) and an undetected face with the attack (Figure 4b). This figure shows screenshots of an application that we created using the OpenCV's implementation of the Viola-Jones algorithm to demonstrate the attack on the Haar-like features. We note that when the face is detected the software superimposes a blue square around the area of the face, and green squares around the area of the eyes and mouth (Figure 4a). However, when the features are attacked, the software fails to detect the face (Figure 4b) and no squares are superimposed on the face.  Recent advances in deep learning and Convolutional Neural Networks (CNN) have improved the accuracy of image processing methods, including face detection methods. While in Viola-Jones methods the features for face detection are hand-crafted through the use patterns and Haar-like features to achieve the detection, in CNN-based algorithms there is no need for any of the two, because CNN can learn the features needed to achieve the detection through the automated training of neural networks [50]. However, CNNs for face detection can also be subject to adversarial machine learning attacks that include the optimization of adversarial generator networks for face detection [51], image-level distortions (i.e., modifications of the image's appearance not related to faces) and face-level distortions (i.e., modifications of facial landmarks in an image) [52].

The Facial Privacy Enabled Technology (FacePET) System
In Section 2.2, we described different classes of facial privacy systems that are not controlled by bystanders, and many do not provide a choice for bystanders before a photo is taken (i.e., still a bystander can be photographed inadvertently and identified without consent). These systems require bystanders to trust other parties to protect their own facial privacy without a choice or assurances to bystanders that their privacy is indeed being protected. We argue that the best types of facial privacy systems are those that provide methods for bystanders to make choices for their own facial privacy before a photo can be taken. We developed FacePET [25] under this premise. Figure 5 shows the components of the FacePET system. The major components of FacePET include: • FacePET wearable: The FacePET wearable (as Figure 6 shows) is composed of goggles with 6 strategically placed Light Emitting Diodes (LEDs), a Bluetooth Low Energy (BLE)-enabled microcontroller, and a power supply. When a bystander wears and activates the wearable, the FacePET wearable emits green light that generates noise (in the signal processing sense) and confuse Haar-like features for the Viola-Jones algorithm. The BLE microcontroller allows the bystander to turn on/off the lights through a Graphical User Interface (GUI) implemented as a mobile application and runs on the bystanders' mobile phone.

•
FacePET mobile applications: We implemented two mobile applications for the FacePET system. The first mobile application, namely the Bystander's mobile app implements a GUI to turn on/off the FacePET wearable through commands broadcast using BLE communications. The Bystander's mobile app also implements an Access Control List (ACL) in which third-party cameras are authorized to disable the wearable and take photos. Different types of policies can be enforced for external parties to disable the wearable. For example, for a specific third-party user, the Bystander's mobile app can limit the number of times the wearable can be disabled for that third-party user. Further privacy policies based on contexts (i.e., location) can also be implemented. The second app, called the Third-party (stranger) mobile application, issues requests to disable the wearable and take photos of the bystander with wearable's lights off. In the current prototype, the Third-party (stranger) mobile application connects to the Bystander's mobile app via Bluetooth [53]. Figure 7 presents screenshots of both mobile applications.

•
FacePET consent protocol: The FacePET consent protocol (as Figure 8 shows) enables a mechanism that creates a list of trusted cameras (an ACL) at the bystander's mobile application. In our current prototype the consent protocol is implemented over Bluetooth.

Methodology
We applied for an approval from the CSU's Institutional Review Board (IRB) to conduct our study. The initial recruitment of participants was conducted by sending a flyer through Columbus State University's (CSU) e-mail system. The flyer explained the steps for participants to take part in the study which was performed in a room at the CSU's Synovous Center for Commerce and Technology. Once in the room, each participant filled out an informed consent form that provided information about the research and its risks. Next, participants filled out an initial survey (called the "Bystander's Privacy Survey") to gauge their knowledge about the concept of bystanders' privacy as well as their personal preferences on having their photos taken in certain situations and places. We used questions from the survey developed for the I-Pic system [14]. Figure 9 shows the questions asked in the survey.
After the initial survey, participants wore the FacePET wearable and had their photo taken using the rear-facing camera of an iPhone 7 in an indoor setting (i.e., a lab) with the wearable system being active and inactive. The captured photos were then used as input in a Python application that used the OpenCV's face detection Application Programming Interface (API) [26] implementation which provides an open source implementation of the Viola-Jones face detection algorithm [27]. Figure 3 shows screenshots of this application. The results of the face detection were presented to the participants (as Figure 3 shows) before they filled out a second survey (called the "Usability Survey") about the use of the wearable device and their attitudes about it. Figure 10 shows the questions we asked in this second survey. Once this second survey was completed, the participants concluded their participation in the study. A total of n = 21 participants took part of this study and we raffled a gift card for USD 25.00 among the participants as an incentive reward for their participation. Table 3 presents the participants' demographics in this study. All participants were at least 18 years old.

Study Results
The initial bystanders' privacy survey assessed the participant's knowledge about facial and bystanders' privacy and how it affects them. Participants were first asked questions about how they feel themselves with respect to technology and how often they took pictures and videos. They were also asked how much they knew about the issue of bystanders' privacy and if they found it to be an important issue in today's world. Out of the 21 participants, 19 of them considered themselves to be tech savvy. When asked how often they took pictures/videos, 11 participants took pictures often while the rest answered not so often (8 participants) or very little (2 participants). When asked about bystanders' privacy and how much they knew about bystanders' privacy, surprisingly, most of them did not know much about the issue or not at all (11 participants adding both choices). In this question, 2 participants stated that they knew a lot about it and 7 participants stated that they knew enough. After these questions and being introduced to the topic, most of the participants were in agreement that it is an important issue in today's world (18 participants), and the rest stating that it was not (3 participants).
When asked about the preferred privacy actions in certain contexts such as being at the gym, in a bar, at the beach, among others (see Figure 11), the participants were given for each situation five choices (I agree to be captured in any photograph; I agree to be captured, but please send me a copy of any photograph that includes me; Please obscure my appearance in any photograph that includes me; I can decide my preference only after I see the photograph; I do not wish to be captured in any photograph). The most common choice among all contexts was "I can decide my preference only after I see the photograph" (32% of all choices). The second most frequent choice was "I agree to be captured in any photograph" with 28.07% of all choices). It is worth noting that in general, 15 participants chose a privacy action other than always agreeing to be photographed. This result demonstrates that, among our survey participants, they prefer some type of privacy protection when photographed. In this part of the survey we had a total of 228 answers.
From the results of the survey, we found that the participants of our study prefer to be photographed without restrictions in some communal places and activities such as outdoor activities, workplaces, at private gatherings with known people (i.e., family and friends), while they do not wish to be photograph in places and activities related to health (i.e, at hospitals, at gyms). It worth noting that the most preferred choice for places such as in bars/nightclubs, at the beach, at a place of worship, and in a restaurant was "I can decide my preference only after I see the photograph". These results show that, in health-related activities, and in contexts that involve consumer/lifestyle habits (i.e., bars, beaches, and restaurants) participants of the study want to control their privacy. This conclusion is similar to past works with a focus on bystander's privacy perceptions (as described in Section 2.1). The last section of the initial bystanders' privacy survey evaluated the participants' comfort levels about who may be a photographer taking photos of them and what the photographer can do with the photos regardless of any specific situation. For each type of photographer/action, the participant could choose five comfort levels (in a Likert scale). Figure 12 shows the results of these questions. In the figure, the Likert scale has been reduced to three categories to simplify the visualization and analysis. In these questions, less comfortable choices (little less and much less) represented 35.24% of all choices, neutral choice ("I will feel the same") represented 32.86% of all choices, and more comfortable choices represented 31.9%. In these questions, participants felt more comfortable in situations where there was some type of privacy protection or the photographer was somebody professional or known to the participant. Finally, participants felt less comfortable if the photos were to be published without consent, if the photographer was a stranger, and if there were children in the proximity of the photo. These results demonstrate that participants were concerned about their facial privacy when photos are taken and published without their knowledge. In this part of the survey there were 210 answers. The results in this part of the study are similar to past works in the area of bystanders' perceptions on photo sharing (see Section 2.1).  The participants then wore the FacePET system. Each individual was photographed using the rear-facing camera of an Apple iPhone 7 mobile phone with the device enabled (privacy protection) and disabled. These photos were then fed into the OpenCV's face detection script (Figure 3). Out of the 21 participants, six participants' faces were detected, giving the device's a success rate in protecting a user's face around 71%. A handful of the participants also took pictures using their own mobile phones so that comparisons could be made for how effective the device worked regardless of the different cameras. The participants were then shown the results of the application ( Figure 3) and then they answered the usability/user perceptions survey shown in Figure 10.
While conducting the experiment on capturing the photos, we noticed that the glasses seemed a little bit big on some of the participants who had thinner or smaller facial structures. This caused OpenCV's face detection script to detect their faces as the FacePET device failed to thwart the facial features. We also observed that the illumination in the room where the experiment was conducted diminished the effectiveness of the device. We plan to address these aspects in the future.
After using the FacePET system and answering the usability survey, 17 participants found the system easy to understand and use. When asked if the device was something they would use on a daily basis, nine participants answered affirmatively, while the rest stated that they would not use the device in its current state. Within the group of participants who answered that they would not use the device (12 participants), we asked if they would use a similar version of the device (one that would achieve the same goals for privacy protection). In this question, 7 out of 12 participants answered affirmatively.
Even though the original FacePET system is not a wearable that most of the participants would use, when adding those participants who initially answered yes (9 participants) to use FacePET and those who would use a similar version (7 participants), the majority of the participants (16 participants out of 21) would use FacePET or similar devices (i.e., other bystander-based devices) to protect their facial privacy. Most of the concerns or reasons surrounding participants not wanting to use the device seemed to be because of the device's form factor. Some of these reasons indicated by the participants included: The photographer is a professional photographer (e.g. wedding photographer, journalist, artist, etc.) The photograph will be limited to personal use by the photographer There are minor children in your vicinity who might also be photographed The photograph may be published online and I am notified afterwards (e.g. social networks) The photograph may be posted in a forum with restricted membership (e.g. company/university mailing list) The photographer is an acquaintance When the participants were asked about how people would react when seeing them wearing the device, a variety of responses given were: • Person laughs and says, "Stupid glasses". • People would stare a lot.

•
People would be confused at first or creeped out.

•
People would ask why the user was wearing such a device.

•
The device would only invite more people to take pictures of it.
From these answers, it seems there would be plenty of confusion on others around the user about the purpose of the device and why someone would wear it in its current form factor. Despite the fact that some of the feedback obtained relates specifically on our FacePET prototype, it is worth pointing out that the majority of the participants did agree that if smart glasses had features to conceal users' identities, it would allow such smart glasses to become more popular with 17 participants stating yes, 3 participants feeling indifferent, and 1 participant stating no. Finally, we gathered some suggestions on how to improve our FacePET prototype. Some of the improvements that were repeated among the responses include a more fashionable design, a better size (smaller) for the goggles, and fixing the long wires that connect the power supply with the goggles and the microcontroller in the current prototype.

Study Limitations
Due to the sample size (n = 21) of our study and because all participants recruited in our study were from Columbus State University, the findings of this study cannot be generalized to a broader population. Thus, if we conduct our study with a broader and more diverse population, we may obtain different results to the ones currently presented in this work. As such, our conclusions are written in terms that relate to our participants rather than a broader population. While our sample size and its characteristics are similar to previous works that also used interviews, testing of devices and the study of users in the wild [8,31,32,34,35,40], we acknowledge that to achieve external validity we will need to scale our experiment to reach a broader population to increase both the sample size and its diversity. To achieve this, we propose as future work the development of an experiment wherein participants do not rely on the FacePET device for the study, but by using current advances in AI in face and eye detection, we could simulate how a participant would look with a bystanderbased privacy protection device similar to FacePET, followed by participants interacting with an interface that simulates the device, and finally have participants answer an online survey or record them answering open questions about the simulated device. We plan to conduct this study in our future research works.

Conclusions
In this work we conducted a user study to assess user perceptions about the FacePET system or similar bystander-centric devices for facial privacy protection. We conducted our study with 21 participants who took a survey to gather information about facial and bystanders' privacy, privacy choices with cameras, and preferences about sharing photos. Participants then used the FacePET wearable and answered a second survey about the usability and perceptions of the system and/or similar devices. We found evidence that participants want some type of privacy protection when photographed, especially in contexts that involve consumer/lifestyle habits, and they do not wish to be photographed in contexts that involve health-related activities or locations. Participants also showed concerns about their facial privacy when photos are taken and published without their knowledge.
When the participants used the FacePET system, we found that even though they would not use the current prototype on a daily basis because of its bulkiness and unfashionable design, most of the participants agreed that they would use a device similar to FacePET to protect their facial privacy. Participants finally agreed that if smart glasses had features that would allow users to protect their facial privacy, this feature would make smart glasses more popular with the general public.
For future work, we will develop a research study to recruit more participants and address the external validity of the conclusions of our small study. To achieve this, we plan to create a research protocol that does not require the utilization of a physical wearable (e.g., access to a FacePET prototype) to scale the data collection. In addition, based on the results of the FacePET evaluation, we plan to improve the appearance of the FacePET design. Finally, we plan also to improve the facial privacy protection aspects of the device to protect against newer face detection and recognition systems based on deep learning and Convolutional Neural Networks (CNNs).