Digital Phenotyping and Patient-Generated Health Data for Outcome Measurement in Surgical Care: A Scoping Review

Digital phenotyping—the moment-by-moment quantification of human phenotypes in situ using data related to activity, behavior, and communications, from personal digital devices, such as smart phones and wearables—has been gaining interest. Personalized health information captured within free-living settings using such technologies may better enable the application of patient-generated health data (PGHD) to provide patient-centered care. The primary objective of this scoping review is to characterize the application of digital phenotyping and digitally captured active and passive PGHD for outcome measurement in surgical care. Secondarily, we synthesize the body of evidence to define specific areas for further work. We performed a systematic search of four bibliographic databases using terms related to “digital phenotyping and PGHD,” “outcome measurement,” and “surgical care” with no date limits. We registered the study (Open Science Framework), followed strict inclusion/exclusion criteria, performed screening, extraction, and synthesis of results in line with the PRISMA Extension for Scoping Reviews. A total of 224 studies were included. Published studies have accelerated in the last 5 years, originating in 29 countries (mostly from the USA, n = 74, 33%), featuring original prospective work (n = 149, 66%). Studies spanned 14 specialties, most commonly orthopedic surgery (n = 129, 58%), and had a postoperative focus (n = 210, 94%). Most of the work involved research-grade wearables (n = 130, 58%), prioritizing the capture of activity (n = 165, 74%) and biometric data (n = 100, 45%), with a view to providing a tracking/monitoring function (n = 115, 51%) for the management of surgical patients. Opportunities exist for further work across surgical specialties involving smartphones, communications data, comparison with patient-reported outcome measures (PROMs), applications focusing on prediction of outcomes, monitoring, risk profiling, shared decision making, and surgical optimization. The rapidly evolving state of the art in digital phenotyping and capture of PGHD offers exciting prospects for outcome measurement in surgical care pending further work and consideration related to clinical care, technology, and implementation.


Introduction
Technology-enabled solutions that capture patient-generated health data (PGHD)-data related to activity, mobility, cognition, behavior, mood and social interactions-are rapidly evolving with the aim of a more personalized, patient-centered, and data-driven approach to the delivery of surgical care [1][2][3][4][5]. The concept of "digital phenotyping" was first coined in 2015 by J.P Onnela as the moment-by-moment quantification of individual human phenotypes in situ using data related to activity, behavior, and communications from personal digital devices, such as smartphones and wearable sensors (wearables) [6][7][8][9][10]. While the first smartphones were developed around 1992, wider utilization and applications capturing PGHD occurred toward the late 2000s. The acquisition of PGHD in the form of patient-reported outcome measurements (PROMs) is commonplace in clinical research and increasingly common in clinical care. PROMs are questionnaires that quantify the patient's perspective of their physical, emotional, and social health, and are commonly collected using tablet devices and web-based, online portals [11][12][13][14][15]. The electronic capture and utility of PROMs has transformed the evaluation of health outcomes in surgical research, partly due to well-defined surgical pathways and time points during the preoperative baseline to postoperative recovery and rehabilitation [12,16]. However, the adoption of PROMs in clinical practice is limited by the burden placed on patients to interpret and complete surveys, is often restricted to the clinical encounter, and associated with several administrative and logistical barriers in sustaining longitudinal data collection, especially in busy, resource-limited settings [15,17].

Rationale
The continuous capture of passive PGHD in "real time" may overcome these limitations via digital phenotyping. However, little is known around digital phenotyping and PGHD in the context of outcome measurement in surgical care. An individual's digital phenotype and how they interact with these devices aims to provide dynamic insights around the impact of a given condition on the patient's lived experience, both within and outside health care settings. This rich data source may augment the way we traditionally acquire health information via physical assessment (clinical history and examination), and investigations (vital signs monitoring, laboratory tests, medical imaging), and further advance the tracking and surveillance of health, enhance decision making at the point of care, trigger the timely detection of clinical deterioration, and better predict surgical outcomes [13,14,18]. While a growing evidence base supports the value of digital phenotyping and PGHD to provide actionable data and targeted interventions, few have comprehensively characterized this technology in surgery or mapped current concepts for driving research and development in this field. The overarching goal of this study was to conduct a rapid scoping review of digital phenotyping and PGHD for outcome measurement in surgery to generate a repository of evidence for the current state of the art, identify knowledge gaps, and guide recommendations for future work.

Objectives
The primary objective was to map the application of digital phenotyping and digitally captured active and passive PGHD for outcome measurement in surgical care by study characteristics, clinical characteristics, technological/data characteristics, and functional characteristics. The secondary objective was to synthesize the body of evidence to define specific areas of further work necessary to translate this technology from research bench to surgical practice. Ultimately, this review aims to inform stakeholders in advancing the field of patient-centered digital health and outcome measurement in surgical care.

Study Design
We performed a rapid scoping review as a streamlined approach to synthesizing evidence for emergent research and development in this field [19][20][21]. We started with a strategic search applied to multiple electronic databases using search terms related to key concepts within our primary and secondary objectives. This was followed by a stepwise process of screening, data extraction, and synthesis.

Protocol and Registration
The protocol was developed a priori, guided by the Preferred Reporting Items for Systematic Reviews and Meta-analysis-Extension for Scoping Reviews (PRISMA-ScR) (Appendix A) [20], and study registered prospectively with the Open Science Framework, Center for Open Science (Registration No. url: osf.io/p9c7u).

Eligibility Criteria
Eligibility criteria were as follows: studies focused on adult patients undergoing any form of surgical care at any phase along the care pathway (i.e., preoperative evaluation, perioperative care, postoperative recovery and rehabilitation), involving personal digital devices used to capture active and/or passive PGHD, describing outcome measurement(s) across any health domain, within original studies (prospective, retrospective, technical feasibility) in peer-reviewed journals that were available in the English language. Studies were excluded if they involved pediatric and adolescent patients, non-surgical contexts, lacked capture of any form of PGHD, involved digital solutions to collect and synthesize PROMs only, or were reviews, commentaries, case studies, without original data, and not available in the English language.

Search and Data Sources
We developed a search strategy guided by our lead institutional librarian [IV], who is experienced in performing systematic reviews. Following rounds of refinement among the research team we defined and combined terms related to "digital phenotyping and PGHD" (concept A), "outcome measurement" (concept B), and "surgical care" (concept C) (Appendix B). Search engines were selected by consensus among authors and our librarian expert then deployed the final search strategy across the following electronic bibliographic databases: PubMed (NLM), Web of Science (Clarivate Analytics, Philadelphia, PA, USA), Cochrane Library (Wiley, Hoboken, NJ, USA), and IEEE Xplore; Databases were searched on 1 June 2020 and refreshed on 1 July of 2020 to ensure we acquired an up-to-date set of articles before reporting findings. No limits were set in publication dates for search purposes, however results spanned years from 1994 to 2020. Search results were limited by language (English only) and resource type (journal articles only). Search results were exported into and deduplicated with the citation management tool EndNote. The search was supplemented by scanning reference lists of relevant reviews.

Data Screening
Three investigators (EL/VG/JM) independently screened titles and abstracts from the full set of articles based on eligibility criteria. For quality control and to increase consistency among reviewers, all reviewers initially screened a set of 25 publications at the outset and discussed the results before continuing with the screening process. Subsets of articles from batches were cross-checked by investigators (PJ/EL/VG) for consistency and quality assurance. Excluded studies were coded with reasons for exclusion using the criteria established a priori. Any differences in judgment on inclusion/exclusion of studies were resolved by group discussions with the senior investigator (AH) as needed. Full-text articles were retrieved for further independent review and final assessment for eligibility (PJ/EU/VG). Number of articles screened, and articles excluded including duplicates were logged for each source of evidence and presented in a PRISMA flow diagram (Appendix C). The final study set for data extraction was thus identified.

Data Charting Process
Two investigators (PJ/EL) jointly developed the data charting system including electronic forms for screening, data extraction, and synthesis of relevant information (Microsoft Excel, v16.21, USA). The screening form logged articles for inclusion/exclusion, allowed tagging of queried citations for further discussion and recording reasons for excluding articles. The extraction form included parameters developed in relation to our primary objective, i.e., study characteristics, clinical characteristics, technological/data characteristics and functional characteristics. Data items for each category were selected by four investigators (PJ/EL/VG/JM) who charted data independently. These investigators regrouped at regular points throughout the screening and extraction phase to discuss and iterate the data charting parameters. Any inconsistencies were resolved by additional input from the senior author (AH) as needed.

Data Items and Extraction
We finally abstracted data from the full text (PDFs) of the final set of selected articles on: study characteristics (lead author, study year, country of origin, study design, total number of patients), clinical characteristics (surgical specialty, surgical procedure, point of application along care pathway), technological and data characteristics (type of device including brand/proprietary names, type of data), and functional characteristics (types of clinical function and utility) with additional notes to document salient points.

Appraisal of Individual Sources of Evidence
We focused on presenting the results as a "map" of data utilizing data visualizations and data tabulations along with a descriptive narrative as per published guidelines, in keeping with a broad and scoping systematic review [20,22]. While we closely reviewed the full text articles during the data extraction phase, we did not proceed with a formal critical appraisal partly given the heterogeneity of the study set (varying study designs in particular), and partly due to the lack of a universal and validated quality assessment tool.

Synthesis of Results
We synthesized results using coding and grouping of relevant data elements using our electronic database. with descriptive analysis using frequencies and percentages within each category of extracted data. Following consensus discussions on metrics of interest by three investigators [PJ/EL/VG], we proceeded to tabulate data and generate visualizations using a data analytics package (Tableau, 2020. v3.0, Mountain View, CA, USA). Visualizations included a geographical chart of country of origin for selected articles; bubble charts and other standard charts for other metrics of relevance, and a Sankey-type flow diagram (@SankeyMATIC, Virginia, USA) to provide an overview of the specific inter-relation between technological, data and functional characteristics.

Initial Evaluation and Selection of Studies
A total of 3001 citations were generated from the original literature search and after adjusting for duplicates (n = 575), 2426 remained for screening. After reviewing titles and abstracts, 2157 were excluded by criteria leaving 269 publications for full-text review. A further 45 studies were excluded based on a lack of alignment with our study objectives and leaving a final set of 224 articles (Table 1

Study Characteristics
The number of studies increased over time ( Figure 2). Studies originated from 29 countries with the majority performed in the USA (n = 74, 33%) ( Figure 3). The majority of studies featured original prospective work (n = 149, 67%), and a substantial proportion of studies involved technical validation and feasibility of digital solutions (n = 50, 22%) ( Figure 4). The cohorts of patients involved in these studies ranged from 5 to 406 participants.

Study Characteristics
The number of studies increased over time ( Figure 2). Studies originated from 29 countries with the majority performed in the USA (n = 74, 33%) ( Figure 3). The majority of studies featured original prospective work (n = 149, 67%), and a substantial proportion of studies involved technical validation and feasibility of digital solutions (n = 50, 22%) ( Figure 4). The cohorts of patients involved in these studies ranged from 5 to 406 participants.

Clinical Characteristics
Studies spanned 14 surgical specialties with the majority being performed in the context of orthopedic surgery (n = 129, 58%) and procedures including total joint replacement, fracture and soft tissue trauma reconstruction, joint fusion, brachial plexus injury, rotator cuff repair, anterior cruciate ligament reconstruction and carpal tunnel release ( Figure 5) (Appendix D). The majority of studies were conducted in the postoperative phase (n = 210, 94%).

Clinical Characteristics
Studies spanned 14 surgical specialties with the majority being performed in the context of orthopedic surgery (n = 129, 58%) and procedures including total joint replacement, fracture and soft tissue trauma reconstruction, joint fusion, brachial plexus injury, rotator cuff repair, anterior cruciate ligament reconstruction and carpal tunnel release ( Figure 5) (Appendix D). The majority of studies were conducted in the postoperative phase (n = 210, 94%).

Clinical Characteristics
Studies spanned 14 surgical specialties with the majority being performed in the context of orthopedic surgery (n = 129, 58%) and procedures including total joint replacement, fracture and soft tissue trauma reconstruction, joint fusion, brachial plexus injury, rotator cuff repair, anterior cruciate ligament reconstruction and carpal tunnel release ( Figure 5) (Appendix D). The majority of studies were conducted in the postoperative phase (n = 210, 94%).
The width of each flow is proportional to number of studies channeled from one category to another, i.e., the flow of number of articles published by technology type that involved the capture of activity, biometric and/or communications data in order to provide a given function

Functional Characteristics
The focus of the majority of studies was on tracking/monitoring of surgical patients (n = 115, 51%), and assessment of technical feasibility (n = 78, 35%), versus prediction of surgical outcomes (n = 32, 14%), risk profiling (n = 20, 9%), and surgical optimization (n = 25, 11%). A wide range of technologies were utilized such as activity trackers, smartphone applications, research-and commercial-grade wearables, and other sensors (Appendix E) alongside numerous types of activity, biometric, and communication-related data points ( Figure 7). As a single publication could be categorized in multiple functional characteristics, the sum of the values and sum of percentages is higher than 224 and 100%, respectively.
Notably, various patient-reported outcome measures were utilized in more than half of the studies (n = 121, 54%) and mostly used to validate wearable data. Findings from these evaluations, such as those assessing correlation between data types, were highly variable. PROMs in these studies included measures of condition-specific health (e.g., Hip Disability and Osteoarthritis Outcome Score, HOOS) (n = 86, 71%), general health and quality of life (e.g., Patient Reported Outcome Measurement Information System (PROMIS)-Global, RAND 36-Item Short Form Health Survey) (n = 54, 45%), and psychosocial factors (e.g., PHQ-9) (n = 6, 5%). A single publication could utilize more than one PROM; thus, the values are higher than the total 121 of publications. The width of each flow is proportional to number of studies channeled from one category to another, i.e., the flow of number of articles published by technology type that involved the capture of activity, biometric and/or communications data in order to provide a given function

Functional Characteristics
The focus of the majority of studies was on tracking/monitoring of surgical patients (n = 115, 51%), and assessment of technical feasibility (n = 78, 35%), versus prediction of surgical outcomes (n = 32, 14%), risk profiling (n = 20, 9%), and surgical optimization (n = 25, 11%). A wide range of technologies were utilized such as activity trackers, smartphone applications, research-and commercial-grade wearables, and other sensors (Appendix E) alongside numerous types of activity, biometric, and communication-related data points ( Figure 7). As a single publication could be categorized in multiple functional characteristics, the sum of the values and sum of percentages is higher than 224 and 100%, respectively. Notably, various patient-reported outcome measures were utilized in more than half of the studies (n = 121, 54%) and mostly used to validate wearable data. Findings from these evaluations, such as those assessing correlation between data types, were highly variable. PROMs in these studies

Discussion
Digital phenotyping and PGHD has been studied in a range of surgical contexts. Smartphones and wearable sensors have been used to capture an array of activity/mobility, biometric, and communication-related data. Studies have been conducted to establish the feasibility of these technologies to gather information from patients, while also assessing the potential for clinically meaningful functions, such as tracking and monitoring change in health status, decision support, and prediction of health outcomes. Our findings should be considered in light of some limitations.

Limitations
Firstly, scoping reviews encompassing broad concepts that generate large numbers of citations may be prone to human error where investigators inadvertently miss relevant articles. Further, where there are multiple investigators performing screening, there is a risk of alternative interpretations of abstracts. To mitigate this, we commenced screening following independent review and group discussion of a common set of articles, before proceeding with screening in batches and regular check-ins to query any concerns, share ideas, reach consensus, and resolve any disputes as necessary. Second, for speed, only two investigators were involved in developing the initial data charting system. A wider group discussion could have generated additional elements for consideration at the outset. Nevertheless, ample opportunities were built into our process for implementing ideas, new concepts, and iterating the data extraction chart. Third, given the heterogeneity of the articles and intention to encompass studies focused on technical feasibility as well as original research, it was challenging to identify a universal tool to appraise the quality and validity of the studies. Finally, while we aimed to comprehensively categorize the wide variety of devices among these studies, as none of the investigators were technologists, there may have been some degree of error in taxonomy and classification, especially among the commercial-and research-grade wearable/sensors. This may have been further complicated by the proprietary names for the devices which could have varied by geographical region or changed as technologies evolved.
Through the process of our full-text review, we identified three spheres of insights: clinical, technological/data, and interpersonal spheres, with future scopes of work required to realize the translation of personalized digital technologies from the research bench to surgical care [10].

Clinical Sphere
Authors have categorized surgical applications of wearable technologies into providing augmentative functions (the provision of information in real time for surgeons during clinical or surgical encounters, e.g., head-up displays on glasses), assistive functions (the use of wearables to replace physical tasks, e.g., gesture control of electronic systems while scrubbed for surgery), and assessment functions (i.e., objective measurement of clinical outcomes and disease severity, e.g., tracking mobility data and walking tolerance in degenerative musculoskeletal conditions) [24]. Wearable technologies can overlap to varying extents among these functions and be positioned at differing points along the continuum of surgical centeredness versus patient centeredness [24].
In this scoping review, beyond studies demonstrating technical feasibility alone, most studies involved the assessment function-commonly tracking and monitoring of activity and biometric data. Fewer studies involved prediction, risk profiling, surgical optimization, diagnostic processes, development of new interventions and care delivery models, shared decision making, decision support and targeted treatment selection, and recovery and rehabilitation support, e.g., gamification [25]. Personal digital technologies capturing PGHD were most commonly applied in the context of orthopaedics and neurosurgery. Applications mostly involved wearable motion sensors in populations with chronic musculoskeletal conditions, such as advanced osteoarthritis requiring total joint replacement [26][27][28][29]. In the context of musculoskeletal health in general, activity/mobility data (from accelerometers and GPS), communication data (text and telephone logs, screen time), and self-reported pain (phone-based visual analogue surveys) have been used to predict outcomes of care for spinal conditions [30]. Further, mobility metrics (gait speed using accelerometer and gyroscope sensors) from wearable sensors have been associated with health outcomes, including activities of daily life in older adults [31].
Digital phenotyping has also captured recovery metrics and physical activity in non-operative spinal care [18], augmented neurological care [47,48], signaled cardiovascular risk [49], characterized loneliness and social isolation [49], and been used to develop behavioral change interventions [50]. In relation to the point of application along the surgical pathway, personal digital devices have established baseline function [51][52][53], enabled advanced monitoring of biometric data during the perioperative phase/acute recovery phase [54], and tracked progress during postoperative recovery [25].

Future Scope
While there are a wide range of clinical applications, directions for further work and surgical use cases involving digital phenotyping can be summarized as (i) enhanced recovery monitoring, (ii) improving decision making, and (iii) surgical optimization (including optimization/prehabilitation). Further studies are also needed to understand the ability of PGHD to segment patient populations during the care cycle without stigmatizing the individual, define postoperative recovery trajectories, and assess the association of passive PGHD with PROMs.
As PGHD commonly involves activity-related metrics, there seems a natural opportunity to expand this form of measurement in orthopaedics to assess the association with PROMs capturing physical function, especially considering the direct impact of common conditions, such as osteoarthritis and fractures, and interventions, such as total joint replacement surgery and fracture fixation, on physical activities and mobility.
In relation to psychometric evaluation (i.e., assessment of validity, reliability, responsiveness, reproducibility, feasibility and user-friendliness), the same level of rigor applied to testing PROMs should be applied to passive PGHD. Full scale adoption of this technology across different surgical settings also requires forecasting of barriers and pitfalls related to surgical quality and safety, alongside the ethical, privacy, and legal considerations related to the use of this technology [18,55].

Technological/Data Sphere
Personal digital technologies such as smartphones-mobile devices used for core phone functions (voice calls, text messaging) and computing functions (wider software, internet, e.g., web browsing, mobile broadband, and multi-media functionality, e.g., gaming, music, video, cameras)-and wearable sensors (wearables)-small electronic devices embedded into items possessing computational ability that interface with the body-are now ubiquitous across the consumer market [3,24].
A rapidly evolving combination of sensors, displays, processors and storage memory, and interconnected software and computer algorithms are accelerating the collection, filtering, processing, interpretation, and visualization of an individual's interactions with their environment from raw data [24]. In this review, we map an array of these generated data points and categorize them into an "Activity-Biometrics-Communication framework of digital biomarkers" from personal digital devices ( Figure 6). The fast pace of this evolution is being fueled by developments in advanced technologies such as artificial intelligence and machine learning (especially around anomaly detection), increasing analytic capabilities alongside advances in collection and processing power. Increase in technical development has been matched by the explosion of scientific work in this field over the last twenty years. This growth may in part be due to the fast-paced release of wearable technologies in the health and fitness consumer market: FitBit releasing their first activity tracker and wearable technologies in 2014; Apple releasing the AppleWatch in 2015; and Garmin releasing the Forerunner 101 back in 2003.
Interestingly, the majority of studies in this review utilized research-grade technologies, despite most of the development, distribution, and sales occurring in the commercial sector. This raises questions around the availability, translation, and scalability of technologies developed and tested at the research bench, and whether such devices serve as appropriate benchmarks for testing commercial-grade devices. In contrast, the proprietary nature of the technology behind commercial-grade devices also warrants further discussion around standardization and scalability. How can health care professionals be sure that the summary statistics from different devices are measuring the same health domain? The heterogeneity throughout PGHD methodology provides a major challenge for scaling solutions.

Future Scope
Further work is needed around a data infrastructure and defining platforms necessary to support multiple forms of PGHD. The development of Fast Healthcare Interoperability Resources (FHIR), smart marker capabilities, and application programming interfaces (APIs) by the clinical informatics community should support the interoperability of personal digital technologies and applications in existing IT infrastructures as they become standardized within care pathways.
Infrastructure also depends on standardization of this technology, including the validation, testing, refinement, and standardizing the algorithms behind personal digital devices. The issue of standardization is particularly relevant when considering health domains derived from proprietary mathematical models, such as sleep quality. Greater clarity is needed around whether the raw data aligns with the processed data and whether these metrics are measuring what they claim to represent.
Validation of this technology should also include active PGHD, such as PROMs, and understanding the extent to which patient-reported outcomes can and/or should be used as comparators and benchmarks for passive PGHD. Extensive cycles of testing are needed to establish whether passive PGHD from personal digital devices can one day be used as standalone measures of health outcomes or be used side by side with active measurements.
Robust and transparent set of IT governance standards are required to optimize interoperability and reproducibility. In a broader sense, a strategic approach is needed to contend with the rate of technological advancement versus rate of adoption in existing health care settings. Further work should also evaluate the challenges around distribution, access to technology and costs.

Interpersonal Sphere
While digital phenotyping provides a rich source of PGHD to support the optimal delivery of surgical care, the nature of data captured using personal digital technologies (e.g., how many texts they wrote or how long they spent talking to friends and family; how long they took in moving from place to place); may further humanize the interaction between patients and clinicians [11,12,56]. This interaction contrasts with legacy electronic health record (EHR) systems and systems actively collecting PROMs (via in-person, digital and telephone assessment), that have led to patient and physician burden, burnout, inefficiencies, and distance between patients and providers.
A key aspect of humanizing the technology behind digital phenotyping requires an exploration of patient and professional perspectives around its acceptability, including sensitives and stigmas that may be associated with this form of data. Further, we need to understand how patients and professionals think and act in response to active and passive data capture, feedback, and visualization of metrics in relation to an individual's condition. Interestingly, early work assessing personal digital data in the mental health arena has shown most patients (in outpatient settings) are happy to share social media and passive smartphone data [20,57]. In contrast, authors have also shown populations have been very apprehensive about actively reporting data summaries from their wearables with concerns around data privacy [18,58].

Future Scope
Further work is needed in surgical settings around the willingness of patients to have their personal passive data shared and utilized for their care [57]. Studies should also investigate the influence of this exposure on performance metrics and outcomes, as well as the way this data shapes the relationship with clinicians [15,59]. The aspect of data overload and data fatigue for clinicians should also be explored. When it comes to multi-disciplinary care and care spanning primary and secondary care, we should aim to define ownership and responsibility for the data.
The successful adoption of digital phenotyping requires an interdisciplinary approach involving co-collaboration and co-development of innovations between stakeholders in health care and digital health (patients and families, health care professionals, medical device industry, researchers, designers, technologists, bioengineers and scientists).
Future work should also explore considerations for integrating PGHD and digital phenotyping into existing patient pathways within and outside the walls of hospitals and clinics. The minimum level of technology required to integrate this technology should be defined alongside an understanding of the relative advantages of data generated for health systems at the clinical, institutional, network, and policy level.

Conclusions
Widescale adoption and use of smartphone and wearable technologies in the consumer and surgical health care sector has sparked opportunities to provide a digital phenotype for patients that aims to reflect their physical ability, cognition, social interaction and behavior in free-living settings. Active and passive data generated from sensors within these devices provide a nuanced view of patient outcomes for surgical conditions both alone and in combination with other data elements. While the ubiquity of such personal digital devices across society averts the need to introduce further technology, substantial further work is needed in relation to technological (data collection and analysis), clinical (standardized integration into workflows) and interpersonal (impact on patient-professional relationship) spheres of research and development. As technological, clinical, and interpersonal considerations unfold in this fast-moving space, more sophisticated ways of modelling themes, such as natural language processing of scientific and technical resources, can be used to better understand these elements [60,61]. Digital phenotyping offers an advanced understanding of human behavior and promises to drive objective, scalable, time sensitive, cost-effective, and reproducible digital outcome measurement for improving routine surgical care.   Identify the report as a scoping review. 1

Structured summary 2
Provide a structured summary that includes (as applicable): background, objectives, eligibility criteria, sources of evidence, charting methods, results, and conclusions that relate to the review questions and objectives.

Rationale 3
Describe the rationale for the review in the context of what is already known. Explain why the review questions/objectives lend themselves to a scoping review approach.

3
Objectives 4 Provide an explicit statement of the questions and objectives being addressed with reference to their key elements (e.g., population or participants, concepts, and context) or other relevant key elements used to conceptualize the review questions and/or objectives.

Protocol and registration 5
Indicate whether a review protocol exists; state if and where it can be accessed (e.g., a Web address); and if available, provide registration information, including the registration number. Specify characteristics of the sources of evidence used as eligibility criteria (e.g., years considered, language, and publication status), and provide a rationale.

Information sources 7
Describe all information sources in the search (e.g., databases with dates of coverage and contact with authors to identify additional sources), as well as the date the most recent search was executed.

4
Search 8 Present the full electronic search strategy for at least one database, including any limits used, such that it could be repeated.

4; Appendix B
Selection of sources of evidence 9 State the process for selecting sources of evidence (i.e., screening and eligibility) included in the scoping review.

4
Data charting process 10 Describe the methods of charting data from the included sources of evidence (e.g., calibrated forms or forms that have been tested by the team before their use, and whether data charting was done independently or in duplicate) and any processes for obtaining and confirming data from investigators.

Data items 11
List and define all variables for which data were sought and any assumptions and simplifications made.

SECTI.ON ITEM PRISMA-ScR CHECKLIST ITEM REPORTED ON PAGE
Critical appraisal of individual sources of evidence 12 If done, provide a rationale for conducting a critical appraisal of included sources of evidence; describe the methods used and how this information was used in any data synthesis (if appropriate).
n/a; Comment on page 5.

Synthesis of results 13
Describe the methods of handling and summarizing the data that were charted. 6

Selection of sources of evidence 14
Give numbers of sources of evidence screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage, ideally using a flow diagram. 6 Characteristics of sources of evidence 15 For each source of evidence, present characteristics for which data were charted and provide the citations.

6
Critical appraisal within sources of evidence 16 If done, present data on critical appraisal of included sources of evidence (see item 12). n/a

Results of individual sources of evidence 17
For each included source of evidence, present the relevant data that were charted that relate to the review questions and objectives.

6,7
Synthesis of results 18 Summarize and/or present the charting results as they relate to the review questions and objectives. 6,7

Summary of evidence 19
Summarize the main results (including an overview of concepts, themes, and types of evidence available), link to the review questions and objectives, and consider the relevance to key groups.

7-11
Limitations 20 Discuss the limitations of the scoping review process. 7

Conclusions 21
Provide a general interpretation of the results with respect to the review questions and objectives, as well as potential implications and/or next steps.

Funding 22
Describe sources of funding for the included sources of evidence, as well as sources of funding for the scoping review. Describe the role of the funders of the scoping review. 11 JBI = Joanna Briggs Institute; PRISMA-ScR = Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews.

Appendix B
Appendix B. Search strategy for PubMed involving terms related to "digital phenotyping and PGHD" (concept A), "outcome measurement" (concept B), and "surgical care" (concept C)