Mobile Applications in Mood Disorders and Mental Health: Systematic Search in Apple App Store and Google Play Store and Review of the Literature

Objectives: The main objective of this work was to explore and characterize the current landscape of mobile applications available to treat mood disorders such as depression, bipolar disorder, and dysthymia. Methods: We developed a tool that makes both the Apple App Store and the Google Play Store searchable using keywords and that facilitates the extraction of basic app information of the search results. All app results were filtered using various inclusion and exclusion criteria. We characterized all resultant applications according to their technical details. Furthermore, we searched for scientific publications on each app’s website and PubMed, to understand whether any of the apps were supported by any type of scientific evidence on their acceptability, validation, use, effectiveness, etc. Results: Thirty apps were identified that fit the inclusion and exclusion criteria. The literature search yielded 27 publications related to the apps. However, these did not exclusively concern mood disorders. 6 were randomized studies and the rest included a protocol, pilot-, feasibility, case-, or qualitative studies, among others. The majority of studies were conducted on relatively small scales and 9 of the 27 studies did not explicitly study the effects of mobile application use on mental wellbeing. Conclusion: While there exists a wealth of mobile applications aimed at the treatment of mental health disorders, including mood disorders, this study showed that only a handful of these are backed by robust scientific evidence. This result uncovers a need for further clinically oriented and systematic validation and testing of such apps.


Introduction
Currently, the majority of the world population uses a smartphone, and we use it an average of 61 h per week [1]. In fact, a 2018 report of the Spanish National Observatory for Telecommunications and the Information Society (ONTSI) reported that 78.9% of Spaniards (≥15 years of age) own a smartphone and 74.8% of them used them to access the Internet [2]. It is estimated that there are more than 325,000 applications (apps) classified as health or

App Search and Filtering
To explore the landscape of MHapps for the treatment of mood disorders a list of key search terms was developed, namely: "bipolar", "depression", "dysthymia", and "mood". These were used to search both the Apple App Store and the Google Play Store. To extract the necessary data from the stores in a structured way, and then be able to filter resulting apps appropriately, an in-house search engine was developed. It produced spreadsheets containing the following information: App name; app developer; approximated number of downloads in Google and number of user ratings for Apple; average user rating; genre; language, in the Apple App Store only; price; release date; and the corresponding URL. Our tool was limited to extracting up to 200 of the most relevant apps for each search.
The application allows the search of key terms in the App/Play Store of a particular country, therefore gaining access to various, potentially region-specific apps. This research restricted its search to these countries: Spain, France, Germany, Italy, United Kingdom, and the United States. The search terms were used both in English and the country's native language between the dates 30/11/2020 and 11/12/2020. In the cases of the UK and USA, English, and Spanish terms were used.
After generating spreadsheets of results for each search term, the results were compiled by country into master files and the initial two filters were applied: Only apps with >10,000 downloads; and only apps belonging to the genres "Health and Fitness", "Lifestyle", "Medical", and "Education". After removing any apps not complying with the above, all country-specific results were compiled into one overall master file. Repeated app results were manually removed, and two further filters applied: Excluding any apps explicitly not available in Spanish; and apps with a most recent update dating back further than a year, i.e., before 2020. This was done to try to exclude apps that were no longer actively used and/or improved upon. Only Apple provided specific language information, thus ones who did not list Spanish were removed. In the Google results some apps were only available in, for example, Portuguese, Italian, etc., and therefore were also removed.
Lastly, while systematically reviewing all apps, 4 distinct app-types arose, which were eliminated from the results due to irrelevance to the project. Namely: Meditation apps; journaling, dairies or (mood) tracker apps; diagnostic tests, with varying degrees of clinical legitimacy (e.g., filling in questionnaires of symptoms for categorization for depression, anxiety or bipolar, without intention of treatment); and apps providing information only, without any element of interaction or treatment. Although many of these apps may be used in the global treatment of mental disorders, their primary focus is not strictly on intervention, hence they were excluded. Table 1 provides an overview of the applied app exclusion criteria.

App Characterization
The filtered apps were then systematically analyzed in more detail to understand the following parameters: Specific type of app and objective; developer profile (who and what kind of background do they have); target population; type of data collected by the app.

Publication Search
Furthermore, a literature search was performed, to see whether any of the apps were supported by any type of scientific evidence on their acceptability, validation, use, effectiveness, etc. This was done by searching PubMed using each app's name in combination with several keywords (displayed in Figure 1). No further filters or search criteria were applied to the PubMed search. For each relevant search result, any listed related literature was also examined to ensure the inclusion of all relevant available information that may not have been shown using the search terms in Figure 1. Additionally, each app's developer's website was examined for references to published evidence to also be included. apps may be used in the global treatment of mental disorders, their primary focus is not strictly on intervention, hence they were excluded. Table 1 provides an overview of the applied app exclusion criteria.

App Characterization
The filtered apps were then systematically analyzed in more detail to understand the following parameters: specific type of app and objective; developer profile (who and what kind of background do they have); target population; type of data collected by the app.

Publication Search
Furthermore, a literature search was performed, to see whether any of the apps were supported by any type of scientific evidence on their acceptability, validation, use, effectiveness, etc. This was done by searching PubMed using each app's name in combination with several keywords (displayed in Figure 1). No further filters or search criteria were applied to the PubMed search. For each relevant search result, any listed related literature was also examined to ensure the inclusion of all relevant available information that may not have been shown using the search terms in Figure 1. Additionally, each app's developer's website was examined for references to published evidence to also be included.

Results
The search resulted in >500 apps, of which 30 apps met the inclusion and exclusion criteria. Figure 2 shows the flowchart of app results obtained at various stages of filtering and application of inclusion and exclusion criteria.

Results
The search resulted in >500 apps, of which 30 apps met the inclusion and exclus criteria. Figure 2 shows the flowchart of app results obtained at various stages of filter and application of inclusion and exclusion criteria. Certain metrics were not readily available from the platforms, specifically, the nu ber of app downloads from the Apple App Store. Therefore, as a proxy, the number user ratings was used, with a more generous threshold of >100 ratings, as there tended be far fewer ratings. The app results are split up into three categories of results, namely apps (26.7%) with published evidence available (  Table S3). More tailed information, including developer information, URLs, and the type of data collect of the apps which had connected published evidence can be found in Supplementary M terials Table S1.
The results include technical details (expanded on in Supplementary Materials bles S1-S3) of each app and information regarding the type of service it provides and der which objective, what its target population is, the type of data collected by the a and the profile of the developer(s), i.e., their backgrounds and potential ties to resea groups, clinicians, public health efforts, etc.  Certain metrics were not readily available from the platforms, specifically, the number of app downloads from the Apple App Store. Therefore, as a proxy, the number of user ratings was used, with a more generous threshold of >100 ratings, as there tended to be far fewer ratings. The app results are split up into three categories of results, namely: 8 apps (26.7%) with published evidence available ( Table 2); 15 apps (50%) with no published evidence, but legitimate background (Supplementary Materials Table S2); and 7 apps (23.3%) with limited available information (Supplementary Materials Table S3). More detailed information, including developer information, URLs, and the type of data collected, of the apps which had connected published evidence can be found in Supplementary Materials Table S1. The results include technical details (expanded on in Supplementary Materials Tables S1-S3) of each app and information regarding the type of service it provides and under which objective, what its target population is, the type of data collected by the app, and the profile of the developer(s), i.e., their backgrounds and potential ties to research groups, clinicians, public health efforts, etc.
Supplementary Materials Table S4 details all the published articles that we were able to identify related to one of the apps listed in Table 2. A total of 27 publications were found through the PubMed search, supplemented by additional findings from developers' websites. Each of these publications described distinct studies of varying sizes and robustness. The results contained: 7 observational/longitudinal studies; 6 randomized controlled trials (RCTs), including one crossover study; 5 nonrandomized feasibility/usability/acceptability/effectiveness studies, including 4 pilot studies; 4 descriptive studies; 1 user satisfaction survey; 1 case study; 1 focus group; 1 context analysis; and 1 study protocol. The studies showed a high degree of heterogeneity in terms of the type of investigated mental health disorder. Despite searching for apps designed to treat the mood disorders such as depression, dysthymia, and bipolar, these studies focused on various disorders that do not all fall into one of these categories yet could be understood to be interrelated in the larger context of mental health. Table 3 illustrates this heterogeneity and the various disease or treatment areas addressed in each publication, which mainly included depression and anxiety, but also obsessive-compulsive disorder (OCD), body image disorder (BID), among others. Some publications included patients of various disorders, such as both anxiety and depression patients. Lastly, other publications focused more on the usability, acceptability, or user satisfaction than any clinical benefit of the apps [34][35][36]51,52].

Principal Findings
This article presents a systematic search of the Apple and Android app stores for mobile apps specifically targeted at mood disorders. In light of the fact that, despite limiting our search to mood disorder-related terms, the resulting apps and associated publications concerned disorders in the broader topic of general mental health, this discussion treats MHapps in general, not limited to mood disorders. This is because a lot of the discussed issues apply broadly to MHapps, as well as to other health apps.
We aimed to explore not only the breadth and variety of available apps for mood disorders, but most importantly we sought to understand the scientific and clinical evidence base supporting the use of the most popular ones. Emphasis on evidence-based interventions could propel a pivotal paradigm shift away from more traditional ways of treating mental health disorders and toward mHealth/eHealth. Especially in a post-COVID era that has made us rethink conventional patient-practitioner interactions.
What we found, however, was that while the app marketplace offers users a wide range of apps marketed toward mental health intervention, only 30 mobile apps fit the inclusion criteria, and of those only 8 (26.7%) were supported by any type of published scientific evidence. Moreover, while the resulting apps could be used in the treatment of mood disorders, only 5 of the 27 associated publications measured the effectiveness of an MHapp on health outcomes and symptoms of mood disorders in particular, namely depression, including postpartum depression (PPD) [37,38,53,55,56]. Table S4 of the Supplementary Materials provides an overview of all publications, the type of evidence reported within them, and their principal findings. Most applications in this overview are not supported by studies published in scientific journals and lack the approval of official agencies endorsing their use. The few publications which did report clinical outcome measures found positive changes to measures such as depression symptom severity, or improved well-being scores. However, studies were early stage and sample sizes tended to be small.
This implies that while a plethora of mobile interventions is developed and marketed, the channels through which this is commonly done do not lend themselves to implementation within healthcare provision contexts; contexts which usually rely on robust effectiveness and safety testing. This might mean that mHealth interventions developed and tested in formal research settings for research purposes are rarely made available to the general public, they do not garner popularity and attention, they are simply non-existent or that they are just widely outnumbered by applications developed in non-research settings.
Our findings highlight considerable shortcomings in the clinical validation of even the most popular MHapps. While the 5 aforementioned publications did report a promising amelioration of wellbeing or various depression symptoms, these findings arise from smallscale (pilot) studies, with various methodological limitations. Therefore, these findings cannot be considered robust enough to provide strong scientific support for the routine clinical use of such interventions. Moreover, comparing these trial findings with the type of robust and methodologically sound evidence necessary to approve, implement, and recommend other kinds of health-interventions (i.e., drugs, devices, therapeutic methods), once more underscore the lack of systematic testing and validation of mental health apps.
Although app developers have made efforts to incorporate evidence-based treatments, such as cognitive behavioral therapy, more research is needed to improve the clinical validity, treatment reliability, and safety of MHapps. This is supported by the findings of a recent systematic search and content analysis of depression apps, which assessed how mobile applications measured up against the National Institute for Health and Care Excellence (NICE) guidelines for the treatment of depression in adults [61]. None of the identified apps fully aligned with the NICE guidelines and authors urged developers to consult and regard relevant guidelines and standards throughout app development and content design.
This calls into question whether direct-to-consumer (DTC) is the most effective and safe route for MHapps to be marketed and distributed to patients struggling with complex mental health pathologies.

Safety and Ethical Considerations of MHapps
The principal area of concern regarding health apps in general, but of course also MHapps, is privacy and the use and protection of personal/medical data. A recent analysis of privacy-related permissions of diabetes apps found that approximately 60% of the analyzed apps requested potentially dangerous permissions, meaning permissions that might lead to data breaches and thus pose a considerable risk to data privacy [73]. Moreover, authors found that app users may not always realize that the business model of free apps is largely based on advertising and, consequently, on directly or indirectly sharing or selling their private data to unknown third parties [73]. These concerns about privacy further expand in the context of apps that use passive monitoring of individuals with mental illness. This involves collecting data from patients through sensors without requiring direct patient input, such as speech patterns, mobility, activity level and signs of social interaction [74,75]. A considerable population segment does not want their digital activity to be monitored and tracked, and without an understanding of the digital economy, which is based on creating value from the analysis of tracked behavioral data, encouraging the use of MHapps may inadvertently lead to harm [76][77][78][79][80][81].
Furthermore, a risk to MHapp users' safety may be the promotion of unproven, unsafe, and misleading messages. A study of 61 frequently used MHapps concluded that the themes they emphasized may promote medicalization of normal mental states and imply individual responsibility for mental health [10]. While the idea of mental health care for everyone might help reduce stigma, this type of messaging could lead to overdiagnosis and pharmaceutical overtreatment [82] and be potentially dangerous for diagnosed patients who need a clear understanding of when to seek professional help [10]. Moreover, in the absence of adequate regulation and if affiliations to regulated mental health professionals are lacking, DTC MHapps may connect users to nonprofessional therapists or chatbots with limited personalized treatment capacities. MHapps may also fail to provide emergency information, all of which exacerbates concerns over safety, accountability, and treatment effectiveness and adherence [4,5,[83][84][85].

Effectiveness and Evidence of MHapps
Previous studies corroborate our findings that despite the potential of mobile mental health intervention, only a slim percentage of MHapps are based on clinically validated research and a lack of evidence on the effectiveness of mobile health apps is pervasive [64,[86][87][88]. Even reviews of MHapp controlled trials generally conclude that studies are of mixed quality and highlight the necessity for further systematic investigation [89]. However, considering the low cost of entry for app developers in general, it is unlikely that many of them will ever be able to afford even a simple clinical trial to validate effectiveness and safety [90]. This is especially true for private sector products, which will often not be subject to more rigorous testing, unlike digital health technology developed by clinical researchers, and may instead be designed to maximize user engagement. This has been termed the "commercialization gap" and it can lead to situations where DTC MHapps end up being popular despite being less effective [83,91]. The primary goal of such an app may be regular engagement, instead of efficacious treatment.

Access to and Adoption of Mobile Mental Health Interventions
Currently, there exist no consequences for marketing mobile health interventions containing inaccurate or non-evidence-based information, although calls to improve health app oversight and raise the standard of app development and clinical validation mechanisms are increasing [4]. The described issues give rise to opportunities for collaborations between industry and clinical researchers, with the goal of developing MHapps that are safe and effective, while also sufficiently engaging to ensure compliance and sustain therapeutic effect [69,92,93]. Such collaborations could infuse private app developments with the viewpoints and priorities of healthcare professionals, or vice versa, making interventions originating from research contexts more commercially viable and attractive.
Besides collaboration, a different approach to MHapp quality assurance may be to rethink the routes of access and accreditation of such interventions, to facilitate eventual integration with clinical practice. Curated, though limited, app libraries, such as Psyberguide [94] or the NHS App Library [95] aim to provide a solution to the unstructured and at times overwhelming access to (mental) health apps. Official regulatory bodies, such as the FDA and the European CE marking directives [96] list just 9 MHapps to date. App assessment tools, such as the APA framework [97] or the Mobile Application Rating Scale (MARS)/User MARS (uMARS) [98] put the onus on app users or their healthcare providers to assesses app quality.
Another recent example of MHapp access facilitation is the AppSalut Site, created by the Catalan Fundació TIC Salut Social [99]. This project aims to showcase apps in the field of health and social services, promoting health within the public. The catalogue allows prescription of certified apps by primary care doctors, and generated data can then be consulted by the professionals. A 5-month pilot study of this system validated the functionality of the platform and its compliance with data security regulations. However, it did not assess any form of clinical effectiveness [100].
In Germany, the Federal Parliament passed the Digital Healthcare Act (DiGA) in 2019, allowing digital health applications to be prescribed by either physicians or psychotherapists and reimbursed by statutory health insurance [101].
These types of projects and legislations allow us to start thinking about MHapps as something to be prescribed within a context of clinical guidance, similarly to common pharmaceutical interventions. This conceptual shift may also facilitate the construction of infrastructure, or a "pipeline" that allows for more robust clinical testing, validation, and evidence generation with the aim of being included within the prescribable pool of mobile applications. This could furthermore be coupled with continuous evidencereporting and retrospective outcome assessment in individual patients, as well as across user populations, in cases where large-scale RCTs may not be feasible. These kinds of arrangements could be thought of as a type of market access agreement, which are subject to continued evidence development. Based on these outcome measurements, MHapps could be continually improved, and ineffective interventions could be weeded out. These types of structures should ideally work in concert with the development of standards, such as standardized health outcomes that should be consistently measured in studies assessing the effectiveness and safety of health apps, with specific adaptations for different therapeutic areas. Of course, such outcome measurement at individual patient level and subsequent incorporation of this data into the electronic health record, for example, would be ideal.
However, at a logistic and technological level, this might be a lofty goal to aspire to and lacking robust validation and accreditation is only one of the many stumbling blocks in the road toward incorporating MHapps in a broader healthcare context. Despite the precedent set by the DiGA, on a European level there exists no specific regulations on the use of digital therapeutics (DTx) [102]. Similarly, a dedicated FDA regulatory framework for software-as-a-medical-device (SaMD) solutions remains up in the air [103]. The European Data Protection Supervisor identifies various risks to data security in relation to DTx, such as constant observation of the patient or risk of data breaches, which make the development of appropriate legislation difficult [102]. Ensuring the security of large-scale healthcare data infrastructure, which offers appropriate levels of oversight, is a hugely complex task.
Furthermore, even considering that DTx solutions may undergo rigorous RCT validation, unlike traditional pharmaceuticals, they have the potential to be frequently updated after regulatory approval, a matter further complicated with the incorporation of AI technologies [103,104]. This means that regulatory pathways need modernization to account for the adaptive nature of DTx.
Lastly, a considerable issue in the adaptation of DTx is the lack of standardized payment and reimbursement frameworks [103]. Options may include licensing or valuebased agreements, but without clear guidance on DTx financing within the various health insurance structures across Europe, prescribers and payers may be unable to transition away from traditional models and patients cannot access these therapeutic options.
While proper clinical validation and accreditation may certainly not be the only hurdle facing MHapps and DTx, it is an important step toward building an environment conducive to implement necessary frameworks, so that DTx interventions can be used safely and effectively.

Implications for Further Research and Policy
The message of this study is that clear, more robust evidence is necessary for the development and subsequent clinical implementation of MHapps. More outcomes-focused research is the crucial building block to harness the potential of mHealth in the treatment of mood disorders and mental health disorders in general.
Well-designed studies and the implementation of standardized outcome-monitoring could address concerns regarding effectiveness and safety and help overcome skepticism towards the systematic implementation of MHapps, both from the sides of healthcare professionals and users.
A concrete action toward holding MHapp and DTx products accountable to the same levels of scientific rigor that is expected of traditional pharmaceuticals could be the development and wide-spread use of app-assessment and accreditation tools. Having standardized assessment metrics to judge MHapp effectiveness and safety, which demand a certain type and quality of evidence, would not only ensure a product's merit, but also help developers at the time of creating their apps. As with traditional medical products, knowing the requirements for approval helps guide the R&D process to gather all necessary data and substantiate a drug or device's claims.
Entrenching such assessment tools in a formal authorization process undertaken by a national or international governing body cements the need for robust evidence if we want to start thinking about apps as prescribed therapeutic options or adjuvants. It is also clear that for this, DTx-specific legislation, approval pathways, and monitoring systems are necessary, which consider all the ways apps and digital solutions are distinct from traditional medicines and devices. The concept of app "administration" may lend itself more easily to appropriate regulatory oversight in terms of privacy and accountability, seeing as healthcare professionals are involved in the process. The authors believe that this is best encouraged and catalyzed through research and industry collaborations, which can capture the various relevant perspectives and needs. Involving diverse stakeholders, such as users, researchers, healthcare providers, and software developers in the creation of applications, as well as standards and best practices may best tackle the various issues effective MHapp-implementation still faces to date. Industry-based developers might find such corporations attractive if they can facilitate mHealth interventions reaching wider target populations and garner trust and a positive reputation with clinical professionals.

Limitations
There exists no gold standard for the systematic search and evaluation of mHealth interventions. Despite searching for apps designed to treat mood disorders such as depression, anxiety, dysthymia, and bipolar, resulting studies focused on various disorders that do not all fall into one of these categories, yet could be understood to be interrelated. Moreover, relying on the information that is publicly available through the Apple App and Google Play Stores carries some limitations, such as incomplete information, e.g., the lack of download information in the Apple App Store, or language specifications, and unstructured information organization. For the ease of our study, we developed a search-tool to extract the relevant data in a structured format. Our search was conducted in December of 2020, and considering that the app landscape is rapidly changing, conducting the same review at a later time might yield different results. Considering the above, we recognize that our results may not be reproducible, despite the transparency of our methods.
Furthermore, our review focused on a Spanish context. Despite searching app stores in the EU5 and USA, we did implement exclusion criteria that would filter out apps that were explicitly not available in Spanish. This was done because this study forms the basis of a larger research project, which aims to develop an app evaluation tool for use within the Spanish healthcare context. In addition, this study was not designed specifically in accordance with PRISMA guidelines. However, we did attempt to construct a robust rationale for the various inclusion and exclusion criteria we applied.
Lastly, we did not undertake a thorough examination of the functionalities of all apps, beyond the basic technical details, since this research limited itself more to understanding the evidence base instead of the effectiveness or adequacy of the individual apps in treating mood disorders. However, in further research it may be interesting to use the MARS/uMARS, for example, to evaluate other characteristics such as engagement, functionality, or information quality.

Conclusions
The use of digital technology in the treatment of mental health is an area of immense potential, especially considering the double-edged consequences of COVID-19; greater mental health burden accompanied by the increased facilitation of tele-and mHealth. Mental healthcare could be made more accessible and affordable, and stigma could be reduced through the effective use of MHapps. However, the lack of robust scientific evidence is continuously underscored, not only in the present study, but in many examinations of the current app-landscape. Finding ways to facilitate robust evidence-generation in a timely and cost-effective manner will remain a significant challenge. Moreover, it will always be necessary to ensure that compliance with meticulous empirical research standards is prioritized, over the potential appeals of producing a "hit" app. Here, research and industry collaborations, or innovative methodological approaches may offer some solutions, by incorporating diverse viewpoints to tackle issues such as producing efficacious apps, setting standards and best practices, and defining universally applicable empirical outcome measures. Additionally, appropriate regulatory oversight, especially when dealing with privacy and the protection of patient data, will be crucial.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijerph19042186/s1, Table S1: Resulting apps with some available published evidence; Table S2: Resulting apps with no available published evidence, but legitimate background; Table S3: Resulting apps with no available published evidence and limited available information; Table S4: Summary of all identified publications pertaining to the found mental health apps. Funding: The research for this paper was fully funded by the Instituto de Salud Carlos III from the Spanish Ministry of Science, Innovation, and Universities (PI21/00234).
Institutional Review Board Statement: Not applicable, study did not involve humans or animals.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.