Machine Learning in Chronic Pain Research: A Scoping Review

Given the high prevalence and associated cost of chronic pain, it has a significant impact on individuals and society. Improvements in the treatment and management of chronic pain may increase patients’ quality of life and reduce societal costs. In this paper, we evaluate state-of-the-art machine learning approaches in chronic pain research. A literature search was conducted using the PubMed, IEEE Xplore, and the Association of Computing Machinery (ACM) Digital Library databases. Relevant studies were identified by screening titles and abstracts for keywords related to chronic pain and machine learning, followed by analysing full texts. Two hundred and eighty-seven publications were identified in the literature search. In total, fifty-three papers on chronic pain research and machine learning were reviewed. The review showed that while many studies have emphasised machine learning-based classification for the diagnosis of chronic pain, far less attention has been paid to the treatment and management of chronic pain. More research is needed on machine learning approaches to the treatment, rehabilitation, and self-management of chronic pain. As with other chronic conditions, patient involvement and self-management are crucial. In order to achieve this, patients with chronic pain need digital tools that can help them make decisions about their own treatment and care.


Introduction
Chronic pain has a serious impact on both individuals and society. Prevalence estimates vary among studies, partly due to inconsistent definitions of chronic pain and differences in the methods used to estimate it [1]. Several studies conducted in Europe and the United States have estimated that prevalence is 15-50% [1][2][3][4][5]. In terms of individuals, chronic pain can severely affect quality of life [6,7]. It is also known that comorbidities are common among patients with chronic pain, e.g., anxiety, depression, sleep disturbance, fatigue, and decreased overall physical and mental functioning [1,8,9]. Mortality rates have been found to be higher in individuals affected by chronic pain. For instance, a large meta-analysis showed that for widespread chronic pain, mortality is 2.5-times higher than normal [10]. In terms of society, chronic pain has a major influence on the consumption of healthcare resources and related costs [11]. The reduced work capacity of patients with chronic pain results in major expenditures in sick leave compensation and disability benefit [12]. For individuals with chronic pain, a study found that the probability of being unemployed was twice as high and the probability of receiving disability benefits was four-times higher, compared to the general population [13].
The most common definition of pain is "an unpleasant sensory and emotional experience associated with actual or potential tissue damage, or described in terms of such damage" [14]. Chronic pain or persistent pain is pain that lasts more than three months (i.e., longer than the normal healing time of tissue). Chronic pain is different from acute pain. Acute pain is caused by a specific disease or injury. Acute pain is usually resolved as the injured tissue heals [15]. Chronic pain may arise from psychological states, does not serve any biological purpose, and does not have a predictable endpoint [16]. Despite recent advances in empirical pain research, effective methods for the treatment of chronic pain are still lacking. How individual patients with chronic pain respond to treatment and the adverse effects they experience are highly variable. In many cases, the cause of chronic pain is unknown, as there are no objective findings or measures that can be used to detect it. This makes treatment and management complex and time consuming. Due to the high variability among patients, successful management of chronic pain requires personalisation with respect to each patient's pain intensity and duration, disease state, tolerance of adverse events, and risk of medication abuse [17].
Identifying the underlying mechanisms of disease and enabling personalisation of care are key to improving the treatment of chronic pain. These are areas where machine learning (ML) is expected to be useful because ML excels at detecting patterns, rules, and causal dependencies in large datasets. ML algorithms learn from data without being explicitly programmed and can be used to obtain insights and make predictions and decisions from large and complex datasets. In healthcare, these analyses can help to better predict, identify, and treat diseases. To create reliable ML models, large volumes of relevant training data are needed. Increased adoption of information systems in healthcare creates vast amounts of health data and facilitates research in ML, which has the potential to realise person-centred healthcare. Simultaneously, citizens are increasingly using health apps and wearables for self-management, thus generating large amounts of health-related data outside of the formal healthcare sector. This wide range of health data, generated both in and out of clinical settings, can potentially be used by ML algorithms to support chronic pain research. This paper intends to determine the state-of-the-art knowledge in this field, identify directions for future studies using health data from numerous sources, and indicate which decision-support tools and recommendation systems for both patients and clinicians may be useful. Such tools can be built either by using existing chronic pain models or data-driven methods.
ML is used both for classification (predicting discrete values such as male/female, pain/not chronic pain, etc.) and regression (predicting continuous values, such as height, salary, etc.). A ML algorithm can be supervised (labelled training data available) or unsupervised (no labelled training data available). Brief explanations of the most common ML algorithms follow below. For more information about ML algorithms, the book by Theodoridis [18] can be consulted.
A naive Bayes classifier calculates the probability that a data point belongs to a certain class using the Bayes rule. The data point is assigned to the class to which it most likely belongs. This method is naive in the sense that it assumes the data points are independent of each other. Support vector machine (SVM) is a supervised method where the task is to create a hyperplane separating two classes. The hyperplane is found by optimising a cost function. Decision trees (DTs) are multistage decision systems where classes are sequentially rejected until an accepted class is reached. DTs can be used both for classification and regression. A random forest (RF) is an ensemble of DTs. The output of a RF is the mode of the classes (classification) or the average prediction of the trees (regression). K-nearest neighbours (kNN) is a supervised learning method where for each data point, the distance to the training points is considered. If most of the nearest training points to the data point belong to a certain class, the data point is assigned to that class. Clustering is the task of dividing an unlabelled dataset into groups. k-means clustering is one of the most popular clustering methods. In k-means clustering, the task is to partition n data points into k clusters. Each observation is assigned to the cluster with the nearest mean. AdaBoost is the abbreviation for adaptive boosting. Boosting is a technique used for creating a strong classifier from a number of weak classifiers. Deep learning is a family of ML methods based on neural networks (NNs). The algorithms are inspired by the human brain. NNs consist of neurons arranged in different layers, similar to neurons in the brain.
Previous review papers have focused on how ML and statistical methods [19][20][21][22] can be utilised in the treatment of patients with chronic pain. However, these reviews had a restricted scope or were outdated in that they did not consider recent developments in ML research. An up-to-date analysis of ML research in chronic pain care is currently lacking. This is the intended contribution of the present paper.
The aim of this scoping review is to summarize and disseminate research findings on the use of ML approaches in chronic pain research, to identify research gaps, and to make recommendations for future research [23]. The literature search was guided by a set of predefined search queries to major research databases. The identified publications were screened by several independent reviewers following pre-defined inclusion and exclusion rules. This rigid process ensured broad coverage of published evidence and facilitated the objective validity and relevance assessment of every publication included in the review [23]. The following research questions (RQ) will be addressed in the present review: By answering the research questions, we plan to contribute to the body of knowledge by providing insight into how ML approaches may be used to improve the treatment and management of patients with chronic pain.

Search Strategy
The literature search of three databases, PubMed, IEEE Xplore, and the ACM Digital Library, was performed 27 February 2019. The search included the keywords "machine learning", "pain", and "fibromyalgia", to be present in the title or abstract. Specifically, the following query was used: (pain OR fibromyalgia) AND machine learning.
This query returned 299 records. After removing duplicates, two-hundred eightyseven records were included in the screening process ( Figure 1). PubMed searches required keyword presence in the title or abstract. This option was not available in IEEE Xplore and the ACM Digital Library, which limited keyword searches to abstracts only.

Inclusion Criteria
Before screening titles and abstracts, the paper inclusion criteria were defined as follows: • The paper must be written in English.

•
The reported trials must be done on humans. • The trials must include patients with chronic pain. • The study must have been completed. • This review focused on chronic pain studies where the source of the pain was not known, such as fibromyalgia (FM) and chronic low back pain (CLBP).

Exclusion Criteria
To limit the scope of the review, the following exclusion criteria were applied to the identified publications: • Review papers were excluded. • Studies on acute pain were excluded. • Headache, migraine, and chest pain were excluded. • Postoperative pain and cancer-related pain were excluded. • Studies with only healthy research participants were excluded.
We wanted to review studies on chronic pain patients. Review papers occurring in our literature search were therefore excluded. Our focus was on chronic pain; hence, we exclude all studies on acute pain. Chest pain is often acute or related to heart disease, which is why this was excluded. Headache and migraine could be chronic, but we decided to exclude these conditions. For postoperative and cancer-related pain, the reason for the pain is known, while we focused on pain with an unknown reason. Pain studies on healthy participants were excluded because they were not related to chronic pain.

Screening and Full-Text Reading
Before full-text reading, we read through the abstracts of the 287 records ( Figure 1). Based on screening of the abstracts, one-hundred ninety-nine papers were excluded. There was a conflict on 51 papers. Of these 51 papers, twenty-eight were excluded after resolving the conflict. In total, two-hundred twenty-seven papers were excluded based on the screening. After screening the abstracts, sixty papers remained for full-text reading. There was a conflict on 12 papers. As in the screening process, we resolved the conflict and ended up with 39 papers included in the review. At this stage, twenty-one articles were excluded by the following reasons: Studies not using ML: ten papers (all from the papers found in the reference lists) Some of the papers were excluded for more than one of the reasons above. While reading papers in full text, we discovered some papers in the reference lists that seemed relevant for our review, but that were not identified in the original literature search because the term "machine learning" was not present in their titles or abstracts. Hence, to identify additional papers for inclusion in the review, we checked the reference lists of the papers that had already been included. Titles containing words such as "classification" and "clustering" were considered. Thirty-two papers from the reference lists were selected for further review. Following a full-text reading of these papers, nine of them were included, and eighteen of them are excluded. In total, fifty-three studies were included in the review (Table 1).

Applications of Machine Learning in Chronic Pain Research
The results of the review are summarised in Table 1. In this table, the publications are grouped into six different categories with respect to the objectives of the studies and which stages of chronic pain care were addressed (RQ5): • Classification/diagnosis of patients with chronic pain using structured health data • Classification/diagnosis of patients with chronic pain using text and images • Genomics approaches and pain biomarker identification For each publication, information was extracted and is represented as columns in Table 1 for answering the research questions (RQ1-RQ5) as follows: chronic pain condition (RQ4), data size and data sources (RQ2 and RQ3), and ML methods (RQ1). Table 1. Decisions: Classifying using health data, classifying using text and images, genomics approaches and biomarker identification, treatment, self-management, and measuring pain intensity.

Study
Year The results clearly showed an obvious problem of interest would be to find an appropriate way of distinguishing patients with chronic pain from healthy controls. Many of the included papers had such a focus [32,35,36,46,49,50,[52][53][54]. The identification of useful means of distinguishing patients with chronic pain from healthy controls would make it easier to characterise a patient with a certain set of symptoms and thereby make it easier to suggest a successful treatment [26]. A specific line of research was provided in [51], where the goal was to distinguish FM from rheumatoid arthritis (RA) patients. Although these patients have many similar symptoms, they usually require different treatment. Hence, a successful classifier in this setting could lead to improved treatment outcomes for these patients. A good first step in identifying the best treatment for the individual patient is to cluster patients with the same disease, as done in [27,31].
Identifying genes relevant to chronic pain is useful in determining patients with chronic pain conditions and can provide links to the pathophysiology of chronic pain. However, the number of papers related to genomics approaches and biomarker identification is small. We found two studies that aimed to identify biomarkers for bladder pain and pelvic pain using extended random forest [56,58]. In other studies, Ultsch et al. [57] identified 535 genes related to pain. This set of genes can be considered a reflection of the knowledge concerning the genetic architecture of pain acquired in any context of pain research available at the time of publication. Lukkahatai et al. [55] developed an algorithm to identify genes that discriminate FM patients from healthy controls. Efficient ML methods for classifying and clustering patients with chronic pain have proven to be useful in pain assessment and treatment decision-making. These methods have been used both for making a correct diagnosis and to identify subgroups among patients with the same diagnosis who potentially react differently to a certain treatment, such as two of the studies that aimed at identifying subgroups of FM patients by using clustering techniques [27,31]. Examples of research more directly related to treatment include outcome prediction in acupuncture [25], suicide prediction among FM patients [62], and disc localisation in CLBP patients [67]. Jiang et al. [63] used SVM-based classifiers of surface electromyography (SEMG) data to identify low back pain (LBP) patients who would respond to a 12 week rehabilitation program. The best performing classifier achieved a classification accuracy of 97%. Thus, the authors claimed that their study could be helpful for physicians or therapists in recommending the best treatment for individual LBP patients. In another study [61], a clinical decision support tool was built for treatment of LBP using three ML models (DT, RF, and boosted tree) Self-management has also been shown to be a key application of machine learning in empowering the patients to manage their condition. Six of the included papers focused on self-management of chronic pain [69,[71][72][73][74]76]. Two of these were authored by the same group of researchers, with both studies using the same training dataset [74,76]. They aimed to support patient self-management by analysing self-reported data from patients with chronic pain pre-treatment, post-treatment, and at three month follow-up. The main topics of the studies were prediction of patients' health level [76] and feature selection to optimise questionnaires by omitting irrelevant questions [74]. Three studies were related to fear-avoidance behaviour and anticipation of pain [69,71,73]. Regular physical activity is an important factor for successful management of chronic musculoskeletal pain [73]. However, many patients avoid physical activity due to a strong fear of pain. Jamison et al. [69] aimed to identify individuals who are prone to catastrophise their pain, by analysing daily assessment data from a mobile application for patients with chronic pain. Meier et al. [71] analysed self-reported questionnaire data and functional magnetic resonance imaging (fMRI) of patients with CLBP to estimate the level of pain-related fear in individuals. Rabbi et al. [73] examined the feasibility of a mobile app to promote physical activity among patients suffering from chronic back pain. The application uses ML to analyse sensor data and self-reported physical activity data to generate personalised physical activity recommendations based on the patient's behaviour. One study focused on measuring and predicting pain volatility by employing ML methods to analyse data from users of a pain management app [72].
Valid and reliable assessment of pain is essential in diagnostics of chronic pain and in guiding treatment decisions [82]. This review includes five studies focusing on automatic estimation of pain intensity. Of these, four studies used facial expression videos from a publicly available shoulder pain research database as the training data [78][79][80][81]. The training data were annotated, and all four studies used algorithms for supervised learning. In two of the studies, a SVM was applied to classify discrete levels of pain intensity [79,81]. Kaltwang et al. [80] applied the relevance vector regression algorithm for continuous estimation of pain intensity. Jaiswal et al. [78] used a convolutional neural network (CNN) to estimate pain intensity by regression. Lee et al. [77] used ML to predict clinical pain based on brain imaging data (fMRI) and physiological data from patients with CLBP. This study used support vector regression (SVR) for the prediction of pain intensity on a continuous scale. In research settings, brain imaging approaches can help to better understand the underlying mechanisms of pain.

Machine Learning Algorithms
A wide variety of ML algorithms were used in the reviewed papers. To provide a structured overview, these publications were grouped according to a hierarchy of ML algorithms.

Supervised Algorithms
The vast majority of included publications used supervised approaches for addressing problems in chronic pain research. These models rely on labelled data available for training. Most of the algorithms solved classification tasks by assigning a predefined class label to an observation. Common examples included SVMs [77,79] and DT-based models (DT and RF) [51,61]. The predictions made by these algorithms are relatively easy to interpret, making them attractive choices for addressing clinical problems.
Deep learning-based models have already demonstrated their value, especially in analysing unstructured data and images. Despite their superior performance, they are still only employed to a limited extent in medical research. This was confirmed by the papers included in this review, as only four out of 53 included publications used neural net-based models [26,36,68,78].
Seven of the included publications used various forms of regression algorithms [37,41,52,59,62,72,80]. Instead of assigning a predefined class label, these algorithms output continuous numerical values. Such outputs may be preferred when predicting risk scores or probabilities or solving other problems requiring a continuous value rather than a class label.

Unsupervised Algorithms
Unsupervised ML algorithms do not require labelled data for training. These algorithms are used for identifying patterns or reducing dimensionality in high-dimensional data. Four out of the 53 included publications used unsupervised method clustering (k-means) [26][27][28]31,72]. Typical problems solved by clustering algorithms were partitioning patients with chronic pain according to pain intensity/volatility, physical and mental state, and social support [31,72]. Clustering was also used for identifying granular characteristics of FM patient subgroups [27] and grouping of symptoms reported in patient survey data [26]. Clustering results, in most cases, were used in further analysis and served as the input for task-specific data-processing pipelines.

Data
The amount of health-related data collected and stored is increasing at a rapid pace. Due in particular to advances in medical imaging and genomics, substantial volumes of health data are stored in clinical systems. In addition, the increased availability and use of consumer health technologies, such as smartphones, mobile apps, and wearable devices, have led to the availability of patient-generated health data. Hence, a wide range of health data exists, both in and out of clinical settings, which can potentially be used by ML algorithms in chronic pain research.
Health data are either structured or unstructured. Structured data are easier to store and access and can be used in the development of ML algorithms. Examples of structured data that have been used for chronic pain diagnosis and treatment include electronic health record (EHR) data [25,32,62], claims data [27], International Classification of Diseases (ICD) codes, and patient-generated data (from questionnaires or mobile phone applications) [26,56,72]. Feature selection in structured data can be systematically done with the help of statistical analysis, similar analysis studies, and guidelines [27].
Unstructured data are data that do not have a data model or are not organised in a predefined manner. The most common unstructured data in the studies included in this review are magnetic resonance imaging (MRI) data. MRI involves the use of strong magnetic fields and radio waves to generate images of the organs in the body. Both fMRI and structural magnetic resonance imaging (sMRI) images are widely used in classifying patients with chronic pain versus healthy controls. Other imaging data include facial expressions used to measure pain intensity [78][79][80][81]. Medical texts are important parts of unstructured data in the EHRs that can provide useful information. Natural language processing has been used to obtain pain-related information from medical texts [39,50]. Sensor data such as gait information and electromyography (EMG) are also useful in examining the kinematic and kinetic parameters of the gait of patients [35,36] or measuring muscle activity.
Patient-generated data are also emerging as important for chronic pain research. Wearable devices such as wristbands may provide biometric and physical activity data [60]. Sensor data and patient self-reported data can complement EHR data to provide a more comprehensive picture of a patient's health status. Questionnaires or self-reported data can be either structured or unstructured, depending on whether the data are in the form of multiple choice responses or free-text, such as patients who can indicate the frequency of pain symptoms they experienced on a point scale. ML methods that are used to analyse structured questionnaires data include adaptive networks [26].
The availability of training data is often an obstacle for ML research in health settings, due to strict privacy and confidentiality regulations. Hence, most of the datasets used in the included studies are not publicly available. However, a few are available for academic use, and these are shown in Table 2. OpenPain dataset [44] fMRI data of patients with CLBP Optum Humedica database [83] >150 million people De-identified EHR data and claims data, including diagnoses, treatments, pharmacy and over-the-counter prescriptions, text notes, etc.

Types of Chronic Pain that May Benefit from ML Research
While many cases of chronic pain originate from primary underlying medical conditions or injury, the majority of studies included in this paper focused on pain conditions with unclear pathophysiology and relatively high prevalence. This subsection describes how ML has been used for the diagnosis, treatment, and management of these conditions. CLBP is the most prevalent chronic pain condition, with estimates falling in the range of 5% to 20% [84]. Prevalence increases linearly with age between the ages of 20 and 60 years and is higher among women than men [84]. In some cases, CLBP can be caused by injury or disease, but ninety percent of patients with CLBP have non-specific low back pain [85]. CLBP is the condition most frequently studied in the current review, and more aspects of the care pathway are covered than is the case for other conditions, as can be seen in Table 3. Table 3. Applications of machine learning studies on different types of chronic pain (prevalence sources: [86][87][88][89][90][91] Rheumatoid Arthritis (RA) 0.41%-0.54% [51] Ankylosing Spondylitis (AS) 0.07%-0.32% [41,42] Trigeminal Neuralgia (TN) 0.03%-0.3% [40] Complex Regional Pain Syndrome (CRPS) 0.025% [36] Unspecified Chronic Pain [57] [ 69,72,74,76] FM has also been examined in several studies. FM is characterised by widespread musculoskeletal pain [92], and its prevalence is estimated as being between 2% and 5% [93,94]. The most common application of ML in FM is to apply classification and clustering techniques to aid in diagnostics or to discover previously unknown subgroups of FM patients [26,27,[31][32][33]46,51,55]. Identifying subgroups might be particularly useful in the treatment and management of FM since these patients are highly heterogeneous and may respond differently to the same therapy. Yim et al. [31] conducted a study where FM patients were divided into four subgroups using a cluster analysis method. Data on pain, physical, social, and psychological function were used to discover the clusters. Other applications of machine learning in FM included predicting quality of sleep, level of fatigue [60], and suicide risk [62].
ML applications for other functional disorders, such as myofascial pain syndrome (MPS) [28], irritable bowel syndrome (IBS) [26,49], and chronic fatigue syndrome (CFS) [26,37,45], could also be useful for FM patients. Similar to the FM studies, many of these studies focused on the classification of patients. Lin et al. [28] classified MPS patients by SEMG patterns using template matching and K-means clustering. Melidis et al. [26] used symptoms data from patients with a functional disorder, FM, IBS, or CFS, to discover symptom clusters. Labus et al. [49] used sMRI data to classify IBS patients versus healthy controls.
Two studies focused on the diagnosis of chronic pelvic pain (CPP) (pain in the lower abdomen and/or in the pelvis) [29,53]. Oliveiro and Poli-Neto [29] compared different classification algorithms on the multi-label problem of diagnosing which diseases (e.g., interstitital cystitis/bladder pain syndrome (IC) and IBS) are causing the pelvic pain. Bagarinao et al. [53] used sMRI data and a multivariate classification approach to distinguish participants with CPP from healthy controls. Two studies applied ML to identify diagnostic biomarkers for IC [56,58]. As with many other chronic pain conditions, the underlying cause of the disease and diagnostic biomarkers remain unknown. Chancellor et al. [56] applied supervised ML on urine samples and questionnaire data to identify biomarkers that can discriminate two sub-types of IC patients and healthy controls. To discover stool-based biomarkers for IC, Braundmeier-Fleming et al. [58] used ML to classify DNA from stool samples of IC patients and healthy controls.
Ankylosing spondylitis (AS), also known as Bekhterev's disease, is a chronic inflammatory autoimmune disease that primarily affects spine joints, causing severe chronic pain [95]. The exact cause of AS remains unclear, and no effective treatment exists [95]. AS was the focus of two studies in the current review, both of which applied ML approaches to the same fMRI dataset to classify AS and healthy controls [41,42].
For each of the chronic pain conditions complex regional pain syndrome (CRPS), chronic neck pain, and trigeminal neuralgia (TN), we identified a single study. Yang et al. [36] proposed to apply multilayer perceptron neural networks to detect abnormal gait patterns for CRPS. CRPS affects the limbs and can be induced by surgery or trauma [96]. Zhang et al. [25] used similarity-based learning for outcome prediction of acupuncture therapy for chronic neck pain. To distinguish TN patients from healthy controls, Zhong et al. [40] applied multivariate pattern classification to neuroimaging data. TN is a condition characterised by severe facial pain and can be associated with systemic diseases such as multiple sclerosis or cardiovascular diseases [89].
Four studies focused on shoulder pain; all used the same dataset of facial expression videos to automatically assess pain intensity [78][79][80][81]. Five studies investigated chronic pain in general. Four of these focused on self-management [69,72,74,76], while one study aimed to identify genes relevant to pain [57] to aid diagnostics.

Important Findings
The considerable number of papers that appeared in the literature search indicates that the use of ML methods in chronic pain research has drawn significant attention and is considered potentially useful. A wide variety of ML methods were employed in the considered studies. Most of these methods involve supervised classification or regression algorithms. Classical ML algorithms, such as SVM and DT, were commonly used in the selected studies. Deep learning was applied to a limited extent, indicating that the uptake of these algorithms may be more problematic for the research community. The limited interpretability and explainability of deep learning-based models may be important reasons why these methods, the use of which has become the de facto standard in modern image and video analysis, are less visible in the scientific publications in the medical field.
A limited number of the studies focused on the treatment of chronic pain. The majority of the studies contributed to treatment indirectly through various types of classification. The task of finding the best treatment for a specific patient is still performed by the clinician, without the assistance of clinical decision support tools. Hence, there is a need for tools that help clinicians make decisions concerning safe, effective, and efficient treatment plans based on either contemporary chronic pain models or directly on the data. Useful tools for these tasks could include DTs and reinforcement learning (RL).
Various types of data have been utilised in chronic pain studies. However, methods that integrate multiple data sources (e.g., both EHR data and patient-generated data) are underdeveloped. Consequently, existing work typically depends on a limited number of data types. The issue of data quality in clinical data and how it affects the results of studies have still not been fully explored. For genomics data, translation of findings from animal to humans is still limited and remains a challenging topic [26]. Furthermore, the sample size used in studies is typically small [63]. Moreover, long-term monitoring with large datasets is required in future work to test whether these techniques can be applied to assess rehabilitation or treatment outcomes for patients [36].
As shown in Table 3, previous research has concentrated on the early stages of the chronic pain care pathway. However, few studies have focused on the rehabilitation and self-management phases. CLBP is the pain condition that has attracted the most research interest, as it was the focus of 20 of the included studies. This is also the only condition in which research focused on all phases of the care pathway. The significant amount of research that has focused on CLBP reflects the high prevalence of the condition. FM was the second most frequently investigated condition, with 11 studies covering the diagnosis and clinical decision support phases. While the majority of the research focused on CLBP and FM, several chronic pain conditions were addressed by only one or two studies.

Recent Research
Several papers relevant for this review have been published after February 2019. A few papers are mentioned below. Lamichhane et al. 2021 [97] used SVM to classify LBP patients versus healthy controls. Brain images (MRI and fMRI) were used in the classification process. The study fits into "Classifying patients using text and images" in Table 1. Santana et al. 2020 [98] tested different classifiers (logistic regression, support vector classifier, kNN, DT, RF, and multilayer perceptron (MLP)) on chronic pain patients (FM and CLBP) and healthy controls. Questionnaires associated with depression and anxiety and quantitative sensory tests were used as input data to the classifiers. This study fits into the group "Classifying patients using structured health data" in Table 1. Santana et al. 2019 [99] used deep learning to classify chronic pain patients versus healthy controls. Resting state fMRI data were used; hence, the study can be put into the group "Classifying patients using text and images" in Table 1.

Potential Limitations of this Review
The aim of the review presented in this paper was to survey a wide range of studies to summarize the state-of-the-art with regard to ML methods in chronic pain research. The literature search was performed using three major research databases (Pubmed, IEEE Xplore, and the ACM Digital Library). To reduce the risk of overlooking important studies, we used a generic query. However, no query search can identify all relevant publications on a topic. To identify additional studies for the review, the reference lists of the included papers were checked. Publications not indexed in these major databases (typically some domain-specific conferences and recently established journals) were left out of this review. The use of strict criteria for determining the inclusion of papers can lead to more attention being paid to formalities than on the contents of a specific publication. Inclusion criteria are defined beforehand and sometimes lead to highly relevant and even top-rated publications in the field being overlooked due to minor discrepancies. In this study, the risk of excluding relevant publications was minimised by involving five independent reviewers in the publication screening process and by relying on majority votes when it came to deciding which papers should be included or excluded. Moreover, in the initial screening stage (title and abstract), only publications that were clearly irrelevant were discarded. Questionable papers were passed to the next step and considered after full-text reading.

Future Research Directions
As noted previously, few of the included studies focused on the use of ML in the treatment of chronic pain. This observation, combined with the fact that learning from data can facilitate the treatment of patients with chronic pain, indicates that decision support tools and recommendation systems for both patients and clinicians may be useful. Tools can be built either by using existing chronic pain models or based on data-driven methods.
At present, no satisfactory treatment exists for patients with chronic pain. There are at least two important reasons for this: First, chronic pain is not a life-threatening disease. Hence, it attracts less attention and is considered to be less of a priority in healthcare service [100]. Second, the population of patients with chronic pain is large and heterogeneous. Identifying the best treatment for an individual patient is challenging. ML may be key to remedy this problem. The underlying concept is that patients and clinicians can learn about the best treatment in a specific case based on real-world patient data. Existing analytical predictive models and ML techniques can be utilised to provide a personalised model of the risk associated with the diseases. Based on this model, personalised treatment plans and recommendations can be derived by maximising the treatment outcome while simultaneously minimising patient risk.
By collecting and analysing large volumes of patient data, it may be possible to identify underlying patterns that could prove instrumental to improving the quality of life for a large group of people. To achieve the best possible decision support system, access to both EHR data and patient-generated data is important. Both data sources have the potential to support the search for a successful decision support tool. Typically, over the short term, it is easier and more realistic to obtain access to patient-generated data because we assume they contain fewer sensitive data. Information like heart rate, physical activity, sleep, medication, and patient-reported pain level may offer valuable insights into the patients' health under given circumstances. Such data may be used to determine whether the observed data can be used to quantify a patient's level of pain. If so, using these data to build a ML model that can provide tailored recommendations patients may reduce the number of painful episodes.
RL and DTs may be utilised to build a data-driven decision support tool for the treatment and management of chronic pain. RL is an area of ML concerned with how to take optimal actions in an environment to maximise the notion of cumulative reward. Novel RL research needs to be conducted to make the algorithm safer and more robust and to reduce the risk for patients. DT is another machine learning technique that can be effective to visually and explicitly derive decisions and decision-making policies based on data.
To enable the development and usability of personalised decision support systems, the availability of patient-generated data is crucial. By utilising new self-management technologies, personal data can be collected in a straightforward and cost-effective manner. ML algorithms can learn from these large amounts of personal data to customise treatment and management to the needs and preferences of each individual. However, a platform to collect and process these data in a privacy-preserving manner needs to be developed, to enable ML algorithms to exploit the vast potential of such data sources.

Conclusions
Publications on ML in chronic pain research were reviewed. The reviewed papers were grouped into six categories. The largest group consisted of papers presenting approaches to classifying patients with chronic pain using structured data. While many studies have emphasised classification tasks, less attention has been paid to the treatment and management of chronic pain. FM and CLBP were the most frequently researched conditions. Several conditions attracted less research attention, as they were covered by one or two studies. More research is needed on the treatment, rehabilitation, and self-management of chronic pain. As with other chronic diseases, encouraging patient involvement and selfmanagement is crucial. To become more involved in the management of their conditions, patients require tools that can help them make healthy choices. Increased use of advanced data-analytics methods alongside more traditional approaches has the potential to make valuable contributions to chronic pain treatment and management.
Funding: This work was supported by Tromsø Research Foundation, the Northern Norway Regional Health Authority (Grant Numbers HNF1463-19 and HNF1445- 19), and the Norwegian Directorate of Health through the national project-Report for AI Implementation in Healthcare.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Acknowledgments:
We would like to thank Line Linstad, Norwegian Centre for E-health Research, for her comments, which greatly improved the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.