There are different definitions and interpretations of a “smart city”. In the research area, the smart city is viewed as having characteristics such as: (i) adopting information and communication technology (ICT) to solve the daily life problems in governance, environment, economy, healthcare, etc.; (ii) improving the quality of life of human beings; (iii) using computational intelligence to address real-world issues by mathematical formulations and machine learning algorithms; and (iv) distributing the workload to computers, robots and machines [1
Human health is wealth; there is nothing more valuable than good health. Researchers have devoted vast efforts in proposing new policies, algorithms, systems and architectures for healthcare. Healthcare is defined as the amelioration of health through the prevention, treatment and examination of physical damage, mental damage, illness, injury and disease.
Today’s world is experiencing three challenges in healthcare: shortage of medical personnel, ageing and high total expenditure on healthcare. Reports from the World Health Organization (WHO) stated that the global need and the factual amount of health workforce were 60.4 million and 43 million, respectively, in 2013 [2
]. These figures will increase to 81.8 million and 67.3 million, respectively, by 2030. Therefore, the shortage of medical personnel is unsolved and remains serious.
From 2000 to 2050, the percentage of world population over 60 years will double (from 11% to 22%) [3
]. One of the major reasons is that the birth rate has remained low in past and it is estimated to keep low in the coming decades. The older the person, the higher is the chance of getting diseases, illnesses and requiring long-term caring and medical treatments. Consequently, more human resources and expenditures should be allocated to the age group of 60 or above.
The total expenditure on healthcare has occupied a significant portion as a percentage of the gross domestic product (GDP). The survey from WHO concluded that the corresponding figures in China, U.S.A., Canada, Brazil, Russian Federation, India, and Australia are 5.6%, 17.1%, 10.5%, 8.3%, 7.1%, 4.7%, and 9.4%, respectively [4
]. Attributed to ageing, these figures will be increased in the coming decades.
It is good news that, with the rapid increase in computation power and availability of health data, healthcare applications can include artificial intelligence and thus become smart healthcare. This can certainly help solving some of the aforementioned challenges in healthcare. In addition, it will facilitate sustainable development [5
]. Healthcare sustainability aims at simultaneously optimizing the financial and social impacts of the health service, without compromising the health of our patients and our ability to provide healthcare in the future. In fact, the inadequate amounts of medical personnel and increase portion of elderly increase government expenditure and lower the social productivity.
Optimization algorithms and machine learning algorithms are core tools that can benefit healthcare applications. Three kinds of optimization algorithms, evolutionary [8
], stochastic [18
] and combinatorial optimization [30
] will be addressed. For machine learning algorithms, the discussion is based on un-supervised learning [39
], supervised learning [50
] and semi-supervised learning [71
]. The technical content in terms of mathematical formulation of these algorithms will not be discussed in this review, but it will illustrate examples to reveal the potential opportunities between optimization/machine learning algorithms and healthcare application.
Due to the multitude of smart healthcare applications, only four applications in the field of diseases diagnosis, cardiovascular diseases [82
], diabetes mellitus [100
], Alzheimer’s disease and other forms of dementias [113
], and tuberculosis [127
] are considered. These are listed on the top 10 causes of annual global death. The applications of various disease diagnosis in smart healthcare are related to automated decision making. If the performance of the classification algorithm is good in terms of overall accuracy and testing time, it may ultimately replace the role of medical doctors in disease diagnosis (and medical doctors can devote their time majorly in complicated surgery). The second case is that the classification algorithm will be utilized as rapid test (fair overall accuracy and rapid decision) for low-cost and large-scale screening.
This paper is organized as follows. Section 2
discusses the emerging optimization algorithms and machine learning algorithms in healthcare. Section 3
provides an up-to-date literature on seven key applications. Finally, a conclusion is drawn.
3. Smart Healthcare Applications
Since there are numerous applications in smart healthcare, this review only focuses on disease diagnosis. Cardiovascular diseases, diabetes mellitus, Alzheimer’s disease and other forms of dementia and tuberculosis are on the list of top 10 causes of annual global death in 2015. Table 1
summarizes the mortality due to these diseases in 2000, 2005, 2010 and 2015 [159
]. It can be seen that the first three types of diseases increase, whereas the tuberculosis decreases from 2000 to 2015. The percentage changes in 2000–2005, 2005–2010 and 2010–2015 in each type of disease are6.7%, 8.03%, and 6.4%; 19.9%, 17.7%, and 17.8%; 29.9%, 36.1%, and 33.7%; and −5.58%, −10.6%, and −2.34%, respectively. These diseases led to more than 22 million deaths in 2015.
The optimization algorithms and machine learning approaches that have been utilized for smart healthcare covering the diseases from Table 1
will be discussed in the following subsections.
3.1. Cardiovascular Diseases
Cardiovascular diseases are a group of disorders involving heart and blood vessels. Some common conditions include atherosclerosis and may even result in stroke. Most of the risk factors of cardiovascular diseases are related to human lifestyle and the corresponding percentages of factors attributable to mortality are as follows: 40.6% for high blood pressure; 13.7% for smoking; 13.2% for poor diet; 11.9% for insufficient physical activity; 8.8% for abnormal glucose levels; and the rest for other factors [82
]. Therefore, improving the eating habit and lifestyle is an effective measure for preventing cardiovascular disease. The main principle of healthy lifestyle includes low-salt and low-fat diet, regular exercising and quitting smoking [83
Stroke is one kind of cardiovascular disease. A stroke will happen when there is an interruption of the blood supply to brain. The common causes will be burst blood vessels in brain or a blockage by a clot. This in turn cuts off the supply of oxygen and nutrients to brain cell, and causes damages to the brain tissue. Hypertension is the most notable risk factor of stroke [84
] and thus having a good management of blood pressure will be a preventive measure of stoke.
The key priority for effective primary stroke prevention should be focusing on behavioral and lifestyle risk factors, including smoking, unhealthy diet, sedentary lifestyle, and improper use of alcohol [85
]. Other than diet modification, engaging in physical activity for at least 30 min every day will help to prevent stroke [86
The surface electrocardiogram (ECG) is the most widely adopted means to record electrical activity of the heart and for diagnosis of cardiovascular diseases [87
]. Besides, ECG can be used for risk stratification and selection of optimal management for different types of cardiovascular diseases. An alternative method, vectorcardiography is rarely used in clinical practice [88
], it was concluded that there are three types of features namely fiducial features (FF), non-fiducial features (NFF) and hybrid features (HF). FF is defined as any feature that is related to the characteristic points (P wave, QRS complex and T wave) of ECG signal, whereas NFF is the opposite of FF. For HF, it has both FF and NFF. Table 2
shows some of the methodologies that have been proposed for cardiovascular diseases classification [90
] or ECG recognition [95
Eleven machine learning or optimization algorithms have been applied in [90
]. Different algorithms may have their superiorities in particular application. The number of patients, Ns
, is very small compared to the number of cardiovascular diseases sufferers (more than 10 million). However, it is difficult for researcher to tackle the problem of small sample size because the datasets are generally retrieved from public domain (hospitals). It depends on the willingness of the policy of government that the data from patients can be fully disclosed to public. The performances are good (>90%) in most of the works, except [92
3.2. Diabetes Mellitus
Diabetes mellitus, or simply called diabetes, is one of the serious epidemic diseases in the world. Over 400 million people are living with diabetes [100
]. It is expected that by 2035, the total number of adults with diabetes would increase to 592 million [101
]. Its subsequent macrovascular complications could be fetal. Enormous financial burden is putting on diabetes related area. Many patients have to rely on medication in their lifetimes to control diabetes.
There are two types of diabetes, insulin-dependent diabetes (or type 1 diabetes) and noninsulin-dependent diabetes (or type 2). Generally speaking, most diabetes cases belong to type 2 as it is often onset in adulthood and caused by unhealthy lifestyle, improper diet and obesity [102
]. These factors then cause a combined defect of insulin secretion and insulin resistance resulting in different severity of diabetes.
Recently, there are many studies in diabetes mellitus via machine learning algorithms. The applications are summarized in Table 3
. The applications on diabetes are type 2 diabetes diagnosis [103
], prediction of fasting plasma glucose status [106
], analysis of predictive power of hypertriglyceridemic waist phenotype [107
], detection of hypoglycemic episodes in children [108
], prediction of protein–protein interaction [109
], prediction of vascular occlusion [110
], prediction of development of liver cancer for diabetes sufferers [111
] and detection of microalbuminuria [112
] related to diabetes.
Similar to the cases in cardiovascular diseases (Table 2
), there are 12 types of algorithms that have been utilized for diabetes mellitus. The number of samples has an increase tendency, which leads to more trustworthy results in evaluating the algorithms. It is still challenging (less than 80% in performance) in the fields of prediction of fasting plasma glucose status, analysis of predictive power of hypertriglyceridemic waist phenotype, detection of hypoglycemic episodes in children, prediction of protein–protein interaction and prediction of development of liver cancer for diabetes sufferers.
3.3. Alzheimer’s Disease and Other Forms of Dementia
Dementia is a general term describing the decline in mental ability including memory and thinking skills that affect the human daily living activities. Dementia is resulted from damages of brain cells making them unable to communicate with each other [113
]. Alzheimer’s disease is the most common form of dementia, making around 60–80% of all cases. Some Alzheimer’s disease patients in final stages may have lost basic bodily functions, including swallowing and moving their limbs. They will need around-the-clock care and this puts a huge financial burden on the medical system and social welfare department of government.
The number of dementia cases increases with aging. This leads to a huge concern among healthcare professionals. Dementia, unlike other forms of chronic illness, has a higher prevalence in developed countries. It is a pandemic disease that affects people in different regions. Moreover, studies have found that people with metabolic syndromes like diabetes and obesity are at higher risk of Alzheimer’s disease [114
]. Therefore, Alzheimer’s disease and dementia will be one of the most challenging non-communicable diseases to battle in this era.
shows some studies in Alzheimer’s disease and other forms of dementia via machine learning algorithms. The applications include diagnosis of Alzheimer’s disease [115
], diagnosis of dementias [117
], and detection of Alzheimer’s disease related regions [118
], prediction of mild cognitive impairment patients for conversion to Alzheimer’s disease [119
], detection of dissociable multivariate morphological patterns [121
], diagnosis of both Alzheimer’s disease and mild cognitive impairment [122
] and identification of genes related to Alzheimer’s disease [125
Fourteen different algorithms were employed in [115
]. The datasets of Alzheimer’s disease and other forms of dementia have relatively small sample size. Three applications, prediction of mild cognitive impairment patients for conversion to Alzheimer’s disease [119
], prediction of mild cognitive impairment [122
] and identification of genes related to Alzheimer’s disease [125
] are required to have further improvement3.4. Tuberculosis
Tuberculosis (TB) was a lethal infectious disease caused by bacteria mycobacterium tuberculosis that usually attacks the lung. The transmission way is mainly through the air, and thus TB is actively spread among the community. It was a public health concern in early 19th century. With the development of drugs and hygiene awareness, the incidence rate has declined slowly since the 1990s [127
People in poor and undeveloped countries are more vulnerable to TB because of poor environmental conditions, food insecurity, and inconvenient access to healthcare services [127
]. Early detection is important for TB control. Without proper diagnosis, infected people are the major source of infection in the community [128
]. It is estimated that about 3 million people remain undiagnosed or are not notified [129
Intensive research was carried out on prevention and fighting against TB [130
]. TB control is crucial with effective surveillance system for instantaneous reporting. With the development of science and technology, it is hoped that TB would not be a major public health issue in next generation.
Smart healthcare applications on tuberculosis include identification of drug resistance-associated mutations [131
], detection of tuberculosis [132
], detection of multidrug resistance tuberculosis [136
], prediction of treatment failure [137
], identification between tuberculosis and human immunodeficiency virus (HIV) [138
], predicting recent transmission of tuberculosis [139
] and detection of smear-negative pulmonary tuberculosis [140
]. These are summarized in Table 5
. Similar findings can be observed in applications in tuberculosis. There are 12 types of algorithms in seven applications in tuberculosis and the sample sizes of datasets are limited. The performance of two of the works [133
] can be improved in the future. Other algorithms can be applied to obtain a favorable performance (>90%).
4. Challenges in Smart Healthcare
For the transition of healthcare to smart healthcare, numerous challenges are normally encountered and research is still ongoing. In the following subsections, privacy, pilot studies and real project, communication between data scientist and medical personnel, no free lunch theorem and increase short-term to medium-term expenditure are discussed.
Medical data contain meaningful information for modeling and analysis, which ultimately can improve medical practice and research. The privacy must be protected from misuse and violation so that patients and medical institutions will agree to release and share the data. The increase in the data availability will improve the quality of healthcare [160
]. A report has stated that about 60% of patients realize that the wide-ranging employment of electronic health record will lead to more personal information being stolen or lost and about half of patients think that the privacy of medical data is not protected [161
Ideally, medical data should be accessible to authorized parties or public institutions only if security and privacy are guaranteed. More important, researchers need medical data for carrying out data analysis, statistical analysis and machine learning applications. Common privacy-preserving methods include disclosure control [162
], output perturbation [164
] and anonymization [166
It is hoped that, in the future, the shared datasets for medical applications (especially disease diagnosis) will be increased in sample size. The key elements will be medical data privacy and the government policies.
4.2. Pilot Studies and Real Projects
Since, in the long run, they reduce the risk of non-professional project deployment, pilot projects are important and they help to select the appropriate risk mitigation strategies and application directions. There is an increasing tendency to carry out pilot studies [168
Professionals gain experience and intuition to improve the protocol design, algorithms, and hardware design of smart healthcare. However, in reality, only a few companies and researchers may take part in real projects in healthcare. Most researchers fail to learn from the experience since it is not uncommon that there are no timely publications, reports and feedbacks provided by the pilot projects. Usually, the projects are target-oriented [169
]. However, chances are that there exist alternate solutions that are cost-effective which may replace the current solution.
It is highly encouraged that before the technology platforms are adopted, various algorithms and technologies that are tested and selected for adoption should be consolidated into technical reports and papers, and shared to the public.
Economically, it is desired that governments can support basic research and innovation [170
]. Regarding the private sectors, they take up the responsible for the commercial and applied research since they are keen on market needs and demands.
For the selection of the policy tool, the features of basic research are essential. If there is a large-scale and highly focused project, conduct of research or direct government support is reliable to maintain a good project management with numerous researchers and staffs [171
The effectiveness in research, innovation and development will be increased if government serves as long-term investor. The time lag between basic research and real deployment commercially can be large. Here are two examples for the benefit of long-term investment. The Internet revolution in the 1990s was the results of long-term investment covering twenty years [172
]. The second example is that the biotechnology commercialization was facilitated by the research findings in the 1950s [173
4.3. Communication between Data Scientists and Medical Personnel
Data scientists generally lack sufficient medical knowledge and they require medical experts. Similar situation happens for medical personnel. When data scientists would like to enter the medical institutions, medical personnel and management teams often refuse [174
]. From the data scientists’ perspective, medical data are useful for statistical analysis, prediction, classification and knowledge discovery. From medical personnel perspective, they are used to diagnose patients via their medical knowledge and rarely rely on the decision generated by machines.
Medical workers may argue that smart healthcare via machine learning and optimization algorithms has a conflict of interest with them; however, this is not the case [175
]. First, most of the countries are suffering from a shortage of medical personnel and the medical shortage will become more accentuated. Second, smart healthcare is mainly targeting routine works so that medical workers can spend more time on professional consultation and surgery activities. Third, medical workers will earn a higher social status and better job satisfaction because patients are more satisfied with the medical services.
Both data scientists and medical personnel should make a step forward to see what the best collaboration method is. The ultimate goal is to improve the quality of life.
4.4. No-Free Lunch Theorem
However, it can be seen that there are numerous approaches in different machine learning application in smart healthcare. There is a “No free lunch” theorem in machine learning which says that no unique algorithm works best for every application [176
]. As a result, when people design algorithm for smart healthcare application, the analysis becomes tedious as one should go through many algorithms and select the one with best performance. An important knowhow is to only try appropriate algorithms for the problem. For instance, clustering algorithms seem inappropriate in solving classification problems.
Deep learning has gained popularity in recent years, attributable to its superiority in solving complex problem. However, it is reminded that deep learning is not always the best or necessary for all problems. It depends on the complexity of the problem, available amount of data, computational power and training time [177
4.5. Increase Short-Term to Medium-Term Expenditure
Authors would like to emphasis that owing to the fact that machine learning in smart healthcare is still young, government and institutions should spend more money as short-term to medium-term expenditure. Many applications are only in initial stage where data collection, feature extraction, and methodology are key criteria for successful deployment.
Ultimately, the smart healthcare applications will benefit human beings by increase of human life expectancy, early disease examination, ambient assisted living, patient monitoring, etc. As a result, we are in a sense making profit in the long-term using machine learning.