We studied the working principles of designing and developing a collaborative-based recommender system. Here we analyze the different health issues and develop an intelligent based health recommender system which provides high recommender quality to patients. There are different steps followed for developing and evaluating health recommender system using different machine learning algorithms.
3.3. Framework for HRS
We need a framework for designing a HRS in cooperation with patients, doctors, surgeons and medical personnel. The architecture of the framework is divided into three parts (data collection, data transformation, data analysis and visualization). The first part is data collection. The data sources for the healthcare system have been categorized into (i) Structured data: organized data which has a predefined format, data type, and structure. Examples of such data include data generated from devices such as sensors, information about various diseases, their symptoms and diagnosis information, laboratory results, patient medical history, drug prescription, CT Scan, X-ray. (ii) Semi-structured data: data which does not conform to a data model but has some structure effective monitoring of patient’s behavior. (iii) Unstructured data: data that has no defined structure, which may include medical prescriptions written in human languages, research notes, discharge summaries and so forth. Healthcare is a prime example of how the three Vs of data, velocity, variety, and volume, are an innate aspect of the data they produce. A large amount of data is spread among multiple healthcare systems, hospitals, health insurers, researchers, government institutions, etc. Different data sources are from prescription, clinical data, hospital records, patient information, vital signs, CT scans, X-rays and biometric fingerprints, physician prescriptions, etc. Healthcare automation systems constitute a branch of computational intelligence that applies reasoning methods and domain-specific knowledge to suggest recommendations like human experts. As with any other recommender domain, we must first understand the different categories of recommendations. The different categories are:
Nutritional data: Generating recommendations to augment nutrition. The doctor might change food habits so that patients get proper nutrition so that he/she can recover from illness or disease. Recommenders could be balanced food, substitution food items, less spicy meals, or additions to a diet.
Physical exercise: Generating recommendations on what type of yoga and physical exercise the patients should do for quick recovery based on patients’ requirements. The patient’s requirements may include location, disease-related, weather, etc.
Diagnosis: Generating recommendations on the diagnosis of patients by the doctor based on symptoms shown in similar cases.
Therapy/Medication: Generating recommendations about different types of medication for a particular disease or patient-specific therapy.
The second part of the framework is the data analysis process. During the data analysis process, health-specific recommendations can be generated. We should first talk about patients who will be using this domain. The end-patient of the system is medical researchers, doctors, and patients. Apart from these end patients, there are other people who can benefit from the health recommender system (HRS) like pharmacists, clinicians, researchers. Minimizing the cost of healthcare should be the ultimate aim of these recommender systems. Analytical methods involve using Hadoop approach that uses MapReduce. This approach increases the speed of medical diagnosis and finding the optimal parameters for doctors so that he/she can detect the type of disease the patient is suffering and check the condition of the patient.
The third big part of the framework is the visualization part. This part contains elements that affect how recommended items should be presented. Visualization and knowledge representation techniques are used to present the mined knowledge to the end patients. The healthiest recommender is the one that should be chosen, but sometimes topic-specific criteria play a role in evaluating a product. Data-driven approaches apply data mining and machine learning methods to extract insights from the heterogeneous data. It provides individual recommenders based on the past learning experience and the patterns extracted from clinical data. A combination of information retrieval and machine learning can be used for the medical database classification. The entire framework of the health recommender system (HRS) is comprised of the following stages:
i. Training Phase
In order to detect various diseases like tuberculosis, cholera, flu, etc. doctors organize clinical tests on patients. Therefore, to study and analyze various diseases and find a cure for same, doctors require information through parameter and variables. Moreover, there has been a tremendous growth in the quantity of information being generated in healthcare. This phase includes data collection and accumulation. However, the absence of proper tools for the collection and accumulation of data will hamper the whole process. The whole process includes collecting various data and information of patients, demographic information of patients, diagnoses, research, clinical tests, patient’s health record, real-time data from hospitals and clinic so that real-time data collection can enhance the effectiveness of the recommender.
ii. Patient Profile Generation
During this stage, for every patient, a patient profile is created which contains various information. For every patient, there will be a health record documenting the patient’s clinical history. This record contains information from various sources, including the patient, doctors, hospitals, laboratory tests, CT Scan, X-ray, etc. If the new patient is admitted, then the whole process starts from the beginning, i.e., from the processing of data and the creation of a new patient health record. In the case of an existing patient, the system updates the record as per requirements.
iii. Sentiment Analysis
In order to support the patient-based recommender for the clinical services, it is imperative to make sure the patient trusts the whole system, i.e., system reliability to maintain privacy and confidentiality of patient data. Information with or without adequate medical data obtained from patients is personal and should not be misused.
From the extraction of rules and patient context, recommendations can be generated. Patients receive personalized recommendations. These recommendations can take the form of preventive and corrective measures, reasons for the causes of the disease, or a further process of treatment.
v. Privacy Preservation
The HRS requires the blending of various clinical information in order to enhance the recommender quality so that healthcare improves. Subsequently, ensuring the privacy of a patient’s information plays a vital role in clinical research. In the proposed approach, the integrity of this information will be maintained while personal identity is effectively shielded [17
3.4. Methods to Design HRS
We need a framework that is made up of different tools which satisfy the domain requirements and specific criteria of specific applications as shown in Figure 6
. These tools present in the framework first put the focus on customer requirements and ensure that clients requirements are met first. The first tool vital for designing the framework is the use of participatory design. It refers to the active participation of stakeholders. Patients should play active patient while designing the framework because patient feedback can improve the whole system and remove lacuna present in the current system. Thus, feedback from the patient is essential. When patients draft the recommender system, they keep in mind health-related issues, not sales and marketing by pharmaceutical companies, because these systems can become their personal assistant helping them to overcome health issues that are significant to them. The most demanding part is to hypothesize an existing framework in order to allow the large-scale participation of patients and doctors. These tools could help in the treatment without requiring the direct intervention of doctors.
The second tool important to HRS, is the use of differential privacy. Differential privacy maintains data privacy and security which is main problem prevalent in a recommender system. Here, it is used for the sharing of a patient’s medical history without revealing patient identities. So, privacy should be provided to the end patient. Patients are often unaware of privacy which presents a contradiction to their long-term interests. To implement an intelligent health recommender system, privacy must play a major role for the patients. The level of knowledge about privacy threats on the Internet is so important that different risk perceptions and levels of digital literacy are also related to technology. Patients are much more reluctant to share data in personal spaces.
The third tool to incorporate is adequate and proper communication. Communication is bidirectional (to the patient and to the recommender). Patients should be able to express in an undisturbed and hassle-free manner to doctors so that doctors can interpret the symptoms of patients and can give recommenders for a particular disease to patients. The visualization of data should address the purpose of the recommender system and able to understand the patients, doctors and their intentions. There should be a proper visualization tool in a recommender that fosters the patient’s willingness to explore options and helps to explain individual recommendations. Since individual differences might play a vital role in the health sector, it is crucial to intensify research in this field.
Some common big data tools are used in health care sector, e.g., Data Cleaner, Apache Hadoop, Cassandra database etc. Apache Hadoop is the most prominent tool in the big data industry with its enormous capability of large-scale data processing. Hive is also one of eco-components of Hadoop which allows programmers to analyze large data sets on the Hadoop platform. It helps with querying and managing a large dataset. Data Cleaner is a data quality analysis platform which has strong data profiling engine. This tool is usually used for data cleaning, data transformation and data merging. Today’s Cassandra database is widely used to provide an effective management of large amount of data. These tools are used to work with recommender engine in big data analytics [34
3.5. Evaluation of HRS
For the success of the recommender system, it is very important to choose what type of criteria are used to evaluate the recommender system. Conventionally, recommender systems were evaluated based on criteria borrowed from information retrieval [9
]. Common metrics used in the evaluation are:
Precision: The measure of retrieved instances that are relevant.
Recall: The fraction of correctly recommended items that are also part of the collection of useful recommended items.
F-Measure: It is a measure of a test’s accuracy and is defined as the weighted harmonic mean of the precision and recall of the test.
ROC-Curve: ROC Curve is a way to compare diagnostic tests. It is a plot of the true positive rate against the false positive rate. It is used to represent the relationship between sensitivity and specificity.
RSME: This measure defines the standard deviation of the residual errors, i.e., differences between predicted values and known values.
The evaluation criteria of the recommender system are very necessary to measure the strength of an HRS based on patient acceptance and satisfaction. By making system suitable for individual patients, the system can run as per patients’ requirements so that patients will not face any problems, ultimately leading to better medical research. This includes patient diversity research, not just in regard to patient-specified results, but also in regard to the patient interface of a health recommender system. The pretentiousness of accuracy metrics and under-representation of metrics such as serendipity and coverage pose a serious challenge in a classic recommender system. Rare diseases are very uncommon but collection of data and case studies of similar cases can help a lot. Therefore, finding all relevant results is important for health recommender systems. Another very vital research issue is trust in recommender systems. If things take a worse turn, the doctor can program the system to take actions so that trust is maintained.
While designing health recommender systems, the person concerned should be careful and prepare plan according to requirements. The capability of a recommender system can be appraised in regard to the patient’s external behavior. The measure of the effectiveness of a health recommender system depends upon behavioral evaluations. For example, when monitoring the health progress of patients and providing suggestions for treatment, keep track of activities. In the case of food restriction for patients, the system has a difficult time to measure its effectiveness, as some patients might smoke without informing the system. Some health recommenders may also aim at long-term behavioral changes and these must be tracked somehow as well. Once the treatment is administered, the system can continue monitoring the patient to determine if treatment is effective. The system should also take steps which can promote faster healing. We must consider those recommendations which do not cause any side effect because neglecting one health parameter can lead to another disease, e.g., changing food habits may lead to loss in body weight (a superficial health parameter), keeping our body fit but neglecting a balanced diet can hamper growth and metabolism. Before applying this approach for practical use, it must be ensured that systems are customer friendly and reliable. We must ensure too that the system delivers real time results.