1. Introduction
The continuous progress in natural language processing (NLP) has enabled the emergence of highly capable intelligent systems that can process and communicate in human language with greater depth and precision. In the context of healthcare, the strategic application of these state-of-the-art technologies represents a compelling opportunity to pioneer innovative tools. Such tools not only possess the capability to comprehend human language intricacies but also demonstrate a remarkable proficiency in providing tailored responses.This powerful synergy has the capacity to redefine the way individuals obtain health-related knowledge, marking a significant paradigm shift in the landscape of healthcare communication and information dissemination.
The landscape of health information seeking has experienced a profound transformation with the rise of digital platforms. More and more people are turning to the internet for information about their health issues, creating a need for advanced systems that can effectively classify and address a wide range of health-related questions. Our contribution is poised to meet this demand by introducing an intelligent chatbot designed to adeptly classify health-related inquiries and deliver pertinent advice. The chatbot functions by receiving user queries pertaining to symptoms or health concerns, employing machine learning models to accurately predict the category, and subsequently delivering personalized health advice tailored to the predicted category.
Section 2 presents an overview of current approaches and technologies explored within the medical literature.
Section 3 outlines the research problem, details the dataset used for training, and introduces the machine learning models employed and also elaborates on the research methodology adopted in this study.
Section 4 discusses the experimental results and offers a comparative performance analysis of the proposed models. Finally,
Section 5 provides a conclusion by recapping the main insights and suggesting future research opportunities.
2. Related Work
The advent of artificial intelligence has opened exciting new possibilities in various sectors, and healthcare is no exception. The integration of intelligent chatbots in the medical field represents a significant advancement, offering innovative solutions to enhance interaction between healthcare professionals, patients, and medical information. Chatbots, underpinned by NLP and ML models, are capable of understanding and analyzing human language in a context-aware manner. Within the healthcare sector, they can offer reliable information, patient support, and improved coordination between medical staff.
The medical field has witnessed a surge in the adoption of intelligent chatbots, a trend largely propelled by continuous breakthroughs in AI and NLP [
1]. Natural Language Processing (NLP) merges principles of artificial intelligence, computer science, and linguistics to facilitate the analysis and understanding of natural language. In essence, NLP provides a collection of tools for uncovering valuable insights from text, often used to gather knowledge and support decision-making through the analysis of textual content such as web pages, reports, and user feedback [
2]. As mentioned earlier, the field of linguistics is fundamental to natural language processing, as it involves the scientific examination of language structure, meaning, and sound—specifically grammar, semantics, and phonetics. In its essence, linguistics focuses on developing and validating rules of language. A careful examination of this definition reveals that natural language adheres to a set of regulations encompassing grammar and semantics. These regulations constitute a pivotal element in enabling machines to comprehend and process textual data. In [
3] provides an overview of linguistic techniques employed in the automation of language structure analysis and examines the evolution of fundamental AI-powered technologies, including speech recognition, speech synthesis, and machine translation.
The growing integration of chatbots in the medical sector represents a significant advancement, providing innovative solutions to enhance accessibility to medical information, offer patient support, and facilitate communication between healthcare professionals and the general public. Recent research has showcased various contributions in this dynamic field. In their pioneering study [
4], the authors introduced a medical chatbot focused on promoting a healthy lifestyle by providing tailored advice. Another notable initiative [
5] concentrated on developing a chatbot specifically designed for patients with chronic illnesses, delivering personalized support and relevant medical information. A significant contribution from [
6] examined the integration of chatbots within mental health services, emphasizing their crucial role in providing guidance and resources for psychological well-being. Additionally, the authors in [
7,
8] addressed the improvement of patient engagement through conversational agents, demonstrating the effectiveness of chatbots in encouraging active patient participation in health management.
Finally, comprehensive reviews [
9,
10] provides an overview of the applications, challenges, and opportunities of chatbots in the medical domain, highlighting their revolutionary potential in healthcare delivery. These collective contributions attest to the positive impact of chatbots and open new perspectives for the future of technology and health interaction. Additionally, we introduce an intelligent chatbot specifically designed to categorize health-related queries and provide personalized advice covering 25 distinct disease categories.
3. Data Collection, Exploratory Analysis, and Model Pipeline
Given the overload of information online and the diversity of health-related queries, developing an intelligent chatbot stands out as an innovative and timely solution. The challenge lies in developing a platform capable of understanding natural language, effectively categorizing health-related questions, and providing personalized advice. This becomes even more crucial with the variety of medical conditions. In this perspective, our goal is to design a specialized chatbot capable of classifying user queries into 25 distinct disease categories. This approach would offer targeted and personalized assistance, thereby improving accessibility to medical information and providing advice tailored to each user’s condition. The creation of an intelligent chatbot in the healthcare domain requires a deep understanding of medical issues, a comprehensive database of diseases, and a robust architecture for contextual analysis of queries. This project aims to push the current boundaries of chatbots by providing more precise and personalized health recommendations. The successful implementation of this chatbot would pave the way for enhanced interaction between users and online medical information, strengthening access to health advice in an efficient and personalized manner.
3.1. Data Collection
The dataset used in this project has been meticulously curated from an open-source repository, reflecting a strategic choice made during the project’s inception. While structured datasets on platforms like Kaggle are often a default choice, our decision to delve into an open-source repository aimed to ensure the inclusion of diverse and real-world audio samples. By opting for an open-source approach, we intended to capture the authentic nuances and complexities present in spoken language expressions of health concerns. This deliberate choice not only aligns with the project’s commitment to realism but also enriches the dataset with a broad spectrum of health-related topics.
The dataset is organized into rows and columns, where each row represents a distinct audio sample, and the columns include various attributes associated with the audio, such as its content in the form of sentences written in natural language. The ’phrase’ and ’prompt’ columns contain textual information describing the spoken content of the audio. These columns are essential for training the chatbot to accurately interpret and respond to user queries concerning health issues. The last column represents the target, indicating the disease category to which each sample is associated. In our case, we have defined 25 disease categories.
Figure 1 presents these categories, along with the percentage of each category in our dataset. The dataset comprises a significant number of records, totaling 6663 overall.
3.2. Exploratory Data Analysis and Pipeline
To prepare the text columns for machine learning training and enhance their semantic understanding, various NLP techniques were applied. The primary techniques used are:
Data cleaning was the foundational step in refining the textual data. It involved the systematic removal of irrelevant characters, punctuation, and special symbols from both the ’phrase’ and ’prompt’ columns. This operation aimed to enhance the overall cleanliness and coherency of the text, ensuring that the subsequent analyses and model training would be conducted on a well-processed dataset. By eliminating noise and non-essential elements, the Data Cleaning process laid the groundwork for more effective Tokenization and subsequent NLP techniques.
Tokenization: Following Text Cleaning, Tokenization was employed to break down sentences into individual words or tokens. This process provided a granular representation of the text, allowing for a more nuanced understanding of linguistic patterns. Each word or token became a distinct unit for analysis, enabling the machine learning models to capture the subtleties and intricacies of language in health-related inquiries. Tokenization facilitated the transformation of raw text into a structured format that could be effectively utilized in the subsequent stages of the preprocessing pipeline.
Lemmatization: Lemmatization has played a crucial role in reducing words to their base or root form. This step was essential for streamlining the dataset by eliminating redundant variations of words.Through the reduction of words to their root forms, lemmatization strengthened the semantic clarity of the text, enabling the models to recognize different word variants as a single concept and avoid being misled by morphological differences.
Handling Missing Values: Ensuring the completeness of the dataset by addressing any missing or null values was a critical aspect of the preprocessing pipeline. This step involved identifying and handling instances where data was absent, either due to errors or genuine gaps. Depending on the nature of missing values, strategies such as imputation or removal were applied to maintain the integrity of the dataset. A complete dataset is fundamental for accurate model training and reliable predictions.
Feature Engineering: Feature Engineering was a distinctive phase dedicated to extracting relevant features from the text data. This involved distilling critical information that could serve as discriminative elements for the machine learning models. By carefully curating features that encapsulated the essence of health-related inquiries, Feature Engineering contributed to the creation of a robust dataset. The selected features played a crucial role in the subsequent model development, influencing the models’ ability to discern patterns and make accurate predictions.
3.3. TF-IDF Vectorization: Transforming Text into Numerical Insights
To enable machine learning algorithms to process textual data, vectorization is performed, converting raw text into numerical form. In our study, TF-IDF (Term Frequency–Inverse Document Frequency) is applied to achieve this transformation [
1]. These vectors capture the frequency and importance of words in the dataset, providing structured input for machine learning models.
The TF component reflects how frequently a term appears in a given document, thereby indicating its local importance. Meanwhile, IDF evaluates the rarity of a term within the entire corpus, giving higher weights to terms that are common in one document but uncommon across others, thus highlighting their specificity. By combining TF and IDF, TF-IDF vectorization assigns a numerical weight to each term that captures both its frequency in a specific document and its overall importance in the corpus. This vector representation enables machine learning models to process text more meaningfully, considering the context and specificity of terms. In the context of our health-focused chatbot, TF-IDF vectorization is a valuable tool for transforming user textual queries into numerical information usable by our machine learning models. This mathematical transformation is encapsulated by the formula:
with, tf(t,d) denotes the term frequency of term t in document d, and idf(t,D) signifies the inverse document frequency of term t across the entire document collection D.
3.4. Proposed Architecture of the Chatbot
To develop an intelligent and resilient chatbot, we’ve curated a diverse set of machine learning models. Our ensemble is meticulously crafted, comprising robust algorithms and employing various ensemble methods. Each model contributes distinctive strengths, culminating in a versatile health-centric conversational agent. This fusion forms the core of our intelligent chatbot, with models including Support Vector Machines (SVM), Decision Tree, Random Forest, Bagging Classifier, Multinomial Naive Bayes, Calibrated Classifier, K-Nearest Neighbors (KNN), Passive Aggressive Classifier, AdaBoost, Gradient Boosting Machine (GBM), Stochastic Gradient Descent (SGD), One vs. Rest Classifier (OVRC), Neural Network (NN) and other.
The architecture of our chatbot is based on an innovative approach that integrates multiple machine learning models to effectively predict the category of diseases from user queries. Firstly, we have implemented an input layer that preprocesses textual queries, normalizes them, and transforms them into vector representations understandable by machine learning models. Subsequently, we have integrated different models such as Support Vector Machines (SVM), Decision Tree, Random Forest, Bagging Classifier, Multinomial Naive Bayes, and others, each specialized in capturing certain nuances or features of medical data. The second part of our architecture consists of a model fusion layer. This layer aims to combine the predictions of each model in a weighted and intelligent manner. We use ensemble techniques, such as weighted averaging or majority voting, to leverage the specific strengths of each model. This ensemble approach enhances the overall robustness and accuracy of the chatbot, taking into account the different perspectives that each model offers on medical data. Finally, the output layer produces the predicted disease category by the chatbot based on the aggregated predictions of the models. This innovative architecture ensures optimal performance of the chatbot in predicting diseases while providing flexibility to integrate new models or enhance existing ones.
The proposed architecture can be summarized in the
Figure 2:
4. Experimental Results
This section serves as the empirical validation of our proposed methodology, offering insights into the performance and effectiveness of our developed chatbot. Through a systematic evaluation, we aim to demonstrate the practical applicability and robustness of our chatbot architecture in handling health-related queries. This section outlines the experimental setup, details the metrics employed for evaluation, and presents a comprehensive analysis of the obtained results.
4.1. Evaluation Metrics
The effectiveness and dependability of our health-focused chatbot are quantified through a range of evaluation metrics. Choosing suitable metrics is crucial for understanding how well the chatbot performs its tasks and aligns with user requirements. Multiple performance indicators are considered to ensure a holistic evaluation [
1]:
Accuracy: It is a key metric that reflects how often the chatbot’s predictions are correct, calculated as the proportion of accurate responses relative to all processed queries.
Precision: Highlights the quality of the chatbot’s positive predictions, quantifying the extent to which false positives are minimized. It is determined by the ratio of true positives to all instances classified as positive.
Recall (Sensitivity): it measures the completeness of positive predictions, indicating how many actual positive cases were correctly identified.
F1 Score: combines precision and recall into a single metric through their harmonic mean, providing a balanced evaluation of a model’s performance.
Confusion Matrix: it illustrates the relationship between the predicted and actual classes, breaking down results into true positives, true negatives, false positives, and false negatives.
4.2. Experimental Tests
The outcomes of our machine learning model evaluations are summarized in
Table 1, which outlines the performance metrics used to gauge the effectiveness of the health-focused chatbot. Metrics such as accuracy, precision, recall, and F1 score are emphasized, providing a thorough comparison of the models’ capabilities in categorizing health-related queries.
The confusion matrix (
Figure 3) delivers a detailed view of classification outcomes, including true and false predictions. Through this comparative analysis, the distinct advantages and limitations of each model become evident, offering insights into their contribution to the chatbot’s accuracy and reliability.
5. Conclusions
In conclusion, this study explored the convergence between Natural Language Processing (NLP) and machine learning, giving rise to an advanced health chatbot. This chatbot introduces an innovative approach by integrating a diverse set of machine learning models to effectively predict disease categories from user queries. The chatbot’s architecture encompasses models such as Support Vector Machines, Decision Tree, Random Forest, Bagging Classifier, Multinomial Naive Bayes, among others. These models are thoughtfully merged to leverage their specific strengths, forming the foundation of this intelligent chatbot.
The performance of our chatbot proved outstanding, achieving a remarkable accuracy rate of 99%. This remarkable achievement not only underscores the chatbot’s effectiveness but also positions it as a robust and reliable tool for addressing health-related queries. By improving the accessibility of health information, this research empowers individuals to make better-informed decisions about their health. The chatbot emerges as a valuable solution to meet the rising need for personalized and knowledge-driven communication in healthcare.