MDPI - Publisher of Open Access Journals

20 pages, 3921 KiB

Open AccessArticle

Quinary Classification of Human Gait Phases Using Machine Learning: Investigating the Potential of Different Training Methods and Scaling Techniques

by Amal Mekni, Jyotindra Narayan and Hassène Gritli

Big Data Cogn. Comput. 2025, 9(4), 89; https://doi.org/10.3390/bdcc9040089 - 7 Apr 2025

Cited by 1 | Viewed by 566

Abstract

Walking is a fundamental human activity, and analyzing its complexities is essential for understanding gait abnormalities and musculoskeletal disorders. This article delves into the classification of gait phases using advanced machine learning techniques, specifically focusing on dividing these phases into five distinct subphases. [...] Read more.

Walking is a fundamental human activity, and analyzing its complexities is essential for understanding gait abnormalities and musculoskeletal disorders. This article delves into the classification of gait phases using advanced machine learning techniques, specifically focusing on dividing these phases into five distinct subphases. The study utilizes data from 100 individuals obtained from an open-access platform and employs two distinct training methodologies. The first approach adopts stratified random sampling, where 80% of the data from each subphase are allocated for training and 20% for testing. The second approach involves participant-based splitting, training on data from 80% of the individuals and testing on the remaining 20%. Preprocessing methods such as Min–Max Scaling (MMS), Standard Scaling (SS), and Principal Component Analysis (PCA) were applied to the dataset to ensure optimal performance of the machine learning models. Several algorithms were implemented, including k-Nearest Neighbors (k-NNs), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (Gaussian, Bernoulli, and Multinomial) (NB), Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). The models were rigorously evaluated using performance metrics like cross-validation score, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), accuracy, and

R^{2}

score, offering a comprehensive assessment of their effectiveness in classifying gait phases. In the five subphases analysis, RF again performed strongly with a 94.95% accuracy, an RMSE of 0.4461, and an

R^{2}

score of 90.09%, demonstrating robust performance across all scaling methods. Full article

(This article belongs to the Special Issue Deep Learning-Based Pose Estimation: Applications in Vision, Robotics, and Beyond)

► Show Figures

Figure 1

24 pages, 7611 KiB

Open AccessArticle

Advancements in Predictive Analytics: Machine Learning Approaches to Estimating Length of Stay and Mortality in Sepsis

by Houssem Ben Khalfallah, Mariem Jelassi, Jacques Demongeot and Narjès Bellamine Ben Saoud

Computation 2025, 13(1), 8; https://doi.org/10.3390/computation13010008 - 1 Jan 2025

Viewed by 1095

Abstract

Sepsis remains a major global health concern, causing high mortality rates, prolonged hospital stays, and substantial economic burdens. The accurate prediction of clinical outcomes, such as mortality and length of stay (LOS), is critical for optimizing hospital resource allocation and improving patient management. [...] Read more.

Sepsis remains a major global health concern, causing high mortality rates, prolonged hospital stays, and substantial economic burdens. The accurate prediction of clinical outcomes, such as mortality and length of stay (LOS), is critical for optimizing hospital resource allocation and improving patient management. The present study investigates the potential of machine learning (ML) models to predict these outcomes using a dataset of 1492 sepsis patients with clinical, physiological, and demographic features. After rigorous preprocessing to address missing data and ensure consistency, multiple classifiers, including Random Forest, Extra Trees, and Gradient Boosting, were trained and validated. The results demonstrate that Random Forest and Extra Trees achieve high accuracy for LOS prediction, while Gradient Boosting and Bernoulli Naïve Bayes effectively predict mortality. Feature importance analysis identified ICU stay duration (ICU_DAYS_OBS) as the most influential predictor for both outcomes, alongside vital signs, white blood cell counts, and lactic acid levels. These findings highlight the potential of ML-driven clinical decision support systems (CDSSs) to enhance early risk assessment, optimize ICU resource planning, and support timely interventions. Future research should refine predictive features, integrate advanced biomarkers, and validate models across larger and more diverse datasets to improve scalability and clinical impact. Full article

(This article belongs to the Special Issue Generative AI in Action: Trends, Applications, and Implications)

► Show Figures

Figure 1

8 pages, 1139 KiB

Open AccessProceeding Paper

Artificial Intelligence-Based Effective Detection of Parkinson’s Disease Using Voice Measurements

by Gogulamudi Pradeep Reddy, Duppala Rohan, Yellapragada Venkata Pavan Kumar, Kasaraneni Purna Prakash and Mandarapu Srikanth

Eng. Proc. 2024, 82(1), 28; https://doi.org/10.3390/ecsa-11-20481 - 26 Nov 2024

Viewed by 1701

Abstract

Parkinson’s disease (PD) is a neurodegenerative illness that affects the central nervous system and leads to a gradual degeneration of neurons that results in movement slowness, mental health problems, speaking difficulties, etc. In the past 20 years, the frequency of PD has doubled. [...] Read more.

Parkinson’s disease (PD) is a neurodegenerative illness that affects the central nervous system and leads to a gradual degeneration of neurons that results in movement slowness, mental health problems, speaking difficulties, etc. In the past 20 years, the frequency of PD has doubled. Global estimates revealed that over 8.5 million cases have been identified so far. Thus, early and accurate detection of PD is crucial for treatment. Traditional detection methods are subjective and prone to delays, as they are reliant on clinical evaluation and imaging. Alternatively, artificial intelligence (AI) has recently emerged as a transformative technology in the healthcare sector, showing decent and promising results. However, an effective algorithm needs to be investigated for the most accurate prediction of a particular disease. Thus, this paper explores the ability of different machine learning algorithms in regard to the effective detection of PD. A total of 26 algorithms were implemented using the Scikit-Learn library on the Oxford PD detection dataset. This is a collection of 195 voice measurements recorded from 31 individuals, of which 23 have PD. The implemented algorithms are logistic regression, decision tree, k-nearest neighbors, random forest, support vector machine, Gaussian naïve bayes, multi-layered perceptron (MLP), extreme gradient boosting, adaptive boosting, stochastic gradient descent, gradient boosting machine, extra tree classifier, light gradient boosting machine, categorical boosting, Bernoulli naïve bayes, complement naïve bayes, multinomial naïve bayes, histogram-based gradient boosting, nearest centroid, radius neighbors classifier, logistic regression with elastic net regularization, extreme learning machine, ridge classifier, huber classifier, perceptron classifier, and voting classifier. Among them, MLP outperformed the other algorithms with a testing accuracy of 95%, precision of 94%, sensitivity of 100%, F1 score of 97%, and AUC of 98%. Thus, it successfully discriminates healthy individuals from those with PD, thereby helping for accurate early detection of PD for new patients using their voice measurements. Full article

(This article belongs to the Proceedings of The 11th International Electronic Conference on Sensors and Applications)

► Show Figures

Figure 1

22 pages, 1165 KiB

Open AccessArticle

Advanced Comparative Analysis of Machine Learning and Transformer Models for Depression and Suicide Detection in Social Media Texts

by Biodoumoye George Bokolo and Qingzhong Liu

Electronics 2024, 13(20), 3980; https://doi.org/10.3390/electronics13203980 - 10 Oct 2024

Cited by 3 | Viewed by 3328

Abstract

Depression detection through social media analysis has emerged as a promising approach for early intervention and mental health support. This study evaluates the performance of various machine learning and transformer models in identifying depressive content from tweets on X. Utilizing the Sentiment140 and [...] Read more.

Depression detection through social media analysis has emerged as a promising approach for early intervention and mental health support. This study evaluates the performance of various machine learning and transformer models in identifying depressive content from tweets on X. Utilizing the Sentiment140 and the Suicide-Watch dataset, we built several models which include logistic regression, Bernoulli Naive Bayes, Random Forest, and transformer models such as RoBERTa, DeBERTa, DistilBERT, and SqueezeBERT to detect this content. Our findings indicate that transformer models outperform traditional machine learning algorithms, with RoBERTa and DeBERTa, when predicting depression and suicide rates. This performance is attributed to the transformers’ ability to capture contextual nuances in language. On the other hand, logistic regression models outperform transformers in another dataset with more accurate information. This is attributed to the traditional model’s ability to understand simple patterns especially when the classes are straighforward. We employed a comprehensive cross-validation approach to ensure robustness, with transformers demonstrating higher stability and reliability across splits. Despite limitations like dataset scope and computational constraints, the findings contribute significantly to mental health monitoring and suggest promising directions for future research and real-world applications in early depression detection and mental health screening tools. The various models used performed outstandingly. Full article

(This article belongs to the Special Issue Information Retrieval and Cyber Forensics with Data Science)

► Show Figures

Figure 1

14 pages, 1052 KiB

Open AccessArticle

The Effect of Training Data Size on Disaster Classification from Twitter

by Dimitrios Effrosynidis, Georgios Sylaios and Avi Arampatzis

Information 2024, 15(7), 393; https://doi.org/10.3390/info15070393 - 8 Jul 2024

Cited by 1 | Viewed by 1794

Abstract

In the realm of disaster-related tweet classification, this study presents a comprehensive analysis of various machine learning algorithms, shedding light on crucial factors influencing algorithm performance. The exceptional efficacy of simpler models is attributed to the quality and size of the dataset, enabling [...] Read more.

In the realm of disaster-related tweet classification, this study presents a comprehensive analysis of various machine learning algorithms, shedding light on crucial factors influencing algorithm performance. The exceptional efficacy of simpler models is attributed to the quality and size of the dataset, enabling them to discern meaningful patterns. While powerful, complex models are time-consuming and prone to overfitting, particularly with smaller or noisier datasets. Hyperparameter tuning, notably through Bayesian optimization, emerges as a pivotal tool for enhancing the performance of simpler models. A practical guideline for algorithm selection based on dataset size is proposed, consisting of Bernoulli Naive Bayes for datasets below 5000 tweets and Logistic Regression for larger datasets exceeding 5000 tweets. Notably, Logistic Regression shines with 20,000 tweets, delivering an impressive combination of performance, speed, and interpretability. A further improvement of 0.5% is achieved by applying ensemble and stacking methods. Full article

► Show Figures

Figure 1

20 pages, 2469 KiB

Open AccessArticle

Deep Dive into Fake News Detection: Feature-Centric Classification with Ensemble and Deep Learning Methods

by Fawaz Khaled Alarfaj and Jawad Abbas Khan

Algorithms 2023, 16(11), 507; https://doi.org/10.3390/a16110507 - 3 Nov 2023

Cited by 14 | Viewed by 5074

Abstract

The online spread of fake news on various platforms has emerged as a significant concern, posing threats to public opinion, political stability, and the dissemination of reliable information. Researchers have turned to advanced technologies, including machine learning (ML) and deep learning (DL) techniques, [...] Read more.

The online spread of fake news on various platforms has emerged as a significant concern, posing threats to public opinion, political stability, and the dissemination of reliable information. Researchers have turned to advanced technologies, including machine learning (ML) and deep learning (DL) techniques, to detect and classify fake news to address this issue. This research study explores fake news classification using diverse ML and DL approaches. We utilized a well-known “Fake News” dataset sourced from Kaggle, encompassing a labelled news collection. We implemented diverse ML models, including multinomial naïve bayes (MNB), gaussian naïve bayes (GNB), Bernoulli naïve Bayes (BNB), logistic regression (LR), and passive aggressive classifier (PAC). Additionally, we explored DL models, such as long short-term memory (LSTM), convolutional neural networks (CNN), and CNN-LSTM. We compared the performance of these models based on key evaluation metrics, such as accuracy, precision, recall, and the F1 score. Additionally, we conducted cross-validation and hyperparameter tuning to ensure optimal performance. The results provide valuable insights into the strengths and weaknesses of each model in classifying fake news. We observed that DL models, particularly LSTM and CNN-LSTM, showed better performance compared to traditional ML models. These models achieved higher accuracy and demonstrated robustness in classification tasks. These findings emphasize the potential of DL models to tackle the spread of fake news effectively and highlight the importance of utilizing advanced techniques to address this challenging problem. Full article

► Show Figures

Figure 1

20 pages, 1421 KiB

Open AccessArticle

Deep Learning-Based Depression Detection from Social Media: Comparative Evaluation of ML and Transformer Techniques

by Biodoumoye George Bokolo and Qingzhong Liu

Electronics 2023, 12(21), 4396; https://doi.org/10.3390/electronics12214396 - 24 Oct 2023

Cited by 29 | Viewed by 13984

Abstract

Detecting depression from user-generated content on social media platforms has garnered significant attention due to its potential for the early identification and monitoring of mental health issues. This paper presents a comprehensive approach for depression detection from user tweets using machine learning techniques. [...] Read more.

Detecting depression from user-generated content on social media platforms has garnered significant attention due to its potential for the early identification and monitoring of mental health issues. This paper presents a comprehensive approach for depression detection from user tweets using machine learning techniques. The study utilizes a dataset of 632,000 tweets and employs data preprocessing, feature selection, and model training with logistic regression, Bernoulli Naive Bayes, random forests, DistilBERT, SqueezeBERT, DeBERTA, and RoBERTa models. Evaluation metrics such as accuracy, precision, recall, and F1 score are employed to assess the models’ performance. The results indicate that the RoBERTa model achieves the highest accuracy ratio of 0.981 and the highest mean accuracy of 0.97 (across 10 cross-validation folds) in detecting depression from tweets. This research demonstrates the effectiveness of machine learning and advanced transformer-based models in leveraging social media data for mental health analysis. The findings offer valuable insights into the potential for early detection and monitoring of depression using online platforms, contributing to the growing field of mental health analysis based on user-generated content. Full article

(This article belongs to the Special Issue AI in Knowledge-Based Information and Decision Support Systems)

► Show Figures

Figure 1

16 pages, 3107 KiB

Open AccessArticle

Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes

by Orlando Iparraguirre-Villanueva, Karina Espinola-Linares, Rosalynn Ornella Flores Castañeda and Michael Cabanillas-Carbonell

Diagnostics 2023, 13(14), 2383; https://doi.org/10.3390/diagnostics13142383 - 15 Jul 2023

Cited by 41 | Viewed by 7254

Abstract

Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk [...] Read more.

Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

18 pages, 864 KiB

Open AccessArticle

Threatening URDU Language Detection from Tweets Using Machine Learning

by Aneela Mehmood, Muhammad Shoaib Farooq, Ansar Naseem, Furqan Rustam, Mónica Gracia Villar, Carmen Lili Rodríguez and Imran Ashraf

Appl. Sci. 2022, 12(20), 10342; https://doi.org/10.3390/app122010342 - 14 Oct 2022

Cited by 22 | Viewed by 3878

Abstract

Technology’s expansion has contributed to the rise in popularity of social media platforms. Twitter is one of the leading social media platforms that people use to share their opinions. Such opinions, sometimes, may contain threatening text, deliberately or non-deliberately, which can be disturbing [...] Read more.

Technology’s expansion has contributed to the rise in popularity of social media platforms. Twitter is one of the leading social media platforms that people use to share their opinions. Such opinions, sometimes, may contain threatening text, deliberately or non-deliberately, which can be disturbing for other users. Consequently, the detection of threatening content on social media is an important task. Contrary to high-resource languages like English, Dutch, and others that have several such approaches, the low-resource Urdu language does not have such a luxury. Therefore, this study presents an intelligent threatening language detection for the Urdu language. A stacking model is proposed that uses an extra tree (ET) classifier and Bayes theorem-based Bernoulli Naive Bayes (BNB) as the based learners while logistic regression (LR) is employed as the meta learner. A performance analysis is carried out by deploying a support vector classifier, ET, LR, BNB, fully connected network, convolutional neural network, long short-term memory, and gated recurrent unit. Experimental results indicate that the stacked model performs better than both machine learning and deep learning models. With 74.01% accuracy, 70.84% precision, 75.65% recall, and 73.99% F1 score, the model outperforms the existing benchmark study. Full article

(This article belongs to the Special Issue Recent Trends in Natural Language Processing and Its Applications)

► Show Figures

Figure 1

17 pages, 3702 KiB

Open AccessArticle

A Deep Learning Method for the Prediction of the Index Mechanical Properties and Strength Parameters of Marlstone

by Mohammad Azarafza, Masoud Hajialilue Bonab and Reza Derakhshani

Materials 2022, 15(19), 6899; https://doi.org/10.3390/ma15196899 - 5 Oct 2022

Cited by 47 | Viewed by 3270

Abstract

The index mechanical properties, strength, and stiffness parameters of rock materials (i.e., uniaxial compressive strength, c, ϕ, E, and G) are critical factors in the proper geotechnical design of rock structures. Direct procedures such as field surveys, sampling, and testing are used to [...] Read more.

The index mechanical properties, strength, and stiffness parameters of rock materials (i.e., uniaxial compressive strength, c, ϕ, E, and G) are critical factors in the proper geotechnical design of rock structures. Direct procedures such as field surveys, sampling, and testing are used to estimate these properties, and are time-consuming and costly. Indirect methods have gained popularity in recent years due to their time-saving and highly accurate results, which are comparable to those obtained through direct approaches. This study presents a procedure for establishing a deep learning-based predictive model (DNN) for obtaining the geomechanical characteristics of marlstone samples that have been recovered from the South Pars region of southwest Iran. The model was implemented on a dataset resulting from the execution of numerous geotechnical tests and the evaluation of the geotechnical parameters of a total of 120 samples. The applied model was verified by using benchmark learning classifiers (e.g., Support Vector Machine, Logistic Regression, Gaussian Naïve Bayes, Multilayer Perceptron, Bernoulli Naïve Bayes, and Decision Tree), Loss Function, MAE, MSE, RMSE, and R-square. According to the results, the proposed DNN-based model led to the highest accuracy (0.95), precision (0.97), and the lowest error rate (MAE = 0.13, MSE = 0.11, and RMSE = 0.17). Moreover, in terms of R², the model was able to accurately predict the geotechnical indices (0.933 for UCS, 0.925 for E, 0.941 for G, 0.954 for c, and 0.921 for φ). Full article

► Show Figures

Figure 1

21 pages, 6236 KiB

Open AccessArticle

TweezBot: An AI-Driven Online Media Bot Identification Algorithm for Twitter Social Networks

by Rachit Shukla, Adwitiya Sinha and Ankit Chaudhary

Electronics 2022, 11(5), 743; https://doi.org/10.3390/electronics11050743 - 28 Feb 2022

Cited by 15 | Viewed by 5400

Abstract

In the ultra-connected age of information, online social media platforms have become an indispensable part of our daily routines. Recently, this online public space is getting largely occupied by suspicious and manipulative social media bots. Such automated deceptive bots often attempt to distort [...] Read more.

In the ultra-connected age of information, online social media platforms have become an indispensable part of our daily routines. Recently, this online public space is getting largely occupied by suspicious and manipulative social media bots. Such automated deceptive bots often attempt to distort ground realities and manipulate global trends, thus creating astroturfing attacks on the social media online portals. Moreover, these bots often tend to participate in duplicitous activities, including promotion of hidden agendas and indulgence in biased propagation meant for personal gain or scams. Thus, online bots have eventually become one of the biggest menaces for social media platforms. Therefore, we have proposed an AI-driven social media bot identification framework, namely TweezBot, which can identify fraudulent Twitter bots. The proposed bot detection method analyzes Twitter-specific user profiles having essential profile-centric features and several activity-centric characteristics. We have constructed a set of filtering criteria and devised an exhaustive bag of words for performing language-based processing. In order to substantiate our research, we have performed a comparative study of our model with the existing benchmark classifiers, such as Support Vector Machine, Categorical Naïve Bayes, Bernoulli Naïve Bayes, Multilayer Perceptron, Decision Trees, Random Forest and other automation identifiers. Full article

(This article belongs to the Special Issue Hybrid Developments in Cyber Security and Threat Analysis)

► Show Figures

Figure 1

12 pages, 1935 KiB

Open AccessArticle

Modelling Service Quality of Internet Service Providers during COVID-19: The Customer Perspective Based on Twitter Dataset

by Bagus Setya Rintyarna, Heri Kuswanto, Riyanarto Sarno, Emy Kholifah Rachmaningsih, Fika Hastarita Rachman, Wiwik Suharso and Triawan Adi Cahyanto

Informatics 2022, 9(1), 11; https://doi.org/10.3390/informatics9010011 - 29 Jan 2022

Cited by 7 | Viewed by 5873

Abstract

Internet service providers (ISPs) conduct their business by providing Internet access features to their customers. The COVID-19 pandemic has shifted most activity being performed remotely using an Internet connection. As a result, the demand for Internet services increased by 50%. This significant rise [...] Read more.

Internet service providers (ISPs) conduct their business by providing Internet access features to their customers. The COVID-19 pandemic has shifted most activity being performed remotely using an Internet connection. As a result, the demand for Internet services increased by 50%. This significant rise in the appeal of Internet services needs to be overtaken by a notable increase in the service quality provided by ISPs. Service quality plays a great role for enterprises, including ISPs, in retaining consumer loyalty. Thus, modelling ISPs’ service quality is of great importance. Since a common technique to reveal service quality is a timely and costly pencil survey-based method, this work proposes a framework based on the Sentiment Analysis (SA) of the Twitter dataset to model service quality. The SA involves the majority voting of three machine learning algorithms namely Naïve Bayes, Multinomial Naïve Bayes and Bernoulli Naïve Bayes. Making use of Thaicon’s service quality metrics, this work proposes a formula to generate a rating of service quality accordingly. For the case studies, we examined two ISPs in Indonesia, i.e., By.U and MPWR. The framework successfully extracted the service quality rate of both ISPs, revealing that By.U is better in terms of service quality, as indicated by a service quality rate of 0.71. Meanwhile, MPWR outperforms By.U in terms of customer service. Full article

► Show Figures

Figure 1

25 pages, 3987 KiB

Open AccessArticle

Using Machine Learning to Detect Events on the Basis of Bengali and Banglish Facebook Posts

by Noyon Dey, Md. Sazzadur Rahman, Motahara Sabah Mredula, A. S. M. Sanwar Hosen and In-Ho Ra

Electronics 2021, 10(19), 2367; https://doi.org/10.3390/electronics10192367 - 28 Sep 2021

Cited by 8 | Viewed by 5461

Abstract

In modern times, ensuring social security has become the prime concern for security administrators. The widespread and recurrent use of social media sites is creating a huge risk for the lives of the general people, as these sites are frequently becoming potential sources [...] Read more.

In modern times, ensuring social security has become the prime concern for security administrators. The widespread and recurrent use of social media sites is creating a huge risk for the lives of the general people, as these sites are frequently becoming potential sources of the organization of various types of immoral events. For protecting society from these dangers, a prior detection system which can effectively detect events by analyzing these social media data is essential. However, automating the process of event detection has been difficult, as existing processes must account for diverse writing styles, languages, dialects, post lengths, and et cetera. To overcome these difficulties, we developed an effective model for detecting events, which, for our purposes, were classified as either protesting, celebrating, religious, or neutral, using Bengali and Banglish Facebook posts. At first, the collected posts’ text were processed for language detection, and then, detected posts were pre-processed using stopwords removal and tokenization. Features were then extracted from these pre-processed texts using three sub-processes: filtering, phrase matching of specific events, and sentiment analysis. The collected features were ultimately used to train our Bernoulli Naive Bayes classification model, which was capable of detecting events with 90.41% accuracy (for Bengali-language posts) and 70% (for the Banglish-form posts). For evaluating the effectiveness of our proposed model more precisely, we compared it with two other classifiers: Support Vector Machine and Decision Tree. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

30 pages, 4577 KiB

Open AccessArticle

Innovative Artificial Intelligence Approach for Hearing-Loss Symptoms Identification Model Using Machine Learning Techniques

by Mohd Khanapi Abd Ghani, Nasir G. Noma, Mazin Abed Mohammed, Karrar Hameed Abdulkareem, Begonya Garcia-Zapirain, Mashael S. Maashi and Salama A. Mostafa

Sustainability 2021, 13(10), 5406; https://doi.org/10.3390/su13105406 - 12 May 2021

Cited by 13 | Viewed by 5872

Abstract

Physicians depend on their insight and experience and on a fundamentally indicative or symptomatic approach to decide on the possible ailment of a patient. However, numerous phases of problem identification and longer strategies can prompt a longer time for consulting and can subsequently [...] Read more.

Physicians depend on their insight and experience and on a fundamentally indicative or symptomatic approach to decide on the possible ailment of a patient. However, numerous phases of problem identification and longer strategies can prompt a longer time for consulting and can subsequently cause other patients that require attention to wait for longer. This can bring about pressure and tension concerning those patients. In this study, we focus on developing a decision-support system for diagnosing the symptoms as a result of hearing loss. The model is implemented by utilizing machine learning techniques. The Frequent Pattern Growth (FP-Growth) algorithm is used as a feature transformation method and the multivariate Bernoulli naïve Bayes classification model as the classifier. To find the correlation that exists between the hearing thresholds and symptoms of hearing loss, the FP-Growth and association rule algorithms were first used to experiment with small sample and large sample datasets. The result of these two experiments showed the existence of this relationship, and that the performance of the hybrid of the FP-Growth and naïve Bayes algorithms in identifying hearing-loss symptoms was found to be efficient, with a very small error rate. The average accuracy rate and average error rate for the multivariate Bernoulli model with FP-Growth feature transformation, using five training sets, are 98.25% and 1.73%, respectively. Full article

(This article belongs to the Special Issue Innovative Artificial Intelligence Approaches for Effective Healthcare Logistics and COVID-19 Response)

► Show Figures

Figure 1

19 pages, 710 KiB

Open AccessArticle

Opinion-Mining on Marglish and Devanagari Comments of YouTube Cookery Channels Using Parametric and Non-Parametric Learning Models

by Sonali Rajesh Shah, Abhishek Kaushik, Shubham Sharma and Janice Shah

Big Data Cogn. Comput. 2020, 4(1), 3; https://doi.org/10.3390/bdcc4010003 - 17 Mar 2020

Cited by 24 | Viewed by 7822

Abstract

YouTube is a boon, and through it people can educate, entertain, and express themselves about various topics. YouTube India currently has millions of active users. As there are millions of active users it can be understood that the data present on the YouTube [...] Read more.

YouTube is a boon, and through it people can educate, entertain, and express themselves about various topics. YouTube India currently has millions of active users. As there are millions of active users it can be understood that the data present on the YouTube will be large. With India being a very diverse country, many people are multilingual. People express their opinions in a code-mix form. Code-mix form is the mixing of two or more languages. It has become a necessity to perform Sentiment Analysis on the code-mix languages as there is not much research on Indian code-mix language data. In this paper, Sentiment Analysis (SA) is carried out on the Marglish (Marathi + English) as well as Devanagari Marathi comments which are extracted from the YouTube API from top Marathi channels. Several machine-learning models are applied on the dataset along with 3 different vectorizing techniques. Multilayer Perceptron (MLP) with Count vectorizer provides the best accuracy of 62.68% on the Marglish dataset and Bernoulli Naïve Bayes along with the Count vectorizer, which gives accuracy of 60.60% on the Devanagari dataset. Multilayer Perceptron and Bernoulli Naïve Bayes are considered to be the best performing algorithms. 10-fold cross-validation and statistical testing was also carried out on the dataset to confirm the results. Full article

► Show Figures

Figure 1

Search Results (16)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (16)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI