Mathematics

Research

21 pages, 580 KiB

Open AccessArticle

Association Rules Mining for Hospital Readmission: A Case Study

by Nor Hamizah Miswan, ‘Ismat Mohd Sulaiman, Chee Seng Chan and Chong Guan Ng

Mathematics 2021, 9(21), 2706; https://doi.org/10.3390/math9212706 - 25 Oct 2021

Cited by 10 | Viewed by 6732

As an indicator of healthcare quality and performance, hospital readmission incurs major costs for healthcare systems worldwide. Understanding the relationships between readmission factors, such as input features and readmission length, is challenging following intricate hospital readmission procedures. This study discovered the significant correlation [...] Read more.

As an indicator of healthcare quality and performance, hospital readmission incurs major costs for healthcare systems worldwide. Understanding the relationships between readmission factors, such as input features and readmission length, is challenging following intricate hospital readmission procedures. This study discovered the significant correlation between potential readmission factors (threshold of various settings for readmission length) and basic demographic variables. Association rule mining (ARM), particularly the Apriori algorithm, was utilised to extract the hidden input variable patterns and relationships among admitted patients by generating supervised learning rules. The mined rules were categorised into two outcomes to comprehend readmission data; (i) the rules associated with various readmission length and (ii) several expert-validated variables related to basic demographics (gender, race, and age group). The extracted rules proved useful to facilitate decision-making and resource preparation to minimise patient readmission. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Figure 1

20 pages, 3151 KiB

Open AccessArticle

Improved Multi-Scale Deep Integration Paradigm for Point and Interval Carbon Trading Price Forecasting

by Jujie Wang and Shiyao Qiu

Mathematics 2021, 9(20), 2595; https://doi.org/10.3390/math9202595 - 15 Oct 2021

Cited by 6 | Viewed by 1832

Abstract

The forecast of carbon trading price is crucial to both sellers and purchasers; multi-scale integration models have been used widely in this process. However, these multi-scale models ignore the feature reconstruction process as well as the residual part and also they often focus [...] Read more.

The forecast of carbon trading price is crucial to both sellers and purchasers; multi-scale integration models have been used widely in this process. However, these multi-scale models ignore the feature reconstruction process as well as the residual part and also they often focus on the linear integration. Meanwhile, most of the models cannot provide prediction interval which means they neglect the uncertainty. In this paper, an improved multi-scale nonlinear integration model is proposed. The original dataset is divided into some subgroups through variational mode decomposition (VMD) and all the subgroups will go through sample entropy (SE) process to reconstruct the features. Then, random forest and long-short term memory (LSTM) integration are used to model feature sub-sequences. For the residual part, LSTM residual correction strategy based on white noise test corrects residuals to obtain point prediction results. Finally, Gaussian process (GP) is applied to get the prediction interval estimate. The result shows that compared with some other methods, the proposed method can obtain satisfying accuracy which has the minimum statistical error. So, it is safe to conclude that the proposed method is able to efficiently predict the carbon price as well as to provide the prediction interval estimate. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Figure 1

13 pages, 273 KiB

Open AccessArticle

Application of Exploratory Factor Analysis in the Construction of a Self-Perception Model of Informational Competences in Higher Education

by Belén Quintero Ordóñez, Ignacio González López, Eloísa Reche Urbano and Juan Antonio Fuentes Esparrell

Mathematics 2021, 9(18), 2332; https://doi.org/10.3390/math9182332 - 20 Sep 2021

Cited by 4 | Viewed by 3021

Abstract

The progress experienced by society resulting from the ready availability of information through the use of technology highlights the need to develop specific learning related to informational competences (IC) in educational settings where future professionals are trained to educate others, specifically in university [...] Read more.

The progress experienced by society resulting from the ready availability of information through the use of technology highlights the need to develop specific learning related to informational competences (IC) in educational settings where future professionals are trained to educate others, specifically in university degrees in social sciences. This study seeks to ascertain the opinions of students enrolled in these degrees at the Universidad de Córdoba (Spain) with regard to the knowledge they consider that they possess about IC for their future professional development, through the practical application of exploratory factor analysis. The methodology designed is based on a descriptive, non-experimental, correlational survey. The results show that factor analysis is a fundamental tool for obtaining results in terms of students’ perception of their knowledge of IC because its psychometric value has confirmed construct validity and enabled us to break down the items that made up the four initial dimensions of IC into eight factors to improve the understanding and explanation of these IC. Full article

(This article belongs to the Special Issue Applied Data Analytics)

23 pages, 3708 KiB

Open AccessEditor’s ChoiceArticle

PM2.5 Prediction Model Based on Combinational Hammerstein Recurrent Neural Networks

by Yi-Chung Chen, Tsu-Chiang Lei, Shun Yao and Hsin-Ping Wang

Mathematics 2020, 8(12), 2178; https://doi.org/10.3390/math8122178 - 6 Dec 2020

Cited by 26 | Viewed by 3464

Abstract

Airborne particulate matter 2.5 (PM2.5) can have a profound effect on the health of the population. Many researchers have been reporting highly accurate numerical predictions based on raw PM2.5 data imported directly into deep learning models; however, there is still considerable room for [...] Read more.

Airborne particulate matter 2.5 (PM2.5) can have a profound effect on the health of the population. Many researchers have been reporting highly accurate numerical predictions based on raw PM2.5 data imported directly into deep learning models; however, there is still considerable room for improvement in terms of implementation costs due to heavy computational overhead. From the perspective of environmental science, PM2.5 values in a given location can be attributed to local sources as well as external sources. Local sources tend to have a dramatic short-term impact on PM2.5 values, whereas external sources tend to have more subtle but longer-lasting effects. In the presence of PM2.5 from both sources at the same time, this combination of effects can undermine the predictive accuracy of the model. This paper presents a novel combinational Hammerstein recurrent neural network (CHRNN) to enhance predictive accuracy and overcome the heavy computational and monetary burden imposed by deep learning models. The CHRNN comprises a based-neural network tasked with learning gradual (long-term) fluctuations in conjunction with add-on neural networks to deal with dramatic (short-term) fluctuations. The CHRNN can be coupled with a random forest model to determine the degree to which short-term effects influence long-term outcomes. We also developed novel feature selection and normalization methods to enhance prediction accuracy. Using real-world measurement data of air quality and PM2.5 datasets from Taiwan, the precision of the proposed system in the numerical prediction of PM2.5 levels was comparable to that of state-of-the-art deep learning models, such as deep recurrent neural networks and long short-term memory, despite far lower implementation costs and computational overhead. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Graphical abstract

17 pages, 5951 KiB

Open AccessArticle

Improving Accuracy and Generalization Performance of Small-Size Recurrent Neural Networks Applied to Short-Term Load Forecasting

by Pavel V. Matrenin, Vadim Z. Manusov, Alexandra I. Khalyasmaa, Dmitry V. Antonenkov, Stanislav A. Eroshenko and Denis N. Butusov

Mathematics 2020, 8(12), 2169; https://doi.org/10.3390/math8122169 - 4 Dec 2020

Cited by 30 | Viewed by 4442

Abstract

The load forecasting of a coal mining enterprise is a complicated problem due to the irregular technological process of mining. It is necessary to apply models that can distinguish both cyclic components and complex rules in the energy consumption data that reflect the [...] Read more.

The load forecasting of a coal mining enterprise is a complicated problem due to the irregular technological process of mining. It is necessary to apply models that can distinguish both cyclic components and complex rules in the energy consumption data that reflect the highly volatile technological process. For such tasks, Artificial Neural Networks demonstrate advanced performance. In recent years, the effectiveness of Artificial Neural Networks has been significantly improved thanks to new state-of-the-art architectures, training methods and approaches to reduce overfitting. In this paper, the Recurrent Neural Network architecture with a small-size model was applied to the short-term load forecasting of a coal mining enterprise. A single recurrent model was developed and trained for the entire four-year operational period of the enterprise, with significant changes in the energy consumption pattern during the period. This task was challenging since it required high-level generalization performance from the model. It was shown that the accuracy and generalization properties of small-size recurrent models can be significantly improved by the proper selection of the hyper-parameters and training method. The effectiveness of the proposed approach was validated using a real-case dataset. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Figure 1

15 pages, 912 KiB

Open AccessArticle

WINFRA: A Web-Based Platform for Semantic Data Retrieval and Data Analytics

by Addi Ait-Mlouk, Xuan-Son Vu and Lili Jiang

Mathematics 2020, 8(11), 2090; https://doi.org/10.3390/math8112090 - 23 Nov 2020

Cited by 5 | Viewed by 4390

Abstract

Given the huge amount of heterogeneous data stored in different locations, it needs to be federated and semantically interconnected for further use. This paper introduces WINFRA, a comprehensive open-access platform for semantic web data and advanced analytics based on natural language processing (NLP) [...] Read more.

Given the huge amount of heterogeneous data stored in different locations, it needs to be federated and semantically interconnected for further use. This paper introduces WINFRA, a comprehensive open-access platform for semantic web data and advanced analytics based on natural language processing (NLP) and data mining techniques (e.g., association rules, clustering, classification based on associations). The system is designed to facilitate federated data analysis, knowledge discovery, information retrieval, and new techniques to deal with semantic web and knowledge graph representation. The processing step integrates data from multiple sources virtually by creating virtual databases. Afterwards, the developed RDF Generator is built to generate RDF files for different data sources, together with SPARQL queries, to support semantic data search and knowledge graph representation. Furthermore, some application cases are provided to demonstrate how it facilitates advanced data analytics over semantic data and showcase our proposed approach toward semantic association rules. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Figure 1

22 pages, 490 KiB

Open AccessArticle

A Novel Data Analytics Method for Predicting the Delivery Speed of Software Enhancement Projects

by Elías Ventura-Molina, Cuauhtémoc López-Martín, Itzamá López-Yáñez and Cornelio Yáñez-Márquez

Mathematics 2020, 8(11), 2002; https://doi.org/10.3390/math8112002 - 10 Nov 2020

Cited by 5 | Viewed by 2765

Abstract

A fundamental issue of the software engineering economics is productivity. In this regard, one measure of software productivity is delivery speed. Software productivity prediction is useful to determine corrective activities, as well as to identify improvement alternatives. A type of software maintenance is [...] Read more.

A fundamental issue of the software engineering economics is productivity. In this regard, one measure of software productivity is delivery speed. Software productivity prediction is useful to determine corrective activities, as well as to identify improvement alternatives. A type of software maintenance is enhancement. In this paper, we propose a data analytics-based software engineering algorithm called search method based on feature construction (SMFC) for predicting the delivery speed of software enhancement projects. The SMFC belongs to the minimalist machine learning paradigm, and as such it always generates a two-dimensional model. Unlike the usual data analytics methods, SMFC includes an original algorithmic training procedure, in which both the independent and dependent variables are considered for transformation. SMFC prediction performance is compared to those of statistical regression, neural networks, support vector regression, and fuzzy regression. To do this, seven datasets of software enhancement projects obtained from the International Software Benchmarking Standards Group (ISBSG) Release 2017 were used. The validation method is leave-one-out cross validation, whereas absolute residuals have been chosen as the performance measure. The results indicate that the SMFC is statistically better than statistical regression. This fact represents an obvious advantage in favor of SMFC, because the other two methods are not statistically better than SMFC. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Figure 1

15 pages, 2631 KiB

Open AccessArticle

Corporate Performance and Economic Convergence between Europe and the US: A Cluster Analysis Along Industry Lines

by Călin Vâlsan and Elena Druică

Mathematics 2020, 8(3), 451; https://doi.org/10.3390/math8030451 - 20 Mar 2020

Cited by 7 | Viewed by 3393

Abstract

We investigate the extent to which the United States and the countries of Europe have achieved economic convergence of their corporate sector. We define convergence as the homogenization of economic performance, institutional arrangements, and market valuation taking place at the meso-economic level. We [...] Read more.

We investigate the extent to which the United States and the countries of Europe have achieved economic convergence of their corporate sector. We define convergence as the homogenization of economic performance, institutional arrangements, and market valuation taking place at the meso-economic level. We perform a cluster analysis along industry lines and find that industries and corporations on both continents cluster in four groups, based on six variables measuring operating performance, ownership, and market valuation. The clusters resulted from the US data are more unstable than those resulted from European data. We are also able to pair a handful of highly similar clusters between the US and European data. These findings suggest a complex dynamic. It seems that the US corporate sector is more homogeneous than the European one. Moreover, some degree of convergence between the European Union and the United States appears to have already occurred. Full article

(This article belongs to the Special Issue Applied Data Analytics)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Applied Data Analytics

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI