Special Issue "Health Informatics: The Foundations of Public Health"

A special issue of Healthcare (ISSN 2227-9032). This special issue belongs to the section "Health Informatics and Big Data".

Deadline for manuscript submissions: 31 October 2022 | Viewed by 10213

Special Issue Editors

Prof. Dr. Michael T. S. Lee
E-Mail Website
Guest Editor
Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 24205, Taiwan.
Interests: data mining; medical/health informatics; artificial intelligence and applications; applied statistics
Prof. Dr. Chi-Jie Lu
E-Mail Website
Guest Editor
Graduate Institute of Business Administration & Department of Information Management, Fu Jen Catholic University, New Taipei City 24205, Taiwan
Interests: machine learning; medical/healthcare informatics; time series forecasting; supply chain management; quality management
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear colleagues,

A Special Issue on Public Health in Health Informatics is being organized in Healthcare. For detailed information about the journal, please refer to https://www.mdpi.com/journal/healthcare. Public health provides an extremely wide variety of problems that can be tackled using computational and machine learning techniques. Health informatics is a spectrum of multidisciplinary fields that includes study of the design, development, and application of computational techniques to improve healthcare. Disciplines involved combine medical fields with computing fields such as software engineering, data science, information technology, and behavior informatics. Health informatics research focuses on applications of artificial intelligence in healthcare for academic institutions. As COVID-19 continues to put serious pressure on healthcare systems worldwide, with more than 188 million confirmed cases and more than 4 million death cases to date and huge datasets collected, it will provide lots of research topics for healthcare informatics. This Special Issue is open to relevant subject areas of healthcare informatics. The keywords listed below provide an outline of some of the possible areas of interest.

Prof. Dr. Michael T. S. Lee
Prof. Dr. Chi-Jie Lu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Healthcare is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • health informatics
  • public health
  • behavior informatics
  • medical informatics
  • healthcare management
  • artificial intelligence
  • machine learning
  • data analytics
  • cognitive informatics
  • neuroinformatics
  • data science
  • biostatistics
  • information technology
  • geographic information systems
  • database management
  • COVID-19

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Validation of Operational Definition to Identify Patients with Osteoporotic Hip Fractures in Administrative Claims Data
Healthcare 2022, 10(9), 1724; https://doi.org/10.3390/healthcare10091724 - 08 Sep 2022
Viewed by 247
Abstract
As incidences of osteoporotic hip fractures (OHFs) have increased, identifying OHFs has become important to establishing the medical guidelines for their management. This study was conducted to develop an operational definition to identify patients with OHFs using two diagnosis codes and eight procedure [...] Read more.
As incidences of osteoporotic hip fractures (OHFs) have increased, identifying OHFs has become important to establishing the medical guidelines for their management. This study was conducted to develop an operational definition to identify patients with OHFs using two diagnosis codes and eight procedure codes from health insurance claims data and to assess the operational definition’s validity through a chart review. The study extracted data on OHFs from 522 patients who underwent hip surgeries based on diagnosis codes. Orthopedic surgeons then reviewed these patients’ medical records and radiographs to identify those with true OHFs. The validities of nine different algorithms of operational definitions, developed using a combination of three levels of diagnosis codes and eight procedure codes, were assessed using various statistics. The developed operational definition showed an accuracy above 0.97 and an area under the receiver operating characteristic curve above 0.97, indicating excellent discriminative power. This study demonstrated that the operational definition that combines diagnosis and procedure codes shows a high validity in detecting OHFs and can be used as a valid tool to detect OHFs from big health claims data. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
Use of Netnography to Understand GoFundMe® Crowdfunding Profiles Posted for Individuals and Families of Children with Osteogenesis Imperfecta
Healthcare 2022, 10(8), 1451; https://doi.org/10.3390/healthcare10081451 - 02 Aug 2022
Viewed by 347
Abstract
Osteogenesis imperfecta (OI) is a rare genetic disorder associated with low bone density and increased bone fragility. OI can lead to a variety of supportive and medical care needs; yet financial impacts for families and individuals living with OI remain understudied and largely [...] Read more.
Osteogenesis imperfecta (OI) is a rare genetic disorder associated with low bone density and increased bone fragility. OI can lead to a variety of supportive and medical care needs; yet financial impacts for families and individuals living with OI remain understudied and largely invisible. Efforts by families to recover costs through GoFundMe®, the most important crowdfunding web platform worldwide, offer an unprecedented opportunity to gain insight into OI costs. The purpose of this study was to describe GoFundMe® profiles and determine what factors may contribute to funding goal achievement. A netnographic approach was used to investigate a publicly available dataset from GoFundMe®, with 1206 webpages extracted and 401 included for analysis. Most webpages originated from the United States and were created by family members. Nineteen cost categories were identified. Thirty-seven web profiles met their funding goal. Funding increases or goal achievements created for children were associated with increased social-media exposure (i.e., Facebook). This study helped to describe and showcase the financial impacts of OI and effectiveness of a crowdfunding website to alleviate costs. The results highlight the need for further research to better understand OI costs and provide economic supports for individuals with OI. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
SARIMA Model Forecasting Performance of the COVID-19 Daily Statistics in Thailand during the Omicron Variant Epidemic
Healthcare 2022, 10(7), 1310; https://doi.org/10.3390/healthcare10071310 - 14 Jul 2022
Viewed by 441
Abstract
This study aims to identify and evaluate a robust and replicable public health predictive model that can be applied to the COVID-19 time-series dataset, and to compare the model performance after performing the 7-day, 14-day, and 28-day forecast interval. The seasonal autoregressive integrated [...] Read more.
This study aims to identify and evaluate a robust and replicable public health predictive model that can be applied to the COVID-19 time-series dataset, and to compare the model performance after performing the 7-day, 14-day, and 28-day forecast interval. The seasonal autoregressive integrated moving average (SARIMA) model was developed and validated using a Thailand COVID-19 open dataset from 1 December 2021 to 30 April 2022, during the Omicron variant outbreak. The SARIMA model with a non-statistically significant p-value of the Ljung–Box test, the lowest AIC, and the lowest RMSE was selected from the top five candidates for model validation. The selected models were validated using the 7-day, 14-day, and 28-day forward-chaining cross validation method. The model performance matrix for each forecast interval was evaluated and compared. The case fatality rate and mortality rate of the COVID-19 Omicron variant were estimated from the best performance model. The study points out the importance of different time interval forecasting that affects the model performance. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
Predicting the Mortality of ICU Patients by Topic Model with Machine-Learning Techniques
Healthcare 2022, 10(6), 1087; https://doi.org/10.3390/healthcare10061087 - 11 Jun 2022
Viewed by 832
Abstract
Predicting clinical patients’ vital signs is a leading critical issue in intensive care units (ICUs) related studies. Early prediction of the mortality of ICU patients can reduce the overall mortality and cost of complication treatment. Some studies have predicted mortality based on electronic [...] Read more.
Predicting clinical patients’ vital signs is a leading critical issue in intensive care units (ICUs) related studies. Early prediction of the mortality of ICU patients can reduce the overall mortality and cost of complication treatment. Some studies have predicted mortality based on electronic health record (EHR) data by using machine learning models. However, the semi-structured data (i.e., patients’ diagnosis data and inspection reports) is rarely used in these models. This study utilized data from the Medical Information Mart for Intensive Care III. We used a Latent Dirichlet Allocation (LDA) model to classify text in the semi-structured data of some particular topics and established and compared the classification and regression trees (CART), logistic regression (LR), multivariate adaptive regression splines (MARS), random forest (RF), and gradient boosting (GB). A total of 46,520 ICU Patients were included, with 11.5% mortality in the Medical Information Mart for Intensive Care III group. Our results revealed that the semi-structured data (diagnosis data and inspection reports) of ICU patients contain useful information that can assist clinical doctors in making critical clinical decisions. In addition, in our comparison of five machine learning models (CART, LR, MARS, RF, and GB), the GB model showed the best performance with the highest area under the receiver operating characteristic curve (AUROC) (0.9280), specificity (93.16%), and sensitivity (83.25%). The RF, LR, and MARS models showed better performance (AUROC are 0.9096, 0.8987, and 0.8935, respectively) than the CART (0.8511). The GB model showed better performance than other machine learning models (CART, LR, MARS, and RF) in predicting the mortality of patients in the intensive care unit. The analysis results could be used to develop a clinically useful decision support system. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
Explaining Cannabis Use by Adolescents: A Comparative Assessment of Fuzzy Set Qualitative Comparative Analysis and Ordered Logistic Regression
Healthcare 2022, 10(4), 669; https://doi.org/10.3390/healthcare10040669 - 02 Apr 2022
Viewed by 683
Abstract
Background: This study assesses the relevance of several factors that the literature on the substance use of adolescents considers relevant. The factors embed individual variables, such as gender or age; factors linked with parental style; and variables that are associated with the teenager’s [...] Read more.
Background: This study assesses the relevance of several factors that the literature on the substance use of adolescents considers relevant. The factors embed individual variables, such as gender or age; factors linked with parental style; and variables that are associated with the teenager’s social environment. Methods: The study applies complementarily ordered logistic regression (OLR) and fuzzy set qualitative comparative analysis (fsQCA) in a sample of 1935 teenagers of Tarragona (Spain). Results: The OLR showed that being female (OR = 0.383; p < 0.0001), parental monitoring (OR = 0.587; p = 0.0201), and religiousness (OR = 0.476; p = 0.006) are significant inhibitors of cannabis consumption. On the other hand, parental tolerance to substance use (OR = 42.01; p < 0.0001) and having close peers that consume substances (OR = 5.60; p < 0.0001) act as enablers. The FsQCA allowed for fitting the linkages between the factors from a complementary perspective. (1) The coverage (cov) and consistency (cons) attained by the explanatory solutions of use (cons = 0.808; cov = 0.357) are clearly lower than those obtained by the recipes for nonuse (cons = 0.952; cov = 0.869). (2) The interaction of being male, having a tolerant family to substance use, and peer attitudes toward substances are continuously present in the profiles that are linked to a risk of cannabis smoking. (3) The most important recipe that explains resistance to cannabis is simply parental disagreement with substance consumption. Conclusions: On the one hand, the results of the OLR allow for determining the strength of an evaluated risk or protective factors according to the value of the OR. On the other hand, the fsQCA allows for the identification not only of profiles where there is a high risk of cannabis use, but also profiles where there is a low risk. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Article
Trends in Ambulatory Analgesic Usage after Myocardial Infarction: A Nationwide Cross-Sectional Study of Real-World Data
Healthcare 2022, 10(3), 446; https://doi.org/10.3390/healthcare10030446 - 26 Feb 2022
Viewed by 736
Abstract
Although current guidelines for myocardial infarction (MI) recommend caution in using non-steroidal anti-inflammatory drugs (NSAIDs), real-world studies of ambulatory settings are rare. This study aimed to explore the patterns and trends of analgesic prescriptions (especially NSAIDs) among patients with a history of MI [...] Read more.
Although current guidelines for myocardial infarction (MI) recommend caution in using non-steroidal anti-inflammatory drugs (NSAIDs), real-world studies of ambulatory settings are rare. This study aimed to explore the patterns and trends of analgesic prescriptions (especially NSAIDs) among patients with a history of MI in ambulatory care settings in Korea. We analyzed real-world data from the Korea National Health Insurance Service database. Patients aged 20 years or older hospitalized with incident MI were identified between January 2007 and December 2015. Ambulatory analgesics were administered after discharge from incident hospitalization for MI, and annual trends in the prescriptions of individual analgesics were evaluated. Among the 93,597 patients with incident MI, 75,131 (80.3%) received a total of 2,081,705 ambulatory analgesic prescriptions. Prescriptions were mainly issued at primary care clinics (80.3%). Analgesics were most frequently prescribed for musculoskeletal diseases (often NSAIDs, 70.7%); aceclofenac (13.7%) and diclofenac injection (9.4%) were the frequently used NSAIDs. Additionally, significant changes were observed in the trends for some analgesics, such as loxoprofen. This study suggested that NSAIDs are commonly prescribed to patients with a history of MI. Future real-world studies are needed to elucidate the drug–disease interactions of NSAIDs prescribed after MI, especially for patients with musculoskeletal diseases. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
Evaluating the Operational Efficiency and Quality of Tertiary Hospitals in Taiwan: The Application of the EBITDA Indicator to the DEA Method and TOBIT Regression
Healthcare 2022, 10(1), 58; https://doi.org/10.3390/healthcare10010058 - 29 Dec 2021
Cited by 5 | Viewed by 741
Abstract
This study estimates the efficiency of 19 tertiary hospitals in Taiwan using a two-stage analysis of Data Envelopment Analysis (DEA) and TOBIT regression. It is a retrospective panel-data study and includes all the tertiary hospitals in Taiwan. The data were sourced from open [...] Read more.
This study estimates the efficiency of 19 tertiary hospitals in Taiwan using a two-stage analysis of Data Envelopment Analysis (DEA) and TOBIT regression. It is a retrospective panel-data study and includes all the tertiary hospitals in Taiwan. The data were sourced from open information hospitals legally required to disclose to the National Health Insurance (NHI) Administration, Ministry of Health and Welfare. The variables, including five inputs (total hospital beds, total physicians, gross equipment, fixed assets net value, the rate of emergency transfer in-patient stay over 48 h) and six outputs (surplus or deficit of appropriation, length of stay, the total relative value units [RVUs] for outpatient services, total RVUs for inpatient services, self-pay income, modified EBITDA) were adopted into the Charnes, Cooper and Rhodes (CCR) and Banker, Charnes and Cooper (BCC) model. In the CCR model, the technical efficiency (TE) from 2015–2018 increases annually, and the average efficiency of all tertiary hospitals is 96.0%. In the BCC model, the highest pure technical efficiency (PTE) was in 2018 and the average efficiency of all medical centers is 99.1%. The average scale efficiency of all medical centers was 96.8% in the BBC model, meaning investment can be reduced by 3.2% and the current production level can be maintained with a fixed return to scale. Correlation coefficient analysis shows that all variables are correlated positively; the highest was the number of beds and the number of days in hospital (r = 0.988). The results show that TE in the CCR model was similar to PTE in the BCC model in four years. The difference analysis shows that more hospitals must improve regarding surplus or deficit of appropriation, modified EBITDA, and self-pay income. TOBIT regression reveals that the higher the bed-occupancy rate and turnover rate of fixed assets, the higher the TE; and the higher number of hospital beds per 100,000 people and turnover rate of fixed assets, the higher the PTE. DEA and TOBIT regression are used to analyze the other factors that affect medical center efficiency, and different categories of hospitals are chosen to assess whether different years or different types of medical centers affect operational performance. This study provides reference values for the improvable directions of relevant large hospitals’ inefficiency decision-making units through reference group analysis and slack variable analysis. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Article
Association of Regular Leisure-Time Physical Activity with Self-Reported Body Mass Index and Obesity Risk among Middle-Aged and Older Adults in Taiwan
Healthcare 2021, 9(12), 1719; https://doi.org/10.3390/healthcare9121719 - 13 Dec 2021
Cited by 1 | Viewed by 797
Abstract
Through this study, we aimed to determine the association of regular leisure-time physical activity (LTPA) with self-reported body mass index (BMI) and obesity risk among middle-aged and older adults in Taiwan. We conducted a cross-sectional study and reviewed the data derived from the [...] Read more.
Through this study, we aimed to determine the association of regular leisure-time physical activity (LTPA) with self-reported body mass index (BMI) and obesity risk among middle-aged and older adults in Taiwan. We conducted a cross-sectional study and reviewed the data derived from the Taiwan National Physical Activity Survey (TNPAS). Responses from 12,687 participants aged 45–108 years from the database were collected in this study. All the participants completed a standardized structured questionnaire that solicitated information regarding their demographic characteristics (age, gender, education, occupation, and self-reported health status), physical activity behaviors (regular/nonregular LTPA), and self-reported anthropometrics (height, weight, and BMI). Multiple linear and logistic regressions were used to examine the association between regular LTPA and BMI, and between regular LTPA and obesity status, respectively. Regular LTPA was associated with male gender, normal weight, excellent or good self-reported health status, and a lower rate of being underweight compared with nonregular LTPA. Regular LTPA was significant negatively associated with being underweight (OR = 0.71, p < 0.05), whereas it had no significant relationship with BMI and obesity (p > 0.05). Regular LTPA was associated with a reduced risk of being underweight among middle-aged and elderly adults in Taiwan. Further research on the relevant mechanism underlying this phenomenon is warranted. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Article
Monthly Disposable Income Is a Crucial Factor Affecting the Quality of Life in Patients with Knee Osteoarthritis
Healthcare 2021, 9(12), 1703; https://doi.org/10.3390/healthcare9121703 - 08 Dec 2021
Viewed by 808
Abstract
Knee osteoarthritis (OA) affects the quality of life (QOL) of elderly people; this study examines the demographic characteristics and QOL of patients with knee OA and identifies demographic characteristics that affect the QOL of these patients. In this cross-sectional study, 30 healthy controls [...] Read more.
Knee osteoarthritis (OA) affects the quality of life (QOL) of elderly people; this study examines the demographic characteristics and QOL of patients with knee OA and identifies demographic characteristics that affect the QOL of these patients. In this cross-sectional study, 30 healthy controls and 60 patients with mild-to-moderate bilateral knee OA aged between 55 and 75 years were enrolled. All participants completed a questionnaire containing questions on 10 demographic characteristics and the Medical Outcome Study 36-Item Short-Form Health Survey (SF-36), and their QOL scores in the eight dimensions of the SF-36 were evaluated. In the OA group, significant correlations were observed between monthly disposable income and physical and mental health components. Monthly disposable income was found to considerably affect the QOL of patients with bilateral knee OA (i.e., it is a crucial factor affecting these patients). The findings of this study may provide a reference for formulating preventive strategies for healthy individuals and for future confirmatory research. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Article
Real-World Evidence of COVID-19 Patients’ Data Quality in the Electronic Health Records
Healthcare 2021, 9(12), 1648; https://doi.org/10.3390/healthcare9121648 - 28 Nov 2021
Viewed by 980
Abstract
Despite the importance of electronic health records data, less attention has been given to data quality. This study aimed to evaluate the quality of COVID-19 patients’ records and their readiness for secondary use. We conducted a retrospective chart review study of all COVID-19 [...] Read more.
Despite the importance of electronic health records data, less attention has been given to data quality. This study aimed to evaluate the quality of COVID-19 patients’ records and their readiness for secondary use. We conducted a retrospective chart review study of all COVID-19 inpatients in an academic healthcare hospital for the year 2020, which were identified using ICD-10 codes and case definition guidelines. COVID-19 signs and symptoms were higher in unstructured clinical notes than in structured coded data. COVID-19 cases were categorized as 218 (66.46%) “confirmed cases”, 10 (3.05%) “probable cases”, 9 (2.74%) “suspected cases”, and 91 (27.74%) “no sufficient evidence”. The identification of “probable cases” and “suspected cases” was more challenging than “confirmed cases” where laboratory confirmation was sufficient. The accuracy of the COVID-19 case identification was higher in laboratory tests than in ICD-10 codes. When validating using laboratory results, we found that ICD-10 codes were inaccurately assigned to 238 (72.56%) patients’ records. “No sufficient evidence” records might indicate inaccurate and incomplete EHR data. Data quality evaluation should be incorporated to ensure patient safety and data readiness for secondary use research and predictive analytics. We encourage educational and training efforts to motivate healthcare providers regarding the importance of accurate documentation at the point-of-care. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
Hazardous Effect of Low-Dose Aspirin in Patients with Predialysis Advanced Chronic Kidney Disease Assessed by Machine Learning Method Feature Selection
Healthcare 2021, 9(11), 1484; https://doi.org/10.3390/healthcare9111484 - 31 Oct 2021
Cited by 2 | Viewed by 756
Abstract
Background: Low-dose aspirin (100 mg) is widely used in preventing cardiovascular disease in chronic kidney disease (CKD) because its benefits outweighs the harm, however, its effect on clinical outcomes in patients with predialysis advanced CKD is still unclear. This study aimed to [...] Read more.
Background: Low-dose aspirin (100 mg) is widely used in preventing cardiovascular disease in chronic kidney disease (CKD) because its benefits outweighs the harm, however, its effect on clinical outcomes in patients with predialysis advanced CKD is still unclear. This study aimed to assess the effect of aspirin use on clinical outcomes in such group. Methods: Patients were selected from a nationwide diabetes database from January 2009 to June 2017, and divided into two groups, a case group with aspirin use (n = 3021) and a control group without aspirin use (n = 9063), by propensity score matching with a 1:3 ratio. The Cox regression model was used to estimate the hazard ratio (HR). Moreover, machine learning method feature selection was used to assess the importance of parameters in the clinical outcomes. Results: In a mean follow-up of 1.54 years, aspirin use was associated with higher risk for entering dialysis (HR, 1.15 [95%CI, 1.10–1.21]) and death before entering dialysis (1.46 [1.25–1.71]), which were also supported by feature selection. The renal effect of aspirin use was consistent across patient subgroups. Nonusers and aspirin users did not show a significant difference, except for gastrointestinal bleeding (1.05 [0.96–1.15]), intracranial hemorrhage events (1.23 [0.98–1.55]), or ischemic stroke (1.15 [0.98–1.55]). Conclusions: Patients with predialysis advanced CKD and anemia who received aspirin exhibited higher risk of entering dialysis and death before entering dialysis by 15% and 46%, respectively. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Article
Research on Urban Medical and Health Services Efficiency and Its Spatial Correlation in China: Based on Panel Data of 13 Cities in Jiangsu Province
Healthcare 2021, 9(9), 1167; https://doi.org/10.3390/healthcare9091167 - 06 Sep 2021
Cited by 2 | Viewed by 802
Abstract
The improvement of the efficiency of medical and health services is of great significance for improving the high-quality and efficient medical and health services system and meeting the increasingly diverse health needs of residents. Based on the panel data of 13 cities in [...] Read more.
The improvement of the efficiency of medical and health services is of great significance for improving the high-quality and efficient medical and health services system and meeting the increasingly diverse health needs of residents. Based on the panel data of 13 cities in Jiangsu Province, this research analyzed the relative effectiveness of medical and health services from 2015 to 2019 using the super efficiency slack-based measure-data envelopment analysis model, and the Malmquist index method was used to explore the changes in the efficiency of medical and health services from a dynamic perspective. Furthermore, the spatial autocorrelation analysis method was used to verify the spatial correlation of medical and health services efficiency. In general, there is room for improvement in the efficiency of medical and health services in 13 cities in Jiangsu Province. There are obvious differences in regional efficiency, and there is a certain spatial correlation. In the future, the medical and health services efficiency of China’s cities should be improved by increasing the investment in high-quality medical and health resources, optimizing their layout and making full use of the spatial spillover effects between neighboring cities to strengthen inter-regional cooperation and exchanges. Full article
(This article belongs to the Special Issue Health Informatics: The Foundations of Public Health)
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Develop a Natural Language Processing Pipeline to Automate Extraction of Periodontal Disease Information from Electronic Dental Clinical Notes
Authors: RISHI RAO; JASIM ALBANDAR; MARISOL TELLEZ; JOACHIM KROIS; HUANMEI WU *
Affiliation: Department of Health Services Administration and Policy, College of Public Health, Temple University Department Of Periodontology and Oral Implantology, Temple University Kornberg School of Dentistry Department of Oral Health Sciences, Temple University Kornberg School of Dentistry Department Of Oral Diagnostics, Digital Health and Health Services Research Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, Berlin, Germany
Abstract: Introduction: Periodontal disease (PD) is one of the most prevalent dental diseases, suffered by 80% of US adults. PD can be prevented if its etiologic and risk factors are identified and controlled early. Prediction models may help clinicians identify high-risk PD patients before the disease initiation and progression. Electronic dental record (EDR) data provide researchers a unique opportunity to develop prediction models that can provide personalized disease risk and treatment recommendations. However, 90% of rich clinical information is documented in the free-text format of EDR. The objective of this study was to develop natural language processing (NLP) applications to extract PD diagnoses, medical histories (e.g., cardiovascular diseases, diabetes), and social history (e.g., smoking) in a structured format for comprehensive follow-up periodontal research. Methods: We have developed a five-stage NLP pipeline. First, we retrieve both structured and no-structured data from the EDR using SQL queries. . we developed manual annotation guidelines using a bottom-up and top-down approach. Results: The SQL queries results of We have examined 347 clinical notes to identify the writing patterns in our EDR system. We also used existing literature to develop manual annotation guidelines. Two domain experts manually reviewed 4,000 clinical notes using the eHOST annotation tool to create a gold standard dataset. We then split the gold standard dataset into 40% training, 20% testing, and 40% validation datasets (external dataset). The training set was used to create NLP applications, and the performance of these applications was evaluated using the testing and validation sets. We achieved excellent results (>90% accuracy) in extracting patients’ detailed PD diagnoses, CVD, smoking, and diabetes information from the EDR. Out of a total of 27,138 unique patients, we found 2,358 (13%) patients into healthy, 3,474 (16%) into gingivitis, and 12,353 (67%) into periodontitis categories. We also found that 3,688 (13.6%) out of 27,138 patients had at least one reported CVD in the EDR. Moreover, 4,973 (18%) patients' HbA1C level was more than 7% indicating poor diabetes control. Last, we found that 1,406 patients were light smokers, and 589 patients were heavy smokers. We conclude that NLP applications designed to extract patients’ detailed PD diagnoses, CVD, diabetes, and smoking status worked excellently with high (>90%) accuracy. EDR data provided rich clinical information about patients’ periodontal health, and this data quality is high and has a high potential to be utilized for periodontal research. Most rich dental clinical information is documented in the free-text format; therefore, this information may not be readily available to the researchers. Hence, developing novel informatics methods such as NLP is critical for using EDR data optimally and efficiently for research.

Back to TopTop