Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (207)

Search Parameters:
Keywords = expert bias

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 5488 KiB  
Article
Biased by Design? Evaluating Bias and Behavioral Diversity in LLM Annotation of Real-World and Synthetic Hotel Reviews
by Maria C. Voutsa, Nicolas Tsapatsoulis and Constantinos Djouvas
AI 2025, 6(8), 178; https://doi.org/10.3390/ai6080178 - 4 Aug 2025
Abstract
As large language models (LLMs) gain traction among researchers and practitioners, particularly in digital marketing for tasks such as customer feedback analysis and automated communication, concerns remain about the reliability and consistency of their outputs. This study investigates annotation bias in LLMs by [...] Read more.
As large language models (LLMs) gain traction among researchers and practitioners, particularly in digital marketing for tasks such as customer feedback analysis and automated communication, concerns remain about the reliability and consistency of their outputs. This study investigates annotation bias in LLMs by comparing human and AI-generated annotation labels across sentiment, topic, and aspect dimensions in hotel booking reviews. Using the HRAST dataset, which includes 23,114 real user-generated review sentences and a synthetically generated corpus of 2000 LLM-authored sentences, we evaluate inter-annotator agreement between a human expert and three LLMs (ChatGPT-3.5, ChatGPT-4, and ChatGPT-4-mini) as a proxy for assessing annotation bias. Our findings show high agreement among LLMs, especially on synthetic data, but only moderate to fair alignment with human annotations, particularly in sentiment and aspect-based sentiment analysis. LLMs display a pronounced neutrality bias, often defaulting to neutral sentiment in ambiguous cases. Moreover, annotation behavior varies notably with task design, as manual, one-to-one prompting produces higher agreement with human labels than automated batch processing. The study identifies three distinct AI biases—repetition bias, behavioral bias, and neutrality bias—that shape annotation outcomes. These findings highlight how dataset complexity and annotation mode influence LLM behavior, offering important theoretical, methodological, and practical implications for AI-assisted annotation and synthetic content generation. Full article
(This article belongs to the Special Issue AI Bias in the Media and Beyond)
Show Figures

Figure 1

24 pages, 1855 KiB  
Article
AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing
by Rohit Ramachandran, Urjit Patil, Srinivasaraghavan Sundar, Prem Shah and Preethi Ramesh
AI 2025, 6(8), 177; https://doi.org/10.3390/ai6080177 - 1 Aug 2025
Viewed by 230
Abstract
Efficient and accurate panel assignment is critical in expert and peer review processes. Traditional methods—based on manual preferences or Heuristic rules—often introduce bias, inconsistency, and scalability challenges. We present an automated framework that combines transformer-based document similarity modeling with optimization-based reviewer assignment. Using [...] Read more.
Efficient and accurate panel assignment is critical in expert and peer review processes. Traditional methods—based on manual preferences or Heuristic rules—often introduce bias, inconsistency, and scalability challenges. We present an automated framework that combines transformer-based document similarity modeling with optimization-based reviewer assignment. Using the all-mpnet-base-v2 from model (version 3.4.1), our system computes semantic similarity between proposal texts and reviewer documents, including CVs and Google Scholar profiles, without requiring manual input from reviewers. These similarity scores are then converted into rankings and integrated into an Integer Linear Programming (ILP) formulation that accounts for workload balance, conflicts of interest, and role-specific reviewer assignments (lead, scribe, reviewer). The method was tested across 40 researchers in two distinct disciplines (Chemical Engineering and Philosophy), each with 10 proposal documents. Results showed high self-similarity scores (0.65–0.89), strong differentiation between unrelated fields (−0.21 to 0.08), and comparable performance between reviewer document types. The optimization consistently prioritized top matches while maintaining feasibility under assignment constraints. By eliminating the need for subjective preferences and leveraging deep semantic analysis, our framework offers a scalable, fair, and efficient alternative to manual or Heuristic assignment processes. This approach can support large-scale review workflows while enhancing transparency and alignment with reviewer expertise. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

14 pages, 1974 KiB  
Article
The Identification of the Competency Components Necessary for the Tasks of Workers’ Representatives in the Field of OSH to Support Their Selection and Development, as Well as to Assess Their Effectiveness
by Peter Leisztner, Ferenc Farago and Gyula Szabo
Safety 2025, 11(3), 73; https://doi.org/10.3390/safety11030073 - 1 Aug 2025
Viewed by 132
Abstract
The European Union Council’s zero vision aims to eliminate workplace fatalities, while Industry 4.0 presents new challenges for occupational safety. Despite HR professionals assessing managers’ and employees’ competencies, no system currently exists to evaluate the competencies of workers’ representatives in occupational safety and [...] Read more.
The European Union Council’s zero vision aims to eliminate workplace fatalities, while Industry 4.0 presents new challenges for occupational safety. Despite HR professionals assessing managers’ and employees’ competencies, no system currently exists to evaluate the competencies of workers’ representatives in occupational safety and health (OSH). It is crucial to establish the necessary competencies for these representatives to avoid their selection based on personal bias, ambition, or coercion. The main objective of the study is to identify the competencies and their components required for workers’ representatives in the field of occupational safety and health by following the steps of the DACUM method with the assistance of OSH professionals. First, tasks were identified through semi-structured interviews conducted with eight occupational safety experts. In the second step, a focus group consisting of 34 OSH professionals (2 invited guests and 32 volunteers) determined the competencies and their components necessary to perform those tasks. Finally, the results were validated through an online questionnaire sent to the 32 volunteer participants of the focus group, from which 11 responses (34%) were received. The research categorized the competencies into the following three groups: core competencies (occupational safety and professional knowledge) and distinguishing competencies (personal attributes). Within occupational safety knowledge, 10 components were defined; for professional expertise, 7 components; and for personal attributes, 16 components. Based on the results, it was confirmed that all participants of the tripartite system have an important role in the training and development of workers’ representatives in the field of occupational safety and health. The results indicate that although OSH representation is not yet a priority in Hungary, there is a willingness to collaborate with competent, well-prepared representatives. The study emphasizes the importance of clearly defining and assessing the required competencies. Full article
Show Figures

Figure 1

24 pages, 624 KiB  
Systematic Review
Integrating Artificial Intelligence into Perinatal Care Pathways: A Scoping Review of Reviews of Applications, Outcomes, and Equity
by Rabie Adel El Arab, Omayma Abdulaziz Al Moosa, Zahraa Albahrani, Israa Alkhalil, Joel Somerville and Fuad Abuadas
Nurs. Rep. 2025, 15(8), 281; https://doi.org/10.3390/nursrep15080281 - 31 Jul 2025
Viewed by 126
Abstract
Background: Artificial intelligence (AI) and machine learning (ML) have been reshaping maternal, fetal, neonatal, and reproductive healthcare by enhancing risk prediction, diagnostic accuracy, and operational efficiency across the perinatal continuum. However, no comprehensive synthesis has yet been published. Objective: To conduct a scoping [...] Read more.
Background: Artificial intelligence (AI) and machine learning (ML) have been reshaping maternal, fetal, neonatal, and reproductive healthcare by enhancing risk prediction, diagnostic accuracy, and operational efficiency across the perinatal continuum. However, no comprehensive synthesis has yet been published. Objective: To conduct a scoping review of reviews of AI/ML applications spanning reproductive, prenatal, postpartum, neonatal, and early child-development care. Methods: We searched PubMed, Embase, the Cochrane Library, Web of Science, and Scopus through April 2025. Two reviewers independently screened records, extracted data, and assessed methodological quality using AMSTAR 2 for systematic reviews, ROBIS for bias assessment, SANRA for narrative reviews, and JBI guidance for scoping reviews. Results: Thirty-nine reviews met our inclusion criteria. In preconception and fertility treatment, convolutional neural network-based platforms can identify viable embryos and key sperm parameters with over 90 percent accuracy, and machine-learning models can personalize follicle-stimulating hormone regimens to boost mature oocyte yield while reducing overall medication use. Digital sexual-health chatbots have enhanced patient education, pre-exposure prophylaxis adherence, and safer sexual behaviors, although data-privacy safeguards and bias mitigation remain priorities. During pregnancy, advanced deep-learning models can segment fetal anatomy on ultrasound images with more than 90 percent overlap compared to expert annotations and can detect anomalies with sensitivity exceeding 93 percent. Predictive biometric tools can estimate gestational age within one week with accuracy and fetal weight within approximately 190 g. In the postpartum period, AI-driven decision-support systems and conversational agents can facilitate early screening for depression and can guide follow-up care. Wearable sensors enable remote monitoring of maternal blood pressure and heart rate to support timely clinical intervention. Within neonatal care, the Heart Rate Observation (HeRO) system has reduced mortality among very low-birth-weight infants by roughly 20 percent, and additional AI models can predict neonatal sepsis, retinopathy of prematurity, and necrotizing enterocolitis with area-under-the-curve values above 0.80. From an operational standpoint, automated ultrasound workflows deliver biometric measurements at about 14 milliseconds per frame, and dynamic scheduling in IVF laboratories lowers staff workload and per-cycle costs. Home-monitoring platforms for pregnant women are associated with 7–11 percent reductions in maternal mortality and preeclampsia incidence. Despite these advances, most evidence derives from retrospective, single-center studies with limited external validation. Low-resource settings, especially in Sub-Saharan Africa, remain under-represented, and few AI solutions are fully embedded in electronic health records. Conclusions: AI holds transformative promise for perinatal care but will require prospective multicenter validation, equity-centered design, robust governance, transparent fairness audits, and seamless electronic health record integration to translate these innovations into routine practice and improve maternal and neonatal outcomes. Full article
Show Figures

Figure 1

21 pages, 3510 KiB  
Article
An Improved Optimal Cloud Entropy Extension Cloud Model for the Risk Assessment of Soft Rock Tunnels in Fault Fracture Zones
by Shuangqing Ma, Yongli Xie, Junling Qiu, Jinxing Lai and Hao Sun
Buildings 2025, 15(15), 2700; https://doi.org/10.3390/buildings15152700 - 31 Jul 2025
Viewed by 169
Abstract
Existing risk assessment approaches for soft rock tunnels in fault-fractured zones typically employ single weighting schemes, inadequately integrate subjective and objective weights, and fail to define clear risk. This study proposes a risk-grading methodology that integrates an enhanced game theoretic weight-balancing algorithm with [...] Read more.
Existing risk assessment approaches for soft rock tunnels in fault-fractured zones typically employ single weighting schemes, inadequately integrate subjective and objective weights, and fail to define clear risk. This study proposes a risk-grading methodology that integrates an enhanced game theoretic weight-balancing algorithm with an optimized cloud entropy extension cloud model. Initially, a comprehensive indicator system encompassing geological (surrounding rock grade, groundwater conditions, fault thickness, dip, and strike), design (excavation cross-section shape, excavation span, and tunnel cross-sectional area), and support (support stiffness, support installation timing, and construction step length) parameters is established. Subjective weights obtained via the analytic hierarchy process (AHP) are combined with objective weights calculated using the entropy, coefficient of variation, and CRITIC methods and subsequently balanced through a game theoretic approach to mitigate bias and reconcile expert judgment with data objectivity. Subsequently, the optimized cloud entropy extension cloud algorithm quantifies the fuzzy relationships between indicators and risk levels, yielding a cloud association evaluation matrix for precise classification. A case study of a representative soft rock tunnel in a fault-fractured zone validates this method’s enhanced accuracy, stability, and rationality, offering a robust tool for risk management and design decision making in complex geological settings. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

16 pages, 1194 KiB  
Systematic Review
Artificial Intelligence in the Diagnosis of Tongue Cancer: A Systematic Review with Meta-Analysis
by Seorin Jeong, Hae-In Choi, Keon-Il Yang, Jin Soo Kim, Ji-Won Ryu and Hyun-Jeong Park
Biomedicines 2025, 13(8), 1849; https://doi.org/10.3390/biomedicines13081849 - 30 Jul 2025
Viewed by 255
Abstract
Background: Tongue squamous cell carcinoma (TSCC) is an aggressive oral malignancy characterized by early submucosal invasion and a high risk of cervical lymph node metastasis. Accurate and timely diagnosis is essential, but it remains challenging when relying solely on conventional imaging and [...] Read more.
Background: Tongue squamous cell carcinoma (TSCC) is an aggressive oral malignancy characterized by early submucosal invasion and a high risk of cervical lymph node metastasis. Accurate and timely diagnosis is essential, but it remains challenging when relying solely on conventional imaging and histopathology. This systematic review aimed to evaluate studies applying artificial intelligence (AI) in the diagnostic imaging of TSCC. Methods: This review was conducted under PRISMA 2020 guidelines and included studies from January 2020 to December 2024 that utilized AI in TSCC imaging. A total of 13 studies were included, employing AI models such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and Random Forest (RF). Imaging modalities analyzed included MRI, CT, PET, ultrasound, histopathological whole-slide images (WSI), and endoscopic photographs. Results: Diagnostic performance was generally high, with area under the curve (AUC) values ranging from 0.717 to 0.991, sensitivity from 63.3% to 100%, and specificity from 70.0% to 96.7%. Several models demonstrated superior performance compared to expert clinicians, particularly in delineating tumor margins and estimating the depth of invasion (DOI). However, only one study conducted external validation, and most exhibited moderate risk of bias in patient selection or index test interpretation. Conclusions: AI-based diagnostic tools hold strong potential for enhancing TSCC detection, but future research must address external validation, standardization, and clinical integration to ensure their reliable and widespread adoption. Full article
(This article belongs to the Special Issue Recent Advances in Oral Medicine—2nd Edition)
Show Figures

Figure 1

19 pages, 2215 KiB  
Article
Evaluation of the Effectiveness of Driver Training in the Use of Advanced Driver Assistance Systems
by Małgorzata Pełka and Adam Rosiński
Appl. Sci. 2025, 15(15), 8169; https://doi.org/10.3390/app15158169 - 23 Jul 2025
Viewed by 208
Abstract
This paper evaluates the effectiveness of driver training programmes aimed at the proper use of Advanced Driver Assistance Systems (ADASs). Participants (N = 49) were divided into the following three groups based on the type of training received: practical training, e-learning, and brief [...] Read more.
This paper evaluates the effectiveness of driver training programmes aimed at the proper use of Advanced Driver Assistance Systems (ADASs). Participants (N = 49) were divided into the following three groups based on the type of training received: practical training, e-learning, and brief manual instruction. The effectiveness of the training methods was assessed using selected parameters obtained from driving simulator studies, including reaction times and system activation attempts. Given the large volume and nonlinear nature of the input data, a heuristic, expert-based approach was used to identify key evaluation criteria, structure the decision-making process, and define fuzzy rule sets and membership functions. This phase served as the foundation for the development of a fuzzy logic model in the MATLAB environment. The model processes inputs to generate a quantitative performance score. The results indicate that practical training (mean score = 4.0) demonstrates superior effectiveness compared to e-learning (3.09) and manual instruction (mean score = 3.01). The primary contribution of this work is a transparent, data-driven evaluation tool that overcomes the inherent subjectivity and bias of traditional trainer-based assessments. This model provides a standardised and reproducible approach for assessing driver competence, offering a significant advancement over purely qualitative, trainer-based assessments and supporting the development of more reliable certification processes. Full article
(This article belongs to the Section Transportation and Future Mobility)
Show Figures

Figure 1

15 pages, 1266 KiB  
Review
Comparison of Oral Microbial Profile Among Patients Undergoing Clear Aligner and Fixed Orthodontic Therapies for the Treatment of Malocclusions: An Updated Review
by Emilie Ponton, Paul Emile Rossouw and Fawad Javed
Dent. J. 2025, 13(7), 322; https://doi.org/10.3390/dj13070322 - 16 Jul 2025
Viewed by 270
Abstract
Objective: The present review aims to compare the oral microbial profile (OMP) of patients undergoing fixed orthodontic therapy (OT) versus clear aligner therapy (CAT) for the treatment of malocclusions. Methods: Clinical studies were included. Case-reports/-series, letters to the editor, reviews, perspectives, [...] Read more.
Objective: The present review aims to compare the oral microbial profile (OMP) of patients undergoing fixed orthodontic therapy (OT) versus clear aligner therapy (CAT) for the treatment of malocclusions. Methods: Clinical studies were included. Case-reports/-series, letters to the editor, reviews, perspectives, and expert opinions were excluded. Indexed databases (MEDLINE/PubMed, Embase, Scopus, and Web of Science) were searched up to the end point of May 2025, without time and language barriers. The study was performed in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The risk of bias (RoB) and quality of evidence were assessed. Results: Three randomized clinical trials (RCTs) and seven non-RCTs were included. In all RCTs and five non-RCTs, OMP was assessed using subgingival plaque samples. Periodontopathogenic bacteria and Gram-negative anaerobic microbes were more often identified in patients undergoing fixed OT than CAT. The biofilm mass was higher in patients undergoing fixed OT than CAT. In two RCTs, periodontopathogenic bacteria were dominant among patients undergoing fixed OT than CAT. All RCTs and two non-RCTs had a high RoB. The certainty of evidence was “moderate” in 70% of the studies. Conclusions: Due to a high RoB, variability in study designs, and lack of power analysis, direct comparisons remain limited. Full article
(This article belongs to the Special Issue Current Research Topics in Orthodontics)
Show Figures

Figure 1

27 pages, 3562 KiB  
Article
Automated Test Generation and Marking Using LLMs
by Ioannis Papachristou, Grigoris Dimitroulakos and Costas Vassilakis
Electronics 2025, 14(14), 2835; https://doi.org/10.3390/electronics14142835 - 15 Jul 2025
Cited by 1 | Viewed by 494
Abstract
This paper presents an innovative exam-creation and grading system powered by advanced natural language processing and local large language models. The system automatically generates clear, grammatically accurate questions from both short passages and longer documents across different languages, supports multiple formats and difficulty [...] Read more.
This paper presents an innovative exam-creation and grading system powered by advanced natural language processing and local large language models. The system automatically generates clear, grammatically accurate questions from both short passages and longer documents across different languages, supports multiple formats and difficulty levels, and ensures semantic diversity while minimizing redundancy, thus maximizing the percentage of the material that is covered in the generated exam paper. For grading, it employs a semantic-similarity model to evaluate essays and open-ended responses, awards partial credit, and mitigates bias from phrasing or syntax via named entity recognition. A major advantage of the proposed approach is its ability to run entirely on standard personal computers, without specialized artificial intelligence hardware, promoting privacy and exam security while maintaining low operational and maintenance costs. Moreover, its modular architecture allows the seamless swapping of models with minimal intervention, ensuring adaptability and the easy integration of future improvements. A requirements–compliance evaluation, combined with established performance metrics, was used to review and compare two popular multilingual LLMs and monolingual alternatives, demonstrating the system’s effectiveness and flexibility. The experimental results show that the system achieves a grading accuracy within a 17% normalized error margin compared to that of human experts, with generated questions reaching up to 89.5% semantic similarity to source content. The full exam generation and grading pipeline runs efficiently on consumer-grade hardware, with average inference times under 30 s. Full article
Show Figures

Figure 1

25 pages, 2109 KiB  
Article
Designing Artificial Intelligence: Exploring Inclusion, Diversity, Equity, Accessibility, and Safety in Human-Centric Emerging Technologies
by Matteo Zallio, Chiara Bianca Ike and Camelia Chivăran
AI 2025, 6(7), 143; https://doi.org/10.3390/ai6070143 - 2 Jul 2025
Viewed by 791
Abstract
Background: The implementation of artificial intelligence (AI) has become a pivotal interdisciplinary challenge, creating new opportunities for sharing information, driving innovation, and transforming societal interactions with technology. While AI offers numerous benefits, its rapid evolution raises critical concerns about its impact on inclusion, [...] Read more.
Background: The implementation of artificial intelligence (AI) has become a pivotal interdisciplinary challenge, creating new opportunities for sharing information, driving innovation, and transforming societal interactions with technology. While AI offers numerous benefits, its rapid evolution raises critical concerns about its impact on inclusion, diversity, equity, accessibility, and safety (IDEAS). Method: This pilot study aimed to explore these issues and identify ways to embed the IDEAS principles into AI design. A qualitative study was conducted with industrial and academic experts in the field. Semi-structured interviews gathered insights into the opportunities, challenges, and future implications of AI from diverse professional and cultural perspectives. Result: Findings highlight uncertainties in AI’s trajectory and its profound cross-sector influence. Key issues emerged, including bias, data privacy, transparency, and accessibility. Participants stressed the need for greater awareness and structured dialogue to integrate the IDEAS principles throughout the AI lifecycle. Conclusion: This study underscores the urgency of addressing AI’s ethical and societal impacts. Embedding the IDEAS principles into its development can help mitigate risks and foster more inclusive, equitable, and accessible technologies. Full article
Show Figures

Figure 1

17 pages, 2241 KiB  
Systematic Review
Increased Overall Mortality in Patients Admitted for Gastrointestinal Bleeding and COVID-19 Infection Compared to No COVID-19 Infection: A Systematic Review and Meta-Analysis
by Sergiu Marian Cazacu, Adina Turcu-Stiolica, Cristina Maria Marginean and Ion Rogoveanu
Gastroenterol. Insights 2025, 16(3), 20; https://doi.org/10.3390/gastroent16030020 - 30 Jun 2025
Viewed by 549
Abstract
(1) Background: Patients admitted for gastrointestinal bleeding (GIB) who are diagnosed with COVID-19 at presentation may face significant therapeutic challenges. The delicate balance between the use of anticoagulant and anti-inflammatory therapy to address COVID-19 and hemostasis targets can, in turn, lead to delays [...] Read more.
(1) Background: Patients admitted for gastrointestinal bleeding (GIB) who are diagnosed with COVID-19 at presentation may face significant therapeutic challenges. The delicate balance between the use of anticoagulant and anti-inflammatory therapy to address COVID-19 and hemostasis targets can, in turn, lead to delays in COVID-19 treatment until bleeding is controlled. The published systematic reviews and meta-analyses that were reviewed included patients with both GIB and COVID-19 regardless of GIB presence at admission, and a separate analysis of patients admitted for GIB and tested for COVID-19 infection during hospitalization was not performed. (2) Methods: PubMed, Web of Science, and Scopus databases were used to access articles published from 1 December 2019 to 20 December 2024. Retrospective studies involving human subjects with GIB and COVID-19 were included in the final analysis. The exclusion criteria were as follows: pediatric population studies; the absence of a GIB control group; reviews, conference abstracts, expert opinions, and letters. The risk of bias in the included studies was assessed using the rank correlation test and Begg’s and Egger’s regression tests. We estimated the outcomes using the pooled odds ratio (OR) and the 95% confidence interval (95% CI). (3) Results: Seven studies, which included 3291 patients admitted for GIB who tested positive for COVID-19 infection, were included in our systematic review; four studies with a control group of patients with GIB but without COVID-19 infection were included in our meta-analysis. The odds of mortality among COVID-19-infected patients admitted for GIB were 3.80. There was heterogeneity regarding the site of GIB (some studies included all forms of GIB, others included only UGIB) and the study period (most studies included only patients from the first pandemic wave, and only one study reported cases from the first 2 years of the pandemic, including the delta pandemic wave). (4) Conclusions: COVID-19 infection in patients admitted for GIB was associated with a higher overall mortality rate. Full article
(This article belongs to the Section Gastrointestinal Disease)
Show Figures

Figure 1

15 pages, 1244 KiB  
Article
Can AI-Based ChatGPT Models Accurately Analyze Hand–Wrist Radiographs? A Comparative Study
by Ahmet Yıldırım, Orhan Cicek and Yavuz Selim Genç
Diagnostics 2025, 15(12), 1513; https://doi.org/10.3390/diagnostics15121513 - 14 Jun 2025
Viewed by 654
Abstract
Background/Aims: The aim of this study was to evaluate the effectiveness of large language model (LLM)-based chatbot systems in predicting bone age and identifying growth stages, and to explore their potential as practical, infrastructure-independent alternatives to conventional methods and convolutional neural network (CNN)-based [...] Read more.
Background/Aims: The aim of this study was to evaluate the effectiveness of large language model (LLM)-based chatbot systems in predicting bone age and identifying growth stages, and to explore their potential as practical, infrastructure-independent alternatives to conventional methods and convolutional neural network (CNN)-based deep learning models. Methods: This study evaluated the performance of three ChatGPT-based models (GPT-4o, GPT-o4-mini-high, and GPT-o1-pro) in predicting bone age and growth stage using 90 anonymized hand–wrist radiographs (30 from each growth stage—pre-peak, peak, and post-peak—with equal male and female distribution). Reference standards were ensured by expert orthodontists using Fishman’s Skeletal Maturity Indicators (SMI) system and the Greulich–Pyle Atlas, with each radiograph analyzed by three GPT models using standardized prompts. Model performances were evaluated through statistical analyses assessing agreement and prediction accuracy. Results: All models showed significant agreement with the reference values in bone age prediction (p < 0.001), with GPT-o1-pro having the highest concordance (Pearson r = 0.546). No statistically significant difference was observed in the mean absolute error (MAE) among the models (p > 0.05). The GPT-o4-mini-high model achieved an accuracy rate of 72.2% within a ±2 year deviation range for bone age prediction. The GPT-o1-pro and GPT-o4-mini-high models showed bias in the Bland–Altman analysis of bone age predictions; however, GPT-o1-pro yielded more reliable predictions with narrower limits of agreement. In terms of growth stage classification, the GPT-4o model achieved the highest agreement with the reference values (κ = 0.283, p < 0.001). Conclusions: This study shows that general-purpose GPT models can support bone age and growth stages prediction, with each model having distinct strengths. While GPT models do not replace clinical examination, their contextual reasoning and ability to perform preliminary assessments without domain-specific training make them promising tools, though further development is needed. Full article
Show Figures

Figure 1

19 pages, 989 KiB  
Systematic Review
Enhancing Image Quality in Dental-Maxillofacial CBCT: The Impact of Iterative Reconstruction and AI on Noise Reduction—A Systematic Review
by Róża Wajer, Pawel Dabrowski-Tumanski, Adrian Wajer, Natalia Kazimierczak, Zbigniew Serafin and Wojciech Kazimierczak
J. Clin. Med. 2025, 14(12), 4214; https://doi.org/10.3390/jcm14124214 - 13 Jun 2025
Viewed by 714
Abstract
Background: This systematic review evaluates articles investigating the use of iterative reconstruction (IR) algorithms and artificial intelligence (AI)-based noise reduction techniques to improve the quality of oral CBCT images. Materials and Methods: A detailed search was performed across PubMed, Scopus, Web of Science, [...] Read more.
Background: This systematic review evaluates articles investigating the use of iterative reconstruction (IR) algorithms and artificial intelligence (AI)-based noise reduction techniques to improve the quality of oral CBCT images. Materials and Methods: A detailed search was performed across PubMed, Scopus, Web of Science, ScienceDirect, and Embase databases. The inclusion criteria were prospective or retrospective studies with IR and AI for CBCT images, studies in which the image quality was statistically assessed, studies on humans, and studies published in peer-reviewed journals in English. Quality assessment was performed independently by two authors, and the conflicts were resolved by the third expert. For bias assessment, the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 tool was used for bias assessment. Material: A total of eleven studies were included, analyzing a range of IR and AI methods designed to reduce noise and artifacts in CBCT images. Results: A statistically significant improvement in CBCT image quality parameters was achieved by the algorithms used in each of the articles we reviewed. The most commonly used image quality measures were peak signal-to-noise ratio (PSNR) and contrast-to-noise ratio (CNR). The most significant increase in PSNR was demonstrated by Ylisiurua et al. and Vestergaard et al., who reported an increase in this parameter of more than 30% for both deep learning (DL) techniques used. Another subcategory used to improve the quality of CBCT images is the reconstruction of synthetic computed tomography (sCT) images using AI. The use of sCT allowed an increase in PSNR ranging from 17% to 30%. For the more traditional methods, FBP and iterative reconstructions, there was an improvement in the PSNR parameter but not as high, ranging from 3% to 13%. Among the research papers evaluating the CNR parameter, an improvement of 17% to 29% was achieved. Conclusions: The use of AI and IR can significantly improve the quality of oral CBCT images by reducing image noise. Full article
Show Figures

Figure 1

32 pages, 4728 KiB  
Article
How Do Ethical Factors Affect User Trust and Adoption Intentions of AI-Generated Content Tools? Evidence from a Risk-Trust Perspective
by Tao Yu, Yihuan Tian, Yihui Chen, Yang Huang, Younghwan Pan and Wansok Jang
Systems 2025, 13(6), 461; https://doi.org/10.3390/systems13060461 - 11 Jun 2025
Viewed by 2285
Abstract
With the widespread application of AI-generated content (AIGC) tools in creative domains, users have become increasingly concerned about the ethical issues they raise, which may influence their adoption decisions. To explore how ethical perceptions affect user behavior, this study constructs an ethical perception [...] Read more.
With the widespread application of AI-generated content (AIGC) tools in creative domains, users have become increasingly concerned about the ethical issues they raise, which may influence their adoption decisions. To explore how ethical perceptions affect user behavior, this study constructs an ethical perception model based on the trust–risk theoretical framework, focusing on its impact on users’ adoption intention (ADI). Through a systematic literature review and expert interviews, eight core ethical dimensions were identified: Misinformation (MIS), Accountability (ACC), Algorithmic Bias (ALB), Creativity Ethics (CRE), Privacy (PRI), Job Displacement (JOD), Ethical Transparency (ETR), and Control over AI (CON). Based on 582 valid responses, structural equation modeling (SEM) was conducted to empirically test the proposed paths. The results show that six factors significantly and positively influence perceived risk (PR): JOD (β = 0.216), MIS (β = 0.161), ETR (β = 0.150), ACC (β = 0.137), CON (β = 0.136), and PRI (β = 0.131), while the effects of ALB and CRE were not significant. Regarding trust in AI (TR), six factors significantly negatively influence it: CRE (β = −0.195), PRI (β = −0.145), ETR (β = −0.148), CON (β = −0.133), ALB (β = −0.113), and ACC (β = −0.098), while MIS and JOD were not significant. In addition, PR has a significant negative effect on TR (β = −0.234), which further impacts ADI. Specifically, PR has a significant negative effect on ADI (β = −0.259), while TR has a significant positive effect (β = 0.187). This study not only expands the applicability of the trust–risk framework in the context of AIGC but also proposes an ethical perception model for user adoption research, offering empirical evidence and practical guidance for platform design, governance mechanisms, and trust-building strategies. Full article
Show Figures

Figure 1

12 pages, 1366 KiB  
Article
Budget Impact Analysis of the Use of Specific Biomarkers GFAP and UCH-L1 in the Management of Mild Traumatic Brain Injury in Spain
by Francisco Moya Torrecilla, Gemma Álvarez-Corral, Eva Gutiérrez Pérez, Daniel Morell-Garcia, Juan Ortega Pérez, Beatriz Miriam Rodríguez, Leticia Sánchez Martín and Francisco Temboury Ruiz
J. Clin. Med. 2025, 14(12), 4095; https://doi.org/10.3390/jcm14124095 - 10 Jun 2025
Viewed by 488
Abstract
Objective: To evaluate the economic impact associated with the use of specific brain biomarkers glial fibrillary acid protein (GFAP) and ubiquitin C-terminal hydrolase L1 (UCH-L1) in adult patients with suspected mild traumatic brain injury (TBI) in a standard Spanish hospital setting. Methods: We [...] Read more.
Objective: To evaluate the economic impact associated with the use of specific brain biomarkers glial fibrillary acid protein (GFAP) and ubiquitin C-terminal hydrolase L1 (UCH-L1) in adult patients with suspected mild traumatic brain injury (TBI) in a standard Spanish hospital setting. Methods: We used a budget impact analysis (BIA) to compare the cost of standard of care using head computed tomography (CT) to evaluate intracranial injury with a scenario incorporating specific biomarkers GFAP and UCH-L1 in an estimated population of 3500 adult patients attending the hospital emergency department with a score of 13 to 15 on the Glasgow Coma Scale (GCS). The probabilities associated with clinical procedures were obtained from a multidisciplinary group of experts from Spanish hospitals and supplemented with data from the literature. Costs were estimated using hospital tariffs from the Spanish autonomous communities and other official sources. Results: The incorporation of specific biomarkers GFAP and UCH-L1 in the management of mild TBI could generate an estimated annual savings of EUR 696,634 in a standard Spanish hospital, mainly due to reduced CT use. The average savings per patient would be EUR 199.04, and the care time would be reduced by 111 min. Sensitivity analysis, with variations of ±20% in the parameters, confirms these savings. Conclusions: This study suggests that the use of specific biomarkers GFAP and UCH-L1 in the management of mild TBI patients in Spain could reduce the average cost per patient, generating significant savings for hospitals. Future studies that incorporate data from clinical records will help validate these results. Full article
(This article belongs to the Section Brain Injury)
Show Figures

Figure 1

Back to TopTop