Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (155)

Search Parameters:
Keywords = multimodal foundation models

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
40 pages, 5708 KB  
Review
Advances on Multimodal Remote Sensing Foundation Models for Earth Observation Downstream Tasks: A Survey
by Guoqing Zhou, Lihuang Qian and Paolo Gamba
Remote Sens. 2025, 17(21), 3532; https://doi.org/10.3390/rs17213532 (registering DOI) - 24 Oct 2025
Abstract
Remote sensing foundation models (RSFMs) have demonstrated excellent feature extraction and reasoning capabilities under the self-supervised learning paradigm of “unlabeled datasets—model pre-training—downstream tasks”. These models achieve superior accuracy and performance compared to existing models across numerous open benchmark datasets. However, when confronted with [...] Read more.
Remote sensing foundation models (RSFMs) have demonstrated excellent feature extraction and reasoning capabilities under the self-supervised learning paradigm of “unlabeled datasets—model pre-training—downstream tasks”. These models achieve superior accuracy and performance compared to existing models across numerous open benchmark datasets. However, when confronted with multimodal data, such as optical, LiDAR, SAR, text, video, and audio, the RSFMs exhibit limitations in cross-modal generalization and multi-task learning. Although several reviews have addressed the RSFMs, there is currently no comprehensive survey dedicated to vision–X (vision, language, audio, position) multimodal RSFMs (MM-RSFMs). To tackle this gap, this article provides a systematic review of MM-RSFMs from a novel perspective. Firstly, the key technologies underlying MM-RSFMs are reviewed and analyzed, and the available multimodal RS pre-training datasets are summarized. Then, recent advances in MM-RSFMs are classified according to the development of backbone networks and cross-modal interaction methods of vision–X, such as vision–vision, vision–language, vision–audio, vision–position, and vision–language–audio. Finally, potential challenges are analyzed, and perspectives for MM-RSFMs are outlined. This survey from this paper reveals that current MM-RSFMs face the following key challenges: (1) a scarcity of high-quality multimodal datasets, (2) limited capability for multimodal feature extraction, (3) weak cross-task generalization, (4) absence of unified evaluation criteria, and (5) insufficient security measures. Full article
(This article belongs to the Section AI Remote Sensing)
21 pages, 1732 KB  
Review
Artificial Intelligence in Clinical Oncology: From Productivity Enhancement to Creative Discovery
by Masahiro Kuno, Hiroki Osumi, Shohei Udagawa, Kaoru Yoshikawa, Akira Ooki, Eiji Shinozaki, Tetsuo Ishikawa, Junna Oba, Kensei Yamaguchi and Kazuhiro Sakurada
Curr. Oncol. 2025, 32(11), 588; https://doi.org/10.3390/curroncol32110588 - 22 Oct 2025
Viewed by 526
Abstract
Modern clinical oncology faces an unprecedented data complexity that exceeds human analytical capacity, making artificial intelligence (AI) integration essential rather than optional. This review examines the dual impact of AI on productivity enhancement and creative discovery in cancer care. We trace the evolution [...] Read more.
Modern clinical oncology faces an unprecedented data complexity that exceeds human analytical capacity, making artificial intelligence (AI) integration essential rather than optional. This review examines the dual impact of AI on productivity enhancement and creative discovery in cancer care. We trace the evolution from traditional machine learning to deep learning and transformer-based foundation models, analyzing their clinical applications. AI enhances productivity by automating diagnostic tasks, streamlining documentation, and accelerating research workflows across imaging modalities and clinical data processing. More importantly, AI enables creative discovery by integrating multimodal data to identify computational biomarkers, performing unsupervised phenotyping to reveal hidden patient subgroups, and accelerating drug development. Finally, we introduce the FUTURE-AI framework, outlining the essential requirements for translating AI models into clinical practice. This ensures the responsible deployment of AI, which augments rather than replaces clinical judgment, while maintaining patient-centered care. Full article
Show Figures

Figure 1

21 pages, 1703 KB  
Article
Beyond Biomarkers: Blending Copeptin and Clinical Cues to Distinguish Central Diabetes Insipidus from Primary Polydipsia in Children
by Diana-Andreea Ciortea, Carmen Loredana Petrea (Cliveți), Gabriela Isabela Verga (Răuță), Sorin Ion Berbece, Gabriela Gurău, Silvia Fotea and Mădălina Nicoleta Matei
Biomedicines 2025, 13(10), 2573; https://doi.org/10.3390/biomedicines13102573 - 21 Oct 2025
Viewed by 205
Abstract
Background: Polyuria–polydipsia syndrome (PPS) in children poses a major diagnostic challenge, as central diabetes insipidus (CDI) and primary polydipsia (PP) require distinct treatments. Although copeptin is a robust diagnostic biomarker, using only fixed thresholds may not adequately support decision making in borderline [...] Read more.
Background: Polyuria–polydipsia syndrome (PPS) in children poses a major diagnostic challenge, as central diabetes insipidus (CDI) and primary polydipsia (PP) require distinct treatments. Although copeptin is a robust diagnostic biomarker, using only fixed thresholds may not adequately support decision making in borderline cases. To address this gap, we evaluated a multimodal diagnostic approach that integrates copeptin dynamics with clinical profiling. Methods: In a prospective diagnostic study (2019–2025), 24 children with PPS (CDI = 11, PP = 13) underwent hypertonic saline testing with serial sodium, osmolality, and copeptin sampling. Predictors included stimulated copeptin, peak sodium, peak osmolality, test duration, and tolerability. A Ridge regression model was applied and internally validated with stratified cross-validation. Results: Stimulated copeptin was the strongest discriminator, while sodium/osmolality dynamics and tolerability provided complementary value. The multimodal model achieved cross-validated AUC of 0.937 with 83.3% accuracy, and the procedure was safe and feasible in children. These findings support moving beyond biomarker cut-offs toward integrative diagnostic approaches that better reflect real-world clinical practice. Conclusions: Combining copeptin with clinical profiling in a penalized regression framework yields a robust and interpretable tool for distinguishing CDI from PP. More broadly, such integrative models may enhance diagnostic precision in rare pediatric disorders and provide a foundation for future multicenter validation and clinical decision-support applications. Full article
(This article belongs to the Section Molecular and Translational Medicine)
Show Figures

Figure 1

17 pages, 1203 KB  
Article
Exploration of Stability Judgments: Assessing Multimodal LLMs in Game-Inspired Physical Reasoning Tasks
by Mury Fajar Dewantoro, Febri Abdullah, Yi Xia, Ibrahim Khan, Ruck Thawonmas, Wenwen Ouyang and Fitra Abdurrachman Bachtiar
Appl. Sci. 2025, 15(20), 11253; https://doi.org/10.3390/app152011253 - 21 Oct 2025
Viewed by 168
Abstract
This study extends our previous investigation into whether multimodal large language models (MLLMs) can reason about physical reasoning, using a game environment as the testbed. Stability served as a foundational scenario to probe model understanding of physical reasoning. We evaluated twelve models, combining [...] Read more.
This study extends our previous investigation into whether multimodal large language models (MLLMs) can reason about physical reasoning, using a game environment as the testbed. Stability served as a foundational scenario to probe model understanding of physical reasoning. We evaluated twelve models, combining those from the earlier study with six additional open-weight models, across three tasks designed to capture different aspects of reasoning. Human participants were included as a reference point, consistently achieving the highest accuracy, underscoring the gap between model and human performance. Among MLLMs, the GPT series continued to perform strongly, with GPT-4o showing reliable results in image-based tasks, while the Qwen2.5-VL series reached the highest overall scores in this extended study and in some cases surpassed commercial counterparts. Simpler binary tasks yielded balanced performance across modalities, suggesting that models can capture certain basic aspects of reasoning, whereas more complex multiple-choice tasks led to sharp declines in accuracy. Structured inputs such as XML improved results in the prediction task, where Qwen2.5-VL outperformed GPT variants in our earlier work. These findings demonstrate progress in scaling and modality design for physical reasoning, while reaffirming that human participants remain superior across all tasks. Full article
Show Figures

Figure 1

20 pages, 2770 KB  
Article
Foundations of Livestock Behavioral Recognition: Ethogram Analysis of Behavioral Definitions and Its Practices in Multimodal Large Language Models
by Siling Zhou, Wenjie Li, Mengting Zhou, Ryan N. Dilger, Isabella C. F. S. Condotta, Zhonghong Wu, Xiangfang Tang, Yiqi Wu, Tao Wang and Jiangong Li
Animals 2025, 15(20), 3030; https://doi.org/10.3390/ani15203030 - 19 Oct 2025
Viewed by 222
Abstract
Computer vision offers a promising approach to automating the observation of animal behavior, thereby contributing to improved animal welfare and precision livestock management. However, the absence of standardized behavioral definitions limits the accuracy and generalizability of artificial intelligence models used for behavior recognition. [...] Read more.
Computer vision offers a promising approach to automating the observation of animal behavior, thereby contributing to improved animal welfare and precision livestock management. However, the absence of standardized behavioral definitions limits the accuracy and generalizability of artificial intelligence models used for behavior recognition. This study applied natural language processing techniques to analyze 655 behavior definitions related to feeding, drinking, resting, and moving, as reported in the livestock research literature published between 2000 and 2023. Clustering and structural analyses revealed consistent semantic patterns across behavior categories. Feeding and drinking behaviors were concisely defined in 6–10 words, including the semantic elements of body parts, actions, and action objects. Resting and moving behaviors were described in 6–15 words. Resting behavior was defined by actions and action objects, while moving behaviors were characterized by action words only. By integrating these structured definitions into prompts, ChatGPT-4o achieved an average correspondence score of 4.53 out of 5 in an image-based piglet behavior annotation task. These findings highlight the value of standardized behavior definitions in supporting more accurate and generalizable behavior recognition models for precision livestock farming. Full article
(This article belongs to the Special Issue Artificial Intelligence as a Useful Tool in Behavioural Studies)
Show Figures

Figure 1

22 pages, 2027 KB  
Article
Agri-DSSA: A Dual Self-Supervised Attention Framework for Multisource Crop Health Analysis Using Hyperspectral and Image-Based Benchmarks
by Fatema A. Albalooshi
AgriEngineering 2025, 7(10), 350; https://doi.org/10.3390/agriengineering7100350 - 17 Oct 2025
Viewed by 241
Abstract
Recent advances in hyperspectral imaging (HSI) and multimodal deep learning have opened new opportunities for crop health analysis; however, most existing models remain limited by dataset scope, lack of interpretability, and weak cross-domain generalization. To overcome these limitations, this study introduces Agri-DSSA, a [...] Read more.
Recent advances in hyperspectral imaging (HSI) and multimodal deep learning have opened new opportunities for crop health analysis; however, most existing models remain limited by dataset scope, lack of interpretability, and weak cross-domain generalization. To overcome these limitations, this study introduces Agri-DSSA, a novel Dual Self-Supervised Attention (DSSA) framework that simultaneously models spectral and spatial dependencies through two complementary self-attention branches. The proposed architecture enables robust and interpretable feature learning across heterogeneous data sources, facilitating the estimation of spectral proxies of chlorophyll content, plant vigor, and disease stress indicators rather than direct physiological measurements. Experiments were performed on seven publicly available benchmark datasets encompassing diverse spectral and visual domains: three hyperspectral datasets (Indian Pines with 16 classes and 10,366 labeled samples; Pavia University with 9 classes and 42,776 samples; and Kennedy Space Center with 13 classes and 5211 samples), two plant disease datasets (PlantVillage with 54,000 labeled leaf images covering 38 diseases across 14 crop species, and the New Plant Diseases dataset with over 30,000 field images captured under natural conditions), and two chlorophyll content datasets (the Global Leaf Chlorophyll Content Dataset (GLCC), derived from MERIS and OLCI satellite data between 2003–2020, and the Leaf Chlorophyll Content Dataset for Crops, which includes paired spectrophotometric and multispectral measurements collected from multiple crop species). To ensure statistical rigor and spatial independence, a block-based spatial cross-validation scheme was employed across five independent runs with fixed random seeds. Model performance was evaluated using R2, RMSE, F1-score, AUC-ROC, and AUC-PR, each reported as mean ± standard deviation with 95% confidence intervals. Results show that Agri-DSSA consistently outperforms baseline models (PLSR, RF, 3D-CNN, and HybridSN), achieving up to R2=0.86 for chlorophyll content estimation and F1-scores above 0.95 for plant disease detection. The attention distributions highlight physiologically meaningful spectral regions (550–710 nm) associated with chlorophyll absorption, confirming the interpretability of the model’s learned representations. This study serves as a methodological foundation for UAV-based and field-deployable crop monitoring systems. By unifying hyperspectral, chlorophyll, and visual disease datasets, Agri-DSSA provides an interpretable and generalizable framework for proxy-based vegetation stress estimation. Future work will extend the model to real UAV campaigns and in-field spectrophotometric validation to achieve full agronomic reliability. Full article
Show Figures

Figure 1

29 pages, 1325 KB  
Article
Digital Stratigraphy—A Pattern Analysis Framework Integrating Computer Forensics, Criminology, and Forensic Archaeology for Crime Scene Investigation
by Romil Rawat, Hitesh Rawat, Mandakini Ingle, Anjali Rawat, Anand Rajavat and Ashish Dibouliya
Forensic Sci. 2025, 5(4), 48; https://doi.org/10.3390/forensicsci5040048 - 17 Oct 2025
Viewed by 289
Abstract
Background/Objectives—Traditional forensic investigations often analyze digital, physical, and criminological evidence separately, leading to fragmented timelines and reduced accuracy in reconstructing complex events. To address these gaps, this study proposes the Digital Stratigraphy Framework (DSF), inspired by archaeological stratigraphy, to integrate heterogeneous evidence [...] Read more.
Background/Objectives—Traditional forensic investigations often analyze digital, physical, and criminological evidence separately, leading to fragmented timelines and reduced accuracy in reconstructing complex events. To address these gaps, this study proposes the Digital Stratigraphy Framework (DSF), inspired by archaeological stratigraphy, to integrate heterogeneous evidence into structured, temporally ordered layers. DSF aims to reduce asynchronous inconsistencies, minimize false associations, and enhance interpretability across digital, behavioral, geospatial, and excavation evidence. Methods—DSF employs Hierarchical Pattern Mining (HPM) to detect recurring behavioral patterns and Forensic Sequence Alignment (FSA) to synchronize evidence layers temporally and contextually. The framework was tested on the CSI-DS2025 dataset containing 25,000 multimodal, stratified records, including digital logs, geospatial data, criminological reports, and excavation notes. Evaluation used 10-fold cross-validation, Bayesian hyperparameter tuning, and structured train-validation-test splits. Metrics included accuracy, precision, recall, F1-score, and Stratigraphic Reconstruction Consistency (SRC), alongside ablation and runtime assessments. Results—DSF achieved 92.6% accuracy, 93.1% precision, 90.5% recall, 91.3% F1-score, and an SRC of 0.89, outperforming baseline models. False associations were reduced by 18%, confirming effective cross-layer alignment and computational efficiency. Conclusions—By applying stratigraphic principles to forensic analytics, DSF enables accurate, interpretable, and legally robust evidence reconstruction. The framework establishes a scalable foundation for real-time investigative applications and multi-modal evidence integration, offering significant improvements over traditional fragmented approaches. Full article
(This article belongs to the Special Issue Feature Papers in Forensic Sciences)
Show Figures

Figure 1

69 pages, 7515 KB  
Review
Towards an End-to-End Digital Framework for Precision Crop Disease Diagnosis and Management Based on Emerging Sensing and Computing Technologies: State over Past Decade and Prospects
by Chijioke Leonard Nkwocha and Abhilash Kumar Chandel
Computers 2025, 14(10), 443; https://doi.org/10.3390/computers14100443 - 16 Oct 2025
Viewed by 580
Abstract
Early detection and diagnosis of plant diseases is critical for ensuring global food security and sustainable agricultural practices. This review comprehensively examines latest advancements in crop disease risk prediction, onset detection through imaging techniques, machine learning (ML), deep learning (DL), and edge computing [...] Read more.
Early detection and diagnosis of plant diseases is critical for ensuring global food security and sustainable agricultural practices. This review comprehensively examines latest advancements in crop disease risk prediction, onset detection through imaging techniques, machine learning (ML), deep learning (DL), and edge computing technologies. Traditional disease detection methods, which rely on visual inspections, are time-consuming, and often inaccurate. While chemical analyses are accurate, they can be time consuming and leave less flexibility to promptly implement remedial actions. In contrast, modern techniques such as hyperspectral and multispectral imaging, thermal imaging, and fluorescence imaging, among others can provide non-invasive and highly accurate solutions for identifying plant diseases at early stages. The integration of ML and DL models, including convolutional neural networks (CNNs) and transfer learning, has significantly improved disease classification and severity assessment. Furthermore, edge computing and the Internet of Things (IoT) facilitate real-time disease monitoring by processing and communicating data directly in/from the field, reducing latency and reliance on in-house as well as centralized cloud computing. Despite these advancements, challenges remain in terms of multimodal dataset standardization, integration of individual technologies of sensing, data processing, communication, and decision-making to provide a complete end-to-end solution for practical implementations. In addition, robustness of such technologies in varying field conditions, and affordability has also not been reviewed. To this end, this review paper focuses on broad areas of sensing, computing, and communication systems to outline the transformative potential of end-to-end solutions for effective implementations towards crop disease management in modern agricultural systems. Foundation of this review also highlights critical potential for integrating AI-driven disease detection and predictive models capable of analyzing multimodal data of environmental factors such as temperature and humidity, as well as visible-range and thermal imagery information for early disease diagnosis and timely management. Future research should focus on developing autonomous end-to-end disease monitoring systems that incorporate these technologies, fostering comprehensive precision agriculture and sustainable crop production. Full article
Show Figures

Figure 1

51 pages, 4751 KB  
Review
Large Language Models and 3D Vision for Intelligent Robotic Perception and Autonomy
by Vinit Mehta, Charu Sharma and Karthick Thiyagarajan
Sensors 2025, 25(20), 6394; https://doi.org/10.3390/s25206394 - 16 Oct 2025
Viewed by 376
Abstract
With the rapid advancement of artificial intelligence and robotics, the integration of Large Language Models (LLMs) with 3D vision is emerging as a transformative approach to enhancing robotic sensing technologies. This convergence enables machines to perceive, reason, and interact with complex environments through [...] Read more.
With the rapid advancement of artificial intelligence and robotics, the integration of Large Language Models (LLMs) with 3D vision is emerging as a transformative approach to enhancing robotic sensing technologies. This convergence enables machines to perceive, reason, and interact with complex environments through natural language and spatial understanding, bridging the gap between linguistic intelligence and spatial perception. This review provides a comprehensive analysis of state-of-the-art methodologies, applications, and challenges at the intersection of LLMs and 3D vision, with a focus on next-generation robotic sensing technologies. We first introduce the foundational principles of LLMs and 3D data representations, followed by an in-depth examination of 3D sensing technologies critical for robotics. The review then explores key advancements in scene understanding, text-to-3D generation, object grounding, and embodied agents, highlighting cutting-edge techniques such as zero-shot 3D segmentation, dynamic scene synthesis, and language-guided manipulation. Furthermore, we discuss multimodal LLMs that integrate 3D data with touch, auditory, and thermal inputs, enhancing environmental comprehension and robotic decision-making. To support future research, we catalog benchmark datasets and evaluation metrics tailored for 3D-language and vision tasks. Finally, we identify key challenges and future research directions, including adaptive model architectures, enhanced cross-modal alignment, and real-time processing capabilities, which pave the way for more intelligent, context-aware, and autonomous robotic sensing systems. Full article
(This article belongs to the Special Issue Advanced Sensors and AI Integration for Human–Robot Teaming)
Show Figures

Figure 1

22 pages, 2258 KB  
Article
Designing Light for Emotion: A Neurophysiological Approach to Modeling Affective Responses to the Interplay of Color and Illuminance
by Xuejiao Li, Ruili Wang and Mincheol Whang
Biomimetics 2025, 10(10), 696; https://doi.org/10.3390/biomimetics10100696 - 14 Oct 2025
Viewed by 608
Abstract
As the influence of indoor environments on human emotional regulation and cognitive function becomes increasingly critical in modern society, there is a growing need for intelligent lighting systems that dynamically respond to users’ emotional states. While previous studies have investigated either illuminance or [...] Read more.
As the influence of indoor environments on human emotional regulation and cognitive function becomes increasingly critical in modern society, there is a growing need for intelligent lighting systems that dynamically respond to users’ emotional states. While previous studies have investigated either illuminance or color in isolation, this study concentrates on quantitatively analyzing the interaction of these two key elements on human emotion and cognitive control capabilities. Utilizing electroencephalography (EEG) and electrocardiography (ECG) signals, we measured participants’ physiological responses and subjective emotional assessments in 18 unique lighting conditions, combining six colors and three levels of illuminance. The results confirmed that the interaction between light color and illuminance significantly affects physiological indicators related to emotion regulation. Notably, low-illuminance purple lighting was found to promote positive emotions and inhibit negative ones by increasing frontal alpha asymmetry (FAA) and gamma wave activity. Conversely, low-illuminance environments generally diminished cognitive reappraisal and negative emotion inhibition capabilities. Furthermore, a random forest model integrating time-series data from EEG and ECG predicted emotional valence and arousal with accuracies of 87% and 79%, respectively, demonstrating the validity of multi-modal physiological signal-based emotion prediction. This study provides empirical data and a theoretical foundation for the development of human-centered, emotion-adaptive lighting systems by presenting a quantitative causal model linking lighting, physiological responses, and emotion. These findings also provide a biomimetic perspective by linking lighting-induced physiological responses with emotion regulation, offering a foundation for the development of adaptive lighting systems that emulate natural light–human interactions. Full article
Show Figures

Figure 1

24 pages, 2328 KB  
Review
Large Language Model Agents for Biomedicine: A Comprehensive Review of Methods, Evaluations, Challenges, and Future Directions
by Xiaoran Xu and Ravi Sankar
Information 2025, 16(10), 894; https://doi.org/10.3390/info16100894 - 14 Oct 2025
Viewed by 929
Abstract
Large language model (LLM)-based agents are rapidly emerging as transformative tools across biomedical research and clinical applications. By integrating reasoning, planning, memory, and tool use capabilities, these agents go beyond static language models to operate autonomously or collaboratively within complex healthcare settings. This [...] Read more.
Large language model (LLM)-based agents are rapidly emerging as transformative tools across biomedical research and clinical applications. By integrating reasoning, planning, memory, and tool use capabilities, these agents go beyond static language models to operate autonomously or collaboratively within complex healthcare settings. This review provides a comprehensive survey of biomedical LLM agents, spanning their core system architectures, enabling methodologies, and real-world use cases such as clinical decision making, biomedical research automation, and patient simulation. We further examine emerging benchmarks designed to evaluate agent performance under dynamic, interactive, and multimodal conditions. In addition, we systematically analyze key challenges, including hallucinations, interpretability, tool reliability, data bias, and regulatory gaps, and discuss corresponding mitigation strategies. Finally, we outline future directions in areas such as continual learning, federated adaptation, robust multi-agent coordination, and human AI collaboration. This review aims to establish a foundational understanding of biomedical LLM agents and provide a forward-looking roadmap for building trustworthy, reliable, and clinically deployable intelligent systems. Full article
Show Figures

Figure 1

17 pages, 286 KB  
Review
Deep Learning Image Processing Models in Dermatopathology
by Apoorva Mehta, Mateen Motavaf, Danyal Raza, Neil Jairath, Akshay Pulavarty, Ziyang Xu, Michael A. Occidental, Alejandro A. Gru and Alexandra Flamm
Diagnostics 2025, 15(19), 2517; https://doi.org/10.3390/diagnostics15192517 - 4 Oct 2025
Viewed by 560
Abstract
Dermatopathology has rapidly advanced due to the implementation of deep learning models and artificial intelligence (AI). From convolutional neural networks (CNNs) to transformer-based foundation models, these systems are now capable of accurate whole-slide analysis and multimodal integration. This review synthesizes the most recent [...] Read more.
Dermatopathology has rapidly advanced due to the implementation of deep learning models and artificial intelligence (AI). From convolutional neural networks (CNNs) to transformer-based foundation models, these systems are now capable of accurate whole-slide analysis and multimodal integration. This review synthesizes the most recent advents of deep-learning architecture and synthesizes its evolution from first-generation CNNs to hybrid CNN-transformer systems to large-scale foundational models such as Paige’s PanDerm AI and Virchow. Herein, we examine performance benchmarks from real-world deployments of major dermatopathology deep learning models (DermAI, PathAssist Derm), as well as emerging next-generation models still under research and development. We assess barriers to clinical workflow adoption such as dataset bias, AI interpretability, and government regulation. Further, we discuss potential future research directions and emphasize the need for diverse, prospectively curated datasets, explainability frameworks for trust in AI, and rigorous compliance to Good Machine-Learning-Practice (GMLP) to achieve safe and scalable deep learning dermatopathology models that can fully integrate into clinical workflows. Full article
(This article belongs to the Special Issue Artificial Intelligence in Skin Disorders 2025)
15 pages, 1245 KB  
Article
Multimodal Behavioral Sensors for Lie Detection: Integrating Visual, Auditory, and Generative Reasoning Cues
by Daniel Grabowski, Kamila Łuczaj and Khalid Saeed
Sensors 2025, 25(19), 6086; https://doi.org/10.3390/s25196086 - 2 Oct 2025
Viewed by 467
Abstract
Advances in multimodal artificial intelligence enable new sensor-inspired approaches to lie detection by combining behavioral perception with generative reasoning. This study presents a deception detection framework that integrates deep video and audio processing with large language models guided by chain-of-thought (CoT) prompting. We [...] Read more.
Advances in multimodal artificial intelligence enable new sensor-inspired approaches to lie detection by combining behavioral perception with generative reasoning. This study presents a deception detection framework that integrates deep video and audio processing with large language models guided by chain-of-thought (CoT) prompting. We interpret neural architectures such as ViViT (for video) and HuBERT (for speech) as digital behavioral sensors that extract implicit emotional and cognitive cues, including micro-expressions, vocal stress, and timing irregularities. We further incorporate a GPT-5-based prompt-level fusion approach for video–language–emotion alignment and zero-shot inference. This method jointly processes visual frames, textual transcripts, and emotion recognition outputs, enabling the system to generate interpretable deception hypotheses without any task-specific fine-tuning. Facial expressions are treated as high-resolution affective signals captured via visual sensors, while audio encodes prosodic markers of stress. Our experimental setup is based on the DOLOS dataset, which provides high-quality multimodal recordings of deceptive and truthful behavior. We also evaluate a continual learning setup that transfers emotional understanding to deception classification. Results indicate that multimodal fusion and CoT-based reasoning increase classification accuracy and interpretability. The proposed system bridges the gap between raw behavioral data and semantic inference, laying a foundation for AI-driven lie detection with interpretable sensor analogues. Full article
(This article belongs to the Special Issue Sensor-Based Behavioral Biometrics)
Show Figures

Figure 1

43 pages, 7808 KB  
Article
GeoJSEval: An Automated Evaluation Framework for Large Language Models on JavaScript-Based Geospatial Computation and Visualization Code Generation
by Guanyu Chen, Haoyue Jiao, Shuyang Hou, Ziqi Liu, Lutong Xie, Shaowen Wu, Huayi Wu, Xuefeng Guan and Zhipeng Gui
ISPRS Int. J. Geo-Inf. 2025, 14(10), 382; https://doi.org/10.3390/ijgi14100382 - 28 Sep 2025
Viewed by 613
Abstract
With the widespread adoption of large language models (LLMs) in code generation tasks, geospatial code generation has emerged as a critical frontier in the integration of artificial intelligence and geoscientific analysis. This growing trend underscores the urgent need for systematic evaluation methodologies to [...] Read more.
With the widespread adoption of large language models (LLMs) in code generation tasks, geospatial code generation has emerged as a critical frontier in the integration of artificial intelligence and geoscientific analysis. This growing trend underscores the urgent need for systematic evaluation methodologies to assess the generation capabilities of LLMs in geospatial contexts. In particular, geospatial computation and visualization tasks in the JavaScript environment rely heavily on the orchestration of diverse frontend libraries and ecosystems, posing elevated demands on a model’s semantic comprehension and code synthesis capabilities. To address this challenge, we propose GeoJSEval—the first multimodal, function-level automatic evaluation framework for LLMs in JavaScript-based geospatial code generation tasks. The framework comprises three core components: a standardized test suite (GeoJSEval-Bench), a code submission engine, and an evaluation module. It includes 432 function-level tasks and 2071 structured test cases, spanning five widely used JavaScript geospatial libraries that support spatial analysis and visualization functions, as well as 25 mainstream geospatial data types. GeoJSEval enables multidimensional quantitative evaluation across metrics such as accuracy, output stability, resource consumption, execution efficiency, and error type distribution. Moreover, it integrates boundary testing mechanisms to enhance robustness and evaluation coverage. We conduct a comprehensive assessment of 20 state-of-the-art LLMs using GeoJSEval, uncovering significant performance disparities and bottlenecks in spatial semantic understanding, code reliability, and function invocation accuracy. GeoJSEval offers a foundational methodology, evaluation resource, and practical toolkit for the standardized assessment and optimization of geospatial code generation models, with strong extensibility and promising applicability in real-world scenarios. This manuscript represents the peer-reviewed version of our earlier preprint previously made available on arXiv. Full article
Show Figures

Figure 1

23 pages, 1575 KB  
Systematic Review
Integrating Spatial Omics and Deep Learning: Toward Predictive Models of Cardiomyocyte Differentiation Efficiency
by Tumo Kgabeng, Lulu Wang, Harry M. Ngwangwa and Thanyani Pandelani
Bioengineering 2025, 12(10), 1037; https://doi.org/10.3390/bioengineering12101037 - 27 Sep 2025
Viewed by 611
Abstract
Advances in cardiac regenerative medicine increasingly rely on integrating artificial intelligence with spatial multi-omics technologies to decipher intricate cellular dynamics in cardiomyocyte differentiation. This systematic review, synthetising insights from 88 PRISMA selected studies spanning 2015–2025, explores how deep learning architectures, specifically Graph Neural [...] Read more.
Advances in cardiac regenerative medicine increasingly rely on integrating artificial intelligence with spatial multi-omics technologies to decipher intricate cellular dynamics in cardiomyocyte differentiation. This systematic review, synthetising insights from 88 PRISMA selected studies spanning 2015–2025, explores how deep learning architectures, specifically Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs), synergise with multi-modal single-cell datasets, spatially resolved transcriptomics, and epigenomics to advance cardiac biology. Innovations in spatial omics technologies have revolutionised our understanding of the organisation of cardiac tissue, revealing novel cellular communities and metabolic landscapes that underlie cardiovascular health and disease. By synthesising cutting-edge methodologies and technical innovations across these 88 studies, this review establishes the foundation for AI-enabled cardiac regeneration, potentially accelerating the clinical adoption of regenerative treatments through improved therapeutic prediction models and mechanistic understanding. We examine deep learning implementations in spatiotemporal genomics, spatial multi-omics applications in cardiac tissues, cardiomyocyte differentiation challenges, and predictive modelling innovations that collectively advance precision cardiology and next-generation regenerative strategies. Full article
Show Figures

Graphical abstract

Back to TopTop