Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (142)

Search Parameters:
Keywords = graphical modeling languages

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 13345 KB  
Article
Neural-Based Controller on Low-Density FPGAs for Dynamic Systems
by Edson E. Cruz-Miguel, José R. García-Martínez, Jorge Orrante-Sakanassi, José M. Álvarez-Alvarado, Omar A. Barra-Vázquez and Juvenal Rodríguez-Reséndiz
Electronics 2026, 15(1), 198; https://doi.org/10.3390/electronics15010198 - 1 Jan 2026
Viewed by 142
Abstract
This work introduces a logic resource-efficient Artificial Neural Network (ANN) controller for embedded control applications on low-density Field-Programmable Gate Array (FPGA) platforms. The proposed design relies on 32-bit fixed-point arithmetic and incorporates an online learning mechanism, enabling the controller to adapt to system [...] Read more.
This work introduces a logic resource-efficient Artificial Neural Network (ANN) controller for embedded control applications on low-density Field-Programmable Gate Array (FPGA) platforms. The proposed design relies on 32-bit fixed-point arithmetic and incorporates an online learning mechanism, enabling the controller to adapt to system variations while maintaining low hardware complexity. Unlike conventional artificial intelligence solutions that require high-performance processors or Graphics Processing Units (GPUs), the proposed approach targets platforms with limited logic, memory, and computational resources. The ANN controller was described using a Hardware Description Language (HDL) and validated via cosimulation between ModelSim and Simulink. A practical comparison was also made between Proportional-Integral-Derivative (PID) control and an ANN for motor position control. The results confirm that the architecture efficiently utilizes FPGA resources, consuming approximately 50% of the available Digital Signal Processor (DSP) units, less than 40% of logic cells, and only 6% of embedded memory blocks. Owing to its modular design, the architecture is inherently scalable, allowing additional inputs or hidden-layer neurons to be incorporated with minimal impact on overall resource usage. Additionally, the computational latency can be precisely determined and scales with (16n+39)m+31 clock cycles, enabling precise timing analysis and facilitating integration into real-time embedded control systems. Full article
Show Figures

Figure 1

51 pages, 6351 KB  
Article
Benchmarking PHP–MySQL Communication: A Comparative Study of MySQLi and PDO Under Varying Query Complexity
by Nebojša Andrijević, Zoran Lovreković, Hadžib Salkić, Đorđe Šarčević and Jasmina Perišić
Electronics 2026, 15(1), 21; https://doi.org/10.3390/electronics15010021 - 20 Dec 2025
Viewed by 561
Abstract
Efficient interaction between PHP (Hypertext Preprocessor) applications and MySQL databases is essential for the performance of modern web systems. This study systematically compares the two most widely used PHP APIs for working with MySQL databases—MySQLi (MySQL Improved extension) and PDO (PHP Data Objects)—under [...] Read more.
Efficient interaction between PHP (Hypertext Preprocessor) applications and MySQL databases is essential for the performance of modern web systems. This study systematically compares the two most widely used PHP APIs for working with MySQL databases—MySQLi (MySQL Improved extension) and PDO (PHP Data Objects)—under identical experimental conditions. The analysis covers execution time, memory consumption, and the stability and variability of results across different types of SQL (Structured Query Language) queries (simple queries, complex JOIN, GROUP BY/HAVING). A specialized benchmarking tool was developed to collect detailed metrics over several hundred repetitions and to enable graphical and statistical evaluation. Across the full benchmark suite, MySQLi exhibits the lowest mean wall-clock execution time on average (≈15% overall). However, under higher query complexity and in certain connection-handling regimes, PDO prepared statement modes provide competitive latency with improved predictability. These results should be interpreted as context-aware rankings for the tested single-host environment and workload design, and as a reusable benchmarking framework intended for replication under alternative deployment models. Statistical analysis (Kruskal–Wallis and Mann–Whitney tests) confirms significant differences between the methods, while Box-plots and histograms visualize deviations and the presence of outliers. Unlike earlier studies, this work provides a controlled and replicable benchmarking environment that tests both MySQLi and PDO across multiple API modes and isolates the impact of native versus emulated prepared statements. It also evaluates performance under complex-query workloads that reflect typical reporting and analytics patterns on the ClassicModels schema. To our knowledge, no previous study has analyzed these factors jointly or provided a reusable tool enabling transparent comparison across PHP–MySQL access layers. The findings provide empirical evidence and practical guidelines for choosing the optimal API depending on the application scenario, as well as a tool that can be applied for further testing in various web environments. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

32 pages, 3530 KB  
Article
Empowering Service Designers with Integrated Modelling Tools: A Model-Driven Approach
by Francisco Javier Pérez-Blanco, Juan Manuel Vara, Cristian Gómez-Macías, David Granada and Carlos Villarrubia
Systems 2025, 13(12), 1107; https://doi.org/10.3390/systems13121107 - 8 Dec 2025
Viewed by 492
Abstract
Service design often involves using diverse business and process modelling notations to represent strategic and operational aspects of services. Although complementary, no modelling environment currently enables integrated use of these notations. This paper addresses this gap by proposing a model-driven solution that supports [...] Read more.
Service design often involves using diverse business and process modelling notations to represent strategic and operational aspects of services. Although complementary, no modelling environment currently enables integrated use of these notations. This paper addresses this gap by proposing a model-driven solution that supports multiple modelling notations within a unified environment. The research is guided by the following question: To what extent can a modelling environment that integrates multiple business and process modelling notations benefit service designers? To answer it, the study adopts Design Science Research (DSR) methodology and develops a prototype integrating several graphical Domain-Specific Languages (DSLs), along with mechanisms for model transformation, traceability, and validation. The prototype was evaluated through a two-phase process: (1) a laboratory case study applying the double diamond model of service design to a real-world scenario, and (2) an empirical study involving nine service design professionals who assessed the usability of the tool, efficiency, and completeness of generated models. Results show that integrating heterogeneous modelling notations through Model-Driven Engineering (MDE) can reduce modelling effort by up to 36.4% and generate models with up to 97.7% completeness, demonstrating not only technical benefits but also contributions to the well-being of designers by reducing cognitive load, fostering consistency, and improving communication among the stakeholders involved in the designing process. Full article
(This article belongs to the Section Systems Practice in Social Science)
Show Figures

Figure 1

13 pages, 892 KB  
Article
LaserCAD—A Novel Parametric, Python-Based Optical Design Software
by Clemens Anschütz, Joachim Hein, He Zhuang and Malte C. Kaluza
Appl. Sci. 2025, 15(22), 11893; https://doi.org/10.3390/app152211893 - 8 Nov 2025
Viewed by 869
Abstract
In this article, we present LaserCAD, an open-source, script-based software toolkit for the design and visualization of optical setups based on parametric ray tracing. Unlike conventional commercial tools, which focus on complex lens optimization and offer dense GUIs with extensive parameters, LaserCAD is [...] Read more.
In this article, we present LaserCAD, an open-source, script-based software toolkit for the design and visualization of optical setups based on parametric ray tracing. Unlike conventional commercial tools, which focus on complex lens optimization and offer dense GUIs with extensive parameters, LaserCAD is tailored for fast, intuitive modeling of laser beam paths and opto-mechanical assemblies with minimal setup overhead. Written in Python, it allows users to describe optical systems in a language close to geometrical optics, using simple commands with sensible defaults for most parameters. Optical elements can be automatically positioned including the required mounts. As a graphical backend, FreeCAD renders 3D models of all components for interactive visualization and post-processing. LaserCAD supports integration with other simulation tools and can automate the creation of alignment aids for 3D printing. This makes it especially suitable for rapid prototyping and lab-ready designs. Full article
(This article belongs to the Special Issue Advances in High-Intensity Lasers and Their Applications)
Show Figures

Figure 1

28 pages, 38011 KB  
Article
On the Use of LLMs for GIS-Based Spatial Analysis
by Roberto Pierdicca, Nikhil Muralikrishna, Flavio Tonetto and Alessandro Ghianda
ISPRS Int. J. Geo-Inf. 2025, 14(10), 401; https://doi.org/10.3390/ijgi14100401 - 14 Oct 2025
Viewed by 3283
Abstract
This paper presents an approach integrating Large Language Models (LLMs), specifically GPT-4 and the open-source DeepSeek-R1, into Geographic Information System (GIS) workflows to enhance the accessibility, flexibility, and efficiency of spatial analysis tasks. We designed and implemented a system capable of interpreting natural [...] Read more.
This paper presents an approach integrating Large Language Models (LLMs), specifically GPT-4 and the open-source DeepSeek-R1, into Geographic Information System (GIS) workflows to enhance the accessibility, flexibility, and efficiency of spatial analysis tasks. We designed and implemented a system capable of interpreting natural language instructions provided by users and translating them into automated GIS workflows through dynamically generated Python scripts. An interactive graphical user interface (GUI), built using CustomTkinter, was developed to enable intuitive user interaction with GIS data and processes, reducing the need for advanced programming or technical expertise. We conducted an empirical evaluation of this approach through a comparative case study involving typical GIS tasks such as spatial data validation, data merging, buffer analysis, and thematic mapping using urban datasets from Pesaro, Italy. The performance of our automated system was directly compared against traditional manual workflows executed by 10 experienced GIS analysts. The results from this evaluation indicate a substantial reduction in task completion time, decreasing from approximately 1 h and 45 min in the manual approach to roughly 27 min using our LLM-driven automation, without compromising analytical quality or accuracy. Furthermore, we systematically evaluated the system’s factual reliability using a diverse set of geospatial queries, confirming robust performance for practical GIS tasks. Additionally, qualitative feedback emphasized improved usability and accessibility, particularly for users without specialized GIS training. These findings highlight the significant potential of integrating LLMs into GISs, demonstrating clear advantages in workflow automation, user-friendliness, and broader adoption of advanced spatial analysis methodologies. Full article
(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)
Show Figures

Figure 1

26 pages, 1820 KB  
Article
CLARE: Context-Aware, Interactive Knowledge Graph Construction from Transcripts
by Ryan Henry and Jiaqi Gong
Information 2025, 16(10), 866; https://doi.org/10.3390/info16100866 - 6 Oct 2025
Viewed by 2201
Abstract
Knowledge graphs (KGs) represent a promising approach for detecting and correcting errors in automated audio and video transcripts. Yet the lack of accessible tools leaves human reviewers with limited support, as KG construction from media data often depends on advanced programming or natural [...] Read more.
Knowledge graphs (KGs) represent a promising approach for detecting and correcting errors in automated audio and video transcripts. Yet the lack of accessible tools leaves human reviewers with limited support, as KG construction from media data often depends on advanced programming or natural language processing expertise. We present the Custom LLM Automated Relationship Extractor (CLARE), a system that lowers this barrier by combining context-aware relation extraction with an interface for transcript correction and KG refinement. Users import time-synchronized media, correct transcripts through linked playback, and generate an editable, searchable KG from the revised text. CLARE supports over 150 large language models (LLMs) and embedding models, including local options suitable for privacy-sensitive data. We evaluated CLARE on the Measure of Information in Nodes and Edges (MINE) benchmark, which pairs articles with ground-truth facts. With minimal parameter tuning, CLARE achieved 82.1% mean fact accuracy, exceeding Knowledge Graph Generation (KGGen, 64.8%) and Graph Retrieval-Augmented Generation (GraphRAG, 48.3%). We further assessed interactive refinement by revisiting the twenty-five lowest-scoring graphs for fifteen minutes each and found that the fact accuracy rose by an average of 22.7%. These findings show that CLARE both outperforms prior methods and enables efficient user-driven improvements. By streamlining ingestion, correction, and filtering, CLARE makes KG construction more accessible for researchers working with unstructured data. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

23 pages, 552 KB  
Article
Flipping the Script: The Impact of a Blended Literacy Learning Intervention on Comprehension
by Michael J. Hockwater
Educ. Sci. 2025, 15(9), 1147; https://doi.org/10.3390/educsci15091147 - 3 Sep 2025
Viewed by 1427
Abstract
This qualitative action research case study explored how a blended literacy learning intervention combining the flipped classroom model with youth-selected multimodal texts influenced sixth-grade Academic Intervention Services (AIS) students’ comprehension of figurative language. The study was conducted over four months in a New [...] Read more.
This qualitative action research case study explored how a blended literacy learning intervention combining the flipped classroom model with youth-selected multimodal texts influenced sixth-grade Academic Intervention Services (AIS) students’ comprehension of figurative language. The study was conducted over four months in a New York State middle school and involved seven students identified as at-risk readers. Initially, students engaged with teacher-created instructional videos outside of class and completed analytical activities during class time. However, due to low engagement and limited comprehension gains, the intervention was revised to incorporate student autonomy through the selection of multimodal texts such as graphic novels, song lyrics, and YouTube videos. Data was collected through semi-structured interviews, journal entries, surveys, and classroom artifacts, and then analyzed using inductive coding and member checking. Findings indicate that students demonstrated increased the comprehension of figurative language when given choice in both texts and instructional videos. Participants reported increased motivation, deeper engagement, and enhanced meaning-making, particularly when reading texts that reflected their personal interests and experiences. The study concludes that a blended literacy model emphasizing autonomy and multimodality can support comprehension and bridge the gap between in-school and out-of-school literacy practices. Full article
(This article belongs to the Special Issue Digital Literacy Environments and Reading Comprehension)
Show Figures

Figure 1

26 pages, 1255 KB  
Article
Interpretable Knowledge Tracing via Transformer-Bayesian Hybrid Networks: Learning Temporal Dependencies and Causal Structures in Educational Data
by Nhu Tam Mai, Wenyang Cao and Wenhe Liu
Appl. Sci. 2025, 15(17), 9605; https://doi.org/10.3390/app15179605 - 31 Aug 2025
Cited by 6 | Viewed by 2584
Abstract
Knowledge tracing, the computational modeling of student learning progression through sequential educational interactions, represents a critical component for adaptive learning systems and personalized education platforms. However, existing approaches face a fundamental trade-off between predictive accuracy and interpretability: deep sequence models excel at capturing [...] Read more.
Knowledge tracing, the computational modeling of student learning progression through sequential educational interactions, represents a critical component for adaptive learning systems and personalized education platforms. However, existing approaches face a fundamental trade-off between predictive accuracy and interpretability: deep sequence models excel at capturing complex temporal dependencies in student interaction data but lack transparency in their decision-making processes, while probabilistic graphical models provide interpretable causal relationships but struggle with the complexity of real-world educational sequences. We propose a hybrid architecture that integrates transformer-based sequence modeling with structured Bayesian causal networks to overcome this limitation. Our dual-pathway design employs a transformer encoder to capture complex temporal patterns in student interaction sequences, while a differentiable Bayesian network explicitly models prerequisite relationships between knowledge components. These pathways are unified through a cross-attention mechanism that enables bidirectional information flow between temporal representations and causal structures. We introduce a joint training objective that simultaneously optimizes sequence prediction accuracy and causal graph consistency, ensuring learned temporal patterns align with interpretable domain knowledge. The model undergoes pre-training on 3.2 million student–problem interactions from diverse MOOCs to establish foundational representations, followed by domain-specific fine-tuning. Comprehensive experiments across mathematics, computer science, and language learning demonstrate substantial improvements: 8.7% increase in AUC over state-of-the-art knowledge tracing models (0.847 vs. 0.779), 12.3% reduction in RMSE for performance prediction, and 89.2% accuracy in discovering expert-validated prerequisite relationships. The model achieves a 0.763 F1-score for early at-risk student identification, outperforming baselines by 15.4%. This work demonstrates that sophisticated temporal modeling and interpretable causal reasoning can be effectively unified for educational applications. Full article
Show Figures

Figure 1

37 pages, 756 KB  
Review
From Fragment to One Piece: A Review on AI-Driven Graphic Design
by Xingxing Zou, Wen Zhang and Nanxuan Zhao
J. Imaging 2025, 11(9), 289; https://doi.org/10.3390/jimaging11090289 - 25 Aug 2025
Cited by 3 | Viewed by 3101
Abstract
This survey offers a comprehensive overview of advancements in Artificial Intelligence in Graphic Design (AIGD), with a focus on the integration of AI techniques to enhance design interpretation and creative processes. The field is categorized into two primary directions: perception tasks, which involve [...] Read more.
This survey offers a comprehensive overview of advancements in Artificial Intelligence in Graphic Design (AIGD), with a focus on the integration of AI techniques to enhance design interpretation and creative processes. The field is categorized into two primary directions: perception tasks, which involve understanding and analyzing design elements, and generation tasks, which focus on creating new design elements and layouts. The methodology emphasizes the exploration of various subtasks including the perception and generation of visual elements, aesthetic and semantic understanding, and layout analysis and generation. The survey also highlights the role of large language models and multimodal approaches in bridging the gap between localized visual features and global design intent. Despite significant progress, challenges persist in understanding human intent, ensuring interpretability, and maintaining control over multilayered compositions. This survey aims to serve as a guide for researchers, detailing the current state of AIGD and outlining potential future directions. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

28 pages, 8325 KB  
Article
Tunnel Rapid AI Classification (TRaiC): An Open-Source Code for 360° Tunnel Face Mapping, Discontinuity Analysis, and RAG-LLM-Powered Geo-Engineering Reporting
by Seyedahmad Mehrishal, Junsu Leem, Jineon Kim, Yulong Shao, Il-Seok Kang and Jae-Joon Song
Remote Sens. 2025, 17(16), 2891; https://doi.org/10.3390/rs17162891 - 20 Aug 2025
Cited by 1 | Viewed by 2990
Abstract
Accurate and efficient rock mass characterization is essential in geotechnical engineering, yet traditional tunnel face mapping remains time consuming, subjective, and potentially hazardous. Recent advances in digital technologies and AI offer automation opportunities, but many existing solutions are hindered by slow 3D scanning, [...] Read more.
Accurate and efficient rock mass characterization is essential in geotechnical engineering, yet traditional tunnel face mapping remains time consuming, subjective, and potentially hazardous. Recent advances in digital technologies and AI offer automation opportunities, but many existing solutions are hindered by slow 3D scanning, computationally intensive processing, and limited integration flexibility. This paper presents Tunnel Rapid AI Classification (TRaiC), an open-source MATLAB-based platform for rapid and automated tunnel face mapping. TRaiC integrates single-shot 360° panoramic photography, AI-powered discontinuity detection, 3D textured digital twin generation, rock mass discontinuity characterization, and Retrieval-Augmented Generation with Large Language Models (RAG-LLM) for automated geological interpretation and standardized reporting. The modular eight-stage workflow includes simplified 3D modeling, trace segmentation, 3D joint network analysis, and rock mass classification using RMR, with outputs optimized for Geo-BIM integration. Initial evaluations indicate substantial reductions in processing time and expert assessment workload. Producing a lightweight yet high-fidelity digital twin, TRaiC enables computational efficiency, transparency, and reproducibility, serving as a foundation for future AI-assisted geotechnical engineering research. Its graphical user interface and well-structured open-source code make it accessible to users ranging from beginners to advanced researchers. Full article
Show Figures

Figure 1

14 pages, 6060 KB  
Article
Text Typing Using Blink-to-Alphabet Tree for Patients with Neuro-Locomotor Disabilities
by Seungho Lee and Sangkon Lee
Sensors 2025, 25(15), 4555; https://doi.org/10.3390/s25154555 - 23 Jul 2025
Viewed by 866
Abstract
Lou Gehrig’s disease, also known as ALS, is a progressive neurodegenerative condition that weakens muscles and can lead to paralysis as it progresses. For patients with severe paralysis, eye-tracking devices such as eye mouse enable communication. However, the equipment is expensive, and the [...] Read more.
Lou Gehrig’s disease, also known as ALS, is a progressive neurodegenerative condition that weakens muscles and can lead to paralysis as it progresses. For patients with severe paralysis, eye-tracking devices such as eye mouse enable communication. However, the equipment is expensive, and the calibration process is very difficult and frustrating for patients to use. To alleviate this problem, we propose a simple and efficient method to type texts intuitively with graphical guidance on the screen. Specifically, the method detects patients’ eye blinks in video frames to navigate through three sequential steps, narrowing down the choices from 9 letters, to 3 letters, and finally to a single letter (from a 26-letter alphabet). In this way, a patient is able to rapidly type a letter of the alphabet by blinking a minimum of three times and a maximum of nine times. The proposed method integrates an API of large language model (LLM) to further accelerate text input and correct sentences in terms of typographical errors, spacing, and upper/lower case. Experiments on ten participants demonstrate that the proposed method significantly outperforms three state-of-the-art methods in both typing speed and typing accuracy, without requiring any calibration process. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

49 pages, 3444 KB  
Article
A Design-Based Research Approach to Streamline the Integration of High-Tech Assistive Technologies in Speech and Language Therapy
by Anna Lekova, Paulina Tsvetkova, Anna Andreeva, Georgi Dimitrov, Tanio Tanev, Miglena Simonska, Tsvetelin Stefanov, Vaska Stancheva-Popkostadinova, Gergana Padareva, Katia Rasheva, Adelina Kremenska and Detelina Vitanova
Technologies 2025, 13(7), 306; https://doi.org/10.3390/technologies13070306 - 16 Jul 2025
Viewed by 2409
Abstract
Currently, high-tech assistive technologies (ATs), particularly Socially Assistive Robots (SARs), virtual reality (VR) and conversational AI (ConvAI), are considered very useful in supporting professionals in Speech and Language Therapy (SLT) for children with communication disorders. However, despite a positive public perception, therapists face [...] Read more.
Currently, high-tech assistive technologies (ATs), particularly Socially Assistive Robots (SARs), virtual reality (VR) and conversational AI (ConvAI), are considered very useful in supporting professionals in Speech and Language Therapy (SLT) for children with communication disorders. However, despite a positive public perception, therapists face difficulties when integrating these technologies into practice due to technical challenges and a lack of user-friendly interfaces. To address this gap, a design-based research approach has been employed to streamline the integration of SARs, VR and ConvAI in SLT, and a new software platform called “ATLog” has been developed for designing interactive and playful learning scenarios with ATs. ATLog’s main features include visual-based programming with graphical interface, enabling therapists to intuitively create personalized interactive scenarios without advanced programming skills. The platform follows a subprocess-oriented design, breaking down SAR skills and VR scenarios into microskills represented by pre-programmed graphical blocks, tailored to specific treatment domains, therapy goals, and language skill levels. The ATLog platform was evaluated by 27 SLT experts using the Technology Acceptance Model (TAM) and System Usability Scale (SUS) questionnaires, extended with additional questions specifically focused on ATLog structure and functionalities. According to the SUS results, most of the experts (74%) evaluated ATLog with grades over 70, indicating high acceptance of its usability. Over half (52%) of the experts rated the additional questions focused on ATLog’s structure and functionalities in the A range (90–100), while 26% rated them in the B range (80–89), showing strong acceptance of the platform for creating and running personalized interactive scenarios with ATs. According to the TAM results, experts gave high grades for both perceived usefulness (44% in the A range) and perceived ease of use (63% in the A range). Full article
Show Figures

Figure 1

8 pages, 702 KB  
Proceeding Paper
Overview of Training LLMs on One Single GPU
by Mohamed Ben jouad and Lotfi Elaachak
Comput. Sci. Math. Forum 2025, 10(1), 14; https://doi.org/10.3390/cmsf2025010014 - 9 Jul 2025
Viewed by 3772
Abstract
Large language models (LLMs) are developing at a rapid pace, which has made it necessary to better understand how they train, especially when faced with resource limitations. This paper examines in detail how various state-of-the-art LLMs train on a single Graphical Processing Unit [...] Read more.
Large language models (LLMs) are developing at a rapid pace, which has made it necessary to better understand how they train, especially when faced with resource limitations. This paper examines in detail how various state-of-the-art LLMs train on a single Graphical Processing Unit (GPU), paying close attention to crucial elements like throughput, memory utilization and training time. We find important trade-offs between model size, batch size and computational efficiency through empirical evaluation, offering practical advice for streamlining fine-tuning processes in the face of hardware constraints. Full article
Show Figures

Figure 1

23 pages, 2579 KB  
Article
Multimodal Particulate Matter Prediction: Enabling Scalable and High-Precision Air Quality Monitoring Using Mobile Devices and Deep Learning Models
by Hirokazu Madokoro and Stephanie Nix
Sensors 2025, 25(13), 4053; https://doi.org/10.3390/s25134053 - 29 Jun 2025
Cited by 2 | Viewed by 1287
Abstract
This paper presents a novel approach for predicting Particulate Matter (PM) concentrations using mobile camera devices. In response to persistent air pollution challenges across Japan, we developed a system that utilizes cutting-edge transformer-based deep learning architectures to estimate PM values from imagery captured [...] Read more.
This paper presents a novel approach for predicting Particulate Matter (PM) concentrations using mobile camera devices. In response to persistent air pollution challenges across Japan, we developed a system that utilizes cutting-edge transformer-based deep learning architectures to estimate PM values from imagery captured by smartphone cameras. Our approach employs Contrastive Language–Image Pre-Training (CLIP) as a multimodal framework to extract visual features associated with PM concentration from environmental scenes. We first developed a baseline through comparative analysis of time-series models for 1D PM signal prediction, finding that linear models, particularly NLinear, outperformed complex transformer architectures for short-term forecasting tasks. Building on these insights, we implemented a CLIP-based system for 2D image analysis that achieved a Top-1 accuracy of 0.24 and a Top-5 accuracy of 0.52 when tested on diverse smartphone-captured images. The performance evaluations on Graphics Processing Unit (GPU) and Single-Board Computer (SBC) platforms highlight a viable path toward edge deployment. Processing times of 0.29 s per image on the GPU versus 2.68 s on the SBC demonstrate the potential for scalable, real-time environmental monitoring. We consider that this research connects high-performance computing with energy-efficient hardware solutions, creating a practical framework for distributed environmental monitoring that reduces reliance on costly centralized monitoring systems. Our findings indicate that transformer-based multimodal models present a promising approach for mobile sensing applications, with opportunities for further improvement through seasonal data expansion and architectural refinements. Full article
(This article belongs to the Special Issue Machine Learning and Image-Based Smart Sensing and Applications)
Show Figures

Figure 1

39 pages, 4748 KB  
Article
Harnessing Multi-Modal Synergy: A Systematic Framework for Disaster Loss Consistency Analysis and Emergency Response
by Siqing Shan, Jingyu Su, Junze Li, Yinong Li and Zhongbao Zhou
Systems 2025, 13(7), 498; https://doi.org/10.3390/systems13070498 - 20 Jun 2025
Viewed by 858
Abstract
When a disaster occurs, a large number of social media posts on platforms like Weibo attract public attention with their combination of text and images. However, the consistency between textual descriptions and visual representations varies significantly. Consistent multi-modal data are crucial for helping [...] Read more.
When a disaster occurs, a large number of social media posts on platforms like Weibo attract public attention with their combination of text and images. However, the consistency between textual descriptions and visual representations varies significantly. Consistent multi-modal data are crucial for helping the public understand the disaster situation and support rescue efforts. This study aims to develop a systematic framework for assessing the consistency of multi-modal disaster-related data on social media. This study explored how the congruence between text and image content affects public engagement and informs strategies for efficient emergency responses. Firstly, the Clip (Contrastive Language-Image Pre-Training) model was used to mine the disaster correlation, loss category, and severity of the images and text. Then, the consistency of image–text pairs was qualitatively analyzed and quantitatively calculated. Finally, the influence of graphic consistency on social concern was discussed. The experimental findings reveal that the consistency of text and image data significantly influences the degree of public concern. When the consistency increases by 1%, the social attention index will increase by about 0.8%. This shows that consistency is a key factor for attracting public attention and promoting the dissemination of information related to important disasters. The proposed framework offers a robust, systematic approach to analyzing disaster loss information consistency. It allows for the efficient extraction of high-consistency data from vast social media data sets, providing governments and emergency response agencies with timely, accurate insights into disaster situations. Full article
(This article belongs to the Section Systems Practice in Social Science)
Show Figures

Figure 1

Back to TopTop