MDPI - Publisher of Open Access Journals

20 pages, 1204 KiB

Open AccessArticle

Deep Learning for Visual Leading of Ships: AI for Human Factor Accident Prevention

by Manuel Vázquez Neira, Genaro Cao Feijóo, Blanca Sánchez Fernández and José A. Orosa

Appl. Sci. 2025, 15(15), 8261; https://doi.org/10.3390/app15158261 - 24 Jul 2025

Traditional navigation relies on visual alignment with leading lights, a task typically monitored by bridge officers over extended periods. This process can lead to fatigue-related human factor errors, increasing the risk of maritime accidents and environmental damage. To address this issue, this study [...] Read more.

Traditional navigation relies on visual alignment with leading lights, a task typically monitored by bridge officers over extended periods. This process can lead to fatigue-related human factor errors, increasing the risk of maritime accidents and environmental damage. To address this issue, this study explores the use of convolutional neural networks (CNNs), evaluating different training strategies and hyperparameter configurations to assist officers in identifying deviations from proper visual leading. Using video data captured from a navigation simulator, we trained a lightweight CNN capable of advising bridge personnel with an accuracy of 86% during night-time operations. Notably, the model demonstrated robustness against visual interference from other light sources, such as lighthouses or coastal lights. The primary source of classification error was linked to images with low bow deviation, largely influenced by human mislabeling during dataset preparation. Future work will focus on refining the classification scheme to enhance model performance. We (1) propose a lightweight CNN based on SqueezeNet for night-time ship navigation, (2) expand the traditional binary risk classification into six operational categories, and (3) demonstrate improved performance over human judgment in visually ambiguous conditions. Full article

(This article belongs to the Special Issue Advances in Maritime Transport: Sustainability, Contamination and New Technologies—2nd Edition)

32 pages, 13059 KiB

Open AccessArticle

Verifying the Effects of the Grey Level Co-Occurrence Matrix and Topographic–Hydrologic Features on Automatic Gully Extraction in Dexiang Town, Bayan County, China

by Zhuo Chen and Tao Liu

Remote Sens. 2025, 17(15), 2563; https://doi.org/10.3390/rs17152563 - 23 Jul 2025

Abstract

Erosion gullies can reduce arable land area and decrease agricultural machinery efficiency; therefore, automatic gully extraction on a regional scale should be one of the preconditions of gully control and land management. The purpose of this study is to compare the effects of [...] Read more.

Erosion gullies can reduce arable land area and decrease agricultural machinery efficiency; therefore, automatic gully extraction on a regional scale should be one of the preconditions of gully control and land management. The purpose of this study is to compare the effects of the grey level co-occurrence matrix (GLCM) and topographic–hydrologic features on automatic gully extraction and guide future practices in adjacent regions. To accomplish this, GaoFen-2 (GF-2) satellite imagery and high-resolution digital elevation model (DEM) data were first collected. The GLCM and topographic–hydrologic features were generated, and then, a gully label dataset was built via visual interpretation. Second, the study area was divided into training, testing, and validation areas, and four practices using different feature combinations were conducted. The DeepLabV3+ and ResNet50 architectures were applied to train five models in each practice. Thirdly, the trainset gully intersection over union (IOU), test set gully IOU, receiver operating characteristic curve (ROC), area under the curve (AUC), user’s accuracy, producer’s accuracy, Kappa coefficient, and gully IOU in the validation area were used to assess the performance of the models in each practice. The results show that the validated gully IOU was 0.4299 (±0.0082) when only the red (R), green (G), blue (B), and near-infrared (NIR) bands were applied, and solely combining the topographic–hydrologic features with the RGB and NIR bands significantly improved the performance of the models, which boosted the validated gully IOU to 0.4796 (±0.0146). Nevertheless, solely combining GLCM features with RGB and NIR bands decreased the accuracy, which resulted in the lowest validated gully IOU of 0.3755 (±0.0229). Finally, by employing the full set of RGB and NIR bands, the GLCM and topographic–hydrologic features obtained a validated gully IOU of 0.4762 (±0.0163) and tended to show an equivalent improvement with the combination of topographic–hydrologic features and RGB and NIR bands. A preliminary explanation is that the GLCM captures the local textures of gullies and their backgrounds, and thus introduces ambiguity and noise into the convolutional neural network (CNN). Therefore, the GLCM tends to provide no benefit to automatic gully extraction with CNN-type algorithms, while topographic–hydrologic features, which are also original drivers of gullies, help determine the possible presence of water-origin gullies when optical bands fail to tell the difference between a gully and its confusing background. Full article

(This article belongs to the Topic Geographic Information and Remote Sensing Technology (GIRST))

► Show Figures

Figure 1

19 pages, 4665 KiB

Open AccessArticle

Territorial Ambiguities and Hesitant Identity: A Critical Reading of the Fishing Neighbourhood of Paramos Through Photography

by Jorge Marum and Maria Neto

Arts 2025, 14(4), 81; https://doi.org/10.3390/arts14040081 - 22 Jul 2025

Viewed by 41

Abstract

This article offers a critical reading of the fishing neighbourhood of Paramos, located on the northern coast of Portugal, through a methodological approach that combines documentary photography and cognitive cartography. The study investigates the relationships between identity, landscape, and power within a territory [...] Read more.

This article offers a critical reading of the fishing neighbourhood of Paramos, located on the northern coast of Portugal, through a methodological approach that combines documentary photography and cognitive cartography. The study investigates the relationships between identity, landscape, and power within a territory marked by spatial fragmentation, symbolic exclusion, and functional indeterminacy. By means of a structured visual essay supported by field observation and interpretive maps, Paramos is examined as a liminal urban enclave whose ambiguities reveal tensions between memory, informal appropriation, and control devices. Drawing on authors such as Lefebvre, Augé, Hayden, Domingues, Foucault, and Latour, the article argues that the photographic image, used as a critical tool, can unveil hidden territorial logics and contribute to a more inclusive and situated spatial discourse. Full article

(This article belongs to the Section Visual Arts)

► Show Figures

Figure 1

28 pages, 3894 KiB

Open AccessReview

Where Business Meets Location Intelligence: A Bibliometric Analysis of Geomarketing Research in Retail

by Cristiana Tudor, Aura Girlovan and Cosmin-Alin Botoroga

ISPRS Int. J. Geo-Inf. 2025, 14(8), 282; https://doi.org/10.3390/ijgi14080282 - 22 Jul 2025

Viewed by 207

Abstract

We live in an era where digitalization and omnichannel strategies significantly transform retail landscapes, and accurate spatial analytics from Geographic Information Systems (GIS) can deliver substantial competitive benefits. Nonetheless, despite evident practical advantages for specific targeting strategies and operational efficiency, the degree of [...] Read more.

We live in an era where digitalization and omnichannel strategies significantly transform retail landscapes, and accurate spatial analytics from Geographic Information Systems (GIS) can deliver substantial competitive benefits. Nonetheless, despite evident practical advantages for specific targeting strategies and operational efficiency, the degree of GIS integration into academic marketing literature remains ambiguous. Clarifying this uncertainty is beneficial for advancing theoretical understanding and ensuring retail strategies fully leverage robust, data-driven spatial intelligence. To examine the intellectual development of the field, co-occurrence analysis, topic mapping, and citation structure visualization were performed on 4952 peer-reviewed articles using the Bibliometrix R package (version 4.3.3) within R software (version 4.4.1). The results demonstrate that although GIS-based methods have been effectively incorporated into fields like site selection and spatial segmentation, traditional marketing research has not yet entirely adopted them. One of the study’s key findings is the distinction between “author keywords” and “keywords plus,” where researchers concentrate on novel topics like omnichannel retail, artificial intelligence, and logistics. However, “Keywords plus” still refers to more traditional terms such as pricing, customer satisfaction, and consumer behavior. This discrepancy presents a misalignment between current research trends and indexed classification practices. Although the mainstream retail research lacks terminology connected to geomarketing, a theme evolution analysis reveals a growing focus on technology-driven and sustainability-related concepts associated with the Retail 4.0 and 5.0 paradigms. These findings underscore a conceptual and structural deficiency in the literature and indicate the necessity for enhanced integration of GIS and spatial decision support systems (SDSS) in retail marketing. Full article

► Show Figures

Figure 1

21 pages, 4147 KiB

Open AccessArticle

AgriFusionNet: A Lightweight Deep Learning Model for Multisource Plant Disease Diagnosis

by Saleh Albahli

Agriculture 2025, 15(14), 1523; https://doi.org/10.3390/agriculture15141523 - 15 Jul 2025

Viewed by 311

Abstract

Timely and accurate identification of plant diseases is critical to mitigating crop losses and enhancing yield in precision agriculture. This paper proposes AgriFusionNet, a lightweight and efficient deep learning model designed to diagnose plant diseases using multimodal data sources. The framework integrates RGB [...] Read more.

Timely and accurate identification of plant diseases is critical to mitigating crop losses and enhancing yield in precision agriculture. This paper proposes AgriFusionNet, a lightweight and efficient deep learning model designed to diagnose plant diseases using multimodal data sources. The framework integrates RGB and multispectral drone imagery with IoT-based environmental sensor data (e.g., temperature, humidity, soil moisture), recorded over six months across multiple agricultural zones. Built on the EfficientNetV2-B4 backbone, AgriFusionNet incorporates Fused-MBConv blocks and Swish activation to improve gradient flow, capture fine-grained disease patterns, and reduce inference latency. The model was evaluated using a comprehensive dataset composed of real-world and benchmarked samples, showing superior performance with 94.3% classification accuracy, 28.5 ms inference time, and a 30% reduction in model parameters compared to state-of-the-art models such as Vision Transformers and InceptionV4. Extensive comparisons with both traditional machine learning and advanced deep learning methods underscore its robustness, generalization, and suitability for deployment on edge devices. Ablation studies and confusion matrix analyses further confirm its diagnostic precision, even in visually ambiguous cases. The proposed framework offers a scalable, practical solution for real-time crop health monitoring, contributing toward smart and sustainable agricultural ecosystems. Full article

(This article belongs to the Special Issue Computational, AI and IT Solutions Helping Agriculture)

► Show Figures

Figure 1

30 pages, 5474 KiB

Open AccessArticle

WHU-RS19 ABZSL: An Attribute-Based Dataset for Remote Sensing Image Understanding

by Mattia Balestra, Marina Paolanti and Roberto Pierdicca

Remote Sens. 2025, 17(14), 2384; https://doi.org/10.3390/rs17142384 - 10 Jul 2025

Viewed by 247

Abstract

The advancement of artificial intelligence (AI) in remote sensing (RS) increasingly depends on datasets that offer rich and structured supervision beyond traditional scene-level labels. Although existing benchmarks for aerial scene classification have facilitated progress in this area, their reliance on single-class annotations restricts [...] Read more.

The advancement of artificial intelligence (AI) in remote sensing (RS) increasingly depends on datasets that offer rich and structured supervision beyond traditional scene-level labels. Although existing benchmarks for aerial scene classification have facilitated progress in this area, their reliance on single-class annotations restricts their application to more flexible, interpretable and generalisable learning frameworks. In this study, we introduce WHU-RS19 ABZSL: an attribute-based extension of the widely adopted WHU-RS19 dataset. This new version comprises 1005 high-resolution aerial images across 19 scene categories, each annotated with a vector of 38 features. These cover objects (e.g., roads and trees), geometric patterns (e.g., lines and curves) and dominant colours (e.g., green and blue), and are defined through expert-guided annotation protocols. To demonstrate the value of the dataset, we conduct baseline experiments using deep learning models that had been adapted for multi-label classification—ResNet18, VGG16, InceptionV3, EfficientNet and ViT-B/16—designed to capture the semantic complexity characteristic of real-world aerial scenes. The results, which are measured in terms of macro F1-score, range from 0.7385 for ResNet18 to 0.7608 for EfficientNet-B0. In particular, EfficientNet-B0 and ViT-B/16 are the top performers in terms of the overall macro F1-score and consistency across attributes, while all models show a consistent decline in performance for infrequent or visually ambiguous categories. This confirms that it is feasible to accurately predict semantic attributes in complex scenes. By enriching a standard benchmark with detailed, image-level semantic supervision, WHU-RS19 ABZSL supports a variety of downstream applications, including multi-label classification, explainable AI, semantic retrieval, and attribute-based ZSL. It thus provides a reusable, compact resource for advancing the semantic understanding of remote sensing and multimodal AI. Full article

(This article belongs to the Special Issue Remote Sensing Datasets and 3D Visualization of Geospatial Big Data)

► Show Figures

Figure 1

26 pages, 1804 KiB

Open AccessArticle

Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search

by Wei Xia, Wenguang Gan and Xinpan Yuan

Big Data Cogn. Comput. 2025, 9(7), 182; https://doi.org/10.3390/bdcc9070182 - 7 Jul 2025

Viewed by 353

Abstract

Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and [...] Read more.

Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and the intended nouns; and (2) textual noise and relevance imbalance (TNRI), where irrelevant or non-discriminative tokens (e.g., ‘wearing’) reduce the saliency of critical visual attributes in the textual description. To address these aspects, we propose the dependency-aware entity–attribute alignment network (DEAAN), a novel framework that explicitly tackles AANA through dependency-guided attention and TNRI via adaptive token filtering. The DEAAN introduces two modules: (1) dependency-assisted implicit reasoning (DAIR) to resolve AANA through syntactic parsing, and (2) relevance-adaptive token selection (RATS) to suppress TNRI by learning token saliency. Experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate state-of-the-art performance, with the DEAAN achieving a Rank-1 accuracy of 76.71% and an mAP of 69.07% on CUHK-PEDES, surpassing RDE by 0.77% in Rank-1 and 1.51% in mAP. Ablation studies reveal that DAIR and RATS individually improve Rank-1 by 2.54% and 3.42%, while their combination elevates the performance by 6.35%, validating their synergy. This work bridges structured linguistic analysis with adaptive feature selection, demonstrating practical robustness in surveillance-oriented TPS scenarios. Full article

► Show Figures

Figure 1

21 pages, 4859 KiB

Open AccessArticle

Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation

by Jun Yin, Fei Wu, Hao Su, Peng Huang and Yuetong Qixuan

Sensors 2025, 25(13), 4199; https://doi.org/10.3390/s25134199 - 5 Jul 2025

Viewed by 386

Abstract

The Segment Anything Model 2 (SAM2) has achieved state-of-the-art performance in pixel-level object segmentation for both static and dynamic visual content. Its streaming memory architecture maintains spatial context across video sequences, yet struggles with long-term tracking due to its static inference framework. SAM [...] Read more.

The Segment Anything Model 2 (SAM2) has achieved state-of-the-art performance in pixel-level object segmentation for both static and dynamic visual content. Its streaming memory architecture maintains spatial context across video sequences, yet struggles with long-term tracking due to its static inference framework. SAM 2’s fixed temporal window approach indiscriminately retains historical frames, failing to account for frame quality or dynamic motion patterns. This leads to error propagation and tracking instability in challenging scenarios involving fast-moving objects, partial occlusions, or crowded environments. To overcome these limitations, this paper proposes SAM2Plus, a zero-shot enhancement framework that integrates Kalman filter prediction, dynamic quality thresholds, and adaptive memory management. The Kalman filter models object motion using physical constraints to predict trajectories and dynamically refine segmentation states, mitigating positional drift during occlusions or velocity changes. Dynamic thresholds, combined with multi-criteria evaluation metrics (e.g., motion coherence, appearance consistency), prioritize high-quality frames while adaptively balancing confidence scores and temporal smoothness. This reduces ambiguities among similar objects in complex scenes. SAM2Plus further employs an optimized memory system that prunes outdated or low-confidence entries and retains temporally coherent context, ensuring constant computational resources even for infinitely long videos. Extensive experiments on two video object segmentation (VOS) benchmarks demonstrate SAM2Plus’s superiority over SAM 2. It achieves an average improvement of 1.0 in J&F metrics across all 24 direct comparisons, with gains exceeding 2.3 points on SA-V and LVOS datasets for long-term tracking. The method delivers real-time performance and strong generalization without fine-tuning or additional parameters, effectively addressing occlusion recovery and viewpoint changes. By unifying motion-aware physics-based prediction with spatial segmentation, SAM2Plus bridges the gap between static and dynamic reasoning, offering a scalable solution for real-world applications such as autonomous driving and surveillance systems. Full article

(This article belongs to the Special Issue Image and Video Processing and Recognition Based on Artificial Intelligence: 3rd Edition)

► Show Figures

Figure 1

31 pages, 1901 KiB

Open AccessArticle

The Impact of Color Cues on Word Segmentation by L2 Chinese Readers: Evidence from Eye Movements

by Lin Li, Yaning Ji, Jingxin Wang and Kevin B. Paterson

Behav. Sci. 2025, 15(7), 904; https://doi.org/10.3390/bs15070904 - 3 Jul 2025

Viewed by 256

Abstract

Chinese lacks explicit word boundary markers, creating frequent temporary segmental ambiguities where character sequences permit multiple plausible lexical analyses. Skilled native (L1) Chinese readers resolve these ambiguities efficiently. However, mechanisms underlying word segmentation in second language (L2) Chinese reading remain poorly understood. Our [...] Read more.

Chinese lacks explicit word boundary markers, creating frequent temporary segmental ambiguities where character sequences permit multiple plausible lexical analyses. Skilled native (L1) Chinese readers resolve these ambiguities efficiently. However, mechanisms underlying word segmentation in second language (L2) Chinese reading remain poorly understood. Our study investigated: (1) whether L2 readers experience greater difficulty processing temporary segmental ambiguities compared to L1 readers, and (2) whether visual boundary cues can facilitate ambiguity resolution in L2 reading. We measured the eye movements of 102 skilled L1 and 60 high-proficiency L2 readers for sentences containing temporarily ambiguous three-character incremental words (e.g., “音乐剧” [musical]), where the initial two characters (“音乐” [music]) also form a valid word. Sentences were presented using either neutral mono-color displays providing no segmentation cues, or color-coded displays marking word boundaries. The color-coded displays employed either uniform coloring to promote resolution of the segmental ambiguity or contrasting colors for the two-character embedded word versus the final character to induce a segmental misanalysis. The L2 group read more slowly than the L1 group, employing a cautious character-by-character reading strategy. Both groups nevertheless appeared to process the segmental ambiguity effectively, suggesting shared segmentation strategies. The L1 readers showed little sensitivity to visual boundary cues, with little evidence that this influenced their ambiguity processing. By comparison, L2 readers showed greater sensitivity to these cues, with some indication that they affected ambiguity processing. The overall sentence-level effects of color coding word boundaries were nevertheless modest for both groups, suggesting little influence of visual boundary cues on overall reading fluency for either L1 or L2 readers. Full article

► Show Figures

Figure 1

18 pages, 4979 KiB

Open AccessSystematic Review

Discordant High-Gradient Aortic Stenosis: A Systematic Review

by Nadera N. Bismee, Mohammed Tiseer Abbas, Hesham Sheashaa, Fatmaelzahraa E. Abdelfattah, Juan M. Farina, Kamal Awad, Isabel G. Scalia, Milagros Pereyra Pietri, Nima Baba Ali, Sogol Attaripour Esfahani, Omar H. Ibrahim, Steven J. Lester, Said Alsidawi, Chadi Ayoub and Reza Arsanjani

J. Cardiovasc. Dev. Dis. 2025, 12(7), 255; https://doi.org/10.3390/jcdd12070255 - 3 Jul 2025

Viewed by 324

Abstract

Aortic stenosis (AS), the most common valvular heart disease, is traditionally graded based on several echocardiographic quantitative parameters, such as aortic valve area (AVA), mean pressure gradient (MPG), and peak jet velocity (Vmax). This systematic review evaluates the clinical significance and prognostic implications [...] Read more.

Aortic stenosis (AS), the most common valvular heart disease, is traditionally graded based on several echocardiographic quantitative parameters, such as aortic valve area (AVA), mean pressure gradient (MPG), and peak jet velocity (Vmax). This systematic review evaluates the clinical significance and prognostic implications of discordant high-gradient AS (DHG-AS), a distinct hemodynamic phenotype characterized by elevated MPG despite a preserved AVA (>1.0 cm²). Although often overlooked, DHG-AS presents unique diagnostic and therapeutic challenges, as high gradients remain a strong predictor of adverse outcomes despite moderately reduced AVA. Sixty-three studies were included following rigorous selection and quality assessment of the key studies. Prognostic outcomes across five key studies were discrepant: some showed better survival in DHG-AS compared to concordant high-gradient AS (CHG-AS), while others reported similar or worse outcomes. For instance, a retrospective observational study including 3209 patients with AS found higher mortality in CHG-AS (unadjusted HR: 1.4; 95% CI: 1.1 to 1.7), whereas another retrospective multicenter study including 2724 patients with AS observed worse outcomes in DHG-AS (adjusted HR: 1.59; 95% CI: 1.04 to 2.56). These discrepancies may stem from delays in intervention or heterogeneity in study populations. Despite the diagnostic ambiguity, the presence of high gradients warrants careful evaluation, aggressive risk stratification, and timely management. Current guidelines recommend a multimodal approach combining echocardiography, computed tomography (CT) calcium scoring, transesophageal echocardiography (TEE) planimetry, and, when needed, catheterization. Anatomic AVA assessment by TEE, CT, and cardiac magnetic resonance imaging (CMR) can improve diagnostic accuracy by directly visualizing valve morphology and planimetry-based AVA, helping to clarify the true severity in discordant cases. However, these modalities are limited by factors such as image quality (especially with TEE), radiation exposure and contrast use (in CT), and availability or contraindications (in CMR). Management remains largely based on CHG-AS protocols, with intervention primarily guided by transvalvular gradient and symptom burden. The variability among the different guidelines in defining severity and therapeutic thresholds highlights the need for tailored approaches in DHG-AS. DHG-AS is clinically relevant and associated with substantial prognostic uncertainty. Timely recognition and individualized treatment could improve outcomes in this complex subgroup. Full article

(This article belongs to the Special Issue Cardiovascular Imaging in Heart Failure and in Valvular Heart Disease)

► Show Figures

Figure 1

23 pages, 2463 KiB

Open AccessArticle

MCDet: Target-Aware Fusion for RGB-T Fire Detection

by Yuezhu Xu, He Wang, Yuan Bi, Guohao Nie and Xingmei Wang

Forests 2025, 16(7), 1088; https://doi.org/10.3390/f16071088 - 30 Jun 2025

Viewed by 269

Abstract

Forest fire detection is vital for ecological conservation and disaster management. Existing visual detection methods exhibit instability in smoke-obscured or illumination-variable environments. Although multimodal fusion has demonstrated potential, effectively resolving inconsistencies in smoke features across diverse modalities remains a significant challenge. This issue [...] Read more.

Forest fire detection is vital for ecological conservation and disaster management. Existing visual detection methods exhibit instability in smoke-obscured or illumination-variable environments. Although multimodal fusion has demonstrated potential, effectively resolving inconsistencies in smoke features across diverse modalities remains a significant challenge. This issue stems from the inherent ambiguity between regions characterized by high temperatures in infrared imagery and those with elevated brightness levels in visible-light imaging systems. In this paper, we propose MCDet, an RGB-T forest fire detection framework incorporating target-aware fusion. To alleviate feature cross-modal ambiguity, we design a Multidimensional Representation Collaborative Fusion module (MRCF), which constructs global feature interactions via a state-space model and enhances local detail perception through deformable convolution. Then, a content-guided attention network (CGAN) is introduced to aggregate multidimensional features by dynamic gating mechanism. Building upon this foundation, the integration of WIoU further suppresses vegetation occlusion and illumination interference on a holistic level, thereby reducing the false detection rate. Evaluated on three forest fire datasets and one pedestrian dataset, MCDet achieves a mean detection accuracy of 77.5%, surpassing advanced methods. This performance makes MCDet a practical solution to enhance early warning system reliability. Full article

(This article belongs to the Special Issue Advanced Technologies for Forest Fire Detection and Monitoring)

► Show Figures

Figure 1

27 pages, 5780 KiB

Open AccessArticle

Utilizing GCN-Based Deep Learning for Road Extraction from Remote Sensing Images

by Yu Jiang, Jiasen Zhao, Wei Luo, Bincheng Guo, Zhulin An and Yongjun Xu

Sensors 2025, 25(13), 3915; https://doi.org/10.3390/s25133915 - 23 Jun 2025

Viewed by 478

Abstract

The technology of road extraction serves as a crucial foundation for urban intelligent renewal and green sustainable development. Its outcomes can optimize transportation network planning, reduce resource waste, and enhance urban resilience. Deep learning-based approaches have demonstrated outstanding performance in road extraction, particularly [...] Read more.

The technology of road extraction serves as a crucial foundation for urban intelligent renewal and green sustainable development. Its outcomes can optimize transportation network planning, reduce resource waste, and enhance urban resilience. Deep learning-based approaches have demonstrated outstanding performance in road extraction, particularly excelling in complex scenarios. However, extracting roads from remote sensing data remains challenging due to several factors that limit accuracy: (1) Roads often share similar visual features with the background, such as rooftops and parking lots, leading to ambiguous inter-class distinctions; (2) Roads in complex environments, such as those occluded by shadows or trees, are difficult to detect. To address these issues, this paper proposes an improved model based on Graph Convolutional Networks (GCNs), named FR-SGCN (Hierarchical Depth-wise Separable Graph Convolutional Network Incorporating Graph Reasoning and Attention Mechanisms). The model is designed to enhance the precision and robustness of road extraction through intelligent techniques, thereby supporting precise planning of green infrastructure. First, high-dimensional features are extracted using ResNeXt, whose grouped convolution structure balances parameter efficiency and feature representation capability, significantly enhancing the expressiveness of the data. These high-dimensional features are then segmented, and enhanced channel and spatial features are obtained via attention mechanisms, effectively mitigating background interference and intra-class ambiguity. Subsequently, a hybrid adjacency matrix construction method is proposed, based on gradient operators and graph reasoning. This method integrates similarity and gradient information and employs graph convolution to capture the global contextual relationships among features. To validate the effectiveness of FR-SGCN, we conducted comparative experiments using 12 different methods on both a self-built dataset and a public dataset. The proposed model achieved the highest F1 score on both datasets. Visualization results from the experiments demonstrate that the model effectively extracts occluded roads and reduces the risk of redundant construction caused by data errors during urban renewal. This provides reliable technical support for smart cities and sustainable development. Full article

(This article belongs to the Topic Digital and Intelligent Technologies and Application in Urban Construction, Operation, Maintenance, and Renewal)

► Show Figures

Figure 1

21 pages, 4734 KiB

Open AccessArticle

Youth Data Visualization Practices: Rhetoric, Art, and Design

by Joy G. Bertling and Lynn Hodge

Educ. Sci. 2025, 15(6), 781; https://doi.org/10.3390/educsci15060781 - 19 Jun 2025

Viewed by 414

Abstract

In the recent K-12 educational literature, arts-based data visualization has been positioned as a compelling means of rendering data science and statistical learning accessible, motivating, and empowering for youth, as data users and producers. However, the only research to attend carefully to youth’s [...] Read more.

In the recent K-12 educational literature, arts-based data visualization has been positioned as a compelling means of rendering data science and statistical learning accessible, motivating, and empowering for youth, as data users and producers. However, the only research to attend carefully to youth’s data-based, artistic storytelling practices has been limited in scope to specific storytelling mechanisms, like youth’s metaphor usage. Engaging in design-based research, we sought to understand the art and design decisions that youth make and the data-based arguments and stories that youth tell through their arts-based data visualizations. We drew upon embodied theory to acknowledge the holistic, synergistic, and situated nature of student learning and making. Corresponding with emerging accounts of youth arts-based data visualization practices, we saw regular evidence of art, storytelling, and personal subjectivities intertwining. Contributing to this literature, we found that these intersections surfaced in a number of domains, including youth’s pictorial symbolism, visual encoding strategies, and data decisions like manifold pictorial symbols arranged to support complex, multilayered, ambiguous narratives; qualitative data melding community and personal lived experience; and singular statements making persuasive appeals. This integration of art, story, agency, and embodiment often manifested in ways that seemed to jostle against traditional notions of and norms surrounding data science. Full article

(This article belongs to the Section Curriculum and Instruction)

► Show Figures

Figure 1

20 pages, 1233 KiB

Open AccessArticle

What Could Possibly Go Wrong? Exploring Challenges and Mitigation Strategies of Applying a Living Lab Approach in an Innovation Project

by Elias Blanckaert, Louise Hallström, Iris Jennes and Wendy Van den Broeck

Sustainability 2025, 17(12), 5496; https://doi.org/10.3390/su17125496 - 14 Jun 2025

Viewed by 520

Abstract

The living lab methodology is widely used in innovation projects to drive user-centered development. While its benefits, such as co-creation and real-world validation, are well known, its implementation presents challenges that remain underexplored. This study examines these challenges by using the Horizon 2020 [...] Read more.

The living lab methodology is widely used in innovation projects to drive user-centered development. While its benefits, such as co-creation and real-world validation, are well known, its implementation presents challenges that remain underexplored. This study examines these challenges by using the Horizon 2020 Möbius project as a case study. While the Möbius project itself aimed to modernize European book publishing through an immersive reading application and a data visualization tool, this study reflects on the implementation process of the living lab approach within that context, using an action research approach. After project completion, a structured brainstorming session reviewed identified challenges and mitigation strategies. Findings highlight three key challenges. First, misalignment between assumed and actual stakeholder needs hindered industry engagement. Second, recruitment was complicated by the ambiguous use of “prosumer”, causing confusion among participants. Third, communication gaps and personnel changes disrupted the integration of user feedback into development cycles. These challenges underscore the need for early and continuous stakeholder alignment, adaptive communication, and structured knowledge management. Based on these findings, the study proposes strategies to improve engagement and integrate user insights more effectively, ultimately enhancing the impact of living lab-based innovation projects. Full article

(This article belongs to the Special Issue Sustainable Impact and Systemic Change via Living Labs)

► Show Figures

Figure 1

18 pages, 4982 KiB

Open AccessArticle

Unsupervised Clustering and Ensemble Learning for Classifying Lip Articulation in Fingerspelling

by Nurzada Amangeldy, Nazerke Gazizova, Marek Milosz, Bekbolat Kurmetbek, Aizhan Nazyrova and Akmaral Kassymova

Sensors 2025, 25(12), 3703; https://doi.org/10.3390/s25123703 - 13 Jun 2025

Viewed by 366

Abstract

This paper presents a new methodology for analyzing lip articulation during fingerspelling aimed at extracting robust visual patterns that can overcome the inherent ambiguity and variability of lip shape. The proposed approach is based on unsupervised clustering of lip movement trajectories to identify [...] Read more.

This paper presents a new methodology for analyzing lip articulation during fingerspelling aimed at extracting robust visual patterns that can overcome the inherent ambiguity and variability of lip shape. The proposed approach is based on unsupervised clustering of lip movement trajectories to identify consistent articulatory patterns across different time profiles. The methodology is not limited to using a single model. Still, it includes the exploration of varying cluster configurations and an assessment of their robustness, as well as a detailed analysis of the correspondence between individual alphabet letters and specific clusters. In contrast to direct classification based on raw visual features, this approach pre-tests clustered representations using a model-based assessment of their discriminative potential. This structured approach enhances the interpretability and robustness of the extracted features, highlighting the importance of lip dynamics as an auxiliary modality in multimodal sign language recognition. The obtained results demonstrate that trajectory clustering can serve as a practical method for generating features, providing more accurate and context-sensitive gesture interpretation. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

Search Results (223)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (223)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI