Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,199)

Search Parameters:
Keywords = image retrievals

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 2875 KiB  
Review
Streamlining ICI Transformed as a Nonnegative System
by David Hyland
Photonics 2025, 12(7), 733; https://doi.org/10.3390/photonics12070733 - 18 Jul 2025
Abstract
More than seventy-five years ago, R. Hanbury Brown and R. Q. Twiss performed the first experiments in quantum optics. At the outset, their results showed great promise for the field of astronomical science, featuring inexpensive hardware, immunity to atmospheric turbulence, and enormous interferometry [...] Read more.
More than seventy-five years ago, R. Hanbury Brown and R. Q. Twiss performed the first experiments in quantum optics. At the outset, their results showed great promise for the field of astronomical science, featuring inexpensive hardware, immunity to atmospheric turbulence, and enormous interferometry baselines. This was put to good use for the determination of stellar diameters up to the present time. However, for two-dimensional imaging with faint objects, the integration times are prohibitive. Recently, in a sequence of papers, the present author developed a stochastic search algorithm to remove this roadblock, reducing millions of hours to minutes or seconds. Also, the author’s paper entitled “The Rise of the Brown-Twiss Effect” summarized the search algorithm and emphasized the mathematical proofs of the algorithm. The current algorithm is a sequence of six lines of code. The goal of the present article is to streamline the algorithm in the form of a discrete-time dynamic system and to reduce the size of the state space. The previous algorithm used initial conditions that were randomly assorted pixel intensities. The intensities were mutually statistically independent and uniformly distributed over the range 0,δ, where δ is a (very small) positive constant. The present formulation employs a transformation requiring the uniformly distributed phase of the fast Fourier transform of the cross correlations of the data as initial conditions. We shall see that this strategy results in the simplest discrete-time dynamic system capable for exploring the alternate features and benefits of compartmental nonnegative dynamic systems. Full article
(This article belongs to the Special Issue Optical Imaging and Measurements: 2nd Edition)
Show Figures

Figure 1

18 pages, 1016 KiB  
Article
The Relationship Between the Phonological Processing Network and the Tip-of-the-Tongue Phenomenon: Evidence from Large-Scale DTI Data
by Xiaoyan Gong, Ziyi He, Jun Wang and Cheng Wang
Behav. Sci. 2025, 15(7), 977; https://doi.org/10.3390/bs15070977 - 18 Jul 2025
Abstract
The tip-of-the-tongue (TOT) phenomenon is characterized by a temporary inability to retrieve a word despite a strong sense of familiarity. While extensive research has linked phonological processing to TOT, the exact nature of this relationship remains debated. The “blocking hypothesis” suggests that the [...] Read more.
The tip-of-the-tongue (TOT) phenomenon is characterized by a temporary inability to retrieve a word despite a strong sense of familiarity. While extensive research has linked phonological processing to TOT, the exact nature of this relationship remains debated. The “blocking hypothesis” suggests that the retrieval of target words is interfered with by phonological neighbors, whereas the “transmission deficit hypothesis” posits that TOT arises from insufficient phonological activation of the target words. This study revisited this issue by examining the relationship between the microstructural integrity of the phonological processing brain network and TOT, utilizing graph-theoretical analyses of neuroimaging data from the Cambridge Centre for Ageing and Neuroscience (Cam-CAN), which included diffusion tensor imaging (DTI) data from 576 participants aged 18–87. The results revealed that global efficiency and mean degree centrality of the phonological processing network positively predicted TOT rates. At the nodal level, the nodal efficiency of the bilateral posterior superior temporal gyrus and the clustering coefficient of the left premotor cortex positively predicted TOT rates, while the degree centrality of the left dorsal superior temporal gyrus (dSTG) and the clustering coefficient of the left posterior supramarginal gyrus (pSMG) negatively predicted TOT rates. Overall, these findings suggest that individuals with a more enriched network of phonological representations tend to experience more TOTs, supporting the blocking hypothesis. Additionally, this study highlights the roles of the left dSTG and pSMG in facilitating word retrieval, potentially reducing the occurrence of TOTs. Full article
(This article belongs to the Section Cognition)
Show Figures

Figure 1

18 pages, 7358 KiB  
Article
On the Hybrid Algorithm for Retrieving Day and Night Cloud Base Height from Geostationary Satellite Observations
by Tingting Ye, Zhonghui Tan, Weihua Ai, Shuo Ma, Xianbin Zhao, Shensen Hu, Chao Liu and Jianping Guo
Remote Sens. 2025, 17(14), 2469; https://doi.org/10.3390/rs17142469 - 16 Jul 2025
Viewed by 59
Abstract
Most existing cloud base height (CBH) retrieval algorithms are only applicable for daytime satellite observations due to their dependence on visible observations. This study presents a novel algorithm to retrieve day and night CBH using infrared observations of the geostationary Advanced Himawari Imager [...] Read more.
Most existing cloud base height (CBH) retrieval algorithms are only applicable for daytime satellite observations due to their dependence on visible observations. This study presents a novel algorithm to retrieve day and night CBH using infrared observations of the geostationary Advanced Himawari Imager (AHI). The algorithm is featured by integrating deep learning techniques with a physical model. The algorithm first utilizes a convolutional neural network-based model to extract cloud top height (CTH) and cloud water path (CWP) from the AHI infrared observations. Then, a physical model is introduced to relate cloud geometric thickness (CGT) to CWP by constructing a look-up table of effective cloud water content (ECWC). Thus, the CBH can be obtained by subtracting CGT from CTH. The results demonstrate good agreement between our AHI CBH retrievals and the spaceborne active remote sensing measurements, with a mean bias of −0.14 ± 1.26 km for CloudSat-CALIPSO observations at daytime and −0.35 ± 1.84 km for EarthCARE measurements at nighttime. Additional validation against ground-based millimeter wave cloud radar (MMCR) measurements further confirms the effectiveness and reliability of the proposed algorithm across varying atmospheric conditions and temporal scales. Full article
Show Figures

Graphical abstract

23 pages, 29759 KiB  
Article
UAV-Satellite Cross-View Image Matching Based on Adaptive Threshold-Guided Ring Partitioning Framework
by Yushi Liao, Juan Su, Decao Ma and Chao Niu
Remote Sens. 2025, 17(14), 2448; https://doi.org/10.3390/rs17142448 - 15 Jul 2025
Viewed by 157
Abstract
Cross-view image matching between UAV and satellite platforms is critical for geographic localization but remains challenging due to domain gaps caused by disparities in imaging sensors, viewpoints, and illumination conditions. To address these challenges, this paper proposes an Adaptive Threshold-guided Ring Partitioning Framework [...] Read more.
Cross-view image matching between UAV and satellite platforms is critical for geographic localization but remains challenging due to domain gaps caused by disparities in imaging sensors, viewpoints, and illumination conditions. To address these challenges, this paper proposes an Adaptive Threshold-guided Ring Partitioning Framework (ATRPF) for UAV–satellite cross-view image matching. Unlike conventional ring-based methods with fixed partitioning rules, ATRPF innovatively incorporates heatmap-guided adaptive thresholds and learnable hyperparameters to dynamically adjust ring-wise feature extraction regions, significantly enhancing cross-domain representation learning through context-aware adaptability. The framework synergizes three core components: brightness-aligned preprocessing to reduce illumination-induced domain shifts, hybrid loss functions to improve feature discriminability across domains, and keypoint-aware re-ranking to refine retrieval results by compensating for neural networks’ localization uncertainty. Comprehensive evaluations on the University-1652 benchmark demonstrate the framework’s superiority; it achieves 82.50% Recall@1 and 84.28% AP for UAV→Satellite geo-localization, along with 90.87% Recall@1 and 80.25% AP for Satellite→UAV navigation. These results validate the framework’s capability to bridge UAV–satellite domain gaps while maintaining robust matching precision under heterogeneous imaging conditions, providing a viable solution for practical applications such as UAV navigation in GNSS-denied environments. Full article
(This article belongs to the Special Issue Temporal and Spatial Analysis of Multi-Source Remote Sensing Images)
Show Figures

Figure 1

18 pages, 8486 KiB  
Article
An Efficient Downwelling Light Sensor Data Correction Model for UAV Multi-Spectral Image DOM Generation
by Siyao Wu, Yanan Lu, Wei Fan, Shengmao Zhang, Zuli Wu and Fei Wang
Drones 2025, 9(7), 491; https://doi.org/10.3390/drones9070491 - 11 Jul 2025
Viewed by 129
Abstract
The downwelling light sensor (DLS) is the industry-standard solution for generating UAV-based digital orthophoto maps (DOMs). Current mainstream DLS correction methods primarily rely on angle compensation. However, due to the temporal mismatch between the DLS sampling intervals and the exposure times of multispectral [...] Read more.
The downwelling light sensor (DLS) is the industry-standard solution for generating UAV-based digital orthophoto maps (DOMs). Current mainstream DLS correction methods primarily rely on angle compensation. However, due to the temporal mismatch between the DLS sampling intervals and the exposure times of multispectral cameras, as well as external disturbances such as strong wind gusts and abrupt changes in flight attitude, DLS data often become unreliable, particularly at UAV turning points. Building upon traditional angle compensation methods, this study proposes an improved correction approach—FIM-DC (Fitting and Interpolation Model-based Data Correction)—specifically designed for data collection under clear-sky conditions and stable atmospheric illumination, with the goal of significantly enhancing the accuracy of reflectance retrieval. The method addresses three key issues: (1) field tests conducted in the Qingpu region show that FIM-DC markedly reduces the standard deviation of reflectance at tie points across multiple spectral bands and flight sessions, with the most substantial reduction from 15.07% to 0.58%; (2) it effectively mitigates inconsistencies in reflectance within image mosaics caused by anomalous DLS readings, thereby improving the uniformity of DOMs; and (3) FIM-DC accurately corrects the spectral curves of six land cover types in anomalous images, making them consistent with those from non-anomalous images. In summary, this study demonstrates that integrating FIM-DC into DLS data correction workflows for UAV-based multispectral imagery significantly enhances reflectance calculation accuracy and provides a robust solution for improving image quality under stable illumination conditions. Full article
Show Figures

Figure 1

30 pages, 5474 KiB  
Article
WHU-RS19 ABZSL: An Attribute-Based Dataset for Remote Sensing Image Understanding
by Mattia Balestra, Marina Paolanti and Roberto Pierdicca
Remote Sens. 2025, 17(14), 2384; https://doi.org/10.3390/rs17142384 - 10 Jul 2025
Viewed by 197
Abstract
The advancement of artificial intelligence (AI) in remote sensing (RS) increasingly depends on datasets that offer rich and structured supervision beyond traditional scene-level labels. Although existing benchmarks for aerial scene classification have facilitated progress in this area, their reliance on single-class annotations restricts [...] Read more.
The advancement of artificial intelligence (AI) in remote sensing (RS) increasingly depends on datasets that offer rich and structured supervision beyond traditional scene-level labels. Although existing benchmarks for aerial scene classification have facilitated progress in this area, their reliance on single-class annotations restricts their application to more flexible, interpretable and generalisable learning frameworks. In this study, we introduce WHU-RS19 ABZSL: an attribute-based extension of the widely adopted WHU-RS19 dataset. This new version comprises 1005 high-resolution aerial images across 19 scene categories, each annotated with a vector of 38 features. These cover objects (e.g., roads and trees), geometric patterns (e.g., lines and curves) and dominant colours (e.g., green and blue), and are defined through expert-guided annotation protocols. To demonstrate the value of the dataset, we conduct baseline experiments using deep learning models that had been adapted for multi-label classification—ResNet18, VGG16, InceptionV3, EfficientNet and ViT-B/16—designed to capture the semantic complexity characteristic of real-world aerial scenes. The results, which are measured in terms of macro F1-score, range from 0.7385 for ResNet18 to 0.7608 for EfficientNet-B0. In particular, EfficientNet-B0 and ViT-B/16 are the top performers in terms of the overall macro F1-score and consistency across attributes, while all models show a consistent decline in performance for infrequent or visually ambiguous categories. This confirms that it is feasible to accurately predict semantic attributes in complex scenes. By enriching a standard benchmark with detailed, image-level semantic supervision, WHU-RS19 ABZSL supports a variety of downstream applications, including multi-label classification, explainable AI, semantic retrieval, and attribute-based ZSL. It thus provides a reusable, compact resource for advancing the semantic understanding of remote sensing and multimodal AI. Full article
(This article belongs to the Special Issue Remote Sensing Datasets and 3D Visualization of Geospatial Big Data)
Show Figures

Figure 1

6 pages, 8447 KiB  
Case Report
Magnetic Mishap: Multidisciplinary Care for Magnet Ingestion in a 2-Year-Old
by Niharika Goparaju, Danielle P. Yarbrough and Gretchen Fuller
Emerg. Care Med. 2025, 2(3), 32; https://doi.org/10.3390/ecm2030032 - 8 Jul 2025
Viewed by 179
Abstract
Background/Objectives: A 2-year-old male presented to the emergency department (ED) with vomiting and abdominal discomfort following ingestion of multiple magnets from a sibling’s bracelet. This case highlights the risks associated with magnet ingestion and the need for coordinated multidisciplinary care and public health [...] Read more.
Background/Objectives: A 2-year-old male presented to the emergency department (ED) with vomiting and abdominal discomfort following ingestion of multiple magnets from a sibling’s bracelet. This case highlights the risks associated with magnet ingestion and the need for coordinated multidisciplinary care and public health intervention. Methods: Radiographs revealed magnets in the oropharynx, stomach, and small bowel. Emergency physicians coordinated care with otolaryngology, gastroenterology, and general surgery. Results: Laryngoscopy successfully removed two magnets from the uvula, and endoscopy retrieved 30 magnets from the stomach. General surgery performed a diagnostic laparoscopy, identifying residual magnets in the colon. Gastroenterology attempted a colonoscopy but was unable to retrieve magnets due to formed stool, leading to bowel preparation and serial imaging. The patient eventually passed 12 magnets per rectum without surgical intervention. Conclusions: This case emphasizes the importance of multidisciplinary collaboration in managing magnet ingestion, a preventable cause of serious gastrointestinal injury. Recent studies highlight the increasing incidence and severity of such cases due to accessibility and inadequate regulation. These findings underscore the need for public awareness and adherence to management protocols to mitigate morbidity and mortality in pediatric patients. Full article
Show Figures

Figure 1

26 pages, 1804 KiB  
Article
Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
by Wei Xia, Wenguang Gan and Xinpan Yuan
Big Data Cogn. Comput. 2025, 9(7), 182; https://doi.org/10.3390/bdcc9070182 - 7 Jul 2025
Viewed by 294
Abstract
Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and [...] Read more.
Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and the intended nouns; and (2) textual noise and relevance imbalance (TNRI), where irrelevant or non-discriminative tokens (e.g., ‘wearing’) reduce the saliency of critical visual attributes in the textual description. To address these aspects, we propose the dependency-aware entity–attribute alignment network (DEAAN), a novel framework that explicitly tackles AANA through dependency-guided attention and TNRI via adaptive token filtering. The DEAAN introduces two modules: (1) dependency-assisted implicit reasoning (DAIR) to resolve AANA through syntactic parsing, and (2) relevance-adaptive token selection (RATS) to suppress TNRI by learning token saliency. Experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate state-of-the-art performance, with the DEAAN achieving a Rank-1 accuracy of 76.71% and an mAP of 69.07% on CUHK-PEDES, surpassing RDE by 0.77% in Rank-1 and 1.51% in mAP. Ablation studies reveal that DAIR and RATS individually improve Rank-1 by 2.54% and 3.42%, while their combination elevates the performance by 6.35%, validating their synergy. This work bridges structured linguistic analysis with adaptive feature selection, demonstrating practical robustness in surveillance-oriented TPS scenarios. Full article
Show Figures

Figure 1

22 pages, 3237 KiB  
Article
Local Polar Coordinate Feature Representation and Heterogeneous Fusion Framework for Accurate Leaf Image Retrieval
by Mengjie Ye, Yong Cheng, Yongqi Yuan, De Yu and Ge Jin
Symmetry 2025, 17(7), 1049; https://doi.org/10.3390/sym17071049 - 3 Jul 2025
Viewed by 190
Abstract
Leaf shape is a crucial visual cue for plant recognition. However, distinguishing among plants with high inter-class shape similarity remains a significant challenge, especially among cultivars within the same species where shape differences can be extremely subtle. To address this issue, we propose [...] Read more.
Leaf shape is a crucial visual cue for plant recognition. However, distinguishing among plants with high inter-class shape similarity remains a significant challenge, especially among cultivars within the same species where shape differences can be extremely subtle. To address this issue, we propose a novel shape representation and an advanced heterogeneous fusion framework for accurate leaf image retrieval. Specifically, based on the local polar coordinate system, multiscale analysis, and statistical histograms, we first propose local polar coordinate feature representation (LPCFR), which captures spatial distribution from two orthogonal directions while encoding local curvature characteristics. Next, we present heterogeneous feature fusion with exponential weighting and Ranking (HFER), which enhances the compatibility and robustness of fused features by applying exponential weighted normalization and ranking-based encoding within neighborhood distance measures. Extensive experiments on both species-level and cultivar-level leaf datasets demonstrate that the proposed representation effectively captures shape features, and the fusion framework successfully integrates heterogeneous features, outperforming state-of-the-art (SOTA) methods. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

20 pages, 4620 KiB  
Article
An Interactive Human-in-the-Loop Framework for Skeleton-Based Posture Recognition in Model Education
by Jing Shen, Ling Chen, Xiaotong He, Chuanlin Zuo, Xiangjun Li and Lin Dong
Biomimetics 2025, 10(7), 431; https://doi.org/10.3390/biomimetics10070431 - 1 Jul 2025
Viewed by 331
Abstract
This paper presents a human-in-the-loop interactive framework for skeleton-based posture recognition, designed to support model training and artistic education. A total of 4870 labeled images are used for training and validation, and 500 images are reserved for testing across five core posture categories: [...] Read more.
This paper presents a human-in-the-loop interactive framework for skeleton-based posture recognition, designed to support model training and artistic education. A total of 4870 labeled images are used for training and validation, and 500 images are reserved for testing across five core posture categories: standing, sitting, jumping, crouching, and lying. From each image, comprehensive skeletal features are extracted, including joint coordinates, angles, limb lengths, and symmetry metrics. Multiple classification algorithms—traditional (KNN, SVM, Random Forest) and deep learning-based (LSTM, Transformer)—are compared to identify effective combinations of features and models. Experimental results show that deep learning models achieve superior accuracy on complex postures, while traditional models remain competitive with low-dimensional features. Beyond classification, the system integrates posture recognition with a visual recommendation module. Recognized poses are used to retrieve matched examples from a reference library, allowing instructors to browse and select posture suggestions for learners. This semi-automated feedback loop enhances teaching interactivity and efficiency. Among all evaluated methods, the Transformer model achieved the best accuracy of 92.7% on the dataset, demonstrating the effectiveness of our closed-loop framework in supporting pose classification and model training. The proposed framework contributes both algorithmic insights and a novel application design for posture-driven educational support systems. Full article
(This article belongs to the Special Issue Biomimetic Innovations for Human–Machine Interaction)
Show Figures

Figure 1

16 pages, 3102 KiB  
Article
Unified Depth-Guided Feature Fusion and Reranking for Hierarchical Place Recognition
by Kunmo Li, Yongsheng Ou, Jian Ning, Fanchang Kong, Haiyang Cai and Haoyang Li
Sensors 2025, 25(13), 4056; https://doi.org/10.3390/s25134056 - 29 Jun 2025
Viewed by 336
Abstract
Visual Place Recognition (VPR) constitutes a pivotal task in the domains of computer vision and robotics. Prevailing VPR methods predominantly employ RGB-based features for query image retrieval and correspondence establishment. Nevertheless, such unimodal visual representations exhibit inherent susceptibility to environmental variations, inevitably degrading [...] Read more.
Visual Place Recognition (VPR) constitutes a pivotal task in the domains of computer vision and robotics. Prevailing VPR methods predominantly employ RGB-based features for query image retrieval and correspondence establishment. Nevertheless, such unimodal visual representations exhibit inherent susceptibility to environmental variations, inevitably degrading method precision. To address this problem, we propose a robust VPR framework integrating RGB and depth modalities. The architecture employs a coarse-to-fine paradigm, where global retrieval of top-N candidate images is performed using fused multimodal features, followed by a geometric verification of these candidates leveraging depth information. A Discrete Wavelet Transform Fusion (DWTF) module is proposed to generate robust multimodal global descriptors by effectively combining RGB and depth data using discrete wavelet transform. Furthermore, we introduce a Spiking Neuron Graph Matching (SNGM) module, which extracts geometric structure and spatial distance from depth data and employs graph matching for accurate depth feature correspondence. Extensive experiments on several VPR benchmarks demonstrate that our method achieves state-of-the-art performance while maintaining the best accuracy–efficiency trade-off. Full article
Show Figures

Figure 1

17 pages, 5319 KiB  
Article
Quantitative Detection of Floating Debris in Inland Reservoirs Using Sentinel-1 SAR Imagery: A Case Study of Daecheong Reservoir
by Sunmin Lee, Bongseok Jeong, Donghyeon Yoon, Jinhee Lee, Jeongho Lee, Joonghyeok Heo and Moung-Jin Lee
Water 2025, 17(13), 1941; https://doi.org/10.3390/w17131941 - 28 Jun 2025
Viewed by 301
Abstract
Rapid rises in water levels due to heavy rainfall can lead to the accumulation of floating debris, posing significant challenges for both water quality and resource management. However, real-time monitoring of floating debris remains difficult due to the discrepancy between meteorological conditions and [...] Read more.
Rapid rises in water levels due to heavy rainfall can lead to the accumulation of floating debris, posing significant challenges for both water quality and resource management. However, real-time monitoring of floating debris remains difficult due to the discrepancy between meteorological conditions and the timing of debris accumulation. To address this limitation, this study proposes an amplitude change detection (ACD) model based on time-series synthetic aperture radar (SAR) imagery, which is less affected by weather conditions. The model statistically distinguishes floating debris from open water based on their differing scattering characteristics. The ACD approach was applied to 18 pairs of Sentinel-1 SAR images acquired over Daecheong Reservoir from June to September 2024. A stringent type I error threshold (α < 1 × 10−8) was employed to ensure reliable detection. The results revealed a distinct cumulative effect, whereby the detected debris area increased immediately following rainfall events. A positive correlation was observed between 10-day cumulative precipitation and the debris-covered area. For instance, on 12 July, a floating debris area of 0.3828 km2 was detected, which subsequently expanded to 0.4504 km2 by 24 July. In contrast, on 22 August, when rainfall was negligible, no debris was detected (0 km2), indicating that precipitation was a key factor influencing the detection sensitivity. Comparative analysis with optical imagery further confirmed that floating debris tended to accumulate near artificial barriers and narrow channel regions. Overall, this study demonstrates that this spatial pattern suggests the potential to use detection results to estimate debris transport pathways and inform retrieval strategies. Full article
Show Figures

Figure 1

25 pages, 2892 KiB  
Article
Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
by Pranita P. Deshmukh and S. Poonkuntran
Computers 2025, 14(7), 255; https://doi.org/10.3390/computers14070255 - 28 Jun 2025
Viewed by 269
Abstract
Every minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When [...] Read more.
Every minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly analyzed, this multimodal data holds immense potential for reconstructing important events and verifying information. However, challenges arise when images and videos lack complete annotations, making manual examination inefficient and time-consuming. To address this, we propose a novel event-based focal visual content text attention (EFVCTA) framework for automated past event retrieval using visual question answering (VQA) techniques. Our approach integrates a Long Short-Term Memory (LSTM) model with convolutional non-linearity and an adaptive attention mechanism to efficiently identify and retrieve relevant visual evidence alongside precise answers. The model is designed with robust weight initialization, regularization, and optimization strategies and is evaluated on the Common Objects in Context (COCO) dataset. The results demonstrate that EFVCTA achieves the highest performance across all metrics (88.7% accuracy, 86.5% F1-score, 84.9% mAP), outperforming state-of-the-art baselines. The EFVCTA framework demonstrates promising results for retrieving information about past events captured in images and videos and can be effectively applied to scenarios such as documenting training programs, workshops, conferences, and social gatherings in academic institutions Full article
Show Figures

Figure 1

11 pages, 4829 KiB  
Brief Report
Differences in Imaging and Histology Between Sinonasal Inverted Papilloma with and Without Squamous Cell Carcinoma
by Niina Kuusisto, Jaana Hagström, Goran Kurdo, Aaro Haapaniemi, Antti Markkola, Antti Mäkitie and Markus Lilja
Diagnostics 2025, 15(13), 1645; https://doi.org/10.3390/diagnostics15131645 - 27 Jun 2025
Viewed by 287
Abstract
Objectives: Sinonasal inverted papilloma (SNIP) is a rare benign tumor that has potential for malignant transformation, usually into squamous cell carcinoma (SCC). The pre-operative differentiation between SNIP and SNIP-SCC is essential in determining the therapeutic strategy, but it is a challenge, as biopsies [...] Read more.
Objectives: Sinonasal inverted papilloma (SNIP) is a rare benign tumor that has potential for malignant transformation, usually into squamous cell carcinoma (SCC). The pre-operative differentiation between SNIP and SNIP-SCC is essential in determining the therapeutic strategy, but it is a challenge, as biopsies may fail to recognize the malignant part of the tumor. Further, a SNIP can also be locally aggressive and thus mimic a malignant tumor. This retrospective study compares the pre-operative differences in computed tomography (CT) and histologic findings between patients with a benign SNIP and those with a SNIP-SCC. Methods: Eight patients with SNIP-SCC were selected from the hospital registries of the Department of Otorhinolaryngology, Helsinki University Hospital (Helsinki, Finland). For each case a comparable SNIP case without malignancy was selected. Five histopathologic samples of both the SNIP and SNIP-SCC tumors were retrieved. CT images and the histopathologic samples were re-evaluated by two observers. Results: The nasal cavity and ethmoid and maxillary sinuses were the most common sites for both tumor types. The SNIP tumors were mostly unilateral, and the SNIP-SCC tumors were both unilateral and bilateral. Only SNIP-SCC tumors demonstrated bone defects and orbital or intracranial invasion. Dysplastic findings such as dyskeratosis, nuclear atypia, and maturation disturbances were seen only in the SNIP-SCC tumors. Conclusions: Bony destruction and invasion of adjacent structures in pre-operative CT images seem to be pathognomonic signs of SNIP-SCC based on this series. To differentiate between SNIP and SNIP-SCC tumors all available pre-operative investigations are warranted. Full article
(This article belongs to the Section Medical Imaging and Theranostics)
Show Figures

Figure 1

24 pages, 41430 KiB  
Article
An Optimal Viewpoint-Guided Visual Indexing Method for UAV Autonomous Localization
by Zhiyang Ye, Yukun Zheng, Zheng Ji and Wei Liu
Remote Sens. 2025, 17(13), 2194; https://doi.org/10.3390/rs17132194 - 25 Jun 2025
Viewed by 456
Abstract
The autonomous positioning of drone-based remote sensing plays an important role in navigation in urban environments. Due to GNSS (Global Navigation Satellite System) signal occlusion, obtaining precise drone locations is still a challenging issue. Inspired by vision-based positioning methods, we proposed an autonomous [...] Read more.
The autonomous positioning of drone-based remote sensing plays an important role in navigation in urban environments. Due to GNSS (Global Navigation Satellite System) signal occlusion, obtaining precise drone locations is still a challenging issue. Inspired by vision-based positioning methods, we proposed an autonomous positioning method based on multi-view reference images rendered from the scene’s 3D geometric mesh and apply a bag-of-words (BoW) image retrieval pipeline to achieve efficient and scalable positioning, without utilizing deep learning-based retrieval or 3D point cloud registration. To minimize the number of reference images, scene coverage quantification and optimization are employed to generate the optimal viewpoints. The proposed method jointly exploits a visual-bag-of-words tree to accelerate reference image retrieval and improve retrieval accuracy, and the Perspective-n-Point (PnP) algorithm is utilized to obtain the drone’s pose. Experiments are conducted in urban real-word scenarios and the results show that positioning errors are decreased, with accuracy ranging from sub-meter to 5 m and an average latency of 0.7–1.3 s; this indicates that our method significantly improves accuracy and latency, offering robust, real-time performance over extensive areas without relying on GNSS or dense point clouds. Full article
(This article belongs to the Section Engineering Remote Sensing)
Show Figures

Figure 1

Back to TopTop