MDPI - Publisher of Open Access Journals

14 pages, 4928 KiB

Open AccessArticle

Retina-Inspired Models Enhance Visual Saliency Prediction

by Gang Shen, Wenjun Ma, Wen Zhai, Xuefei Lv, Guangyao Chen and Yonghong Tian

Entropy 2025, 27(4), 436; https://doi.org/10.3390/e27040436 - 18 Apr 2025

Viewed by 623

Biologically inspired retinal preprocessing improves visual perception by efficiently encoding and reducing entropy in images. In this study, we introduce a new saliency prediction framework that combines a retinal model with deep neural networks (DNNs) using information theory ideas. By mimicking the human [...] Read more.

Biologically inspired retinal preprocessing improves visual perception by efficiently encoding and reducing entropy in images. In this study, we introduce a new saliency prediction framework that combines a retinal model with deep neural networks (DNNs) using information theory ideas. By mimicking the human retina, our method creates clearer saliency maps with lower entropy and supports efficient computation with DNNs by optimizing information flow and reducing redundancy. We treat saliency prediction as an information maximization problem, where important regions have high information and low local entropy. Tests on several benchmark datasets show that adding the retinal model boosts the performance of various bottom-up saliency prediction methods by better managing information and reducing uncertainty. We use metrics like mutual information and entropy to measure improvements in accuracy and efficiency. Our framework outperforms state-of-the-art models, producing saliency maps that closely match where people actually look. By combining neurobiological insights with information theory—using measures like Kullback–Leibler divergence and information gain—our method not only improves prediction accuracy but also offers a clear, quantitative understanding of saliency. This approach shows promise for future research that brings together neuroscience, entropy, and deep learning to enhance visual saliency prediction. Full article

(This article belongs to the Section Information Theory, Probability and Statistics)

► Show Figures

Figure 1

28 pages, 5580 KiB

Open AccessArticle

An Adaptive Multi-Content Complementary Network for Salient Object Detection

by Lina Huo, Kaidi Guo and Wei Wang

Electronics 2023, 12(22), 4600; https://doi.org/10.3390/electronics12224600 - 10 Nov 2023

Cited by 1 | Viewed by 1272

Abstract

Deep learning methods for salient object detection (SOD) have been studied actively and promisingly. However, the existing methods mainly focus on the decoding process and ignore the differences in the contributions of different encoder blocks. To address this problem for SOD, we propose [...] Read more.

Deep learning methods for salient object detection (SOD) have been studied actively and promisingly. However, the existing methods mainly focus on the decoding process and ignore the differences in the contributions of different encoder blocks. To address this problem for SOD, we propose an adaptive multi-content complementary network (PASNet) for salient object detection which aims to exploit the valuable contextual information in the encoder fully. Unlike existing CNN-based methods, we adopt the pyramidal visual transformer (PVTv2) as the backbone network to learn global and local representations with its self-attention mechanism. Then, we follow the coarse-to-fine strategy and introduce two novel modules, including an advanced semantic fusion module (ASFM) and a self-refinement module (SRM). Among these, the ASFM takes local branches and adjacent branches as inputs and collects semantic and location information of salient objects from high-level features to generate an initial coarse saliency map. The coarse saliency map serves as the location guidance for low-level features, and the SRM is applied to capture detailed information disguised in low-level features. We expand the location information with high-level semantics from top to bottom across the salient region, which is effectively fused with detailed information through feature modulation. The model effectively suppresses noises in the features and significantly improves their expressive capabilities. To verify the effectiveness of our PASNet, we conducted extensive experiments on five challenging datasets, and the results show that the proposed model is superior to some of the current state-of-the-art methods under different evaluation metrics. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 2924 KiB

Open AccessArticle

Exploring Focus and Depth-Induced Saliency Detection for Light Field

by Yani Zhang, Fen Chen, Zongju Peng, Wenhui Zou and Changhe Zhang

Entropy 2023, 25(9), 1336; https://doi.org/10.3390/e25091336 - 15 Sep 2023

Cited by 2 | Viewed by 1819

Abstract

An abundance of features in the light field has been demonstrated to be useful for saliency detection in complex scenes. However, bottom-up saliency detection models are limited in their ability to explore light field features. In this paper, we propose a light field [...] Read more.

An abundance of features in the light field has been demonstrated to be useful for saliency detection in complex scenes. However, bottom-up saliency detection models are limited in their ability to explore light field features. In this paper, we propose a light field saliency detection method that focuses on depth-induced saliency, which can more deeply explore the interactions between different cues. First, we localize a rough saliency region based on the compactness of color and depth. Then, the relationships among depth, focus, and salient objects are carefully investigated, and the focus cue of the focal stack is used to highlight the foreground objects. Meanwhile, the depth cue is utilized to refine the coarse salient objects. Furthermore, considering the consistency of color smoothing and depth space, an optimization model referred to as color and depth-induced cellular automata is improved to increase the accuracy of saliency maps. Finally, to avoid interference of redundant information, the mean absolute error is chosen as the indicator of the filter to obtain the best results. The experimental results on three public light field datasets show that the proposed method performs favorably against the state-of-the-art conventional light field saliency detection approaches and even light field saliency detection approaches based on deep learning. Full article

(This article belongs to the Section Signal and Data Analysis)

► Show Figures

Figure 1

21 pages, 7923 KiB

Open AccessArticle

An Effective Method of Infrared Maritime Target Enhancement and Detection with Multiple Maritime Scene

by Chang Ding, Zhendong Luo, Yifeng Hou, Siyang Chen and Weidong Zhang

Remote Sens. 2023, 15(14), 3623; https://doi.org/10.3390/rs15143623 - 20 Jul 2023

Cited by 4 | Viewed by 1799

Abstract

Aiming at maritime infrared target detection with low contrast influenced by maritime clutter and illumination, this paper proposes a Modified Histogram Equalization with Edge Fusion (MHEEF) pre-processing algorithm in backlight maritime scenes and establishes Local-Contrast Saliency Models with Double Scale and Modes (LCMDSM) [...] Read more.

Aiming at maritime infrared target detection with low contrast influenced by maritime clutter and illumination, this paper proposes a Modified Histogram Equalization with Edge Fusion (MHEEF) pre-processing algorithm in backlight maritime scenes and establishes Local-Contrast Saliency Models with Double Scale and Modes (LCMDSM) for detecting a target with the properties of positive and negative contrast. We propose a local-contrast saliency mathematical model with double modes in the extension of only one mode. Then, the big scale and small scale are combined into one Target Detection Unit (TDU), which can approach the “from bottom to up” mechanism of the Visual Attention Model (VAM) better and identify the target with a suitable size, approaching the target’s actual shape. In the experimental results and analysis, clutter, foggy, backlight, and dim maritime scenes are chosen to verify the effectiveness of the target detection algorithm. From the enhancement result, the LCMDSM algorithm can achieve a Detection Rate (DR) with a value of 98.26% under each maritime scene on the average level and can be used in real-time detection with low computational cost. Full article

(This article belongs to the Special Issue Remote Sensing in Intelligent Maritime Research)

► Show Figures

Graphical abstract

18 pages, 59748 KiB

Open AccessArticle

A Comparison of Bottom-Up Models for Spatial Saliency Predictions in Autonomous Driving

by Jaime Maldonado and Lino Antoni Giefer

Sensors 2021, 21(20), 6825; https://doi.org/10.3390/s21206825 - 14 Oct 2021

Cited by 4 | Viewed by 3729

Abstract

Bottom-up saliency models identify the salient regions of an image based on features such as color, intensity and orientation. These models are typically used as predictors of human visual behavior and for computer vision tasks. In this paper, we conduct a systematic evaluation [...] Read more.

Bottom-up saliency models identify the salient regions of an image based on features such as color, intensity and orientation. These models are typically used as predictors of human visual behavior and for computer vision tasks. In this paper, we conduct a systematic evaluation of the saliency maps computed with four selected bottom-up models on images of urban and highway traffic scenes. Saliency both over whole images and on object level is investigated and elaborated in terms of the energy and the entropy of the saliency maps. We identify significant differences with respect to the amount, size and shape-complexity of the salient areas computed by different models. Based on these findings, we analyze the likelihood that object instances fall within the salient areas of an image and investigate the agreement between the segments of traffic participants and the saliency maps of the different models. The overall and object-level analysis provides insights on the distinctive features of salient areas identified by different models, which can be used as selection criteria for prospective applications in autonomous driving such as object detection and tracking. Full article

(This article belongs to the Special Issue Advanced Computer Vision Techniques for Autonomous Driving)

► Show Figures

Figure 1

12 pages, 1154 KiB

Open AccessArticle

Spatially Filtered Emotional Faces Dominate during Binocular Rivalry

by Maria Teresa Turano, Fiorenza Giganti, Gioele Gavazzi, Simone Lamberto, Giorgio Gronchi, Fabio Giovannelli, Andrea Peru and Maria Pia Viggiano

Brain Sci. 2020, 10(12), 998; https://doi.org/10.3390/brainsci10120998 - 17 Dec 2020

Cited by 4 | Viewed by 3454

Abstract

The present investigation explores the role of bottom-up and top-down factors in the recognition of emotional facial expressions during binocular rivalry. We manipulated spatial frequencies (SF) and emotive features and asked subjects to indicate whether the emotional or the neutral expression was dominant [...] Read more.

The present investigation explores the role of bottom-up and top-down factors in the recognition of emotional facial expressions during binocular rivalry. We manipulated spatial frequencies (SF) and emotive features and asked subjects to indicate whether the emotional or the neutral expression was dominant during binocular rivalry. Controlling the bottom-up saliency with a computational model, physically comparable happy and fearful faces were presented dichoptically with neutral faces. The results showed the dominance of emotional faces over neutral ones. In particular, happy faces were reported more frequently as the first dominant percept even in the presence of coarse information (at a low SF level: 2–6 cycle/degree). Following current theories of emotion processing, the results provide further support for the influence of positive compared to negative meaning on binocular rivalry and, for the first time, showed that individuals perceive the affective quality of happiness even in the absence of details in the visual display. Furthermore, our findings represent an advance in knowledge regarding the association between the high- and low-level mechanisms behind binocular rivalry. Full article

► Show Figures

Figure 1

16 pages, 2540 KiB

Open AccessArticle

Multi-Scale Global Contrast CNN for Salient Object Detection

by Weijia Feng, Xiaohui Li, Guangshuai Gao, Xingyue Chen and Qingjie Liu

Sensors 2020, 20(9), 2656; https://doi.org/10.3390/s20092656 - 6 May 2020

Cited by 9 | Viewed by 3679

Abstract

Salient object detection (SOD) is a fundamental task in computer vision, which attempts to mimic human visual systems that rapidly respond to visual stimuli and locate visually salient objects in various scenes. Perceptual studies have revealed that visual contrast is the most important [...] Read more.

Salient object detection (SOD) is a fundamental task in computer vision, which attempts to mimic human visual systems that rapidly respond to visual stimuli and locate visually salient objects in various scenes. Perceptual studies have revealed that visual contrast is the most important factor in bottom-up visual attention process. Many of the proposed models predict saliency maps based on the computation of visual contrast between salient regions and backgrounds. In this paper, we design an end-to-end multi-scale global contrast convolutional neural network (CNN) that explicitly learns hierarchical contrast information among global and local features of an image to infer its salient object regions. In contrast to many previous CNN based saliency methods that apply super-pixel segmentation to obtain homogeneous regions and then extract their CNN features before producing saliency maps region-wise, our network is pre-processing free without any additional stages, yet it predicts accurate pixel-wise saliency maps. Extensive experiments demonstrate that the proposed network generates high quality saliency maps that are comparable or even superior to those of state-of-the-art salient object detection architectures. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

18 pages, 4754 KiB

Open AccessArticle

Unsupervised Saliency Model with Color Markov Chain for Oil Tank Detection

by Ziming Liu, Danpei Zhao, Zhenwei Shi and Zhiguo Jiang

Remote Sens. 2019, 11(9), 1089; https://doi.org/10.3390/rs11091089 - 7 May 2019

Cited by 72 | Viewed by 4205

Abstract

Traditional oil tank detection methods often use geometric shape information. However, it is difficult to guarantee accurate detection under a variety of disturbance factors, especially various colors, scale differences, and the shadows caused by view angle and illumination. Therefore, we propose an unsupervised [...] Read more.

Traditional oil tank detection methods often use geometric shape information. However, it is difficult to guarantee accurate detection under a variety of disturbance factors, especially various colors, scale differences, and the shadows caused by view angle and illumination. Therefore, we propose an unsupervised saliency model with Color Markov Chain (US-CMC) to deal with oil tank detection. To avoid the influence of shadows, we make use of the CIE Lab space to construct a Color Markov Chain and generate a bottom-up latent saliency map. Moreover, we build a circular feature map based on a radial symmetric circle, which makes true targets to be strengthened for a subjective detection task. Besides, we combine the latent saliency map with the circular feature map, which can effectively suppress other salient regions except for oil tanks. Extensive experimental results demonstrate that it outperforms 15 saliency models for remote sensing images (RSIs). Compared with conventional oil tank detection methods, US-CMC has achieved better results and is also more robust for view angle, shadow, and shape similarity problems. Full article

(This article belongs to the Special Issue Remote Sensing for Target Object Detection and Identification)

► Show Figures

Figure 1

14 pages, 5269 KiB

Open AccessArticle

Superpixel-Based Feature for Aerial Image Scene Recognition

by Hongguang Li, Yang Shi, Baochang Zhang and Yufeng Wang

Sensors 2018, 18(1), 156; https://doi.org/10.3390/s18010156 - 8 Jan 2018

Cited by 22 | Viewed by 5922

Abstract

Image scene recognition is a core technology for many aerial remote sensing applications. Different landforms are inputted as different scenes in aerial imaging, and all landform information is regarded as valuable for aerial image scene recognition. However, the conventional features of the Bag-of-Words [...] Read more.

Image scene recognition is a core technology for many aerial remote sensing applications. Different landforms are inputted as different scenes in aerial imaging, and all landform information is regarded as valuable for aerial image scene recognition. However, the conventional features of the Bag-of-Words model are designed using local points or other related information and thus are unable to fully describe landform areas. This limitation cannot be ignored when the aim is to ensure accurate aerial scene recognition. A novel superpixel-based feature is proposed in this study to characterize aerial image scenes. Then, based on the proposed feature, a scene recognition method of the Bag-of-Words model for aerial imaging is designed. The proposed superpixel-based feature that utilizes landform information establishes top-task superpixel extraction of landforms to bottom-task expression of feature vectors. This characterization technique comprises the following steps: simple linear iterative clustering based superpixel segmentation, adaptive filter bank construction, Lie group-based feature quantification, and visual saliency model-based feature weighting. Experiments of image scene recognition are carried out using real image data captured by an unmanned aerial vehicle (UAV). The recognition accuracy of the proposed superpixel-based feature is 95.1%, which is higher than those of scene recognition algorithms based on other local features. Full article

(This article belongs to the Special Issue Sensor Signal and Information Processing)

► Show Figures

Graphical abstract

13 pages, 2068 KiB

Open AccessArticle

Different Judgments About Visual Textures Invoke Different Eye Movement Patterns

by Richard H.A.H. Jacobs, Remco Renken, Stefan Thumfart and Frans W. Cornelissen

J. Eye Mov. Res. 2009, 3(4), 1-13; https://doi.org/10.16910/jemr.3.4.2 - 15 Oct 2010

Cited by 2 | Viewed by 86

Abstract

Top-down influences on the guidance of the eyes are generally modeled as modulating influences on bottom-up salience maps. Interested in task-driven influences on how, rather than where, the eyes are guided, we expected differences in eye movement parameters accompanying beauty and roughness judgments [...] Read more.

Top-down influences on the guidance of the eyes are generally modeled as modulating influences on bottom-up salience maps. Interested in task-driven influences on how, rather than where, the eyes are guided, we expected differences in eye movement parameters accompanying beauty and roughness judgments about visual textures. Participants judged textures for beauty and roughness, while their gaze-behavior was recorded. Eye movement parameters differed between the judgments, showing task effects on how people look at images. Similarity in the spatial distribution of attention suggests that differences in the guidance of attention are non-spatial, possibly feature-based. During the beauty judgment, participants fixated on patches that were richer in color information, further supporting the idea that differences in the guidance of attention are feature-based. A finding of shorter fixation durations during beauty judgments may indicate that extraction of the relevant features is easier during this judgment. This finding is consistent with a more ambient scanning mode during this judgment. The differences in eye movement parameters during different judgments about highly repetitive stimuli highlight the need for models of eye guidance to go beyond salience maps, to include the temporal dynamics of eye guidance. Full article

► Show Figures

Figure 1

7 pages, 733 KiB

Open AccessArticle

Probing Bottom-Up Processing with Multistable Images

by Ozgur E. Akman, Richard A. Clement, David S. Broomhead, Sabira Mannan, Ian Moorhead and Hugh R. Wilson

J. Eye Mov. Res. 2007, 1(3), 1-7; https://doi.org/10.16910/jemr.1.3.4 - 9 Feb 2009

Viewed by 80

Abstract

The selection of fixation targets involves a combination of top-down and bottom-up processing. The role of bottom-up processing can be enhanced by using multistable stimuli because their constantly changing appearance seems to depend predominantly on stimulusdriven factors. We used this approach to investigate [...] Read more.

The selection of fixation targets involves a combination of top-down and bottom-up processing. The role of bottom-up processing can be enhanced by using multistable stimuli because their constantly changing appearance seems to depend predominantly on stimulusdriven factors. We used this approach to investigate whether visual processing models based on V1 need to be extended to incorporate specific computations attributed to V4. Eye movements of 8 subjects were recorded during free viewing of the Marroquin pattern in which illusory circles appear and disappear. Fixations were concentrated on features arranged in concentric rings within the pattern. Comparison with simulated fixation data demonstrated that the saliency of these features can be predicted with appropriate weighting of lateral connections in existing V1 models. Full article

► Show Figures

Figure 1

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI