Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (171)

Search Parameters:
Keywords = egocentric

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 554 KiB  
Article
The Relationship Between Kindness and Transgressive Behaviors in Adolescence: The Moderating Role of Self-Importance of Moral Identity
by Claudia Russo, Ioana Zagrean, Lucrezia Cavagnis, Sara Cristalli, Valentina Valtulini, Francesca Danioni and Daniela Barni
Adolescents 2025, 5(3), 40; https://doi.org/10.3390/adolescents5030040 - 1 Aug 2025
Viewed by 144
Abstract
Adolescence is marked by identity formation and moral development, often accompanied by increased transgressive behaviors. While existing research highlights the interplay between moral constructs and transgression in adolescence, the role of kindness remains underexamined. This study conceptualizes kindness as a multidimensional moral construct [...] Read more.
Adolescence is marked by identity formation and moral development, often accompanied by increased transgressive behaviors. While existing research highlights the interplay between moral constructs and transgression in adolescence, the role of kindness remains underexamined. This study conceptualizes kindness as a multidimensional moral construct and investigates the relationship between different stages of kindness (i.e., egocentric, social/normative, extrinsically motivated, authentic) and transgressive behaviors among adolescents, also considering the moderating role of self-importance of moral identity. The participants were 215 Italian adolescents (aged 15–19) who completed a self-report questionnaire. The results showed that egocentric and authentic kindness were positively and negatively associated with transgression, respectively. Moreover, moral identity significantly enhanced the protective role of authentic kindness. These findings suggest that the relationship between kindness and transgression varies based on the stage of kindness and the importance adolescents attribute to their moral identity. They contribute to extending the understanding of kindness during adolescence, offering implications for reducing transgressive behaviors through targeted and innovative interventions. Full article
Show Figures

Figure 1

18 pages, 2688 KiB  
Article
Generalized Hierarchical Co-Saliency Learning for Label-Efficient Tracking
by Jie Zhao, Ying Gao, Chunjuan Bo and Dong Wang
Sensors 2025, 25(15), 4691; https://doi.org/10.3390/s25154691 - 29 Jul 2025
Viewed by 129
Abstract
Visual object tracking is one of the core techniques in human-centered artificial intelligence, which is very useful for human–machine interaction. State-of-the-art tracking methods have shown their robustness and accuracy on many challenges. However, a large amount of videos with precisely dense annotations are [...] Read more.
Visual object tracking is one of the core techniques in human-centered artificial intelligence, which is very useful for human–machine interaction. State-of-the-art tracking methods have shown their robustness and accuracy on many challenges. However, a large amount of videos with precisely dense annotations are required for fully supervised training of their models. Considering that annotating videos frame-by-frame is a labor- and time-consuming workload, reducing the reliance on manual annotations during the tracking models’ training is an important problem to be resolved. To make a trade-off between the annotating costs and the tracking performance, we propose a weakly supervised tracking method based on co-saliency learning, which can be flexibly integrated into various tracking frameworks to reduce annotation costs and further enhance the target representation in current search images. Since our method enables the model to explore valuable visual information from unlabeled frames, and calculate co-salient attention maps based on multiple frames, our weakly supervised methods can obtain competitive performance compared to fully supervised baseline trackers, using only 3.33% of manual annotations. We integrate our method into two CNN-based trackers and a Transformer-based tracker; extensive experiments on four general tracking benchmarks demonstrate the effectiveness of our method. Furthermore, we also demonstrate the advantages of our method on egocentric tracking task; our weakly supervised method obtains 0.538 success on TREK-150, which is superior to prior state-of-the-art fully supervised tracker by 7.7%. Full article
Show Figures

Figure 1

35 pages, 2865 KiB  
Article
eyeNotate: Interactive Annotation of Mobile Eye Tracking Data Based on Few-Shot Image Classification
by Michael Barz, Omair Shahzad Bhatti, Hasan Md Tusfiqur Alam, Duy Minh Ho Nguyen, Kristin Altmeyer, Sarah Malone and Daniel Sonntag
J. Eye Mov. Res. 2025, 18(4), 27; https://doi.org/10.3390/jemr18040027 - 7 Jul 2025
Viewed by 485
Abstract
Mobile eye tracking is an important tool in psychology and human-centered interaction design for understanding how people process visual scenes and user interfaces. However, analyzing recordings from head-mounted eye trackers, which typically include an egocentric video of the scene and a gaze signal, [...] Read more.
Mobile eye tracking is an important tool in psychology and human-centered interaction design for understanding how people process visual scenes and user interfaces. However, analyzing recordings from head-mounted eye trackers, which typically include an egocentric video of the scene and a gaze signal, is a time-consuming and largely manual process. To address this challenge, we develop eyeNotate, a web-based annotation tool that enables semi-automatic data annotation and learns to improve from corrective user feedback. Users can manually map fixation events to areas of interest (AOIs) in a video-editing-style interface (baseline version). Further, our tool can generate fixation-to-AOI mapping suggestions based on a few-shot image classification model (IML-support version). We conduct an expert study with trained annotators (n = 3) to compare the baseline and IML-support versions. We measure the perceived usability, annotations’ validity and reliability, and efficiency during a data annotation task. We asked our participants to re-annotate data from a single individual using an existing dataset (n = 48). Further, we conducted a semi-structured interview to understand how participants used the provided IML features and assessed our design decisions. In a post hoc experiment, we investigate the performance of three image classification models in annotating data of the remaining 47 individuals. Full article
Show Figures

Figure 1

26 pages, 2873 KiB  
Article
Interactive Content Retrieval in Egocentric Videos Based on Vague Semantic Queries
by Linda Ablaoui, Wilson Estecio Marcilio-Jr, Lai Xing Ng, Christophe Jouffrais and Christophe Hurter
Multimodal Technol. Interact. 2025, 9(7), 66; https://doi.org/10.3390/mti9070066 - 30 Jun 2025
Viewed by 602
Abstract
Retrieving specific, often instantaneous, content from hours-long egocentric video footage based on hazily remembered details is challenging. Vision–language models (VLMs) have been employed to enable zero-shot textual-based content retrieval from videos. But, they fall short if the textual query contains ambiguous terms or [...] Read more.
Retrieving specific, often instantaneous, content from hours-long egocentric video footage based on hazily remembered details is challenging. Vision–language models (VLMs) have been employed to enable zero-shot textual-based content retrieval from videos. But, they fall short if the textual query contains ambiguous terms or users fail to specify their queries enough, leading to vague semantic queries. Such queries can refer to several different video moments, not all of which can be relevant, making pinpointing content harder. We investigate the requirements for an egocentric video content retrieval framework that helps users handle vague queries. First, we narrow down vague query formulation factors and limit them to ambiguity and incompleteness. Second, we propose a zero-shot, user-centered video content retrieval framework that leverages a VLM to provide video data and query representations that users can incrementally combine to refine queries. Third, we compare our proposed framework to a baseline video player and analyze user strategies for answering vague video content retrieval scenarios in an experimental study. We report that both frameworks perform similarly, users favor our proposed framework, and, as far as navigation strategies go, users value classic interactions when initiating their search and rely on the abstract semantic video representation to refine their resulting moments. Full article
Show Figures

Figure 1

18 pages, 5235 KiB  
Article
Environmental Concern in Rural Andean Communities: Comparative Study in Central Ecuadorian Highlands
by María Fernanda Rivera-Velásquez, Cristina Gabriela Cóndor-Simbaña, Cristhian Mauricio Lapo-Alcivar, Diego Paul Viteri-Núñez and Víctor Santiago Saigua-Pérez
Sustainability 2025, 17(12), 5551; https://doi.org/10.3390/su17125551 - 17 Jun 2025
Viewed by 971
Abstract
High Andean ecosystems face increasing pressures that threaten the sustainability of rural livelihoods, prompting communities to demand culturally appropriate governance responses. This study examines the structure of environmental concern in two rural communities, Riobamba and Guaranda, in the central Ecuadorian Andes. Applying a [...] Read more.
High Andean ecosystems face increasing pressures that threaten the sustainability of rural livelihoods, prompting communities to demand culturally appropriate governance responses. This study examines the structure of environmental concern in two rural communities, Riobamba and Guaranda, in the central Ecuadorian Andes. Applying a tripartite model of egocentric, altruistic, and biocentric concern, we assess its validity through Confirmatory Factor Analysis (CFA) and evaluate the influence of age, gender, ethnicity, and economic activity using Structural Equation Modeling (SEM). The results reveal distinct patterns: biocentric concern predominates in the more urbanized Riobamba, while Guaranda shows a stronger egocentric orientation, accompanied by moderate altruistic concern. Agricultural activity and residence in less urbanized environments are associated with lower levels of environmental concern, whereas age, gender, and ethnicity show no significant effects. The results suggest that although there are differences in the forms of environmental concern, these dimensions are not isolated. Instead, they are part of the same hierarchical phenomenon. This analysis supports the idea of a general concept of a relationship with nature. These findings underscore the importance of implementing environmental policies that respect the holistic worldview of Andean communities. They also highlight the need to develop culturally sensitive measurement tools to avoid potential biases and ensure alignment with local realities. Full article
(This article belongs to the Section Social Ecology and Sustainability)
Show Figures

Figure 1

24 pages, 1408 KiB  
Review
Biomolecular Basis of Life
by Janusz Wiesław Błaszczyk
Metabolites 2025, 15(6), 404; https://doi.org/10.3390/metabo15060404 - 16 Jun 2025
Viewed by 580
Abstract
Life is defined descriptively by the capacity for metabolism, homeostasis, self-organization, growth, adaptation, information metabolism, and reproduction. All these are achieved by a set of self-organizing and self-sustaining processes, among which energy and information metabolism play a dominant role. The energy metabolism of [...] Read more.
Life is defined descriptively by the capacity for metabolism, homeostasis, self-organization, growth, adaptation, information metabolism, and reproduction. All these are achieved by a set of self-organizing and self-sustaining processes, among which energy and information metabolism play a dominant role. The energy metabolism of the human body is based on glucose and lipid metabolism. All energy-dependent life processes are controlled by phosphate and calcium signaling. To maintain the optimal levels of energy metabolism, cells, tissues, and the nervous system communicate mutually, and as a result of this signaling, metabolism emerges with self-awareness, which allows for conscience social interactions, which are the most significant determinants of human life. Consequently, the brain representation of our body and the egocentric representation of the environment are built. The last determinant of life optimization is the limited life/death cycle, which exhibits the same pattern at cellular and social levels. This narrative review is my first attempt to systematize our knowledge of life phenomena. Due to the extreme magnitude of this challenge, in the current article, I tried to summarize the current knowledge about fundamental life processes, i.e., energy and information metabolism, and, thus, initiate a broader discussion about the life and future of our species. Full article
(This article belongs to the Section Thematic Reviews)
Show Figures

Figure 1

23 pages, 6387 KiB  
Article
Building an Egocentric-to-Allocentric Travelling Direction Transformation Model for Enhanced Navigation in Intelligent Agents
by Zugang Chen and Haodong Wang
Sensors 2025, 25(11), 3540; https://doi.org/10.3390/s25113540 - 4 Jun 2025
Viewed by 527
Abstract
Many behavioral tasks in intelligent agent research involve working with mathematical vectors. While traditional methods perform well in some cases, they struggle in complex and dynamic environments. Recently, bionic neural networks have emerged as a novel solution. Studies on the Drosophila central complex [...] Read more.
Many behavioral tasks in intelligent agent research involve working with mathematical vectors. While traditional methods perform well in some cases, they struggle in complex and dynamic environments. Recently, bionic neural networks have emerged as a novel solution. Studies on the Drosophila central complex have revealed that these insects use neural signals from the ellipsoid body and fan to track allocentric travel angles and update spatial awareness during movement, a process that heavily relies on directional vector manipulation. Our model accurately replicates the neural connectivity of the Drosophila central complex, drawing inspiration from the half-adder unit to efficiently encode and process spatial direction information. This framework significantly enhances the accuracy of coordinate transformations while increasing adaptability and resilience in challenging environments. Our experimental results demonstrate that the bionic neural network outperforms traditional methods, delivering superior precision and robust generalizability within the coordinate system. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

17 pages, 5756 KiB  
Article
PPDD: Egocentric Crack Segmentation in the Port Pavement with Deep Learning-Based Methods
by Hyemin Yoon, Hoe-Kyoung Kim and Sangjin Kim
Appl. Sci. 2025, 15(10), 5446; https://doi.org/10.3390/app15105446 - 13 May 2025
Viewed by 571
Abstract
Road infrastructure is a critical component of modern society, with its maintenance directly influencing traffic safety and logistical efficiency. In this context, automated crack detection technology plays a vital role in reducing maintenance costs and enhancing operational efficiency. However, previous studies are limited [...] Read more.
Road infrastructure is a critical component of modern society, with its maintenance directly influencing traffic safety and logistical efficiency. In this context, automated crack detection technology plays a vital role in reducing maintenance costs and enhancing operational efficiency. However, previous studies are limited by the fact that they provide only bounding box or segmentation mask annotations for a restricted number of crack classes and use a relatively small size of datasets. To address these limitations and advance deep learning-based crack segmentation, this study introduces a novel crack segmentation dataset that reflects real-world road conditions. The proposed dataset includes various types of cracks and defects—such as slippage, rutting, and construction-related cracks—and provides polygon-based segmentation masks captured from an egocentric, vehicle-mounted perspective. Using this dataset, we evaluated the performance of semantic and instance segmentation models. Notably, SegFormer achieved the highest Pixel Accuracy (PA) and mean Intersection over Union (mIoU) for semantic segmentation, while YOLOv7 exhibited outstanding detection performance for alligator crack class, recording an AP50 of 87.2% and AP of 57.5%. In contrast, all models struggled with the reflection crack type, indicating the inherent segmentation challenges. Overall, this study provides a practical and robust foundation for future research in automated road crack segmentation. Additional resources including the dataset and annotation details can be found at our GitHub repository. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

23 pages, 2596 KiB  
Article
RouteLAND: An Integrated Method and a Geoprocessing Tool for Characterizing the Dynamic Visual Landscape Along Highways
by Loukas-Moysis Misthos and Vassilios Krassanakis
ISPRS Int. J. Geo-Inf. 2025, 14(5), 187; https://doi.org/10.3390/ijgi14050187 - 30 Apr 2025
Cited by 1 | Viewed by 1160
Abstract
Moving away from a static concept for the landscape that surrounds us, in this research article, we approach the visual landscape as a dynamic concept. Moreover, we attempt to provide an interconnection between the domains of landscape and cartography by designing maps that [...] Read more.
Moving away from a static concept for the landscape that surrounds us, in this research article, we approach the visual landscape as a dynamic concept. Moreover, we attempt to provide an interconnection between the domains of landscape and cartography by designing maps that are particularly suitable for characterizing the visible landscape and are potentially meaningful for overall landscape evaluation. Thus, the present work mainly focuses on the consecutive computation of vistas along highways, incorporating actual landscape composition—as the landscape is perceived from an egocentric perspective by observers moving along highway routes in peri-urban landscapes. To this end, we developed an integrated method and a Python (version 2.7.16) tool, named “RouteLAND”, for implementing an algorithmic geoprocessing procedure; through this geoprocessing tool, sequences of composite dynamic geospatial analyses and geometric calculations are automatically implemented. The final outputs are interactive web maps, whereby the segments of highway routes are characterized according to the dominant element of the visible landscape by employing (spatial) aggregation techniques. The developed geoprocessing tool and the generated interactive map provide a cartographic exploratory tool for summarizing the landscape character of highways in any peri-urban landscape, while hypothetically moving in a vehicle. In addition, RouteLAND can potentially aid in the assessment of existing or future highways’ scenic level and in the sustainable design of new highways based on the minimization of intrusive artificial structures’ vistas; in this sense, RouteLAND can serve as a valuable tool for landscape evaluation and sustainable spatial planning and development. Full article
Show Figures

Figure 1

15 pages, 1990 KiB  
Article
Watermark and Trademark Prompts Boost Video Action Recognition in Visual-Language Models
by Longbin Jin, Hyuntaek Jung, Hyo Jin Jon and Eun Yi Kim
Mathematics 2025, 13(9), 1365; https://doi.org/10.3390/math13091365 - 22 Apr 2025
Viewed by 696
Abstract
Large-scale Visual-Language Models have demonstrated powerful adaptability in video recognition tasks. However, existing methods typically rely on fine-tuning or text prompt tuning. In this paper, we propose a visual-only prompting method that employs watermark and trademark prompts to bridge the distribution gap of [...] Read more.
Large-scale Visual-Language Models have demonstrated powerful adaptability in video recognition tasks. However, existing methods typically rely on fine-tuning or text prompt tuning. In this paper, we propose a visual-only prompting method that employs watermark and trademark prompts to bridge the distribution gap of spatial-temporal video data with Visual-Language Models. Our watermark prompts, designed by a trainable prompt generator, are customized for each video clip. Unlike conventional visual prompts that often exhibit noise signals, watermark prompts are intentionally designed to be imperceptible, ensuring they are not misinterpreted as an adversarial attack. The trademark prompts, bespoke for each video domain, establish the identity of specific video types. Integrating watermark prompts into video frames and prepending trademark prompts to per-frame embeddings significantly boosts the capability of the Visual-Language Model to understand video. Notably, our approach improves the adaptability of the CLIP model to various video action recognition datasets, achieving performance gains of 16.8%, 18.4%, and 13.8% on HMDB-51, UCF-101, and the egocentric dataset EPIC-Kitchen-100, respectively. Additionally, our visual-only prompting method demonstrates competitive performance compared with existing fine-tuning and adaptation methods while requiring fewer learnable parameters. Moreover, through extensive ablation studies, we find the optimal balance between imperceptibility and adaptability. Code will be made available. Full article
(This article belongs to the Special Issue Artificial Intelligence: Deep Learning and Computer Vision)
Show Figures

Figure 1

14 pages, 1834 KiB  
Article
Effect of Victim Gender on Evaluations of Sexual Crime Victims and Perpetrators: Evidence from Japan
by Tomoya Mukai
Sexes 2025, 6(2), 18; https://doi.org/10.3390/sexes6020018 - 18 Apr 2025
Viewed by 689
Abstract
Recent legal reforms incorporating the concept of sexual consent into the Penal Code, alongside high-profile scandals involving male idol groups and comedians, have heightened societal attention to sexual crimes in Japan. Although studies have extensively examined this topic, findings have been predominantly from [...] Read more.
Recent legal reforms incorporating the concept of sexual consent into the Penal Code, alongside high-profile scandals involving male idol groups and comedians, have heightened societal attention to sexual crimes in Japan. Although studies have extensively examined this topic, findings have been predominantly from Western or English-speaking countries, which raises questions regarding their applicability to other cultural contexts. To address this gap, this study examined whether the results of prior research could be generalized to Japan. This study examined six hypotheses derived from previous studies. Using a vignette-based online survey (N = 748), participants read a hypothetical sexual assault case and answered questions on sentencing, negative social reactions, and victim/perpetrator blaming. An analysis revealed that only one hypothesis was supported: respondents recommended longer sentences for perpetrators when the victim was male rather than female. Additionally, women were more likely to exhibit egocentric reactions, such as expressing more anger toward the perpetrators than the victims, than men. No other hypothesized gender-based differences, which included victim-blaming or harsher sentencing by male observers, were supported. These findings highlight the risks of generalizing research findings across cultural contexts and emphasize the importance of conducting culturally specific studies. Full article
Show Figures

Figure 1

14 pages, 867 KiB  
Article
(In)Visible Nuances: Analytical Methods for a Relational Impact Assessment of Anti-Poverty Projects
by M. Licia Paglione
Societies 2025, 15(4), 105; https://doi.org/10.3390/soc15040105 - 18 Apr 2025
Viewed by 360
Abstract
In recent social science debates, poverty is seen as a multidimensional phenomenon, not only economic, but also psychological, educational, moral, and relational. The empirical observation and analysis of this latter dimension and its qualities represent a sociological challenge, especially in assessing the integral [...] Read more.
In recent social science debates, poverty is seen as a multidimensional phenomenon, not only economic, but also psychological, educational, moral, and relational. The empirical observation and analysis of this latter dimension and its qualities represent a sociological challenge, especially in assessing the integral effectiveness of social projects. As part of this debate, this article proposes an analytical method—based on Social Network Analysis, according to the egocentric or personal approach—and describes its use during an empirical “relational impact assessment” of a specific anti-poverty project in the Northwest region of Argentina. Analysis of the data—collected longitudinally through questionnaires—highlights the changes in the personal “relational configurations” of small entrepreneurs in the tourist area, i.e., the beneficiaries of the project, while also highlighting the emergence of “relational goods”. In this way, this article offers an analytical method to evaluate the “relational impact” of anti-poverty projects in quali–quantitative terms. Full article
Show Figures

Figure 1

17 pages, 2231 KiB  
Article
Brain Functional Connectivity During First- and Third-Person Visual Imagery
by Ekaterina Pechenkova, Mary Rachinskaya, Varvara Vasilenko, Olesya Blazhenkova and Elena Mershina
Vision 2025, 9(2), 30; https://doi.org/10.3390/vision9020030 - 6 Apr 2025
Viewed by 1311
Abstract
The ability to adopt different perspectives, or vantage points, is fundamental to human cognition, affecting reasoning, memory, and imagery. While the first-person perspective allows individuals to experience a scene through their own eyes, the third-person perspective involves an external viewpoint, which is thought [...] Read more.
The ability to adopt different perspectives, or vantage points, is fundamental to human cognition, affecting reasoning, memory, and imagery. While the first-person perspective allows individuals to experience a scene through their own eyes, the third-person perspective involves an external viewpoint, which is thought to demand greater cognitive effort and different neural processing. Despite the frequent use of perspective switching across various contexts, including modern media and in therapeutic settings, the neural mechanisms differentiating these two perspectives in visual imagery remain largely underexplored. In an exploratory fMRI study, we compared both activation and task-based functional connectivity underlying first-person and third-person perspective taking in the same 26 participants performing two spatial egocentric imagery tasks, namely imaginary tennis and house navigation. No significant differences in activation emerged between the first-person and third-person conditions. The network-based statistics analysis revealed a small subnetwork of the early visual and posterior temporal areas that manifested stronger functional connectivity during the first-person perspective, suggesting a closer sensory recruitment loop, or, in different terms, a loop between long-term memory and the “visual buffer” circuits. The absence of a strong neural distinction between the first-person and third-person perspectives suggests that third-person imagery may not fully decenter individuals from the scene, as is often assumed. Full article
(This article belongs to the Special Issue Visual Mental Imagery System: How We Image the World)
Show Figures

Figure 1

8 pages, 186 KiB  
Opinion
Evidence for Cognitive Spatial Models from Ancient Roman Land-Measurement
by Andrew M. Riggsby
Brain Sci. 2025, 15(4), 376; https://doi.org/10.3390/brainsci15040376 - 4 Apr 2025
Viewed by 432
Abstract
Influential studies in the history of cartography have argued that map-like representations of space were (virtually) unknown in the Classical Mediterranean world and that the cause of this was an absence of underlying cognitive maps. That is, persons in that time/place purportedly had [...] Read more.
Influential studies in the history of cartography have argued that map-like representations of space were (virtually) unknown in the Classical Mediterranean world and that the cause of this was an absence of underlying cognitive maps. That is, persons in that time/place purportedly had only route/egocentric-type mental representations, not survey/allocentric ones. The present study challenges that cognitive claim by examining the verbal descriptions of plots of land produced by ancient Roman land-measurers. Despite the prescription of a route-based form, actual representations persistently show a variety of features which suggest the existence of underlying survey-type mental models and the integration of those with the route-type ones. This fits better with current views on interaction between types of spatial representation and of cultural difference in this area. The evidence also suggests a linkage between the two kinds of representations. Full article
39 pages, 1298 KiB  
Systematic Review
Vision-Based Collision Warning Systems with Deep Learning: A Systematic Review
by Charith Chitraranjan, Vipooshan Vipulananthan and Thuvarakan Sritharan
J. Imaging 2025, 11(2), 64; https://doi.org/10.3390/jimaging11020064 - 17 Feb 2025
Cited by 1 | Viewed by 1406
Abstract
Timely prediction of collisions enables advanced driver assistance systems to issue warnings and initiate emergency maneuvers as needed to avoid collisions. With recent developments in computer vision and deep learning, collision warning systems that use vision as the only sensory input have emerged. [...] Read more.
Timely prediction of collisions enables advanced driver assistance systems to issue warnings and initiate emergency maneuvers as needed to avoid collisions. With recent developments in computer vision and deep learning, collision warning systems that use vision as the only sensory input have emerged. They are less expensive than those that use multiple sensors, but their effectiveness must be thoroughly assessed. We systematically searched academic literature for studies proposing ego-centric, vision-based collision warning systems that use deep learning techniques. Thirty-one studies among the search results satisfied our inclusion criteria. Risk of bias was assessed with PROBAST. We reviewed the selected studies and answer three primary questions: What are the (1) deep learning techniques used and how are they used? (2) datasets and experiments used to evaluate? (3) results achieved? We identified two main categories of methods: Those that use deep learning models to directly predict the probability of a future collision from input video, and those that use deep learning models at one or more stages of a pipeline to compute a threat metric before predicting collisions. More importantly, we show that the experimental evaluation of most systems is inadequate due to either not performing quantitative experiments or various biases present in the datasets used. Lack of suitable datasets is a major challenge to the evaluation of these systems and we suggest future work to address this issue. Full article
Show Figures

Figure 1

Back to TopTop