MDPI - Publisher of Open Access Journals

31 pages, 55513 KB

Open AccessArticle

SAM for Road Object Segmentation: Promising but Challenging

by Alaa Atallah Almazroey, Salma kammoun Jarraya and Reem Alnanih

J. Imaging 2025, 11(6), 189; https://doi.org/10.3390/jimaging11060189 - 10 Jun 2025

Viewed by 1864

Road object segmentation is crucial for autonomous driving, as it enables vehicles to perceive their surroundings. While deep learning models show promise, their generalization across diverse road conditions, weather variations, and lighting changes remains challenging. Different approaches have been proposed to address this [...] Read more.

Road object segmentation is crucial for autonomous driving, as it enables vehicles to perceive their surroundings. While deep learning models show promise, their generalization across diverse road conditions, weather variations, and lighting changes remains challenging. Different approaches have been proposed to address this limitation. However, these models often struggle with the varying appearance of road objects under diverse environmental conditions. Foundation models such as the Segment Anything Model (SAM) offer a potential avenue for improved generalization in complex visual tasks. Thus, this study presents a pioneering comprehensive evaluation of the SAM for zero-shot road object segmentation, without explicit prompts. This study aimed to determine the inherent capabilities and limitations of the SAM in accurately segmenting a variety of road objects under the diverse and challenging environmental conditions encountered in real-world autonomous driving scenarios. We assessed the SAM’s performance on the KITTI, BDD100K, and Mapillary Vistas datasets, encompassing a wide range of environmental conditions. Using a variety of established evaluation metrics, our analysis revealed the SAM’s capabilities and limitations in accurately segmenting various road objects, particularly highlighting challenges posed by dynamic environments, illumination changes, and occlusions. These findings provide valuable insights for researchers and developers seeking to enhance the robustness of foundation models such as the SAM in complex road environments, guiding future efforts to improve perception systems for autonomous driving. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

19 pages, 5354 KB

Open AccessArticle

Method for Applying Crowdsourced Street-Level Imagery Data to Evaluate Street-Level Greenness

by Xinrui Zheng and Mamoru Amemiya

ISPRS Int. J. Geo-Inf. 2023, 12(3), 108; https://doi.org/10.3390/ijgi12030108 - 4 Mar 2023

Cited by 12 | Viewed by 3654

Abstract

Street greenness visibility (SGV) is associated with various health benefits and positively influences perceptions of landscape. Lowering the barriers to SGV assessments and measuring the values accurately is crucial for applying this critical landscape information. However, the verified available street view imagery (SVI) [...] Read more.

Street greenness visibility (SGV) is associated with various health benefits and positively influences perceptions of landscape. Lowering the barriers to SGV assessments and measuring the values accurately is crucial for applying this critical landscape information. However, the verified available street view imagery (SVI) data for SGV assessments are limited to the traditional top-down data, which are generally used with download and usage restrictions. In this study, we explored volunteered street view imagery (VSVI) as a potential data source for SGV assessments. To improve the image quality of the crowdsourced dataset, which may affect the accuracy of the survey results, we developed an image filtering method with XGBoost using images from the Mapillary platform and conducted an accuracy evaluation by comparing the results with official data in Shinjuku, Japan. We found that the original VSVI is well suited for SGV assessments after data processing, and the filtered data have higher accuracy. The discussion on VSVI data applications can help expand useful data for urban audit surveys, and this full-free open data may promote the democratization of urban audit surveys using big data. Full article

► Show Figures

Figure 1

12 pages, 5793 KB

Open AccessArticle

Development of a Large-Scale Roadside Facility Detection Model Based on the Mapillary Dataset

by Zhehui Yang, Chenbo Zhao, Hiroya Maeda and Yoshihide Sekimoto

Sensors 2022, 22(24), 9992; https://doi.org/10.3390/s22249992 - 19 Dec 2022

Cited by 9 | Viewed by 4478

Abstract

The detection of road facilities or roadside structures is essential for high-definition (HD) maps and intelligent transportation systems (ITSs). With the rapid development of deep-learning algorithms in recent years, deep-learning-based object detection techniques have provided more accurate and efficient performance, and have become [...] Read more.

The detection of road facilities or roadside structures is essential for high-definition (HD) maps and intelligent transportation systems (ITSs). With the rapid development of deep-learning algorithms in recent years, deep-learning-based object detection techniques have provided more accurate and efficient performance, and have become an essential tool for HD map reconstruction and advanced driver-assistance systems (ADASs). Therefore, the performance evaluation and comparison of the latest deep-learning algorithms in this field is indispensable. However, most existing works in this area limit their focus to the detection of individual targets, such as vehicles or pedestrians and traffic signs, from driving view images. In this study, we present a systematic comparison of three recent algorithms for large-scale multi-class road facility detection, namely Mask R-CNN, YOLOx, and YOLOv7, on the Mapillary dataset. The experimental results are evaluated according to the recall, precision, mean F1-score and computational consumption. YOLOv7 outperforms the other two networks in road facility detection, with a precision and recall of 87.57% and 72.60%, respectively. Furthermore, we test the model performance on our custom dataset obtained from the Japanese road environment. The results demonstrate that models trained on the Mapillary dataset exhibit sufficient generalization ability. The comparison presented in this study aids in understanding the strengths and limitations of the latest networks in multiclass object detection on large-scale street-level datasets. Full article

(This article belongs to the Special Issue AI Applications in Smart Networks and Sensor Devices)

► Show Figures

Figure 1

22 pages, 2177 KB

Open AccessArticle

Hierarchical Novelty Detection for Traffic Sign Recognition

by Idoia Ruiz and Joan Serrat

Sensors 2022, 22(12), 4389; https://doi.org/10.3390/s22124389 - 10 Jun 2022

Cited by 3 | Viewed by 2855

Abstract

Recent works have made significant progress in novelty detection, i.e., the problem of detecting samples of novel classes, never seen during training, while classifying those that belong to known classes. However, the only information this task provides about novel samples is that they [...] Read more.

Recent works have made significant progress in novelty detection, i.e., the problem of detecting samples of novel classes, never seen during training, while classifying those that belong to known classes. However, the only information this task provides about novel samples is that they are unknown. In this work, we leverage hierarchical taxonomies of classes to provide informative outputs for samples of novel classes. We predict their closest class in the taxonomy, i.e., its parent class. We address this problem, known as hierarchical novelty detection, by proposing a novel loss, namely Hierarchical Cosine Loss that is designed to learn class prototypes along with an embedding of discriminative features consistent with the taxonomy. We apply it to traffic sign recognition, where we predict the parent class semantics for new types of traffic signs. Our model beats state-of-the art approaches on two large scale traffic sign benchmarks, Mapillary Traffic Sign Dataset (MTSD) and Tsinghua-Tencent 100K (TT100K), and performs similarly on natural images benchmarks (AWA2, CUB). For TT100K and MTSD, our approach is able to detect novel samples at the correct nodes of the hierarchy with 81% and 36% of accuracy, respectively, at 80% known class accuracy. Full article

(This article belongs to the Special Issue Advance in Sensors and Sensing Systems for Driving and Transportation: Part B)

► Show Figures

Figure 1

22 pages, 6677 KB

Open AccessEditor’s ChoiceArticle

Crowdsourcing Street View Imagery: A Comparison of Mapillary and OpenStreetCam

by Ron Mahabir, Ross Schuchard, Andrew Crooks, Arie Croitoru and Anthony Stefanidis

ISPRS Int. J. Geo-Inf. 2020, 9(6), 341; https://doi.org/10.3390/ijgi9060341 - 26 May 2020

Cited by 43 | Viewed by 8838

Abstract

Over the last decade, Volunteered Geographic Information (VGI) has emerged as a viable source of information on cities. During this time, the nature of VGI has been evolving, with new types and sources of data continually being added. In light of this trend, [...] Read more.

Over the last decade, Volunteered Geographic Information (VGI) has emerged as a viable source of information on cities. During this time, the nature of VGI has been evolving, with new types and sources of data continually being added. In light of this trend, this paper explores one such type of VGI data: Volunteered Street View Imagery (VSVI). Two VSVI sources, Mapillary and OpenStreetCam, were extracted and analyzed to study road coverage and contribution patterns for four US metropolitan areas. Results show that coverage patterns vary across sites, with most contributions occurring along local roads and in populated areas. We also found that a few users contributed most of the data. Moreover, the results suggest that most data are being collected during three distinct times of day (i.e., morning, lunch and late afternoon). The paper concludes with a discussion that while VSVI data is still relatively new, it has the potential to be a rich source of spatial and temporal information for monitoring cities. Full article

► Show Figures

Figure 1

21 pages, 8298 KB

Open AccessArticle

The State of Mapillary: An Exploratory Analysis

by Dawei Ma, Hongchao Fan, Wenwen Li and Xuan Ding

ISPRS Int. J. Geo-Inf. 2020, 9(1), 10; https://doi.org/10.3390/ijgi9010010 - 20 Dec 2019

Cited by 34 | Viewed by 6390

Abstract

As the world’s largest crowdsourcing-based street view platform, Mapillary has received considerable attention in both research and practical applications. By February 2019, more than 20,000 users worldwide contributed approximately 6.3 million kilometers of streetscape sequences. In this study, we attempted to get a [...] Read more.

As the world’s largest crowdsourcing-based street view platform, Mapillary has received considerable attention in both research and practical applications. By February 2019, more than 20,000 users worldwide contributed approximately 6.3 million kilometers of streetscape sequences. In this study, we attempted to get a deep insight into the Mapillary project through an exploratory analysis from the perspective of contributors, including the development of users, the spatiotemporal analysis of active users, the contribution modes (walking, cycling, and driving), and the devices used to contribute. It shows that inequality exists in the distribution of contributed users, similar to that in other volunteered geographic information (VGI) projects. However, the inequality in Mapillary contribution is less than in OpenStreetMap (OSM). Compared to OSM, the other main difference is that the data collection demonstrated obvious seasonal variation because contributions to OSM can be accomplished on a computer, whereas images have to be captured on the streets for Mapillary, and this is considerably affected by seasonal weather. Full article

► Show Figures

Figure 1

26 pages, 18579 KB

Open AccessArticle

Crowdsourced Street-Level Imagery as a Potential Source of In-Situ Data for Crop Monitoring

by Raphaël D'Andrimont, Momchil Yordanov, Guido Lemoine, Janine Yoong, Kamil Nikel and Marijn Van der Velde

Land 2018, 7(4), 127; https://doi.org/10.3390/land7040127 - 22 Oct 2018

Cited by 27 | Viewed by 7637

Abstract

New approaches to collect in-situ data are needed to complement the high spatial (10 m) and temporal (5 d) resolution of Copernicus Sentinel satellite observations. Making sense of Sentinel observations requires high quality and timely in-situ data for training and validation. Classical ground [...] Read more.

New approaches to collect in-situ data are needed to complement the high spatial (10 m) and temporal (5 d) resolution of Copernicus Sentinel satellite observations. Making sense of Sentinel observations requires high quality and timely in-situ data for training and validation. Classical ground truth collection is expensive, lacks scale, fails to exploit opportunities for automation, and is prone to sampling error. Here we evaluate the potential contribution of opportunistically exploiting crowdsourced street-level imagery to collect massive high-quality in-situ data in the context of crop monitoring. This study assesses this potential by answering two questions: (1) what is the spatial availability of these images across the European Union (EU), and (2) can these images be transformed to useful data? To answer the first question, we evaluated the EU availability of street-level images on Mapillary—the largest open-access platform for such images—against the Land Use and land Cover Area frame Survey (LUCAS) 2018, a systematic surveyed sampling of 337,031 points. For 37.78% of the LUCAS points a crowdsourced image is available within a 2 km buffer, with a mean distance of 816.11 m. We estimate that 9.44% of the EU territory has a crowdsourced image within 300 m from a LUCAS point, illustrating the huge potential of crowdsourcing as a complementary sampling tool. After artificial and built up (63.14%), and inland water (43.67%) land cover classes, arable land has the highest availability at 40.78%. To answer the second question, we focus on identifying crops at parcel level using all 13.6 million Mapillary images collected in the Netherlands. Only 1.9% of the contributors generated 75.15% of the images. A procedure was developed to select and harvest the pictures potentially best suited to identify crops using the geometries of 785,710 Dutch parcels and the pictures’ meta-data such as camera orientation and focal length. Availability of crowdsourced imagery looking at parcels was assessed for eight different crop groups with the 2017 parcel level declarations. Parcel revisits during the growing season allowed to track crop growth. Examples illustrate the capacity to recognize crops and their phenological development on crowdsourced street-level imagery. Consecutive images taken during the same capture track allow selecting the image with the best unobstructed view. In the future, dedicated crop capture tasks can improve image quality and expand coverage in rural areas. Full article

(This article belongs to the Special Issue Citizen Science and Crowdsourcing for Land Use, Land Cover and Change Detection)

► Show Figures

Graphical abstract

22 pages, 4486 KB

Open AccessData Descriptor

Technical Guidelines to Extract and Analyze VGI from Different Platforms

by Levente Juhász, Adam Rousell and Jamal Jokar Arsanjani

Data 2016, 1(3), 15; https://doi.org/10.3390/data1030015 - 24 Sep 2016

Cited by 10 | Viewed by 9152

Abstract

An increasing number of Volunteered Geographic Information (VGI) and social media platforms have been continuously growing in size, which have provided massive georeferenced data in many forms including textual information, photographs, and geoinformation. These georeferenced data have either been actively contributed (e.g., adding [...] Read more.

An increasing number of Volunteered Geographic Information (VGI) and social media platforms have been continuously growing in size, which have provided massive georeferenced data in many forms including textual information, photographs, and geoinformation. These georeferenced data have either been actively contributed (e.g., adding data to OpenStreetMap (OSM) or Mapillary) or collected in a more passive fashion by enabling geolocation whilst using an online platform (e.g., Twitter, Instagram, or Flickr). The benefit of scraping and streaming these data in stand-alone applications is evident, however, it is difficult for many users to script and scrape the diverse types of these data. On 14 June 2016, a pre-conference workshop at the AGILE 2016 conference in Helsinki, Finland was held. The workshop was called “LINK-VGI: LINKing and analyzing VGI across different platforms”. The workshop provided an opportunity for interested researchers to share ideas and findings on cross-platform data contributions. One portion of the workshop was dedicated to a hands-on session. In this session, the basics of spatial data access through selected Application Programming Interfaces (APIs) and the extraction of summary statistics of the results were illustrated. This paper presents the content of the hands-on session including the scripts and guidelines for extracting VGI data. Researchers, planners, and interested end-users can benefit from this paper for developing their own application for any region of the world. Full article

(This article belongs to the Special Issue Geospatial Data)

► Show Figures

Figure 1

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI