Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (28)

Search Parameters:
Keywords = autonomous e-learning ability

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2109 KB  
Article
ToggleMimic: A Two-Stage Policy for Text-Driven Humanoid Whole-Body Control
by Weifeng Zheng, Shigang Wang and Bohua Qian
Sensors 2025, 25(23), 7259; https://doi.org/10.3390/s25237259 - 28 Nov 2025
Viewed by 756
Abstract
For humanoid robots to interact naturally with humans and seamlessly integrate into daily life, natural language serves as an essential communication medium. While recent advances in imitation learning have enabled robots to acquire complex motions through expert demonstration, traditional approaches often rely on [...] Read more.
For humanoid robots to interact naturally with humans and seamlessly integrate into daily life, natural language serves as an essential communication medium. While recent advances in imitation learning have enabled robots to acquire complex motions through expert demonstration, traditional approaches often rely on rigid task specifications or single-modal inputs, limiting their ability to interpret high-level semantic instructions (e.g., natural language commands) or dynamically switch between actions. Directly translating natural language into executable control commands remains a significant challenge. To address this, we propose ToggleMimic, an end-to-end imitation learning framework that generates robotic motions from textual instructions, enabling language-driven multi-task control. In contrast to end-to-end methods that struggle with generalization or single-action models that lack flexibility, our ToggleMimic framework uniquely combines the following: (1) a two-stage policy distillation that efficiently bridges the sim-to-real gap, (2) a lightweight cross-attention mechanism for interpretable text-to-action mapping, and (3) a gating network that enhances robustness to linguistic variations. Extensive simulation and real-world experiments demonstrate the framework’s effectiveness, generalization capability, and robust text-guided control performance. This work establishes an efficient, interpretable, and scalable learning paradigm for cross-modal semantic-driven autonomous robot control. Full article
Show Figures

Figure 1

37 pages, 13864 KB  
Article
LSTM-Enhanced Deep Reinforcement Learning for Robust Trajectory Tracking Control of Skid-Steer Mobile Robots Under Terra-Mechanical Constraints
by Jose Manuel Alcayaga, Oswaldo Anibal Menéndez, Miguel Attilio Torres-Torriti, Juan Pablo Vásconez, Tito Arévalo-Ramirez and Alvaro Javier Prado Romo
Robotics 2025, 14(6), 74; https://doi.org/10.3390/robotics14060074 - 29 May 2025
Cited by 7 | Viewed by 4714
Abstract
Autonomous navigation in mining environments is challenged by complex wheel–terrain interaction, traction losses caused by slip dynamics, and sensor limitations. This paper investigates the effectiveness of Deep Reinforcement Learning (DRL) techniques for the trajectory tracking control of skid-steer mobile robots operating under terra-mechanical [...] Read more.
Autonomous navigation in mining environments is challenged by complex wheel–terrain interaction, traction losses caused by slip dynamics, and sensor limitations. This paper investigates the effectiveness of Deep Reinforcement Learning (DRL) techniques for the trajectory tracking control of skid-steer mobile robots operating under terra-mechanical constraints. Four state-of-the-art DRL algorithms, i.e., Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), and Soft Actor–Critic (SAC), are selected to evaluate their ability to generate stable and adaptive control policies under varying environmental conditions. To address the inherent partial observability in real-world navigation, this study presents an original approach that integrates Long Short-Term Memory (LSTM) networks into DRL-based controllers. This allows control agents to retain and leverage temporal dependencies to infer unobservable system states. The developed agents were trained and tested in simulations and then assessed in field experiments under uneven terrain and dynamic model parameter changes that lead to traction losses in mining environments, targeting various trajectory tracking tasks, including lemniscate and squared-type reference trajectories. This contribution strengthens the robustness and adaptability of DRL agents by enabling better generalization of learned policies compared with their baseline counterparts, while also significantly improving trajectory tracking performance. In particular, LSTM-based controllers achieved reductions in tracking errors of 10%, 74%, 21%, and 37% for DDPG-LSTM, PPO-LSTM, TD3-LSTM, and SAC-LSTM, respectively, compared with their non-recurrent counterparts. Furthermore, DDPG-LSTM and TD3-LSTM reduced their control effort through the total variation in control input by 15% and 20% compared with their respective baseline controllers, respectively. Findings from this work provide valuable insights into the role of memory-augmented reinforcement learning for robust motion control in unstructured and high-uncertainty environments. Full article
(This article belongs to the Section Intelligent Robots and Mechatronics)
Show Figures

Figure 1

31 pages, 3425 KB  
Article
RPF-MAD: A Robust Pre-Training–Fine-Tuning Algorithm for Meta-Adversarial Defense on the Traffic Sign Classification System of Autonomous Driving
by Xiaoxu Peng, Dong Zhou, Jianwen Zhang, Jiaqi Shi and Guanghui Sun
Electronics 2025, 14(10), 2044; https://doi.org/10.3390/electronics14102044 - 17 May 2025
Cited by 1 | Viewed by 1038
Abstract
Traffic sign classification (TSC) based on deep neural networks (DNNs) plays a crucial role in the perception subsystem of autonomous driving systems (ADSs). However, studies reveal that the TSC system can make dangerous and potentially fatal errors under adversarial attacks. Existing defense strategies, [...] Read more.
Traffic sign classification (TSC) based on deep neural networks (DNNs) plays a crucial role in the perception subsystem of autonomous driving systems (ADSs). However, studies reveal that the TSC system can make dangerous and potentially fatal errors under adversarial attacks. Existing defense strategies, such as adversarial training (AT), have demonstrated effectiveness but struggle to generalize across diverse attack scenarios. Recent advancements in self-supervised learning (SSL), particularly adversarial contrastive learning (ACL) methods, have demonstrated strong potential in enhancing robustness and generalization compared to AT. However, conventional ACL methods lack mechanisms to ensure effective defense transferability across different learning stages. To address this, we propose a robust pre-training–fine-tuning algorithm for meta-adversarial defense (RPF-MAD), designed to enhance the sustainability of adversarial robustness throughout the learning pipeline. Dual-track meta-adversarial pre-training (Dual-MAP) integrates meta-learning with ACL methods, which improves the generalization ability of the upstream model to different adversarial conditions. Meanwhile, adaptive variance anchoring robust fine-tuning (AVA-RFT) utilizes adaptive prototype variance regularization to stabilize feature representations and reinforce the generalizable defense capabilities of the downstream model. Leveraging the meta-adversarial defense benchmark (MAD) dataset, RPF-MAD ensures comprehensive robustness against multiple attack types. Extensive experiments across eight ACL methods and three robust fine-tuning (RFT) techniques demonstrate that RPF-MAD significantly improves both standard accuracy (SA) by 1.53% and robust accuracy (RA) by 2.64%, effectively enhances the lifelong adversarial resilience of TSC models, achieves a 13.77% improvement in the equilibrium defense success rate (EDSR), and reduces the attack success rate (ASR) by 9.74%, outperforming state-of-the-art (SOTA) defense methods. Full article
Show Figures

Figure 1

19 pages, 6222 KB  
Article
Generalization Ability of Bagging and Boosting Type Deep Learning Models in Evapotranspiration Estimation
by Manoranjan Kumar, Yash Agrawal, Sirisha Adamala, Pushpanjali, A. V. M. Subbarao, V. K. Singh and Ankur Srivastava
Water 2024, 16(16), 2233; https://doi.org/10.3390/w16162233 - 8 Aug 2024
Cited by 5 | Viewed by 2204
Abstract
The potential of generalized deep learning models developed for crop water estimation was examined in the current study. This study was conducted in a semiarid region of India, i.e., Karnataka, with daily climatic data (maximum and minimum air temperatures, maximum and minimum relative [...] Read more.
The potential of generalized deep learning models developed for crop water estimation was examined in the current study. This study was conducted in a semiarid region of India, i.e., Karnataka, with daily climatic data (maximum and minimum air temperatures, maximum and minimum relative humidity, wind speed, sunshine hours, and rainfall) of 44 years (1976–2020) for twelve locations. The Extreme Gradient Boosting (XGBoost), Gradient Boosting (GB), and Random Forest (RF) are three ensemble deep learning models that were developed using all of the climatic data from a single location (Bengaluru) from January 1976 to December 2017 and then immediately applied at eleven different locations (Ballari, Chikmaglur, Chitradurga, Devnagiri, Dharwad, Gadag, Haveri, Koppal, Mandya, Shivmoga, and Tumkuru) without the need for any local calibration. For the test period of January 2018–June 2020, the model’s capacity to estimate the numerical values of crop water requirement (Penman-Monteith (P-M) ETo values) was assessed. The developed ensemble deep learning models were evaluated using the performance criteria of mean absolute error (MAE), average absolute relative error (AARE), coefficient of correlation (r), noise to signal ratio (NS), Nash–Sutcliffe efficiency (ɳ), and weighted standard error of estimate (WSEE). The results indicated that the WSEE values of RF, GB, and XGBoost models for each location were smaller than 1 mm per day, and the model’s effectiveness varied from 96% to 99% across various locations. While all of the deep learning models performed better with respect to the P-M ETo approach, the XGBoost model was able to estimate ETo with greater accuracy than the GB and RF models. The XGBoost model’s strong performance was also indicated by the decreased noise-to-signal ratio. Thus, in this study, a generalized mathematical model for short-term ETo estimates is developed using ensemble deep learning techniques. Because of this type of model’s accuracy in calculating crop water requirements and its ability for generalization, it can be effortlessly integrated with a real-time water management system or an autonomous weather station at the regional level. Full article
(This article belongs to the Special Issue Water Management in Arid and Semi-arid Regions)
Show Figures

Figure 1

22 pages, 7100 KB  
Technical Note
On Developing a Machine Learning-Based Approach for the Automatic Characterization of Behavioral Phenotypes for Dairy Cows Relevant to Thermotolerance
by Oluwatosin Inadagbo, Genevieve Makowski, Ahmed Abdelmoamen Ahmed and Courtney Daigle
AgriEngineering 2024, 6(3), 2656-2677; https://doi.org/10.3390/agriengineering6030155 - 5 Aug 2024
Cited by 7 | Viewed by 2410
Abstract
The United States is predicted to experience an annual decline in milk production due to heat stress of 1.4 and 1.9 kg/day by the 2050s and 2080s, with economic losses of USD 1.7 billion and USD 2.2 billion, respectively, despite current cooling efforts [...] Read more.
The United States is predicted to experience an annual decline in milk production due to heat stress of 1.4 and 1.9 kg/day by the 2050s and 2080s, with economic losses of USD 1.7 billion and USD 2.2 billion, respectively, despite current cooling efforts implemented by the dairy industry. The ability of cattle to withstand heat (i.e., thermotolerance) can be influenced by physiological and behavioral factors, even though the factors contributing to thermoregulation are heritable, and cows vary in their behavioral repertoire. The current methods to gauge cow behaviors are lacking in precision and scalability. This paper presents an approach leveraging various machine learning (ML) (e.g., CNN and YOLOv8) and computer vision (e.g., Video Processing and Annotation) techniques aimed at quantifying key behavioral indicators, specifically drinking frequency and brush use- behaviors. These behaviors, while challenging to quantify using traditional methods, offer profound insights into the autonomic nervous system function and an individual cow’s coping mechanisms under heat stress. The developed approach provides an opportunity to quantify these difficult-to-measure drinking and brush use behaviors of dairy cows milked in a robotic milking system. This approach will open up a better opportunity for ranchers to make informed decisions that could mitigate the adverse effects of heat stress. It will also expedite data collection regarding dairy cow behavioral phenotypes. Finally, the developed system is evaluated using different performance metrics, including classification accuracy. It is found that the YoloV8 and CNN models achieved a classification accuracy of 93% and 96% for object detection and classification, respectively. Full article
Show Figures

Figure 1

15 pages, 1225 KB  
Article
A Self-Supervised Few-Shot Semantic Segmentation Method Based on Multi-Task Learning and Dense Attention Computation
by Kai Yi , Weihang Wang  and Yi Zhang 
Sensors 2024, 24(15), 4975; https://doi.org/10.3390/s24154975 - 31 Jul 2024
Viewed by 2361
Abstract
Nowadays, autonomous driving technology has become widely prevalent. The intelligent vehicles have been equipped with various sensors (e.g., vision sensors, LiDAR, depth cameras etc.). Among them, the vision systems with tailored semantic segmentation and perception algorithms play critical roles in scene understanding. However, [...] Read more.
Nowadays, autonomous driving technology has become widely prevalent. The intelligent vehicles have been equipped with various sensors (e.g., vision sensors, LiDAR, depth cameras etc.). Among them, the vision systems with tailored semantic segmentation and perception algorithms play critical roles in scene understanding. However, the traditional supervised semantic segmentation needs a large number of pixel-level manual annotations to complete model training. Although few-shot methods reduce the annotation work to some extent, they are still labor intensive. In this paper, a self-supervised few-shot semantic segmentation method based on Multi-task Learning and Dense Attention Computation (dubbed MLDAC) is proposed. The salient part of an image is split into two parts; one of them serves as the support mask for few-shot segmentation, while cross-entropy losses are calculated between the other part and the entire region with the predicted results separately as multi-task learning so as to improve the model’s generalization ability. Swin Transformer is used as our backbone to extract feature maps at different scales. These feature maps are then input to multiple levels of dense attention computation blocks to enhance pixel-level correspondence. The final prediction results are obtained through inter-scale mixing and feature skip connection. The experimental results indicate that MLDAC obtains 55.1% and 26.8% one-shot mIoU self-supervised few-shot segmentation on the PASCAL-5i and COCO-20i datasets, respectively. In addition, it achieves 78.1% on the FSS-1000 few-shot dataset, proving its efficacy. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

30 pages, 641 KB  
Article
Strategies of Automated Machine Learning for Energy Sustainability in Green Artificial Intelligence
by Dagoberto Castellanos-Nieves and Luis García-Forte
Appl. Sci. 2024, 14(14), 6196; https://doi.org/10.3390/app14146196 - 16 Jul 2024
Cited by 12 | Viewed by 4311
Abstract
Automated machine learning (AutoML) is recognized for its efficiency in facilitating model development due to its ability to perform tasks autonomously, without constant human intervention. AutoML automates the development and optimization of machine learning models, leading to high energy consumption due to the [...] Read more.
Automated machine learning (AutoML) is recognized for its efficiency in facilitating model development due to its ability to perform tasks autonomously, without constant human intervention. AutoML automates the development and optimization of machine learning models, leading to high energy consumption due to the large amount of calculations involved. Hyperparameter optimization algorithms, central to AutoML, can significantly impact its carbon footprint. This work introduces and investigates energy efficiency metrics for advanced hyperparameter optimization algorithms within AutoML. These metrics enable the evaluation and optimization of an algorithm’s energy consumption, considering accuracy, sustainability, and reduced environmental impact. The experimentation demonstrates the application of Green AI principles to AutoML hyperparameter optimization algorithms. It assesses the current sustainability of AutoML practices and proposes strategies to make them more environmentally friendly. The findings indicate a reduction of 28.7% in CO2e emissions when implementing the Green AI strategy, compared to the Red AI strategy. This improvement in sustainability is achieved with a minimal decrease of 0.51% in validation accuracy. This study emphasizes the importance of continuing to investigate sustainability throughout the life cycle of AI, aligning with the three fundamental pillars of sustainable development. Full article
(This article belongs to the Section Ecology Science and Engineering)
Show Figures

Figure 1

22 pages, 1448 KB  
Systematic Review
A Systematic Review of the Role of Multimodal Resources for Inclusive STEM Engagement in Early-Childhood Education
by Sarika Kewalramani, George Aranda, Jiqing Sun, Gerarda Richards, Linda Hobbs, Lihua Xu, Victoria Millar, Belinda Dealy and Bridgette Van Leuven
Educ. Sci. 2024, 14(6), 604; https://doi.org/10.3390/educsci14060604 - 4 Jun 2024
Cited by 5 | Viewed by 6811
Abstract
This paper presents the findings from a systematic review of 29 websites and 13 frameworks that provide STEM (Science, Technology, Engineering, Mathematics) educational resources for parents, educators, and children (birth–8 years of age). Our theoretical approach is rooted within a social semiotic perspective [...] Read more.
This paper presents the findings from a systematic review of 29 websites and 13 frameworks that provide STEM (Science, Technology, Engineering, Mathematics) educational resources for parents, educators, and children (birth–8 years of age). Our theoretical approach is rooted within a social semiotic perspective that has indicated that multimodality enables children to use different types of expression to communicate a message or share an idea. Using the PRISMA methodology and the narrative document analysis approach, the themes that emerged included how the content and resources available on the websites addressed whether multimodality supported STEM engagement in an inclusive manner. The findings revealed that there were scarce multimodal resources that engaged children with fun, interactive, and meaningful opportunities to be autonomous learners (e.g., children had agency) (n = 11 out of 29), moving between the digital and hands-on physical spaces (n = 8 out of 29), employing gamification for deep learning (n = 4 out of 29), and piquing children’s imagination, inquiry, and creativity, and links to everyday STEM scenarios were hardly present (n = 10 out of 29). The implications lie in addressing early STEM engagement by considering children’s learning abilities and agency, bearing in mind parents/educators’ sociocultural backgrounds, confidence in STEM awareness, and multimodal avenues for communicating STEM learning and inquiry. Full article
(This article belongs to the Section STEM Education)
Show Figures

Figure 1

22 pages, 2334 KB  
Article
Vision-Based Object Manipulation for Activities of Daily Living Assistance Using Assistive Robot
by Md Tanzil Shahria, Jawhar Ghommam, Raouf Fareh and Mohammad Habibur Rahman
Automation 2024, 5(2), 68-89; https://doi.org/10.3390/automation5020006 - 15 Apr 2024
Cited by 10 | Viewed by 3916
Abstract
The increasing prevalence of upper and lower extremity (ULE) functional deficiencies presents a significant challenge, as it restricts individuals’ ability to perform daily tasks independently. Robotic devices are emerging as assistive devices to assist individuals with limited ULE functionalities in activities of daily [...] Read more.
The increasing prevalence of upper and lower extremity (ULE) functional deficiencies presents a significant challenge, as it restricts individuals’ ability to perform daily tasks independently. Robotic devices are emerging as assistive devices to assist individuals with limited ULE functionalities in activities of daily living (ADLs). While assistive manipulators are available, manual control through traditional methods like joysticks can be cumbersome, particularly for individuals with severe hand impairments and vision limitations. Therefore, autonomous/semi-autonomous control of a robotic assistive device to perform any ADL task is open to research. This study addresses the necessity of fostering independence in ADLs by proposing a creative approach. We present a vision-based control system for a six-degrees-of-freedom (DoF) robotic manipulator designed for semi-autonomous “pick-and-place” tasks, one of the most common activities among ADLs. Our approach involves selecting and training a deep-learning-based object detection model with a dataset of 47 ADL objects, forming the base for a 3D ADL object localization algorithm. The proposed vision-based control system integrates this localization technique to identify and manipulate ADL objects (e.g., apples, oranges, capsicums, and cups) in real time, returning them to specific locations to complete the “pick-and-place” task. Experimental validation involving an xArm6 (six DoF) robot from UFACTORY in diverse settings demonstrates the system’s adaptability and effectiveness, achieving an overall 72.9% success rate in detecting, localizing, and executing ADL tasks. This research contributes to the growing field of autonomous assistive devices, enhancing independence for individuals with functional impairments. Full article
(This article belongs to the Collection Smart Robotics for Automation)
Show Figures

Figure 1

29 pages, 10941 KB  
Article
Classification of Lakebed Geologic Substrate in Autonomously Collected Benthic Imagery Using Machine Learning
by Joseph K. Geisz, Phillipe A. Wernette and Peter C. Esselman
Remote Sens. 2024, 16(7), 1264; https://doi.org/10.3390/rs16071264 - 3 Apr 2024
Cited by 6 | Viewed by 2733
Abstract
Mapping benthic habitats with bathymetric, acoustic, and spectral data requires georeferenced ground-truth information about habitat types and characteristics. New technologies like autonomous underwater vehicles (AUVs) collect tens of thousands of images per mission making image-based ground truthing particularly attractive. Two types of machine [...] Read more.
Mapping benthic habitats with bathymetric, acoustic, and spectral data requires georeferenced ground-truth information about habitat types and characteristics. New technologies like autonomous underwater vehicles (AUVs) collect tens of thousands of images per mission making image-based ground truthing particularly attractive. Two types of machine learning (ML) models, random forest (RF) and deep neural network (DNN), were tested to determine whether ML models could serve as an accurate substitute for manual classification of AUV images for substrate type interpretation. RF models were trained to predict substrate class as a function of texture, edge, and intensity metrics (i.e., features) calculated for each image. Models were tested using a manually classified image dataset with 9-, 6-, and 2-class schemes based on the Coastal and Marine Ecological Classification Standard (CMECS). Results suggest that both RF and DNN models achieve comparable accuracies, with the 9-class models being least accurate (~73–78%) and the 2-class models being the most accurate (~95–96%). However, the DNN models were more efficient to train and apply because they did not require feature estimation before training or classification. Integrating ML models into benthic habitat mapping process can improve our ability to efficiently and accurately ground-truth large areas of benthic habitat using AUV or similar images. Full article
(This article belongs to the Topic Geocomputation and Artificial Intelligence for Mapping)
Show Figures

Graphical abstract

17 pages, 12823 KB  
Article
Towards Fully Autonomous UAV: Damaged Building-Opening Detection for Outdoor-Indoor Transition in Urban Search and Rescue
by Ali Surojaya, Ning Zhang, John Ray Bergado and Francesco Nex
Electronics 2024, 13(3), 558; https://doi.org/10.3390/electronics13030558 - 30 Jan 2024
Cited by 6 | Viewed by 2293
Abstract
Autonomous unmanned aerial vehicle (UAV) technology is a promising technology for minimizing human involvement in dangerous activities like urban search and rescue missions (USAR), both in indoor and outdoor. Automated navigation from outdoor to indoor environments is not trivial, as it encompasses the [...] Read more.
Autonomous unmanned aerial vehicle (UAV) technology is a promising technology for minimizing human involvement in dangerous activities like urban search and rescue missions (USAR), both in indoor and outdoor. Automated navigation from outdoor to indoor environments is not trivial, as it encompasses the ability of a UAV to automatically map and locate the openings in a damaged building. This study focuses on developing a deep learning model for the detection of damaged building openings in real time. A novel damaged building-opening dataset containing images and mask annotations, as well as a comparison between single and multi-task learning-based detectors are given. The deep learning-based detector used in this study is based on YOLOv5. First, this study compared the different versions of YOLOv5 (i.e., small, medium, and large) capacity to perform damaged building-opening detections. Second, a multitask learning YOLOv5 was trained on the same dataset and compared with the single-task detector. The multitask learning (MTL) was developed based on the YOLOv5 object detection architecture, adding a segmentation branch jointly with the detection head. This study found that the MTL-based YOLOv5 can improve detection performance by combining detection and segmentation losses. The YOLOv5s-MTL trained on the damaged building-opening dataset obtained 0.648 mAP, an increase of 0.167 from the single-task-based network, while its inference speed was 73 frames per second on the tested platform. Full article
(This article belongs to the Special Issue Control and Applications of Intelligent Unmanned Aerial Vehicle)
Show Figures

Figure 1

19 pages, 6177 KB  
Article
SyS3DS: Systematic Sampling of Large-Scale LiDAR Point Clouds for Semantic Segmentation in Forestry Robotics
by Habibu Mukhandi, Joao Filipe Ferreira and Paulo Peixoto
Sensors 2024, 24(3), 823; https://doi.org/10.3390/s24030823 - 26 Jan 2024
Cited by 2 | Viewed by 2702
Abstract
Recently, new semantic segmentation and object detection methods have been proposed for the direct processing of three-dimensional (3D) LiDAR sensor point clouds. LiDAR can produce highly accurate and detailed 3D maps of natural and man-made environments and is used for sensing in many [...] Read more.
Recently, new semantic segmentation and object detection methods have been proposed for the direct processing of three-dimensional (3D) LiDAR sensor point clouds. LiDAR can produce highly accurate and detailed 3D maps of natural and man-made environments and is used for sensing in many contexts due to its ability to capture more information, its robustness to dynamic changes in the environment compared to an RGB camera, and its cost, which has decreased in recent years and which is an important factor for many application scenarios. The challenge with high-resolution 3D LiDAR sensors is that they can output large amounts of 3D data with up to a few million points per second, which is difficult to process in real time when applying complex algorithms and models for efficient semantic segmentation. Most existing approaches are either only suitable for relatively small point clouds or rely on computationally intensive sampling techniques to reduce their size. As a result, most of these methods do not work in real time in realistic field robotics application scenarios, making them unsuitable for practical applications. Systematic point selection is a possible solution to reduce the amount of data to be processed. Although our approach is memory and computationally efficient, it selects only a small subset of points, which may result in important features being missed. To address this problem, our proposed systematic sampling method called SyS3DS (Systematic Sampling for 3D Semantic Segmentation) incorporates a technique in which the local neighbours of each point are retained to preserve geometric details. SyS3DS is based on the graph colouring algorithm and ensures that the selected points are non-adjacent in order to obtain a subset of points that are representative of the 3D points in the scene. To take advantage of the ensemble learning method, we pass a different subset of nodes for each epoch. This leverages a new technique called auto-ensemble, where ensemble learning is proposed as a collection of different learning models instead of tuning different hyperparameters individually during training and validation. SyS3DS has been shown to process up to 1 million points in a single pass. It outperforms the state of the art in efficient semantic segmentation on large datasets such as Semantic3D. We also present a preliminary study on the validity of the performance of LiDAR-only data, i.e., intensity values from LiDAR sensors without RGB values for semi-autonomous robot perception. Full article
(This article belongs to the Special Issue Sensor Based Perception for Field Robotics)
Show Figures

Figure 1

18 pages, 6689 KB  
Article
Exploring the Potential of Ensembles of Deep Learning Networks for Image Segmentation
by Loris Nanni, Alessandra Lumini and Carlo Fantozzi
Information 2023, 14(12), 657; https://doi.org/10.3390/info14120657 - 12 Dec 2023
Cited by 7 | Viewed by 3800
Abstract
To identify objects in images, a complex set of skills is needed that includes understanding the context and being able to determine the borders of objects. In computer vision, this task is known as semantic segmentation and it involves categorizing each pixel in [...] Read more.
To identify objects in images, a complex set of skills is needed that includes understanding the context and being able to determine the borders of objects. In computer vision, this task is known as semantic segmentation and it involves categorizing each pixel in an image. It is crucial in many real-world situations: for autonomous vehicles, it enables the identification of objects in the surrounding area; in medical diagnosis, it enhances the ability to detect dangerous pathologies early, thereby reducing the risk of serious consequences. In this study, we compare the performance of various ensembles of convolutional and transformer neural networks. Ensembles can be created, e.g., by varying the loss function, the data augmentation method, or the learning rate strategy. Our proposed ensemble, which uses a simple averaging rule, demonstrates exceptional performance across multiple datasets. Notably, compared to prior state-of-the-art methods, our ensemble consistently shows improvements in the well-studied polyp segmentation problem. This problem involves the precise delineation and identification of polyps within medical images, and our approach showcases noteworthy advancements in this domain, obtaining an average Dice of 0.887, which outperforms the current SOTA with an average Dice of 0.885. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition and Machine Learning in Italy)
Show Figures

Figure 1

22 pages, 5355 KB  
Article
A Multi-Task Fusion Strategy-Based Decision-Making and Planning Method for Autonomous Driving Vehicles
by Weiguo Liu, Zhiyu Xiang, Han Fang, Ke Huo and Zixu Wang
Sensors 2023, 23(16), 7021; https://doi.org/10.3390/s23167021 - 8 Aug 2023
Cited by 10 | Viewed by 3049
Abstract
The autonomous driving technology based on deep reinforcement learning (DRL) has been confirmed as one of the most cutting-edge research fields worldwide. The agent is enabled to achieve the goal of making independent decisions by interacting with the environment and learning driving strategies [...] Read more.
The autonomous driving technology based on deep reinforcement learning (DRL) has been confirmed as one of the most cutting-edge research fields worldwide. The agent is enabled to achieve the goal of making independent decisions by interacting with the environment and learning driving strategies based on the feedback from the environment. This technology has been widely used in end-to-end driving tasks. However, this field faces several challenges. First, developing real vehicles is expensive, time-consuming, and risky. To further expedite the testing, verification, and iteration of end-to-end deep reinforcement learning algorithms, a joint simulation development and validation platform was designed and implemented in this study based on VTD–CarSim and the Tensorflow deep learning framework, and research work was conducted based on this platform. Second, sparse reward signals can cause problems (e.g., a low-sample learning rate). It is imperative for the agent to be capable of navigating in an unfamiliar environment and driving safely under a wide variety of weather or lighting conditions. To address the problem of poor generalization ability of the agent to unknown scenarios, a deep deterministic policy gradient (DDPG) decision-making and planning method was proposed in this study in accordance with a multi-task fusion strategy. The main task based on DRL decision-making planning and the auxiliary task based on image semantic segmentation were cross-fused, and part of the network was shared with the main task to reduce the possibility of model overfitting and improve the generalization ability. As indicated by the experimental results, first, the joint simulation development and validation platform built in this study exhibited prominent versatility. Users were enabled to easily substitute any default module with customized algorithms and verify the effectiveness of new functions in enhancing overall performance using other default modules of the platform. Second, the deep reinforcement learning strategy based on multi-task fusion proposed in this study was competitive. Its performance was better than other DRL algorithms in certain tasks, which improved the generalization ability of the vehicle decision-making planning algorithm. Full article
(This article belongs to the Special Issue Machine Learning for Autonomous Driving Perception and Prediction)
Show Figures

Figure 1

13 pages, 1364 KB  
Article
Evaluating of Education Effects of Online Learning for Local University Students in China: A Case Study
by Lifen Bai, Binbin Yang and Shichong Yuan
Sustainability 2023, 15(13), 9860; https://doi.org/10.3390/su15139860 - 21 Jun 2023
Cited by 2 | Viewed by 2470
Abstract
The spread and persistence of the global COVID-19 pandemic has caused online education to gradually become the “new normal” in higher education, and a comprehensive and systematic study of the online course learning of college students in the context of the normalization of [...] Read more.
The spread and persistence of the global COVID-19 pandemic has caused online education to gradually become the “new normal” in higher education, and a comprehensive and systematic study of the online course learning of college students in the context of the normalization of online teaching is urgently needed. Higher education is the main stage of cultivating innovative talents, and the evaluation of collaborative education through science and education in the context of online education is also a key content. Therefore, this study established a hierarchical evaluation system based on the Analytic Hierarchy Process (AHP), including three second-level evaluation factors and fifteen third-level evaluation factors. Then, a judgment matrix for each level of evaluation factor was constructed and passed the consistency test. Finally, the weights of each factor were determined, and the feasibility and effectiveness of this method in online learning quality evaluation were verified through a case study. The evaluation results indicate that the cultivation of online learning ability for local college students is more important than online course resources and the online learning environment. In order to promote the realization of educational equity, it is not only necessary to focus on improving students’ academic performance, but also to find ways to enhance students’ learning abilities. In addition, the cultivation of communication and collaboration abilities between teachers and students is an important way to improve the quality of online learning for college students in the process of online learning for college students. Full article
(This article belongs to the Section Sustainable Education and Approaches)
Show Figures

Figure 1

Back to TopTop