Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (7)

Search Parameters:
Keywords = CO2 caption

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 2414 KiB  
Article
Personalized Contextual Information Delivery Using Road Sign Recognition
by Byungjoon Kim and Yongduek Seo
Appl. Sci. 2025, 15(11), 6051; https://doi.org/10.3390/app15116051 - 28 May 2025
Viewed by 290
Abstract
Road sign recognition is essential for navigation and autonomous driving applications. While existing models focus primarily on text detection and extraction, they fail to incorporate user-specific contextual information, limiting their effectiveness in real-world scenarios. This study proposes a modular system that enhances road [...] Read more.
Road sign recognition is essential for navigation and autonomous driving applications. While existing models focus primarily on text detection and extraction, they fail to incorporate user-specific contextual information, limiting their effectiveness in real-world scenarios. This study proposes a modular system that enhances road sign recognition by integrating user-adapted contextual reasoning. The system applies a step-by-step Chain of Thought (CoT) approach to link detected road signs with relevant contextual data, such as location, speed, and destination. Compared to traditional image captioning models, our approach significantly improves information relevance and usability. Experimental results show that the proposed system achieves a 23.4% increase in user-adapted information accuracy and reduces interpretation errors by 17.8% in real-world navigation scenarios. These findings demonstrate that semantic inference-based reasoning improves decision-making efficiency, making road sign recognition systems more practical for real-world applications. The study also discusses challenges such as real-time processing limitations and potential future improvements for broader infrastructure recognition. Full article
Show Figures

Figure 1

30 pages, 452 KiB  
Article
Advancing Multimodal Large Language Models: Optimizing Prompt Engineering Strategies for Enhanced Performance
by Minjun Son and Sungjin Lee
Appl. Sci. 2025, 15(7), 3992; https://doi.org/10.3390/app15073992 - 4 Apr 2025
Cited by 2 | Viewed by 2646
Abstract
This study investigates prompt engineering (PE) strategies to mitigate hallucination, a key limitation of multimodal large language models (MLLMs). To address this issue, we explore five prominent multimodal PE techniques: in-context learning (ICL), chain of thought (CoT), step-by-step reasoning (SSR), tree of thought [...] Read more.
This study investigates prompt engineering (PE) strategies to mitigate hallucination, a key limitation of multimodal large language models (MLLMs). To address this issue, we explore five prominent multimodal PE techniques: in-context learning (ICL), chain of thought (CoT), step-by-step reasoning (SSR), tree of thought (ToT), and retrieval-augmented generation (RAG). These techniques are systematically applied across multiple datasets with distinct domains and characteristics. Based on the empirical findings, we propose the greedy prompt engineering strategy (Greedy PES), a methodology for optimizing PE application across different datasets and MLLM models. To evaluate user satisfaction with MLLM-generated responses, we adopt a comprehensive set of evaluation metrics, including BLEU, ROUGE, METEOR, S-BERT, MoverScore, and CIDEr. A weighted aggregate evaluation score is introduced to provide a holistic assessment of model performance under varying conditions. Experimental results demonstrate that the optimal prompt engineering strategy varies significantly depending on both dataset properties and the MLLM model used. Specifically, datasets categorized as general benefit the most from ICL, ToT, and RAG, whereas mathematical datasets perform optimally with ICL, SSR, and ToT. In scientific reasoning tasks, RAG and SSR emerge as the most effective strategies. Applying Greedy PES leads to a substantial improvement in performance across different multimodal tasks, achieving an average evaluation score enhancement of 184.3% for general image captioning, 90.3% for mathematical visual question answering (VQA), and 49.1% for science visual question answering (VQA) compared to conventional approaches. These findings highlight the effectiveness of structured PE strategies in optimizing MLLM performance and provide a robust framework for PE-driven model enhancement across diverse multimodal applications. Full article
Show Figures

Figure 1

21 pages, 1298 KiB  
Article
Co-LLaVA: Efficient Remote Sensing Visual Question Answering via Model Collaboration
by Fan Liu, Wenwen Dai, Chuanyi Zhang, Jiale Zhu, Liang Yao and Xin Li
Remote Sens. 2025, 17(3), 466; https://doi.org/10.3390/rs17030466 - 29 Jan 2025
Viewed by 1998
Abstract
Large vision language models (LVLMs) are built upon large language models (LLMs) and incorporate non-textual modalities; they can perform various multimodal tasks. Applying LVLMs in remote sensing (RS) visual question answering (VQA) tasks can take advantage of the powerful capabilities to promote the [...] Read more.
Large vision language models (LVLMs) are built upon large language models (LLMs) and incorporate non-textual modalities; they can perform various multimodal tasks. Applying LVLMs in remote sensing (RS) visual question answering (VQA) tasks can take advantage of the powerful capabilities to promote the development of VQA in RS. However, due to the greater complexity of remote sensing images compared to natural images, general-domain LVLMs tend to perform poorly in RS scenarios and are prone to hallucination phenomena. Multi-agent debate for collaborative reasoning is commonly utilized to mitigate hallucination phenomena. Although this method is effective, it comes with a significant computational burden (e.g., high CPU/GPU demands and slow inference speed). To address these limitations, we propose Co-LLaVA, a model specifically designed for RS VQA tasks. Specifically, Co-LLaVA employs model collaboration between Large Language and Vision Assistant (LLaVA-v1.5) and Contrastive Captioners (CoCas). It combines LVLM with a lightweight generative model, reducing computational burden compared to multi-agent debate. Additionally, through high-dimensional multi-scale features and higher-resolution images, Co-LLaVA can enhance the perception of details in RS images. Experimental results demonstrate the significant performance improvements of our Co-LLaVA over existing LVLMs (e.g., Geochat, RSGPT) on multiple metrics of four RS VQA datasets (e.g., +3% over SkySenseGPT on “Rural/Urban” accuracy in the test set of RSVQA-LR dataset). Full article
Show Figures

Figure 1

16 pages, 13325 KiB  
Article
Effect of NOX and SOX Contaminants on Corrosion Behaviors of 304L and 316L Stainless Steels in Monoethanolamine Aqueous Amine Solutions
by Eleni Lamprou, Fani Stergioudi, Georgios Skordaris, Nikolaos Michailidis, Evie Nessi, Athanasios I. Papadopoulos and Panagiotis Seferlis
Coatings 2024, 14(7), 842; https://doi.org/10.3390/coatings14070842 - 5 Jul 2024
Cited by 1 | Viewed by 2093
Abstract
This work is devoted to evaluating the corrosion behaviors of SS 304L and SS 316L in monoethanolamine solutions (MEA) containing SOX and NOX pollutants, examining both lean and CO2-loaded conditions at 25 °C and 40 °C. Electrochemical techniques (potentiodynamic [...] Read more.
This work is devoted to evaluating the corrosion behaviors of SS 304L and SS 316L in monoethanolamine solutions (MEA) containing SOX and NOX pollutants, examining both lean and CO2-loaded conditions at 25 °C and 40 °C. Electrochemical techniques (potentiodynamic and cyclic polarization) were used along with Scanning Electron Microscopy, Confocal Microscopy and weight loss measurements. The results reveal that the introduction of SOX and NOX pollutants increased the corrosion rate, whereas CO2 loading primarily reduced the corrosion resistance in the lean MEA solution, while its impact on solutions with SOX and NOX was less pronounced. This suggests that SOX and NOX play primary roles in the metal’s dissolution. Also, SS 316L demonstrated superior corrosion resistance compared to 304L in nearly all of the cases examined. Elevated temperatures were also found to intensify the corrosion rate, indicating a correlation between the corrosion rate and temperature. A microscopic observation and EDX analysis revealed that corrosion products are characterized by high concentrations of iron (Fe) and oxygen (O) as well as carbon (C). There is also an indication of the possible formation of amine complexes, suggesting a potential for amine degradation. No pitting corrosion was observed in SS 304L and SS 316L across any tested solution. Finally, the immersion results expose a tendency for passivity in all amine solutions and at both temperatures after several days of exposure. Moreover, they confirm the very low corrosion rate calculated from potentiodynamic curves due to minimal weight loss after 24 days of immersion. Full article
Show Figures

Figure 1

18 pages, 5740 KiB  
Article
Design and Implementation of Industrial Accident Detection Model Based on YOLOv4
by Taejun Lee, Keanseb Woo, Panyoung Kim and Hoekyung Jung
Appl. Sci. 2023, 13(18), 10163; https://doi.org/10.3390/app131810163 - 9 Sep 2023
Cited by 3 | Viewed by 2160
Abstract
Korea’s industrial accident rate ranks high among Organization for Economic Co-operation and Development countries. Moreover, large-scale accidents have recently occurred. Accordingly, the requirements for management and supervision in industrial sites are increasing; in this context, the “Act on Punishment of Serious Accidents, etc.” [...] Read more.
Korea’s industrial accident rate ranks high among Organization for Economic Co-operation and Development countries. Moreover, large-scale accidents have recently occurred. Accordingly, the requirements for management and supervision in industrial sites are increasing; in this context, the “Act on Punishment of Serious Accidents, etc.” has been enacted. Aiming to prevent such industrial accidents, various data collected by devices such as sensors and closed-caption televisions (CCTVs) are utilized to track workers and detect hazardous substances, gases, and fires at industrial sites. In this study, an industrial area requiring such technology is selected. A hazardous situation event is derived, and a dataset is built using CCTV data. A violation corresponding to a hazardous situation event is detected and a warning is provided. The events incorporate requirements extracted from industrial sites, such as those concerning collision risks and the wearing of safety equipment. The precision of the event violation detection exceeds 95% and the response and delay times are under 20 ms. Thus, this system is believed to be used at industrial sites and for other intelligent industrial safety prevention solutions. Full article
(This article belongs to the Section Applied Industrial Technologies)
Show Figures

Figure 1

15 pages, 1384 KiB  
Article
Circumnavigating the Revolving Door of an Ethical Milieu
by Carmel Capewell, Sarah Frodsham and Kim Waring Paynter
Educ. Sci. 2022, 12(4), 250; https://doi.org/10.3390/educsci12040250 - 31 Mar 2022
Cited by 3 | Viewed by 2562
Abstract
This paper reflects on an Ethical Review Board’s (ERB) established structure of practice throughout a student-led project. We use the research project as a means of exploring the three questions set by the Editors, Fox and Busher, regarding the role of ERBs throughout [...] Read more.
This paper reflects on an Ethical Review Board’s (ERB) established structure of practice throughout a student-led project. We use the research project as a means of exploring the three questions set by the Editors, Fox and Busher, regarding the role of ERBs throughout the research process. We gained full university-level ethical approval in October 2020. This project initially focused on collecting data from students, from a UK university. The participatory way we collaborated with both undergraduates and postgraduates illuminated their individual unique perspectives and successfully facilitated their agentive contributions. This required on-going simultaneous negotiation of predetermined ethical procedures through the ERB. We termed this iterative process ‘circumnavigating the revolving door’ as it summarised revisiting ethical approval in the light of requests from our student participants. The participants were also invited to be part of the analysis and dissemination phase of this research. Original data collected related to personalised experiences of learning during the on-going global pandemic. The philosophical approach adopted was through an adaptation of Photovoice. That is, with limited direction by the researchers, the participants were invited to construct images (photos or hand drawn pictures), with captions (written text or voice), to explore their own educative circumstances. With this in mind, this paper explores the students’ participatory agency throughout this visual methods project through three lenses: namely, the appropriateness of ethical practices within a contextualised scenario (i.e., researching learning during lockdown in a higher educational institution); how the ethical process of an educational establishment supported the dynamic and iterative nature of participant-led research; and finally, how the original researchers’ experiences can inform ethical regulations and policy, both nationally and internationally. The circumnavigation of the revolving door of participatory ethics has proved invaluable during this research. This iterative cycle was necessary to incorporate the students (or co-researchers) suggested contributions. One example includes gaining the ERB’s approval, post full approval, for participants to audio record their own captions for a public facing website. From originally welcoming the students as participants, to facilitating them to become agentive co-researchers, it became increasingly important to provide them with opportunities to be actively involved in all parts of the research process. The reciprocal iterative relationship developed between co-researcher, researchers and the ERB served to strengthen the outcomes of the project. Full article
(This article belongs to the Special Issue Regulation and Ethical Practice for Educational Research)
Show Figures

Figure 1

10 pages, 680 KiB  
Article
Accessibility of Online Resources for Associations Providing Services to People with Brain Injuries in Covid-19 Pandemic
by Nolwenn Lapierre, Olivier Piquer, Erik Celikovic, François Routhier, Julie Ruel and Marie-Eve Lamontagne
Int. J. Environ. Res. Public Health 2021, 18(23), 12609; https://doi.org/10.3390/ijerph182312609 - 30 Nov 2021
Cited by 6 | Viewed by 1977
Abstract
Background. Since the Covid-19 pandemic, many community-based services for people with traumatic brain injury (TBI) have been moved online, which may have hindered their accessibility. The study aims to assess the accessibility of online information and resources dedicated to people with TBI. Methods. [...] Read more.
Background. Since the Covid-19 pandemic, many community-based services for people with traumatic brain injury (TBI) have been moved online, which may have hindered their accessibility. The study aims to assess the accessibility of online information and resources dedicated to people with TBI. Methods. The websites of 14 organizations offering information and resources to people with TBI in Quebec were evaluated. Two co-authors independently evaluated one page of each website and compared their results. Descriptive statistical analyses were performed. Results. The average accessibility score of the 14 websites evaluated was 54% with a standard deviation of 16%. Website design and writing were the most accessible aspects (72.3%). Only two out of the 14 websites (14%) presented multimedia content. This category presented the most barriers to accessibility with a score of 42%. Regarding images, they reached an accessibility score of 46%. Their main shortcoming was the absence of a caption. Conclusion. This study highlights accessibility issues specific to people with TBI to access online resources and identifies specific areas of improvement. The results of this study provide community organizations with avenues of improvement to make their online resources more accessible to people with TBI and may therefore lead to improved community practices. Full article
(This article belongs to the Section Disabilities)
Show Figures

Figure 1

Back to TopTop