Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (21)

Search Parameters:
Keywords = Parameter-Efficient Fine-Tuning (PEFT)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 3233 KB  
Article
Optimizing Client Participation in Communication-Constrained Federated LLM Adaptation with LoRA
by Faranaksadat Solat and Joohyung Lee
Sensors 2025, 25(21), 6538; https://doi.org/10.3390/s25216538 - 23 Oct 2025
Viewed by 518
Abstract
Federated learning (FL) enables privacy-preserving adaptation of large language models (LLMs) across distributed clients. However, deploying FL in edge environments remains challenging because of the high communication overhead of full-model updates. Recent advances in parameter-efficient fine-tuning (PEFT), particularly low-rank adaptation (LoRA), have substantially [...] Read more.
Federated learning (FL) enables privacy-preserving adaptation of large language models (LLMs) across distributed clients. However, deploying FL in edge environments remains challenging because of the high communication overhead of full-model updates. Recent advances in parameter-efficient fine-tuning (PEFT), particularly low-rank adaptation (LoRA), have substantially reduced update sizes by injecting lightweight trainable matrices into pretrained transformers, thereby making FL with LLMs more feasible. In this paper, we propose LoRaC-GA, a communication-aware optimization framework that dynamically determines the optimal number of clients to participate in each round under a fixed bandwidth constraint. We formulated a max-min objective to jointly maximize the model accuracy and communication efficiency and solved the resulting non-convex problem using a genetic algorithm (GA). To further reduce the overhead, we integrated a structured peer-to-peer collaboration protocol with log2K complexity, enabling scalable communication without full connectivity. The simulation results demonstrate that LoRaC-GA adaptively selects the optimal client count, achieving competitive accuracy while significantly reducing the communication cost. The proposed framework is well-suited for bandwidth-constrained edge deployments involving large-scale LLMs. Full article
Show Figures

Figure 1

31 pages, 914 KB  
Review
A Survey of Large Language Models: Evolution, Architectures, Adaptation, Benchmarking, Applications, Challenges, and Societal Implications
by Seyed Mahmoud Sajjadi Mohammadabadi, Burak Cem Kara, Can Eyupoglu, Can Uzay, Mehmet Serkan Tosun and Oktay Karakuş
Electronics 2025, 14(18), 3580; https://doi.org/10.3390/electronics14183580 - 9 Sep 2025
Viewed by 4637
Abstract
This survey provides an in-depth review of large language models (LLMs), highlighting the significant paradigm shift they represent in artificial intelligence. Our purpose is to consolidate state-of-the-art advances in LLM design, training, adaptation, evaluation, and application for both researchers and practitioners. To accomplish [...] Read more.
This survey provides an in-depth review of large language models (LLMs), highlighting the significant paradigm shift they represent in artificial intelligence. Our purpose is to consolidate state-of-the-art advances in LLM design, training, adaptation, evaluation, and application for both researchers and practitioners. To accomplish this, we trace the evolution of language models and describe core approaches, including parameter-efficient fine-tuning (PEFT). The methodology involves a thorough survey of real-world LLM applications across the scientific, engineering, healthcare, and creative sectors, coupled with a review of current benchmarks. Our findings indicate that high training and inference costs are shaping market structures, raising economic and labor concerns, while also underscoring a persistent need for human oversight in assessment. Key trends include the development of unified multimodal architectures capable of processing varied data inputs and the emergence of agentic systems that exhibit complex behaviors such as tool use and planning. We identify critical open problems, such as detectability, data contamination, generalization, and benchmark diversity. Ultimately, we conclude that overcoming these complex technical, economic, and social challenges necessitates collaborative advancements in adaptation, evaluation, infrastructure, and governance. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

36 pages, 23263 KB  
Article
RL-TweetGen: A Socio-Technical Framework for Engagement-Optimized Short Text Generation in Digital Commerce Using Large Language Models and Reinforcement Learning
by Chitrakala S and Pavithra S S
J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 218; https://doi.org/10.3390/jtaer20030218 - 26 Aug 2025
Viewed by 1395
Abstract
In the rapidly evolving landscape of digital marketing and electronic commerce, short-form content—particularly on platforms like Twitter (now X)—has become pivotal for real-time branding, community engagement, and product promotion. The rise of Non-Fungible Tokens (NFTs) and Web3 ecosystems further underscores the need for [...] Read more.
In the rapidly evolving landscape of digital marketing and electronic commerce, short-form content—particularly on platforms like Twitter (now X)—has become pivotal for real-time branding, community engagement, and product promotion. The rise of Non-Fungible Tokens (NFTs) and Web3 ecosystems further underscores the need for domain-specific, engagement-oriented social media content. However, automating the generation of such content while balancing linguistic quality, semantic relevance, and audience engagement remains a substantial challenge. To address this, we propose RL-TweetGen, a socio-technical framework that integrates instruction-tuned large language models (LLMs) with reinforcement learning (RL) to generate concise, impactful, and engagement-optimized tweets. The framework incorporates a structured pipeline comprising domain-specific data curation, semantic classification, and intent-aware prompt engineering, and leverages Parameter-Efficient Fine-Tuning (PEFT) with LoRA for scalable model adaptation. We fine-tuned and evaluated three LLMs—LLaMA-3.1-8B, Mistral-7B Instruct, and DeepSeek 7B Chat—guided by a hybrid reward function that blends XGBoost-predicted engagement scores with expert-in-the-loop feedback. To enhance lexical diversity and contextual alignment, we implemented advanced decoding strategies, including Tailored Beam Search, Enhanced Top-p Sampling, and Contextual Temperature Scaling. A case study focused on NFT-related tweet generation demonstrated the practical effectiveness of RL-TweetGen. Experimental results showed that Mistral-7B achieved the highest lexical fluency (BLEU: 0.2285), LLaMA-3.1 exhibited superior semantic precision (BERT-F1: 0.8155), while DeepSeek 7B provided balanced performance. Overall, RL-TweetGen presents a scalable and adaptive solution for marketers, content strategists, and Web3 platforms seeking to automate and optimize social media engagement. The framework advances the role of generative AI in digital commerce by aligning content generation with platform dynamics, user preferences, and marketing goals. Full article
Show Figures

Figure 1

17 pages, 2230 KB  
Article
Enhancing Diffusion-Based Music Generation Performance with LoRA
by Seonpyo Kim, Geonhui Kim, Shoki Yagishita, Daewoon Han, Jeonghyeon Im and Yunsick Sung
Appl. Sci. 2025, 15(15), 8646; https://doi.org/10.3390/app15158646 - 5 Aug 2025
Viewed by 2186
Abstract
Recent advancements in generative artificial intelligence have significantly progressed the field of text-to-music generation, enabling users to create music from natural language descriptions. Despite the success of various models, such as MusicLM, MusicGen, and AudioLDM, the current approaches struggle to capture fine-grained genre-specific [...] Read more.
Recent advancements in generative artificial intelligence have significantly progressed the field of text-to-music generation, enabling users to create music from natural language descriptions. Despite the success of various models, such as MusicLM, MusicGen, and AudioLDM, the current approaches struggle to capture fine-grained genre-specific characteristics, precisely control musical attributes, and handle underrepresented cultural data. This paper introduces a novel, lightweight fine-tuning method for the AudioLDM framework using low-rank adaptation (LoRA). By updating only selected attention and projection layers, the proposed method enables efficient adaptation to musical genres with limited data and computational cost. The proposed method enhances controllability over key musical parameters such as rhythm, emotion, and timbre. At the same time, it maintains the overall quality of music generation. This paper represents the first application of LoRA in AudioLDM, offering a scalable solution for fine-grained, genre-aware music generation and customization. The experimental results demonstrate that the proposed method improves the semantic alignment and statistical similarity compared with the baseline. The contrastive language–audio pretraining score increased by 0.0498, indicating enhanced text-music consistency. The kernel audio distance score decreased by 0.8349, reflecting improved similarity to real music distributions. The mean opinion score ranged from 3.5 to 3.8, confirming the perceptual quality of the generated music. Full article
Show Figures

Figure 1

20 pages, 407 KB  
Article
Leveraging Asymmetric Adaptation with Dynamic Sparse LoRA for Enhanced Nuance in LLM-Based Offensive Language Detection
by Yanzhe Wang, Bingquan Chen and Jingchao Sun
Symmetry 2025, 17(7), 1076; https://doi.org/10.3390/sym17071076 - 7 Jul 2025
Viewed by 1411
Abstract
The challenge of detecting nuanced, context-dependent offensive language highlights the need for Large Language Model (LLM) adaptation strategies that can effectively address inherent data and task asymmetries. Standard Parameter-Efficient Finetuning (PEFT) methods like Low-Rank Adaptation (LoRA), while efficient, often employ a more uniform, [...] Read more.
The challenge of detecting nuanced, context-dependent offensive language highlights the need for Large Language Model (LLM) adaptation strategies that can effectively address inherent data and task asymmetries. Standard Parameter-Efficient Finetuning (PEFT) methods like Low-Rank Adaptation (LoRA), while efficient, often employ a more uniform, or symmetric, update mechanism that can be suboptimal for capturing such linguistic subtleties. In this paper, we propose Dynamic Sparse LoRA (DS-LoRA), a novel technique that leverages asymmetric adaptation to enhance LLM finetuning for nuanced offensive language detection. DS-LoRA achieves this by (1) incorporating input-dependent gating mechanisms, enabling the asymmetric modulation of LoRA module contributions based on instance-specific characteristics, and (2) promoting asymmetric sparsity within LoRA update matrices via L1 regularization. This dual asymmetric strategy empowers the model to selectively engage and refine only the most pertinent parameters for a given input, fostering a more parsimonious and contextually aware adaptation. Extensive experiments on benchmark datasets demonstrate DS-LoRA’s significant overperformance over standard LoRA and other strong baselines, particularly in identifying subtle and contextually ambiguous offensive content, underscoring the benefits of its asymmetric adaptive capabilities. Full article
Show Figures

Figure 1

34 pages, 963 KB  
Review
Synergizing Intelligence and Privacy: A Review of Integrating Internet of Things, Large Language Models, and Federated Learning in Advanced Networked Systems
by Hongming Yang, Hao Liu, Xin Yuan, Kai Wu, Wei Ni, J. Andrew Zhang and Ren Ping Liu
Appl. Sci. 2025, 15(12), 6587; https://doi.org/10.3390/app15126587 - 11 Jun 2025
Cited by 2 | Viewed by 2592
Abstract
Bringing together the Internet of Things (IoT), LLMs, and Federated Learning (FL) offers exciting possibilities, creating a synergy to build smarter, privacy-preserving distributed systems. This review explores the merging of these technologies, particularly within edge computing environments. We examine current architectures and practical [...] Read more.
Bringing together the Internet of Things (IoT), LLMs, and Federated Learning (FL) offers exciting possibilities, creating a synergy to build smarter, privacy-preserving distributed systems. This review explores the merging of these technologies, particularly within edge computing environments. We examine current architectures and practical methods enabling this fusion, such as efficient low-rank adaptation (LoRA) for fine-tuning large models and memory-efficient Split Federated Learning (SFL) for collaborative edge training. However, this integration faces significant hurdles: the resource limitations of IoT devices, unreliable network communication, data heterogeneity, diverse security threats, fairness considerations, and regulatory demands. While other surveys cover pairwise combinations, this review distinctively analyzes the three-way synergy, highlighting how IoT, LLMs, and FL working in concert unlock capabilities unattainable otherwise. Our analysis compares various strategies proposed to tackle these issues (e.g., federated vs. centralized, SFL vs. standard FL, DP vs. cryptographic privacy), outlining their practical trade-offs. We showcase real-world progress and potential applications in domains like Industrial IoT and smart cities, considering both opportunities and limitations. Finally, this review identifies critical open questions and promising future research paths, including ultra-lightweight models, robust algorithms for heterogeneity, machine unlearning, standardized benchmarks, novel FL paradigms, and next-generation security. Addressing these areas is essential for responsibly harnessing this powerful technological blend. Full article
Show Figures

Figure 1

22 pages, 1562 KB  
Article
Leveraging Vision Foundation Model via PConv-Based Fine-Tuning with Automated Prompter for Defect Segmentation
by Yifan Jiang, Jinshui Chen and Jiangang Lu
Sensors 2025, 25(8), 2417; https://doi.org/10.3390/s25082417 - 11 Apr 2025
Cited by 3 | Viewed by 1616
Abstract
In industrial scenarios, image segmentation is essential for accurately identifying defect regions. Recently, the emergence of foundation models driven by powerful computational resources and large-scale training data has brought about a paradigm shift in deep learning-based image segmentation. The Segment Anything Model (SAM) [...] Read more.
In industrial scenarios, image segmentation is essential for accurately identifying defect regions. Recently, the emergence of foundation models driven by powerful computational resources and large-scale training data has brought about a paradigm shift in deep learning-based image segmentation. The Segment Anything Model (SAM) has shown exceptional performance across various downstream tasks, owing to its vast semantic knowledge and strong generalization capabilities. However, the feature distribution discrepancy, reliance on manually labeled prompts, and limited category information of SAM reduce its scalability in industrial settings. To address these issues, we propose PA-SAM, an industrial defect segmentation framework based on SAM. Firstly, to bridge the gap between SAM’s pre-training data and distinct characteristics of industrial defects, we introduce a parameter-efficient fine-tuning (PEFT) technique incorporating lightweight Multi-Scale Partial Convolution Aggregation (MSPCA) into Low-Rank Adaptation (LoRA), named MSPCA-LoRA, which effectively enhances the image encoder’s sensitivity to prior knowledge biases, while maintaining PEFT efficiency. Furthermore, we present the Image-to-Prompt Embedding Generator (IPEG), which utilizes image embeddings to autonomously create high-quality prompt embeddings for directing mask segmentation, eliminating the limitations of manually provided prompts. Finally, we apply effective refinements to SAM’s mask decoder, transforming SAM into an end-to-end semantic segmentation framework. On two real-world defect segmentation datasets, PA-SAM achieves mean Intersections over Union of 73.87% and 68.30%, as well as mean Dice coefficients of 84.90% and 80.22%, outperforming other state-of-the-art algorithms, further demonstrating its robust generalization and application potential. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

28 pages, 3613 KB  
Article
Chatbot Based on Large Language Model to Improve Adherence to Exercise-Based Treatment in People with Knee Osteoarthritis: System Development
by Humberto Farías, Joaquín González Aroca and Daniel Ortiz
Technologies 2025, 13(4), 140; https://doi.org/10.3390/technologies13040140 - 4 Apr 2025
Cited by 1 | Viewed by 2158
Abstract
Knee osteoarthritis (KOA) is a prevalent condition globally, leading to significant pain and disability, particularly in individuals over the age of 40. While exercise has been shown to reduce symptoms and improve physical function and quality of life in patients with KOA, long-term [...] Read more.
Knee osteoarthritis (KOA) is a prevalent condition globally, leading to significant pain and disability, particularly in individuals over the age of 40. While exercise has been shown to reduce symptoms and improve physical function and quality of life in patients with KOA, long-term adherence to exercise programs remains a challenge due to the lack of ongoing support. To address this, a chatbot was developed using large language models (LLMs) to provide evidence-based guidance and promote adherence to treatment. A systematic review conducted under the PRISMA framework identified relevant clinical guidelines that served as the foundational knowledge base for the chatbot. The Mistral 7B model, optimized with Parameter-Efficient Fine-Tuning (PEFT) and Mixture-of-Experts (MoE) techniques, was integrated to ensure computational efficiency and mitigate hallucinations, a critical concern in medical applications. Additionally, the chatbot employs Self-Reflective Retrieval-Augmented Generation (SELF-RAG) combined with Chain of Thought (CoT) reasoning, enabling dynamic query reformulation and the generation of accurate, evidence-based responses tailored to patient needs. The chatbot was evaluated by comparing pre- and post-improvement versions and against a reference model (ChatGPT), using metrics of accuracy, relevance, and consistency. The results demonstrated significant improvements in response quality and conversational coherence, emphasizing the potential of integrating advanced LLMs with retrieval and reasoning methods to address critical challenges in healthcare. This approach not only enhances treatment adherence but also strengthens patient–provider interactions in managing chronic conditions like KOA. Full article
Show Figures

Figure 1

20 pages, 55414 KB  
Article
Parameter-Efficient Fine-Tuning for Individual Tree Crown Detection and Species Classification Using UAV-Acquired Imagery
by Jiuyu Zhang, Fan Lei and Xijian Fan
Remote Sens. 2025, 17(7), 1272; https://doi.org/10.3390/rs17071272 - 3 Apr 2025
Cited by 3 | Viewed by 1806
Abstract
Pre-trained foundation models, trained on large-scale datasets, have demonstrated significant success in a variety of downstream vision tasks. Parameter-efficient fine-tuning (PEFT) methods aim to adapt these foundation models to new domains by updating only a small subset of parameters, thereby reducing computational overhead. [...] Read more.
Pre-trained foundation models, trained on large-scale datasets, have demonstrated significant success in a variety of downstream vision tasks. Parameter-efficient fine-tuning (PEFT) methods aim to adapt these foundation models to new domains by updating only a small subset of parameters, thereby reducing computational overhead. However, the effectiveness of these PEFT methods, especially in the context of forestry remote sensing—specifically for individual tree detection—remains largely unexplored. In this work, we present a simple and efficient PEFT approach designed to transfer pre-trained transformer models to the specific tasks of tree crown detection and species classification in unmanned aerial vehicle (UAV) imagery. To address the challenge of mitigating the influence of irrelevant ground targets in UAV imagery, we propose an Adaptive Salient Channel Selection (ASCS) method, which can be simply integrated into each transformer block during fine-tuning. In the proposed ASCS, task-specific channels are adaptively selected based on class-wise importance scores, where the channels most relevant to the target class are highlighted. In addition, a simple bias term is introduced to facilitate the learning of task-specific knowledge, enhancing the adaptation of the pre-trained model to the target tasks. The experimental results demonstrate that the proposed ASCS fine-tuning method, which utilizes a small number of task-specific learnable parameters, significantly outperforms the latest YOLO detection framework and surpasses the state-of-the-art PEFT method in tree detection and classification tasks. These findings demonstrate that the proposed ASCS is an effective PEFT method, capable of adapting the pre-trained model’s capabilities for tree crown detection and species classification using UAV imagery. Full article
(This article belongs to the Special Issue Intelligent Extraction of Phenotypic Traits in Agroforestry)
Show Figures

Figure 1

19 pages, 10070 KB  
Article
SAR Image Target Segmentation Guided by the Scattering Mechanism-Based Visual Foundation Model
by Chaochen Zhang, Jie Chen, Zhongling Huang, Hongcheng Zeng, Zhixiang Huang, Yingsong Li, Hui Xu, Xiangkai Pu and Long Sun
Remote Sens. 2025, 17(7), 1209; https://doi.org/10.3390/rs17071209 - 28 Mar 2025
Cited by 1 | Viewed by 1334
Abstract
As a typical visual foundation model, SAM has been extensively utilized for optical image segmentation tasks. However, synthetic aperture radar (SAR) employs a unique imaging mechanism, and its images are very different from optical images. Directly transferring a pretrained SAM from optical scenes [...] Read more.
As a typical visual foundation model, SAM has been extensively utilized for optical image segmentation tasks. However, synthetic aperture radar (SAR) employs a unique imaging mechanism, and its images are very different from optical images. Directly transferring a pretrained SAM from optical scenes to SAR image instance segmentation tasks can lead to a substantial decline in performance. Therefore, this paper fully integrates the SAR scattering mechanism, and proposes a SAR image target segmentation method guided by the SAR scattering mechanism-based visual foundation model. First, considering the discrete distribution features of strong scattering points in SAR imagery, we develop an edge enhancement morphological adaptor. This adaptor is designed to incorporate a limited set of trainable parameters aimed at effectively boosting the target’s edge morphology, allowing quick fine-tuning within the SAR realm. Second, an adaptive denoising module based on wavelets and soft-thresholding techniques is implemented to reduce the impact of SAR coherent speckle noise, thus improving the feature representation performance. Furthermore, an efficient automatic prompt module based on a deep object detector is built to enhance the ability of rapid target localization in wide-area scenes and improve image segmentation performance. Our approach has been shown to outperform current segmentation methods through experiments conducted on two open-source datasets, SSDD and HRSID. When the ground-truth is used as a prompt, SARSAM improves mIOU by more than 10%, and APmask50 by more than 5% from the baseline. In addition, the computational cost is greatly reduced because the number of parameters and FLOPs of the structures that require fine-tuning are only 13.5% and 10.1% of the baseline, respectively. Full article
(This article belongs to the Special Issue Physics Informed Foundational Models for SAR Image Interpretation)
Show Figures

Figure 1

19 pages, 1572 KB  
Article
FeTT: Class-Incremental Learning with Feature Transformation Tuning
by Sunyuan Qiang and Yanyan Liang
Mathematics 2025, 13(7), 1095; https://doi.org/10.3390/math13071095 - 27 Mar 2025
Viewed by 1101
Abstract
Class-incremental learning (CIL) enables models to continuously acquire knowledge and adapt in an ever-changing environment. However, one primary challenge lies in the trade-off between the stability and plasticity, i.e., plastically expand the novel knowledge base and stably retaining previous knowledge without catastrophic forgetting. [...] Read more.
Class-incremental learning (CIL) enables models to continuously acquire knowledge and adapt in an ever-changing environment. However, one primary challenge lies in the trade-off between the stability and plasticity, i.e., plastically expand the novel knowledge base and stably retaining previous knowledge without catastrophic forgetting. We find that even recent promising CIL methods via pre-trained models (PTMs) still suffer from this dilemma. To this end, this paper begins by analyzing the aforementioned dilemma from the perspective of marginal distribution for data categories. Then, we propose the feature transformation tuning (FeTT) model, which concurrently alleviates the inadequacy of previous PTM-based CIL in terms of stability and plasticity. Specifically, we apply the parameter-efficient fine-tuning (PEFT) strategies solely in the first CIL task to bridge the domain gap between the PTMs and downstream task dataset. Subsequently, the model is kept fixed to maintain stability and avoid discrepancies in training data distributions. Moreover, feature transformation is employed to regulate the backbone representations, boosting the model’s adaptability and plasticity without additional training or parameter costs. Extensive experimental results and further feature channel activations discussion on CIL benchmarks across six datasets validate the superior performance of our proposed method. Full article
(This article belongs to the Special Issue New Insights in Machine Learning (ML) and Deep Neural Networks)
Show Figures

Figure 1

22 pages, 669 KB  
Article
Analyzing LLAMA3 Performance on Classification Task Using LoRA and QLoRA Techniques
by Rajvardhan Patil, Priyanka Khot and Venkat Gudivada
Appl. Sci. 2025, 15(6), 3087; https://doi.org/10.3390/app15063087 - 12 Mar 2025
Viewed by 6250
Abstract
Large language models (LLMs), consisting of billions and trillions of parameters, have demonstrated exceptional ability in natural language understanding (NLU) and natural language generation (NLG) tasks. Increases in their numbers of parameters and model sizes have resulted in better performance and accuracy. However, [...] Read more.
Large language models (LLMs), consisting of billions and trillions of parameters, have demonstrated exceptional ability in natural language understanding (NLU) and natural language generation (NLG) tasks. Increases in their numbers of parameters and model sizes have resulted in better performance and accuracy. However, models with such enormous numbers of parameters incur significant computational costs and resources, making them challenging to fine tune and adapt to a specific downstream task. Several parameter-efficient fine-tuning (PEFT) techniques have been proposed to address this issue. This study demonstrates the improvement obtained over the base LLaMA3-8B model using two prominent PEFT techniques: LoRA and QLoRA. We use the sequence classification task of sentiment analysis to conduct the experiments. Additionally, we analyze the effects of hyperparameter adjustments (r and α) on the model’s performance. We examine the tradeoff between efficiency and memory savings obtained using the quantized LoRA (QLoRA) technique. We also investigate and compare the performance changes of LoRA and QLoRA techniques obtained after adapting to attention layers (query, key, value, and project) to all the linear layers during fine tuning. We report the findings of our work along with limitations and future directions. Full article
(This article belongs to the Special Issue Techniques and Applications of Natural Language Processing)
Show Figures

Figure 1

29 pages, 549 KB  
Review
Generative Models in Medical Visual Question Answering: A Survey
by Wenjie Dong, Shuhao Shen, Yuqiang Han, Tao Tan, Jian Wu and Hongxia Xu
Appl. Sci. 2025, 15(6), 2983; https://doi.org/10.3390/app15062983 - 10 Mar 2025
Cited by 7 | Viewed by 8248
Abstract
Medical Visual Question Answering (MedVQA) is a crucial intersection of artificial intelligence and healthcare. It enables systems to interpret medical images—such as X-rays, MRIs, and pathology slides—and respond to clinical queries. Early approaches primarily relied on discriminative models, which select answers from predefined [...] Read more.
Medical Visual Question Answering (MedVQA) is a crucial intersection of artificial intelligence and healthcare. It enables systems to interpret medical images—such as X-rays, MRIs, and pathology slides—and respond to clinical queries. Early approaches primarily relied on discriminative models, which select answers from predefined candidates. However, these methods struggle to effectively address open-ended, domain-specific, or complex queries. Recent advancements have shifted the focus toward generative models, leveraging autoregressive decoders, large language models (LLMs), and multimodal large language models (MLLMs) to generate more nuanced and free-form answers. This review comprehensively examines the paradigm shift from discriminative to generative systems, examining generative MedVQA works on their model architectures and training process, summarizing evaluation benchmarks and metrics, highlighting key advances and techniques that propels the development of generative MedVQA, such as concept alignment, instruction tuning, and parameter-efficient fine-tuning (PEFT), alongside strategies for data augmentation and automated dataset creation. Finally, we propose future directions to enhance clinical reasoning and intepretability, build robust evaluation benchmarks and metrics, and employ scalable training strategies and deployment solutions. By analyzing the strengths and limitations of existing generative MedVQA approaches, we aim to provide valuable insights for researchers and practitioners working in this domain. Full article
(This article belongs to the Special Issue Feature Review Papers in "Computing and Artificial Intelligence")
Show Figures

Figure 1

21 pages, 1595 KB  
Article
Aspect-Based Sentiment Analysis with Enhanced Opinion Tree Parsing and Parameter-Efficient Fine-Tuning for Edge AI
by Shih-wei Liao, Ching-Shun Wang, Chun-Chao Yeh and Jeng-Wei Lin
Electronics 2025, 14(4), 690; https://doi.org/10.3390/electronics14040690 - 10 Feb 2025
Viewed by 2412
Abstract
Understanding user opinions from user comments or reviews in social media text mining is essential for marketing campaigns and many other applications. However, analyzing social media user comments presents significant challenges due to the complexity of discerning relationships between opinions and aspects, particularly [...] Read more.
Understanding user opinions from user comments or reviews in social media text mining is essential for marketing campaigns and many other applications. However, analyzing social media user comments presents significant challenges due to the complexity of discerning relationships between opinions and aspects, particularly when comments vary greatly in length. To effectively explore aspects and opinions in the sentences, techniques based on mining opinion sentiment of the referred aspects (implicitly or explicitly) in the user comments with ACOS (aspect-category-opinion-sentiment) quadruple extraction have been proposed. Among many others, the opinion tree parsing (OTP) scheme has been shown to be effective and efficient for the ACOS quadruple extraction task in aspect-based sentiment analysis (ABAS). In this study, we continue the efforts to design an efficient ABSA scheme. We extend the original OTP scheme further with richer context parsing rules, utilizing conjunctions and semantic modifiers to provide more context information in the sentence and thus effectively improving the accuracy of the analysis. Meanwhile, regarding the limitations of computation resources for edge devices in edge computing scenario, we also investigate the trade-off between computation saving (in terms of the percentage of model parameters to be updated) and the model’s performance (in terms of inference accuracy) on the proposed scheme under PEFT (parameter-efficient fine-tuning). We evaluate the proposed scheme on publicly available ACOS datasets. Experiment results show that the proposed enhanced OTP (eOTP) model improves the OTP scheme both in precision and recall measurements on the public ACOS datasets. Meanwhile, in the design trade-off evaluation for resource-constrained devices, the experiment results show that, in model training, eOTP requires very limited parameters (less than 1%) to be retrained by keeping most of the parameters frozen (not modified) in the fine-tuning process, at the cost of a slight performance drop (around 4%) in F1-score compared with the case of full fine-tuning. These demonstrate that the proposed scheme is efficient and feasible for resource-constrained scenarios such as for mobile edge/fog computing services. Full article
(This article belongs to the Special Issue Feature Papers in "Computer Science & Engineering", 2nd Edition)
Show Figures

Figure 1

16 pages, 397 KB  
Article
Efficient Fine-Tuning of Large Language Models via a Low-Rank Gradient Estimator
by Luoming Zhang, Zhenyu Lou, Yangwei Ying, Cheng Yang and Hong Zhou
Appl. Sci. 2025, 15(1), 82; https://doi.org/10.3390/app15010082 - 26 Dec 2024
Cited by 1 | Viewed by 6746
Abstract
In this paper, we present a Low-Rank Gradient Estimator (LoGE) to accelerate the finetune-time computation of transformers, especially large language models (LLMs). Unlike Parameter-Efficient Fine-Tuning (PEFT) methods, which primarily aim to minimize the number of fine-tuning parameters, LoGE also significantly reduces the computational [...] Read more.
In this paper, we present a Low-Rank Gradient Estimator (LoGE) to accelerate the finetune-time computation of transformers, especially large language models (LLMs). Unlike Parameter-Efficient Fine-Tuning (PEFT) methods, which primarily aim to minimize the number of fine-tuning parameters, LoGE also significantly reduces the computational load of activation gradient calculations by decomposing pre-trained weights and utilizing low-rank matrices during the backward pass. Our approach includes an effective solution for identifying sensitive and important latent subspaces in large models before training with downstream datasets. As LoGE does not alter the network structure, it can be conveniently integrated into existing models. We validated LoGE’s efficacy through comprehensive experiments across various models on various tasks. For the widely used LLaMA model equipped with LoRA, LoGE achieves up to a 1.3× speedup while maintaining graceful accuracy. Full article
Show Figures

Figure 1

Back to TopTop