From Edge AI to On-Device LLMs: Hardware Architectures, Systems, and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 September 2026 | Viewed by 774

Special Issue Editor


E-Mail Website
Guest Editor
School of Computing and Engineering, Quinnipiac University, Hamden, CT 06518, USA
Interests: AI for science; multi-modal LLM systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The deployment of artificial intelligence on edge devices has become a critical research frontier, driven by the demand for real-time processing, enhanced privacy, and reduced latency. From traditional deep learning models powering computer vision and speech recognition to the recent emergence of large language models (LLMs) and generative AI, the ability to run sophisticated AI directly on resource-constrained devices is transforming how intelligent systems are designed and deployed.

On-device AI encompasses a broad spectrum of capabilities. Traditional deep neural networks, including convolutional neural networks (CNNs) and recurrent architectures, continue to serve as the backbone for many edge applications such as object detection, image classification, and sensor data analysis. Meanwhile, the rapid advancement of generative AI and LLMs has created new opportunities and challenges for edge deployment, enabling on-device text generation, multimodal understanding, and intelligent conversational agents without cloud dependency.

Realizing efficient on-device AI requires innovations across the full technology stack. Advances in neural network accelerators, energy-efficient processor architectures, and heterogeneous computing platforms are enabling increasingly powerful models to run on edge hardware. Techniques such as model compression, quantization, neural architecture search, and efficient attention mechanisms are bridging the gap between model capability and device constraints. Furthermore, distributed computation frameworks and edge-cloud collaborative architectures allow complex AI workloads to be partitioned and executed across multiple nodes, balancing performance, latency, and resource utilization.

This Special Issue invites contributions that advance the hardware architectures, system designs, and algorithmic innovations, enabling efficient on-device and distributed AI deployment. We welcome research spanning traditional deep learning systems, emerging generative AI and LLM implementations, distributed computation frameworks, and the hardware–software co-design approaches that make edge intelligence practical. Both theoretical advances and real-world implementations are encouraged.

Dr. Ron (Rongyu) Lin
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • on-device AI
  • edge generative AI
  • on-device LLM
  • neural network accelerator
  • model compression and quantization
  • distributed computation
  • hardware–software co-design
  • edge-cloud collaborative architecture
  • TinyML
  • federated learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

10 pages, 418 KB  
Article
Empirical Analysis of Internal Hallucination Detection in Quantized LLMs: Layer Dynamics and White-Box Benchmarks
by Haohua Liu and Jinli Xu
Electronics 2026, 15(9), 1802; https://doi.org/10.3390/electronics15091802 - 23 Apr 2026
Viewed by 554
Abstract
As large language models (LLMs) move onto resource-constrained devices, maintaining factual reliability without adding another expensive decoding pass becomes a practical inference problem. Instead of introducing another complex hallucination detector, this paper presents an empirical study of which low-cost white-box features remain useful [...] Read more.
As large language models (LLMs) move onto resource-constrained devices, maintaining factual reliability without adding another expensive decoding pass becomes a practical inference problem. Instead of introducing another complex hallucination detector, this paper presents an empirical study of which low-cost white-box features remain useful under a controlled single-pass benchmark. Across repeated candidate-answer reruns on Qwen2.5-1.5B-Instruct and Llama-3.2-1B-Instruct, truthful and incorrect internal states are most separable in the middle-to-late layers, with the peak consistently falling at 50–70% of total network depth across both model families. The depth-relative pattern is more stable than any single detector ranking: simple residual-space baselines, including Mahalanobis scoring, remain competitive with more elaborate residual-plus-spectral fusion features under the same protocol, although detector ranking still changes by task. A separate preliminary two-seed Qwen2.5-7B-Instruct BF16 probe under that same white-box benchmark reproduces the same middle-to-late peak, and auxiliary Int8 checks on Qwen2.5-1.5B and Qwen2.5-7B remain consistent with that same localization under moderate quantization. Taken together, the results point away from detector complexity and toward a more reproducible question of where hallucination cues emerge, which internal statistics remain reliable, and how cautiously such conclusions should be transferred to deployment settings. Full article
Show Figures

Figure 1

Back to TopTop