FirmVulLinker: Leveraging Multi-Dimensional Firmware Profiling for Identifying Homologous Vulnerabilities in Internet of Things Devices

Yixuan Cheng; Fengzhi Xu; Lei Xu; Yang Ge; Jingyu Yang; Wenqing Fan; Wei Huang; Wen Liu

doi:10.3390/electronics14173438

,

and

¹

State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China

²

School of Computer and Cyber Sciences, Communication University of China, Beijing 100024, China

^*

Author to whom correspondence should be addressed.

Electronics2025, 14(17), 3438;https://doi.org/10.3390/electronics14173438

Version Notes

Order Reprints

Abstract

Identifying homologous vulnerabilities across diverse IoT firmware images is critical for large-scale vulnerability auditing and risk assessment. However, existing approaches often rely on coarse-grained components or single-dimensional metrics, lacking the semantic granularity needed to capture cross-firmware vulnerability relationships. To address this gap, we propose FirmVulLinker, a semantic profiling framework that holistically models firmware images across five dimensions: unpacking signature sequences, filesystem semantics, interface exposure, boundary binary symbols, and sensitive parameter call chains. These multi-dimensional profiles enable interpretable similarity analysis without requiring prior vulnerability labels. We construct an evaluation dataset comprising 54 Known Defective Firmware (KDF) images with 74 verified vulnerabilities and assess FirmVulLinker across multiple correlation tasks. Compared to state-of-the-art techniques, FirmVulLinker achieves higher precision with substantially lower false-positive and false-negative rates. Notably, it identifies and reproduces 53 previously undisclosed N-day vulnerabilities in firmware images not listed as affected at the time of public disclosure, effectively extending the known impact scope. Our results demonstrate that FirmVulLinker enables scalable, high-fidelity homologous vulnerability analysis, offering a new perspective on understanding cross-firmware vulnerability patterns in the IoT ecosystem.

Keywords:

homologous vulnerability correlation; firmware similarity analysis; IoT; firmware security; static analysis

1. Introduction

With the rapid proliferation of embedded systems and Internet of Things (IoT) devices in scenarios such as smart homes, industrial control systems, and traffic management, firmware has become the essential software foundation supporting these critical applications. Today, firmware is widely deployed in home routers, smart cameras, industrial controllers, and other key devices []. Against this backdrop, security vulnerabilities in firmware have emerged as a core risk that threatens system stability and user privacy [,]. Notably, common development practices such as code cloning, component reuse, and the integration of shared SDKs have led to the widespread presence of homologous vulnerabilities across multiple firmware versions []. Once attackers identify and exploit these vulnerabilities, they can launch large-scale attacks that result in system downtime, data breaches, and the formation of extensive botnets. Well-known examples include the Mirai and Bashlite botnet families, which compromised hundreds of thousands of IoT devices by exploiting only a few dozen homologous vulnerabilities, highlighting the real-world threat of cross-device vulnerability propagation [,].

To address this growing risk, prior research has explored homologous vulnerability analysis to identify how vulnerabilities propagate and are reused across different firmware samples [,,,]. These approaches typically rely on component-level feature matching, string signature comparison, or function-level semantic alignment, which can partially capture the behaviors of firmware clones. However, when analyzing firmware across platforms, architectures, or even different vendors, these methods face significant limitations. On the one hand, they often treat firmware merely as a collection of isolated components, lacking systematic modeling of the firmware’s overall structural semantics and attack surface information. This makes it challenging to represent the complete static behavioral logic and multi-dimensional characteristics of the firmware [,]. On the other hand, most existing techniques do not incorporate reachability analysis and therefore fail to evaluate whether a vulnerability is practically exploitable through real communication interfaces such as web APIs, parameter paths, or user inputs. As a result, they struggle to assess the actual threat boundaries of discovered vulnerabilities accurately [,]. More specifically, mainstream methods face three key challenges:

Limited to component-level analysis without holistic semantic modeling. Current approaches typically focus on specific components, such as ELF binaries, functions, or third-party libraries [,]. Without modeling the firmware’s overall structure, path layout, symbolic context, and configuration logic, these methods rely on shallow component-level matching []. Given the diversity in component layouts across vendors and platforms, such strategies often fail to uncover truly homologous vulnerabilities.
Heavy reliance on third-party components while overlooking vendor-specific logic. Many studies emphasize the reuse of open-source libraries and third-party modules as the primary source of vulnerability propagation []. However, in practice, IoT vendors frequently introduce substantial amounts of custom code tailored to their specific needs. This vendor-specific logic often becomes a concentrated source of previously unknown vulnerabilities []. Thus, analysis methods focused solely on standard components risk missing a significant number of impactful, real-world vulnerabilities.
Lack of communication interface modeling is limiting the assessment of exploitability. In IoT scenarios, attacks are frequently launched via exposed communication interfaces [,,]. If a vulnerability cannot be triggered through external network traffic, its practical security risk is substantially reduced. Existing methods generally do not model the network management interfaces, input parameter paths, or function call chains that connect to external inputs, making it challenging to identify vulnerabilities that are truly reachable or exploitable [].

In summary, current homologous vulnerability analysis methods need to move beyond shallow, component-level matching toward in-depth modeling of firmware-wide structural semantics. At the same time, incorporating perspectives on communication interfaces and reachability is critical to enable more explainable, comprehensive, and practically meaningful vulnerability discovery.

Our Approach. To address these challenges, we propose FirmVulLinker, a framework designed explicitly for homologous vulnerability analysis of embedded firmware. The central idea behind FirmVulLinker is to treat the firmware as a unified modeling target and systematically capture its internal static structure, behavioral logic, and potential attack surfaces by constructing a multi-dimensional semantic profiling system. Based on semantic similarity measurements across these dimensions, FirmVulLinker identifies firmware samples that are likely to contain homologous vulnerabilities, even across large-scale datasets. The overall framework consists of two main stages:

Stage One: Multi-dimensional semantic profiling. FirmVulLinker builds five complementary semantic profiles to represent the static behavioral characteristics of firmware, focusing on unpacking structure, filesystem layout, exposed communication interfaces, boundary binary symbols, and high-risk parameter call chains. Specifically: (i) Unpacking Signature Sequence Profile captures the hierarchical structure and nested relationships revealed during firmware unpacking; (ii) Filesystem Semantic Profile models path semantics and the distribution of sensitive resources; (iii) Interface Exposure Profile identifies reachable input paths and user-controlled parameters that define the external attack surface; (iv) Exposed Binary Symbolic Profile analyzes symbol tables, imported/exported functions, and other interface-related features of boundary binaries; (v) Vulnerability-oriented Call Chain Profile statically traces high-risk parameters to construct potential vulnerability trigger chains. Together, these unified semantic profiles capture diverse dimensions of firmware behavior, providing a rich semantic basis for downstream homologous similarity analysis.
Stage Two: Semantic similarity computation and homologous vulnerability discovery. FirmVulLinker implements tailored similarity metrics for each of the five semantic profiles, allowing it to capture structural and sequence-level differences across multiple dimensions of firmware behavior. These dimension-specific similarity scores are then aggregated through a weighted fusion strategy to compute a global semantic similarity score for each firmware pair. Based on this comprehensive similarity analysis, FirmVulLinker can systematically identify homologous firmware samples and detect the same vulnerabilities within firmware that is highly similar to known vulnerable firmware.

To evaluate the effectiveness of FirmVulLinker, we developed a complete prototype system and constructed a homologous vulnerability dataset containing 74 real-world vulnerabilities across 54 embedded firmware samples. This dataset encompasses multiple mainstream architectures and vendor platforms, with each sample accompanied by standardized emulation environments, deployment scripts, and automated vulnerability verification tools. Using this dataset, we conducted comprehensive, large-scale experiments. The results show that FirmVulLinker outperforms existing state-of-the-art firmware analysis tools in terms of detection accuracy, false-negative rate, and false-positive rate, demonstrating significant improvements in both precision and robustness. Beyond identifying known homologous vulnerability samples, FirmVulLinker also discovered 53 previously undisclosed N-day vulnerabilities, effectively extending the known impact scope of existing vulnerabilities and confirming its strong potential for real-world vulnerability intelligence and supply chain security analysis.

Contributions. Overall, our main contributions are summarized as follows:

New Framework: We present FirmVulLinker, a framework for homologous vulnerability propagation analysis built on multi-dimensional static semantic profiling. Distinct from prior approaches that focus narrowly on individual components, FirmVulLinker treats firmware as a unified analysis target, constructing comprehensive semantic profiles that capture both structural semantics and attack surface characteristics. This holistic modeling paradigm significantly improves detection accuracy, interpretability, and adaptability across heterogeneous platforms.
New Dataset: We have constructed a high-quality evaluation dataset for homologous vulnerabilities, covering 74 verified vulnerabilities across 54 firmware samples. Each firmware sample is provided with a standardized emulation environment, startup scripts, and automated validation tools. This design ensures the reproducibility and comparability of vulnerability samples, offering a stable and reliable testing resource for subsequent research.
Implementation and Evaluation: We implemented a complete prototype of FirmVulLinker and conducted systematic evaluations across multiple real-world vulnerability scenarios and actual firmware samples. Experimental results demonstrate the effectiveness, accuracy, and scalability of our approach in identifying homologous vulnerability propagation, revealing multiple vulnerable firmware samples that were previously missed by existing tools. This further underscores the practical value of our method.

Roadmap: The remainder of this paper is organized as follows. Section 2 reviews the background and related work on IoT firmware vulnerability analysis and homologous firmware correlation. Section 3 details the design rationale of FirmVulLinker and explains the construction of its multi-dimensional static semantic profiles across five core modeling dimensions. Section 4 describes the similarity measurement strategies for each profile and the multi-dimensional fusion method used to support cross-firmware vulnerability identification and propagation analysis. Section 5 presents the implementation details of the prototype system and its experimental evaluation, including comprehensive comparisons with state-of-the-art methods. Section 6 discusses the limitations identified during design and implementation and outlines potential directions for future work. Finally, Section 7 concludes the paper.

To facilitate future research, we will release the open-source implementation of FirmVulLinker at https://github.com/a101e-lab/FirmVulLinker (accessed on 29 July 2025).

2. Background and Related Work

Before detailing our technical approach, we provide an overview of existing methodologies for discovering IoT firmware vulnerabilities, with a particular focus on the background of homologous vulnerability analysis across firmware.

2.1. Vulnerability Discovery in IoT Firmware

Current techniques for identifying security vulnerabilities in IoT firmware generally fall into two broad categories: dynamic analysis and static analysis [,,]. Each offers distinct advantages in terms of applicability, coverage, and engineering cost, while also facing unique challenges and limitations.

Dynamic Analysis-Based Approaches. Among various dynamic analysis techniques, fuzzing has emerged as a dominant method due to its high degree of automation, lack of reliance on prior knowledge, and strong path exploration capabilities [,]. Fuzzing involves repeatedly injecting malformed or partially valid inputs into a target program and monitoring its execution for abnormal behavior such as crashes or hangs, thereby revealing hidden flaws or vulnerabilities []. Depending on visibility into the program’s internal state and the guidance available during execution, fuzzing can be categorized as white-box, grey-box, or black-box [].

However, applying fuzzing to IoT firmware poses substantial challenges. First, due to the strong dependence of firmware execution on hardware-specific environments, dynamic testing often requires emulation platforms such as QEMU [], Firmadyne [], or FirmAE [] to simulate user-space or system-level execution []. In practice, firmware frequently contains low-level hardware interactions, proprietary drivers, or dependencies on NVRAM, which makes accurate emulation difficult and often results in crashes or incomplete behavior during testing, thereby undermining stability and effectiveness []. Moreover, as IoT firmware is typically distributed in a closed-source format, white-box fuzzing becomes infeasible []. While grey-box fuzzers can collect runtime feedback via instrumentation, they often require the firmware to run in user space or within a semi-virtualized setup [,]. These conditions are rarely met in real-world firmware, which limits their applicability. Consequently, black-box fuzzing is the most widely adopted in IoT scenarios due to its independence from source code and execution context [,].

Nevertheless, black-box fuzzing suffers from a critical drawback: lacking knowledge of target interfaces, communication protocols, or authentication mechanisms, generated test inputs often fail to pass the firmware’s initial validation checks, preventing them from reaching deeper logic branches []. This is further complicated by the presence of web interfaces, command-line tools, and other components in firmware that introduce additional layers of encoding, encryption, and multi-step verification, which significantly raise the barrier for constructing valid inputs []. In summary, while dynamic fuzzing offers powerful capabilities for vulnerability discovery, its high deployment cost, environmental dependence, and difficulty in adaptation continue to hinder its large-scale adoption in IoT firmware analysis.

Static Analysis-Based Approaches. In contrast to dynamic techniques, static analysis examines unpacked firmware images without executing the code, offering greater generalizability and scalability []. These methods typically extract executable binaries, configuration scripts, and resource files, then apply techniques such as syntax parsing, control flow modeling, data flow analysis, and taint tracking to identify potential vulnerabilities or construct behavioral models. Static analysis is especially advantageous in analyzing closed-source firmware, as it enables deep inspection of logic structures and non-exposed interfaces that are often unreachable via dynamic methods.

Despite its broader applicability, static analysis faces notable challenges in practice, particularly in terms of false-positive rates and computational complexity []. The lack of runtime context and real inputs makes it inherently difficult to determine whether identified issues are exploitable, often leading to an overwhelming number of false alarms that reduce the practical utility of the results []. In addition, core analysis procedures such as constructing control flow graphs (CFGs), generating data flow graphs (DFGs), and conducting path-sensitive analysis require substantial computing resources, which significantly hampers the efficiency of large-scale analysis on real-world firmware images [].

Moreover, many static techniques depend heavily on accurate symbol resolution and function boundary recovery. In practice, however, firmware binaries are often stripped of symbols, optimized aggressively, or obfuscated. These conditions lead to ambiguous control flow and missing call relationships, which severely limit the precision and robustness of analysis []. Additionally, most existing approaches treat firmware as a collection of isolated components, such as individual binaries, third-party libraries, or exposed interfaces, without modeling the firmware system as an integrated whole []. This narrow perspective makes it difficult to reason about the interaction between components or to trace how vulnerabilities propagate through different firmware images.

Given the challenges of incomplete symbols, heterogeneous binary resources, and the large scale of real-world firmware, a key problem remains unresolved: how to accurately identify vulnerabilities in the absence of an execution context while minimizing false positives and keeping semantic modeling manageable. To address this, there is a growing need for a scalable and robust semantic profiling framework that treats firmware as a unified target for analysis and evaluation. By capturing global structure and behavior across multiple semantic dimensions, such a framework can provide a more complete context for vulnerability discovery. When combined with efficient similarity computation, it can also support large-scale analysis of homologous vulnerability propagation.

2.2. Static Analysis-Based Homologous Vulnerability Identification in Firmware

Due to their minimal dependence on execution environments and broad applicability across platforms, static analysis techniques have become a prominent approach for detecting firmware vulnerabilities and identifying homologous vulnerabilities across firmware samples []. These methods typically operate on unpacked firmware images, extracting multi-dimensional static features through structural modeling, semantic abstraction, or feature alignment. The extracted features are then used to perform cross-platform and cross-architecture similarity analysis, uncovering potential reuse paths and propagation patterns of known vulnerabilities [,,]. Based on the nature of their modeling strategies, existing approaches can be broadly categorized into three groups: code structure modeling, resource feature modeling, and global semantic modeling.

Code Structure Modeling. Control flow graphs (CFGs) and function call graphs are widely adopted for their strong structural representation capabilities in function-level similarity analysis []. Representative works such as Genius [] and FirmSec [] utilize graph embedding over CFGs to learn structural representations and match functions across firmware samples. Subsequent efforts introduce local call graphs (LCGs) to enhance contextual modeling. Wang et al. further propose feature optimization via genetic algorithms to improve alignment accuracy []. While these techniques have demonstrated strong performance at the function granularity, they face two fundamental limitations in practical firmware analysis: (1) many firmware images contain custom or proprietary components that lack consistent structural semantics, making graph-based modeling fragile; and (2) by focusing on functions as the core unit of comparison, these methods fail to capture holistic firmware behaviors and attack surfaces. In particular, they struggle to account for high-level semantic features such as communication interfaces, configuration paths, and trigger conditions, which are essential for cross-vendor or cross-platform analysis.

Resource Feature Modeling. Beyond structural modeling, human-readable strings and scripts are often used for lightweight firmware clustering due to their interpretability and portability []. For instance, IHB employs string extraction, MinHash, and LSH indexing to quickly cluster similar firmware by identifying key symbols and paths in readable content []. UFO focuses on shell scripts and builds script dependency graphs to identify command sequences that may expose vulnerability trigger paths []. However, the effectiveness of these methods largely relies on the quality of extracted resources and their ability to be accurately parsed and analyzed. Obfuscation, encryption, and aggressive compilation often degrade the quality of strings, while script-based approaches lack structured semantics and are challenging to integrate into unified analytical pipelines. Moreover, these approaches typically ignore the interaction between resources and binary logic, limiting their ability to model the propagation path from external inputs to internal kernel logic.

Global Semantic Modeling. To move beyond shallow features, several studies have proposed end-to-end semantic modeling frameworks that link frontend interfaces with backend processing logic. Firmalice, for example, constructs data dependency graphs (DDGs) to model how key variables propagate through configuration logic, identifying instruction sequences that may lead to behavioral deviations []. SaTC combines web frontend forms, CGI parameters, and backend logic to build full-stack input path graphs, enabling the identification of semantically reachable attack points []. These approaches demonstrate early efforts to bridge communication interfaces and internal logic and provide a semantic basis for vulnerability reachability analysis. However, they are typically tailored to specific scenarios or data sources and lack general-purpose modeling capabilities. In the absence of a unified feature space or abstract representation, these methods are not yet scalable or automated enough for large-scale propagation analysis.

In addition to function-level and component-level approaches, there have also been attempts to extend such methods to the firmware level. This typically involves incorporating automated unpacking pipelines, extracting binary modules from the unpacked images, and then aggregating module-level similarity results to derive a firmware-level correlation score. Such extensions provide a practical way to reuse existing analysis techniques at a broader granularity, but they remain conceptually distinct from other research directions.

On one hand, function-level or component-level approaches often achieve finer-grained feature selection within their chosen dimensions, making them effective for capturing localized structural or syntactic similarities. However, they provide a weaker description of firmware as a whole, as they lack modeling of higher-level semantics and global organization. On the other hand, multi-dimensional firmware-level profiling approaches emphasize a holistic description of the entire firmware by integrating features such as structural unfolding, resource layout, communication interfaces, symbolic information, and call-chain semantics. These approaches are stronger in capturing global characteristics across heterogeneous firmware ecosystems. However, due to the large number of profiling dimensions, the feature refinement within each dimension is typically less fine-grained compared to function-level methods. This contrast highlights both the differences in granularity and the potential complementarity between the two perspectives.

In summary, current static approaches to homologous vulnerability detection have explored multiple feature dimensions, including control flow structures, script resources, and frontend–backend semantic paths. However, despite recent firmware-level extensions, most existing techniques remain largely component-centric. They lack a unified modeling paradigm that comprehensively captures the global firmware structure, communication entry points, and symbolic semantics. When applied to customized logic, heterogeneous platforms, or complex communication interfaces, these limitations often result in inaccurate matching and incomplete discovery of propagation paths due to the absence of contextual reachability reasoning and structural constraint inference. Furthermore, most existing techniques fall short in modeling detailed vulnerability trigger chains, making it difficult to assess the exploitability or real-world impact of discovered homologous vulnerabilities. To address these challenges, future research must shift toward holistic firmware-level modeling. This requires the construction of multi-dimensional static semantic profiles that treat the firmware as a unified analysis target. By integrating communication context and reachability modeling, such approaches can support more interpretable, accurate, and generalizable homologous vulnerability identification across diverse firmware ecosystems.

3. Multi-Dimensional Firmware Semantic Profiling

To enable the large-scale analysis of homologous vulnerability propagation across heterogeneous embedded firmware images, we propose FirmVulLinker. This unified static profiling framework models firmware semantics in a structured, platform-independent manner. The framework performs multi-dimensional profiling to capture key features of firmware, facilitating cross-sample semantic alignment and homologous vulnerability discovery. The overall system architecture of FirmVulLinker is illustrated in Figure 1.

Figure 1. Overview of FirmVulLinker.

FirmVulLinker operates in two main stages: multi-dimensional semantic profiling and similarity computation. In the first stage, each input firmware image is profiled across five semantic dimensions, capturing structural organization, resource layout, input interfaces, symbolic structure, and vulnerability propagation paths. These features are then integrated into a unified firmware semantic profile. In the second stage, intra-dimensional similarity evaluation and cross-dimensional fusion strategies are employed to generate a global semantic similarity score for detecting homologous vulnerability.

We formally define the semantic profile of a firmware image

F

as a five-tuple, as shown in Equation (1):

P (F) = (P_{1}, P_{2}, P_{3}, P_{4}, P_{5})

(1)

Each

P_{i}

represents an independent feature subspace:

P₁: Unpacking Signature Sequence Profile. Captures the sequence of structural signatures identified during the firmware unpacking process and models their nesting order as a trajectory, reflecting the firmware’s packaging layers and organizational patterns.
P₂: Filesystem Semantic Profile. Analyzes the extracted filesystem layout to model path topology, sensitive resource distribution, and binary component fingerprints. This profile characterizes the deployment logic, exposure surface, and potential code reuse.
P₃: Interface Exposure Profile. Identifies external communication paths and associated parameter keys. This profile models the input attack surface and interface naming behaviors to reflect exposed functionality and entry points.
P₄: Exposed Binary Symbolic Profile. Extracts symbolic information from boundary binaries along communication paths, including imported and exported symbols, function signatures, and symbol table structures. This profile captures visible semantic labels and inter-module dependencies.
P₅: Vulnerability-oriented Call-Chain Profile. Traces the propagation of high-risk parameters and reconstructs multi-path call chains from entry points to sensitive functions. This profile models vulnerability trigger paths and the corresponding propagation context.

These five profiling dimensions collectively capture the firmware’s structural layout, input behavior, symbolic semantics, and vulnerability-related context. The profiles offer fine-grained representations, complementary semantic perspectives, and cross-platform compatibility. Among them,

P_{1}

and

P_{2}

emphasize structural and behavioral organization, while

P_{3}

and

P_{4}

characterize communication interfaces and symbolic visibility.

P_{5}

provides deeper insight into control dependencies and parameter flows associated with potential vulnerability triggers. The modular and extensible nature of this profiling system enables FirmVulLinker to accommodate diverse architectures, formats, and compilation styles. This unified abstraction layer establishes a robust semantic foundation for subsequent cross-firmware analysis and homologous vulnerability identification.

In the remainder of this section, we present the modeling and extraction methodology for each semantic dimension in Section 3.1, Section 3.2, Section 3.3, Section 3.4 and Section 3.5. The similarity computation strategies and fusion mechanisms are discussed in detail in Section 4.

3.1. Unpacking Signature Sequence Profile

Embedded firmware images are typically packaged in aggregate or compressed formats, encapsulating multiple structured content segments, such as filesystem images, kernel binaries, initialization scripts, configuration blocks, or key storage regions. Unlike conventional operating system images, the nesting relationships and encapsulation order in embedded firmware vary significantly across implementations, posing substantial challenges for structural analysis and content reuse detection.

To address this, we propose the Unpacking Signature Sequence Profile, a semantic behavior trace that captures the structural unfolding process of a firmware image. This profile models the nesting patterns and component organization by extracting a sequence of structural signatures encountered during the unpacking process. Specifically, we treat the firmware image as a linear byte stream and apply a semantic signature matching mechanism to identify boundaries of nested components. These structural signatures are symbolically encoded based on their appearance order, forming a discrete trace that reflects the logical decomposition path of the firmware. This sequence offers a stable and comparable representation of structural behavior across samples, without relying on complete filesystem mounting or parsing.

To facilitate computational analysis and practical application, we introduce a formal modeling approach that transforms the unpacking sequence into a vectorizable representation within a structural feature space. This modeling process is divided into three stages:

Signature Trace Extraction. Given a firmware image

F

, we apply structural matching techniques to identify embedded components and assign each with a unique semantic signature identifier. To achieve this, we construct a global dictionary of magic byte patterns commonly used to indicate nested structures, such as filesystems, kernel segments, or certificate data. Each magic byte feature in this dictionary is predefined and uniquely assigned with an identifier prior to the unpacking process. During structural matching, whenever a magic byte pattern is detected within the firmware, the corresponding identifier is retrieved from this dictionary. These identifiers are then sequentially arranged according to their appearance in the byte stream, forming a structural signature sequence as defined in Equation (2):

S (F) = [s_{1}, s_{2}, \dots, s_{n}], s_{i} \in Σ

(2)

Here,

Σ

denotes a finite set of structural signature symbols, where each symbol

s_{i}

represents the identifier of a specific structural component recognized during the unpacking process, such as a filesystem, kernel segment, certificate data, or path marker. The resulting sequence thus encodes the logical unpacking behavior of the firmware, capturing the order in which components appear and the organizational layout of their internal structures. As this representation does not rely on filesystem mounting or complete extraction, it provides strong adaptability and generality across diverse architectures and packaging formats.

Structural Behavior Abstraction. To enhance the expressiveness of the trace sequence and improve semantic alignment across firmware samples, we abstract

S (F)

into a structural behavior path that represents the sequence of structural states traversed by nested components during the unpacking process. Each signature symbol

s_{i} \in Σ

denotes an explicit unpacking operation, and the overall combination pattern reveals the encapsulation strategies, module partitioning, or distribution templates adopted during firmware construction. This abstraction disregards the actual physical offsets or sizes of components within the byte stream. Instead, it focuses on the execution order, nesting hierarchy, and the stability and commonality of semantic patterns. These factors form a robust structural foundation for subsequent tasks.

Sequence Pattern Modeling and Vectorization. To transform the structural behavior sequence into a computable feature vector, we apply an n-gram modeling strategy to the sequence

S (F)

defined in Equation (2). Specifically, we extract all contiguous subsequences of length k to construct the following pattern set:

G_{k} (F) = \{(s_{i}, s_{i + 1}, \dots, s_{i + k - 1}) ∣ 1 \leq i \leq n - k + 1\}

(3)

Here,

G_{k} (F)

represents the set of all local structural fragments of length

k

extracted from the signature trace. We then compute the frequency distribution over this set to obtain a vectorized representation of the n-gram patterns:

ϕ_{k} (F) = [f_{1}, f_{2}, \dots, f_{m}] \in ℝ^{m}

(4)

In this vector, each element

f_{j}

denotes the occurrence count of the

j

-th pattern in

S (F)

, and

m

is the total number of distinct patterns in the feature space. The resulting vector

ϕ_{k} (F)

serves as the structural representation of firmware

F

under this profiling dimension and will be used for subsequent similarity measurement and cross-sample clustering.

This modeling strategy offers three key advantages. First, it does not depend on filesystem mounting or semantic recovery, making it robust to damaged or packed samples. Second, it enables the discovery of standard structural encapsulation and reuse patterns across firmware images. Third, it yields a sparse and efficient representation, suitable for large-scale indexing, retrieval, and analysis. This structural profile will be integrated into the firmware similarity analysis module introduced in Section 4.

3.2. Filesystem Semantic Profile

Once unpacked, embedded firmware typically contains a complete or nearly complete filesystem structure. This structure not only governs the organization and invocation of system components but also exposes a wide range of static information through path naming, file content, and resource layout—information that attackers may leverage. To capture the structural and semantic behaviors present in the filesystem, we propose the Filesystem Semantic Profile, which models static characteristics across multiple dimensions, such as path topology, sensitive resource distribution, configuration exposure, and binary file layout. This profile consists of three modeling phases: structural layout modeling, sensitive resource annotation and modeling, and binary resource signature profiling. The outputs are ultimately fused into a unified structural feature vector that supports similarity analysis and clustering across firmware samples based on deployment patterns, information leakage risks, and resource reuse behaviors.

Structural Layout Modeling. To characterize structural differences in component organization and directory deployment, we treat the root of the unpacked firmware filesystem as

R

, and formalize its structure as a directed tree

T (R) = (D, E)

, where

D

is the set of reachable directories and

E \subseteq D \times D

encodes parent-child relationships between directories. For each directory

d \in D

, we define four structural metrics:

The depth δ(d), indicating the length of the path from the root to d.
The executable density ρ(d), defined as the proportion of executable files among all files in d.
The aggregation degree γ(d), measuring the number of files and subdirectories contained in d.
The global directory entropy H(D), quantifying the uniformity and organizational complexity of the overall filesystem layout.

We compute statistical summaries (e.g., mean

μ

and standard deviation

σ

) over these metrics to form the structural feature vector:

ϕ_{s} (R) = [μ_{δ}, σ_{δ}, μ_{ρ}, μ_{γ}, H (D)] \in ℝ^{5}

(5)

Here,

μ_{δ}

and

σ_{δ}

represent the mean and standard deviation of directory depths,

μ_{ρ}

and

μ_{γ}

denote the average executable density and aggregation degree, and

H (D)

is the normalized Shannon entropy over the distribution of file counts across directories. This vector encodes the organizational behavior of the firmware’s filesystem and provides a basis for analyzing its component deployment practices and directory construction patterns.

Sensitive Resource Annotation and Modeling. This stage focuses on static assets within the firmware filesystem that may lead to information leakage, authentication bypass, or system exposure. We design a semantic path labeling framework tailored for firmware content, covering three high-risk resource categories: credential artifacts, static information indicators, and sensitive configuration files. These resources are uniformly modeled through a rule-based tagging mechanism that standardizes the representation of static attack surfaces across diverse firmware samples.

We denote the set of all reachable paths in the unpacked filesystem as

P = {p_{1}, p_{2}, \dots, p_{N}}

, where each path

p_{i}

corresponds to a file or directory node. To identify the potential semantics of each path, we define a set of regular expression rules

M

, each associated with a semantic label

l \in L

. A matching function

Match (p_{i}, M) \Rightarrow l_{i}

maps each path

p_{i}

to a label

l_{i} \in L

. The label space is organized into the following three categories:

Credential and Cryptographic Artifact Disclosure. This category targets paths related to authentication and cryptographic secrets. It includes: (1) files such as .key, .pem, .crt, .cer indicating private keys and certificates; (2) known password storage files such as .htpasswd, shadow, and passwd; and (3) configuration paths for SSH and TLS services like .ssh/, sshd_config, ssl.conf, and authorized_keys. The exposure of these resources directly undermines remote access control, encrypted communication, and device authentication.
Static Information Leakage Indicators. These labels capture hardcoded static information that can assist adversaries in target fingerprinting, attack tailoring, or device tracking. We define matching patterns for IP addresses, hardcoded URLs, email addresses, MAC addresses, and device identifiers. Although such paths may not directly trigger vulnerabilities, they frequently serve as side channels for intelligence gathering during real-world attacks.
Security-sensitive Configuration Artifact Detection. This category identifies configuration paths tied to system components, service modules, or runtime logic. We include rules for standard web services, databases, automation scripts, and startup configuration files, matching paths such as .conf, .ini, .sql, .db, and .sh. The exposure of these files can disclose execution logic, hardcoded credentials, or automation entry points, thus expanding the attack surface.

We count the number of occurrences of each label

l_{j} \in L

within the path set

P

to form a sensitive resource semantic feature vector:

ϕ_{l} (R) = [c_{1}, c_{2}, \dots, c_{n}] \in ℕ^{n}

(6)

Here,

c_{j}

denotes the frequency of label

l_{j}

in the firmware filesystem, and

n = |L|

is the dimensionality of the label space. This vector characterizes the semantic footprint of credential, configuration, and information exposure within the sample, providing a quantitative basis for downstream similarity analysis tasks.

Binary Resource Signature Profiling. This profiling dimension focuses on the deployment locations, content structures, and potential reuse characteristics of executable binaries extracted from the unpacked firmware filesystem. We propose binary resource signature profiling to statically model the path semantics, string features, and fuzzy hash fingerprints of typical executables. This supports cross-firmware analysis of binary similarity and component-level clustering. We define the set of all executable files with either execution permissions or semantically significant paths as

B = \{b_{1}, b_{2}, \dots, b_{M}\}

, where each

b_{i}

represents a distinct binary. For each binary, we define three signature functions to capture its key attributes:

SigProfile (b_{i}) = (p_{i}, s_{i}, h_{i})

(7)

Here,

p_{i}

is a binary path pattern match flag,

s_{i}

is the set of printable ASCII strings extracted from the binary, and

h_{i}

is the fuzzy hash fingerprint. Specifically:

Path Semantic Feature. We use a predefined path pattern set $R$ _bin to determine whether b_i is deployed in common locations (e.g., /bin/httpd,/usr/sbin/telnetd,/lib/libcrypto.so). If matched, we set p_i = 1; otherwise, p_i = 0. The total number of path-matched binaries provides a structural representation of deployment patterns.
String Summary Feature. We extract the printable ASCII string set $S$ (b_i) from each binary and retain only those strings with length greater than a threshold n (default n = 10). This set captures high-level semantic content embedded within the binary.
Fuzzy Hash Fingerprint. We compute the ssdeep [] hash h_i = ssdeep(b_i) for each binary, yielding a fingerprint of its raw content. The complete fingerprint set $H$ (R) = {h1, …, h_M} serves as a basis for subsequent cross-firmware fuzzy matching.

Finally, we aggregate these features into a unified resource signature vector:

ϕ_{r} (R) = (P_{R}, S_{R}, H_{R})

(8)

Here,

P_{R} = \{p_{i} ∣ p_{i} = 1\}

denotes the set of binaries with path semantics matches,

S_{R} = ⋃_{i = 1}^{M} S (b_{i})

is the union of all printable string sets, and

H_{R} = {h_{i}}_{i = 1}^{M}

is the complete set of ssdeep fingerprints. This signature profile serves as a lightweight and semantically rich input for downstream analyses such as path alignment, string-set similarity, and fuzzy fingerprint matching. It enables efficient large-scale binary reuse analysis and similarity-based firmware clustering.

Feature Fusion and Vectorized Representation. To unify the structural and semantic features modeled across the three sub-dimensions above, we concatenate the structural layout vector

ϕ_{s} (R)

, the sensitive resource vector

ϕ_{l} (R)

, and the binary resource signature vector

ϕ_{r} (R)

in a fixed order to construct a holistic semantic representation of the firmware’s filesystem dimension:

Φ_{fs} (F) = [ϕ_{s} (R) ∥ ϕ_{l} (R) ∥ ϕ_{r} (R)]

(9)

Here,

∥

denotes the vector concatenation operator. The resulting vector

Φ_{fs} (F)

provides a consistent and interpretable feature representation across different firmware samples, supporting direct engagement in downstream tasks such as structural similarity comparison and cluster analysis.

The Filesystem Semantic Profile offers a multi-perspective, lightweight, and highly interpretable modeling paradigm. Its construction does not rely on decrypted file contents, filesystem mounting, or dynamic execution, making it broadly applicable across diverse firmware types. This profile serves as a foundational component in tasks such as structural reuse detection and configuration exposure assessment, providing robust analytical support under both scalable and adversarial analysis conditions.

3.3. Interface Exposure Profile

Communication interfaces serve as the primary entry points through which external entities interact with embedded devices. The presence and structure of such interfaces, along with their associated parameter semantics, reveal the functional exposure and input control boundaries of the underlying system. In embedded firmware, these interfaces are often embedded in configuration scripts, HTML forms, or hardcoded strings, with limited or no accompanying documentation or source code. To systematically recover the externally accessible communication surface, we introduce the Interface Exposure Profile, which extracts reachable interface paths and input parameters from firmware images and models the corresponding exposure structure. This profile is constructed in three stages: Interface Identifier Extraction, Interface Structural Summary Modeling, and Interface Exposure Profile Representation.

Interface Identifier Extraction. We begin by statically scanning the unpacked firmware filesystem for communication-related indicators. Specifically, we focus on HTML pages, CGI handler paths, shell scripts, and string constants embedded within binary files. From these sources, we extract two types of semantic identifiers: (i) interface paths that serve as functional endpoints for communication, and (ii) parameter names that represent user-controlled input keys. Examples of such indicators include CGI paths (e.g., /cgi-bin/upload_firmware.cgi, boafrm/formUpload) and form parameters (e.g., action, modelName) commonly found in embedded web interfaces. Formally, we define:

U (F) = \{u_{1}, u_{2}, \dots, u_{m}\}, K (F) = \{k_{1}, k_{2}, \dots, k_{n}\}

(10)

Here,

U (F)

denotes the set of all reachable interface paths, and

K (F)

denotes the set of parameter keys identified in firmware

F

. We preserve the raw string representation of these identifiers to retain semantic fidelity and naming consistency.

Interface Structural Summary Modeling. To enable compact representation and cross-firmware comparison, we construct a five-dimensional statistical vector to summarize the structural characteristics of the extracted interface elements. These features capture the scale, nesting complexity, and naming diversity of the communication surface:

Number of interface paths N_u = | $𝒰$ (F)|: number of distinct interface paths.
Number of parameter names N_k = | $𝒦$ (F)|: number of unique parameter names.
Average path depth $μ_{d} = \frac{1}{N_{u}} \sum_{i = 1}^{N_{u}} Depth (u_{i})$ : average path depth, where Depth(u_i) counts the number of slashes in u_i.
Prefix entropy of interface paths $H_{prefix} = - \sum_{j = 1}^{k} p_{j} \log_{2} p_{j}$ : measuring the distribution dispersion of interface deployment across different top-level path prefixes (e.g., cgi-bin, boafrm, goform), where p_j is the probability of prefix j.
Character-level entropy of parameter names $H_{key} = - \sum_{c \in C} p (c) \log_{2} p (c)$ : reflecting the lexical diversity and naming consistency across all extracted input keys, where $𝒞$ is the set of characters and p(c) is the frequency of character c.

Formally, the structural summary vector is defined as:

ϕ_{intf} (F) = [N_{u}, N_{k}, μ_{d}, H_{prefix}, H_{key}] \in ℝ^{5}

(11)

This representation offers a holistic view of the input surface in terms of interface scale, structural organization, and naming semantics, supporting clustering and complexity evaluation tasks across firmware samples.

Interface Exposure Profile Representation. Finally, we define the complete Interface Exposure Profile of a firmware sample as:

Φ_{intf} (F) = (U (F), K (F), ϕ_{intf} (F))

(12)

Here,

U (F)

and

K (F)

preserve raw semantic information, and

ϕ_{intf} (F)

provides a structured summary. Together, these components capture the content, range, and distributional properties of the firmware’s communication surface.

This profile has three key advantages: (i) it retains original identifiers for semantic interpretability and traceability; (ii) it supports alignment and comparison across new firmware samples; and (iii) its structural features enable multi-granularity analysis of communication surface complexity and design consistency. In Section 4, we leverage Interface Exposure Profiles for similarity computation and functional clustering across firmware samples, facilitating large-scale analysis of input structures and attack surfaces.

3.4. Exposed Binary Symbolic Profile

In embedded firmware, boundary binaries serve as the critical bridge between external communication interfaces and internal control logic. These binaries are responsible for input processing, protocol parsing, and system interactions. Their internal structures, such as function definitions, symbol tables, and identifiable semantic elements, directly reflect the device’s entry points and functional layout. In particular, such binaries are typically located at the forefront of the system execution path. As a result, their exposed symbolic features are central indicators of module boundaries and play a pivotal role in defining the attack surface and guiding the propagation of potential vulnerabilities.

To systematically model the symbolic-level structure of these binaries, we propose the Exposed Binary Symbolic Profile, which statically extracts multi-dimensional symbolic features including import tables, export symbols, reconstructed internal symbols, and identifiable function signatures. This modeling process is guided by the communication paths and parameter sets extracted in the Interface Exposure Profile, from which we select the top-n binaries that expose the most communication entry points:

B = \{B_{1}, B_{2}, \dots, B_{n}\}

(13)

We then analyze each

B_{i} \in B

using static disassembly tools. In this process, we extract four key categories of symbolic features, namely the import table, export symbols, reconstructed internal symbols, and identifiable function signatures.

Import Table Modeling. We extract the set of imported functions required by

B_{i}

, representing its dependencies on external libraries and system interfaces:

Imp (B_{i}) = \{f_{1}^{imp}, f_{2}^{imp}, \dots, f_{m_{i}}^{imp}\}

(14)

This set reflects the runtime behavior intent of the binary. We retain the raw function names for cross-sample comparison and compute the total number of imports:

ϕ_{imp} (B_{i}) = |Imp (B_{i})|

(15)

Export Table Modeling. We extract the set of functions and global symbols exported by

B_{i}

, which define the binary’s external interfaces:

Exp (B_{i}) = \{f_{1}^{\exp}, f_{2}^{\exp}, \dots, f_{r_{i}}^{\exp}\}

(16)

These entries represent callable entry points for other components. We preserve the original names and record the export count:

ϕ_{\exp} (B_{i}) = |Exp (B_{i})|

(17)

Symbol Table Recovery. We attempt to reconstruct internal function symbols within

B_{i}

, yielding a set of recoverable symbols:

Sym (B_{i}) = \{f_{1}^{sym}, f_{2}^{sym}, \dots, f_{s_{i}}^{sym}\}

(18)

These may include internal or static functions detected via debugging residues or symbol recovery algorithms. We compute the total number:

ϕ_{sym} (B_{i}) = |Sym (B_{i})|

(19)

Function Identifier Extraction. We extract identifiable function signatures from all functions within

B_{i}

, forming a set of matched function fingerprints:

FID (B_{i}) = \{f_{1}^{id}, f_{2}^{id}, \dots, f_{t_{i}}^{id}\}

(20)

These signatures support comparison with known vulnerability handlers and protocol logic. We record their count:

ϕ_{id} (B_{i}) = |FID (B_{i})|

(21)

Symbolic Feature Representation. We concatenate the four symbolic statistics into a symbolic feature vector:

ϕ_{symb} (B_{i}) = [ϕ_{imp} (B_{i}), ϕ_{\exp} (B_{i}), ϕ_{sym} (B_{i}), ϕ_{id} (B_{i})] \in ℕ^{4}

(22)

Additionally, we retain the corresponding raw function sets for advanced symbolic matching:

Ψ_{symb} (B_{i}) = (Imp (B_{i}), Exp (B_{i}), Sym (B_{i}), FID (B_{i}))

(23)

To construct the complete symbolic profile for firmware

F

, we aggregate the features across all selected boundary binaries:

Φ_{symb} (F) = ⋃_{i = 1}^{n} (ϕ_{symb} (B_{i}), Ψ_{symb} (B_{i}))

(24)

This representation is both scalable and semantically expressive: it provides quantitative indicators of symbolic exposure while preserving the symbolic-level matching granularity. The Exposed Binary Symbolic Profile serves as a crucial foundation for downstream tasks such as component identification, protocol analysis, semantic attribution, and vulnerability propagation modeling.

3.5. Vulnerability-Oriented Call-Chain Profile

In embedded firmware, communication parameters often serve as key control entry points for triggering functionality and orchestrating protocols. Their propagation paths within a binary may reveal critical interaction chains between external input and internal processing logic. When such parameters reach security-sensitive functions, particularly those related to memory operations or command execution, they can pose serious risks such as buffer overflows or command injection. To capture these behaviors, we introduce the Vulnerability-oriented Call-Chain Profile. This static modeling approach traces parameter-dependent function call paths within boundary binaries to highlight potential high-risk interactions. This profile consists of three modeling stages: Sensitive Parameter-Oriented Trace Extraction, Address Normalization and Chain Reconstruction, and Path Sampling and Sequence Modeling. Together, they provide a unified structural foundation for reconstructing parameter-driven execution contexts and supporting semantic-level comparison across firmware samples.

Sensitive Parameter-Oriented Trace Extraction. We begin by identifying a predefined set of security-sensitive functions (e.g., strcpy, system, memcpy, snprintf) within the top-n boundary binaries. These functions are treated as sink points for potential unsafe input propagation. Using static taint analysis, we reverse-trace their input parameters to identify a subset of communication keys involved in such flows:

K (F) = \{k_{1}, k_{2}, \dots, k_{m}\}

(25)

Each key

k_{i}

in

K (F)

represents a parameter that appears at least once in a taint path reaching a sensitive function. For every

k \in K (F)

, we extract all function-level call chains along which the parameter propagates:

C_{k} = \{C_{1}^{k}, C_{2}^{k}, \dots, C_{t_{k}}^{k}\}

(26)

Each call chain

C_{i}^{k}

is a sequence of function nodes representing the invocation path from input reception to the sensitive function:

C_{i}^{k} = [f_{i, 1}^{k}, f_{i, 2}^{k}, \dots, f_{i, l_{i}}^{k}]

(27)

Each function node

f_{i, j}^{k}

may contain the function name and disassembly address, describing concrete transfer steps of the parameter through function calls.

Address Normalization and Chain Reconstruction. Due to differences in compilation, loading addresses, and symbol availability across firmware samples, direct use of raw function paths hinders cross-sample alignment. To address this, we normalize each chain by removing address information and preserving only semantically meaningful function names:

Norm (C_{i}^{k}) = [{f^{'}}_{1}, {f^{'}}_{2}, \dots, {f^{'}}_{l_{i}}], {f^{'}}_{j} = Name (f_{i, j}^{k})

(28)

Here,

Name (\cdot)

extracts the symbolic name of a function. If no name is available, a placeholder is inserted. The normalized call chains are aggregated as:

{\hat{C}}_{k} = \{Norm (C_{i}^{k}) ∣ 1 \leq i \leq t_{k}\}

(29)

This set of address-independent sequences forms a canonical representation for parameter propagation paths, ensuring alignment across diverse firmware environments.

Path Sampling and Sequence Modeling. In practice, a single parameter may propagate through a large number of call chains. To reduce redundancy while preserving structural diversity, we group

{\hat{C}}_{k}

by path length and retain up to five representative sequences for each length. Let

L

denote the set of all observed path lengths, and the final sampled set is defined as:

{\tilde{C}}_{k} = ⋃_{l \in L} {Top}_{5} (\{C \in {\hat{C}}_{k} ∣ |C| = l\})

(30)

Here,

{Top}_{5} (\cdot)

selects up to five chains per group, prioritizing path order or function importance heuristics. This sampling reduces modeling complexity while preserving core behavioral signals.

Finally, we define the overall call-chain profile of the firmware as:

Φ_{chain} (F) = \{{\tilde{C}}_{k} ∣ k \in K (F)\}

(31)

This structure aggregates normalized call paths for all relevant parameters, offering stable, interpretable representations suitable for downstream analysis. For similarity comparison, we adopt edit distance as a path-level metric, enabling robust sequence alignment and behavioral clustering across firmware samples.

4. Firmware Similarity Computation Across Multi-Dimensional Semantic Profiles

In Section 3, we constructed five types of semantic profiles, focusing on distinct aspects of firmware behavior: structural unpacking signatures, filesystem semantic layouts, interface and parameter exposure, symbolic structures of boundary binaries, and parameter-dependent call chains. These profiles are highly complementary, compensating for the limitations of any single analytical dimension. They also exhibit strong generalizability and stability across heterogeneous architectures, providing a semantic foundation for downstream tasks such as firmware clustering, functionality transfer, and reasoning about vulnerability patterns.

To enable semantic alignment and similarity computation across firmware samples, this chapter presents a unified Multi-dimensional Semantic Profile Similarity Computation Framework. Given a pair of firmware samples

(F_{i}, F_{j})

, the framework computes dedicated similarity scores

{Sim}_{Φ} (F_{i}, F_{j})

for each profile

Φ (F)

, adopting customized strategies according to the feature type, such as sequence sets, path structures, statistical vectors, or function call chains. The framework then integrates these individual scores using a profile fusion mechanism to produce a unified semantic similarity metric. The remainder of this chapter elaborates on the representation and comparison methods for each profile, detailing the similarity metrics and computation procedures for all semantic dimensions.

4.1. Unpacking Signature Sequence Profile Similarity

The Unpacking Signature Sequence Profile captures structural behaviors exhibited during firmware unpacking. It represents each firmware sample using a sequence of signature identifiers

S (F)

, an n-gram structure set

G_{k} (F)

, and a corresponding frequency vector

ϕ_{k} (F)

. To support similarity measurement under this dimension, we design a two-level strategy that combines set-based and frequency-based comparisons.

Let

F_{1}

and

F_{2}

be the two firmware samples under comparison. Their respective n-gram structure sets are:

G_{k} (F_{1}) = \{g_{1}^{(1)}, g_{2}^{(1)}, \dots, g_{n_{1}}^{(1)}\}, G_{k} (F_{2}) = \{g_{1}^{(2)}, g_{2}^{(2)}, \dots, g_{n_{2}}^{(2)}\}

(32)

Here, each

g_{i}^{(*)}

is a length-

k

structural sub-sequence extracted from the unpacking signature stream. We first compute the overlap between the two sets using the Jaccard similarity:

{Sim}_{sig}^{set} (F_{1}, F_{2}) = \frac{|G_{k} (F_{1}) \cap G_{k} (F_{2})|}{|G_{k} (F_{1}) \cup G_{k} (F_{2})|}

(33)

This metric reflects the extent to which the two firmware samples share common unpacking patterns, which indicates alignment in nested component structures.

To further assess their behavior on shared patterns, we construct a unified vocabulary

G = \{g_{1}, g_{2}, \dots, g_{m}\}

over the union

G_{k} (F_{1}) \cup G_{k} (F_{2})

, and derive the frequency vectors:

ϕ_{k}^{'} (F_{1}) = [f_{1}^{(1)}, f_{2}^{(1)}, \dots, f_{m}^{(1)}], ϕ_{k}^{'} (F_{2}) = [f_{1}^{(2)}, f_{2}^{(2)}, \dots, f_{m}^{(2)}]

(34)

Here,

f_{j}^{(i)}

denotes the frequency of

g_{j} \in G

in firmware

F_{i}

. Based on these vectors, we compute the cosine similarity in the frequency space:

{Sim}_{sig}^{vec} (F_{1}, F_{2}) = \frac{ϕ_{k}^{'} (F_{1}) \cdot ϕ_{k}^{'} (F_{2})}{‖ ϕ_{k}^{'} (F_{1}) ‖ \cdot ‖ ϕ_{k}^{'} (F_{2}) ‖}

(35)

This captures the similarity in usage tendencies over common structural patterns, revealing shared unpacking behaviors and semantic bias.

Finally, we integrate the structural and frequency perspectives with equal weight to define the overall similarity under the unpacking profile:

{Sim}_{sig} (F_{1}, F_{2}) = \frac{1}{2} \cdot ({Sim}_{sig}^{set} (F_{1}, F_{2}) + {Sim}_{sig}^{vec} (F_{1}, F_{2}))

(36)

This dual-layer computation combines structural alignment and behavioral affinity, providing a stable, interpretable, and generalizable metric for modeling unpacking-based firmware similarity. It serves as a key component in the multi-dimensional similarity integration that follows.

4.2. Filesystem Semantic Profile Similarity

The Filesystem Semantic Profile captures firmware deployment patterns and potential attack surfaces from three static perspectives: structural layout, sensitive resource exposure, and binary resource composition. To compute the similarity of two firmware samples under this profile, we decompose the overall representation

Φ_{fs} (F)

into three heterogeneous feature categories. Each category is processed using a dedicated similarity function, and its results are aggregated into a unified similarity score at the vector level. The complete method is detailed below.

Structural Layout Similarity. Let the structural layout vectors of two firmware samples

F_{i}

and

F_{j}

be defined as:

ϕ_{s} (F_{i}) = [μ_{δ}^{(i)}, σ_{δ}^{(i)}, μ_{ρ}^{(i)}, μ_{γ}^{(i)}, H^{(i)}], ϕ_{s} (F_{j}) = [μ_{δ}^{(j)}, σ_{δ}^{(j)}, μ_{ρ}^{(j)}, μ_{γ}^{(j)}, H^{(j)}]

(37)

We adopt the normalized Euclidean distance as the base metric for comparing these layout vectors, defined as:

{Sim}_{layout} (F_{i}, F_{j}) = 1 - \frac{∥ ϕ_{s} (F_{i}) - ϕ_{s} (F_{j}) ∥_{2}}{∥ ϕ_{s} (F_{i}) ∥_{2} + ∥ ϕ_{s} (F_{j}) ∥_{2}}

(38)

The similarity score lies within the range

[0, 1]

, with larger values indicating greater structural similarity. Normalization mitigates bias caused by vector magnitude and is suitable for low-dimensional dense feature vectors.

Sensitive Resource Tag Similarity. Sensitive resources are modeled as frequency vectors defined over a unified label space

L

, where each label corresponds to a specific category of files or directories with potential security significance. Given two firmware samples

F_{i}

and

F_{j}

, we represent their tag distributions as:

ϕ_{l} (F_{i}) = [c_{1}^{(i)}, c_{2}^{(i)}, \dots, c_{n}^{(i)}], ϕ_{l} (F_{j}) = [c_{1}^{(j)}, c_{2}^{(j)}, \dots, c_{n}^{(j)}]

(39)

Here,

c_{k}^{(i)}

denotes the occurrence frequency of label

l_{k} \in L

in firmware

F_{i}

. Since

L

is predefined and consistent across all samples, the vectors

ϕ_{l} (F_{i})

and

ϕ_{l} (F_{j})

share identical dimensionality. We use cosine similarity to evaluate the alignment between the two label frequency vectors:

{Sim}_{tags} (F_{i}, F_{j}) = \frac{ϕ_{l} (F_{i}) \cdot ϕ_{l} (F_{j})}{∥ ϕ_{l} (F_{i}) ∥_{2} \cdot ∥ ϕ_{l} (F_{j}) ∥_{2}}

(40)

This similarity score reflects the degree of overlap in sensitive resource exposure between the two firmware images. It is particularly effective for comparing sparse high-dimensional vectors and highlights consistency in sensitive asset usage patterns.

Binary Resource Signature Similarity. The representation of binary resources consists of three distinct components: the set of matched file paths denoted as

P_{R}

, the set of printable strings denoted as

S_{R}

, and the set of fuzzy hash digests denoted as

H_{R}

. These components, respectively, characterize the semantic properties of the binary in terms of deployment location, textual content, and structural fingerprint.

Path Semantic Matching Similarity. Let $𝒫$ _i and $𝒫$ _i denote the sets of matched file paths extracted from firmware F_i and F_j, respectively. The path-based similarity is quantified using the Jaccard index, defined as:

{Sim}_{path} (F_{i}, F_{j}) = \frac{|P_{i} \cap P_{j}|}{|P_{i} \cup P_{j}|}

(41)

String Set Similarity. Let $𝒮$ _i and $𝒮$ _j denote the sets of printable strings extracted from firmware F_i and F_j, respectively. Their semantic similarity is computed by:

{Sim}_{str} (F_{i}, F_{j}) = \frac{|S_{i} \cap S_{j}|}{|S_{i} \cup S_{j}|}

(42)

Fuzzy Hash Matching Similarity. Let $H$ _i and $H$ _j represent the sets of fuzzy hash digests generated from the binary files of F_i and F_j, respectively. To compare structural similarity, we use the maximum fuzzy hash similarity for each element in $H$ _i against all elements in $H$ _j, as provided by the ssdeep hash similarity function ssdeep_sim(h, h′) ∈ [0, 100], which is normalized to the range [0, 1]:

{Sim}_{hash} (F_{i}, F_{j}) = \frac{1}{|H_{i}|} \sum_{h \in H_{i}} \max_{h^{'} \in H_{j}} ssdeep_sim (h, h^{'})

(43)

Finally, we aggregate the three signature components using a weighted sum. By default, we assign equal weights

α = β = γ = \frac{1}{3}

, but these can be adjusted based on task-specific priorities:

{Sim}_{bin} (F_{i}, F_{j}) = α \cdot {Sim}_{path} + β \cdot {Sim}_{str} + γ \cdot {Sim}_{hash}, α + β + γ = 1

(44)

Integrated Similarity Fusion. Finally, we compute the overall similarity under the Filesystem Semantic Profile by fusing the three sub-scores:

{Sim}_{fs} (F_{i}, F_{j}) = w_{s} \cdot {Sim}_{layout} + w_{l} \cdot {Sim}_{tags} + w_{r} \cdot {Sim}_{bin}, w_{s} + w_{l} + w_{r} = 1

(45)

Here,

w_{s}

,

w_{l}

, and

w_{r}

represent the weights assigned to structural layout, label frequency, and binary signature similarity, respectively. By default, we adopt uniform weighting to balance the contributions. This integrated strategy ensures that multi-source semantic features are jointly modeled, providing a robust metric for structural comparison and behavioral clustering at the filesystem level.

4.3. Interface Exposure Profile Similarity

The Interface Exposure Profile

Φ_{intf} (F)

is designed to capture the structural organization and complexity of external communication surfaces in embedded firmware. It incorporates three components: the interface path set

U (F)

, the parameter name set

K (F)

, and a structural summary vector

ϕ_{intf} (F)

. To compute semantic similarity under this profile, we jointly consider two classes of features: symbolic identifier sets and statistical structural vectors. These features are assessed independently and then integrated under a unified similarity space.

Path and Key Set Similarity. We begin by comparing the sets of extracted interface paths and parameter names between firmware

F_{i}

and

F_{j}

, using Jaccard similarity for both components:

{Sim}_{url} (F_{i}, F_{j}) = \frac{|U (F_{i}) \cap U (F_{j})|}{|U (F_{i}) \cup U (F_{j})|}, {Sim}_{key} (F_{i}, F_{j}) = \frac{|K (F_{i}) \cap K (F_{j})|}{|K (F_{i}) \cup K (F_{j})|}

(46)

Here,

U (F)

denotes the set of statically extracted communication paths, while

K (F)

refers to the corresponding input parameter names. Both are retained in their original string form to preserve semantic traceability. These metrics quantify the degree of overlap in externally exposed endpoints and input semantics, serving as the foundation for structural alignment and shared functionality identification.

Structural Summary Vector Similarity. To assess structural alignment at an abstract level, we represent each firmware using a five-dimensional structural vector

ϕ_{intf} (F) = [N_{u}, N_{k}, μ_{d}, H_{prefix}, H_{key}] \in ℝ^{5}

, which encodes the number of distinct paths and keys, average path depth, prefix entropy of deployment locations, and character-level entropy of parameter names. Since the vector resides in continuous space, we compute similarity using cosine distance:

{Sim}_{stat} (F_{i}, F_{j}) = \frac{ϕ_{intf} (F_{i}) \cdot ϕ_{intf} (F_{j})}{∥ ϕ_{intf} (F_{i}) ∥ \cdot ∥ ϕ_{intf} (F_{j}) ∥}

(47)

This metric captures higher-level structural and naming trends, reflecting deployment complexity, input diversity, and design consistency across firmware samples.

Integrated Interface Exposure Similarity. To combine the above three scores into a unified similarity indicator, we define a weighted aggregation scheme:

{Sim}_{intf} (F_{i}, F_{j}) = λ_{1} \cdot {Sim}_{url} (F_{i}, F_{j}) + λ_{2} \cdot {Sim}_{key} (F_{i}, F_{j}) + λ_{3} \cdot {Sim}_{stat} (F_{i}, F_{j})

(48)

Here,

λ_{1}, λ_{2}, λ_{3} \in [0, 1]

are fusion weights satisfying

λ_{1} + λ_{2} + λ_{3} = 1

. By default, equal weights

λ_{1} = λ_{2} = λ_{3} = \frac{1}{3}

are used, though these may be adjusted for task-specific requirements.

This similarity formulation provides three key advantages. First, Jaccard similarity over original string sets enables clustering and generalization across vendor-specific interface conventions. Second, the statistical vector captures global structural patterns in a compact, computable form. Third, the fusion mechanism supports multi-source feature coordination, enhancing the alignment of interface semantics across architectures. This similarity score serves as an integral component in the overall cross-firmware similarity computation pipeline.

4.4. Exposed Binary Symbolic Similarity

In embedded firmware, boundary binaries are responsible for protocol parsing and system interaction, and their symbolic-level structures reflect both external dependencies and internal functional layouts. In the modeling described previously, we extract for each top-n boundary binary

B_{i} \in B

a four-dimensional statistical feature vector

ϕ_{symb} (B_{i}) \in ℕ^{4}

, along with a corresponding symbol set

Ψ_{symb} (B_{i})

. These features jointly capture the exposed import functions, export symbols, reconstructed internal symbols, and function identifiers. To assess similarity under this semantic dimension, we design two complementary modeling strategies based on aggregated statistics and original symbolic content.

Statistical Vector Similarity. Given two firmware samples

F_{a}

and

F_{b}

, let their respective sets of boundary binaries be

B_{a} = {B_{1}^{a}, \dots, B_{n_{a}}^{a}}

,

B_{b} = \{B_{1}^{b}, \dots, B_{n_{b}}^{b}\}

. For each sample, we apply average pooling over the statistical feature vectors of all its binaries to form a global representation:

{\bar{ϕ}}_{symb} (F_{a}) = \frac{1}{n_{a}} \sum_{i = 1}^{n_{a}} ϕ_{symb} (B_{i}^{a}), {\bar{ϕ}}_{symb} (F_{b}) = \frac{1}{n_{b}} \sum_{j = 1}^{n_{b}} ϕ_{symb} (B_{j}^{b})

(49)

We then measure their similarity using cosine similarity, where

∥ \cdot ∥_{2}

denotes the L2 norm:

{Sim}_{symb}^{vec} (F_{a}, F_{b}) = \frac{{\bar{ϕ}}_{symb} (F_{a}) \cdot {\bar{ϕ}}_{symb} (F_{b})}{∥ {\bar{ϕ}}_{symb} (F_{a}) ∥_{2} \cdot ∥ {\bar{ϕ}}_{symb} (F_{b}) ∥_{2}}

(50)

Symbol Set Similarity. To further incorporate raw symbolic identifiers, we aggregate all four categories of symbols from the boundary binaries of each firmware into a unified set:

S_{symb} (F_{a}) = ⋃_{i = 1}^{n_{a}} Imp (B_{i}^{a}) \cup Exp (B_{i}^{a}) \cup Sym (B_{i}^{a}) \cup FID (B_{i}^{a})

(51)

S_{symb} (F_{b}) = ⋃_{i = 1}^{n_{a}} Imp (B_{i}^{a}) \cup Exp (B_{j}^{b}) \cup Sym (B_{j}^{b}) \cup FID (B_{j}^{b})

(52)

Here, Imp, Exp, Sym, and FID respectively denote the sets of imported functions, exported symbols, reconstructed internal symbols, and function identifiers. We compute the Jaccard similarity between these unified symbol sets:

{Sim}_{symb}^{set} (F_{a}, F_{b}) = \frac{|S_{symb} (F_{a}) \cap S_{symb} (F_{b})|}{|S_{symb} (F_{a}) \cup S_{symb} (F_{b})|}

(53)

Hybrid Similarity Fusion. We integrate the statistical and symbolic-level similarities into a final similarity score using a weighted combination:

{Sim}_{symb} (F_{a}, F_{b}) = α \cdot {Sim}_{symb}^{vec} (F_{a}, F_{b}) + (1 - α) \cdot {Sim}_{symb}^{set} (F_{a}, F_{b})

(54)

Here,

α \in [0, 1]

is a tunable parameter that adjusts the trade-off between structural statistics and raw symbolic semantics. The value of

α

can be selected based on validation performance under different evaluation tasks.

This similarity modeling approach balances computational efficiency and representational fidelity, combining the stability of statistical abstraction with the expressiveness of symbolic identifiers. It is suitable for cross-firmware comparison tasks such as functional module alignment and behavior-based clustering.

4.5. Vulnerability-Oriented Call-Chain Similarity

The Vulnerability-oriented Call-Chain Profile captures the semantic propagation paths of communication parameters to security-sensitive functions within embedded firmware. It reflects the internal execution chains that may trigger potential vulnerabilities such as buffer overflows or command injections. To assess the structural similarity of such propagation patterns across different firmware samples, we leverage normalized function call sequences and adopt edit distance as the primary metric for measuring pairwise path similarity. A cross-set alignment strategy is introduced to compute similarity at the firmware level.

Let the Vulnerability-oriented Call-Chain Profile of firmware

F_{i}

and

F_{j}

be represented as:

Φ_{chain} (F_{i}) = \{C_{1}^{i}, C_{2}^{i}, \dots, C_{m}^{i}\}, Φ_{chain} (F_{j}) = \{C_{1}^{j}, C_{2}^{j}, \dots, C_{n}^{j}\}

(55)

Here,

C_{p}^{i}

and

C_{p}^{j}

denote individual normalized function call chains represented as ordered sequences of function names:

C_{p}^{i} = [f_{p, 1}^{i}, f_{p, 2}^{i}, \dots, f_{p, l_{i}}^{i}], C_{q}^{j} = [f_{q, 1}^{j}, f_{q, 2}^{j}, \dots, f_{q, l_{j}}^{j}]

(56)

The similarity between any pair of paths

C_{p}^{i} \in Φ_{chain} (F_{i})

and

C_{q}^{j} \in Φ_{chain} (F_{j})

is defined as:

{Sim}_{edit} (C_{p}^{i}, C_{q}^{j}) = 1 - \frac{EditDist (C_{p}^{i}, C_{q}^{j})}{\max (|C_{p}^{i}|, |C_{q}^{j}|)}

(57)

Here,

EditDist (\cdot, \cdot)

denotes the minimum edit distance between two function sequences, computed using insertion, deletion, and substitution operations. The length

|\cdot|

denotes the number of functions in each path. This normalized formulation bounds the similarity score within the range

[0, 1]

, where a value of 1 indicates identical chains and 0 reflects complete dissimilarity.

To compute the overall similarity between firmware

F_{i}

and

F_{j}

, we adopt a max-alignment weighted averaging strategy. For each call chain in

F_{i}

, we select the most similar counterpart in

F_{j}

, and average the resulting similarities:

{Sim}_{chain} (F_{i}, F_{j}) = \frac{1}{|Φ_{chain} (F_{i})|} \sum_{C_{p}^{i} \in Φ_{chain} (F_{i})} \max_{C_{q}^{j} \in Φ_{chain} (F_{j})} {Sim}_{edit} (C_{p}^{i}, C_{q}^{j})

(58)

This strategy enhances structural comparison accuracy while mitigating risks from unmatched path lengths, varying call-chain counts, and inconsistent function naming. By incorporating the similarity of call chains as a semantic representation of vulnerability propagation behavior, this metric provides a robust foundation for identifying firmware samples with analogous vulnerability paths. It also supports cross-architecture vulnerability migration modeling and reproducibility analysis.

4.6. Multi-Profile Fusion and Global Similarity Computation

In the previous sections, we introduced independent similarity metrics across five key semantic dimensions of firmware, including structural signature sequences, filesystem layout, interface exposure, symbolic representation, and Vulnerability-oriented Call-Chains. For each pair of firmware samples

(F_{i}, F_{j})

, we compute semantic similarity scores under the corresponding profiles:

{Sim}_{sig} (F_{i}, F_{j}), {Sim}_{fs} (F_{i}, F_{j})

,

{Sim}_{intf} (F_{i}, F_{j}),

{Sim}_{symb} (F_{i}, F_{j})

,

{Sim}_{chain} (F_{i}, F_{j})

. To support unified similarity assessment across firmware samples, we adopt a linearly weighted fusion strategy that aggregates scores from all semantic dimensions. The global semantic similarity is defined as follows:

{Sim}_{global} (F_{i}, F_{j}) = λ_{1} \cdot {Sim}_{sig} (F_{i}, F_{j}) + λ_{2} \cdot {Sim}_{fs} (F_{i}, F_{j}) + λ_{3} \cdot {Sim}_{intf} (F_{i}, F_{j}) + λ_{4} \cdot {Sim}_{symb} (F_{i}, F_{j}) + λ_{5} \cdot {Sim}_{chain} (F_{i}, F_{j})

(59)

Here, the weights

λ_{1}, λ_{2}, \dots, λ_{5} \in [0, 1]

control the contribution of each semantic profile and satisfy the following normalization constraint

\sum_{i = 1}^{5} λ_{i} = 1

. By default, we suggest the following weight configuration to balance structural stability, semantic expressiveness, and computational efficiency:

λ_{1} = 0.1, λ_{2} = 0.2, λ_{3} = 0.3, λ_{4} = 0.1, λ_{5} = 0.3

(60)

Among these dimensions, the Interface Exposure Profile

Φ_{intf} (F)

and the Vulnerability-oriented Call-Chain Profile

Φ_{chain} (F)

are assigned the highest weights. This is because they directly model the firmware’s attack surface, input boundaries, and potential vulnerability trigger paths, which are critical for homologous vulnerability discovery. The Filesystem Semantic Profile

Φ_{fs} (F)

is also given a relatively high weight due to its stable directory topology and rich static feature representation, which yield strong discriminative power across scenarios. The remaining dimensions, the Unpacking Signature Sequence

ϕ_{k} (F)

and the Exposed Binary Symbolic Profile

Φ_{symb} (F)

, serve as foundational and complementary features with lower weights, ensuring a comprehensive and balanced analysis.

5. Implementation and Evaluation

In this section, we present the prototype implementation of FirmVulLinker and conduct a comprehensive evaluation of its effectiveness. We assess the system based on the following research questions:

RQ1 Compared with existing approaches for analyzing firmware with homologous vulnerabilities, does FirmVulLinker offer a more comprehensive set of analysis dimensions?

RQ2 How effective is FirmVulLinker in identifying homologous vulnerabilities across firmware samples?

RQ3 Do the multi-dimensional semantic features employed by FirmVulLinker contribute to improved analysis results?

RQ4 Can FirmVulLinker identify real but previously overlooked vulnerabilities?

5.1. Framework Implementation

We have developed and released a fully functional prototype system named FirmVulLinker, which validates the feasibility and practicality of our semantic profiling and homologous vulnerability discovery framework. The system integrates the entire analysis pipeline, from semantic profile construction to cross-sample similarity computation. The source code is available on GitHub to facilitate reproducibility and future extension.

Multi-dimensional Semantic Profiling Module. The semantic profiling module in FirmVulLinker implements five core modeling components, corresponding to the five proposed semantic feature dimensions: Unpacking Signature Sequence Profile, Filesystem Semantic Profile, Interface Exposure Profile, Exposed Binary Symbolic Profile, and Vulnerability-oriented Call Chain Profile.

For Unpacking Signature Sequence Profiling, FirmVulLinker performs recursive scanning of firmware images using the Binwalk [] tool to extract sequences of embedded magic signatures. These sequences reflect the unpacking path and nesting order, forming a structural trace of the unpacking process. After n-gram modeling, the resulting signature vectors capture semantic differences in unpacking structures across samples.

For the Filesystem Semantic Profile, FirmVulLinker statically traverses the extracted filesystem using the Firmwalker [] tool, combined with custom pattern-matching rules, to automatically annotate sensitive resources based on path and file characteristics. The system further integrates directory layout, critical configuration files, and binary signature metadata to generate unified semantic feature vectors that support cross-sample comparison of filesystem structures and sensitive asset placement.

For the Interface Exposure Profile, the system integrates and extends the SaTC [] tool to identify potential external communication entry points, including HTML pages, CGI scripts, and associated parameterized paths. The extracted results are then structurally modeled to capture the number of interfaces, path templates, and parameter name patterns. These features are aggregated into high-dimensional interface exposure vectors, which serve as key inputs to subsequent vulnerability propagation analysis.

For Exposed Binary Symbolic Profiling, FirmVulLinker first ranks boundary binaries based on the number of exposed interfaces and parameters, selecting the top-n candidates for further inspection. Using the Ghidra [] reverse engineering suite, the system extracts each binary’s import table, export table, symbol table, and function identifiers. These symbolic features are then encoded into vectors and statistical descriptors, enabling semantic comparison between boundary modules across firmware.

For the Vulnerability-oriented Call-Chain Profile, FirmVulLinker conducts a deeper analysis of the previously identified boundary binaries. The system employs a predefined set of sensitive functions and uses the SaTC tool to trace the origin of its arguments through static backward slicing. Each extracted path undergoes address normalization, symbol replacement, and sequence reconstruction. Paths of varying lengths are retained to support chain-level sequence alignment and edit-distance comparison. The final representations serve as semantic encodings of potential vulnerability propagation paths at the firmware level.

Firmware Similarity Computation Module. To support the discovery of homologous vulnerabilities across firmware, FirmVulLinker implements a similarity computation and fusion module covering all five semantic dimensions. This module is implemented in Python 3.12.8 and applies tailored matching strategies for each profile, including n-gram and Jaccard-based sequence similarity, vector-based structural comparison, and edit-distance computation for call chains. The final similarity score is generated through feature normalization and weighted fusion, allowing for global comparison between firmware samples and facilitating the identification of homologous vulnerability propagation paths.

5.2. Experimental Setup

We begin by outlining the experimental setup, including the baseline tools used for comparison, hardware configurations, and the evaluation datasets.

Baseline Tool. For evaluating homologous vulnerability analysis across firmware, we selected LibAM [] as the primary baseline. LibAM is a state-of-the-art, open-source binary similarity analysis framework widely adopted for clone detection and reuse analysis across various architectures. As LibAM was originally designed for stand-alone binary analysis, it lacks native support for firmware-level inputs. To bridge this gap, we extended LibAM with additional modules to support automatic firmware unpacking, batch binary extraction, and input preprocessing, enabling direct operation on complete firmware images. To adapt LibAM’s output for firmware-level evaluation, we constructed a reuse correlation matrix that captures matched binary pairs across firmware samples. Based on the number of matched binary modules between two firmware images, we computed the mean squared error (MSE) and Pearson correlation coefficient. We fused these results to generate a unified firmware-level similarity score. This enables a fair comparison between LibAM and FirmVulLinker in terms of detecting shared components and potential vulnerability propagation.

Hardware Configuration. Experiments were conducted on two server environments. FirmVulLinker was executed on a CPU-based server equipped with two common KVM processor CPUs (32 physical cores in total), 128 GB RAM, running 64-bit Ubuntu 22.04.5 LTS. LibAM was executed on a GPU-based server with two Intel(R) Xeon(R) Platinum 8358 CPUs at 2.60 GHz (64 physical cores in total), 256 GB RAM, running 64-bit Ubuntu 22.04 LTS, and 8 × NVIDIA A800-SXM4-80GB GPUs (each with 80 GB of memory, CUDA 12.2). Thanks to its lightweight design, FirmVulLinker is capable of efficiently processing large-scale firmware entirely on the CPU, without requiring GPU support. In contrast, LibAM is GPU-dependent for inference and similarity computation.

Dataset. To evaluate FirmVulLinker’s effectiveness in discovering real-world homologous vulnerabilities across firmware samples, we curated datasets entirely from real-world sources. All firmware samples were historical versions officially released by major vendors, and each vulnerability included has a corresponding CVE identifier. Although FirmVulLinker performs purely static analysis without relying on emulation environments, we conducted full emulation and manual vulnerability validation for each firmware sample before inclusion. Specifically, for each firmware, we first built a complete containerized emulation environment using Docker to provide the necessary runtime for vulnerability validation. We then constructed a dedicated proof-of-concept (PoC) script for each vulnerability. To further improve automation, we developed corresponding validation code based on our prior framework, IoTBenchSL [], which enabled us to trigger and confirm vulnerabilities automatically. Only those samples that could be successfully emulated and whose vulnerabilities were reproducibly triggered were included in the final dataset, ensuring the accuracy and credibility of the ground truth. This additional validation step, though not required by the static analysis workflow, further strengthens the reliability of our evaluation.

In total, we selected 54 firmware images, covering 74 unique vulnerabilities, with each vulnerability appearing in at least two different firmware samples. The vulnerabilities primarily include memory corruption flaws and command injection vulnerabilities, which represent two of the most prevalent categories in IoT firmware security. The dataset also spans the most widely used embedded architectures, namely MIPS and ARM, ensuring that the evaluation reflects the diversity of real-world deployment environments. To further assess FirmVulLinker’s capability in discovering previously undisclosed vulnerabilities, we additionally constructed a second dataset comprising 78 firmware images. These samples could also be fully emulated but had no known vulnerabilities recorded in the National Vulnerability Database (NVD). As discussed in Section 5.6, this dataset was used to explore FirmVulLinker’s potential for identifying latent vulnerabilities that have not yet been reported.

5.3. Comparative Analysis of Feature Dimensions

To answer RQ1, this section comprehensively evaluates FirmVulLinker’s feature modeling capabilities in the context of homology-based firmware vulnerability analysis. We perform a comparative study against several representative static analysis tools that cover different mainstream paradigms, including classic static analysis systems (e.g., Firmalice [] and FirmRec []), structural embedding methods (e.g., Asm2Vec []), and integrated vulnerability analysis frameworks (e.g., SaTC [] and FirmSec []). The evaluation focuses on five critical semantic dimensions to determine whether each tool supports corresponding modeling capabilities:

Unpacking Signature Sequence Modeling: Captures the structural trace and magic byte sequences during firmware unpacking, reflecting packaging formats, nesting structures, and build regularity.
Filesystem Structure Modeling: Represents the organization of the unpacked filesystem, including directory hierarchies, path semantics, and resource density, to assist in identifying deployment-related configurations and behavior scripts.
Communication Interfaces and Parameters: Models externally exposed input vectors, such as web endpoints, CGI paths, and parameter names, to characterize attack surfaces and accessible entry points.
Boundary Binary Modeling: Extracts symbols, function signatures, or control flow information from core binaries to analyze component-level reuse relationships and potential clone characteristics.
Sensitive Call-Chain Modeling: Performs static backward analysis to identify propagation paths of parameters flowing into security-sensitive functions, reconstructing the triggering context of potential vulnerability paths.

Table 1 summarizes the modeling capabilities of FirmVulLinker and other tools across these five dimensions.

Table 1. Comparison of semantic feature coverage across firmware analysis tools.

As shown in Table 1, FirmVulLinker is the only framework that supports all five static semantic dimensions. Existing approaches typically suffer from structural constraints that hinder the construction of stable and comparable semantic representations for homology analysis. For example, FirmSec and Asm2Vec focus on function-level embeddings based on control flow graphs or assembly instructions but lack awareness of communication entry points and reachable paths, thereby failing to capture vulnerability exploitation contexts. Firmalice and FirmRec emphasize clone detection and sensitive path tracing but do not integrate multi-dimensional features or perform structural abstraction. SaTC supports partial interface extraction but is limited to frontend-backend path matching and does not model unpacking sequences or filesystem layouts, nor does it perform cross-dimensional feature integration.

In contrast, FirmVulLinker demonstrates clear advantages in systematic modeling. Its unified five-dimensional profiling design enables comprehensive semantic representation of both firmware structure and attack surfaces. Moreover, it supports interpretability in component decoupling, path modeling, and context reconstruction. This design provides a robust foundation for subsequent tasks such as firmware homology comparison and variant vulnerability tracing, significantly enhancing the framework’s stability and scalability across architectures and vendors.

5.4. Homologous Vulnerability Correlation Across Firmware Images

To address RQ2, we evaluate the effectiveness of FirmVulLinker in analyzing homologous vulnerability correlation from a vulnerability-centric perspective. We conduct a comparative study against LibAM, a state-of-the-art firmware homology analysis tool, to evaluate both the accuracy and interpretability of identifying correlated vulnerabilities across different firmware images.

To support this evaluation, we curated a vulnerability-driven dataset comprising 54 firmware images from mainstream vendors, covering 74 validated CVE vulnerabilities. Each vulnerability is associated with at least two distinct firmware images. These firmware images, in which the presence of the corresponding vulnerabilities has been manually verified, are hereafter referred to as Known Defective Firmware (KDF). To ensure the reliability of the ground truth, all firmware images were fully emulated, and vulnerabilities were manually verified for reachability and triggering conditions. This process enabled us to establish accurate mappings between KDF images and their corresponding vulnerabilities. For each vulnerability, we randomly selected one associated firmware image as the representative reference firmware. In the subsequent correlation analysis, all other firmware images were treated as candidates and compared against these reference firmware images to evaluate semantic similarity. Metadata of the firmware images is summarized in Appendix A Table A1, and the ground truth mappings between firmware images and vulnerabilities are provided in Appendix A Table A2.

FirmVulLinker and LibAM were then used to compute the semantic similarity between each candidate firmware image and the reference firmware. A similarity threshold was applied to determine potential homologous relationships. The predictions were compared against the ground truth to evaluate three core metrics: precision, false-positive rate (FPR), and false-negative rate (FNR). Table 2 presents the comparative performance of FirmVulLinker and LibAM. The results show that FirmVulLinker significantly outperforms LibAM across all metrics, particularly in reducing false positives. Further analysis reveals that the multi-dimensional semantic profiling adopted by FirmVulLinker provides greater expressive power in modeling static behavior, interface exposure, and sensitive code paths. This enables FirmVulLinker to capture the deeper semantics behind homologous vulnerability correlation, rather than relying solely on superficial function or instruction-level clone detection.

Table 2. Performance comparison on homologous vulnerability correlation accuracy.

Beyond accuracy, the five-dimensional similarity scoring mechanism introduced by FirmVulLinker further enables interpretable analysis of homologous vulnerability correlation among firmware images. As shown in Figure 2, we visualize the semantic similarity distribution of representative vulnerability scenarios using radar charts, where each axis corresponds to one of the five semantic dimensions modeled by FirmVulLinker: M1 denotes the Unpacking Signature Sequence Profile; M2, the Filesystem Semantic Profile; M3, the Interface Exposure Profile; M4, the Exposed Binary Symbolic Profile; and M5, the Vulnerability-oriented Call-Chain Profile. In each subplot, the red line represents the average similarity curve across all firmware image pairs that are correlated in the ground truth concerning the given vulnerability, providing a reference for typical semantic alignment. In contrast, the blue line depicts the similarity distribution of a specific candidate firmware image compared to its reference.

Figure 2. Radar chart of FirmVulLinker similarity scores for representative firmware pairs.

In Figure 2a, firmware image KDF-18 exhibits above-average similarity across all five dimensions (M1–M5), with particularly high scores from M1 to M3. This indicates substantial semantic alignment between the target and reference firmware images, suggesting a high likelihood of homologous vulnerability correlation. In Figure 2b, KDF-9 demonstrates significantly lower similarity in M4 but a notably high score in M5, while maintaining near-average similarity in the remaining dimensions. This suggests that KDF-4 and KDF-9 differ in their exposed binary symbolic representations but are highly consistent in their Vulnerability-oriented Call-Chain structures. Considering the overall similarity pattern, KDF-4 and KDF-9 may originate from homologous firmware and potentially exhibit a homologous vulnerability correlation. In Figure 2c,d, the target firmware images consistently show low similarity across all semantic dimensions, indicating considerable divergence from their respective reference firmware images and a low probability of homologous vulnerability correlation. The radar chart visualization provides a clear depiction of multi-dimensional similarity distributions across firmware image pairs and supports interpretable analysis of vulnerability correlation based on semantic profiling.

Figure 3 presents a heatmap illustrating the similarity scores between different firmware images and a baseline firmware image containing a specific vulnerability, evaluated across all five profiling dimensions. As shown in the figure, KDF-1, KDF-32, KDF-33, KDF-34, and KDF-29 all contain this vulnerability and exhibit consistently high similarity scores across the five dimensions accordingly. In contrast, KDF-19, KDF-3, KDF-9, KDF-10, and KDF-38 are not affected by this vulnerability, and their similarity scores are significantly lower across all dimensions. This visualization highlights FirmVulLinker’s ability to distinguish vulnerable firmware from non-vulnerable ones based on multi-dimensional semantic profiling.

Figure 3. Firmware similarity heatmap based on multi-dimensional profiling.

These results collectively demonstrate FirmVulLinker’s effectiveness in accurately identifying homologous firmware images within the context of vulnerability propagation. By employing structured and interpretable profiling across five orthogonal semantic dimensions, FirmVulLinker consistently outperforms traditional methods in both precision and robustness. Moreover, its explainable multi-dimensional scoring framework provides actionable insights into the correlation of homologous vulnerabilities, facilitating risk assessment and informing firmware update strategies. This highlights the practical applicability and engineering significance of FirmVulLinker in real-world firmware security scenarios.

5.5. Ablation Study of Semantic Profiling Dimensions

To address RQ3, which investigates the individual contribution of each semantic profiling dimension to the identification of homologous vulnerability correlation, we conduct a component-level ablation study. This study quantitatively evaluates the independent and joint effectiveness of the five profiling dimensions proposed in FirmVulLinker, thereby validating the necessity of each module in semantic modeling and cross-firmware alignment.

This experiment is conducted on the same evaluation dataset used in Section 5.4, encompassing 54 real-world firmware images and 74 verified vulnerabilities, each observed in at least two firmware images. To systematically assess the impact of each profiling dimension, we construct six comparative models. The FullModel incorporates all five profiling dimensions and serves as the most comprehensive configuration. Each of the other five models, Ablated-M1 through Ablated-M5, omits one specific dimension while keeping the others unchanged. The definitions of M1 through M5 are consistent with those described in Section 5.4. To ensure fair comparison, the weights of the remaining dimensions in each ablation model are normalized proportionally to maintain a consistent total weight.

Figure 4 presents the performance of each model across three key metrics: precision, FPR, and FNR. The FullModel achieves the highest performance, with a precision of 0.9564, FPR of 0.0305, and FNR of 0.0215. These results confirm that integrating all five profiling dimensions yields the most accurate detection of homologous vulnerability correlation.

Figure 4. Performance impact of removing each semantic profile dimension.

Figure 4a illustrates the precision performance of all models. Notably, Ablated-M1, Ablated-M2, and Ablated-M4 exhibit a substantial drop in precision, highlighting the critical role of unpacking structure, filesystem semantics, and binary symbol context in enhancing alignment accuracy. These dimensions capture high-level firmware organization, resource layout, and symbolic interaction semantics, all of which are essential for distinguishing fine-grained homology. Figure 4b depicts the changes in both false-positive rate and false-negative rate. Ablated-M1, Ablated-M3, and Ablated-M4 exhibit increased FPRs, indicating that unpacking signatures, interface structures, and symbol semantics help to suppress incorrect matches. The absence of these dimensions notably weakens the system’s ability to filter non-homologous firmware images, especially under customized interface or symbol mutations. Conversely, Ablated-M2, Ablated-M3, and Ablated-M5 show an increase in FNRs, suggesting that filesystem layouts, interface parameters, and call-chain paths are indispensable for capturing real vulnerability propagation paths. These dimensions encompass critical reachability, parameter dependency, and behavioral semantics that are essential for identifying actual attack vectors.

An interesting observation is that the performance gap between the FullModel and Ablated-M5 appears relatively small compared with other ablation groups. Upon closer analysis, we found that specific parameter-related semantics encoded in the Vulnerability-oriented Call-Chain Profile (M5) are also partially captured by the sensitive parameter profile (M3). Since parameter features are particularly effective for modeling vulnerability correlation, the presence of M3 compensates for part of the information loss when M5 is removed. Nevertheless, including M5 in the FullModel consistently improves accuracy, demonstrating the benefit of retaining all profiling dimensions and further underscoring the robustness of our design.

Taken together, the results across all three metrics demonstrate the strong complementarity of FirmVulLinker’s five-dimensional semantic profiling framework. From structural unfolding and resource semantics to interface behavior and symbolic execution context, each dimension models a distinct aspect of firmware behavior. The absence of any one dimension leads to measurable degradation in performance, reinforcing both the design rationale and practical robustness of our framework. These findings affirm the effectiveness of our multi-dimensional profiling strategy in modeling homologous vulnerability correlation and provide a solid foundation for large-scale reuse detection and threat propagation analysis.

5.6. Discovery of Previously Undisclosed Vulnerabilities

To address RQ4, this section evaluates the capability of FirmVulLinker to discover previously undisclosed vulnerabilities in real-world settings, with a particular focus on its effectiveness in expanding the scope of known vulnerability impacts and revealing homologous vulnerability correlations across firmware images. Specifically, we selected the 74 verified vulnerabilities from Section 5.4 as known vulnerability sources and examined whether the corresponding vulnerable logic could be identified in other firmware images through correlation analysis. To this end, we curated an evaluation set consisting of 78 firmware images, none of which have been reported to contain any of the 74 known vulnerabilities in the CVE or NVD databases, thereby representing typical “unknown-state” targets for homologous vulnerability correlation analysis. Each firmware image was validated via emulation using our in-house open-source tool, FirmEmuHub [], ensuring successful boot and interaction for subsequent vulnerability verification.

The experiment proceeds as follows. We use the 74 verified vulnerabilities and their corresponding reference firmware images from Section 5.4 as the known vulnerability sources. FirmVulLinker is then employed to perform pairwise correlation analysis between each target firmware image and all known source images. By computing multi-dimensional global semantic similarity scores, FirmVulLinker automatically determines whether a target firmware image is a potential homologous carrier of a known vulnerability. A similarity score exceeding a predefined threshold triggers further validation, under the hypothesis that the target may reuse vulnerable logic from the reference.

At the candidate identification stage, FirmVulLinker produced 283 potentially correlated firmware pairs. These were further validated using the integrated FirmEmuHub and IoTVulBench [] toolchain, which automates large-scale firmware emulation and vulnerability verification. Through static configuration, dynamic interaction, and automated exploit testing, we refined the candidate set by eliminating redundant matches and reconstructing exploit paths. To validate the identified vulnerabilities, we constructed dedicated emulation environments for each firmware image using Docker-based containerization provided by FirmEmuHub. For every vulnerability, we implemented or reused PoC scripts that trigger the suspected flaw. During validation, IoTVulBench automatically deployed each firmware in the emulated environment, injected the corresponding PoC payloads, and monitored the firmware’s service behavior. A vulnerability was confirmed if observable state changes occurred in the target, such as abnormal responses, service crashes, or the execution of unauthorized commands. This validation pipeline was fully automated through configuration files, enabling reproducible and scalable verification across all candidate firmware images. By combining FirmEmuHub and IoTVulBench, we ensured both the accuracy and efficiency of confirming vulnerabilities at scale. Ultimately, we confirmed and reproduced 53 exploitable vulnerabilities within the target firmware set, each with verifiable impact and valid triggering conditions. We refer to firmware images identified by FirmVulLinker as containing vulnerable logic similar to known sources, despite lacking prior vulnerability disclosures, as Unknown Anomaly Firmware (UAF). Detailed metadata for these firmware images are provided in Table 3.

Table 3. Metadata of representative UAF.

Detailed results are summarized in Table 4, which presents the matched UAF and their associated vulnerabilities identified via homologous correlation analysis. It is important to note that although these 53 vulnerabilities are documented in existing databases (i.e., N-day vulnerabilities), the specific firmware versions identified in our experiment are not listed in any official disclosures. This finding underscores FirmVulLinker’s capability to uncover undocumented yet affected firmware images, thereby augmenting public vulnerability intelligence. By accurately identifying homologous firmware exhibiting shared propagation chains, FirmVulLinker provides practical support for extending vulnerability coverage and refining the assessed impact scope.

Table 4. Matched UAF and their associated vulnerabilities identified via FirmVulLinker.

Overall, these findings strongly validate the effectiveness of FirmVulLinker in discovering real-world vulnerabilities under unsupervised conditions. By leveraging semantic profiling and structural similarity analysis, FirmVulLinker captures deep logical correlations across firmware images. Beyond its detection accuracy, FirmVulLinker demonstrates strong potential as an automated tool for expanding the impact boundaries of known vulnerabilities. It facilitates rapid risk identification for security analysts. It empowers vulnerability disclosure platforms, device vendors, and supply chain managers with more accurate and comprehensive impact assessments, thereby advancing firmware security awareness and proactive defense strategies.

6. Discussion and Future Work

Although FirmVulLinker demonstrates promising progress in multi-dimensional semantic modeling and homologous vulnerability correlation analysis across firmware images, and has been empirically validated on real-world firmware datasets, it still faces several limitations and opportunities for further enhancement when applied to larger-scale or more complex deployment scenarios. This section discusses the current capability boundaries and outlines potential future research directions.

Finer-grained Semantic Feature Modeling. The current semantic profiling framework of FirmVulLinker incorporates five major dimensions: unpacking signature sequences, filesystem structure layout, interface exposure profiles, boundary binary symbolic structure, and sensitive parameter call chains. These profiles collectively describe the static structure and behavioral semantics of most generic firmware images. However, finer-grained semantic elements such as control flow graphs (CFGs), data flow graphs (DFGs), system call sequences, and inter-file dependencies are not explicitly modeled. This may limit the system’s semantic resolution capability when dealing with refactored, obfuscated, or structurally evolved firmware. In future work, we plan to further refine static semantic profiling by incorporating representation learning techniques. Specifically, we will explore the use of machine learning and deep embedding models to capture latent relationships across heterogeneous static features, such as CFG fragments, system call contexts, and inter-file dependencies. These methods can provide more expressive semantic embeddings and improve the robustness of similarity computation under code refactoring, obfuscation, and structural variations. Moreover, we recognize that the current framework adopts heuristic parameter settings when integrating different feature dimensions. While this provides a reasonable starting point, future research will investigate data-driven parameter optimization and sensitivity analysis, which can further justify and adaptively balance the contributions of different profiling dimensions. By combining taint-based control dependency modeling, static semantic embeddings, and learning-based similarity metrics, FirmVulLinker will be able to capture fuzzy function boundaries, composite component structures, and implicit vulnerability paths with higher accuracy. These enhancements are expected to substantially improve both the precision and interpretability of homologous vulnerability correlation across diverse firmware images.

Extending Compatibility to Non-Linux Embedded Firmware. FirmVulLinker is currently designed for embedded firmware images running Linux, relying on standardized filesystem layouts and ELF-format binaries for feature extraction. However, in practical industrial contexts, bare-metal architectures, RTOS-based systems, and customized boot logics are prevalent. These systems often adopt fundamentally different program organizations, entry point definitions, and symbol table layouts, which restrict the portability of the current model. Migrating FirmVulLinker to such environments introduces several challenges: the absence of standardized filesystem structures and executable formats complicates consistent feature extraction; the frequent lack of symbol or debugging information in RTOS and bare-metal binaries hinders semantic recovery; and heterogeneous bootstrapping mechanisms, interrupt handling, and peripheral initialization logic introduce significant variability in control and data flow structures. To address these challenges, we plan to design a pluggable abstraction framework targeting non-Linux firmware, integrating techniques such as image structure recognition, heuristic function boundary recovery, and binary context clustering. This extension will enhance FirmVulLinker’s adaptability and robustness across heterogeneous embedded platforms, enabling broader applicability in real-world scenarios.

Managing False Positives and Negatives. Another limitation lies in the potential false positives and negatives introduced by multi-dimensional profiling. For example, structurally similar features across different firmware images may lead to spurious correlations, while incomplete or noisy feature extraction may cause homologous vulnerabilities to be overlooked. These issues stem from the inherent trade-offs between sensitivity and specificity in static semantic modeling. To mitigate such risks, we plan to refine our profiling framework by incorporating adaptive weighting across feature dimensions, introducing contextual constraints to filter out irrelevant matches, and integrating lightweight dynamic validation to cross-check critical cases. These enhancements will improve the balance between precision and recall, thereby strengthening the reliability of FirmVulLinker in large-scale vulnerability correlation tasks.

Structural Modeling of Cross-Firmware Vulnerability Propagation. FirmVulLinker currently focuses on pairwise semantic similarity analysis, which is effective for discovering homologous vulnerability correlations. However, real-world vulnerability propagation often manifests in more complex patterns, such as chained transfers, component-level reuse, and semantic evolution of interfaces. The current model does not explicitly capture causal paths or propagation chains across firmware images. As a future direction, we intend to construct a cross-firmware vulnerability propagation graph by leveraging function slicing, call graph normalization, and path reconstruction techniques. We will further incorporate graph matching algorithms to support path alignment, source inference, and multi-point co-origin recognition, thereby enabling structurally explainable modeling of vulnerability propagation across firmware images.

Enhancing Vulnerability Intelligence and Practical Deployment. FirmVulLinker has demonstrated strong capabilities in identifying homologous vulnerability correlations, offering automated and structured mechanisms for extending the impact scope and bridging version gaps of known vulnerabilities. Moving forward, we plan to develop a comprehensive vulnerability analysis toolchain centered on FirmVulLinker, tailored explicitly for security practitioners. This toolchain will provide end-to-end support for firmware unpacking, feature extraction, similarity analysis, and identification of affected versions, enabling researchers to pinpoint high-risk targets within large-scale firmware collections. Despite these promising results, we acknowledge several limitations that point to directions for future work. First, while our evaluation demonstrates solid accuracy, we have not yet performed systematic measurements of throughput, runtime efficiency, and memory consumption at larger scales. Second, we have not conducted robustness testing under conditions such as symbol stripping, compiler-based obfuscation, or atypical packing. These transformations are increasingly present in real firmware and may degrade the discriminative power of specific profiling dimensions. Exploring their impact on F1, recall, and failure modes will help refine our profiling strategies and improve resilience against adversarial or obfuscated binaries. Additionally, our current evaluation is limited by the scarcity of baseline tools. Many recent approaches in this field have not released source code, and even when code is available, non-trivial adaptation is often required to apply it to our evaluation setting. These constraints restricted the number of baselines we could incorporate in the present study. We view this as an important avenue for future work and plan to extend our evaluation once more baselines become accessible or can be feasibly adapted. To support continuous benchmarking, we will also update the open-source repository of FirmVulLinker with extended results, including both performance and robustness experiments. Ultimately, we aim to integrate the planned toolchain with existing vulnerability disclosure platforms and supply chain risk assessment systems. By enriching vulnerability databases with missing information on affected versions, FirmVulLinker can enhance the completeness and timeliness of vulnerability intelligence. This, in turn, will promote automation in vulnerability lifecycle management and contribute to strengthening the overall security of the firmware ecosystem.

7. Conclusions

In this paper, we propose and implement FirmVulLinker, a multidimensional semantic profiling framework for identifying homologous vulnerability correlations across firmware images. FirmVulLinker models each firmware holistically across five static dimensions, including unpacking signature sequences, filesystem semantics, interface exposure, boundary binary symbols, and sensitive parameter call chains. This design enables robust similarity computation and the reconstruction of vulnerability propagation paths. We evaluated FirmVulLinker on two real-world datasets and validated its effectiveness across multiple tasks. Experimental results demonstrate that it outperforms state-of-the-art approaches in terms of precision, false-positive rate, and coverage. Notably, FirmVulLinker successfully identified and reproduced 53 previously undisclosed N-day vulnerabilities in firmware images exhibiting homologous vulnerability correlations. These findings underscore FirmVulLinker’s capability to extend existing vulnerability intelligence and support scalable analysis of firmware-level vulnerability propagation. By providing a unified and interpretable semantic modeling approach, FirmVulLinker delivers practical and extensible support for large-scale IoT firmware security auditing and homologous vulnerability discovery.

Author Contributions

Conceptualization, Y.C. and W.H.; data curation, F.X. and L.X.; methodology, Y.C., J.Y. and W.F.; resources, W.F. and W.L.; software, Y.C., F.X., L.X., Y.G. and J.Y.; validation, Y.C., F.X., L.X. and Y.G.; visualization, F.X., L.X. and Y.G.; writing—original draft, Y.C.; writing—review and editing, Y.C., W.H. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Fundamental Research Funds for the Central Universities under Grant Number CUCZDTJ2403 and CUC25SG001. The experiments and data processing in this study were supported by the Public Computing Cloud, CUC.

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request. The disclosed security vulnerabilities used and identified in this paper are accessible in the CVE database (https://cve.mitre.org/, accessed on 29 July 2025). Additionally, the firmware benchmark dataset used during this study will be open-sourced at https://github.com/a101e-lab/FirmVulLinker-dataset (accessed on 29 July 2025).

Acknowledgments

We would like to sincerely thank the reviewers for their insightful comments, which helped us improve this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The appendix of this paper provides detailed supplementary materials that support the experiments and findings presented in the main sections. Appendix A Table A1 summarizes the metadata of all Known Defective Firmware (KDF) images curated for evaluating homologous vulnerability correlations. These firmware images were carefully selected from mainstream vendors and validated to ensure reproducible vulnerability triggering. Appendix A Table A2 presents the manually verified ground truth of homologous vulnerability correlations, listing each vulnerability alongside its designated reference firmware image and associated correlated firmware images. Together, these tables aim to facilitate transparency, reproducibility, and future comparative research on firmware-level vulnerability correlation and propagation analysis.

Table A1. Metadata of firmware images used in the homologous vulnerability correlation evaluation.

ID	Device Type	Hardware Version	ID	Device Type	Hardware Version
KDF-1	TL-WR940N	V4(us)	KDF-28	TL-WR802N	V1
KDF-2	DIR-823G A1	v1.0.2B03	KDF-29	TL-WR940N	V3(150206)
KDF-3	DIR-825 B1	v2.10NAb02	KDF-30	TL-WR940N	V1(120201)
KDF-4	Archer C20i(UN)	V1	KDF-31	TL-WR940N	V3(161107)
KDF-5	TL-WR841N	V10	KDF-32	TL-WR940N	V4(eu)
KDF-6	DIR-846 A1	v1.0.0(A43)	KDF-33	TL-WR940N	V6
KDF-7	TL-WR740N	V1	KDF-34	TL-WR940N	V3(151102)
KDF-8	TL-WR810N	V2	KDF-35	TL-WR841N	V7(120201)
KDF-9	TL-WR840N	V4	KDF-36	Archer-C2	v1(0.9.1_4.0)
KDF-10	TL-WR845N	V3	KDF-37	Archer-C2	v1(0.9.1_4.1)
KDF-11	TL-WR841N	V8(3_15_9)	KDF-38	Archer-C20	v1(0.9.1_0.2)
KDF-12	TL-WR940N	V2	KDF-39	DIR-823G	v1.02B01
KDF-13	TL-WR1043	V2(150717)	KDF-40	TL-WR743ND	V1(110829)
KDF-14	TL-WR841N	V9(150104)	KDF-41	TL-WR743ND	V1(111212)
KDF-15	TL-WR841N	V9(150310)	KDF-42	TL-WR841N	V7(111228)
KDF-16	Archer_C2	V1	KDF-43	TL-WR940N	V1(111228)
KDF-17	TL-WR841N	V8(3_16_9)	KDF-44	TL-WR940N	V2(3_15_9)
KDF-18	TL-WR841N	V9(us_150401)	KDF-45	TL-WR1042N	V1(120618)
KDF-19	DIR-615 C1	v3.14NA	KDF-46	TL-WR1042N	V1(130117)
KDF-20	TL-WR940N	V1(111228)	KDF-47	DIR-823G	v1.00B02
KDF-21	TL-WR941ND	V6	KDF-48	DIR-823G	v1.0.2B05
KDF-22	TL-WR1043	V2(150910)	KDF-49	TL-WR902AC_US	V1
KDF-23	TL-WR810N	V1	KDF-50	TL-WR1042ND	V1
KDF-24	TL-WR743ND	V1(130109)	KDF-51	TL-WR1041N	V2
KDF-25	Archer-C2	V1	KDF-52	TL-WR841ND	V9
KDF-26	Archer-C20	V1	KDF-53	Archer C20i(EU)	V1_220107
KDF-27	Archer-C50	V1	KDF-54	DIR-846 A1	v1.0.0(A35)

Table A2. Ground truth of the homologous vulnerability correlation.

Vulnerability ID	Reference Firmware Image ID	Associated Firmware Image IDs
CVE-2023-36356	KDF-23	KDF-32	KDF-29	KDF-11	KDF-50	KDF-20	KDF-49
		KDF-1	KDF-35	KDF-15	KDF-45	KDF-5	KDF-33
		KDF-14	KDF-43	KDF-7	KDF-21	KDF-52	KDF-24
		KDF-34	KDF-41	KDF-13	KDF-51	KDF-17	KDF-46
		KDF-18	KDF-12	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42	KDF-28
CVE-2023-37080	KDF-17	KDF-32	KDF-23	KDF-29	KDF-11	KDF-50	KDF-20
		KDF-49	KDF-1	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-52
		KDF-24	KDF-34	KDF-41	KDF-13	KDF-51	KDF-46
		KDF-18	KDF-12	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-36358	KDF-18	KDF-32	KDF-23	KDF-29	KDF-11	KDF-50	KDF-20
		KDF-49	KDF-1	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-52
		KDF-24	KDF-34	KDF-41	KDF-13	KDF-51	KDF-17
		KDF-46	KDF-12	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-39745	KDF-31	KDF-32	KDF-23	KDF-11	KDF-29	KDF-50	KDF-20
		KDF-49	KDF-1	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-52
		KDF-24	KDF-34	KDF-41	KDF-13	KDF-51	KDF-17
		KDF-46	KDF-18	KDF-12	KDF-8	KDF-44	KDF-22
		KDF-40	KDF-30	KDF-42
CNVD-2023-48042	KDF-32	KDF-23	KDF-11	KDF-29	KDF-50	KDF-20	KDF-49
		KDF-1	KDF-35	KDF-15	KDF-45	KDF-5	KDF-33
		KDF-14	KDF-43	KDF-7	KDF-21	KDF-52	KDF-24
		KDF-34	KDF-41	KDF-13	KDF-51	KDF-17	KDF-46
		KDF-18	KDF-12	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-33536	KDF-1	KDF-32	KDF-23	KDF-11	KDF-29	KDF-50	KDF-20
		KDF-49	KDF-35	KDF-15	KDF-45	KDF-5	KDF-33
		KDF-14	KDF-43	KDF-7	KDF-21	KDF-52	KDF-24
		KDF-34	KDF-41	KDF-13	KDF-51	KDF-17	KDF-46
		KDF-18	KDF-12	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-36354	KDF-12	KDF-32	KDF-23	KDF-11	KDF-29	KDF-50	KDF-20
		KDF-1	KDF-49	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-24
		KDF-52	KDF-34	KDF-41	KDF-13	KDF-51	KDF-17
		KDF-46	KDF-18	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-37081	KDF-50	KDF-32	KDF-23	KDF-11	KDF-29	KDF-20	KDF-1
		KDF-49	KDF-35	KDF-15	KDF-45	KDF-5	KDF-33
		KDF-14	KDF-43	KDF-7	KDF-21	KDF-24	KDF-52
		KDF-34	KDF-41	KDF-13	KDF-51	KDF-17	KDF-46
		KDF-12	KDF-18	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-36357	KDF-34	KDF-32	KDF-23	KDF-11	KDF-29	KDF-50	KDF-20
		KDF-1	KDF-49	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-24
		KDF-52	KDF-41	KDF-51	KDF-13	KDF-17	KDF-46
		KDF-12	KDF-18	KDF-8	KDF-44	KDF-22	KDF-31
		KDF-40	KDF-30	KDF-42
CVE-2023-37079	KDF-8	KDF-32	KDF-11	KDF-23	KDF-29	KDF-50	KDF-20
		KDF-1	KDF-49	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-24	KDF-52
		KDF-34	KDF-41	KDF-51	KDF-13	KDF-17	KDF-46
		KDF-12	KDF-18	KDF-44	KDF-22	KDF-31	KDF-40
		KDF-30	KDF-42
CVE-2023-37082	KDF-13	KDF-32	KDF-11	KDF-23	KDF-29	KDF-50	KDF-20
		KDF-1	KDF-49	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-24
		KDF-52	KDF-34	KDF-51	KDF-17	KDF-46	KDF-12
		KDF-18	KDF-8	KDF-44	KDF-22	KDF-40	KDF-31
		KDF-30	KDF-42
CVE-2023-37083	KDF-51	KDF-32	KDF-11	KDF-23	KDF-29	KDF-50	KDF-20
		KDF-1	KDF-49	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-7	KDF-21	KDF-24
		KDF-52	KDF-34	KDF-13	KDF-17	KDF-46	KDF-12
		KDF-18	KDF-8	KDF-44	KDF-22	KDF-40	KDF-31
		KDF-30	KDF-42
CVE-2023-33537	KDF-7	KDF-32	KDF-11	KDF-23	KDF-29	KDF-50	KDF-20
		KDF-1	KDF-49	KDF-35	KDF-15	KDF-45	KDF-5
		KDF-33	KDF-14	KDF-43	KDF-21	KDF-24	KDF-52
		KDF-34	KDF-51	KDF-13	KDF-17	KDF-46	KDF-12
		KDF-18	KDF-8	KDF-44	KDF-22	KDF-40	KDF-31
		KDF-30	KDF-42
CVE-2023-36359	KDF-11	KDF-32	KDF-23	KDF-29	KDF-50	KDF-20	KDF-1
		KDF-49	KDF-35	KDF-15	KDF-45	KDF-5	KDF-33
		KDF-14	KDF-43	KDF-21	KDF-52	KDF-34	KDF-13
		KDF-17	KDF-46	KDF-12	KDF-18	KDF-8	KDF-44
		KDF-22	KDF-31	KDF-30	KDF-42
CNVD-2021-81545	KDF-40	KDF-11	KDF-29	KDF-50	KDF-20	KDF-35	KDF-15
		KDF-45	KDF-5	KDF-7	KDF-43	KDF-14	KDF-24
		KDF-52	KDF-41	KDF-51	KDF-17	KDF-46	KDF-12
		KDF-18	KDF-44	KDF-30	KDF-42
CNVD-2021-81533	KDF-52	KDF-11	KDF-29	KDF-50	KDF-20	KDF-35	KDF-15
		KDF-45	KDF-5	KDF-7	KDF-43	KDF-14	KDF-24
		KDF-41	KDF-51	KDF-17	KDF-46	KDF-12	KDF-44
		KDF-40	KDF-30	KDF-42
CVE-2024-9284	KDF-5	KDF-32	KDF-23	KDF-29	KDF-1	KDF-49	KDF-15
		KDF-33	KDF-14	KDF-52	KDF-34	KDF-13	KDF-17
		KDF-12	KDF-18	KDF-8	KDF-22	KDF-31
CVE-2017-13772	KDF-49	KDF-32	KDF-23	KDF-29	KDF-1	KDF-15	KDF-5
		KDF-33	KDF-14	KDF-52	KDF-34	KDF-13	KDF-17
		KDF-12	KDF-18	KDF-8	KDF-22	KDF-31
CVE-2019-6989	KDF-15	KDF-32	KDF-23	KDF-29	KDF-1	KDF-49	KDF-5
		KDF-33	KDF-14	KDF-52	KDF-34	KDF-13	KDF-17
		KDF-12	KDF-18	KDF-8	KDF-22	KDF-31
CNVD-2021-35879	KDF-29	KDF-32	KDF-23	KDF-1	KDF-49	KDF-15	KDF-5
		KDF-33	KDF-14	KDF-52	KDF-34	KDF-13	KDF-17
		KDF-12	KDF-18	KDF-8	KDF-22	KDF-31
CVE-2020-8423	KDF-14	KDF-32	KDF-23	KDF-29	KDF-1	KDF-49	KDF-15
		KDF-5	KDF-33	KDF-52	KDF-34	KDF-13	KDF-17
		KDF-12	KDF-18	KDF-8	KDF-22	KDF-31
CVE-2023-33538	KDF-22	KDF-32	KDF-23	KDF-29	KDF-1	KDF-49	KDF-15
		KDF-5	KDF-33	KDF-14	KDF-52	KDF-34	KDF-13
		KDF-17	KDF-12	KDF-18	KDF-8	KDF-31
CVE-2024-46313	KDF-33	KDF-32	KDF-23	KDF-29	KDF-1	KDF-49	KDF-15
		KDF-5	KDF-14	KDF-52	KDF-34	KDF-13	KDF-17
		KDF-12	KDF-18	KDF-8	KDF-22	KDF-31
CVE-2014-9350	KDF-46	KDF-11	KDF-50	KDF-20	KDF-35	KDF-45	KDF-7
		KDF-43	KDF-24	KDF-41	KDF-51	KDF-44	KDF-40
		KDF-30	KDF-42
CVE-2021-44864	KDF-11	KDF-50	KDF-20	KDF-35	KDF-45	KDF-7	KDF-43
		KDF-24	KDF-41	KDF-51	KDF-46	KDF-44	KDF-40
		KDF-30	KDF-42
CVE-2021-26827	KDF-30	KDF-50	KDF-20	KDF-35	KDF-45	KDF-7	KDF-43
CVE-2021-26827	KDF-30	KDF-24	KDF-41	KDF-51	KDF-46	KDF-40	KDF-42
CVE-2023-39748	KDF-45	KDF-51	KDF-50	KDF-20	KDF-35	KDF-7	KDF-43
CVE-2023-39748	KDF-45	KDF-24	KDF-46	KDF-40	KDF-30	KDF-42
CVE-2021-29302	KDF-26	KDF-36	KDF-53	KDF-38	KDF-10	KDF-25	KDF-37
CVE-2021-29302	KDF-26	KDF-16	KDF-27	KDF-4	KDF-9
CVE-2022-26641	KDF-16	KDF-36	KDF-53	KDF-38	KDF-26	KDF-10	KDF-25
CVE-2022-26641	KDF-16	KDF-37	KDF-27	KDF-4	KDF-9
CVE-2022-25062	KDF-38	KDF-36	KDF-53	KDF-26	KDF-10	KDF-25	KDF-37
CVE-2022-25062	KDF-38	KDF-16	KDF-27	KDF-4	KDF-9
CVE-2022-26640	KDF-37	KDF-36	KDF-53	KDF-38	KDF-26	KDF-10	KDF-25
CVE-2022-26640	KDF-37	KDF-16	KDF-27	KDF-4	KDF-9
CVE-2022-24355	KDF-18	KDF-1	KDF-49	KDF-15	KDF-5	KDF-14	KDF-52
CVE-2022-24355	KDF-18	KDF-17	KDF-12
CVE-2022-26639	KDF-4	KDF-36	KDF-53	KDF-26	KDF-10	KDF-25	KDF-37
CVE-2022-26639	KDF-4	KDF-16	KDF-9
CVE-2022-25064	KDF-36	KDF-26	KDF-10	KDF-25	KDF-37	KDF-16	KDF-4
CVE-2022-25064	KDF-36	KDF-9
CVE-2022-42156	KDF-6	KDF-47	KDF-48	KDF-2	KDF-39	KDF-54
CVE-2021-46314	KDF-54	KDF-47	KDF-48	KDF-6	KDF-39	KDF-2
CVE-2018-16408	KDF-6	KDF-47	KDF-48	KDF-39	KDF-54	KDF-2
CVE-2023-51984	KDF-48	KDF-47	KDF-6	KDF-39	KDF-54	KDF-2
CVE-2024-41622	KDF-47	KDF-6	KDF-39	KDF-54	KDF-2
CVE-2023-33735	KDF-6	KDF-39	KDF-54	KDF-2
CVE-2020-25367	KDF-39	KDF-48	KDF-47	KDF-2
CVE-2023-26613	KDF-48	KDF-47	KDF-39	KDF-2
CVE-2019-15528	KDF-2	KDF-48	KDF-47	KDF-39
CVE-2022-43109	KDF-39	KDF-48	KDF-47	KDF-2
CVE-2018-17880	KDF-47	KDF-48	KDF-39	KDF-2
CVE-2019-13481	KDF-48	KDF-47	KDF-39	KDF-2
CVE-2019-15529	KDF-48	KDF-47	KDF-39	KDF-2
CVE-2019-15530	KDF-47	KDF-48	KDF-39	KDF-2
CVE-2019-7298	KDF-2	KDF-48	KDF-47	KDF-39
CVE-2021-43474	KDF-47	KDF-48	KDF-39	KDF-2
CVE-2018-19986	KDF-39	KDF-48	KDF-47	KDF-2
CVE-2018-19989	KDF-2	KDF-48	KDF-47	KDF-39
CVE-2018-19990	KDF-2	KDF-48	KDF-47	KDF-39
CVE-2020-25368	KDF-39	KDF-48	KDF-47	KDF-2
CVE-2018-17787	KDF-39	KDF-48	KDF-47	KDF-2
CVE-2019-12786	KDF-48	KDF-47	KDF-39	KDF-2
CVE-2019-15526	KDF-2	KDF-48	KDF-47	KDF-39
CVE-2019-13128	KDF-48	KDF-47	KDF-39	KDF-2
CVE-2018-19987	KDF-47	KDF-48	KDF-39	KDF-2
CVE-2019-13482	KDF-39	KDF-48	KDF-47	KDF-2
CVE-2023-36355	KDF-32	KDF-1	KDF-31
CVE-2023-39746	KDF-24	KDF-7	KDF-40
CVE-2018-19988	KDF-48	KDF-39	KDF-2
CVE-2022-44808	KDF-39	KDF-48	KDF-2
CVE-2019-12787	KDF-2	KDF-48	KDF-39
CVE-2019-7297	KDF-2	KDF-47	KDF-39
CVE-2020-25366	KDF-2	KDF-48	KDF-39
CVE-2023-39747	KDF-11	KDF-44
CVE-2019-17510	KDF-54	KDF-6
CVE-2023-43284	KDF-6	KDF-54
CVE-2020-10215	KDF-3	KDF-19
CVE-2020-10216	KDF-19	KDF-3
CVE-2022-25061	KDF-10	KDF-9
CVE-2022-46642	KDF-6	KDF-54

References

Friha, O.; Ferrag, M.A.; Shu, L.; Maglaras, L.; Wang, X. Internet of Things for the Future of Smart Agriculture: A Comprehensive Survey of Emerging Technologies. IEEE/CAA J. Autom. Sin. 2021, 8, 718–752. [Google Scholar] [CrossRef]
Travel Routers. NAS Devices among Easily Hacked IoT Devices. Available online: https://threatpost.com/travel-routers-nas-devices-among-easily-hacked-iot-devices/124877/ (accessed on 29 July 2025).
Eceiza, M.; Flores, J.L.; Iturbe, M. Fuzzing the Internet of Things: A Review on the Techniques and Challenges for Efficient Vulnerability Discovery in Embedded Systems. IEEE Internet Things J. 2021, 8, 10390–11041. [Google Scholar] [CrossRef]
He, D.; Yu, X.; Li, T.; Chan, S.; Guizani, M. Firmware Vulnerabilities Homology Detection Based on Clonal Selection Algorithm for IoT Devices. IEEE Internet Things J. 2022, 9, 16438–16445. [Google Scholar] [CrossRef]
Kambourakis, G.; Kolias, C.; Stavrou, A. The Mirai Botnet and the IoT Zombie Armies. In Proceedings of the MILCOM 2017-2017 IEEE Military Communications Conference (MILCOM), Baltimore, MD, USA, 23–25 October 2017; pp. 267–272. [Google Scholar]
Marzano, A.; Alexander, D.; Fonseca, O.; Fazzion, E.; Hoepers, C.; Steding-Jessen, K.; Chaves, M.H.P.C.; Cunha, Í.; Guedes, D.; Meira, W. The Evolution of Bashlite and Mirai IoT Botnets. In Proceedings of the 2018 IEEE Symposium on Computers and Communications (ISCC), Natal, Brazil, 25–28 June 2018; pp. 00813–00818. [Google Scholar]
Feng, Q.; Zhou, R.; Xu, C.; Cheng, Y.; Brian, B.; Yin, H. Scalable Graph-based Bug Search for Firmware Images. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ‘16), Vienna, Austria, 24–28 October 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 480–491. [Google Scholar]
Zhao, B.; Ji, S.; Xu, J.; Tian, Y.; Wei, Q.; Wang, Q.; Lyu, C.; Zhang, X.; Lin, C.; Wu, J.; et al. A large-scale empirical analysis of the vulnerabilities introduced by third-party components in IoT firmware. In Proceedings of the 31st ACM SIG-SOFT International Symposium on Software Testing and Analysis (ISSTA 2022), Virtual, Republic of Korea, 18–22 July 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 442–454. [Google Scholar]
Xiao, H.; Zhang, Y.; Shen, M.; Lin, C.; Zhang, C.; Liu, S.; Yang, M. Accurate and Efficient Recurring Vulnerability De-taction for IoT Firmware. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security (CCS ‘24), Salt Lake City, UT, USA, 14–18 October 2024; Association for Computing Machinery: Salt Lake City, UT, USA, 2024; pp. 3317–3331. [Google Scholar]
Shoshitaishvili, Y.; Wang, R.; Hauser, C.; Kruegel, C.; Vigna, G. Firmalice-automatic detection of authentication bypass vulnerabilities in binary firmware. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium 2015, San Diego, CA, USA, 8–11 February 2015. [Google Scholar]
Ding, S.H.H.; Fung, B.C.M.; Charland, P. Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; pp. 472–489. [Google Scholar]
Yu, Y.; Gan, S.; Qin, X. FirmVulSeeker—BERT and Siamese Network-Based Vulnerability Search for Embedded Device Firmware Images. J. Internet Things 2022, 4, 1–20. [Google Scholar] [CrossRef]
Ye, J.; Fei, X.; De Carnavalet, X.D.C.; Zhao, L.; Wu, L.; Zhang, M. Detecting Command Injection Vulnerabilities in Linux-Based Embedded Firmware with LLM-Based Taint Analysis of Library Functions. Comput. Secur. 2024, 144, 103971. [Google Scholar] [CrossRef]
Qasem, A.; Shirani, P.; Debbabi, M.; Wang, L.; Lebel, B.; Agba, B.L. Automatic Vulnerability Detection in Embedded Devices and Firmware: Survey and Layered Taxonomies. ACM Comput. Surv. 2021, 54, 1–25. [Google Scholar] [CrossRef]
Feng, X.; Zhu, X.; Han, Q.-L.; Zhou, W.; Wen, S.; Xiang, Y. Detecting Vulnerability on IoT Device Firmware: A Survey. IEEE/CAA J. Autom. Sin. 2023, 10, 25–41. [Google Scholar] [CrossRef]
Yun, J.; Rustamov, F.; Kim, J.; Shin, Y. Fuzzing of Embedded Systems: A Survey. ACM Comput. Surv. 2022, 55, 1–137. [Google Scholar] [CrossRef]
Feng, X.; Sun, R.; Zhu, X.; Xue, M.; Wen, S.; Liu, D.; Nepal, S.; Xiang, Y. Snipuzz: Black-Box Fuzzing of IoT Firmware via Message Snippet Inference. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, 15–19 November 2021; pp. 337–350. [Google Scholar]
Qin, C.; Peng, J.; Liu, P.; Zheng, Y.; Cheng, K.; Zhang, W.; Sun, L. UCRF: Static Analyzing Firmware to Generate under-Constrained Seed for Fuzzing SOHO Router. Comput. Secur. 2023, 128, 103157. [Google Scholar] [CrossRef]
Bakhshi, T.; Ghita, B.; Kuzminykh, I. A Review of IoT Firmware Vulnerabilities and Auditing Techniques. Sensors 2024, 24, 708. [Google Scholar] [CrossRef]
Ul Haq, S.; Singh, Y.; Sharma, A.; Gupta, R.; Gupta, D. A Survey on IoT & Embedded Device Firmware Security: Architecture, Extraction Techniques, and Vulnerability Analysis Frameworks. Discov. Internet Things 2023, 3, 17. [Google Scholar] [CrossRef]
Zhu, X.; Wen, S.; Camtepe, S.; Xiang, Y. Fuzzing: A Survey for Roadmap. ACM Comput. Surv. 2022, 54, 1–36. [Google Scholar] [CrossRef]
Manès, V.J.; Han, H.; Han, C.; Cha, S.K.; Egele, M.; Schwartz, E.J.; Woo, M. The Art, Science, and Engineering of Fuzzing: A Survey. IEEE Trans. Softw. Eng. 2019, 47, 2312–2331. [Google Scholar] [CrossRef]
Liang, H.; Pei, X.; Jia, X.; Shen, W.; Zhang, J. Fuzzing: State of the Art. IEEE Trans. Reliab. 2018, 67, 1199–1218. [Google Scholar] [CrossRef]
Bellard, F. QEMU, a Fast and Portable Dynamic Translator. In Proceedings of the USENIX Annual Technical Conference, Anaheim, CA, USA, 13–15 April 2005. [Google Scholar]
Chen, D.D.; Egele, M.; Woo, M.; Brumley, D. Towards Automated Dynamic Analysis for Linux-based Embedded Firmware. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium 2016, San Diego, CA, USA, 21–24 February 2016. [Google Scholar]
Kim, M.; Kim, D.; Kim, E.; Kim, S.; Jang, Y.; Kim, Y. FirmAE: Towards Large-Scale Emulation of IoT Firmware for Dynamic Analysis. In Proceedings of the Annual Computer Security Applications Conference, Austin, TX, USA, 7–11 December 2020. [Google Scholar]
Zhang, C.; Wang, Y.; Wang, L. Firmware Fuzzing: The State of the Art. In Proceedings of the 12th Asia-Pacific Symposium on Internetware, Singapore, Singapore, 1–3 November 2020; pp. 110–115. [Google Scholar]
Zheng, Y.; Davanian, A.; Yin, H.; Song, C.; Zhu, H.; Sun, L. FIRM-AFL: High-Throughput Greybox Fuzzing of IoT Firmware via Augmented Process Emulation. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; pp. 1099–1114. [Google Scholar]
Yin, Q.; Zhou, X.; Zhang, H. Firmhunter: State-Aware and Introspection-Driven Grey-Box Fuzzing towards IoT Firmware. Appl. Sci. 2021, 11, 9094. [Google Scholar] [CrossRef]
Zhang, Y.; Huo, W.; Jian, K.; Shi, J.; Lu, H.; Liu, L.; Wang, C.; Sun, D.; Zhang, C.; Liu, B. SRFuzzer: An Automatic Fuzzing Framework for Physical SOHO Router Devices to Discover Multi-Type Vulnerabilities. In Proceedings of the 35th Annual Computer Security Applications Conference, San Juan, PR, USA, 9–13 December 2019. [Google Scholar]
Cheng, Y.; Fan, W.; Huang, W.; Yang, J.; Yu, G.; Liu, W. MSLFuzzer: Black-Box Fuzzing of SOHO Router Devices via Message Segment List Inference. Cybersecurity 2023, 6, 51. [Google Scholar] [CrossRef]
Wang, D.; Zhang, X.; Chen, T.; Li, J. Discovering Vulnerabilities in COTS IoT Devices through Blackbox Fuzzing Web Management Interface. Secur. Commun. Netw. 2019, 2019, 1–19. [Google Scholar] [CrossRef]
Costin, A.; Zarras, A.; Francillon, A. Automated Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, Xi’an, China, 30 May–3 June 2016; pp. 437–448. [Google Scholar]
Wu, Y.; Wang, J.; Wang, Y.; Zhai, S.; Li, Z.; He, Y.; Sun, K.; Li, Q.; Zhang, N. Your Firmware Has Arrived: A Study of Firmware Update Vulnerabilities. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 5627–5644. [Google Scholar]
Wang, Y.; Shen, J.; Lin, J.; Lou, R. Staged Method of Code Similarity Analysis for Firmware Vulnerability Detection. IEEE Access 2019, 7, 14171–14185. [Google Scholar] [CrossRef]
Thomas, S.L.; Chothia, T.; Garcia, F.D. Stringer: Measuring the Importance of Static Data Comparisons to Detect Backdoors and Undocumented Functionality. In Proceedings of the 22nd European Symposium on Research in Computer Security, Oslo, Norway, 11–15 September 2017; pp. 513–531. [Google Scholar]
Chen, Y.; Li, H.; Zhao, W.; Zhang, L.; Liu, Z.; Shi, Z. IHB: A Scalable and Efficient Scheme to Identify Homologous Binaries in IoT Firmwares. In Proceedings of the 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC), San Diego, CA, USA, 10–12 October 2017; pp. 1–8. [Google Scholar]
Tien, C.-W.; Tsai, T.-T.; Chen, Y.; Kuo, S.-Y. UFO-Hidden Backdoor Discovery and Security Verification in IoT Device Firmware. In Proceedings of the 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Memphis, TN, USA, 15–18 October 2018; pp. 18–23. [Google Scholar]
Chen, L.; Wang, Y.; Cai, Q.; Zhan, Y.; Hu, H.; Linghu, J.; Hou, Q.; Zhang, C.; Duan, H.; Xue, Z. Sharing More and Checking Less: Leveraging Common Input Keywords to Detect Bugs in Embedded Systems. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Online, 11–13 August 2021; pp. 303–319. [Google Scholar]
Kornblum, J. Identifying Almost Identical Files Using Context Triggered Piecewise Hashing. Digit. Investig. 2006, 3, 91–97. [Google Scholar] [CrossRef]
Binwalk. A Firmware Analysis Tool. Available online: https://github.com/ReFirmLabs/Binwalk (accessed on 29 July 2025).
Firmwalker. Script for Searching the Extracted Firmware File System for Goodies. Available online: https://github.com/craigz28/firmwalker (accessed on 29 July 2025).
Ghidra. A Software Reverse Engineering (SRE) Framework. Available online: https://github.com/NationalSecurityAgency/ghidra (accessed on 29 July 2025).
Li, S.; Wang, Y.; Dong, C.; Yang, S.; Li, H.; Sun, H.; Lang, Z.; Chen, Z.; Wang, W.; Zhu, H.; et al. LibAM: An Area Matching Framework for Detecting Third-Party Libraries in Binaries. ACM Trans. Softw. Eng. Methodol. 2024, 33, 1–35. [Google Scholar] [CrossRef]
Cheng, Y.; Li, X.; Mao, Z.; Fan, W.; Huang, W.; Liu, W. IoTBenchSL: A Streamlined Framework for the Efficient Production of Standardized IoT Benchmarks with Automated Pipeline Validation. Electronics 2025, 14, 856. [Google Scholar] [CrossRef]
FirmEmuHub. An Open-Source Firmware Benchmark Dataset Generated by the IoTBenchSL Tool. Available online: https://github.com/a101e-lab/FirmEmuHub (accessed on 29 July 2025).
IoTVulBench. An Open-Source Benchmark Dataset for IoT Security Research. Available online: https://github.com/a101e-lab/IoTVulBench (accessed on 29 July 2025).

Figure 1. Overview of FirmVulLinker.

Figure 2. Radar chart of FirmVulLinker similarity scores for representative firmware pairs.

Figure 3. Firmware similarity heatmap based on multi-dimensional profiling.

Figure 4. Performance impact of removing each semantic profile dimension.

Table 1. Comparison of semantic feature coverage across firmware analysis tools.

Feature Dimension	FirmVulLinker	Firmalice []	FirmRec []	Asm2Vec []	SaTC []	FirmSec []
Unpacking Signatures	✔	✘	✘	✘	✘	✘
Filesystem Structure	✔	✘	✘	✘	✘	✘
Communication Interfaces	✔	✘	✔	✘	✔	✘
Binary Feature Modeling	✔	✔	✔	✔	✔	✔
Sensitive Call Chains	✔	✔	✔	✘	✔	✘
Vulnerability Discovery	✔	✔	✔	✘	✔	✔

✔: Feature present. ✘: Feature absent.

Table 2. Performance comparison on homologous vulnerability correlation accuracy.

Metric	FirmVulLinker	LibAM
Precision	0.9564	0.7825
False-Negative Rate	0.0215	0.0745
False-Positive Rate	0.0305	0.4379

Table 3. Metadata of representative UAF.

UAF ID	Vendor	Device Type	Firmware Version
UAF-1	D-Link	DIR-822 B1	v2.02KRb06
UAF-2	TP-Link	TL-WR802N	V2
UAF-3	TP-Link	Archer-C2	V5
UAF-4	D-Link	DIR846	100A43
UAF-5	D-Link	DIR-823G	v1.01B02
UAF-6	D-Link	DIR-825 B1	v2.02SSb13
UAF-7	TP-Link	TL-WR940N	v3(151102)
UAF-8	TP-Link	TL-WR743ND	v1(110829)
UAF-9	TP-Link	TL-WR940N	v2(140627)

Table 4. Matched UAF and their associated vulnerabilities identified via FirmVulLinker.

UAF ID	Associated Vulnerabilities
UAF-1	CVE-2020-27600	CVE-2019-17510	CVE-2021-46314	CVE-2021-46315
UAF-1	CVE-2022-46641	CVE-2022-46642
UAF-2	CVE-2023-36356	CVE-2014-9350	CVE-2021-44864	CVE-2024-9284
	CVE-2023-33536	CVE-2017-13772	CVE-2023-37080	CVE-2023-36354
	CVE-2023-36358	CVE-2023-36357	CVE-2023-37082	CVE-2023-37083
	CVE-2019-6989	CVE-2023-36359	CNVD-2021-35879	CVE-2020-8423
	CVE-2023-39745	CNVD-2023-48042	CVE-2023-37079	CVE-2023-37081
	CVE-2023-33537	CVE-2023-39747	CVE-2023-33538	CVE-2024-46313
UAF-3	CVE-2021-29302	CVE-2022-26641	CVE-2022-25062	CVE-2022-26640
UAF-3	CVE-2022-25061	CVE-2022-26639	CVE-2022-25064
UAF-4	CVE-2020-27600	CVE-2022-46641	CVE-2019-17510	CVE-2021-46314
UAF-4	CVE-2021-46315	CVE-2022-46642
UAF-5	CVE-2019-7297	CVE-2019-7298	CVE-2019-15528	CVE-2019-15529
	CVE-2020-25367	CVE-2020-25368	CVE-2021-43474	CVE-2022-43109
	CVE-2019-15530	CVE-2020-25366	CVE-2023-26613
UAF-6	CVE-2020-10215	CVE-2020-10213	CVE-2020-10216	CVE-2020-10214
UAF-7	CVE-2023-36356	CVE-2024-9284	CVE-2023-39745	CNVD-2023-48042
	CVE-2017-13772	CVE-2023-37080	CVE-2023-36354	CVE-2023-37079
	CVE-2023-36358	CVE-2023-36357	CVE-2023-37082	CVE-2023-37083
	CVE-2023-36359	CNVD-2021-35879	CVE-2020-8423	CVE-2023-33538
	CVE-2023-36355	CVE-2023-33536	CVE-2023-37081	CVE-2022-24355
	CVE-2023-33537	CVE-2019-6989	CVE-2024-46313
UAF-8	CVE-2023-36356	CVE-2014-9350	CVE-2021-44864	CVE-2023-39745
	CVE-2023-37080	CVE-2023-36354	CVE-2021-26827	CNVD-2021-81545
	CVE-2023-37081	CVE-2023-36358	CVE-2023-36357	CVE-2023-37082
	CVE-2023-39746	CVE-2023-39748	CNVD-2023-48042	CVE-2023-33536
	CNVD-2021-81533	CVE-2023-37079	CVE-2023-37083	CVE-2023-33537
UAF-9	CVE-2023-36356	CVE-2014-9350	CVE-2021-44864	CVE-2023-39745
	CVE-2023-37080	CVE-2023-36354	CNVD-2021-81545	CNVD-2021-81533
	CVE-2023-36358	CVE-2023-36357	CVE-2023-37082	CVE-2023-37083
	CVE-2023-36359	CNVD-2023-48042	CVE-2023-33536	CVE-2023-33537
	CVE-2023-37079	CVE-2023-37081	CVE-2023-39747

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

FirmVulLinker: Leveraging Multi-Dimensional Firmware Profiling for Identifying Homologous Vulnerabilities in Internet of Things Devices

Abstract

1. Introduction

2. Background and Related Work

2.1. Vulnerability Discovery in IoT Firmware

2.2. Static Analysis-Based Homologous Vulnerability Identification in Firmware

3. Multi-Dimensional Firmware Semantic Profiling

3.1. Unpacking Signature Sequence Profile

3.2. Filesystem Semantic Profile

3.3. Interface Exposure Profile

3.4. Exposed Binary Symbolic Profile

3.5. Vulnerability-Oriented Call-Chain Profile

4. Firmware Similarity Computation Across Multi-Dimensional Semantic Profiles

4.1. Unpacking Signature Sequence Profile Similarity

4.2. Filesystem Semantic Profile Similarity

4.3. Interface Exposure Profile Similarity

4.4. Exposed Binary Symbolic Similarity

4.5. Vulnerability-Oriented Call-Chain Similarity

4.6. Multi-Profile Fusion and Global Similarity Computation

5. Implementation and Evaluation

5.1. Framework Implementation

5.2. Experimental Setup

5.3. Comparative Analysis of Feature Dimensions

5.4. Homologous Vulnerability Correlation Across Firmware Images

5.5. Ablation Study of Semantic Profiling Dimensions

5.6. Discovery of Previously Undisclosed Vulnerabilities

6. Discussion and Future Work

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Article Metrics

Citations

Article Access Statistics