1. Introduction
The rapid growth of the Internet of Things (IoT) has transformed industrial environments by enabling continuous monitoring, automation, and data-driven decision-making across the manufacturing, logistics, energy, and process control domains. Among these technologies, computer vision (CV) has emerged as a critical enabler of intelligent industrial systems. High-resolution cameras, edge processors, and AI-based visual analytics now support applications such as quality inspection, defect detection, robotic guidance, workplace safety monitoring, and equipment anomaly detection [
1,
2,
3]. These applications often operate continuously and must deliver reliable, real-time performance under harsh and dynamic factory conditions.
Despite their importance, deploying computer vision solutions in Industrial IoT settings remains challenging. Modern factory floors contain highly heterogeneous devices ranging from resource-constrained embedded platforms to industrial PCs and GPU-enabled workstations with significant variation in computational capacity, energy constraints, model support, and communication capabilities. At the same time, industrial networks are exposed to bandwidth fluctuations, interference, and congestion, which can degrade inference latency, increase frame loss, and destabilize throughput [
4,
5,
6]. Ensuring that the “right” device processes a given vision task at the “right” time is therefore a non-trivial problem, especially when multiple devices are capable of performing the same task.
A further challenge lies in trust and security. Conventional Industrial IoT deployments typically rely on static configurations, fixed routing, or centralized schedulers that do not explicitly account for device reliability, historical behavior, or quality of service delivered over time. For computer vision workloads, an incorrect, delayed, or malicious result can lead directly to production defects, safety incidents, or regulatory non-compliance. While prior research on IoT and SIoT has proposed various trust and reputation models, including graph-based trust computation and blockchain-assisted trust management, these approaches are rarely tailored to vision-centric metrics such as detection accuracy, frame-level latency, jitter, or throughput, nor are they evaluated in realistic industrial CV scenarios with heterogeneous hardware [
7,
8,
9,
10,
11].
The Social Internet of Things (SIoT) [
12,
13] introduces a complementary perspective by treating devices as socially aware entities that can form, manage, and evolve “friendships” based on interaction history and shared roles. SIoT research has shown that social relationships between devices can improve network navigability, decentralize decision-making, and provide a natural basis for trust and collaboration. However, most existing SIoT frameworks focus on generic service discovery, social relationship modeling, or small-scale simulations, with limited consideration of data-intensive workloads and strict real-time constraints [
14,
15]. As a result, the application of socially grounded relationships and trust scoring to industrial computer vision pipelines, where high-volume video streams and frame-level latency are critical, remains largely unexplored.
Existing Social Internet of Things (SIoT) trust models primarily rely on abstract social relationships, reputation aggregation, or historical interaction patterns to evaluate device reliability [
12,
13,
14,
15]. While these approaches effectively capture social proximity and long-term behavioral trends, they remain largely domain-agnostic and do not explicitly incorporate execution-level performance characteristics. In particular, they do not account for computer vision-specific factors such as detection accuracy under workload variation, execution feasibility on heterogeneous hardware, or failure modes arising from real-time streaming constraints. In contrast, the proposed framework grounds trust evaluation directly in observed computer vision execution outcomes, enabling socially inspired relationships to emerge from measurable service behavior rather than from static or self-reported attributes.
To operationalize these socially inspired interactions, the proposed SIoT protocol integrates proven distributed systems technologies. Device discovery is achieved using a lightweight Gossip mechanism, enabling decentralized and fault-tolerant propagation of presence and capability information across heterogeneous IIoT networks [
16,
17]. Secure and efficient inter-device communication is handled through gRPC, which supports bidirectional streaming and structured payloads that are well-suited for transmitting computer vision results and test frames [
18]. For friendship evaluation, the protocol employs a PageRank-inspired scoring model that aggregates multi-attribute performance metrics including latency, accuracy, throughput, and reliability into a single trust value [
19,
20]. These components collectively enable devices to autonomously identify suitable partners, validate their capabilities, and select the optimal node for executing industrial computer vision workloads under varying network and hardware conditions.
To assess the effectiveness of the proposed protocol, the evaluation includes both homogeneous and heterogeneous test setups. In the homogeneous setting, multiple devices with similar hardware configurations and identical vision models undergo comparison to reveal how the protocol behaves when nominal capabilities are equivalent. In the heterogeneous setting, a Social IoT requester coordinates with diverse devices, including embedded platforms and GPU-enabled systems running different models and operating under varying video resolutions, frame rates (FPSs), and network bandwidths. These scenarios reflect realistic industrial conditions, where camera streams and processing nodes differ in capability and network quality. Across both setups, the evaluation includes measurements of vision accuracy, latency, and throughput, along with an analysis of how SIoT friendship scores evolve and influence device selection.
Unlike conventional edge-service selection approaches that rely on direct multi-metric aggregation or static reputation scores [
4,
5,
12,
14], the proposed framework operationalizes socially grounded trust through explicit relationship modeling and propagation. Trust is derived from observed execution outcomes and decomposed into multiple logical trust nodes per service-providing device, capturing distinct performance and execution characteristics. PageRank is employed not as a generic ranking mechanism, but as a means of propagating trust through the resulting relational graph, enabling indirect influence among services based on shared behavioral patterns. The social grounding of the proposed framework is motivated by established principles of relational interaction, where persistent relationships provide structure and interpretability beyond isolated performance measurements [
21]. This design allows feasibility constraints, execution failures, and performance trade-offs to be reflected naturally in the friendship scores, distinguishing the proposed approach from metric-only selection or reputation-based SIoT models.
The proposed framework targets intra-site industrial deployments, such as factory floors or localized edge clusters, where service-providing devices operate within a managed network environment. The assumed setting corresponds to local-area or edge-network connectivity with variable but bounded bandwidth and latency, rather than wide-area cloud federation. This deployment model reflects common industrial computer vision scenarios, in which devices exhibit heterogeneous computational capabilities and experience execution failures due to resource constraints rather than persistent network disconnections.
This study makes the following contributions:
- 1.
A socially grounded SIoT protocol that aligns device interactions with relational communication principles, enabling structured collaboration in industrial CV environments.
- 2.
A trust-evaluation framework that integrates accuracy, latency stability, throughput, and model behavior to produce reliable and explainable friendship scores.
- 3.
A mathematical scoring model that applies RPD-based attribute comparison, weighted parameter importance, and PageRank-inspired trust propagation.
- 4.
A unified testing methodology covering homogeneous and heterogeneous hardware conditions, including variations in compute capability, model type, video resolution, frame rate, and network bandwidth.
- 5.
A validated industrial testbed demonstrating stable trust evolution, consistent device selection, and improved operational reliability under realistic industrial CV workloads.
The remainder of this paper is organized as follows.
Section 2 presents the background and related work in SIoT, trust management, industrial computer vision, and distributed edge intelligence.
Section 3 outlines the social-theoretic foundations that shape the interaction model.
Section 4 describes the SIoT protocol architecture, including discovery, capability exchange, validation, and friendship scoring.
Section 5 introduces the mathematical model for attribute weighting and PageRank-based trust propagation.
Section 6 describes the test setup and presents the experimental results.
Section 7 discusses the observed results, challenges, and implications of the proposed SIoT protocol.
Section 8 concludes the study and outlines future research directions.
3. Theoretical Foundations
The proposed SIoT protocol is grounded in established theories from social science and interpersonal communication, which provide a principled basis for modeling trust, relationship formation, and interaction dynamics among autonomous devices. These theories offer a conceptual bridge between human social behavior and socially inspired device collaboration in distributed IoT environments.
3.1. Knapp’s Relational Development Model
Knapp’s relational development model describes relationship formation as a staged process through which entities progressively build interaction depth, mutual understanding, and trust over time [
41]. The model identifies distinct phases of relationship initiation, intensification, maintenance, and potential dissolution, each characterized by increasing levels of information exchange and commitment.
In the context of the Social Internet of Things, these stages can be mapped to device interactions that evolve from initial discovery and capability advertisement to sustained cooperation and long-term service provision. Early interactions correspond to exploratory exchanges, while repeated successful task execution reinforces trust and strengthens the device relationship.
By adopting Knapp’s model, the proposed SIoT protocol formalizes friendship establishment as a gradual and evidence-driven process rather than a binary or static association. This perspective aligns naturally with trust accumulation based on historical performance, reliability, and behavioral consistency, which are central to industrial computer vision service collaboration.
- 1.
Initiating reflects early-stage contact, aligning with device discovery under uncertain conditions.
- 2.
Experimenting corresponds to capability exchange, where devices explore functional compatibility.
- 3.
Intensifying aligns with trial interactions such as test-frame execution or controlled performance evaluation.
- 4.
Integrating reflects stable collaboration, where devices begin to exchange operational workloads.
- 5.
Bonding represents persistent cooperation supported by consistent trust scores and long-term operational alignment.
This staged progression offers predictable behavioral logic that maps to device interactions in industrial CV workloads. Devices progress toward stable collaboration as they demonstrate accuracy, reliability, and operational consistency.
3.2. Social Exchange Theory
Social Exchange Theory (SET) views relationships as outcomes of cost–benefit interactions, where each participant evaluates the utility returned from the relationship [
42]. In SIoT, utility corresponds to performance indicators such as
Inference accuracy.
Processing latency.
Throughput stability.
Resource consumption.
Reliability under load.
Devices prefer partners that deliver durable returns like lower latency, higher accuracy, stable throughput, or robust performance under bandwidth variation. Industrial CV environments generate fluctuating demands, making SET principles particularly relevant for evaluating long-term cooperative value.
Relationships become more stable when interaction outcomes consistently meet or exceed expectations. SET therefore informs the trust scoring logic, where favorable results strengthen collaborative ties and inconsistent behavior introduces penalties.
3.3. Social Penetration Theory
Social Penetration Theory (SPT) describes how relationships deepen through progressively richer and more meaningful exchanges [
43]. In interpersonal communication, this process involves disclosure of layered information.
In SIoT, the following facts are true:
Early interactions involve surface-level capability advertisement.
Deeper interactions involve controlled test-frame processing.
Mature interactions involve full operational cooperation with real streaming data.
This model supports a graded approach to relationship validation and trust evolution. Devices reveal increasing levels of capacity and reliability as interactions intensify. For industrial CV workloads, this staged approach reduces the risk of selecting unreliable devices before performance validation occurs.
3.4. Fiske’s Relational Models
Fiske’s Relational Models Theory outlines four interaction patterns: Communal Sharing, Authority Ranking, Equality Matching, and Market Pricing [
44]. These categories help classify device–device relationships in industrial environments.
Communal Sharing corresponds to cooperative processing, where devices share capabilities for mutual benefit (e.g., CV offloading among similar edge nodes).
Authority Ranking reflects hierarchical collaboration, such as GPU-enabled servers supporting lower-tier embedded devices.
Equality Matching aligns with symmetric exchanges, such as devices with equivalent models and workloads.
Market Pricing corresponds to performance-based collaboration, where attribute weighting and trust scoring influence partner selection.
Industrial CV deployments often include diverse hardware tiers, making Fiske’s framework useful for structuring role-based collaboration.
3.5. Homophily Theory
Homophily theory states that entities with similar characteristics exhibit a higher likelihood of forming meaningful relationships [
7,
12]. In SIoT, similarity factors include the following:
In industrial computer vision systems, homophily influences the clustering of devices with shared operational profiles. Devices processing similar resolutions, FPS levels, or model architectures exhibit more predictable performance, making them suitable collaboration candidates. Homophily therefore supports efficient search-space reduction when identifying suitable peers in socially driven device-selection processes [
14,
15].
3.6. Proximity, Reciprocity, and Competence
Abed’s CSIoT framework [
21] expands SIoT trust formation using proximity, similarity, reciprocity, and competence factors. These factors mirror behavioral influences observed in human social networks:
Proximity reflects physical or network closeness, relevant for bandwidth-dependent CV cooperation.
Reciprocity captures consistency of responses and timely interaction.
Competence aligns with device capability, model support, and resource availability.
Similarity strengthens relationship formation among devices with aligned performance profiles.
These principles support multi-dimensional trust formation and align effectively with industrial CV contexts, where device quality depends on accuracy, latency, and stability.
3.7. Summary of Theoretical Relevance
The theories reviewed above collectively influence the behavioral model of the SIoT protocol:
Knapp’s stages model device interaction phases.
Social Exchange Theory drives utility-oriented trust evaluation.
Social Penetration Theory informs graded validation through test frames.
Fiske’s relational types guide role-based collaboration.
Homophily reduces search complexity and improves compatibility.
Abed’s CSIoT principles strengthen multi-factor trust reasoning.
These theories provide conceptual grounding for the protocol architecture introduced in the next section.
4. SIoT Protocol Architecture
The proposed SIoT protocol adopts a layered architecture that structures device interaction into progressive stages aligned with relational communication theory. Each layer represents a functional phase in the collaboration lifecycle, mapping directly to Knapp’s relational stages of initiation, experimentation, intensification, integration, and bonding. This layered organization supports scalable discovery, reliable capability exchange, secure interaction, and trust-oriented device selection in industrial computer vision environments.
Figure 1 presents the proposed seven-layer SIoT protocol architecture, illustrating the progressive stages of device interaction from discovery and capability exchange to trust-based friendship scoring and service selection.
4.1. Trust Model Terminology and Decision Flow
For clarity, the following terms are used consistently throughout the manuscript:
Trust: a quantitative assessment of a service provider’s reliability derived from observed execution outcomes, including detection accuracy, communication latency, and processing behavior.
Reputation: an aggregated or historical perception of reliability commonly employed in prior SIoT models but not directly used in the proposed framework.
Friendship score: the final trust-derived ranking value computed through PageRank-based propagation over the SIoT trust graph, reflecting both direct and indirect trust relationships.
Service selection: the process of choosing one or more service-providing devices based on their friendship scores while respecting execution feasibility constraints.
Scope of Trust Evaluation and Threat Model
Building on the trust definition introduced in the previous subsection, trust in the context of this study is defined as an execution-grounded measure reflecting the reliability and suitability of a service provider for industrial computer vision workloads. The selected metrics—detection accuracy, communication latency, and processing time—directly capture the primary factors influencing the correctness and timeliness of vision-based decisions in operational environments. Other dimensions commonly associated with trust in distributed systems, such as long-term availability, fault history, security posture, and adversarial robustness, are intentionally not modeled in this work. The proposed framework assumes a managed industrial deployment in which devices are authenticated and operate within controlled network boundaries, and the focus is placed on performance-driven trust under realistic execution constraints. These additional trust dimensions are complementary and can be incorporated as additional trust attributes in future extensions without altering the core socially grounded trust computation.
4.2. Layer 1—Device Discovery (Initiation Stage)
Layer 1 is responsible for discovering devices that operate within the SIoT environment. Discovery uses a Gossip protocol to propagate presence information without central coordination. Each device periodically publishes minimal identifiers, operational metadata, and availability status. This approach ensures robust discovery under fluctuating industrial network conditions and maps to the initiation stage of relational development, where entities identify potential partners without deep interaction.
4.3. Layer 2—Capability Exchange (Experimentation Stage)
Layer 2 supports structured exchange of device capabilities, including CV model support, hardware attributes, GPU characteristics, and operational constraints. The interaction occurs through gRPC messaging, allowing devices to share structured and authenticated capability data. This layer maps to Knapp’s experimentation stage, where devices explore compatibility based on functional characteristics.
4.4. Layer 3—Friendship Establishment (Intensification Stage)
Layer 3 formalizes initial cooperation by initiating explicit friendship requests. Devices issue gRPC calls to confirm participation, exchange service metadata, and prepare for controlled test execution. This layer reflects the intensification stage, where interaction depth increases and devices begin to form targeted collaborative ties.
4.5. Layer 4—Service Requesting (Integration Stage)
Layer 4 executes service requests using gRPC service operations. Devices in the service-requesting mode transmit CV test frames, object-detection requests, or feature-execution tasks to suitable peers. Devices in the providing mode process these tasks and return structured responses. This stage represents integration, where cooperative processing becomes operational rather than exploratory.
4.6. Layer 5—Testing and Validation (Bonding Stage)
Layer 5 performs controlled validation of peer behavior. Devices transmit test frames to evaluating peers, assess response latency, compare inference outcomes with expected results, and update local metrics. This process reflects the bonding stage, where relationships strengthen based on validated reliability and performance consistency. Vision-specific metrics, such as frame-level precision, latency stability, and throughput, play a central role.
4.7. Layer 6—Graph Construction
Layer 6 constructs a directed friendship graph. Each device represents a node, and edges encode multi-attribute similarity using Relative Percentage Difference (RPD) and attribute weighting. This graph becomes the computational backbone for trust propagation. The graph adapts dynamically as devices update performance characteristics, ensuring alignment with current industrial conditions.
Figure 2 provides a conceptual view of the directed SIoT friendship graph constructed at this stage, illustrating how performance metrics, device characteristics, and weighted trust edges are organized to support PageRank-based friendship score propagation.
The SIoT trust graph is constructed dynamically from observed execution data rather than predefined social links or manually assigned relationships. For each service-providing device, normalized performance attributes are computed from measured detection accuracy, communication latency, and processing time. Logical trust nodes represent these attributes, and directed edges are instantiated by comparing relative performance across devices using the Relative Percentage Difference (RPD) formulation. Edge weights therefore reflect empirically observed dominance or degradation relationships rather than assumed trust levels. The relative importance weights assigned to accuracy- and latency-related attributes are not used to define the existence of trust edges, but rather to express application-level preference during trust aggregation. These weights remain fixed within a given experimental configuration to ensure comparability across runs and reflect industrial computer vision priorities, where both correctness and timeliness are critical. Sensitivity analysis results in
Section 6.6.4 show that weight variation affects friendship-score magnitude but does not alter service-selection outcomes.
4.8. Layer 7—Friendship Scoring and Selection (PageRank Trust Evaluation)
Layer 7 applies a PageRank-based scoring process [
19] over the constructed SIoT friendship graph to derive trust scores for all potential collaborators. Each device receives a score proportional to its reliability, performance stability, interaction history, and consistency across prior evaluations. Devices with favorable performance attributes accumulate higher scores and become preferred partners for real-time industrial CV workloads.
PageRank-based trust propagation operates on the directed friendship graph constructed in Layer 6, where weighted directed edges are derived from accuracy, latency, and processing attributes produce a global ranking that supports robust service-provider selection under heterogeneous hardware and network conditions.
4.9. Implementation Considerations
The SIoT protocol implementation adopts a lightweight and platform-agnostic design that supports heterogeneous industrial environments. The prototype development uses Python 3.x, selected for its extensive ecosystem, rapid prototyping capability, and compatibility across embedded and high-performance platforms.
gRPC provides the structured communication layer for service requesting, capability exchange, and testing. Its support for bidirectional streaming, HTTP/2 transport, and Protocol Buffer serialization ensures efficient delivery of computer vision payloads and device metadata across diverse hardware. Protocol Buffers define compact message structures for device capabilities, service execution requests, and response exchanges, enabling robust and consistent interaction formats.
Distributed discovery relies on a Gossip mechanism, implemented using the Serf agent, which propagates presence information and device identifiers without central coordination. This approach supports dynamic membership changes, fault tolerance, and efficient propagation under varying industrial network conditions.
Graph construction and trust scoring use the NetworkX version 3.4.2 library, which provides optimized implementations for directed graph structures and PageRank-based evaluation. The library supports dynamic updates to graph nodes and edges, enabling real-time trust computation as device performance, network conditions, or capabilities evolve.
Execution and streaming failures observed during service execution are handled according to their severity in the proposed framework. Hard execution failures, including gRPC stream termination, incomplete frame delivery, or resource exhaustion that prevent completion of inference, are classified as FAIL and result in exclusion of the corresponding configuration from service selection. No friendship edge or validation reward is generated for such infeasible executions. Intermittent execution or streaming degradations that do not halt execution are explicitly treated as negative trust evidence. Such degradations do not receive validation rewards and contribute negatively to the corresponding friendship edge weights in the trust graph. As a result, devices that repeatedly exhibit degraded execution behavior experience a progressive reduction in their PageRank-based friendship scores and are deprioritized in subsequent service-selection rounds. The implications of this failure-handling strategy on performance metrics and service-selection outcomes are reflected in the experimental evaluation presented in
Section 6.
The implementation approach ensures portability across industrial hardware, including embedded platforms, GPU-enabled devices, and high-performance workstations. Each device executes the protocol as a self-contained Python microservice, exposing the required SIoT interfaces and maintaining local friendship tables and configuration stores. The modular design supports integration into existing industrial automation platforms and enables interoperability across heterogeneous systems.
5. Mathematical Model
The SIoT trust evaluation framework quantifies the suitability of each device for industrial computer vision workloads through a combination of attribute comparison, weighted importance, and network-aware trust propagation. The mathematical model integrates performance indicators such as detection accuracy, latency stability, throughput, GPU characteristics, and network sensitivity. The scoring process comprises three stages:
- 1.
Relative Percentage Difference (RPD) computation.
- 2.
Weighted attribute aggregation.
- 3.
PageRank-based trust propagation.
5.1. Attribute Definitions and Categories
Each device is evaluated using a set of n attributes, indexed by . These attributes represent operational and performance characteristics relevant to industrial CV workloads. Attributes fall into two categories:
Ascending attributes: Higher values indicate preferable performance. Examples include vision accuracy metrics (precision, recall, F1-score), processing throughput (frames per second), GPU memory size, GPU core count, and supported backbone models.
Descending attributes: Lower values indicate preferable performance. Examples include network latency, per-frame processing time, and end-to-end response delay.
For each attribute j, the desired value is denoted by and the observed value for friend device i is represented as .
5.2. Weight Percentage (WP)
Table 1 presents the weight distribution of model parameters used for friendship computation in service requesting mode, expressed using the Weight Percentage (WP). The Weight Percentage
expresses the relative importance of each attribute
j in the trust-evaluation process. Industrial computer vision workloads assign different priorities to performance indicators such as accuracy, latency, throughput, and compute capability. The value
determines the influence of deviations in attribute
j on the final friendship score.
The weighting parameters adopted in this work are application-specific and reflect the operational requirements of industrial computer vision workloads. In such environments, detection accuracy is essential due to safety and process-reliability considerations, while timely execution is equally critical because delayed results are often operationally unusable. At the same time, neither accuracy nor processing latency alone is sufficient to characterize service trust, as execution feasibility, reliability, and validation feedback also play key roles. Accordingly, accuracy and processing time are assigned moderate, bounded weights to ensure balanced influence within the overall trust computation, preventing dominance by any single metric. These weights are design-time policy parameters rather than optimized coefficients and may be adjusted for other application domains with different performance priorities.
Attributes with higher operational significance receive larger values. For example, precision, recall, and F1-score retain higher weights in safet- critical inspection tasks, while throughput or GPU-memory capacity receive moderate weights when continuous processing speed becomes essential. Latency-sensitive applications assign higher weights to end-to-end delay and per-frame processing time.
The Weight Percentage does not require prior normalization across attributes, since the PageRank stage stabilizes relative contributions based on the SIoT graph structure. During scoring, modifies the impact of the Relative Percentage Difference (RPD) term, amplifying penalties for deviations in strongly weighted attributes and reducing their effect for attributes with lower relevance.
All WP values originate from the device’s Friendship Configuration file, allowing each node to express workload-specific preferences when selecting collaborators for industrial CV execution.
5.3. Relative Percentage Difference (RPD)
The Relative Percentage Difference (RPD) expresses the deviation between the expected and observed values while normalizing differences across attributes. For each device
i and attribute
j, the RPD is
When attribute
j contains multiple subcomponents indexed by
k (e.g., precision, recall, and F1-score contributing to accuracy), the RPD for each subcomponent is
5.4. Ascending Attributes
Ascending attributes represent performance indicators where higher values reflect stronger suitability for industrial CV workloads. These attributes describe accuracy, throughput, compute capability, and overall device strength.
For each ascending attribute
j, the weight contribution for device
i is
Higher accuracy or throughput strengthens the ascending-attribute weight, while lower-than-expected performance reduces the weight through inverse scaling.
5.5. Descending Attributes
Descending attributes represent performance indicators where lower values reflect more desirable behavior. Industrial CV pipelines favor minimal network delay, low processing time, and reduced end-to-end latency.
For each descending attribute
j, the weight contribution for device
i is
Network latency, processing time, and end-to-end delay strengthen the weight when they remain below target thresholds.
5.6. Split Attributes
When attribute
j comprises
K subcomponents, each subcomponent weight is computed as
and the aggregated attribute weight becomes
Split attributes are relevant for multi-metric characteristics such as accuracy (precision, recall, F1).
5.7. Composite Friendship Weight
The composite friendship weight for device
i is
Higher composite weights reflect stronger alignment with industrial CV requirements.
5.8. SIoT Graph Construction
The SIoT friendship graph is defined as
where each node represents a device and each edge
encodes directional trust with weight proportional to
. The graph structure adapts dynamically as devices update metrics or network conditions shift.
5.9. PageRank-Based Trust Scoring
Trust propagation uses a PageRank formulation. The trust score for device
is
where
d is the damping factor,
denotes devices with links directed to
and
denotes outgoing edges from
.
Devices with consistent accuracy, reduced latency, and favorable composite weights achieve higher PageRank scores. PageRank-based friendship scoring is computed using a damping factor of 0.85, consistent with standard practice for influence propagation in directed graphs. All nodes are initialized with equal rank values at the start of computation. Iterative updates are performed until the -norm difference between successive PageRank vectors falls below or a maximum of 50 iterations is reached, whichever occurs first. In all evaluated scenarios, convergence is achieved well within the iteration limit, ensuring stable and reproducible friendship scores.
5.10. Final Device Selection
The final trust value guiding device selection is
where
ranks the suitability of each device for subsequent industrial CV workloads.
6. Experimental Results and Analysis
6.1. Experimental Setup
This section describes the experimental configuration used to evaluate the proposed SIoT-based protocol for industrial computer vision workloads. The setup is designed to assess protocol behavior under controlled yet realistic operating conditions, reflecting both homogeneous and heterogeneous device environments commonly found in industrial IoT deployments.
Figure 3 illustrates the SIoT experimental test setup used in this study, showing both the homogeneous and heterogeneous deployment environments considered for evaluation.
6.1.1. Test Environments
Two experimental environments are considered:
- 1.
Homogeneous environment: Two Windows-based devices (D1 and D2) with similar hardware capabilities act as service providers, while a Raspberry Pi device (D4) functions as the service requester. This configuration evaluates SIoT behavior when service providers exhibit comparable computational resources.
- 2.
Heterogeneous environment: A Windows device (D1) and an embedded GPU-based Jetson device (D3) act as service providers, with the Raspberry Pi device (D4) as the service requester. This configuration reflects practical industrial scenarios involving mixed hardware platforms with varying compute and memory characteristics.
6.1.2. Device Configuration
Each device is assigned a unique identifier for clarity throughout the evaluation:
D1: Windows Device 1;
D2: Windows Device 2;
D3: Jetson Device;
D4: Raspberry Pi Device.
The detailed hardware and software configurations of all devices, including processor type, memory capacity, storage, operating system, and accelerator support are summarized in
Table 2. These configurations are kept constant throughout all experiments to ensure repeatability.
6.1.3. Object Detection Algorithms
The architectural characteristics and deployment variants of these algorithms are summarized in
Table 3. The detection algorithms are instantiated from publicly available pretrained implementations. The YOLO-based detector is deployed using the Ultralytics YOLO framework, while the SSD Inception v2 and EfficientDet-Lite0 models are obtained from the TensorFlow Detection Model Zoo and TensorFlow Hub, respectively.
In the homogeneous environment, algorithms A1 and A2 (as defined in
Table 3) are deployed on D1 and D2 and systematically swapped to eliminate device–algorithm bias. In the heterogeneous environment, algorithm A3, as summarized in
Table 3, is additionally deployed on the Jetson device (D3) to evaluate embedded GPU behavior under high-resolution streaming conditions.
6.1.4. Workload and Input Configuration
All experiments use a traffic-surveillance workload focused on detecting and counting vehicles (cars and trucks). Video frames are streamed from the service requester (D4) to service providers using a gRPC-based streaming interface.
The workload is evaluated across the following input parameters:
Resolutions:
- –
Ultra HD (3840 × 2160).
- –
Full HD (1920 × 1080).
- –
HD (1080 × 720).
Frame rates: 30 FPS and 60 FPS.
Network bandwidths: 100 Mbps, 50 Mbps, and 10 Mbps.
Each experiment maintains a fixed combination of resolution, frame rate, and bandwidth to isolate the impact of individual parameters on protocol behavior.
6.1.5. Reproducibility Details
All experiments are conducted using publicly available datasets and pretrained detection models to ensure reproducibility. Video workloads are selected to represent common industrial surveillance and monitoring scenarios, with annotated object classes corresponding to vehicles and pedestrians. The datasets are used solely as inference workloads, and no additional model training or fine-tuning is performed as part of this study. The sources of pretrained model implementations are documented in References [
48,
49,
50].
Network conditions are emulated by explicitly constraining available bandwidth between requestor and service-providing devices. Bandwidth limits are applied at the operating-system level to reflect realistic industrial network conditions. Packet loss and jitter are not artificially injected; instead, the evaluation focuses on bandwidth-induced latency effects, which dominate performance variability in the targeted deployment setting.
6.1.6. Performance Metrics
The evaluation considers both computer vision performance metrics and SIoT protocol-level metrics:
Precision, Recall, and F1-score: Computed using ground-truth annotations associated with the test video content. These metrics quantify object-detection accuracy and are used as ascending attributes in the trust model.
Round-Trip Time (RTT): Defined as the elapsed time between frame transmission from the requester and receipt of the processed result, including preprocessing, data transfer, and post-processing. RTT is treated as a descending attribute.
Processing time: Measures algorithm execution time at the service provider, excluding network delay.
Friendship score: Represents the final trust value assigned to each service provider after SIoT graph construction and PageRank-based propagation.
SIoT Graph build time and PageRank computation time: Measure the overhead introduced by SIoT trust evaluation.
In the experimental evaluation, execution failures are treated consistently with the trust-handling logic described in
Section 4.9. Device–algorithm configurations that cannot complete inference due to hard execution failures—such as gRPC stream termination, incomplete frame delivery, or resource exhaustion (e.g., Jetson execution at 4K resolution)—are classified as FAIL and excluded from friendship-score computation and service selection. In contrast, configurations that complete execution but exhibit degraded performance or intermittent streaming instability remain feasible and contribute negative trust evidence, resulting in reduced friendship edge weights and lower PageRank-based friendship scores in subsequent selection rounds.
6.1.7. SIoT Protocol Execution
For each experimental run, the SIoT protocol operates as follows:
- 1.
All participating devices are preloaded with the proposed SIoT protocol library, and the requester device (D4) streams video frames to available service providers.
- 2.
Service providers execute object-detection inference using the deployed algorithm.
- 3.
Performance metrics are collected and normalized using the proposed mathematical model.
- 4.
A directed SIoT graph is constructed using composite friendship weights.
- 5.
PageRank-based trust propagation computes final friendship scores.
- 6.
The service provider with the highest friendship score is selected for subsequent execution.
Weight Percentages used in the trust-evaluation process are defined in
Table 1.
6.1.8. Experimental Scope
All experiments are conducted using identical datasets, configurations, and evaluation criteria across both environments. Each experimental configuration was executed three times under identical conditions to assess result stability. Reported values correspond to mean measurements across runs. Observed run-to-run variability was small relative to absolute execution time and did not affect relative device ranking or service-selection outcomes. For this reason, standard deviation values are omitted for clarity. This consistency ensures that observed differences in performance and trust behavior arise from device heterogeneity, algorithm characteristics, and network constraints rather than experimental bias.
6.2. Homogeneous Environment Results
6.2.1. Round-Trip Time (RTT) Analysis
This subsection analyzes the round-trip time (RTT) behavior observed in the homogeneous experimental environment, where two identical Windows devices (D1 and D2) act as service providers and a Raspberry Pi (D4) functions as the service requester. Since D1 and D2 share comparable hardware configurations and operating systems, RTT variations primarily reflect the effects of network bandwidth, video resolution, frame rate, and protocol-level processing rather than hardware heterogeneity. The corresponding numerical results are summarized in
Table 4,
Table 5 and
Table 6 and visualized in
Figure 4.
Figure 4.
Round-trip time (RTT) comparison for homogeneous device configurations under different device–algorithm assignments and frame rates.
Figure 4.
Round-trip time (RTT) comparison for homogeneous device configurations under different device–algorithm assignments and frame rates.
In this study, round-trip time (RTT) is defined as the elapsed time between the transmission of a video frame from the service-requesting device and the reception of the corresponding detection result. RTT includes frame preprocessing, gRPC serialization, uplink transmission, response reception, and post-processing overhead but explicitly excludes the model inference time executed on the service provider, which is reported separately as processing time.
Figure 4 illustrates the RTT measured for different device–algorithm pairs under varying bandwidth conditions (100 Mbps, 50 Mbps, and 10 Mbps) at both 30 FPS and 60 FPS. As reported in
Table 4,
Table 5 and
Table 6, RTT increases sharply as available bandwidth decreases across all resolutions, demonstrating the dominant influence of network throughput on end-to-end communication delay. At 100 Mbps, RTT remains relatively low and stable, while at 10 Mbps, RTT increases by an order of magnitude, particularly for high-resolution video streams. This behavior is consistent with the increased transmission time required for larger frame payloads under constrained network conditions.
The effect of frame rate is also evident. For a fixed resolution and bandwidth, RTT values at 60 FPS are consistently higher than those observed at 30 FPS. Higher frame rates increase the number of frames transmitted per second, thereby amplifying network congestion and buffering effects within the gRPC streaming pipeline. As a result, RTT reflects cumulative queuing delays rather than instantaneous inference latency alone.
Although D1 and D2 are homogeneous devices, small but consistent RTT differences are observed between them across several test cases, as shown in
Table 4,
Table 5 and
Table 6. These variations remain within a narrow range and do not exhibit systematic bias toward either device. Such differences arise from operating-system scheduling, thread synchronization, memory allocation behavior, and network stack timing rather than differences in algorithm execution time.
Resolution has a pronounced impact on RTT scaling. 4K resolution streams exhibit significantly higher RTT values compared to 1920p and 1080p resolutions, especially under reduced bandwidth conditions. The larger frame sizes associated with 4K content increase serialization and transmission costs, making RTT more sensitive to bandwidth constraints. In contrast, lower resolutions demonstrate more gradual RTT scaling, indicating improved robustness to network variability.
Overall, the homogeneous RTT results confirm that network bandwidth and frame rate dominate end-to-end delay, while device-level variations remain minimal when hardware configurations are similar. These observations establish a baseline for interpreting the effects of heterogeneity and failure scenarios analyzed in subsequent sections.
6.2.2. Detection Accuracy and F1-Score Analysis
This subsection evaluates detection performance in the homogeneous environment using precision, recall, and F1-score as ascending attributes. The analysis focuses on two object-detection algorithms—A1 and A2—executed on homogeneous service-provider devices (D1 and D2) under varying resolutions, frame rates, and bandwidth conditions. The corresponding quantitative results are reported in
Table 4,
Table 5 and
Table 6 and visualized in
Figure 5.
Figure 5.
F1-score comparison of object-detection algorithms A1 and A2 under homogeneous configurations for original and swapped device–algorithm assignments at different frame rates.
Figure 5.
F1-score comparison of object-detection algorithms A1 and A2 under homogeneous configurations for original and swapped device–algorithm assignments at different frame rates.
Across all test configurations, Algorithm A1 consistently achieves higher precision, recall, and F1-score than Algorithm A2, irrespective of device assignment. This trend holds for both the original deployment (D1–A1, D2–A2) and the swapped configuration (D1–A2, D2–A1). Since D1 and D2 share comparable hardware characteristics, these results indicate that detection accuracy in the homogeneous setting is primarily governed by algorithmic capability rather than device-level factors.
Resolution exhibits a noticeable but non-monotonic influence on detection accuracy. While moderate increases in spatial resolution improve object representation, the experimental results reported in
Table 4,
Table 5 and
Table 6 show that 4K does not consistently yield higher F1-scores than 1920p. In several cases, 1920p achieves superior precision and F1-score compared to 4K. This behavior arises from internal input resizing within detection models, scale-mismatch effects between object size and anchor or grid structures, and increased pipeline pressure associated with higher-resolution streams. Consequently, higher spatial resolution does not automatically translate into improved detection accuracy under real-time streaming conditions.
Bandwidth variations have minimal direct impact on precision, recall, and F1-score. For a fixed resolution and frame rate, detection accuracy remains largely invariant across 100 Mbps, 50 Mbps, and 10 Mbps conditions, as observed in
Table 4,
Table 5 and
Table 6. This observation confirms that bandwidth constraints primarily affect communication delay rather than inference correctness, provided that frames are successfully delivered and processed.
The swapped deployment experiments further validate the robustness of the accuracy results. When Algorithm A1 executes on either D1 or D2, detection performance remains stable, while Algorithm A2 exhibits similarly consistent but lower accuracy regardless of the hosting device. This outcome reinforces the separation between algorithm-level accuracy and device-level performance in homogeneous environments.
Overall, the homogeneous accuracy results demonstrate that algorithm selection dominates detection quality, while network bandwidth and device assignment exert negligible influence on precision, recall, and F1-score. These findings justify the emphasis placed on accuracy-related metrics in the trust-evaluation model and support their higher weighting in the friendship score computation.
6.2.3. Processing Time vs. Bandwidth
This subsection analyzes the processing time behavior observed in the homogeneous environment under varying bandwidth, resolution, and frame-rate configurations. Processing time refers exclusively to the model inference execution on the service-provider device and excludes communication-related delays, which are captured separately as RTT. The corresponding results are reported in
Table 4,
Table 5 and
Table 6 and illustrated in
Figure 6.
Across all resolutions and frame rates, processing time remains largely insensitive to network bandwidth. For a fixed resolution and FPS, the measured processing time exhibits minimal variation across 100 Mbps, 50 Mbps, and 10 Mbps conditions. This behavior confirms that processing time is dominated by algorithm execution characteristics and device computing capability rather than network constraints.
Algorithm-level differences strongly influence processing time. Algorithm A1 consistently exhibits higher processing time than Algorithm A2 at 1080p and 1920p resolutions, reflecting the increased computational complexity associated with A1. In contrast, at 4K resolution, this trend reverses in several test cases, with Algorithm A2 exhibiting comparable or higher processing time than Algorithm A1. This behavior arises from resolution-dependent internal preprocessing and scaling operations, where Ultra HD inputs introduce additional resizing and memory-handling overheads that affect lighter models disproportionately.
Figure 6.
Processing time variation with network bandwidth under homogeneous configurations for different resolutions and frame-rate settings.
Figure 6.
Processing time variation with network bandwidth under homogeneous configurations for different resolutions and frame-rate settings.
Resolution exerts a pronounced effect on processing time. Processing time increases substantially when moving from 1080p to 1920p and Ultra HD, particularly at higher frame rates. Ultra HD streams impose significantly higher memory bandwidth and tensor-handling demands, amplifying execution overhead even when inference input dimensions are internally normalized. As a result, processing time scales non-linearly with resolution, especially under real-time constraints.
Frame rate further amplifies processing demands. At 60 FPS, processing time increases relative to 30 FPS for all resolutions, reflecting reduced per-frame execution slack and increased scheduling pressure on the service-provider device. However, the relative ranking between algorithms remains consistent within each resolution regime, indicating that frame rate magnifies—but does not fundamentally alter—algorithmic processing characteristics.
Overall, the processing-time analysis demonstrates that algorithm complexity and input resolution dominate inference execution cost, while network bandwidth exerts negligible influence. These findings justify treating processing time as a descending attribute in the trust-evaluation model and motivate its explicit inclusion alongside RTT and accuracy in friendship-score computation.
6.2.4. Friendship Score Analysis
This subsection analyzes the friendship scores derived from the homogeneous experimental setup by integrating detection accuracy, processing time, and round-trip time within the proposed SIoT trust-evaluation framework. Friendship scores are computed using the weighted attribute model followed by PageRank-based trust propagation, as summarized in
Table 4,
Table 5 and
Table 6 and illustrated in
Figure 7.
Figure 7.
Friendship score comparison under homogeneous configurations for representative bandwidth and frame-rate conditions.
Figure 7.
Friendship score comparison under homogeneous configurations for representative bandwidth and frame-rate conditions.
Across all homogeneous test configurations, service providers executingAlgorithm A1 consistently achieve higher friendship scores than those executing Algorithm A2. This outcome remains stable across different resolutions, frame rates, and bandwidth conditions, indicating that friendship scores are primarily driven by algorithm-level performance rather than transient network variations. The dominance of Algorithm A1 in friendship ranking directly reflects its superior precision, recall, and F1-score, as discussed in
Section 6.2.2.
Low RTT alone does not guarantee a high friendship score. Several configurations exhibit comparable RTT values across service providers while yielding significantly different friendship scores. This behavior arises from the weighted trust model, in which accuracy-related attributes receive higher importance than latency-related attributes. As a result, service providers delivering consistently higher detection accuracy retain stronger trust scores even when their RTT or processing time is marginally higher.
Processing time contributes as a descending attribute in the friendship computation but does not override accuracy-dominated behavior in the homogeneous environment. Although Algorithm A1 often incurs higher processing time than Algorithm A2 at moderate resolutions, its superior detection accuracy compensates for this penalty in the weighted aggregation stage. This trade-off ensures that the trust model favors reliable detection performance over marginal gains in execution speed.
Bandwidth variations exert minimal influence on friendship scores in the homogeneous setup. Since accuracy remains stable across bandwidth conditions and processing time is largely network-independent, friendship scores exhibit only minor fluctuations between 100 Mbps, 50 Mbps, and 10 Mbps scenarios. This stability confirms that the proposed trust model effectively isolates intrinsic service quality from transient network effects when devices are homogeneous.
Overall, the homogeneous friendship score results validate the design of the SIoT trust-evaluation framework. By jointly considering ascending attributes such as detection accuracy and descending attributes such as RTT and processing time, the model consistently selects service providers that offer reliable and accurate computer vision performance. These results establish a baseline against which heterogeneous environment behavior is analyzed in the subsequent section.
6.3. Heterogeneous Results
This section presents the experimental results obtained in the heterogeneous environment, where service-provider devices exhibit distinct hardware and computational characteristics. In this setup, a Windows device (D1) and an embedded GPU-based device (D3—Jetson) act as service providers, while a Raspberry Pi (D4) functions as the service requester. Unlike the homogeneous configuration, this environment exposes the proposed SIoT protocol to variations in compute capability, memory architecture, and system-level constraints.
The heterogeneous experiments are designed to evaluate the robustness of the SIoT trust-evaluation framework under realistic industrial conditions, where participating devices differ significantly in performance and reliability. The analysis focuses on round-trip time, detection accuracy, processing time, and friendship score behavior, with particular attention to failure scenarios observed during high-resolution video streaming.
6.3.1. Round-Trip Time (RTT) Analysis
This subsection analyzes round-trip time (RTT) behavior in the heterogeneous environment, where a Windows device (D1) and an embedded GPU-based device (D3—Jetson) act as service providers and a Raspberry Pi (D4) functions as the service requester. RTT is used here as an end-to-end communication delay metric, as defined earlier. The results are summarized in
Table 7,
Table 8 and
Table 9 and illustrated in
Figure 8.
Figure 8.
Round-trip time (RTT) comparison for heterogeneous device configurations under different device–algorithm assignments and frame rates.
Figure 8.
Round-trip time (RTT) comparison for heterogeneous device configurations under different device–algorithm assignments and frame rates.
Compared to the homogeneous setup, RTT values in the heterogeneous environment exhibit higher variance and stronger sensitivity to resolution and frame rate. For moderate resolutions (1080p and 1920p) at sufficient bandwidth, RTT remains within operational limits for both service providers. However, for 4K streams, stable RTT measurements are not observed for the Jetson-based service provider across multiple bandwidth settings. Instead, repeated gRPC streaming failures occur, preventing sustained frame transmission.
The observed failures for 4K streaming on the Jetson device arise from the combined effects of high-resolution frame payloads, limited memory headroom, and sustained gRPC streaming pressure. 4K frames significantly increase serialization overhead and buffer occupancy, leading to backpressure within the gRPC pipeline. Under these conditions, frame queues grow rapidly, triggering transport-level timeouts or resource exhaustion before inference results can be returned. As a result, RTT becomes unbounded, and the service effectively fails rather than degrading gracefully.
In contrast, the Windows-based service provider maintains stable RTT behavior under identical 4K streaming conditions. This difference reflects the availability of greater system memory, more aggressive buffering capacity, and more robust operating-system scheduling on workstation-class hardware. These characteristics allow the Windows device to absorb transient surges in data rate and maintain bidirectional gRPC communication even at high resolutions.
For resolutions where both service providers operate successfully, RTT follows expected scaling behavior. RTT increases as bandwidth decreases and as frame rate increases, consistent with the accumulation of transmission and buffering delays. However, even under these conditions, the Jetson-based service provider exhibits higher RTT values than the Windows device, indicating reduced tolerance to sustained streaming load.
Overall, the heterogeneous RTT results highlight the importance of device capability awareness in SIoT-based service selection. While homogeneous environments mask such limitations, heterogeneous deployments expose failure thresholds that cannot be captured by average RTT values alone. These findings motivate the integration of reliability and failure-awareness into the trust-evaluation framework, as discussed in subsequent subsections.
6.3.2. Detection Accuracy and F1-Score Analysis
This subsection evaluates detection accuracy in the heterogeneous environment using precision, recall, and F1-score as ascending attributes. The analysis considers object-detection algorithms A1, A2, and A3 deployed on heterogeneous service providers with distinct computational capabilities, namely a Windows-based device (D1) and an embedded GPU-based Jetson device (D3). The corresponding results are summarized in
Table 7,
Table 8 and
Table 9 and illustrated in
Figure 9.
Figure 9.
F1-score comparison of object-detection algorithms A1 and A2 under heterogeneous device configurations at different frame rates.
Figure 9.
F1-score comparison of object-detection algorithms A1 and A2 under heterogeneous device configurations at different frame rates.
For configurations where stable streaming is achieved, detection accuracy primarily reflects algorithmic capability rather than device class. Algorithm A1 consistently delivers higher precision, recall, and F1-score than Algorithm A3 when executed on the Windows-based service provider, mirroring the behavior observed in the homogeneous environment. These results indicate that, in the absence of failures, algorithm selection remains the dominant factor governing detection accuracy even under heterogeneous hardware conditions.
The Jetson-based service provider exhibits acceptable detection accuracy for moderate resolutions such as 1080p and 1920p when bandwidth is sufficient. In these cases, Algorithm A3 achieves reasonable precision and recall, although its F1-score remains lower than that of Algorithm A1 executed on the Windows device. This difference reflects both algorithmic design and the constrained computational and memory resources available on embedded platforms.
For 4K streaming, detection accuracy metrics are not reported for the Jetson device due to repeated gRPC streaming failures. In these scenarios, frames are not processed reliably, resulting in incomplete or missing inference outputs. Consequently, precision, recall, and F1-score values are marked as failed rather than degraded. This distinction is important, as it emphasizes that the absence of accuracy measurements arises from system-level instability rather than poor inference quality.
Frame rate exerts a secondary influence on detection accuracy in the heterogeneous environment. At 60 FPS, detection accuracy remains comparable to 30 FPS for resolutions where streaming is stable, indicating that temporal sampling does not significantly degrade inference correctness prior to system saturation. However, higher frame rates reduce the operational margin for embedded devices, increasing susceptibility to streaming failure under high-resolution workloads.
Overall, the heterogeneous accuracy results demonstrate that algorithmic strength determines detection quality when execution is feasible, while device capability governs whether such execution remains stable. These findings reinforce the need for trust models that account for both accuracy and reliability, particularly in heterogeneous SIoT deployments where failure behavior cannot be inferred from accuracy metrics alone.
6.3.3. Processing Time vs. Bandwidth
This subsection analyzes processing time behavior in the heterogeneous environment under varying bandwidth, resolution, and frame-rate conditions. Processing time represents the model inference execution time on the service-provider device and excludes communication-related delays captured by RTT. The corresponding results are reported in
Table 7,
Table 8 and
Table 9 and illustrated in
Figure 10.
Figure 10.
Processing time variation with network bandwidth under heterogeneous device configurations at 1080p resolution for different frame-rate settings.
Figure 10.
Processing time variation with network bandwidth under heterogeneous device configurations at 1080p resolution for different frame-rate settings.
For configurations where stable streaming is achieved, processing time remains largely independent of network bandwidth for both service providers. Across 100 Mbps, 50 Mbps, and 10 Mbps conditions, processing time exhibits only minor variation for a fixed resolution and frame rate. This observation confirms that inference execution cost is dominated by algorithm complexity and device compute capability rather than network throughput.
Clear differences emerge between workstation-class and embedded platforms. The Windows-based service provider consistently achieves lower and more stable processing times compared to the Jetson device for moderate resolutions such as 1080p and 1920p. This behavior reflects higher available CPU and memory bandwidth, more aggressive parallelism, and reduced contention for system resources on the Windows platform.
The Jetson-based service provider exhibits significantly higher processing time variability as resolution and frame rate increase. At 1920p resolution, processing time increases sharply at higher frame rates, indicating reduced scheduling slack and increased memory pressure on the embedded device. These effects are amplified at lower bandwidths, not because of network influence on inference execution, but due to upstream buffering and backpressure that reduce effective processing throughput.
For 4K configurations, processing time measurements are not reported for the Jetson device due to repeated streaming failures. In these cases, inference execution does not reach a steady state, preventing reliable processing-time measurement. This behavior highlights a fundamental limitation of embedded platforms when subjected to sustained high-resolution, high-frame-rate streaming workloads.
Overall, the heterogeneous processing-time results demonstrate that algorithm execution cost and device capability jointly determine inference feasibility. While bandwidth does not directly affect processing time, its interaction with buffering and streaming stability indirectly influences whether inference can proceed reliably on resource-constrained devices. These findings further motivate the inclusion of processing time as a descending attribute in the SIoT trust-evaluation framework.
6.3.4. Friendship Score Analysis
This subsection analyzes friendship score behavior in the heterogeneous environment by integrating detection accuracy, processing time, RTT, and reliability within the proposed SIoT trust-evaluation framework. Friendship scores are computed using weighted attribute aggregation followed by PageRank-based trust propagation. The corresponding results are summarized in
Table 7,
Table 8 and
Table 9 and illustrated in
Figure 11.
Figure 11.
Friendship score comparison under heterogeneous device configurations for representative bandwidth and frame-rate conditions.
Figure 11.
Friendship score comparison under heterogeneous device configurations for representative bandwidth and frame-rate conditions.
In contrast to the homogeneous environment, friendship scores in the heterogeneous setup exhibit stronger differentiation across service providers. The Windows-based service provider consistently achieves higher friendship scores than the Jetson-based device across all stable configurations. This outcome reflects the combined effect of superior detection accuracy, lower processing time, and stable RTT behavior under varying network and workload conditions.
Failure behavior plays a decisive role in friendship-score computation. For 4K streaming scenarios, the Jetson-based service provider repeatedly experiences gRPC streaming failures, resulting in incomplete inference outputs. In these cases, friendship scores associated with the Jetson device are either not computed or significantly penalized, effectively excluding the device from selection. This behavior demonstrates that the trust model captures reliability implicitly, favoring service providers that maintain consistent execution rather than those that intermittently fail.
Low RTT alone does not guarantee a high friendship score in the heterogeneous environment. Several configurations exhibit comparable RTT values across service providers while yielding substantially different friendship scores. This outcome arises from the weighting strategy employed in the trust model, where accuracy and execution reliability exert stronger influence than latency under industrial computer vision workloads. Consequently, devices delivering stable and accurate inference are consistently prioritized even when latency differences are marginal.
Processing time contributes as a descending attribute but remains secondary to accuracy and reliability in determining friendship scores. Although the Jetson device may achieve acceptable processing times under moderate workloads, its susceptibility to failure under high-resolution streaming reduces its overall trustworthiness. In contrast, the Windows-based service provider maintains stable performance across all tested configurations, resulting in consistently higher trust scores.
Overall, the heterogeneous friendship score results validate the ability of the proposed SIoT framework to distinguish between capable and unreliable service providers under realistic industrial conditions. By jointly considering accuracy, latency, processing cost, and failure behavior, the trust-evaluation mechanism selects service providers that offer dependable and sustained computer vision performance. These results confirm that socially grounded trust modeling is essential for robust service selection in heterogeneous SIoT deployments.
6.4. SIoT Graph Construction and Friendship Scoring Overhead
This subsection evaluates the computational overhead introduced by SIoT graph construction and PageRank-based friendship scoring across both homogeneous and heterogeneous experimental environments. Unlike inference execution and communication delay, these operations depend primarily on the size and topology of the SIoT graph rather than video resolution, frame rate, bandwidth, or device capability. The corresponding measurements are reported in
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9.
Across all experimental configurations, the SIoT graph structure remains fixed at 11 nodes and 22 directed edges. Consequently, graph construction time remains stable in both homogeneous and heterogeneous setups, typically within the range of approximately 1.3 s to 1.8 s. Minor variations are observed across experiments; however, these variations do not correlate with changes in resolution, frame rate, bandwidth, or algorithm selection. Instead, they arise from runtime factors such as memory allocation, interpreter scheduling, and graph initialization overhead.
The PageRank-based friendship scoring time exhibits minimal variation across all experiments, typically ranging between 0.019 s and 0.023 s. Since the graph topology and edge weights remain structurally consistent, this variation does not reflect changes in trust computation complexity. Rather, it is attributed to runtime effects, including thread scheduling, cache behavior, memory allocation, and floating-point convergence characteristics during PageRank iteration. Importantly, the observed variation is negligible when compared to end-to-end communication delay and inference execution time.
Overall, these results demonstrate that SIoT graph construction and friendship scoring do not constitute performance bottlenecks in the proposed protocol. Compared to the latency introduced by video streaming and object-detection inference, the overhead associated with trust evaluation is minimal. This efficiency is particularly relevant for industrial SIoT deployments, where trust relationships may require frequent recomputation without compromising real-time responsiveness or system stability.
6.5. Scalability Analysis of the SIoT Trust Graph
To evaluate the scalability of the proposed SIoT trust computation beyond the size of the physical testbed, a synthetic trust-graph analysis was conducted. As illustrated in
Figure 2, the SIoT framework operates at the trust-evaluation layer, where each service-providing device contributes multiple logical trust nodes representing performance metrics, execution characteristics, and device capabilities. As a result, the size of the SIoT graph increases with the number of participating devices and their associated evaluation attributes, rather than with the physical device count alone.
In this context, stability refers to the consistency of the relative PageRank ordering and the selected service provider as the SIoT trust graph scales in size, rather than to statistical variance or sensitivity metrics. In the scalability study, synthetic SIoT trust graphs were constructed to emulate deployments with increasing numbers of service providers. For each service provider, multiple logical trust nodes were instantiated to capture detection accuracy, communication latency, processing time, and device-level characteristics. Weighted directed edges were computed using the Relative Percentage Difference (RPD) formulation combined with workload-specific attribute weights. PageRank-based friendship scoring was then applied to the resulting graphs to assess convergence behavior and computational overhead under increasing graph size.
The evaluation considered configurations with up to 50 service providers, corresponding to approximately 500 logical trust nodes and nearly 30,000 weighted edges, as summarized in
Table 10. Across all evaluated network sizes, PageRank convergence was stable under the fixed damping factor and convergence threshold described in
Section 5, and the relative ordering of top-ranked trust nodes remained consistent across repeated executions. While absolute PageRank friendship scores varied with graph size, the relative ordering of top-ranked devices and the resulting trust-based service-selection decisions remained stable and were not sensitive to network scale.
The computational overhead of PageRank-based trust evaluation remained low as the graph size increased. Even for the largest evaluated configuration, the trust-scoring computation completed within a few tens of milliseconds, confirming that PageRank-based friendship evaluation can be executed frequently without compromising system responsiveness. For very small graphs, observed execution-time variability was dominated by Python runtime and library initialization overhead rather than graph size and therefore does not reflect scalability trends. It is noted that the reported PageRank execution times correspond exclusively to the in-memory trust-scoring computation and exclude protocol-level communication and inference overheads, which dominate end-to-end latency in the physical experimental testbed.
These results demonstrate that the proposed socially grounded SIoT trust model scales efficiently with increasing trust-graph dimensionality and introduces negligible overhead relative to inference and communication costs. This confirms the suitability of the framework for frequent trust re-evaluation in dynamic industrial IoT environments.
6.6. Baseline Selector Comparison
To isolate the contribution of socially grounded trust computation from conventional metric-based decision strategies, this section compares the proposed SIoT framework against three non-social baseline selectors. The comparison focuses on decision outcomes, rather than re-measuring performance, and operates exclusively on the summary metrics already reported in
Section 6.2 and
Section 6.3.
6.6.1. Baseline Selector Definitions
Three baseline selectors are considered:
Accuracy-First Selector: Selects the service provider with the highest detection accuracy, measured using the F1-score.
Latency-First Selector: Selects the device with the lowest round-trip time (RTT), prioritizing communication responsiveness.
Weighted-Sum Selector (Non-Social): Applies direct normalization and fixed weighting to accuracy, RTT, and processing time using the same relative importance as the proposed trust model, but without SIoT graph construction or PageRank-based trust propagation.
All baseline selections are derived deterministically from the reported summary metrics, and no additional experimental executions are performed.
6.6.2. Baseline Comparison in the Homogeneous Environment
Table 11 compares service-selection outcomes under representative homogeneous configurations spanning high-load, moderate-load, and constrained operating conditions. Representative configurations are selected to span low-load, moderate-load, and high-stress operating regimes; all underlying metrics for other configurations are reported in
Table 4,
Table 5 and
Table 6. Since service providers exhibit comparable hardware and network characteristics in this environment, performance differences primarily arise from algorithmic behavior rather than device heterogeneity. Across homogeneous scenarios, baseline selectors and the proposed SIoT method frequently converge on the same device. This convergence reflects clear dominance in detection accuracy and minimal variation in runtime behavior between service providers. The proposed SIoT framework preserves these dominant performance signals and does not introduce artificial bias when metric differences are unambiguous.
6.6.3. Baseline Comparison in the Heterogeneous Environment
Table 12 presents baseline selection outcomes for representative heterogeneous configurations involving resource-diverse devices. These scenarios introduce competing objectives between accuracy and runtime feasibility, reflecting realistic industrial deployment conditions. In several heterogeneous cases, baseline selectors and the proposed SIoT method coincide due to strong and consistent performance dominance by a single device. This outcome indicates that PageRank-based trust propagation respects clear evidence rather than overriding it. Divergence between selectors arises only under conditions where metric trade-offs or performance variability become significant, at which point the proposed framework emphasizes stability and consistency across repeated interactions.
6.6.4. Weight Sensitivity Analysis
To evaluate the robustness of the proposed SIoT trust model with respect to weight assignment, a sensitivity analysis is conducted by varying the top-level category weights assigned to accuracy and processing time while preserving internal metric ratios. Accuracy and processing-time weights are varied across a broad but reasonable range (5–25%), reflecting different application priorities.
Table 13 summarizes the resulting service-selection outcomes for representative homogeneous and heterogeneous configurations. Across all tested weight combinations, the selected service provider remains unchanged. Absolute PageRank friendship scores exhibit only minor, smooth variations under weight perturbation, while the relative ordering of service providers and final selection remain unchanged, indicating that decision outcomes are driven by dominant performance relationships rather than finely tuned weight values. In the heterogeneous configuration, devices that fail to execute under given resource constraints are excluded from the candidate set, and weight variation does not override feasibility constraints. These results demonstrate that the chosen weights are not tuned post hoc and that the proposed SIoT-based selection strategy is robust to reasonable weight perturbations.
7. Discussion
While the proposed framework does not aim to benchmark or improve object-detection models, the validity of trust-aware service selection necessarily depends on the baseline reliability of the deployed detection pipelines. In this study, detection algorithms and datasets are intentionally chosen to reflect commonly used industrial computer vision workloads rather than to achieve state-of-the-art accuracy. Consequently, absolute F1-score values should be interpreted as representative of realistic operating conditions rather than as indicators of model optimality. Trust-aware selection is therefore meaningful within feasible operating regimes where detection quality is acceptable, and the framework is designed to identify the most reliable service provider under these practical constraints.
The discussion of homogeneous and heterogeneous results reveals that trust-aware service selection in the Social Internet of Things (SIoT) is shaped by both algorithmic performance and system-level feasibility constraints. This study evaluates whether socially grounded trust computation can reliably guide service selection when devices differ in performance, execution constraints, and network conditions, rather than assessing detector performance in isolation.
7.1. Key Observations Across Experimental Environments
Results from the homogeneous environment establish a controlled baseline in which service-provider devices exhibit similar hardware configurations and operating systems. Under these conditions, detection accuracy emerges as the primary differentiating factor, while round-trip time and processing latency remain largely comparable across devices. Consequently, the friendship score consistently favors the service provider executing the higher-accuracy algorithm, confirming that the proposed weighting strategy behaves as expected when device-level variability is minimal.
In contrast, the heterogeneous environment introduces realistic asymmetry in compute capability, memory architecture, and execution efficiency. Here, performance differentiation arises not only from algorithmic accuracy but also from device suitability and runtime stability. The observed divergence in processing time and RTT highlights the importance of incorporating multiple performance dimensions into trust evaluation, particularly under high-resolution and high-frame-rate workloads.
7.2. Homogeneous vs. Heterogeneous Trust Dynamics
A key insight from this work is the fundamentally different role played by trust metrics across homogeneous and heterogeneous environments. In homogeneous configurations, trust computation is dominated by algorithm-level performance, as device capabilities are effectively interchangeable. In heterogeneous configurations, however, trust emerges from the interaction between algorithm capability and device execution feasibility.
Importantly, the heterogeneous experiments intentionally preserve realistic deployment constraints. Algorithm–device pairings are restricted to configurations that are stable and deployable under real-time streaming conditions, reflecting practical SIoT deployments rather than synthetic benchmarking scenarios. This design choice ensures that trust evaluation reflects service reliability rather than forced execution of unsuitable algorithm–device combinations.
The resulting friendship scores demonstrate that high algorithmic accuracy alone does not guarantee superior trust when deployed on resource-constrained platforms. Instead, SIoT service selection favors configurations that achieve a balanced trade-off between detection performance, communication latency, and processing efficiency. This behavior confirms that the proposed framework captures performance asymmetries introduced by heterogeneity in a principled and context-aware manner.
7.3. Robustness Under Feasibility Constraints
Observed detection performance, including absolute F1-scores, is also influenced by the intrinsic characteristics of the deployed detection algorithms and their pretrained configurations. The evaluated models are selected to represent commonly used industrial detection pipelines rather than to maximize benchmark accuracy on a specific dataset. Consequently, variations in F1-score reflect both algorithmic capability and system-level execution constraints. The proposed SIoT framework does not seek to optimize detection models themselves but instead provides a trust-aware mechanism to select the most reliable service provider given the available algorithms and operating conditions.
7.4. Protocol Overhead and Practical Feasibility
The computational overhead associated with SIoT trust evaluation is separated into two distinct components: trust-graph construction and PageRank-based friendship scoring. The reported graph construction time corresponds to initializing or updating the trust graph based on aggregated execution outcomes rather than performing per-frame recomputation. In the proposed framework, trust-graph construction is triggered at coarse temporal granularity, such as after completion of a workload batch or upon significant changes in device availability or execution behavior. As a result, the one-time construction overhead is distributed across multiple inference executions and does not dominate end-to-end latency.
From a practical standpoint, the experimental setup reflects realistic industrial scenarios, where a limited number of heterogeneous edge devices collaborate to process high-rate visual data streams. The observed behaviors mirror conditions commonly encountered in smart surveillance, automated inspection, and industrial monitoring systems, where device capability, network variability, and workload intensity interact in complex ways.
Trust-graph construction is therefore treated as an event-driven operation, triggered during initial system setup or when significant changes occur, such as the addition or removal of service-providing devices, administrative reconfiguration, or sustained changes in execution behavior. In contrast, PageRank-based friendship scoring operates on an already constructed in-memory graph and completes within a few tens of milliseconds, enabling frequent trust re-evaluation without compromising system responsiveness.
7.5. Practical Relevance and Experimental Scope
Recent work on distributed industrial edge systems has shown that lightweight consensus mechanisms enable consistent sensing data sharing under severe resource constraints by reducing communication and computation overhead through techniques such as node sampling and dynamically adjusted trust or reputation. These approaches focus on maintaining data integrity, provenance, and system-wide consistency while remaining scalable and fault tolerant in large, heterogeneous IoT deployments [
51]. The proposed SIoT trust-evaluation framework complements such mechanisms by operating at a higher service-selection layer. While lightweight consensus protocols coordinate shared system state and data synchronization across edge servers, the SIoT framework leverages locally observed execution outcomes to guide trust-aware selection of feasible and reliable computer vision services. In a combined deployment, consensus mechanisms provide an efficient coordination and consistency substrate, whereas the SIoT framework exploits execution-level feedback to adapt service selection under heterogeneous device capabilities and variable network conditions. This separation of concerns allows both approaches to work together without tight coupling, enhancing scalability and practical deployability in distributed industrial edge environments.
A common concern in SIoT evaluation is whether experiments involving a limited number of devices adequately reflect real-world deployments. In this work, the experimental setup is intentionally scoped to a small but representative set of service providers to enable controlled analysis of trust dynamics. SIoT trust computation is inherently relative: service providers are evaluated based on comparative performance rather than absolute scale.
Even with a small number of devices, the underlying friendship graph construction and PageRank-based scoring mechanism exhibit stable and interpretable behavior. These results indicate that meaningful trust differentiation can emerge without requiring large-scale deployments, particularly during early-stage system validation or in localized industrial settings where device populations are naturally constrained.
Moreover, industrial IoT deployments often evolve incrementally, beginning with a limited number of heterogeneous devices before scaling. The presented evaluation reflects this practical progression and demonstrates that the proposed SIoT framework remains effective under such conditions.
7.6. Limitations and Future Directions
While the proposed SIoT trust-evaluation framework demonstrates consistent and stable service selection under realistic execution constraints, several limitations should be acknowledged. First, the physical experimental evaluation involves a limited number of service-providing devices, resulting in a trust graph comprising 11 logical nodes and 22 edges. Although synthetic scalability analysis indicates stable PageRank convergence for larger graphs, the physical results primarily reflect small- to medium-scale industrial deployments and may not capture all dynamics present in large-scale systems.
Second, the experimental study relies on a selected set of pretrained object detection models and publicly available traffic surveillance datasets. Detection accuracy, failure characteristics, and execution behavior are therefore influenced by the intrinsic properties of these models and datasets. While the proposed SIoT framework is model- and dataset-agnostic, absolute performance values and failure rates may vary under alternative architectures, training regimes, or application domains.
Finally, failure handling in this work focuses on execution feasibility and performance degradation rather than on safety-critical fault tolerance or adversarial resilience.
Configurations that cannot complete inference are excluded from service selection, and repeated execution failures reduce trust scores over time. This approach supports robust service selection under resource constraints but does not replace dedicated safety mechanisms required in mission-critical industrial systems.
Future work will explore larger and more diverse device populations, adaptive trust-weight tuning, and dynamic model migration to further evaluate scalability and autonomy. Incorporating additional trust dimensions, such as availability history and security posture, and exploring cross-vendor acceleration backends represent promising directions for extending the framework toward more resilient industrial deployments.
8. Conclusions
This work presented a socially driven protocol for the Social Internet of Things (SIoT) that enables trust-aware service selection for distributed industrial computer vision workloads. The proposed framework integrates multi-dimensional performance metrics, including detection accuracy, communication latency, processing time, and device-level execution characteristics—within a graph-based trust model, allowing service providers to be selected based on observed behavior rather than static device descriptions.
Experimental evaluation across both homogeneous and heterogeneous environments shows that the proposed SIoT framework supports consistent and stable trust-based service selection within feasible operating regimes.
In homogeneous settings, where service providers exhibit comparable hardware capabilities, algorithmic accuracy dominates trust outcomes, and higher-performing detection pipelines achieve higher friendship scores. This behavior validates the adopted weighting strategy in environments where execution feasibility is uniform and resource constraints are minimal.
In heterogeneous environments comprising resource-diverse devices, trust computation reflects the combined influence of algorithm capability and device execution feasibility. Under high-resolution and high-frame-rate workloads, service providers with stronger processing capacity and stable execution profiles are preferentially selected, resulting in clear differentiation as hardware resources, network conditions, and workload intensity vary. Device–algorithm configurations that fail to execute under given constraints are excluded from service selection, allowing the framework to operate reliably within feasible operating regions.
The experimental results further indicate that network bandwidth primarily affects communication delay without significantly impacting detection correctness. Specifically, reducing available bandwidth from 100 Mbps to 10 Mbps increases round trip communication latency by approximately one order of magnitude, while detection accuracy remains largely invariant. In contrast, processing capability and execution stability emerge as decisive factors under demanding workloads. From a systems perspective, SIoT trust evaluation introduces limited computational overhead: trust-graph construction incurs a one-time or event-driven cost on the order of seconds, while PageRank-based friendship scoring is completed within tens of milliseconds and can be executed frequently without impacting end-to-end service latency.
The practical applicability of the proposed SIoT framework is conditioned on operating regimes where candidate device–algorithm configurations are feasible and detection performance remains within acceptable bounds for the target application. Experimental results show that service selection is fundamentally constrained by device capability and algorithm feasibility under given workload conditions. Within these bounded feasibility regions, the framework consistently enables stable and trust-aware service selection, but it does not eliminate limitations imposed by hardware resources or algorithm suitability.
Overall, the proposed SIoT protocol demonstrates the effectiveness of socially grounded trust computation for coordinating heterogeneous edge devices in industrial IoT environments. By jointly considering accuracy, efficiency, and runtime behavior, the framework supports principled and explainable service selection under dynamic conditions. Future work will explore scaling the framework to larger SIoT deployments, adaptive trust-weight tuning under evolving workloads, and the incorporation of additional trust attributes such as availability history and fault resilience to further enhance applicability in safety-critical industrial systems.