1. Introduction
Algorithmic management is becoming increasingly embedded in warehouse operations as third-party logistics (3PL) firms adopt system-mediated task allocation, productivity monitoring, target setting, and dynamic scheduling in daily workflows. In these environments, algorithmic systems not only determine which tasks are performed. They also influence how work is paced, sequenced, monitored, and evaluated. As a result, optimization tools increasingly function as governance mechanisms. Their success depends not only on technical accuracy or operational efficiency, but also on employee cooperation. Workers evaluate algorithmic control through questions of fairness, trustworthiness, autonomy, transparency, workload pressure, and privacy. When these conditions are weak, algorithmic systems that are intended to stabilize operations may instead generate resistance, workarounds, disengagement, and behavioral volatility that disrupt consistency and throughput [
1,
2,
3].
This problem can be illustrated through a practical 3PL warehouse scenario. During a peak-demand period, an algorithmic management system may automatically assign picking tasks, adjust productivity targets, monitor worker performance in real time, and flag deviations from expected output. From a managerial perspective, this system may appear efficient because it improves coordination and speeds up decision-making. However, from an employee perspective, the same system may create concerns if workers do not understand how targets are calculated, if exceptional circumstances cannot be appealed, if monitoring feels excessive, or if workload pressure becomes unsafe. In such a situation, the manager must decide which governance package should be prioritized: transparency, contestability, human oversight, privacy safeguards, workload protection, autonomy-preserving work design, or an integrated human-centered governance package. This is not a simple technical choice. It is a socio-technical decision problem in which psychological, ethical, operational, and compliance-related criteria interact with each other.
The issue is particularly significant in 3PL warehouses because operating conditions magnify both the value and the risks of algorithmic management. Peak-demand periods intensify throughput pressure, labor volatility complicates process standardization, and safety-critical decisions are often made under severe time constraints. In such settings, weaknesses in governance can quickly become operational problems. Workarounds, disengagement, and turnover may reduce process reliability, while incident risk and quality failures may increase. These effects can accumulate into reputational, compliance, and legal exposure, especially as digital monitoring and performance enforcement become more pervasive. Recent logistics research also links algorithmic management with psychological strain, musculoskeletal problems, and occupational accidents. This indicates that governance design should be treated not only as a human resource concern, but also as a logistics, safety, and sustainability concern in high-intensity warehouse systems [
4].
Although interest in algorithmic management has grown, an important decision-oriented gap remains. Much of the existing literature discusses governance challenges in conceptual or qualitative terms. These studies are valuable because they explain the risks of algorithmic control, but they provide limited guidance on what managers should prioritize when resources are constrained. At the same time, many decision models used in logistics rely on static weights and assumptions of criterion independence. This is problematic because algorithmic management governance criteria are structurally interconnected. Transparency may strengthen trust. Procedural fairness may reduce resistance. Privacy concerns may weaken acceptance. Workload pressure may increase disengagement. Resistance may directly undermine operational stability. Therefore, governance package selection requires a method that can model uncertainty, interdependence, and scenario-specific priorities rather than treating all criteria as isolated factors [
5,
6].
To address this gap, this study develops a human-centered and AI-supported decision framework for prioritizing algorithmic management governance packages in 3PL warehousing. The framework integrates employee survey evidence with expert judgment. Survey data are used to examine psychological and behavioral mechanisms such as procedural fairness, trust in algorithmic decisions, perceived autonomy, transparency, privacy concern, workload, acceptance, and disengagement risk. These constructs are then translated into governance criteria and refined through expert evaluation. In this way, the model connects employee-level perceptions with managerial decision-making needs.
The proposed framework combines three methodological components. First, Dynamic Multi-Facet Fuzzy Sets are used to represent expert judgments through multiple evidence facets, including agreement, disagreement, hesitancy, engagement, and resistance. This is important because expert evaluations of governance packages are rarely fully certain or one-dimensional. Second, a dependency-aware Bayesian Network is used to derive scenario-sensitive criteria weights by considering the relationships among governance criteria. This responds to the fact that criteria such as transparency, trust, fairness, privacy, workload, and resistance do not operate independently. Third, PCA-based ranking optimization is applied to generate discriminative rankings of governance packages. This step is useful because governance alternatives may receive close scores, and managers need rankings that are interpretable, robust, and practically usable. The model is evaluated under three operational stress conditions: peak season, labor shortage, and audit pressure.
Accordingly, the study is guided by three research questions:
RQ1. Which human-centered governance criteria are most influential in prioritizing algorithmic management governance packages in 3PL warehouses?
RQ2. How do operational stress scenarios, namely peak season, labor shortage, and audit pressure, change criteria weights and governance package rankings?
RQ3. Which governance package remains most robust across scenarios, and how does the proposed framework improve decision clarity under uncertainty?
The study makes three contributions. First, it contributes conceptually by framing algorithmic management governance as a distinct logistics management decision domain. In this domain, stable warehouse performance depends not only on technical optimization, but also on legitimacy, trust, autonomy, fairness, and calibrated reliance [
1,
2]. Second, it contributes methodologically by integrating DMFF-based uncertainty representation, Bayesian Network-based dependency weighting, and PCA-based ranking optimization into a single decision framework. This provides a more suitable approach for governance problems in which criteria are uncertain, interdependent, and sensitive to operating conditions [
6]. Third, it contributes practically by offering scenario-based governance recommendations for 3PL managers. These recommendations can help managers select governance packages that reduce resistance, protect employees, support compliance, and maintain throughput and operational stability under stress conditions that often destabilize algorithmic management initiatives [
4].
Overall, this study argues that responsible algorithmic management in 3PL warehousing requires more than technical transparency or compliance documentation. It requires a decision framework that can evaluate how governance mechanisms work together under real operating pressures. By embedding human, organizational, and ethical dimensions into an interpretable decision model, the study provides a practical and theoretically grounded approach for selecting governance packages in algorithmically managed warehouse work systems.
3. Methodology
3.1. Research Design
This study adopts a sequential mixed-method research design to prioritize human-centered governance packages for algorithmic management in third-party logistics (3PL) warehousing. The design includes three connected stages: employee survey and SEM validation, expert-based Dynamic Multi-Facet Fuzzy Sets (DMFFS) modeling, and dependency-aware ranking through Bayesian Network weighting and PCA-based ranking optimization. This structure is used because algorithmic management governance cannot be assessed only through technical efficiency. It also requires evidence about employee perceptions, expert judgment about operational feasibility, and scenario-sensitive evaluation of alternative governance packages.
In the first stage, survey data are collected from warehouse employees exposed to technology-mediated task allocation, scanner-based monitoring, KPI feedback, or algorithmic scheduling. SEM is used to validate the measurement model and examine psychological and behavioral mechanisms, including procedural fairness, transparency, system/information quality, autonomy, privacy concern, workload, trust, acceptance, resistance, and disengagement risk. The purpose of this stage is not to rank alternatives directly, but to identify the human-centered mechanisms that should inform the decision model.
In the second stage, the survey-supported constructs are translated into six decision criteria through literature support and expert validation: procedural fairness (PF), transparency and contestability clarity (TP), system and information quality (SQ), autonomy support (AU), privacy boundary governance (PR), and workload protection (WL). Trust, acceptance, resistance, disengagement, and operational stability are retained as pathway or outcome mechanisms rather than additional weighted criteria. This distinction clarifies how the SEM stage connects to the decision-model stage.
In the third stage, a seven-member expert panel evaluates five implementable governance package alternatives under three operational scenarios: peak season, labor shortage, and audit pressure. Expert judgments are represented through DMFFS to capture membership, non-membership, hesitancy, expected engagement, and expected resistance. Bayesian Network weighting is then used to derive scenario-specific criteria weights by modeling dependencies among governance criteria and behavioral mechanisms. Finally, PCA-based ranking optimization uses the weighted DMFFS decision matrix to generate scenario-specific and robust rankings of the governance packages.
The unit of analysis is the selection of a governance package for algorithmic management in 3PL warehousing. A governance package is defined as a coherent bundle of mechanisms that shapes how algorithmic systems are implemented, explained, monitored, contested, and controlled. The study evaluates packages rather than isolated policies because responsible algorithmic management depends on the combined effect of transparency routines, contestability channels, human oversight, privacy boundaries, autonomy support, and workload protection.
Figure 2 presents the integrated methodological workflow of the proposed framework. The workflow shows the connection between employee survey evidence, SEM validation, survey-to-criteria mapping, expert-based DMFFS elicitation, Bayesian Network weighting, PCA-based ranking optimization, robustness testing, and governance recommendations.
To improve transparency without overloading the text,
Table 1 summarizes the methodological pipeline.
3.2. Decision Model Specification and Survey-to-Criteria Mapping
The decision model defines governance package selection for algorithmic management in 3PL warehousing as a multi-criteria decision problem. The final model evaluates five governance package alternatives against six human-centered governance criteria under three operational scenarios. This package-based structure is preferred because responsible algorithmic management depends on the combined effect of transparency, contestability, human oversight, privacy boundaries, autonomy support, and workload protection rather than on isolated policies.
To ensure consistency between the methodology, expert evaluation, computational analysis, and reported results, the alternative set is consolidated into five implementable governance packages, denoted as
and
, where
.
Table 2 presents the five governance package alternatives evaluated in this study for algorithmic management in 3PL warehouses.
To reduce overlap among the governance alternatives, the revised model treats A1, A2, and A5 as different levels and types of governance intervention. A1 is defined as the baseline compliance layer that provides minimum documentation, notices, and boundary disclosure. A2 is defined as an operational transparency–workload–autonomy package because it directly addresses target explanation, workload protection, and limited discretion during daily warehouse work. A5 is defined as an advanced privacy–governance package focused on data minimization, access control, retention control, privacy impact review, audit trails, and restrictions on secondary use. Accordingly, shared terms such as monitoring boundaries are interpreted at different depths: as basic disclosure in A1 and as advanced privacy-by-design control in A5.
The final decision model uses six criteria: procedural fairness (PF), transparency and contestability clarity (TP), system and information quality (SQ), autonomy support (AU), privacy boundary governance (PR), and workload protection (WL). These criteria were selected because they are theoretically grounded, empirically relevant in the survey stage, and practically actionable for 3PL managers. All six criteria are treated as benefit criteria, meaning that higher values indicate stronger governance performance.
Table 3 summarizes the final decision criteria used to evaluate the governance package alternatives.
The six criteria do not replace the broader psychological and behavioral constructs examined in the survey. Instead, the model separates decision criteria from pathway and outcome variables. The six criteria represent governance levers that managers can influence through policy, system design, and operational routines. Trust, acceptance, resistance, disengagement, and operational stability are retained as mechanisms that explain how these governance levers affect employee cooperation and warehouse stability.
Table 4 explains how the survey constructs are translated into their corresponding roles within the decision model.
This mapping clarifies that survey results are not used as direct rankings. Rather, they provide empirical support for identifying the governance dimensions included in the decision model. Constructs that correspond to actionable governance levers become decision criteria, while constructs explaining behavioral consequences are retained as dependency-path variables in the Bayesian Network. Because governance priorities may change under operational stress, the model evaluates the alternatives under three scenarios: peak season surge, labor shortage/high turnover, and audit pressure/compliance scrutiny.
Table 5 presents the operational scenarios used to evaluate the robustness of the governance package alternatives.
Formally, the decision problem is defined as:
where
is the set of governance package alternatives,
is the set of decision criteria, and
is the set of operational scenarios. For each scenario, experts evaluate each alternative against each criterion using the DMFFS linguistic scale. The resulting scenario-specific decision matrices are then weighted through the Bayesian Network and ranked using PCA-based ranking optimization.
3.3. Data Sources, Sampling, and Expert Panel
This study uses two complementary data sources. The first is an employee survey with 380 operational warehouse employees, used to capture frontline perceptions of algorithmic management, including fairness, transparency, system quality, autonomy, privacy concern, workload, trust, acceptance, and resistance. The second is a seven-member expert panel, used to validate the decision criteria, evaluate the governance package alternatives, support the Bayesian Network dependency structure, and review the scenario-specific ranking logic.
The employee survey was conducted among operational employees working in 3PL warehouse environments in Türkiye. Data were collected from employees of three 3PL companies between 15 March and 20 April. The participating employees worked in environments where algorithmic management practices were used or being introduced, including WMS/LMS-based task allocation, scanner-based monitoring, KPI dashboards, digital performance feedback, and algorithmic scheduling. Participants were included if they had direct exposure to system-mediated work allocation or KPI-based performance evaluation. The target population included employees who directly experience task allocation, monitoring, and digital performance feedback in daily warehouse operations, such as pickers, packers, forklift operators, replenishment staff, quality controllers, dispatch employees, team leaders, and shift supervisors.
Table 6 summarizes the sampling frame and inclusion criteria for the employee survey. It specifies the target population, sample size, participating 3PL warehouse context, data collection period, eligibility conditions, and ethical safeguards.
The expert panel was selected purposively because algorithmic management governance requires knowledge from multiple functional areas. The panel included expertise in 3PL operations, warehouse supervision, workforce planning, industrial engineering, ergonomics and safety, compliance and audit, data governance and privacy, and WMS/LMS implementation. Experts were coded as E1–E7 to protect confidentiality. Selection criteria included at least five years of relevant professional experience, familiarity with warehouse performance systems, experience with operational constraints such as peak periods and staffing variability, and willingness to participate in iterative elicitation and validation rounds.
Table 7 presents the composition and selection criteria of the expert panel used in the decision-model stage.
The expert engagement process consisted of three rounds. In the first round, all seven experts reviewed the five governance alternatives, six decision criteria, and three operational scenarios to assess their clarity, realism, and practical relevance. In the second round, detailed expert evidence profiles were assigned according to the primary operational relevance of each expert role rather than as equal scenario subsamples. Specifically, three primary profiles were associated with S1 peak season surge (operations, workforce planning, and ergonomics/safety expertise), two with S2 labor shortage/high turnover (warehouse supervision and WMS/LMS implementation expertise), and two with S3 audit pressure/compliance scrutiny (audit/compliance and data-governance/privacy expertise). This 3-2-2 structure therefore indicates the distribution of primary expert evidence profiles, not the creation of three independent expert panels. All experts reviewed the overall scenario logic and dependency structure, but each expert contributed one detailed primary scenario profile to avoid duplicating the same role-specific evidence across scenarios. In the third round, experts reviewed the aggregated evaluations, scenario-specific criteria weights, preliminary rankings, and explanation outputs to validate the face validity and managerial plausibility of the model.
Judgments were collected individually to reduce dominance bias and group pressure. After collection, expert judgments were aggregated using the arithmetic mean at the alternative–criterion–scenario level. Disagreement among experts was not treated as an error; instead, it was represented through the DMFFS facets of non-membership, hesitancy, engagement, and resistance. This approach is suitable because expert views may differ according to warehouse maturity, workforce composition, supervision capacity, and scenario conditions.
The expert panel also validated the main Bayesian Network dependency paths, including transparency → trust, system quality → trust, procedural fairness → resistance, workload protection → resistance, privacy boundary governance → resistance, trust → acceptance, and resistance → operational stability. These paths were retained when they were theoretically grounded, consistent with the survey model, and meaningful in warehouse practice.
Ethical safeguards were applied across both data sources. Participation was voluntary and based on informed consent. Employee responses were anonymous, expert identities were coded as E1–E7, and all findings were reported only in aggregate form. No directly identifying information, such as names, employee numbers, or organization-specific identifiers, was collected.
3.4. Survey Instrument Development and SEM Validation Logic
The survey instrument was designed to measure employee responses to algorithmic management in 3PL warehousing. It focused on how employees perceive system-mediated task allocation, scanner-based monitoring, KPI feedback, algorithmic scheduling, and digitally supported performance evaluation. The survey was not used to rank governance packages directly. Instead, it provided empirical evidence about the psychological and behavioral mechanisms that inform the decision model.
The survey included constructs related to legitimacy, reliance, perceived control, privacy boundaries, workload strain, acceptance, and resistance. Procedural fairness, transparency, system/information quality, autonomy, privacy concern, and workload were treated as governance-related antecedents. Trust, acceptance, resistance/disengagement, and operational stability were treated as pathway or outcome variables. This distinction is important because the decision model evaluates governance levers that managers can influence, while SEM explains how these levers affect employee cooperation and resistance.
To avoid confusion between measurement support and structural estimation, the operational stability (OS) items were not treated as a separate SEM latent construct. They were included in the questionnaire as calibration items to capture how employees understood stable cooperation, reduced workarounds, and fewer governance-related disruptions in algorithmically managed warehouse work. These items were used to support the conceptual definition of OS, guide expert interpretation during the BN elicitation stage, and inform BN target-node parameterization. They were not included in the SEM measurement model, reliability/validity tables, or structural path testing, because OS functions in this study as the final governance-effectiveness target node rather than as an additional employee-perception construct.
The six final decision criteria were derived from the survey constructs and expressed in governance-action terms: procedural fairness (PF), transparency and contestability clarity (TP), system and information quality (SQ), autonomy support (AU), privacy boundary governance (PR), and workload protection (WL). Privacy concern and workload were direction-adjusted so that higher values in the decision model indicate stronger privacy boundary governance and stronger workload protection.
Table 8 presents the survey constructs and explains how each construct is positioned within the decision framework.
Item development followed a conservative adaptation strategy. Items were adapted from validated scales and contextualized to algorithmic management in 3PL warehousing. Wording referred to WMS/LMS-based task assignment, scanner-based monitoring, KPI dashboards, target calculation, digital performance feedback, monitoring boundaries, and exception handling. Expert review was used to assess clarity, relevance, and non-redundancy. A pilot test with warehouse employees was then used to refine wording, reduce ambiguity, and improve readability.
The final questionnaire used a consistent Likert-type response format. Higher scores represented stronger agreement with each construct statement. For constructs with risk-oriented meanings, such as privacy concern, workload, and resistance/disengagement, direction adjustment was applied before decision-model integration. This ensured that all final decision criteria were interpreted consistently as benefit-type criteria.
Before SEM testing, the measurement model was assessed for reliability and validity. Internal consistency was evaluated using Cronbach’s alpha and Composite Reliability. Convergent validity was assessed through standardized factor loadings and Average Variance Extracted. Discriminant validity was checked using HTMT, with additional checks such as Fornell–Larcker where appropriate. Collinearity was assessed using VIF before interpreting structural paths.
The SEM stage was conducted using partial least squares structural equation modeling (PLS-SEM). PLS-SEM was selected because this stage was used for prediction-oriented validation of the governance-related psychological mechanisms and for informing the dependency logic of the later BN stage, rather than for covariance-based theory confirmation. The structural paths were evaluated through a non-parametric bootstrapping procedure with 5000 resamples, using two-tailed significance tests. In addition to reliability, AVE, and HTMT, the model was checked through SRMR, inner VIF values, R
2 values for endogenous constructs, and predictive relevance diagnostics.
Table 9 summarizes the measurement model assessment criteria used to evaluate the reliability and validity of the survey constructs.
After measurement validation, SEM was used to estimate the hypothesized relationships among the constructs. These results served two purposes. First, they tested whether the theoretical relationships proposed in
Section 2.6 were empirically supported. Second, they informed the dependency logic used in the Bayesian Network. For example, if transparency, system quality, and procedural fairness were linked to trust, and trust was linked to acceptance, these relationships justified treating the criteria as interdependent rather than independent.
The survey scores were not used mechanically as final criteria weights. Instead, they served as an empirical evidence layer. Construct scores were normalized and direction-adjusted so that the six decision criteria could be consistently interpreted as benefit criteria. Positive governance perceptions, including procedural fairness, transparency, system/information quality, and autonomy, were retained in their original direction. Privacy concern and workload were reverse-scaled to represent privacy boundary governance and workload protection. Resistance/disengagement and acceptance were retained as SEM/BN pathway variables, while operational stability was used as the target outcome or governance effectiveness proxy.
Through this procedure, the survey identifies the human-centered mechanisms that matter for algorithmic management governance, supports the selection of the six final criteria, and informs the dependency structure used in the Bayesian Network. This directly connects the SEM stage with the expert-based DMFFS evaluation and the subsequent ranking model.
3.5. Dynamic Multi-Facet Fuzzy Sets Representation
The expert evaluation stage uses Dynamic Multi-Facet Fuzzy Sets (DMFFS) to represent uncertainty in governance package assessment. This method is appropriate because expert judgments about algorithmic management governance are rarely fully certain. A package may be technically useful but may still create hesitation, limited engagement, or employee resistance under specific operational conditions.
DMFFS captures five facets of expert judgment: membership, non-membership, hesitancy, engagement, and resistance. Membership indicates the degree to which a governance package satisfies a criterion, while non-membership indicates the degree to which it fails to do so. Hesitancy captures uncertainty in the evaluation. Engagement reflects expected worker and supervisor buy-in, while resistance reflects expected pushback, avoidance, metric gaming, or workaround behavior.
For each expert
, alternative
, criterion
, and scenario
, the DMFFS assessment is defined as:
where
Here,
represents membership,
is non-membership,
is hesitancy,
is engagement, and
is resistance. Each facet is bounded between 0 and 1:
The following rule is applied to maintain consistency between membership, non-membership, and hesitancy:
This means that hesitancy increases when an expert does not fully assign the evaluation either to membership or non-membership.
Table 10 defines the Dynamic Multi-Facet Fuzzy Set (DMFFS) facets used to represent expert judgments in the governance package evaluation.
3.5.1. Linguistic Scale and Mapping
Experts evaluated the governance packages using a five-level linguistic scale: Very Low (VL), Low (L), Medium (M), High (H), and Very High (VH). Each linguistic term was mapped into interval-valued DMFFS parameters. For a linguistic term
, the membership, non-membership, engagement, and resistance values are represented as intervals:
Hesitancy is then calculated as:
Table 11 presents the linguistic scale used to convert expert evaluations into interval-valued DMFFS parameters.
3.5.2. Scenario-Dependent DMFFS Elicitation
The dynamic aspect of DMFFS is operationalized by evaluating governance packages under three scenarios: peak season surge, labor shortage/high turnover, and audit pressure/compliance scrutiny. This allows the same package to receive different evaluations depending on operational conditions:
For example, workload protection may receive stronger membership under peak season because it directly addresses overload and resistance risk, while procedural fairness and documentation may become more important under audit pressure.
Table 12 provides the scenario-dependent elicitation guide used during the expert evaluation process.
After expert evaluations were collected, DMFFS assessments were aggregated into a panel-level evaluation for each alternative–criterion–scenario combination. The aggregated DMFFS decision matrix is reported in
Section 4, while full expert-level matrices can be provided in the
Appendix A.
3.6. DMFF Bayesian Network for Scenario-Specific Criteria Weighting
This study uses a DMFF Bayesian Network (DMFF-BN) to derive scenario-specific weights for the six governance criteria. The purpose of this stage is to avoid treating the criteria as independent. In algorithmic management governance, procedural fairness, transparency, system quality, autonomy, privacy boundary governance, and workload protection are connected through psychological and behavioral mechanisms such as trust, acceptance, resistance, and operational stability. Therefore, a dependency-aware weighting method is more suitable than a static weighting approach.
The Bayesian Network is informed by the conceptual model and SEM logic. SEM identifies the relationships among employee perceptions, trust, acceptance, resistance, and operational stability. The BN then uses these relationships as expert-validated dependency paths. SEM results do not mechanically determine the final weights; instead, they support the structure that experts validate and parameterize.
To avoid ambiguity in the weighting procedure, the present study uses a combined evidence-to-weight logic rather than simple expert-average weighting. Expert judgments are first used to define scenario-specific dependency evidence for the BN structure, including the relative strength of the links between governance criteria, pathway variables, and operational stability. These aggregated dependency inputs are then processed through BN-based information contribution logic, using information gain, entropy, and conditional entropy to estimate how strongly each criterion reduces uncertainty about the target node. Only after this BN information-contribution step are the resulting scores normalized to produce the final scenario-specific criteria weight vector. Thus, the reported weights are BN-derived normalized information-contribution weights informed by expert evidence, not direct arithmetic averages of expert scores.
The final BN includes three node groups. The first group consists of the six governance criteria: procedural fairness (PF), transparency and contestability clarity (TP), system and information quality (SQ), autonomy support (AU), privacy boundary governance (PR), and workload protection (WL). The second group includes the pathway variables: trust in algorithmic decisions (TR), acceptance/intention to comply (AC), and resistance/disengagement (RD). The final target node is operational stability (OS), which reflects reduced disruption from workarounds, metric gaming, resistance, and unstable cooperation.
The OS node was therefore parameterized as a decision-model target rather than estimated as an SEM factor. The OS calibration items helped define what the target node represented in practice: fewer disruptions, reduced workarounds, stable employee cooperation, and lower governance-related friction. During expert elicitation, these meanings were used to anchor judgments about how the governance criteria and pathway variables contribute to operational stability under each scenario.
Figure 3 presents the core dependency structure of the DMFF-BN.
Before the scenario-specific criteria weights are calculated, the Bayesian Network structure is defined by distinguishing between governance criteria, pathway variables, and the final target outcome. This distinction is important because the model does not treat all constructs as direct decision criteria. Instead, PF, TP, SQ, AU, PR, and WL function as actionable governance criteria, while TR, AC, and RD represent the behavioral mechanisms through which governance affects operational stability.
Table 13 summarizes the nodes used in the DMFF-BN and clarifies their roles in the model.
The dependency logic is represented as a directed acyclic graph:
where
is the set of nodes and
is the set of directed edges. The baseline paths are:
These paths summarize the BN dependency structure used to link the governance criteria with trust, acceptance, resistance/disengagement, and the operational sta-bility target node.
The BN is parameterized through expert judgments expressed with the DMFFS linguistic scale. Experts assess the strength of the dependency paths under each operational scenario. This allows a path to become stronger or weaker depending on context. For example, may become stronger during peak season because workload pressure can increase disengagement and workarounds. Similarly, may become more important under audit pressure because contestability and documentation are central to legitimacy.
For each child node
and its parent set
, experts provide scenario-specific conditional judgments:
where
represents the scenario and
represents the expert. Experts do not provide full numerical conditional probability tables. Instead, they evaluate conditional influence using linguistic terms such as Low, Medium, High, or Very High. These judgments are then transformed into DMFFS parameters and aggregated into scenario-specific conditional values.
The BN weighting stage estimates how much each criterion contributes to the target outcome, operational stability. A criterion receives a higher weight when it provides more information about operational stability under a given scenario. For each criterion
, the scenario-specific information contribution is calculated as:
where
is the information gain of criterion
under scenario
, and
represents Shannon entropy. A higher information-gain value means that the criterion reduces uncertainty about operational stability more strongly.
Because expert judgments are represented through DMFFS, the conditional values are first transformed into comparable numerical values. Membership and engagement increase support for a criterion, while non-membership, hesitancy, and resistance reduce effective support. The information-gain values are then normalized:
where
This produces one criterion weight vector for each scenario:
The output of this stage is a set of scenario-specific criteria weights. These weights are used directly in the PCA-based ranking optimization stage.
Section 4 reports the full weight vector for each scenario, allowing readers to see how governance priorities shift under peak season, labor shortage, and audit pressure. This improves methodological transparency by showing how survey-supported constructs, expert dependency judgments, and scenario conditions are converted into operational criteria weights.
For computational transparency and reproducibility, the weighting and ranking procedure followed a sequential calculation logic. First, expert linguistic judgments were converted into DMFFS values for membership, non-membership, hesitancy, engagement, and resistance. Second, scenario-specific dependency strengths in the Bayesian Network were derived from expert evaluations of the main governance pathways. Third, these dependency values were transformed into information-contribution scores for each criterion with respect to operational stability and then normalized to obtain one criterion weight vector for each scenario. Fourth, the scenario-specific weights were applied to the aggregated DMFFS decision matrix. Finally, the weighted matrix was transformed into scalar scores and processed through PCA-based ranking optimization to obtain scenario-level ranking scores. This procedure ensures that the final rankings reflect both expert uncertainty and scenario-specific interdependence among governance criteria.
3.7. DMFF-BN-PCRO Ranking Optimization
After scenario-specific criteria weights are obtained through the DMFF-BN stage, the study applies DMFF-BN-PCRO Ranking Optimization to rank the five governance packages. This stage uses the weighted DMFFS decision matrix as input. The Bayesian Network weights determine the importance of each criterion under each scenario, while PCA-based ranking optimization transforms the weighted matrix into scenario-specific ranking scores.
This stage is included because governance packages may receive close scores under conventional additive methods. PCA-based optimization helps identify dominant patterns in the weighted decision space, reduce redundancy among correlated criteria, and separate alternatives more clearly. The ranking stage uses three inputs: aggregated DMFFS evaluations, scenario-specific BN weights, and direction-adjusted benefit criteria. The model evaluates five alternatives , six criteria , and three scenarios .
For each scenario
, the aggregated DMFFS decision matrix is expressed as:
where
represents the aggregated DMFFS evaluation of alternative
under criterion
and scenario
. The scenario-specific Bayesian Network weight vector is defined as:
The weighted DMFFS decision matrix is then obtained as:
where
indicates criterion-wise scalar weighting applied to the DMFFS facets. Membership and engagement increase the effective support score, while non-membership, hesitancy, and resistance reduce it. Since privacy concerns and workload were direction-adjusted earlier, all six criteria are treated as benefit-type criteria.
PCA requires a numerical matrix. Therefore, each weighted DMFFS evaluation is transformed into a scalar score using a consistent defuzzification rule:
where
is the scalar score for alternative
, criterion
, and scenario
. The resulting scenario-specific score matrix is:
For each scenario, PCA is applied to the score matrix. The covariance matrix is calculated as:
and eigen-decomposition is performed as:
where
is the eigenvalue and
is the loading vector of component
. Components are retained according to explained variance. The component score for each alternative is calculated as:
where
is the row vector of scores for alternative
.
The final scenario-specific ranking score is calculated by combining the retained component scores:
where
is the number of retained components and
is the variance-based weight of component
. Alternatives are ranked in descending order:
This procedure produces one ranking for each scenario:
To identify governance packages that perform consistently across changing conditions, an aggregated robust score is also calculated:
where
is the scenario weight. If no scenario is prioritized, equal scenario weights are used:
The final robust ranking is obtained by ordering alternatives according to . This allows the study to identify both scenario-specific winners and generally stable governance packages.
To improve computational transparency,
Section 4 reports the scenario-specific criteria weights, summarized alternative scores, PCA-based ranking scores, scenario-specific rankings, robust ranking, score differences among close alternatives, and robustness indicators.
Table 14 summarizes the input–output logic of this stage. In addition, the intermediate PCA outputs are reported in
Appendix F, including eigenvalues, explained variance, retained components, loading structures, and component scores. This additional reporting allows readers to trace how the weighted DMFFS decision matrix is transformed into the final PCRO scores and rankings.
Table 14 summarizes the input–output logic of the DMFF-BN-PCRO ranking stage. It explains how the weighted DMFFS decision matrix is transformed into scenario-specific ranking outputs through PCA-based ranking optimization.
This stage connects Bayesian weighting to the final governance package ranking and makes the ranking process more interpretable by showing how final scores are generated from weighted criteria, principal components, and scenario-specific evaluations.
3.8. Iterative Expert Engagement and Consensus Protocol
An iterative expert engagement protocol was used to improve the validity, interpretability, and practical relevance of the decision model. The aim was not only to collect expert evaluations, but also to confirm that the governance packages, criteria, scenarios, and dependency paths were realistic for 3PL warehousing.
The expert engagement process consisted of three rounds. In Round 1, experts reviewed the five governance package alternatives, six decision criteria, and three operational scenarios. This round was used to confirm the clarity and practical relevance of the model components. In Round 2, experts provided linguistic evaluations using the DMFFS scale and assessed the strength of key Bayesian Network dependency relationships, including transparency → trust, procedural fairness → resistance, workload protection → resistance, privacy boundary governance → resistance, trust → acceptance, and resistance → operational stability. In Round 3, experts reviewed the aggregated evaluations, criteria weights, preliminary rankings, and interpretation outputs to assess whether the results were plausible and managerially meaningful.
Judgments were collected individually to reduce dominance bias. Expert inputs were then aggregated at the alternative–criterion–scenario level. For each alternative
, criterion
, and scenario
, the aggregated DMFFS evaluation is represented as:
where
is the panel-level evaluation and
represents the aggregation operator. The same aggregation rule was applied across membership, non-membership, hesitancy, engagement, and resistance facets.
Disagreement among experts was not treated as an error. Instead, divergent views were reflected through the DMFFS facets, especially hesitancy and resistance. This is suitable for algorithmic management governance because experts may reasonably emphasize different risks, such as throughput stability, privacy boundaries, workload pressure, or audit defensibility.
Convergence was assessed through two checks. Structural convergence required that experts accept the final alternatives, criteria, scenarios, and dependency paths as understandable and relevant. Ranking convergence required that the top-ranked alternatives and scenario-specific weight patterns remained plausible after expert review. The panel was considered to have reached sufficient convergence when no major changes were requested in the final validation round.
A clarification is necessary for scenario-level reporting. The seven experts contributed to the overall expert evaluation process, while the scenario-level n values in
Section 4 refer to the number of primary expert evidence profiles associated with each operational scenario. These values should therefore not be interpreted as the total number of experts who reviewed the study, nor as three fully independent and equally sized scenario panels.
Table 15 presents the iterative expert engagement and consensus protocol used to support the decision model development. It outlines the main stages of expert involvement, including the review of criteria and alternatives, scenario-based evaluation, aggregation of judgments, validation of preliminary results, and refinement of the final model.
This protocol supports methodological transparency by showing how expert knowledge is transformed into fuzzy evaluations, dependency-aware weights, and scenario-specific governance rankings.
3.9. Robustness and Comparative Validation Procedure
Robustness and sensitivity analyses were conducted to test whether the ranking results remain stable under changes in scenario assumptions, criteria weights, and decision-model specifications. This step is necessary because algorithmic management governance decisions are made under uncertainty, and expert judgments or operating conditions may change across warehouse contexts.
The robustness procedure includes four checks: scenario comparison, weight perturbation, DMFFS facet sensitivity, and comparison with conventional MCDM methods. Scenario comparison evaluates whether the same alternatives remain strong across peak season, labor shortage/high turnover, and audit pressure/compliance scrutiny. For each scenario, the model produces a ranking:
These rankings are compared using Top-1 retention, Top-3 overlap, and Spearman rank correlation. These indicators show whether the model produces stable recommendations or whether rankings change sharply across scenarios.
Weight perturbation analysis examines whether the results are sensitive to changes in criteria weights. Criteria weights are perturbed by
and
around their baseline values:
where
After perturbation, the weights are normalized again:
The ranking model is then recalculated to identify rank shifts, Top-K retention, and ranking volatility.
DMFFS facet sensitivity examines whether rankings change when different facets are emphasized. This includes increasing the penalty for resistance, increasing the penalty for hesitancy, increasing the contribution of engagement, or reducing the dominance of membership values. This test is important because the model includes behavioral uncertainty, not only positive expert evaluations.
Rank reversal analysis is used to test whether the ranking depends on weaker alternatives. One lower-ranked alternative is removed at a time, and the ranking is recalculated. The main indicators are top alternative retention, Top-3 retention, number of pairwise reversals, and maximum rank shift.
To address comparative validation, the proposed DMFF-BN-PCRO ranking is compared with two conventional MCDM benchmarks: Simple Additive Weighting (SAW) and TOPSIS. SAW is used as a transparent additive benchmark:
where
is the normalized score of alternative
under criterion
and scenario
. TOPSIS is used as a distance-based benchmark:
where
is the distance from the positive ideal solution and
is the distance from the negative ideal solution. AHP and BWM are not used as direct benchmarks because the study does not collect complete pairwise comparison matrices for all alternatives and criteria. SAW and TOPSIS are more appropriate because they can be applied to the same normalized score matrix and weight vector.
Comparative validation is assessed using:
Table 16 summarizes the robustness, sensitivity, and comparative validation indicators used to assess the stability of the DMFF-BN-PCRO results.
This validation structure adds numerical stability checks and benchmark comparison rather than relying only on verbal claims about robustness.
3.10. Implementation Outputs
The final stage of the methodology translates the computational results into practical governance outputs for 3PL warehouse managers. The aim is not only to identify the highest-ranked governance package, but also to show how governance priorities change under peak season, labor shortage, and audit pressure conditions.
The framework produces five main outputs: scenario-specific criteria weights, scenario-specific rankings of the five governance packages, an aggregated robust ranking, sensitivity and robustness indicators, and comparative validation with SAW and TOPSIS. These outputs help managers understand which package should be prioritized, why it is preferred, and whether the recommendation remains stable when assumptions change.
Table 17 presents the implementation outputs generated by the proposed governance decision framework.
The implementation outputs also support scenario-based interpretation. During peak season, managers may need to prioritize workload caps, target explanations, autonomy support, and rapid exception handling. Under labor shortage or high turnover, the emphasis may shift toward simple rules, reliable system feedback, onboarding-friendly dashboards, and clear privacy notices. Under audit pressure, the focus may move toward appeal logs, override records, fairness reviews, access controls, and monitoring-boundary documentation.
Thus, the final methodological output is not only a ranking table. It is a decision-support structure that combines criteria weights, scenario rankings, robust ranking, stability evidence, benchmark comparison, and practical interpretation. The following
Section 4 reports these outputs through the survey findings, expert-based weights, scenario-specific rankings, robustness tests, and comparative validation results.
4. Results
4.1. Sample Characteristics
The employee survey included 380 valid responses from operational employees working in 3PL warehouse environments. No missing values were identified in the demographic and work-profile variables used in this section. The sample included employees from different operational roles, tenure groups, shift types, and weekly working-hour categories, making it suitable for examining perceptions of algorithmic task allocation, scanner-based monitoring, KPI feedback, and digitally supported performance control.
Picker/packer employees formed the largest group, representing 41.1% of the sample. Forklift operators and team leaders each accounted for 18.9%, followed by shift supervisors (14.5%) and quality/compliance employees (6.6%). In terms of tenure, the largest group had 1–3 years of experience (29.5%), followed by employees with less than one year (21.8%), 3–5 years (19.7%), 5–10 years (19.5%), and more than ten years (9.5%). The sample also included different shift patterns, with day-shift employees representing 43.9%, evening-shift employees 22.9%, rotating-shift employees 17.4%, and night-shift employees 15.8%. Weekly working hours show that many respondents worked under relatively intensive conditions: 41.3% worked 40–45 h per week, 30.3% worked 46–50 h, and 9.5% worked more than 50 h.
Table 18 presents the sample profile of the survey respondents. It summarizes the main demographic and work-related characteristics of the 380 warehouse employees who participated in the study, including variables such as role, tenure, exposure to algorithmic management practices, and relevant workplace characteristics.
4.2. Measurement Model and Construct Validity
The measurement model was assessed before testing the structural relationships. The results indicate that the constructs used to represent employee perceptions of algorithmic management governance met the required reliability and validity conditions. Internal consistency was satisfactory across all constructs, with Cronbach’s alpha values ranging from 0.795 to 0.839 and Composite Reliability values ranging from 0.880 to 0.894. Convergent validity was also supported because all AVE values exceeded the recommended 0.50 threshold, ranging from 0.666 to 0.737. Standardized item loadings were strong, ranging from 0.798 to 0.870, showing that the indicators represented their intended constructs adequately. Discriminant validity was also acceptable, with a maximum HTMT value of 0.540, which is well below the commonly used 0.85/0.90 thresholds. These results show that the measurement model is suitable for the subsequent SEM analysis.
Table 19 reports the reliability and convergent validity results for the survey constructs based on the 380 employee responses.
Operational stability was not estimated as a separate latent construct in the SEM model and was therefore not included in the SEM path table. The OS items reported in
Appendix A were used only as calibration items to define the BN target node and to support expert interpretation of governance effectiveness. Accordingly, OS is reported in the decision-model stage as a target outcome for BN parameterization and ranking interpretation, not as an unreported SEM construct.
4.3. Structural Model and Mechanism Testing
After confirming measurement quality, the structural model was examined to test the psychological mechanisms linking algorithmic management governance perceptions to trust, acceptance, and resistance/disengagement. The results support the main logic of the study. Trust was significantly predicted by transparency, system/information quality, and procedural fairness, with the model explaining 25.8% of the variance in trust. Acceptance was significantly predicted by trust, autonomy, and procedural fairness, explaining 20.8% of the variance in acceptance. Resistance/disengagement was mainly shaped by privacy concern, workload, procedural fairness, and system/information quality, with the model explaining 19.3% of the variance in resistance.
Because the study adopted PLS-SEM, the structural model was evaluated using bootstrapped path estimates and prediction-oriented diagnostics. The standardized root mean square residual (SRMR = 0.052) was below the commonly used 0.08 threshold, indicating acceptable model fit for PLS-SEM. Collinearity did not threaten the interpretation of the structural paths because the inner VIF values ranged from 1.18 to 2.31. Bootstrapping with 5000 subsamples was used to obtain the reported significance levels. The explanatory power of the endogenous constructs was moderate for the exploratory purpose of the study, with R
2 values of 0.258 for trust, 0.208 for acceptance/intention to comply, and 0.193 for resistance/disengagement. Predictive relevance was also acceptable because the Q
2 values were positive for all endogenous constructs.
Table 20 presents the structural path results and hypothesis decisions obtained from the SEM analysis.
Table 21 presents the additional PLS-SEM structural model diagnostics used to assess the quality and explanatory adequacy of the structural model. It reports key indicators such as bootstrapping settings, SRMR, VIF, R
2, Q
2, and effect-size information.
The strongest predictor of trust was procedural fairness, followed by transparency and system/information quality. This indicates that employees are more likely to trust algorithmic decisions when they perceive them as fair, understandable, and based on reliable information. Acceptance was also strengthened by trust, autonomy, and procedural fairness, showing that employees’ willingness to follow algorithmic guidance depends not only on system reliability but also on perceived discretion and legitimacy.
For negative behavioral responses, privacy concern and workload increased resistance/disengagement, while procedural fairness reduced it. Transparency did not directly reduce resistance, suggesting that explanation alone may not be enough when employees are concerned about workload pressure or monitoring boundaries. System/information quality showed a significant negative relationship with resistance, indicating that reliable and relevant system feedback can reduce pushback.
For the same reason, operational stability is not reported as a separate SEM-tested path. It is used in the subsequent DMFF-BN and ranking stages as the target outcome of governance effectiveness. This means that the OS items functioned as conceptual and elicitation support for the BN target node, not as a latent construct omitted from SEM reporting.
Figure 4 summarizes the SEM-tested relationships in a self-explanatory format by displaying the standardized PLS-SEM path coefficients, significance levels, exploratory paths, and R
2 values for the endogenous constructs. The values shown in the figure correspond directly to
Table 20.
4.4. Scenario Evidence Profiles
The expert evidence was grouped by the three operational scenarios used in the decision model: peak season surge (S1), labor shortage/high turnover (S2), and audit pressure/compliance scrutiny (S3). The scenario-level values refer to the number of primary expert evidence profiles assigned to each scenario, not the total expert panel size and not separate expert samples. The 3–2–2 allocation was based on the closest functional relevance of each expert role to the operational condition being modeled. S1 included three primary profiles because peak-season surge requires operations, workforce planning, and ergonomics/safety expertise. S2 included two primary profiles because labor shortage/high turnover requires supervision, onboarding, and WMS/LMS implementation expertise. S3 included two primary profiles because audit pressure/compliance scrutiny requires audit/compliance and data-governance/privacy expertise. Thus, the seven experts were not divided to create statistically balanced scenario subsamples; rather, their role-specific evidence was organized to support scenario-sensitive interpretation of governance priorities.
Table 22 presents the scenario evidence profiles derived from the expert elicitation process. It summarizes how expert judgments were organized across the three operational scenarios and clarifies the scenario-specific evidence base used to support weighting and ranking decisions.
The scenario profiles show that the strongest mechanism under S1 was the trust–acceptance link, suggesting that employee cooperation becomes especially important when warehouse pressure increases. Under S2, transparency had the strongest influence on trust, indicating that simple and understandable rules are particularly important when staffing is unstable or new workers are present. Under S3, the fairness–resistance and resistance–stability links were strongest, showing that audit pressure increases the importance of contestability, documentation, and defensible procedures.
These scenario profiles provide the expert-informed basis for the subsequent DMFF-BN weighting stage. They show that governance priorities are not fixed across contexts. Peak season emphasizes acceptance and workload-related cooperation, labor shortage emphasizes transparency and trust, and audit pressure emphasizes procedural fairness, resistance reduction, and operational stability.
The scenario-level expert evidence profiles should therefore be interpreted as contextual evidence groupings rather than as independent expert-panel samples. Because each scenario is supported by two or three primary evidence profiles, the scenario comparisons are not intended to represent statistically generalizable differences across large expert groups. Instead, they show how governance priorities change when role-specific expert knowledge is mapped onto distinct operational pressures. This interpretation also means that differences between scenarios may reflect both the operating condition and the expertise represented in the primary evidence profiles. For this reason, the findings are used as decision-support evidence and are complemented by robustness checks, comparative validation, and limitation statements.
4.5. DMFF-BN Scenario-Specific Criteria Weights
The DMFF-BN stage converted the expert evidence profiles into scenario-specific criteria weights for the six governance criteria: procedural fairness (PF), transparency and contestability clarity (TP), system and information quality (SQ), autonomy support (AU), privacy boundary governance (PR), and workload protection (WL). The weights were not obtained by directly averaging expert scores. Instead, expert judgments were first aggregated to form scenario-specific dependency inputs for the Bayesian Network. The BN then estimated the information contribution of each criterion to operational stability by using information gain, entropy, and conditional entropy. The resulting information-contribution values were normalized so that the criteria weights sum to 1.000 within each scenario. This stage therefore provides a BN-derived and expert-informed computational basis for the subsequent DMFF-PCRO ranking.
Table 23 presents the scenario-specific criteria weights generated through the DMFF-BN stage. It shows how the relative importance of procedural fairness, transparency and contestability clarity, system and information quality, autonomy support, privacy boundary governance, and workload protection changes across the three operational scenarios.
The weight patterns show that governance priorities change across operating conditions. Under S1, workload protection received the highest weight (0.207), followed by equal weights for fairness, transparency, system quality, and autonomy (0.170 each). This reflects the importance of reducing fatigue, overload, and resistance during peak periods. Under S2, privacy boundary governance received the highest weight (0.215), followed by procedural fairness (0.205) and transparency (0.195). This suggests that when labor is unstable or turnover is high, employees need clear monitoring boundaries, understandable rules, and fair procedures. Under S3, procedural fairness received the highest weight (0.205), closely followed by workload protection (0.195), indicating that audit pressure increases the importance of contestability, defensible procedures, and resistance reduction.
These results support the scenario-sensitive logic of the model. The criteria weights are not fixed across all conditions; instead, they shift according to the operational pressures faced by the warehouse. This provides a more realistic basis for ranking governance packages than a single static weighting structure.
4.6. DMFF-PCRO Ranking Results
The DMFF-PCRO stage ranked the five governance packages under the three operational scenarios. To improve transparency,
Table 24 reports both the scenario-specific ranking score and the corresponding rank. The score was calculated on a normalized 0–1 scale, where higher values indicate stronger ranking performance. Reporting both scores and ranks provides more than ordinal ordering and makes the score differences among alternatives clearer.
To make the ranking logic more transparent,
Table 25 summarizes the leading PCRO evidence for each scenario, including the winning package, the score gap from the second-ranked package, and the main criteria or pathways explaining the result.
The results show that the preferred governance package changes according to the operational scenario. Under S1: peak season surge, A2 achieved the highest score (1.000) and ranked first. This indicates that a package combining transparency, workload protection, and autonomy support is most suitable when warehouse operations face volume pressure, time constraints, and intensified KPI monitoring. Under S2: labor shortage/high turnover, A1 ranked first with a score of 1.000, suggesting that baseline compliance and boundary governance become especially important when workforce stability and onboarding consistency are weaker. Under S3: audit pressure/compliance scrutiny, A3 ranked first with a score of 1.000, showing that contestability, procedural fairness, appeal mechanisms, and documentation become central when algorithmic decisions must be justified and defended.
The lower rankings of A4 and A5 do not mean that human oversight or privacy governance are unimportant. Rather, the results suggest that these packages are less effective as stand-alone primary interventions in the three scenarios. Their mechanisms may be more useful when integrated into broader packages such as A1, A2, or A3.
To make this interpretation more transparent,
Table 26 reports the criterion-level contribution pattern behind the lower rankings of A4 and A5. The values are normalized contribution indices derived from the scenario-weighted DMFFS decision matrices after applying the BN criteria weights; they should therefore be read as decision-model contribution scores rather than as additional survey statistics.
The contribution pattern explains why A4 and A5 remain lower-ranked despite their importance in AI governance. A4 provides meaningful human-in-the-loop control, but its contribution is concentrated in review and validation functions and is less direct for workload protection, autonomy support, and day-to-day resistance reduction under peak and labor-shortage conditions. A5 provides the strongest privacy-boundary contribution, but privacy governance alone does not sufficiently address target opacity, pacing pressure, reduced discretion, or appeal needs. Therefore, the results do not downgrade human oversight or privacy governance; rather, they indicate that these mechanisms are more effective when embedded within broader governance packages such as A1, A2, or A3.
A robust ranking was also calculated by averaging the scenario-specific scores across S1, S2, and S3. The purpose was to identify which governance package performs most consistently across changing warehouse conditions.
Table 27 presents the robust DMFF-PCRO ranking results across the three operational scenarios.
The robust ranking identifies A2 as the strongest cross-scenario governance package, followed closely by A1 and A3. The score difference between A2 and A1 is small (0.029), and the difference between A1 and A3 is also limited (0.029). This indicates that the leading three packages should not be interpreted as completely separate solutions. Instead, they form a strong governance cluster. A2 provides the best general balance, A1 is particularly useful under labor instability, and A3 becomes most valuable under audit or compliance pressure.
Figure 5 visually compares the relative performance of the leading governance packages across the three operational scenarios and the robust overall solution.
These findings support the scenario-sensitive logic of the proposed framework. A single governance package is not equally dominant under all conditions. Peak periods require transparency, workload protection, and autonomy support; labor shortages require clear boundaries and simple compliance routines; audit pressure requires contestability and procedural fairness. Therefore, the ranking results provide both a primary recommendation and a context-specific governance guide for 3PL warehouse managers.
4.7. Robustness, Sensitivity, and Benchmark Comparison Results
Robustness and comparative validation were conducted to examine whether the ranking results remain stable across scenario changes and whether the proposed DMFF-BN-PCRO framework provides additional decision value compared with conventional ranking approaches. The analysis focused on three checks: scenario stability, rank-shift behavior, and comparison with SAW and TOPSIS benchmark rankings.
The scenario comparison shows that the leading group remained stable across all three operational conditions. Although the first-ranked package changed by scenario, the same three alternatives, A1, A2, and A3, consistently remained in the Top-3. This indicates that the model does not produce random or unstable recommendations. Instead, it shows a context-sensitive pattern in which A2 is strongest under peak season, A1 is strongest under labor shortage/high turnover, and A3 is strongest under audit pressure/compliance scrutiny. To summarize the stability of the scenario-based rankings,
Table 28 reports the main robustness indicators across the three operational scenarios. These indicators show whether the leading governance packages remain stable when the operating context changes from peak season to labor shortage and audit pressure.
To address the robustness tests more explicitly,
Table 29 reports the full numerical outcomes of the weight perturbation, DMFFS facet sensitivity, and rank-reversal checks. The results show that the leading cluster remained stable across the tests: A1, A2, and A3 consistently formed the Top-3 set. At the same time, the table also shows that the exact first position can be sensitive in higher-intensity perturbations, especially when resistance and hesitancy penalties are emphasized. Therefore, the robustness evidence supports a stable leading cluster rather than an absolute, method-insensitive dominance of a single alternative.
The rank-shift pattern confirms that the model is sensitive to operational context without becoming unstable. A1 shifted between first and second place, A2 shifted between first and second place, and A3 shifted between first and third place. A4 and A5 remained lower-ranked across all scenarios. This supports the interpretation that A1, A2, and A3 are the most defensible governance packages, while A4 and A5 are better understood as supporting mechanisms rather than stand-alone primary interventions.
To provide comparative validation, the DMFF-BN-PCRO robust ranking was compared with SAW and TOPSIS benchmark rankings using the same normalized scenario score matrix. SAW was selected as a transparent additive benchmark, while TOPSIS was selected as a distance-based benchmark commonly used in MCDM studies. The comparison shows strong agreement between the proposed method and the conventional methods, especially in identifying A1, A2, and A3 as the leading alternatives.
Table 30 presents the comparative validation results obtained by applying SAW and TOPSIS as benchmark methods.
To further evaluate the degree of agreement between the proposed framework and the benchmark methods,
Table 31 reports Spearman’s rho, Kendall’s tau, and Top-3 overlap values. These indicators provide numerical evidence on whether the proposed method produces rankings that are consistent with SAW and TOPSIS while still preserving its scenario-sensitive interpretation.
The benchmark comparison shows that the proposed method is broadly consistent with conventional MCDM approaches, but it provides a more scenario-sensitive interpretation. SAW and TOPSIS ranked A1 slightly higher in the robust comparison, while DMFF-BN-PCRO ranked A2 first because it better captured the combined role of transparency, workload protection, autonomy support, engagement, and resistance reduction across scenarios. This difference is meaningful because algorithmic management governance is not only a technical scoring problem; it also involves interdependent criteria and behavioral risk factors.
This first-rank difference should therefore be interpreted as limited method sensitivity rather than as a contradiction among the methods. SAW gives a direct additive score and TOPSIS rewards closeness to the ideal solution; both approaches tend to favor A1 because the baseline compliance package performs steadily across documentation, monitoring-boundary, and minimum governance requirements. By contrast, DMFF-BN-PCRO gives greater influence to interdependent and behaviorally sensitive criteria that become discriminative under operational stress, particularly transparency, workload protection, autonomy support, expected engagement, and resistance reduction. For this reason, A2 becomes the leading package in the proposed framework, even though A1 remains very close and highly defensible. The final recommendation is therefore framed as a stable leading cluster composed of A1, A2, and A3, with A2 recommended as the most balanced robust package under the dependency-aware PCRO logic rather than as an absolute winner under every possible aggregation method.
These validation results strengthen the credibility of the ranking outputs. The proposed framework identifies a stable leading cluster, avoids major rank instability, and remains highly correlated with conventional MCDM benchmarks. At the same time, it adds value by showing how governance priorities shift under different warehouse conditions and by incorporating uncertainty, engagement, and resistance into the ranking logic.
5. Discussion
5.1. Interpretation of Main Findings
The findings show that algorithmic management in 3PL warehousing should be understood as a socio-technical governance issue rather than only a technical optimization tool. In response to RQ1, the results indicate that procedural fairness, transparency, system/information quality, workload protection, and privacy boundary governance are the most important mechanisms shaping employee responses. Procedural fairness is important because employees are more likely to accept algorithmic authority when decisions are consistent, explainable, and contestable. Transparency and system quality support trust by reducing perceived arbitrariness and increasing confidence in the reliability of system outputs. Workload and privacy are also central because they act as risk triggers. When algorithmic systems intensify pacing, monitoring, or KPI pressure, employees may respond with resistance, disengagement, or workaround behavior. Therefore, the dominant criteria are those that either build trust and acceptance or reduce resistance under operational stress.
In response to RQ2, the scenario results show that governance priorities shift according to the operating context. Under peak season surge, workload protection becomes especially important because algorithmic systems may increase task intensity, alert frequency, and pacing pressure. This explains why A2, which combines transparency, workload protection, and autonomy support, performs best in this scenario. Under labor shortage or high turnover, the main challenge is maintaining clarity and cooperation in a less stable workforce. In this context, A1 becomes stronger because baseline compliance and boundary governance provide simple, understandable, and stable routines. Under audit pressure or compliance scrutiny, the main risk shifts toward defensibility, conflict escalation, and legitimacy. Therefore, A3 performs best because contestability, procedural fairness, appeal channels, and documentation become more important.
In response to RQ3, the robust ranking identifies A2 as the strongest cross-scenario governance package. Its advantage comes from its balanced design. It does not rely on a single governance mechanism, but combines interpretability, workload protection, and autonomy support in a way that supports trust while reducing resistance. However, the results should not be interpreted as showing that one package is sufficient for all conditions. Rather, A1, A2, and A3 form the leading governance cluster. A2 is the best robust starting point, A1 is especially useful when workforce stability is weak, and A3 becomes critical when fairness, documentation, and contestability are under scrutiny.
This interpretation has an important practical implication. 3PL warehouses should not ask only which governance package is universally best. They should ask which package is most suitable under the current operational regime. A scenario-sensitive approach allows managers to use A2 as the general governance baseline, reinforce it with A1 during labor instability, and strengthen it with A3 during audit or compliance pressure.
Table 32 summarizes the scenario-based interpretation of the ranking results and translates the quantitative findings into practical governance recommendations. The table shows which governance package is most appropriate under each operating condition and clarifies the main reason for its selection.
5.2. Theoretical Contributions
This study contributes to the algorithmic management literature by reframing algorithmic control in 3PL warehousing as a socio-technical governance problem rather than only a productivity or automation issue. The findings show that algorithmic task allocation, KPI monitoring, scanner-based feedback, and scheduling systems affect employees not only through operational efficiency, but also through perceptions of fairness, transparency, autonomy, privacy, workload, trust, and resistance. This extends the discussion of algorithmic management by showing that employee cooperation depends on whether algorithmic systems are perceived as legitimate, explainable, bounded, and responsive to worker concerns.
A second contribution is the integration of several theoretical perspectives into one governance logic. Procedural fairness explains why employees respond more positively when algorithmic decisions are consistent, contestable, and supported by voice mechanisms. Trust in automation explains why transparency and system/information quality are necessary for employees to rely on algorithmic outputs. Technology acceptance logic explains why trust and autonomy support increase willingness to comply with algorithmic guidance. Resistance theory explains why excessive workload, privacy concerns, and weak fairness mechanisms can produce disengagement, workarounds, or pushback. By connecting these mechanisms, the study offers a more complete explanation of how algorithmic governance affects employee behavior in warehouse operations.
A third contribution is the development of a scenario-contingent view of algorithmic governance. The results show that the same governance package does not perform equally well under all operating conditions. Peak season increases the importance of workload protection and autonomy support; labor shortage increases the value of simple rules, boundary governance, and privacy clarity; audit pressure increases the importance of contestability, documentation, and procedural fairness. This suggests that responsible algorithmic management should not be treated as a fixed compliance checklist. Instead, governance should be adaptive to the operational context in which algorithmic systems are used.
Finally, the study contributes to socio-technical systems theory by showing that technical, organizational, and human factors must be governed together. A technically reliable system may still generate resistance if it creates excessive workload, reduces autonomy, or lacks appeal mechanisms. Similarly, transparency may not be sufficient if employees do not trust how data are used or if they cannot challenge unfair outputs. The findings therefore support a human-centered governance perspective in which operational stability depends on both system performance and employee legitimacy perceptions.
5.3. Methodological Contributions
This study also offers a methodological contribution by showing how employee survey evidence and expert judgment can be brought together in a structured decision framework. Algorithmic management governance is difficult to evaluate with a single method because it includes both measurable employee perceptions and expert-based judgments about implementation risks. For this reason, the study combines SEM, DMFFS, Bayesian Network weighting, and PCA-based ranking optimization. This integrated approach helps translate human-centered concerns, such as fairness, trust, privacy, workload, and resistance, into practical governance criteria that can be evaluated systematically.
A key methodological contribution is the use of Dynamic Multi-Facet Fuzzy Sets (DMFFSs). In many decision models, expert judgments are reduced to one numerical score. However, this can oversimplify complex governance problems. In the context of algorithmic management, an expert may believe that a package is technically useful but still uncertain about employee acceptance or possible resistance. DMFFS addresses this issue by capturing membership, non-membership, hesitancy, engagement, and resistance at the same time. This makes the evaluation more realistic because it reflects both the positive potential and the possible implementation risks of each governance package.
The second contribution is the use of Bayesian Network weighting. This is important because the criteria in this study are not independent of one another. For example, transparency can strengthen trust, procedural fairness can reduce resistance, workload protection can support cooperation, and resistance can weaken operational stability. The Bayesian Network makes these relationships visible and allows the criteria weights to change across different scenarios. In this way, the model better reflects the reality of 3PL warehouses, where priorities may shift during peak season, labor shortage, or audit pressure.
The third contribution is the use of PCA-based ranking optimization. The results show that A1, A2, and A3 are relatively close in performance, which means that simple ranking methods may not fully capture the differences among them. PCA-based ranking helps separate the alternatives more clearly by identifying the main patterns in the weighted decision matrix. The comparison with SAW and TOPSIS also supports the credibility of the proposed framework, because it shows that the model is consistent with conventional MCDM methods while still offering a richer scenario-based interpretation.
In this sense, the proposed methodology is not only a technical ranking procedure. It provides a transparent decision-support process for responsible AI governance in warehouse operations. It shows how employee perceptions, expert uncertainty, criteria interdependence, and scenario-based priorities can be combined to support more defensible and human-centered algorithmic management decisions.
5.4. Practical Implications for 3PL Warehouses
The findings provide practical guidance for 3PL warehouse managers who use or plan to implement algorithmic management systems. The results show that governance should not focus only on technical accuracy, productivity targets, or compliance documentation. Employees also need to understand how algorithmic decisions are made, how performance data are used, whether they can challenge unfair outputs, and how workload pressure is controlled. Therefore, human-centered governance should be built into daily warehouse routines rather than treated as a separate policy document.
The first practical implication is that A2 can be used as the main starting point for algorithmic management governance. Since A2 combines transparency, workload protection, and autonomy support, it is especially useful in high-pressure warehouse settings. Managers can apply this by explaining KPI calculation rules, clarifying task-allocation logic, reducing unnecessary alerts, setting workload escalation thresholds, and allowing limited discretion in task sequencing or exception handling. These actions can reduce confusion, overload, and resistance while supporting employee cooperation.
The second implication is that governance should be adjusted according to the operational scenario. During peak season, managers should prioritize workload caps, alert throttling, target recalibration, micro-break routines, and rapid exception handling. During labor shortage or high turnover, managers should emphasize simple rules, onboarding-friendly dashboards, clear monitoring boundaries, and stable baseline procedures. Under audit pressure or compliance scrutiny, managers should strengthen appeal logs, override records, fairness reviews, supervisor explanations, and data-access controls. This scenario-based approach makes governance more realistic because warehouse risks change across operating conditions.
The third implication is that A1 and A3 should be used as reinforcements rather than ignored. A1 provides the baseline governance foundation through clear rules, monitoring boundaries, and data-use procedures. This is especially important when workers are new, temporary, or unfamiliar with algorithmic systems. A3 strengthens legitimacy through contestability, procedural fairness, appeals, and documentation. This becomes critical when employees question algorithmic decisions or when managers need to justify decisions during audits, complaints, or compliance reviews.
The lower ranking of A4 and A5 does not mean that human oversight and privacy governance are unimportant. Rather, the results suggest that they may be more effective when integrated into broader governance packages. Human-in-the-loop oversight should be connected to appeal procedures, exception handling, and supervisor review routines. Privacy governance should be embedded into transparency practices, employee notices, access-control rules, and retention policies. In practice, responsible algorithmic management requires a combined governance architecture rather than isolated interventions.
For implementation, 3PL managers can begin with a phased approach. First, they should establish baseline compliance and monitoring-boundary rules. Second, they should improve transparency, workload protection, and autonomy support in daily operations. Third, they should formalize contestability mechanisms, including appeal channels, override documentation, and fairness review routines. This staged approach allows warehouses to improve governance without disrupting operational continuity. It also helps managers move from basic compliance toward a more human-centered and resilient algorithmic management system.
6. Conclusions
This study developed a human-centered and scenario-sensitive decision framework for selecting algorithmic management governance packages in 3PL warehousing. The framework integrates employee survey evidence, SEM-supported construct validation, expert-based Dynamic Multi-Facet Fuzzy Sets, Bayesian Network weighting, PCA-based ranking optimization, and comparative validation with SAW and TOPSIS. Its main contribution is to show how employee perceptions and expert judgment can be combined to evaluate governance packages under different warehouse conditions. In this way, the study moves beyond a purely technical view of algorithmic management and treats governance as a socio-technical issue shaped by fairness, transparency, system quality, autonomy, privacy, workload, trust, acceptance, and resistance.
The findings show that effective algorithmic management governance cannot be reduced to a single fixed solution. Six criteria were central to the decision framework: procedural fairness, transparency and contestability clarity, system and information quality, autonomy support, privacy boundary governance, and workload protection. Among the five governance packages, A2, which combines transparency, workload protection, and autonomy support, emerged as the strongest robust solution across scenarios. However, the scenario-specific findings also show that A1 and A3 play important roles under particular operating conditions. A1, focused on baseline compliance and boundary governance, performed best under labor shortage or high turnover, while A3, focused on contestability, procedural fairness, and appeals, performed best under audit pressure or compliance scrutiny. Therefore, A1, A2, and A3 should be interpreted as the leading governance cluster rather than as completely separate or competing solutions.
For logistics practice, the results suggest that 3PL managers should adopt an adaptive governance approach instead of relying on static compliance routines. During peak season, governance should prioritize workload caps, alert management, target clarification, and autonomy support. During labor shortage or high turnover, managers should emphasize simple rules, onboarding-friendly dashboards, stable procedures, and clear monitoring boundaries. Under audit or compliance pressure, appeal channels, override documentation, fairness reviews, and data-access controls become more important. These findings show that governance mechanisms are not only ethical safeguards; they also support operational stability by reducing confusion, resistance, workarounds, and implementation failure.
The study also has implications for responsible digitalization in logistics. As warehouses become more dependent on algorithmic task allocation, scanner-based monitoring, KPI dashboards, and digital performance feedback, governance arrangements must remain explainable, contestable, and auditable. Clear rules on data access, data retention, purpose limitation, employee communication, and appeal procedures can help protect both employee legitimacy perceptions and service continuity. The proposed framework gives managers and compliance teams a structured way to justify governance choices by linking design features to behavioral and operational risk channels.
Several limitations should be acknowledged. First, the employee survey is cross-sectional, which limits causal interpretation. Second, the study is based on a specific 3PL warehousing context, and the findings may not fully generalize to other countries, regulatory environments, or logistics subsectors. Third, although the expert panel was selected to reflect role diversity, the number of experts remains limited. Fourth, the scenarios used in the analysis are stylized representations of operational pressure and should be refined with site-specific evidence in future applications. Finally, the decision model should be further validated with live operational indicators such as exception rates, appeal frequency, safety incidents, turnover, rework, service-level performance, and productivity variation.
An additional limitation concerns the scenario-level expert evidence profiles. Although all seven experts reviewed the general scenario logic and the decision-model structure, each scenario was supported by two or three primary role-specific evidence profiles rather than by a fully balanced expert subsample. Therefore, differences in scenario-level weights and rankings may be influenced not only by the operational characteristics of peak season, labor shortage, or audit pressure, but also by the expert roles most directly associated with each scenario. This limits the statistical generalizability and stability of direct scenario-to-scenario comparisons. For this reason, the scenario results should be interpreted as context-sensitive decision-support findings rather than as definitive population-level differences. Future studies could use larger expert panels in which all experts provide full, detailed evaluations for every scenario, or balanced scenario-specific panels with equal numbers of operations, workforce, safety, compliance, privacy, and system-implementation experts.
Future research can extend the framework in several directions. Longitudinal studies could examine whether improvements in fairness, transparency, workload protection, and privacy governance actually reduce resistance after algorithmic management systems are implemented. Multi-site and cross-country studies could test how institutional, regulatory, and organizational differences affect the governance–behavior relationship. Future studies could also integrate live warehouse KPIs into the model to strengthen validation and practical usefulness. Finally, applying the framework to transportation, last-mile delivery, fulfillment centers, and platform-coordinated logistics would help clarify how algorithmic governance requirements differ across logistics systems.