Next Article in Journal
Integrated Cultivation of Thalassiosira sp. Using Nitrified Recirculating Aquaculture System Effluent: Nutrient Recovery, CO2 Fixation, and Fucoxanthin-Rich Biomass Production
Previous Article in Journal
Uncovering the Differences in Environmental Justice of Passenger and Freight Transportation Emissions Through Multi-Task Interpretable Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on the Association and Pathways Between Data Elements and Coastal Port Smartness Enhancement

School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China
*
Author to whom correspondence should be addressed.
Sustainability 2026, 18(12), 5989; https://doi.org/10.3390/su18125989
Submission received: 24 April 2026 / Revised: 7 June 2026 / Accepted: 8 June 2026 / Published: 11 June 2026

Abstract

Against the backdrop of the “Dual Carbon” strategy and global shipping digitalization, data elements have emerged as the key enabling factor and predictive correlate of coastal port smartness. Using panel data for seven coastal provinces/municipalities and eight coastal ports in China from 2017 to 2024, this paper constructs a “base-supply-flow-use” data element development index (DEDI) and a “WSR” coastal port smartness index (CPSI), employing VHSD-EM dynamic model, random forest algorithm, and partial effect model to examine the association patterns, nonlinear responses, and differentiated enhancement pathways between data elements and port smartness. Findings reveal: (1) CPSI and DEDI exhibit a high positive correlation with narrowing regional disparities; (2) CPSI shows stepwise spatial differentiation, with Shanghai and Ningbo-Zhoushan Ports leading, while Guangdong demonstrates “data advancement but smartness lag”; (3) in the random forest model, the predictive contribution of DEDI to CPSI is 13.586%, which ranks behind digital inclusive finance and openness level but is higher than regional economic strength and innovation output. The combined predictive contribution of the DEDI main effect and its interaction terms reaches 32.567%; (4) the univariate partial effect of DEDI on predicted CPSI followed a three-stage nonlinear pattern of initial accumulation, accelerated improvement around a threshold of DEDI ≈ 0.215, and diminishing marginal effects at higher levels; and (5) the joint partial effects of DEDI with digital inclusive finance, economic development, fiscal transportation expenditure, and innovation output showed clear dimensional and regional heterogeneity. Accordingly, four policy pathways are proposed: constructing a full-chain data element system, enabling synergistic empowerment of data and supporting elements, formulating regionally differentiated catch-up strategies, and strengthening the dual-wheel support of digital inclusive finance and opening-up—all aimed at advancing the development of world-class ports.

1. Introduction

The digitization of global trade, the “Dual Carbon” strategy, and the demand for supply chain resilience are collectively propelling ports to transition from competition centered on traditional throughput to that focused on efficiency, green performance, and intelligent capabilities. The development of smart ports encompasses not only automated equipment and digital platforms but also the alignment among data resources, business processes, organizational capabilities, and governance mechanisms. As a new production factor, data elements provide a critical foundation for port perception, scheduling, coordination, and service optimization. However, existing studies remain confined to single-port case analyses or static evaluations, lacking empirical evidence on the dynamic correlation between provincial data elements and the intelligence level of coastal ports. This deficiency results in a lack of quantitative basis for policy instruments and investment decisions. Therefore, within the context of China’s coastal ports, a systematic investigation into the correlation characteristics and regional disparities between the development of data elements and port intelligence holds significant practical urgency for the implementation of the “Data Elements × Transportation” special action.

2. Literature Review

Port smartness refers to a comprehensive metric for evaluating the level of port smartness, reflecting the maturity and efficiency of port smartness initiatives. It serves as a core indicator for measuring the progress and outcomes of port smartness efforts. Existing literature on the measurement and evaluation of port smartness primarily focuses on the construction of indicator systems and the innovation of evaluation methodologies. For instance, Molavi et al. (2020) proposed the Smart Port Index (SPI) [1], while Robert Philipp (2020) developed the Port Digital Readiness Index [2]. However, the former assesses smart ports from only three dimensions, and the latter emphasizes digital infrastructure and technological readiness, making it difficult to fully capture the multidimensional characteristics of port intelligence. In China, Cai Wenxue and Zheng Jichuan (2019) developed an evaluation framework based on a set of indicators and applied the Analytic Hierarchy Process (AHP) combined with fuzzy comprehensive evaluation, but their approach is highly subjective due to manual weighting, making the results heavily dependent on expert judgment [3]. Cao Jie et al. (2021) employed an improved matter-element theory based on cloud models to evaluate Tianjin Port, yet the limited sample size restricts cross-comparisons among coastal ports [4]. Zheng Zhong and Li Hongliang (2022) used AHP to evaluate 14 international ports including Hamburg Port, broadening the comparative perspective internationally; however, their evaluation system still relies heavily on static weight assignments, failing to reflect the dynamic evolution of port intelligence [5]. Luo Bencheng et al. (2023) proposed a grey entropy change-weight evaluation model and conducted empirical analysis on the intelligence levels of China’s top 13 container throughput ports, but their sample selection remains throughput-driven, potentially underestimating differences in intelligent transformation among smaller and medium-sized ports [6]. Zhu Jishuang et al. (2025) broke away from the “throughput-centric” paradigm by establishing a multidimensional comprehensive evaluation system for world-class ports and applied AHP and fuzzy comprehensive evaluation to assess 34 global ports [7]. Zhou et al. (2024) based their assessment on indicators from the “Evaluation Index System for Smart Ports”, using correlation calculation and entropy weighting methods, though the static entropy method may still lead to intertemporal incomparability due to time-varying weights across periods [8]. Clearly, although existing research on port intelligence evaluation continues to evolve, challenges remain, including inconsistent indicator systems, static evaluation methods, and insufficient cross-port comparability.
Data elements, with data as their carrier, are a new type of production factor that can participate in production and operation activities, create economic value, and improve total factor productivity after undergoing collection, processing, circulation, and application. Li Zhiguo and Wang Jie (2021), as well as He Wei et al. (2024), constructed indicator systems for data elements and used principal component analysis and entropy weight methods, to measure the development level of data elements; however, their indicators primarily focused on digital infrastructure and macro-level allocation environments [9,10]. Huang (2025) combined indicator system approaches with text analysis to assess China’s data element levels at both provincial macro and enterprise micro levels [11]. Pan Hongliang (2025) established an evaluation framework for data element development based on three dimensions—data foundational support, data capability transformation, and industry application—and measured the domestic development level of data elements [12]. Although this study began to emphasize the transformation process of data elements, its “capability” dimension placed greater emphasis on digital technological capabilities rather than data supply quality or data service capacity [12]. Chen Rongda (2025) measured the marketization level of data elements across Chinese provinces from a macro perspective, focusing on data supply, circulation, and utilization [13]. Wu Jie and Chen Hongzhao (2025) developed an evaluation system for data element development from an input-output perspective, systematically depicting the processes of data input, transmission, processing, support, and value realization [14]. Nevertheless, their approach still leaned toward macro-level statistical and spatiotemporal evolution analysis, offering insufficient explanation of how data elements integrate into port smartness construction [14]. In summary, existing research has measured the development level of data elements at regional, urban, and enterprise levels, highlighting the importance of data infrastructure, technological supply, data circulation, and application scenarios. However, within the port sector, the development level of data elements is often reflected in the data environment and digital foundation of the regions where ports are located. Whether these elements can be transformed into port intelligence capabilities depends further on port-specific business contexts, data governance mechanisms, and organizational absorption capacity.
With the rapid advancement of data elements, they have gradually emerged as the core driving force in the smartness process of coastal ports. A growing body of scholars has explored the role of data elements and associated digital technologies in enhancing port smartness. Sun Yu and Wang Pei (2019) emphasized that the key to applying unmanned technology lies in precise data collection and efficient utilization; however, their research focused primarily on individual technological applications and did not comprehensively analyze the overall impact of data elements on port intelligence [15]. Paulauskas et al. (2021) argued that port digitization levels are influenced by multiple internal factors, with smaller and medium-sized ports significantly lagging behind larger ones in terms of digital maturity [16]. Tang Hao et al. (2021), based on 5G technology, developed an information platform for Zhanjiang Port’s smart port system, proposing optimization strategies such as establishing databases, planning and constructing port platforms, and building information perception networks [17]. Min (2022) highlighted the roles of the Internet of Things and big data in port management, but their analysis mainly focused on management architecture and technical applications without incorporating data as an independent production factor into the analytical framework [18]. Deng Yuyong et al. (2022) analyzed 15 major Chinese ports and concluded that smart port development has significantly improved total factor productivity and overall efficiency through technological advancement, although current technical and scale efficiency still require improvement [19]. Du Xinke (2023) and Ma Lanqing et al. (2024) respectively examined the application of artificial intelligence and sensing technologies in smart port development, strengthening the focus on key technologies, yet insufficient attention was given to the supply, circulation, and value transformation of data elements [20,21]. Hua Jiang et al. (2025), using Jiangyin Port as a case study, proposed that smart port development should be data-centric [22], while Cai Hanyi (2026) discussed the significant role of big data technology in intelligent port logistics [23]; however, both studies primarily focused on practical pathways and management strategies, lacking quantitative validation across multiple regions and years. These research findings indicate that digital technologies and data resources play a crucial role in improving port efficiency, yet there remains insufficient discussion regarding the systematic relationships, nonlinear characteristics, and regional heterogeneity between regional data element development and port smartness enhancement. Particularly in China’s coastal regions, differences in data element development levels, port scales, hinterland industries, and governance conditions across provinces and cities may lead to varying efficiencies in transforming data elements into port smartness.
In summary, existing research still suffers from four major shortcomings. First, lack of universality: Current port smartness evaluation systems employ different standards, with significant variations in indicator dimensions and weighting methods, resulting in insufficient horizontal comparability across studies. Second, lack of full-chain coverage: As a production factor, data elements must go through a complete chain of “base-supply-flow-use”, based on the lifecycle theory of data elements. Most existing studies focus only on infrastructure or macro-level applications, whereas this paper is the first to incorporate all four stages simultaneously into port smartness analysis. Third, lack of dynamism: Most studies use static evaluation methods such as entropy weight method or fuzzy comprehensive evaluation to measure the development level of data elements and port smartness, making it difficult to capture temporal evolution of indicators. Fourth, lack of mechanisms for identifying associations: Existing approaches mostly rely on linear weighted models or static evaluation models, implicitly assuming stable marginal contributions of each indicator, which makes it difficult to identify threshold effects, synergistic effects, and saturation effects in how data elements influence port smartness, and rarely discuss the possibility that highly intelligent ports may, in turn, promote improvements in data collection and application capabilities. The random forest algorithm in machine learning can effectively avoid issues such as model specification bias and multicollinearity inherent in traditional linear models, making it well-suited for analyzing complex real-world scenarios involving interactions among multiple factors.
Given this, this paper constructs an indicator system for measuring coastal port smartness and the development level of data elements, employs the VHSD-EM model to assess these dimensions across coastal regions, analyzes their temporal and spatial evolution, and integrates Random Forest algorithms with partial-effect models to examine the predictive contribution and average marginal response of data elements to port smartness. This provides practical guidance and empirical support for enhancing the smartness of China’s coastal ports. The main contribution of this study lies in offering an empirical diagnostic framework that captures the relationship between data elements and port smartness for China’s coastal ports, rather than proposing strict causal identification or novel machine learning algorithms. Specifically, the combined use of VHSD-EM, Random Forest, and partial-effect analysis provides complementary evidence on index construction, predictive association, and nonlinear response patterns while avoiding causal claims beyond the capacity of the data. Although the empirical setting is China, the framework also offers a transferable reference for other port systems seeking to assess whether regional data-element environments are effectively converted into port smartness performance.

3. Mechanism Analysis of Data Elements Driving the Enhancement of Coastal Port Smartness

The development of smart ports is a critical strategy for responding to global trade demands, enhancing national competitiveness, and promoting economic and social progress. It has become an important driver of efficiency improvement, cost reduction, and competitive advantage.
From a socio-technical and ecosystem perspective, the smartness of coastal ports is not determined solely by automation equipment or digital technologies. Rather, it results from the coordinated transformation of digital infrastructure, operational processes, governance mechanisms, and organizational capabilities. Digital infrastructure provides the foundation for data sharing, platform interconnection, and collaborative innovation. Socio-technical transformation theory emphasizes that technological upgrading must evolve alongside user practices, institutional arrangements, industrial networks, and governance structures. The digital platform ecosystem perspective further suggests that ports are multi-actor collaborative systems involving terminal operators, shipping companies, customs authorities, logistics providers, financial institutions, and public management agencies. Their value creation depends on data sharing, platform interoperability, and rule-based coordination. Therefore, data elements enhance port smartness not only by improving technical efficiency but also by reshaping governance models, inter-organizational collaboration, and operational performance.
The advancement of port smartness is a socio-technical process driven by data elements. However, data elements do not automatically translate into higher levels of smartness. Their value realization depends on the synergistic interaction among technological infrastructure, governance mechanisms, and human and organizational support. Accordingly, the WSR framework provides a suitable analytical lens for explaining the formation mechanism of port smartness.
The data-driven enhancement of coastal port smartness follows a progressive mechanism of factor reconstruction, chain empowerment, and system transition. Specifically, factor reconstruction explains the micro-level embedding of data elements into port production functions; chain empowerment reveals the meso-level circulation and value creation of data elements across the port value chain; and system transition captures the macro-level transformation of data elements into smart port capabilities through the coordinated evolution of the Wuli-Shili-Renli system. The mechanism of data elements driving smartness enhancement in coastal ports is presented in Figure 1.
At the factor reconstruction stage, data elements are embedded into the port production function as a new production factor, reshaping the structure of production inputs. The traditional port production function can be expressed as Y = A f N , K , where Y denotes port output, N labor input, K capital input, and A total factor productivity. With the introduction of data elements, the function can be reconstructed as Y = A I f N I , K I , I , where I represents data elements as an independent production factor, N I denotes data-empowered labor input, K I denotes data-empowered capital input, and A I refers to data-enhanced total factor productivity. These terms represent different mechanisms of data empowerment: data improve labor skills and decision-making capabilities, optimize capital allocation and equipment utilization, directly participate in production as an independent input, and enhance total factor productivity through integrated digital technologies.
At the chain empowerment stage, data elements permeate the entire base-supply-flow-use value chain. In the foundation support layer, digital infrastructure is established to enable human–machine-object interconnection. In the technology supply layer, production and operation are smart-enabled through the dual drive of mechanisms and data. In the circulation operation layer, data barriers among stakeholders are reduced to facilitate collaborative sharing. In the value transformation layer, the multiplier effect of data is activated to support new business models and service innovation. These four layers progress hierarchically and form a closed-loop empowerment process from infrastructure construction to value realization.
At the system transition stage, the effects generated through factor reconstruction and chain empowerment are integrated into the WSR system. The Wuli dimension represents the technological and material foundation of port smartness, including digital infrastructure, data collection systems, intelligent equipment, communication networks, and integrated digital platforms. It provides the basic conditions for data acquisition, transmission, processing, and application. The Shili dimension reflects the institutional and managerial logic through which technical resources are transformed into operational efficiency and governance capacity, including governance rules, process integration, collaboration mechanisms, platform management, risk control, and decision-making processes. The Renli dimension emphasizes the human-centered and organizational foundation of port intelligence, including managerial cognition, digital skills, organizational learning, stakeholder collaboration, and cross-departmental cooperation.
Through the coordinated interaction of Wuli, Shili, and Renli, data elements are transformed from technical resources into system-level capabilities for enhancing port smartness. This process systematically improves the smartness of port infrastructure, operational service efficiency, and talent-driven innovation capacity. Moreover, the empowerment effect on port smartness is strengthened by multiple supporting factors, generating a synergistic amplification effect in which the whole is greater than the sum of its parts. As a result, coastal port smartness evolves from quantitative accumulation toward qualitative transformation, forming a dynamic coupling mechanism between data elements and supporting conditions.

4. Research Methods

To avoid methodological over-complexity, the three analytical components are assigned distinct and limited roles: VHSD-EM constructs comparable CPSI and DEDI indices, Random Forest identifies predictive contributions under nonlinear conditions, and partial-effect analysis visualizes the model-predicted response intervals. The framework is therefore used for empirical diagnosis and policy interpretation of port smartness enhancement, not for claiming algorithmic novelty or establishing causal identification.

4.1. The VHSD-EM Model

The VHSD-EM model is a dynamic comprehensive evaluation approach that integrates the Vertical and Horizontal Scatter Degree (VHSD) model with the Entropy Method (EM). Compared with the traditional analytic hierarchy process (AHP), the VHSD-EM model reduces the subjectivity associated with expert-based weighting. Unlike the standalone entropy method, it considers not only the dispersion of indicator information but also the vertical and horizontal scatter degree, thereby incorporating both temporal dynamics and cross-sectional heterogeneity into the weighting process and improving the dynamic comparability of panel evaluation results. In contrast to the standalone VHSD model, VHSD-EM further leverages the entropy method’s capacity to capture the information content of indicators, thus avoiding excessive reliance on inter-object differences while neglecting the intrinsic informational contribution of the indicators. Comparison of evaluation methods are presented in Table 1.
Given that the evaluation of coastal port smartness and data element development involves both temporal evolution and spatial differentiation, the heterogeneity arising from spatiotemporal factors must be adequately addressed. Therefore, this study constructs the VHSD-EM model by integrating the two methods, generating composite weights that balance temporal dynamics, cross-sectional differentiation, and information completeness. The proposed model provides a more transparent and comparable diagnostic basis for measuring and evaluating coastal port smartness and the development level of data elements.
(1)
Basic principle of the VHSD model. Let the smartness of coastal ports be denoted as P I j t . The core formula is as follows:
P I j t = i = 1 I W i X i j t i = 1 , 2 , , I ; j = 1 , 2 , , J ; t = 1 , 2 , , T
In Formula (1), P I j t represents the comprehensive evaluation score of the j -th evaluation object (port) in period t ; W i denotes the weight of the i -th indicator; and X i j t is the standardized value of the original data for the i -th indicator of the j -th port in period t .
The determination of indicator weights adheres to the principle of “maximizing inter-object differentiation”. This differentiation is measured by the total sum of squared deviations (TSSD), expressed as:
σ 2 = t = 1 T j = 1 J P I j t P I ¯ 2 = t = 1 T W T H t W = W T t = 1 T H t W = W T H W
In Formula (2), W = W 1 , W 2 , , W I T is the weight vector, and H = t = 1 T H t is a symmetric matrix of order I. Let X 11 t X 1 J t X I 1 t X I J t , then we have H t = X t T X t .
To satisfy the basic constraint of indicator weights W T W = 1 , the maximization of the objective function in Equation (2) is transformed into the following optimization problem:
max W T H W , s . t . W = 1 W > 0
The solution W is the eigenvector corresponding to the largest eigenvalue λ m a x H of matrix H . Finally, the comprehensive evaluation score P I j t for each port at each time point is calculated using Equation (1).
(2)
Basic principle of the EM model. Firstly, the same standardization process is applied to obtain normalized indicator values X i j t . Information entropy is then used to determine indicator weights w i t , with the core formula:
w i t = 1 E i t / i = 1 I 1 E i t
E i t = j = 1 J P i j t ln P i j t / ln J ,   P i j t = X i j t j = 1 J X i j t
Here, E i denotes the information entropy of the i -th indicator, and P i j t represents the normalized result. Finally, the comprehensive evaluation score P I j t is calculated according to Equation (1).
(3)
Construction of the VHSD-EM model. The final weight δ i t is derived by taking the arithmetic mean of the weights obtained from Equations (3) and (4):
δ i t = W i + w i t / 2
In Formula (6), δ i t represents the weight of the i -th indicator for coastal port smartness in period t , as calculated by the VHSD-EM model. The comprehensive evaluation score is then obtained via linear weighting aggregation (layer-by-layer summation). The calculation process for the data element development level follows the same logic.
In addition, this study employs the Spearman correlation test to assess the consistency between the measurement results of the VHSD model and the EM model. It should be noted that this test only indicates the consistency of rankings derived from the two weighting methods, and cannot independently verify the external validity of the CPSI or DEDI indicator systems. Therefore, this paper further supplements and validates the measurement results by combining ranking stability comparison and model robustness tests.

4.2. Random Forest Algorithm

In this study, the random forest model was implemented using the Python programming language, with model construction and computation specifically carried out via the Scikit-learn machine learning library. The random forest was utilized to identify the relative predictive importance of different feature variables in CPSI prediction and to capture potential nonlinear relationships between variables. It should be clarified that the results derived from the random forest reflect the variable importance in the context of model prediction.
(1)
Sample splitting and node optimization. For a given feature variable K j and its threshold split point m , the sample set K 1 , K 2 , K 3 ,   , K p is split into two subsets such that the residual sum of squares (RSS) of the target values q i is minimized:
m i n j , m [ q i R 1 j , m q i q R 1 ^ 2 + q i R 2 j , m q i q R 2 ^ 2 ]
In Formula (7), q R 1 ^ and q R 2 ^ denote the mean target values of the two subsets after splitting, calculated as:
q R 1 ^ = a v e q i | q i R 1 j , m ,   q R 2 ^ = a v e ( q i | q i R 2 j , m )
The subsets R 1 and R 2 are defined as:
R 1 j , m = q i , K i 1 , K i 2 , , K i j | K j m ,   R 2 j , m = q i , K i 1 , K i 2 , , K i j | K j > m
For each decision tree in the random forest, the splitting process (7)–(9) is repeated to iteratively select the optimal feature variable and optimal threshold that minimize RSS. The splitting stops when a preset stopping criterion is satisfied. The random forest then performs bootstrap sampling (random sampling with replacement) on the dataset, selects feature variables corresponding to different split nodes for each tree, and finally outputs the predicted target value by averaging the predictions across trees.
(2)
Parameter tuning and overfitting mitigation. To mitigate the risk of overfitting under small-sample conditions, this study adopts a combined approach of grid search and cross-validation for parameter tuning. The parameter settings and performance evaluation results of the random forest model are presented in Table 2.
(3)
Model training, contribution rate calculation, and performance evaluation. The optimal parameter combination model, obtained through hyperparameter tuning, is trained using sample data, and the contribution rate of each feature variable is calculated. Additionally, in Table 3, this study reports model performance metrics such as CV_R2, RMSE, and MAE, comparing them with benchmark models to enhance transparency in result interpretation.
In terms of predictive performance, the Random Forest model exhibits a level of accuracy comparable to that of the panel fixed-effects model and shows better predictive accuracy than traditional linear models (ordinary OLS) as well as regularized regression methods (Ridge, Lasso) and the Gradient Boosting algorithm. Notably, it also enables the generation of nonlinear feature contribution rankings, a capability that enhances the interpretability of feature importance in nonlinear modeling frameworks.
The basic principle of the random forest and the workflow for calculating feature variable contribution rates are illustrated in Figure 2.

4.3. Partial Effect Model

In model analyses involving multiple independent variables, partial effect models can be employed to examine the average marginal response of a specific independent variable on the predicted value of the dependent variable. Partial effect plots, by marginalizing other variables, illustrate the average change in the model-predicted outcome variable as a single variable or a combination of two variables varies. It should be emphasized that partial effect plots are used to identify nonlinear trends, threshold intervals, and interaction relationships, rather than to establish strict causal identification.
Let the set of independent variables be denoted as X p p = 1 , 2 , , m . When examining the average marginal response of a specific independent variable X S on the predicted value of the dependent variable Y, we define X C as the subset of independent variables excluding X S , satisfying X S X C = X 1 , X 2 , , X m . The partial effect of X S on Y is then expressed as:
f S X S = E X S f X S , X C
where f S · denotes the partial effect function, and E X S represents the expected value of the dependent variable Y corresponding to different values of X S .
We further extend this concept to a three-dimensional scenario to analyze the joint partial effect of two target independent variables, X A and X B , on Y. Let X C denote the subset of independent variables excluding X A and X B , satisfying X A X B X C = X 1 , X 2 , , X m . The joint partial effect of X A and X B on Y is given by:
f A , B X A , X B = E X C f X A , X B , X C
where f A , B · denotes the joint partial effect function, and E X C represents the conditional expectation of Y corresponding to different values of X A and X B .
To enhance the statistical interpretability of partial effect results, this study employs the bootstrap resampling method to construct 95% uncertainty intervals. Specifically, repeated sampling with replacement is performed on the original sample; in each iteration, the random forest model is retrained, and partial effect values are calculated at uniform grid points. The 95% uncertainty interval is then constructed using the 2.5th and 97.5th percentiles of the resampled partial effect distributions. For univariate partial effect plots, the uncertainty interval is visualized as a shaded area; for bivariate joint partial effect plots, contour lines and sample location markers are incorporated to facilitate result interpretation. The parameter settings of the partial effect model are presented in Table 4.

5. Evaluation of the Smartness of Coastal Ports and the Development of Data Elements

5.1. Evaluation Index System for the Smartness of Coastal Ports and Data Element Development

Based on the WSR systems methodology, this study constructs an evaluation index system for coastal port smartness from three dimensions: the Wuli layer, the Shili layer, and the Renli layer. Drawing on insights from existing literature [3], eight secondary indicators and 23 tertiary indicators are selected, as presented in Table 5.
The full industrial chain of data elements refers to the complete industrial ecosystem that encompasses the entire process from the generation of data (as a new production factor) to the ultimate realization of its value. This chain can be summarized into four core links: “base-supply-flow-use”. Based on the full industrial chain of data elements, this paper constructs an evaluation index system for the development level of data elements. Drawing on the approaches of Pan Hongliang [12] and Chen Rongda [13], eight secondary indicators and 22 tertiary indicators are selected, as presented in Table 6.
The DEDI indicators are selected as observable proxies for the regional data-element environment rather than as direct port-operational variables. Indicators such as the number of high-tech zones and digital-economy policies reflect the institutional and industrial supply of data-related resources; optical cable length and telecommunications switch capacity capture the infrastructure foundation for data transmission and connectivity. These provincial-level indicators may influence port smartness indirectly by shaping the availability, circulation, and application capacity of data resources in the surrounding logistics and industrial ecosystem.

5.2. Data Sources

Since port smartness and data elements are emerging concepts proposed in recent years, related research in China remains at an early stage. As a result, the continuity and comparability of relevant statistical data are still subject to certain limitations. Accordingly, this paper selects seven coastal provinces/municipalities and eight coastal ports over the period 2017–2024 as the research sample, based primarily on the following considerations:
Firstly, the selected ports cover major coastal port regions in China, including Northeast China, North China, East China, and South China, thus ensuring regional representativeness. Secondly, these ports play an important role in terms of cargo throughput, container transportation, regional economic linkages, and port infrastructure. Thirdly, since this paper constructs both the port-level CPSI and the provincial-level DEDI, the availability of continuous panel data and the consistency of statistical standards constitute key constraints in sample selection. Fourth, Guangzhou Port and Zhuhai Port are both included in the Guangdong sample because they differ markedly in port scale, functional positioning, and port smartness performance. This enables a more nuanced examination of heterogeneity in port smartness under the same provincial data-element environment.
For port smartness data, primary sources include China port yearbooks, annual reports of port-listed companies, policy documents, and survey data released by government departments and authoritative institutions such as the Ministry of Transport, China Ports Association, and Shipping Exchanges. For missing values encountered during data collection, multiple imputation was employed for interpolation.
For data elements data, data were retrieved from China Statistical Yearbook, provincial statistical yearbooks, China Science and Technology Statistical Yearbook, and the National Intellectual Property Administration. The counts of data-related regulations/standards and data security documents were obtained via web scraping using Python (uses PyCharm 2024.1 as the development environment and completes data processing and analysis based on Python 3.11.0). Missing values in these datasets were addressed using linear interpolation.
The use of provincial-level DEDI indicators is mainly constrained by data availability and the fact that ports are embedded in wider regional economic and governance systems. Nevertheless, this scale mismatch between provincial data-element indicators and port-level smartness indicators remains a limitation, and the empirical results should be interpreted as regional association evidence rather than direct port-level causal effects.

5.3. Sustainability Implications: Linking CPSI with the SDGs

From the perspective of sustainable development, the CPSI indicator system constructed in this paper exhibits strong alignment with the United Nations Sustainable Development Goals (SDGs), we can see in Table 7. Within the Wuli layer, indicators such as intelligent facilities, automated terminals, big data platforms, and informatization and paperless operations correspond to SDG 9, embodying the upgrading of port infrastructure and technological innovation. Green and low-carbon indicators align with SDG 13 and SDG 12, reflecting the port’s performance in energy conservation, emission reduction, and low-carbon operations. In the Shili layer, indicators related to logistics efficiency, operational effectiveness, and service quality are associated with SDG 9, SDG 11, and SDG 17, demonstrating the port’s supporting role in regional supply chain efficiency, urban logistics systems, and trade connectivity. At the Renli layer, indicators of talent teams and innovation-driven development correlate with SDG 4, SDG 8, and SDG 9, reflecting the human capital and innovation foundation required for the transformation of smart ports. Thus, the CPSI not only functions as an evaluation tool for the level of port intelligence but also serves as a comprehensive indicator for observing the sustainable transformation capacity of port systems.

5.4. Robustness Analysis

This study calculates the annual CPSI for coastal ports and the annual DEDI for provinces using the VHSD-EM model. The results of the Spearman test for the VHSD and EM models are presented in Table 8.
According to Spearman’s test results, the evaluation results derived from the vertical-horizontal grading method and the information entropy weighting method are all significant at the 5% level, with some years reaching the 1% significance threshold. Additionally, the correlation coefficient approaches 1.000, indicating that the rankings of the two indices are nearly identical—thus indicating that the two methods show good consistency in sample sorting.
To examine the robustness of the composite weighting scheme, a weight sensitivity analysis was further conducted. Specifically, α was set to 0.25, 0.50, and 0.75, where α denotes the relative contribution of VHSD weights and 1 − α denotes that of EM weights. The composite weight was recalculated as Wα = α × WVHSD + (1 − α) × WEM. The Spearman rank correlation coefficient was then used to compare the consistency of indicator-weight rankings under alternative α settings. Spearman correlation coefficients of CPSI ranking results are presented in Table 9.
The results show that the overall Spearman rank correlation coefficient between α = 0.25 and the baseline setting of α = 0.50 is 0.9654, while that between α = 0.75 and α = 0.50 is 0.9258. The year-by-year results also indicate high rank stability, with mean annual Spearman coefficients of 0.9576 and 0.9072, respectively. These findings suggest that the CPSI weighting scheme is not sensitive to moderate changes in the relative contributions of VHSD and EM weights, thereby confirming the robustness of the weighting structure. The weight sensitivity analysis for DEDI follows the same principle.

5.5. Analysis of Evolution in the Time Dimension

This study further estimates the coastal port smartness index (CPSI) and data element development index (DEDI) for the period 2017–2024 using the VHSD-EM model, with results presented in Figure 3.
Firstly, in terms of the overall development trend, both the smartness level of coastal ports and the provincial-level data element development level exhibit an upward trajectory, indicating a certain degree of temporal synchronization between the two. However, such synchronization does not imply a strict causal relationship; instead, it suggests a strong correlation between the improvement of the data element environment and the enhancement of port smartness during the sample period.
Secondly, from a regional perspective, the differentiation between the two indices is pronounced across coastal regions. In East China, the smartness of coastal ports and the development of data elements have mostly accelerated synchronously, with port smartness significantly outperforming data element development—both of which are notably higher than the overall level of coastal areas. By contrast, the growth rates of both data elements and port smartness in Fujian are relatively moderate. In North China and the Northeast (specifically Liaoning and Tianjin), the growth of data elements and port smartness is insignificant, showing almost flat trends; while port smartness outperforms data element development, both remain below the coastal average. In South China, the growth rate of data elements is significantly higher than the national average, yet the progress of port smartness is relatively slow—leading to a clear misalignment in their development pace and a gradually widening gap. In contrast to other regional characteristics, Guangdong province exhibits a relatively high level of data element development; however, the smartness rankings of Guangzhou Port and Zhuhai Port remain relatively low, indicating a structural mismatch between the provincial-level data element advantage and the smartness performance of its ports.

5.6. Analysis of Evolution in the Spatial Dimension

Based on the calculation results of the VHSD-EM model, the average values of the coastal port smartness index (CPSI) and data element development index (DEDI) for coastal regions during the period 2017–2024 were further derived. The results are shown in Table 10.
Firstly, regarding the total coastal port smartness index (CPSI), the smartness of coastal ports exhibits a stepped distribution with significant regional gradient disparities. Shanghai Port (0.5116) and Ningbo-Zhoushan Port (0.5022) rank first and second, with their smartness levels far outperforming other ports, forming the first tier. Qingdao Port (0.4456) follows closely, constituting the second tier. Tianjin Port (0.3480), Dalian Port (0.3413), Guangzhou Port (0.3156), and Xiamen Port (0.3150) are at a medium development level, forming the third tier. Zhuhai Port (0.2123) has the lowest CPSI, with a distinct gap from leading ports, reflecting a significant regional imbalance in the smart development of coastal ports.
As an external qualitative validation, the leading CPSI positions of Shanghai Port and Ningbo-Zhoushan Port are broadly consistent with their widely recognized roles as advanced international hub ports with strong digital infrastructure, automated terminal development, and integrated logistics capabilities. This comparison is not intended as an additional quantitative validation test, but it improves the external interpretability of the CPSI ranking results.
Secondly, regarding the WSR subsystem index of port smartness, the development of subsystems in some ports is unbalanced, with structural bottlenecks. From the physical layer perspective, Ningbo-Zhoushan Port (0.2473) leads with well-developed infrastructure and intelligent facilities, while Zhuhai Port (0.1086) performs the worst, indicating a substantial gap in physical facility support capabilities between leading and trailing ports. From the Shili layer perspective, Qingdao Port (0.2200) and Ningbo-Zhoushan Port (0.2175) demonstrate coordinated leadership in logistics efficiency, operational efficiency, and service quality, whereas Zhuhai Port (0.0754) shows obvious deficiencies in operational processes and service efficiency, with prominent regional differentiation characteristics. From the Renli layer perspective, Tianjin Port (0.0671) leads Shanghai Port (0.0668) by a narrow margin; Zhuhai Port (0.2123) and Xiamen Port (0.0085) lag significantly in talent reserves and innovation-driven capabilities, which have become bottlenecks restricting their smartness improvement. This subsystem imbalance phenomenon indicates that some ports have development biases of “prioritizing hardware construction over software empowerment” or “emphasizing operational efficiency while neglecting talent cultivation”.
Thirdly, regarding the matching degree between the data element development index (DEDI) and the coastal port smartness index (CPSI), there exist significant disparities in their spatial synergy, giving rise to three distinct development patterns. The “data-leading but smartness-lagging” pattern: Guangdong province serves as a typical case. Its DEDI (0.5962) ranks first nationwide, yet the CPSI rankings of Guangzhou Port and Zhuhai Port are only 6th and 8th, respectively. This indicates that the enabling value of data elements has not been fully unleashed, with a prominent issue of inefficient conversion of data dividends. The “smartness-leading but data-lagging” pattern: For example, Tianjin’s DEDI (0.1409) is at the bottom of the sample, but its CPSI ranking outperforms its DEDI ranking, presenting a reverse imbalance. The supporting role of data elements in port smartness remains to be enhanced. The “coordinated development and virtuous cycle” pattern: Shanghai, Zhejiang province, and Shandong province exhibit high levels of both DEDI and CPSI with strong synergy, forming a positive cycle of “data element support → smartness enhancement → data value deepening”. In contrast, Fujian province and Liaoning province show a balanced feature of low rankings in both indices, leaving substantial room for improvement in their overall development level.
Fourth, from the perspective of spatial agglomeration characteristics, a spatial pattern of “leader-led agglomeration and regional block differentiation” has taken shape. The port smartness and data element development levels in East China (with Shanghai, Zhejiang, and Shandong as prominent leaders) rank first nationwide. Leveraging a solid digital economy foundation, clear port development positioning, and robust policy support, they have formed a collaborative agglomeration effect, emerging as the core leading region for the intelligent transformation of China’s coastal ports. In South China (Guangdong), data element development advantages are prominent, yet port smartness construction has failed to keep pace, resulting in a lack of effective synergy. Ports in North China (Tianjin) and Northeast China (Liaoning) have a relatively solid foundation in port smartness, but lag in data element development—a key bottleneck restricting regional collaborative upgrading. This spatial distribution pattern is closely linked to regional data element resource endowments, port development positioning, policy support intensity, and industrial synergy levels, further exacerbating regional differentiation in the intelligent development of coastal ports.

6. Association Analysis Between Data Elements and Smartness of Coastal Ports

6.1. Identification of Direct Association Between Data Elements and Smartness of Coastal Ports

This study employs the coastal port smartness index (CPSI) as the dependent variable and utilizes the random forest algorithm to conduct feature importance analysis. Given that the predictive association of data elements on the enhancement of coastal port smartness is context-dependent—i.e., the association between data elements and CPSI exhibits nonlinear characteristics as the socio-economic conditions of coastal regions evolve—this paper, in addition to the core explanatory variable (development index of data elements, DEDI), selects 10 control variables as feature variables based on existing literature [24,25,26,27,28,29,30,31,32]. These variables span dimensions including economic development, policy support, talent reserve, capital investment, openness, financing capacity, innovation level, and financial development. They not only serve as important contextual predictors of coastal port smartness but are also influenced by the development of data elements to varying degrees. It should be noted that the feature importance derived from random forest analysis reflects the relative contribution of different variables to the model’s prediction of CPSI.
The specific operationalization of variables is as follows: The economic development level of coastal regions is measured by per capita regional GDP (RGDP_PC); Government support for the transportation sector in coastal regions is reflected by the budgeted general public budget expenditure on transportation (BET_GPB); Talent density in coastal regions is captured by the proportion of port employees with a bachelor’s degree or above (EDU) and the proportion of port technical personnel (TECH); Capital investment level and openness of coastal regions are represented by per capita fixed asset investment (CFAI_PC) and foreign trade dependence (OPE), respectively; The financing level of coastal regions is measured by the social financing scale (SSF); The independent innovation capacity of coastal regions is assessed by the R&D investment intensity of port enterprises (RDI_PE) and the number of authorized invention patent applications (IPA); The development status of digital finance in coastal regions is gauged by the digital inclusive finance index (DFI).
The model training adopted an iterative optimization strategy, integrating cross-validation and model performance metric evaluation to assess the stability of results. After training the initial model on the full sample, this study progressively eliminated low-contribution variables (e.g., CFAI_PC, RDI_PE, SSF) based on feature contribution rates, theoretical relevance, and model performance. The remaining features were then retrained, and their contribution rates recalculated, ultimately retaining eight core feature variables. This process was designed to enhance the interpretability and parsimony of the model; however, given the limited sample size, the findings should be interpreted as exploratory predictive evidence.
This variable combination not only enhances the contribution rate of data elements but also maintains a high model goodness of fit (R2), thereby providing empirical evidence for the more cautious identification of association patterns between data elements and the improvement of coastal port smartness. The measurement results of feature variable contribution rates are presented in Table 11.
Firstly, the data element development index (DEDI) accounts for 13.586% of the contribution rate in model prediction, indicating its high importance in explaining the CPSI variations among the sampled ports. This result demonstrates a strong statistical association between the provincial-level data element development level and coastal port smartness.
Secondly, the remaining seven feature variables also exhibited varying degrees of importance in the model. Among them: the digital inclusive finance index (DFI) has the highest contribution rate (29.324%), highlighting the fundamental supporting role of the popularization and deepening of digital financial services in the smart transformation of ports; Foreign trade dependence (OPE) ranks second (17.033%), indicating that the level of openness and trade activity are important external factors associated with smart port development; Per capita regional GDP (RGDP_PC) contributes 10.270%, reflecting the supporting effect of regional economic strength on port smartness; The number of authorized invention patent applications (IPA) and the proportion of port technical personnel (TECH) have contribution rates of 9.351% and 8.166%, respectively, embodying the enabling role of innovation capacity and professional technical talent in port smartness; The budgeted general public budget expenditure on transportation (BET_GPB) contributes 6.881%, underscoring the important role of government policy support; The proportion of port employees with a bachelor’s degree or above (EDU) contributes 5.389%, reflecting the supporting role of high-end talent reserves.
This indicates that port smartness is not the outcome of the standalone role of data elements, but rather the combined effect of regional economic foundation, openness level, financial services, talent structure, innovation capacity, and policy support.

6.2. Identification of Interactive Associations Between Data Elements, Other Elements, and Smartness in Coastal Ports

In the analysis of regression problems involving moderating effects, the introduction of interaction terms is a standard approach. For a given feature variable, its contribution rate calculated by the random forest algorithm without interaction terms reflects the combined effect of the variable itself and its interactions with other feature variables. Thus, if the magnitude of a feature variable’s impact on the target value varies with the values of other variables (and a high correlation exists between them), the variable’s standalone contribution rate will inevitably decrease after adding interaction terms—while the interaction terms will account for a portion of the contribution. Concurrently, the model’s goodness-of-fit (R2) will either remain stable or improve compared to the model without interaction terms.
To further investigate the interactive relationships between data elements and supporting conditions, this study incorporates first-order interaction terms of DEDI with the seven feature variables into the random forest model. It is worth noting that the contribution rate of interaction terms is used to characterize the changes in the importance of different variable combinations in model prediction, which can serve as a reference for identifying synergistic relationships, but does not equate to the causal moderating effect in the context of traditional regression.
After incorporating the interaction terms between the DEDI and each of the seven feature variables, and based on the optimal model derived from hyperparameter tuning of the random forest algorithm, the contribution rates of all feature variables—including the seven interaction terms—were finally obtained, as presented in Table 12.
A comparison between Table 11 and Table 12 reveals a significant shift in the contribution rate structure of variables following the introduction of interaction terms:
Firstly, the standalone contribution rates of the data element (DEDI) and the seven feature variables all exhibit a downward trend, while interaction terms account for a substantial proportion of total contributions. Specifically: The standalone contribution rate of DEDI decreased from 13.586% to 3.216%, representing a 76.329% reduction; The combined standalone contribution rate of the seven feature variables fell from 86.414% to 67.433%; The combined contribution rate of the DEDI main effect and DEDI interaction terms reached 32.567%. This finding indicates that the predictive contribution of data elements is strongly conditioned by supporting regional factors—i.e., the predictive association of data elements varies dynamically with changes in the economic and social conditions of coastal regions. This supports the existence of nonlinear correlated enhancement patterns in port smartness enhancement.
Secondly, from the distribution of interaction term contribution rates, the factors with relatively high contributions are concentrated in the digital inclusive finance index (DFI), per capita regional GDP (RGDP_PC), budgeted general public budget expenditure on transportation (BET_GPB), and the number of authorized invention patent applications (IPA), reaching 6.031%, 5.870%, 5.7166%, and 4.369%, respectively. Their independent contribution rates decreased from 29.324%, 10.270%, 6.881%, and 9.351% before including interaction terms to 25.564%, 9.747%, 3.287%, and 4.629% after including them. This indicates that whether viewed in terms of overall contribution or considering interactive effects, enhancing digital finance development, promoting regional economic growth, increasing government support, and strengthening independent innovation capacity are associated with stronger coordinated improvement between data elements and port smartness.

6.3. Pathways for Enhancing Coastal Port Smartness Associated with Data Elements

6.3.1. Univariate Partial Effect

The preceding analysis has established the high significance of data elements in predicting port smartness, while their model-predicted response patterns require further examination. To identify the average marginal response of CPSI predicted values across varying DEDI levels and explore how provinces can select appropriate pathways for port smartness enhancement based on their respective data element development statuses, this study leverages a partial effect model to analyze the nonlinear association between data elements and port smartness, and further discusses potential differentiated pathways by incorporating the geographical locations of sample regions.
Based on the random forest algorithm, the univariate dynamic partial effect formula in this paper is expressed as:
f S ^ X S = 1 n k = 1 n f X S , x k C , k = 1 , 2 , , n
Similarly, the multivariate dynamic partial effect formula is expressed as:
f A , B ^ X A , X B = 1 n k = 1 n f X A , X B , x k C , k = 1 , 2 , , n
In Equations (12) and (13), f S · and f A , B · denote the functional relationships between the feature variables and coastal port smartness estimated by the random forest model, while n represents the number of training set samples in the random forest model.
In essence, the above two equations represent the discretization of integrals for continuous functions: by summing and averaging, all other variables X C are eliminated via integration, yielding the partial effect of X S on Y and the joint partial effect of X A and X B on Y. Let X S denote the data element development index; the partial effect plot of X S on coastal port smartness is presented in Figure 4.
As illustrated in Figure 4, the results of the univariate partial effect analysis reveal a pronounced nonlinear association between DEDI and CPSI.
At the stage of low DEDI levels, the predicted CPSI shows an overall upward trend with the improvement of data element development, indicating that the construction of data elements in the initial stage is associated with higher predicted comprehensive service capacity of coastal ports. Around the potential turning interval of DEDI ≈ 0.215, the predicted curve of the model exhibits a potential inflection point characterized by an increasing slope, indicating that the positive correlation between data element development and port smartness is enhanced once data element development reaches a certain level.
When DEDI rises to a medium-to-high level, the curve gradually flattens, suggesting that the marginal contribution of data element development to CPSI may experience a diminishing trend, and simply improving the regional data-element environment may not be sufficient to generate equivalent increases in predicted CPSI without complementary governance, operational, and innovation conditions.
In addition, some provinces, such as Guangdong, are characterized by high DEDI but low CPSI, which indicates that CPSI is not fully explained by DEDI, but is also jointly influenced by multiple factors such as port infrastructure, industrial collaboration, openness, and technical capabilities.

6.3.2. Joint Partial Effects

To clarify the mechanism for identifying the joint association between data elements and the aforementioned supporting factors with respect to port smartness, this study further employs a multivariate dynamic partial effect model to examine the joint dynamic partial effects of data elements and feature variables on coastal port smartness, thereby identifying differentiated optimization paths for port smartness enhancement across regions. Given that analyzing all feature variables individually would unduly lengthen the paper, only the four elements with the highest contribution rates of interaction terms with data elements are selected as typical cases for analysis. The model is constructed based on Equation (13), where X A denotes the data element development index (DEDI) and X B represents a selected feature element. A two-dimensional contour heat map was plotted to visualize the joint dynamic partial effects, as detailed in Figure 5.
Figure 5 presents the joint partial effects of DEDI and four supporting factors on the predicted CPSI using two-dimensional contour heat maps. In each subplot, the horizontal axis denotes the DEDI, while the vertical axes respectively represent the digital inclusive finance index (DFI), per capita regional GDP (RGDP_PC), the budgeted general public budget expenditure on transportation (BET_GPB), and the number of authorized invention patent applications (IPA). The color gradient indicates the predicted CPSI generated by the random forest model, with the transition from red to green representing an increase in the predicted CPSI level. Contour lines represent different levels of the joint partial effect. Hollow scatter points denote raw observations, solid labeled dots indicate provincial mean positions, and red pentagrams identify the largest-gradient points on the predicted response surfaces, namely the intervals where predicted CPSI is most sensitive to changes in the combination of DEDI and the corresponding supporting factor. These points should not be interpreted as strict causal thresholds or policy optima; rather, they serve as diagnostic references for identifying potential nonlinear response intervals and regional differences in factor coordination. The relative positions of provincial mean points and largest-gradient points are further used to classify regional development stages, as shown in Table 13.
Based on Figure 5 and Table 13, this study presents the following findings:
Firstly, the synergistic state between data elements and supporting factors exhibits significant four-dimensional heterogeneity.
In the DFI dimension, there remains substantial room for universal improvement in the synergy between digital financial inclusion and data elements. Shanghai, Zhejiang, Guangdong, and Shandong have crossed the DEDI inflection point but have not yet entered the maximum gradient interval corresponding to DFI. This indicates that these regions possess a solid data element foundation, while the synergistic empowerment of digital finance for port intelligence has not been fully unleashed. In contrast, Fujian, Tianjin, and Liaoning have not simultaneously crossed the DEDI and DFI inflection points, suggesting that they need to both strengthen their data element infrastructure and enhance the capacity of digital financial services to support intelligent port transformation.
In the RGDP_PC dimension, the synergy between regional economic foundations and data elements presents a stratified pattern. Shanghai has simultaneously crossed the DEDI and RGDP_PC inflection points, demonstrating a well-established synergistic support between data element development and economic foundations. Zhejiang, Guangdong, and Shandong have crossed the DEDI inflection point but have not reached the high-efficiency synergy interval corresponding to RGDP_PC, reflecting the characteristic of “relatively advanced data elements yet insufficient economic support”. Fujian, Tianjin, and Liaoning remain in a state of dual lag, requiring further enhancement of regional economic strength and industrial hinterland support to improve the conversion capacity of data elements into port intelligence.
In the BET_GPB dimension, the supporting role of fiscal transportation expenditure in enabling data elements varies across regions. Guangdong has simultaneously crossed the DEDI and BET_GPB inflection points, indicating strong synergy between its data element foundation and fiscal support for transportation. Zhejiang, Shanghai, and Shandong have crossed the DEDI inflection point, but their fiscal transportation expenditure has not yet entered the high-efficiency synergy interval, suggesting that these regions need to further optimize fiscal resource allocation by directing transportation fiscal investment more precisely toward the construction of smart port infrastructure, digital platforms, and intelligent operation systems. Fujian, Liaoning, and Tianjin have not crossed either inflection point, necessitating simultaneous strengthening of data infrastructure construction and fiscal support in the transportation sector.
In the IPA dimension, the synergy between innovation capacity and data elements is relatively favorable, yet regional differentiation persists. Shanghai, Guangdong, Shandong, and Zhejiang have simultaneously crossed the DEDI and IPA inflection points, indicating a strong synergistic relationship between their data element foundation and innovation output. Although Tianjin has exceeded the IPA inflection point, it has not crossed the DEDI inflection point, exhibiting the characteristic of “relatively advanced innovation capacity yet insufficient data element foundation”. Moving forward, it should strengthen the construction of data infrastructure, data circulation systems, and port digital application scenarios to better translate innovation capacity into port intelligence performance. Fujian and Liaoning lag behind in both DEDI and IPA, requiring simultaneous reinforcement of data element accumulation and innovation capacity building.
Secondly, the coordination status between data elements and supporting factors across coastal provinces and municipalities exhibits significant regional heterogeneity, resulting in differentiated development patterns and catch-up pathways. Shanghai performs relatively well in terms of RGDP_PC and IPA, but still has room for improvement in DFI and BET_GPB, indicating that its future focus should shift from relying solely on economic and innovation advantages to strengthening digital finance support and precise allocation of fiscal resources for transportation. Guangdong excels in BET_GPB and IPA dimensions and maintains a high level of DEDI, yet it has not fully entered the efficient coordination range in DFI and RGDP_PC, suggesting the need to further improve the efficiency of transforming data elements into smart port applications and enhance the support of digital finance and regional economic quality for port intelligence. Zhejiang and Shandong are generally balanced overall, but in DFI, RGDP_PC, and BET_GPB they mainly show “having crossed the DEDI inflection point but with supporting factors not yet fully meeting standards”, indicating that they should prioritize enhancing synergies among digital finance, economic support, and fiscal investment. Tianjin holds certain advantages in IPA, but lags behind in DEDI, DFI, RGDP_PC, and BET_GPB, meaning that its advancement in port intelligence cannot rely solely on innovation foundations; it must simultaneously strengthen data infrastructure, economic underpinning, and fiscal input. Fujian and Liaoning remain relatively lagging overall, with most indicators failing to cross the inflection points, requiring priority improvements in data infrastructure, digital financial services, regional economic foundations, and innovation capacity. Overall, the enhancement of coastal port intelligence is not determined solely by data elements, but rather by the degree of coordinated alignment among data elements, digital finance, economic foundation, fiscal support, and innovation capability. Policies must be tailored to regional endowment differences, implementing targeted measures to unlock the synergistic value of data elements through upgrades in supporting factors.

6.4. Further Analysis and Discussion of the Guangdong Case

Combining the results of random forest and partial dependence analysis, there is a strong statistical association and nonlinear response between DEDI and CPSI. However, the case of Guangdong suggests that high levels of data element development do not necessarily correspond to equally high levels of port digitalization performance. To avoid over-interpretation of causality in this phenomenon, this paper further explores the issue from the perspectives of institutional and governance contexts.
On one hand, Guangdong lies within the Greater Bay Area port cluster, where ports such as Guangzhou, Shenzhen, and Zhuhai differ in functional positioning, hinterland structure, port management entities, and industrial foundations. The “Outline Development Plan for the Guangdong-Hong Kong-Macao Greater Bay Area” positions Hong Kong as an international shipping center and places key ports like Guangzhou and Shenzhen within the regional integrated transportation system, proposing to strengthen inland waterways and port-access rail and highway networks centered on major coastal ports. This indicates that the digital transformation of ports in Guangdong involves complex coordination across cities, ports, and transport modes.
On the other hand, while the Guangdong port system already possesses substantial scale and regional data infrastructure, the performance of port digitalization still depends on whether data standards, platform interconnectivity, operational collaboration, and governance mechanisms are effectively integrated. In other words, data resource advantages can only be translated into improvements in CPSI when embedded in specific scenarios such as port operations, sea-rail intermodal transport, customs clearance coordination, port-logistics-trade services, and green supervision.
Based on the above exploratory analysis, Guangdong should focus on improving the efficiency of transforming its provincial data element advantages into smart port scenarios. Specifically, building upon the existing coordinated development of port clusters, efforts should be further advanced to unify data standards, interconnect platform interfaces, and coordinate business processes among major ports such as Guangzhou Port, Shenzhen Port, and Zhuhai Port, thereby promoting the in-depth application of data elements in areas including intelligent dispatching, sea-rail intermodal transport, customs clearance coordination, green supervision, and port-trade services.
For Guangzhou Port, priority should be given to strengthening its role as an international hub port, integrating port, shipping, and trade services, and enhancing cross-port intelligent dispatching capabilities. For Zhuhai Port, efforts should focus on addressing shortcomings in smart logistics platforms, digitalization of port operations, and multimodal transport coordination.
It should be noted that the explanation of the Guangdong case in this paper is intended as an exploratory discussion of mechanisms based on statistical results and policy texts, rather than a rigorous causal analysis. Future research should incorporate port-level operational data, platform development data, port governance structures, enterprise interviews, or case studies to conduct more detailed empirical analyses of data element conversion efficiency in Guangdong ports.
The Guangdong case further suggests that data-element accumulation is a necessary but not sufficient condition for port smartness enhancement. Governance capacity, cross-port coordination, data-standard unification, and scenario-based operational integration determine whether regional data resources can be converted into port-level intelligent services. This finding also has international relevance: in European and other global port systems, the same diagnostic framework may be transferable, but implementation effects are likely to vary with port governance structures, data-sharing rules, public–private coordination mechanisms, and the degree of integration among port, logistics, customs, and hinterland stakeholders.

7. Conclusions, Suggestions, Limitations, and Future Research

7.1. Conclusions

This study investigates the association between data element development and coastal port smartness, and identifies pathways for enhancement, thereby providing empirical evidence and policy implications for the smart development of coastal ports. The key conclusions are as follows: (1) CPSI and DEDI both exhibit an overall upward trend during the sample period, yet their development is not spatially synchronized. East China, represented by Shanghai, Zhejiang, and Shandong, displays a relatively coordinated pattern in which data element development and port smartness mutually reinforce each other. Tianjin and Liaoning demonstrate relatively stronger port-smartness foundations than their provincial data-element environments, whereas Guangdong presents a typical mismatch characterized by “data-element advancement but port-smartness lag.” Fujian and Liaoning still have considerable room for improvement in both data element development and port smartness. (2) Coastal port smartness displays a clear stepwise spatial structure and internal subsystem imbalance. Shanghai Port and Ningbo-Zhoushan Port form the leading tier, Qingdao Port occupies the second tier, and Tianjin, Dalian, Guangzhou, Xiamen, and Zhuhai show varying degrees of catch-up pressure. From the WSR perspective, the Wuli layer reveals gaps in infrastructure and intelligent facilities, the Shili layer reflects differences in logistics and operational efficiency, and the Renli layer highlights talent reserves and innovation capacity as key constraints for several ports. (3) The random forest results indicate that data elements are an important predictor of coastal port smartness, but their role should be interpreted as predictive contribution rather than strict causal effect. In the model without interaction terms, DEDI contributes 13.586% to CPSI prediction, ranking behind digital inclusive finance and openness and ahead of regional economic strength and innovation output. After introducing interaction terms, the standalone contribution of DEDI decreases to 3.216%, whereas the combined contribution of the DEDI main effect and its interaction terms accounts for 32.567%, indicating that the association between data elements and port smartness is strongly dependent on supporting regional conditions. (4) The partial effect results further reveal a nonlinear response of CPSI to DEDI. At low DEDI levels, data element accumulation provides a foundation for port smartness improvement; around the threshold of DEDI = 0.215, the positive association becomes more pronounced; at medium-to-high DEDI levels, the curve gradually flattens, suggesting diminishing marginal predictive gains. This pattern explains why regions with high DEDI, such as Guangdong, do not automatically achieve high CPSI: the transformation of data-element advantages into port-smartness performance also depends on infrastructure, industrial coordination, openness, digital finance, fiscal support, and innovation capacity. (5) The joint partial effect analysis shows significant dimensional and regional heterogeneity in the synergy between data elements and supporting factors. Digital inclusive finance, regional economic development, fiscal transportation expenditure, and innovation output all interact with DEDI, but the efficiency of these combinations differs across regions. Overall, coastal port smartness is shaped not by data elements alone, but by the coordinated allocation of data elements and multiple supporting factors.

7.2. Suggestions

The empirical findings of this paper are primarily derived from a sample of China’s coastal ports. The following recommendations do not constitute a universally applicable policy framework for smart ports, but rather region-specific moderate policy proposals formulated based on the matching status of the “data element-port smartness” nexus and the shortcomings in supporting factors across different regions.
Firstly, establish a full-chain data element system of “base-supply-flow-use” to promote the data elements to cross the scale threshold. For regions where data element development has not yet reached the threshold, accelerate the improvement of data infrastructure, expanding coverage of Internet broadband, big data platforms, and 5G networks; establish and improve data circulation rules and transaction mechanisms to facilitate cross-port, cross-department, and cross-regional data sharing and interconnection; strengthen data security protection and intellectual property rights (IPR) protection and improve the efficiency of data element circulation and the quality of its application, thereby providing full-chain data support for port smartness.
Secondly, promote synergistic empowerment of data elements and supporting factors. In response to the dimensional heterogeneity in the synergistic state between data elements and supporting factors, targeted policy interventions are proposed: strengthen the recruitment and cultivation of port talents with bachelor’s degrees or above; enhance R&D investment intensity and patent commercialization efficiency; optimize fiscal expenditure structures to tilt toward smart port development; and deepen the application of digital inclusive finance in the port sector. By upgrading supporting factors, the synergistic value of data elements can be fully unlocked.
Thirdly, implement a differentiated regional catch-up strategy to advance port intelligence in a categorized manner. Given the varying endowments and divergent development paths among coastal provinces and municipalities, tailored development strategies should be formulated: For regions characterized by “data leadership but lagging smartness”, such as Guangdong, efforts should focus on strengthening integration between provincial data infrastructure and port application scenarios, promoting data sharing among ports, customs, shipping companies, logistics platforms, and financial institutions, thereby enhancing the port’s capacity to absorb and utilize data resources. For areas like Tianjin, which exhibit “smartness leadership but data lag”, it is essential to fully leverage their innovation advantages while simultaneously strengthening data infrastructure, data circulation rules, port data platform development, and policy-based financial support. For relatively underdeveloped regions such as Fujian and Liaoning, policies should prioritize synchronized advancement in data infrastructure, digital financial services, industrial hinterland support, fiscal investment, and innovation capabilities. For regions demonstrating “coordinated development and positive feedback loops”, including Shanghai, Zhejiang, and Shandong, the focus should shift from scale expansion to quality improvement, with enhanced synergy between data elements and inclusive digital finance, regional economic support, fiscal transportation spending, and innovation output.
Fourth, strengthen the dual drivers of digital inclusive finance and opening-up to unlock the synergistic value of supporting factors. In response to the pervasive shortcoming of lagging digital inclusive finance across all seven provinces and municipalities, the following targeted measures are proposed: The People’s Bank of China (PBC) and the National Administration of Financial Regulation (NAFR) should issue special policies to encourage financial institutions to develop financial products tailored to port smart transformation, thereby reducing the financing costs of digital upgrading; For regions with opening-up advantages (e.g., Guangdong, Shanghai), enhance the synergistic effect between trade dynamism and data element empowerment; For regions with lagging opening-up (e.g., Fujian, Liaoning), expand outward-oriented businesses such as cross-border e-commerce and international logistics to form a virtuous cycle of “trade-driven growth → smart upgrading → data empowerment → financial innovation”, advancing the construction of world-class ports.

7.3. Limitations and Future Research

Currently, research on how data elements can empower port intelligence is still in its exploratory stage, and the theoretical framework requires further refinement. This study aims to investigate the relationship between data elements and the enhancement of coastal port smartness by establishing a scientific and reasonable indicator system. However, due to limitations in data availability and research conditions, the developed indicator system remains incomplete, and this study has certain inherent limitations.
Specifically, this study has five main limitations: the sample size is relatively small; the random forest and partial-effect models provide predictive rather than causal evidence; the DEDI is measured mainly at the provincial level, while CPSI is measured at the port level; the institutional and governance setting is China-specific; and the international generalizability of the conclusions remains limited. These caveats should be considered when interpreting the empirical results and policy implications.
Future research should focus on the following five aspects:
(1)
Indicator System Construction: As statistical data continue to improve, authors should make efforts to develop a more detailed, scientific, and systematic evaluation index system, incorporating more micro-level and qualitative indicators.
(2)
Sample Size: The current study is based on a limited number of major coastal ports, corresponding to 56 port-year observations due to data constraints. Model complexity is deliberately constrained through cross-validation, parsimonious feature selection, benchmark comparison, and robustness checks. These procedures reduce, but cannot eliminate, the uncertainty associated with small-sample machine-learning analysis. To achieve more reliable results, future research should consider larger sample sizes covering small- and medium-sized ports and inland river ports.
(3)
Research Methods: Exploring other ways to assess the nonlinear and spatial spillover effects of data elements, the degree of synergy between data elements and emerging technologies, and the long-term dynamic development of smart ports could lead to more comprehensive conclusions.
(4)
Causal Interpretation and Indicator Scale: The machine-learning results should be interpreted as predictive associations rather than causal effects. Future research should integrate port-level operational data, platform transaction data, enterprise interviews, and longitudinal policy shocks to better address causality and the scale mismatch between provincial DEDI indicators and port-level CPSI indicators.
(5)
International Applicability and Governance Context: The conclusions of this study are primarily applicable to the context of China’s coastal ports and cannot be directly generalized to all international port systems because institutional arrangements, port governance models, data-sharing rules, and public–private coordination mechanisms differ across countries. Future research may further expand the sample scope, integrating micro survey data, enterprise operation data, and international port comparison data to conduct external testing and revision of the framework proposed in this study.

Author Contributions

Conceptualization, L.J. and Y.B.; methodology, L.J.; software, Y.B.; validation, L.J., Y.B. and X.Z.; formal analysis, Y.B.; investigation, X.Z. and Q.Z.; resources, Y.B.; data curation, Y.B. and X.Z.; writing—original draft preparation, Y.B. and X.Z.; writing—review and editing, X.Z. and Q.Z.; visualization, Y.B.; supervision, L.J.; project administration, L.J.; funding acquisition, L.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Fund of China (No. 24BJY114).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Molavi, A.; Lim, G.J.; Race, B. A framework for building a smart port and smart port index. Int. J. Sustain. Transp. 2020, 14, 686–700. [Google Scholar] [CrossRef]
  2. Philipp, R. Digital readiness index assessment towards smart port development. Sustain. Manag. Forum|Nachhalt. 2020, 28, 49–60. [Google Scholar] [CrossRef]
  3. Cai, W.X.; Zheng, J.C. Construction of evaluation index system for intelligent degree of ports in Guangdong Province. Water Transp. Manag. 2019, 41, 14–17. [Google Scholar] [CrossRef]
  4. Cao, J.; Wang, D.C.; Dai, R.; Li, H.; Huang, F.C. Evaluation model of smart port development based on improved matter-element method. J. Chongqing Jiaotong Univ. (Nat. Sci.) 2021, 40, 59–65. [Google Scholar] [CrossRef]
  5. Zheng, Z.; Li, H.L. Evaluation of international port intelligence based on KPI and AHP. J. Wuhan Univ. Technol. (Soc. Sci. Ed.) 2022, 35, 81–88. [Google Scholar] [CrossRef]
  6. Luo, B.C.; Hu, J.; Wang, R.D. Smart port grade evaluation method based on grey entropy variable weight. Transp. Res. 2023, 9, 127–133+142. [Google Scholar] [CrossRef]
  7. Zhu, J.S.; Li, Q.; Lin, J.J.; Ning, T. Comprehensive evaluation index system and method of world-class ports. Navig. China 2025, 48, 1–8. [Google Scholar] [CrossRef]
  8. Zhou, H.; Zhou, D.; Liu, Z.H. Research on modern intelligent port evaluation. In Proceedings of the 9th International Conference on Financial Innovation and Economic Development; Atlantis Press: Dordrecht, The Netherlands, 2024; pp. 1–6. [Google Scholar] [CrossRef]
  9. Li, Z.G.; Wang, J. Digital economy development, data factor allocation and manufacturing productivity improvement. Economist 2021, 10, 41–50. [Google Scholar] [CrossRef]
  10. He, W.; Dong, Y.; Sun, Z.Y. Impact of data elements on regional coordinated development: Empirical analysis based on panel data of 279 prefecture-level cities in China. Urban Probl. 2024, 6, 35–44. [Google Scholar] [CrossRef]
  11. Huang, Z. Theoretical measurement and analysis of Chinese data factor development index. Int. J. Financ. Investig. 2025, 2, 1–6. [Google Scholar] [CrossRef]
  12. Pan, H.L.; Zhao, L.X.; Ye, L. Measurement and spatiotemporal evolution of data factor development level in China. Stud. Sci. Sci. 2025, 43, 205–216. [Google Scholar] [CrossRef]
  13. Chen, R.D.; Wang, C.; Pan, C.X.; Liu, C. Impact of data factor marketization on firm performance: Evidence from Chinese listed companies. Chin. J. Manag. Sci. 2025, 33, 1–11. [Google Scholar] [CrossRef]
  14. Wu, J.; Chen, H.Z. Measurement of the development level of data elements in China, spatiotemporal evolution and promotion path. Stat. Decis. 2025, 41, 82–87. [Google Scholar] [CrossRef]
  15. Sun, Y.; Wang, P. Application of unmanned driving technology in future smart ports. Pearl River Water Transp. 2019, 23, 5–7. [Google Scholar] [CrossRef]
  16. Paulauskas, V.; Filina-Dawidowicz, L.; Paulauskas, D. Ports digitalization level evaluation. Sensors 2021, 21, 6134. [Google Scholar] [CrossRef] [PubMed]
  17. Tang, H.; Chang, Z.X.; Wu, Z.M.; Wei, Y.N. Exploration of smart port construction in Zhanjiang port based on 5G technology. Water Transp. Manag. 2021, 43, 11–15. [Google Scholar] [CrossRef]
  18. Min, H. Developing a smart port architecture and essential elements in the era of Industry 4.0. Marit. Econ. Logist. 2022, 24, 189–207. [Google Scholar] [CrossRef]
  19. Deng, Y.Y.; Qin, J.P. Efficiency evaluation of smart ports based on super-efficiency DEA and DEA-Malmquist index model. J. Shanghai Marit. Univ. 2022, 43, 83–90. [Google Scholar] [CrossRef]
  20. Du, X.K. Research on the path of artificial intelligence to empower intelligent port upgrading and transformation. E3S Web Conf. 2023, 372, 02001. [Google Scholar] [CrossRef]
  21. Ma, L.Q.; Zhang, Y.; Zhao, C.; Feng, X.J.; Wang, R.H.; Li, X.D. Review on key technologies and applications of smart port construction. Mod. Transp. Metall. Mater. 2024, 4, 16–30. [Google Scholar]
  22. Hua, J.; Ma, W.; Wang, Z.F.; Li, Z. Exploration and practice of smart port in Wuxi (Jiangyin) Port. China Water Transp. 2025, 7, 34–36. [Google Scholar] [CrossRef]
  23. Cai, H.Y. Practical application of big data technology in the intelligent management of port logistics. China Logist. Purch. 2026, 1, 72–73. [Google Scholar] [CrossRef]
  24. Jiang, Z.R.; Zhu, H.Y.; Wang, C.J.; Ye, S.L. Spatial pattern and influencing factors of port system transformation: An empirical analysis based on the Yangtze River Delta. Sci. Geogr. Sin. 2021, 41, 1187–1198. [Google Scholar] [CrossRef]
  25. Zheng, B.Y.; Yang, H.F. Evaluation of port efficiency of China’s coastal cities in “The Belt and Road” —Based on DEA game crossover efficiency-Tobit model. J. Appl. Stat. Manag. 2021, 40, 502–514. [Google Scholar] [CrossRef]
  26. Cao, J.J.; Lei, A.H.; Liu, Q.; Zhang, Y.; Wang, L.; Yan, X.P. Research on the development of smart ports driven by the integration of virtual and real. Eng. Sci. China 2023, 25, 239–250. [Google Scholar] [CrossRef]
  27. Zhao, Z.Y. High-quality development path of smart ports empowered by digitalization. Mark. Mod. 2024, 24, 48–51. [Google Scholar] [CrossRef]
  28. Wang, K.L.; Jiang, X.M. Spatiotemporal differences and influencing factors of high-quality development of marine economy in China. Ocean Dev. Manag. 2024, 41, 121–132. [Google Scholar] [CrossRef]
  29. Zhao, D.; Xu, D.M.; Zhou, Y.T.; Duan, W. The spatial spillover effect and its attenuation boundary of urban economy on port efficiency. PLoS ONE 2024, 19, e0304973. [Google Scholar] [CrossRef]
  30. Zhou, Y.T.; Li, Z.F.; Deng, Z. The spatiotemporal differentiation, future trends and driving factors of port efficiency in the Bohai Rim region. World Reg. Stud. 2024, 33, 142–154. [Google Scholar] [CrossRef]
  31. Yan, X.P.; Tu, M.; Yang, J.Q.; Xu, H.R.; Zhang, T.X. Research on the path to building world-class marine ports in China. Eng. Sci. China 2025, 27, 236–247. [Google Scholar] [CrossRef]
  32. Xue, T.T.; Zhang, F.; Xue, Y. Technical adoption barriers and countermeasures in the smart transformation of ports. Pearl River Water Transp. 2026, 4, 15–17. [Google Scholar] [CrossRef]
Figure 1. The mechanism of data elements driving smartness enhancement in coastal ports (the figure was created by the authors).
Figure 1. The mechanism of data elements driving smartness enhancement in coastal ports (the figure was created by the authors).
Sustainability 18 05989 g001
Figure 2. Basic principles of random forest and the calculation process of feature variable contribution rates (the figure was created by the authors).
Figure 2. Basic principles of random forest and the calculation process of feature variable contribution rates (the figure was created by the authors).
Sustainability 18 05989 g002
Figure 3. Trends of CPSI and DEDI from 2017 to 2024. The x-axis denotes time, and the y-axis denotes index values (DEDI, CPSI). Solid lines represent DEDI; dashed lines represent CPSI (the figure was created by the authors).
Figure 3. Trends of CPSI and DEDI from 2017 to 2024. The x-axis denotes time, and the y-axis denotes index values (DEDI, CPSI). Solid lines represent DEDI; dashed lines represent CPSI (the figure was created by the authors).
Sustainability 18 05989 g003
Figure 4. Univariate partial effect of DEDI on CPSI with Bootstrap 95% uncertainty interval (the figure was created by the authors). Note: This figure presents the partial dependence relationship between DEDI and predicted CPSI, estimated from the random forest model. The solid line denotes the mean partial effect, and the dashed vertical line identifies a potential inflection point around DEDI ≈ 0.215. The shaded area represents the Bootstrap 95% uncertainty interval obtained by resampling the empirical distribution used in partial-effect integration. Since the random forest model is kept fixed during this procedure, the interval reflects the uncertainty of the average predictive response.
Figure 4. Univariate partial effect of DEDI on CPSI with Bootstrap 95% uncertainty interval (the figure was created by the authors). Note: This figure presents the partial dependence relationship between DEDI and predicted CPSI, estimated from the random forest model. The solid line denotes the mean partial effect, and the dashed vertical line identifies a potential inflection point around DEDI ≈ 0.215. The shaded area represents the Bootstrap 95% uncertainty interval obtained by resampling the empirical distribution used in partial-effect integration. Since the random forest model is kept fixed during this procedure, the interval reflects the uncertainty of the average predictive response.
Sustainability 18 05989 g004
Figure 5. Joint partial effects of data elements and feature variables on coastal port smartness (the figure was created by the authors). Note: The horizontal axis of the figure denotes the DEDI, while the vertical axis represents the corresponding supporting variables. Color coding indicates the CPSI predicted by the random forest model, contour lines illustrate the gradient of predicted value changes, scatter points mark the actual locations of sample regions, and red stars indicate the largest-gradient locations on the predicted response surface. It is important to note that the joint partial effect plot reflects the average response in the context of model prediction, which is used to identify regions of nonlinear variation in CPSI predicted values under combinations of variables. It does not constitute a strict causal effect or a policy-optimal point.
Figure 5. Joint partial effects of data elements and feature variables on coastal port smartness (the figure was created by the authors). Note: The horizontal axis of the figure denotes the DEDI, while the vertical axis represents the corresponding supporting variables. Color coding indicates the CPSI predicted by the random forest model, contour lines illustrate the gradient of predicted value changes, scatter points mark the actual locations of sample regions, and red stars indicate the largest-gradient locations on the predicted response surface. It is important to note that the joint partial effect plot reflects the average response in the context of model prediction, which is used to identify regions of nonlinear variation in CPSI predicted values under combinations of variables. It does not constitute a strict causal effect or a policy-optimal point.
Sustainability 18 05989 g005aSustainability 18 05989 g005b
Table 1. Comparison of evaluation methods (the table was created by the authors).
Table 1. Comparison of evaluation methods (the table was created by the authors).
MethodMain AdvantagesMain LimitationsApproach in This Study
AHPEnables the integration of expert judgment and theoretical weightingHigh subjectivity and insufficient dynamic comparabilityUsed as a literature-based reference, not the primary methodology
Entropy Weight MethodObjectively assigns weights based on indicator dispersionLimited comparability of weights across different yearsUtilized to characterize the information content of indicators
VHSDBalances longitudinal temporal variations and cross-sectional object differencesInsufficient consideration of indicator information volumeEmployed to enhance dynamic comparability of measurements
VHSD-EMIntegrates dynamic variations and information dispersionStill dependent on the predefined indicator systemAdopted as the comprehensive measurement method in this study, with sensitivity analysis performed
Table 2. Parameter settings and performance evaluation of the random forest model (the table was created by the authors).
Table 2. Parameter settings and performance evaluation of the random forest model (the table was created by the authors).
ItemSetting/Result
Sample size56
Number of features11
Random forest trees300
max_featuressqrt
min_samples_leaf1
min_samples_split2
n_estimators300
Train R20.9729
CV R20.7688
OOB R20.7935
CV RMSE0.0507
CV MAE0.0390
Table 3. Performance comparison between the random forest model and benchmark models (the table was created by the authors).
Table 3. Performance comparison between the random forest model and benchmark models (the table was created by the authors).
ModelCV R2 MeanCV R2 StdRMSE MeanRMSE StdMAE MeanMAE StdTrain R2 MeanTrain R2 Gap
Random Forest0.75480.04890.05000.00820.03890.00640.96100.2063
Gradient Boosting0.63950.05950.06120.01110.04150.00430.99890.3593
Ridge0.07260.19090.09690.01300.08330.01530.41210.3395
Lasso(0.0938)0.06620.10630.01380.08960.01620.00000.0938
OLS(0.2817)0.49180.11300.02470.09230.01350.50870.7903
Table 4. Parameter settings of the partial effect model (the table was created by the authors).
Table 4. Parameter settings of the partial effect model (the table was created by the authors).
ItemSetting/Result
Sample size56
Number of features8
Random forest trees300
max_featuressqrt
min_samples_leaf2
Bootstrap repetitions100
Monte Carlo integration samples5000
OOB R20.7230
Interval labelBootstrap 95% uncertainty interval
Table 5. Indicator system for coastal port smartness (the table was created by the authors).
Table 5. Indicator system for coastal port smartness (the table was created by the authors).
Primary IndicesSecondary IndicesTertiary IndicesAttribute
Wuli layerInfrastructureLength of production wharves+
Number of production berths+
Intelligent facilitiesNumber of automated terminals+
Presence of intelligent operating systems+
Presence of big data platforms+
Informatization and paperless operations+
Degree of new technology application+
Green and low-carbonPort carbon dioxide emissions
Comprehensive energy consumption
per unit throughput of ports
Shili layerLogistics efficiencyPort cargo throughput+
Container throughput+
Sea-rail intermodal volume+
Operational efficiencyAverage vessel loading/unloading volume
at container terminals
+
Average quay crane handling volume
at container terminals
+
Service quality and effectivenessPort connectivity+
Production cost per ton
Actual regular cost of exports
Import container pick-up time efficiency
Export container gate-in time efficiency
Renli layerTalent teamProportion of employees with bachelor’s degree or above+
Proportion of technical personnel+
Innovation-drivenRatio of total R&D investment to operating revenue+
Number of authorized invention patent applications
in port cities
+
Table 6. Indicator system for the development level of data elements (the table was created by the authors).
Table 6. Indicator system for the development level of data elements (the table was created by the authors).
Primary IndicesSecondary IndicesTertiary IndicesAttribute
Basic support for data
element development
Supporting environment for data element
development
Number of state-level high-tech zones+
Number of state-level university science parks+
Number of national key laboratories+
Number of digital economy policies issued+
Number of data security documents
issued at the provincial level
+
Number of data-related regulations and standards+
Number of internet broadband access ports+
Length of optical cable lines+
Capacity of mobile phone switches+
Per capita volume of telecommunications services+
Data element
supply capacity
R&D and innovation
capability of elements
Number of digital economy patents+
R&D investment intensity of high-tech industries+
Supply capacity of data
technologies and services
Number of digital business enterprises+
Technological transformation expenditure
of industrial enterprises above designated size
+
Proportion of software business revenue in GDP+
Proportion of information technology service revenue in GDP+
Scale of data resourcesNumber of mobile internet users+
Popularization level of data terminal facilitiesNumber of computers per 100 persons+
Effectiveness of data
element application
Application of electronic
big data
E-commerce sales volume+
Application of financial
big data
Digital inclusive finance index+
Application of industrial
big data
Technology market transaction volume+
Robot density+
Table 7. Mapping between CPSI indicators and the United Nations Sustainable Development Goals (SDGs) (the table was created by the authors).
Table 7. Mapping between CPSI indicators and the United Nations Sustainable Development Goals (SDGs) (the table was created by the authors).
CPSI DimensionRepresentative IndicatorsRelated SDGsSustainability Relevance
Wuli layerAutomated terminals;
intelligent operating systems;
big data platforms;
paperless operations
SDG 9Infrastructure upgrading and technological innovation
Wuli layerPort carbon dioxide emissions;
energy consumption per unit throughput
SDG 13; SDG 12Climate action, energy saving, and resource efficiency
Shili layerCargo throughput;
container throughput;
sea-rail intermodal volume;
handling efficiency
SDG 9; SDG 11Resilient logistics systems and urban supply-chain efficiency
Shili layerPort connectivity;
export/import costs;
container pick-up and gate-in efficiency
SDG 8; SDG 17Trade facilitation and regional economic connectivity
Renli layerHighly educated employees;
technical personnel;
R&D investment;
invention patents
SDG 4; SDG 8; SDG 9Human capital, innovation capacity, and quality employment
Table 8. Spearman correlation test results for the VHSD-EM model (the table was created by the authors).
Table 8. Spearman correlation test results for the VHSD-EM model (the table was created by the authors).
Spearman
Correlation
Coefficient
20172018201920202021202220232024
CPSI0.952 **1.000 *** 1.000 *** 0.810 * 1.000 *** 0.976 *** 1.000 *** 1.000 **
DEDI0.974 ***0.984 ***1.000 ***1.000 ***0.964 ***0.851 *0.873 *0.939 ***
Note: The sample includes eight coastal ports: Dalian Port (Liaoning), Tianjin Port (Tianjin), Qingdao Port (Shandong), Shanghai Port (Shanghai), Ningbo-Zhoushan Port (Zhejiang), Xiamen Port (Fujian), Zhuhai Port, and Guangzhou Port (Guangdong), corresponding to seven coastal provinces and municipalities. ***, **, and * denote significance at the 1%, 5%, and 10% levels, respectively.
Table 9. Spearman correlation coefficients of CPSI ranking results under different α settings (the table was created by the authors).
Table 9. Spearman correlation coefficients of CPSI ranking results under different α settings (the table was created by the authors).
ComparisonOverall Spearman ρ Mean Yearly ρ Min Yearly ρ Max Yearly ρ
α = 0.25 vs. α = 0.500.96540.95760.94470.9822
α = 0.75 vs. α = 0.500.92580.90720.83890.9506
α = 0.25 vs. α = 0.750.80880.79530.67890.8696
Note: α denotes the relative contribution of VHSD weights in Wα = α × WVHSD + (1 − α) × WEM. α = 0.50 is treated as the baseline setting.
Table 10. Average index of coastal port smartness and data element development from 2017 to 2024 (the table was created by the authors).
Table 10. Average index of coastal port smartness and data element development from 2017 to 2024 (the table was created by the authors).
Coastal
Region
Coastal
Port
CPSIWSR DimensionsDEDI
Wuli
Layer
Shili
Layer
Renli
Layer
ScoreRankingScoreRankingScoreRankingScoreRankingScoreRanking
ShanghaiShanghai0.511610.239320.205530.066820.34693
ZhejiangNingbo-
Zhoushan
0.502220.247310.217520.037440.32995
ShandongQingdao0.445630.188650.220010.037050.33194
TianjinTianjin0.348040.157960.123040.067110.14098
LiaoningDalian0.341350.203540.106260.031660.16317
GuangdongGuangzhou 0.315660.144970.116350.054430.59621
FujianXiamen0.315070.217830.088870.008580.16866
GuangdongZhuhai0.212380.108680.075480.028370.59621
Table 11. Measurement results of feature variable contribution rates (the table was created by the authors).
Table 11. Measurement results of feature variable contribution rates (the table was created by the authors).
Variable Contribution RateVariableContribution Rate
DEDI13.586%IPA9.351%
DFI29.324%TECH8.166%
OPE17.033%BET_GPB6.881%
RGDP_PC10.270%EDU5.389%
Table 12. Measurement results of feature variable contribution rates incorporating data element interaction terms (the table was created by the authors).
Table 12. Measurement results of feature variable contribution rates incorporating data element interaction terms (the table was created by the authors).
Variable Contribution RateVariable Contribution Rate
DEDI3.216%DEDI main effect & DEDI interaction terms32.567%
DFI25.564%DEDI × DFI6.031%
RGDP_PC9.747%DEDI × RGDP_PC5.870%
BET_GPB3.287%DEDI × BET_GPB5.716%
IPA4.629%DEDI × IPA4.369%
TECH5.710%DEDI × TECH2.858%
OPE14.292%DEDI × OPE2.682%
EDU4.204%DEDI × EDU1.825%
Table 13. Classification of partial effects of drivers of coastal port smartness (the table was created by the authors).
Table 13. Classification of partial effects of drivers of coastal port smartness (the table was created by the authors).
DFIRGDP_PCBET_GPBIPA
Shanghai (2)Shanghai (4)Guangdong (4)Shanghai (4)
Zhejiang (2)Fujian (1)Zhejiang (4)Guangdong (4)
Fujian (1)Zhejiang (2)Shanghai (4)Shandong (4)
Guangdong (2)Tianjin (1)Shandong (4)Tianjin (3)
Tianjin (1)Guangdong (2)Fujian (3)Zhejiang (4)
Shandong (2)Shandong (2)Liaoning (3)Fujian (1)
Liaoning (1)Liaoning (1)Tianjin (3)Liaoning (1)
Note: Each column in the table is sorted in descending order according to the corresponding feature variable value. The number in parentheses indicates the relative position of the region with respect to the maximum gradient point marked by a star in Figure 5: 1 indicates the region has not reached the maximum partial derivative point of DEDI and the feature variable; 2 indicates the region has crossed the DEDI turning point but not the maximum partial derivative point of the feature variable; 3 indicates the region has not crossed the DEDI turning point but has exceeded the maximum partial derivative point of the feature variable; 4 indicates the region has crossed both the DEDI turning point and the maximum partial derivative point of the feature variable.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jian, L.; Bai, Y.; Zhang, X.; Zhao, Q. Research on the Association and Pathways Between Data Elements and Coastal Port Smartness Enhancement. Sustainability 2026, 18, 5989. https://doi.org/10.3390/su18125989

AMA Style

Jian L, Bai Y, Zhang X, Zhao Q. Research on the Association and Pathways Between Data Elements and Coastal Port Smartness Enhancement. Sustainability. 2026; 18(12):5989. https://doi.org/10.3390/su18125989

Chicago/Turabian Style

Jian, Lingxiang, Yuefeng Bai, Xinyue Zhang, and Qingyu Zhao. 2026. "Research on the Association and Pathways Between Data Elements and Coastal Port Smartness Enhancement" Sustainability 18, no. 12: 5989. https://doi.org/10.3390/su18125989

APA Style

Jian, L., Bai, Y., Zhang, X., & Zhao, Q. (2026). Research on the Association and Pathways Between Data Elements and Coastal Port Smartness Enhancement. Sustainability, 18(12), 5989. https://doi.org/10.3390/su18125989

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop