Next Article in Journal
Occurrence Patterns and Pollution Risk of Microplastics in Surface Sediments and Sediment Cores of the Three Gorges Reservoir, China
Previous Article in Journal
Green Valorization of Parapenaeus longirostris By-Products Through Ultrasound-Assisted Extraction of Astaxanthin with Extra Virgin Olive Oil: Application in Functional Trahanas with Enhanced Stability and Consumer Acceptability
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quantitative Measurement of Digital Maturity in Manufacturing Enterprises: An Application Scenario-Based Study

College of Management Science and Engineering, China Jiliang University, Hangzhou 310018, China
*
Author to whom correspondence should be addressed.
Sustainability 2026, 18(1), 274; https://doi.org/10.3390/su18010274
Submission received: 4 November 2025 / Revised: 18 December 2025 / Accepted: 23 December 2025 / Published: 26 December 2025

Abstract

With the rapid advancement of intelligent manufacturing and digital transformation, assessing the digital transformation and maturity of manufacturing enterprises enables firms to evaluate their digital achievements and establish appropriate transformation pathways. Existing maturity assessment models for manufacturing enterprises predominantly emphasize strategic-level evaluation and rely heavily on survey-based data, while paying limited attention to the business-function level. However, in practice, enterprises often initiate digital transformation by addressing specific business challenges and then gradually advance through concrete application scenarios. To address this gap, this paper proposes a scenario-based approach for measuring the digital transformation and maturity of manufacturing enterprises. With this approach, a differentiated weighting system is constructed based on “core keywords–extended keywords–negative keywords”, and a semantic similarity model is used to identify and quantify digital application scenarios in corporate annual reports. Building on this, a three-dimensional evaluation framework, comprising scenario coverage, scenario depth, and scenario consistency, is developed to comprehensively assess the extent of digital transformation from the perspective of application scenarios. With the proposed method, different business units can be evaluated independently, thereby capturing transformation progress across heterogeneous levels. Since it relies on publicly available corporate annual reports, the evaluation process is transparent and traceable and generates quantitative results. By shifting from survey-based, strategy-oriented assessments to function-oriented, data-driven, and modular evaluations, the method not only enhances accuracy and interpretability but also provides practical guidance for resource allocation, cross-functional complexity management, and the progressive expansion of digital transformation.

1. Introduction

With the accelerated progress in intelligent manufacturing and industrial digital transformation, manufacturing enterprises are undergoing a profound transformation from traditional production models to deeply integrated systems of physical and digital operations. Digital technologies have been widely adopted as critical tools to enhance flexibility, reduce operating costs, and restructure strategic models. However, scientifically evaluating the maturity and evolutionary pathways of a manufacturing enterprise’s digital transformation remains a major challenge.
In contrast to holistic transformation, manufacturing enterprises typically initiate digital transformation by addressing specific operational problems and then gradually implement it through application scenarios. Corporate annual reports serve as an important and reliable source of information for disclosing the degree of digital transformation, but their texts are often characterized by high semantic variability and unstructured expressions, making it difficult for traditional methods to effectively extract and quantify the underlying digital capabilities. Therefore, there is an urgent need for a measurement approach that can identify application scenarios from authentic texts, reflect the development depth, and track transformation trends to better align with enterprise practices and policy evaluation needs.
To address this issue, this paper proposes a scenario-based method for measuring the digital transformation of manufacturing enterprises. The method employs keywords and semantic similarity models to identify and quantitatively evaluate digital application scenarios in corporate annual reports. First, application scenarios are classified according to evaluation objectives. Then, three categories of keywords—“core words, extended words, and negation words”—are constructed and assigned differentiated weights. Next, based on keyword intensity, semantic similarity is applied to evaluate the alignment between digital transformation objectives and outcomes reflected in specific business scenarios. For each predefined scenario, a three-dimensional evaluation framework—comprising scenario coverage, scenario depth, and consistency—is established using keyword frequency, keyword intensity, and semantic similarity, thereby providing a systematic assessment of digital transformation from the perspective of application scenarios.
The proposed measurement framework for digital transformation is based on “scenario division–keyword guidance–semantic model embedding.” The framework demonstrates strong model adaptability and interpretability, as well as reliable trend tracking. It can be used to quantify digital transformation maturity across different business scenarios and reveal the transformation trajectories in functional dimensions such as R&D, manufacturing, operations, and collaboration, thereby providing a foundation for multidimensional assessment. To validate its effectiveness, the method was applied to ten years of annual reports from manufacturing enterprises in Zhejiang Province. The results show that it can consistently capture the trajectories of digital transformation across scenarios and align closely with regional policy developments. Compared with traditional approaches, this study introduces a novel paradigm for evaluating digital maturity and offers enterprises, governments, and research institutions a practical tool for assessing transformation progress and supporting decision-making, with the potential for cross-industry and cross-regional applications.

2. Literature Review

Maturity models (MMs) have been reviewed from various perspectives by several authors (for example, [1,2,3]. Senna et al. [3] provided a literature review of 55 digital MMs in the manufacturing sector. Therefore, we used these reviews as a starting point to build our model and collected research results on MMs for manufacturing enterprises. The reviewed literature on MMs in the manufacturing sector was published between 2023 and 2025 and obtained from the WoS and Scopus databases.
In the following subsections, existing MMs across four key dimensions are analyzed: assessment level (strategic, organizational, functional), evaluation purpose (readiness, maturity, benchmarking), data sources (survey, expert, system-generated, text), and methodology (qualitative, MCDA/statistical, hybrid, data-driven). This framework allows for a structured comparison of key models, identifying gaps that motivate the proposed scenario-based approach.

2.1. Evaluation Level and Scope

From a structural perspective, layered frameworks with hierarchical stages are commonly adopted in the 105 reviewed MMs. Five- and six-level maturity structures are frequently used and have become relatively standardized, providing a clear basis for stakeholders to understand how capability progresses across stages. Some research applies such structures in their models to describe capability gaps and evolution path [4,5,6].
While some models focus primarily on technological aspects, most incorporate multiple assessment domains, including strategy, organization, technology, human factors, and operations, reflecting the multidimensional nature of Industry 4.0 readiness. For instance, several of these domains are used in the assessments at the same time [7,8,9,10]. In addition, some models are tailored for specific contexts, such as simplified frameworks for small and medium-sized enterprises [11] or models with targeted objectives such as green readiness [12] or workforce evaluation [13].
However, existing MMs predominantly define and assess digital maturity at the organizational level using macro-level dimensions, with limited attention to the transformation processes of specific business functions in manufacturing firms. In practice, manufacturing digital transformation often starts with a single function or a critical use case and then expands progressively across functional modules. Relying solely on organizational-level stage definitions and aggregated scoring can obscure maturity disparities among functions, making it difficult to identify function-specific implementation pace and capability gaps.

2.2. Purpose of Model Evaluation

The reviewed MMs vary in their focus and intended purpose, which can be broadly grouped into three categories: (1) readiness assessment, which evaluates an organization’s preparedness to initiate or advance Industry 4.0 transformation [9,14,15]; (2) maturity measurement, which determines the current state of digital maturity across selected domains or technologies [8,10,16]; and (3) capability benchmarking and strategic planning, which aims to identify capability gaps and support the formulation of strategic roadmaps or related decision-making inputs [17,18].
Of the 105 models analyzed, more than 70% are primarily designed for readiness assessment or maturity evaluation, indicating the strong emphasis on understanding organizational status relative to Industry 4.0 goals. Models with a strategic orientation are typically used to support transformation planning through methods such as capability gap analysis or weighted performance evaluation [11,17,18,19].
In addition, models are increasingly being developed for policy support and sustainability integration. These models have a range of purposes, such as education and workforce development [13,20], green and sustainable transformation [12,21], and SME-oriented policy frameworks [11].
Most existing evaluation targets adopt a firm-level perspective and primarily address the question of “where the organization stands overall in its digital transformation.” Consequently, they provide limited visibility into the performance and status of specific functions or processes in manufacturing firms and fail to reveal heterogeneity in transformation outcomes across functional modules.

2.3. Data for Model Evaluation

The evaluation data used can be categorized as: (1) survey-based data, which are collected via structured questionnaires, typically Likert scales [15,22]; (2) expert input and interviews, which are collected via expert judgment, interviews, and workshops [16]; (3) literature-driven or framework-derived data, which are based on existing models, standards, and theoretical mapping [23]; and (4) operational or system-generated data, which are based on digital systems such as MES, IoT, or performance logs [16].
Most (over 60%) of the examined models rely primarily on survey instruments. While this method is easy to implement, it may be limited in terms of objectivity and scalability. Recent developments indicate a gradual shift towards the use of system-based, real-time, or hybrid data sources, with the aim of improving reliability, automation, and integration with operational technologies. Beyond the above four data categories, there is the case-study-oriented data collection strategy, which draws directly from empirical evidence in multiple enterprises [24].
Among the commonly used data types, survey- and expert-based inputs are inherently subjective, while literature-driven or framework-derived data often fail to accurately represent firm-specific contexts. Moreover, substantial heterogeneity across manufacturing sectors and the diversity of digital software systems makes it challenging to leverage operational or system-generated data effectively, as doing so requires standardized information-extraction methods and unified data formats, which are difficult to design and implement.

2.4. Evaluation Methods

Evaluation methods were grouped into the following four categories: (1) expert-based and self-assessment tools, which involve subjective scoring by external experts or internal stakeholders using checklists, Likert scales, or maturity matrices; (2) quantitative and statistical methods, which include AHP (Analytic Hierarchy Process), PROMETHEE (Preference Ranking Organization METHod for Enrichment Evaluations), SEM (Structural Equation Modeling), PLS-SEM (Partial Least Squares Structural Equation Modeling), CCA (Canonical Correlation Analysis), IPMA (Importance–Performance Map Analysis), fuzzy logic, and regression-based scoring; (3) hybrid or simulation-based approaches, which combine expert input with decision models, simulations, or model-driven reasoning (e.g., fuzzy AHP, job profile scoring); and (4) process-based or value-oriented assessment, using VSM (Value Stream Mapping), carbon footprint, or value-added analysis to link maturity with sustainability or process efficiency.
Among the reviewed literature, the expert-based assessment is prevalent; many models prioritize usability and adopt quick self-assessment tools—especially suited for SMEs (Small and Medium Enterprises) and initial diagnostic stages [3,25,26]. The use of formal MCDA (Multi-Criteria Decision Analysis) and statistical approaches is also rising. For instance, PLS-SEM, AHP, or PROMETHEE are incorporated into various models for more reliable validation and scoring, especially in academic contexts [4,22,27]. Hybrid models are also emerging. A growing number of studies [13,28] mix expert scoring with algorithmic computation, such as job role alignment and fuzzy logic, aiming for scalability and personalization. Newer approaches linking maturity to performance/sustainability integrate environmental KPIs (e.g., [12]) or production performance (e.g., [29]) to assess real-world impact and implementation depth. The evaluation of these methods reflects a shift from manual checklists to hybrid, data-supported strategies.
In the assessment of digital transformation in manufacturing enterprises, the underlying evidence is predominantly textual and available at a considerable scale. Most existing approaches remain traditional, small-data–oriented evaluation methods, which are insufficient to fully leverage massive text-based evidence. Against the backdrop of rapid advances in artificial intelligence, it is necessary to explore automated analytics for large-scale, multi-source textual data to characterize firms’ digital practices in a more objective and reproducible manner and to derive robust assessment conclusions.

2.5. Summary of the Research State and Trends

To provide a clearer understanding of the current landscape of digital transformation MMs, Table 1 compares 10 key models based on their assessment levels, evaluation purposes, data sources, and method types. While over 100 models are described in the literature, these 10 were specifically chosen to highlight the most relevant approaches, gaps, and limitations in line with the objectives of this study. The selected models encompass various assessment levels, data sources, and evaluation methods commonly used in MMs.
This comparative framework illustrates the dominant trends in existing research and highlights some inherent limitations. Many models focus on high-level organizational assessments, rely on subjective data sources such as surveys or expert judgment, and lack detailed analysis at the functional or scenario level. These gaps hinder a comprehensive assessment of digital transformation across various business functions. The following sections will explore these limitations in more detail.
(1)
Dominant focus on maturity and readiness evaluation
Over 60% of the reviewed models are primarily designed to evaluate readiness or maturity. This indicates that most frameworks are oriented toward identifying the current state of digital adoption, particularly in relation to emerging technologies such as IoT and AI [7,30,31,32]. Academic studies tend to emphasize descriptive frameworks, paying limited attention to how firms progress between maturity stages. While some models support basic benchmarking, they often provide limited guidance on improvement actions or resource allocation [3,33]. In most cases, the emphasis remains on answering “Where are you now?” rather than “How are different parts of the organization evolving, and what should be improved next?”
(2)
Emphasis on strategic-level evaluation
From the scope perspective, many of the models (over 40%) focus on strategic-level assessment. In contrast, the number of models addressing operational and technology-level evaluations is considerably smaller. Strategic assessments can help initiate high-level discussions but often lack the specificity required for implementation in concrete areas such as production, IT, and operations [34]. Most models do not provide diagnostics for specific organizational functions, such as manufacturing, IT, logistics, or human resources [29]. Cross-functional integration is also limited, particularly in terms of linking strategic intent with coordinated actions across operational and technological domains. This is partly because many models are designed to support top-level decision-making, helping executives align Industry 4.0 initiatives with business goals and digital investment planning [9,30]. In addition, conceptual elements such as vision, leadership, and innovation are more readily modeled and generalized across different organizations and sectors [3]. As a result, function-level diagnostics—such as those for manufacturing, IT, logistics, or HR—are addressed to a lesser extent in many models.
(3)
Prevalence of survey-based data and underuse of real-time or system data
Survey-based models far outnumber other types. Surveys are a commonly used method for deployment and scaling, especially across sectors with different digital maturity levels [7,29], and many models follow management research paradigms that favor self-reported questionnaires [28,33]. This reliance on survey instruments is also related to access limitations: researchers often lack access to live factory data due to privacy, system integration, or technical constraints [35].
However, surveys typically involve cross-sectional data collection and can only capture information at a specific moment in time, reducing their usefulness for continuous improvement or automated decision-making [16]. Without continuous data streams, most models provide one-off diagnostics with limited feedback mechanisms [36], and current models rarely explore the use of intelligent algorithms to process and interpret operational or text-based data.
(4)
Synthesized gaps
The review identifies three key gaps in current MM research for digital transformation in manufacturing industry:
  • Lack of Function- and Scenario-Level Diagnostics: Existing models predominantly focus on organization- or strategy-level assessments, providing limited insight into how specific business functions or application scenarios progress asynchronously during digital transformation.
  • Dependence on Subjective, Survey-Based Data: The dominance of self-reported questionnaires and expert scoring limits scalability, continuity, and objectivity, while underutilizing the potential of publicly available or system-generated data.
  • Limited Use of Data-Driven and Text-Based Semantic Modeling: Although some models integrate quantitative and hybrid approaches, the systematic use of text mining and advanced semantic models (e.g., LLMs) to extract and quantify digital practices from unstructured data is still rare.
To address these limitations, future MMs should adopt a more modular, function-oriented, and data-driven architecture. Models ought to be structured around key organizational functions—such as production, supply chain, IT, maintenance, R&D, and HR—to enable their independent assessment using tailored criteria. This allows organizations to prioritize areas aligned with their strategic goals and transformation priorities. In addition, models should become more configurable, enabling the selection of maturity dimensions based on specific objectives, capabilities, and resource constraints. To improve accuracy and credibility, objective system-generated or publicly available data should be used, supporting real-time or at least regularly updated assessments of digital maturity in rapidly changing environments.
In manufacturing, digital transformation is typically incremental rather than uniform, often starting in areas such as production, maintenance, or logistics, where the value is clear and measurable. These pilot implementations gradually expand as technologies mature and organizational readiness grows. Successful transformation also depends on integrating enterprise software systems such as MES, which are inherently modular and support discrete operations (e.g., production scheduling, inventory control, quality management, customer service). Therefore, assessing transformation at the business-function level is often more effective than relying solely on aggregate organizational scores.
Accordingly, this study proposes a business-function-oriented maturity evaluation model. The model is structured around independent business functions, each assessed through quantifiable indicators derived from objective and publicly available data. This design has four advantages:
  • Modularity—Each function is evaluated independently, capturing asynchronous transformation progress across different areas.
  • Objectivity—System-generated or publicly available data are used, rather than purely subjective judgments.
  • Practicality—The model employs clear evaluation logic with a low data collection burden, supporting benchmarking, monitoring, and cross-functional comparison without imposing excessive reporting requirements on firms.
  • Quantifiability—Targeted, quantitative diagnostics guide stepwise transformation aligned with varied maturity levels and functional priorities.
With these features, the proposed model addresses the identified gaps and complements existing MMs, extending the literature from predominantly strategic, survey-based assessments toward a more fine-grained, function- and scenario-oriented, data-driven evaluation of digital transformation.

3. Scenario-Based Framework for Quantitative Evaluation of Digital Transformation in Manufacturing Enterprises

3.1. Digital Transformation Application Scenarios in Manufacturing Enterprises

In the context of digital transformation, application scenarios focus on addressing specific business problems by integrating digital software functions with enterprise processes.
For manufacturing enterprises, digital transformation often begins with operational challenges or tasks that offer significant potential to improve efficiency; these opportunities then serve as entry points for broader organizational expansion. From the perspective of efficiency and value creation, application scenarios represent concrete solutions that drive performance improvement in manufacturing firms. A digital application scenario can therefore be understood as a practical instance in which digital technologies and advanced manufacturing technologies are embedded into specific business functions or production processes to achieve process optimization, efficiency gains, or value creation. Examples include predictive maintenance, digital quality traceability systems, and automated material delivery. Beyond demonstrating technological application, these scenarios also reflect a firm’s digital capability and illustrate how technology can be embedded into core operations to reduce costs, enhance flexibility, and strengthen customer responsiveness. Thus, application scenarios act as a bridge between technological potential and business value, providing a vehicle for practical application while significantly influencing the effectiveness and sustainability of digital transformation initiatives.
Digital transformation’s gradual and uneven nature further underscores the importance of scenario analysis. Transformation rarely unfolds uniformly across an organization; instead, it progresses incrementally, often beginning in functions such as production, maintenance, or logistics—domains where digital interventions generate immediate and measurable value. These pilot practices typically expand as technological capabilities mature and organizational readiness improves. From this perspective, application scenarios can be regarded as structured practices that integrate one or more digital technologies into business functions or processes to drive improvement or innovation. They are not only solutions to business problems but also structured mechanisms that enable digital transformation to scale in line with enterprise capabilities.
The essence of scenario-based digital transformation lies in deriving technological implementation from the need to improve process efficiency. By identifying and continuously expanding application scenarios, enterprises can clarify transformation objectives, deeply integrate digital technologies with business processes, and pursue sustained innovation oriented toward value creation. Moreover, the evaluation of application scenarios can reveal their breadth (functional coverage), depth (technological maturity), and integration (extent of business alignment), thereby reflecting the staged characteristics of digital transformation and identifying weaknesses across departments or processes. These insights provide a basis for resource allocation and optimization. Importantly, the selection of application scenarios is highly flexible and can be adjusted according to the evaluation object, purpose, and scope.

3.2. Challenges in Using Annual Report Data to Evaluate Digital Transformation

One of the key challenges in evaluating enterprise digital transformation from the perspective of business scenarios lies in obtaining stable and reliable data on business operations. Consequently, corporate annual reports have become an important data source and have been used for text analyses in several studies [37,38,39]. However, in existing studies that employ annual report data to measure digital transformation [40,41], keyword-matching methods are typically based on string equivalence or containment, which cannot capture the semantic similarity among different textual expressions. Moreover, the keywords used are primarily names of digital technologies, which fail to reflect the actual digital transformation outcomes within enterprises.
To evaluate whether manufacturing enterprises have undertaken digital transformation in specific business scenarios and to determine the degree of such transformation, this study builds on existing research by integrating scenario-specific keywords with semantic analysis. Beyond literal keyword matching, LLMs (large language models) are employed to uncover broader semantic relationships, thereby complementing keyword matching and enhancing the comprehensiveness of the evaluation.

3.3. Evaluation Framework and Procedure

The proposed method as shown in Figure 1 employs annual reports as the primary data source. Typical manufacturing business scenarios are first selected as the analytical units, and textual analysis is performed accordingly. Keyword extraction and semantic similarity computation are then applied to derive three indicators—scenario coverage, scenario depth, and scenario consistency—forming a three-dimensional evaluation system for assessing the degree of digital transformation.
Compared with traditional approaches that rely on process-based segmentation or manually defined indices, the innovations of this framework can be summarized as follows: (1) business scenarios are adopted as the fundamental analytical unit, which better captures the practical logic of digital transformation in manufacturing enterprises; (2) a keyword–semantic model integration mechanism that combines expert knowledge with semantic embeddings is introduced to enhance the detection of target information in unstructured texts; and (3) a three-dimensional evaluation structure (coverage, depth, and consistency) is designed, which enables a systematic measurement from local semantic expressions to overall trend tracking. The evaluation procedure is outlined as follows:
(1)
Annual reports data collection and preprocessing
A cluster of listed manufacturing enterprises is identified based on the evaluation objectives. Their annual reports are then collected and preprocessed, with the Management Discussion and Analysis (MD&A) section retained for further analysis.
(2)
Definition of scenario keywords
First, scenarios are selected according to the current evaluation objectives. For each scenario, core keywords (CKs) are identified, and then corresponding extended keywords (EKs) and negative keywords (NKs) are derived. The number of CKs, EKs, and NKs can be flexibly determined based on the evaluation purpose and specific conditions.
CKs represent the fundamental characteristics of digital transformation, typically expressed through key processes, technologies, or methods. They explicitly capture the outcomes of digital application, integration, and transformation. EKs complement the Core Keywords and generally include concepts and domain-specific descriptions related to the core technologies or methods. They provide additional detail on the specific business processes or scope of digital transformation, thereby enriching the description of each scenario. NKs denote negations, adverse terms, or exclusionary expressions (e.g., “underperformed,” “lack of resources,” “non-strategic direction”). The combined use of CKs, EKs, and NKs—where CKs define the direction, EKs broaden the scope, and NKs eliminate noise—enables a more comprehensive understanding of textual semantics.
(3)
Calculation of scenario keyword intensity
When calculating keyword intensity for each scenario in annual reports, different weights are assigned to core keywords (CKs), extended keywords (EKs), and negative keywords (NKs). For example, CKs are assigned a weight of 1.5, EKs a weight of 1.0, and NKs a weight of −0.8. The intensity score for each scenario is then computed by combining the frequency of each keyword type with its assigned weight.
CKs carry the highest weight to anchor the core theme of the scenario. Even in lengthy texts, CKs keep the evaluation focused and allow rapid identification of the central elements of digital transformation, ensuring strong alignment between the assessment and the business scenario. EKs are assigned moderate weights and are used to capture derivative expressions and indirect associations of the core theme. They help recognize different formulations of the same concept, supplementing industry-specific terminology, synonyms, or scenario extensions, thereby enhancing the reliability of the assessment. NKs are assigned negative weights to suppress noise or adverse information by detecting negation patterns and lowering the corresponding scenario score. This adjustment improves the ability to identify the true strategic intent expressed in annual reports.
When CKs, EKs, and NKs are jointly used to calculate keyword intensity, each of their contributions can be traced individually, ensuring interpretability and facilitating verification. In addition, by adjusting the keyword library and weights, the sensitivity of the evaluation can be flexibly calibrated, making the method adaptable to different assessment targets.
(4)
Semantic similarity calculation
The keyword matching is based on string equivalence or containment. However, the number of keywords is limited, and their semantics are often narrow. Directly searching for a keyword’s occurrence in annual reports fails to adequately account for synonyms, near-synonyms, or alternative expressions.
This study employs LLMs to learn semantic associations among word vectors, thereby capturing deeper semantic information. By calculating semantic similarity, more comprehensive scenario-related information can be extracted from annual reports. Even when specific words do not appear explicitly, semantic similarity allows them to be recognized as related, thus complementing the limitations of simple keyword matching.
The results of semantic similarity computation further improve keyword intensity analysis by evaluating the clarity of digital transformation expressions. They also assess scenario consistency—the consistency between the intended objectives and actual outcomes of digital transformation in specific business scenarios.
For the computation, LLMs (e.g., Word2Vec, BERT, Transformer) are first employed to extract core semantic anchors, which serve as the foundation for semantic understanding, from the scenario keyword table. Next, the LLMs transform the annual report texts into multidimensional vectors, representing their overall semantics rather than superficial lexical forms. Cosine similarity is then applied to measure the similarity between semantic vectors, producing a semantic similarity score, which is normalized to a value between 0 and 1.
In this paper, the normalized semantic similarity score between the annual report text and the scenario keywords is denoted by S S i , and the processing steps are outlined in Appendix A. This score is used as the scenario consistency measurement.
Through the above steps, deep semantic recognition and matching between the MD&A section of annual reports and the scenario keywords can be achieved.
In the three core tasks involved in the above process—semantic anchor understanding, text semantic vectorization, and semantic similarity computation—different language models significantly differ in their adaptability to Chinese and English corpora. Specifically, semantic anchor understanding places high demands on a model’s capacity for contextual modeling and semantic completion. This is particularly critical when dealing with short keywords with insufficient contextual information, where the model must possess a certain level of “semantic extrapolation” ability. Text semantic vectorization, in contrast, emphasizes the model’s ability to achieve consistent encoding across texts of varying lengths while maintaining semantic structural stability. Semantic similarity computation requires the embedding space to exhibit well-organized semantic distribution characteristics, such that semantically similar text pairs are represented with smaller distances in the vector space. For tasks involving semantic anchor understanding and semantic similarity computation, different language models demonstrate distinct strengths in language coverage, application scenarios, and performance.
(5)
Three-dimensional evaluation
During the evaluation, the consolidated annual report text is treated as a whole and assessed for each predefined scenario separately. For each scenario, a three-dimensional evaluation framework is constructed based on keyword frequency, keyword strength, and semantic similarity. This framework comprises scenario coverage, scenario depth, and scenario consistency and is designed to comprehensively assess the degree of digital transformation. Scenario coverage is measured by the number of scenario-related keywords appearing in the annual report, thereby evaluating the breadth of digital transformation. Scenario depth is determined using keyword weights together with the frequency and number of keywords within each scenario, thus evaluating transformation depth. Scenario consistency is assessed by supplementing the analysis with semantic similarity, which evaluates the coherence and clarity of transformation-related keyword expressions within a scenario. The computational logic of these three indicators is summarized in Table 2, and the detailed process of calculating scenario coverage and scenario depth is presented in Appendix B.
The scores across different dimensions reflect a firm’s transformation status within each scenario. By integrating the values of scenario coverage, scenario depth, and scenario consistency, the Digital Transformation Index (DTI) can be derived to evaluate the degree of digital transformation for an individual scenario.
D T I i = S C i + S D i + S S i 3 .
The evaluation value of a single scenario can be used to assess different transformation scenarios for a target group of enterprises. Furthermore, aggregating the evaluation values across all scenarios provides an overall assessment of a firm’s digital transformation. In this study, the mean value of all scenario-specific evaluation scores is defined as the Overall Digital Transformation Index (ODTI).
ODTI = i = 1 n s D T I i n s
where n s is the predefined number of scenarios.

4. Data Example and Analysis

To validate the effectiveness of the proposed method, listed manufacturing enterprises in Zhejiang Province are used as an empirical case study for evaluation and analysis. Manufacturing in Zhejiang accounts for 35% of the provincial GDP, exceeding the national average of approximately 27%. The province hosts 45 national-level intelligent manufacturing demonstration factories (over 10% of the national total), 32 global “lighthouse factories” (representing one-fifth of China’s total), and 18 national-level industrial clusters, covering both traditional and emerging industries. In addition, Zhejiang has more than 450 A-share listed manufacturing firms. With its strong manufacturing base, advanced digital transformation process, comprehensive policy support, and high data completeness, Zhejiang constitutes a representative case for examining the digital transformation of China’s manufacturing sector.

4.1. Annual Report Data Acquisition and Sampling Strategy

Following prior studies on annual report sampling [40,42], manufacturing firms listed on the Shanghai and Shenzhen A-share markets with registered addresses in Zhejiang Province between 2014 and 2023 were identified using the Choice data platform. Industry classification was based on three systems: (1) the CSRC “manufacturing” category; (2) selected SW categories, including machinery, automotive, electrical equipment, electronics, petrochemicals, and basic chemicals; and (3) the CSI “industry” category. The results from the three systems were merged and manually screened to exclude non-manufacturing firms, yielding the final list of Zhejiang manufacturing enterprises. To ensure data completeness, only firms with at least five consecutive years of annual report data were retained. The full texts of annual reports were then downloaded from the Choice database “http://choice.eastmoney.com (accessed on 10 June 2025)”. Prior to evaluation, annual reports were preprocessed and only companies meeting the five-year continuity requirement were included in the final sample.

4.2. Three-Dimensional Keyword Mapping Table Construction

According to the Implementation Guidelines for Digital Transformation of Manufacturing Enterprises, jointly issued by the Ministry of Industry and Information Technology and two other departments, the value-creation processes of manufacturing enterprises can be categorized into six typical digital transformation scenarios: research and design, production and manufacturing, operation and maintenance services, business management, supply chain management, and cross-linkage collaboration. In addition to defining these scenarios, the guidelines also delineate their functional scopes. The literature on the digital transformation of manufacturing enterprises further discusses scenario descriptions and transformation-related evaluation keywords.
By building on the scenario definitions and descriptions provided in the policy guidelines and integrating relevant domestic and international research, a mapping table of scenario keywords adapted for annual report texts was developed. As shown in Appendix C, the mapping table includes CKs, EKs, and NKs.
Given the continuous development and iteration of digital technologies and their applications in manufacturing, as well as the varying purposes of digital transformation evaluation, scenario definitions are not immutable. They may be adjusted according to specific evaluation objectives and prevailing conditions.

4.3. Evaluation Process

4.3.1. Calculation of Keyword Intensity

From the previously collected ten years of annual reports, the “Management Discussion and Analysis” (MD&A) sections were extracted for analysis. Since the reports are written in Chinese, the JIEBA segmentation tool was first applied to perform word segmentation and remove stop words. The predefined keyword mapping table was then loaded, and weights of 1.0, 0.6, and −0.8 were assigned to “CKs” “EKs” and “NKs” respectively. For the six manufacturing scenarios: research and design (S1), production and manufacturing (S2), operation and maintenance services (S3), business management (S4), supply chain management (S5), and cross-linkage collaboration (S6), the frequency of each keyword appearing in the corresponding scenario was calculated, and the total number of identified keywords within each scenario was recorded. Finally, by applying Equations (A2) and (A3) in Appendix B, the keyword frequencies, counts, and weights were integrated to obtain the evaluation values of scenario coverage and depth.

4.3.2. Calculation of Semantic Similarity

As noted above, computing semantic similarity based on keywords involves three steps: semantic anchor understanding, text semantic vectorization, and semantic similarity computation. Since the case study entails evaluating semantic similarity in Chinese annual reports of manufacturing enterprises, the text structure is relatively stable, the anchor keywords have clear semantic meanings, and noise interference is limited. Therefore, employing a single model to perform semantic anchor understanding, text semantic vectorization, and semantic similarity computation in a unified manner provides a cost-effective solution.
To compare the results of different large language models, three representative sentence embedding models were selected for the semantic similarity evaluation experiments: SimCSE-chinese-roberta-wwm-ext, paraphrase-multilingual-MiniLM-L12-v2, and text2vec-base-chinese. These models differ in network architecture, pre-training corpora, and training objectives, thereby reflecting their performance in semantic alignment tasks from different perspectives.
By introducing models with different architectures and training mechanisms for comparative analysis and examining the evaluation results from multiple perspectives, such as inter-model differences, the robustness and generalizability of the evaluation method can be validated, which enhances the reliability of the overall analysis and the persuasiveness of the conclusions.
Based on the annual report data of the manufacturing enterprise group described in Section 4.1, the three Chinese sentence embedding models were applied to evaluate digital transformation levels across six typical manufacturing scenarios (S1–S6) over the ten-year period from 2014 to 2023. The scenarios include research and design (S1), production and manufacturing (S2), business management (S3), operation and maintenance services (S4), supply chain management (S5), and cross-linkage collaboration (S6). This process yielded three score matrices of size 10 × 6. The evaluation results are aggregated at the “year–scenario” level, resulting in a relatively small outcome matrix. However, each value is derived from a large corpus of annual report texts, ensuring that the results capture macro-level trends. The evaluation results are analyzed below.

5. Evaluation Results and Trend Analysis

Although the primary contribution of this study is methodological—developing a scenario-based DTI/ODTI evaluation framework—measurement research should also demonstrate that the resulting index behaves in theoretically meaningful ways. Therefore, the empirical patterns reported in this section are presented descriptively, but they also serve as construct-validation evidence rather than purely narrative description. In particular, we examine whether (i) the temporal evolution of DTI/ODTI is consistent with widely discussed diffusion and deepening processes of digital transformation, and (ii) the observed scenario-level dynamics correspond to staged advancement of intelligent manufacturing in Zhejiang Province. We also report cross-model consistency as reliability-oriented validation evidence.

5.1. Comparative Analysis Across Models

Table 3 and Figure 2 present the ten-year average scores of the three models across the six evaluation scenarios (S1–S6). The results indicate noticeable differences in score ranges among the models. The Text2Vec model consistently achieves higher scores, with the highest score in S3 (0.5415) and the lowest in S6 (0.1188). In contrast, the SimCSE and Paraphrase models exhibit relatively low average scores, primarily distributed between 0.07 and 0.45. Table 4 and Figure 3 show the time series of the overall digital transformation evaluation index (ODTI) from 2014 to 2023.
From a trend perspective, although the three models exhibit systematic shifts in their absolute scores, their evaluation results demonstrate a high degree of consistency in relative temporal patterns (see Table 4 and Figure 3). Specifically, the average annual scores of all three models gradually increased between 2014–2020, followed by a pronounced rise between 2020–2021. In the subsequent two years, the scores of all models slightly fluctuated but remained relatively stable, without any clear trend divergence. These consistent temporal dynamics suggest that, while the models differ in their scoring scales, they respond to semantic feature variations in a largely uniform direction when applied to the same time period.
A closer examination of the trends shown in Figure 4 for the three models (SimCSE, Paraphrase, and Text2Vec) across scenarios (S1–S6) reveals that, despite differences in absolute scores due to model architecture and training strategies, SimCSE, Paraphrase, and Text2Vec exhibit highly consistent temporal patterns across S1–S6. Notably, all three models peaked in S1 in 2021, with scores remaining high but slightly declining thereafter; a comparable surge was observed for S3 in the same period. These results indicate that mainstream Chinese sentence embedding models possess strong substitutability and cross-validation capabilities when applied to structured semantic texts such as corporate annual reports, reinforcing the reliability of similarity-based approaches for index modeling, trend detection, and intelligent classification and validating the effectiveness of the proposed methodology.
Altogether, the high consistency in relative temporal patterns across models provides reliability-oriented validation evidence for the proposed framework, suggesting that the observed trends are not driven by a particular embedding model.

5.2. Overall Analysis of Digital Transformation Trends Across Enterprise Scenarios

To comprehensively capture the digital semantic expression across different business segments in corporate annual reports, the annual trend curves of six scenarios were plotted based on the overall evaluation values produced by the SimCSE, Paraphrase, and Text2Vec models (see Figure 5).
S1 (R&D design) remained at a relatively low level (0.2–0.4) between 2014 and 2020, indicating that digital transformation in R&D was still in its early stage. After 2021, scores rose sharply and remained at their highest levels through 2023, reflecting the concentrated release of investments and achievements in R&D related digital transformation in annual report. This trend suggests that R&D has recently become critical for breakthroughs in manufacturing enterprises.
S2 (Production and manufacturing) followed a steady upward trajectory, markedly accelerating after 2019 and reaching nearly 0.5 in 2023. This indicates that production, as the core domain of digital transformation, is progressively advancing from equipment upgrades toward more systematic transformation.
S3 (Business management) remained at a relatively high level throughout the period and rose rapidly after 2020, surpassing 0.7. This reflects the increasing recognition and application of digital value in strategic planning and resource allocation by management, positioning this scenario as a leading driver of digital transformation.
S4 (Operations and maintenance services) remained low overall but grew slowly, reaching only about 0.3 in 2023. This suggests that although technologies such as remote operations and intelligent maintenance are being gradually adopted, digital transformation in this domain has yet to see widespread diffusion.
S5 (Supply chain management) displayed a nonlinear trajectory, with relatively high scores during 2014–2016 but a decline to the lowest point in 2020, followed by stabilization. This pattern may be influenced by the increasing external complexity of supply chains and rising challenges in upstream–downstream coordination, highlighting the instability and lack of standardized expressions in supply chain digital transformation.
S6 (Cross-process collaboration) consistently had the lowest scores, with limited growth and poor interannual consistency. This indicates that cross-departmental and inter-firm data integration and business collaboration remain major challenges in digital transformation, requiring further breakthroughs in organizational mechanisms and technical standards.

5.3. Correspondence Between Enterprise Digital Maturity Trends and Stages of Intelligent Manufacturing Advancement in Zhejiang Province

As shown in Figure 6, the trends in the average digital transformation evaluation scores of the six key scenarios (S1–S6) for manufacturing enterprises from 2014 to 2023 reveal that digital semantic expression in annual reports is evolving in alignment with the staged advancement path of intelligent manufacturing in Zhejiang Province. This finding indicates that the evaluation based on semantic models not only captures the internal evolution of linguistic standardization within enterprises but also serves as an external mapping tool for assessing the effectiveness of regional digital transformation initiatives.
Phase I (2014–2017): National and provincial initiatives emphasized informatization and automation, such as the Machine Substitution program (2013) and Made in China 2025 (2015). Correspondingly, S1 (R&D) and S3 (management) showed early and stable improvements, indicating initial progress in digitalized R&D processes and management systems. However, S2 (production) did not exhibit a significant increase, suggesting lagging digital transformation in manufacturing.
Phase II (2018–2020): Policies focused on technological integration and system development, exemplified by the Intelligent Manufacturing Action Plan (2018–2020) and the provincial program on industrial internet platform construction (2019). The model results, however, show only limited gains in S1 and S2, while S4 (operations) steadily improved and S5 (supply chain) declined before stabilizing. This reflects the organizational complexity and external disruptions constraining digital transformation in production and collaboration scenarios.
Phase III (2021–2023): Policy emphasis shifted toward system integration and deeper digital application, as evidenced by the Opinions on Accelerating the Construction of “Future Factories” issued by Hangzhou (2021) and the national Industrial Internet Innovation and Development Action Plan (2021–2023). During this period, core scenarios S1, S2, S3, and S4 experienced substantial increases, reflecting the accumulated effects of earlier policy efforts. S5 (supply chain) rebounded slightly, but S6 (cross-process collaboration) remained at a low level, underscoring the persistent challenges of integration and inter-organizational collaboration.
Overall, the ten-year trend analysis reveals that the enterprise digital maturity in Zhejiang Province evolves following a distinct staged trajectory: beginning with internal management as a priority, progressing through production and operations, and advancing toward breakthroughs in system-wide collaboration. The digital and intelligent transformation of manufacturing enterprises is gradually shifting from isolated digital transformation of individual processes to holistic process-wide collaboration. This correspondence provides theory-consistent validation that the proposed index captures meaningful staged evolution rather than arbitrary narrative variation.

5.4. Other Issues Identified in the Ten-Year Scenario-Based Digital Transformation Evaluation

(1)
Variations in the Clarity of Digital Transformation Expressions across Business Scenarios
The scoring trends and semantic analysis reveal marked differences in the clarity of digital transformation expressions across business scenarios. These differences are evident not only in semantic density but also in the degree of linguistic standardization and the recognizability of technical terminology. Scenarios such as S2 (production) and S6 (cross-process collaboration) consistently fall within the lower score range, not solely due to weak enterprise capabilities but, more likely, because of ambiguous expression combined with model adaptation challenges.
For S2, persistently lower scores can be attributed to two main factors. On the one hand, production systems are largely internal infrastructure and thus are rarely elaborated upon in annual reports. On the other hand, terminology in this domain is highly heterogeneous and industry-specific, with little semantic standardization, making it difficult for semantic models to capture accurately. Furthermore, unlike strategic or management modules that benefit from a more communicable narrative logic, production-related practices lack systematic and standardized linguistic representation, hindering the extraction of underlying capabilities.
Collaboration capability, meanwhile, is widely regarded in both theory and policy as a hallmark of digital maturity—particularly since the industrial internet and platform-based collaboration have become strategic priorities. Yet, the results show that S6 scores have remained low over the decade, without sustained upward movement. This highlights the inherent difficulty of expressing cross-process collaboration in textual form. Its semantics involve multi-system and multi-organization interactions, with key expressions scattered across contexts such as “supply chain integration,” “system interconnection,” and “upstream–downstream coordination.” These elements lack standardized terminology, rely heavily on contextual inference and cross-paragraph linkage, and thus are not readily captured in semantic similarity calculations. Moreover, cross-process collaboration often entails organizational restructuring, data-sharing protocols, and platform-based mechanisms that are difficult to implement and slow to diffuse, making them less likely to be explicitly reported in annual documents.
Taken together, the persistently low scores of S2 and S6 reflect both limitations in enterprise reporting practices and methodological challenges in handling terminological diversity, semantic fragmentation, and dispersed logic. This suggests that future research should focus on dynamic keyword modeling, semantic standardization, and enhanced adaptation mechanisms for complex scenarios, thereby improving the model’s ability to capture and represent digital transformation across diverse business contexts.
(2)
Time Lag between Policy Priorities and the Transformation Effects Reflected in Textual Expressions
Although the overall scoring trends broadly align with Zhejiang’s trajectory of intelligent manufacturing, there is a noticeable temporal lag between policy deployment and enterprise textual expressions. In Phase I (2014–2017), S3 (management) and S1 (R&D) improved first, reflecting the orientation toward informatization, while S2 (production) did not respond to the goal of intelligent transformation. In Phase II (2018–2020), despite policies promoting ERP–MES integration and industrial internet platforms, gains in S1 and S2 remained limited. Only in Phase III (2021–2023) did the scores of core scenarios rise sharply, with S1 and S3 reaching high levels. This suggests that the semantic reflections of digital transformation in annual reports are not immediate responses, but typically lag behind policy pilots and system deployment, closely linked to the accumulation of prior technological efforts. The findings further indicate a consistent 1–2-year “policy-first, expression-lagging” window, reflecting a multi-level trajectory of digital transformation from institutional design to organizational practice and finally to external disclosure. This also highlights the proposed method’s sensitivity in structural alignment and phase detection, meaning that the enterprise transformation pace and strategic consistency can be inferred from publicly available texts.

5.5. Sensitivity and Robustness Analysis

To evaluate the robustness and reliability of the proposed DTI/ODTI framework, we conducted a sensitivity analysis in three ways: cross model robustness, keyword weights sensitivity, and external validity through comparison with benchmark enterprises.

5.5.1. Cross Model Robustness

Three representative sentence embedding models were used: SimCSE-chinese-roberta-wwm-ext, paraphrase-multilingual-MiniLM-L12-v2, and text2vec-base-chinese were used in this article. Table 4 and Figure 3 show the ODTI from 2014 to 2023 for these three models.
Spearman correlation coefficients showed highly consistent temporal trends among the three models, with values ranging from ρ = 0.98 to 0.99 (p < 0.01), indicating stable alignment in their temporal evaluations. The overall ranking pattern was consistent, with Text2Vec consistently outperforming the other two models, Paraphrase in the middle, and SimCSE yielding the lowest scores. This demonstrates that framework is robust across different models.

5.5.2. Sensitivity to Keyword Weights

To assess the DTI’s sensitivity to keyword weights, two tests are conducted:
(1)
Single-factor perturbation (±20% weight change):
The weights of each keyword category were adjusted by ±20%. The experimental results are shown in Figure 7. Adjusting the weight of core keywords by ±20% leads to significant changes in evaluation scores across all scenarios. In contrast, changes in the weights of extended and negative keywords had minimal effects, indicating that the framework is highly sensitive to the weight of core keywords, while other categories have little impact on the final score.
(2)
Extreme Weight Configurations
Several extreme weight configurations are also tested, including core keyword dominant (CK-dominant), balanced weights (Equivalent), and extended keywords dominant (EK-dominant). The evaluation results corresponding to these configurations are presented in Figure 8.
The core keywords dominant configuration showed stable results, similar to the baseline, while the extended keywords dominant configuration resulted in significant score reductions, particularly in the R&D scenario. This suggests that core keywords have the most influence, while extended words contribute little in the current dataset.

5.5.3. External Validity: Benchmark Enterprises vs. Other Firms

To validate the external applicability of the framework, we compared the DTI/ODTI scores of seven demonstration enterprises selected from the Future Factory Cultivation List in Zhejiang Province (2021) against the industry average from 2017 to 2023.
Figure 9 and Figure 10 show that the DTI/ODTI scores for these enterprises consistently exceeded the industry average, confirming that the framework effectively identifies companies with advanced digital transformation capabilities.
The above section shows that the sensitivity and robustness analysis demonstrate that the DTI/ODTI framework is reliable across different models, highly sensitive to changes in keyword weights, and capable of distinguishing leading digital transformation firms through external validation. These results highlight the practical applicability and external validity of the framework in real-world settings.

6. Discussion

6.1. Summary of Methods

The methodological contributions of this study are threefold. First, it introduces “application scenarios” as the fundamental unit of analysis, going beyond traditional approaches that rely on business processes or manually defined indicators. This design is more closely aligned with the practical logic of digital transformation in manufacturing enterprises and enables the identification of heterogeneous trajectories across different functional domains. While its application was demonstrated in the manufacturing sector, the framework can also be adapted to other industries for digital transformation assessment. Second, the method incorporates a “core–extended–excluded” keyword mechanism, which directs the semantic embedding process for unstructured texts. This mechanism not only enhances the ability to capture target concepts but also increases the transparency and interpretability of the evaluation, with adjustable weighting parameters to accommodate different analytical needs. By maintaining semantic accuracy while significantly improving interpretability and traceability, it provides a replicable pathway for subsequent semantic analysis. Third, the framework’s stability was further validated through multi-model comparison. Comparative experiments using three structurally distinct models—SimCSE, Paraphrase, and Text2Vec—demonstrated strong consistency in temporal trends (average Spearman correlation = 0.89), despite differences in raw scores. This indicates the cross-model generalizability of the approach and lays a foundation for its application in multilingual, cross-industry, and multi-task environments.
In sum, the proposed framework introduces methodological innovations in scenario-based structuring, keyword-enhanced semantic modeling, and cross-model validation, providing a generalizable and scalable tool for quantifying enterprise digital transformation. Moreover, the framework has the potential to be extended beyond manufacturing to applications across industries and international contexts.

6.2. Limitations and Future Improvements

Although the evaluation method proposed in this study performs well in trend identification and cross-model stability, several limitations remain. The keyword system heavily relies on expert knowledge, which limits its adaptability to emerging fields and rapidly evolving terminology. For example, new technical terms may not be recognized, leading to underestimated evaluation results. Additionally, keywords are typically short phrases with sparse semantics, which can cause misalignment when matching them to long, complex annual report texts, especially in cross-paragraph and cross-functional contexts. The framework also uses a single semantic model for keyword parsing, text encoding, and similarity computation, resulting in high task coupling and limiting scalability as corpus size or domain diversity increases. Finally, the method relies on static annual reports, which often lag behind actual organizational practices, potentially underestimating recent progress in digital transformation efforts.
Lastly, the framework does not integrate operational data, such as IoT or MES logs, which could offer more granular and real-time insights into digital transformation efforts. The integration of such proprietary operational data presents significant challenges, particularly in industries with varied data formats and structures. Moreover, the LLM-based semantic similarity approach may face difficulties when applied to emerging industries or cross-regional contexts. The current study is focused on a specific dataset from the manufacturing sector in Zhejiang Province, and applying the method to other industries or regions would likely require substantial re-engineering of keywords and adjustments to the model, limiting the framework’s generalizability and its ability to handle industry- or region-specific terminology effectively.
Given these limitations, future research could focus on the following areas: (1) Automatic Keyword Generation and Semantic Lexicon Expansion: Develop automatic mechanisms for keyword and semantic lexicon expansion, such as topic modeling and knowledge graph-based methods, to address the limitations of manually curated keywords and improve the framework’s adaptability to emerging and unstandardized terminology. (2) Context-Aware and Document-Level Modeling: Incorporate context-aware mechanisms, including paragraph- and document-level embeddings, to enhance the framework’s ability to recognize implicit statements and capture cross-paragraph relationships, especially in complex, multi-functional scenarios. (3) Model Architecture Optimization: Explore modular model architectures, where different models handle specific tasks, such as anchor extraction, text encoding, and semantic matching. Additionally, integrating graph neural networks or attention mechanisms to construct semantic dependency graphs could improve the scalability and consistency of the evaluation process. (4) Integration of Real-Time and Operational Data: Integrate real-time data sources, such as quarterly reports, social media, and IoT logs, to provide more up-to-date insights into digital transformation efforts. Time-series forecasting models (e.g., ARIMA or LSTM) could also be used to predict future trends and adjust evaluations to account for delays in policy implementation and reporting.
In conclusion, addressing these limitations through future optimization measures will enhance the method’s adaptability, scalability, and applicability across diverse industries, improving its effectiveness in digital transformation assessment and policy monitoring.

Author Contributions

Methodology, Q.L. and X.J.; Validation, Q.L.; Data curation, X.J.; Writing—review and editing, Q.L.; Supervision, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Calculation of Semantic Similarity

  • Step 1: Based on the predefined scenario keywords, they are input into the LLM to obtain word vector representations, which serve as semantic anchors.
  • Step 2: The LLM performs context-aware encoding of the target text. Specifically, the annual report text under analysis is input into the LLM to generate high-dimensional semantic vectors that capture contextual relevance.
  • Step 3: Cosine similarity is applied to calculate the similarity between the semantic anchors and the target text vectors. The results are normalized to a range between 0 and 1. The cosine similarity is computed as follows:
Sim T , K i = T · K i T · K i
where “⋅” denotes the dot product operation between vectors, and ‖⋅‖ represents the Euclidean norm of a vector. Here, T refers to the semantic anchor, while K i denotes the target text vector under comparison, S i m ( T , K i ) 1 , 1 .
To standardize the measurement results, the cosine similarity is further transformed into a semantic relevance score normalized to the range [0, 1], using the following formula:
S S i = 1 + S i m ( T , K i ) 2 .

Appendix B. Calculation of the Scenario Coverage and Scenario Depth

  • S i : the ith scenario, i = 1 , , n s , n s is the amount of scenario.
  • C K i j : word frequency of the jth core keywords (CK) in the ith scenario, j = 1 , , c k i , c k i is the number of detected core keywords (CK) in the ith scenario.
  • E K i j : word frequency of the jth extended keywords (EK) in the ith scenario, j = 1 , , e k i , e k i is the number of detected extended keywords (EK) in the ith scenario.
  • N K i j : word frequency of the jth negative keywords (NK) in the ith scenario, j = 1 , , n k i , n k i is the number of detected negative keywords (NK) in the ith scenario.
  • W C K : weight of the core keywords (CK), W C K 0 ,   1 ,
  • W E K : weight of the extended keywords (EK), W E K 0 ,   1 , and W C K > W E K ;
  • W N K : weight of the negative keywords (NK), W N K 1 ,   0 .
The keyword weights can be flexibly defined within this range as needed. For example, the weight of a core keyword may be set to 1, an extended keyword to 0.8, and a negation keyword to −0.8.
When calculating, for the ith scenario, scenario coverage S C i  (The number of keywords appearing in the text/Total keywords) × 100% is calculated with the following Equation (A2):
j = 1 c k i C K i j C K i j + j = 1 c k i E K i j E K i j + j = 1 c k i N K i j N K i j c k i + e k i + n k i × 100 %
C K i j , E K i j , N K i j 0
Scenario Depth S D i (Σ(Weights of different types of keywords × Frequencies of different types of keywords)/(Weight of the core keywords × keyword frequency) × 100%) is calculated with the following Equation (A3):
j = 1 c k i W C K × C K i j + j = 1 e k i W E K × E K i j + j = 1 n k i W N K × N K i j W C K × ( j = 1 c k i C K i j + j = 1 e k i E K i j + j = 1 n k i N K i j ) × 100 %
C K i j , E K i j , N K i j 0
Scenario consistency S S i , is calculated with the following Equation (A4):
S C i = 1 + S i m ( T , K i ) 2

Appendix C. Scenario Keywords

ScenarioScenario Keywords
Core KeywordsExtended KeywordsNegative Keywords
R&D designDigital Twin; Virtual Reality (VR); Augmented Reality (AR); Mixed Reality (MR); 3D Optimization Design; Computer-Aided Design (CAD); Digital Mock-Up (DMU); Model-Driven Design (MDD); Product Platform Design; Virtual Process Simulation; Digital Factory; Numerical Control (NC); Product Lifecycle Management (PLM); Model-Based Systems Engineering (MBSE); Text Mining; Heterogeneous Data; Digital Communication; Digital Control; Digital Network; Image Understanding; Semantic Search; Visual Recognition; 3D Printing/Additive Manufacturing; 3D Technology; Digital Thread; Parametric Design; Topology Optimization; Simulation and Verification Platform; Multiphysics Coupling Analysis; Modular Architecture; Collaborative Design Platform; Virtual Verification Platform; Design Knowledge Base; Reverse Engineering Technology; Lightweight Design; Thermodynamic Simulation; Computational Fluid Dynamics (CFD) Analysis; Electronic Design Automation (EDA); Materials Database; Digital Prototype IterationInvestment in R&D Digital transformation; Investment in Digital Mock-Up (DMU) Development; Investment in Reverse Engineering; Optimization of R&D Expense Ratio; Capitalization of R&D Expenses; R&D Digital transformation Strategy; R&D Innovation Incentive Mechanism; R&D Talent Reserve; Industry-Academia-Research Collaboration Achievements; Virtual Simulation Capability Building; Process Simulation Platform Development; Digital Twin Verification; Thermodynamic Simulation Capability; Simulation Verification Coverage; Improvement of Simulation Confidence; Cross-Platform Collaborative Development; Application of Intelligent Design Tools; Modular Design System; Application of Parametric Design; Forward Design Capability; Exploration of Generative Design; Application of Intelligent Correction Systems; Application of Smart Materials; R&D Cycle Reduction Indicator; Design Achievement Conversion Rate; Intellectual Property Commercialization; Patent Conversion of R&D Achievements; Improvement of Design Reuse Rate; Design Asset Reuse Rate; Improvement of R&D Efficiency; Design Iteration Speed; Enhancement of R&D Knowledge Base; Multidisciplinary Collaboration Mechanism; R&D Process Reengineering; Design Standardization Rate; 3D Model Coverage Rate; Electronic Design Automation (EDA); Digital transformation Rate of R&D Equipment; Breakthrough in Lightweight Design; Integration of the Digital ThreadInsufficient R&D Investment; Outdated Design Tools; Lack of Simulation and Verification; Dependence on 2D Drawings; High Cost of Physical Prototypes; Low Design Reuse Rate; Weak Intellectual Property Commercialization; Insufficient Cross-Department Collaboration; Excessive R&D Cycle Time; Lack of Forward Design Capability; Limited Application of Parametric Design; Absence of Modular Design; Fragmented Knowledge Management; Weak Reverse Engineering Capability; Insufficient Thermodynamic Verification; Lack of Materials Database; Absence of Industry-Academia-Research Collaboration; Non-Standardized R&D Processes; Frequent Design Changes; Imbalanced R&D Expense Ratio; Shortage of Digital Talent; Lagging Lightweight Technology; Insufficient Simulation Confidence; Dependence on Outsourced Electronic Design; Lack of Innovation Incentive Mechanism; Low Patent Output Efficiency; Low Process Simulation Coverage; Obsolete R&D Equipment; Disconnected Digital Thread; Absence of Intelligent Correction Systems; Insufficient Design Asset Reuse; Lack of Collaborative Development Platform; Low R&D Achievement Conversion Rate; Delayed Verification Platform Development; Insufficient Multidisciplinary Coupling; Low Design Standardization Rate; Lag in R&D Process Digital transformation; Slow Design Iteration Speed; Lack of Intelligent Material Selection; Unclear R&D Strategy
Production and manufacturingIntelligent Manufacturing; Smart Factory; Industrial Robots; Machine Substitution for Labor; Human-Machine Collaboration; Automated Production Line; Manufacturing Execution System (MES); Distributed Control System (DCS); Production Process Optimization; Intelligent Quality Inspection; Automatic Monitoring; Automated Inspection; Automated Production; Flexible Manufacturing; Precision Manufacturing; Lean Production; Industry 4.0; Industrial Cloud; Digital Factory; Virtual Manufacturing; Lighthouse Factory | Future Factory; Intelligent Fault Diagnosis; Intelligent Production Line; Smart Workshop; Self-Optimizing Process Parameters; Adaptive Machining System; Production Line Digital Twin; Real-Time Energy Consumption Regulation | Automated Defect Classification; Equipment OEE Improvement; Intelligent Work Order Scheduling; Optimization of Material Readiness Rate; Process Knowledge Graph; Error-Proofing and Traceability System; Production Takt Balancing; Predictive Equipment Maintenance; Human-Machine Safety Interlock; Mold Life Prediction; Energy-Carbon Collaborative ManagementInvestment in Smart Factory Development; Improvement of Equipment Connectivity Rate; Effectiveness of Production Process Optimization; Progress of Automation Upgrades; Flexible Manufacturing Capability Development; Enhancement of Quality Traceability System; Digital Energy Consumption Management; Production Line OEE Improvement; Self-Optimizing Process Parameters; Predictive Maintenance Coverage Rate; Application of Digital Twin Factory; Lighthouse Factory Certification; Completion Rate of Machine Substitution for Labor; Coverage Rate of Smart Warehousing; Optimization of Production Takt; Accumulation of Process Knowledge; Standardization of Equipment Interconnection Protocols; Benchmarking Management of Energy Consumption; Reduction in Carbon Emission Intensity; Scrap Recycling Utilization Rate; Intelligent Manufacturing Demonstration Projects; Industrial Robot Density; Breakthrough of Process Bottlenecks; Application of Dynamic Scheduling System; Flexibility of Production Line Reconfiguration; Intelligent Mold Management; Rapid Changeover Rate of Tooling; Tool Life Prediction System; Effectiveness of Energy-Carbon Collaboration; Coverage Rate of Intelligent Inspection; Closed-Loop Rate of Quality Abnormalities; Digital transformation of Process Compliance; Equipment Health Management Platform; Automation of Production Reporting; Deepening of Manufacturing Execution System (MES); Preventive Maintenance Rate of Equipment; Material Readiness Early Warning System; Level of Production Visualization; Standardization Rate of Process Packages; Return on Investment in Intelligent ManufacturingLow Equipment Automation Rate; Dependence on Experience for Process Parameters; Incomplete Quality Traceability; Extensive Energy Consumption Management; Insufficient Equipment Connectivity Rate; Lack of Production Line Flexibility; Absence of Process Knowledge Accumulation; Unstable Production Takt; Insufficient Digital transformation of Mold Management; Low Efficiency of Tooling Changeover; Uncontrollable Tool Wear; Unmonitored Carbon Emissions; Absence of Scrap Recycling System; Equipment OEE Below Industry Benchmark; Lack of Preventive Maintenance; Fluctuating Material Readiness Rate; Lagging Production Visualization; Manual Inspection for Process Compliance; Absence of Equipment Health Management; Insufficient Dynamic Scheduling Capability; Inadequate Investment in Intelligent Manufacturing; Low Industrial Robot Density; Delayed Lighthouse Factory Development; Absence of Digital Twin Applications; Slow Progress in Machine Substitution for Labor; Low Coverage of Smart Warehousing; Slow Response to Quality Abnormalities; Absence of Energy-Carbon Collaboration Mechanism; Persistent Process Bottlenecks; Manual Preparation of Production Reports; Shallow Application of Manufacturing Execution System (MES); Fragmented Equipment Interconnection Protocols; Absence of Energy Consumption Benchmarking System; Non-Compliance of Carbon Emission Intensity; Low Scrap Recycling Rate; Poor Flexibility of Production Reconfiguration; Non-Standardization of Process Packages; Absence of Intelligent Inspection Equipment; Insufficient Closed-Loop Rate of Quality Issues; ROI of Intelligent Manufacturing Below Expectations
Operations and maintenance servicesOnline Equipment Monitoring and Maintenance; Predictive Maintenance; Remote Energy Consumption Monitoring; Safety and Environmental Monitoring and Supervision; Intelligent Operation and Maintenance (O&M); Intelligent Customer Service; Smart Wearables; Smart Environmental Protection; Smart Transportation; Smart Healthcare; Internet of Things (IoT); Industrial Internet; Cyber-Physical Systems (CPS); Edge Computing; Remote Diagnosis; Aftermarket Services; Product Lifecycle Management (PLM); Fault Prediction; Smart Terminals; Digital Twin; Vibration Spectrum Analysis; Lubricant Condition Monitoring; Corrosion Rate Assessment; Equipment Health Index; Automated Service Work Orders; AR-Based Remote Guidance; Intelligent Spare Parts Recommendation; Customer Usage Behavior Analysis; Optimization of Service Response Time; Maintenance Knowledge Base; Service Revenue Management; Equipment Residual Value Assessment; Extended Warranty Service Design; Visualization of Service Network; Customer Satisfaction Modeling; Service Cost Attribution; 3D Printing of Service Spare Parts; Digital transformation of Service Contracts; Service Resource Scheduling; Service Risk Early WarningEquipment Health Management System; Investment in Predictive Maintenance; Coverage Rate of Remote Diagnosis; Construction of Intelligent O&M Platform; Service Digital Transformation; Spare Parts Inventory Turnover Rate; Accuracy of Customer Profiling; Service Response Time Indicator; Application of AR-Based Remote Guidance; Enhancement of Maintenance Knowledge Base; Equipment Residual Value Assessment Model; Growth of Extended Warranty Service Revenue; Visualization of Service Network; Customer Satisfaction Modeling; Service Cost Attribution Analysis; Increase in Service Revenue Contribution; Equipment Migration Planning Capability; Digital transformation Rate of Service Contracts; Optimization of Service Resource Scheduling; Service Risk Early Warning System; Extension of Product Lifecycle; Service Revenue Growth Rate; Resolution Rate of Intelligent Customer Service; Contribution Rate of Aftermarket Services; Improvement of Service Gross Margin; Service Work Order Closed-Loop Rate; Equipment Performance Analysis Report; Standard Operating Procedures (SOP) for Services; Application of Service Scenario Simulation; Service Value Quantification System; Development of Service Digital Twin; Knowledge Management of Service Experience; Enhancement of Service Resource Profiling; Service Capability Evaluation Model; Timeliness of Service Anomaly Detection; Rate of Service Process Automation; Coverage Rate of Service Network; Compliance Rate of Service Response SLA; Investment in Service Digital Transformation; ROI Analysis of Intelligent O&MPredominance of Passive Maintenance; Absence of Predictive Maintenance; Weak Remote Diagnosis Capability; Lag in Service Digital transformation; Low Spare Parts Inventory Turnover; Rough Customer Profiling; Service Response Delays; Lack of AR Guidance Application; Fragmented Maintenance Knowledge; Absence of Equipment Residual Value Assessment; Low Penetration of Extended Warranty Services; Lack of Service Network Visibility; Absence of Customer Satisfaction Modeling; Unclear Service Cost Attribution; Low Contribution of Service Revenue; Absence of Equipment Migration Planning; Paper-Based Management of Service Contracts; Inefficient Service Resource Scheduling; Absence of Service Risk Early Warning System; Insufficient Contribution from Aftermarket Services; Low Coverage of Intelligent Customer Service; Low Service Work Order Closed-Loop Rate; Absence of Equipment Performance Analysis; Lack of Established Service SOP; Absence of Service Scenario Simulation; Inability to Quantify Service Value; Absence of Service Digital Twin; Lack of Service Experience Accumulation; Absence of Service Resource Profiling; Absence of Service Capability Evaluation; Slow Response to Anomaly Detection; Low Service Process Automation Rate; Insufficient Coverage of Service Network; Low SLA Compliance Rate; Insufficient Investment in Digital Transformation; Unclear ROI of Intelligent O&M; Continuous Decline in Service Gross Margin; Stagnant Growth of Service Revenue; Lack of Unified Service Standards; Absence of Customer Self-Service Portal
Business managementBusiness Intelligence (BI); Enterprise Resource Planning (ERP); Management Cockpit; Lean Management; Financial System; Customer Relationship Management (CRM); Data Visualization; Market Positioning; Profit Model; Sales Model; Human Resources System; Contract Management System; Document Management System; Office Automation (OA); Data Middle Platform; Management Information System (MIS); Precision Marketing; Customer Insights; Supply Chain Marketing; Smart Business District; Business Profit and Loss Simulation; Cash Flow Forecasting Model; Organizational Effectiveness Analysis; Strategy Map Decomposition; Quantification of Risk Appetite; Automated Compliance Auditing; Opportunity Funnel Management; Customer Segmentation and Profiling; Employee Competency Mapping; Meeting Efficiency Analysis; Travel Cost Attribution; Dynamic Budget Adjustment; Expense Control Rule Engine; Contract Performance Monitoring; Seal Usage Traceability; Knowledge Base Search Optimization; Process Mining; Decision Tree Analysis; Executive Dashboard; Organizational Resilience AssessmentStrategy Execution Visualization; Organizational Effectiveness Digital transformation; Process Mining Coverage Rate; Intelligent Decision Support; Intelligent Risk Early Warning; Digital Compliance Auditing; Intelligent Opportunity Discovery; Customer Value Segmentation; Employee Capability Digital transformation; Meeting Efficiency Analysis; Optimization of Business Travel Models; Intelligent Budgeting; Expense Control Rule Engine; Intelligent Contract Review; Digital Seal Supervision; Intelligent Knowledge Retrieval; Process Bottleneck Diagnosis; Digital Management Dashboard; Organizational Resilience Assessment; Strategic Objective Decomposition; Risk Appetite Modeling; Compliance Knowledge Graph; Opportunity Value Assessment; Customer Churn Prediction; Employee Turnover Early Warning; Meeting Decision Traceability; Business Travel Cost Attribution; Budget Execution Monitoring; Expense Anomaly Detection; Contract Risk Profiling; Seal Usage Traceability; Knowledge Association Mining; Process Compliance Checking; Decision Factor Analysis; Management Cockpit; Organizational Health Index; Strategy Execution Deviation; Risk Heat Map; Compliance Audit Clues; Investment in Digital GovernanceLack of Strategy Execution Visibility; Unclear Organizational Effectiveness; Black-Box Process Operations; Experience-Driven Decision-Making; Delayed Risk Early Warning; Manual Compliance Inspection; Inefficient Opportunity Discovery; Absence of Customer Value Segmentation; Unquantified Employee Capabilities; Low Meeting Efficiency; Extensive Business Travel Model; Static Budgeting; Absence of Expense Control Rules; Manual Contract Review; Lack of Seal Supervision; Inefficient Knowledge Retrieval; Unidentified Process Bottlenecks; Absence of Management Dashboard; Lack of Organizational Resilience Assessment; Absence of Strategic Objective Decomposition; Unquantified Risk Appetite; Fragmented Compliance Knowledge; Misjudgment of Opportunity Value; Absence of Customer Churn Early Warning; Lack of Employee Turnover Prediction; No Traceability of Meeting Decisions; Lack of Business Travel Cost Attribution; Uncontrolled Budget Execution; Undetected Expense Anomalies; Absence of Contract Risk Profiling; No Record of Seal Usage; Existence of Knowledge Silos; Process Compliance Gaps; Ambiguous Decision Factors; Dispersed Management Data; Lack of Organizational Health Diagnosis; Significant Deviation in Strategy Execution; One-Sided Risk Identification; Inefficient Compliance Auditing; Insufficient Investment in Digital Governance
Supply Chain ManagementSupply Chain Management (SCM); Intelligent Logistics; Unmanned Warehousing; Supply Chain Collaboration; Multi-Tier Supplier Management; Supply Chain Disruption Prediction; Blockchain Traceability; Product Quality Traceability; Route Optimization; Inventory Management; Industrial Internet; Internet of Things (IoT); E-Commerce; Cross-Border E-Commerce; Electronic Data Interchange (EDI); Intelligent Procurement; Centralized Procurement System; Supply Chain Visibility; Cold Chain Logistics; Supply Chain Finance; Supplier Risk Assessment; On-Time Delivery Rate Optimization; VMI Inventory Model; Transportation Cost Modeling; Packaging Standardization Rate; Automated Customs Documentation; Cross-Border Compliance Checking; Supplier Resilience Scoring; Alternative Sourcing Strategy; Logistics Carbon Footprint Tracking; Available-to-Promise (ATP); Dynamic Safety Stock; Demand Sensing Algorithm; Supply Network Reconfiguration; Material Readiness Early Warning; Supplier Collaboration Portal; Contract Performance Anomaly Detection; Logistics Lead Time Prediction; Supplier Capacity Profiling; Supply Chain Stress TestingSupply Chain Visibility; Intelligent Demand Forecasting; Dynamic Safety Stock; Cross-Border Compliance Engine; Supplier Digital Profiling; Logistics Carbon Footprint Tracking; Alternative Sourcing System; Supply Disruption Simulation; Intelligent Customs System | Logistics Lead Time Prediction; Supplier Collaboration Platform; Contract Anomaly Detection; Material Readiness Early Warning; Transportation Cost Optimization; Intelligent Packaging Design; Supplier Risk Assessment; On-Time Delivery Model; VMI Inventory Optimization; Supply Network Reconfiguration; Demand Propagation Model; Intelligent Traceability System; Real-Time Cold Chain Monitoring; Electronic Data Interchange (EDI); Intelligent Route Planning; Inventory Health Index; Supply Chain Transparency; Supplier Credit Rating; Material Lifecycle Management; Supply-Demand Matching; Intelligent Proof-of-Delivery System; Logistics Resource Scheduling; Supply Volatility Absorption; Intelligent Replenishment Strategy; Reserved Supply Capacity; Cross-Border Tax Optimization; Supply Network Simulation; Blockchain-Based Quality Traceability; Multi-Tier Supplier Penetration; Supply Resilience Assessment; Supply Chain Digital TwinLack of Supply Chain Visibility; High Demand Forecast Error; Static Safety Stock; Cross-Border Compliance Risks; Subjective Supplier Evaluation; Untracked Logistics Carbon Emissions; Insufficient Alternative Sourcing Development; Absence of Supply Disruption Contingency Plans; Manual Customs Processes; Uncontrollable Logistics Lead Time; Inefficient Supplier Collaboration; Undetected Contract Anomalies; Delayed Material Readiness Early Warning; Unoptimized Transportation Costs; Traditional Packaging Design; Unquantified Supplier Risks; Low On-Time Delivery Rate; Ineffective VMI Execution; Rigid Supply Network; Distorted Demand Propagation; Absence of Traceability System; Gaps in Cold Chain Monitoring; Inefficient Data Exchange; Manual Route Planning; Undiagnosed Inventory Health; Low Supply Chain Transparency; Unrated Supplier Credit; Untracked Material Lifecycle; Imbalance in Supply-Demand Matching; Traditional Proof-of-Delivery System; Inefficient Logistics Scheduling; Poor Response to Supply Volatility; Primitive Replenishment Strategy | Insufficient Reserved Supply Capacity; Unoptimized Cross-Border Taxation; Absence of Network Simulation; Incomplete Quality Traceability; Uncontrollable Multi-Tier Supply; Unassessed Supply Resilience; Unapplied Supply Chain Digital Twin
Cross-process collaborationData Integration; Information Integration; Model Interconnection; Data-Driven; Ecosystem Collaboration; Industrial Internet; Product Lifecycle Management (PLM); C2M Mass Customization; Data Middle Platform; Cloud Platform; Industrial Cloud; Cloud Ecosystem; Digital Twin; Digital Thread; Industrial Information; Industrial Communication; Mixed Reality (MR); Augmented Reality (AR); Human-Computer Interaction; Digital Marketing; Componentization of Business Capabilities; End-to-End Process Integration; Organization-Level Architecture Governance; Master Data Consistency; Indicator Lineage Traceability; Heterogeneous System Adapter; Microservices Governance; API Economy Model; Digital Thread Connectivity; Change Impact Domain Analysis; Business Object Modeling; Shared Rule Engine; User Experience Journey Mapping; Value Stream Panorama View; Demand Propagation Mechanism; Capability Open Platform; Standardization of Business Semantics; Grey Release Control; Quantification of Technical Debt; Innovation Funnel EvaluationDigital Thread Connectivity; Standardization of Business Objects; Intelligent Service Orchestration; Event-Driven Architecture; Master Data Governance | Intelligent Interface Adaptation; Deep Application of Process Mining; Change Impact Analysis; Capability Open Platform; Digital transformation of User Experience; Unified Business Semantics; Visualization of Technical Debt; Architecture Compliance Checking; Service Contract Governance; Data Lineage Traceability; Decoupling of Business Capabilities; Grey Release Control; Innovation Funnel Management; Value Stream Analysis; Centralized Rule Management; Experience Measurement System; Service Circuit Breaker Mechanism; Business Continuity Assurance; Architecture Resilience Assessment; Digital Product Factory; Business Middle Platform Maturity; Technology Stack Governance; Alignment of Business Semantics; Optimization of Capability Portfolio; Reengineering of Experience Journey; Architecture Evolution Roadmap; Change Impact Simulation; Service Dependency Mapping; Business Rules Engine; Data Ownership Mechanism; Technology Asset Inventory; Innovation Value Assessment; Reuse of Business Components; Digital transformation of Architecture Decisions; Digital Resilience BuildingBroken Digital Thread; Disordered Business Objects; Inefficient Service Orchestration; Absence of Event-Driven Mechanism; Inconsistent Master Data; Difficulties in Interface Adaptation; Lack of Process Mining Application; Unknown Change Impact; Insufficient Capability Openness; Unquantifiable User Experience; Ambiguous Business Semantics; Accumulation of Technical Debt; Architecture Compliance Gaps; Absence of Service Contracts; Blurred Data Lineage; Strong Business Coupling; Uncontrolled Grey Release; Ineffective Innovation Funnel; Lack of Value Stream Analysis; Fragmented Rule Management; Absence of Experience Measurement; Blank Circuit Breaker Mechanism; Risks to Business Continuity; Lack of Architecture Resilience Assessment; Absence of Digital Product Factory; Weak Middle Platform Capability; Chaotic Technology Stack; Misaligned Business Semantics; Inefficient Capability Portfolio; Unoptimized Experience Journey; Disordered Architecture Evolution; Unmeasured Change Impact; Unclear Service Dependencies; Absence of Rules Engine; Unclear Data Ownership; Lack of Technology Asset Inventory; Unassessed Innovation Value; Low Component Reuse Rate; Arbitrary Architecture Decisions; Insufficient Digital Resilience

References

  1. Elhusseiny, H.M.; Crispim, J. A review of Industry 4.0 maturity models: Theoretical comparison in the smart manufacturing sector. Procedia Comput. Sci. 2024, 232, 1869–1878. [Google Scholar] [CrossRef]
  2. Vance, D.; Jin, M.; Price, C.; Nimbalkar, S.U.; Wenning, T. Smart manufacturing maturity models and their applicability: A review. J. Manuf. Technol. Manag. 2023, 34, 735–770. [Google Scholar] [CrossRef]
  3. Senna, P.P.; Barros, A.C.; Bonnin Roca, J.; Azevedo, A. Development of a digital maturity model for Industry 4.0 based on the technology-organization-environment framework. Comput. Ind. Eng. 2023, 185, 109645. [Google Scholar] [CrossRef]
  4. Jamwal, A.; Agrawal, R.; Sharma, M. Developing a maturity model for Industry 4.0 practices in manufacturing SMEs. Oper. Manag. Res. 2025, 18, 111–143. [Google Scholar] [CrossRef]
  5. Chatterjee, S.; Rana, N.P.; Dwivedi, Y.K.; Baabdullah, A.M. Understanding AI adoption in manufacturing and production firms using an integrated TAM-TOE model. Technol. Forecast. Soc. Change 2021, 170, 120880. [Google Scholar] [CrossRef]
  6. Henriquez, R.; Muñoz-Villamizar, A.; Santos, J. Key factors in operational excellence for Industry 4.0: An empirical study and maturity model in emerging countries. J. Manuf. Technol. Manag. 2023, 34, 771–792. [Google Scholar] [CrossRef]
  7. De Carolis, A.; Sassanelli, C.; Acerbi, F.; Macchi, M.; Terzi, S.; Taisch, M. The Digital REadiness Assessment MaturitY (DREAMY) framework to guide manufacturing companies towards a digitalisation roadmap. Int. J. Prod. Res. 2025, 63, 5555–5581. [Google Scholar] [CrossRef]
  8. Latino, M.E. A maturity model for assessing the implementation of Industry 5.0 in manufacturing SMEs: Learning from theory and practice. Technol. Forecast. Soc. Change 2025, 214, 124045. [Google Scholar] [CrossRef]
  9. Santos, R.C.; Martinho, J.L. An industry 4.0 maturity model proposal. J. Manuf. Technol. Manag. 2020, 31, 1023–1043. [Google Scholar] [CrossRef]
  10. Mittal, S.; Romero, D.; Wuest, T. Towards a Smart Manufacturing Maturity Model for SMEs (SM3). In Proceedings of the IFIP International Conference on Advances in Production Management Systems, Seoul, Republic of Korea, 26–30 August 2018; Springer: Cham, Germany, 2018; pp. 155–163. [Google Scholar]
  11. Rahamaddulla, S.R.B.; Leman, Z.; Baharudin, B.T.; Ahmad, S.A. Conceptualizing smart manufacturing readiness-maturity model for small and medium enterprise (SME) in Malaysia. Sustainability 2021, 13, 9793. [Google Scholar] [CrossRef]
  12. Benešová, A.; Basl, J.; Tupa, J.; Steiner, F. Design of a business readiness model to realise a green industry 4.0 company. Int. J. Comput. Integr. Manuf. 2021, 34, 920–932. [Google Scholar] [CrossRef]
  13. Fareri, S.; Fantoni, G.; Chiarello, F.; Coli, E.; Binda, A. Estimating industry 4.0 impact on job profiles and skills using text mining. Comput. Ind. 2020, 118, 103222. [Google Scholar] [CrossRef]
  14. Battistoni, E.; Gitto, S.; Murgia, G.; Campisi, D. Adoption paths of digital transformation in manufacturing SMEs. Int. J. Prod. Econ. 2023, 255, 108675. [Google Scholar] [CrossRef]
  15. Andrei, M.; Johnsson, S. Advancing maturity in the adoption of digital technologies for energy efficiency in manufacturing industry. J. Manuf. Technol. Manag. 2025, 36, 114–133. [Google Scholar] [CrossRef]
  16. Volf, L.; Dohnal, G.; Beranek, L.; Kyncl, J. Navigating the Fourth Industrial Revolution: SBRI—A comprehensive digital maturity assessment tool and road to Industry 4.0 for small manufacturing enterprises. Manuf. Technol. 2024, 24, 668–680. [Google Scholar] [CrossRef]
  17. Nottbrock, C.; Van Looy, A.; De Haes, S. Impact of digital Industry 4.0 innovations on interorganizational value chains: A systematic literature review. Bus. Process Manag. J. 2023, 29, 43–76. [Google Scholar] [CrossRef]
  18. Gürdür, D.; El-Khoury, J.; Törngren, M. Digitalizing Swedish industry: What is next? Data analytics readiness assessment of Swedish industry, according to survey results. Comput. Ind. 2019, 105, 153–163. [Google Scholar] [CrossRef]
  19. Gökalp, M.O.; Gökalp, E.; Kayabay, K.; Koçyiğit, A.; Eren, P.E. Data-driven manufacturing: An assessment model for data science maturity. J. Manuf. Syst. 2021, 60, 527–546. [Google Scholar] [CrossRef]
  20. Kääriäinen, J.; Pussinen, P.; Saari, L.; Kuusisto, O.; Saarela, M.; Hänninen, K. Applying the positioning phase of the digital transformation model in practice for SMEs: Toward systematic development of digitalization. Int. J. Inf. Syst. Proj. Manag. 2020, 8, 24–43. [Google Scholar] [CrossRef]
  21. Gladysz, B.; Krystosiak, K.; Buczacki, A.; Quadrini, W.; Ejsmont, K.; Kluczek, A.; Park, J.; Fumagalli, L. Sustainability and Industry 4.0 in the packaging and printing industry: A diagnostic survey in Poland. Eng. Manag. Prod. Serv. 2024, 16, 51–67. [Google Scholar] [CrossRef]
  22. Hajoary, P.K.; Balachandra, P.; Garza-Reyes, J.A. Industry 4.0 maturity and readiness assessment: An empirical validation using confirmatory composite analysis. Prod. Plan. Control 2024, 35, 1779–1796. [Google Scholar] [CrossRef]
  23. Manavalan, E.; Jayakrishna, K. A review of Internet of Things (IoT) embedded sustainable supply chain for industry 4.0 requirements. Comput. Ind. Eng. 2019, 127, 925–953. [Google Scholar] [CrossRef]
  24. Naeem, H.M.; Garengo, P. The interplay between industry 4.0 maturity of manufacturing processes and performance measurement and management in SMEs. Int. J. Product. Perform. Manag. 2022, 71, 1034–1058. [Google Scholar] [CrossRef]
  25. Gomes, A.D.O.; Basilio, J.C. A fuzzy inference model to identify the current industry maturity stage in the transformation process to Industry 4.0. IEEE Trans. Autom. Sci. Eng. 2024, 21, 1607–1622. [Google Scholar] [CrossRef]
  26. Krykavskyy, Y.; Pokhylchenko, O.; Hayvanovych, N. Supply chain development drivers in industry 4.0 in Ukrainian enterprises. Oeconomia Copernic. 2019, 10, 273–290. [Google Scholar] [CrossRef]
  27. Saad, S.M.; Bahadori, R.; Jafarnejad, H. The smart SME technology readiness assessment methodology in the context of Industry 4.0. J. Manuf. Technol. Manag. 2021, 32, 1037–1065. [Google Scholar] [CrossRef]
  28. Ferreira, D.V.; De Gusmão, A.P.H.; De Almeida, J.A. A multicriteria model for assessing maturity in industry 4.0 context. J. Ind. Inf. Integr. 2024, 38, 100579. [Google Scholar] [CrossRef]
  29. Bayrak, İ.T.; Cebi, F. Procedure model for Industry 4.0 realization for operations improvement of manufacturing organizations. IEEE Trans. Eng. Manag. 2024, 71, 7901–7912. [Google Scholar] [CrossRef]
  30. Zhao, L.; Shao, J.; Qi, Y.; Chu, J.; Feng, Y. A novel model for assessing the degree of intelligent manufacturing readiness in the process industry: Process-industry intelligent manufacturing readiness index (PIMRI). Front. Inf. Technol. Electron. Eng. 2023, 24, 417–432. [Google Scholar] [CrossRef]
  31. Modrak, V.; Soltysova, Z.; Sobotova, L. Transition of SMEs towards smart factories: A multi-case survey. Pol. J. Manag. Stud. 2024, 29, 237–254. [Google Scholar] [CrossRef]
  32. Jankowska, B.; Götz, M.; Mińska-Struzik, E.; Bartosik-Purgat, M. A new wave and the ripples it makes: Post-transition firm’s digital maturity and its consequences in global value chains. Entrep. Bus. Econ. Rev. 2024, 12, 135–152. [Google Scholar] [CrossRef]
  33. Ciravegna Martins Da Fonseca, L.M.; Pereira, T.; Oliveira, M.; Ferreira, F.; Busu, M. Manufacturing companies industry 4.0 maturity level: A multivariate analysis. J. Ind. Eng. Manag. 2024, 17, 196. [Google Scholar] [CrossRef]
  34. Mora-Alvarez, Z.A.; Hernandez-Uribe, O.; Luque-Morales, R.A.; Cardenas-Robledo, L.A. Modular ontology to support manufacturing SMEs toward Industry 4.0. Eng. Technol. Appl. Sci. Res. 2023, 13, 12271–12277. [Google Scholar] [CrossRef]
  35. Ganzarain, J.; Errasti, N. Three stage maturity model in SMEs toward industry 4.0. J. Ind. Eng. Manag. 2016, 9, 1119–1128. [Google Scholar] [CrossRef]
  36. Ahn, D.-J.; Jun, C.; Song, S.; Baek, J.-G. Production system maturity model (PSMM) for assessing manufacturing execution system. IEEE Access 2024, 12, 123459–123475. [Google Scholar] [CrossRef]
  37. Garechana, G.; Río-Belver, R.; Bildosola, I.; Rodríguez Salvador, M. Effects of innovation management system standardization on firms: Evidence from text mining annual reports. Scientometrics 2017, 111, 1987–1999. [Google Scholar] [CrossRef]
  38. Maibaum, F.; Kriebel, J.; Foege, J.N. Selecting textual analysis tools to classify sustainability information in corporate reporting. Decis. Support Syst. 2024, 183, 114269. [Google Scholar] [CrossRef]
  39. Stinson, M.; Mohammadian, A. W2VPCA: A machine learning method for measuring attitudes with natural language. IEEE Trans. Intell. Transp. Syst. 2024, 25, 8063–8077. [Google Scholar] [CrossRef]
  40. Wu, F.; Hu, H.; Lin, H.; Ren, X. Corporate digital transformation and capital market performance: Empirical evidence from stock liquidity. Manag. World 2021, 37, 130–150. [Google Scholar]
  41. Meng, M.; Fan, S.; Li, X.; Lei, J. Digital transformation and strategic risk taking dataset for China’s public-listed companies. Data Brief 2024, 54, 110511. [Google Scholar] [CrossRef]
  42. Jin, X.Y.; Zuo, C.J.; Fang, M.Y.; Li, T.; Nie, H.H. The measurement dilemma of enterprise digital transformation: New methods and findings based on large language models. J. World Econ. 2024, 2024, 34–53. [Google Scholar]
Figure 1. The framework of the evaluation method.
Figure 1. The framework of the evaluation method.
Sustainability 18 00274 g001
Figure 2. Temporal trends in the overall average digital transformation evaluation scores of the three models (based on the mean of all scenario scores for each S).
Figure 2. Temporal trends in the overall average digital transformation evaluation scores of the three models (based on the mean of all scenario scores for each S).
Sustainability 18 00274 g002
Figure 3. Temporal trends in the average scores of the three models across scenarios (S1–S6).
Figure 3. Temporal trends in the average scores of the three models across scenarios (S1–S6).
Sustainability 18 00274 g003
Figure 4. Temporal trends in the overall digital transformation evaluation index (ODTI) during the study period for the three models.
Figure 4. Temporal trends in the overall digital transformation evaluation index (ODTI) during the study period for the three models.
Sustainability 18 00274 g004
Figure 5. Evaluation values of each scenario using the three models.
Figure 5. Evaluation values of each scenario using the three models.
Sustainability 18 00274 g005
Figure 6. Annual variation trends in scenario evaluation score averages (mean across three models).
Figure 6. Annual variation trends in scenario evaluation score averages (mean across three models).
Sustainability 18 00274 g006
Figure 7. Evaluation variation under different single-weight perturbation.
Figure 7. Evaluation variation under different single-weight perturbation.
Sustainability 18 00274 g007
Figure 8. Evaluation variation in DTI under different extreme weight configurations.
Figure 8. Evaluation variation in DTI under different extreme weight configurations.
Sustainability 18 00274 g008
Figure 9. Evaluation of DTI for 7 benchmark enterprises from 2017 to 2023.
Figure 9. Evaluation of DTI for 7 benchmark enterprises from 2017 to 2023.
Sustainability 18 00274 g009
Figure 10. Evaluation of ODTI for 7 benchmark enterprises from 2017 to 2023.
Figure 10. Evaluation of ODTI for 7 benchmark enterprises from 2017 to 2023.
Sustainability 18 00274 g010
Table 1. Comparative overview of existing digital transformation MMs.
Table 1. Comparative overview of existing digital transformation MMs.
ModelLevel/ScopeMain PurposeData SourceMethod TypeFunction-/Scenario-Level?
Mittal et al. (2018) [10]Organization/plantIndustry 4.0 readiness/maturitySurveyExpert/MCDANo
De Carolis et al. (DREAMY) (2025) [7]Organization (manufacturing)Digital readiness/roadmapSurvey + interviewsExpert/MCDALimited
Battistoni et al. (2023) [14]Manufacturing SMEs/firmDT adoption pathsSurveyPLS-SEM + NCANo
Gürdür et al. (2019) [18]Cross-industry/organizationData analytics readinessWeb-based surveySurvey-based scoringNo
Manavalan & Jayakrishna (2019) [23]Supply chain/organizationI4.0/sustainable SC readinessLiterature reviewConceptual frameworkNo (supply-chain level)
Rahamaddulla et al. (2021) [11]Manufacturing SMEs/organizationSmart manufacturing readinessLiterature + conceptual synthesisConceptual readiness–MMLimited
Saad et al. (2020) [27]SMEs/value chain/designI4.0 technology readiness (SSTRA)Case/practitioner dataAHP-based assessmentPartial (design-related)
Fareri et al. (2020) [13]Organization/HRI4.0 impact on skills/profilesJob descriptions/textText mining/NLPNo
Jamwal et al. (2025) [4]Manufacturing SMEs/organizationI4.0 practices maturitySurveyFuzzy/multi-criteria maturityNo
Latino, M.E. (2025) [8]Manufacturing SMEs/organizationIndustry 5.0 implementation maturitySelf-assessment/surveyExpert/MCDA-style modelLimited
This studyFunction/scenario (firm)Scenario-level digital maturityPublic annual reports (text)LLM-based semantic modelYes (scenario-level)
Table 2. The calculation principles of the three indicators.
Table 2. The calculation principles of the three indicators.
Indicator NameCalculating PrinciplesEquation No
Scenario coverage(The number of keywords appearing in the text/Total keywords) × 100%Equation (A2) in Appendix B
Scenario DepthΣ(Weights of different types of keywords × Frequencies of different types of keywords)/(Weight of the core keywords × keyword frequency) × 100%Equation (A3) in Appendix B
Scenario consistencyNormalized semantic similarity valueEquation (A4) in Appendix B
Table 3. Average evaluation values of S1–S6 per model from 2014 to 2023.
Table 3. Average evaluation values of S1–S6 per model from 2014 to 2023.
SimCSEText2VecParaphrase
S10.40310.49890.3983
S20.17920.2690.2011
S30.42050.54150.4488
S40.11590.18140.1331
S50.23540.28020.2691
S60.05660.11880.0744
Table 4. Average evaluation values of S1-S6 per model for each year from 2014 to 2023.
Table 4. Average evaluation values of S1-S6 per model for each year from 2014 to 2023.
2014201520162017201820192020202120222023
SimCSE0.18260.16010.17620.18710.18060.19110.19350.37270.35480.3526
Paraphrase0.19660.17590.19150.18770.19770.21130.21730.39990.38360.3799
Text2Vec0.25150.23090.24960.26410.25680.26800.27420.46390.44570.4450
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Q.; Jiang, X. Quantitative Measurement of Digital Maturity in Manufacturing Enterprises: An Application Scenario-Based Study. Sustainability 2026, 18, 274. https://doi.org/10.3390/su18010274

AMA Style

Liu Q, Jiang X. Quantitative Measurement of Digital Maturity in Manufacturing Enterprises: An Application Scenario-Based Study. Sustainability. 2026; 18(1):274. https://doi.org/10.3390/su18010274

Chicago/Turabian Style

Liu, Qing, and Xiaoyan Jiang. 2026. "Quantitative Measurement of Digital Maturity in Manufacturing Enterprises: An Application Scenario-Based Study" Sustainability 18, no. 1: 274. https://doi.org/10.3390/su18010274

APA Style

Liu, Q., & Jiang, X. (2026). Quantitative Measurement of Digital Maturity in Manufacturing Enterprises: An Application Scenario-Based Study. Sustainability, 18(1), 274. https://doi.org/10.3390/su18010274

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop