Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems

Wang, Lingyu; Wang, Junliang; Guo, Meixing; Liu, Guangtao; Fang, Mingzhu; Yan, Xingyun; Wang, Hairui; Chen, Bin; Zhu, Yuanyuan; Hu, Jie; Qi, Jin

doi:10.3390/app15179327

Open AccessArticle

Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems

by

Lingyu Wang

¹,

Junliang Wang

²,

Meixing Guo

¹,

Guangtao Liu

²,

Mingzhu Fang

¹,

Xingyun Yan

¹,

Hairui Wang

¹,

Bin Chen

¹,

Yuanyuan Zhu

³,

Jie Hu

^4,* and

Jin Qi

^4,*

¹

School of Design, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Minhang District, Shanghai 200240, China

²

Shanghai Rongtai Health Technology Co., Ltd., No. 1226, Zhufeng Road, Qingpu District, Shanghai 201702, China

³

School of Art and Design, Nanjing Forestry University, No. 159, Longpan Road, Xuanwu District, Nanjing 210037, China

⁴

School of Mechanical and Power Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Minhang District, Shanghai 200240, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9327; https://doi.org/10.3390/app15179327

Submission received: 14 July 2025 / Revised: 16 August 2025 / Accepted: 22 August 2025 / Published: 25 August 2025

Download

Browse Figures

Versions Notes

Abstract

As technologies become increasingly diverse and complex, Intelligent Massage Systems (IMS) are evolving from traditional mechanically executed modes toward personalized and predictive health interventions. However, the field still lacks a unified grading standard for intelligence, making it difficult to quantitatively assess a system’s overall intelligence level. To address this gap, this paper proposes a task-driven six-level (L0–L5) classification framework and constructs a Massage-Driven Task (MDT) model that decomposes the massage process into six subtasks (S1–S6). Building on this, we design a three-dimensional evaluation scheme comprising a Functional Delegation Structure (FDS), an Anomaly Perception Mechanism (APM), and a Human–Machine Interaction Boundary (HMIB), and we select eight key performance indicators to quantify IMS intelligence across the perception–decision–actuation–feedback closed loop. We then determine indicator weights via the Delphi method and the Analytic Hierarchy Process (AHP), and obtain dimension-level scores and a composite intelligence score S0 using normalization and weighted aggregation. Threshold intervals for L0–L5 are defined through equal-interval partitioning combined with expert calibration, and sensitivity is verified on representative samples using ±10% data perturbations. Results show that, within typical error ranges, the proposed grading framework yields stable classification decisions and exhibits strong robustness. The framework not only provides the first reusable quantitative basis for grading IMS intelligence but also supports product design optimization, regulatory certification, and user selection.

Keywords:

intelligent massage systems (IMS); capability grading; task-driven; Massage-Driven Task (MDT) model; performance indicators

1. Introduction

Under the strategic context of “Healthy China 2030”, the global population is aging at an accelerating pace, and health needs are steadily increasing [1]. Intelligent health devices have thus become an important development direction in the healthcare and eldercare sector. As a new generation of home-based physiotherapy and rehabilitation equipment, the Intelligent Massage System (IMS) integrates advanced sensing technologies, artificial intelligence algorithms, and human–machine interaction methods and already possesses capabilities for environmental perception, decision-making, and feedback regulation [2,3,4,5]. This enables a gradual transition from passive mechanical execution to an intelligent health terminal with environmental awareness, autonomous regulation, and personalized services. IMS has also demonstrated significant effects in relieving muscle fatigue, improving blood circulation, and enhancing user comfort [6,7,8,9].

However, in a market environment where the technological architecture of massage systems is increasingly complex and product form factors have become highly diverse, descriptions of the “degree of intelligence” of IMS in both industry and academia largely remain at the level of stacked feature lists or qualitative commentary, with no unified, quantitative capability evaluation standard. This gap not only makes it difficult for R&D teams to set clear technical priorities but also hampers regulators in formulating certification criteria and leaves consumers without an effective basis for comparison when making purchases. Therefore, it is urgent to establish a clear and operational capability-grading framework.

To address the above shortcomings, this paper draws on the notions of task delegation and hierarchical capability assessment in the SAE J3016 levels of driving automation and develops a task-driven six-level (L0–L5) grading model tailored to IMS. First, based on the proposed Massage-Driven Task (MDT) model, the massage process is decomposed into six subtasks—body-state recognition, program recommendation, force and rhythm control, multi-region coordination, anomaly detection and intervention, and user modeling and learning (S1–S6). From this, three evaluative dimensions are derived—Functional Delegation Structure (FDS), Anomaly Perception Mechanism (APM), and Human–Machine Interaction Boundary (HMIB)—to characterize the system’s degree of delegation/agency, processing capability, and proactivity across the subtasks. Subsequently, eight corresponding key performance indicators are selected; indicator weights are determined via the Delphi method and Analytic Hierarchy Process (AHP); and a composite intelligence score

S_{0}

is constructed using min–max normalization and weighted summation. Finally, sensitivity analysis is conducted to verify the model’s stability and robustness under typical perturbations.

The remainder of this paper is organized as follows: Section 2 reviews related research in the field of intelligent massage, as well as research on levels of driving automation. Section 3 details the construction of the MDT model, the design of the three-dimensional capability evaluation framework, and the procedures for indicator quantification and analysis. Section 4 defines the L0–L5 level specifications, presents the computation of the composite intelligence score, and reports the sensitivity verification. Section 5 summarizes the findings, discusses their implications, and outlines directions for future work.

2. Related Works

2.1. Current Status of Research on Vehicle Autonomous Driving Classification

During the early stage of the industrialization and commercialization of automated driving, the absence of unified grading standards and policy support led to long-standing divergences in how “automation” was understood, which in turn affected regulatory rule-making, corporate R&D, and public perception [10,11]. To address these issues, in 2014, SAE International released the J3016 standard, which pioneered decomposing the Dynamic Driving Task (DDT) into a set of subtasks according to the responsibilities that can be delegated to the system. Building on three core elements—Functional Delegation Structure (FDS), Operational Design Domain (ODD), and fallback capability—it established a six-level capability classification system from L0 (no automation) to L5 (full automation) [12,13]. This approach broke from schemes based solely on technological maturity and enabled a structured expression of system behavioral responsibilities and safety boundaries.

Following the publication of the J3016 framework, various regions around the world have successively introduced localized classification standards. As illustrated in Figure 1, the U.S. National Highway Traffic Safety Administration (NHTSA) proposed a five-level classification model in 2013, which, building upon the foundation of J3016, further refined levels such as function-specific automation, combined-function automation, and conditional automation [14]. In 2019, both Germany’s Federal Highway Research Institute (BASt) and the United Nations Economic Commission for Europe (UNECE WP.29) developed related classification models and regulatory frameworks, advancing the international harmonization of standards [15]. In 2021, China released the national standard GB/T 40429-2021: Classification of Driving Automation for Vehicles, which inherits the core logic of J3016 while tailoring it to domestic road characteristics, safety supervision requirements, and technological realities. The Chinese standard refines definitions of minimum functional sets, safety assurance measures, and applicable scenarios for each automation level [16,17,18]. The successive issuance of these standards has not only provided a unified, layered framework for evaluating the capabilities of autonomous driving systems but also laid a solid foundation for regulatory certification, industry testing, and consumer education [19].

In China, as autonomous driving technologies continue to evolve and reach practical deployment, vehicles are gradually transitioning from traditional means of transportation to intelligent mobile entities equipped with autonomous navigation capabilities. Their core value is also shifting from mechanical drive systems to the co-evolution of software algorithms and intelligent chips [20], with the influence of the SAE J3016 classification framework increasingly evident within the industry [21]. On one hand, the classification system provides a clear capability layering framework for intelligent driving products, facilitating the explicit definition of system functional boundaries. For example, at the L2–L3 levels, driver assistance features such as adaptive cruise control and lane keeping have become standard offerings in new energy and premium vehicles [22]. According to data from the Ministry of Industry and Information Technology, the penetration rate of L2 functions in new vehicles in China reached 55.7% in the first half of 2024 and is projected to exceed 65% by 2025 [23]. Demonstration applications of L3 systems, such as Audi A8′s Traffic Jam Pilot, have promoted on-road pilot programs in restricted scenarios in cities like Beijing and Wuhan, and prompted several automakers (including XPeng, GAC, and Zeekr) to announce mass-production plans [24]. On the other hand, the establishment of the classification system also provides a structured foundation for accident liability determination, risk regulation, and user behavior guidance. Higher-level automated systems (L3 and above) are tasked with more complex driving responsibilities, raising legal challenges surrounding “human–machine handover disputes” in dynamic environments. Researchers have begun exploring decision transparency within explainable AI frameworks to address accountability boundaries and fault tolerance in L3–L4 scenarios [25]. Moreover, users are prone to cognitive bias during transitions between L2 and L3, further underscoring the importance of clearly defined levels and user education.

Compared with the development path in China, international research on autonomous driving classification places greater emphasis on the coordinated advancement of technological maturity and regulatory systems. The United States and Europe are representative in practical research efforts and differ in their technical approaches and regulatory strategies. The United States holds a leading position in the research and application of autonomous driving technologies, characterized by a market-driven approach and strong emphasis on technological innovation [26]. Technology companies and traditional automakers—represented by Google’s Waymo and Tesla—are actively promoting the development of autonomous driving [27]. Waymo has launched driverless taxi services in several U.S. cities, while Tesla continues to upgrade its Autopilot system to gradually realize more advanced autonomous driving capabilities [28]. In addition, many U.S. states have relatively lenient regulations for autonomous vehicle testing. For example, California and Arizona have created favorable testing environments. In contrast, Europe places greater emphasis on public safety and the coordination between policy and technological development [29]. The European Union supports the advancement of autonomous driving through initiatives such as Horizon 2020 and promotes multinational collaboration. For instance, the L3Pilot project is aimed at conducting L3-level autonomous driving tests across multiple European countries. Europe also emphasizes the integration between road infrastructure and autonomous vehicles, promoting V2X (Vehicle-to-Everything) technologies to enable real-time information exchange between vehicles and traffic infrastructure [30,31]. With continuous technological progress and the gradual improvement of regulatory systems, autonomous vehicles are expected to achieve large-scale commercial deployment in the near future [32,33]. The convergence of artificial intelligence, 5G communication, and intelligent transportation infrastructure will further accelerate the development of autonomous driving technologies [34].

From a technical perspective, both in China and internationally, classification frameworks have facilitated the coordinated development of key technologies, such as multi-sensor fusion, behavior prediction, and decision-making and execution mechanisms. For example, the integration of data from LiDAR, millimeter-wave radar, and cameras has significantly improved the system’s ability to perceive and interpret complex traffic scenarios [35,36,37,38]. Moreover, the classification logic has inspired the development of tiered software version pathways (e.g., ADS Lite → ADS 2.0), offering automotive manufacturers such as XPeng, Huawei, and BYD a standardized model for progressive feature deployment [39,40]. The SAE J3016 classification standard has accelerated the global transition of autonomous driving technologies from conceptual ambiguity to structured standardization. Its successful implementation demonstrates that a capability-layered framework based on task delegation and responsibility boundaries not only enables a clear articulation of system intelligence levels but also provides a unified paradigm for the technical evolution, regulatory governance, and market adoption of complex human–machine interactive systems. This methodology offers important insights for the classification of emerging intelligent devices such as IMS and lays a solid theoretical and methodological foundation for adapting autonomous driving classification principles to capability modeling within IMS.

2.2. Research Status of Intelligent Massage System

In recent years, driven by consumption upgrading trends and national policy support such as the Action Plan for the Development of the Smart Health and Elderly Care Industry (2021–2025), China has actively promoted the development of new health devices based on physiological sensing, human–machine interaction, and data-driven intelligence [41], creating a favorable environment for the technological evolution and market application of IMS. At the same time, under the dual influence of population aging and the expanding population with sub-health conditions, the global massage equipment market reached USD 25.26 billion in 2024 and is projected to grow to USD 41.18 billion by 2032, with a compound annual growth rate (CAGR) of 6.35% [42]. During the same period, China’s massage chair market expanded from RMB 7.8 billion in 2023 to RMB 8.485 billion in 2024 and is expected to continue its growth trend in 2025 [43]. Driven by both policy and demand, IMS is rapidly evolving toward high-level intelligence and personalization. Over 60% of mid- to high-end massage products are now equipped with AI algorithms, IoT connectivity, and adaptive control functions, signaling a transition from traditional mechanical control systems to multimodal intelligent response systems.

The technological evolution of IMS can be divided into four stages: mechanical drive, perception integration, knowledge-based reasoning, and deep learning-based closed-loop control systems, as shown in Figure 2.

The current system architecture has gradually developed into a closed-loop framework with four collaborative layers: perception, decision, execution, and feedback. The perception layer constructs a digital human body model through pressure sensing, infrared imaging, and millimeter-wave radar. The decision layer performs program matching and strategy optimization based on deep learning models and rule sets. The execution layer achieves high-precision operations through multi-axis drive and force control systems. The feedback layer integrates force feedback and physiological signal return to drive active learning modules and promote adaptive system evolution. This architecture reflects a key transformation of IMS from passive execution to active perception and self-optimization.

In China, with the maturation of technologies such as big data, deep learning, and machine vision, research on IMS has focused on technical principles, algorithmic optimization, and user experience enhancement. Current efforts aim to develop systems capable of more precise massage region localization, massage technique recognition, and personalized services tailored to individual differences [44,45]. For example, National Taiwan University proposed a method for automatic massage trajectory generation based on point cloud data [46]. Tsinghua University developed an acupoint recognition system based on visual feedback, achieving millimeter-level acupoint localization with the aid of QR code-assisted positioning [47]. Shanghai Jiao Tong University introduced a Symmetric Space Transformation Network (SSTN) that combines skeletal geometry and depth information to enable real-time 3D localization of acupoints under varying body postures and clothing conditions [48]. To improve the accuracy and intelligence level of massage functions, BP neural networks and deep optimization algorithms have been applied to the precise tracking of specific body parts and personalized dynamic adjustment [49,50,51]. These include muscle tension prediction and force regulation during massage, enabling autonomous positioning and enhanced user experience [52], which significantly strengthens the system’s capability for personalized response. The industry has also shown a development trend that echoes academic progress. In 2025, OGAWA launched its 5D-AI massage robot, which integrates multi-axis sensing and biological feedback to deliver deeply customized massage services based on user muscle tension and fatigue levels [53]. Several manufacturers have combined image recognition with robotic control to realize automatic acupoint detection and execution of massage techniques [54,55], thereby improving therapeutic effectiveness and user comfort [56,57]. These advancements mark a shift in IMS from auxiliary equipment to intelligent health interaction systems with cognitive capabilities.

Compared with domestic research, which mainly focuses on system architecture and user experience optimization, international studies on IMS place greater emphasis on the practical deployment of robotic platforms and the scientific evaluation of massage effectiveness. For instance, researchers have proposed a vision-based robotic massage (VBRM) framework capable of performing safe and autonomous robotic massage operations on the human body in dynamic environments. This system allows operators to customize massage parameters, and the robot adjusts the massage trajectory in real time using visual recognition and motion tracking technologies [58]. In addition, some studies have systematically extended the understanding of the mechanisms behind manual and self-massage and evaluated the effectiveness of autonomous, interactive robotic massage solutions [59]. These studies highlight that IMS should not only be regarded as a “technical device” but also as an active participant in the process of human–machine interaction.

As the technology continues to mature, the functional design of IMS is evolving from predefined workflows to dynamically generated processes. As shown in Table 1, at the strategy layer, the system progressively incorporates adaptive regulation mechanisms, based on physiological signals such as heart rate and electromyography (EMG) [60,61,62]. At the trajectory layer, personalized paths are generated in real time using machine vision and deep learning algorithms [63,64,65,66]. At the mode layer, traditional techniques (e.g., kneading, rolling) are deeply integrated with multimodal therapies, such as heat application, electrical stimulation, and pneumatic compression [67,68,69,70]. Some products further allow users to freely combine programs through human–machine interaction systems, enabling a highly customized massage experience [71,72,73].

Although existing research and commercial products have made significant progress in perception, decision-making, and execution, the high degree of system freedom has led to a phenomenon of “functional explosion”, in which the overall system exhibits trends of functional fragmentation and trajectory dispersion. This further increases the complexity of capability modeling and level classification. How to strike a balance between functional diversity and system controllability and how to construct a unified classification and evaluation framework have become key challenges that must be addressed in current IMS research. In addition, the current market descriptions of the “intelligence level” of massage products mostly remain at the level of qualitative marketing claims, and there is still no established system for quantitative, comparable, and verifiable classification of system capabilities. Existing classification approaches mainly fall into the following three categories:

Price gradient: Intelligence level is inferred based on product pricing. However, price does not have a linear correlation with functional capability, which can easily lead to misjudgment.
Function stacking: Evaluation is conducted based on the number of massage programs, the variety of covered body regions, and the presence of additional modules (such as voice control, heating, or music). Nevertheless, this approach fails to reflect the structural nature of system intelligence.
Subjective labels: Classification is based on marketing-oriented terms constructed from subjective expressions, such as “AI massage” or “automatic recognition”, which lack clear definitions and standardized criteria.

This gives rise to three major challenges: consumers are unable to accurately assess the intelligence level of products; enterprises face difficulties in developing differentiated solutions based on capability stratification; and regulatory systems lack standardized metrics for quantitative certification. In contrast, the field of autonomous driving has established a graded task delegation framework that clearly defines the responsibility boundaries and intervention modes of systems at each level. This has provided a unified paradigm for technological iteration, industry standardization, and public understanding of complex intelligent systems, and has gained broad consensus in both engineering implementation and societal adoption.

Therefore, establishing a multidimensional classification framework for IMS based on system capability structures and task delegation mechanisms has become an urgent research priority. In response to this need, this paper proposes a capability grading system that is structurally clear and technically comprehensive. The framework is intended to provide design guidance for product development, interpretability for end users, and a theoretical foundation for industry standardization, thereby promoting the evolution of massage systems from function stacking toward structured, capability-oriented intelligence.

3. Materials and Methods

3.1. Definition of the Massage-Driven Task (MDT) Model

To address the research gap identified in Section 2—namely, the lack of a structured task framework in the classification of existing intelligent massage systems—this paper first proposes the Massage-Driven Task (MDT) model as the functional foundation for capability grading. The model draws on the DDT structure from the field of autonomous driving and decomposes the IMS operational process into four closed-loop stages: Perception, Decision, Execution, and Feedback. To clearly define functional modules, this paper sequentially numbers six core task submodules within the MDT model as S1 to S6 (Subtask 1 to Subtask 6), which will be used consistently in subsequent sections and figures for identification and mapping.

The MDT model is based on the operational process of human massage and sequentially covers the following six core task submodules, as illustrated in Figure 3.

1.: Perception stage (S1: Body State Recognition)

S1 collects key body parameters such as body contour, sitting posture, and muscle tension in real time through multi-source sensor fusion, using pressure sensors, millimeter-wave radar, and infrared depth cameras. These precise inputs provide refined data for subsequent personalized decision-making.

2.: Decision stage (S2: Program Recommendation)

S2 generates adaptive massage programs based on a combination of user fatigue status, local pressure distribution, and historical preferences, using a built-in rule base and data-driven algorithms. This reflects the autonomous intelligence of the IMS system in strategy formulation and plan scheduling.

3.: Execution stage (S3: Force and Rhythm Control; S4: Multi-Zone Coordination)

S3 is the control module responsible for core massage actions. It dynamically adjusts the intensity, frequency, and rhythm of massage movements based on preset programs or real-time feedback signals, reflecting the system’s ability for active regulation and adaptation.

S4 further coordinates the timing and spatial execution of massage actions across multiple body regions, such as the back, neck, and legs. It ensures sequence synchronization and spatial consistency, avoiding conflicts, overlaps, or omissions between regions, thereby enhancing the overall coherence and comfort of the user experience.

4.: Feedback stage (S5: Abnormal Detection and Intervention; S6: User Modeling and Learning)

S5 focuses on risk management during system operation. By continuously monitoring abnormal conditions such as massage force exceeding thresholds, posture deviation, and equipment jamming, it can trigger proactive safety interventions including alarms, pauses, or force reduction, reflecting the system’s safety assurance mechanism.

S6 focuses on user modeling and learning updates during long-term use. The system can record and analyze user preference information, operating habits, and feedback content and dynamically adjust strategies to optimize program parameters, thereby enabling the personalized evolution of services.

In summary, the six subtasks S1 to S6 together form the complete task cycle of the IMS, from perception input to feedback closure, progressively reflecting its core capability characteristics across different functional dimensions.

3.2. Design Logic of the Grading Structure

Based on the establishment of the core task model MDT for the IMS, this paper further proposes a grading structure design logic grounded in system capability performance to systematically support the definition and classification of different intelligence levels. The logical framework consists of three key mechanisms:

Functional Delegation Structure (FDS): Measures the system’s autonomous task completion capability in the task execution dimension.
Abnormal Perception Mechanism (APM): Evaluates the system’s ability to monitor and respond to operational anomalies.
Human–Machine Interaction Bounds (HMIB): Reflects the system’s proactive service capability in the process of human–machine interaction.

These three mechanisms correspond to the three core capability dimensions of IMS: task closed-loop capability, abnormal state handling capability, and interaction proactivity. They provide a clear and quantifiable basis for constructing the subsequent intelligence level classification system.

Unlike traditional methods that classify levels solely based on the number of massage programs or product price, the grading logic constructed in this section focuses more on the intrinsic structure of the system’s actual capabilities. This is reflected in three aspects:

Task closed-loop capability: the degree to which the system autonomously undertakes each subtask in the MDT model (S1 to S6).
Abnormal state handling capability: the system’s level of perception, judgment, and intervention in abnormal conditions during operation, reflecting its safety assurance capability.
Interaction proactivity: the extent to which the system actively engages in user modeling, personalized learning, and service response.

The following sections will introduce the definitions, classification standards, and determination methods of these three mechanisms and will map and quantify capabilities in relation to the six subtasks (S1 to S6) in the MDT model.

3.2.1. Functional Delegation Structure (FDS)

In dividing the intelligence capability levels of IMS, the first factor is to assess whether the system possesses the ability to independently complete execution in each core task dimension. Therefore, this paper proposes the FDS to quantify the delegation status of IMS across the four stages of perception, decision, execution, and feedback for the six subtask categories (S1 to S6) in the MDT model, thereby establishing a mapping between task execution capability and level classification.

The design concept of the FDS draws on the functional takeover mechanism for the DDT in the field of autonomous driving (as specified in SAE J3016) and adapts and reconstructs it using the MDT closed-loop framework from cognitive science. Although IMS and autonomous driving systems differ significantly in their application goals, both share highly similar task structures in terms of recognizing user states, matching control strategies, implementing intervention actions, and perceiving feedback. This structural similarity provides the theoretical foundation for applying the FDS to IMS.

As shown in Figure 4, for S1–S6 in the MDT model, an IMS can exhibit varying degrees of active participation capability. To address this, the system’s delegation behaviors are classified into three types of functional delegation states, which are mapped and presented in Table 2.

No Delegation (N): The system has no perception, judgment, or response capability in this task dimension, and all operations rely entirely on the user, representing a purely manual control stage.
Limited Delegation (L): The system can perform part of the functions with the support of static rules or simple algorithms, such as executing programs based on fixed templates or adjusting feedback based on thresholds. However, in this state, the system lacks adaptability and closed-loop task capability and still requires user guidance or intervention.
Full Delegation (F): The system possesses a complete capability chain from environmental perception and state judgment to action control and result feedback, enabling adaptive operation without user intervention and demonstrating strong task comprehension and system stability.

It should be noted that the core value of the FDS lies not in simply counting the number of system functions but in identifying the system’s structured participation capability across the six subtasks—that is, whether it can independently understand the task, carry out intervention and execution, and adjust based on feedback. Through the three-level classification of N-L-F, the FDS provides a clear and quantifiable evaluation framework for the execution capability of each subtask and also establishes a solid capability foundation for the construction of the IMS intelligence level classification system.

3.2.2. Abnormal Perception Mechanism (APM)

In the MDT model, the feedback stage plays a critical role in ensuring system safety and operational stability. During actual operation, an IMS may encounter various unexpected conditions, such as user posture deviation, localized abnormal force, equipment jamming, sensor malfunction, or control program delays. Whether the system can perceive these abnormal states in real time, accurately identify them, and respond effectively is an important indicator of its level of intelligence.

To address this, this paper proposes the APM as one of the core components in the classification of IMS intelligence levels. The APM is used to measure the system’s monitoring depth, the complexity of its response pathways, and its safety regulation capability when dealing with non-standard states in uncertain environments. The capability level of this mechanism directly affects the completion quality of S5 (safety protection) in the MDT model.

Based on the system’s capability to perceive abnormal scenarios and the complexity of its handling strategies, the APM can be categorized into three levels:

No Perception: The system lacks any abnormal state monitoring functions. Unexpected conditions during operation cannot be detected, and all risk handling relies on the user to manually terminate the operation or restart the device, significantly compromising operational safety. (Corresponds to S5 “No Automation”)
Rule-Based Perception: The system uses static threshold settings or predefined rule templates to identify and respond to certain typical abnormalities, for example, automatically reducing massage force when it exceeds a safety threshold or issuing a warning when a sensor fails. This mechanism has a single, fixed response path, lacks dynamic adaptability, and can only cover abnormal scenarios predefined in the rules. (Corresponds to S5 “Limited Automation”)
Active Perception: The system integrates multi-source data (e.g., pressure, posture, time series) and builds dynamic behavior models, enabling it to predict, identify, and perform closed-loop regulation of irregular, complex, and evolving abnormalities. Typical capabilities include machine learning-based abnormal pattern recognition, dynamic strategy switching, process interruption control, and adaptive calibration, thereby establishing a fundamental safety assurance framework. (Corresponds to S5 “Full Automation”)

APM and the previously described FDS together form a dual-support structure for the IMS in the execution–feedback phase. FDS ensures that the system possesses structured execution capability during normal subtasks, while APM safeguards its ability to maintain continuous monitoring and risk intervention in abnormal conditions. The synergy between the two not only enhances the stability of routine operations but also significantly strengthens the system’s robustness and user safety assurance in emergency situations.

3.2.3. Human–Machine Interaction Bounds (HMIB)

Within the MDT model, the manner in which the user interacts with the system directly reflects the autonomy level of the IMS. To systematically evaluate the IMS’ dependence on user operations and its level of proactive responsiveness within interaction processes, this study proposes the HMIB mechanism. This mechanism captures the progression of IMS interaction patterns from user-driven to system-driven as its perception and decision-making capabilities advance.

In intelligent massage scenarios, low-intelligence systems typically rely on explicit user commands to initiate, adjust, and interrupt operations. By contrast, high-intelligence systems leverage user state recognition and strategy learning to automate the interaction process, and, in some cases, achieve near-invisible or seamless interaction.

Based on the system’s mode of control within task workflows, the HMIB is categorized into three representative patterns:

Fully Dependent Interaction: In subtasks such as S2 and S3, the system relies entirely on the user to issue start, adjustment, and termination commands. Interaction channels are primarily graphical user interfaces, voice commands, or mobile applications. The system lacks autonomous decision-making capabilities, resulting in a high operational workload for the user.
Semi-Autonomous Interaction: In subtasks such as S1 and S2, the system can automatically execute certain processes based on predefined conditions (e.g., user posture, time markers). For example, it may automatically start a program upon detecting that the user is seated or transition to the next massage phase after completing a session. Although some processes run autonomously, critical points such as task switching and exception handling still require active user intervention.
Passive–Aware Interaction: Based on its capabilities in subtasks S1 and S6, the system achieves continuous dynamic sensing and prediction of the user’s state. Without any explicit user commands, it can autonomously perform program recommendations, parameter adjustments, and process control. This mode, enabled by multimodal data fusion and behavioral modeling, significantly enhances task continuity, interaction immersion, and overall user experience.

Complementing FDS and APM, HMIB focuses on the interaction core, measuring the IMS’ proactive service capability and interaction intelligence level. Together, these three mechanisms systematically establish the theoretical framework and evaluation logic for IMS intelligence grading, providing a solid foundation for both grade definition and validation.

3.3. Construction of the Grading Indicator System

3.3.1. Determination of Key Indicators

Guided by the three logical frameworks proposed in Section 3.2—FDS, APM, and HMIB—this section conducts a systematic analysis of the operational processes covered by the MDT model (S1–S6).

To achieve a comprehensive quantitative evaluation of the IMS intelligence level and thereby support the objective determination of grading, this section establishes an evaluation system consisting of eight core indicators, with detailed definitions provided in Table 3. Each indicator corresponds to the system’s performance in different task stages, covering the critical links in the intelligentization process as follows:

1.: Task Recognition Accuracy (P1): Measures the system’s precision in identifying the user’s current physical state, corresponding to the classification performance of the S1 body state recognition module in FDS.
2.: Abnormal Detection Sensitivity (P2): Evaluates the system’s capability to detect and respond to abnormal states in S5 (e.g., sensor failure, overload) in a timely manner, reflecting the coverage and emergency response capability of APM.
3.: Recommendation Hit Rate (D1): Measures the degree to which program recommendations in S2 match the user’s actual needs, reflecting the decision-making accuracy of the FDS decision layer.
4.: Decision Response Latency (D2): Refers to the average delay from perception input to decision output, measuring the real-time performance of the system in converting information and initiating execution in S2–S3 subtasks.
5.: Force Control Error (E1): Calculated as the average deviation between the actual applied force and the target force curve, indicating the execution accuracy in force control and rhythm coordination in S3.
6.: Path Tracking Accuracy (E2): Measures the spatial deviation between the actual trajectory of the massage head and the preset path, assessing the spatial accuracy of multi-region coordinated control in S4.
7.: Physiological Feedback Response Rate (F1): Measures the proportion of user physiological signals (e.g., EDA, HRV) effectively activated during the massage process, reflecting the effectiveness of the feedback loop between S5 and S6, and representing the HMIB’s capability for dynamic user state perception.
8.: User Subjective Satisfaction (F2): Reflects the user’s subjective evaluation of the overall massage experience, typically collected via questionnaires or interviews, and provides a comprehensive view of HMIB’s service quality and user experience optimization at the perceptual level.

This indicator system not only encompasses the IMS’ perception, decision-making, and execution capabilities under the FDS framework but also integrates the APM’s mechanisms for responding to abnormal scenarios. In addition, it incorporates HMIB’s performance in human–machine interaction and subjective perception, thereby enabling a multidimensional quantitative assessment of the system’s capabilities.

3.3.2. Indicator Quantification Method

After determining the key indicators, to eliminate dimensional differences and ensure comparability, the raw data for each indicator is first subjected to min–max normalization, linearly mapping the values to the range [0, 1]. The formula is as follows:

x_{i}^{'} = \frac{x_{i} - x_{i}^{\min}}{x_{i}^{\max} - x_{i}^{\min}}, x_{i}^{'} \in [0, 1]

(1)

Among them,

x_{i}

is the original observation value of the i-th indicator, and

x_{i}^{\min}

and

x_{i}^{\max}

represent the minimum and maximum values of the indicator in all samples or simulation tests, respectively. Normalization not only eliminates the influence of dimensionality but also provides a unified measurement basis for subsequent weight aggregation analysis.

After the indicators are dimensionless, to scientifically reflect the relative importance of each indicator in the comprehensive evaluation, this study combines the Delphi expert interview method [74] with the Analytic Hierarchy Process (AHP) to determine the weights. The specific procedure is as follows:

First, ten industry experts from the fields of massage chair R&D and rehabilitation medicine were invited to participate. They conducted three rounds of anonymous scoring on the importance of the eight primary indicators (1 = very unimportant, 5 = very important). After each round, the average importance

{\bar{d}}_{i}

and standard deviation

σ_{i}

of each indicator were calculated, and the results were fed back to the experts for revised scoring. For indicators with

σ_{i} > 0.5

, which indicates relatively large divergence, subsequent rounds focused on discussion and re-evaluation, until all indicators had

σ_{i} < 0.5

, at which point expert opinions were considered to be basically consistent.

Based on the final round of average ratings from the Delphi process, an AHP judgment matrix was constructed, and the feature root method was used to calculate the weight vector

w_{i}

[75]. The pairwise comparison elements in the judgment matrix were referenced from the Delphi mean values

{\bar{d}}_{i}

, and the matrix was generated through pairwise comparison. A consistency check was then conducted to verify the logical coherence of the matrix. When the consistency ratio (CR) is less than 0.1, the weight allocation of the judgment matrix is considered to have good consistency and logical soundness.

The final obtained weight coefficients satisfy:

\sum_{i = 1}^{8} w_{i} = 1

(2)

After completing the normalization and weight determination, a weighted comprehensive evaluation model was established. Each normalized indicator

x_{i}^{'}

was combined with its corresponding weight

w_{i}

to calculate the final score, using the following formula:

S = \sum_{i = 1}^{8} w_{i} x_{i}^{'}

(3)

where

S

represents the overall capability level of the system across the four stages defined in the MDT model. This model will play a central role in the subsequent questionnaire data analysis and case validation and will provide a reliable quantitative basis for intelligent grade classification.

3.3.3. Questionnaire Design and Implementation

To further verify the importance of the eight core evaluation indicators and obtain expert judgments on the subjective weights of each indicator, this study designed and conducted three rounds of Delphi surveys. The questionnaire content strictly adhered to the aforementioned indicator definitions (see Table 4), with each question employing a 5-point Likert scale (1 representing “very unimportant” and 5 representing “very important”), and was set as a mandatory item to ensure data completeness and comparability of the scoring results.

Each question corresponds to a key indicator, namely, Task Identification Accuracy (P1), Fault Detection Sensitivity (P2), Recommendation Hit Rate (D1), Decision Response Delay (D2), Force Control Error (E1), Path Tracking Accuracy (E2), Physiological Feedback Response Rate (F1), and User Subjective Satisfaction (F2).

The expert sample consisted of 10 individuals, including engineering and technical specialists in massage device research and development, as well as clinical researchers in the field of rehabilitation medicine. The Delphi survey was conducted over three rounds, each lasting 5–7 days. During the intervals between rounds, feedback based on the statistical results was provided to guide the experts in revising their judgments toward achieving consensus.

The first-round questionnaire was distributed over a period of 7 days, and all 10 valid questionnaires were returned (response rate: 100%). Based on the first-round results, the mean and standard deviation of each indicator score were calculated, and the results for indicators with

σ > 0.5

were fed back to the experts for re-evaluation in the second round. The second- and third-round questionnaires continued to use the same questions, providing feedback on the deviation of the average scores from the previous round, allowing experts to adjust their ratings accordingly. Each round was conducted at an interval of 4–5 days. When the standard deviation of all indicators fell below 0.5, it indicated that expert opinions had reached a high level of consensus. In total, 28 valid questionnaires were collected over the three rounds, providing a reliable data foundation for constructing the AHP judgment matrix and performing weight calculations.

3.3.4. Data Analysis and Quantification

(1): Expert scoring aggregation and normalization calculation

Based on the aforementioned nine valid questionnaires from the third round, the expert scoring data were aggregated and subjected to a consistency verification analysis, resulting in the following scoring matrix:

D = [d_{i k}], i = 1, …, 8; k = 1, …, 9

(4)

Calculate the average importance of the i-th indicator:

\bar{d_{i}} = \frac{1}{N} \sum_{k = 1}^{N} d_{i k}

(5)

And standard deviation:

σ_{i} = \sqrt{\frac{1}{N - 1} \sum_{k = 1}^{N} {(d_{i k} - {\bar{d}}_{i})}^{2}}

(6)

As shown in Table 5, the

σ_{i}

values of all indicators are less than 0.5, indicating that experts have reached a high degree of consensus on the importance of all candidate indicators. The indicator numbers in the table (P1–P2, D1–D2, E1–E2, I1–I2) correspond to the definitions provided in Section 3.3.1.

The scores were further standardized using a linear normalization method to construct the initial weight vector. The specific calculation method is as follows:

w_{i}^{n o r m} = \frac{{\bar{d}}_{i}}{\bar{\sum_{j = 1}^{n} {\bar{d}}_{j}}}, n = 1, …, 8

(7)

The calculation yields:

w^{n o r m} = [0 . 138, 0 . 099, 0 . 106, 0 . 138, 0 . 138, 0 . 138, 0 . 134, 0 . 109]

(8)

This weight vector will serve as a reference baseline for comparing structural consistency and robustness with the AHP calculation results in subsequent analyses.

(2): Construction of the judgment matrix and consistency verification

The weight results obtained above directly reflect the relative importance relationships from expert ratings. However, to further verify the structural rationality and stability of indicator weights, this study introduces the AHP for consistency and robustness analysis.

Based on

{\bar{d}}_{i}

, an eight-dimensional judgment matrix

A = [a_{i j}]

is constructed. Each element in the matrix A reflects the relative importance of the i-th and j-th indicators. Subsequently, the eigenvalue method is used to solve the maximum eigenvalue

λ_{\max}

and its corresponding eigenvector, and the eigenvector is normalized to obtain the preliminary weight vector

w_{i}

.

The average importance vector of the eight indicators obtained from the third round of the Delphi survey is:

a_{i j} = \frac{{\bar{d}}_{i}}{{\bar{d}}_{j}}, i, j = 1, …, 8

(9)

Known

\bar{d} = [{\bar{d}}_{1}, {\bar{d}}_{2}, …, {\bar{d}}_{8}] = [4.78, 3.44, 3.67, 4.78, 4.78, 4.78, 4.67, 3.78]

;

The judgment matrix

A

is obtained, retaining three decimal places as follows:

A = [\begin{matrix} 1.000 & 1.383 & 1.325 & 1.004 & 1.004 & 0.986 & 1.010 & 1.243 \\ 0.724 & 1.000 & 0.951 & 0.723 & 0.726 & 0.706 & 0.751 & 0.922 \\ 0.754 & 1.051 & 1.000 & 0.759 & 0.758 & 0.758 & 0.780 & 0.972 \\ 0.991 & 1.384 & 1.317 & 1.000 & 0.997 & 0.992 & 1.029 & 1.247 \\ 0.996 & 1.378 & 1.319 & 1.003 & 1.000 & 0.992 & 1.019 & 1.263 \\ 1.014 & 1.417 & 1.319 & 1.008 & 1.008 & 1.000 & 1.036 & 1.250 \\ 0.990 & 1.332 & 1.282 & 0.973 & 0.982 & 0.966 & 1.000 & 1.236 \\ 0.805 & 1.084 & 1.029 & 0.802 & 0.792 & 0.801 & 0.809 & 1.000 \end{matrix}]

(10)

Perform eigenvalue decomposition on the judgment matrix

A

and take the maximum eigenvalue

λ_{\max}

and its corresponding eigenvector

v

:

Av = λ_{\max} v

(11)

Normalize

v

so that the sum of its elements is 1:

w = \frac{v}{\sum_{i = 1}^{8} v_{i}}

(12)

Based on the above formula, the calculated weight vector is:

w_{i} \approx [0 . 125, 0 . 090, 0 . 096, 0 . 125, 0 . 125, 0 . 125, 0 . 122, 0 . 113]

(13)

The maximum eigenvalue

λ_{\max} \approx 8.001

; in order to verify the consistency of the judgment matrix, calculate its Consistency Index (CI) and Consistency Ratio (CR):

C I = \frac{λ_{\max} - n}{n - 1} \approx 0.00014

(14)

C R = \frac{C I}{R I} = \frac{0.00014}{1.41} \approx 0.0001

(15)

The results show that the judgment matrix

A

passes the AHP consistency test and has good hierarchical logic.

(3): Perturbation Analysis and Weight Robustness Testing

To verify the stability of the constructed weight vector

w

within expert judgments, this study further introduces minor symmetric perturbations to the judgment matrix

A

, and conducts a sensitivity analysis of the weight allocation to assess its robustness.

Select the upper limit of disturbance

δ \approx 0.02

, for each pair of indicators

i \neq j

, add the symmetric perturbation

a_{i j} = \frac{{\bar{d}}_{i}}{{\bar{d}}_{j}}

based on the original ratio

ε_{i j} \sim U (- δ, δ)

and let

ε_{j i} = - ε_{i j}, ε_{i i} = 0

.

The matrix elements after slight perturbation are defined as:

a_{i j} = a_{i j}^{(0)} (1 + ε_{i j}), i, j = 1, …, 8

(16)

The perturbation matrix

ε

is:

ε = [\begin{matrix} 0.000 & 0.002 & 0.009 & 0.004 & 0.002 & - 0.003 & 0.006 & - 0.002 \\ - 0.002 & 0.000 & 0.016 & 0.019 & - 0.005 & 0.012 & 0.001 & 0.003 \\ - 0.009 & - 0.016 & 0.000 & 0.000 & - 0.017 & - 0.017 & - 0.019 & 0.013 \\ - 0.004 & - 0.019 & - 0.017 & - 0.017 & 0.011 & 0.015 & 0.019 & 0.012 \\ - 0.002 & 0.005 & 0.017 & 0.017 & 0.000 & - 0.002 & 0.011 & - 0.015 \\ 0.003 & - 0.012 & 0.017 & 0.017 & 0.002 & 0.000 & - 0.007 & 0.007 \\ - 0.006 & - 0.001 & 0.019 & 0.019 & - 0.011 & 0.007 & 0.000 & - 0.008 \\ 0.002 & - 0.003 & - 0.013 & - 0.013 & 0.015 & - 0.007 & 0.008 & 0.000 \end{matrix}]

(17)

Calculate the weight deviation of each indicator to obtain the judgment matrix

A'

after disturbance:

A^{'} = [\begin{matrix} 1.000 & 1.386 & 1.337 & 1.008 & 1.006 & 0.983 & 1.016 & 1.241 \\ 0.723 & 1.000 & 0.966 & 0.737 & 0.722 & 0.714 & 0.752 & 0.925 \\ 0.747 & 1.034 & 1.000 & 0.772 & 0.745 & 0.745 & 0.765 & 0.985 \\ 0.987 & 1.358 & 1.295 & 1.000 & 1.008 & 1.007 & 1.049 & 1.262 \\ 0.994 & 1.385 & 1.341 & 0.992 & 1.000 & 0.990 & 1.030 & 1.244 \\ 1.021 & 1.401 & 1.293 & 1.023 & 1.026 & 1.000 & 1.029 & 1.267 \\ 0.984 & 1.349 & 1.256 & 0.993 & 0.992 & 0.959 & 1.000 & 1.226 \\ 0.822 & 1.118 & 1.040 & 0.792 & 0.801 & 0.793 & 0.815 & 1.000 \end{matrix}]

(18)

Repeat the eigenvalue decomposition and normalization steps for

A'

to obtain the perturbed weight vector

w^{'}

:

{w^{'}}_{i} \approx [0.138, 0 . 100, 0 . 104, 0 . 138, 0 . 138, 0 . 137, 0 . 129, 0 . 117]

(19)

Compare the deviation

w_{i}

between the original weight

w_{i}^{'}

and the perturbed weight

Δ w_{i} = |w_{i}^{'} - w_{i}|

of each indicator.

As shown in Table 6, the deviation of all indicators is less than 0.02, meeting the ±2% perturbation sensitivity requirement. This indicates that the constructed weight structure exhibits good robustness.

The maximum characteristic root and consistency test results of the perturbation matrix

A'

are as follows:

C I = \frac{λ_{\max} - n}{n - 1} \approx 0.002

(20)

C R = \frac{C I}{R I} = \frac{0.00014}{1.41} \approx 0.0014

(21)

The results show that the perturbed judgment matrix still meets the consistency requirements of AHP.

Comparison with the direct normalization results shows that both methods maintain identical rankings, with weight differences across all indicators not exceeding 0.013. This finding indicates that the expert ratings exhibit good internal consistency, and the priority structure reflected in the ratings demonstrates strong stability under mathematical modeling. Therefore, the AHP method, serving as a structural validation tool, confirmed that the rating system achieves consistency within the pairwise comparison matrix.

(4): Distribution of indicator weights and interpretation of system functions

As shown in Figure 5, the radar chart distribution of the eight indicator weights is generally balanced, with a slight structural emphasis. The weights range from 0.090 to 0.125, with a range of 0.035 and a coefficient of variation of 12.4%, indicating that the importance of the indicators fluctuates minimally in expert evaluations. Overall, the high concentration reflects a solid foundation for weight allocation.

On this basis, the actual focus reflected by the different weights can be further interpreted from the perspective of system functionality: E1, E2, and P1 have the highest weights, indicating that the system’s ability to accurately perceive the user’s current state and precisely execute massage actions is the core factor determining its overall intelligence level. D2 and I1 highlight the important role of real-time decision-making capabilities and closed-loop adjustments based on the user’s physiological signals in enhancing the system’s intelligence. In contrast, although D1, P2, and I2 are also indispensable supporting indicators in the evaluation system, their weights are relatively low, classifying them as secondary supporting indicators.

In summary, this section not only verifies the rationality of constructing the eight indicators but also provides a basis for the hierarchical classification of system capabilities in the subsequent analysis.

3.3.5. Determination of the Grading Index System

After completing the construction of the MDT model (S1–S6) and the normalization and weight assignment of the eight performance indicators, this section structurally integrates the indicators to build a three-tier graded evaluation index system. This system covers multidimensional core tasks and coupling capabilities, as shown in Figure 6.

In the FDS dimension, the evaluation indicators focus on the system’s ability to substitute for and assist the user’s operational perception, corresponding to stages S1 and S2 in the MDT model. P1 measures the task perception accuracy in S1, reflecting the system’s sensitivity in detecting input signals. D1 reflects the responsiveness and preliminary judgment capability in S2, indicating the speed and accuracy of front-end decision-making. Together, these two indicators reveal the degree of autonomy of the IMS during the perception and decision-making stages.

In the APM dimension, the indicators focus on reflecting the system’s safety monitoring and real-time intervention capabilities, primarily covering stages S3 to S5. P2 evaluates the quality of perception–cognition integration in S3, as well as the responsiveness and flexibility in the subsequent decision-making chain. D2 measures the timeliness from anomaly detection in S4 to the triggering of protective interventions. E1 reflects the precision with which S5 controls execution timing, rhythm, and force response. Collectively, these three indicators support the system’s safety, efficiency, and reliability during the execution and feedback stages.

In the HMIB dimension, the indicators address both execution accuracy and user experience, primarily covering stages S4 and S6, forming a key evaluation basis for user interaction and human-factor feedback. E2 focuses on the adaptability of S6 in tuning massage parameters to align with users’ physiological signals and subjective experiences, reflecting the system’s human–machine coordination. F1 evaluates the degree of adaptiveness in S6 when automatically adjusting massage parameters in response to real-time physiological signals such as heart rate and electromyography. F2 reflects the comfort and user satisfaction of S1 during the service interaction process.

Through the systematic integration of the eight indicators within the three major dimensions, this study has developed a closed-loop, multidimensional, and quantifiable hierarchical evaluation system for IMS. This framework provides a solid theoretical and empirical foundation for the subsequent clear delineation of intelligent level thresholds.

4. Results

4.1. Intelligent Level Classification

4.1.1. Calculation of Three-Dimensional Ability Scores

Based on the normalization and weighting results of the eight core indicators, this section performs a weighted integration of the quantitative results across the three mechanisms, thereby obtaining the IMS’ composite scores for the three capability dimensions on a unified numerical scale, enabling direct comparison and classification of system performance levels.

The specific mapping relationship and calculation formula are as follows:

\begin{matrix} FDS_score = w_{P 1} \times P 1 + w_{D 1} \times D 1 \\ APM_score = w_{P 2} \times P 2 + w_{D 2} \times D 2 \\ HMIB_score = w_{A 1} \times A 1 + w_{A 2} \times A 2 + w_{I 1} \times I 1 + w_{I 2} \times I 2 \end{matrix}

(22)

Among them, the weight vector

w_{i}

comes from the consistency check result in Section 3.3.4, and all indicators have been normalized to [0, 1].

[w_{P 1}, w_{P 2}, w_{D 1}, w_{D 2}, w_{E 1}, w_{E 2}, w_{I 1}, w_{I 2}] \approx [0 . 125, 0 . 090, 0 . 096, 0 . 125, 0 . 125, 0 . 125, 0 . 122, 0 . 113]

(23)

Under this weighted calculation framework, FDS_score reflects the degree of autonomy of the IMS in the perception and decision-making stages, APM_score evaluates the system’s ability for anomaly monitoring and safety intervention in the execution and feedback phases, while HMIB_score focuses on the level of multi-region coordination and user feedback optimization during the execution and feedback phases. By obtaining the three-dimensional score results, the functional core of the IMS can be quantitatively analyzed, providing a scientific basis for subsequent grading modeling and capability evaluation.

4.1.2. Threshold Design and Grading

After completing the three-dimensional capability score calculations for FDS_score, APM_score, and HMIB_score, this section further develops a grading standard aligned with the MDT model and the three major capability mechanisms. Given that the MDT model decomposes the IMS operation process into S1–S6, and drawing reference from the six-level hierarchy used in the field of autonomous driving, this study establishes an L0–L5 six-level capability framework to achieve structural correspondence between task granularity and grading. The six-level framework adopts a threshold design method based on equal-interval division combined with expert calibration. First, the continuous score range [0, 1] is divided into five equal segments, each with a length of approximately 0.167, producing six initial boundaries: 0.000, 0.167, 0.333, 0.500, 0.667, 0.833, and 1.000. This results in six continuous intervals, corresponding to levels L0 through L5, thus establishing a direct mapping between the scoring scale and task levels (S1–S6).

On this basis, qualitative evaluations by Delphi experts of the three-dimensional score distributions for typical cases were incorporated to fine-tune two key boundary points: the L1 → L2 boundary was raised from 0.333 to 0.350 to more accurately distinguish between “limited agency” and “intermediate autonomy” capabilities; the L3 → L4 boundary was lowered from 0.667 to 0.650 to better match the actual distribution of system capabilities. The final threshold division results are shown in Table 7.

The quantitative thresholds of FDS_score, APM_score, and HMIB_score within the L0–L5 six-level framework balanced interval lengths and reasonable separability, providing a repeatable quantitative standard for subsequent level determination of each instance.

4.1.3. Comprehensive Intelligence Calculation

Based on the multidimensional grading framework, to address potential inconsistencies in dimensionality that may arise when determining grades from three-dimensional scores, this study further introduces a unified metric, “Comprehensive Intelligence

S_{0}

”, which integrates the results of FDS_score, APM_score, and HMIB_score across the three dimensions. This integration ensures dimensional uniformity and weighted aggregation across dimensions.

To this end, a paired-comparison questionnaire was designed, inviting eight domain experts with extensive experience in the development of intelligent massage and rehabilitation devices to perform pairwise comparisons across the three capability dimensions, assigning a relative intensity score (1–5) for each comparison. Each expert was required to select the dimension they considered more important in each pair and provide an intensity rating. The expert scoring results are shown in Table 8. For example, if an expert judged “FDS to be more important than APM” and assigned a score of 5, this would be recorded as FDS > APM with an intensity of 5.

Based on the above scores, the average score for each pairwise comparison was first calculated and normalized to relative proportions. After further normalization, the resulting relative weights for the three dimensions were: FDS = 0.341, APM = 0.254, and HMIB = 0.405.

Accordingly, the unified index for comprehensive intelligence

S_{0}

is calculated as:

S_{0} = 0.341 FDS_score + 0.254 APM_score + 0.405 HMIB_score

(24)

This index has been applied in the subsequent grading process, ensuring both the rationality of contributions across dimensions and the consistency of the aggregated results.

4.2. Definition of L0–L5 Levels

After defining the three-dimensional composite scores and their corresponding threshold ranges, this section presents the capability characteristics for the six levels (L0–L5) in sequence. Each level is characterized based on its three-dimensional score range, in combination with the subtasks (S1–S6) in the MDT process and the degree of user involvement, to describe the progression of the system’s intelligence level.

4.2.1. L0—Mechanical Execution Level

When the IMS’ three-dimensional scores and

S_{0}

all fall within the range of [0.00, 0.167), it indicates extremely limited capabilities in intelligent control, strategy generation, and human–machine interaction, possessing only the most basic mechanical execution functions. Therefore, it is defined as Level L0 (Mechanical Execution Level).

At this level, the IMS does not yet possess any capabilities for active perception, feedback correction, solution generation, or user modeling, relying entirely on manually preset logic to complete tasks. Its primary characteristic is the ability to perform fixed operations along a predetermined path, without responding or adapting to external environments or user states. To provide a more concrete depiction of the performance of IMS at this level, we analyze it in conjunction with the six key task dimensions (S1–S6) under the MDT framework, as shown in Table 9.

An L0-level IMS lacks capabilities in multiple task dimensions (S1, S2, S5, and S6) and only possesses basic path execution capabilities in S3 and S4. However, it cannot perform feedback control or strategy optimization based on external inputs or user states, exhibiting a distinct “rigid-drive” system characteristic. Therefore, an L0-level system relies entirely on user operations for step-by-step task control and does not yet demonstrate any form of intelligent behavior.

4.2.2. L1—Environmental Perception Level

When the IMS’ three-dimensional scores and

S_{0}

all fall within the range of [0.167, 0.350), it can be classified as L1 Environmental Perception Level.

At this level, the intelligent massage system begins to exhibit environmental perception and basic early-warning capabilities through sensor data, but its autonomous decision-making and closed-loop feedback abilities remain limited. As shown in Table 10, the system can utilize a single pressure or posture sensor to perform basic detection of the user’s body contour and sitting posture, with P1 improved from near-zero at L0 to a low–medium level. In S2, the system can match and select massage programs from a limited library, with D1 slightly enhanced, but still requiring user verification and confirmation. S3–S4 continue to execute preset force and path, maintaining control errors within ±20%, without dynamic correction based on feedback. In S5, upon detecting predefined anomalies such as overpressure or overheating, the system can trigger alarms; however, D2 response latency remains high, and adjustments can only be made after user intervention. S6 is limited to recording basic usage data, such as massage duration and force settings.

L1 marks the IMS’ initial transition from “none” to “basic” capability in the perception and decision-making stages. At this level, the system overcomes the limitations of purely mechanical execution by introducing environmental perception and early-warning functions for the first time, signifying a critical step toward intelligence. However, due to the absence of feedback regulation and proactive response mechanisms, the user must still maintain a high degree of involvement in program matching and anomaly handling.

4.2.3. L2—Intelligent Assistance Level

When the IMS’ three-dimensional scores and

S_{0}

all fall within the range [0.350, 0.500), it indicates that the system possesses semi-automatic control capabilities across perception, decision, execution, and feedback stages, thereby reaching the L2 Intelligent Assistance Level.

At this level, the system provides semi-automatic execution for most of the six core tasks. As shown in Table 11, S1 and S2 can perform multi-angle body scanning, and generate personalized massage programs from the preset library based on sensing results, with P1 and D1 reaching medium levels. S3 and S4 can control force and path deviation within ±10% and perform limited real-time trajectory adjustments through feedback. S5 can detect and promptly respond to common anomalies such as overpressure and mechanical jamming, with shorter D2 response times, and the system can automatically execute shutdowns or reset operations without user confirmation to ensure safety and operational continuity. S6 introduces initial heart rate, skin conductivity, and other physiological signal inputs for simple real-time adjustments based on user feedback, indicating the system’s preliminary closed-loop active response capability.

At the L2 level, the IMS has significantly reduced its reliance on active user operations during task execution, enabling the completion of high-quality operational processes in most standard operation scenarios through a semi-automatic closed-loop approach, thus laying the foundation for achieving higher levels of fully autonomous intelligent control in subsequent stages.

4.2.4. L3—Autonomous Decision-Making Level

When the IMS’ three-dimensional scores and

S_{0}

all fall within the [0.500, 0.650) range, it reaches the L3 Autonomous Decision-Making Level.

At this level, the IMS not only possesses in-depth cross-domain coordination capabilities but can also continuously optimize control strategies online through learning. As shown in Table 12, S3 and S4 can dynamically allocate and sequence force between back, neck, and leg massage units; E1 and E2 errors are further reduced; S5 begins to perform multi-stage intelligent intervention on complex physiological signal waveforms; the IMS is no longer limited to passive alarms based on preset thresholds but can initiate early warnings based on multimodal signal adaptive analysis, executing multi-level preemptive strategies; and S6 uses online algorithms to leverage historical usage data and real-time feedback to automatically generate massage plans, form personalized recommendation models, and autonomously complete the execution and adaptive adjustment of the entire process after the user inputs the target, without requiring secondary confirmation.

At the L3 level, the system significantly reduces its reliance on user-initiated operations and possesses the capability to perceive, assess, and dynamically adapt to multi-parameter environments. This advancement marks that the IMS has attained genuine autonomous decision-making capability, laying a solid foundation for higher-level fully automated and predictive interventions.

4.2.5. L4—Health Steward Level

When the IMS’ three-dimensional scores and

S_{0}

all fall within the range of [0.650, 0.800), it reaches the L4 Health Management Expert level.

At the L4 level, the IMS not only completes full-process automation (S1–S5) but also further assumes the responsibility of managing the user’s health. As shown in Table 13, during the sensing and execution stages (S1–S5), the system can perform multimodal sensing (heart rate, respiratory rate, muscle activity, body temperature) to achieve real-time monitoring and basic health index evaluation through multi-channel signals, and can link with clinical decision support systems to automatically complete data-based regulation and decision-making. The system can provide precise prediction of physiological abnormalities and assist users in implementing targeted health interventions. In addition, in the continuous learning aspect (S6), the system can analyze and learn from long-term accumulated data to build a personalized health dynamic model, thereby realizing continuous health maintenance and risk prediction for the user. At this stage, even without frequent manual input or preset indicators, the IMS can actively maintain and optimize health protection based on dynamic sensing results.

In terms of key technology implementation pathways, L4-level systems typically integrate:

Flexible electronic skin, millimeter-wave radar, and other multi-source sensors to enable non-intrusive monitoring of pressure, temperature, electromyography (EMG), electrodermal activity (EDA), electrocardiography (ECG), and vital signs.
Digital modeling of traditional Chinese medicine acupoints and path optimization algorithms to enhance precise adaptation to individualized acupoints.
Federated learning and cloud-based health profiling platforms to support multidevice data collaboration and secure sharing.

Multiple studies and patents have verified the feasibility of the above technologies. For example, a 120 GHz FMCW millimeter-wave radar can achieve heart rate and respiration monitoring with an accuracy of ±2 bpm within a range of 0.5–2 m, and signal processing algorithms can further enhance robustness [76,77,78]. The published patent WO2015174879A1 also highlights the potential application of such technologies in home and infant environments [79]. Overall, the L4 level possesses the capability to transform multi-source physiological signals into health intervention recommendations, representing a key stage toward proactive health management.

4.2.6. L5—Preventive Intervention Level

When the IMS’ three-dimensional scores and

S_{0}

both fall within the range [0.800, 1.000], the system is classified as entering the L5 Predictive Intervention level, representing the highest level of IMS intelligence.

At this level, the system achieves fully autonomous closed-loop operation across all S1–S6 tasks, with continuous learning, self-adaptation, and predictive intervention capabilities. As shown in Table 14, S1 and S2 integrate multimodal sensing with advanced deep learning models to enable high-precision recognition of user body shape, posture, and physiological state, supporting the automatic generation of personalized massage programs. S3 and S4 perform dynamic force allocation and sequencing across multiple body regions in real time, without requiring user intervention, achieving ultra-high control accuracy with a deviation within ±2%. S5 performs sequential analysis of ECG, respiration, EMG, and EDA signals to provide predictive alerts and implement proactive risk prevention actions (such as adjusting massage parameters to guide the user in preventive activities). S6 connects to a cloud-based health management platform to continuously track user health data, automatically complete personalized health records, and provide comprehensive recovery and health improvement recommendations based on predictive analysis results.

Secondly, at the technical implementation level, L5-level systems typically integrate:

Multiple types of non-invasive sensors for long-term physiological monitoring.
Predictive algorithms such as machine learning and knowledge graphs to perform temporal prediction of health data and construct personalized user models.
Integration of edge computing and cloud services to enable cross-scenario data sharing and privacy protection.

At this level, users do not need to engage in frequent interactions; by providing only basic initial information, they can continuously enjoy personalized, predictive intelligent massage and health management services, thereby truly achieving a “hands-free” smart health experience.

4.3. L0–L5 Level Capability Analysis

To more intuitively present the differences in maturity across the S1–S6 tasks for levels L0–L5, this section introduces a visual analysis of capability coverage. As shown in Figure 7, six gradient intervals of light to dark blue are used to represent capability maturity levels (0–20%, 21–40%, 41–60%, 61–80%, 81–95%, 96–100%). The horizontal axis lists the intelligent levels from L0 to L5 in sequence, while the vertical axis corresponds to the six task dimensions S1–S6.

L0–L1: The main task dimensions fall within the lowest (0–20%) or initial (21–40%) maturity intervals, with only localized responses observed in S3 and S4. This indicates that system functions rely heavily on external triggers or mechanical execution, resulting in a low level of intelligence. S1, S2, S5, and S6 remain largely undeveloped, showing the absence of critical capability support.
L2–L3: Most task dimensions enter the medium maturity range (41–80%). Among them, the average maturity of five task dimensions (S1–S5) falls between 70% and 85%, while S6 rises sharply from about 10% at L1 to about 50% at L2. This reflects the system’s partial realization of closed-loop control and a significant enhancement in its ability to dynamically model user states.
L4: Most task dimensions fall within the 81–95% deep-blue range, with S1–S6 capabilities highly coordinated. This indicates that the system has achieved a high-quality closed loop of perception, decision-making, execution, and feedback.
L5: All task dimensions enter the 96–100% range, shown in the darkest blue in the figure, representing fully matured, end-to-end intelligent closed-loop capabilities.

Overall, Figure 7 clearly illustrates the coverage and maturity evolution trajectory of key task dimensions across the L0 to L5 levels, further validating the rationality and progressive nature of the multi-level intelligence grading framework proposed earlier. It provides clear quantitative support for mapping the capability evolution pathway of the IMS.

Figure 8 provides an integrated overview of the evolutionary characteristics of the L0–L5 levels across three dimensions: user roles/behaviors, system responsibilities and capabilities, and key technology support. The vertical steps reflect the progressive enhancement of IMS complexity and autonomy, while the gradient band at the bottom illustrates the advancing trend in intelligence levels. The progression moves from L0′s purely mechanical execution to L1′s introduction of basic environmental perception, L2′s semi-closed-loop fine-tuning, L3′s real-time decision-making, L4′s predictive health management, and ultimately L5′s end-to-end self-learning with continuous cloud-based optimization. This grading framework not only enables precise mapping between user participation and system capability but also underscores the central role of technological evolution in driving functional upgrades, providing a tiered reference framework for IMS development and application.

4.4. Grading System Verification

To evaluate the stability and robustness of the constructed L0–L5 grading system, this study adopts a numerical sensitivity analysis method to test the tolerance of the grading results to practical errors by simulating minor fluctuations in the input data. This method is based on the previously established comprehensive intelligence score calculation and threshold intervals, ensuring a rigorous mathematical foundation and reproducibility.

First, three groups of representative test vectors are selected:

Very low group: FDS_score, APM_score, and HMIB_score are all set to 0.05, and the calculated score of $S_{0}$ falls in the interval [0.000, 0.167), corresponding to level L0.
Midpoint group: The three-dimensional scores are all set to 0.425, and the $S_{0}$ score falls in the interval [0.350, 0.500), corresponding to the L2 level.
Extremely high group: The three-dimensional scores were all set to 0.95, and the $S_{0}$ score fell in the interval [0.833, 1.000], corresponding to level L5.

This initial mapping verified the accuracy of the grading thresholds in both extreme and intermediate scenarios. Subsequently, a ±10% perturbation test was conducted on the midpoint group to simulate minor deviations that may occur in practical applications. Specifically, each dimension score was increased and decreased by 10% (0.425 ± 0.0425), and the comprehensive intelligence score was recalculated to assess whether slight fluctuations could cause the system to jump to another grade (e.g., from L2 to L3).

S_{0}^{+ 10 %} = 0.327 \times 0.4675 + 0.267 \times 0.4675 + 0.406 \times 0.4675 = 0.4665

(25)

S_{0}^{- 10 %} = 0.327 \times 0.3825 + 0.267 \times 0.3825 + 0.406 \times 0.3825 = 0.3831

(26)

Table 15 presents the intelligence scores under perturbation.

As shown in Table 15, under all perturbation scenarios, the comprehensive intelligence score of the midpoint group remains at the L2 level, with no occurrence of level jumping. This result indicates that the grading system demonstrates good fault tolerance and robustness to minor fluctuations in the input scores, maintaining stability in intelligence level determination within the common error range of ±10%.

5. Conclusions

To address the urgent need for capability grading of IMS across the entire process of perception, decision-making, execution, and feedback, existing methods still lack a systematic and quantifiable evaluation framework, making it difficult to provide precise guidance for product design, performance benchmarking, and technical optimization. Therefore, this study, for the first time, proposes and validates a structured and quantifiable L0–L5 six-level grading framework from a task-driven perspective, achieving accurate characterization and stable evaluation of the intelligence level of IMS.

The research mainly includes three stages:

(1): To address the lack of evaluation for functional depth and proactiveness in the IMS operational process, an MDT model is proposed, which divides the system operation process into six subtasks (S1–S6). On this basis, a three-dimensional capability measurement framework, consisting of FDS, APM, and HMIB, is established to characterize the system’s agent capability and proactiveness level in each subtask.
(2): In response to the current lack of reproducible quantitative methods, eight key performance indicators were selected to comprehensively reflect the intelligence level of the system. The Delphi method combined with the AHP was used to determine the weights of the indicators. The three-dimensional capability scores and the overall intelligence degree were then calculated through normalization and weighted summation. Subsequently, within the [0, 1] interval, the grading thresholds for L0–L5 were determined by combining equal-interval division with expert calibration, thereby achieving a quantitative mapping from the original performance indicators to the corresponding intelligence levels.
(3): To address potential minor errors that may occur in the practical application of the grading system, experiments involving extreme-value and midpoint mapping of typical test vectors were designed, and a sensitivity analysis was conducted by introducing ±10% numerical perturbations to the midpoint group. The results indicate that the grading decisions remain stable within common error ranges, thereby verifying the stability and robustness of the grading system.

This study has the following main contributions:

(1): This study introduces the task-driven concept into the IMS capability grading framework for the first time. It integrates the three-dimensional capability framework, comprising FDS, APM, and HMIB, with the MDT model to establish a structured “Task-Capability-Level” mapping. This work fills the research gap in capability grading within the IMS domain and provides a quantifiable theoretical foundation for evaluating the capabilities of intelligent healthcare devices.
(2): This study proposes a reproducible IMS grading method that combines the Delphi method and the AHP to determine indicator weights, followed by normalization and overall intelligence degree calculation. The resulting quantitative standard can be directly applied to product design, performance benchmarking and technical optimization.

Although this study has constructed and validated a systematic grading model and confirmed its stability, several limitations remain. The model validation primarily relies on simulated vectors and mathematical derivations, lacking support from real-world data. The indicator weights are derived from expert judgment, which inevitably introduces a certain degree of subjectivity. With the advancement of emerging technologies such as flexible sensing and brain–computer interfaces, the coverage of the existing task model and capability dimensions still has room for expansion. To address these issues, future research could leverage automated simulation platforms or industry collaboration to obtain multi-scenario empirical data and conduct dynamic, as well as cross-scenario validation. Machine learning or big data methods could be integrated to enable adaptive optimization of weights and thresholds. Furthermore, new sensing and interaction technologies could be incorporated into the grading framework to dynamically expand task dimensions and grading criteria, thereby further enhancing the forward-looking nature and applicability of the grading and evaluation system.

Author Contributions

Conceptualization, L.W., J.W., J.H. and J.Q.; Methodology, L.W., J.W., M.G. and G.L.; Investigation, J.W., M.G., M.F. and X.Y.; Validation, J.W., M.F., X.Y., H.W. and B.C.; Formal analysis, L.W., J.W., G.L. and M.G.; Data curation, M.G., G.L. and B.C.; Visualization, L.W., J.W. and M.G.; Writing—original draft preparation, L.W. and J.W.; Writing—review and editing, J.H., J.Q. and Y.Z.; Project administration, J.H. and J.Q.; Funding acquisition, J.H. and J.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (52035007, 52475270), Ergonomics drives Rongtai Health Intelligent Innovation Design (24H020103149), and Shanghai Jiao Tong University “Multimodal Cognitive Model-Driven Intelligent Innovation Design Research” (AF4300019).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data reported in this study can be found in the sources cited in the reference list.

Conflicts of Interest

Authors Wang Junliang and Liu Guangtao are employed by Shanghai Rongtai Health Technology Co., Ltd., the funding sponsor. They contributed to the design of the study, the collection, analysis, and interpretation of data, as well as the writing of the manuscript and the decision to publish the results. The other authors declare no conflicts of interest.

References

Su, J. Promoting the steady and healthy development of China’s economy through the dual expansion of demand and supply. State-Own. Assets Rep. 2025, 90–93. [Google Scholar]
Zhang, Y.; Lu, S. Behavioral decision-making of treatment plan of traditional Chinese medicine massage robot. J. Nat. Sci. Heilongjiang Univ. 2016, 33, 545–549. [Google Scholar]
Wang, Z. Design of Massage Manipulator and Research on Massage Effect Based on Surface Electromyography Signal. Ph.D. Thesis, Tianjin University, Tianjin, China, 2023. [Google Scholar]
Fan, C. Design and Analysis of Back Massage Robot Based on Traditional Chinese Medicine Massage Techniques. Ph.D. Thesis, Dongguan University of Technology, Dongguan, China, 2024. [Google Scholar]
Mei, Z. Research on Massage Manipulator Control Based on Vision and Force Feedback. Ph.D. Thesis, Wuhan University of Technology, Wuhan, China, 2024. [Google Scholar]
Zhou, Y.; Wang, Z.; Yu, X.; Hu, L.; Xu, Y. Research on fatigue recovery effect of different massage modes of massage chair on shoulder trapezius muscle. Furnit. Inter. Decor. 2024, 31, 67–71. [Google Scholar]
Peng, D.; Zhai, J.; Chen, Y. Multimodal user comfort monitoring for massage robots. Mech. Des. Manuf. 2025, 1–5. [Google Scholar] [CrossRef]
Li, B.; Xu, B.; Xue, Y.; Li, J. Experimental study on the influence of EEG-based massage position on the comfort of wearable massager. Ergonomics 2023, 29, 26–31. [Google Scholar]
Wang, Q.; Wu, Y. Research on the design of integrated cervical massager based on emotional design strategy. Beauty Times 2024, 5, 139–142. [Google Scholar] [CrossRef]
Yan, X.; Ji, Y. Research on the liability of autonomous driving car accidents from the perspective of SAE classification standards. Stand. Sci. 2019, 12, 50–54. [Google Scholar]
Zhang, X.; Sun, H. Analysis of GB/T 40429-2021 “Automobile Driving Automation Classification”. China Automob. 2022, 5, 3–5+7. [Google Scholar]
Society of Automotive Engineers (SAE). Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles (J3016_201806); Society of Automotive Engineers (SAE): Warrendale, PA, USA, 2018. [Google Scholar]
Trypuz, R.; Kulicki, P.; Sopek, M. Ontology of autonomous driving based on the SAE J3016 standard. Semant. Web 2024, 15, 1837–1862. [Google Scholar] [CrossRef]
NHTSA. Automated Driving Systems: A Vision for Safety 2.0. U.S. Department of Transportation. 2017. Available online: https://www.nhtsa.gov/technology-innovation/automated-vehicles-safety (accessed on 25 June 2025).
UNECE. UN Regulation No. 157-Automated Lane Keeping Systems (ALKS). World Forum for Harmonization of Vehicle Regulations (WP.29). 2021. Available online: https://unece.org/transport/standards/transport/vehicle-regulations-wp29 (accessed on 25 June 2025).
GB/T 40429-2021; Automobile Driving Automation Classification. State Administration for Market Regulation & National Administration of Standardization: Beijing, China, 2021.
Hu, J.; Deng, J. Analysis of the standard “Automobile Driving Automation Classification” (Draft for Approval). Environ. Technol. 2020, 38, 192–195. [Google Scholar]
Lv, Z.; Tian, Y. The classification is clearer. The Chinese version of the autonomous driving classification standard is announced. Intell. Connect. Veh. 2020, 2, 13–15. [Google Scholar]
Yuan, P. The dilemma of attribution of liability for traffic accidents caused by autonomous driving vehicles and the criminal law response. J. South China Univ. Technol. (Soc. Sci. Ed.) 2023, 25, 30–40. [Google Scholar]
Liu, Y.; Zhan, J.; Li, S.; Li, X.; Chen, J. The future of autonomous driving technology: Single-vehicle intelligence and intelligent vehicle-road collaboration. J. Automot. Saf. Energy Conserv. 2024, 15, 611–633. [Google Scholar]
Bimbraw, K. Autonomous cars: Past, present and future a review of the developments in the last century, the present scenario and the expected future of autonomous vehicle technology. In Proceedings of the 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO), Colmar, France, 21–23 July 2015; IEEE: New York, NY, USA, 2015; Volume 1, pp. 191–198. [Google Scholar]
Chen, Y. Current status and development trend of autonomous driving technology research. Energy Technol. Manag. 2021, 46, 34–37. [Google Scholar]
Xinhuanet. The Penetration Rate of L2 Autonomous Driving Functions in China’s Passenger Cars Has Reached 55.7% [EB/OL]. Available online: https://www.news.cn/fortune/20250226/5f2e955579de42cfb51dde1eb46a4ff5/c.html (accessed on 26 June 2025).
PDay Smart Car Network. L3 Autonomous Driving Pilot Projects Continue to Advance, and Road Tests are Launched in Beijing and Other Places [EB/OL]. Available online: https://www.pday.com.cn/Htmls/Report/202502/24550682.html (accessed on 26 June 2025).
Atakishiyev, S.; Salameh, M.; Yao, H.; Goebel, R. Explainable artificial intelligence for autonomous driving: An overview and guide for future research directions. IEEE Access 2024, 12, 101603–101625. [Google Scholar] [CrossRef]
Badue, C.; Guidolini, R.; Carneiro, R.V.; Azevedo, P.; Cardoso, V.B.; Forechi, A.; Jesus, L.; Berriel, R.; Paixão, T.M.; Mutz, F.; et al. Self-Driving Cars: A Survey. Expert Syst. Appl. 2021, 165, 113816. [Google Scholar] [CrossRef]
Kosuru, V.S.R.; Venkitaraman, A.K. Advancements and challenges in achieving fully autonomous self-driving vehicles. World J. Adv. Res. Rev. 2023, 18, 161–167. [Google Scholar] [CrossRef]
Geng, L. Autonomous Driving Driven by Artificial Intelligence: Development Status and Future Prospects. Comput. Artif. Intell. 2025, 2, 29–36. [Google Scholar] [CrossRef]
Sever, T.; Contissa, G. Automated driving regulations—Where are we Now? Transp. Res. Interdiscip. Perspect. 2024, 24, 101033. [Google Scholar] [CrossRef]
Su, Y.-S.; Huang, H.; Daim, T.; Chien, P.-W.; Peng, R.-L.; Akgul, A.K. Assessing the technological trajectory of 5G-V2X autonomous driving inventions: Use of patent Analysis. Technol. Forecast. Soc. Change 2023, 196, 122817. [Google Scholar] [CrossRef]
Gora, P.; Rüb, I. Traffic Models for Self-driving Connected Cars. Transp. Res. Procedia 2016, 14, 2207–2216. [Google Scholar] [CrossRef]
Fu, Z. The Current Development and Future Prospects of Autonomous Driving Driven by Artificial Intelligence. Comput. Artif. Intell. 2025, 2, 8–15. [Google Scholar] [CrossRef]
Sheng, S. Review of Development Trends and Challenges in Autonomous Driving Technology. Appl. Comput. Eng. 2025, 140, 113–118. [Google Scholar] [CrossRef]
Deng, X.; Wang, L.; Gui, J.; Jiang, P.; Chen, X.; Zeng, F.; Wan, S. A review of 6G autonomous intelligent transportation systems: Mechanisms, applications and Challenges. J. Syst. Archit. 2023, 142, 102929. [Google Scholar] [CrossRef]
Zheng, J. Application of autonomous driving technology in intelligent connected vehicles. Automob. Pict. 2024, 8, 30–32. [Google Scholar]
Zhu, Z.; Zhang, Y.; Zhang, Z.; Peng, Z.; Zhu, T. A multi-sensor post-fusion method for autonomous driving. Bus Technol. Res. 2025, 47, 1–5. [Google Scholar]
Guo, J. Application of multi-sensor fusion in environmental perception of autonomous driving vehicles. Automob. Maint. Tech. 2025, 2, 13–14. [Google Scholar]
Hu, C. On the development status and safety considerations of autonomous driving technology of intelligent vehicles in my country. J. Hunan Police Coll. 2019, 31, 99–106. [Google Scholar]
Hangzhou Auto Channel. Several Car Companies Have Clarified the Timetable for Mass Production of L3 Autonomous Driving [EB/OL]. Available online: https://auto.hangzhou.com.cn/reading/content/2025-04/24/content_8981898.html (accessed on 28 June 2025).
Sina Finance. BYD, Xiaopeng and Others Are Accelerating the Promotion of High-End Intelligent Driving [EB/OL]. Available online: https://finance.sina.com.cn/stock/zqgd/2025-02-21/doc-inemeuym3096446.shtml (accessed on 28 June 2025).
Ministry of Industry and Information Technology, Ministry of Civil Affairs, National Health Commission. Action Plan for the Development of Smart Health Care Industry (2021–2025); Ministry of Industry and Information Technology: Beijing, China, 2021. [Google Scholar]
Zheng, M. The classification of automobile driving automation has a basis. China Transp. News 2022, 5. [Google Scholar] [CrossRef]
China Industry Information Network. China’s Massage Chair Industry Chain Map, Industrial Environment, Market Status and Future Trend Analysis in 2025 [EB/OL]. Available online: https://www.chyxx.com/industry/1222032.html (accessed on 29 June 2025).
Zhou, L.; Feng, Z.; Cai, Z.; Yang, X.; Ai, C.; Shao, H.; Hu, Z. A Massage Area Positioning Algorithm for Intelligent Massage System. Comput. Intell. Neurosci. 2022, 2022, 7678516. [Google Scholar] [CrossRef]
Zhu, S.; Lei, J.; Chen, D. Recognition Method of Massage Techniques Based on Attention Mechanism and Convolutional Long Short-Term Memory Neural Network. Sensors 2022, 22, 5632. [Google Scholar] [CrossRef]
Luo, R.C.; Chen, S.Y.; Yeh, K.C. Human body trajectory generation using point cloud data for robotics massage applications. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014. [Google Scholar]
Sun, K.; Zhao, Q.; Yang, Z.; Xu, X. Visual feedback system for traditional chinese medical massage robot. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; IEEE: New York, NY, USA, 2019; pp. 6379–6385. [Google Scholar]
Hu, W.; Sheng, Q.; Sheng, X. A novel realtime vision-based acupoint estimation for TCM massage robot. In Proceedings of the 2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), Shanghai, China, 26–28 November 2021; IEEE: New York, NY, USA, 2021; pp. 771–776. [Google Scholar]
Zhang, Q.; Zhang, Y.; Li, C. Application of genetic algorithm optimized BP neural network in traditional Chinese medicine massage robot. Appl. Sci. Technol. 2017, 44, 5. [Google Scholar]
Sun, L.; Sun, S.; Fu, Y.; Zhao, X. Acupoint detection based on deep convolutional neural network. In Proceedings of the 2020 39th Chinese control conference (CCC), Shenyang, China, 27–29 July 2020; IEEE: New York, NY, USA, 2020; pp. 7418–7422. [Google Scholar]
Zhai, J.; Zeng, X.; Su, Z. An intelligent control system for robot massaging with uncertain skin characteristics. Ind. Robot Int. J. Robot Res. Appl. 2022, 49, 634–644. [Google Scholar] [CrossRef]
Zhong, Y. Research on the technical principles and applications of massage robots. Appl. Comput. Eng. 2024, 77, 37–42. [Google Scholar] [CrossRef]
Sina Finance. Ogawa AI Massage Robot Debuts, 5D Sensing Technology Leads the Industry Upgrade [EB/OL]. Available online: https://finance.sina.com.cn/wm/2025-05-14/doc-inewpmqy1033029.shtml (accessed on 30 June 2025).
Gu, H.; Wang, D. Massage Robot Using Machine Vision. CN111343958A, 26 June 2020. [Google Scholar]
Pan, X. A Massage Robot Acupoint Tracking System Based on Visual Positioning. CN110882150A, 17 March 2020. [Google Scholar]
Liu, L. Research on Expert System of Traditional Chinese Medicine Massage Robot. Ph.D. Thesis, Shandong Jianzhu University, Jinan, China, 2016. [Google Scholar]
Ma, Z. Intelligent Massage System Based on Big Data. Chinese Patent CN201910172730.2, 7 March 2019. [Google Scholar]
Fukushima, S.; Nakajima, R.; Nomura, J. A VR Relax/Refresh System Employing Physiological Feedback. IEEJ Trans. Electron. Inf. Syst. 1995, 115, 222–229. [Google Scholar] [CrossRef]
Lin, C.; Chen, L.; Ren, Y.; Tang, B. Research on Intelligent Massage Chair Based on Fuzzy Control. Mech. Electr. Eng. 2011, 28, 201. [Google Scholar]
Jaafar, H.; Fariz, A.; Ahmad, S.A.; Yunus, N.A.M. Intelligent massage chair based on blood pressure and heart rate. In Proceedings of the 2012 IEEE-EMBS Conference on Biomedical Engineering and Sciences, Langkawi, Malaysia, 17–19 December 2012; IEEE: New York, NY, USA, 2012; pp. 514–518. [Google Scholar]
Hiyamizu, K.; Fujiwara, Y.; Genno, H.; Yasuda, M.; Koma, T. Development of human sensory sensor and application to massaging chairs. In Proceedings of the 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No. 03EX694), Kobe, Japan, 16–20 July 2003; IEEE: New York, NY, USA, 2003; Volume 1, pp. 140–144. [Google Scholar]
Alavandar, S.; Sundaram, K.A.; Nigam, M.J. Genetic algorithm based robot massage. J. Theor. Appl. Inf. Technol. 2007, 3, 102–109. [Google Scholar]
Chen, H.; Shen, L.; Song, J. The influence of massage mode on the massage comfort of kneading massage chair. Ergonomics 2012, 2, 40–43. [Google Scholar]
Liu, M.; Dai, F.; Yang, T. Research on robot constant force massage based on admittance control. Mech. Des. 2025, 42, 116–122. [Google Scholar]
Ma, L.; Yao, G.; Ni, Q.; Zhu, Z. Research on parallel-series hybrid robot for Chinese medicine massage with rolling method. Mech. Des. Res. 2005, 12, 43–46. [Google Scholar]
Wang, W.; Yu, H.; Zhou, C.; Song, S. Based on ADAMS Design and analysis of bionic massage mechanical device. Robot. Appl. 2011, 44–46. [Google Scholar]
Xu, Q.; Deng, Z.; Zeng, C.; Li, Z.; He, B.; Zhang, J. Toward automatic robotic massage based on interactive trajectory planning and Control. Complex Intell. Syst. 2024, 10, 4397–4407. [Google Scholar] [CrossRef]
Kerautret, Y. Caractérisation des Techniques de Massage et Validation des Bénéfices Physiologiques d’un Système de Massage Robotique, Autonome et Interactif. Ph.D. Thesis, Université de Lyon, Lyon, France, 2021. [Google Scholar]
Shen, K. Research on Seated Electric Massage Chair. Ph.D. Thesis, China Jiliang University, Hangzhou, China, 2015. [Google Scholar]
Wang, Q.; Liu, W. Research on the design of massage chair products for the elderly. Furniture 2022, 40–44. [Google Scholar]
Yu, S.; Ma, L.; Guo, Z. Analysis of kinematic and dynamic characteristics of traditional Chinese medicine massage techniques. J. Shandong Univ. Technol. 2005, 19, 82–85. [Google Scholar]
Su, D. Design and Research of Adaptive Massage Chair Control System. Ph.D. Thesis, Anhui University of Technology, Ma’anshan, China, 2017. [Google Scholar]
Han, X.; Xu, P. Design of intelligent massage robot. Autom. Appl. 2024, 65, 74–76. [Google Scholar]
Li, X.; Shi, R.; Xu, L. Research on ESG evaluation system of Chinese automobile industry based on Delphi-AHP. Automob. Ind. Res. 2025, 35–40. [Google Scholar]
Ishizaka, A.; Nemery, P. Multi-Criteria Decision Analysis: Methods and Software; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
Garg, P. Remote Pulse Monitoring Using Millimeter Waves. 2021. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1616868&dswid=4205 (accessed on 5 July 2025).
Iyer, S.; Zhao, L.; Mohan, M.P.; Jimeno, J.; Siyal, M.Y.; Alphones, A.; Karim, M.F. mm-Wave radar-based vital signs monitoring and arrhythmia detection using machine learning. Sensors 2022, 22, 3106. [Google Scholar] [CrossRef] [PubMed]
Lv, W.; He, W.; Lin, X.; Miao, J. Non-contact monitoring of human vital signs using FMCW millimeter wave radar in the 120 GHz band. Sensors 2021, 21, 2732. [Google Scholar] [CrossRef]
Lee, J.; Kim, D.; Choi, H. Vital Sign Monitoring System Using Millimeter Wave Radar. WO2015174879A1, 19 November 2015. [Google Scholar]

Figure 1. Evolution of the Autonomous Driving Grading System (created by the authors).

Figure 2. Intelligent massage system technology evolution diagram (created by the authors).

Figure 3. MDT system (created by the authors).

Figure 4. Comparative analysis between IMS and the DDT (created by the authors).

Figure 5. Radar chart of eight key indicators weights (created by the authors).

Figure 6. Eight evaluation indicators systems for IMS (created by the authors).

Figure 7. The maturity of capabilities at different levels (created by the authors).

Figure 8. IMS’ level classification (created by the authors).

Table 1. Evolution of function–design dimensions in IMS.

Dimension/Aspect	Example Function	Technology Example	Core Advantage
Massage Strategy	Adaptive adjustment driven by physiological signals	Heart-rate sensor, EMG detection + adaptive control algorithm	Real-time response to user state; high personalization
Massage Track	Dynamic path planning combining vision and algorithms	Depth camera/RGB camera + path-planning/optimization algorithms	Highly precise trajectories; adapts to varied postures and body shapes
Massage Mode	Traditional motions (kneading, tapping, rolling) + multimodal operations (heat therapy, electrical pulse, air-bag compression)	Multi-axis motor drive + heating elements + pneumatic control + pulse modules	Rich, composite actions; enhanced comfort and efficacy

Table 2. Mapping of MDT subtask functional delegation states.

Subtasks	N	L	F
S1	Fully manual	Basic perception	Closed-loop intelligent identification
S2	Manual selection by user	Preset programs	Intelligent dynamic recommendation
S3	Manual adjustment	Step adjustment	Adaptive adjustment
S4	No feedback	Simple instructions	Intelligent interaction
S5	User-controlled	Basic safety	Smart protection
S6	No continuous learning	Parameter memory	Deep learning and strategy optimization

Table 3. Grading evaluation indicators and their meanings.

Indicator Code	Indicator Name	Dimension	Unit
P1	Task recognition accuracy	FDS-Perception	%
P2	Abnormal detection sensitivity	APM	%
D1	Recommendation hit rate	FDS-Decision	%
D2	Decision response delay	FDS-Decision	s
E1	Force control error	FDS-Execution	N
E2	Path tracking accuracy	FDS-Execution	mm
F1	Physiological feedback response rate	HMIB-Feedback	%
F2	User subjective satisfaction	HMIB-Feedback	point

Table 4. Questionnaire topic setting.

Number	Question
1	Is it important for the massage system to accurately identify your physical state? For example, it can distinguish whether you are “tired” or “relaxed”.
2	Is it important for the massage system to detect equipment failures in a timely manner? For example, problems such as sensor abnormalities and massage head jams can be detected quickly.
3	Is it important that the massage program recommended by the massage system meets your needs?
4	Is the speed at which the massage system makes decisions important?
5	Is the accuracy of the massage system’s force control important?
6	Is the accuracy of the massage system’s path tracking important?
7	Is it important for the massage system to adjust in real time based on your physiological signals?
8	Is your overall satisfaction with the massage experience important?

Table 5. Mean and standard deviation of expert ratings.

Indicator Code	${\bar{d}}_{i}$	$σ_{i}$
P1	4.78	0.44
P2	3.44	0.53
D1	3.67	0.50
D2	4.78	0.44
E1	4.78	0.44
E2	4.78	0.44
F1	4.67	0.50
F2	3.78	0.44

Table 6. Comparison of original weights and perturbed weights.

Indicator Code	$w_{i}$	$w_{i}^{'}$	$Δ w_{i}$
P1	0.125	0.138	0.013
P2	0.090	0.100	0.010
D1	0.096	0.104	0.008
D2	0.125	0.138	0.013
E1	0.125	0.138	0.013
E2	0.125	0.137	0.012
F1	0.122	0.129	0.007
F2	0.113	0.117	0.004

Table 7. Classification and threshold interval.

Level	Threshold Interval
L0	[0.000, 0.167)
L1	[0.167, 0.350)
L2	[0.350, 0.500)
L3	[0.500, 0.650)
L4	[0.650, 0.833)
L5	[0.833, 1.000]

Table 8. Pairwise comparison scoring results of experts on the importance of the three dimensions.

Expert ID	FDS vs. APM	FDS vs. HMIB	APM vs. HMIB
E1	5	4	3
E2	4	4	3
E3	5	4	2
E4	4	3	3
E5	5	5	4
E6	5	4	3
E7	4	3	3
E8	5	4	3

Table 9. L0 level task capability mapping.

Subtasks	Whether It Has	Ability to Achieve
S1	×	No automatic body shape/posture detection
S2	×	No solution recommendation algorithm
S3	√	Supports fixed path/sequence execution, but lacks feedback correction mechanisms
S4	√	Can operate along fixed paths, but lacks online replanning capability
S5	×	No anomaly detection or alerting capability
S6	×	No user modeling or adaptive capability

Table 10. L1 level task capability mapping.

Subtasks	Whether It Has	Ability to Achieve
S1	√	Basic body shape/posture detection
S2	√	Program library scheme recommendation, slight improvement in D1
S3	√	Execution of preset force/sequence (±20%), without adaptive correction
S4	√	Sequential execution of preset paths, without dynamic force allocation
S5	√	Fixed rule-based alarm triggering, longer D2 response delay
S6	×	Only records basic usage parameters, no user modeling and adaptation

Table 11. L2 level task capability mapping.

Subtasks	Whether It Has	Ability to Achieve
S1	√	Multi-angle body scanning, P1 reaches the middle level
S2	√	Based on historical preference personalized recommendation, D1 significantly improved
S3	√	Strength/rhythm error ± 10%, limited fine-tuning possible based on real-time feedback
S4	√	Multi-area coordinated dynamic adjustment, key parts can be adjusted dynamically with simple strength
S5	√	Automatically identifies common faults and can quickly perform decompression or pause operations
S6	√	Preliminary adaptive learning of physiological signals

Table 12. L3 level task capability mapping.

Subtasks	Whether It Has	Ability to Achieve
S1	√	Multi-sensor dynamic body shape/posture recognition
S2	√	Online learning drives solution optimization
S3	√	Strength/rhythm error ± 10%, multiple fine-tuning can be performed based on real-time feedback
S4	√	Multi-area collaborative dynamic adjustment, continuous action switching between complex parts
S5	√	Multi-stage intelligent intervention for complex abnormalities
S6	√	Continuous closed-loop learning and precise recommendations

Table 13. L4 level task capability mapping.

Subtasks	Whether It Has	Ability to Achieve
S1	√	Real-time identification of multimodal health status
S2	√	Dynamic plan generation driven by health goals
S3	√	±3% velocity/rhythm error before active pre-adjustment parameters
S4	√	Cross-site multi-stage continuous health intervention
S5	√	Predictive warning and intervention for subtle health fluctuations
S6	√	Long-term health record construction and personalized strategy promotion

Table 14. L5 level task capability mapping.

Subtasks	Whether It Has	Ability to Achieve
S1	√	Multi-source perception and physiological state prediction
S2	√	Big data-driven dynamic rehabilitation/massage program
S3	√	The force and rhythm error within ±1% can be automatically adjusted based on the prediction results
S4	√	Execute customized continuous actions to achieve preventive health interventions for the whole body
S5	√	Predictive warning and intervention for subtle health fluctuations
S6	√	Predict potential health risks and automatically implement intervention measures
S6	√	Online learning and real-time improvement of health profiles

Table 15. Comparative analysis of sensitive disturbances.

Disturbance Amplitude	Calculated So Value	Corresponding Level
±0%	0.4253	L2
+10%	0.4665	L2
−10%	0.3831	L2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, L.; Wang, J.; Guo, M.; Liu, G.; Fang, M.; Yan, X.; Wang, H.; Chen, B.; Zhu, Y.; Hu, J.; et al. Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems. Appl. Sci. 2025, 15, 9327. https://doi.org/10.3390/app15179327

AMA Style

Wang L, Wang J, Guo M, Liu G, Fang M, Yan X, Wang H, Chen B, Zhu Y, Hu J, et al. Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems. Applied Sciences. 2025; 15(17):9327. https://doi.org/10.3390/app15179327

Chicago/Turabian Style

Wang, Lingyu, Junliang Wang, Meixing Guo, Guangtao Liu, Mingzhu Fang, Xingyun Yan, Hairui Wang, Bin Chen, Yuanyuan Zhu, Jie Hu, and et al. 2025. "Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems" Applied Sciences 15, no. 17: 9327. https://doi.org/10.3390/app15179327

APA Style

Wang, L., Wang, J., Guo, M., Liu, G., Fang, M., Yan, X., Wang, H., Chen, B., Zhu, Y., Hu, J., & Qi, J. (2025). Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems. Applied Sciences, 15(17), 9327. https://doi.org/10.3390/app15179327

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on a Task-Driven Classification and Evaluation Framework for Intelligent Massage Systems

Abstract

1. Introduction

2. Related Works

2.1. Current Status of Research on Vehicle Autonomous Driving Classification

2.2. Research Status of Intelligent Massage System

3. Materials and Methods

3.1. Definition of the Massage-Driven Task (MDT) Model

3.2. Design Logic of the Grading Structure

3.2.1. Functional Delegation Structure (FDS)

3.2.2. Abnormal Perception Mechanism (APM)

3.2.3. Human–Machine Interaction Bounds (HMIB)

3.3. Construction of the Grading Indicator System

3.3.1. Determination of Key Indicators

3.3.2. Indicator Quantification Method

3.3.3. Questionnaire Design and Implementation

3.3.4. Data Analysis and Quantification

3.3.5. Determination of the Grading Index System

4. Results

4.1. Intelligent Level Classification

4.1.1. Calculation of Three-Dimensional Ability Scores

4.1.2. Threshold Design and Grading

4.1.3. Comprehensive Intelligence Calculation

4.2. Definition of L0–L5 Levels

4.2.1. L0—Mechanical Execution Level

4.2.2. L1—Environmental Perception Level

4.2.3. L2—Intelligent Assistance Level

4.2.4. L3—Autonomous Decision-Making Level

4.2.5. L4—Health Steward Level

4.2.6. L5—Preventive Intervention Level

4.3. L0–L5 Level Capability Analysis

4.4. Grading System Verification

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI