Next Article in Journal
Consistency-Regularized Hybrid Deep Learning with Entropy-Weighted Attention and Branch Dropout for Intrusion Detection in IoT Networks
Previous Article in Journal
Entropy-Based Spectrum Sensing for Cognitive Radio Networks Using Machine Learning and Software Defined Radio
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive Multidimensional Model for User Interface Quality Assessment

by
Ina Asenova Naydenova
1,
Zlatinka Svetoslavova Kovacheva
1,2,* and
Iliya Krasimirov Georgiev
1
1
Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria
2
Department of Mathematics and Informatics, University of Mining and Geology, 1700 Sofia, Bulgaria
*
Author to whom correspondence should be addressed.
Future Internet 2026, 18(5), 261; https://doi.org/10.3390/fi18050261
Submission received: 20 March 2026 / Revised: 11 May 2026 / Accepted: 12 May 2026 / Published: 15 May 2026
(This article belongs to the Section Techno-Social Smart Systems)

Abstract

User interface evaluation remains fragmented across performance metrics, subjective assessments, and user-dependent factors, limiting the comparability and interpretability of results across methodological traditions. This paper proposes a multidimensional evaluation framework that integrates these perspectives into a coherent analytical structure. The framework consists of three dimensions—Functional–Objective, Cognitive–Perceptual, and Contextual–Individual—each capturing a distinct facet of interface quality. A key feature of the proposed approach is the use of profile-dependent weighting, which enables evaluation results to reflect the specific priorities of different user groups. The framework’s operational logic is demonstrated through structured illustrative scenarios, showing how the model can be applied in practice to support more informed design and evaluation decisions. By aligning heterogeneous evaluation logics within a unified structure, the proposed approach provides a systematic basis for more consistent, transparent, and context-sensitive assessment of user interfaces.

Graphical Abstract

1. Introduction

The user interface serves as the primary bridge between humans and information systems. Users’ perceptions of a system depend not only on its functional capabilities but also on its ease of use [1]. Empirical evidence shows that evaluations are strongly influenced by interface design: technically robust systems may receive negative assessments when the interface is unintuitive, while more modest systems are often perceived as reliable and modern when they provide a clear and well-structured design [2]. This discrepancy underscores the importance of considering not only technical performance but also the perceptual and experiential dimensions that influence judgments of interface quality.
Traditional UI evaluation methods typically consider both objective and subjective aspects of interaction. Performance-based metrics such as task completion time, error rates, and efficiency are commonly used alongside subjective assessments of perceived usability and user experience (UX) [3,4]. More recent approaches extend UI/UX evaluation by incorporating contextual dimensions such as user expectations and interaction context [5,6]. Despite progress in UX and QoE research toward multidimensional and user-dependent models [7], a number of frameworks continue to treat these factors as separate analytical components [8].
A further limitation concerns the absence of systematic user profiling within the evaluation process, which restricts the ability of existing approaches to account for variability across users. User profiling, defined as the identification and modeling of individual characteristics such as expertise and domain knowledge, is essential for the accurate interpretation of evaluation results [9]. Without an explicit profiling component, usability evaluations often rely on the “average user” assumption, overlooking cases in which the same system performs differently across expertise levels [10].
To address these limitations, this paper proposes a multidimensional model (framework) for user interface evaluation that integrates performance-based metrics, subjective assessments, and user-specific characteristics. The model defines relationships among these dimensions and introduces a normalization mechanism that enables comparison across heterogeneous indicators within a unified framework. A key contribution of the proposed approach is the incorporation of adaptive user profiling to capture variability across user groups and usage contexts.
This study is conceptual and methodological in nature, focusing on the structure, logic, and operational principles of the proposed framework. Illustrative examples demonstrate how the framework can be applied in practice, enabling subsequent prototype development and empirical validation.
The research is guided by the following questions:
RQ1. How can user profiles, such as differences in experience level, be systematically integrated into UI evaluation through adaptive weighting mechanisms?
RQ2. How can a conceptual adaptive evaluation model demonstrate the effects of user-dependent weighting and context-sensitive indicators on evaluation outcomes?
RQ3. Which conceptual and methodological components are required to support the future empirical operationalization of the proposed framework?

2. Related Work

Existing approaches to user interface evaluation can be broadly grouped into four major research directions: performance-based usability approaches, questionnaire-based methods, composite UX/QoE models, and adaptive and personalized systems. These categories differ in the dimensions they emphasize, the methodological assumptions they adopt, and the extent to which they incorporate user-specific variability into the evaluation process.
Early usability research emerged from human–computer interaction (HCI) and cognitive engineering traditions, emphasizing measurable aspects of interaction performance. Foundational work by Card, Moran, and Newell [4], together with the usability engineering principles of Nielsen [3], established efficiency, effectiveness, and error frequency as central evaluation indicators. This line of research was supported by well-established cognitive models, including Fitts’ Law [11] and Hick’s Law [12], which describe fundamental relationships between interface design, motor behavior, and decision complexity. These principles were later formalized in standardized frameworks such as ISO 9241-11 [13], which defines usability in terms of effectiveness, efficiency, and satisfaction.
A complementary research direction focuses on subjective and experiential aspects of interaction. Widely adopted instruments such as SUS [14], PSSUQ [15], and NASA-TLX [16] provide structured mechanisms for capturing perceived usability, user satisfaction, and cognitive workload. Research on perceived ease of use [1], together with studies examining the relationship between interface aesthetics and usability perception [2,17], underscores the importance of perceptual and experiential dimensions in interface evaluation. Experiential UX perspectives proposed by Hassenzahl [18] additionally emphasize emotional, hedonic, and contextual aspects of interaction that extend beyond purely task-oriented usability assessment.
Recent research increasingly explores composite UX/QoE models that integrate objective and subjective indicators within broader analytical structures. Quality-in-use frameworks [5], QoE-oriented approaches [6,7], and comprehensive reviews of UX evaluation methodologies [8,19,20] highlight the shift toward integrated representations of interaction processes and user perception. In parallel, multi-criteria decision-making approaches such as AHP [21], together with composite-indicator methodologies [22,23], provide formal mechanisms for combining heterogeneous evaluation criteria into consolidated assessment models. These approaches support more systematic representations of interaction-related characteristics, particularly in complex evaluation environments.
Another important research direction concerns adaptive UI systems. Advances in user modeling and profiling [9], intelligent user interfaces [24], adaptive UI frameworks [25], and personalized system evaluation [26] emphasize the role of user-specific characteristics in shaping interaction outcomes. Such approaches commonly employ contextual adaptation, behavioral analysis, and machine learning techniques to personalize interface behavior and interaction flow. Recent reviews further demonstrate the growing importance of adaptive and user-dependent approaches in intelligent environments [10]. Related work on behavioral analysis and interaction pattern recognition [27,28,29] also illustrates the increasing use of data-driven techniques for modeling user behavior and interaction variability.
While existing approaches provide complementary perspectives, they differ in their operational assumptions and in the degree of integration across evaluation dimensions. Composite UX/QoE models enable the aggregation of heterogeneous metrics [7,21,22], yet they typically rely on static weighting schemes that do not account for variation across users and contexts [19,23]. In contrast, adaptive UI systems support personalization through user modeling and contextual adaptation [9,24,25], but these mechanisms are rarely extended to evaluation frameworks. Consequently, current approaches offer limited means for combining objective, subjective, and user-dependent criteria within a unified analytical structure. This limitation is particularly evident in heterogeneous user populations, where aggregated results may conceal differences in expertise, cognitive strategies, and interaction behavior [10,19,20,30].
Building on these observations, the proposed framework integrates performance-related metrics, subjective evaluation, adaptive weighting, and structured user profiling within a unified multidimensional evaluative structure. The framework additionally introduces normalization and relationship-modeling mechanisms that support comparison across heterogeneous indicators while accounting for variability across user groups and usage contexts.
Table 1 provides a comparative overview of how the proposed approach aligns with existing methodologies in terms of supported dimensions and level of integration. The comparative data underscores a persistent fragmentation in UI evaluation, where traditional metrics [3,13] and UX tools [18,20] remain largely disconnected from user-dependent adaptation [24,26]. This fragmentation reflects the historical development of these approaches, which emerged from separate methodological lineages—ranging from cognitive engineering [4] to experiential UX [18] and intelligent environments [10].
Therefore, current models rarely share compatible assumptions or evaluation logics, which limits the extent to which their results can be meaningfully aligned or interpreted within a common analytical frame. This synthesis clarifies how the proposed framework positions itself relative to these traditions and highlights its potential to support more coherent evaluation across heterogeneous methodological backgrounds.

3. Methodological Approach

The formulation of the Adaptive Multidimensional Model (AMM) follows a structured methodological sequence, progressing from conceptual grounding to formal specification and demonstration. The process is organized into three phases.

3.1. Phase I: Conceptual Foundation and Dimensional Structuring

Phase I establishes the theoretical basis of the model by identifying the core conceptual principles derived from gaps in current UI/UX evaluation. These principles are then translated into a structured dimensional representation of interface quality, and the corresponding evaluation indicators are selected. The selection process is grounded in established HCI theory and informed by the PACT framework [31], ensuring alignment with fundamental components of interactive system design.

3.2. Phase II: Formal Specification and Calibration Logic

The conceptual model is operationalized through a formal specification of its computational mechanisms, including normalization of heterogeneous metrics, adaptive weighting based on user expertise, and hierarchical aggregation into composite indices and an overall quality score.

3.3. Phase III: Model Illustration and Functional Demonstration

This phase illustrates the internal coherence of the AMM and demonstrates its practical applicability. Through structured illustrative scenarios (Section 6 and Section 7), the model’s computational steps are applied to hypothetical cases, showing how profile-specific weighting configurations influence evaluation outcomes and support reasoning about interface patterns tailored to different user categories.

4. Concept of an Adaptive Multidimensional Model

The model is based on three main principles:
  • Multidimensionality—the evaluation considers several interrelated dimensions;
  • Integration of heterogeneous evaluation inputs—neither type of indicator is sufficient on its own;
  • Adaptivity—the model adjusts evaluation priorities based on user characteristics.

4.1. Dimensions of the Model

To operationalize the described principles, AMM is structured into three dimensions, each reflecting a distinct facet of interface quality. The Functional–Objective dimension captures measurable aspects of user performance and system behavior; the Cognitive–Perceptual dimension reflects users’ subjective interpretations, clarity, and emotional responses; and the Contextual–Individual dimension incorporates user-specific characteristics that influence interaction patterns. Together, these dimensions enable clearer interpretation of evaluation results by mapping outcomes to specific categories of interface quality.

4.1.1. Functional–Objective Dimension

The first dimension organizes technical and behavioral metrics into a hierarchical structure based on composite indices (see Table 2), grounded in established HCI metrics commonly used in usability evaluation research [3,13,32]. Three composite indices are defined to align with recognized standards and core HCI principles:
  • Interaction Efficiency quantifies the effort required to achieve goals, a core component of usability as defined by ISO 9241-11 [13];
  • Reliability captures interface robustness and error-prevention capabilities, which are critical for maintaining user trust [30];
  • Interface Consistency reflects the principle that predictable system behavior reduces learning effort and cognitive workload [3].
The hierarchical design addresses the challenge of metric fragmentation by grouping related data into composite indices, achieving a balance between analytical precision and operational usability [22]. It aggregates fine-grained interaction metrics into higher-level indicators, enabling a clearer diagnostic view of functional performance without excessive detail. While aggregation may reduce sensitivity to subtle interaction nuances, the modular design ensures extensibility, allowing individual sub-metrics to be introduced or replaced without altering the overall conceptual framework.
In addition to the core composite indices, the framework considers characteristics such as flexibility and scalability [33,34], which are not treated as standalone indices but as supplementary factors that influence the stability of evaluation outcomes. Flexibility reflects the interface’s ability to maintain perceptual consistency and usability across diverse user profiles, while scalability captures the stability of functional performance and cognitive load as system complexity increases. Rather than operating as independent measures, these characteristics function as stress-test conditions that probe the robustness of both functional and perceptual aspects of the interface [30,35].
Methods for Functional–Objective Metrics Assessment
Functional–Objective metrics are measured using quantitative, performance-based evaluation methods. A primary approach is controlled usability testing, where participants execute predefined tasks under standardized conditions while performance indicators such as task completion time, number of interaction steps, and error rates are systematically recorded [3,32]. In addition to empirical testing, predictive HCI models provide a complementary analytical perspective by estimating interaction efficiency based on theoretical models of human motor and cognitive behavior. Methods such as Fitts’s Law, Hick’s Law, GOMS, and the Keystroke-Level Model (KLM) enable the formalization of interaction costs without requiring direct user observation [11,12].
System reliability is assessed through analysis of failure-related data, including defect density and system logs, often supported by stochastic models such as Poisson-based and non-homogeneous Poisson process (NHPP) approaches [36,37].
Interface consistency is evaluated using structured inspection techniques, including heuristic evaluation, consistency checklists, and formal UI audits [30,38].
Taken together, these complementary methods provide objective and reproducible measurements of functional performance, which can be systematically integrated into composite indices within the proposed multidimensional framework.
The methods described in this section represent established approaches for the assessment of Functional–Objective metrics as reported in the literature. The proposed framework does not prescribe a fixed method for metric computation, as the most appropriate evaluation technique may vary depending on system characteristics, data availability, application domain, and evaluation context. In particular, different analytical and empirical methods may be more suitable under different conditions, especially in relation to system complexity and domain-specific expertise. This flexibility allows the framework to be applied across a wide range of practical scenarios without constraining the operationalization of individual metrics. A similar principle applies to the metrics of the other model dimensions discussed in the following sections.

4.1.2. Cognitive–Perceptual Dimension

The second dimension focuses on subjective metrics capturing users’ perceptions of interface quality and overall satisfaction. Similar to the Functional–Objective dimension, these metrics are organized into hierarchical composite indices (see Table 3), reflecting the main semantic domains of user experience and enabling a structured interpretation of subjective feedback [8,19,31].
Within this structure, four composite indices are defined to align with widely recognized UX models:
  • Usability & Intuitiveness captures perceived ease of use and self-efficacy, which are key predictors of system acceptance [1];
  • Clarity & Comprehensibility addresses the cognitive effort required to process information and relates to perceived cognitive load [35];
  • Structural & Feedback Quality evaluates system transparency and the effectiveness of feedback mechanisms [30];
  • Aesthetic & Emotional Satisfaction reflects the Aesthetic–Usability Effect, where visual appeal influences perceived quality and overall satisfaction [17].
Each composite index aggregates related sub-metrics to support structured analysis of subjective evaluations and reduce variability in individual responses. This organization also allows adaptation of sub-metrics to different contexts (e.g., mobile or desktop) while preserving consistency across evaluation scenarios.
Methods for Cognitive–Perceptual Metrics Assessment
Cognitive–Perceptual metrics are typically assessed through structured user surveys, standardized usability scales, and Likert-type questionnaires, which enable the systematic quantification of subjective perceptions such as perceived clarity, intuitiveness, cognitive load, and aesthetic satisfaction [15,16]. Organizing indicators into conceptually coherent groups facilitates the construction of evaluation instruments and supports comparability of results across different contexts and user populations.
In addition to conventional survey instruments, subjective data can be collected through AI-assisted questionnaires that dynamically adapt their structure based on user responses. Such adaptive mechanisms may be triggered when inconsistencies emerge between log-based and self-reported measures, when response uncertainty is high, or when additional clarification is required through follow-up or comparative questions [39]. This approach enables more context-sensitive and robust elicitation of subjective judgments, particularly in complex or heterogeneous usage scenarios.

4.1.3. Contextual–Individual Dimension (User-Centered)

The third dimension captures user-related factors that shape interaction behavior. Users are classified into three representative expertise profiles—Beginner, Intermediate, and Advanced—based on a composite evaluation of their experience and background. This categorization provides an interpretable way to operationalize expertise and to differentiate evaluation outcomes across diverse user groups.
To ensure that the classification is both objective and theoretically grounded, the selection of indicators follows the PACT framework [31]. The model incorporates four key indicators:
  • Domain Experience—prior knowledge of the subject matter or business logic, reflecting the maturity of users’ mental models [30];
  • System Fluency—frequency and history of interaction with the specific interface, indicating the development of automated task schemas [3,30];
  • Digital Literacy—general proficiency with interface conventions and technological self-efficacy [1];
  • Task-Context Complexity—typical complexity of performed tasks and the characteristics of the interaction environment (e.g., mobile vs. desktop) [31].
Experience level is used as the primary grouping factor because it strongly influences how users perceive and process interface qualities. According to Cognitive Load Theory [35], beginners lack well-developed schemas and therefore expend more cognitive resources when interpreting unfamiliar structures. In contrast, experienced users rely on internalized mental models and automated routines, enabling more efficient interaction and faster task execution [30].
By grouping results according to experience level, the model aligns evaluation outcomes with the distinct mental models, expectations, and needs of each user segment [3].
Methods for Contextual–Individual Metrics Assessment
User profiles are derived from a combination of empirical and self-reported data sources that provide measurable inputs for the contextual indicators. These include preliminary questionnaires capturing user experience, digital literacy, and familiarity with the system; task-based assessments that evaluate interaction performance under controlled conditions; and, where applicable, AI-assisted surveys that adapt dynamically to improve the consistency and completeness of the collected data.
While the model is formally defined through a composite representation of user-related indicators, practical implementations may rely on simpler rule-based or threshold-based profiling strategies, depending on data availability and application requirements.

4.2. Integrated Evaluation and Interpretation

Once the values for all composite indices have been computed, they are integrated into a final quality assessment through a profile-specific weighting mechanism. In this process, the indices are aggregated using weights that are dynamically adjusted according to the identified user profile and the context of use (Figure 1).
Weights are determined through a structured calibration process, ranging from initial equal distribution to expert-defined heuristics and empirical data-driven optimization (for details see “Calibration and Operationalization Principles” section below).
By incorporating user-specific sensitivities into the quality assessment, the model identifies which interface elements or design strategies are most effective for different segments. These insights enable the generation or recommendation of appropriate UI variants—for example, reducing visual complexity for novice users or offering richer functionality and streamlined interactions for advanced users. In this way, the model functions not only as an evaluation tool but also as a framework for personalized interface design and optimization.
More specifically, the model enables:
  • Personalized assessment of interface quality that reflects profile-specific sensitivities to different usability factors;
  • Systematic comparison of interface performance across distinct user groups, highlighting where design solutions benefit or disadvantage particular profiles;
  • Data-driven identification of interface elements that may require adaptation or redesign for specific user segments.

5. Formal Definition of the Proposed Model

5.1. Indicator Sets

The quality of the user interface Q U I is modelled as an aggregated function of multiple objective and subjective components, structured as:
Q U I = F O ,   S ,   U ,
where
  • O = o 1 , o 2 , , o n is the set of objective indicators (e.g., task completion time, error rate, etc.);
  • S = s 1 , s 2 , , s m is the set of subjective indicators (e.g., perceived usability, visual simplicity, satisfaction, etc.);
  • U = u 1 , u 2 , , u k represents user-centered indicators (e.g., domain experience, digital literacy, task-context complexity, etc.).

5.2. Normalization of Indicators

To ensure comparability between heterogeneous metrics, all indicators are normalized to the interval [0, 1]. Since the indicators differ in their orientation, direction-specific normalization is applied.
For positively oriented indicators (higher raw values indicate better performance, e.g., clarity, satisfaction), normalization is defined as:
o i ^ = o i m i n o i max o i m i n o i ,   i = 1 n   s j ^ = s j m i n s j max s j m i n s j ,   j = 1 m
For negatively oriented indicators (lower raw values indicate better performance, e.g., task time, errors), the transformation is defined as:
o i ^ = m a x o i   o i max o i m i n o i ,   i = 1 n   s j ^ =   m a x s j   s j max s j m i n s j ,   j = 1 m
This transformation guarantees that higher normalized values consistently correspond to better performance, regardless of the original direction of the indicator.
Subjective indicators S are quantified using standardized rating scales such as the Mean Opinion Score (MOS) [40] and subsequently normalized using the same procedure.

5.3. Determination of Minimum and Maximum Values

Minimum and maximum values are established using a combination of empirical range estimation, percentile-based thresholds (e.g., trimming extreme values at distribution tails) and expert-defined operational bounds. Empirical ranges ensure that normalization reflects observed user behavior, percentile thresholds reduce the influence of outliers, and expert-defined limits provide realistic boundaries in early design stages or in domains where empirical data are not yet available.

5.4. Context-Dependent User Representation and Profiling

The context-dependent feature vector U = u 1 , u 2 , , u k captures both intrinsic user characteristics (e.g., experience level, digital literacy) and interaction conditions (e.g., task type, system characteristics, and environmental factors).
User performance is therefore interpreted as context-dependent rather than as a global and stable trait. Accordingly, the user profile P is defined as:
P = g U ,               P   B e g i n n e r , I n t e r m e d i a t e , A d v a n c e d
A composite experience score is computed as:
E U = i = 1 k β i u i ,     i = 1 k β i = 1
where u i are user-related contextual indicators and β i are weighting coefficients.
User profiles are then assigned using a percentile-based segmentation of the empirical distribution of E U   within the corresponding context:
P   =   B e g i n n e r ,             E U < P 33             I n t e r m e d i a t e ,     P 33 E U < P 66 A d v a n c e d ,             E U P 66

5.5. Hierarchical Indicator Aggregation

Individual normalized sub-metrics are first aggregated into composite indices that represent higher-level aspects of interface quality. Each composite index is computed as a weighted aggregation of its corresponding normalized sub-metrics:
C i O = k = 1 n i α i k . o i k ^ ,             k = 1 n i α i k = 1
C j S = l = 1 m j β j l . s j l ^ ,             l = 1 m j β j l = 1
Here, o i k ^ and s j l ^ denote the normalized sub-metrics, while α i k ,   β j l are their fixed multipliers. The multipliers are defined independently of the user profile P , ensuring that the composite indices reflect the intrinsic importance of each metric rather than variations introduced by specific user groups. This preserves the comparability and stability of the evaluation model across profiles.

5.6. Integrated Assessment Function

Overall interface quality is computed as a profile-dependent weighted aggregation of the composite indices, allowing the model to emphasize the aspects most relevant to each user category:
Q U I P = i w i O P C i O +   j w j S P C j S ,             i w i O P + j w j S P = 1
where
  • P denotes the user profile (e.g., Beginner, Intermediate, Advanced);
  • w i O P   and   w j S P are the adaptive weights assigned to each composite index, determined by the function P = g U .
This formulation introduces user-specific adaptation at the aggregation level, while preserving a stable internal structure of the composite indices.
The resulting value Q U I ( P ) represents a profile-aware evaluation of interface quality, reflecting both measurable system performance and perceived usability.

5.7. Calibration and Operationalization Principles

The model employs a unified three-level calibration framework across all components (user profiles, composite-index multipliers, and adaptive weights).
  • Level 1: Baseline configuration
Equal weights and initial user profiles derived from self-reported experience provide a neutral and fully replicable starting point. This configuration, however, offers limited practical value, as it does not reflect differences in how users perceive or prioritize interface qualities.
  • Level 2: Expert-informed refinement
Domain experts adjust weights, define user profile boundaries, and identify relevant indicators based on established HCI principles and widely adopted heuristic evaluation approaches. Such heuristic-based calibration is a standard practice in usability and UX evaluation and provides a theoretically grounded way to incorporate domain knowledge into the model [3,30,38].
Rather than imposing a fixed scheme, this level provides the flexibility needed to adapt the model to diverse software contexts. For instance, mobile interactions often prioritize efficiency and minimal steps due to their goal-directed nature, whereas desktop environments may emphasize functional breadth for cognitively demanding tasks.
Domain characteristics further shape calibration decisions: safety-critical systems prioritize reliability and error minimization, while non-critical or entertainment-oriented systems may emphasize usability, responsiveness, and engagement.
These variations demonstrate that weighting decisions are inherently context-dependent and cannot be reduced to a universal scheme without sacrificing practical relevance.
  • Level 3: Data-driven refinement
When empirical data are available, statistical or machine learning techniques may be used to optimize weights and validate user profile assignments. For example, profile classification may be performed using models trained on user behavior logs and survey responses, including clustering algorithms [27], logistic regression [28], and Bayesian classification methods [29].
This level of calibration provides the highest degree of empirical precision, as it allows weights and decision boundaries to be derived directly from observed user behavior. It can capture complex and context-specific patterns that may not be fully captured through heuristic reasoning. However, it is also the most resource-intensive approach, requiring sufficient datasets and computational effort.

Synthesis of Calibration Levels

The baseline configuration (Level 1) ensures transparency and reproducibility, while expert-informed refinement (Level 2) integrates domain knowledge and established heuristic reasoning. Data-driven refinement (Level 3) offers the highest accuracy but may be infeasible in early-stage or low-data scenarios. In practice, a hybrid strategy is often most effective: expert-informed initialization progressively refined with empirical data when available. This balances theoretical grounding, interpretability, and empirical accuracy across different evaluation contexts. In settings with limited data, simplified profiling may be implemented through predefined templates or rule-based classification (e.g., beginner, intermediate, expert), particularly when input indicators are categorical or when high-precision numerical calibration is not required.

6. Illustrative Example: Profile-Dependent Weight Configuration

To illustrate the operation of the proposed model, a simplified example calculation is presented. Once the user profile P is determined, the weighting structure may be adjusted to emphasize interface characteristics most relevant for that user category. Figure 2 presents an example configuration of weights for the main composite indices in the Functional–Objective and Cognitive–Perceptual dimensions.
The example weighting scheme demonstrates how evaluation priorities may differ across user profiles:
  • Beginner users—greater emphasis on clear visual cues and straightforward interaction patterns. Higher weights may be assigned to clarity, intuitiveness, perceptual comfort, and aesthetic support to facilitate orientation during early interactions.
  • Intermediate users—a more balanced distribution of priorities. With moderate familiarity, both usability-related and efficiency-related criteria remain relevant.
  • Advanced users—stronger emphasis on efficiency and performance-oriented factors, such as rapid task completion and streamlined interaction flows. Cognitive–Perceptual factors may retain a secondary, but still meaningful, role.

7. Illustrative Scenario: Dynamic Interface Adaptation

This section presents a proof-of-concept scenario illustrating the operational logic of the proposed AMM model. The objective is to clarify how an adaptive evaluation mechanism supports responsive, user-centered interface design through a systematic feedback loop.
In this scenario, an information system integrates objective performance metrics with subjective feedback to determine the optimal user profile and its corresponding UI configuration. As shown in Figure 3, the system selects among simplified, balanced, or advanced configurations, tailored to Beginner, Intermediate, and Advanced users.

7.1. Adaptation Pathways

Several complementary adaptation mechanisms may be employed:
  • Automatic Profile Inference—behavioral data are analyzed to infer the user’s profile and deploy the most suitable UI configuration.
  • User-Controlled Configuration—users may manually select or adjust their preferred UI configuration, ensuring transparency and preserving autonomy in environments where personal preference or professional expertise is critical.
  • Adviser-Based Suggestions—a recommendation mechanism periodically evaluates accumulated logs and survey responses. When behavioral evidence suggests that another configuration may better support the user, the system proposes trying an alternative template.
  • Initial Onboarding—for first-time users, the system may rely on self-reported experience, goals, and preferences to generate an initial profile before behavioral data become available.

7.2. Dynamic Deployment and Feedback Loop

Once a user profile is identified, the system deploys the corresponding UI configuration and continuously monitors interaction patterns to ensure that the selected configuration remains appropriate over time. As illustrated in Figure 3, this process is supported by a dynamic feedback loop that integrates functional performance data, perceptual evaluations, and user-controlled settings.
The system collects objective indicators such as task efficiency, error patterns, and navigation behavior, together with subjective feedback related to clarity, perceived workload, and overall satisfaction. These data feed into an adaptive evaluation mechanism that assesses whether the current UI configuration continues to align with the user’s interaction style and evolving expertise.
When discrepancies emerge—such as improved performance suggesting increased fluency, or elevated cognitive load indicating that the interface may be too complex—the system may initiate one of several adjustment pathways. These include fine-tuning specific interface parameters, recommending an alternative UI configuration through an adviser-based suggestion mechanism, or automatically reassigning the user to a more suitable configuration when strong behavioral evidence supports the change. Users retain the ability to override system recommendations, ensuring transparency and maintaining control over the interaction experience.
This dynamic deployment process enables the interface to evolve alongside the user, supporting gradual transitions between configurations and maintaining alignment between system behavior, user capabilities, and contextual demands.

8. Discussion and Implications

The results of this study highlight the importance of considering user diversity as a central factor in user interface evaluation. The proposed framework addresses the limitations of uniform assessment by moving toward a profile-sensitive logic that reflects systematic differences in user perception and interaction behavior.

8.1. Comparison with Related Work

From a methodological perspective, this model aligns with foundational performance metrics [4] and standardized frameworks like ISO 9241-11 [13]. While the integration of objective and subjective indicators is a well-established practice in UX research [7,8,18], the current framework advances this integration by introducing a profile-dependent aggregation logic.
Unlike standalone metrics [20] or fixed composite indices, the present approach treats the relationship between performance and perception as context-dependent rather than static. This distinguishes it from classical MCDM methods used in UI evaluation [21,22], which typically rely on predefined weighting schemes [23]. The framework enables evaluation priorities to shift based on user characteristics. For instance, it can emphasize error minimization for novice users while prioritizing efficiency for experts, thereby addressing the long-standing need for flexibility in user-centered evaluation [19,24]. Furthermore, while adaptive UI systems [9,25] primarily focus on real-time interface modification, the present work extends this perspective by introducing a structured approach for evaluating interface quality across heterogeneous user groups.

8.2. Practical and Design Implications

From a design perspective, profile-sensitive evaluation reveals how different user groups perceive the same interface differently. This is particularly relevant for systems serving heterogeneous populations, where design trade-offs must account for varying cognitive strategies and interaction patterns. By identifying exactly where design solutions may disadvantage particular profiles, the model functions not only as an evaluation tool but also as a framework for personalized interface optimization. It allows practitioners to justify design decisions—such as simplifying navigation for novices or expanding shortcut capabilities for experts—based on empirical, profile-specific data.

8.3. Limitations and Validation Roadmap

Despite its advantages, the proposed framework depends on the calibration of its components, which introduces several methodological considerations.
The data-driven refinement level provides the highest degree of precision but requires sufficient datasets and computational resources, which may not always be feasible in early-stage or low-resource scenarios. In addition, the expert-informed refinement stage remains partially dependent on subjective judgment during the definition of weights and profile boundaries. Another limitation is that the current profile structure is discrete, leaving intermediate stages of expertise development unrepresented.
To support the transition from conceptual specification to empirical application, future work will follow a three-phase validation strategy:
  • Experimental Design and Data Collection: A representative interface and task set will be selected to enable the collection of objective performance metrics and subjective assessments.
  • User Profiling and Model Application: Participants are assigned to profiles based on their experience levels. To isolate the effect of adaptive profiling, the framework is tested in two configurations: an adaptive version (using profile-dependent weighting) and a non-adaptive baseline (using fixed weights). In both cases, the balance between objective and subjective metrics remains identical.
  • Evaluation and Comparative Analysis: This phase will examine the evaluation outcomes to assess differences across user groups and the alignment between model outputs and users’ overall quality judgments. The goal is to determine whether adaptive weighting yields more consistent and profile-sensitive interpretations of interface quality compared to fixed schemes.
The validation process will focus on whether adaptive weighting produces more consistent and profile-sensitive interpretations of interface quality compared to fixed evaluation schemes. Rather than benchmarking against incompatible multidimensional UX models, the evaluation adopts a user-centered perspective by examining the correspondence between model-generated scores and users’ own quality perceptions.
Future research will additionally investigate dynamic transitions between user profiles and the corresponding adjustment of adaptive weights as user proficiency evolves over time.

8.4. Summary of Contribution

Overall, the discussion underscores that UI evaluation benefits from moving beyond uniform assessment assumptions toward models that explicitly account for user variability. By separating evaluative structure from operational implementation, the framework maintains the flexibility required for diverse application domains—ranging from safety-critical systems to consumer mobile apps—without sacrificing methodological rigor or theoretical grounding.

9. Conclusions

This study introduced a conceptual framework for user interface evaluation that integrates heterogeneous evaluation inputs within a unified structure. Designed to support more context-sensitive interpretations of interface quality, the framework is particularly relevant in scenarios where user diversity significantly shapes interaction outcomes. It provides a structured foundation for combining multiple evaluation signals while preserving flexibility in how individual metrics are selected and operationalized.
In relation to the research questions, the model establishes a conceptual basis for integrating user profiles into evaluation processes (RQ1), demonstrates how adaptive weighting can be embedded within a multidimensional structure (RQ2), and outlines the methodological requirements for future empirical validation (RQ3).
Future work will focus on empirical studies, refinement of profiling strategies, and exploration of the framework in real-world interactive systems. In the longer term, extending the profiling dimension to include accessibility related characteristics offers a pathway toward more inclusive and universally applicable evaluation practices.

Author Contributions

Conceptualization, I.A.N. and Z.S.K.; methodology, I.A.N. and Z.S.K.; software, I.A.N.; formal analysis, I.A.N. and Z.S.K.; investigation, I.A.N. and I.K.G.; resources, I.A.N. and I.K.G.; data curation, I.A.N. and I.K.G.; writing, I.A.N. and Z.S.K.; writing—review and editing, I.A.N. and Z.S.K.; visualization, I.A.N.; supervision, I.A.N.; project administration, Z.S.K.; funding acquisition, Z.S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Bulgarian National Science Fund (grant number KP-06-N52/2: “Perspective Methods for Quality Prediction in the Next Generation Smart Informational Service Networks”).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
UIUser Interface
HCIHuman–Computer Interaction
KLMKeystroke-Level Model
MOSMean Opinion Score
AMMAdaptive Multidimensional Model
UXUser Experience
QoEQuality of Experience
SUSSystem Usability Scale
PSSUQPost-Study System Usability Questionnaire
UEQUser Experience Questionnaire
MCDMMulti-criteria decision-making
AHPAnalytic Hierarchy Process
PACTPeople, Activities, Contexts, Technologies
NASA-TLXNASA Task Load Index
GOMSGoals, Operators, Methods, and Selection rules
NHPPNon-Homogeneous Poisson Process
AIArtificial Intelligence

References

  1. Davis, F.D. Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Q. 1989, 13, 319–340. [Google Scholar] [CrossRef] [PubMed]
  2. Sonderegger, A.; Sauer, J. The influence of design aesthetics in usability testing: Effects on user performance and perceived usability. Appl. Ergon. 2010, 41, 403–410. [Google Scholar] [CrossRef] [PubMed]
  3. Nielsen, J. Usability Engineering; Morgan Kaufmann: San Diego, CA, USA, 1993. [Google Scholar] [CrossRef]
  4. Card, S.K.; Moran, T.P.; Newell, A. The Psychology of Human–Computer Interaction; Lawrence Erlbaum Associates: Hillsdale, NJ, USA, 1983. [Google Scholar]
  5. Bevan, N. Extending quality in use to provide a framework for usability measurement. In Human Centered Design; Springer: Berlin/Heidelberg, Germany, 2009; pp. 13–22. [Google Scholar] [CrossRef]
  6. Raake, A.; Egger, S. Quality and Quality of Experience. In Quality of Experience: Advanced Concepts, Applications and Methods; Möller, S., Raake, A., Eds.; Springer: Cham, Switzerland, 2014; pp. 11–33. [Google Scholar]
  7. Möller, S.; Raake, A.; Fiedler, M.; Hossfeld, T.; Keimel, C.; Schatz, R. Towards a Holistic View of Quality of Experience. Dagstuhl Manifesto. In Dagstuhl Reports; Schloss Dagstuhl–Leibniz Center for Informatics: Wadern, Germany, 2017. [Google Scholar]
  8. Vermeeren, A.P.; Law, E.L.C.; Roto, V.; Obrist, M.; Hoonhout, J.; Väänänen-Vainio-Mattila, K. User experience evaluation methods: Current state and development needs. In Proceedings of the 6th Nordic Conference on Human-Computer Interaction; ACM: Reykjavik, Iceland, 2010; pp. 521–530. [Google Scholar] [CrossRef]
  9. Purificato, I.; Narducci, F.; Musto, C.; Semeraro, G. User Modeling and Profiling: A Comprehensive Survey. arXiv 2024, arXiv:2402.09660. [Google Scholar] [CrossRef]
  10. Ntoa, S. Usability and User Experience Evaluation in Intelligent Environments: A Review and Reappraisal. Int. J. Hum.-Comput. Interact. 2024, 41, 2829–2858. [Google Scholar] [CrossRef]
  11. Fitts, P.M. The information capacity of the human motor system in controlling the amplitude of movement. J. Exp. Psychol. 1954, 47, 381–391. [Google Scholar] [CrossRef]
  12. Hick, W.E. On the rate of gain of information. Q. J. Exp. Psychol. 1952, 4, 11–26. [Google Scholar] [CrossRef]
  13. ISO 9241-11:2018; Ergonomics of Human–System Interaction—Part 11: Usability: Definitions and Concepts. ISO: Geneva, Switzerland; IEC: Geneva, Switzerland, 2018.
  14. Bangor, A.; Kortum, P.; Miller, J. Determining What Individual SUS Scores Mean: Adding an Adjective Rating Scale. J. Usability Stud. 2009, 4, 114–123. [Google Scholar]
  15. Lewis, J.R. Psychometric evaluation of the PSSUQ using data from five years of usability studies. Int. J. Hum.-Comput. Interact. 2002, 14, 463–488. [Google Scholar]
  16. Hart, S.; Staveland, L. Development of NASA TLX (Task Load Index): Results of empirical and theoretical research. In Human Mental Workload; Hancock, P.A., Meshkati, N., Eds.; North Holland: Amsterdam, The Netherlands, 1988; pp. 139–183. [Google Scholar]
  17. Tractinsky, N.; Katz, A.S.; Ikar, D. What is beautiful is usable: The impact of website aesthetics on perceived usability. Interact. Comput. 2000, 13, 127–145. [Google Scholar] [CrossRef]
  18. Hassenzahl, M. User experience (UX): Towards an experiential perspective on product quality. In IHM’08: Proceedings of the 20th Conference on l’Interaction Homme-Machine; Association for Computing Machinery: New York, NY, USA, 2008. [Google Scholar] [CrossRef]
  19. Borsci, S.; Federici, S.; Malizia, A. A Multidimensional Approach to Usability and UX Evaluation: A Critical Review. Behav. Inf. Technol. 2023, 42, 389–407. [Google Scholar]
  20. Mortazavi, E.; Doyon-Poulin, P.; Imbeau, D.; Taraghi, M.; Robert, J.-M. Exploring the Landscape of UX Subjective Evaluation Tools and UX Dimensions: A Systematic Literature Review (2010–2021). Interact. Comput. 2024, 36, iwae017. [Google Scholar] [CrossRef]
  21. Saaty, T.L. The Analytic Hierarchy Process; McGraw-Hill: New York, NY, USA, 1980. [Google Scholar]
  22. OECD. Handbook on Constructing Composite Indicators: Methodology and User Guide; OECD Publishing: Paris, France, 2008. [Google Scholar]
  23. Triantaphyllou, E. Multi-Criteria Decision Making Methods: A Comparative Study; Kluwer Academic Publishers: Boston, MA, USA, 2000. [Google Scholar]
  24. Brdnik, A.; Heričko, M.; Šumak, B. Intelligent user interfaces and their evaluation: A systematic mapping study. Sensors 2022, 22, 5830. [Google Scholar] [CrossRef] [PubMed]
  25. Carrera-Rivera, A.; Larrinaga, F.; Lasa, G.; Martínez-Arellano, G.; Unamuno, G. AdaptUI: A Framework for the Development of Adaptive User Interfaces in Smart Product–Service Systems. User Model. User-Adapt. Interact. 2024, 34, 1929–1980. [Google Scholar] [CrossRef]
  26. Knijnenburg, B.P.; Willemsen, M.C.; Gantner, Z.; Soncu, H.; Newell, C. Evaluating Recommender Systems with User Experiments. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: Boston, MA, USA, 2015; pp. 309–352. [Google Scholar] [CrossRef]
  27. Wang, G.; Zhang, X.; Tang, S.; Zheng, H.; Zhao, B.Y. Unsupervised Clickstream Clustering for User Behavior Analysis. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI ’16), San Jose, CA, USA, 7–12 May 2016; ACM; New York, NY, USA, 2016; pp. 225–236. [Google Scholar] [CrossRef]
  28. Hosmer, D.W.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression, 3rd ed.; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar]
  29. Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification, 2nd ed.; Wiley: New York, NY, USA, 2001. [Google Scholar]
  30. Shneiderman, B.; Plaisant, C.; Cohen, M.; Jacobs, S.; Elmqvist, N.; Diakopoulos, N. Designing the User Interface: Strategies for Effective Human–Computer Interaction, 6th ed.; Pearson: Boston, MA, USA, 2016. [Google Scholar]
  31. Benyon, D. Designing Interactive Systems: A Comprehensive Guide to HCI, UX and Interaction Design, 4th ed.; Pearson Education Limited: Harlow, UK, 2019. [Google Scholar]
  32. Sauro, J.; Lewis, J.R. Quantifying the User Experience: Practical Statistics for User Research, 2nd ed.; Morgan Kaufmann: Cambridge, MA, USA, 2016. [Google Scholar]
  33. Subramaniam, H.; Zulzalil, H. Software quality assessment using flexibility: A systematic literature review. Int. Rev. Comput. Softw. 2012, 7, 5. [Google Scholar]
  34. Bass, L.; Clements, P.; Kazman, R. Software Architecture in Practice, 4th ed.; Addison-Wesley Professional: Boston, MA, USA, 2021. [Google Scholar]
  35. Sweller, J.; Ayres, P.; Kalyuga, S. Cognitive Load Theory; Springer: New York, NY, USA, 2011. [Google Scholar]
  36. Musa, J.D.; Iannino, A.; Okumoto, K. Software Reliability: Measurement, Prediction, Application; McGraw Hill: New York, NY, USA, 1987. [Google Scholar]
  37. Lyu, M.R. (Ed.) Handbook of Software Reliability Engineering; McGraw Hill: New York, NY, USA, 1996. [Google Scholar]
  38. Nielsen, J.; Molich, R. Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Washington, DC, USA, 1–5 April 1990; pp. 249–256. [Google Scholar]
  39. Luo, X.; Li, Y.; Xu, J.; Zheng, Z.; Ying, F.; Huang, G. AI in medical questionnaires: Scoping review. J. Med. Internet Res. 2025, 27, e72398. [Google Scholar] [CrossRef] [PubMed]
  40. Streijl, R.; Winkler, S.; Hands, D. Mean opinion score (MOS) revisited: Methods and applications, limitations and alternatives. Multimed. Syst. 2016, 22, 213–227. [Google Scholar] [CrossRef]
Figure 1. Adaptive UI evaluation model diagram.
Figure 1. Adaptive UI evaluation model diagram.
Futureinternet 18 00261 g001
Figure 2. Profile-dependent weighting of composite indices.
Figure 2. Profile-dependent weighting of composite indices.
Futureinternet 18 00261 g002
Figure 3. Dynamic UI adaptation process based on user profiles.
Figure 3. Dynamic UI adaptation process based on user profiles.
Futureinternet 18 00261 g003
Table 1. Comparison of existing UI evaluation approaches with the proposed framework (symbols ✔ and ✖ denote full alignment and the absence of a specific feature, respectively).
Table 1. Comparison of existing UI evaluation approaches with the proposed framework (symbols ✔ and ✖ denote full alignment and the absence of a specific feature, respectively).
Approach/ModelObjective IndicatorsSubjective
Indicators
User
Profiles
Adaptive WeightingMultidimensional Structure
Classical Usability Metrics
UX Questionnaire Based Evaluation
Composite UX/QoE Models
Adaptive UI Systems
Proposed Framework
Table 2. Composite indices of the Functional–Objective dimension.
Table 2. Composite indices of the Functional–Objective dimension.
Composite IndicesSub-MetricsDescription
Interaction EfficiencyTask completion time,
Number of steps,
Actions per task
Measures how quickly and easily users can complete tasks, minimizing effort and redundant actions
ReliabilityError rate,
System failures,
Incorrect states
Captures the frequency of errors and system instability, reflecting robustness of the interface
Interface ConsistencyRepeatability of elements,
Uniform behavior across screens
Evaluates whether design elements behave predictably, providing a coherent experience
Table 3. Composite indices of the Cognitive–Perceptual Dimension.
Table 3. Composite indices of the Cognitive–Perceptual Dimension.
Composite IndicesSub-MetricsDescription
Usability & IntuitivenessUsability,
Perceived ease of use,
Intuitiveness
Measures how naturally and predictably users can interact with the interface
Clarity & ComprehensibilityClarity,
Visual simplicity,
Cognitive load
Evaluates how easily users understand information and navigate the interface without unnecessary mental effort
Structural & Feedback QualityLogical hierarchy,
Structural organization,
Adequacy of feedback
Assesses how well the interface organizes content and communicates system status to the user
Aesthetic & Emotional SatisfactionEmotional perception,
Aesthetic satisfaction
Captures the emotional and aesthetic experience, influencing overall satisfaction and perceived reliability
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Naydenova, I.A.; Kovacheva, Z.S.; Georgiev, I.K. Adaptive Multidimensional Model for User Interface Quality Assessment. Future Internet 2026, 18, 261. https://doi.org/10.3390/fi18050261

AMA Style

Naydenova IA, Kovacheva ZS, Georgiev IK. Adaptive Multidimensional Model for User Interface Quality Assessment. Future Internet. 2026; 18(5):261. https://doi.org/10.3390/fi18050261

Chicago/Turabian Style

Naydenova, Ina Asenova, Zlatinka Svetoslavova Kovacheva, and Iliya Krasimirov Georgiev. 2026. "Adaptive Multidimensional Model for User Interface Quality Assessment" Future Internet 18, no. 5: 261. https://doi.org/10.3390/fi18050261

APA Style

Naydenova, I. A., Kovacheva, Z. S., & Georgiev, I. K. (2026). Adaptive Multidimensional Model for User Interface Quality Assessment. Future Internet, 18(5), 261. https://doi.org/10.3390/fi18050261

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop