Previous Article in Journal
Effect of Foreign Object Damage on Corrosion Fatigue Behavior in Surface-Strengthened EA4T Railway Axle Steel
 
 
Article
Peer-Review Record

A Framework for Standardizing the Development of Serious Games with Real-Time Self-Adaptation Capabilities Using Digital Twins

Technologies 2025, 13(8), 369; https://doi.org/10.3390/technologies13080369
by Spyros Loizou * and Andreas S. Andreou
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Technologies 2025, 13(8), 369; https://doi.org/10.3390/technologies13080369
Submission received: 16 April 2025 / Revised: 6 August 2025 / Accepted: 8 August 2025 / Published: 18 August 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors wrote about building a framework to make serious games smarter and more personal by using digital twins and semantic annotation. They split the game development process into clear phases, starting from setting goals, designing the game environment with help from experts, tagging gameplay data with metadata, and using real-time analytics to adjust the game experience based on how players perform. They tested their ideas with two real-world examples: a speech therapy game for children with learning disabilities, and a factory training game for engineers, showing that the games could automatically tweak difficulty and content for each user. They also compared their framework to others and argued that their approach stood out because of better real-time adaptability, stronger standardisation, and broader domain flexibility. Overall, they showed strong early results, even though they admitted more work is needed to scale and refine the system.

 

 

Issue 1: Lack of Large-Scale Validation
The authors mainly relied on small-scale experiments involving just a handful of participants, particularly five children in the speech therapy example, and a limited number of engineers in the factory setting. They wrote that their framework worked well based on these small trials, but they did not run any broad user studies or long-term evaluations to show that the system would hold up with a bigger or more diverse group. Without serious longitudinal data or more diverse user sampling, the claims about the system's effectiveness, scalability, and generalisability stayed mostly theoretical.


Issue 2: Limited Real-Time Adaptation Depth
Even though the authors claimed their framework could adapt games in real-time using digital twins, the types of adaptations they demonstrated were relatively simple. They mainly adjusted difficulty levels, provided hints, or added visual cues. They did not show deeper structural changes to gameplay, storyline branching, or richer player modelling. So while they wrote a lot about dynamic adaptation, in reality the adaptations they achieved seemed closer to basic parameter tweaking than to true complex, context-aware gameplay evolution.


Issue 3: Expert Dependency Weakness
The system heavily relied on domain experts to define the adaptation rules, configure the initial semantic annotations, and continuously refine the system. The authors wrote that this expert input was a strength, but it actually introduced a major bottleneck. Without expert involvement, the framework would not easily scale or adapt to new domains, because non-expert users would struggle to create or adjust the necessary ontologies, rule sets, or XML structures. They did not propose or test any way to automate or ease this part of the process.


Issue 4: Overstated Claims about Standardisation
The authors repeatedly stated that their framework standardised the development of serious games through semantic annotation and metadata structures. However, the “standards” they created were internal and specific to their own project. They did not align their work with any major public standards for game development, data interoperability, or educational technology, such as IEEE or IMS standards. This made their talk about standardisation seem overhyped, because in practice they just created a nice internal convention rather than a real standard.

 


Issue 5: No Real Machine Learning or Predictive Modelling
Although the paper positioned the system as highly data-driven, the authors did not actually use machine learning, predictive analytics, or behavioural modelling. They wrote about using historical data and monitoring performance, but all adaptations were based on simple rule engines and predefined thresholds. This missed the opportunity to leverage the rich gameplay data they collected for deeper insights, like predicting skill progression, clustering player profiles, or discovering hidden patterns in behaviour.


Issue 6: Limited Handling of Data Privacy and Ethics
The authors talked about collecting a lot of sensitive data from players, including performance logs, decision patterns, and even voice recordings, but they barely addressed privacy, ethics, or data protection in any meaningful way. They wrote that they collected and semantically annotated the data, but they did not mention how they anonymised, secured, or ethically managed that information, especially when working with vulnerable groups like children. In a real-world deployment, this would have been a major oversight.


Issue 7: Weak Discussion on Transferability to Other Game Genres
The paper focused entirely on training and therapeutic serious games, and while the authors claimed their framework was domain-agnostic, they did not show how it could handle entertainment games, multiplayer environments, open-world scenarios, or games with emergent gameplay. They wrote confidently about adaptability across fields, but the examples they provided were highly controlled, linear, and single-user focused. They missed the chance to seriously discuss or test the system’s adaptability to more dynamic or less structured gaming contexts.

 

Author Response

HOW COMMENTS WERE DEALT

We would like to thank the two reviewers for their valuable comments. We tried to address all of their suggestions by either revising the manuscript (adding/enhancing or modifying text where appropriate) or by clarifying/answering to a comment in case we did not fully agree with it.

Below we provide each comment/suggestion in italics and our answer right below it for each reviewer.

 

Reviewer 1

Issue 1: Lack of Large-Scale Validation

The authors mainly relied on small-scale experiments involving just a handful of participants, particularly five children in the speech therapy example, and a limited number of engineers in the factory setting. They wrote that their framework worked well based on these small trials, but they did not run any broad user studies or long-term evaluations to show that the system would hold up with a bigger or more diverse group. Without serious longitudinal data or more diverse user sampling, the claims about the system's effectiveness, scalability, and generalisability stayed mostly theoretical.

Response to Issue 1:

We agree with this comment and there were several mentions in the manuscript that referred to this preliminary type of evaluation. A full-scale experimental assessment was beyond the scope of the present paper, the target of which was to propose a new, phased approach for developing serious games, describe its distinct steps in detail and provide demo examples. The aim of the preliminary evaluation conducted was to provide initial evidence that the approach actually works well in practice. Future steps will, of course, include larger scale experimentation, and this has already been scheduled and will last for at least one year. In order to clarify better this issue in the manuscript we have reported this more extensively in the text, emphasizing more on explaining the reason for the small-scale assessment of the proposed framework. We also referred to the future full-scale assessment (see revised sections and pages below, with revised text in red):

Abstract (page 1, line 29)

Preliminary, small-scale experimentation indicated that this framework promotes personalized and dynamic user experience, with improved engagement through the adjustment of gaming elements in real-time to match each player’s unique profile, actions and achievements. Comparison with similar approaches using a set of properties and features suggested the superiority of the proposed framework.

5.2 Evaluation (page 19, line 684)

In conclusion, the preliminary evaluation results indicate that the framework is promising with respect to adaptability and usability. Nevertheless, it is important to note that the pilot results reported may be considered as preliminary evidence rather than proof of efficacy. To this end, additional large-scale and long-term user studies are required to fully assess the framework’s efficacy, scalability and generalization. The use of a DT for real-time self-adjustment appears to be a strong advantage, creating an efficient process for a personalized and dynamic environment. Still, there is ample room for improvement on scenarios configuration and on enhancing the difficulty adjustment.

5.4 Discussion (page 24, line 878)

These applications showcase the framework’s adaptability and potential, demonstrating that it is not limited to one specific domain and that it can be applied to a range of training and educational contexts. Nevertheless, as the current evaluation is limited to small-scale, short-term demonstrations, future work will focus on large-scale and long-term experiments in order to fully evaluate the framework’s efficiency, scalability and impact on a range of user groups. 

 

Issue 2: Limited Real-Time Adaptation Depth

Even though the authors claimed their framework could adapt games in real-time using digital twins, the types of adaptations they demonstrated were relatively simple. They mainly adjusted difficulty levels, provided hints, or added visual cues. They did not show deeper structural changes to gameplay, storyline branching, or richer player modelling. So while they wrote a lot about dynamic adaptation, in reality the adaptations they achieved seemed closer to basic parameter tweaking than to true complex, context-aware gameplay evolution.


Response 2:

We would like to thank the reviewer for this comment. Indeed, the adaptations demonstrated were confined to small to medium changes in scenarios and gameplay, but were not so simple as the reviewer states. Our intention was to demonstrate the ability to perform these adaptations and not to show the full depth or breadth of the capacity to perform changes. In addition, structural changes or storytelling branching in our framework is nothing more than elements of the game environment. Once these elements are described and associated rules are defined, they can also be performed without any limitation. We did not consider that demonstrating structural changes would add to the applicability of the framework, let alone the fact that it would take much more space to explain these changes and then demonstrate them, which would add to the size of the paper. Nevertheless, we took this comment into serious consideration and made some adjustments in the manuscript to highlight this point and explain it better, as shown below (see revised sections and pages below, with revised text in red):

  1. Proposed Framework (page 6, line 238)

The goals and metrics orient the Game Environment, the latter including the core aspects of the game, such as the gaming rules, game objects (images, videos, text, music, etc.), scenarios/scenes and challenges to be faced, associated with partial goals for each scenario/scene. The structure of the game is built in a way that is adaptable to different roles or requirements of the players corresponding to their condition (e.g., patients that have suffered stroke), or skills/abilities (e.g., newly hired factory engineers under training). More specifically, the framework supports three levels of adaptation: (i) Parameter-Level, that adjusts values for certain parameters (e.g., difficulty, time limits), (ii) Scenario-Level, that modifies game tasks, rules or environmental elements such as switching from syllable recognition to word formation, and, (iii) Structural-Level that modifies core gameplay mechanics and content by introducing new storylines or interactive objects.

Page 7, line 271

As previously mentioned, the rules can vary depending on the level of adaptation, ranging from simple scene adjustments (objects, difficulty, gameplay) to complex, deep structural changes to the content, educational approach (e.g., increase of learning curve) and game style (e.g., from simple tile or image matching to incorporating AR or VR elements depending on the game implementation). The multi-layer adaptation capabilities are therefore driven by DT’s rule engine, which supports expert-defined XML representations to align with real-time players’ data.

Page 11, line 395

The framework utilizes a set of rules encoded in the DT rule engine to guide the adaptation process. These rules dynamically modify the game environment based on players’ performance and selections, and can vary from simple changes in the graphical layout (e.g., colors or lines), to deep structural changes involving scenarios and educational content.

5.4 Discussion (Page 25, line 912)

Another possible area for improvement is the incorporation of deep analytics on player behavior and engagement to discover possible patterns and trends that could be incorporated in the rule-engine and be treated as “ground-truth” for deciding in real-time if, when and where adaptations should take place. In addition, exploring AI-driven con-tent generation may provide further automatic adaptations and support deeper structural changes.

 

Issue 3: Expert Dependency Weakness

The system heavily relied on domain experts to define the adaptation rules, configure the initial semantic annotations, and continuously refine the system. The authors wrote that this expert input was a strength, but it actually introduced a major bottleneck. Without expert involvement, the framework would not easily scale or adapt to new domains, because non-expert users would struggle to create or adjust the necessary ontologies, rule sets, or XML structures. They did not propose or test any way to automate or ease this part of the process.

Response 3:

We disagree with the characterization of expert support as a weakness; the way we see it is that experts are an important part of how the proposed framework works. The framework is built to combine the knowledge of domain experts with automated tools, so that the rules and adjustments in the game actually make sense for each field, like industrial education or health. This combination is common in similar systems and helps make sure the game is useful and safe – there are similar studies that also report this, relevant text is inserted in the manuscript and related papers are included in the references. This issue is (in general terms) the same as with software systems development where analysts become domain experts through the elicitation of requirements. This is how a software system may become useful in different disciplines. In pretty much the same way, experts are utilized to provide the necessary details of the domain of application, thus directing the efforts of game developers to deliver a useful serious game. Of course, we know that setting up rules and game details can be made easier, especially for new users or new areas. That is why, as we mention in the future work section, we plan to focus on improving usability and scalability, integrating AI-driven adaptation techniques and creating deep long-term research to validate the effectiveness of this approach across different domains. One additional note here, the reviewer states that experts configure the initial semantic annotations. This is actually not correct, they provide their knowledge and expertise to guide the semantic annotations, the latter being performed by game developers.

Finally, we would like to note that we have revised the paper and modified or added text to clarify further this point (see revised sections and pages below, with revised text in red):

 

Abstract (page1, line 19)

This information is then used by a digital twin for automatically adjusting the game experience using a set of rules defined by a group of domain experts. The framework thus follows a hybrid approach, combing expert knowledge with automated adaptation actions being performed to ensure meaningful educational content delivery and flexible, real-time personalization.

  1. Introduction (page 2, line 78)

A key feature of our methodology is its hybrid nature, with the potential of utilizing domain experts to assist in defining and/or refining game rules and scenarios, while the DT adapts the game in real-time based on player data (e.g., progress) and game rules.

  1. Proposed Framework (page 7, line 265)

The experts define also the rules which will guide the DT to perform adjustments in terms of complexity and difficulty. The integration between experts and DT is a feedback loop starting with the initial setup and continuing with periodical re-definition of rules and scenarios taking place when necessary (i.e., based on new educational targets). These rules are incorporated in the DT’s Rule Engine and relevant feedback may be provided to the experts to support their re-definition when necessary.

  1. System Demonstration (page 12, line 449)

The game environment was designed with the aid of speech pathologists at the rehabilitation clinic so as to be simple but also attractive (e.g., colours, icons, figures, etc.), and at the same time be interactive to motivate children to participate in the educational tasks. This case study highlights the hybrid workflow of our approach, where domain experts guide the initial setup and ongoing redefinition, while the adaptive system manages re-al-time adjustments based on players’ performance.

5.4 Discussion (page 25, line 893)

Additionally, expert guidance supports the development of adaptation rules ensuring that the modifications made to the game are in full alignment with domain-specific learning objectives, keeping educational value and player engagement. It is important to note that our framework balances between automation and human intervention. It may be considered a hybrid solution that brings together the strengths of both worlds, that is, expert input and automated processes, allowing us to maintain quality and relevance while enabling scalability and flexibility. Nevertheless, the structure of the framework supports also modifying the levels of contribution between expert support and automation, even to the extreme ends with only experts being user, or only automated rules, the latter being directed by generative AI models. 

  1. Conclusion (page 26, line 957)

Future steps will also focus on improving scalability, integrating AI-driven adaptation techniques and creating deep long-term research to validate the effectiveness of this approach across different domains. Additionally, the use of new AI tools, such as generative AI and Large Language Models (LLMs) will be investigated to assist experts create and update rules faster and more easily. Depending on the findings, the utilization of LLMs could be performed at the level at which all rules defined by domain experts may be re-placed by those produced by LLMs using the domain knowledge embodied. This way, the proposed approach may offer varying levels of automation ranging from full scale to hybrid schemes, utilizing partly the experts or being fully expert-based.

 

Issue 4: Overstated Claims about Standardisation

The authors repeatedly stated that their framework standardised the development of serious games through semantic annotation and metadata structures. However, the “standards” they created were internal and specific to their own project. They did not align their work with any major public standards for game development, data interoperability, or educational technology, such as IEEE or IMS standards. This made their talk about standardisation seem overhyped, because in practice they just created a nice internal convention rather than a real standard.

 

Response 4:

We thank the reviewer for this comment. We agree with the reviewer that our framework introduces internal standardization for describing game elements and adaptation rules using XML and ontologies, and that it does not align with any public standards such as IEEE or IMS Learning resource metadata. Our goal was not to fully implement a standardization scheme aligned with external learning management systems or educational technology platforms, but to ensure portability, consistent interpretation and structured game elements within the proposed framework. Therefore, in response to this comment, we have made the following modifications to the manuscript (see revised sections and pages below, with revised text in red) to make this point clearer:

Abstract (Page 1, line 11)

The present paper introduces a framework to guide the development of serious games using a phased approach. The framework introduces a level of standardization for the game elements, scenarios and data descriptions, mainly to support portability, interpretability and comprehension. This standardization is achieved through semantic annotation and it is utilized by digital twins to support self-adaptation.

  1. Introduction (Page 2, line 60)

This paper presents a framework which offers a phased and structured SG development process supported by standardization of the steps to produce dynamically adjustable games using DTs. The standardization provided is not aligned with any public or industry wide standards for games development, such as, the IEEE P2948 (Cloud Gaming), or P3341 (Mobile Game Experience) [26,27], the IMS Learning Tools Interoperability (LTI) or Learning Design (LD) [28,29], or SCORM [30]; it aims primarily to facilitate portability across platforms, improve consistency and support integration of different game elements.

Page 2, line 75

This environment is built with the support of domain experts, and it is formally described using standardised semantic forms, such as ontologies and blueprints. These forms are designed to make it easier to describe, parse and transfer game elements and data described within the proposed framework.

  1. The Proposed Framework (page 7, line 286)

As mentioned earlier, the dedicated XML structure formalizes the description of game scenarios and parameters. This standardized format may be considered internal and it is used to define all related elements, such as game scenes, player difficulty levels, game tasks, and performance metrics. The term “internal” is used here to differentiate between the use of a globally accepted format, such as XML or JSON files, to express game elements and does not refer to universal standards for educational technology or game development as explained previously.

5.3 Comparison with other Approaches (page 21, line 768)

Standardization is a key enabler of interoperability and cross-platform functionality. While all approaches incorporate some form of standardization, ranging from procedural content generation ([21]) to JSON-based data exchange ([24]), our approach stands out through its comprehensive use of both XML and JSON to define rules, game logic, scenarios, metrics, and thresholds. Supporting portability and modular updates using this scheme does not mean that our framework is aligned with full schemes defined by industry-wide or public standards. This provides a structured way to manage and reuse game elements, which can ease future integration or migration to standards if needed. 

5.4 Discussion (page 24, line 863)

One of the key contributions of this framework is the introduction of an internal standardization process for adaptive SG making them scalable and reusable across different domains. This process fomalises SG development using ontologies and metadata for the description of game elements, scenarios, scenes, goals and gameplay. Standardization is therefore framework-specific and does not target at providing compliance with ex-ternal standards for game development or educational technology. Our goal was mainly to provide better integration and consistency within our framework and facilitate portability.

 

Issue 5: No Real Machine Learning or Predictive Modelling

Although the paper positioned the system as highly data-driven, the authors did not actually use machine learning, predictive analytics, or behavioural modelling. They wrote about using historical data and monitoring performance, but all adaptations were based on simple rule engines and predefined thresholds. This missed the opportunity to leverage the rich gameplay data they collected for deeper insights, like predicting skill progression, clustering player profiles, or discovering hidden patterns in behaviour.

Response 5:

We would like to thank the reviewer for this suggestion. We agree that integrating machine learning or predictive analytics is a very important next step for our framework and we fully agree to explore this in the near future. The main goal of the present paper was to introduce and describe the proposed approach, outline its major components and demonstrate its applicability through real-world examples. At this stage, we focused on using expert-driven rules and real-time data collection/feedback, which we consider essential for sensitive domains like healthcare and industrial training examples. We have now clarified that we intend to use machine learning, generative AI and advanced data-driven techniques with priority in our future work. To this end, we have performed the following modifications to the manuscript to address this comment and further clarify this issue (see revised sections and pages below, with revised text in red):

 

5.4 Discussion (page 25, line 912)

Another possible area for improvement is the incorporation of deep analytics on player behavior and engagement to discover possible patterns and trends that could be incorporated in the rule-engine and be treated as “ground-truth” for deciding in real-time if, when and where adaptations should take place. In addition, exploring AI-driven con-tent generation may provide further automatic adaptations and support deeper structural changes.

  1. Conclusion (page 26, line 951)

Future work will focus on extending the framework application to more domains, re-fining adaptability features and optimizing user profiling capabilities. Also, standardizing data collection by developing formal methods for collecting and storing gameplay in-formation based on different data sources (i.e. players) and types, improving consistency and reliability of the analysis phase. Furthermore, integration of advanced technologies like AI and machine learning will be performed to add more features such as prediction of user behavior or the learning curve to adapt better to players’ needs. Future steps will also focus on improving scalability, integrating AI-driven adaptation techniques and creating deep long-term research to validate the effectiveness of this approach across different domains. Additionally, the use of new AI tools, such as generative AI and Large Language Models (LLMs) will be investigated to assist experts create and update rules faster and more easily. Depending on the findings, the utilization of LLMs could be performed at the level at which all rules defined by domain experts may be replaced by those produced by LLMs using the domain knowledge embodied. This way, the proposed approach may offer varying levels of automation ranging from full scale to hybrid schemes, utilizing partly the experts or being fully expert-based. Finally, large-scale trials and long-term evaluations will be conducted to validate the platform impact over extended periods of use and different user groups, with high volume of simultaneous users.

 

Issue 6: Limited Handling of Data Privacy and Ethics

The authors talked about collecting a lot of sensitive data from players, including performance logs, decision patterns, and even voice recordings, but they barely addressed privacy, ethics, or data protection in any meaningful way. They wrote that they collected and semantically annotated the data, but they did not mention how they anonymised, secured, or ethically managed that information, especially when working with vulnerable groups like children. In a real-world deployment, this would have been a major oversight.

Response 6:

We would like to thank the reviewer for highlighting this important issue. We agree that privacy, ethics and data protection are critical, especially with sensitive groups like children-related or industrial data. Indeed, we did not provide enough detail in the previous version of our manuscript on how we dealt with these aspects, and we have now improved this section. As reported in the revised text, access is granted only to authorized users, while collected data is anonymised and encrypted before storage and analysis. This ensures that even in the case of unauthorized access, the data remains unreadable and protected. Any information that could identify a user is removed so that individual identities cannot be linked to the data. We have performed the following modifications to the manuscript to address this comment and clarify this issue (see revised sections and pages below, with revised text in red):

 

  1. The Proposed Framework (page 9, line 353)

Player interactions, performance metrics and outcomes are collected here including every selection made and the response time within which it was made, all user actions (e.g., clicking correctly or not, cancelling or re-loading stage), and completed challenges. To protect user privacy, all collected data is first anonymized, meaning that any information that could directly identify users is removed, while each user is associated with a unique internal ID. Other sensitive data, such as the profiling of a user for game adaptation (e.g., a syndrome) or their performance data, is encrypted during acquisition and storing so that the relevant data remains unreadable to unauthorized users even if they gain access.

  1. System Demonstration (page 13, line 498)

In this preliminary demonstration, five users/players (children) up to the age of twelve years old with learning difficulties due to various syndromes (e.g. Down, Williams) were engaged in the gaming process and used the game task of recognizing the syllables of words. The private nature of the participants’ data was preserved through mechanisms to prevent unauthorised access to the system, as well as the utilization of data anonymization and encryption techniques.

 

Issue 7: Weak Discussion on Transferability to Other Game Genres

The paper focused entirely on training and therapeutic serious games, and while the authors claimed their framework was domain-agnostic, they did not show how it could handle entertainment games, multiplayer environments, open-world scenarios, or games with emergent gameplay. They wrote confidently about adaptability across fields, but the examples they provided were highly controlled, linear, and single-user focused. They missed the chance to seriously discuss or test the system’s adaptability to more dynamic or less structured gaming contexts.

Response 7:

We would like to thank the reviewer for this comment, with which we partly agree and partly disagree. We agree that we did not elaborate further on how our approach could be used in the variety of game cases mentioned by the reviewer. This was out of the targets of this paper. We could not introduce the framework, describe it and demonstrate it with many examples as this would cause an explosion in the paper’s size. Instead, we decided to demonstrate its applicability through two different examples, both as regards the domain (industrial, therapeutic) and its nature or content. We disagree that we focused entirely on training and therapeutic serious games and did not touch upon other game genres. We did so verbally in the text, not using demo use cases though. Nevertheless, taking this comment into serious consideration, we enhanced the verbal parts and have addressed it as shown below to emphasize more the framework’s domain agnostic design nature, data architecture and modular game mechanisms, which allow its utilization in a variety of genres.  Specifically, we have performed the following modifications to the manuscript to address this comment and clarify this issue (see revised sections and pages below, with revised text in red):

  1. Introduction (page 3, line 91)

The framework is general and domain-agnostic as it is not tailored for or attached to a specific application area. Instead, it provides the means, activities and structures to serve  practically any field, with experts supporting the transfer of domain knowledge which is reflected in scenarios, scenes, goals and gameplay. The way scenarios, levels and game elements are defined is flexible and independent of the application area, making this approach applicable in a wide range of domains, not just the industrial training or therapeutic cases used in this paper for demonstration purposes. 

  1. The Proposed Framework (page 10, line 374)

The latter is able to efficiently handle vast amounts of data that is being generated from multiple simultaneous users at different frequencies and formats resembling Big Data. Therefore, this metadata scheme may also be used to facilitate big data management [17]. The DL architecture of our framework is flexible to support the definition and management of any game environment, that is, scenarios, tasks, gameplay and central data. This reflects to practically all known game types or genres, with the proposed framework being able to accommodate all distinct characteristics of a game type without any dependencies or limitations hindering the setup of the environment for any such kind of game, including those that give birth to big data game environments.

5.4 Discussion (page 25, line 880)

Nevertheless, as the current evaluation is limited to small-scale, short-term demonstrations, future work will focus on large-scale and long-term experiments in order to fully evaluate the framework’s efficiency, scalability and impact on a range of user groups.  The mechanisms provided by the framework for setting up and creating games are modular, allowing developers to target different game genres and styles without difficulty.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Serious games are an important tool in the field of education and training. This paper introduces a framework. The framework adopts a phased approach to guide the development of serious games and uses semantic annotation and digital twin technology to standardize the steps for creating adaptive serious games. Existing problems:
Delete Figure 1 and just describe it in one sentence.
Simplify the description of the digital twin in line 163.
Delete programs such as Figure 3 and Figure 4 and place them as attachments at the end of the article.
Modify Figure 7 to be horizontal.
Delete Figure 8 and change it to a simple diagram highlighting the design technology. Merge and modify Figure 9 as the information is too limited.

Author Response

HOW COMMENTS WERE DEALT

We would like to thank the two reviewers for their valuable comments. We tried to address all of their suggestions by either revising the manuscript (adding/enhancing or modifying text where appropriate) or by clarifying/answering to a comment in case we did not fully agree with it.

Below we provide each comment/suggestion in italics and our answer right below it for each reviewer.

 

Reviewer 2

Comment 1:

Delete Figure 1 and just describe it in one sentence.

Response 1:

We have deleted figure 1 and added the following text in section “2. Technical Background and Related Work” (page 3, line 124):

The core of the proposed framework combines SGs, DTs, semantic annotation and predefined rules to enable real-time self-adaptation and personalized learning experiences.

Comment 2:

Simplify the description of the digital twin in line 163.

Response 2:

We have simplified the description of the digital twin to: (page 4, line 183)

The main benefit a DT offers is the ability to simulate real-world scenarios, predict out-comes and perform decision making based on data. As DTs can help adaptive systems adjust in real-time, it remains challenging to use real-time data from to offer personalised user experience in interactive environments like SGs [11,5].

Comment 3:

Delete programs such as Figure 3 and Figure 4 and place them as attachments at the end of the article.

Response 3:

The requested change has been made and now Figures 3 and Figures 4, have been removed from the main text and included in the Appendix at the end of the paper.

Comment 4:

Modify Figure 7 to be horizontal.

Response 4:

Figure 7 now appears in horizontal format to better present the ontology structure and relationships within the game environment.

Comment 5:

Delete Figure 8 and change it to a simple diagram highlighting the design technology.

Response 5:

We kindly disagree with the reviewer’s suggestion. Figure 8 presents the structure and key elements of the game environment, which is the main driver for understanding how the game adaptive mechanisms and scenarios are organized. This visual context is important for readers to fully understand the complexity and adaptability of the game design as described in the manuscript.  

Comment 6:

Merge and modify Figure 9 as the information is too limited.

Response 6:

Following the reviewer’s suggestion, Figure 9 has been merged with Figure 10 to provide one dashboard with examples of graphs that domain experts can utilize.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

Overall I think it should be more discussed and described why this framework is new to many other frameworks. One big problem in its current form is reproducibility - and a missing/good described methodology (section). In addition it was not clear to me what the real contribution/novelty is.

* Please describe why you used 2 examples in more detail (and why you think based on this, that your framework works in general - at least it is read to be like this).
* Please also add more aspects about concrete results in the abstract
* The connex to digital twins should be presented in more detail
* I would also suggest to keep the results/methodology out of the introduction
* The methodology is missing, and it is very hard/not possible to understand why each step was chosen and how you ended up with these results. I would suggest to add a dedicated methodology section and split what/how aspects to methodology and results
* Please describe the two examples also in more details - e.g. design choices, requirements - how did you end up with these 2, why these two were chosen etc.
* Why didn’t you put digital twins into the title?
* Please check the styling out your references (e.g. (14))
* I am missing other/more serious game frameworks - there are quite a lot, and they should be explained in more detail/describe what the real novelty is and focus on that
* Please also describe the self adoption part in more detail (form me, this would be the most novel aspect)
* Please check your references to pictures: e.g. where is Fig. A1 and a2?
* The description of the second game is missing in section 4
* I would also suggest to use the IMRAD structure
* Please also explain why you evaluated it like this - there are many standardized questionnaires available.
* Please describe inclusion and exclusion criteria in more detail
* I would suggest to skip table 2 - instead add more details about participants
* Please also explain more about the „brief“ literature review and how it was done
* I like Tab 3. Although I think you should really focus on adaptability - at least based on the title I would have expected it to go into that direction
* I believe the source code/pseudocode can be skipped in the publication (or argue in more detail, why it is that important/new)
* In the end it was not clear for my what the real novelty is and into what direction the paper is heading.

Author Response

HOW COMMENTS WERE DEALT

We would like to thank the reviewer for his/her valuable comments. We tried to address all of their suggestions by either revising the manuscript (adding/enhancing or modifying text where appropriate) or by clarifying/answering to a comment in case we did not fully agree with it.

Below we provide each comment/suggestion in italics and our answer right below it for each reviewer.

 

Reviewer #3

Overall Comment 1:

“Overall I think it should be more discussed and described why this framework is new to many other frameworks.”

Response Overall 1:

This is extensively described in Section 5 (subsection 5.4 - Comparison with Other Approaches) where we briefly present other approaches that are similar or close enough to the proposed framework. As described in the manuscript, we selected four other approaches to compare with the proposed one using Table 3 and the selected criteria. The results are discussed in the same section and are further analysed the subsequent section (5.5 Discussion). This text explains the differences between the proposed framework and others and describes its unique characteristics that differentiate it with potential rivals. Nevertheless, we added new Table 2 and relevant text in subsection 5.4 - Comparison with Other Approaches which covers this comment.

Section 5.4 – Comparison with Other Approaches

Page 22 (Lines 874-877)

Various frameworks or approaches have been proposed across the literature to streamline development and ensure effectiveness. Table 2 provides an overview of similar approaches analyzed in terms of their focus, key features and limitations, juxtaposed with the proposed framework.

Page 22 (Lines 896-918)

Their approach is domain-specific, centered on physical rehabilitation, and primarily validated in a single context.; (ii) Saeedi et al. [26], who developed "Ava", a smartphone-based serious game designed to assist speech therapy for preschool children with speech sound disorders (SSD). The game teaches consonants, syllables, words, and sentences through four interactive levels. Results showed high satisfaction among speech-language pathologists and positive feedback from children. The game demonstrated potential as a tool for home-based therapy under parental supervision. The work emphasizes user-centered design and usability, but is still specific to speech therapy and does not address any adaptability or domain transfer; (iii) Alcover et al. [27], that introduce PROGame, a structured framework for developing serious games aimed at motor rehabilitation therapy. The framework integrates agile methodologies (Scrum), web application development principles, and clinical trial processes to ensure systematic and validated game development. PROGame demonstrates potential for broader use in rehabilitation game development. The model is process-oriented and repeatable within rehabilitation contexts, but its application and validation are limited to motor therapy; (iv) Antunes and Madeira [28], who introduce the PLAY platform, a model-driven framework for designing serious games tailored to children with special needs. It focuses on physiotherapy and cognitive rehabilitation by gamifying therapeutic exercises into structured levels and actions. The platform integrates patient profiles, enabling personalized game recommendations based on therapeutic needs and progress. Therapists can monitor performance and adapt exercises in real time, while data analytics tools support decision-making. The PLAY platform is modular and domain-agnostic, but focuses only on therapy and clinical settings.

 

Overall Comment 2:

“One big problem in its current form is reproducibility - and a missing/good described methodology (section).”

Response Overall 2:

We have revised this section and changed the title to “Methodology and the Proposed Framework”. We also added text in page 6 to describe how we worked to reach to the design of the framework and its distinct parts or phases. Now with this enhancement it is clearer how we conceived the approach and its steps, while their description, which follows in detail, is now connected more smoothly and readability is enhanced. 

As regards reproducibility, we have devoted a separate part of the paper (section 4) to provide a full demonstration of the steps within the phases of the proposed framework using the case-study of the therapeutic game. Using this stepwise demo approach the framework becomes easy, simple and straightforward for someone to follow. The steps are generic, inclusive and domain agnostic so they pose no limitations or difficulties that may obscure reproducibility.

Section Methodology and The Proposed Framework

Page 6 (Lines 241-269)

The methodology used for defining the framework introduced in this paper was as follows: First, we identified three key pillars upon which we structured the phases of the framework, that is, Functionality, Stakeholders and Data. Functionality pillar enclosed all game elements that give birth to the scenarios, scenes, tasks and gameplay. These elements constitute the environment of the game and as such a critical part of the framework is devoted on describing them effectively and efficiently through dedicated descriptors and standardised formats that will be analyzed later on. Stakeholders pillar is considered the cornerstone of the framework as it serves two purposes. First, it includes domain experts whose valuable guidance and knowledge drive the development of the game. Second, it involves the end-users, that is, the enablers of the game (e.g., administrators that initiate the game and make sure it will execute as planned) and the actual users that engage with the game and produce performance results. Data is the final pillar and encompasses all related information, starting with game elements and ending with the outputs of the game during user interaction. These three pillars were thought of as living interconnected parts of a larger ecosystem, the integration of which serves the following purposes: The functionality of a game is described in its environment, which is driven by a set of goals. These goals are defined from the beginning utilizing domain experts. Game tasks are oriented by rules and gameplay which must also be part of the environment. The execution of the game by its end-users produces game experience data which should be recorded in profiles and analyzed further. Therefore, corresponding mechanisms for profiling and managing data should be in place, allowing for models and techniques to monitor, analyse and evaluate it. The core of the framework is its ability to offer to a game self-adaptive properties in real-time. To serve this purpose there should be a game engine in place governing adaptability decisions, which are taken based on the processing of the data and feedback from experts where necessary. This engine would be built to combine elements of Digital Twins, a powerful technology not fully exploited in the past for SG, and a rule-based approach that should enable bridging game requirements and goals with gameplay. This was in a nutshell the philosophy behind the structuring of the framework introduced in this work.

 

Overall Comment 3:

In addition it was not clear to me what the real contribution/novelty is.”

Response Overall 3:

We would like to thank the reviewer for this comment. Taking this into serious consideration we added a new paragraph in the Introduction (Page 3 in Lines 100-105) summarizing and highlighting the contributions of our paper, as follows:

The main contributions of this paper may be summarised to the following: (i) Provides a phased, stepwise approach for developing SG from scratch, which is domain agnostic; (ii) Enables real-time, automatic self-adaptation of SGs via the integration with DT; (iii) Offers internal standardization by introducing a formalised structure for describing game elements, scenarios, and adaptation rules thus supporting consistency and portability.”

We would like to highlight that core innovations, challenges and contributions of our work are mentioned and described throughout the text, excerpts of which are provided in Comment 20.

 

Comment 1:

“Please describe why you used 2 examples in more detail (and why you think based on this, that your framework works in general - at least it is read to be like this).”

Response 1:

Following this suggestion we have now revised the manuscript and added new text in section 1 - Introduction (see page 2: lines 83-85, page 3: lines 90-96) and section 5 - Preliminary Experimentation and Evaluation (see page 15, lines 613-701). This new text provides more details on the development of the games (requirements, design, assessment, etc), as well as the reason why we have selected these two types of games and how these contribute to the generalizability of the proposed approach.

In addition, similar approaches deal with the issue of generalizability in a shallower way (or even do not deal with it at all) than this paper. For example:

Hocine et al. (2015): Mentions this issue very briefly describes its adaptation method as “generic” for rehab games, but in reality, the authors show it working only for one game and one type of therapy (stroke rehab).

Saeedi et al. (2024): Does not provide any generalization process but speaks in general about design and testing speech therapy games for children.

Antunes et al. (2024): Argues about building a flexible, modular system that can plug in different kinds of therapy game and sensors. The framework introduced is called PLAY and works with multiple types of therapy (upper limp, lower limp, speech), and even cognitive training. The system is built to be domain-agnostic, meaning it is not tied to just one use-case, and can add game therapies as needed. No other detail is provided as to generalizability.

Streicher & Smeddinck (2016): Mentions generalizability but in a more “broader picture” way. This book chapter provides theory and methods for making adaptive games work for all sorts of users and situations. No other evidence of generalizability is provided.

Calvo-Morata et al. (2020): This is a review of quite a few different games for bullying and cyberbullying prevention. It does not propose a framework or discusses generalizability.

Amengual Alcover et al. (2018): Introduces a process for building serious games for motor rehab, which, as stated by the authors, can be used for other rehab games, not just the one tested (adults with cerebral palsy). It is also claimed that the framework is adaptable for different patient groups and therapies, but all within the rehab field. Again, no discussion on generalizability.

Of course, generalizability is a major issue would require a substantial amount of text in its own merit to establish. And this is something beyond the scope of our paper. Nevertheless, below we provide some text from our manuscript that reports on generalizability:

"The diversity of the games demonstrates the flexibility and generalizability of the framework, and its potential to adapt to a wide range of application areas." (Lines 611-613)

"The way scenarios, levels and game elements are defined is flexible and independent of the application area, making this approach applicable in a wide range of domains, not just the industrial training or therapeutic cases used in this paper for demonstration purposes." (Lines 79-82)

"The proposed framework is general and domain-agnostic as it is not tailored for or attached to a specific application area." (Lines 76-77)

"These applications showcase the framework’s adaptability and potential, demonstrating that it is not limited to one specific domain and that it can be applied to a range of training and educational contexts." (Lines 1072-1074)

Section Introduction

Page 2 (Lines 83-85)

The proposed framework and associated methodology are demonstrated via two case studies, the first being utilized as a therapeutic game and the second being integrated into a smart manufacturing environment.

Page 2 (Lines 90-96)

The diversity of the games developed indicates the generalizability of the framework as the different settings, operational environment and educational targets are served equally well despite the application domain. In these case-studies the game environment supports efficiently different user training needs, such as to overcome certain types of learning weaknesses or enhance their experience on the use of complicated machinery by allowing personalised and self-adaptive game experience.

Section 5.1 – Case-studies Design

Page 15 (Lines 613-701)

For experimentation purposes, two SGs were developed that cover distinct key domains: Healthcare and Industry 4.0. The diversity of the games demonstrates the flexibility and generalizability of the framework, and its potential to adapt to a wide range of application areas. The two game examples have been chosen so as to show the framework's applicability across these two very different application domains with distant user groups characteristics and requirements. The speech therapy game targets children with special needs, requiring therapeutic personalization. The factory game targets adult engineers, focusing on operational skill training and real-time adaptation to performance. Therefore, through these case-studies it is demonstrated that the proposed approach is not tied to a specific application area but is instead general and flexible enough to be applied to various domains as long as the targets, rules, scenarios and other game elements are tuned to serve a particular domain and user group. By successfully applying the framework to these contrasting settings, its domain-agnostic design and adaptability is highlighted. In both cases, requirements and design guidelines were sourced from domain experts ensuring relevance and effectiveness in real-world contexts. The selection of the specific case studies was motivated by an additional factor, the ongoing collaboration with domain experts in the relevant application areas which would allow utilizing and exploiting groups of users for real-time experimentation and evaluation, and working within a real-world environment and not a simulated one.

The development of the two games followed a short cycle of requirements analysis which resulted in the following:

  1. Speech Therapy Game for Children

Purpose: Develop in collaboration with speech therapists at the Rehabilitation Clinic of the Cyprus University of Technology and aims at supporting children with syndromes or learning difficulties, focusing on improving phonological awareness and syllable segmentation skills.

Requirements: Sourced by Speech Therapy Experts. Target Group shall be children with learning difficulties, including those with syndromes affecting phonological awareness.

Therapeutic Goals: Improve syllable segmentation and phonological processing, maintain high engagement and motivation during therapy.

Game Structure and Mechanics: The main task is to identify the correct number of syllables in words presented with colorful images, receiving immediate feedback. Correct answers should trigger positive reinforcement (e.g., happy sounds, animations). Increasing difficulty: starting with two-syllable words, advancing to more complex words and higher syllable counts as accuracy improves. Additional tasks include replacing letters to form new words and matching spoken words to corresponding images.

Design Guidelines: Shall begin with simple two-syllable words, progressing to more complex structures as mastery is demonstrated. Utilize feedback and trigger mechanisms that use positive reinforcement (sounds, animations) for correct answers; offer hints or visual cues when errors occur.

Scenarios: Support a variety of task types (syllable segmentation, letter replacement, image-word matching) to address different therapy needs.

Adaptation and Personalization: The game should dynamically adjust difficulty based on individual player performance (accuracy, response time, error rates). Also, should a child struggle, the system will provide immediate, supportive feedback and hints as needed (e.g., underlining the correct answer or splitting words into syllables). Progress shall be tracked, and metrics like accuracy and completion time shall guide real-time adjustments.

Expert Involvement: Speech therapists shall define the initial rules, thresholds, and adaptation strategies. Experts shall be able to manually adjust game parameters and review performance dashboards to monitor progress and refine the game experience.

Evaluation: Preliminary trials shall involve at least five children with various syndromes. The game shall be assessed in terms of flexibility, adaptability, and ability to reduce the need for constant manual intervention by therapists.

Data Collection: Record accuracy, response times, error rates, and progression. Also, enable therapists to review session data for individualized planning.

Performance Monitoring: Integrate real-time dashboards for therapists to monitor and adjust therapy dynamically.

 

  1. Factory Training Game for Engineers

Purpose: Implement at the Paradisiotis Group poultry meat factory. Design to train engineers and factory staff in machinery operation and climate control within poultry farms.

Requirements: Target Group is factory engineers and technical staff in a Paradisiotis production environment. The training objectives are to simulate real-world machinery operation and climate control scenarios, support skill development in diagnosing faults and maintaining optimal environmental conditions.

Game Structure and Mechanics: Two main types of tasks: (i) Simulate the factory environment where users diagnose and resolve machinery issues by selecting correct images based on textual descriptions. (ii) Control the climate system: players adjust parameters such as temperature, humidity, and ventilation to maintain optimal conditions during breeding cycles.

Design Guidelines: Scenarios shall create realistic simulations of machinery faults and climate management, reflecting actual factory challenges. Furthermore, adaptation shall increase task complexity as users demonstrate proficiency that offer additional support for users who struggle or have difficulties.

Scenarios: Support a variety of task types (handle knobs, gauges, control panels) to address different training needs.

Adaptation and Personalization: The game shall increase complexity of scenarios as the player progresses (e.g., more challenging machinery faults, more nuanced climate control tasks). Real-time adaptation shall be based on performance metrics like task accuracy and completion times. The system shall provide feedback and adjust the level of challenge to match the trainee’s skill and progress.

Expert Involvement: Factory managers and domain experts shall contribute to scenario design, rule definition, and evaluation of training effectiveness. Experts shall use dashboards to track user performance and adjust training parameters as needed.

Evaluation: The game shall be tested by domain experts and shift managers to provide feedback on usability, adaptability, and the relevance of training scenarios. Also, the approach shall be assessed in terms of supporting both individual and group training, adapting to various skill levels, and providing actionable performance insights.

Data Collection: Record task accuracy, completion times, and decision patterns, and enable supervisors to track individual and group progress.

Performance Monitoring: Use dashboards to visualize progress, highlight areas for improvement, and inform future training adjustments.

 

Comment 2:

“Please also add more aspects about concrete results in the abstract”

Response 2:

Following this suggestion, we added one more sentence in the Abstract to mention more about results.

Using a specially prepared questionnaire the framework was evaluated by domain experts that suggested high levels of usability and game adaptation. Comparison with similar approaches via a set of properties and features indicated the superiority of the proposed framework.” (Lines 33-36)

 

Comment 3:

“The connex to digital twins should be presented in more detail”

Response 3:

Digital Twins were described already in various places in the manuscript allowing for details to be shared with the readers. For example:

Abstract:

“This standardization is achieved through semantic annotation and it is utilized by digital twins to support self-adaptation. The proposed approach describes the game environment using ontologies and specific semantic structures, while it collects and semantically tags data during players’ interactions, including performance metrics, decision-making patterns and levels of engagement. This information is then used by a digital twin for automatically adjusting the game experience using a set of rules defined by a group of domain experts.” (lines 15-21)

Introduction:

“Although mainly used in different contexts (e.g., smart manufacturing or healthcare delivery), a DT may enable a SG to respond dynamically to players’ behaviour and performance.” (lines 60-62)

“The DT in our case has a special form that allows changing the way educational content is delivered and training activities are performed. This is achieved by making certain adjustments (e.g., changing the scenes or scenarios, or modifying the complexity/difficulty) for a specific sample user profile category and then measuring the response of the trainees. In this case the physical object is the educational tool used, that is, the training approach, content, performance assessment and adaptability according to types of users.” (lines 70-75)

Methodology and Proposed Framework:

“The integration between experts and DT is a feedback loop starting with the initial setup and continuing with periodical re-definition of rules and scenarios taking place when necessary (i.e., based on new educational targets). These rules are incorporated in the DT’s Rule Engine and relevant feedback may be provided to the experts to support their re-definition when necessary” (lines 323-328)

“This information essentially feeds the self-adaptive part of the framework performed by the DT, while it is also stored in the user profiles. The adaptability feature is ensured by using a specific, formal and standard description of the current rules that are active within the game environment.” (lines 337-341)

“Essentially, the DT parses the corresponding XML file and recognizes the rules set. For example, the DT assesses a player’s stress levels, scores, response time and error rates. This information triggers adaptation rules to increase or reduce task complexity or provide hints to assist users successfully completing a scene or game task.” (lines 365-369)

System Demonstration:

“As previously mentioned, the self-adaptive simulation (DT and rules engine) dynamically adjusts the game difficulty in real-time based on player performance. The DT processes gameplay data and applies certain rules set by domain experts to modify game tasks at a certain pace and level.” (lines 535-538).

Nevertheless, we also added more text to account for this comment in section 4 page 13 where we provide more details and discuss how the DT technology was used in our demo example.

Section 4 – System Demonstration

Page 13 (Lines 535-553)

As previously mentioned, the self-adaptive simulation (DT and rules engine) dynamically adjusts the game difficulty in real-time based on player performance. The DT processes gameplay data and applies certain rules set by domain experts to modify game tasks at a certain pace and level. Essentially, the DT continuously monitors the response of users to these modifications based on performance data. Then, it applies the relevant rules for lowering or increasing the difficulty/complexity of the current game task either partially or fully. In our example, if the player faced difficulties, then the adaptation process would trigger a slower or marginal differentiation and enable the provision of hints, such as underlining the correct answer, or splitting the word into the correct syllables and displaying them on screen. Otherwise, if progress showed that it was too easy for the player to answer (this was measured by counting the number of correct answers and the time it took to provide them) then the adaptation would make the task harder at higher levels of change, for example by increasing the words to 3 syllables, or by using words having less common sounds like “br” as in “bridge”. Two more tasks were additionally developed in our case-study game with increasing difficulty for adaptability purposes, which involved (i) replacing one or more letters of a given word and creating new words, and, (ii) listening to the pronunciation of various words and choosing the correct image out of the options displayed (see Figure 3 (b)). The adaptation process followed the logic that is described in the pseudocode of Figure A3 in Appendix A.

 

Comment 4:

“I would also suggest to keep the results/methodology out of the introduction”

Response 4:

Following this suggestion, we moved one large paragraph that provides some more details on the methodology from the Introduction to section 3 - Methodology and the Proposed Framework

 

Comment 5:

“The methodology is missing, and it is very hard/not possible to understand why each step was chosen and how you ended up with these results. I would suggest to add a dedicated methodology section and split what/how aspects to methodology and results”

Response 5:

We would like to thank the reviewer for pointing this out. We have revised this section and changed the title to “Methodology and the Proposed Framework”. We also added text in page 6 to describe how we worked to reach to the design of the framework and its distinct parts or phases. Now with this enhancement it is clearer how we conceived the approach and its steps, while their description, which follows in detail, is now connected more smoothly and readability is enhanced. 

As regards splitting to methodology and results, this is extensively answered in Comment 13 below. In essence, the paper includes these two parts with slightly different titles. Methodology is section 3 and results is section 5.

Section 3 – Methodology and the Proposed Framework

Page 6 (Lines 241 – 269)

The methodology used for defining the framework introduced in this paper was as follows: First, we identified three key pillars upon which we structured the phases of the framework, that is, Functionality, Stakeholders and Data. Functionality pillar enclosed all game elements that give birth to the scenarios, scenes, tasks and gameplay. These elements constitute the environment of the game and as such a critical part of the framework is devoted on describing them effectively and efficiently through dedicated descriptors and standardised formats that will be analyzed later on. Stakeholders pillar is considered the cornerstone of the framework as it serves two purposes. First, it includes domain experts whose valuable guidance and knowledge drive the development of the game. Second, it involves the end-users, that is, the enablers of the game (e.g., administrators that initiate the game and make sure it will execute as planned) and the actual users that engage with the game and produce performance results. Data is the final pillar and encompasses all related information, starting with game elements and ending with the outputs of the game during user interaction. These three pillars were thought of as living interconnected parts of a larger ecosystem, the integration of which serves the following purposes: The functionality of a game is described in its environment, which is driven by a set of goals. These goals are defined from the beginning utilizing domain experts. Game tasks are oriented by rules and gameplay which must also be part of the environment. The execution of the game by its end-users produces game experience data which should be recorded in profiles and analyzed further. Therefore, corresponding mechanisms for profiling and managing data should be in place, allowing for models and techniques to monitor, analyse and evaluate it. The core of the framework is its ability to offer to a game self-adaptive properties in real-time. To serve this purpose there should be a game engine in place governing adaptability decisions, which are taken based on the processing of the data and feedback from experts where necessary. This engine would be built to combine elements of Digital Twins, a powerful technology not fully exploited in the past for SG, and a rule-based approach that should enable bridging game requirements and goals with gameplay. This was in a nutshell the philosophy behind the structuring of the framework introduced in this work.

 

 

Comment 6:

“Please describe the two examples also in more details - e.g. design choices, requirements - how did you end up with these 2, why these two were chosen etc.”

Response 6:

As mentioned in comment 1 above, we have modified the manuscript to address all the issues related to the case-studies noted by the referee. The parts modified are found in section 5.1 page 15.

Section 5.1 – Cases-studies Design

Page 15 (Lines 613 – 701)

The two game examples have been chosen so as to show the framework's applicability across these two very different application domains with distant user groups characteristics and requirements. The speech therapy game targets children with special needs, requiring therapeutic personalization. The factory game targets adult engineers, focusing on operational skill training and real-time adaptation to performance. Therefore, through these case-studies it is demonstrated that the proposed approach is not tied to a specific application area but is instead general and flexible enough to be applied to various domains as long as the targets, rules, scenarios and other game elements are tuned to serve a particular domain and user group. By successfully applying the framework to these contrasting settings, its domain-agnostic design and adaptability is highlighted. In both cases, requirements and design guidelines were sourced from domain experts ensuring relevance and effectiveness in real-world contexts. The selection of the specific case studies was motivated by an additional factor, the ongoing collaboration with domain experts in the relevant application areas which would allow utilizing and exploiting groups of users for real-time experimentation and evaluation, and working within a real-world environment and not a simulated one.

The development of the two games followed a short cycle of requirements analysis which resulted in the following:

  1. Speech Therapy Game for Children

Purpose: Develop in collaboration with speech therapists at the Rehabilitation Clinic of the Cyprus University of Technology and aims at supporting children with syndromes or learning difficulties, focusing on improving phonological awareness and syllable segmentation skills.

Requirements: Sourced by Speech Therapy Experts. Target Group shall be children with learning difficulties, including those with syndromes affecting phonological awareness.

Therapeutic Goals: Improve syllable segmentation and phonological processing, maintain high engagement and motivation during therapy.

Game Structure and Mechanics: The main task is to identify the correct number of syllables in words presented with colorful images, receiving immediate feedback. Correct answers should trigger positive reinforcement (e.g., happy sounds, animations). Increasing difficulty: starting with two-syllable words, advancing to more complex words and higher syllable counts as accuracy improves. Additional tasks include replacing letters to form new words and matching spoken words to corresponding images.

Design Guidelines: Shall begin with simple two-syllable words, progressing to more complex structures as mastery is demonstrated. Utilize feedback and trigger mechanisms that use positive reinforcement (sounds, animations) for correct answers; offer hints or visual cues when errors occur.

Scenarios: Support a variety of task types (syllable segmentation, letter replacement, image-word matching) to address different therapy needs.

Adaptation and Personalization: The game should dynamically adjust difficulty based on individual player performance (accuracy, response time, error rates). Also, should a child struggle, the system will provide immediate, supportive feedback and hints as needed (e.g., underlining the correct answer or splitting words into syllables). Progress shall be tracked, and metrics like accuracy and completion time shall guide real-time adjustments.

Expert Involvement: Speech therapists shall define the initial rules, thresholds, and adaptation strategies. Experts shall be able to manually adjust game parameters and review performance dashboards to monitor progress and refine the game experience.

Evaluation: Preliminary trials shall involve at least five children with various syndromes. The game shall be assessed in terms of flexibility, adaptability, and ability to reduce the need for constant manual intervention by therapists.

Data Collection: Record accuracy, response times, error rates, and progression. Also, enable therapists to review session data for individualized planning.

Performance Monitoring: Integrate real-time dashboards for therapists to monitor and adjust therapy dynamically.

 

  1. Factory Training Game for Engineers

Purpose: Implement at the Paradisiotis Group poultry meat factory. Design to train engineers and factory staff in machinery operation and climate control within poultry farms.

Requirements: Target Group is factory engineers and technical staff in a Paradisiotis production environment. The training objectives are to simulate real-world machinery operation and climate control scenarios, support skill development in diagnosing faults and maintaining optimal environmental conditions.

Game Structure and Mechanics: Two main types of tasks: (i) Simulate the factory environment where users diagnose and resolve machinery issues by selecting correct images based on textual descriptions. (ii) Control the climate system: players adjust parameters such as temperature, humidity, and ventilation to maintain optimal conditions during breeding cycles.

Design Guidelines: Scenarios shall create realistic simulations of machinery faults and climate management, reflecting actual factory challenges. Furthermore, adaptation shall increase task complexity as users demonstrate proficiency that offer additional support for users who struggle or have difficulties.

Scenarios: Support a variety of task types (handle knobs, gauges, control panels) to address different training needs.

Adaptation and Personalization: The game shall increase complexity of scenarios as the player progresses (e.g., more challenging machinery faults, more nuanced climate control tasks). Real-time adaptation shall be based on performance metrics like task accuracy and completion times. The system shall provide feedback and adjust the level of challenge to match the trainee’s skill and progress.

Expert Involvement: Factory managers and domain experts shall contribute to scenario design, rule definition, and evaluation of training effectiveness. Experts shall use dashboards to track user performance and adjust training parameters as needed.

Evaluation: The game shall be tested by domain experts and shift managers to provide feedback on usability, adaptability, and the relevance of training scenarios. Also, the approach shall be assessed in terms of supporting both individual and group training, adapting to various skill levels, and providing actionable performance insights.

Data Collection: Record task accuracy, completion times, and decision patterns, and enable supervisors to track individual and group progress.

Performance Monitoring: Use dashboards to visualize progress, highlight areas for improvement, and inform future training adjustments.

 

Comment 7:

“Why didn’t you put digital twins into the title?

Response 7:

This is a nice suggestion. Actually, we were thinking about that from the very beginning and the reason we did not put it in the first place is that we thought that the title would get too long. Following this comment, we have now revised the title to “A Framework for Standardizing the Development of Serious Games with Real-Time Self-Adaptation Capabilities Using Digital Twins”

 

Comment 8:

“Please check the styling out your references (e.g. (14))”

Response 8:

Thank you for your observation regarding the styling of my references. All references now follow the required formatting guidelines.



Comment 9:

“I am missing other/more serious game frameworks - there are quite a lot, and they should be explained in more detail/describe what the real novelty is and focus on that”

Response:

Indeed there are a lot of other frameworks for serious games, all of which present pros and cons. These are reported in section 2 - Technical Background and Related Work and discussed extensively in section 5 - Preliminary Experimentation and Evaluation (subsection 5.4 - Comparison with Other Approaches).  The latter contains a short analysis of each rival framework and explains the challenges they face and their weaknesses, and where the proposed framework is superior and how.

Parts of the following analysis can be found in subsection 5.4:

Hocine et al. [25]: Propose an adaptation technique for upper-limb rehabilitation games, focusing on dynamic difficulty adjustment tailored to stroke patients. Their approach is domain-specific, centered on physical rehabilitation, and primarily validated in a single context.

Saeedi et al. [26]: Present the design and usability evaluation of a speech therapy game for preschool children with speech sound disorders. The work emphasises user-centered design and usability, but is still specific to speech therapy and does not address any adaptability or domain transfer.

Antunes et al. [5]: Describe a DT framework for personalised SGs that based on therapy, supporting multi-domain integration (e.g., upper/lower limb, speech, cognitive therapy). The PLAY platform is modular and domain-agnostic, but focuses on therapy and clinical settings.

Streicher & Smeddinck [24]: Offers a conceptual and methodological discussion on personalisation and adaptivity in SG across domains. The work describes theoretical models and taxonomies rather than a concrete, implementable framework.

Calvo-Morata et al. [6]: Conduct a systematic review of SG for bullying and cyberbullying prevention. This is a literature review, not a framework proposal.

Amengual Alcover et al. [27]: Introduce PROGame, a process framework for developing SG in motor rehabilitation. The model is process-oriented and repeatable within rehabilitation contexts, but its application and validation are limited to motor therapy.

Antunes & Madeira [28]: Details a web-based, model-driven platform using DTs for therapy games, supporting configuration, monitoring, and adaptation across multiple therapy types and user groups.

We did not want to extend this analysis at the expense of the paper’s size, therefore we kept it as short and concise as possible. In addition, we added Table 2 and some text to summarize the differences between these approaches and ours.

Section 5.4 – Comparison with Other Approaches

Page 22 (Lines 874-877)

Various frameworks or approaches have been proposed across the literature to streamline development and ensure effectiveness. Table 2 provides an overview of similar approaches analyzed in terms of their focus, key features and limitations, juxtaposed with the proposed framework.

Page 22

(Lines 896-897)

Their approach is domain-specific, centered on physical rehabilitation, and primarily validated in a single context

Page 23

(Lines 902-904)

The work emphasizes user-centered design and usability, but is still specific to speech therapy and does not address any adaptability or domain transfer

(Lines 909-910)

The model is process-oriented and repeatable within rehabilitation contexts, but its application and validation are limited to motor therapy

(Lines 917-918)

The PLAY platform is modular and domain-agnostic, but focuses only on therapy and clinical settings


Comment 10:

“Please also describe the self adoption part in more detail (form me, this would be the most novel aspect)”

Response 10:

To account for this comment we have added new text in section 4 page 13 where we provide more details and discuss how the DT technology was used in our demo example to explain better how self-adaptation was performed in practise. Additionally, self-adaptation has already been mentioned in various parts of the manuscript, such as:

Abstract:

“This standardization is achieved through semantic annotation and it is utilized by digital twins to support self-adaptation. The proposed approach describes the game environment using ontologies and specific semantic structures, while it collects and semantically tags data during players’ interactions, including performance metrics, decision-making patterns and levels of engagement. This information is then used by a digital twin for automatically adjusting the game experience using a set of rules defined by a group of domain experts.” (lines 15-21)

 

Introduction:

“Although mainly used in different contexts (e.g., smart manufacturing or healthcare delivery), a DT may enable a SG to respond dynamically to players’ behaviour and performance.” (lines 60-62)

Technical Background and Related Work:

“There is still a gap in integrating real-time adjustments based on DTs to further personalize game experience [5]. Combining DTs with SGs will create self-adaptive and effective strategies, which is exactly what the present paper aims at.” (lines 156-159)

“None of the studies thus far have explored adequately the standardization in integrating DTs with SGs for creating self-adaptive games based on user interactions and abilities. The aforementioned gaps and challenges observed in the relevant research on DTs supporting interactive and adaptive environments are significant obstacles to the development of dynamic systems that enable real-time adaptability and personalized experience.” (lines 221-226)

Proposed Framework:

“A key feature of our methodology is its hybrid nature, with the potential of utilizing domain experts to assist in defining and/or refining game rules and scenarios, while the DT adapts the game in real-time based on player data (e.g., progress) and game rules. Progress is measured through performance metrics, such as completion times, interactions and performance outcomes. Every interaction is tracked and valuable data is collected that helps understand how the players respond to distinct game scenes. Analytical models process the data to evaluate a player’s progress and feed the adjustment phase of the game. Specifically, the output of these models is used to adapt the game experience in real-time, adjusting scenarios based on the observed players’ behaviour and patterns. This is feasible as the game structure is designed to respond to players’ actions in real-time, that is, the game adapts itself by adjusting the complexity of the educational content delivery ensuring that the player is challenged at the correct level of difficulty.” (lines 275-287)

“As previously mentioned, the rules can vary depending on the level of adaptation, ranging from simple scene adjustments (objects, difficulty, gameplay) to complex, deep structural changes to the content, educational approach (e.g., increase of learning curve) and game style (e.g., from simple tile or image matching to incorporating AR or VR elements depending on the game implementation). The multi-layer adaptation capabilities are therefore driven by DT’s rule engine, which supports expert-defined XML representations to align with real-time players’ data” (lines 328-334)

“At the same time, this standardised format offers on one hand a unique and unambiguous structure of the game, and, on the other, the means for the proper communication between the game environment and the DT, thus enabling self-adaptation and real-time adjustments. For example, in the game with the therapeutic purpose mentioned in the Introduction and utilized later in this paper for experimentation, the XML defines the relevant scenarios and parameters, such as task type (syllable recognition), difficulty level (beginner, intermediate, advanced) and success criteria (pass, fail, repeat). Furthermore, this structured XML relates to metadata that describes the above parameters so that their dynamic adjustment, based on player feedback and performance data collected during gameplay, is made feasible.” (lines 349-358)

System Demonstration:

“As previously mentioned, the self-adaptive simulation (DT and rules engine) dynamically adjusts the game difficulty in real-time based on player performance. The DT processes gameplay data and applies certain rules set by domain experts to modify game tasks. Then, based on performance data, it applies the relevant rules for lowering or increasing the difficulty/complexity of the current game task. In our example, if the player faced difficulties, then the adaptation process would enable the provision of hints, such as underlining the correct answer, or splitting the word into the correct syllables and displaying them on screen. Otherwise, if progress showed that it was too easy for the player to answer (this was measured by counting the number of correct answers and the time it took to provide them) then the adaptation would make the task harder by increasing the words to 3 syllables, or by using words having less common sounds like “br” as in “bridge”. Two more tasks were additionally developed in our case-study game with increasing difficulty for adaptability purposes, which involved (i) replacing one or more letters of a given word and creating new words, and, (ii) listening to the pronunciation of various words and choosing the correct image out of the options displayed.” (lines 535-552)

Preliminary Experimentations:

“The platform adaptability to user needs suggested how to handle (increase) complexity for making training in the factory more challenging. Overall, users proposed new features that could be incorporated in upcoming updates (e.g., allowing experts to adjust training scenarios dynamically through a visual editor, adding guided directions for first-time users, enabling the system to recommend difficulty adjustments based on long-term performance trends, etc.) For the platform to effectively meet the needs of domain experts and players, this input was considered essential.” (lines 853-859)

Section 4 – System Demonstration

Page 13 (Lines 535-553)

As previously mentioned, the self-adaptive simulation (DT and rules engine) dynamically adjusts the game difficulty in real-time based on player performance. The DT processes gameplay data and applies certain rules set by domain experts to modify game tasks at a certain pace and level. Essentially, the DT continuously monitors the response of users to these modifications based on performance data. Then, it applies the relevant rules for lowering or increasing the difficulty/complexity of the current game task either partially or fully. In our example, if the player faced difficulties, then the adaptation process would trigger a slower or marginal differentiation and enable the provision of hints, such as underlining the correct answer, or splitting the word into the correct syllables and displaying them on screen. Otherwise, if progress showed that it was too easy for the player to answer (this was measured by counting the number of correct answers and the time it took to provide them) then the adaptation would make the task harder at higher levels of change, for example by increasing the words to 3 syllables, or by using words having less common sounds like “br” as in “bridge”. Two more tasks were additionally developed in our case-study game with increasing difficulty for adaptability purposes, which involved (i) replacing one or more letters of a given word and creating new words, and, (ii) listening to the pronunciation of various words and choosing the correct image out of the options displayed (see Figure 3 (b)). The adaptation process followed the logic that is described in the pseudocode of Figure A3 in Appendix A.

 

Comment 11:

“Please check your references to pictures: e.g. where is Fig. A1 and a2?

Response:

Thank you for highlighting the reference to pictures. I have reviewed and corrected the reference text throughout the manuscript. These figures appear in the Appendix A section, and all corresponding references in the main text have been updated to clearly indicate their location.

 

Comment 12:

“The description of the second game is missing in section 4”

Response:

Section 4 uses the first game as a demonstration example to provide a step-by-step guide how to execute the various phases of the framework. Section 5 - Preliminary Experimentation and Evaluation (see page 15, lines 610-704) has been revised and new text has been inserted to provide more details on the development of both games (requirements, design, assessment, etc).

Section 5.1 – Case-studies Design

Page 15 (Lines 610-704)

For experimentation purposes, two SGs were developed that cover distinct key domains: Healthcare and Industry 4.0. The diversity of the games demonstrates the flexibility and generalizability of the framework, and its potential to adapt to a wide range of application areas. The two game examples have been chosen so as to show the framework's applicability across these two very different application domains with distant user groups characteristics and requirements. The speech therapy game targets children with special needs, requiring therapeutic personalization. The factory game targets adult engineers, focusing on operational skill training and real-time adaptation to performance. Therefore, through these case-studies it is demonstrated that the proposed approach is not tied to a specific application area but is instead general and flexible enough to be applied to various domains as long as the targets, rules, scenarios and other game elements are tuned to serve a particular domain and user group. By successfully applying the framework to these contrasting settings, its domain-agnostic design and adaptability is highlighted. In both cases, requirements and design guidelines were sourced from domain experts ensuring relevance and effectiveness in real-world contexts. The selection of the specific case studies was motivated by an additional factor, the ongoing collaboration with domain experts in the relevant application areas which would allow utilizing and exploiting groups of users for real-time experimentation and evaluation, and working within a real-world environment and not a simulated one.

The development of the two games followed a short cycle of requirements analysis which resulted in the following:

  1. Speech Therapy Game for Children

Purpose: Develop in collaboration with speech therapists at the Rehabilitation Clinic of the Cyprus University of Technology and aims at supporting children with syndromes or learning difficulties, focusing on improving phonological awareness and syllable segmentation skills.

Requirements: Sourced by Speech Therapy Experts. Target Group shall be children with learning difficulties, including those with syndromes affecting phonological awareness.

Therapeutic Goals: Improve syllable segmentation and phonological processing, maintain high engagement and motivation during therapy.

Game Structure and Mechanics: The main task is to identify the correct number of syllables in words presented with colorful images, receiving immediate feedback. Correct answers should trigger positive reinforcement (e.g., happy sounds, animations). Increasing difficulty: starting with two-syllable words, advancing to more complex words and higher syllable counts as accuracy improves. Additional tasks include replacing letters to form new words and matching spoken words to corresponding images.

Design Guidelines: Shall begin with simple two-syllable words, progressing to more complex structures as mastery is demonstrated. Utilize feedback and trigger mechanisms that use positive reinforcement (sounds, animations) for correct answers; offer hints or visual cues when errors occur.

Scenarios: Support a variety of task types (syllable segmentation, letter replacement, image-word matching) to address different therapy needs.

Adaptation and Personalization: The game should dynamically adjust difficulty based on individual player performance (accuracy, response time, error rates). Also, should a child struggle, the system will provide immediate, supportive feedback and hints as needed (e.g., underlining the correct answer or splitting words into syllables). Progress shall be tracked, and metrics like accuracy and completion time shall guide real-time adjustments.

Expert Involvement: Speech therapists shall define the initial rules, thresholds, and adaptation strategies. Experts shall be able to manually adjust game parameters and review performance dashboards to monitor progress and refine the game experience.

Evaluation: Preliminary trials shall involve at least five children with various syndromes. The game shall be assessed in terms of flexibility, adaptability, and ability to reduce the need for constant manual intervention by therapists.

Data Collection: Record accuracy, response times, error rates, and progression. Also, enable therapists to review session data for individualized planning.

Performance Monitoring: Integrate real-time dashboards for therapists to monitor and adjust therapy dynamically.

 

  1. Factory Training Game for Engineers

Purpose: Implement at the Paradisiotis Group poultry meat factory. Design to train engineers and factory staff in machinery operation and climate control within poultry farms.

Requirements: Target Group is factory engineers and technical staff in a Paradisiotis production environment. The training objectives are to simulate real-world machinery operation and climate control scenarios, support skill development in diagnosing faults and maintaining optimal environmental conditions.

Game Structure and Mechanics: Two main types of tasks: (i) Simulate the factory environment where users diagnose and resolve machinery issues by selecting correct images based on textual descriptions. (ii) Control the climate system: players adjust parameters such as temperature, humidity, and ventilation to maintain optimal conditions during breeding cycles.

Design Guidelines: Scenarios shall create realistic simulations of machinery faults and climate management, reflecting actual factory challenges. Furthermore, adaptation shall increase task complexity as users demonstrate proficiency that offer additional support for users who struggle or have difficulties.

Scenarios: Support a variety of task types (handle knobs, gauges, control panels) to address different training needs.

Adaptation and Personalization: The game shall increase complexity of scenarios as the player progresses (e.g., more challenging machinery faults, more nuanced climate control tasks). Real-time adaptation shall be based on performance metrics like task accuracy and completion times. The system shall provide feedback and adjust the level of challenge to match the trainee’s skill and progress.

Expert Involvement: Factory managers and domain experts shall contribute to scenario design, rule definition, and evaluation of training effectiveness. Experts shall use dashboards to track user performance and adjust training parameters as needed.

Evaluation: The game shall be tested by domain experts and shift managers to provide feedback on usability, adaptability, and the relevance of training scenarios. Also, the approach shall be assessed in terms of supporting both individual and group training, adapting to various skill levels, and providing actionable performance insights.

Data Collection: Record task accuracy, completion times, and decision patterns, and enable supervisors to track individual and group progress.

Performance Monitoring: Use dashboards to visualize progress, highlight areas for improvement, and inform future training adjustments.

Two game tasks were implemented that focus on speech therapy (phonological training), while another two focus on machinery training within the poultry meat factory described earlier.

 

Comment 13:

“I would also suggest to use the IMRAD structure”

Response 13:

Thank you for this suggestion. The IMRaD structure is a worldwide standard format for organizing scientific research papers, into sections Introduction, Methods, Results, and Discussion. It is very similar to ours. But we preferred to use this slightly different format as we feel it is more appropriate for this paper. In essence, though, the content of our sections is extremely close to the IMRaD structure. For example, Methods in IMRaD is similar to our section 3 - Methodology and the Proposed Framework, and Results to our section 5 - Preliminary Experimentation and Evaluation. The different titles that we use reflect better what is reported in the text. Finally, in between we have section 4 - System Demonstration which we consider quite significant as it provides details for the practical application of the framework and supports its reproducibility.

 

Comment 14:

“Please also explain why you evaluated it like this - there are many standardized questionnaires available.”

Response 14:

We chose to use a customised questionnaire and not a standardised one like the System Usability Scale (SUS), Post-Study System Usability Questionnaire (PSSUQ), or the Game Experience Questionnaire (GEQ) as we felt that the customised version would serve better the purposes of the study we wanted to conduct. The aforementioned approaches offer well-established questionnaires that provide valuable insights into usability, user satisfaction, and general game experience, but they do not fully capture the unique and critical aspects we aimed at evaluation in our framework. This framework emphasizes real-time self-adaptation, the ability to define, store and process rules via a dedicated engine, uses semantic annotation, and provides dynamic personalisation based on detailed performance metrics. Our customised questionnaire was designed to directly evaluate these specific components, focusing on the effectiveness of rule definition, adaptability based on user needs, accuracy and performance through data collection. This approach allowed us to collect a specific kind of feedback on the key performance indicators of our framework. By aligning the evaluation closely with the features of our framework, we ensured that the assessment was both relevant and actionable, providing insights that directly inform the validation of our framework’s novel capabilities.

Similar text to the above was also inserted in section 5 (subsection 5.2 Evaluation Criteria) to clarify this issue (Lines 752-767)

 

Comment 15:

“Please describe inclusion and exclusion criteria in more detail”

Response 15:

We assume that the referee refers to the criteria used to compare the proposed framework to other similar approaches. As described in Comment 14 above, the criteria were selected to reflect best the components or properties we aimed to compare. The inclusion criteria for the questionnaire were selected to directly reflect the most critical aspects of our framework’s operation and user interaction, as identified both in the literature and through expert consultation. The focus was on usability, configuration and rule definition, adaptability, accuracy and usefulness of results, user experience, and overall satisfaction, as these areas are the most relevant to evaluating the unique features and performance of our framework. In order to keep the evaluation focused and ensure that the most important and useful feedback could be collected without confusing those who responded or missing focus, we have selected and emphasized those criteria. As mentioned in the manuscript, this evaluation is preliminary and does not exclude the possibility of incorporating additional or more standardised criteria in future work as the framework matures. The criteria were not intended to be exhaustive, but rather to provide a focused assessment aligned with the current scope and aims of the study, allowing for more deep exploration in subsequent research phases. This is clearly described in section 5.

Quoting from the text in Section 5.2 – Evaluation Criteria:

Page 19 (Lines 768-778)

“The questionnaire was designed based on the pillars mentioned above and having the goal of obtaining feedback from domain experts on the following areas: (i) Usability, focusing on how easy it was to understand the delivered content and utilize the sup-ported gameplay to complete the tasks;[18] (ii) Configuration and Rule definition, assessing how easy and efficient was the process of defining rules and configuring game parameters and settings; (iii) Adaptability, that is, how well the games provided feedback and adjusted to user needs; (iv) Accuracy and Usefulness of Results, evaluating the proper use of performance metrics and their usefulness in assessing user progress and interactions;[20] (v) User experience, measuring the overall user experience including enjoyability; and (vi) General satisfaction and suggestions, assessing overall impression or general feedback on the platform and recording possible suggestions.[19] “

Therefore, what was included in this comparative study was a mixture of recommendations by similar studies with properties offered by the proposed framework.

 

Comment 16:

“I would suggest to skip table 2 - instead add more details about participants”

Response 16:

Thank you for your suggestion. We feel that keeping table 1 improves readability of the paper. As for the details on participants, we added the following text in Section 5 to address this comment: 

Section 5.3 – Evaluation

Page 20 (Lines 803-816)

The experts were selected so as to reflect both therapeutic and industrial expertise relevant to the framework’s application areas. Of these, 9 were professionals from the therapeutic domain. This group included 3 highly experienced individuals holding PhDs or recognized as senior experts in their field, while the remaining 6 were either graduate-level specialists or students at the master’s level. The other 6 participants came from the PARG company, representing the industrial training context. This subgroup included 2 shift managers, each with around 10 years of experience in the company, and 4 Electrical Engineers. Among the engineers, 2 had similar experience (~8-10 years), while the other 2 were newer to the organisation, with approximately 2 years prior experience at other industries. This mixture of senior and junior staff ensured that feedback collected a variety of perspectives, from long-term operational knowledge to new viewpoints from newer employees. This diverse participants’ profile provided a strong basis for evaluating the framework’s usability, adaptability, and relevance across both therapeutic and industrial training aspects.”

 

Comment 17:

“Please also explain more about the „brief“ literature review and how it was done “

Response 17:

Following the suggestion of the reviewer we added a new paragraph explaining how literature review was conducted at the beginning of Section 2.

Quoting from the manuscript:

Section 2 – Technical Background and Related Work

Page 3 (Lines 121-138)

Literature review conducted in this study followed a short but focused and structured approach to ensure relevance and clarity. A selection of recent peer-reviewed journal articles, systematic reviews, and established frameworks in the fields of educational technology, serious games, and adaptive systems was performed aiming at identifying research gaps directly related to the development, standardisation and adaptability of serious games, especially real-time self-adaptation and integration with digital twin technology. The review of these sources was brief, giving priority to works that best matched the main ideas of the suggested framework. The search process aimed at tracing articles indexed in Scopus, Science Direct, IEEE Xplore, ACM Digital Library, SpringerLink, Google Scholar and Wiley Online. Inclusion criteria focused on studies that addressed standardisation in game development, integration of digital twins, and real-time adaptation mechanisms. This set of criteria consisted of the type of paper, publication year, publication venue and number of citations. Only scientific papers published in recognized venues with a significant number of citations were included in the final set of papers. This targeted approach allowed collecting findings and best practices from the relevant literature, ensuring that a compact, yet solid background was formed that enables the identification of gaps and challenges and act as an informative guideline for the design and validation of the proposed framework.”

 

Comment 18:

“I like Tab 3. Although I think you should really focus on adaptability - at least based on the title I would have expected it to go into that direction”

Response 18:

As mentioned in Comment 10 above, self-adaptation is a key feature and we feel that, as mentioned by the referee, it is indeed the primary focus of this paper. As such it has already been highlighted in many places in the manuscript (see examples of the text reported in Comment 10). With changes performed in comments 3 and 10 we believe that this aspect has been emphasized now even more.

 

Comment 19:

“I believe the source code/pseudocode can be skipped in the publication (or argue in more detail, why it is that important/new)”

Response 19:

Thank you for your suggestion about the source code and pseudocode. In response, I have moved all program code and basic pseudocode figures from the main text to the appendix A section.

Comment 20:

“In the end it was not clear for my what the real novelty is and into what direction the paper is heading.”

Response 20:

We would like to thank the reviewer for this comment. Taking this into serious consideration we added a new paragraph in the Introduction summarizing and highlighting the contributions of our paper, as follows:

Section 1 – Introduction

Page 3 (Lines 100-105)

The main contributions of this paper may be summarised to the following: (i) Provides a phased, stepwise approach for developing SG from scratch, which is domain agnostic; (ii) Enables real-time, automatic self-adaptation of SGs via the integration with DT; (iii) Offers internal standardization by introducing a formalised structure for describing game elements, scenarios, and adaptation rules thus supporting consistency and portability.

We would like to highlight that core innovations, challenges and contributions of our work are mentioned and described throughout the text, excerpts of which are provided below:

Step by step approach:

  • The paper introduces a phased and structured approach to SG development, offering a clear methodology that guides the creation of self-adaptive games using DT, semantic annotations, rule engine (expert-driven rules) and performance metrics. (Section 3, lines 235-240)
  • This step-by-step approach itself innovative, as standardises the process and guarantees adaptability, portability and consistency across domains. (lines 270-287, 920-922)

Integration of DT for Real Time Self-Adaptation:

  • Integration of DTs into SGs, enabling real-time adaptation of gameplay based on user data, which is not common in existing literature (lines 15-17, 55-62, 970-977)
  • The DT works in combination with a rule engine and analytical model, adjusting the game experience dynamically based on players interactions and performance metrics (lines 77-79, 326-334, 454-458)
  • Hybrid approach combines self-adaptation with expert feedback, ensuring both educational relevance and technical innovation (lines 22-24, 275-278, 511-513, 1040-1042).

Standardisation for Portability and Interoperability:

  • We introduce a standardised description of game elements, scenarios, and data using XML and ontologies, ensuring portability across platforms and supporting integration of diverse game elements (lines 13–15, 272-274, 729-732).
  • This internal standardisation is an innovative contribution, enabling consistent development and future extents (lines 344-349, 1047-1059).

Blending Multiple Technologies:

  • The framework allows for easy adaptation in many application areas not only in a specific domain (lines 610-613, 624-628).
  • The combination of DTs, rule engines, semantic annotation, and expert-driven adaptation in this framework is unique in the field (lines 202–205, 221-229, 618-621, 1072-1074).
  • All these components work together to deliver real-time self-adaptation and a personalised, data-driven gaming experience (lines 20–21, 392–409, 686-690, 729-732).

Challenges:

  • The main challenge addressed is the lack of standardisation and real-time adaptability in serious game development, as well as the need for domain flexibility and user personalisation (lines 46–54, 211-216, 221-229, 729-732, 1100-1104, 1049-1052).
  • Our approach directly responds to these gaps by providing a phased and structured and extensible framework that supports real-time adaptation, rule engine (domain-experts), and integration of new technologies (lines 156-159, 323-326, 344-346, 624-628, 729-732).

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Acceptance recommended following their major revision.

Author Response

Thank you for your follow-up review and recommendation for acceptance. I appreciate your time and constructive feedback throughout the review process.

Reviewer 2 Report

Comments and Suggestions for Authors

 The article explores the value of serious games and has some merit. However, the following issues exist: 1. Much of the program code and basic code definitions are listed in the main text. To avoid disrupting the main content, they should be moved to the appendix. 2. The article describes the process of developing the game but fails to demonstrate innovative technologies or ideas in game development. 3. The game interface is overly simple, making it unclear what it intends to convey. 4. This type of game development should emphasize technology rather than just ideas.

Comments on the Quality of English Language

none

Author Response

HOW COMMENTS WERE DEALT

We would like to thank the reviewer for his/her valuable comments. We tried to address all of their suggestions by either revising the manuscript (adding/enhancing or modifying text where appropriate) or by clarifying/answering to a comment in case we did not fully agree with it.

Below we provide each comment/suggestion in italics and our answer right below it for each reviewer.

 

Reviewer #2

 

Comment 1:

Much of the program code and basic code definitions are listed in the main text. To avoid disrupting the main content, they should be moved to the appendix.”

Response 1:

Thank you for your suggestion. I have moved all program code and basic code figures from the main text to the appendix.

 

Comment 2:

“The article describes the process of developing the game but fails to demonstrate innovative technologies or ideas in game development”

Response 2:

Thank you for your observation about the demonstration of innovative technologies and ideas in our game development process.

Taking this into serious consideration we added a new paragraph in the Introduction (Lines 100-105) summarizing and highlighting the contributions of our paper, as follows:

The main contributions of this paper may be summarised to the following: (i) Provides a phased, stepwise approach for developing SG from scratch, which is domain agnostic; (ii) Enables real-time, automatic self-adaptation of SGs via the integration with DT; (iii) Offers internal standardization by introducing a formalised structure for de-scribing game elements, scenarios, and adaptation rules thus supporting consistency and portability.

We would like to highlight that core innovations, challenges and contributions of our work are mentioned and described throughout the text, excerpts of which are provided below for quick reference:

Step by step approach:

  • The paper introduces a phased and structured approach to SG development, offering a clear methodology that guides the creation of self-adaptive games using DT, semantic annotations, rule engine (expert-driven rules) and performance metrics. (Section 3, lines 235-240)
  • This step-by-step approach itself innovative, as standardises the process and guarantees adaptability, portability and consistency across domains. (lines 270-287, 920-922)

Integration of DT for Real Time Self-Adaptation:

  • Integration of DTs into SGs, enabling real-time adaptation of gameplay based on user data, which is not common in existing literature (lines 15-17, 55-62, 970-977)
  • The DT works in combination with a rule engine and analytical model, adjusting the game experience dynamically based on players interactions and performance metrics (lines 77-79, 326-334, 454-458)
  • Hybrid approach combines self-adaptation with expert feedback, ensuring both educational relevance and technical innovation (lines 22-24, 275-278, 511-513, 1040-1042).

Standardisation for Portability and Interoperability:

  • We introduce a standardised description of game elements, scenarios, and data using XML and ontologies, ensuring portability across platforms and supporting integration of diverse game elements (lines 13–15, 272-274, 729-732).
  • This internal standardisation is an innovative contribution, enabling consistent development and future extents (lines 344-349, 1047-1059).

Blending Multiple Technologies:

  • The framework allows for easy adaptation in many application areas not only in a specific domain (lines 610-613, 624-628).
  • The combination of DTs, rule engines, semantic annotation, and expert-driven adaptation in this framework is unique in the field (lines 202–205, 221-229, 618-621, 1072-1074).
  • All these components work together to deliver real-time self-adaptation and a personalised, data-driven gaming experience (lines 20–21, 392–409, 686-690, 729-732).

Challenges:

  • The main challenge addressed is the lack of standardisation and real-time adaptability in serious game development, as well as the need for domain flexibility and user personalisation (lines 46–54, 211-216, 221-229, 729-732, 1100-1104, 1049-1052).
  • Our approach directly responds to these gaps by providing a phased and structured and extensible framework that supports real-time adaptation, rule engine (domain-experts), and integration of new technologies (lines 156-159, 323-326, 344-346, 624-628, 729-732).

 

Comment 3:

 “The game interface is overly simple, making it unclear what it intends to convey.”

Response 3:

We thank the reviewer for this feedback about the game's interface design. We would like to clarify that the simplicity of the interface was an intentional and critical design choice, driven by the objectives and the specific needs of our target users.

The central contribution of this work is the development of a standardised framework for creating SG with real-time self-adaptation capabilities. The primary focus was on demonstrating the innovative integration of DT, a rule engine, and data-driven performance metrics and analytics to enable dynamic personalisation. The two games were developed as proof-of-concept to validate the framework's effectiveness and domain-agnostic, rather than to develop complex bowled gaming interfaces.

The two games developed were targeted on not experienced audiences (children with disabilities and factory workers with some computer knowledge) with the following targets:

  • Therapeutic Game for Children: This game was developed for children with learning difficulties and syndromes that affect their learning abilities. As mentioned in the text, in collaboration with the speech pathologists at the Rehabilitation Clinic, a key requirement was to create an interface that was "simple but also attractive... and at the same time... interactive to motivate children"
  • Industrial Training Game: This game targets new hired factory engineers or not expert workers who are under training to use complex machinery. The goal here was to effectively teach a technical skill, not to provide an entertainment experience.

 

Comment 4:

“This type of game development should emphasize technology rather than just ideas.”

Response 4:

Thank you for this suggestion. We strongly disagree with the referee that our work is limited to “just ideas”, which, essentially, is the end-product (i.e., the actual games).  Our approach explained in detail the integration and application of advanced technologies. Below, we clarify how the proposed framework is actually technological, not conceptual, and highlight the specific innovations that drive the development of serious games.

The framework includes real-time adaptation mechanisms, enabling the game environment to dynamically adjust its difficulty, content, and feedback based on player performance and interactions. (lines 20-24, 100-105, 235-236, 263-265). Also, the core technological innovation is the use of DTs in the game architecture. The DT apply adaptation rules to modify the game environment accordingly (see lines 24-26, 71-75, 320-322, 364-369, 454-458). Furthermore, the rule engine is responsible for encompassing expert knowledge and adaptation strategies, which are executed in real-time by the DT. Their input is key to ensure operationalisation through the components of the framework (see lines 326-334, 454-458, 511-513, 970-977). Finally, the framework uses semantic annotation and standardized XML structures to formally describe game elements, scenarios, and user data. This ensures portability, and the ability to scale or transfer games across different domains and platforms (see lines 13-15, 344-349, 729-732).

Author Response File: Author Response.pdf

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

The article proposes a development framework for serious games from the perspectives of portability and comprehensibility, and verifies through integration with therapists' practice and engineer training scenarios that such games have good usability and adaptability. What needs to be supplemented is how these games can help people achieve physical and mental relaxation, and that physiological data such as brain waves can well reflect the actual effect of the games.

Author Response

We would like to thank the reviewer for this suggestion. Although it was not among the targets of this paper to address how serious games can help people achieve physical and mental relaxation, or how physiological data such as ECG and brain waves can reflect the actual effects of these games, we followed this suggestion and revised our manuscript (text and framework architecture) as follows:

  1. We have modified the framework architecture (Page 6, Figure 1) and added a module in Game Experience which collects, processes and analyses biosignals. This module is later described in various parts of section 3 - Methodology and the Proposed Framework.

 

  1. We have included in literature review some text to discuss about physical and mental relaxation, and assessment of serious games using physiological data:

Section 2 - Technical Background and Related Work

2.1 – Serious Games (Page 4, lines 183-207)

Serious games often incorporate biofeedback and immersive technologies to reduce stress and enhance well-being [9]. Several studies report that serious games incorporating mindfulness, guided breathing, and light cognitive tasks can induce measurable relaxation effects [10]. For example, games designed with calm visual and auditory stimuli help users enter a parasympathetic state, reducing anxiety and muscle tension. Virtual Reality (VR) games like Deep and FlowVR immerse players in peaceful underwater or natural environments, requiring breathing-based controls that promote deep, slow respiration. These have been shown to significantly reduce self-reported stress and anxiety scores in both healthy and clinical populations [11]. In addition, researchers frequently use EEG (brain-waves), ECG (heart rate), and EDA (skin conductance) to evaluate the effectiveness of re-laxation-focused serious games [12]. EEG studies show that increased alpha and theta activity—associated with relaxed wakefulness and meditative states—are more prominent during and after game play. For instance, a study using a neurofeedback game found a rise in frontal alpha power, correlating with reduced anxiety levels [9]. Heart rate variability (HRV), a key indicator of autonomic nervous system balance, also improves after gameplay in many studies. Greater HRV indicates a shift toward parasympathetic dominance, suggesting reduced stress. Similarly, decreases in skin conductance reflect reduced arousal [10]. There is also studies in literature that report the use of physiological data for adjusting game parameters. For example, RECOGNISED is a serious game that uses re-al-time EEG input to modify in-game environments based on user relaxation levels. Clin-ical trials showed improvement in sleep quality and reduced stress in participants [12]. Other examples are NeuroRacer and similar games that are tailored for older adults, which also demonstrated improved emotional regulation and stress recovery [10]. Finally, rehabilitation games that use motion sensors and biofeedback help patients manage pain and anxiety during recovery [9].

We included a module in our conceptual architecture which is dedicated to utilizing such data in the framework and discussed how this module may be utilized to offer guidance to the self-adaptation process. Especially in the case of having available physiological data such as ECG and brain waves, we explained how and where this data may be exploited for assessing the effectiveness of the serious game and offering more accurate customization of game scenarios and tasks.

 

  1. We had already briefly described in the therapeutic case-study how physical and mental relaxation of children with syndromes and learning disabilities is taken into consideration during self-adaptation (colors, calm visuals, etc.). This part is now better linked with the new text and the modification performed on the architecture.

 

Section 3 - Methodology and the Proposed Framework

Page 8 (Lines 345-351)

While the current version of the therapeutic game leverages expert input to provide supportive elements, such as calming visuals and music for users with heightened stress, no biosignals or physiological measurements are recorded as part of the adaptation process. However, it is recognized that including real-time biosignals (such as heart rate or skin conductance) in future iterations could provide deeper, objective insights into stress and relaxation responses, further enhancing the game’s ability to tailor interventions to individual needs.

 

Page 9 (Lines 367-370)

Player interactions, behaviors, decisions and performance metrics, such as scores and completion times, potentially enhanced by biosignals of the players recorded during playing sessions (e.g., ECG, heart rate, blood pressure) are examples of game experience data.

 

Page 10 (447-455)

Also, Game Experience offers the potential to exploit biosignals via a dedicated Biosignal Analysis module. This component is designed to acquire in real-time and process physiological data such as heart rate, skin conductance, or EEG signals, when and if such in-formation is available. Its purpose is to add an additional layer of insight into the players’ state such as stress, relaxation or emotional engagement thus enabling more precise adaptation of the game environment. Although biosignals data are not currently utilsed for real-time adaptation in the case studies described later in this paper, the proposed framework structure allows for their integration as an additional input stream, enhancing personalisation and supporting a more holistic evaluation of user experience.

 

  1. We added some text in future work to discuss how biosignals in general may be further exploited by the proposed framework to support more fine grained and accurate game self-adaptation

 

Section 6 – Conclusion (Page 30, Lines 1204-1210)

Finally, future research steps will focus on enhancing the real-time adaptability capabilities of the proposed framework by incorporating biosignals analysis as an active input to the adaptation engine and assessing its contribution to the automatic adjustment per-formed. Integrating real-time physiological data such as heart rate variability, or EEG signals will enable more accurate monitoring of user states and support new strategies for dynamic, data-driven gameplay adjustments, particularly in therapeutic and stress-management scenarios. 

 

Author Response File: Author Response.docx

Back to TopTop