Implementation and Evaluation of a Serious Game for Working Memory Enhancement

: The amount of information which can be stored in the human brain is limited and dependent on memory capacity. Over the last few years there has been a trend in training cognitive skills, not only to prevent cognitive decline, which is inevitable as a person grows older, but also to increase or at least preserve mental abilities that will allow a person to function at a higher cognitive level. Memory is one of those key aspects among cognitive skills that has a signiﬁcant role in a person’s mental performance. Speciﬁcally, focus is given to Working Memory (WM), as evidence has shown that it can be increased by applying targeted interventions. An intervention program like this is the main object of this current paper. Using a Serious Game (SG), we designed and created a video game which targets WM training. Its e ﬀ ectiveness was tested and evaluated through an evaluation process where forty people participated in a seven-week training program. Post-results showed that participants had an increase in their WM performance, especially those who had lower scores at the pre-test, while those with high pre-test scores just preserved their initial status. Additionally, all participants agreed that the game is fun and enjoyable to play and that it helps them to increase WM performance. T.T., Conceptualization, Formal Resources, Writing—Original Preparation, Visualization. T.T.: Conceptualization, Methodology, Validation, Writing—Review


Introduction
Over the last few years, brain functionality and cognitive skills have been in the center of interest. Moreover, various studies and research have been carried out and are still taking place to identify best practices and evidence which will offer better training and enhancement of the human brain [1,2]. Although evidence so far indicates controversial results, it is claimed that training Working Memory (WM) can affect a person's overall cognitive skills and intelligence.
As we move towards an era where people have to deal with a plethora of information and process data in sectors such as the daily routine, workplace or even at school, it is crucial to be able to store and manage all this information, using it effectively as needed. Moving from school towards a professional career, someone must maintain strong brain functionality and develop advanced cognitive skills (e.g., problem-solving capability, novel thinking, strategic leadership, emotional intelligence and adaptability). Starting from a young age, children also are required to develop higher mental abilities to be successful and perform at maximum level, so that they can prepare themselves for a professional career afterwards [3][4][5].
Another key challenge, which is related to brain functionality, refers to mental disorders caused by the normal ageing of the brain (excluding dementia, Alzheimer's disease, strokes or any other diseases which cause more severe mental deficits). The older someone gets, the more likely they are to present reduced brain functionality, since the human brain tends to shrink over the years [6,7]. The challenge here is that older people should be able to preserve their mental abilities and prevent cognitive decline in order to be able to function and live on their own. Training and improving cognitive skills are possible based on the theory which claims that the human brain is characterized by the presumed "neuroplasticity"-the ability to create and form new neurons and building or even reforming the interconnections between them (called synapses) [8][9][10]. In recent years, psychologists have managed to develop several batteries which train and stimulate the human brain by utilizing certain cognitive exercises, and there is continuous interest and research on how to exploit these practices and apply them through intervention programs aimed at brain strengthening, delaying cognitive decline and leading to overall better quality of life [11][12][13][14][15].
This research describes the design and implementation of a Serious Game (SG) for WM training, its final evaluation regarding its effectiveness as a training tool for WM and its usefulness and enjoyment evaluation as an SG as well, through a study with 40 participants in a 7-week intervention program. The results showed that users with high pre-test scores preserved their performance with no further improvement, but for those who had scored lower in the pre-test, there was a notable increase in their performance. Additionally, all users found the game useful in WM training and enjoyable as well.
The following sections contain the theory behind WM training and SGs, which the game was based on, followed by the description of the study, the tools and instruments used, along with the evaluation process and finally the results.

Working Memory
Cognitive functions are the basic functions taking place in the human brain which allow the storing and processing of new and old data that a person receives from several stimuli. They include skills like knowledge, memory, reasoning and evaluation. Additionally, concentration, memory, thinking, perception, language/speaking and their performance are different between people and can be measured by several instruments. Among them, memory is of high importance since it affects most of the other skills. According to [16], attention (concentration), encoding, storing and recalling information are the four levels of memory. Concentration/attention is related to WM while information storage is related to long-term memory. The remaining levels, encoding and recalling information, refer to both kinds of memory.
As Klingberg has stated [17], WM is the cognitive system which is located in the prefrontal cortex of the brain and strongly affects the development of reasoning skills, as it underlies several cognitive abilities such as logical reasoning and problem solving. The prefrontal cortex is also where the memory and all the functions we need when we have to temporary hold and/or process specific data in order to execute some action exist. The aforementioned part of the brain is responsible for new incoming information which, through proper encoding and processing, leads to the creation of new knowledge. Thus, it is important to perform as highly as possible. People with high capacity WM are able to apply recent incoming information in new functions, remain focused and concentrated on every procedure and reorganize their thoughts in order to include new information. Furthermore, these people can take better notes and process information with increased accuracy while following complicated and numerous directions and orders.
Self-regulation, flexibility and WM are the three cores of the executive functions of cognitive skills and processes in the human brain [18]. We could say that WM is always alert and ready to be used; it is the active memory which should immediately preserve, recall and/or process in order to execute any tasks assigned to it. It is crucial regarding daily routines, work, education and our entire life in general, allowing us to remain in a good memory shape. WM's capacity affects how the rest of the cognitive skills perform and is also a strong prognostic factor for the majority of them, including from language comprehension to performing complex numerical calculations. Therefore, it is important to increase WM as much as possible and discover new ways to make this happen through external factors. WM's capacity corresponds to the number of items that can be retained in WM. According to Miller, it can hold 7 +/− 2 items [19], but Cowan [20] has stated that if we exclude cognitive iteration or information storage in long-term memory (factors that affect WM's capacity), then only four items can be stored temporarily. Obviously, it depends on the kind of items that we have to preserve, for example, it differs if we have to preserve numbers, letters, words, etc. The timespan we can retain this information is limited, and as Goldstein [21] has concluded, it is between 10-15 seconds unless this information is applied somewhere actively or repeatedly by any means and in this latter case the information is stored to long-term memory. To conclude, we focus on a kind of memory which can be further improved, and this improvement can facilitate any cognitive functions in the human brain thereafter.

Working Memory Training
WM's capacity is not fixed, but rather depends on and is defined based on what one is able to remember. Recent research shows that the capacity of WM can be increased through certain and targeted training. Considering the strong relationship between WM and higher cognitive skills, recent studies have focused on the capacity's alterability, which in turn has led to results indicating that the training of the WM generates benefits reflected in the cognitive functions of the human brain. This is another key factor which dictates the ongoing need for WM training. There are two main approaches in WM training, basic and strategic, and there are underlined theoretical and practical incentives which lead to each one of them. Increasing WM's capacity by certain training has been accepted and utilized in recent years, presenting encouraging and positive results in many different groups of people. However, it should be noted that each approach leads to different outcomes, and we should not expect that following one or another will positively affect every cognitive function. Nevertheless, there have been certain studies with encouraging results towards a targeted and holistic training approach, which can lead to an overall improvement of cognitive skills. However, there are still certain boundaries that we have to overcome in order for these studies to be accepted by the scientific community.
Memory can be treated like the human body; just like the human body needs regular and systematic training and exercise, memory should also be trained in a way that will guarantee its top performance. The only aspect of WM that can be improved is its capacity. Unfortunately, regarding the time period that information can be preserved, there is not much that can be done in the form of a training program, apart from more repetitions. These repetitions will eventually result in the storage of information in the long-term memory so that it can remain there for more time. Time is specific, but capacity has several ways of improvement.
Memory enhancement methods include basic and strategic training, mnemonic strategies like rehearsal-the conscious repetition of information to be remembered [22], chunking-organizing information into manageable bits or chunks [23] or building mental images for information encoding [24]. In order to attain memory decongestion, we can utilize management, memorizing techniques, connection, cognitive load reduction, a record of stressful situations and the capability of reducing interferences. Therefore, for WM to be enhanced we can use the same techniques that we exploit for memory enhancement. The simplest strategy, according to Turley-Ames and Whittfield [25], is the repetition of information, which allows us to strengthen WM and is suitable for people with low WM such as, for instance, the elderly. On the other hand, regarding WM decongestion techniques as mentioned above, they intend to free space for WM to function better-it is not all about increasing WM capacity. We can achieve the same results by freeing space for WM.
Of course, besides the abovementioned methods for the enhancement and decongestion of WM, corresponding drugs have also been developed which help to improve brain functionality, such as bromocriptine, which affects dopamine receptors [26]. Furthermore, another method constitutes the intervention in the way the neurons of the human brain function as shown in a recent study by R. Reinhart and J. Nguyen [27], in which they managed to restore the memory of a group of elderly people for a limited time period (50 min). Certain interactions between neurons are part of WM functionality, which in turn managed to recover to a level as in the age of 20 years old with electric stimulation.
From these findings, we can conclude that the intervention method for WM enhancement varies from case to case and depends on the group of people we attempt to treat. It can vary from a simple game, where the main characteristic is repetition, to electric stimulation for the connections between neurons in order to recover their processing speed as in younger age. In conclusion, specific intervention techniques may affect people with low WM capacity, while other methods can affect more people with higher memory capacity [25].

Serious Games for Supporting WM Training
There are several definitions about SGs such as that of Clark back in 1970, who related SGs to board and card games whose purpose was not only fun but education as well [28]. Another definition describes as serious the games that set educational goals with an entertaining background [29]. Additionally, Michael and Chen [30] also defined SG as a game whose primary target is education rather than entertainment. An interesting definition has been given by Sawyer, who stated that the serious characterization of a game does not lie on the content and/or its representation but on the goals that it sets and how they are achieved [31,32]. According to him, personal computers were the medium that transformed an entertaining game into a serious one. Finally, Zyda defined SGs as "a mental contest, played with computer in accordance with specific rules that uses entertainment to further government or corporate training, education, health, public policy, and strategic communication objectives" [33].
All the previous definitions conclude that the primary purpose of SGs is not pure entertainment; rather that they have an explicit educational purpose and intend to provide certain knowledge to the players that are involved in them. Towards this direction, SGs can be utilized in the learning process, as they provide the ability to create memorable and engaging learning experiences. In addition, SGs can be built on sound learning principles encompassing teaching and training approaches that support the design of authentic and situated learning activities in an engaging and immersive way. Moreover, they may develop and reinforce modern skills such as collaboration, problem solving and communication. For that reason, SG applications have been used in several sectors like education, healthcare, defense, engineering, emergency management and more.
Building SGs based on activity-centered pedagogies that enable trainees to engage actively with problems associated with cognitive training and specifically WM training is an empowering approach with benefits for learning as well as for developing a wide range of significant high-level intellectual attributes. SGs constitute probably one of the best ways of building cognitive skills and achieving educational methods attractive for the learner, under suitable conditions. Based on the above, in order to evaluate WM we experimented on a memory game which uses the repetition of paths in order to potentially improve the WM. As a result, the game in question is placed among serious ones, since it tests whether it achieves its purpose or not, apart from entertainment.
As mentioned, WM allows the transfer of information into long-term memory and temporarily stores new information, creating conditions for recall-when needed-so that this information can be used or permanently stored [34]. In addition, WM facilitates the maintenance of concentration, building visuo-spatial images in order not to cause cognitive load, which overloads the limited memory so that cognitive functions can be performed more easily. All these directly connect to what has been mentioned regarding the parameters which are important when building a successful SG.
WM is a key factor in enabling concentration and preserving data-even temporarily-so that there can be a connection with existing memories or relative encoding, which facilitate cognitive functions. In this way, we realize its significance, which makes it imperative to include it as a point of reference in the design of an SG. Almost all factors that contribute to the successful design of an SG are related directly or indirectly to WM. Therefore, besides the basic design purpose, which is the user's entertainment, WM constitutes another essential aspect which should be taken into consideration.
In order to have an effective SG, its goals should be clear and should also apply to every appropriate cognitive/behavioral design principle. Consequently, during design we should take into consideration the target group, its educational level, motivation and any difficulties in the design per se. The game should provide simple and clear instructions in order to avoid any discomfort in the player which could result in them not playing the game in the desired manner. Of course, we should forget the game's main axes, which are enjoyment and a fun environment that promotes learning-or any other SG purpose-without being necessarily noticed.
When building a successful SG, we aim to actually attract trainees and not just participate in the process. It is not another mandatory and boring learning experience but a pleasant and entertaining environment which will allow the user to learn and have fun at the same time. To sum up, the most significant requirements are: (a) simple and clear instructions, (b) training with scalable difficulty, (c) motivation, (d) concentration/attention and (e) fun environment.

Rationale for Supporting Working Memory Training with Serious Games
WM is a kind of memory that allows the transfer of information into long-term memory, which can temporarily store new information, creating suitable conditions for recall-when needed-in order to be used or stored permanently [35]. It is the type of memory which facilitates attention preservation and attempts to create visuo-spatial designs in order to not create cognitive loads by overloading the already limited memory, but to help the brain perform cognitive tasks more easily. Since WM is a key factor in activating attention to preserve information-even temporarily-in order to create connections with other types of memory, or to have the appropriate encoding which will facilitate cognitive function, it is realized that it is important to include WM in the development of an SG. Almost every factor taken into consideration for the successful design of an SG is strongly connected, either directly or indirectly, to WM. Thus, apart from the user's enjoyment, WM constitutes another factor that should be considered when designing SGs.
Recently, several SGs for cognitive training have been developed which directly target the training of cognitive functions by utilizing mental exercises in games, called "brain training". Games like "Sudoku" and crossword puzzles constitute examples of simple yet effective games which support brain training. Sudoku, for example, relies on short-term memory, since in order to successfully complete a Sudoku puzzle, you have to be able to make predictions and follow trails of consequence. This type of planning helps improve short-term memory and concentration.
A well-known platform is "Lumosity" (http://www.lumosity.com/), which aims to enhance attention, WM and executive functions. Lumosity's brain training and mental fitness games, tests, and activities are backed by science. Other similar options are: The approach chosen in these programs relies on performing short tasks repeatedly and with increasing levels of difficulty. Most brain training games have been developed for healthy elderly people and people with mild cognitive impairments. Evidence for the effectiveness of these brain training games is inconclusive, as the effects of the training generally are not generalized beyond the training itself [35,36].
Cognitive training applications, and specifically WM training systems, are designed to improve the user's WM. However, conventional systems are frequently considered tedious or repetitive, which deeply affects the user's motivation to learn and consequently the potential for learning transfer [37]. According to Prins et al. [38], WM training with game elements significantly improves motivation and training performance.
A game is considered to be "serious" if it offers strong motivation for an objective, provides simple and clear directions, increases the attention and focus of the player, offers scalable and adaptive difficulty, and all these are supported in a fun environment. The connection of the SGs with cognitive training can be found in the fact that cognitive training has also got specific goals which motivate the subject. Moreover, it requires the subject's increased attention and concentration. There are usually simple instructions and guidance and a scalable difficulty as well, depending on the performing task. Cognitive training may not be characterized by increased fun, but this is where SG can be utilized to fill this gap. Thus, it is obvious that SGs are appropriate to support cognitive training by utilizing all aspects of building an attractive training environment.

Research Goals
As mentioned, the aim of this study is to examine the effectiveness of an SG in WM training. In other words, the primary research goal (RG1) is whether WM can be increased by following an intervention program with this particular SG. Since the game was designed and built from scratch, based on SG principles and WM training strategies, the main question is whether this design fulfilled the main goal.
The second important research goal (RG2) is to investigate the usefulness and enjoyment of the game itself, since commitment and consistency in daily sessions were crucial factors. Additionally, the depth of the game, in other words the size of the grid and the maximum path length, needed to be defined in order to adjust the game and avoid both too easy and too difficult levels.

Game Design and Description
The game's main goal is to improve and increase the players' WM capacity by utilizing repetition and concentration. The design of the game was built and based on findings of previous research regarding WM training and a specialized psychologist's guidance and directions. In order to achieve WM's improvement, the game was designed in a way that the player must be able to remember as many items as possible during a predefined time period. We decided to follow a web-based approach, so that our application can be accessed by anyone who possess a PC/laptop and an internet connection. In addition, due to the game's design, we needed to make sure that there was enough screen real estate, so it would be almost impossible to support mobile devices for the time being.
The main gameplay consists of a variable size grid with tiles on it that represent the steps a player should make in order to complete an entire path. The system of the game generates random paths with start/end points and each path is presented to the player gradually, one tile each time. The player starts with an initial route length, which is depicted in the form of flashing tiles sequentially, and the player must repeat the depicted paths successfully without deviating from the original path. The scalable difficulty of the game lies in the number of steps each route has, as after successful repetitions of a route, the game generates routes with an increased number of steps. For instance, the player can start with a route of seven steps, and after repeating it correctly, the game presents a route of eight "steps" and so on. Each new tile flashes on the screen for one second and it disappears right after. The route resets to the starting point and the player must repeat all the steps, including the new tile that flashed right before.
We tried to keep the game's interface as simple as possible, so that it does not require much effort and can be played by anyone, with or without knowledge regarding video games. Following that direction, the game has a single main screen containing the grid and information about the player's performance and the level where the player is (difficulty level, current path length, score, etc.). As depicted in Figure 1, regarding the difficulty level, the player can see the information on the top right corner, which shows options like easy, medium, hard, very hard and insane (see section b in the aforementioned figure), which correspond to the grid size. Moving to the bottom (section c in Figure 1), there is a reversed pyramid which fills as the player manages to complete paths of increased size. The next one (section d in Figure 1) is the current path length indicator, which informs the player with the number of steps that each route has. Finally, the last element is the player's score (section e in Figure 1), which is calculated by the grid size, the path length and the number of repetitions for the current path length, also considering any mistakes that the player may have made. Of course, there is a registration screen where the player provides any necessary information in order to be identified by the system, so that they can be tracked and evaluated performance-wise (Figure 1 presents a sample of that screen). As a final addition, we included a leaderboard screen, where players can check their ranking among all players.
When the player logs into the game, the grid size reflects her/his current performance, which translates to the number of items that can be stored in her/his WM. When the player logs in for the first time, the grid size reflects the player's performance on the pre-test results, which will be discussed further later on. For example, if a player scored a Corsi block span of seven, they would start in a 7 × 7 grid. However, if a player scored six in the pre-test, they would start in a 5 × 5 grid, as there is no option for even-number grid size and it would more suitable for them to start in a lower grid size as well. After the first time and each time a user logs in, the grid size is determined by the player's last session performance, so that they can resume and continue from the last session. This ensures the progress in the players' performance in a way that pushes the user to perform in a higher cognitive state.
The difficulty is adjusted from the system itself, and apart from increasing the number of steps, which translates to the length of routes and finally to increased grid size, it can also be adjusted in the opposite way, as the system can reduce the route length to a level where the player can successfully complete each stage. Although it is rather unlikely to happen, it is still a feature that helps the game's adaptivity and helps the player to be constantly motivated and challenged to perform better. Consequently, if a player makes two consecutive mistakes on the same route, then the route length is reduced by one, and the player has to complete this route length from the beginning. If they are still unable to complete this route length, then the game adjusts and requires a shorter route and so on.
As mentioned before, starting from a given grid size and route length, the game displays the current route by flashing tiles sequentially on the grid, resulting in presenting the route which the player must follow. After finalizing the route, the player starts from the initial tile and must choose the next tile, and the next after this and so on, in order to form the route successfully. A correct completion leads to a different route of the same size. It requires six repetitions of different routes of the same size in order to move to the next level of difficulty, where the size is increased by one. Therefore, for each grid size, the player must repeat routes of the same size six times, so that they can move to the next level where the route size is increased by one. Grid size increases when the player Of course, there is a registration screen where the player provides any necessary information in order to be identified by the system, so that they can be tracked and evaluated performance-wise (Figure 1 presents a sample of that screen). As a final addition, we included a leaderboard screen, where players can check their ranking among all players.
When the player logs into the game, the grid size reflects her/his current performance, which translates to the number of items that can be stored in her/his WM. When the player logs in for the first time, the grid size reflects the player's performance on the pre-test results, which will be discussed further later on. For example, if a player scored a Corsi block span of seven, they would start in a 7 × 7 grid. However, if a player scored six in the pre-test, they would start in a 5 × 5 grid, as there is no option for even-number grid size and it would more suitable for them to start in a lower grid size as well. After the first time and each time a user logs in, the grid size is determined by the player's last session performance, so that they can resume and continue from the last session. This ensures the progress in the players' performance in a way that pushes the user to perform in a higher cognitive state.
The difficulty is adjusted from the system itself, and apart from increasing the number of steps, which translates to the length of routes and finally to increased grid size, it can also be adjusted in the opposite way, as the system can reduce the route length to a level where the player can successfully complete each stage. Although it is rather unlikely to happen, it is still a feature that helps the game's adaptivity and helps the player to be constantly motivated and challenged to perform better. Consequently, if a player makes two consecutive mistakes on the same route, then the route length is reduced by one, and the player has to complete this route length from the beginning. If they are still unable to complete this route length, then the game adjusts and requires a shorter route and so on.
As mentioned before, starting from a given grid size and route length, the game displays the current route by flashing tiles sequentially on the grid, resulting in presenting the route which the player must follow. After finalizing the route, the player starts from the initial tile and must choose the next tile, and the next after this and so on, in order to form the route successfully. A correct completion leads to a different route of the same size. It requires six repetitions of different routes of the same size in order to move to the next level of difficulty, where the size is increased by one. Therefore, for each grid size, the player must repeat routes of the same size six times, so that they can move to the next level where the route size is increased by one. Grid size increases when the player completes all routes, which is essentially the sum of the grid size plus five. Of course, each grid size offers different length routes, but we wanted to be consistent on the number of routes that each player should successfully complete before they get to the next grid size. Lastly, available grid sizes in the game are 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13 and 15 × 15.
The player can choose each tile by either using the keyboard's arrows or by using the mouse and clicking on a representation of the keyboard's arrows, located in the bottom left corner of the game's screen. This feature was added in order to satisfy some users who reported that this would be a nice feature to have while playing.
Each session lasts only seven minutes, based on the psychologist's suggestions and directions. Adding to that, the main screen also has a countdown timer (section a in Figure 1) so that the player monitors the time left in the game. After the session ends, the user's performance is automatically stored in the database, along with other significant game metrics. We should note that there is no feedback given to the player when the game ends, since it would not offer any useful information or any essential metrics. The player can just watch their current performance in the score, grid size and route length, but they can also watch the previous performance so that they can compare the last two sessions.

Evaluation Plan-Methodology
As described before, the current research has two research goals (RG1 and RG2) concerning its impact on WM and the gameplay. The evaluation plans and the respective methodology followed are described below.

Working Memory Evaluation Method
The main goal of this study is to evaluate any potential effect on the players' WM capacity. To do that, we utilized a free and online assessment tool called the Corsi test or "Corsi block tapping test", which is used to measure WM's storage capacity [39]. In this test, there are nine blocks which are flashed once, sequentially and gradually, so that they form a sequence of blocks, which the subject must repeat in the same order. The sequence starts from two blocks, and in case of a correct answer, the test proceeds to the next higher number of blocks. In case of a wrong sequence, there is another chance and if the subject still fails, the test ends and the score is the Corsi block span. The highest possible block span is nine, although there might be people who can do better. However, this is quite rare. In the study of Kessels et al., healthy adults had an average block span of 6.2 blocks (S.D. = 1.3), which means that they are most likely to have a block span of about five and seven blocks [40]. Figure 2 shows a sample of the main screen of the Corsi span test.
In this study, participants were tested in the Corsi test before and after the end of the training program. Their initial scores were used in the registration form in order to define the minimum grid size which would be the starting point of the game for that player. By the end of the intervention, participants were tested again and their score was reported and stored in the database. Both tests took place in a controlled environment and participants were prompted to perform as well as possible.
the test proceeds to the next higher number of blocks. In case of a wrong sequence, there is another chance and if the subject still fails, the test ends and the score is the Corsi block span. The highest possible block span is nine, although there might be people who can do better. However, this is quite rare. In the study of Kessels et al., healthy adults had an average block span of 6.2 blocks (S.D. = 1.3), which means that they are most likely to have a block span of about five and seven blocks [40]. Figure  2 shows a sample of the main screen of the Corsi span test.

Game Usefulness and Enjoyment Evaluation Method
The study's secondary goal was to examine the game's usability and players' enjoyment. For this purpose, we utilized a questionnaire of 93 questions which participants had to complete at the end of training program. The last five questions concerned personal information and the remaining three questions were about their gaming experience.
Regarding the evaluation of the game's usability, participants were asked to complete an online questionnaire after the end of the intervention program. The questionnaire was based on the "USE" questionnaire as defined by Lund [41]. It is a short questionnaire designed to effectively measure the most significant aspects of the usability of a product.
The "USE" questionnaire consists of 32 questions that are categorized into four key dimensions: (a) Usefulness, (b) Ease of use, (c) Ease of Learning and (d) Satisfaction. Participants' answers include a 7-point Likert rating scale with the following available options: 1-strongly disagree, 2-disagree, 3-slightly disagree, 4-neutral, 5-slightly agree, 6-agree and 7-strongly agree.
On the other hand, in an attempt to explore any potential enjoyment that the game offers to players, participants were asked to complete a second online questionnaire as well, the EGameFlow questionnaire [42], which includes 56 questions grouped into eight dimensions: (a) Concentration, (b) Clear Goal, (c) Feedback, (d) Challenge, (e) Autonomy, (f) Immersion, (g) Social Interaction and (h) Knowledge Improvement. For the answers to these questions a 7-point Likert rating scale was used to represent the lowest and the highest degree.

Participants and Training Program
For the evaluation process, forty healthy adults participated in a seven-week training program. More specifically, the study consisted of twenty-three men (57.5%) and seventeen women (42.5%) aged between twenty-one and forty-two years old (48.8% aged below twenty-five years old and 51.2% aged over twenty five years old-Avg. is 29.9 years, S.D. is 8.23 years). Only three of them had a basic education degree, while the rest had a minimum of a BSc degree. Finally, 70% of them were core and casual gamers, playing video games two to four times per week, while the rest did not play video games at all.
As for the training program per se, following research and the psychologist's proposition, the overall program lasted seven weeks and participants had to play the game daily, for seven minutes only. They could only skip two sessions in the whole program. Additionally, they were strongly advised to only play when they were not mentally tired, during the early hours of each day. Finally, each session should have a twenty-four-hour rest, so participants should have kept stable hours of playing.

Results
The evaluation results are presented below, starting from the WM test results and then moving to the USE and EGameFLow questionnaire results.

Working Memory Capacity
As mentioned before, in order to evaluate WM capacity, participants were asked to take the Corsi test, which measures WM capacity in number of items. In order to examine any differences between pre-and post-results, we performed a Paired-Samples t-Test in SPSS. The results (Figures 3-5) showed that there is a statistically significant difference (p < 0.05) between pre-and post-performance, indicating an increase in WM's capacity after the training program. It can be seen that before the intervention, the sample's mean value was around 5.5 items and at the end of the program it was increased by one item, as the mean value was 6.6 items afterwards.

Working Memory Capacity
As mentioned before, in order to evaluate WM capacity, participants were asked to take the Corsi test, which measures WM capacity in number of items. In order to examine any differences between pre-and post-results, we performed a Paired-Samples t-Test in SPSS. The results ( Figures  3-5) showed that there is a statistically significant difference (p < 0.05) between pre-and postperformance, indicating an increase in WM's capacity after the training program. It can be seen that before the intervention, the sample's mean value was around 5.5 items and at the end of the program it was increased by one item, as the mean value was 6.6 items afterwards.

Usefulness
For this statistical analysis, we calculated descriptive measures of central tendency, the most significant of which is mean. For each Likert-type item of the four dimensions of the USE questionnaire (usefulness, ease of use, ease of learning and satisfaction), we calculated the mean values [41]. Figure 6 below presents the mean values of the dimensions.

Working Memory Capacity
As mentioned before, in order to evaluate WM capacity, participants were asked to take the Corsi test, which measures WM capacity in number of items. In order to examine any differences between pre-and post-results, we performed a Paired-Samples t-Test in SPSS. The results ( Figures  3-5) showed that there is a statistically significant difference (p < 0.05) between pre-and postperformance, indicating an increase in WM's capacity after the training program. It can be seen that before the intervention, the sample's mean value was around 5.5 items and at the end of the program it was increased by one item, as the mean value was 6.6 items afterwards.

Usefulness
For this statistical analysis, we calculated descriptive measures of central tendency, the most significant of which is mean. For each Likert-type item of the four dimensions of the USE questionnaire (usefulness, ease of use, ease of learning and satisfaction), we calculated the mean values [41]. Figure 6 below presents the mean values of the dimensions.

Working Memory Capacity
As mentioned before, in order to evaluate WM capacity, participants were asked to take the Corsi test, which measures WM capacity in number of items. In order to examine any differences between pre-and post-results, we performed a Paired-Samples t-Test in SPSS. The results ( Figures  3-5) showed that there is a statistically significant difference (p < 0.05) between pre-and postperformance, indicating an increase in WM's capacity after the training program. It can be seen that before the intervention, the sample's mean value was around 5.5 items and at the end of the program it was increased by one item, as the mean value was 6.6 items afterwards.

Usefulness
For this statistical analysis, we calculated descriptive measures of central tendency, the most significant of which is mean. For each Likert-type item of the four dimensions of the USE questionnaire (usefulness, ease of use, ease of learning and satisfaction), we calculated the mean values [41]. Figure 6 below presents the mean values of the dimensions.

Usefulness
For this statistical analysis, we calculated descriptive measures of central tendency, the most significant of which is mean. For each Likert-type item of the four dimensions of the USE questionnaire (usefulness, ease of use, ease of learning and satisfaction), we calculated the mean values [41]. Figure 6 below presents the mean values of the dimensions.
As can be seen, all four dimensions are above average (M = 3.5). The highest mean values are of "Ease of Use" (M = 6.12) and "Ease of Learning" (M = 6.47), which translates into a game which is easy for players to use in terms of controls and gameplay, and also easy for them to learn how to operate and navigate the game's environment. The results about "Usefulness" (M = 4.32) and "Satisfaction" (M = 4.36) seem to be just above average, but still indicate the participants' positive opinion. The latter result can be partially justified as something expected, since participants cannot realize the usefulness of such games because they cannot see any immediate effects in their memory performance. However, the positive thing is that they realize the overall usefulness of intervention programs in WM enhancement. Overall, we can conclude that the players' opinions were positive regarding every dimension, indicating that the game can be characterized as useful in WM training. As can be seen, all four dimensions are above average (M = 3.5). The highest mean values are of "Ease of Use" (M = 6.12) and "Ease of Learning" (M = 6.47), which translates into a game which is easy for players to use in terms of controls and gameplay, and also easy for them to learn how to operate and navigate the game's environment. The results about "Usefulness" (M = 4.32) and "Satisfaction" (M = 4.36) seem to be just above average, but still indicate the participants' positive opinion. The latter result can be partially justified as something expected, since participants cannot realize the usefulness of such games because they cannot see any immediate effects in their memory performance. However, the positive thing is that they realize the overall usefulness of intervention programs in WM enhancement. Overall, we can conclude that the players' opinions were positive regarding every dimension, indicating that the game can be characterized as useful in WM training.

Enjoyment
The same approach was followed regarding the EGameFlow questionnaire [42], and for each of the eight dimensions (concentration, clear goal, feedback, challenge, autonomy, immersion, social interaction and knowledge improvement) we also calculated descriptive measures of central tendency where the mean value was the most significant one. We calculated the mean values for each Likert-type item. The mean values of the eight dimensions are presented in Figure 7.

Enjoyment
The same approach was followed regarding the EGameFlow questionnaire [42], and for each of the eight dimensions (concentration, clear goal, feedback, challenge, autonomy, immersion, social interaction and knowledge improvement) we also calculated descriptive measures of central tendency where the mean value was the most significant one. We calculated the mean values for each Likert-type item. The mean values of the eight dimensions are presented in Figure 7. As can be seen, all four dimensions are above average (M = 3.5). The highest mean values are of "Ease of Use" (M = 6.12) and "Ease of Learning" (M = 6.47), which translates into a game which is easy for players to use in terms of controls and gameplay, and also easy for them to learn how to operate and navigate the game's environment. The results about "Usefulness" (M = 4.32) and "Satisfaction" (M = 4.36) seem to be just above average, but still indicate the participants' positive opinion. The latter result can be partially justified as something expected, since participants cannot realize the usefulness of such games because they cannot see any immediate effects in their memory performance. However, the positive thing is that they realize the overall usefulness of intervention programs in WM enhancement. Overall, we can conclude that the players' opinions were positive regarding every dimension, indicating that the game can be characterized as useful in WM training.

Enjoyment
The same approach was followed regarding the EGameFlow questionnaire [42], and for each of the eight dimensions (concentration, clear goal, feedback, challenge, autonomy, immersion, social interaction and knowledge improvement) we also calculated descriptive measures of central tendency where the mean value was the most significant one. We calculated the mean values for each Likert-type item. The mean values of the eight dimensions are presented in Figure 7.  As indicated by the results of the analysis, the players' opinions were positive, since the mean value for the "Concentration" dimension is 4.96. Furthermore, a mean value of 5.91 for the "Clear Goal" dimension and a mean value of 4.65 for the "Feedback" dimension were awarded. Moreover, the mean value for the "Challenge" dimension is 4.25, while for the "Autonomy" dimension it is 4.06. It could be deduced that the game offers social interaction and immersion since the mean values are 3.7 and 3.07, respectively, but this kind of game does not include interaction between users and immersion is low by design. Furthermore, players believe that the game improves their knowledge since the mean value is 4.32.

The users' answers to the open-ended questions revealed positive and negative issues:
• Positive Easy to use Simple Does not require any previous knowledge Helps in WM improvement

• Negative
Slow pace/progress Too many steps before changing level Difficulty should be more adjustable Overall, most participants reported that they were motivated by WM improvement and they were willing to complete that goal. They also took the whole process seriously towards this direction. The game was easy for them to use and they familiarized with the environment immediately. Yet, due to the fact that most of them were in a good status regarding WM, they often found the progress of the game very slow, thus they started feeling bored until they moved to the next level which was more challenging.

Discussion
Starting with the results regarding WM improvement, overall, the majority of the participants presented an increase in their WM capacity, as their performance in the Corsi test was found to be increased by one block minimum, while there were also cases who managed to score two more blocks compared to their pre-test results. These are encouraging findings, as they indicate that the proposed SG fulfills its main purpose, which is to enhance the WM capacity. A finding worth mentioning concerns those cases who had a high score during the pre-test evaluation. Particularly, participants who achieved a score of seven blocks in the Corsi test did not manage to increase their WM capacity, except a single case who managed to hit a score of eight blocks. The remaining cases obtained the same score in the post-test as in the pre-test, but none of them presented lower scores. There are two main reasons that may justify this result. One reason, and the most dominant one, would be that those cases reached the limit of the maximum capacity of their WM, thus it would be hard to present any further improvement. The other reason would be that the participants' performance in the post evaluation process, if we consider their mental status by the time of the test-possible tiredness during the day-or even any lack of motivation, since the pre-test did not have to add any value to them, such as a reward for example, and they were not motivated to perform as highly as possible. In either case, these factors can be eliminated if we consider a more controlled environment during pre-and post-evaluation and if we add motivation elements after the training program in order to ensure that participants perform as highly as possible, resulting in more solid results.
In addition to the latter, another element that can be added in the training process is an electroencephalogram sensor device, along with a Brain Computer Interface. This will allow us to have an intervention program with an electroencephalogram-based brain activity observation process. By monitoring the level of concentration and attention of the players, one could constantly check their mental status and how much effort they put into the game. The utilization of electroencephalogram sensors has gained ground over recent years, and the plethora of devices in combination with their low cost provide us with many available options, as Soufineyestani, Dowling and Khan (2020) stated. They concluded that by utilizing real-time electroencephalogram signals, we can have immediate information about brain-wave activities that is useful in cognitive impairment and memory deficits [43]. Furthermore, in the study of Katona and Kővári in 2018, a similar system was used in order to observe the level of attention of students and to evaluate the output with learning efficiency tests applied in cognitive neuroscience. They concluded that with the help of a brain-computer interface test, the attention level could be measured in real time and this would be an influential factor in the learning process, as we could have continuous feedback about the level of attention of students and adjust the teaching plan accordingly [44]. These promising findings seem to be the next step in brain training programs as the monitoring and evaluation of the subject's brain processes will allow adjustable and personalized interventions, which will keep the subject in the highest performance level possible.
Furthermore, we acknowledge that the lack of a control group might not give a clear picture of the intervention program itself, but since the design was based on already tested and applied strategies regarding SGs and cognitive training, our main interest focused on testing the game per se and examining whether it is attractive regarding the users and effective regarding the memory training. This was an experimental application of the game in order to test it primarily as an effective SG to continue our research and expand this design by incorporating different input methods and modality.
Another noticeable thing is the lack of coaching by either a domain expert in memory training or a virtual assistant that would guide players and provide them with appropriate feedback. We wanted to test the game as an autonomous SG that would be easy to use and would not require extra effort. Needless to say, it is expected that coaching would produce better results, pushing the players to perform better and always remain focused. To this direction, a reward system would also help players stay motivated as it would challenge them to perform better. Finally, a competitive environment would also raise motivation and challenge, as players would compete with each other by comparing scores and relative metrics. Although we have included a leaderboard in the game and players liked the idea, they neither used it extensively nor found it useful. Thus, we need to introduce competitive characteristics to keep players motivated and the game challenging.
The current evaluation process required a very strict plan regarding the daily interventions. Participants had to commit to the intervention program and follow it for 7 weeks with only two skips. These were the psychologist's instructions, who also defined the session period as 7 min only. This makes the whole process cumbersome and alternatives should be examined in order to have a more flexible training program. In addition, it was expected that some players would lose interest or get tired during the process, so their performance would not be as high as at the beginning. It would be good to examine either a shorter intervention period or ways of keeping motivation high.
We need to underline that although the Corsi block test is considered a valid test to evaluate WM, the outcome for each participant may not be accurate, since all participants first familiarized themselves with it and then used it normally in order to be evaluated. The conditions and the status of participants during the test are not to be considered controlled since they performed the test on their personal PCs and under various conditions. In any case, they were prompted to use the test during the early hours of the day in order to avoid tiredness and fatigue that would normally ensue during the day.
Finally, as it happens with most similar studies, we were not able to evaluate WM performance in follow-up assessments. Although it is considered that WM interventions have a short effect in people, it would be useful and meaningful to examine what the exact time of the effects among different people is.

Conclusions
This study concerned the design and implementation of an SG for WM training. The design was based on WM training principles and guidance by a specialized psychologist. The implementation was based on general guidelines for user interface especially in video games. The "seriousness" in this specific game was to improve the players' WM. Regarding this goal, the results are encouraging since the most of the participants managed to increase their memory span by 1 item. This finding might not look so impressive, but it is still encouraging and indicates that SGs can be utilized in that direction. Even though the intervention program which followed involved strict rules and looked demanding for participants, at the end it seemed that they found it useful and easy to attend. Additionally, this is not the only possible implementation of similar games as it could be the base for the next designs, since it is simple and easy to implement.
Regarding the game per se, usability and enjoyment metrics look quite promising since the participants agreed that the game is easy to use and play, without requiring much effort in terms of technological background. They realized its usefulness as an SG and believed that it helped them to increase their memory capacity. Furthermore, they found the game enjoyable to play and spent hours on it, as enjoyment analysis indicates positive results. Of course, social interaction and immersion are below average, but the game's purpose and nature concern self-improvement. In any case, it is difficult to utilize multiplayer games or user collaboration in cognitive skills enhancement. Moreover, low immersion is justified by the game's nature, since in order to achieve simplicity and usability, we could not build a complicated environment with sophisticated graphics or technology.
Overall, the evaluation results indicate that the proposed game fits well in the WM training domain and fulfills its purpose. It could be utilized to this direction and offer certain benefits to a lot of people. It goes without saying that there are many improvements that can be applied and can lead to a better game in general, and we as authors realize that. On that ground, we are about to start working based on the feedback that the participants gave.
Moreover, the grid size compelled us to design the game for larger screens because it would be difficult to fit grids larger than 9 × 9 on the screens of mobile devices. On the other hand, many participants reported that it would have been much easier for them to follow the intervention program if they had the choice of a game supported by mobile devices.