Developing and Exploring an Evaluation Tool for Educational Apps (E.T.E.A.) Targeting Kindergarten Children

: During the last decade, there has been an explosive increase in the number of mobile apps that are called educational and target children aged three to six. Research has shown that only a few of them have been created taking into consideration young children’s development and learning processes. The key question that emerges is how parents, custodians, or teachers can choose appropriate, high-quality educational apps. Literature has presented limited assessment tools based on advanced statistical procedures, which allow one to address validity and reliability issues. This study investigates the dimensions of using and operating educational apps for kids and presents a thirteen-item assessment instrument along with its psychometric properties. Data ( N = 218) were collected via an electronic questionnaire from pre-service teachers of preschool education. Principal Component Analysis (PCA) with Varimax rotation was used to investigate the underlying dimensions. The resulting structure included four factors, namely: Usability, E ﬃ ciency, Parental Control, and Security. PCA supported the factorial validity of the instrument, while the reliability measures of Cronbach’s alpha for the four dimensions were satisfactory. Finally, a lucid discussion on the ﬁndings is provided.


Introduction
Although interactive touchscreen technology made its public appearance in 2007 with the introduction of the iPhone from Apple, one of the first mainstream multimedia devices, it was the release of the first tablet-type device (Apple iPad) on April 3, 2010 that so profoundly changed the ecosystem of smart mobile devices. In terms of sales, the first iPad sold more than 300,000 units in its first 24 hours, and almost 15 million units in nine months. For the first time, the general public had access to a high-quality device with a crisper screen, eliminating the ergonomic problems that users encountered with the smaller screens of mobile phones and other devices such as Personal Digital Assistants (PDAs) [1].
The popularity of these devices is due, in part, to the fact that they were soon seen by the general public as an alternative to the 'traditional' graphical user interface (GUI) of desktops or laptops. market aimed at young children as a 'Digital Wild West', suggesting that parents should be wary of those claims [13].
For parents and educators, choosing an appropriate educational application is a great challenge [14]. The issue of what constitutes an educational app is therefore strikingly complex, since it implies the consideration of various scientific aspects. Thus, sometimes it is easier to identify what constitutes a lack of quality. For instance, Martens, Rinnert & Andersen [15] report that the presence of ads, including pop-ups and pop-unders, poor or inadequate design, and non-functional elements are disruptive to the educational process, while privacy violation issues, etc. further diminish the value of an app.
Kucirkova, Messer, Sheehy & Panadero [16] state that researchers who aim at proposing a conceptual framework for mobile learning applications face many of the same challenges as those researching educational software used for desktop computers. To highlight that, Hirsh-Pasek and her colleagues describe the current app market as the 'first wave of application development', in which already-existing non-digital material is being converted into a digital format [17]. Indeed, most of them are found to be reproductions of their print-based counterparts of simple, enjoyable activities offering just passive learning experiences, even though apps with educational value should focus primarily on promoting education, and not just being entertaining [16,18].
Shuler, Levine & Ree [10] analyzed the best children's educational apps by evaluating the 100 educational apps available for the iPad and iPhone devices (200 apps in total). They found that more than 80% of top-selling paid apps in the Education category target children, 72% of which are designed for preschool-aged children. The study also revealed that developers' target audience was primarily parents seeking to cultivate a creative environment at home for their children. For anyone that is not a mobile educational technology expert, finding high-quality and appropriate educational apps requires a great deal of time, effort, and luck because this procedure is not only hampered by both the sheer volume available in the stores and the inconvenient digital store user interface, but also by factors such as the lack of description, the misleading scoring system, the subjective user comments, ineffective and unworkable search algorithms, etc. [19]. Martens et al. [15] noted that a simple search in the Apple App Store using the terms 'A, B, C' or 'Alphabet' returned approximately 279 to 286 results. Indeed, the world's two major smart device app stores do not provide the users with a user-friendly interface in which navigation is easy and reliable. Moreover, the information included on the principles followed and the methodology used by the development team is often not sufficient for successful decision-making. Although one might argue that information about apps is available in digital stores, this information cannot be used as a general criterion for evaluating the educational value. In fact, this content often comes from the app's creator, and therefore cannot be considered as accurate or reliable [20].
In addition, there are very few tools for evaluating applications. Although there may be assessment tools in the form of rubrics and checklists developed by researchers at universities, parents and teachers either ignore their existence or find it difficult to use and interpret the results [15]. Researchers such as Hirsh-Pasek et. al and Kucirkova [17,21] also emphasize the fierce competition in the app market. Kucirkova [21] states that developing an application is a costly endeavor; the average cost ranges from 10,000 to 70,000 USD. At the same time, the average fee is about 3 USD, while most Android and Apple apps are available for free download.
Given that the app market is highly competitive with dozens of new products introduced every week, commercial success is not just a result of their quality; it is also a matter of luck. In fact, success relates closely to the number of users who have chosen any given app from a plethora of similar products. Therefore, rapid growth in production and sale is a survival bet for most of the developers That may be a possible explanation about the phenomenon that lots of children's apps offer the same content with a slightly modified design [17], resulting into lack of effectiveness in academic terms while choosing among the most popular educational apps [17,21]. Indeed, the popularity measured by user reviews, star ratings, or the number of installations is often misleading for parents and teachers, who make a choice based solely on the aforementioned subjective and therefore unreliable criteria [22].

Are There Tools to Help the General Population to Choose Appropriate Apps?
The low quality of the majority of educational apps targeting preschool-aged children highlights the need for a tool to help parents and educators to evaluate the self-proclaimed educational apps for their real value. In 2019, using the validated PRISMA methodology (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) [23], various databases and digital repositories were searched for studies that met the following inclusion criteria: (1) Field study in smart mobile devices, (2) describes an educational app assessment tool, (3) reports on educational apps for children, and (4) is scientifically sound.
The results collected during the review reinforced the importance of evaluation tools for educational apps. The study found 11 articles describing two different assessment approaches. Six studies present a rubric and five studies present a checklist. Additionally, the study also identified seven nonscientific-based tools. Four web sources present a rubric and three sources present a checklist. According to the researcher, the term 'nonscientific-based tools' refers to freely available online tools for the evaluation of educational apps that present their quality criteria without a focus on research methodology and scientific evidence.
In general, from all 18 tools, the researcher classified only three [14,24,25] as both comprehensive instruments and scientific-based. The other 15 were characterized as 'very poor' in terms of their evaluation power. In general, they were not found to be very comprehensive, thus significantly omitting aspects that are considered important during the evaluation procedure of the educational value of an app. For instance, many tools do not take into consideration the presence of pop-up advertisements, which is a critical characteristic, as researchers claim that their existence distracts users from the learning process [26].
In fact, even though the three tools were considered to be appropriate for app evaluation, they were assumed as not effective in terms of the effort, experience, and time needed by parents and educators for their use. For instance, the rubric in [24] contains eight pages and 24 items, while the rubric in [14] contains four pages and 18 items. The checklist provided by Lee & Kim [25] is two pages long and contains 33 items. These tools need a considerable amount of time and experience by the users. Nobody can be sure that a parent or teacher would like to spend so much time and effort just to evaluate an app. Researchers must, therefore, find a balance between the length, the evaluation power, and the ease of use of a proposed tool.
In conclusion, the digital market is full of apps that are promoted as educational, but they have little or no pedagogical value because they are often made with limited input from educators or developmental specialists. Moreover, the majority of the tools presented in the relevant literature are not available for the parents, caregivers, and educators of young children, as they are stored in copyrighted digital repositories and databases. However, even if they were available, several questions arise in terms of their appropriateness, the time needed to complete an evaluation, etc. On the other hand, the freely available tools are considered as outdated and not appropriate in terms of their depth and scientific evidence. This literature review pointed out the lack of reliable and easy-to-use evaluation tools and highlighted the need for a new, improved one to help everyone who is interested to choose apps with increased educational value. That tool must be easy to use, reliable, short enough, and able to be used as more than a general guideline.

The Instrument
An initial battery of items was created based on relevant published rubrics, checklists, or questionnaires attempting to evaluate apps. Following the PRISMA method (Preferred Reporting Items for Systematic Reviews and Meta-Analyses), we identified 11 articles describing two different approaches concerning the evaluation of educational apps. Six studies presented a rubric [27][28][29][30][31] and five presented a checklist [32][33][34][35]. We also conducted internet-based searches for gray literature, and we identified seven, four of which presented a rubric [36][37][38][39] and three presented a checklist [40][41][42]. We also found papers that did not include a rubric or a checklist, but they were considered particularly valuable and were also included in the present study [17,[43][44][45] along with the others, as they were in the form of guidelines.
Based on theoretical elaborations, certain properties of educational apps comprise the dimensions on which an evaluation could be established. Thus, the underlying factorial structure of a set of items that operationalizes such properties has to be detectable and explicitly expressed via a factor analysis procedure. To achieve this, an Exploratory Factor Analysis (EFA) was applied to a number of items, which were anticipated to conform to a four-factor structure. The four factors were named: Usability, Efficiency, Parental Control, and Security. Note that the EFA procedure stared with a larger number of items, which were excluded from the final solution as they did not conform to the meaningful structure. Though the instrument is targeting parents, custodians, or teachers, we consider that pre-service teachers of preschool education possessing the proper cognitive and affective assets are a suitable sample to work with in order to explore and establish the psychometric properties of the proposed questionnaire under study. The instrument was named the E.T.E.A.-Evaluation Tool for Educational Apps-and it is presented in the Appendix A.

Participants
The participants (N = 218) were students at the University of Crete, Greece studying at the Department of Preschool Education. They were first-year students and sophomore taking a relevant course in educational technologies during the winter semester of 2019-2020, and they were relatively familiar with the educational apps. First, they downloaded three educational apps for preschool-aged children to smart portable devices, which were in the Greek language and were chosen to be of similar content but of varying quality. The sample consisted only of female students, so it was not possible to investigate any effects of gender.
The study was approved by the University of Crete Institutional Review Board and followed the university guidelines on ethical practice when researching with adults. Anonymity and confidentiality were guaranteed, and informed consent, as well as the right to withdraw, were the ethical principles adhered to. Procedures were followed to ensure the voluntary nature of being involved in the study.

Exploratory Factor Analysis (EFA)
In the EFA procedure, Principal Components Analysis (PCA) with Varimax rotation was implemented.
Initially, a rescaled transformation (Categorical Principal Components Analysis-CATPCA procedure) [46] was applied to the ordinal data so that the resulting scale was suitable for PCA. Bartlett's test of sphericity (χ 2 = 1755.95, p < 0.001) and the Kaiser-Meyer-Olkin index (0.79) indicated adequate variance for factor analysis. The number of factors was decided on the Kaiser criterion, eigenvalue greater than 1, while only items with loadings greater than 0.60 were kept. The four factors explained 79.39% of the total variance. The rotated factor structure is shown in Table 2. Reliability analysis of the four dimensions showed that the Cronbach's coefficient values for Usability, Efficiency, Parental Control, and Security were 0.91, 0.83, 0.96, and 0.73, respectively. Thus, the internal consistency is adequate, and the desirable reliability of the measurements is expected to be satisfactory. Note also that, besides PCA, the principal axis factoring method was applied, which resulted in the same factor structure. The overall factor and reliability analyses indicate and verify the factorial validity of the E.T.E.A. instrument. Table 2 includes the thirteen items allocated to the factors. The four factors are correlated in reality. Table 3 shows the correlation matrix for the four factors. Usability and Efficiency are positively correlated (r = 0.59, p = 0.01). This is a reasonable finding, since an easy-to-use app could become efficient in children's hands. Efficiency is correlated with Parental Control (r = 0.212, p = 0.01) and with Security (r = 0.252 p = 0.01). The fact that the Parental Control is not correlated with Security indicates that apps might alternatively provide parental control with security options. It is important to note here that the four dimensions validated in this endeavor represent essential properties of educational apps that a kid and a parent anticipate. The usability is a primary aspect to consider for facilitating a kid's involvement and efficient use of all encoded activities in order to enjoy and learn through them. On the other hand, parents care for security aspects, and it is reasonable to demand information about what their children do.

Limitations
In addition to the importance and striking empirical findings, this study has some limitations. Since it is the first attempt to validate these assessment aspects of apps, the findings should be replicated with wider samples, larger numbers of apps, and varied populations, e.g., parents and/or educators, in order to establish the instrument's validation and its generalized psychometric properties. Moreover, the four-factor structure should not be considered as complete, but it could be further extended to include additional dimensions dictated by other theoretical premises that we might foster.

Discussion and Conclusions
The present work contributes toward the development of a valid instrument for assessing educational apps for children aged 3-6 years. As was mentioned earlier, the EFA refinement started with a larger number of items, which were excluded from the final solution as they did not conform to the meaningful structure. The final proposed dimensions were: Usability, Efficiency, Parental Control, and Security. These dimensions were conceived in an initial theoretical elaboration, taking into consideration the relevant functions of the apps, their purposes, children's expectations, and surely the parents' implicit demands. The resulting four-dimensional structure satisfies the validity and reliability presuppositions for a scientifically developed instrument. It is important, however, to emphasize here that this factor structure, even though valid, is not complete, as additional dimensions (e.g., related to learning experiences and outcomes) could be found to exist and be incorporated into the present structure. Although some apps can provide a rich learning experience to enhance young children's knowledge while maintaining a high level of satisfaction in both formal and informal learning environments, the majority of self-proclaimed 'educational' apps simply do not provide significant evidence to substantiate that title [47]. On the other hand, as far as the measurement is concerned, learning outcomes are a difficult case, and merely reporting one's personal opinions does not guarantee valid judgment. Assessing learning demands more objective approaches and a significantly more complicated procedure. In contrast, the easy-to-use aspects are more obvious and are straightforwardly evaluated. Indirectly, these could be considered as facilitating learning outcomes. A relatively small number of apps are well designed, easily navigable, and offer innovative approaches to support children to learn more effortlessly and effectively. In their majority, app developers build their apps without having an understanding of how children learn and develop [12]. As a result, they do not create digital environments that foster children's engagement in play activities that promote learning across academic and social domains. They reproduce printed learning tools and material in a digital format that promotes the acquisition of knowledge or skill through repetitive practice [1,14]. Efficiency is the other mediator towards learning; it is considered essential and is easier to evaluate. Since some apps are more effective than others in facilitating a learning-by-doing approach and improving levels of children's engagement in the learning process, a question that arises is how teachers and parents recognize them among others. Teachers and parents have to spend many hours trying to find quality learning apps for children to support their learning experiences both at school and at home [48].
So, what can teachers and parents do when looking for 'an educational app'? They need to be able to assess its suitability and determine whether it is appropriate based on the child's age or level of development [1]. Vaala, Ly & Levine [12] claim that it is essential to go beyond the descriptions and comments in the app stores before downloading an app, and also to visit the developers' websites. Additionally, Callaghan [9] suggests that parents must play an active role in their children's learning by playing with apps with their children.
No one can deny that smart mobile devices and their accompanying apps are here to stay, and they have already transformed the way that young children experience and act in the world [49]. Therefore, it is necessary for those involved in the application of technology within the formal and informal learning environments to focus on improving educational outcomes by creating more efficient apps that enhance the cognitive development of young children.
Until that happens, however, it is essential for parents and/or educators to have a reliable, powerful, fast, and easy-to-use tool that will guide them in choosing apps with actual educational value for their children and/or students. The tool presented in this paper also aims to contribute in this direction. Given the multidimensional underlying structure of this evaluation tool, further exploration, and extension of the E.T.E.A. is possible and is certainly needed to attain a more complete instrument, a valued asset of mainly parental concern.
Author Contributions: All co-authors contributed to data collection and/or analysis of project results. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.