Next Article in Journal
Non-Destructive Determination of Hass Avocado Harvest Maturity in Colombia Based on Low-Cost Bioimpedance Spectroscopy and Machine Learning
Next Article in Special Issue
Instructional Mediation for Equitable Computational Thinking in STEAM Learning Across Diverse School Contexts
Previous Article in Journal
Machine Learning and Deep Learning for Dropout Prediction in Higher Education: A Review
Previous Article in Special Issue
Integrating Digital Technologies into STEM Physics for Adult Learners: A Comparative Study in Second Chance Schools
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Development and Validation of the Computational Thinking Assessment Tool DACT

by
Emmanouil Poulakis
1,*,
Panagiotis Politis
1,† and
Petros Roussos
2
1
Department of Primary Education, University of Thessaly, 38221 Volos, Greece
2
Department of Psychology, National and Kapodistrian University of Athens, 15784 Athens, Greece
*
Author to whom correspondence should be addressed.
Current address: Department of Digital Systems, University of Thessaly, 41500 Larissa, Greece.
Computers 2026, 15(3), 165; https://doi.org/10.3390/computers15030165
Submission received: 30 December 2025 / Revised: 17 February 2026 / Accepted: 21 February 2026 / Published: 4 March 2026
(This article belongs to the Special Issue STEAM Literacy and Computational Thinking in the Digital Era)

Abstract

Although computational thinking (CT) has attracted researchers’ and educators’ interest for the last 20 years, resulting in new educational approaches and reformation of curricula, more research work needs to be performed, especially in the field of CT assessment. Taking this need into consideration, this article describes the development of a new CT assessment tool. The DACT CT assessment tool is developed based on CT literature, taking into consideration six basic CT dimensions. Initially, 90 CT assessment tasks are created, which are examined and reformed through a pilot study. The main research consists of an extensive study (521 students), which has resulted in the construction of the DACT CT assessment tool through continuous monitoring of Cronbach’s α, consisting of 36 final tasks. DACT is disengaged from programming, does not require a specific programming language as it uses its own micro-world, is cross-platform and can be administered online or in paper format mode, supported by an administering protocol. This article also discusses the validation process of the DACT and argues on several validation checks, such as face validity, criterion validity and concurrent validity. This work has the ambition to provide a new, useful CT assessment tool to the scientific community.

1. Introduction

Computational thinking (CT) has been discussed world-wide for many years already, as many researchers state [1,2,3,4,5,6], while much conversation has been made in the National Research Council [7]. CT came into research focus after an article by Wing [8], in which the horizontal penetration of computer science in all disciplines is discussed. Following this discussion, large communities of teachers, academics, as well as organizations started to discuss and search for the definition of CT, its development in school age, its assessment, as well as the inclusion of CT instruction in curricula all over the world [7,9,10,11,12,13,14,15,16,17,18,19]. Soon enough, researchers proposed a categorization of the instruction interventions regarding CT development, where there were instruction interventions—some of which used computers (computerized) and others did not (unplugged) [20]—with those concerning computer programming being the largest in quantity [21].
Despite the numerous proposals of instructive resources for the development of CT, CT assessment does not seem to keep the same pace. Many open matters have come to the surface, such as the age groups that CT assessment refers to, the dependence on specific programming environments, the assessment of mainly programming concepts, non-automated procedures and subjectivity, as well as the subjectivity and the validation of the suggested assessment tools [22,23,24].

Theoretical Framework

Selby and Woolard [1] refer to five concepts, common in almost all CT definition attempts: algorithmic thinking, abstraction, decomposition, evaluation and generalization. Their work is based on a literature review they conducted on the definition attempts of CT. They refer to evidence from the literature, examine the consensus of the work they include in their sample, and argue about the candidate terms that a definition of CT should include. Finally, they come up with 12 terms, from which they keep the afore-mentioned five in their proposed definition: “CT It is a cognitive or thought process that reflects the ability to think in abstractions, the ability to think in terms of decomposition, the ability to think algorithmically, the ability to think in terms of evaluation, and the ability to think in generalizations”.
More similar concepts are reported, like automation [4], which is connected with algorithmic thinking [16]. Respectively, data representation is connected with abstraction, while data analysis and organization is made in terms of logic, thus connecting these concepts with CS Unplugged logic [19]. Until today, there has been no commonly accepted CT definition, which led the European Commission [9] to specify a group of commonly accepted CT concepts: abstraction, algorithmic thinking, automation, decomposition, debugging and generalization. Debugging is also a programming phase and we can argue that it is part of algorithmic thinking, as well as of evaluation and logic. Initiatives with a world-wide impact, such as CS Unplugged or Computing At School (CAS), use specific CT concepts. In CS Unplugged, all activities are nowadays connected with CT, and it uses six basic concepts [25]: algorithmic thinking, abstraction, decomposition, generalization and patterns, evaluation and logic. Bell and Lodi [25] report these six concepts, referring to Selby and Woollard [1], at the same time explaining the addition of logic. CAS also refers to the same six concepts [15], and additionally to techniques and approaches. Teaching London Computing [26] also refers to the CAS CT approach [15], and finally the Bebras contest harmonizes with the concepts suggested by Selby and Woollard [27].
Thus, in the context of this paper, we will follow the same theoretical approach, in which CT is a cognitive process or thought process and includes the following six (6) dimensions: algorithmic thinking, abstraction, decomposition, generalization and patterns, evaluation, and logic. Brief examples and definitions of the concepts are provided by Curzon et al. [28].
Throughout our study of existing CT assessment tools, we did not find any tool that autonomously assesses CT, taking into account the above dimensions. As Bocconi et al. [10] point out, the dimensions that make up CT can be grouped into two general categories: (a) CT related to general problem solving and (b) CT related to programming and computation. The first category includes almost all dimensions (except algorithmic thinking) that make up CT according to our analysis above. Most automated CT assessment tools belong to the second category and deal mainly with programming concepts. However, as research shows [22] the assessment of CT should not be based solely on purely programming concepts or a specific programming environment, but can be made independent of specific programming platforms and there can be cross-platform assessment tools. We also mentioned the lack of reference to assessment tools for specific age groups, as well as the lack of assessment tools for older students, after primary school. In the very interesting study by Román–González and Pérez–González [29], which examines the cognitive stages of development according to Piaget’s theory, it is mentioned that for CT in middle schools, the stage of formal logical–abstract operations begins, where abstraction, metacognitive abilities, and systematic problem solving through logic appear in a more methodical way. With regard to evaluation, the same authors also refer to this age stage, while they place generalization at a slightly older age. The research by Román–González and Pérez–González [29] also refers to a new definition of CT using cognitive concepts, specifically (a) decomposition, (b) pattern recognition, (c) abstraction, (d) algorithmic design, (e) automation, (f) evaluation, and (g) generalization. It must be noted that they agree almost entirely with the approach we follow, with the only difference being the reference to logic and automation.
Taking these parameters into account, we decided to proceed with the design and construction of a new CT assessment tool that
  • will be based on the six dimensions of CT mentioned above;
  • will be detached from purely programming concepts;
  • will not be based on a specific programming environment;
  • will have a simple abstract setting and forms (not distracting);
  • will represent movement on two dimensions;
  • will be aimed at the 11–14 age group.
The main research objective is to develop an autonomous tool for assessing the CT of students aged 11–14, and more specifically to design a new tool based on specific CT concepts and to develop the tool through the research process.
This article is organized as follows: Section 2 (Material and Methods) provides a thorough analysis of the research design, namely the construction of an initial assessment task pool, the main research for the construction of the assessment tool DACT and its validation process; Section 3 (Results) provides a report on the implementation of the research and its main results; Section 4 (Discussion) and Section 5 (Conclusions) discuss and summarize the main findings and results of this research, combining the produced knowledge with more recent research in the specific field of CT assessment.

2. Materials and Methods

In order to develop this tool, we first constructed original tasks that refer to the basic CT dimensions. Robinson [30] states that at least three (3) questions are needed to measure a characteristic on a scale, recommending four (4) for safety reasons, and we aimed for five (5) tasks for each dimension in the final assessment tool for greater reliability. The number of initial questions is recommended to be at least double the final number [31] for the construction of a scale, and we aimed at three times the desired final number for greater safety. Thus, the design of the tool included 90 tasks, corresponding to 15 per dimension.
After creating the tasks, which are all original and were developed as part of this research, we initially administered them on a pilot research on a small sample of students to identify any inaccuracies and malfunctions in the research design and tasks. After the necessary corrections and replacements or modifications of tasks, the final bank of 90 questions was created, which was used in the main research.
As Boateng et al. [31] point out, constructing a scale requires a large sample, which can be 10 participants per question or a total of 200–300 individuals for factor analysis. In the main survey, bearing in mind the final 30 tasks out of the initial 90, we ended up with a sample of more than 500 students.
The methodology we followed had to do with predicting the reliability index (Cronbach’s α) of the entire questionnaire. Taking into consideration the responses of all participants in SPSS, we investigated the question of reliability (reliability statistics), asking each time for a new value to be calculated if a task was deleted, removing one task at a time and running the same test again. As an end criterion, we set the point at which we were left with at least five questions in each dimension or the point at which we judged that beyond that point the planned tool would not be reliable (Cronbach’s α < 0.80).
Throughout the research, particular emphasis was placed on ethical issues, participants were thoroughly informed, the necessary permissions were obtained (e.g., access to schools, parental consent) and particular attention was paid to maintaining anonymity and informing participants about the possible context of presentation or publication of the results. At all stages, there was freedom to decide whether to participate in the research, with the ultimate goal of achieving informed consent [32].
The ethics of our research required complete anonymity of the participants, while at the same time we had to be able to link the research data collected in more than one meeting. For this reason, codes were used and the matching was performed by the school teacher, so that the researchers could not personalize the results. Tablets and Google forms were used for the research and data collection for the questionnaires. This approach had advantages, such as avoiding errors and managing time (digitally available answers, student participation).

2.1. Creation of the Initial Pool of Tasks

The tasks were constructed as part of this research and were based on the researchers’ experience through theoretical study of the literature at an academic level, but also because of their work in the field of teaching and didactics of computer science. The first two steps of the four-step framework proposed by Li et al. [33] were used, which concern the construction and attribution of each task to a CT concept, as well as a review of the proposed tasks by experts to confirm their correlation with the specific CT concepts.
The tasks’ construction took into consideration the basic literature from which the CT concepts referred to arose, as already mentioned in the theoretical part [1,15,25,27]. At this point, we should note that, in our opinion, there is never just one CT concept in a task, but several coexist, which is an argument that also appears in the literature [34,35]. Rowe et al. [36] provide examples of logic and refer sometimes to algorithmic thinking, pattern recognition, or abstraction. In constructing the tasks, we used ideas and variations from Rowe et al. [36] and Li et al. [33]. Our effort could not avoid writing algorithms, instructions, and commands in a specific order and form, mainly for the concept of algorithmic thinking, but also for some of the other dimensions of CT. For this reason, we decided to create an original “programming” environment in which shapes would move and which would contain basic objects. Based on the two-dimensional approach of Scratch- and Logo-like environments, we designed a micro-world consisting of the following: a two-dimensional maze with distinct and easily measurable steps, a male and a female shape (both abstract), a broom, a bucket of paint, four basic commands that make the figure move right, left, up, and down, and programming structures that represent the simple selection structure, the complex selection structure, and the repetition structure.
The process followed specific steps: initial recording of ideas for each CT concept, design and discussion between the two researchers on the categorization and form of each task, construction of the tasks and graphics required, formatting them in Microsoft Word and entering them into a corresponding Google form to make them accessible on the internet, review by two experts to agree that these questions do indeed belong to the concept invoked by the researchers who designed and constructed them. Thus, 90 tasks were finally constructed, which were divided into six categories, thus creating six distinct questionnaires.

2.2. Pilot Implementation of Initial Tasks’ Pool

A secondary school in the provinces with three classes and a total of 70 participants took part in the pilot study. Despite our best efforts, not all of the 70 participating students completed all of the questionnaires, as they were not all completed in consecutive hours due to expected fatigue. Thus, there was absence of some students, even though the researchers returned. Table 1 shows the participants. In the end, 41 students completed all six questionnaires and 23 students completed five out of the six questionnaires. Table 1 shows us that the 70 participants have completed 383 questionnaires. The total number of six questionnaires per participant gives us 420 questionnaires and one could expect a percentage of 383/420 = 0.91, leading to about 63 participations with six questionnaires completed. However, as afore-mentioned, only 41 students have completed all six questionnaires.
All questions on the Google forms were mandatory, thus ensuring that no questions would remain unanswered, as is the case with printed questionnaires. To avoid hasty or random answers, all questions, without exception, included a fifth option “NO ANSWER”, which we encouraged students to use whenever they could not or would not answer a question.
We used a sheet to record the progress of the students in each class, which included details of the school and class, the dates and times we visited the class, the students’ codes, absence, and which questionnaire each student completed. We used a free-form observation sheet, which recorded the date, time, and class, so if we needed any clarification, we could return to the same class at a later date.
Following the observation, we kept notes of the questions of the students and throughout some discussion with some of them, we found out some easy tasks, difficult tasks and fun tasks.
The data were stored in spreadsheets on Google Drive and downloaded immediately to ensure maximum data integrity. The data were processed in Excel, where an auto-correct function assigned one mark to the correct answer and zero mark to the incorrect answer or “NO ANSWER”.

2.3. Main Research

Concerning the research sample, the initial target was a large number of students aged approximately 12–13 (first year of secondary school). We followed the convenience sampling method [32], including schools from both urban centers and smaller towns, and even from two regional schools. As seen in Table 2, the initial number of participants was 521, but the survey was finally completed by 452 individuals (86.76%). This was due to our persistence in returning to the schools, but also to the fact that the students themselves expressed their interest and wanted to participate.
The research design described in the pilot study was used in this research too, using the final questionnaires in Google form format, student codes, and tablets for the completion of the tasks. The answers per student (code) from all questionnaires were collected, checked for errors, corrected, and statistically processed. The processing included the extraction of statistics per question to show whether it was too easy or too difficult, as well as the extraction of statistics per student to discuss any unreliability in the answers (e.g., a questionnaire with 80% “NO ANSWER” responses would be deemed unsuitable for inclusion in our results). Once again, notes were kept on students’ questions and reactions in an attempt to evaluate tasks (what was very difficult, what was incomprehensible, what was fun, etc.). A second researcher participated as an observer in the research, helping to record questions asked in the classroom and malfunctions, and also to provide his own perspective.
The responses of the students included in the final sample were then statistically processed, following the removal of some entries. The statistical processing included the reliability of the questionnaire and, more specifically, its internal reliability. The following procedure was followed:
  • Reliability Analysis was performed repeatedly on the entire questionnaire, and in each analysis, SPSS was asked to display Cronbach’s alpha for all tasks, as well as the value of the index if a task was removed (“Scale if item deleted”).
  • In each iteration, one task was removed and we checked the index according to the previous step.
  • The process stopped when we reached the minimum number of five tasks that we had initially selected as the limit per CT concept, or if the proposed index fell below the threshold of 0.80, which is considered the threshold for acceptable internal reliability [37].
Following the construction of the new DACT CT assessment tool, an exploratory factor analysis (EFA) was conducted to examine the dimensionality of the questionnaire, using the Jamovi v.2.6.44.0 tool. The EFA was conducted using the initial main research sample presented in Table 2 (N = 452).

2.4. Validation

Following the initial development of the new CT assessment tool, our research proceeds with the validation process of the DACT CT assessment tool. For the validation process, we searched for independent assessment tools in Greek, but none were found in our research on CT assessment [22]. Few attempts have been made to assess CT in Greek [38,39,40,41], but they either refer to a different age group than that of our research sample, or they are not autonomous CT assessment tools, proposing a CT assessment methodology within the context they analyze.
Thus, among the assessment tools studied, the Computational Thinking Test (CTt) [42] was selected as the most suitable for validating the DACT CT assessment tool we created. The main reason of the selection is that CTt has already undergone several validation checks and its relationship with CT assessment has thus been proven by research. Several studies have already been conducted on the CTt to describe and validate it [43], such as content validity [42], descriptive statistics, reliability, criterion validity [44], convergent validity [43], predictive validity [45], and cross-cultural validity [44]. Furthermore, research has been conducted on instructional validity and the sensitivity of measurements depending on whether the assessment is conducted before or after teaching, and it has been studied both in an unplugged teaching environment [43] and in the Scratch programming teaching environment [46,47]. In addition, it has been studied in other programming environments such as Penguin Go [48] and Bomberbot [49]. Finally, CTt has undergone validation checks using the IRT approach [50].
We can argue that CTt belongs to programming-based assessment tools and, in fact, uses a specific, existing, programming environment. CTt assesses CT, mainly based on the programming approach, and a comparison with the DACT tool would show a positive correlation. Of course, we should mention that the creators of CTt state that they are theoretically based on Aho’s definition of CT [51], talk about the fundamental role of algorithmic thinking in CT, and refer to the primary position of programming in relation to CT, while clarifying that we can also have unplugged environments for CT [44]. The CTt tool does not require the use of a computer to complete it, but can also be administered on paper, as its questions (28 in total) are each based on an image and the answers are all of a multiple choice format.
The CTt is an already validated tool for assessing CT, which is based on a smaller number of CT concepts than our approach in the DACT tool, mainly referring to programming concepts. However, since both tools aim to assess CT, and CT is unique, the results of the two tools can be compared on a common sample and conclusions can be drawn about their possible correlation.

2.4.1. Adaptation of the CTt Tool in Greek

For the purposes of this study, CTt was translated and adapted into Greek so that it could be used to validate the DACT tool. For the use of the CTt tool and its adaptation to the Greek language, we have communicated with the research team that created it and have received their approval for our research purposes. Our research objectives in this section are the translation of the CTt into Greek and its cultural adaptation.
For the translation into Greek and the cultural adaptation of the CTt tool, we followed the process of translating the tool from English into Greek (forward translation) and back from Greek into English (backward translation). The translation was performed with a view to the correct reference and use of language, cultural elements, and scientific data, as reported by Borsa et al. [52]. A positive factor that facilitated this process is that CTt does not use a large amount of text, but mainly uses images from a Scratch-type programming environment.
Furthermore, as the programming environment from Code.org was used for the initial construction of the CTt, and this environment already has a translation available in Greek, we did not need to translate the images and commands into Greek. We only created the questions in the original environment in Greek from the outset, using the official translation of the block commands provided by Code.org.
As Hambleton [53] points out, issues related to cultural, idiomatic, linguistic, and content differences should be taken into account during translation. He also states that translators should be bilingual [52,54]. For this reason, two translators who met the criteria were used, who initially translated CTt into Greek. No discrepancies were found, due to the fact that there is not a large volume of text, and the commands within the blocks had already been translated by the official programming environment.
Beaton et al. [54] suggest that one translator should be familiar with the tool’s countertext (terminology), while the other should not. They argue that in this way, the familiar translator tends to use scientific terminology, while the unfamiliar translator uses terminology that the average user of the tool will understand. For this reason, another translator was used, whose translation was not different from the translations of the first two familiar translators.
Before reaching the final form of the tool in Greek, a minor adjustment to cultural elements was made. Thus, the form of the environment referred to as “Artist” in English and literally translated as “Καλλιτέχνης” in Greek was ultimately decided to be rendered with the term “Μαθητής”, as their involvement with the program’s pen and the design of mainly lines and rectangular shapes do not refer to an “Artist” in Greek culture.
For the selection of the sample, we followed the convenient sampling method [32], choosing a school in the city of Heraklion. The translated CTt tool was administered on a pilot research in a class of 24 students.
The instructions of the CTt creators were followed for the administration of the translated CTt tool, who recommend a pre-defined duration for its completion and the familiarization of the participants through three initial examples.
The Greek version of the CTt was coded in Google Form format so that it could be accessed remotely in electronic form. This format was also chosen by the creators of the original tool, as it facilitates the administration and electronic collection of responses. Participants completed the Greek version of the CTt within a teaching hour. Subsequently, questions were submitted to assess the students’ understanding of the tool’s questions. The recording and analysis of the responses did not reveal any need for modifications, and so the initial Greek version of the CTt used in this study also constituted the final version of the CTt tool in Greek used in the DACT validation research.

2.4.2. Validation Research

In the previous sections, we have described the development of the DACT assessment tool, as well as the translation, cultural adaptation, and pilot implementation of the CTt tool. The CTt is an already validated tool for CT assessing, referring mainly to programming concepts and covering less CT concepts than DACT. However, since both tools aim to assess CT, and CT is unique, the results of the two tools can be compared on a common sample and conclusions can be drawn about their possible correlation.
Validity is important when creating a measurement tool, a key to effective research and there are several different kinds of validity [32]. The main research questions of this section of our research are as follows:
  • Does the DACT tool have internal reliability?
  • Does the DACT tool cover the domain it purports to cover (content validity)?
  • Does the DACT tool seem to measure what it is designed for (face validity)?
  • Is there a correlation between the DACT tool and the CTt tool?
  • Does a high correlation coefficient exist between the scores on the DACT tool and the scores on other accepted tests of the same performance (criterion validity)?
  • Do the DACT tool results concur with results on other tests or instruments that are assessing CT (concurrent validity)?
Some of our research questions, as mentioned below, have already been answered during the research procedures described in the previous chapters.
The reliability of the DACT tool was tested using Cronbach’s α, through the tool construction process, which we analyzed in a previous chapter. Concluding through successive repetitions of the corresponding criterion in SPSS, and taking into account that in the statistics provided by SPSS processing there is always a column titled “Cronbach’s Alpha if item deleted”, we came up with the final DACT tool with 36 questions and an index value of 0.926. Thus, the reliability analysis of our DACT tool has already proved that the tool is reliable and, in fact, has a very high index value for its internal reliability.
Furthermore, during the construction of the DACT tool, and prior to data collection, its content validity was ensured, as mentioned in the respective chapters on task construction. The first two steps of the proposed four-step framework by Li et al. [33], which concern the construction and performance of each question in one CT concept, as well as a review of the proposed tasks by experts to confirm their correlation with the specific CT concepts were made. Furthermore, for the initial tasks’ construction, the relevant literature was taken into account, as well as efforts to evaluate CT that have already been mentioned in the theoretical part. Finally, in the research process that led to the final DACT tool of 36 questions, the criterion set was that no CT concept should have fewer than five questions (out of the initial 15 that corresponded to each one). The above design, expert review, and support from the literature, as well as the fact that no concept was left with fewer than five questions, allow us to claim that the DACT tool has content validity.
Regarding the DACT tool and its face validity, the initial assessment has been made, as mentioned above, both during the process of constructing the initial tasks and during the process of constructing the final DACT questionnaire, as at some points it was necessary to decide which of the two or more equally probable tasks would be removed from the tool.
In the following research, we describe the process of validating the DACT tool in terms of criterion validity, and, in particular, concurrent validity, which is based on a comparison with the already validated CTt tool. In other words, the research will provide answers to our last three research questions, while the first three have already been answered in the above paragraphs.
In order to study criterion validity and concurrent validity, two tools were administered simultaneously:
  • The DACT tool in its final form
  • The CTt tool in its Greek version, as described in the previous section
The two tools were administered to students, the results of the two tools were calculated, and the results were subjected to correlation tests using Pearson’s r coefficient. We followed the convenience sampling method [32] for this research. Two secondary schools in the city of Heraklion were selected. A total of 119 students from these two secondary schools participated in the study. Of these individuals, 111 ultimately completed the CTt in Greek and 112 completed the DACT. Of the 111 individuals who completed the CTt, seven did not complete the DACT, while of the 112 individuals who completed the DACT, eight did not complete the CTt. Thus, seven and eight (15 in total) individuals were removed from the statistical analysis of the survey, resulting in a sample of 104 individuals who had completed both questionnaires from the initial sample of 119.
The design of this research followed the administration of the tools in a similar way that was used in our previous studies. The assessment tools were available online in Google forms, and each student had to use a code to start the process. The codes ensured the anonymity of the participants and were provided by the school teacher, but also allowed us to link the answers through the same codes used by the same student in the two questionnaires. Tablets were used again, using the DACT homepage [https://dact.pre.uth.gr/en/] as the start page, from which the two Google forms were accessed.
Once participation was complete, the data was available electronically to the researcher and processing began. Automatic answer correction functions in Excel were used, as both questionnaires included multiple choice questions, making them easy to correct automatically. The results were then entered into SPSS and using Pearson’s r coefficient criterion, we reached the corresponding conclusions regarding the criterion validity and concurrent validity of the DACT tool.
It took about two teaching hours to complete both questionnaires. Some students needed more time and others less. In most cases, the questionnaires were completed on the same day, while in some cases we had to come back to them. In any case, the completion of both questionnaires by those students who did not finish them on the same day took place within two to three days, so that the answers given were close in time.
Microsoft Excel was initially used to process the students’ responses, and the initial scoring of the assessment tools was performed using automated correction functions. The data were also entered into SPSS (IBM SPSS Statistics v. 29.0.0.0), where the analysis was performed and the results were extracted. Once the data collection was complete, we first examined the individual data files and compared them with the research recording sheets. As in the pilot study, we observed some minor errors in the codes, mainly due to typos (e.g., someone typed “HUVPHC” instead of “HVUPHC”). These errors were easily identified and corrected, as we had assigned specific codes to each school and class, noted the date and time of intervention in each school, and the computer files of the Google form responses had timestamps, so we could easily identify any questionnaire that had an incorrect code. Note that the code is the only means of identifying the participating student, and it is used to combine their answers in the two questionnaires.
After correcting any code errors, we merged the individual files of the students’ answers in Excel. To ensure that this process was carried out correctly, we sorted each of the two individual files by code and then copied each questionnaire into a new file, side by side on the same worksheet, so that each row corresponded to a specific participant. In cases where a code was missing, as we saw in the sample, where not all students completed all the questionnaires, we left the cells corresponding to that questionnaire blank. Since there were only two questionnaires per participant, this process was fairly quick and simple.
Once all student entries had been correctly entered into Excel and the individual results for each assessment tool had been extracted, statistical processing was carried out using SPSS.

3. Results

3.1. Creation of an Initial Tasks’ Pool

At this stage of the research, the result was a basic pool of 90 tasks, which were divided into six categories, thus creating six distinct questionnaires. The six questionnaires correspond to the six main CT concepts, which we have analyzed in the Introduction: algorithmic thinking, abstraction, decomposition, generalization and patterns, evaluation, and logic. Each questionnaire consists of 15 tasks, making a total of 90 initial tasks, which were used in the pilot study.

3.2. Pilot Implementation of the Initial Tasks’ Pool

The pilot research helped to test our research design in practice and to identify specific difficulties and shortcomings. It was verified that the design of the electronic collection of responses via the Internet is feasible and did not present any problems, while it is superior in terms of speed and helps to avoid errors.
The pilot study provided an opportunity to adapt many tasks and correct them so that students would not have difficulty understanding the tasks. Visualization of the data proved to be a good practice, while the use of organizational structures, such as tables or bullet points, also helped students to understand the tasks more easily. The way the questions were written also seemed to play a significant role in how well they were understood by each participant, and for this reason, several tasks were reworded. Some incorrect code entries were also observed, but these were mainly due to transpositions. Applying the above, this research resulted in the proposal of a final pool of tasks, 90 in total, divided into 15 individual questionnaires on algorithmic thinking, evaluation, decomposition, generalization and patterns, logic and abstraction, which were used for the main research.

3.3. Main Research

In the main research, after collecting, correcting, and coding the data, it was transferred to SPSS. It consists of 92 variables, which are the code, the school, and the 90 tasks answered by each participant.
The analysis we followed began with the removal of two variables due to zero variance. In the first statistical analysis, we have 88 questions and Cronbach’s Alpha index has a value of 0.933. Following the suggestion of “Cronbach’s α if item deleted”, we chose to remove the variable that leaves us with the highest reliability index when removed. In the event that we have the same values for two or more variables, we followed different paths for which question should be removed. The possible options for removing one question at a time open up a tree of options, and the different options increase exponentially. Most of the time, we ended up back at the same point after a few steps, i.e., the same questions had been removed but in a different order each time, so we continued from the same point. At some points, where the decision was made based on the researchers’ experience, we also relied on our notes from the process (student questions and discussions), in order to select one question over another for removal. The process ended after 53 steps, and the termination condition that was activated is the one that ensures that no concept will be left with fewer than five tasks in the final questionnaire. Table 3 shows some steps of this process at the beginning, the middle and finally at the end of the process.
We thus concluded with the final questionnaire, which ultimately consists of 36 tasks corresponding to the six dimensions of CT: algorithmic thinking (8), evaluation (5), decomposition (5), generalization and patterns (7), logic (6), and abstraction (5). Cronbach’s Alpha coefficient is 0.926, which is a very high value for internal reliability. Table 4 shows the final tasks per concept.
The final questionnaire is available in text file format (.docx) and in portable format (.pdf) (print version), while it has already been formatted appropriately for use via the Internet and is available in Google form format. The final questionnaire is accompanied by a corresponding administration protocol, and in its final form it will be available on the project’s website in all formats for use, after the appropriate permission has been obtained from its creators.
Finally, our notes of students’ questions and reactions, as well as discussions with some of them, allowed us to argue that students find questions related to movement and shapes that have images easier and more entertaining, while they find questions with more text and no other means of representing information more difficult.
Following the construction of the proposed DACT tool, an exploratory factor analysis (EFA) was conducted to examine the dimensionality of the questionnaire. The sample used in this analysis is the initial, main research sample (N = 452) due to its large number of participants. The results indicated a single-factor solution (unidimensional structure), as only one factor had an eigenvalue greater than one and the scree plot showed a clear inflection point. Figure 1 provides the EFA scree plot.
Table 5 shows the results of the factor loading, whereby we can see only one factor (Factor 1) and the relation of each variable to this factor. The principal axis factoring’ extraction method was used in combination with an “oblimin” rotation. Values greater than 0.4 usually indicate strong relation (high absolute values) and tasks which present such values are considered as the tasks which define this factor. Three of the final tasks, ABS-3, ABS-6 and ABS-13 present a low value of factor loading (less than 0.30). As seen in Table 3, ABS-13 was proposed for removal in the final step (if removed it would leave the concept Abstraction with less than five tasks), and ABS-3 and ABS-6 were equally proposed for removal one step before the end of the process. Table 6, Table 7 and Table 8 show the values for KMO Measure of Sampling Adequacy, Model Fit Measures and Bartlett’s Test of Sphericity, respectively.
Table 6 gives us detailed information on the KMO (Kaiser–Meyer–Olkin) Measure of Sampling Adequacy, which provides a score of the data that indicates if they are suitable for factor analysis. Scores above 0.5 and up to 0.6 are generally acceptable, 0.6 to 0.7 gives a medium acceptance, while scores of 0.7 to 0.8 are considered good and scores higher than 0.8 values are considered excellent. As we can clearly see in Table 6, all our data is excellent. Additionally, KMO’s overall value is 0.902. These values give us the opportunity to characterize these data as excellent for Factor Analysis.
Table 7 provides us with some measurements about how well our model explains the data. RMSEA (Root Mean Square Error of Approximation) measures discrepancy per degree of freedom; usually a value less than 0.08 is considered good, while a value less than 0.06, which is our case, is considered excellent. Chi-square tests was used to measure if the model perfectly fits the data, but this measurement is sensitive to sample size, although it is statically important in our case (p < 0.001).
Finally, Table 8 provides us with a statically significant measurement of Bartlett’s Test of Sphericity, which is a statistical test used to testify the use of factor analysis. A statistically significant result, as in our case (p < 0.001), means variables are correlated enough to proceed with dimension reduction, thus justifying the use of the EFA.

3.4. Validation

Following the development of the DACT tool, we continued with its validation research. In the previous paragraphs, we described the process of selecting, translating, and adapting a validated CTt assessment tool into Greek. This tool was translated and adapted into Greek for use in the validation of the DACT tool we developed. In addition, the CTt tool will now also be available in Greek, as the results of our research will be delivered to its creators, along with the Greek version of the tool, which can also be used by other Greek research efforts.
Thus, this work contributes to strengthening the effort to assess CT in the Greek context in yet another way. At this point, we should emphasize again that the two tools (CTt—DACT) do not refer exactly to the same CT concepts, so the use of one does not exclude the use of the other, but as both aim to assess CT as a whole, we can use them together and (as we will see in the next section) compare their results. Moreover, multiple assessment methods are currently the most appropriate and reliable method for CT assessment [22].
To investigate concurrent validity, we computed Pearson’s product–moment correlation coefficient between the total scores of the two assessment tools (DACT, range 0–36; CTt, range 0–28). Data from 104 students who completed both assessments were analyzed. Both variables met the assumptions for Pearson’s correlation.
A strong positive correlation was found between the two measures, r (102) = 0.80, p < 0.001, 95% CI [0.70, 0.86], indicating that higher scores on the DACT tool are associated with higher scores on the CTt. This result supports the concurrent validity of DACT, as it correlates highly with an established measure of computational thinking (CTt), consistent with criterion validity.

4. Discussion

In our research, we took the existing literature into account, including similar efforts and suggestions for the format of the tasks belonging to each category, and we came up with an original pool of 90 tasks for assessing CT. The tasks that were constructed formed the basis of the pilot study, from which, with minimal corrections, the final tasks’ pool emerged. These 90 tasks formed the basis of our main research. The main research resulted in the final CT assessment questionnaire, which consisted of far fewer questions. The final CT assessment questionnaire consists of 36 questions, with a very high Cronbach’s alpha internal reliability index, and we hope to make it available to the scientific community soon to fill the gap mentioned in the literature [22].
The construction of the original CT assessment tasks of DACT is theoretically based on the six CT concepts that we mentioned: algorithmic thinking, abstraction, decomposition, generalization and patterns, evaluation, and logic. However, the EFA results did not differentiate between six unique factors, but rather indicated a single-factor solution (unidimensional structure). So, in this phase of our research, the structure of the DACT assessment tool emerges to be unidimensional and does not assess distinct CT skills, but rather regards the CT as a single concept and skill. In the “Creation of the initial pool of tasks” section, we have already discussed that CT concepts coexist to a large extent and cannot be completely separated, as other researchers also point out and explain [34,35,36]. We believe that the specific field of separating the concepts, as well as their temporal dependence (whether they always appear together, some before, some after, and in what way) should be the subject of extensive research in the future.
At the same time, DACT is detached from purely programming concepts, without ignoring basic programming structures. It is not based on a specific programming environment and therefore does not require knowledge of programming or a specific programming language, while in tasks that use scenery, programming forms and commands, the scenery, forms, and programming commands used do not come from a specific programming environment but from the DACT micro-world created for this research and tool. In addition to the difficulty of having to learn a programming environment in order to be assessed within it, this also avoids the misjudgment of some students who may already be familiar with the specific programming environment, compared to those who are using it for the first time.
Furthermore, as mentioned in the design of the questions, the setting and forms are deliberately abstract so as not to distract attention, while movement in two dimensions is considered sufficient and no attempt was made to involve students in the three-dimensional world, as such a need does not arise either intuitively or from our literature review.
The assessment tool is based on the theoretical analysis we have described. At the same time as our own research work, the scientific community has also been working on assessing CT. Thus, Wiebe et al. [44] propose and validate the CTA-M tool, which is based on the existing CTt assessment tool and selected topics from the Bebras competition [45,55]. The tasks introduced by the authors from Bebras are similar to those we have designed and included in our tool.
In 2020, Tang et al. [23] reported on the progress made in CT assessment to date in their review, and in their key conclusions, they recommended that more tools should be created for older students and university students, which is covered by our tool for the first grades of secondary education, as well as to give greater emphasis to the theoretical documentation of assessment tools and to ensure that they can be administered independently of specific environments (cross-platform), both of which our tool adequately covers, according to what we have already presented. More recent proposals refer to primary school students [33], making the need for assessment aimed at secondary education even more urgent, while assessment proposals continue to be based mainly on programming structures [56]. The use of more general concepts than basic programming concepts in the assessment of CT, which is followed by our tool, is also emphasized by Lai [57].
In 2022, El–Hamamsy et al. [58] used the Competent Computational Thinking Test (cCTt) assessment tool to target older elementary school students, employing an unplugged approach and moving away from basic programming environments, but the gap in secondary education, which our tool aims to fill for the first grades, still remains. More generally, other approaches to assessing CT are emerging for younger ages, such as the proposals by Shen et al. [59], Rowe et al. [36], Sartor Hoffer [60], and many others presented in the research of Ocampo et al. [61], while more recent research continues to rely mainly on programming for the assessment of CT, such as that of Ghosh et al. [62]. This is confirmed by recent research by Ukkonen et al. [24] on teachers’ views on CT assessment, where they report that it is easier to assess basic programming concepts, but they consider CT to also include concepts more general than programming, which are not adequately assessed by existing assessment tools, thus showing that the research gap that prompted us to create the CT assessment tool continues to exist.
Furthermore, Román–González and Pérez–González [29] analyze the dimensions of higher-order thinking that are expected to develop in different age groups, placing the foundation of logical thinking (according to Piaget) and its transfer to abstract schemas in middle school (typically ages 11–14), i.e., the use of abstract thinking, evaluation through argumentation and reflection, while approaching problem solving logically and methodically, rather than simply relying on trial-and-error approaches. The above confirms the targeting of our tool at this age group.
In this study, specific validity checks were performed on the DACT tool. In the future, additional validity checks could be performed, such as external validity, concurrent validity, construct validity or structural validity, which includes convergent, discriminant, and predictive validity, as well as criterion validity.
The DACT tool could also be tested with a wider age range, as we believe it could cover both the older grades of elementary school and the last grades of middle school (ages 10–15).
Furthermore, it would be worthwhile for future research to translate and adapt this tool into English; as the text is not very long, it uses easily understandable images and symbols, and it could thus be used by the international community as a CT assessment tool.
To present the research results and support the research, a website was created on the site of the Department of Primary Education of University of Thessaly, Greece, with the acronym DACT, from the initials of the words Development and Assessment of Computational Thinking. This acronym also characterizes the tool created in the context of this research.
Moving on to the validation of the DACT tool, we searched the literature for a validated tool for CT assessment. At the time of designing the study, there were no validated tools for autonomous CT assessment. One tool that stood out due to the numerous validation checks it had undergone was the Computational Thinking test (CTt). Although the CTt has been validated by its creators and is a tool for CT assessment, it is mainly based on a programming environment for assessment. It consists of 28 questions and uses a ready-made programming environment (from code.org) for the forms, commands, and scenarios it contains. Its analysis is based on the recognition and use of programming patterns related to basic instructions and uses sequence structure, repetition structure, selection structure, and functions. In a sense, CTt’s orientation leans more towards algorithmic thinking and evaluation, two of the CT concepts, but still remains a validated tool for CT assessment.
CT is a cognitive process that consists of several dimensions, as we have already analyzed. This research is based on the analysis of CT in six (6) concepts, and thus our assessment tool refers to more CT concepts than CTt. The two approaches and the two assessment tools are not identical. However, CT is one and unique, so since the CTt tool assesses specific CT concepts, based mainly on the programming approach, then a comparison with the DACT tool should show a positive correlation. It is important to note that the creators of CTt state that they are theoretically based on Aho’s definition of CT [51], whereby they talk about the fundamental role of algorithmic thinking and programming in CT, but clarify that we can also have unplugged environments for CT [44]. Thus, although the CTt tool is based on an existing programming environment, it does not require the use of a computer to complete it, but can also be administered on paper, since each of its tasks is based on an image and the answers are all multiple choice.
Thus, we believe that since CTt is a validated tool for assessing CT, taking into account that it may use fewer CT concepts, it can be used to validate DACT. Even though the latter refers to more concepts, the results of the two tools should show a positive correlation as they assess the same general common concept of CT.
In order to be able to use the CTt tool in Greek, we translated and adapted it, thus producing the Greek version of the CTt tool. With the Greek version of the CTt tool now available, we proceeded to validate the DACT tool. The CTt was used to check concurrent validity, where both tools (CTt and DACT) were administered to the same student population, and our statistical test gave us statistically significant results for their correlation (high correlation), thus proving concurrent validity, which belongs to content validity.
The process of constructing the DACT through continuous monitoring of Cronbach’s alpha provided us with statistical results for internal reliability, while our construction process ensured criterion validity. In addition, during the construction process, the face validity of the DACT was checked, thus completing a basic cycle of validation checks for the tool.
Thus, we can now recommend DACT as a CT assessment tool, which is based on the analysis of CT in six main dimensions, namely algorithmic thinking, evaluation, generalization and patterns, abstraction, logic, and decomposition, but it provides a single result and does not evaluate each of these dimensions separately. The DACT tool is administered autonomously, without depending on a specific programming environment, and can be administered either online or printed and completed by students. The online version is recommended, as it has the advantage of automatically checking the answers to each question, and the results are immediately available electronically.
The license to use the tool will also include the automated correction function spreadsheet for the DACT tool. The provision of the DACT tool will be accompanied by an administration protocol for their use.
A limitation of this research consists of conducting an Explanatory Factor Analysis (EFA) using an even larger sample of participants, who will complete all 90 tasks of the CT assessment tasks’ pool. Following this case, a new research will be conducted, which might also include a Confirmatory Factor Analysis (CFA). This work presents the first steps of the development and validation of the DACT assessment tool, keeping in mind that there are more things to be examined in future research and probably a new wider research cycle will be necessary in order to present the final assessment tool.
In future research, the questionnaire construction process could be repeated facing the afore-mentioned limitations, and it will be interesting to compare the results for the final CT assessment tool DACT. It might also be interesting to target not the whole class, but selected students who have a proven interest in the subjects and possibly in programming or problem solving, to see what would come out of their participation alone, assuming that there would be no element of randomness in the answers. Furthermore, as there are different kinds of validity, more research on the validity of the DACT assessment tool could be conducted in the future.

5. Conclusions

This work describes the process of developing a new CT assessment tool, named DACT. The initial research has provided the pool of 90 assessment tasks, originally created by the authors and checked by experts. Each task has been categorized to one CT concept, namely algorithmic thinking, abstraction, decomposition, evaluation, generalization and patterns and logic, although it is not always easy to argue about the presence of only one CT concept. A pilot research has initially checked these tasks and reformed some of them, allowing us to proceed to the main research that provided the final DACT CT assessment tool with 36 tasks. The proposed tool has undergone validation checks and we have argued about its concurrent validity, criterion validity and face validity, thus completing a basic cycle of validation checks for the tool.
The lack of CT assessment tools for the age group (11–14 years old) that DACT targets, the fact that it is designed to be administered autonomously, without depending on a specific programming environment or knowledge of programming and the ease of administration as it can be either used online (google form) or in printed format (pdf file) allow us to hope that this work could cover a significant gap in the scientific community.

Author Contributions

Conceptualization, E.P., P.P. and P.R.; methodology, E.P.; validation, E.P., P.P. and P.R.; formal analysis, E.P. and P.R.; investigation, E.P.; resources, E.P.; data curation, E.P.; writing—original draft preparation, E.P.; writing—review and editing, E.P., P.P. and P.R.; visualization, E.P.; supervision, P.P.; project administration, E.P., P.P. and P.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DACTDevelopment and Assessment of Computational Thinking
CTComputational Thinking
CTtComputational Thinking test
cCTtCompetent Computational Thinking Test
EFAExplanatory Factor Analysis

References

  1. Selby, C.; Woollard, J. Computational Thinking: The Developing Definition. University of Southampton (E-prints). Available online: https://eprints.soton.ac.uk/356481/ (accessed on 10 December 2025).
  2. Yadav, A.; Stephenson, C.; Hong, H. Computational Thinking for Teacher Education. Commun. ACM 2017, 60, 55–62. [Google Scholar] [CrossRef]
  3. Allan, W.; Coulter, B.; Denner, J.; Erickson, J.; Lee, I.; Malyn-Smith, J.; Martin, F. Computational Thinking for Youth. Available online: https://stelar.edc.org/sites/default/files/Computational_Thinking_paper.pdf (accessed on 10 December 2025).
  4. Barr, V.; Stephenson, C. Bringing computational thinking to K-12: What is involved and what is the role of the computer science education community. ACM InRoads 2011, 2, 48–54. [Google Scholar] [CrossRef]
  5. Denning, P.J. Computing is a natural science. Commun. ACM 2007, 50, 13–18. [Google Scholar] [CrossRef]
  6. Guzdial, M. Paving the way for computational thinking. Commun. ACM 2008, 51, 25–27. [Google Scholar] [CrossRef]
  7. National Research Council. Committee for the Workshops on Computational Thinking: Report of A Workshop on the Scope and Nature of Computational Thinking; The National Academies Press: Washington, DC, USA, 2010. [Google Scholar]
  8. Wing, J.M. Computational thinking. Commun. ACM 2006, 49, 33–35. [Google Scholar] [CrossRef]
  9. Bocconi, S.; Chioccariello, A.; Dettori, G.; Ferrari, A.; Engelhardt, K. Developing Computational Thinking in Compulsory Education—Implications for Policy and Practice. Available online: https://publications.jrc.ec.europa.eu/repository/bitstream/JRC104188/jrc104188_computhinkreport.pdf (accessed on 10 December 2025).
  10. Bocconi, S.; Chioccariello, A.; Kampylis, P.; Dagienė, V.; Wastiau, P.; Engelhardt, K.; Earp, J.; Horvath, M.A.; Jasutė, E.; Malagoli, C.; et al. Reviewing Computational Thinking in Compulsory Education. In Reviewing Computational Thinking in Compulsory Education; Inamorato Dos Santos, A., Cachia, R., Giannoutsou, N., Punie, Y., Eds.; Publications Office of the European Union: Luxembourg, 2022. [Google Scholar]
  11. Computational Thinking with Scratch. Computational Thinking. Available online: https://scratched.gse.harvard.edu/ct/ (accessed on 10 December 2025).
  12. Computer Science Teachers Association Task Force. Computational Thinking. Available online: https://advocate.csteachers.org/2015/02/04/csta-computational-thinking-ct-task-force/ (accessed on 10 December 2025).
  13. Computing At School. Computing in the National Curriculum: A Guide for Primary Teachers; Newnorth Print: Bedford, UK, 2013. [Google Scholar]
  14. Computing At School. Computing in the National Curriculum: A guide for Secondary Teachers; Newnorth Print: Bedford, UK, 2014. [Google Scholar]
  15. Computing At School. Computational Thinking: A Guide for Teachers. 2015. Available online: https://community.computingatschool.org.uk/files/8550/original.pdf (accessed on 10 December 2025).
  16. ISTE; CSTA. Operational Definition of Computational Thinking in K-12 Education. Available online: https://cdn.iste.org/www-root/Computational_Thinking_Operational_Definition_ISTE.pdf (accessed on 10 December 2025).
  17. National Research Council. Committee for the Workshops on Computational Thinking: Report of a Workshop on the Pedagogical Aspects of Computational Thinking; The National Academies Press: Washington, DC, USA, 2011. [Google Scholar]
  18. The Royal Society. Shut Down or Restart: The Way Forward for Computing in UK Schools. Available online: https://royalsociety.org/~/media/education/computing-in-schools/2012-01-12-computing-in-schools.pdf (accessed on 10 December 2025).
  19. University of Canterbury. Computational Thinking and CS Unplugged. Available online: https://csunplugged.org/en/computational-thinking/ (accessed on 10 December 2025).
  20. Kalelioglu, F.; Gulbahar, Y.; Kukul, V. A framework for computational thinking based on a systematic research review. Balt. J. Mod. Comput. 2016, 4, 583–596. [Google Scholar]
  21. Angeli, C.; Jaipal-Jamani, K. Preparing Pre-service Teachers to Promote Computational Thinking in School Classrooms. In Computational Thinking in the STEM Disciplines; Khine, M.S., Ed.; Springer: Cham, Switzerland, 2018; pp. 127–150. [Google Scholar]
  22. Poulakis, E.; Politis, P. Computational Thinking assessment: Literature review. In Research on e-Learning and ICT in Education: Technological, Pedagogical and Instructional Issues; Tsiatsos, T., Demetriadis, S., Dagdilelis, V., Mikropoulos, A., Eds.; Springer: Cham, Switzerland, 2021; pp. 111–128. [Google Scholar]
  23. Tang, X.; Yin, Y.; Lin, Q.; Hadad, R.; Zhai, X. Assessing computational thinking: A systematic review of empirical studies. Comput. Educ. 2020, 148, 103798. [Google Scholar] [CrossRef]
  24. Ukkonen, A.; Pajchel, K.; Mifsud, L. Teachers’ understanding of assessing computational thinking. Comput. Sci. Educ. 2024, 35, 794–819. [Google Scholar] [CrossRef]
  25. Bell, T.; Lodi, M. Constructing Computational Thinking Without Using Computers. Constr. Found. 2019, 14, 342–351. [Google Scholar]
  26. Teaching London Computing. A Resource Hub from CAS London and CS4FN. Available online: https://teachinglondoncomputing.org/ (accessed on 10 December 2025).
  27. Dagienė, V.; Sentance, S. It’s Computational Thinking! Bebras tasks in the curriculum. In Informatics in Schools: Improvement of Informatics Knowledge and Perception; Brodnik, A., Tort, F., Eds.; Springer: Cham, Switzerland, 2016; pp. 28–39. [Google Scholar]
  28. Curzon, P.; Dorling, M.; Ng, T.; Selby, C.; Woollard, J. Developing Computational Thinking in the Classroom: A Framework. 2014. Available online: https://www.researchgate.net/publication/299364542_Developing_computational_thinking_in_the_classroom_a_framework (accessed on 10 December 2025).
  29. Román-González, M.; Pérez-González, J. Computational Thinking Assessment: A Developmental Approach. In Computational Thinking Curricula in K–12: International Implementations; Abelson, H., Kong, S.C., Eds.; The MIT Press: Cambridge, MA, USA, 2024; pp. 121–141. [Google Scholar]
  30. Robinson, M.A. Using multi-item psychometric scales for research and practice in human resource management. Hum. Resour. Manage. 2018, 57, 739–750. [Google Scholar] [CrossRef]
  31. Boateng, G.O.; Neilands, T.B.; Frongillo, E.A.; Melgar-Quiñonez, H.R.; Young, S.L. Best Practices for Developing and Validating Scales for Health, Social, and Behavioral Research: A Primer. Front. Public Health 2018, 6, 1–18. [Google Scholar] [CrossRef]
  32. Cohen, L.; Manion, L.; Morrison, K. Research Methods in Education, 5th ed.; Routledge: London, UK, 2000. [Google Scholar]
  33. Li, Y.; Xu, S.; Liu, J. Development and Validation of Computational Thinking Assessment of Chinese Elementary School Students. J. Pac. Rim Psychol. 2021, 15, 1–22. [Google Scholar] [CrossRef]
  34. Computing At School. Computing Progression Pathways. Available online: https://teachinglondoncomputing.org/wp-content/uploads/2014/07/computing_progression_pathways_with_computational_thinking_v2-3.pdf (accessed on 10 December 2025).
  35. Rich, P.J.; Egan, G.; Ellsworth, J. A Framework for Decomposition in Computational Thinking. In Proceedings of the 2019 ACM Conference on Innovation and Technology in Computer Science Education; Scharlau, B., McDermott, R., Pears, A., Sabin, M., Eds.; ACM: New York, NY, USA, 2019; pp. 416–421. [Google Scholar]
  36. Rowe, E.; Asbell-Clarke, J.; Almeda, M.; Gasca, S.; Edwards, T.; Bardar, E.; Shute, V.; Ventura, M. Interactive Assessments of CT (IACT): Digital Interactive Logic Puzzles to Assess Computational Thinking in Grades 3–8. Int. J. Comput. Sci. Educ. Sch. 2021, 5, 28–73. [Google Scholar] [CrossRef]
  37. Bryman, A. Social Research Methods, 3rd ed.; Oxford University Press: New York, NY, USA, 2008. [Google Scholar]
  38. Atmatzidou, S.; Demetriadis, S. Advancing students’ computational thinking skills through educational robotics: A study on age and gender relevant differences. Rob. Auton. Syst. 2016, 75, 661–670. [Google Scholar] [CrossRef]
  39. Kanaki, K.; Kalogiannakis, M. Assessing Algorithmic Thinking Skills in Relation to Gender in Early Childhood. Educ. Process Int. J. 2022, 11, 44–59. [Google Scholar]
  40. Kanaki, K.; Kalogiannakis, M. Assessing computational thinking skills at first stages of schooling. In Proceedings of the 2019 3rd International Conference on Education and E-Learning, New York, NY, USA, 5–7 November 2019. [Google Scholar]
  41. Kanaki, K.; Kalogiannakis, M.; Stamovlasis, D. Assessing algorithmic thinking skills in early childhood education: Evaluation in physical and natural science courses. In Handbook of Research on Tools for Teaching Computational Thinking in P-12 Education; Kalogiannakis, M., Papadakis, S., Eds.; IGI-Global: Hershey, PA, USA, 2020; pp. 488–523. [Google Scholar]
  42. Román-González, M. Computational thinking test: Design guidelines and content validation. In Proceedings of the 7th International Conference on Education and New Learning Technologies, Barcelona, Spain, 6–8 July 2015. [Google Scholar]
  43. Brackmann, C.; Roman-Gonzalez, M.; Robles, G.; Moreno-Leon, J.; Casali, A.; Barone, D. Development of Computational Thinking Skills through Unplugged Activities in Primary School. In Proceedings of the 12th Workshop on Primary and Secondary Computing Education, Nijmegen, The Netherlands, 8–10 November 2017. [Google Scholar]
  44. Wiebe, E.; London, J.; Aksit, O.; Mott, B.W.; Boyer, K.E.; Lester, J.C. Development of a Lean Computational Thinking Abilities Assessment for Middle Grades Students. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education; Hawthorne, E.K., Pérez-Quiñones, M.A., Heckman, S., Zhang, J., Eds.; ACM: New York, NY, USA, 2019; pp. 456–461. [Google Scholar]
  45. Román-González, M.; Pérez-González, J.C.; Moreno-Leon, J.; Robles, G. Can computational talent be detected? Predictive validity of the Computational Thinking Test. Int. J. Child-Comput. Interact. 2018, 18, 47–58. [Google Scholar] [CrossRef]
  46. Pérez-Marín, D.; Hijón-Neira, R.; Bacelo, A.; Pizarro, C. Can computational thinking be improved by using a methodology based on metaphors and scratch to teach computer programming to children? Comput. Hum. Behav. 2018, 105, 1–10. [Google Scholar] [CrossRef]
  47. Rose, S.P.; Habgood, M.P.J.; Jay, T. Using Pirate Plunder to Develop Children’s Abstraction Skills in Scratch. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, New York, NY, USA, 4–9 May 2019. [Google Scholar]
  48. Zhao, W.; Shute, V.J. Can playing a video game foster computational thinking skills? Comput. Educ. 2019, 141, 103633. [Google Scholar] [CrossRef]
  49. Fanchamps, N.L.J.A.; Slangen, L.; Specht, M.; Hennissen, P. The Impact of SRA-Programming on Computational Thinking in a Visual Oriented Programming Environment. Educ. Inf. Technol. 2021, 26, 6479–6498. [Google Scholar] [CrossRef]
  50. Chan, S.W.; Looi, C.K.; Sumintono, B. Assessing computational thinking abilities among Singapore secondary students: A Rasch model measurement analysis. J. Comput. Educ. 2021, 8, 213–236. [Google Scholar] [CrossRef]
  51. Aho, A.V. Computation and computational thinking. Comput. J. 2012, 55, 832–835. [Google Scholar] [CrossRef]
  52. Borsa, J.C.; Damásio, B.F.; Bandeira, D.R. Cross-cultural adaptation and validation of psychological instruments: Some considerations. Paidéia 2012, 22, 423–432. [Google Scholar] [CrossRef]
  53. Hambleton, R.K. Issues, designs, and technical guidelines for adapting tests into multiple languages and cultures. In Adapting Educational and Psychological Tests for Cross-Cultural Assessment; Hambleton, R.K., Merenda, P.F., Spielberger, C.D., Eds.; Lawrence Erlbaum: Mahwah, NJ, USA, 2005; pp. 3–38. [Google Scholar]
  54. Beaton, D.E.; Bombardier, C.; Guillemin, F.; Ferraz, M.B. Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measures. Spine 2000, 25, 3186–3191. [Google Scholar] [CrossRef]
  55. Blokhuis, D.; Millican, P.; Roffey, C.; Schrijvers, E.; Sentance, S. UK Bebras Computational Thinking Challenge 2016; University of Oxford: Oxford, UK, 2016. [Google Scholar]
  56. Grover, S. Designing an Assessment for Introductory Programming Concepts in Middle School Computer Science. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education; Zhang, J., Sherriff, M., Eds.; ACM: New York, NY, USA, 2020; pp. 678–684. [Google Scholar]
  57. Lai, R.P.Y. Beyond Programming: A Computer-Based Assessment of Computational Thinking Competency. ACM Trans. Comput. Educ. 2021, 22, 1–27. [Google Scholar] [CrossRef]
  58. El-Hamamsy, L.; Zapata-Caceres, M.; Barroso, E.M.; Mondada, F.; Dehler Zufferey, J.; Bruno, B. The Competent Computational Thinking Test: Development and Validation of an Unplugged Computational Thinking Test for Upper Primary School. J. Educ. Comput. Res. 2022, 60, 1818–1866. [Google Scholar] [CrossRef]
  59. Shen, L.; Mirakhur, Z.; LaCour, S. Investigating the psychometric features of a locally designed computational thinking assessment for elementary students. Comput. Sci. Educ. 2024, 35, 414–433. [Google Scholar] [CrossRef]
  60. Sartor Hoffer, M.; Baroni, S.; Fronza, I.; Pahl, C. About Computational Thinking Assessment: A Proposal for Primary School First Year from a Pedagogical Perspective. In Proceedings of the 2nd Systems of Assessments for Computational Thinking Learning Workshop (TACKLE 2019), Co-Located with 14th European Conference on Technology Enhanced Learning (EC-TEL 2019), Delft, The Netherlands, 17 September 2019. [Google Scholar]
  61. Ocampo, L.M.; Corrales-Álvarez, M.; Cardona-Torres, S.A.; Zapata-Cáceres, M. Systematic Review of Instruments to Assess Computational Thinking in Early Years of Schooling. Educ. Sci. 2024, 14, 1124. [Google Scholar] [CrossRef]
  62. Ghosh, A.; Malva, L.; Singla, A. Analyzing–Evaluating–Creating: Assessing Computational Thinking and Problem Solving in Visual Programming Domains. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education; Stephenson, B., Stone, J.A., Battestilli, L., Rebelsky, S.A., Shoop, L., Eds.; ACM: New York, NY, USA, 2024; pp. 387–393. [Google Scholar]
Figure 1. EFA scree plot.
Figure 1. EFA scree plot.
Computers 15 00165 g001
Table 1. Participations per questionnaire for the pilot study (N = 70).
Table 1. Participations per questionnaire for the pilot study (N = 70).
ANQuestionnaireParticipation
1Algorithmic thinking—ALG68
2Evaluation—EVA65
3Decomposition—DEC51
4Generalization—GEN69
5Logic—LOG69
6Abstraction—ABS61
Total383
Table 2. Main research sample.
Table 2. Main research sample.
ANSchoolClassesParticipantsCompleted
11st Gymnasium District 188780
22nd Gymnasium District 15127115
33rd Gymnasium District 1510791
44th Gymnasium District 17126106
51st Gymnasium District 236757
62nd Gymnasium District 2173
Total29521452
Table 3. Steps; total remaining tasks; tasks per concept; removal proposal.
Table 3. Steps; total remaining tasks; tasks per concept; removal proposal.
StepTasksALGEVADECGENLOGABSRemoval
Proposal
Cronbach’s
α If Item
Removed
188151514151514LOG-140.935
287151514151414EVA-7, EVA-11, DEC-5, GEN-4, LOG-12, ABS-4, ABS-80.935
305910111010108ALG-1, ALG-4, ALG-10, ALG-11, ALG-12, ALG-14, EVA-2, EVA-3, EVA-5, EVA-6, EVA-8, EVA-9, EVA-13, EVA-15, DEC-2, DEC-4, DEC-9, DEC-10, DEC-12, DEC-13, DEC-14, DEC-15, GEN-1, GEN-9, GEN-10, GEN-11, GEN-15, LOG-2, LOG-4, LOG-7, LOG-9, LOG-11, LOG-15, ABS-1, ABS-3, ABS-6, ABS-9, ABS-11, ABS-12, ABS-13, ABS-150.945
5237856765DEC-9, ABS-3, ABS-6, ABS-9, ABS-130.926
5336855765ABS-130.927
Table 4. Final tasks per CT concept and number in DACT.
Table 4. Final tasks per CT concept and number in DACT.
CT ConceptTasksNumber
Algorithmic
Thinking
ALG-3, ALG-4, ALG-5, ALG-6,
ALG-8, ALG-10, ALG-12, ALG-14
8
EvaluationEVA-3, EVA-4, EVA-10, EVA-13, EVA-145
DecompositionDEC-8, DEC-10, DEC-11, DEC-13, DEC-145
Generalization and
Patterns
GEN-3, GEN-7, GEN-9, GEN-11,
GEN-12, GEN-13, GEN-14
7
LogicLOG-3, LOG-4, LOG-5, LOG-6,
LOG-9, LOG-13
6
AbstractionABS-1, ABS-3, ABS-6, ABS-9, ABS-135
Total number of tasks in DACT36
Table 5. EFA factor loading for DACT.
Table 5. EFA factor loading for DACT.
TaskFactor 1Uniqueness
ALG_30.5210.729
ALG_40.4660.783
ALG_50.5430.706
ALG_60.5200.730
ALG_80.5620.684
ALG_100.4630.786
ALG_120.3790.857
ALG_140.3750.859
EVA_30.4240.820
EVA_40.5020.748
EVA_100.4240.820
EVA_130.3610.870
EVA_140.5300.719
DEC_80.4190.824
DEC_100.3990.841
DEC_110.4020.839
DEC_130.3330.889
DEC_140.3830.853
GEN_30.4460.801
GEN_70.4100.832
GEN_90.3510.876
GEN_110.4060.835
GEN_120.4110.831
GEN_130.4660.783
GEN_140.4840.766
LOG_30.4330.812
LOG_40.4430.803
LOG_50.4040.837
LOG_60.5860.656
LOG_90.3270.893
LOG_130.4520.796
ABS_10.5090.741
ABS_3 0.934
ABS_6 0.912
ABS_9 0.938
ABS_130.3970.843
Table 6. KMO Measure of Sampling Adequacy.
Table 6. KMO Measure of Sampling Adequacy.
TaskMSA
ALG_30.942
ALG_40.928
ALG_50.889
ALG_60.908
ALG_80.913
ALG_100.924
ALG_120.925
ALG_140.876
EVA_30.902
EVA_40.925
EVA_100.936
EVA_130.869
EVA_140.938
DEC_80.870
DEC_100.902
DEC_110.881
DEC_130.876
DEC_140.879
GEN_30.914
GEN_70.842
GEN_90.841
GEN_110.850
GEN_120.906
GEN_130.900
GEN_140.924
LOG_30.880
LOG_40.891
LOG_50.904
LOG_60.945
LOG_90.886
LOG_130.923
ABS_10.920
ABS_30.854
ABS_60.843
ABS_90.854
ABS_130.899
Table 7. Model fit measures.
Table 7. Model fit measures.
MeasureValue
RMSEA0.0316
TLI0.889
BIC−2768
χ2863
df594
p<0.001
Table 8. Bartlett’s Test of Sphericity.
Table 8. Bartlett’s Test of Sphericity.
χ2dfp
3206630<0.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Poulakis, E.; Politis, P.; Roussos, P. Development and Validation of the Computational Thinking Assessment Tool DACT. Computers 2026, 15, 165. https://doi.org/10.3390/computers15030165

AMA Style

Poulakis E, Politis P, Roussos P. Development and Validation of the Computational Thinking Assessment Tool DACT. Computers. 2026; 15(3):165. https://doi.org/10.3390/computers15030165

Chicago/Turabian Style

Poulakis, Emmanouil, Panagiotis Politis, and Petros Roussos. 2026. "Development and Validation of the Computational Thinking Assessment Tool DACT" Computers 15, no. 3: 165. https://doi.org/10.3390/computers15030165

APA Style

Poulakis, E., Politis, P., & Roussos, P. (2026). Development and Validation of the Computational Thinking Assessment Tool DACT. Computers, 15(3), 165. https://doi.org/10.3390/computers15030165

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop