1. Introduction
In modern education, the importance of visual programming tools is increasingly emphasized because they help students master fundamental programming concepts more easily through intuitive graphical interfaces [
1,
2]. Research indicates that this approach fosters logical and algorithmic thinking through interactive, visual learning [
1,
3]. When working with children with special educational needs, selecting an appropriate software solution is further complicated by cognitive, motor, and perceptual limitations [
4,
5].
The research problem is the lack of a standardized, empirically validated, multicriteria framework for selecting visual programming software for children with special educational needs. Although many tools are available, comparing them is not straightforward because of numerous, often conflicting criteria that affect their applicability in the educational process. Consequently, the selection process in practice is often insufficiently structured and subject to subjective judgments, complicating the making of objective and transparent decisions. This situation indicates the need to develop a formalized hybrid MCDM framework to enable systematic evaluation, ranking, and selection of software solutions in this context.
MCDM is an approach to problem-solving under conditions involving multiple, often conflicting criteria [
6,
7], and has wide applications across various fields, including education [
8,
9,
10]. MCDM methods enable the structured evaluation and comparison of alternatives against predefined criteria, thereby ensuring greater transparency and consistency in the decision-making process [
6,
8,
9].
The evaluation criteria framework was developed based on an analysis of the educational context and the specific needs of the target population, as no single standardized set of criteria is available in the literature for this purpose. Empirical data were collected through a structured electronic questionnaire distributed to educational institutions specializing in educating children with special educational needs, enabling the construction of a matrix for multi-criteria analysis. The methodological approach included applying the ROC method to determine the criterion weights and the PROMETHEE II method to rank the alternatives. The stability of the results was further examined using a sensitivity analysis.
The main contribution of this study is the development of an integrated multi-criteria framework for evaluating and selecting visual programming software solutions for children with special needs, including a clearly defined set of criteria tailored to the target group’s specific characteristics. This contribution is also reflected in the combined use of the ROC and PROMETHEE II methods in this context, as well as in a sensitivity analysis to assess the model’s robustness. The proposed framework contributes to a more objective and transparent decision-making process and serves as a basis for further research in educational software evaluation.
Given the specific requirements of inclusive education, selecting an appropriate software solution is complicated by the interdependence and often conflicting relationships among relevant criteria, including ease of use, cognitive load, and pedagogical effectiveness. In this context, multi-criteria decision-making provides a systematic, methodologically grounded approach to decision-making problems with conflicting criteria, as confirmed by contemporary MCDM approaches [
11,
12].
2. Literature Review
The literature was searched across scientific databases, digital libraries, and open-access sources. The search covered papers published between 2020 and 2025 that relate to the application of multicriteria methods in education, the evaluation of educational software, and the use of visual programming with children with special educational needs. The search was organized around these three research directions, as no study integrating them into a single whole was identified. Papers were selected based on their thematic relevance to the research.
In education, multicriteria methods are widely used to evaluate teaching strategies and educational technologies. Troussas et al. developed a TOPSIS-based model to tailor teaching strategies to students’ individual characteristics, thereby improving learning efficiency [
13]. Similarly, Alshamsi et al. applied an MCDM approach to evaluate distance learning systems, identifying key criteria such as content quality, interactivity, and user satisfaction [
14].
Basciani et al. proposed a model for software quality assessment that integrates technical and user aspects, enabling a comprehensive evaluation of software solutions [
15]. Shabrina et al. also analyzed systems for automatically generating feedback in programming learning, using a multicriteria approach that evaluates accuracy, efficiency, and user experience [
16].
The application of multicriteria methods in inclusive education is a significant research area. Studies show that MCDM can be used to prioritize the needs of children with autism spectrum disorders and to evaluate different educational approaches [
17]. Additionally, methods such as AHP and GRA have been used to analyze the environment and factors influencing the learning of children with autism [
18]. Kumar et al. analyzed the application of MCDM methods for evaluating inclusive educational systems and identified key factors affecting the quality of education for children with special needs [
19]. Cañete et al. used a multicriteria approach in product design, specifically for smart toys for children with autism, defining criteria that include sensory, functional, and economic aspects [
20].
It is important to emphasize the role of digital technologies in supporting the development of children with special needs, particularly in communication, social interaction, and learning. Advances in digital technologies have further expanded the possibilities for supporting children with special needs. For example, modern technologies, such as machine learning and deep learning, enable the development of systems that can assist in identifying and addressing issues in children’s learning and daily functioning [
21,
22]. These approaches indicate the significant potential for integrating intelligent systems and educational tools.
Visual programming is a significant approach to teaching programming to children, as it enables the acquisition of fundamental concepts in an intuitive and accessible way. Block-based tools are particularly noteworthy because they eliminate the need to learn syntax, thereby facilitating initial learning. However, their use with children with special needs requires additional adaptations to ensure adequate accessibility and effective learning. Research shows that numerous challenges arise in using these tools, including the complexity of the user interface, visual overload, and a lack of functionality tailored to users’ specific needs [
23]. To overcome these problems, guidelines have been proposed that include simplifying the interface, applying visual and auditory stimuli, and supporting individualized learning [
23,
24].
Despite the growing number of papers addressing multi-criteria analysis, inclusive education, and visual programming, the literature still lacks an integrated approach that unifies these research directions into a methodologically grounded framework. Existing research has mainly focused on individual aspects of the problem. At the same time, systematic evaluation and selection of software solutions that consider the specific needs of children with special educational needs remain insufficiently represented.
This shortcoming indicates the need for an approach that enables clearer, more objective comparisons of available solutions. Accordingly, this study aims to integrate existing knowledge from multi-criteria analysis, software evaluation, and inclusive education by proposing a hybrid framework for the systematic analysis and selection of software solutions for visual programming for children with special educational needs.
3. Materials and Methods
3.1. Research Design
This quantitative cross-sectional study included 66 respondents employed in schools serving children and students with developmental disabilities in the Republic of Serbia [
25,
26]. The sample comprised computer science teachers and professional associates (pedagogues, psychologists, and special education teachers) who were key participants in selecting and implementing educational software solutions. In addition to respondents’ socio-professional characteristics, the study captured characteristics of the educational environment in which they worked, including students’ ages and the types of developmental and educational needs (e.g., learning difficulties, intellectual disabilities, developmental disorders, and multiple disabilities). These data provide a more precise understanding of the conditions under which visual programming software is applied in practice. Given the anonymous nature of the research, no identification data on individual schools were collected; only aggregated data on the type of institution (primary or secondary school) and the basic characteristics of the educational environment were gathered. The sample was formed using convenience sampling, constrained by limited access to specialized educational institutions and the specific nature of the target population.
3.2. Research Objectives and Research Questions
This study aimed to identify and quantify the criteria that teachers and professional associates use when selecting visual programming software for children with special educational needs, and to develop and apply a hybrid multi-criteria framework to support decision-making in evaluating and ranking these solutions. In line with this aim, the following research questions were formulated:
RQ1: What are the most important criteria for evaluating and selecting visual programming software solutions for children with special educational needs?
RQ2: How are software solutions ranked using the hybrid ROC–PROMETHEE II model according to the defined criteria?
RQ3: How stable is the resulting ranking of software solutions under changes in criterion weights, as assessed by sensitivity analysis and GAIA visualization?
3.3. Data Collection and Instrumentation
The data were collected via an anonymous, structured questionnaire created in Google Forms and electronically distributed to 48 primary and secondary schools in the Republic of Serbia that serve children and students with developmental disabilities, yielding 66 valid responses from teachers and professional associates. The questionnaire included both closed- and open-ended questions to identify software solutions used for visual programming among children with special educational needs, to rank evaluation criteria, and to determine key factors influencing the choice of software solution. Criteria were ranked on an ordinal scale, with respondents assigning ranks from 1 to 11 (1—most important criterion, 11—least important), providing the basis for determining criterion weights using the ROC method in the next phase of the analysis. Participation was voluntary and anonymous, and no personal identification data were collected.
3.4. Criteria Definition
Based on a systematic review and analysis of relevant scientific and professional literature in the fields of educational technologies, inclusive education, and multicriteria analysis, a set of criteria was developed to evaluate software solutions for visual programming by children with special educational needs. The criteria were formulated to encompass the pedagogical, cognitive, technical, and inclusive aspects of software solutions and to enable quantification within the MCDM model.
Special emphasis is placed on the principles of inclusive education, including accessibility, equity in learning, and adapting educational content to the diverse needs of students [
27,
28]. Equity in learning means providing equal opportunities to achieve learning outcomes through tailored approaches, methods, and resources, in accordance with students’ individual characteristics. Contemporary frameworks for evaluating digital educational tools that highlight the importance of usability, interactivity, and pedagogical value were also considered [
29,
30].
By synthesizing relevant sources, a structured set of criteria suitable for multi-criteria analysis was developed [
31,
32]. Based on this framework, the criteria shown in
Table 1 were defined and used to evaluate software solutions or visual programming intended for children with special educational needs.
A structured set of criteria was developed to evaluate visual programming software solutions intended for children with special educational needs. The criteria are systematized and presented in
Table 1, ensuring clarity and providing a basis for further quantitative analysis within a multicriteria decision-making model.
3.5. Data Structure and Preprocessing
The data were organized into two basic structures to construct a hybrid MCDM framework using the ROC and PROMETHEE II methods. This approach enables the integration of respondents’ subjective assessments with the objective characteristics of software solutions.
The first structure presents the criterion-ranking matrix based on respondents’ responses. Using these rankings, the average criterion rankings were calculated and used as inputs for the ROC method [
37]. The ROC method transforms ordinal rankings into quantitative weights, with the weight of the
i-th criterion determined as the average of the reciprocal ranks, according to the following expression:
where
wi is the weight of the ith criterion and
n is the total number of criteria. The ROC method was chosen for its simplicity, stability, suitability for processing ordinal data, and the absence of additional subjective parameters in weight determination [
38,
39].
The questionnaire was designed to obtain ordinal rankings of predefined evaluation criteria for use within a multicriteria decision-making framework. The evaluation criteria were predefined based on a systematic review of relevant scientific and professional literature and were not intended to represent items of a single measurement scale. Instead, they were treated as independent decision criteria within the MCDM framework. Since the criteria represent independent evaluation dimensions rather than items of a psychometric scale, psychometric reliability and validity procedures, such as Cronbach’s alpha and factor analysis, were not considered appropriate for the present study design. The ROC method was applied to transform respondents’ ordinal rankings into quantitative criterion weights.
The second structure presents a decision matrix that includes the software alternatives Scratch, Tynker, MakeCode, Blockly, and Alice, selected based on empirical data from respondents and the relevant literature in educational technologies [
40,
41]. Although the present study considered five software alternatives identified through empirical findings, the proposed MCDM framework is not limited to these solutions and may be applied to a broader set of visual programming environments. The values for the alternatives were drawn from technical documentation and prior research and are expressed on a unified scale to ensure comparability across criteria.
The data were normalized using the min–max method to ensure comparability across criteria, given their differing measurement scales [
42]. This method was chosen because it preserves relative relationships between values and is a standard step in MCDM data preprocessing [
41]. The normalization further enables the consistent application of the PROMETHEE II method for ranking alternatives [
43].
3.6. MCDM Framework
A hybrid MCDM framework combines the ROC method, which transforms ordinal ranks into numerical weights without additional subjective parameters, with the PROMETHEE II method, which provides a complete ranking of alternatives based on the net preference flow [
7,
43]. The ROC and PROMETHEE II methods were selected for their complementary roles in multicriteria decision-making. The ROC method offers a simple, stable transformation of criterion ranks into weights, whereas PROMETHEE II provides a complete ranking of alternatives based on preferential relations among criteria. This combination enables the consistent integration of respondents’ subjective assessments with objective multicriteria analyses.
The criterion weights obtained using the ROC method served as input parameters for the PROMETHEE II analysis, ensuring consistent evaluation of alternatives [
38,
39]. The net preference flow for each alternative is defined as
where
Φ+(
a) and
Φ−(
a) are the positive and negative preference flows, respectively. The final ranking of alternatives is based on the net flow values, with higher values indicating a better position in the ranking and negative values indicating weaker relative performance [
43].
A linear (V-shaped) preference function with a threshold of
p = 2 was applied to all criteria because the input data were previously normalized and expressed on a uniform scale. This approach enabled differentiation among alternatives based on the magnitude of differences in criterion values while avoiding the need to define an additional indifference threshold (
q). This approach contributes to the model’s simplicity, preserves interpretability, and increases the method’s sensitivity to variations among alternatives [
44,
45].
Although a common preference threshold (p = 2) was applied across all criteria to ensure methodological consistency and simplify model implementation, criterion-specific thresholds could provide a more detailed representation of preference structures. This may be particularly relevant when comparing objectively measurable criteria, such as the price of a software solution criterion, with more subjective criteria, such as pedagogical support. Future studies may investigate criterion-specific preference and indifference thresholds to further refine the decision-making model and assess their influence on ranking stability.
A sensitivity analysis was conducted by varying the criteria weights within ±10% and +20% of their nominal values to assess the stability of the ranking and identify the criteria with the greatest impact on the results. This approach validates the model’s robustness and aligns with contemporary PROMETHEE and MCDM studies [
39,
43,
44,
45].
The ROC weights were calculated in Microsoft Excel for its convenience in processing ordinal data and implementing standard MCDM calculations. Visual PROMETHEE 1.4 Academic Edition software was used to apply the PROMETHEE II method and conduct a sensitivity analysis, enabling ranking of alternatives and assessing the stability of the results.
3.7. Limitations of the Study
A limitation of this research is that fewer than 10% of participants have experience with multiple visual programming software solutions. Consequently, respondents’ direct experience could not provide a sufficiently reliable basis for the comparative evaluation of all analyzed alternatives. For this reason, the assessment of software alternatives was based on technical documentation, relevant scientific literature, and a comparative analysis of software functionalities. The survey was not used to evaluate software alternatives against the defined criteria; rather, it served to identify visual programming environments used in educational practice and to determine the relative importance of the evaluation criteria. Therefore, the final ranking results should be interpreted in light of this triangulated evaluation framework.
Another limitation relates to the heterogeneity of the target population. Children with special educational needs were considered as a broad target group, without distinguishing among specific categories of disabilities, such as cognitive, motor, sensory, or perceptual impairments. Although the selected criteria reflect general principles of inclusive education and accessibility, future research may examine the suitability of visual programming environments for specific groups of learners and investigate whether different disability categories require different evaluation criteria or weighting structures.
A further limitation concerns the study sample. The empirical research was conducted among teachers and professional associates working in schools that educate children with special educational needs in the Republic of Serbia. Participants were recruited using a convenience sampling approach, which may have influenced the composition of the sample. Consequently, the obtained criterion weights and ranking results should be interpreted in terms of the characteristics of the surveyed participants and should not be automatically generalized to other educational systems. Future research may validate and extend the proposed framework using larger and more geographically diverse samples, enabling comparisons across different countries and educational systems.
4. Results
The relative importance of the criteria was determined using the ROC method based on the ranking of 11 criteria by 66 respondents. The criteria were ranked on a scale from 1 (most important) to 11 (least important), after which the mean ranks were calculated, and an aggregate ranking of the criteria was established. Based on these values, the weights of the criteria were calculated and used in further analyses.
Table 2 presents the calculated criteria weights obtained using the ROC method.
The results in
Table 2 show that the criteria for visual accessibility (0.2745) and cognitive complexity (0.1836) have the highest weights. These are followed by pedagogical support, interaction simplicity, support for specific skill development, adaptability and inclusiveness, multisensory learning, and technical accessibility. The lowest weights were assigned to software solution cost, voice/audio support, and adaptability to the school environment. This weight distribution underscores the dominant importance of accessibility, usability, and support for the learning process in evaluating software solutions.
The criterion weights and the decision matrix, together with the defined preference functions, constitute the fundamental inputs of the multi-criteria model and enable quantitative comparison of alternatives within the MCDM approach [
46]. The PROMETHEE II method is based on the outranking principle, in which alternatives are compared in pairs for each criterion. The results are aggregated into positive and negative flows, and the difference between them defines the net flow, which is used to rank the alternatives [
47].
Table 3 presents a qualitative evaluation of the software alternatives against the defined criteria, which serves as the basis for constructing the decision matrix and precedes its numerical transformation. Qualitative assessments were standardized using a five-level ordinal scale to ensure consistency and comparability across criteria. The levels used (very high, high, medium, low, and very low) were based on an analysis of technical documentation, available scientific literature, and a comparative analysis of software solution functionalities.
The qualitative assessments were converted to a five-point ordinal Likert scale (1–5) [
60], using the following mapping: very high = 5, high = 4, medium = 3, low = 2, and very low = 1. This approach enables the quantification of qualitative data and their use in multi-criteria analysis. In this scheme, a higher value (5) indicates the most favorable performance level, whereas lower values indicate a weaker degree of fulfillment of the observed criterion. For the criteria of cognitive complexity and software solution cost, an inverse preference was applied, so that lower values represented more favorable alternatives, thereby ensuring consistency in preference direction within the PROMETHEE II method [
43,
47].
Qualitative assessments were assigned through a comparative analysis of the software solutions’ functionalities, drawing on relevant technical documentation and scientific literature. Using these standardized and transformed data, a decision matrix was constructed, as shown in
Table 4, which serves as the input dataset for the multicriteria ranking of alternatives.
Scratch consistently achieves the highest scores across nearly all evaluated criteria, indicating its overall superiority among the software solutions analyzed. MakeCode ranks second, demonstrating strong performance in technical accessibility, software solution cost, and adaptability to the school environment, while maintaining stable results across other criteria. Tynker ranks third, showing consistently high values in pedagogical support and multisensory learning, although slightly lower performance on the software solution cost criterion influences its overall position. Blockly ranks fourth and exhibits moderate performance across most criteria, with its strongest performance observed in the software solution cost criterion. Alice ranks last, with comparatively lower values for visual accessibility, cognitive complexity, and interaction simplicity than the other evaluated alternatives.
The resulting decision matrix served as input for the PROMETHEE II analysis. Within the model, a linear (V-shaped) preference function with a threshold
p = 2 was applied, defined in accordance with the five-point Likert rating scale, thereby allowing differentiation between moderate and more pronounced differences among alternatives and enabling more precise discrimination with respect to the criteria considered [
60,
61,
62].
The PROMETHEE II method enables calculation of positive (φ
+), negative (φ
−), and net preference flows (φ). Positive flow (φ
+) reflects how much an alternative surpasses others, whereas negative flow (φ
−) indicates how much others outweigh a given alternative. The net flow (φ) is the difference between the positive and negative flows and is used to rank alternatives [
61,
62,
63]. The results are presented in
Table 5.
The values in
Table 5 show that the Scratch alternative achieved the highest net preference flow (φ = 0.2817), ranking first. Tynker ranked second with a positive net flow (φ = 0.1554), indicating stable and competitive performance across the observed criteria. MakeCode achieved a lower but still positive value (φ = 0.0380), indicating a relatively balanced ratio of advantages to disadvantages. Alice and Blockly had negative net preference flow values, indicating that other alternatives on more criteria outperformed them. The lowest value was recorded by Blockly (φ = −0.2456), which ranked last.
Although the initial analysis of the decision matrix (
Table 4) shows a slight advantage for the MakeCode alternative over the Tynker alternative, the application of the PROMETHEE II method yields a different ranking, with the Tynker alternative ranked second. This difference results from the influence of criterion weights and the model’s preference structure, thereby confirming the importance of applying multi-criteria methods in decision-making and the need for a more detailed analysis of the relationships between alternatives. When interpreting the ranking results, it should be taken into account that respondents differed in their familiarity with the analyzed software solutions. However, the ranking was derived from a triangulated evaluation process based on technical documentation, scientific literature, and software functionality analysis rather than respondents’ assessments of the software alternatives. For a clearer interpretation of the results, a graphical representation of the ranking of alternatives based on net preference flow values is provided (
Figure 1).
The ranking graph (
Figure 1) shows the distribution of alternatives according to the net preference flow values. Scratch held the highest position with a pronounced positive value, confirming its overall most favorable status compared to the other alternatives. Tynker also achieved a significant positive value and ranked second, whereas MakeCode, with a value close to zero, indicated an almost balanced relationship between advantages and disadvantages. In contrast, Blockly and Alice are located in the negative part of the scale, indicating that, in aggregate terms, they are outperformed by other alternatives based on the observed criteria. Blockly has the lowest value and thus occupies the last place in the ranking. For additional insight into the relationships between alternatives and criteria, the results of the PROMETHEE II method are also presented on the GAIA plane, which enables visualization of multidimensional preferential relations [
61,
62] (
Figure 2).
The GAIA plane enables visual analysis of relationships between alternatives and criteria by projecting the multidimensional decision space into a two-dimensional display [
62,
63]. The GAIA plane shows that most criteria are aligned in a similar direction, indicating strong mutual alignment and low conflict within the model. Due to the high similarity and close alignment of several criterion vectors, some criteria overlap in the GAIA projection and are therefore not individually distinguishable in the graphical representation. Such overlap is expected when criteria exhibit a high degree of consistency and does not affect the interpretation of the results.
The criteria cognitive complexity and cost of the software solution stand out with a different vector orientation, indicating their conflicting relationship with criteria related to the inclusive, pedagogical, and functional characteristics of the software solutions. In relation to the decision axis (π-axis), the alternative Scratch is closest to the dominant direction of the criteria, confirming its highest alignment with the defined preferences and corresponding to the results of the PROMETHEE II ranking. Tynker and MakeCode occupy intermediate positions, indicating partial alignment with the dominant decision direction. At the same time, Alice and Blockly are farther from the π-axis, indicating a lower level of overall alignment with the defined evaluation criteria. The spatial distribution of alternatives confirms differences in the performance of the analyzed software solutions and enables a clearer interpretation of their interrelationships within the defined decision model. The projection quality of 91.1% indicates very good representativeness of the GAIA plane and confirms the reliability of the visual interpretation of the results [
62,
63].
To assess the stability of the ranking and determine the influence of individual criteria on the results, a sensitivity analysis was conducted by varying criterion weights by ±10% and +20%. The analysis was performed using the variable weights method (Walking Weights) in the Visual PROMETHEE tool, with particular emphasis on the key criteria [
44,
45,
47]. The results of the sensitivity analysis are presented in
Table 6, and
Table 7 provides a comparative overview of the baseline ranking and the rankings obtained across the analyzed scenarios.
The sensitivity analysis focused on the four highest-ranked criteria: visual accessibility, cognitive complexity, pedagogical support, and interaction simplicity. These criteria were selected because they received the highest rankings from respondents and obtained the largest ROC weights. Together, they account for 70.42% of the total weight distribution and therefore exert the strongest influence on the final ranking results. Consequently, sensitivity analysis was focused on these criteria, as variations in lower-weighted criteria would be expected to have a substantially smaller impact on the overall ranking. For all criteria, weight changes of +10% and +20% were applied to assess the impact of positive deviations on the final ranking of alternatives. In comparison, negative variation was limited to −10% to maintain scenario comparability and a stable sensitivity analysis structure.
In the base scenario, the PROMETHEE II method ranked the alternatives as follows: Scratch, Tynker, MakeCode, Alice, and Blockly. The same order was used as the reference display in the tables presenting the sensitivity analysis results (
Table 6 and
Table 7).
The results in
Table 6 indicate a relatively stable and robust ranking across variations in criterion weights. The greatest stability was observed for the criteria visual accessibility and interactional simplicity under negative weight variation (−10%), and for the criterion pedagogical support under positive variations (+10% and +20%), where the ranking of alternatives did not change relative to the baseline scenario. These results indicate that these criteria have a limited influence on the overall ranking structure.
On the other hand, the cognitive complexity criterion emerged as the most influential for model stability, since variations in its weight lead to the largest changes in the ranking of alternatives. To a lesser extent, the interactional simplicity criterion also showed changes, with shifts in positions among the lower-ranked alternatives.
The results of the sensitivity analysis indicate that the top three alternatives (Scratch, Tynker, and MakeCode) remain stable across most scenarios analyzed. In contrast, ranking changes primarily affect the last two alternatives (Alice and Blockly). These results confirm the high stability of the top-ranked alternatives, while the lower-ranked alternatives are more sensitive to changes in criterion weights.
Table 7 compares the baseline ranking of alternatives with the results across scenarios with varying criterion weights. To assess the stability of the ranking, a five-point sensitivity scale (1–5) was used, defined in this study as operationalizing the PROMETHEE-based sensitivity analysis results [
45].
The results in
Table 7 summarize the stability of alternative rankings across the sensitivity analysis and indicate varying levels of robustness among software solutions in response to changes in criterion weights. Scratch is the top-ranked alternative but shows sensitivity in scenarios with increased weighting of the cognitive complexity criterion, shifting as low as fourth place, indicating moderate sensitivity.
Tynker appears as a stable alternative with low sensitivity. Its position mainly alternates between first and second place, indicating balanced results in relation to the most influential criteria. MakeCode demonstrates low-to-moderate sensitivity and consistently ranks in the middle. Minor fluctuations between second and third place occur mainly when the weighting of the cognitive complexity criterion is more pronounced, indicating a relatively balanced profile.
Alice demonstrates high sensitivity, as her position changes significantly with shifts in the weights of the criteria. In scenarios with increased cognitive complexity and reduced weight for inclusive criteria, Alice shifts toward the middle of the ranking list, whereas in other scenarios, she remains in lower positions.
Blockly also demonstrates low to moderate sensitivity, but in most scenarios, it remains in last place. Although it occasionally switches positions with the alternative Alice, its overall ranking stays at the bottom, indicating limited competitiveness relative to the defined evaluation criteria.
The sensitivity analysis demonstrates that the ranking of alternatives remains relatively stable under the considered weight variations. Although cognitive complexity produces the most pronounced ranking changes, the three highest-ranked alternatives generally preserve their leading positions. These findings indicate that the proposed ROC–PROMETHEE II framework is robust to moderate preference shifts within the most influential evaluation criteria. Future research may investigate more extreme weight modifications and additional scenario configurations to further evaluate the robustness of the highest-ranked alternatives under stronger preference shifts.
5. Discussion
The research results showed that criteria related to accessibility and cognitive load reduction play a central role in evaluating and selecting software solutions for visual programming intended for children with special educational needs. The highest weights were assigned to visual accessibility and cognitive complexity, indicating that, when choosing software solutions, the greatest importance is placed on intuitive, clearly structured environments that facilitate content understanding and user interaction with the software. These results are consistent with research showing that block-based programming environments reduce cognitive load and make it easier for beginners and students with diverse educational needs to master fundamental programming concepts [
35,
64].
High-ranking criteria for pedagogical support and interactional simplicity confirm that the quality of educational software depends not only on technical capabilities but also on its ability to support the learning process, motivation, and students’ autonomous activity. A clear interface structure, a gradual introduction to programming concepts, and visually guided interaction can significantly affect engagement and learning efficiency among children with special educational needs. The results are consistent with research in inclusive digital education, which emphasizes the importance of adaptive and interactive environments for students with diverse educational needs [
65,
66].
In the PROMETHEE II ranking, Scratch stood out as the highest-rated alternative, achieving the most favorable balance across the criteria analyzed. Its dominant position primarily stems from high scores on the criteria most influential in the final ranking, particularly visual accessibility, cognitive complexity, pedagogical support, and interactional simplicity. These results are consistent with other studies that recognize Scratch as one of the most accessible environments for introductory programming and for developing computational logic in children [
67,
68]. Its intuitive drag-and-drop approach, visual organization of programming elements, and wide availability of educational resources make it especially suitable for an inclusive educational environment [
67].
Tynker ranked second and demonstrated stable performance across most criteria. Its strengths are particularly pronounced in multisensory learning and pedagogical support, positively affecting student motivation and engagement during the learning process. MakeCode displayed a balanced performance profile and a relatively high level of stability across various sensitivity analyses. Its strengths primarily relate to technical accessibility and compatibility with the educational environment, while somewhat lower scores in inclusiveness and pedagogical support limit its overall ranking. The stability of this alternative suggests it could be a reliable solution in educational settings where evaluation priorities vary by the needs of teachers, students, and institutions. In contrast, Alice and Blockly achieved lower results than the leading alternatives, especially in the criteria related to accessibility, inclusiveness, and cognitive simplicity. Although Alice enables more advanced visualizations and more complex programming concepts, the increased complexity of the interaction can be a limitation for students who require additional educational support.
The results of the GAIA analysis showed high consistency across most criteria, indicating a relatively coherent decision-making model and low conflict among the dominant evaluation criteria. The criteria of cognitive complexity and software solution cost stood out because their vectors pointed in different directions, indicating a conflicting relationship with the criteria related to the software solutions’ pedagogical and inclusive features. This distribution confirms that lower costs or higher functional complexity do not necessarily lead to greater educational effectiveness, especially when working with children with special educational needs.
Sensitivity analysis showed that the proposed ROC–PROMETHEE II framework was stable across most scenarios. The largest changes in the ranking of alternatives occurred when the weight of the cognitive complexity criterion changed, confirming its significant impact on the final evaluation results. The stability of the top alternatives across scenarios with varying criterion weights confirms the reliability and practical applicability of the proposed hybrid MCDM framework for evaluating and selecting educational software solutions for children with special educational needs.
6. Conclusions
This study presents a hybrid multi-criteria decision-making (MCDM) framework for evaluating, ranking, and selecting visual programming software solutions for children with special educational needs. Drawing on the specificities of the educational context and the needs of the target population, a set of criteria tailored to the research domain was defined. It was observed that no unified, standardized framework for this type of evaluation exists in the current literature.
The research used a structured questionnaire administered to all schools in Serbia that educate children with special educational needs. The resulting dataset provided the empirical basis for determining criterion importance and constructing the weighting structure used within the decision model. The ROC method was used to determine criterion weights, and the PROMETHEE II method was used to rank the alternatives. Additional GAIA and sensitivity analyses were conducted to assess the stability and interpretability of the results.
Regarding RQ1, the results indicate that visual accessibility, cognitive complexity, pedagogical support, and interactional simplicity are the most significant criteria in decision-making. The results confirm the dominant role of educational-cognitive factors over technical and economic aspects.
Regarding RQ2, the ROC–PROMETHEE II model produced a clear ranking of the alternatives, with Scratch in first place, followed by Tynker and MakeCode, while Alice and Blockly ranked lower. The results indicate that software solutions with a simpler interface and stronger pedagogical support perform better under the defined criteria.
For RQ3, the sensitivity analysis and GAIA projection confirm the ranking’s robustness to changes in criterion weights (±10% and +20%). Although the preference flows change numerically, the ranking structure remains stable, especially for the top three alternatives (Scratch, Tynker, and MakeCode). Most changes occur among the lower-ranked alternatives (Alice and Blockly), indicating greater sensitivity in these solutions. The analysis shows that the model is most sensitive to changes in the cognitive complexity criterion, while the other criteria have less impact on the final ranking.
The work’s contribution is an integrated ROC–PROMETHEE II framework for multicriteria evaluation, ranking, and selecting visual programming software solutions in an inclusive educational context, based on empirically collected data. The approach enables structured, transparent decision-making by combining objective determination of criterion weights with the PROMETHEE II method for ranking alternatives. Another contribution is the application of the GAIA projection and sensitivity analysis, which, in addition to the final ranking, assesses the stability of the results under changes in input parameters. In this way, the methodological framework enables not only the selection of the best alternative but also insight into the robustness of solutions across different decision-making scenarios.
The proposed framework has practical value, as it can support decision-making in educational institutions when selecting appropriate educational tools, thereby improving the quality of teaching and the inclusiveness of education. The study has several limitations, including the number of alternatives considered, the use of a predefined set of criteria, the limited direct experience of respondents with multiple visual programming environments, the heterogeneous nature of the target population, and the limited representativeness of the study sample. Future research may expand the set of criteria, include a larger number of software solutions, and apply additional multi-criteria methods for comparative analysis and further validation of the results’ robustness. It may also incorporate qualitative data collected through interviews or focus groups with teachers, professional associates, and software developers, examine the practical application of the proposed framework in educational settings, and explore the long-term effects of visual programming environments on the development of children with special educational needs.