Analysis of Word Problems in Primary Education Mathematics Textbooks in Spain

: A textbook constitutes the hegemonic material of the educational institution. It acts as a mediator between the ofﬁcial curriculum and the educational practice. Given its potential inﬂuence in the classroom, this study analyzes the treatment of word problems included in the mathematics textbooks published by the publishing houses with the greatest diffusion in Spain at every primary education grade. Three variables were analyzed: their semantic structure, their degree of challenge, and their situational context. The results indicate that most of the problems included in textbooks are characterized by low complexity and variability regarding their semantic structure. They are also characterized by a limited degree of challenge and by being presented in highly standardized situational contexts. Likewise, it is found that there is no evolution in the treatment of these problems with respect to previous studies carried out in the Spanish context. Therefore, it is concluded that the mathematics textbooks currently used in schools are not effective tools to address the process of teaching-learning problem solving.


Introduction
According to Reference [1], the relevance of curricular materials lies in being an inherent part of school practice, in such a way that it would be unthinkable to carry out any activity without the support of an educational material. To this instrumental function, we should also add a significance, since the curricular materials do not have a neutral character. On the contrary, they reveal a certain vision of education and the teaching function, based on a pedagogical theory or model. In this sense, from technical rationality, highly structured and standardized materials that pose few difficulties for teachers are advocated. In this way, the function of teachers is limited to assuming and reproducing pre-prepared materials mechanically. The maximum exponent of this logic is the textbook, which has traditionally occupied a hegemonic role in the classrooms of many countries [2][3][4][5][6].
Mathematics textbook is not an exception. It can be firmly stated that this material resource has a dominant character in the teaching-learning process of mathematics, both nationally and internationally [7][8][9]. Mathematics textbooks largely determine what teachers teach and, consequently, what students learn, since their role is frequently even more decisive than the prescriptions of the official curriculum [10][11][12]. In fact, different studies indicate that the behavior of teachers is, in general, very consistent with the contents, structure and methodological approach took by mathematics textbooks [13][14][15]. For this reason, publishers become the most decisive agent when determining the real curriculum, which is established based on the pedagogical beliefs of a certain author or group of authors [16].
Ref [17] points out that, because of this influence and its effects on educational practice, many researchers have studied the treatment of different mathematical contents in textbooks, among them, problems and their solving process. Precisely, the main aim of this study is to analyze the treatment of verbal arithmetic problems (which we will refer to in this article as "word problems") of additive structure in primary education mathematics textbooks, taking into account their semantic structure, their degree of challenge and the situational context in which the problem appears. On the one hand, this analysis will allow us to present a general panorama of the current state of the matter. On the other hand, taking into account the pioneering study carried out by Reference [18] in Spain, it will offer us the possibility of verifying to what extent there has been a change or an improvement in the treatment of this type of problem since the entry into force of a modification of the national educational law in 2013 [19] until the current moment, in which we are witnessing a new legislative change.
Specifically, our purpose is aimed at answering the following questions. First, since the semantic structure of the problems determines their degree of difficulty, which is the frequency and variability of the semantic structures of the problems included in textbooks? Second, which proportion of problems present some kind of challenge beyond the choice and the execution of the correct algorithm to solve the problem? In addition, which is its nature? Finally, which proportion of problems appear in a different situational context than standard situations (premises with data and questions)? Moreover, of what type is the situational information that appears in the problem statement? In short, what do students usually solve in the classroom? What kind of problematic situations do they face throughout primary education? Which type of educational practices are mathematic textbooks promoting currently?
To achieve these objectives, we present, first, the previous studies carried out on the analysis of word problems in Spanish textbooks. Second, we describe the procedure carried out by the different coding systems used. The results are provided below based on the research objectives. Finally, the discussion of the results, the limitations and the educational implications that can be extracted from this work are raised.

Previous Studies
In Spain, the first study that analyzed the problems presented in textbooks was carried out with textbooks published between 1999 and 2001 [18], within the legislative framework of Reference [20]. Results of that study showed that the textbooks of the three publishers analyzed (Santillana, Anaya, and S.M) showed a very similar panorama, characterized by a scarce variety of subtypes of problems and a high frequency of consistent problems that did not require to apply advanced conceptual knowledge or sophisticated solving strategies to solve them. Regarding the second variable analyzed, the findings indicated that only a small proportion of the problems involved some type of challenge. Finally, the results referring to the situational context variable showed that the problems presented by the textbooks appeared in highly stereotyped contexts with very little or even no situational information to help students to solve them.
A second study, published one year after the promulgation of the modification of the educational national law in 2013 [19], was carried out by Reference [21], although the problems analyzed were those included in a textbook of the Santillana publishing house, which was edited in the normative framework of the national educational law [22]. The first objective of this work was to characterize the degree of authenticity of the problems that students usually solve in the classroom; that is, to analyze the possible connections between the problems presented by the textbook and the problematic situations which students face in their daily life. The second objective was to analyze some of the variables studied by Reference [18] in their pioneering study, as well as to be able to check if the panorama described by these authors had changed. The first conclusion of this study was that the problems analyzed were distant from the real life of the students (only 3.5% of the problems could be considered authentic problem situations). The second conclusion was the low frequency of word problems with inconsistent additive structure (which are more difficult to solve) and the scarce variability of word problems with additive and multiplicative structure (only 1% of the additive structure problems and 0.51% of the structural multiplicative were challenging). Consequently, this second study corroborated the panorama described in the previous study by Reference [18].
The most recent study in Spain has been developed by Reference [23], with two series of textbooks (Santillana and SM), published in 2009 and 2010. Therefore, this work is also framed at the normative framework of Reference [22]. The aim of these authors was to analyze the word problems of additive and multiplicative structure in order to know their degree of complexity, both at a procedural level (number of steps that were necessary to solve the problem) and at a semantic-mathematical level (structure of the problem). Likewise, the authors considered updating the study developed by Reference [18]. The results revealed that most of the word problems presented by these publishers had low procedural complexity and low semantic-mathematical complexity. Regarding the second objective, the authors conclude that there is no evolution with respect to the panorama presented in the study of Reference [18], and they add that "the books seem to be oblivious to the successive educational reforms carried out in our country" [23]. Furthermore, according to the authors, carrying out educational reforms that avoid issues that are closest to educational practice (in this case, the issue of textbooks) can limit student learning, since textbooks are the curricular materials that really define what students learn.

Materials
The study was carried out with primary education textbooks from three publishing projects: Santillana Group ("Knowing how to do"), Anaya Group ("Learning is growing"), and S.M Edition ("Savia"), that were published between 2014 and 2015 with the entry into force of Reference [19]. Students have used those textbooks until now. The publishers were chosen due to two mainly reasons. First, they are three of the publishers with the greatest diffusion in the Spanish schools. Second, since they are the same publishers used in the study of Reference [18], this analysis allow us to know the evolution of our object of study after more than a decade. The total number of problems analyzed was 1900.

Semantic Structure of the Problems
To classify the simple word problems (those that are solved through a single operation) according to their semantic structure, the categorization system established by Reference [24] was followed, which distinguishes twenty categories and subcategories of change (CHAN), combination (COMB), and comparison (COMP) problems. Likewise, the equalization category (EQUA), which was subsequently proposed by Reference [25], was taken into account too: Change categories: • Change 1: It starts with an initial amount, which is increased by an action of adding. The question refers to the final set. Example: Juan had 5 marbles. In one game, he won 3 marbles. How many marbles does Juan have now? • Change 2: It starts from an initial amount, which suffers a decrease. The question refers to the final set. Example: Juan had 8 marbles. In one game, he lost 3 marbles. How many marbles does Juan have now? • Change 3: It starts from an initial amount, which undergoes a change of unknown quantity, and which results in a known final set greater than the initial set. The question refers to the change set. Example: Juan had 5 marbles. In one game, he won some marbles. Now Juan has 8 marbles. How many marbles did Juan win? • Change 4: It starts with an initial amount that undergoes an unknown quantity change, which results in a known quantity that is less than the initial amount. The question refers to the change set. Example: Juan had 8 marbles. In one game, he lost some marbles. Juan has 5 marbles now. How many marbles did Juan lose? • Change 5: It starts with an unknown initial amount, which is increased with a set of known quantity, and which results in another known quantity. Example: Juan had some marbles. In one game, he won 3 marbles. Juan has 8 marbles now. How many marbles did Juan win?
• Change 6: It starts from an unknown initial quantity, which undergoes a decrease with a set of known quantity, and which results in another known quantity. Example: Juan had some marbles. In one game, he lost 3 marbles. Juan has 5 marbles now. How many marbles did Juan lose?
Combination categories: • Combination 1: The two parts come together to form a whole. Example: Juan has 3 marbles. Peter has 5 marbles. How many marbles do they have between the two of them? • Combination 2: The whole and one of the parts are known. The problem asks about the other part. Example: Juan and Peter have 8 marbles between them. Juan has 3 marbles (or Peter has 5 marbles). How many marbles does Peter (or Juan) have?
Comparison categories: • Comparison 1: The reference set and the comparison set are known. The question refers to the difference set in terms of "how many more" elements the compared set has with respect to the referent. Example: Juan has 8 marbles. Peter has 5 marbles. How many more marbles does Juan have than Peter? • Comparison 2: The reference set and the comparison set are also known. The question refers to the difference set, but in terms of "how many fewer" elements the compared set has with respect to the reference set. Example: Juan has 8 marbles. Peter has 5 marbles. How many fewer marbles does Peter have than Juan? • Comparison 3: The reference set and the difference with respect to the compared set are known, indicating "how many more" it has. It is asked about this compared set. Example: Peter has 5 marbles. Juan has 3 more marbles than Peter. How many marbles does Juan have? • Comparison 4: The reference set and the difference with respect to the compared set are known, indicating the number of "less" elements it has. It asks for the compared set. Example: Juan has 8 marbles. Peter has 3 less marbles than Juan. How many marbles does Peter have? • Comparison 5: The compared set and the difference set are known, noting how many "more" elements the reference set has. The problem asks about that reference set. Example: Juan has 8 marbles. Juan has 3 more marbles than Peter. How many marbles does Peter have? • Comparison 6: The compared set is known. The difference set, expressed in terms of how many "fewer" the compared set has with respect to the reference set, is also known. The problem asks about that reference set. Example: Peter has 5 marbles. Peter has 3 less marbles than Juan. How many marbles does Juan have?
Equalization category: • Equalization 1: The largest and the smallest set are known, and the difference is asked in terms of how much is necessary to add to the comparison set to equalize the two sets. Example: Juan has 8 marbles. Peter has 5 marbles. How many marbles do they have to give to Peter to have the same marbles as Juan? • Equalization 2: The largest and the compared set are also known, and the difference is asked in terms of how much must be removed from the largest in order to make the two sets equal. Example: Juan has 8 marbles. Peter has 5 marbles. How many marbles do they have to take from Juan so that he has the same marbles as Peter? • Moreover, to determine the degree of difficulty of the problems according to their semantic structure, the consistency hypothesis proposed by Reference [26] was taken into account. These authors established a dichotomous classification to categorize the additive structure word problems, based on the relationship between the surface structure of the problem and the algorithm necessary to solve it. The surface structure of problems can be expressed in consistent or inconsistent language. Thus, canonical problems or problems expressed in consistent language are easier to solve than non-canonical or inconsistent problems. The greatest facility for consistent problems lies in the existence of a coherence between the surface structure of the problem and the arithmetic operation with which it is solved. For instance: Juan has 3 marbles. In a game, he wins 5 marbles. How many marbles does Juan have now? 3 + 5 = 8 (to win = to add).
Juan has 8 marbles. In a game, he loses 5 marbles. How many marbles does Juan have now? 8 − 5 = 3 (to lose = to subtract).
However, in inconsistent problems, this "keyword" indicates the opposite operation to the one that must be applied. That is, terms, such as "win", appear in inconsistent (non-coherent) problems that require a subtraction operation to be solved; or, on the contrary, terms, such as "lose", appear in those problems in which the solver must to add. For instance: Juan has some marbles. In a game, he wins 5 marbles. Juan has 8 marbles now. How many marbles did he have? 8-5 = 3 (to win = to subtract).
Juan has some marbles. In a game, he loses 5 marbles. Juan has 3 marbles now. How many marbles did he have? 5 + 3 = 8 (to lose = to add).
For the classification of compose word problems with additive structure (or with more than one operation needed), it was used the categorization system created by Reference [18] in their pioneering study, which distinguishes the following categories:

•
Category A: problems that combine the structure of change with the structure of combination, with the main structure being the structure of change. Example: Sergio had 150 euros. On his birthday his father gave him 35 euros and his mother 46 euros. How much money does Sergio have now? • Category B: the change structure is repeated successively. Example: 56 people were traveling on a bus. At the first stop, 16 people got off and at the second stop, 12 people got on. How many people are traveling on the bus now? • Category C: the main structure is of comparison 1 or 2, and the major or minor set, or both, are obtained from combination. Example: Luis has an album with 750 stickers and another album with 380 stickers. Susana has an album with 560 stickers. How many stickers does Luis have more than Laura? • Category D: the comparison structure is repeated successively (two, three, or more times). Example: Alfredo has 26 marbles. Ramón has 7 less marbles than Alfredo and Rosa has 9 more marbles than Ramón. How many marbles does Rosa have? • Category E: this category is similar to the previous one, but it is combined with combination structure 1, which acts as the main structure. In this case, one or more of the "parts" are given by comparison. Example: There are 154 strawberry candies, 27 more orange candies than strawberry and 19 more lemon candies than orange in a bag. How many candies are there in total? • Category F: the main structure is combination 1 and one or more parts are obtained from the change structure. Example: Roberto bought a shirt and a sweater. The shirt costed 46 euros and the sweater costed 37 euros. In each garment, they made him a discount of 9 euros. How much did Roberto spend on the purchase of the two garments? • Category G: the main category is combination 2, and the "all" set is obtained from change 3 or 4. Example: A mounting kit has 130 pieces. To make a boat, Peter has used 45 large pieces and the rest small, and he has 18 pieces left over. How many small pieces did Peter use to make the boat? • Category H: the main structure is equalization 1, and the minor set is obtained from a combination 1. Example: Carlos and Alba are making a puzzle of 5800 pieces. Carlos has already placed 1214 pieces and Alba has placed 897 pieces. How many pieces do they need to finish the puzzle? • Category I: the main structure is combination 1, obtaining one of the parts from combination 2. This is a special case of problems since it needs to be accompanied by a multiplicative structure, since, otherwise, the calculation of the part combination 2 would be irrelevant. Example: A liter bottle of tomato juice weighs 1350 gr. An empty bottle of that juice weighs 385 gr. The empty 5-L bottle of tomato juice weighs 675 g. How much does the full bottle weigh?
Finally, as in the study of Reference [18], the multiplicative structure problems (multiplication and division) were not coded. However, problems with mixed structures were included in the analysis, that is, problems where additive and multiplicative structures were combined. For example "Fabiana works in a bookstore. Every day, in the morning, she sends 9 emails with the new orders and, in the afternoon, she sends another 6 new emails. How many emails does she send in total from Monday to Friday?" (Santillana, 3rd grade). These problems were coded in the category of the corresponding additive structure part; in the case of the example, it is the combination 1 category.

Grade of Challenge of Problems
With regard to the variable degree of challenge, the word problems were classified in two categories, "problem posing" and "information", as it was made in previous studies in Spain [18] and other countries [12,17,27,28].
Within the first category, two subcategories were distinguished. On the one hand, the subcategory of total problem posing, in which students are asked to create a complete problem statement, for example: "Write a similar problem to the ones on this page that can be solved by representing the data graphically" (Santillana, 5th grade). On the other hand, the subcategory of partial problem posing, in which students must complete a sentence with the question or some other information, for instance: "In a hotel they are going to host 560 tourists today. There are already 325 installed since yesterday and 136 have arrived this morning. The rest will arrive in the afternoon."(S.M, 3rd grade). In the information category, a differentiation was made between problems with irrelevant or superfluous information (extra data) that are not necessary to solve the problem (for example: "Vicente has 49 sheets, Leire has 46 and Marina has 15 less sheets than Vicente. How many sheets does Marina have?" Santillana, 1st grade); and problems with missing or omitted information (less data) (for instance: "Miguel bought a backpack that costed 15 euros and a folder. How much did Miguel spend in total?" Santillana, 2nd grade).

Situational Context
The analyzed variable was the situational context where the problem appears, considered as a relevant variable when it comes to help students to understand the statement of the problem and, therefore, to solve it. The study of the situational context is based on the double nature of every problem: conceptual or mathematical, on the one hand, and textual, on the other one [29][30][31]. Thus, every word problem underlies a mathematical equation with numerical data, which will be solved by applying one or more operations (mathematical nature). Nevertheless, in order to solve the problem, the first step must necessarily be the reading of its statement (textual nature).
This textual character, which takes on its entity in the situational context, has been the object of analysis of the problems of textbooks by different authors [32][33][34][35], although with different procedures and classification systems.
In Spain, Reference [18] operationalized the study of this variable using Reusser's Situation Problem Solver as a reference [36,37]. According to the model, which comes from the field of study of text comprehension, the difficulties that students present when facing the solving of a problem are not due exclusively to aspects of a mathematical nature, but also to a lack of understanding of certain linguistic expressions. However, above all these aspects, their difficulties are due to factors closely linked to the situational context where the statement appears: agents, events, goals and intentions, causal and temporal chains, etc. Thus, based on the preceding study of Reference [18], and in the model created by References [36,37], the following categories of analysis were used in this work: Possible combinations of the above information: action + intention; cause + action; action + description; or intention + action: "To celebrate her birthday, Gemma is spending the day with her friends ( . . . )" (Santillana, 2nd grade).

Reliability
For ensuring that the problem coding process had enough guarantees, an inter-judge reliability procedure was conducted.
The second author of the study carried out the coding of all the problems in all the variables that were analyzed. Subsequently, the first author coded 120 problems that were randomly selected from the set of problems included in the unit of analysis (10% of the total). Next, using the SPSS 27 statistical package, Cohen's Kappa index was calculated to determine the degree of agreement. This index takes into account the degree of agreement between judges and the degree of agreement that can be attributed to chance, thus providing a more reliable indicator than just the percentage of agreement. The results are shown in Table 1: Subsequently, four Doctors of Education and Educational Psychology carried out the following tasks: (a) to code ten problems according to their semantic structure; (b) to determine the grade of challenge of five problems; and (c) to indicate the type of situational information included in the statement of five problems. In order to cover a greater number of problems, the coding carried out by each collaborator was different. Thus, 40 problems were coded in the first task, and 20 different problems were coded in the second and third tasks.
Finally, Cohen's Kappa index was calculated again as an indicator of the reliability of the four judges. The results of this agreement are shown in Table 2:

Analysis of the Semantic Structure
In total, 1900 problems were analyzed. Among the different categories of simple word problems, the most frequent were combination (40.0%) and change (31.4%). The comparison category was much less frequent (13.2%), and the equalization category (0.90%) was practically non-existent (Table 3).
Regarding the category of change, the highest proportion of the problems are concentrated in the subcategories of change 2 (21.21%) and change 1 (10.05%) (both of them are consistent). The rest of the subcategories of change barely appeared in the analysis. Regarding the combination category, the subtype of combination 1 (with consistent nature) was the most frequent of the entire sample (33.89%), while the subtype of combination 2 (inconsistent) represented a very low rate (6.15%). With regard to the comparison category, the most numerous subcategory was comparison 1 (7.10%) of an inconsistent nature. The rest of the comparison subcategories had minimal or no presence. Finally, the equalization problems presented the lowest rates, both with respect to frequency and variability, since of the six subcategories only were included by publishers, with minimal percentages, the subcategories of equalization 1 (0.68%), and equalization 2 (0.22%).
To sum up, according to the consistency hypothesis proposed by Reference [26], of the total of simple problems analyzed (1628), 85.3% problems were consistent (easier to solve), compared to 14.6% that they turned out to be inconsistent.
A relevant result refers to the difference in the frequency of simple problems (85.5%), compared to the compound ones, which appeared in a much lower proportion (14.5%). In addition, another remarkable result was, as in the simple problems, their low variability. Although it is true that, of the eleven categories of compound problems, the three publishers include eight of them, the proportions were so small that they lead us to conclude that this variability is only apparent. Actually, most of the compound problems are concentrated in category A (8.80%) and B (2.40%). The rest of the categories show values that range between 0.05% and 1.10%.
Regarding the analysis by publishers, as shown in Table 4, Santillana included a larger number of problems in its textbooks, a total of 988 problems (52%), which doubles the number of problems included by the other two publishers: S.M included 487 (25.6%), and Anaya included 425 (22.3%). However, results showed a higher proportion of consistent problems in the three publishers, with hardly any notable differences between them: S.M (77.4%), Anaya (74.1%), and Santillana (67%).   In the three publishers, the most frequent problems were those of combination: S.M The characterization of the compound problems was similar to that of the simple ones in terms of the low frequency (Santillana, 17%; Anaya, 14.8%; SM, 9.2%) and the scarce variability: of the eleven categories, most of the compound problems were concentrated on categories A and B.
On another note, the comparative analysis of this work (2021) with previous studies (see Table 5) shows that, in the four compared studies, the order of frequency of presentation of the different categories of simple word problems is the same (combination, change, comparison and equalization). Furthermore, in all the studies, the subcategory of combination 1 leads the general category of combination to present the highest percentages of the entire sample of problems, since the subcategory of combination 2, which nature is inconsistent, appears in a very lower proportion, especially in our study.
Regarding the category of change, the highest proportion of the problems are concentrated in the subcategories of change 1 and 2 (both of them are consistent) in the four studies. Likewise, except in the study conducted by Reference [23] where the data do not allow making this distinction, in the other studies, subcategory change 2 occurs with a considerably higher frequency than subcategory 1. However, the most important result with respect to this category is probably that, in the four studies, both change subcategories 3 and 4 (medium difficulty) and change problems 5 and 6 (high difficulty) are barely non-existent, so that the variability of this category of problems is reduced to the easiest problems to solve (change 1 and 2).
In the comparison category, the four studies also show similar results. In addition, the same effect that happened with the category of change is produced again. That is, the largest proportion of problems is concentrated in the comparison subcategories 1 and 2, while the rest of the comparison subcategories, in which difficulty is medium or high, are presented with a very low frequency. Finally, in the four studies, the category of equalization problems is also almost non-existent. Of the six subcategories of problems of this type, only equalization problems 1, 2, and 3 appear, in a very low proportion. Regarding the frequency of compound problems in the study of Reference [18], a total of 1749 word problems were analyzed (87.42% simple and 12.52% compound). In our study, 1900 word problems were analyzed (85.50% simple and 14.50% compound). Therefore, the results regarding the frequency of these two categories of problems in the textbooks are very similar. From the results on the variability of the different categories of compound problems (see Table 6), it is evident that, in both studies, this variability is only apparent since, although all the categories of problems are represented, the percentages are so low that the presence of each of them is almost non-existent. Actually, the presence of compound problems is concentrated in categories A and B, which also have very low percentages. Table 6. Comparative analysis of the frequency and variability of compound word problems with additive structure.

Analysis of the Degree of Challenge
As it is reflected in Table 7, the number of problems that present some type of challenge is very low. Of the 1900 problems analyzed, only 299 (15.7%) problems propose some type of task that goes beyond the choice and application of the operation. Santillana is the publisher with the highest frequency of problems of this type (55.8%), followed by S.M (34.4%) and Anaya (9.6%). However, Santillana is the publisher that most frequently reveals the challenge of the problem to the student prior to its solving.
The analysis of the results by categories shows that the largest type of challenge is the one corresponding to the partial problem posing (easier than the total problem posing): Santillana (35.1%) and S.M (21%). However, in Anaya, this category does not appear, and the total problem posing appears with underrepresented data (3.6%).
In the information category, there are no notable differences between the three publishers. Both the categories of superfluous information and of omitted information present negligible results that oscillate between 6.3% and 0.6%. Therefore, the features that characterize this type of task are its low frequency and low variability.

Analysis of the Situational Context
The first relevant result refers to the low proportion of problems in which situational information has been entered (see Table 8). Of the total problems analyzed, only 317 (16.6%) problems are presented in situational contexts enriched with qualitative information. Therefore, the vast majority of problems follow the prototypical pattern of standard problems.
Regarding the frequency and variability of the different categories, the most notable result is the absence of completely rewritten problems. Another relevant result is the considerable difference between the category of actions detached from any type of situational information (54.5%) and the rest of the categories: intentions, purposes, goals (19.5%); temporary structures (1%); causal (12%); and those that combine actions and causes (0.6%).
The analysis of this variable by publishers indicates that Santillana is the one with the highest proportion of problems in which situational information of some kind has been included (52.7%), followed by publishers SM (28.8%) and Anaya (18.5%). The three publishers follow the same pattern of variability described: the most frequent categories in all of them are actions (Santillana, 56%; Anaya, 52.2%; SM, 51.6%), followed by intentions (SM, 21.3%; Santillana, 19.5 %, Anaya, 18.6%), and the rest of the categories with very low percentages.

Discussion
In this study, the main aim was to analyze the treatment of word problems with additive structure in primary education mathematics textbooks that were published with the entry into force of a modification of the current educational law in Spain in 2013 [19]. On the one hand, this analysis intended to offer a general vision of the current panorama of this issue. On the other hand, more than a decade after the pioneering study carried out in the Spanish context by Reference [18], we aimed to check to what extent there has been a change and/or improvement in the treatment of this type of problem in the mathematics textbooks that are currently used in our educational system. This general objective has been specified in the analysis of the frequency and variability of the semantic structure of word problems, the presence/absence of problems with some degree of challenge, and the situational context in which they are presented. In sum, since textbooks largely determine teaching practice, our purpose has been to know what type of problems students face throughout primary education and which are the educational practices promoted by these materials.
Regarding the semantic structure of the word problems, our findings have shown a panorama characterized by a low variability of the different subcategories of problems and a high frequency of consistent problems, which are the easiest to solve. Therefore, it can be concluded that using superficial solving strategies is enough to solve most of the problems presented in these curricular materials, so that the problem solving process becomes a routine and mechanical task in which the use of reasoning is hardly necessary.
With regard to the degree of underlying challenge, it has been verified that the inclusion of this type of problem is very limited. There are very few problem situations in which the proposed challenge goes beyond the selection of the data and the application of the corresponding operation. The analyzed problems of the different publishers contain only "coincident information", that is, the necessary and sufficient data for their solving. However, as previous research indicates, the inclusion of superfluous or insufficient information in the statement is crucial to develop the ability to solve problems, since it is a way of helping students to consider the context as a relevant element to address the solving [39][40][41]. Consequently, it is not rare that students infer that solving a problem means "doing something with (all) the numbers given in the statement" [18] (p. 444). Neither is it strange that students develop superficial and passive solving strategies, which demand little cognitive effort.
The tasks of problem posing by the students are also scarce. Even the category of total problem posing (more challenging) is much lower than partial problem posing. In this regard, it should be noted that the proposals of the different mathematics curricular projects must focus on the solving process, but they must also include the task of problem posing that must be essential in learning mathematics at every educational level [42,43]. The problem posing by the student is an activity inherent to the problem solving process [44], whose educational value both at a cognitive and attitudinal level has been emphasized for a long time by many researchers in the field of mathematics education. Specifically, Reference [45] affirms that this task involves a high cognitive demand, it promotes the mathematical understanding of the students, it fosters their reasoning capacity, and it awakens their motivation.
On another note, the analysis of the situational context where the problems appear reveals that there are very few problematic situations enriched with qualitative information that contributes to the understanding and solving of the problem by the students, so that the problems are presented in highly standardized contexts (very specific premises with data and questions). The most frequent category is actions that are detached from other causal and intentional categories (purposes, goals, intentions of the characters), which, according to the research, has positive effects on the problem solving process when it is linked to the mathematical structure of the problem [46].
In short, the word problems presented by the textbooks analyzed in this study are characterized by being mainly consistent problems, with a low variability of the different semantic structures and a limited degree of challenge. Moreover, they are usually presented in highly standardized situational contexts. All these characteristics allow us to conclude, first, that textbooks are not effective tools to approach the teaching-learning process of the problem solving. Likewise, regarding the second research objective, it has been verified that the current panorama does not differ from the pioneering study carried out in Spain by Reference [18] with textbooks published in the legislative framework of Reference [20] in the three variables analyzed. Nor does it differ from the studies carried out by References [21,23], who also analyzed the semantic structure of problems with textbooks published during the period of validity of Reference [22]. In conclusion, the analyzed textbooks are generally characterized by their formal homogeneity, by their standardization and isomorphism, by their scarce evolution, and by their resistance to change.

Educational Implications
As it has been verified in this paper, most of the word problems included in textbooks by publishers are characterized by a low level of semantic complexity, so that the problematic situations that students usually face in classrooms are closer to the function of the "exercise" than the function of the "problem". Therefore, a first educational implication that emerges from this study is the need for publishers, when they design and develop the mathematics textbooks, to rethink the concept of problem and the meaning and function that the problem solving should fulfill at school. This would be possibly a first step, a starting point that would determine the effectiveness of the publisher's curricular proposal. It is difficult for a curricular project to be valid if the function of the problem (which is the backbone of the mathematics curriculum) is subordinated to the function of the exercise. It would be also difficult if the solving process is reduced to the development of superficial strategies, to the detriment of genuine modes of solving, in which the intervention of reasoning is necessary.
Furthermore, following Reference [47], we would like to echo two instructional principles that educational research has highlighted. First, the principle of prolonged commitment, according to which, for an improvement in the ability of problem solving, students have to "work on problem tasks on a regular basis, for an extended period of time" [47] (p. 272). Second, the task variety principle, which states that students will improve as problem solvers "only if they are given opportunities to solve a variety of types of problem tasks" [47] (p. 272). According to these two instructional principles, the second educational implication refers to the fact that, in the design and development of textbooks or any other alternative material, problems should be included more frequently and systematically, compared to other routine tasks. However, it is not only a question of increasing the diet of problems, since this quantitative change, by itself, would not suppose an improvement. Rather, it is a qualitative change that, in line with the second principle formulated by Reference [47], would also consist of increasing the variety of problematic situations, involving higher levels of challenge tending to favor the development of reasoning.
Moreover, if we assume that textbooks, regardless of educational reforms, will continue to reproduce the same school practices, the third educational implication is that changes must necessarily pass through the action of the teacher. That said, we must bear in mind that teaching activity is largely determined by the awareness of the need for change and by the necessary knowledge to carry it out. From our point of view, a teacher's activity can range from what we could call a "minimum perspective" to a "maximum perspective". On the one hand, the first perspective would involve compensating the limitations of textbooks, using them flexibly as one more resource among other possible ones, selecting and/or modifying those aspects that may be useful to make a specific publisher proposal beneficial. From this perspective, textbooks would be an aid for the teacher, who will use them "as support for their work in the classroom, but not as a didactic action guide to direct daily work in a prescriptive way" [48] (p. 38). On the other hand, from what we have called a "maximum perspective", the teacher's activity would focus on the elaboration of his or her own curricular proposals. It would go beyond the current traditional disciplinary approach, in which, according to Reference [49], the textbook is usually the only teaching resource used. However, this fact implies the need to question the role that initial and continuing teacher training plays as a key aspect to achieve change. Consequently, in our opinion, the need to provide teachers with criteria to analyze textbooks and to check whether a certain curricular proposal can be effective should be considered, modifying or adapting those proposals to the classroom context, or in the best of cases developing own proposals.
Finally, the role of the educational administration, whose policies do not contribute at all to change this reality, could also be questioned. Specifically, the policies of free textbooks through the implementation of the "book bank" system, which is generalized in all the autonomous communities of Spain, naturalize the presence of the textbook in schools as something typical of pedagogical normality; thus, they become the main curricular material.

Limitations and Future Studies
As in the pioneering study conducted by Reference [18], one of the variables analyzed has been the frequency and variability of the different types of semantic structures of the word problems with additive structure. Nevertheless, the word problems with multiplicative structure has not been analyzed in either of the two studies, even though the problems of two or more operations, in which additive structures were combined with multiplicative structures, were taken into account. Subsequent studies carried out in Spain [21,23] analyzed word problems with multiplicative structure included in textbooks published during the period of validity of Reference [22], thus offering a more complete view of the different types of textbooks' word problems (additive and multiplicative). However, the sample of books used for those analyses was smaller: in the study of Reference [21], the research was carried out just with one publisher (Santillana), while, in the study of Reference [23], two publishers were analyzed (Santillana and S.M). Therefore, a first limitation of our study is not having offered a broader vision of the treatment of problems in the textbooks published within the normative framework of Reference [19], since we have not included word problems with multiplicative structure in our analysis. Furthermore, regarding this variable, too, our study focuses on the analysis of the semantic-mathematical complexity (semantic structure of word problems), and it does not address the analysis of procedural complexity, which is carried out in the study of Reference [23].
Finally, a future line of research could be the analysis of other publishers that are also relevant in the Spanish publishing market (e.g., Edelvives, Edebé, Vicens-Vives). That study would allow us to obtain a broader view, and, therefore, a more generalized view, of the treatment of word problems with additive structure by textbooks.

Conclusions
The main conclusions that we draw from the obtained results are the following ones: In Spain, mathematics textbooks in primary education lack rigorous planning regarding the number and type of word problems. Word problems of a consistent nature of the simplest semantic categories (change and combination) predominate throughout the entire stage, and word problems of an inconsistent nature and those of the more complex semantic categories are much less frequent. Word problems usually contain only the necessary data for their solving. There are practically nonexistent cases in which textbooks propose problems with additional data than those strictly necessary to find the solution, or the cases in which they pose problems in which data are lacking to find the solution. In addition, students are only asked to solve problems, not to pose or invent them. The problems, thus, become relatively routine tasks, where there is always a solution, and it is found operating exclusively with the numerical information of the statement.
In sum, the analysis of textbooks suggests that word problems are often considered a mere excuse to exercise the performance of operations. Word problems should be one of the axes on which the entire mathematics curriculum is organized, but they are devalued and placed at the service of the exercise of calculus.
The results of this study are similar to those obtained in previous studies also carried out in Spain with different legislative frameworks. This suggests that these changes in legislation do not effectively affect the curriculum that is presented in mathematics textbooks.
For this reason, we consider that it is necessary for the agents in charge of developing textbooks and other types of curricular materials to carry out a rigorous planning of the type of word problems that they present to students, to give these types of tasks the importance they truly deserve and to place problem solving at the center of the school mathematics curriculum.