1. Introduction
The 2023 ACS (American Chemical Society) Guidelines for Undergraduate Chemistry Programs state that programs should “use effective pedagogies” and “teach their courses in a challenging, engaging, and inclusive manner that helps improve learning for all students”. In fact, the list of “effective pedagogies” includes “inquiry-based learning” [
1] (p. 28), and evidence is building that active learning makes courses more equitable [
2,
3,
4,
5,
6]. Some evidence indicates that equity in STEM courses may be especially problematic [
7,
8,
9], while other evidence reveals that higher-order thinking in the classroom is linked to increases in learning [
10,
11] or equity [
12,
13,
14]. Furthermore, studies show that what or how students learn is linked to how they are tested [
15,
16,
17,
18,
19]. Therefore, information on the extent of higher-order thinking in classrooms and assessments is valuable in providing equitable, inclusive learning environments for students.
The United States Military Academy at West Point is an intensive, immersive experience that graduates approximately 1000 lieutenants for the United States Army each year [
20]. West Point aspires to have each graduating class reflect the composition of soldiers of the United States, which requires us to admit and graduate students from all 50 states and varied demographics. Teaching effectively and inclusively is essential for us to meet our goals.
Within these goals of efficacy and inclusion, all students are required to complete or validate one semester of general or introductory chemistry. Teaching 500–600 students in General Chemistry I each semester in sections of 16–18 students requires 25–30 sections. The transitory nature of many of our faculty [
21] and the large number of sections prompt us to structure General Chemistry I to provide similar content and opportunities across all sections of the course. General Chemistry has common learning objectives for each class meeting, and students take common mid-term and final exams. Generally, learning objectives for the course persist from year to year with minor changes in content and phrasing. Changes to General Chemistry I learning objectives are usually caused by changes in textbooks that use different vocabulary (for example, molar mass instead of molecular weight).
Traditional teaching at West Point emphasizes student preparation before class and practice on learning objectives during class. Students are expected to come to class having read assigned material from the textbook and attempted practice problems. When students have questions based on their work, they note these questions and bring them to the class meeting. Faculty often open class by asking, “What are your questions?” Once students’ questions are answered, they are told to “Take Boards”, and students spend the balance of class working on problems or answering questions on chalkboards around the classroom [
20,
22].
After a review of the General Chemistry program, the department implemented Process Oriented Guided Inquiry Learning (POGIL) [
23,
24]. Guided inquiry activities often approach information differently from many traditional textbooks. Having students draw conclusions from data, figures, and text in self-managed teams allowed them to practice leadership, teamwork, and communication skills, all of which are especially relevant to preparing our students to become Army officers. This pedagogical change required assessment, which included investigating how learning objectives and mid-term exams had changed.
Bloom’s revised taxonomy [
25] offers a framework to assess the complexity of learning objectives and exam questions. The taxonomy has two dimensions: Cognitive Process and Knowledge. The Cognitive Process dimension organizes learning from simplest to most complex, using categories of Remember, Understand, Apply, Analyze, Evaluate, and Create. The Knowledge dimension sorts learning from concrete to abstract, using the categories Factual, Conceptual, Procedural, and Metacognitive. For a more detailed summary of Bloom’s revised taxonomy, see Asmussen [
26].
Previous studies used Bloom’s revised taxonomy to explore assessments in physiology [
27], biology [
28], and natural sciences [
29]. Learning objectives for chemistry in Czechia, Finland, and Turkey were compared to one another [
30]. Learning objectives in introductory chemistry were also compared to faculty beliefs and exams [
31]. Momsen and coworkers studied learning objectives and exams in undergraduate biology classes, noting that objectives and exam questions were not always at the same Bloom’s level [
28]. Undergraduate biology exams were also often shown to be lower level than MCAT questions [
32]. Spindler used Bloom’s taxonomy to study modeling projects in a differential equations class [
33]. Similar work has been done by researchers who developed alternate frameworks to Bloom’s taxonomy [
34,
35,
36,
37,
38,
39]. To our knowledge, no one has used Bloom’s taxonomy to compare learning objectives and mid-term exam questions in a transition to guided inquiry. We aim here to use both dimensions of Bloom’s revised taxonomy to examine General Chemistry I learning objectives and mid-term exam questions before and during the transition from traditional to guided inquiry teaching.
We posed three research questions:
How do learning objectives in traditional semesters and guided inquiry semesters compare in the Cognitive Process or Knowledge dimensions of Bloom’s revised taxonomy?
How do mid-term exam questions in traditional semesters and guided inquiry semesters compare in the Cognitive Process or Knowledge dimensions of Bloom’s revised taxonomy?
Did Bloom’s revised taxonomy levels for mid-term exam questions correspond to learning objectives?
2. Materials and Methods
The move from traditional teaching methods [
22] to guided inquiry provided a natural experiment for comparing learning objectives and mid-term exam questions before and after the transition. The faculty member with primary responsibility for writing General Chemistry I objectives and exams during the transition held this position in Fall 2009, Fall 2014, Fall 2015, and Spring 2016. For this reason, Fall 2009 and Fall 2014 were chosen to represent semesters using traditional teaching. Fall 2015 and Spring 2016 were semesters using guided inquiry.
Three raters, each having significant teaching experience in STEM, coded all learning objectives and mid-term exam questions for the four semesters into subtypes of both the Cognitive Process and Knowledge dimensions of Bloom’s revised taxonomy. When a single learning objective or question contained tasks or concepts from more than one level in the taxonomy, the objective or question was split into separate items for coding. For example, the learning objective “Explain the three mass laws (mass conservation, definite composition, multiple proportions), and calculate the mass percent of an element in a compound” was split into “Explain the three mass laws” and “Calculate the mass percent of an element in a compound”. All learning objectives and coding are available in the Supporting Materials as Excel spreadsheets,
Tables S1–S4. Mid-term exam questions and codes are
Tables S7–S10.
All three raters compared their codes and discussed them until they came to an agreement at the major type level. That is, coders might disagree on whether an item was 1.1 Recognizing or 1.2 Retrieval, but consensus that an item was Level 1 or Remember in the Cognitive Process dimension was sufficient.
Codes for similar objectives and exam question topics from different semesters were reviewed for internal consistency. Due to the length of time (six months) required to completely analyze the objectives and exam questions, we looked back at our data to see if our analysis had changed over time. At least ten percent of codes from each major type in both dimensions were reviewed for consistency. When discrepancies were found, the three coders discussed the items again to reach a consensus.
Because the number of learning objectives in traditional semesters was different from the number of learning objectives in guided inquiry semesters, F-tests were done to check that the data sets were comparable. Data were analyzed using Pearson’s χ
2-test for more than two categories [
40] to identify differences within and between groups, as shown in
Table 1. All categories in both dimensions of Bloom’s revised taxonomy were considered during coding. Because no learning objectives or exam questions were coded as Create and Metacognitive, these categories were not considered in the analysis once coding was complete. Analyze and Evaluate were combined so that fewer than 20% of cells in contingency tables had values less than 5 and no expected values were less than 1, meeting the requirements for Pearson’s χ
2 analysis. Cramer’s V was chosen to measure effect sizes because the contingency tables were larger than 2 × 2 [
41]. Calculations of F-tests and Pearson’s χ
2 analysis were done in R Studio [
42]. Cramer’s V was calculated using Microsoft Excel. Calculations are provided in
Supporting Material as Tables S5, S6, S11 and S12.
Additionally, course averages and aggregated grades for the four semesters under investigation were also collected. Two-tailed t-tests were done to identify statistical differences between semesters. Enrollment in Fall 2015 was lower than in previous years because West Point began offering General Chemistry I in both fall and spring that year. Combining grades for Fall 2015 and Spring 2016 was done to facilitate comparison to Fall 2009 and Fall 2014.
This study was a subset of a larger project conducted under protocol approval number CA17-016-17 through West Point’s Collaborative Academic Institutional Review Board. This was a natural experiment rather than an intervention. No personally identifiable information was collected.
4. Discussion
We set out to answer three specific research questions. We now summarize our findings.
4.1. How Do Learning Objectives in Traditional Semesters and Guided Inquiry Semesters Compare in the Cognitive Process or Knowledge Dimensions of Bloom’s Revised Taxonomy?
The statistical comparison of learning objectives in traditional and inquiry semesters seen in
Table 3 shows significant results and a corresponding medium effect. The proportion of learning objectives classified according to Cognitive Process and Knowledge categories in the inquiry semesters illustrates the preponderance of objectives belonging to Conceptual (43.9%) and Understand (41.5%). This is in stark contrast to the traditional course, which revealed a greater proportion of learning objectives belonging to Procedural (48.6%) and Apply (43.7%). While there is a concern among educators that a dominance of Conceptual and Understand categories will arise at the expense of problem-solving ability, this does not seem to be the case for the inquiry-based semesters as a substantial percentage of learning objectives still belong to Procedural (40.2%) and Apply (35.4%).
These changes are even more surprising since during the transition to guided inquiry, the course director made the deliberate choice to retain the content and phrasing of learning objectives as much as possible (personal comm). Partly, this was motivated by simplicity; fewer changes streamlined the overall process. A second motivation was an “if it ain’t broke, don’t fix it” philosophy. Despite these efforts to minimize changes, the data suggests that learning objectives in traditional and guided-inquiry semesters were at different levels in both the Cognitive Process and Knowledge dimensions.
4.2. How Do Mid-Term Exam Questions in Traditional Semesters and Guided Inquiry Semesters Compare in the Cognitive Process or Knowledge Dimensions of Bloom’s Revised Taxonomy?
During traditional semesters, the Apply and Procedural exam questions comprised a large portion of the points on the exams, 60.3% and 64.2%, respectively. After adopting guided inquiry, the Apply and Procedural exam questions were both decreased to ~35% of the total points. The remaining Cognitive Process and Knowledge dimension categories increased in point value proportions by at least 5%, with the Conceptual category having the largest increase, 18.8% to 43.8%.
The proportions of Remember and Factual questions were similar for both traditional and guided inquiry semesters. This is likely attributable to concerns that free-response questions might be long or difficult to grade. The exam author made a deliberate choice to include a page of easy-to-grade questions (fill-in-the-blank or multiple choice, often Remember and/or Factual) to allow more time to grade free response questions asking students to sketch or provide rationale for Understand or Conceptual questions.
4.3. Did Bloom’s Revised Taxonomy Levels for Exam Questions Correspond to Learning Objectives?
Comparison of the Knowledge dimension for the traditional semesters shows an emphasis on Procedural learning objectives (
Table 4, 48.6%) and exams (
Table 9, 64.2%). Factual and Conceptual remained relatively equal in the proportions of learning objectives and exam points.
For the Knowledge dimension in inquiry semesters, learning objectives and exam points placed nearly identical emphasis on Conceptual (~44%) items. There was a slight increase (~5%) in Factual questions on exams compared to the learning objectives. The increased proportion of Factual questions over Procedural can be attributed to the deliberate choice to include a page of easy-to-grade questions.
In the Cognitive Process dimension, learning objectives (
Table 5) and exam questions (
Table 10) emphasized Procedural items. Both Remember and Understand had decreases in their respective proportion going from learning objectives to exam points, with Remember having the larger decrease (30.3% to 16.3%). Analyze/Evaluate remained constant for learning objectives and exam questions.
For the Inquiry semesters’ learning objectives, Understand (41.5%) was emphasized the most, followed by Apply (35.4%), Remember (14.6%), and then Analyze/Evaluate (8.5%). The exam questions favored Apply (35.7%), then Understand (32.7%), Remember (22.2%), and Analyze (9.4%). The increased emphasis on Remember questions on the exams, similar to Factual for the Knowledge dimension, was an effect of the deliberate choice to include a page of easy-to-grade questions.
4.4. Comparison of Grades
In addition to answering the above research questions, a comparison of average grades and grade distributions from the studied semesters found small fluctuations between the assigned grades regardless of each semester’s teaching approach. Therefore, despite having more conceptual learning objectives and mid-term exam questions in Fall 2015 and Spring 2016, asking more complex questions did not reduce course grades or hinder below-average students, contrary to faculty expectations [
43,
44,
45].
4.5. Future Work
Analyzing objectives and exams from more recent semesters could identify trends in Bloom’s levels over time. Future work could also include analyzing learning objectives and mid-term exam questions using other frameworks [
34,
35,
36,
37,
38,
39]. 3D-LAP’s [
35] focus on science and engineering practices, crosscutting concepts, and disciplinary core ideas might yield different results from a framework organized around definitions, concepts, and algorithms [
37]. Conducting a similar analysis using Fink’s Taxonomy of Significant Learning [
46] might also yield valuable insights.
4.6. Limitations
The Cognitive Process dimension and Knowledge dimensions of Bloom’s revised taxonomy are not totally independent of each other. Remembering Factual knowledge was more common than remembering other types of knowledge in an analysis of 940 biology assessment items [
47]. Similarly, Apply and Procedural tended to occur together. Also, the taxonomy provides a list of action verbs that can indicate the level of Cognitive Process. These can be a good starting place, but they are not sufficient [
47]. For example, “distinguish” is listed under 4.1 Differentiating, a subtype of Analyze, but “distinguish metals from nonmetals” is 2.3 Classifying, a subtype of Understanding. Properly assigning each item to an appropriate category requires considering how the objective or exam question is implemented in teaching and learning.
Learning objectives or exam questions categorized by Bloom’s revised taxonomy only provide opportunities for thinking at that level. Students may have chosen to work at a lower level. For example, objectives or exam questions intended to be higher level might be answered by recalling factual information [
24] (p. 5). It is also possible that instructors taught at a lower level, perhaps inadvertently [
48]. In particular, calculations that are Apply and Procedural often do not require Conceptual Understanding [
49,
50,
51].
These findings may apply only to this context. Further work at other universities would be required to generalize these findings. Guided inquiry’s emphasis on analyzing information may be especially suited to supporting higher levels of Bloom’s taxonomy. Other active learning strategies, such as think-pair-share or jigsaw, may not show changes in Bloom’s levels.
5. Conclusions
Using Bloom’s revised taxonomy, we coded and analyzed learning objectives and general chemistry exams for a traditional and a guided-inquiry class, over four different semesters (Fall 2009, Fall 2014, Fall 2015, and Spring 2016). Analysis of learning objectives for the guided-inquiry semesters (Fall 2015 and Spring 2016) revealed more Understand, Analyze and Conceptual, with significantly fewer Remember and Factual.
In addition to the analysis of learning objectives, similar coding and statistical analyses were performed for exams. These data show more Understand, Analyze, and Conceptual questions, with fewer Procedural and Apply questions. Small increases in Remember and Factual reflected a choice to facilitate grading. Despite this, guided inquiry semesters generally gave students many opportunities to consider complex questions. Several studies indicate that faculty choose lower-level assessments from concerns that only a few students could successfully reason at higher levels [
43,
44,
45]. This approach may be counterproductive, as more conceptual, less algorithmic classes have been shown to be more equitable [
4,
16].
All semesters showed some correspondence between exam questions and learning objectives within Bloom’s revised taxonomy. Learning objectives and mid-term exam questions in traditional semesters had similar proportions of Factual and Conceptual items, but Procedural questions on mid-term exams were overrepresented relative to learning objectives. Guided inquiry semesters showed strong correspondence between Bloom’s taxonomy levels in learning objectives and mid-term exam questions. The closer alignment of learning objectives and mid-term exam questions in the guided inquiry semesters is consistent with a literature recommendation for equitable teaching [
52].
Guided inquiry has the capacity to enrich complexity and enhance learning. Making learning accessible to a wide variety of learners is paramount in any classroom, yet especially important for introductory STEM courses in which problems of retention and persistence are all too common [
7,
8,
9,
53]. The environment of the guided inquiry classroom promotes and encourages participation from all learners, regardless of course prerequisites and former STEM experiences, thus making it an environment of equity, belonging, and inclusivity. Guided inquiry then becomes the catalyst to tailor learning objectives and assess them in such a way as to ensure mastery of concepts at a deeper level.