Broader terms curriculum mapping: Using natural language processing and visual-supported communication to create representative program planning experiences

Accreditation bodies call for curriculum development processes open to all stakeholders, reflecting viewpoints of students, industry, university faculty and society. However, communication difficulties between faculty and non-faculty groups leave unexplored an immense collaboration potential. Using classification of learning objectives, natural language processing, and data visualization, this paper presents a method to deliver program plan representations that are universal, self-explanatory, and empowering. A simple example shows how the method contributes to representative program planning experiences and a case study is used to confirm the method's accuracy and utility.


Introduction
Changing times impose new and radical challenges for societies, urging Higher Education Institutions (HEI) to rethink their educational offer.Curriculum development is the process where changes to educational offer are conceived and, to be successful, this process needs to create opportunities for active and consequent reflection.To create these opportunities, stakeholders' participation is essential.This is the unanimous opinion of researchers and accreditation bodies [ABET, 2020, Crawley et al., 2007, Sutherland, 2018] who defend curriculum development open to all stakeholder groups, expressing the viewpoints of students, industry, university faculty and society.
Focusing on program planning, a core process that lays at the heart of curriculum development, open principles are typically associated with interviews, focus group sessions, where non-faculty stakeholders are asked to express their views.Faculty, who hold the central managing role [Wilson and Slade, 2020], is responsible for processing the data collected in these informational touchpoints, and it is faculty who participate in program planning discussions.
Faculty central role in program planning is not only a traditional functional attribute.Program planning requires scientific and pedagogic skills essential to-and therefore mastered by-faculty, that are not essential to non-faculty stakeholders.This opens an important communication gap, and the absence of dialogue between faculty and non-faculty groups [Ornstein and Hunkins, 2018] leaves unexplored an immense collaboration potential.
Yet, the challenges imposed by a rapidly changing society; recognizing that responsive and effective program plans are more likely with participatory (not just informational) involvement of all those concerned, forces HEI to reach out and find ways to integrate external contributions, valuing non-faculty stakeholders as experts of their own experience.
To achieve this objective, to bridge communication gaps, better representations of the program plan are essential.This paper proposes the use of a broader terms curriculum mapping method to deliver program plan representations that are universal, self-explanatory, empowering.
The core objective of this paper is the presentation of the broader terms CM (Curriculum Mapping) method.Because this method uses a combination of information and data science techniques, a significant part of the paper is dedicated to the step by step description-using a simple example-of these techniques, and how they contribute to representative program planning experiences.To verify the method's accuracy and utility, a case study is used.
But, before proceeding to the broader terms CM method presentation, it is important to explain what is meant by curriculum mapping.How it has been used previously, and what changes are required to make it a vehicle for representative program planning.This is the topic of the next section.

The context: Curriculum mapping
According to Burns [2001], curriculum mapping is a process for recording what content and skills are taught in a study program.The recording relies on a visual medium, typically a chart, table, or map, depicting the building blocks of the study program and how these blocks relate to one another.Because different types of building blocks could be used, there are different types of curriculum mapping.
When individual courses are the building blocks, curriculum mapping provides a snapshot of existing learning pathways considering the available courses, helping students navigate the study program.These course mappings use the calendar year as an organizer to depict vertical (from year to year) and horizontal (within a year) relations between courses [Burns, 2001], and are usually represented as flowcharts.Meij and Merx [2018] provide an example of a course mapping published online, with course-specific scientific and pedagogic details available as hyperlinked content.
For accreditation bodies, the grouping of contents and skills per courses is not as relevant as ensuring these contents and skills result in expected learning outcomes [Felder andBrent, 2003, Crawley et al., 2007].For this reason, for accreditation purposes, learning outcomes are the building blocks, and learning outcomes mappings are used to show the study program yields the expected learning outcomes.These mappings are typically represented as tables aligning program learning outcomes and accreditation standards.Examples of learning outcomes mappings are given in Dyjur and Lock [2016].
Learning outcomes mappings' purpose goes beyond reporting the alignment with accreditation standards.This type of curriculum mapping is used to communicate accreditation bodies vision of transparency, accountability and scientific curriculum development [Ornstein and Hunkins, 2018].A vision that becomes reality with HEI adoption of outcomesbased education [Spady, 1988, Harden, 2001] and constructive alignment principles [Biggs and Tang, 2011, p.99].Curriculum mapping is used, therefore, as a tool to shape HEI processes, particularly, program planning.
Willcox and Huang describe another type of curriculum mapping: the concept mapping.Concepts are in this case used for building blocks, with a concept denoting "the main idea underlying a (typically small) unit of content covered in a course" [Willcox and Huang, 2017, p.9].These units of content are linked to Knowledge Concepts defined by [Koedinger et al., 2012], and Willcox and Huang use concept mappings to provide insight into the relations between learning outcomes and between courses, helping faculty with the precise program plan navigation.Examples of this type of mapping are given in Seering et al. [2015], Willcox and Huang [2017] or Varagnolo et al. [2020].These authors use circular ideograms [Krzywinski et al., 2009] and/ or network graphs [Rosen, 2009] to detail concepts' precedence relations.The visual outputs presented by these researchers are very successful and efficient in conveying visual meaning to the complex relations found study programs.
The analysis of the three types of curriculum mapping reveals important characteristics.Curriculum mapping is used to shape HEI processes, and this ability is valuable for opening program planning discussions to non-faculty groups.Curriculum mapping uses visual-supported communication to represent and discuss study programs, and the developments taking place in the field of information visualization can be used to bridge communication gaps, helping stakeholders to articulate their expert (non-verbal) knowledge.However, as regards the choice of building blocks, if the objective is to increase non-faculty groups participatory involvement, broader (not detailed) concepts, requiring less scientific and pedagogic skills should be preferred.
A curriculum mapping method that builds on the practices already available but tailored for non-faculty groups participation in program planning discussions is described in the next section.

The method: Broader terms curriculum mapping
This section presents a curriculum mapping method designed for representative program planning.A method that empowers all stakeholders.
A flowchart representing the method steps, respective inputs and outputs, is presented in Figure 1.The method considers four steps-detailed in the following subsections-: (1) classification of course learning objective statements into broader terms; (2) use of Natural Language Processing (NLP) to convert broader terms into quantitative frequencies of key program concepts; (3) visualization of key program concept frequencies and mappings with links between key concepts and/ or courses; (4) discussion, considering the participation of all stakeholder groups, of the method's visual outputs and decision to reclassify or review course learning objectives.
To illustrate how these steps apply, a simple example is considered.The third column of Table 1 presents broader terms derived from the courses learning objectives.The methodology used to obtain these broader terms is described in the following subsection.

Step 1: Classification of course learning objectives
To characterize courses and the program-degree, the broader terms CM method uses course Learning Objectives (LO).
According to Felder and Brent [2003, p.19], course LO are defined as "statements of observable actions that serve as evidence of the knowledge, skills and attitudes acquired in a course."These statements define key program concepts and through these key concepts the intricate web of course relations is revealed.Course LO provide, therefore, access to the "mechanics" behind a program plan.
The problem of using course LO is that they presume tacit understanding of concepts specific to disciplinary and scientific sub-areas, and this renders LO-statements seldom clear and unequivocal [Ballantyne et al., 2019, Watts andHodgson, 2015].Even when LO are written according to specific rules (e.g., considering Bloom's taxonomy, Bloom, 1956, Adam, 2004, Felder and Brent, 2003), the variability in style and scope results in a heterogeneous set, including statements that are often too abstract or too detailed [Lam andTsui, 2016, Hussey andSmith, 2003].
To disclose their latent information and for effective  1: "Recognize a real-valued function of a real variable; Recall the concept of derivative of a real function and explain its geometric interpretation."Using Wikipedia index, the first statement could be classified according to the Wikipedia concept, "Function of a real variable"; a matching concept almost identical to the original LO.However, broader concepts could be chosen.For example, the second LO statement could be classified (with exaggeration) by the broader Wikipedia concept "Differential calculus".The third column in Table 1 includes results of course LO-statements classification for the 5 courses.
The classification example provided in the previous paragraph, and the comparison of columns two and three in Table 1, shows a significant reduction in the vocabulary used to characterize the courses.Because the reduction in vocabulary could entail an important loss of information, it follows the importance of the conceptual analysis of LO-statements [Lancaster, 2003].The importance of determining what the LO statement is about-the "aboutness"-and of the translation into (the selection of) specific broader terms.
Because program planning relies, typically, on the declared curriculum, with LO-statements written by university faculty, the classification of LO is frequently done by faculty in collaboration with curriculum planners [Willcox and Huang, 2017, Seering et al., 2015, Varagnolo et al., 2020].An alternative procedure consists of an initial classification by an information science professional, subsequently revised by faculty [Ballantyne et al., 2019].For large database classification, automated machine learning techniques are also used [Golub, S/D, West, 2016].In this paper, an initial draft classification of course LO is made by a small multidisciplinary team of university faculty.Once visual outputs derived with the broader terms CM method are available, a reclassification is made with contributions from all stakeholders (see Step 4 in Figure 1).
For effective communication of the program plan, having courses associated with a small subset of broader terms selected from a control vocabulary is an important advantage.Key concepts found in courses can be identified, paving the way to their quantification and to the analysis of the relations between courses, i.e., to the analysis of information flows, such as topics covered, which assessments relate to which topic, and so on.
The next section describes in detail the method used in the quantitative processing of broader terms.

Step 2: Processing of broader terms
This paper uses Natural Language Processing (NLP, Manning andSchütze, 1999, West, 2018) to convert broader terms assigned to courses into quantitative data; i.e., into frequencies of words.It will be assumed that these words-these tokens as they are called in the NPL literature [Manning and Schütze, 1999]-extracted from broader terms, still carry conceptual meaning and can still be used to characterize courses and the program-degree.For this reason, in this paper, token and key (program or course) concept, K, are used as synonyms.
NLP applies a sequence of processing functions to an original set of broader terms.Tokenization, the first of these functions, identifies words in broader terms that are included in a corpus; in a dictionary of tokens.Recalling Section 3.1 example of obtaining broader terms for Mathematics (C1:MATH)-"Function of a real variable" and "differential calculus" were the resulting broader terms-, and, considering the corpus of English words; after tokenization the following set of tokens {Function, of, a, real, variable, differential, calculus} characterizes the Mathematics course.
But the above set includes tokens (i.e., "of" and "a") that add no value to the course characterization; therefore, these tokens-known as stop-words-, as well as any punctuation signs and numerals, should be removed.Moreover, words written with capital letters and different conjugations of the same word should be replaced by an adequate "stem-word" (in a process known as stemming, Manning and Schütze, 1999).
Denoting the stemming and the purging of meaningless tokens as normalization, if a study program has N courses, after tokenization and normalization of course Ci broader terms (i ∈ {1, 2, . . ., N }), a multiset K i (allowing multiple instances of the same token) of m i tokens K i,k is obtained (with k = 1, 2, . . ., m i ).For the program-degree as a whole, a set K (no repetitions) with a total of with Eq. ( 1) represents the elements of an N × M matrix B of token frequencies per course.
Considering the broader terms for the 5 courses in Table 1 (column three), after tokenization and normalization2 , the resulting course-token matrix is, where, given the large number of identified tokens (70), only the columns for the six most frequent are shown.
Observe how this matrix attaches quantitative information to courses based on token frequency.Observe, for instance, the link that emerges between courses C1:MATH and C2:PHY via token K2:calculus.Matrix B 5C shows this token is found twice among the tokens associated with course C1:MATH, and once among those associated with course C2:PHY (see also the underlined words in Table 1).
This ability to describe a study program quantitatively is an important breakthrough; a way to bridge the gap created by tacit understanding, unclear LO-statements.But, at the same time, notice how unpractical is the analysis of the data in matrix format.
To achieve a clearer understanding of the quantitative data emerging from NLP, an alternative to matrix or tabular representations of data is essential.

Step 3: Visualization of quantitative data
An important result from NLP are token frequencies: the column-wise sum of the elements in the course-token matrix.
A convenient visual representation of these frequencies is obtained with wordclouds.Figure 2 presents a wordcloud from data in matrix B 5C (Eq.3).
Figure 2: Wordcloud of token frequencies for the 5 courses example.Graph obtained using the Wordcloud package [Fellows, 2018] for the R programming language [R Core Team, 2019]-see supplementary material [Duarte, 2020].et al., 2009] was obtained with the Circlize package [Gu et al., 2014] and the multigraph was obtained with the iGraph package [Csardi and Nepusz, 2006].Both packages for the R programming language [R Core Team, 2019]-see supplementary material [Duarte, 2020].
The outer circumference in Figure 3 a) displays the 5 courses Ci on the right side and the 6 tokens Kj on the left side.This circumference specifies the number of links (see scale) between courses and tokens.For example, courses C3 and C4 have the largest number of links (8 and 7, respectively) to tokens.Token K1 has the largest number of links (6) to courses.
But the advantage of the circular ideogram comes, especially, from the inner circle in Figure 3 a), and from the stripes that link courses and tokens.The inner circle in Figure 3 a) emphasizes the (previously mentioned) link between courses C1:MATH and C2:PHY via token K2:calculus (see purple stripe).But much more is revealed: for example, while course C2:PHY has no further associations, course C1:MATH is also related to course C3:LOGOP through K5:linear (green stripe).The width of the stripes-the strength of the links-connecting C1:MATH to K2:calculus and K5:linear is also larger than the width for the stripes connecting these tokens to courses C2:PHY and C3:LOGOP.Given that "calculus" and "linear" are mathematics-related tokens, these results were expected, and the expert analysis of the 5 courses LO-statements (in Table 1) should result in identical conclusions.But in Figure 3, the combination of stripes' curvature, color and width renders the analysis universal, self-explanatory, empowering, uncovering latent information and helping the verbal articulation of expert (non-verbal) knowledge.
Another equally useful visual representation of data in matrix B 5C6K is presented in Figure 3  , where the number (the cardinality) of vertex links determine the vertex position [Fruchterman and Reingold, 1991].Notice how vertices C3 and C4-with largest number of links-shape a central cluster, while vertices C1 and C2 protrude to the periphery.Moreover, vertex centrality is emphasized through course vertices diameter; with larger diameters representing courses with a larger number of incident links.
In matrix format, the multigraph in Figure 3 b) for the 5 courses and 6 most frequent tokens is,

Broader Terms Curriculum Mapping
A PREPRINT a square biadjacency matrix obtained from B 5C6K (superscript T denotes matrix transpose).
Figures 3 a) and 3 b) provide important insights on how key concepts and courses interrelate.However, a simpler and yet very useful representation would consist of the direct links between courses and between tokens.
Observing Figure 3 b) and matrix A 5C6K we conclude that elements of the biadjacency matrix represent the cardinality of 1-walks between consecutive vertices-with a k-walk defined as the sequence of k edges (e 1 , e 2 , . . ., e k ) joining k + 1 vertices (v 1 , v 2 , . . ., v k+1 ) [Rosen, 2009].For example, matrix A 5C6K shows that between vertices v C1 and v K2 there are two 1-walks, A direct link between vertices v C1 and v C2 could be conceived as two 2-walks joining these vertices, represented as denoting the two available options to go from C1 to C2.For the 5 courses example, using matrix algebra, the number of 2-walks between course vertices and between token vertices is found from the 2 nd power of the biadjacency matrix, with the diagonal elements of the resulting matrix made equal to zero [Rosen, 2009].With L 5C6K denoting the 2-walk matrix, it follows L 5C6K = A 2 5C6K − diag A 2 5C6K , and replacing A 5C6K gives, Submatrices L 5C = L 5C6K [1 : 5; 1 : 5] and L 6K = L 5C6K [6 : 11; 6 : 11] in Eq. ( 5) represent the number of possible 2-walks between consecutive courses and consecutive tokens, respectively.
To confirm the results discussed previously for the direct link between vertices v C1 and v C2 , notice the value 2 found in matrix element L 5C6K [1; 2] (or L 5C6K [2; 1], because the graph is undirected).
Using submatrices L 5C and L 6K , representations of the direct links between courses and between key concepts are presented in Figures 4 a)   Figure 4 b) confirms the peripheral role played by mathematics-related concept, K2:calculus; and it is interesting to contrast this graph's discriminating potential with that of the wordcloud in Figure 2. Indeed, no evidence is found in the wordcloud as to differences between tokens K2 to K6 (because the number of edges incident on vertices v K2 to v K6 is the same for these tokens, 3).Obviously, reasons for this should be discussed; in particular, the absence of a (expected) link between C2:PHY and C4:ENER.
Results from this section show visual outputs from the broader terms CM method provide evidence-based details on weaknesses (and strengths) in program plans; namely, related to key program concepts and to the interrelations between these and/ or courses.

Step 4: Discussion of the visual outputs
With the adoption of the broader terms CM method the focus of program planning discussions is shifted from the discussion of written statements of course LO-seldom clear and unequivocal-; from atomized discourses about the links between courses, to the interpretation of quantitative data communicated visually in a way understandable to all.
Because of the universal, self-explanatory quality of its visual outputs-of the mappings-, the broader terms CM method empowers all stakeholders, allowing participatory involvement of non-faculty groups in program planning discussions.Because of its quantitative nature, the broader terms CM method nurtures constructive critique, effectively addressing disciplinary and scientific boundaries, hierarchical and functional differences, and atomized discourses.
With the objective identification of program plan weaknesses, it is possible to unfreeze [Schein, 1999] long established beliefs, preparing the agreement for change with contributions from all stakeholders (see the review feedback loop in Figure 1).Some of the weaknesses identified in the mappings may derive from course LO classification.Section 3.1 stated an initial draft classification was made by a small multidisciplinary team of university faculty.During the discussion step, with the help of mappings, classification problems are easily identified, justifying the reclassification feedback loop in Figure 1.
As mentioned in Section 3.1, this reclassification carries some subjectivity.Different broader terms could be chosen to classify a LO-statement; and there could be LO-statements for which an adequate broader term is not included in the control vocabulary.However, the technical nature of control vocabularies and of the classification task makes selection of broader terms distinctly less subjective than the head-on discussion of LO-statements.
Using the mappings for the 5 courses example we elaborate on relevant discussion topics that would benefit from the participatory involvement of all stakeholders.
Considering frequencies and links between key program concepts, in Figures 3 a) and 4 b), stakeholders (namely, industry and society groups) could contribute with their experience to identify important key concepts, essential links, considering not only scientific and pedagogic arguments, but also the mission of HEI in the context of rapidly changing technological, economical, societal and political environments.With respect to the links between courses, and the links between courses and key concepts, in Figures 4 a) and 3 b), student and graduate groups could contribute with their experience to contrast the differences between the declared and the enacted curriculum [Arafeh, 2016, Varagnolo et al., 2020].
Concerning the lack of an expected link between C2:PHY and C4:ENER, mentioned at the end of last subsection-recall Figure 4 a)-; given that Applied Physics and Energy Management syllabuses are typically linked by thermodynamics, heat transfer and fluid flow topics, to express the importance of this link, stakeholders could use handwritten notes to communicate a desirable change, as depicted in Figure 5. Figure 5 demonstrates the ease with which stakeholders take possession of the mappings.The dashed lines show the preferred location of C2:PHY (and C1:MATH), closer to core courses.Text in square brackets points to broader terms justifying the link between C2:PHY and C4:ENER.Figure 5 could be the starting point for the revision of these courses LO; perhaps considering another forum and using detailed concept mappings.
To conclude this section, notice that having used a simple example with only 5 courses, it is not possible to verify if the method most frequent key program concepts and if the links between these and/ or courses are accurate.To assess the accuracy of the broader terms CM method, the next section presents a case study.

The case study: Bachelor degree in T&IM
To evaluate the accuracy of the broader terms CM method, results obtained with this method should be confirmed by actual observations.For this purpose, this section uses a bachelor degree-Technology and Industrial Management (T&IM)-assessed by the Portuguese accreditation agency (A3ES) in 2013.The section starts with the generic presentation of the T&IM study program and with the presentation of the recommendations issued by A3ES (the results of the assessment).Afterwards, the broader terms CM method is used to generate mappings from courses LO-statements.To evaluate the accuracy of the method, the mappings are compared to the recommendations-which are deemed accurate.The section ends with a discussion of this comparison.

T&IM bachelor degree and the Portuguese accreditation agency recommendations
The bachelor degree (180 ECTS credits) in Technology and Industrial Management (T&IM, Lourenço et al., 2013, Duarte et al., 2014, 2018)   Another important characteristic represented in Figure 6 is the dispersal of T&IM core and elective courses among 6 departments.
Six years after it began, the Portuguese accreditation agency assessed the T&IM bachelor degree [A3ES, 2013].Study program data reporting to the 2007-2012 period was gathered, a self-assessment report was delivered by the HEI, an independent panel of experts (representing A3ES) visited and met with HEI stakeholders.
Regarding the program plan, the A3ES produced the following recommendations: i. Increase program-degree mathematical content.
ii. Steer programming skills towards high-level languages with practical use.
iii.Strengthen the program plan with important applied industrial management content, namely, in operations management, supply chain management and operational research.
iv. Excessive number of courses, some with little additional content.
v. Poor integration of topics taught in the different courses.
Concerning these recommendations, note that: (1) these are considered an accurate expression of weaknesses in the 2007-2012 T&IM program plan; (2) the non-prescriptive (and somewhat vague) style of the recommendations results from accreditation criteria allowing program-degrees to adjust to different HEI missions, to student demographics and available resources.

T&IM mappings
Using LO-statements from the T&IM courses (2007-2012 program) and the methodology described in Figure 1 (excluding the feedback loop), after courses LO classification and broader terms NLP, a total of 256 program tokens (no repetitions) was obtained.Note that out of the total 38 courses in Figure 6, three (ETH, NET and CAD) were not taught and were excluded; Internships I&II were also excluded, justifying the analysis of only 33 courses (for the meaning of the course acronyms please refer to Figure 6).

Comparing T&IM mappings with A3ES recommendations
Comparing Figures 7 and 8 with A3ES recommendations (in Section 4.1) it is possible to evaluate, for each recommendation, if the meaning conveyed in writing has a visual equivalent.The comparison of written and visual meaning is used to verify the accuracy of the broader terms CM method.
Consider item (i) of the A3ES recommendations-increase program-degree mathematical content.The visual equivalent of this recommendation is the (relative) absence of mathematics-related tokens in the mappings.Indeed, Figure 7 a) includes very few mathematics-related tokens (e.g., mathemat, algebra, theorem), with the small font size of these tokens confirming the detachment of mathematics from core concepts taught in the T&IM degree.The position of the "mathemat" token in Figure 7 b), distant from central key program concepts, is also consistent with this analysis.Figure 8 provides further evidence that some action should be taken concerning mathematics contents.Courses MATH and STAT relative position and the small number of links to other program courses translates into insufficient integration of mathematics contents.
As regards item (ii) of the A3ES recommendations-steer programming skills towards high level languages with practical use-, the visual equivalent should be the absence of links between programming and applied key concepts.
An analysis similar to the previous one shows few programming-related tokens in Figure 7 a), with none among the 28 most frequent in Figure 7 b).As regards the PROG course, its location in Figure 8 confirms it is among those with less links to central and applied courses.
As for item (iii)-strengthen the program plan with important applied industrial management content, namely, in operations management, supply chain management and operational research-, from Figure 7 b), tokens "oper[ational]", "logist[ics]" are found among the 28 most frequent.Figure 7 a) includes additional concepts related to the mentioned courses, such as "suppli", "chain", "optim".Comparing font sizes in Figure 7 a), these latter key concepts are less frequent than generic managerial key concepts "resourc", "financ", "account", which could be subjectively deemed less important in a Technology & Industrial Management program plan.A detailed quantitative analysis of token frequencies and of token connections could be made, contributing with relevant insights to the constructive discussion of this recommendation.
Using a similar line of inquiry, item (iv) in A3ES recommendations-excessive number of courses, some with little additional content-would benefit from the detailed analysis of token frequencies per course and from the equivalent to Figure 3 b) with data from the T&IM study program.This detailed analysis and the graph are obtained with ease from matrix A T&IM , using the methods and tool considered in supplementary material [Duarte, 2020].However, from Figure 8 it is possible to sort courses based on their connectivity (close to core or peripheral location).This figure depicts ultra-peripheral courses (MECHT, MULT, INNOV) in the background plane with no links.These courses are obvious candidates to detailed scrutiny.A scrutiny that should be extended to courses closer to the graph central region but, nevertheless, showing a small number of links (e.g., GLOB, ENVECON, ECON).
Finally, concerning item (v)-poor integration of topics-, as stated previously in Section 4.2, Figures 7 b) and 8 denounce the clustering of managerial and of engineering concepts.In addition, courses more detached from the graph central region and with less links in Figure 8 (already identified in the previous A3ES recommendation, item iv) are once more obvious candidates to detailed scrutiny.
In light of the above, and considering A3ES recommendations, Table 2 (second column) summarizes the evidence-based visual meaning obtained from T&IM mappings.

Discussion
From Table 2, for recommendations (i), (ii) and (v), mappings provide detailed visual evidence supporting these recommendations.For recommendations (iii) and (iv), the style (the vagueness) of A3ES statements prevents an objective comparison of visual and written meanings.Yet, these latter recommendations are useful to highlight the striking difference between an evidence-based analysis-possible with the mappings-and the subjective interpretationrelying on tacit understanding-of A3ES written statements.
Because the mappings provide evidence supporting the majority of the A3ES recommendations, it is concluded that the broader terms CM method provides an accurate depiction of T&IM program plan weaknesses.Because all T&IM mappings rely on key program concepts, it is also concluded that these key concepts-and the broader terms CM method-are useful in program planning.
Three additional notes are worth mentioning.Firstly, despite the large number of program courses (33), classification, NLP and visualization steps were concluded quickly and with ease, posing no particular difficulty.Secondly, Figure 8 shows that a holistic experience of the T&IM program plan, considering interrelations between the 33 courses, is possible.Lastly, unlike the course mapping of Meij and Merx [2018], the detailed concept mappings of Seering et al. [2015], Willcox and Huang [2017] or Varagnolo et al. [2020], visual outputs from the broader terms CM method do not aim at the tracing of the available learning pathways or at the tracing of detailed precedence relations between program concepts.Instead, the broader terms CM method maps3 clusters of key concepts, or courses; maps multiple undirected links between courses and/ or concepts.These maps' aim is to provide a representation of the program plan that is understandable to all stakeholders, allowing participatory or non-faculty groups without imposing predefined models or fixed routes.In this sense, broader terms curriculum mapping does not replace, rather precedes and complements other curriculum mapping methods,

Conclusion
Addressing the curriculum development process is of paramount importance.This process has profound consequences being responsible for the preparation of future professionals and for laying the foundations for dynamic knowledge transfer systems affecting local and global realities.At the heart of curriculum development lays program planning.
Program planning is of immense strategic value.The effort put into program planning propagates through all levels and subprocesses of teaching and learning, imprinting the values, intentions and expectations that will guide stakeholders; shaping HEI educational outcomes.
To improve program planning more participatory touchpoints to non-faculty groups (i.e., students, industry, society) are needed.Creating these touchpoints, contributing to representative program planning was the motivation behind this paper.
An important impediment to representative program planning lays in the communication gap between faculty and non-faculty groups.Curriculum mapping has been used to promote better communication between faculty and shape program planning.This paper collected practices available from different types of curriculum mapping and, using information and data science techniques, tailored a curriculum mapping method for non-faculty groups participation in program planning discussions.The resulting method-the broader terms CM (Curriculum Mapping) method-was illustrated with the help of a simple example-5 courses example.The following conclusions were found: • (Section 3.1) Classification replaces head-on discussion of subjective course LO-statements with the much more objective task of selecting broader terms from a control vocabulary.
• (Section 3.2) Natural language processing allows the quantitative analysis of the program plan, providing a way to cut across disciplinary and scientific boundaries, hierarchical and functional differences, and atomized discourses.
• (Section 3.3) Mappings render quantitative results' interpretation universal, self-explanatory, empowering stakeholders with evidence-based details on weaknesses (and strengths) in the program plan.
• (Section 3.4) Discussion of visual outputs with non-faculty groups allows representative program planning, with these groups' voice being heard on reclassification and review of course LO-statements.
Despite the relevance of the above conclusions-related to the participatory involvement of non-faculty stakeholders-, the simple 5 courses example was unable to answer the question of the broader terms CM method' accuracy and, therefore, of the method' utility.
To evaluate the method's accuracy, a case study-the T&IM bachelor degree-was used.Mappings for the case study were obtained and compared with observations from an independent panel of experts.From this comparison the following was concluded (Section 4.4): • Mappings provide evidence supporting the observations, and the broader terms CM method provides an accurate depiction of T&IM program plan weaknesses.
• Key concepts obtained from course LO-statements-and the broader terms CM method-are useful in program planning.
Considering the benefit of non-faculty groups' participation in curriculum development processes, and considering the progresses made in information systems and relational databases [Chen, 1976, Bagui and Earp, 2003, Leff and Rayfield, 2001], the merger of techniques used in the broader terms CM method and HEI information systems would help bring the method's benefits into HEI everyday reality; for example, with the inclusion of mappings in information systems' summary dashboards.This merger is just one potential topic for further explorations in this rich and challenging research area that joins education, information and data sciences.

Figure 1 :
Figure 1: Flowchart representing the steps, respective inputs and outputs of the broader terms curriculum mapping method.

Figure 2
Figure 2 identifies the most frequent key program concepts-manag[ement], calculus, control, energi, linear, logistic-, represented with larger font size in a central position.

Figure 2
Figure2is adequate to identify the relative importance of different key concepts, but provides no information concerning the relations between these or between courses.To represent these relations researchers can choose among several alternatives.One which captures all data in course-token matrix and makes patterns and descriptive statistics visible is presented in Figure3 a).It is the circular ideogram representation[Krzywinski et al., 2009] of the data in matrix B 5C6K = B 5C [1 : 5; 1 : 6], a submatrix including the first 6 columns of matrix B 5C (Eq.3).
and 4 b), respectively.The strength of the links-the cardinality of possible 2-walks-is given both by numbers and by edge widths.Moreover, as for Figure3 b), vertices layout and vertex diameter provide a suggestive visual depiction of core and peripheral courses/ key concepts.

Figure 4 :
Figure 4: Graphs showing direct links between: a) courses (from L5C); b) key concepts (from L6K).Numbers and edge widths represent the strength of the link.Graphs produced with the iGraph package[Csardi and Nepusz, 2006] for the R programming language [R Core Team, 2019]-see supplementary material[Duarte, 2020].

Figure 4 a
Figure 4 a) shows the largest number of possible 2-walks (12) occurs between courses C3:LOGOP and C4:ENER.This value can be verified in Figure3 b).The way key concepts influence links between courses is clearly reflected in courses C2:PHY and C5:FIN locations.Although C2:PHY and C5:FIN have both a single key concept among the 6 most frequent-K2:calculus and K1:manag, respectively (see Eq. 3)-, the fact that "manag[ment]" is more common than the mathematics-related concept pulls C5:FIN closer to where core program courses lay, whereas C2:PHY is pushed to a peripheral location.

Figures 4 a
Figures 4 a) and Figure 3 b) provide visual evidence of course C2:PHY detachment from the remaining courses.Obviously, reasons for this should be discussed; in particular, the absence of a (expected) link between C2:PHY and C4:ENER.

Figure 5 :
Figure 5: Handwritten notes communicating a desirable change to the mapping in Figure 4 a).Example of how broader terms curriculum mapping can be used by stakeholders during the program planning discussions.
was conceived in 2006 at the College of Engineering of Instituto Politécnico de Setúbal, a Portuguese public HEI.The degree targeted mature students working in the industry sector in the region of Setúbal.Considering the characteristics of the students-mature blue color workers with formal and informal skills in their area of professional expertise-and the advanced technological settings provided by the employing organizations (which include automotive, aeronautic and ship repair industries), the 2007-2012 program plan emphasized managerial contents at the expense of engineering and mathematics.This emphasis on management topics is made clear in Figure 6, a circular dendrogram representing T&IM courses and respective departments.Out of the 38 program courses, 18 belonged to the Business Sciences Department.

Figure 6 :
Figure 6: Circular dendrogram representing T&IM courses (2007-2012 program plan) and respective departments.Courses belonging to the departments of Business Sciences (BScDep), Electrical Engineering (ElecEngDep), Informatics (InfDep), Mathematics (MathDep), Mechanical Engineering (MechEngD) and Process Control (ProcCtrlDep) are represented counterclockwise.The responsibility for Internship I&II is shared among departments and the asterisk symbol ( * ) is used to identify elective courses.
Figure 7 a) presents a wordcloud with the 200 most frequent key program concepts.Using the program biadjacency matrix A T&IM , graphs with direct links between the most frequent key program concepts and with direct links between courses were obtained-Figures 7 b) and 8, respectively.

Figure 7 :
Figure 7: Mappings for the T&IM degree (2007-2012 program plan).a) Wordcloud with the 200 most frequent key program concepts.b) Links between the 28 most frequent key program concepts.Wordcloud obtained using the wordcloud package[Fellows, 2018].Undirected network graph obtained from matrix L28K using the iGraph package[Csardi and Nepusz, 2006].Both packages developed for the R programming language [R CoreTeam, 2019].

Figure 8 :
Figure8: Links between T&IM program courses.The forward (white) plane presents an enlarged detail with 30 out of the 33 courses present in the backward (gray) plane.Network graph obtained from matrix L33C using the iGraph package[Csardi and Nepusz, 2006] for the R programming language [R CoreTeam, 2019].

Figure 8
Figure 8 presents the links between program courses in two planes.The background (gray) plane is used to show three ultra-peripheral courses with no links: MECHT, MULT and INNOV.The forward (white) plane provides a detail of the courses laying closer to the graph core region.In this detail all courses are linked.Distant from the graph center lay courses PHY and STAT; at an intermediate distance lay MATH, MAIN, CTRLP, PROG, DRAW, ENVECON, ECON, GLOB and ENG; the remaining 19 courses lay at the central region.A divide similar to the one identified previously between managerial and engineering concepts is also present in Figure 8, with managerial courses clustered to the (upper) left and engineering courses clustered towards the (lower) right of Figure 8 detail.

Table 1 :
[Duarte, 2020]tives and respective broader terms classification for 5 courses.Notes: (1) A non-truncated version of this table is provided with the supplementary material[Duarte, 2020].(2) Data in this table models style and scope variability frequently found in learning objective statements and considers different levels of detail in broader terms selection.
This paper considers principles of resource classification to classify course LO.Concepts from Wikipedia index matching course LO-statements are used to define broader terms.To illustrate how this is done, consider the excerpt of LO statements for Mathematics (C1:MATH) in Table

Table 2 :
Comparing A3ES recommendations with evidence-based visual meaning conveyed from T&IM mappings.Note these results are obtained exclusively from courses LO-statements, whereas A3ES recommendations consider a visit by an independent panel of experts, interviews, focus group sessions, among other inputs.