Pre-Service Teachers’ Knowledge of Relational Structure of Physics Concepts: Finding Key Concepts of Electricity and Magnetism

Relational interlinked dependencies between concepts constitute the structure of abstract knowledge and are crucial in learning conceptual knowledge and the meaning of concepts. To explore pre-service teachers’ declarative knowledge of physics concepts, we have analyzed concept networks, which agglomerate 12 pre-service teacher students’ representations of the key elements in electricity and magnetism. We show that by using network-based methods, the interlinked connections of nodes, locally and globally, can be analyzed to reveal how different elements of the network are supported through their connections to other nodes in the network. Nodes with high global connectivity initialize contiguous concept patchworks within the network and are thus most often found to be abstract, general, and advanced concepts. Locally cohesive concepts, on the other hand, are nearly always auxiliary supporting concepts, related to specific textbook-type experiments and model-type conceptional elements. Comparisons of group-level knowledge and individual pre-service teacher students’ knowledge in the form of networks shows that while in group-level the aggregated knowledge is expert-like, at the individual level pre-service teacher students possess only a fraction of that knowledge.


Introduction
Science education is supposed to stay close to conceptual knowledge as it is conceived by experts in science and teaching science. A recurrent theme in science education is the discussion of how the structure of novices' knowledge differs from experts' knowledge and how to facilitate the transformation of novices' knowledge to a more expert-like knowledge. It has been repeatedly noted that the effectiveness of experts' knowledge is derived from its organized structure, which allows easy, fast and accurate retrieval of knowledge [1,2]. The interest towards the structure of students' knowledge is well documented in many research reports, and thus its importance hardly needs reiteration. Consequently, in science education, conceptual structures that are typical for advanced scientific knowledge are also of key interest in learning abstract scientific knowledge.
Research discussing the structure of pre-service teacher students' knowledge and how it is related to experts' knowledge has thus attempted to find structural characteristics of experts' and novices' knowledge for the basis of such comparisons. One extensive branch of such research has focused on finding structure as it is revealed by semantic networks and closeness of terms in semantic networks, where semantic networks are analyzed by graph-theoretical methods [3][4][5]. In word association studies, the closeness of concepts in semantic networks, as they emerge in pairwise associations, is assumed to be a robust method for exploring students' structure of knowledge, or at least how students use 1.
Which concepts and conceptual elements have high connectivity, and of which type? 2.
How do the concepts with high connectivity of distinct types differ in their content? 3.
How does individual students' knowledge relate to group-level knowledge?
For us, the notion of connectivity has a significant role in exploring the structure of students' knowledge expression in the form of concept networks, as the research questions show.
The results of the study, based on extended network-based analysis, confirm the assumption that abstract, general concepts have different structural positions in comparison to more context related concepts. In addition, the results show that at the group-level, the students' knowledge comes close to expert-like knowledge and is richly connected at the global level, while individual students possess only parts of the group-level knowledge. The parts of knowledge structures in possession of students are not always very comprehensive or extensive, but they are nearly always adequate as such and well substantiated.

Relational Knowledge and its Cartography: A Network View
The view that knowledge is a system and that different knowledge elements acquire their meaning as part of that system is supported by recent ideas which emphasize the role of relations in building the meaning of concepts [25][26][27][28]. This view lends itself very easily to underpinning the exploration of students' knowledge to how they represent their declarative knowledge. The pre-service teacher students' representations of their declarative knowledge (i.e., knowledge in written, explicated form or represented symbolically) provides, of course, only a window into their knowledge, but is nevertheless a major part of the knowledge they can use in communicating their understanding of the knowledge system that the representations are about. For example, how they would use that knowledge in problem solving would provide a different, complementary picture. However, both pictures are needed and here we focus on the former, declarative knowledge.
We first briefly summarize the basic notions of relational knowledge, then rationalize the representation of such knowledge as kinds of lexicons, and finally introduce appropriate cartographic methods to find metrics of such networks.

Relational Knowledge
The structure of knowledge systems emerges from relational connections between concepts which form the system. The role of relational structures and relational schemes in learning concepts and abstract knowledge has recently been discussed within the relational theory of concepts [25][26][27][28]. In this theory, the relational structure of knowledge is in focus in a very similar way as in those views on scientific knowledge that emphasize the systemic (and systematic) nature of scientific knowledge [29][30][31][32][33]. The importance of the relational theory of concepts to science education has recently been pointed out by Goldwater and Schalk [25], who discuss the role that relational knowledge and relational categories play in bridging cognitive science and science education. In discussing the role of relational concepts in learning, they claim that in learning advanced scientific knowledge, "learners need to acquire a highly interrelated set of concepts and principles that classify phenomena, problems, and situations by their deep (common) relational structure" [25]. According to Goldwater and Schalk, such relational knowledge is indispensable for experts, and they claim that becoming an expert requires "learning and understanding highly interconnected systems of relational knowledge", which during the learning and growth of expertise, become richer and more coherent [25]. In a comparable way, in many of the studies by Lachner et al. [19][20][21][22] they have pointed out that experts' knowledge builds around abstract, coherent connections, where conceptual distances between the concepts are often distant and complex, while novices' knowledge often consist of closely related concepts but also remains shallower in comparison to experts' knowledge. These notions suggest that the indispensable feature of the experts' knowledge in learning and teaching contexts is abstractness, deep embedding of the theoretical structure of knowledge, and complex connection within that knowledge system. These notions contained in research emphasising relational knowledge and relational schemes embody the seminal notions [1] that scientific knowledge and experts' knowledge is organized, constituted by relationships between the concepts and that such relations are related to abstractness and generality of concepts. Next, we turn to take a step which brings us closer to operationalization of structural features of interest.

Relational Knowledge as Networked Lexicon
The notions of relational knowledge invite representations in which relational connections and interlinking patterns between concepts are explicated. If relational structures and patterns are taken to be central to learning and understanding scientific knowledge, as these views posit, we must attempt to find ways to represent, explicate, and recognize such structures. One possibility is to turn to conceptual declarative knowledge and how it can be represented. Such a restriction is of course only one aspect of the richness of conceptual knowledge, yet it is important enough to deserve attention of its own.
Conceptual declarative knowledge is often approached from the viewpoint of semantic networks, where networked connections are of primary interest [32] because retrieval and inference are based on traversing such networks. From the viewpoint of semantic networks, concepts are assumed to form clusters in which connections are close. Concepts within clusters share more similarities than concepts in different clusters. The structure revealed closeness of terms in semantic networks are appropriate for studies where associative knowledge and word associations are in focus [6,7,32]. However, in cases where the knowledge is rule-based, normative, and based on regulative connections, as in the case of abstract, general concepts, longer, contingent paths which relate concepts in different clusters Educ. Sci. 2019, 9, 18 5 of 25 are important [16,[18][19][20]. In any case, regardless of the distance between concepts, both view direct attention to exploring the connectivity of knowledge elements within the knowledge system.
An obvious way to explore the conceptual connections is by focusing on terms that stand for the concepts, and how relationships between the terms emerge as they are explicated in different contexts. Such a network eventually forms a lexicon of terms and names, where connections between them derive from contextualized instances of how terms are used [33]. The assumption that concept meaning is related to the structure of lexical systems is closely related to notions that learning word meanings is directly reflected in the relational structure of the lexical networks the learners possess [34,35]. From the viewpoint of scientific knowledge, the focus on concept networks as lexical networks finds support from the notion that scientific knowledge can also be seen as a lexical system, and learning scientific knowledge is learning that lexical system [33]. Seeing relational conceptual knowledge as a networked lexicon, although not very common in science education and research on it, receives strong dual support from research on learning as well as from analyses of scientific knowledge. Here, by focusing on the relational structure of pre-service teacher students' declarative knowledge in the form of lexicon, we attempt to bridge the gap that we think remains between recurrent descriptive and qualitative notions of importance of structure of students' knowledge, and how research has thus far managed to operationalize these notions. To get the tools to bridge the gap, we next turn to methods of cartography of knowledge.

Cartography of Networked Lexicons
The network approach to conceptual declarative knowledge provides not only a new view on knowledge as system, but in addition, and importantly, as practical tools to develop well-defined operationalizations for the key notions. Complex network methods, as a cartography of knowledge, have been successfully utilized in mapping scientific knowledge as it is revealed by connections between different disciplinary areas, citation networks, and networks of scientific collaborations [36][37][38]. We approach the students' lexicon of declarative knowledge and concept networks by using metrics to measure the connectivity of nodes in the networks, representing them by following paralleling methods [18] as developed in the context of the cartography of knowledge. The network theory provides several operationalizations [39][40][41] of such properties, and exhaustive study of all them is beyond the scope of the present study. However, there is no straightforward method to decide a priori which measures are optimal, and such decisions must be made a posteriori on the basis of how the operationalization of the desired property is carried out and whether it is able to convey information on the desired property of the network.
We assume that the key elements of students' declarative knowledge are those elements that have important local and global connecting roles in the concept networks. On the basis of previous studies focusing on the structure of students' concept networks [17,18,20], we suggest that the three most important properties and their operationalizations are: • Local connectivity. Local connectivity of a node is simply the strength of all its connections to neighbouring nodes, thus providing the local epistemic support for the node. If the node has many well-substantiated connections, the epistemic support of the node is strong. Local connectivity is operationalized as Degree centrality, D (see Appendix A). • Local cohesion. Local cohesion of a node is related to the number of neighbours of the node that are also connected. Such triadically connected patterns are mutually supporting and give rise to cyclic structures. Of these, the triadic cycle is of special interest, since it connects the closest neighbours transitively and confers strong local coherence of the system. Local cohesion is operationalized as the Local Clustering coefficient, C (see Appendix A). • Global contiguity. Global contiguity is related to global connectivity and the availability of long paths connecting two distantly separated knowledge elements in the network. Such long contiguous paths are pathways to transmit supporting information, needed in substantiating knowledge elements not only within the local neighbourhood but also in the network as whole.
Global contiguity is operationalized here as Communicability Centrality, G (see Appendix A). Communicability Centrality is directly related to knowledge elements' role in mediating between the other elements in the system. High global contiguity obviously also implies high global cohesion, in the sense that distant elements are nevertheless interconnected. Therefore, we do not need a separate measure for global cohesion.
The selection of these three properties and operationalizations is justified by their close connection to desired structural properties, but also their practical utility in helping to discern those properties which have been found central to the basis of a posteriori considerations. The above-defined properties can be operationalized in many ways, but here we have opted to use the most obvious standard centrality measures [39][40][41]. The operationalizations, in terms of D, C, and G, are based on path-counting in a network [18,20,40], as explained in detail in Appendix A. The path-counting algorithm contains one parameter β which controls how the long paths are weighted in the counting (i.e., how substantial portions of the network are included in counting) [18,40]. When β << 1, only local neighbours are included, while for β >> 1, all available paths in the network are included. For most cases value β = 1 is used, which weights paths to correct the effect of expected multiplicity (see Appendix A). Finally, it should be noted, however, that the above verbal descriptions of the centralities are only illustrative and cannot replace the mathematical descriptions provided in Table A1 and Appendix A. The verbal descriptions serve to give an overall idea of the nature of operationalization but are incomplete and easily misleading. The mathematical descriptions are thus indispensable [41].

Empirical Sample, Its Preparation, and Method of Analysis
As an example of the network-based cartography of pre-service teacher students' knowledge, we discuss here how students present their views of the relational connectedness of concepts in electricity and magnetism, and how the network view provides a window on the features of that knowledge system. The sample on which we base our analysis consists of 12 concept networks, each made by one student and produced during a seven-week course that focused on the conceptual structure of physics. During the teaching sequence, the pre-service teacher students first produced an initial concept network, and later, after instruction and group discussions, a definitive version of the concept network. In constructing the networks, students were instructed to follow certain design principles, explained in detail elsewhere [16,19] and summarized in Appendix B to the extent necessary here. Here, only the final networks are considered because the final stage of the students' understanding of the relational structure of concepts is of interest. The 12 individual concept networks are not separately analyzed. Instead, for analysis we formed three different agglomerated networks based on the 12 individual networks, which represent students' knowledge at the group level. Of these three networks, the first is a direct agglomeration of acceptable knowledge found in the 12 individual networks, while the two others are augmented versions of the first. The augmented networks contain about 10% additional links, which three experts added to make the augmented network correspond to a representation the experts found adequate and expert like.

Content Analysis
For the analysis, all nodes in the individual concept networks (and also reported in written reports) were evaluated and all relevant nodes were retained (about 5% of nodes were filtered out at this stage). Next, the detailed content analysis based on epistemic classification of nodes was performed. Epistemic, content related classification of nodes contained two levels [16,19]: Level of using ontologically correct properties for nodes; 2.
Level of factually correct statements.
It should be noted that the ontological level is a requirement for the factual level. At the ontological level, the explanation needs to contain correct physics concepts connected correctly to ontological properties of their targets (i.e., particles can have electric charge, mass, and track, whereas fields have strength and extension). At the factual level, the explanation needs to contain correctly identified connections between at least two concepts. Factual knowledge can be represented in the form of physics law or principles, or it can be some experimentally perceived connection between concepts or connection derived from a model. Each node in students' gained scores from 0 to 2 is based on this epistemic analysis. The content analysis was performed independently by two researchers. The overall interrater agreement was 95%.

Construction of Collated Networks for Cartography
Here we focus on the examination of pre-service teacher students' group-level knowledge. We use three differently collated (agglomerated) networks based on the relationships found in 12 individual networks and on the level of epistemic substantiation of the relationships. The collated networks contain the same 121 nodes, but the nodes are linked differently. The three networks studied are: The augmented network contains 787 links and can be taken to represent experts' knowledge, because three experts refined it to the extent that they deemed to provide an adequate representation of the topic.
To prepare the collated networks and the individual concept networks made by students in a form that yields to quantitative analysis, the information contained in the epistemic strength of the nodes is transformed into information contained in the links between the nodes, in form of the weight w of the link and normalized to maximum value one. The details of weighting and construction of the weighted network are explained in Appendix A. Here it is enough to keep in mind that epistemic quality of content is now coded in link weights. All networks discussed in what follows, the augmented (AUG), collated (COL-O and COL-A), and individual (IND), are transformed into weighted, directed networks, and analyzed as such.
The original 12 students' concept networks are discussed to the extent that they can be compared to collated networks. The reason for this is that only at the level of collated networks do interesting and relevant features of structure of knowledge become apparent. It is important to note that the structural features found at group-level are not artefacts produced by the collation of individual networks. Rather, the collation removes noise, and parses incompletely expressed connections, and connections that individual students failed to notice. In many ways, this parallels the utility and improvement of group-level knowledge found in crowdsourcing of knowledge [42].
One example of the individual networks is shown in Figure 1, which illustrates the overall appearance of the network and shows how it consists of three modules (communities). The modularity is partially a consequence of the task structure. The three tasks (electricity, magnetism, and electromagnetism) were completed separately and joined only in the final concept network. Nodes 1-53 correspond to the module on electricity, nodes 54-93 to magnetism, and nodes 94-121 to electromagnetism. Educ   Magnetic flux Ф 120 Electromagnetic (propagating) waves

Results
The relational structure of the collated concept networks is revealed by the network cartography, in terms of local observables for connectivity D, cohesion C, and communicability G. The nodes of the network representing concepts and conceptual elements are then ranked according to their values D and C to find concepts with high local connectivity and cohesion, and according to G to find concepts with high global contiguity. We refer to concepts with high G as the key concepts, because they have an important global structural role. The statistical significance of the results summarized here has been assessed in the standard way of network-analysis by using the configuration-model [41] as a null-model (see Appendix A) (see also [18,24]). The standardized deviations of values from empirical samples from mean values obtained for null-model provide so-called Z-scores (see Appendix A). Z-scores higher than |Z|> 2 (deviation more than two standard deviations) are taken to be statistically significant. Figure 2 shows the different structural roles of nodes for the augmented network (AUG) as they are revealed by the variables D and G. The network for G shows only those nodes which have G > 0.5 with β = 1. In Figure 2, the size of the node corresponds to the value of variables D and G, thus the key concepts are discernible as larger than average nodes in each case. The 25 highest ranking concepts and conceptual elements that have the highest values of G are listed in Table 2 for β = 1. The rankings according to D are also provided. For comparison, the rankings for collated networks COL-O and COL-A are also given. The additional information in Table 2 summarizes the number N of networks, where the given concept appears among the 10, 20, or 30 highest-ranking concepts. 0.5 with = 1. In Figure 2, the size of the node corresponds to the value of variables D and G, thus the key concepts are discernible as larger than average nodes in each case. The 25 highest ranking concepts and conceptual elements that have the highest values of G are listed in Table 2 for = 1. The rankings according to D are also provided. For comparison, the rankings for collated networks COL-O and COL-A are also given. The additional information in Table 2 summarizes the number N of networks, where the given concept appears among the 10, 20, or 30 highest-ranking concepts.   Table 1. The Z-scores provided in Table 2 for G-values in AUG-network show that some concepts, like 57 and 70, have exceptionally high Z-scores. This indicates a substantially higher number of long contiguous paths than expected on the basis of local connectivity D. These concepts have exceptionally high global influence on the structure. Some other concepts, such as 2, 28, and 100 have exceptionally low Z-scores, which indicates that they are more important locally than globally, and do not affect the conceptual system on a global scale as much as expected on the basis of their local importance. Table 2. Communicability-based rankings for collated (COL) and augmented (AUG) networks. The symbol R refers to ranking. Corresponding Z-scores are provided. DEG refers to Degree centrality D-based rankings and N refers to frequency in the sample of 12 networks. For collated networks, results are for optimized (opt) and authentic (aut). The last six columns list the frequency of key concepts in individual (IND) networks, either optimized (opt) or with β = 1 when n = 10 (n10), 20 (n20), or 30 (n30) highest ranking concepts are included.  Table 3 summarizes the 16 top-ranking nodes based on values of C, which thus represent concepts with high local cohesion. The rankings in Tables 2 and 3 allow the following preliminary conclusions: 1.

Finding Key Concepts from Relational Structure
Nodes which rank high in D are also often high-ranking in G, but clear differences emerge.

2.
Nodes which rank high in D and G are different from nodes which rank high in C. Table 3. Ranking based on local clustering C for augmented network. Maximum value C = 1.00 indicates totally transitive (triadic) connectivity. Note that many are connected concepts in Table 2. These observations are readily interpreted so that the structural positions of nodes are indeed different, and distinct sets of nodes can be formed on the basis of how they contribute to the connectivity of the network locally (D and C) and globally (G). Many nodes with high values of D are those that provide high local connectivity and are thus also important in initializing the contiguous patchwork within the network. The local connectivity as operationalized through D thus correlates strongly with contiguous global connectivity. The Spearman rank correlation coefficient between D and G is 0.87, while the correlation of C with D is in the range of 0.16-0.17. Therefore, a high value of D is a good predictor of a high value of global connectivity as measured by G, but does not determine it completely.

Concept
Comparison of values of D and G and relative rankings based on these values for a given node and provide information on the global role of the node; nodes with higher values of G than expected on the basis of D are clearly globally the most important ones.
To test the effect of parameter choices on rankings, we performed the analysis for values in the range of 1/32 < β < 32. The sets of key concepts were found for all 20 cases tested and were classified according to the value of compound parameter β. The results are shown in Figure 3 in the form of a fingerprint-map (a kind of heat-map) for all 121 concepts. In the fingerprint-map, concepts which have high communicability are shown as dark stripes. As can be seen, up to value log β ≈ −0.1 of the resolution is poor because too many low communicability nodes (concepts) contribute to communicability. The change in resolution changes rapidly when value log β ≈ 0 is approached and improves slightly up to log β ≈ 1.2. The optimal scale to explore the network with robust results is thus −0.1 < β < 1.2. Although the detailed values of the communicability vary in region 0.1 < β < 1.2, variations are moderate, and the sets of the highest-ranking concepts remain very stable (among the 25 highest-ranking concepts around 90% are always the same). We now take the set of 35 concepts, which are found in all combinations among the 25 highest-ranking concepts. The Communicability centrality G cases of log β = 0, 0.6, and 1.1 are shown in Figure 4 in the form of radar-plots. For comparison, the Degree centrality D is also shown (note that D does not depend on β because it is a local measure). Educ  To test the effect of parameter choices on rankings, we performed the analysis for values in the range of 1/32 < < 32. The sets of key concepts were found for all 20 cases tested and were classified according to the value of compound parameter . The results are shown in Figure 3 in the form of a fingerprint-map (a kind of heat-map) for all 121 concepts. In the fingerprint-map, concepts which have high communicability are shown as dark stripes. As can be seen, up to value log ≈ −0.1 of the resolution is poor because too many low communicability nodes (concepts) contribute to communicability. The change in resolution changes rapidly when value log ≈ 0 is approached and improves slightly up to log ≈ 1.2. The optimal scale to explore the network with robust results is thus −0.1 < < 1.2. Although the detailed values of the communicability vary in region 0.1 < < 1.2, variations are moderate, and the sets of the highest-ranking concepts remain very stable (among the 25 highest-ranking concepts around 90% are always the same). We now take the set of 35 concepts, which are found in all combinations among the 25 highest-ranking concepts. The Communicability centrality G cases of log = 0, 0.6, and 1.1 are shown in Figure 4 in the form of radar-plots. For comparison, the Degree centrality D is also shown (note that D does not depend on because it is a local measure).     To test the effect of parameter choices on rankings, we performed the analysis for values in the range of 1/32 < < 32. The sets of key concepts were found for all 20 cases tested and were classified according to the value of compound parameter . The results are shown in Figure 3 in the form of a fingerprint-map (a kind of heat-map) for all 121 concepts. In the fingerprint-map, concepts which have high communicability are shown as dark stripes. As can be seen, up to value log ≈ −0.1 of the resolution is poor because too many low communicability nodes (concepts) contribute to communicability. The change in resolution changes rapidly when value log ≈ 0 is approached and improves slightly up to log ≈ 1.2. The optimal scale to explore the network with robust results is thus −0.1 < < 1.2. Although the detailed values of the communicability vary in region 0.1 < < 1.2, variations are moderate, and the sets of the highest-ranking concepts remain very stable (among the 25 highest-ranking concepts around 90% are always the same). We now take the set of 35 concepts, which are found in all combinations among the 25 highest-ranking concepts. The Communicability centrality G cases of log = 0, 0.6, and 1.1 are shown in Figure 4 in the form of radar-plots. For comparison, the Degree centrality D is also shown (note that D does not depend on because it is a local measure).   The radar-plots show the 35 concepts which rank highly according to their value of G in the networks for a wide range of parameters β. The radar plots for G show that the augmented network (AUG) has a very robust set of key concepts, not much affected by the parameter choices. Some concepts which rank highly in providing local connectivity D (e.g., nodes 2, 71, 83, and 100) also rank highly in providing global connectivity G. However, some concepts which rank highly in providing global connectivity (e.g., nodes 57, 59, 116, and 117) do not rank highly in providing local connectivity. In AUG, it is remarkable that the distribution of high-ranking nodes is rather uniform. In particular, when it is noted that module of electricity corresponds to nodes 1-54, that of magnetism to nodes 55-93, and 94-121 for electromagnetism, the distribution signals equal importance for each module. Of course, this is as expected, because AUG was augmented because of expert evaluation of the student-produced networks; the experts added links they deemed to be of importance but missing from student expressions. Although the AUG is not a genuine expert-made network, it acts here as a benchmark.
The key concepts in collated networks (COL-O and COL-A) are based on genuine student networks. In both cases, it is striking that the module of nodes 55-94 corresponding to magnetism is overrepresented in comparison to the module of electricity (nodes 1-54). Similar overemphasis on magnetism was observed in a closely related study [17] but with a set of concepts of more limited scope. Otherwise, the collated networks are very close to augmented network, which indicates that the structures of AUG and COL support the relational connection between key concepts similarly; as far as structure of conceptual knowledge is in focus, expert augmented, and students' group-level knowledge come close to each other.
The comparison of COL-O to COL-A provides an interesting notion: namely, the results for key concepts are nearly indistinguishable. The differences for key concepts in COL-A and COL-O are significantly smaller. This indicates that interrater agreement in classification of nodes is not perhaps crucial, at least on the level of agreement obtained, for the relevance of the structural analysis. The results of insensitivity to different results of classification at the level of single nodes is masked by the importance of structure, which is in any case not captured by interrater agreement reliability assessment. The situation in which traditional statistical tests designed for local quantities fail to be of relevance in networked systems is not uncommon, and guides attention to thinking critically about the uses of traditional measures of reliability in concept-network analysis (cf. ref. [41]).

Content of Key Concepts
The structural network-based analysis picks out concept-nodes according to their role in local and global connectivity. The analysis, as such, does not refer in any way to the content of nodes, nor does it use any information about the content of a node in assessing its structural importance. Next, we turn to the question concerning the content of the key concepts.
The local connectivity, as operationalized by degree D, picks out certain nodes with high values of D, as shown in Figure 2 (upper row, left) and summarized in Table 2. In the expert network, the concept-nodes' magnetic force (node 83) and electric charge (node 2) have the highest rankings. These concepts are central to the electrostatic and magnetostatic clusters of concepts. Node 2 (electric charge) has high connectivity because it is the central starting concept in the substantiation of many other concepts; node 83 (magnetic force) is used in many different models and experiments to interpret them, and thus connects with magnetostatics as well as electromagnetism. Magnetic force is F = qv × B, understood as part of Lorentz's force. Some other concepts with high values of D are field concepts, either related to electric fields (nodes 28, 109) or to magnetic fields (nodes 91, 71, and 66). Node 28 is electric field empirically defined as E = F/q through electrostatic force and understood as the space-filling field describing the electrostatic interaction. Node 91 is magnetic field H and is defined by the Ampere-Laplace law and Ampere's circuital law, which relates the field H to electric current. Magnetic flux density B, as it appears in node 71, is related to the empirical definition of strength of magnetic interaction, through torque M and magnetic moment m and relation M = m × B. The flux density B (node 66) interprets this in terms of magnetic fields in vacuum as created by current-carrying wires, through the Biot-Savart experimental law.
The high-ranking nodes related to electromagnetic induction are 100 (Faraday-Henry empirical induction law) and 109 (rotational electric field). Field concept 109 is also among the top 15 concepts. It is satisfying to find that the induction law is so prominently featured in the students' group-level (COL-A) knowledge base. In addition, another relatively well-substantiated node is 113, which is the Ampere-Maxwell law for rotational field. From the viewpoint of content (of university level physics), it is of course satisfactory that all nodes with high values are also central to the topics of electricity and magnetism; such concepts are thus key concepts structurally and in terms of content. Interestingly, magnetic force has the topmost ranking for local connectivity, indicating that it plays a key role for many other concepts, at least locally. In addition to field concepts, theoretical general principles, such as superposition of fields (node 27) and the principle of energy conservation in the context of mechanical work (node 33), have relatively high local connectivity.
The global contiguity of the node, as operationalized by communicability centrality G is, as expected, correlated with values of D; if a node has a high value of D it often also has a high value of G, which indicates that the node is connected to many other nodes through contiguous paths. The concepts with the highest G values are related to magnetic fields (nodes 57, 71, 91, and 66), magnetic interaction (nodes 63 and 83), and to electromagnetic induction (nodes 100 and 109) and waves (nodes 116, 117, and 120). Nodes 57, 63, 116, 177, and 120 are interesting, because they do not have as high of a local connectivity as might be expected on the basis of their global connectivity. In turn, node 28 has substantially lower global connectivity than expected on the basis of its local connectivity.
The higher than expected global connectivity of the above-mentioned nodes is also reflected in their Z-scores, which are high. Similarly, node 28 has a very low Z-score. Node 57 is a conceptual element related to the empirical definition of the magnetic field, through quantitative experiments and measurement of force and torque. Node 28 has a similar epistemic role regarding electric fields and represents definitions of field strength through force. However, their global structural positions are very different. While node 57 supports many other concepts in modules of magnetism concepts, node 28 does not have an equally important role for concepts in the electricity concepts module. Also, nodes 113, 116, and 120, which are all related to the conceptualization of propagating electric fields, culminating in the introduction of propagating electric fields as node 120, are highly globally connected. This reflects the fact that they are linked with both electricity and magnetism modules through electric field and magnetic field concepts.
Interestingly, among the top-ranking nodes are rotational electric fields (node 109) and Ampere's and Maxwell's law (node 113), which are rather high-level theoretical concepts related to electromagnetic induction phenomena and induction law (node 100). Interestingly, the rotational magnetic field as a non-conservative field (node 85) has rank 3, although it does not appear among the 25 highest ranking nodes in D. Apparently, in students' knowledge, node 85 plays a key role in connecting magnetostatics to the clusters of electrostatics and electrodynamics. The global importance of magnetostatics and electromagnetic induction is of course to be expected, because many of its concepts mediate between electrostatics and electrodynamics. For students, magnetostatics seems also to be of special interest, which was noted in a previous study focusing on learning concepts of electrostatics and magnetostatics [13].
The 35 nodes featured in the AUG network are all unquestionably central to the content, and several theoretically important and relatively abstract key concepts are found in this list. Many of these concepts are field concepts. Collated networks COL-O and COL-A do not feature all of these concepts as strongly as in the AUG network, but many of them are still among the key concepts of the collated networks. The COL networks are biased toward magnetism concepts, and for their part agree closely with the AUG network.
Some very central theoretical laws are included among the key concepts. The conclusion is that high values of D and G indicate that a concept is central from the viewpoint of theoretical, abstract content. Within that set of concepts, the concepts with high values of G are often related to magnetism and magnetic fields, thus revealing their role in providing global connectivity within the system of concepts and conceptual elements. Such global connections, consisting of the contiguous paths, form a skeletal patchwork that extends throughout the network and provide global cohesion and reachability of concepts otherwise distant.
The local cohesion (transitive) as measured by clustering C picks out concepts and conceptual elements which are very different from high D and G conceptual elements. The top conceptual elements with high C values reported in Table 3 are mostly model-based derivations of certain laws, models, experiments, or definitions. For example, in the expert network, some of the top-ranking nodes are: 107 (definition of mutual inductance); 89 (derivation of Ampere's circuital law by using a specific model of linear wire and cylindrical magnetic field); 4 (Millikan's oil drop experiment); 52 (example of using homogeneous electric field); and 50 (derivation of Gauss's law through model of spherical conductor). The other top ranking high-C concepts and conceptual elements are similar. For expert and novice networks, the 16 highest-ranking concepts according to their C values contain 12 which are derivations, models, model-based definitions, or experiments. Moreover, these items, and the ways they are reported in students' reports, closely match textbook presentations, some specific models offered in textbooks, and the problems discussed in them (compare with ref [43,44]).
In summary, the network analysis supports the view that globally important concepts are joined together by long contiguous paths, and thus play a role in connecting distant parts of the network. These concepts are primarily abstract, general concepts, in the case of electricity and magnetism very often general field concepts. The concepts, which are tightly and transitively (triadically) connected and which form tight and closely connected clusters, are nearly never the concepts with long contiguous paths. Such concepts are locally important, situational, and appear in the context of specific models, experiments, and examples related to them. Therefore, the abstract and situational concepts indeed have different and recognizable structural positions in students' representations of their knowledge.

Similarity Comparisons
The collated networks (COL-O and COL-A) contain 697 substantiated connections against 787 connections in the augmented network (AUG), while the number of connections in the individual networks on which the collated networks are based is lowest at 64 and highest at 129. The variation between networks is thus considerable. The communicability centrality of nodes with G > 0.5 in all 12 student networks are shown in Figure 5. The student networks share many high-ranking concepts with the expert network, but in them one finds also many such nodes which are not featured in collated networks among the high-ranking nodes. There is thus overlap between the networks, but not a perfect match. A more detailed breakdown of the key concepts is revealed by radar-maps of Communicability centrality G in Figure 6, to be compared with radar-maps in Figure 3. As may be seen, with increased demand for quality of substantiation, by increasing value of β, very few nodes remain as high-ranking nodes. Only networks g4 and g5 remain relatively robust and retain their set of key concepts when β is increased.
The key concepts with high Dand G-rankings in novice networks are not as systematic a collection of concepts as found in the expert network, and many individual networks miss the abstract field concepts as high-ranking ones. On the other hand, nearly all top-ranking concepts and conceptual elements in individual networks are relevant to the topic. The comparison shows that although the 12 individual student networks contain the set of highly abstract and central concepts, they are not often adequately substantiated in individual concept networks. The individual networks thus contain the knowledge which is contained in collated networks but in a very fragmented way; high-ranking concepts with well substantiated connections are different in various individual networks, although at the group-level the collection and collated structure formed out of them are very satisfactory  The novice network reveals a similar bias towards highly cohesive, locally triadically, and transitively connected cliques with high values of C. These are again conceptual elements towards derivations, models, model-based definitions, and experiments (see Table 3). In novice networks, however, the high-C conceptual elements are different from the elements found in the expert network; only their type is similar. The diversity of high-C nodes is thus very high, contrary to collection of nodes with high values of D and G, in which case the novice networks share many nodes with the expert network. The substantial low overlap of high-C nodes in novice and expert networks reveals that individual student concept networks, on which the novice network is based, have very few highly cohesive clusters in common, but such clusters are nearly always of a similar type: model-based derivations, model-based definitions, models, or experiments as found in textbooks. On the other hand, the diversity of high-C nodes reflects the facts that high-C nodes are auxiliary, in the role of supporting (e.g., through derivation, definition, or modeling) the substantiation of highly abstract field concepts, which have globally more important structural and content-related roles. This interpretation of results concerning the high-C nodes is in concordance with the notion that tight clustering of concepts may be characteristics of shallow knowledge related to the specific context, instead of being abstract and general [19]. Here, however, the nodes with high values of C do not represent shallow but auxiliary and very context-specific knowledge.
We base the similarity comparisons on the distribution of communicability centrality values; the more similar the distribution, the more similar are the networks. Such comparison is most easily done on the basis of so-called cosine-similarity S(g,g') for networks g and g' [24,40], which is defined in detail in Appendix A in terms of communicability of nodes contained in the networks. Similarity S is normalized and attains values from 0 (complete dissimilarity) to 1 (complete similarity). The cosine-similarity S is a convenient way to compress the information and perform a similarity comparison by using only one number. This is of course a highly average view and a lot of information is lost. Nevertheless, it provides a comprehensive overall picture of how networks are related when attention is focused on key concepts (high communicability concepts). The cosine-similarity is shown in Figure 7 for The novice network reveals a similar bias towards highly cohesive, locally triadically, and transitively connected cliques with high values of C. These are again conceptual elements towards derivations, models, model-based definitions, and experiments (see Table 3). In novice networks, however, the high-C conceptual elements are different from the elements found in the expert network; only their type is similar. The diversity of high-C nodes is thus very high, contrary to collection of nodes with high values of D and G, in which case the novice networks share many nodes with the expert network. The substantial low overlap of high-C nodes in novice and expert networks reveals that individual student concept networks, on which the novice network is based, have very few highly cohesive clusters in common, but such clusters are nearly always of a similar type: modelbased derivations, model-based definitions, models, or experiments as found in textbooks. On the other hand, the diversity of high-C nodes reflects the facts that high-C nodes are auxiliary, in the role of supporting (e.g., through derivation, definition, or modeling) the substantiation of highly abstract field concepts, which have globally more important structural and content-related roles. This interpretation of results concerning the high-C nodes is in concordance with the notion that tight clustering of concepts may be characteristics of shallow knowledge related to the specific context, instead of being abstract and general [19]. Here, however, the nodes with high values of C do not The similarity comparison shows that collated networks are rather similar; at the local level the corresponding log β < −0.5 is nearly completely similar (i.e., they have the same sets of key concepts). Even with increased scale of exploration when long global paths are included, similarity remains high, ranging from about 0.98 for COL-O vs. COL-A to 0.82 for AUG vs. COL-A. The similarities of individual networks are also substantially high at the local level for log β < −0.5, on average being 0.75. When similarity on more global scales is explored by increasing the value of β, and when longer paths become involved, the similarities of individual networks between other individual networks drop drastically, as well as their similarity to the collated, authentic COL-A network. On the average, they saturate to similarity 0.3. This suggests that in individual networks, successfully substantiated links are all alike, while every unsuccessful link is unsuccessful in its own way. By collecting and collating the successful links, however, a highly satisfactory group-level conceptual structure emerges. We base the similarity comparisons on the distribution of communicability centrality values; the more similar the distribution, the more similar are the networks. Such comparison is most easily done on the basis of so-called cosine-similarity S(g,g') for networks g and g' [24,40], which is defined in detail in Appendix A in terms of communicability of nodes contained in the networks. Similarity S is normalized and attains values from 0 (complete dissimilarity) to 1 (complete similarity). The cosinesimilarity S is a convenient way to compress the information and perform a similarity comparison by using only one number. This is of course a highly average view and a lot of information is lost. Nevertheless, it provides a comprehensive overall picture of how networks are related when attention is focused on key concepts (high communicability concepts). The cosine-similarity is shown in Figure 7  The similarity comparison shows that collated networks are rather similar; at the local level the corresponding log < −0.5 is nearly completely similar (i.e., they have the same sets of key concepts). Even with increased scale of exploration when long global paths are included, similarity remains high, ranging from about 0.98 for COL-O vs. COL-A to 0.82 for AUG vs. COL-A. The similarities of individual networks are also substantially high at the local level for log < −0.5, on average being 0.75. When similarity on more global scales is explored by increasing the value of β, and when longer paths become involved, the similarities of individual networks between other individual networks drop drastically, as well as their similarity to the collated, authentic COL-A network. On the average, they saturate to similarity 0.3. This suggests that in individual networks, successfully substantiated links are all alike, while every unsuccessful link is unsuccessful in its own way. By collecting and collating the successful links, however, a highly satisfactory group-level conceptual structure emerges.
Consequently, the results suggest that pre-service teacher student's knowledge is piecewise and partial from the viewpoint of the global connectivity required from expert-like knowledge, yet it has many expert-like parts. If such shared knowledge is supposed to be the basis of mutual communication and sharing of knowledge, it may be too low a similarity for effective discussion, but on the other hand, it provides a lot of unused potential to learn through sharing that unshared knowledge.

Discussion and Conclusions
Cognitively-oriented research on learning has pointed out that relational knowledge and relational interlinked dependencies between concepts are crucial in learning conceptual knowledge and the meaning of concepts, especially in the case of abstract concepts. The relational view of Consequently, the results suggest that pre-service teacher student's knowledge is piecewise and partial from the viewpoint of the global connectivity required from expert-like knowledge, yet it has many expert-like parts. If such shared knowledge is supposed to be the basis of mutual communication and sharing of knowledge, it may be too low a similarity for effective discussion, but on the other hand, it provides a lot of unused potential to learn through sharing that unshared knowledge.

Discussion and Conclusions
Cognitively-oriented research on learning has pointed out that relational knowledge and relational interlinked dependencies between concepts are crucial in learning conceptual knowledge and the meaning of concepts, especially in the case of abstract concepts. The relational view of knowledge lends it easily to an approach in which students' conceptual knowledge (or conceptual knowledge in general) is considered as a networked relational system. Informed by such notions, we have here investigated pre-service teachers' declarative knowledge of physics concepts (in electricity and magnetism) to find out if structural position of a concepts within a networked system of concepts is indicative of its abstractness or context dependence. We focus here on the knowledge that students can express by drawing concept networks and by writing explanations on what those drawn networks contain and describe. Such knowledge is declarative conceptual knowledge, and because it is expressed through terms and names of concepts and conceptual structures (like models), it can be analyzed as a lexicon of scientific knowledge, in which connections between concepts derive from how they can be used together with other concepts.
The purpose of the study is to show how the interlinked connections of nodes, locally and globally, can be used in analysis of such a network, and in revealing how different elements of the network are supported through their weighted connection to other nodes in the network. The methods of analyzing such networks introduced here augment the traditional methods, which most often focus either on local properties through counting links [8,11,45] or on qualitative global relational properties by visual inspection [46,47]. The concept networks analyzed here represent pre-service teacher students' (N = 12) understanding of physics concepts and their interrelationships as a collated network, consisting of 121 nodes and 787 links in its most extensive form when all links are included (augmented network), or only 602 links when substantiated links are considered (novice network). In addition, a third network was formed on the basis of the augmented network by having experts add links that they felt were missing (expert network). All three networks were analyzed by operationalizing the notions of local connectivity (degree centrality D), global connectivity (communicability centrality G), and local cohesion (local clustering coefficient C). We can now provide answers to the first two of our three research questions as follows.
Research question 1 asked "Which concepts and conceptual elements have high connectivity, and of which type?" The results of the analysis show that the set of concepts which have high global connectivity and coherence (high values of G) are predominantly theoretical and abstract concepts, many of them field-concepts. These concepts form the theoretical skeletal conceptual frame of the network. Some of the concepts in that group play a key role in connecting different, well-connected clusters of concepts. All concepts and conceptual elements that have high values of D and G are also central from the point of view of content. This finding is of course satisfying and not trivially expected to be featured in students' representations of their knowledge.
Research question 2 asked "How do the concepts with high connectivity of various types differ in their content?" The results show that another set of concepts is the set with high local cohesion in form of transitive (triadic) connectivity, as measured by local clustering coefficient C. These concepts and conceptual elements are specific derived models or specific examples, or auxiliary concepts connected with them. They are not nodes in the network having high local coherence, but they have a key role in supporting or augmenting the theoretical skeletal structure formed by the abstract theoretical concepts with high global connectivity and coherence. To our knowledge, no previous studies have demonstrated similar results as conclusively, based on detailed network analysis as presented here.
The notions that concepts and conceptual elements with high global connectivity and which are predominantly theoretical and abstract parallels, with notions that concepts of abstract knowledge are not closely connected, and that such connections in the case of expert's knowledge may be parsimonious [19]. This opens an interpretation that the tendency that high-G concept-nodes have lower global connectivity than expected on the basis of their local connectivity may be a feature of expert-like knowledge instead an indication of difficulty to create such connections. On the other hand, it has been suggested that novice's shallow knowledge is only locally connected and lacks connections that are contiguous and indirect [19]. In the present case, such conceptual knowledge is recognized as concept-nodes, which have relatively high values of local clustering (C) and high Z-scores, but low values of global connectivity (G).
We have not analyzed individual student networks in detail but have only compared the similarity between the collated novice network and the augmented and expert networks on the basis of the distribution of the values of local connectivity D, global connectivity G, and local cohesion C of their nodes. Similarity between the networks is taken to be higher when the similarity of rankings of the nodes based on the values of D, G, and C is greater. This analysis allows us to answer the third and last research question.
Research question 3 asked "How does individual students' knowledge relate to group-level knowledge?" The results show that the augmented network (i.e., expert network) is nearly identical to the collated network (student knowledge at group-level), but the novice network has only about 40% similarity with the augmented network. An additional comparison between the 12 individual networks, each made by one student, shows that the mutual similarity of individual networks is also roughly 40%. The conclusion is that, on average, individual student-made networks share a substantial number of the highest-ranking nodes, and the best substantiation of nodes as collected in the collated networks comes from only a few networks. This indicates that high-quality conceptual knowledge, as collected in the novice network, is distributed sparsely and frugally within the group of 12 students. On the other hand, when demands on substantiation of the knowledge are relaxed and recognition on the level of identification of a connection is sufficient, the collated network (augmented network) is a highly satisfactory collection of potentially valid connections. This means that students' knowledge is dispersed, and as such not necessarily easily retrieved or consolidated collectively. If the dispersed knowledge could be consolidated through collaborative knowledge elaboration, that would provide an immensely effective instructional approach. It remains a challenge to instruction and teaching to find ways to make this distributed knowledge available to all students in the group.
This study, although it reveals many important relations between structural position of concepts and their abstractness, has its limitations. The most obvious limitation, of course, is that we restrict attention on declarative conceptual knowledge. However, declarative conceptual knowledge is indispensable in problem solving and in acquiring procedural knowledge. Also, it is the most obvious type of knowledge to pay attention in communication. In this respect, the limitation to declarative knowledge is shared with many other studies in learning and science education. Regarding the method, the assumption that connectivity is the property of most interest is the most crucial one. The motivation to focus on connectivity is based on its role in the relational view of concepts as well as analyses of structure of scientific knowledge. Connectivity, however, as a term related to networks science, is not self-explanatory and has different interpretations [48]. Our data is only a time slice of a more complex, dynamic knowledge acquisition and processing situation, and thus we are restricted to exploring the connectivity at a static level. In that limited sense, the way in which we discuss connectivity agrees with how it is understood within network science. It should also be noted that we have deliberately avoided using coherence and hierarchy as a key notion characterising the structure of knowledge. The reason is that coherence is a notion which is too elusive to be operationalized, while hierarchy can be operationalized but then requires very reliable information of the directness of connections. Directedness, however, in cases where students represent their knowledge, would be too awkward because we have little understanding of what students represent when they represent directions in their concept networks.
Finally, it is necessary to discuss how the analysis of structure of knowledge relates to other dimensions of learning. One of the main notions contained in cognitively-and psychologically-oriented research is the key role of the structure of a teacher's knowledge in instruction; experts are more efficient in facilitating higher-level learning than novices because they have better mastery and organization of target knowledge than novices [19,22,49,50]. However, a study focusing on problem-solving has demonstrated that use of conceptual knowledge is strongly situational. Interestingly, in studies concerning problem-solving in electricity and magnetism [12,15], some concepts classified as typical for low-level knowledge are in our study found to be key concepts with high global connectivity. Nevertheless, in both cases the field concepts are classified as high-level concepts. The difference between the results of research focusing on situational problem-solving and concepts as part of a system of knowledge (as in the present study) may simply be due to the different focus on different dimensions of learning. Conceptual learning involves learning to use concepts in specific problems, and in addition, analyzing the norms and relations regarding how concepts are connected as part of a system. The latter ability, though needed in problem-solving, may not be fully visible in addressing closed and traditional problems, but may become visible when dealing with open and more complex problems. Network methods as introduced here are suitable tools to track the large-scale structure of students' declarative knowledge and for cartography of conceptual semantic fields, revealing how students conceive and rationalize the relational connections between concepts. In future, these kinds of methods need to be connected to research which explores the ways the concepts are used in different contexts, situations, and problem solving, where in addition, procedural or strategic knowledge is needed.
In summary, we have shown how pre-service teacher students' declarative conceptual knowledge builds on relational connections between concepts, and how the local cohesion and global connectedness of students' knowledge builds up through relational connections. The analysis shows that, with increasing comprehensiveness and richness, abstract advanced concepts are recognized through their high global connectivity and the contiguity of paths through which they are connected to other concepts. The locally cohesive concepts, on the other hand, are auxiliary supporting concepts, specific textbook-type experiments, and model-type conceptional elements. We see three advantages to the type of approach to science education research presented here: explicated views of learning provided by relational structure of knowledge [19][20][21][22][25][26][27][28]; rationalization informed by analysis of knowledge as a system coming from philosophical analyses of scientific knowledge [29][30][31]; and conceptualization and operationalization of central quantities used in analysis based on a network view of knowledge [36][37][38], which is in concordance with the view on learning and conception of knowledge that forms the basis of the study. The results help to better define the notion of structure of pre-service teacher students' relational knowledge, its cohesiveness, connectedness, and contiguity, to point out how to make such properties of knowledge visible and approachable in research, and to recognize their role in teaching and instruction.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Mathematical Models of Analysis
In the epistemic analysis, the epistemic strength s(v) of node v is a measure of how strongly it is substantiated. If the given node v has intrinsic epistemic strength s(v) and it is connected to a set of other nodes, we can operationalize the effective epistemic strength by assigning to each link connecting nodes v and u a weight a, which derives its strength from the node it originates from: a(v→u) = s(u). The epistemic strengths of the nodes are thus transformed to link weights a(v→u). The weights are elements of adjacency matrix a, which then contains all information of connections between nodes and their epistemic weights, which makes the network amenable to flexible analysis as a weighted, directed network. Analysis of connectivity of nodes in the network is based on the weighted network.

A1. The Centralities
The analysis method introduced here is based on finding the nodes that are important in the network, locally and globally. Each node in the network represents a knowledge element, and thus links between the nodes directly represent the relations between knowledge elements. The analysis of the relational structure and coherence of the network requires that the relevant information of the connections yield to quantitative analysis. In Section 3 we introduced the operationalization of desired types of local and global connectivity in form: (1) local connectivity; (2) local cohesion; and (3) global connectivity. In what follows, we introduce mathematical formulae to calculate them by counting connections and paths. All quantities are defined in terms of adjacency matrix. The adjacency matrix a has elements a ij = [a] ij , which is the epistemic weight of the link when nodes i and j are connected or is otherwise zero.
1. Degree centrality D as measure to local connectivity. Those nodes which have high local importance have many connections so that they are adjacent to many nodes in a network. This kind of centrality is simply measured through the total number of links attached to the given node, thus called the Degree Centrality D of the node. The degree centrality D [35,36] is simply the weighted number of links (out-and ingoing) D v attached to a given node v Degree centrality for a weighted network as defined here is sometimes called strength. It is an efficient and simple measure to gauge local connectivity, but it only provides information on connections to adjacent nodes, i.e., to the nearest neighbours.
2. Local clustering coefficient C as a measure for local cohesion. Local clustering C, which measures the nodes adjacent to a given node v, with these nodes being connected as fully connected triads, is defined as [39,40] The ratio represents the number of fully connected triples (triads) divided by number of triples connected by two links only (elementary spokes). A transitive connection, in which the connection between A and C, and B and C, often also indicates a connection between A and B, leads to high clustering [39,40].
3. Communicability centrality G. This centrality operationalizes the global connectivity by counting all paths (or walks) between nodes. The weighted matrix can be used directly to obtain the number of walks. This is based on the notion that there is a walk from p to q if a pq = 0, walk p→j→q if a pj , a jq = 0, walk p→k→k'→q if a pj , a jk , a kq = 0, etc. On the other hand, for a walk involving two nodes a 2 = 0, for three nodes a 3 = 0, respectively. Now, in a connected network, the number of long walks increases rapidly, nearly factorially with the length of the walk, because different combinatorial possibilities emerge, and one is interested in the relative weight of such walks. Therefore, the number of walks is usually divided by the factorial, to obtain [18,40]  The Communicability centrality of node v is then obtained as where G vv = [Exp [βa]] vv . From the definition in Equation A4 it can be seen that the Communicability centrality is a totally holistic measure of network metrics. The communicability G, which will be used to explore the contiguous connectivity of the networks and the role of specific nodes (key concepts) in providing the connectivity, uses parameter β to tune the length of paths to be included as part of exploration of the network. Parameter β << 1 makes the exploration strictly local, while β >> 1 explores the entire network, granting all path-lengths nearly equal importance. Parameter value β = 1 corresponds to the case where paths including L links are weighted by factorial L! of the links. Although the number of alternative paths between two distant nodes does not always grow in a strictly multiplicative manner, in practice factorial provides a reasonable and robust normalization [40]. The Communicability centrality G, which quantifies the property of contiguous connectivity, is thus the most important of the operationalized measures, because with it, by increasing the values of β, we can explore the effect of long contiguous paths on the communicability between nodes. The centrality measures used in analysis are summarized in Table A1 with their mathematical definitions.

A2. Reliability Analysis
An important part of network analysis is the analysis of the reliability and the statistical significance of the results. This can be accomplished by comparing the results of the analysis to results obtained from an appropriate null-model [40,51]. To decide which features and which values of centralities are exceptional and not determined simply by the size (number of nodes and links) of the networks and the distribution of in-and out-going links among the nodes (in-and out-degrees of the node), we need to define the null-model, which preserves the number of nodes and links and the direction of links but destroys correlations contained in the linkages of different nodes in the collated network. Such a null model is obtained by rewiring all links in the network. In rewiring, two nodes are selected at random and their outgoing links are switched, as well as the in-coming links. When this is repeated many times, initially present correlations in the wirings are wiped out. In this study, we used 5000 re-wirings for each rewiring and repeated it 1000 times to obtain an ensemble of networks to be compared with the original collated networks. All re-wirings were performed with IGraph software [52].
The average values of variables O ∈ {C, G} was calculated for the ensemble of rewired networks, with averages and standard deviations denoted by O and std O , respectively. The statistical significance of the different centralities and the reliability of the results can be assessed by calculating the so-called Z-scores (i.e., standardized form) of variable O, defined as [40,51] where O is the observable value in the empirical sample, with O the corresponding average value in the ensemble of networks produced by the null-model and std O the corresponding standard deviation. Reliability and statistical significance require that Z-values are high enough: usually the value Z = 2 is taken as a limiting case. Assuming that the variables O are normally distributed, Z-score Z = 2 corresponds to p-value 0.02, while Z = 3.0 and Z = 3.5 correspond to p-values 0.01 and 0.002. Here, we have chosen to use Z = 2 as a cut-off for statistically significant deviations deserving special attention.

A3. Similarity Analysis
Comparison of different networks is done on the basis of so-called cosine-similarity S(g,g') for networks g and g' [24,40], which is in terms of communicability of nodes contained in the networks. If the communicability centrality of node k is given by G k in network g and by G' k in network g', the cosine-similarity of networks g and g' is defined as The cosine-similarity S is normalized to have values from 0 to 1. The high values of G now have more weight, while low values contribute less, and values of zero contribute nothing to the similarity. The similarity is thus biased to be sensitive to high-G nodes.