The Relationship between the Outdoor School Violence Distribution and the Outdoor Campus Environment: An Empirical Study from China

It is widely believed that outdoor environmental design contributes to outdoor violence prevention. To enhance the effectiveness of environmental design, the intrinsic link between the outdoor school violence distribution (OSVD) and the outdoor campus environment (OCE) should be fully considered. For this purpose, this study investigated boarding school L, located in southern Zhejiang Province of China, through a questionnaire and Spatial Syntax theory. Based on the questionnaire marker method (N = 338, 50.59% female), the OSVD was mapped using the kernel density estimation in ArcGIS, including four types of teacher-student conflict: verbal bullying, physical conflict, and external intrusion. The spatial analysis of the OCE (spatial configuration and spatial visibility) then was generated by the DepthmapX, involving four spatial attributes such as integration, mean depth, connectivity, and visibility connectivity. Statistical analysis results indicated the correlation between the OSVD and both the spatial configuration and spatial visibility of the OCE. For the different violence types, there were differences in the impact relationships, with integration being a significant predictor of teacher-student conflict and physical conflict (p < 0.01) and a general predictor of verbal bullying (p < 0.05), while mean depth was a significant predictor of physical conflict (p < 0.01), but not recommended as a predictor of external intrusion. This study explores and predicts the relationship between the OSVD and the OCE, providing guidance and evidence for school violence prevention environmental design. It is a novel attempt, but still challenging and requires more research to refine.


Introduction
Although schools are supposed to be harmonious and stable places, there have been many incidents of student violence in recent years which have seriously damaged the campus atmosphere. School violence takes many forms, not just physical conflict, but also social violence such as verbal intimidation, cyber violence, group isolation, and studentteacher conflict [1,2]. The persistence of school violence can influence the order of teaching and learning, the campus living environment, and the healthy development of students in varying degrees, and, in particular can damage students' physical and mental health [3], reduce subjective well-being [4,5], and cause great psychological disorders [6,7]. Meanwhile, school violence is a global problem with some 246 million children worldwide experiencing different forms of violence and almost a third (32%) of students being bullied by their peers on one or more days, according to the reports that were published by UNESCO [1,8]. In addition, the proportion of bullied students is highest in the Middle East, North Africa, and Sub-Saharan Africa and lowest in Central America, the Caribbean, and Europe. In China, a field study that was conducted by Central China Normal University in six provinces with a sample of more than 10,000 students in more than 130 primary and secondary schools, showed that the school violence rate was 32.4% from 2019 to 2020 [9,10]. It is, therefore, clear that research into the school violence prevention is urgent, especially in developing countries such as China.
In recent years, many scholars around the world have begun to explore the causes of school violence from the perspectives of culturally transformative education [11], adolescent personality traits [12,13], student physicality [14], family background [15], teaching climate [16], and social capital [17] in order to prevent or reduce the occurrence of school violence, and have proposed violence prevention measures such as dialogue models [18], drama therapy programs [19], and community participation integrated models [20,21]. Benefiting from these important research findings, the majority of schools around the world now have implemented prevention and control efforts that are based on safety education for school violence, with proactive interventions such as increased surveillance equipment and patrols [22]. Different countries have also developed specific and effective experiences, such as Ireland [23], the United States [24], and South Korea [25], which have incorporated school violence into their youth legislation and policies to regulate it. Specifically in Finland, the Ministry of National Education has initiated the development of a curriculum system plan for the protection of students against violence in schools [26]. The Ministry of Education of Korea has launched the Regional Professional Programme to guide the prevention and follow-up of school violence through the development and implementation of special local programs that are led by schools and communities [27]. In addition, the Olweus Bullying Prevention Programme at Clemson University, USA, including whole-school, classroom, individual, and community components, was designed and evaluated for elementary, middle, junior high, and high school (K- 12) and has been validated in relevant evaluation studies [28]. These social interventions are ideal for indoor violence prevention but are not so effective for outdoor violence. However, the lack of access and lighting in outdoor school environments [29,30] greatly increases the likelihood of violence, making it equally important to conduct research on outdoor school environments to prevent outdoor violence. Newcastle University in the UK launched an experiment called "Watchful Eye" [22] in response to the frequent theft of bicycles on campus. By putting up posters to psychologically intimidate potential criminals, the number of bike thefts on campus was halved after 12 months. Sandy Hook Elementary School in the USA was the scene of a serious school shooting in 2012, in which a number of students and teachers were killed. In the aftermath, Jay Brotman redesigned the school in the context of crime prevention theories [31]. Using a sample of high schools in New Jersey, USA, Aggarwal J compared data to find a stratified relationship between schools (buildings and facilities) and potential gun violence harborage, noting the need for timely response mechanisms between schools and government [32]. Further, Ning Ma of the Beijing University of Technology, China, designed a safety assessment method for the visually impaired based on the Visual Access and Exposure model of the outdoor campus environment, providing guidance for design and optimization [33]. As can be seen, although there has been some empirical research on school violence prevention through environmental design, it is mostly qualitative research and practice. Therefore, the outdoor school violence distribution (OSVD) should be fully considered in relation to the outdoor campus environment (OCE) in order to increase the design effectiveness.
In response, a large number of scholars have carried out long-term research and practice, and a more complete theoretical system has been formed for the study of the relationship between criminal behavior and spatial environment [34][35][36][37]. In 1961, Jane Jacobs analyzed the security of street space from a sociological perspective [38] and introduced the concept of the "Street Eye", while in 1971, C. Ray Jeffery first developed the first generation of Crime Prevention Through Environmental Design (CPTED) theory [39]. Then, the con-In summary, facing the growing problem of school violence, although there are already social means such as security patrols and safety education, as well as physical interventions such as CPTED theory being developed, it is crucial to conduct quantitative research on the relationship between the OSVD and OCE to achieve effective prevention, which is still a gap. Thus, to fill the above-mentioned research area, the aim of this study is to quantitatively investigate the relationship between the OSVD (including teacher-student conflict, verbal bullying, physical conflict, and external intrusion) and the spatial attributes of the OCE (including the Integration, Mean Depth and Connectivity of spatial configuration, as well as the Visibility Connectivity of spatial visibility) through a case study, based on questionnaires, GIS techniques, and Spatial Syntax theory with its software. Through correlation and multiple regression analysis, regression models between the distribution of each outdoor school violence type and the outdoor campus environment then were developed, indicating the corresponding spatial attribute predictors. The results may provide guidance and theoretical support for many schools in planning, designing, and renovating outdoor environments for safety.
The research questions addressed in this paper include: (1) What are the spatial distribution characteristics of different outdoor school violence types? (2) Is there a relationship between the distribution of each outdoor school violence type and the spatial attributes (spatial configuration and spatial visibility) of the OCE? What is the relationship? (3) How do the various spatial attributes of the OCE affect the distribution of each outdoor school violence type? What are the predictors? Figure 1 displays the flow framework of this study, corresponding to the three sections of research questions, research methods, and the expected results. Firstly, the violence occurrence situation (types of violence, number of occurrences, spatial points of occurrence) and the current environment evaluation in the case school were collected through a questionnaire survey to produce a spatial point distribution map of outdoor school violence. Then the kernel density bandwidth values were calculated through Ripley's K Function, and the kernel density values in the case school were carried out using the kernel density estimation to map the kernel density distribution of different outdoor school violence types. Secondly, based on Spatial Syntax theory, the spatial simulation analysis was conducted by building models of the OCE, and the vector distribution graphs of spatial configuration and spatial visibility were generated by using DepthmapX software. Finally, the statistical analysis, including correlation analysis and multiple regression analysis, was carried out to find out the causal relationship between various types of OSVD and the OCE, and to identify the regression models and predictors.

Field Site
This study was carried out in Case L, a boarding secondary school that is located in southern Zhejiang Province of China. The school was chosen as a case study for the following reasons.  [73] whose parents are often in the laboring group. These students' personalities tend to become stubborn, distrustful, lonely, and rude due to the lack of family care, as well as psychological problems such as anxiety, loneliness, and boredom that are caused by adaptation to the environment. Students in boarding schools are, therefore, a special group, with not only the general psychological characteristics of contemporary students, but also the special psychological problems of left-behind children. • Representative violence rate: A random sample of students in different grades obtained an outdoor violence rate that reached 58.01%, which is higher than the abovementioned violence rate in China (32.4%) and the global violence rate (32%), indicating that Case L is strikingly representative [68].
Consequently, Case L was finally selected for the study based on full consideration of the significance and generalizability of the findings. The students of Case L are mainly from the seventeen surrounding towns and eight neighborhoods, and are divided into three grades (freshman, sophomore, and senior) with a total of 27 classes and about 1000 students (10 classes in the freshman year, 13 classes in the sophomore year, and 4 classes in the senior year. The total number of seniors was small because most students transferred or dropped out of further studies). Meanwhile, the school covers an area of 67,333.3 m 2 with a building area of 57,000 m 2 . Figure 2 illustrates that the school is separated into two areas by a river, with the northern part containing the teaching buildings, gymnasium, dining hall, male dormitories, and playgrounds, and the southern part including the female dormitories and park lots. As a supplementary, the roads (denoted by lowercase letters a to s) and the main spatial nodes with actual photographs (denoted by Arabic numerals 1 to 20) all have been marked one by one. In addition, it can be noticed that the Case L has three playground areas, a square with a garden, and several landscape nodes. There is also currently only one exit from the school, located to the north-east of the basketball court, which connects to the road leading to the town.

Investigation Method
In cooperation with teachers and students, the research team used recess time to administer questionnaires to students from different grades by using a combination of stratified and random sampling. More specifically, the three grades (998 students in total) were divided into three tiers, i.e., the freshman year as a tier (376 students), the sophomore year as a tier (462 students), and the senior year as a tier (160 students). As suggested by William G. Cochran [74], a further random sample of approximately 35% tested students was then selected for each tier to participate in the survey, comprising of 132 freshmen (35.11%), 162 sophomores (35.06%), and 56 seniors (35.00%), for a total of 350 tested students (1:1 male to female ratio), so as to achieve a certain representativeness, as detailed in Table A1. As a complement, the inclusion criteria for the sample were regular students that were enrolled in Case L who were informed and consented to participate in this study, while the exclusion criteria were students with mental disorders, cognitive impairment, or who were unable to cooperate for other reasons. Table A2 shows the details of the questionnaire, which consisted of three sections: personal information statistics (non-scale questions), outdoor school violence statistics (non-scale questions), and current environmental assessment (scale questions). The first part was designed to find out whether the composition of the tested students was balanced and reasonable. The second part aimed to understand the types of violence, the number of occurrences, and the spatial points of occurrence through item quizzes and spatial point markers. The third part was intended to support the subsequent exploration of the relationship between the OSVD and OCE by means of the subjects' satisfaction with the current outdoor environment (16 questions were set in conjunction with the literature [75], containing 5 scales).
Once the questionnaires were returned, individual data samples needed to be screened and processed to improve the accuracy and completeness of the underlying data based on the following criteria [76,77].

•
Deletion: When faced with invalid responses throughout the questionnaire, the sample should be simply deleted. • Partial deletion: When an individual question in a valid questionnaire had a missing response, the data for it were removed but the other valid variables were retained. • Modify: When faced with a missing variable in a sample, it can be filled in with the mean of the remaining variables. For classification, when the missing amount was less than 5%, the overall mean of the variable was used to replace. Otherwise, the hot deck method can be used to group the valid data for that variable and choose the mean of a certain group as the value of the missing sample.
Finally, a total of 342 questionnaires were returned, with a return rate of 97.71%. After screening and collating, 338 valid questionnaires were counted, resulting in the effective rate of 96.57%. SPSS19.0 was then used to complete the data entry and obtain the sample basic data after statistical analysis.

Investigation Findings
The part "Personal Information Statistics" showed that among the 338 students participating in the survey, 167 were male (49.41%) and 171 were female (50.59%). According to the source of students, the largest number of students came from rural areas, reaching 291 (86.09%), while the smallest number of students came from suburban areas, only 17 (5.03%), and 30 (8.88%) from urban areas. And by the grade, 127, 160, and 51 students participated for freshman, sophomore, and senior, respectively, accounting for 37.57%, 47.34%, and 15.09%, correspondingly. Additionally, according to whether they were leftbehind children or not, there were 87 left-behind students and 252 non-left-behind students, accounting for 25.74% and 74.56%, respectively.
The part "Outdoor School Violence Statistics" indicated that more than 50% of the students reported having participated in, experienced, or observed different types of school violence, and the average incidence of outdoor violence for Case L was calculated to be 58.01%, as detailed in Table A3. The fact that Case L had such a high violence incidence, on the one hand, was inevitably linked to the nature of boarding school it is (diversified building functions, special student population, etc.), which also confirmed by numerous local studies in China [78][79][80]. On the other hand, the statistics included violence incidents that were experienced by perpetrators, victims, and bystanders, which may have been duplicated, resulting in some error. Overall, the school violence incidence in Case L still remained high and representative. In particular, as presented in Figure 3, the area near the toilet (78.11%), the area around the dormitory (69.82%), the way to school (66.57%), and the secluded grove (62.13%) were identified as the places where the most outdoor violence occurred. Males (64.50%) and seniors (73.37%) were considered to be the most dominant perpetrators, and 12.13% of the tested students indicated the presence of teacher violence. Furthermore, with the previous relevant references [81,82], these violence incidents were grouped into four categories, namely teacher-student conflict, verbal bullying, physical conflict, and external intrusion. Table 1 lists the number of occurrences and the spatial distribution maps of the four types of outdoor violence by coding and statistical processing (marked by the tested student). Among them, teacher-student conflict and verbal bullying were the most common violence behavior (the combined numbers of perpetrators and victims were 185 and 426, respectively), although physical conflict and external intrusion were not as prevalent (the combined numbers of perpetrators and victims were 130 and 76, respectively), it was also a strong sign of the seriousness for outdoor school violence in Case L.
The part "Current Environmental Assessment" was the scale part, so its reliability and validity were tested by using IBM SPSS Statistics 26, proving that the findings were reliable, with the detailed analysis process in Appendix B. The results showed that more than 20.00% of the tested students felt that the entrance to the school was narrow and disorderly (29.00%) and that the outdoor sports fields, for example, were far from the dormitory area (20.12%). In addition, more than 17.00% of the tested students believed that the current outdoor environment still has problems such as dead space in the landscape (19.23%), dim lighting design (17.16%), remote location of the dormitory building (18.34%), lack of leisure facilities (17.46%), more visual blind spots (18.94%), and so on. On the contrary, only around 6.00% of the tested students disagreed with the "poorly managed green space" and "untimely litter removal". Complementary, the students' agreement ratings with issues such as "low campus fence" and "confusing planning of school buildings" can be found in Table A4.  Note: These statistics of violent incidents were obtained by marking on a general plan.

Data Handling Tools
After obtaining the spatial point distribution maps for the four outdoor violence types in Case L, further speculative analysis was needed to get the density of violence distribution across the outdoor environment. As a result, Ripley's K Function and kernel density estimation were applied in ArcGIS software. Firstly, Ripley's K Function was used for pre-processing the spatial point distribution of outdoor violence occurrence to determine the distance parameter (kernel density bandwidth value) for kernel density estimation [83].
The formula is shown in Equation (1). After calculating Ripley's K expected and observed values, the best clustering distances for teacher-student conflict, verbal bullying, physical conflict and external intrusion were determined to be 33.04 m, 70.50 m, 52.38 m, and 32.52 m, respectively.
d is the spatial radius scale, n is the total number of elements in the study area, A is the total area of the study site, and k is the number of elements in the set.
Secondly, with the advantages of intuitive representation, conceptual simplicity and ease of computer implementation, kernel density estimation was used to estimate the distribution of violence occurrence density, forming a two-dimensional smoothed estimation surface to reflect the characteristics and spatial variation of outdoor violence clustering in Case L. The specific formula is as Equation (2) [83]. x i is the point where the highest density of violence occurs. The further outward distance x i is the corresponding density value will decrease, and when the distance reaches a certain threshold, its density value will be close to zero. Ultimately, the kernel density distribution for each type of outdoor school violence was calculated by the above two steps.
f(x) is the kernel density at point x, n is the number of points whose distance from x is equal to or less than h, h is the distance decay threshold (kernel density bandwidth value), and λ is the spatial weight function.

Spatial Simulation
DepthmapX, a tool of Spatial Syntax theory, was used to analyze the OCE in Case L. Specifically, it is a software is dedicated to urban spatial analysis and contains three basic models, namely Axis model, Convex Space Analysis (CSA) model, and Visibility Graph Analysis (VGA) model [71], which can be used to calculate parameters such as integration, choice and visibility. The Axis model and VGA models were applied in this study to analyze the relevant parameters of the spatial configuration and spatial visibility of the OCE, mainly described as following.
• Integration (In) represents the relationship between a space and local or overall space, that is, the accessibility of the space. The higher the In value, the higher the accessibility [73]. • Mean Depth (MD) indicates the number of transformations from local space to other parts of space, representing the convenience of the node in the spatial system. Higher values of MD indicate higher spatial separation [84]. • Connectivity (Con) refers to the sum of the number of spaces directly connected with the surrounding space. The higher the Con value of a space, the more spaces connected with it, characterizing as a transportation hub in the spatial system. • Visibility Connectivity (VC) shows the number of other points that a point can see within its line of sight, reflecting the quality of natural surveillance provided by the outdoor environment to users or passers-by. The higher the VC value, the better the quality of surveillance or under surveillance [68].
These parameters that are mentioned above provide a comprehensive characterization of the accessibility and visibility capabilities of the OCE. The specific range of values in the final analysis for each parameter is expressed in a color scale from dark blue to dark red, with the former indicating low values and the latter indicating high values.
It is important to note that elements such as vegetation and street furniture within the environment need to be subdivided before conducting spatial simulations, as their impact on spatial configuration and spatial accessibility analysis is complex [46]. On the one hand, the height of the canopy bifurcation affects the spatial analysis, e.g., trees with a high bifurcation do not affect sight and behavior, but trees (especially shrubs) and street furniture that are too short can impede crossing. On the other hand, heavy foliage and bulky street furniture can obscure sight, while sparse foliage and furniture (such as utility poles) can provide some visibility. Therefore, a survey of the components of the outdoor environment in Case L was carried out to classify how they were calculated [68], as shown in Table 2. To summarize, first, shrubs and small trees with large crowns and low forks are defined as a hindrance. Second, large trees with extremely high forked canopies or small canopies, as well as low shrubs, resting seats, street lights, and other street furniture, are considered to be negligible obstructive factors. Third, motor vehicle parking is regarded as a hindering factor. In addition, considering the influence of tree canopy changes in different seasons on the spatial analysis, two seasons, summer (VC S ) and winter (VC W ), were chosen for this study to obtain more accurate and comprehensive results.

Statistical Analysis
After the above processing, the results of the kernel density of the OSVD and the spatial attribute variables of the OCE were derived. Statistical analysis then was conducted to explore the correlations and regression relationships between them [85]. As illustrated in Figure 2, roads and major spatial nodes in the OCE have been labeled to link their outdoor violence kernel density to the spatial attributes. With interpolation, the kernel density values of the various types of outdoor violence and the values of the four spatial attributes derived from DepthmapX (In, MD, Con, VC M averaged over VC S and VC W ) were embedded in the same folder, via IBM SPSS Statistics 26 and Stata 15.1 software. Then they would be analyzed by correlational analysis and multiple regression, including the following specific steps.

•
Normal distribution test: It was required for all data to select the appropriate correlation coefficient through this test. After embedded in SPSS Statistics 26, the one-sample Kolmogorov-Smirnov (K-S) test thus must be performed. If the variables were normally distributed (p > 0.05), Pearson's correlation coefficient would be used, otherwise (p ≤ 0.05), Spearman correlation coefficient should be used [86]. • Correlation analysis: With the data imported into Stata 15.1 software, the bivariate correlation analysis was carried out using the appropriate correlation coefficients for kernel density variables and spatial attribute variables, discriminated by significance p and correlation coefficient r [85]. p was used to test whether there was a statistically significant relationship between the two variables (p-values less than 0.1, 0.05, or 0.01 all indicate a correlation, denoted by *, ** and ***, respectively, in ascending order of significance). If significant, the positive and negative direction of the correlation coefficient r and the degree of correlation should be analyzed. A larger absolute value of r indicated a higher degree of correlation.

•
Multiple regression analysis and model building: It was used to explore the regression relationships between outdoor school violence kernel density variables and spatial attribute variables to predict the magnitude of the effect of different spatial attribute variables on the same violence kernel density. Specifically, the hierarchical regression method was used if the relationship between certain variables can be obtained from previous literature, otherwise the stepwise method should be chosen [85]. Once the regression analysis was determined, the data were tested for covariance to prevent interactions within the variables, including two indicators of variance inflation factor (VIF) (less than 10) and tolerance (greater than 0.2). Further, the regression coefficient beta, t-test value, and significance p were selected as indicators to discriminate the regression relationship. Among them, the t-test and p-value were judged by the same criteria as the above conditions. If the absolute value of beta was greater than zero, then the independent variable can effectively predict the variation of the dependent variable, i.e., there was a significant influence relationship. Also in the hierarchical regression analysis, R 2 change (the difference in R 2 between two consecutive models) represented the contribution that was made by increasing the independent variable; the larger the value, the more prominent the influence relationship. • Regression model checking: The validity, reliability, and generalizability of the model all should be tested in this step. Combined with previous literature [87][88][89], the difference between R 2 and adjusted R 2 could indicate the generalizability if the value was closer to zero. The Durbin-Watson statistic needed to be in the range of 1.0-3.0, and the maximum value of Cook's Distance for all samples should be controlled to be less than 1.0 to demonstrate that the regression model had good reliability. It was also necessary to determine whether the regression residuals were approximately normally distributed based on normal Q-Q plots, with a better fit indicating a closer to normal distribution. As a complement, the results of the correlation and regression analysis needed to be combined to define the effective predictors of the regression model. Figure 4 illustrates the kernel density distribution of each outdoor school violence type formed from the questionnaire results, aided by ArcGIS 10.8 software. The aggregation degree of outdoor school violence is reflected by the distribution of kernel density values, i.e., the darker the color (brown), the higher the kernel density value and the higher the violence aggregation [83]. Firstly, Figure 4a shows the kernel density distribution of teacher-student conflict (N = 185, d = 33.04 m. The range of kernel density values is 0.000000-0.021252), the second most frequent type of violence, which can be seen to be concentrated mainly in the areas of the teaching building and school entrance. Therefore, areas A (internal courtyard), B (north-west of the playground), and C (entrance plaza) were marked as violence hotspots with corresponding maximum kernel density values of 0.02083, 0.01513, and 0.0212, respectively. Secondly, as indicated in Figure 4b, the kernel density distribution of verbal bullying (N = 462, d = 70.50 m. The range of kernel density values is 0.000000-0.014691), the violence type with the highest incidence has violence hotspots mainly in areas A, D (courtyard in the male dormitory), E (basketball court), and F (front plaza of the female dormitory), with a maximum and density of 0.01468, 0.00801, 0.00754, and 0.00548, respectively. Verbal bullying thus can be found to be widespread, with its presence in almost the entire outdoor environment and highly concentrated in two areas, the teaching building and male dormitory. Thirdly, the kernel density distribution of physical conflict (N = 130, d = 52.38 m. The range of kernel density values is 0.000000-0.008190) is shown in Figure 4c, with areas A, D, E, G (southern parking lot), and H (volleyball court) marked as violence hotspots with the highest kernel densities of 0.00223, 0.00819, 0.00230, 0.00247, and 0.00339, respectively. The physical conflict distribution was scattered, largely located in the corners of the OCE and highly clustered in the male dormitory area. Finally, Figure 4d represents the kernel density distribution of external intrusion (N = 76, d = 32.52 m. The range of kernel density values is 0.000000-0.011644), the type of violence with the lowest incidence. The results show that areas E, G, H, and I (the grove in the north part of the male dormitory area) were marked as violence hotspots, with maximum kernel densities of 0.00561, 0.00327, 0.01164, and 0.01041, correspondingly. As can be seen, while both external intrusion and physical conflict were mainly located in the corners, the former was only located in the campus boundary spaces (near the fence).

Spatial Syntactic Graphs of the Outdoor Campus Environment
Using DepthmapX, the boundary lines of walkable areas such as roads, squares, and outdoor sports fields were drawn to create an Axis model for the spatial configuration, i.e., the walkable layers. Meanwhile, the boundaries of the fence, the outer contours of the buildings, the contours of the plant canopy, and the boundaries of the different street furniture were mapped to establish a VGA model for the spatial visibility. In mapping the two models above, the lawn was defined to be a non-walkable layer and the parking lots were considered to be parked full of vehicles. Lastly, the analysis results were interpolated through ArcGIS to obtain specific values, as shown in Table 3.  Figure 5 indicates the simulation results of the spatial configuration of the OCE, including In, MD, and Con. The graphs show that the playground, volleyball court, and the roads connecting them (roads f, h, i, l, and n) were important areas in the OCE with the highest levels of In and Con (described in red) and the lowest MD (described in blue), which had the best accessibility due to the absence of obstacles and were identified as hotspots. In addition, several areas with moderate In, MD, and Con were found, including the basketball court, the south square, and the roads connecting them to the playground (roads b, c, d, k, m, and o), which were medium in accessibility (described in the neutral green). However, there are still a few traffic-clogged spaces that were regarded as less accessible spatial dead spots because they were at the end of the road network (road ends or narrow courtyards), such as the courtyards and roads around the male dormitory (roads a and j) and the roads near the female dormitory (roads q, r, s, and t), which got the least In and Con (described in blue) and the highest MD (described in red). In combination, the spatial configuration attributes in the OCE showed a clear polarization. Comparing with the kernel density distributions above, it can be found that outdoor violence mostly occurs in areas with lower In and Con, but higher MD, containing areas A, D, E, F, and G. It was supported by the fact that the most frequently agreed items in the "Current Environmental Assessment" of the questionnaire were "confusing campus building planning", "remote location of the dormitory building", "remote outdoor sports grounds", "dead space in the landscape", and "lack of leisure facilities".

Spatial Visibility Graphs
The results of the spatial visibility simulation of the OCE of Case L are shown in Figure 6, containing both global and partial scales for summer and winter seasons. Globally, the visibility distribution remained strongly consistent in both two seasons. Specifically and firstly, the playground, volleyball court, and roads to the west of them (roads f and l) all showed good visibility (described in red) due to the openness of the space, especially the volleyball court, which was the area with the best visibility and is considered as a hotspot. The large number of trees planted along the outer boundary of the playground diminished the visibility of the area behind it, making the boundary area the least visible (described in blue). It means that the visibility of the outdoor sports field area was radially distributed, i.e., the central area was the most visible and the outer boundary areas were the least visible. Secondly, the spaces among the teaching building, the gymnasium, and the dining hall as well as the basketball court, were at an average level of visibility (described in the neutral green), with relatively average VC. Thirdly, the internal courtyard, the male dormitory area, and the parking lot around the female dormitory, which were shaded by the building recessed spaces and vegetation areas, presented a lower level of VC (described in blue) and were identified as the least visible spaces. Collectively, the spatial visibility in the OCE also showed clear differences, which, when compared to the outdoor violence kernel density hotspots, were found to be mostly concentrated in areas of poor visibility such as areas A, D, F, and G. Similarly, the "more visual blind spots", which was highly endorsed by the tested students in the questionnaire, explained this phenomenon. Furthermore, a comparative analysis of local visibility between the two seasons revealed that the spatial visibility was more negatively affected by deciduous trees, especially trees with medium-sized canopies such as Liquidambar formosana Hance, Punica granatum L., and Prunus serrulata. The effect was more pronounced in summer than in winter, as evidenced by the fact that the area of high visibility was significantly smaller in summer than in winter. Also, some of the spatial nodes (nodes 3, 4, 13, 14, 15, 19, and 20) were more visible in winter due to the seasonality of deciduous trees.

The Coupling Relationship between the OSVD and the OCE
Referring to the roads and main spatial nodes in the OCE of Case L, a square with a side length of 5 m was drawn at the center of 20 spatial nodes (comparing the total dimensions of Case L, 5 m is more appropriate) [90], and data on spatial attribute variables were crawled for four vertices and centroids, for a total of 100 sample sets. The kernel densities of the four outdoor school violence types were named KD1 (for teacher-student conflict), KD2 (for verbal bulling), KD3 (for physical conflict), and KD4 (for external intrusion). Importantly, the Axis model did not exactly overlap with the VGA model, as the former only included walkable space, while the latter covered roughly the entire outdoor environment. Therefore, the spatial configuration parameters (In, MD, and Con) of the non-viable layer were considered to be zero when interpolation was performed, but meanwhile the spatial visibility parameter remained valid and was replaced by the VC M (average of VC S and VC W ). Then, the sample data were finally placed into IBM SPSS Statistics 26 and Stata 15.1 software for subsequent statistical analysis.

Normal Distribution Test Results
To ensure the logical rigor, a normal distribution test was needed to be carried out to determine the appropriate correlation coefficient before statistical analysis. With IBM SPSS Statistics 26, the one-sample Kolmogorov-Smirnov test (K-S) was used and the results are shown in Table 4. The results indicated that the p-values of the K-S statistic for these variables were all less than 0.05, implying that all were non-normally distributed and the Spearman Correlation Coefficient was determined to be applied, as suggested by Andy Field [86]. Also, the magnitudes of some of the variables were adjusted, where statistically permissible, to eliminate the effects of large differences in numerical magnitudes between variables.

Correlation Relationship
The correlation between the kernel densities for different OSVD maps and the spatial attributes of the OCE was assessed by applying the non-parametric Spearman Correlation Coefficient in Stata 15.1 software, and the results are shown in Table 5. KD1 was found to be related to all four spatial attribute variables, but all were moderately positively correlated (In: r = 0.246, p < 0.05; MD: r = 0.210, p < 0.05; Con: r = 0.200, p < 0.05; VC M : r = 0.242, (p < 0.05). Similarly, KD2 showed positive correlations with these spatial attribute variables (In: r = 0.332, p < 0.01; MD: r = 0.187, p < 0.1; Con: r = 0.290, p < 0.01; VC M : r = 0.293, p < 0.01), with In correlated the most strongly, while the MD had the weakest relationship. In contrast, KD3 presented a strong negative correlation with In in spatial configura-tion (r = 0.320, p < 0.01) and a moderate negative correlation with Con and VC M (Con: r = 0.250, p < 0.05; VC M : r = 0.224, p < 0.05). At the same time, KD4 was identified to have a strong negative correlation with the MD (r = 0.392, p < 0.01), but no significant correlation with the other three variables. Figure 7 displays the relationship between KD4 and MD. Since some of the data points were collected from the non-viable layer, these points were distributed on axes with a MD value of zero, leading to the illusion of a negative correlation. In fact, most of the high kernel density points of external intrusion were concentrated in the non-walkable layer.  In summary, the results of Spearman correlation analysis indicated that as the values of the three spatial attribute variables of In, Con, and VC M increased, both KD1 and KD2 in Case L grew, while KD3 decreased. Among these, verbal bullying was most significantly associated with all three, physical conflict was the next, and teacher-student conflict was the least relevant. Moreover, KD1 and KD2 were also related to MD and rose with its increase. KD4 correlated most strongly with MD and decreased with it increasing and increased most significantly when the MD value is zero (located in the non-viable layer).

Regression Relationship and Models
To explore the nature of the relationship between the kernel density distributions and spatial attributes of each outdoor school violence that are mentioned above, and to complete the in-depth discussion from qualitative to quantitative analysis, a regression model for Case L was developed in Stata 15.1 software. Before the regression analysis, the four spatial attribute independent variables first needed to be analyzed for multiple covariance to determine whether there was some degree of co-linearity between them and to prevent any impact on the contributory nature of the model. Table 6 presents the results of the co-linearity analysis. All spatial attribute variables had VIF values less than 10 and Tolerance values above 0.2, indicating the absence of significant covariance, providing support for the validity of the regression model following, as suggested by Bruce L. Bowerman and Richard O'Connell [87,88]. Multiple regression analysis followed the approach pointed out by Field, A.P. and for this study, there has been no explicit quantitative research on the relationship between school violence and spatial attributes, so a stepwise method and standardized tests were conducted on all independent variables. For KD1, the values of R 2 were boosted whenever an independent variable was added to the model, as displayed in Table 7. Among these, In and Con accounted for 5.7% and 7.0% of the R 2 change in the teacher-student conflict kernel density. In the final model of Table 8, the standardized test revealed the effects of each predictor with two predictors, In and Con, found to be significant predictors of teacherstudent conflict (In: beta = 10.334, t(100) =3.04, p < 0.01; Con: beta = 0.066, t(100) = 3.23, p < 0.01), with positive and negative effects, respectively. As the analysis progressed shown in Table 9, the R 2 change for KD2 was relatively large at 11.2% and 1.9% when In and Con were added. In turn, the data from the final model in Table 10 verified the influence of In and Con in relation to verbal bullying (In: beta = 5.585, t(100) = 2.54, p < 0.05; Con: beta = 0.023, t(100) = 1.69, p < 0.1), with the former being a moderate positive effect and the latter a weak, negligible negative effect. For KD3 illustrated in Table 11, the value of R 2 also increased significantly with each addition of the variable. The largest R 2 changes were caused by In, MD, and Con, accounting for 8.4%, 4.9%, and 7.6%, respectively, in physical conflict kernel density accordingly. Meanwhile, as Table 12 exhibits, the standardized test also showed that these three independent variables significantly predicted the distribution of physical conflict with a negative effect of In and a positive effect of the other two variables (In: beta = 6.521, t(100) = 4.12, p < 0.01; MD: beta = 4.616, t(100) = 3.44, p < 0.01; Con: beta = 0.029, t(100) = 3.00, p < 0.01). The regression model for external intrusion was similar to that for physical conflict, as can be seen in Tables 13 and 14, with In, MD, and Con also being significant predictors of the distribution, except that MD was associated with a moderate negative effect (beta = 3.510, t(100) = 2.42, p < 0.05).

Regression Model Checking Results
The model generalizability, the Durbin-Watson statistic, case diagnosis, and normality tests were carried out again to examine the validity of the regression models that were built above, as detailed in Table 15. Firstly, the difference comparison between the R 2 and the adjusted R 2 for the four final models shows that each model had a difference of around 0.03, i.e., occupied only a relatively small variance of around 3.0%. If the model was artificially driven conditions, this would suggest that all four models have good generalizability. Secondly, all values of the Durbin-Watson statistic were between 1.0 and 3.0, indicating that the assumptions under the dependence error were feasible, as Field [49] argued. Thirdly, the maximum values of Cook's Distance for all the models were less than 1.0, presenting that the influence of a single sample on the overall regression model was within a manageable range. Finally, as can be seen in Figure 8, the probabilities for each of the regression models were normally distributed. The stability, validity, and generalizability of the models were supported by all the regression statistics that were analyzed.  Notably, the results of the regression analysis supported some of the conclusions of the correlation analysis, while some of the ideas were not confirmed. With reference to the relevant literature [89,91], correlation analysis is a bivariate relationship, while regression analysis considers the interaction between multiple variables, meaning that correlation is not necessarily related to the existence of a regression relationship. When there is a correlation between the variables, but no regression relationship, it is considered that only a correlation exists. And when there is no correlation but a regression relationship exists, no correlation will be accepted as a conclusion. However, when there is a negative (positive) effect but a positive (negative) correlation, there is concluded to be a correlation but no regression relationship. Consequently and lastly, in this study, In was found to have a positive effect with teacher-student conflict and verbal bullying, and a negative effect with physical conflict, but a relatively weak relationship with verbal bullying. In other words, In can be identified as a significant predictor of teacher-student conflict and physical conflict, and a general predictor of verbal bullying. MD were discovered to have a positive and a negative effect relationship with physical conflict and external intrusion, respectively, which, in principle, could be used as significant predictors of these two violence types. However, according to Figure 7, external intrusion usually occurred in spaces with a MD value of zero; therefore, it was not recommended as a predictor.

Discussion
Research on the relationship between criminal behavior and environmental space has been going on for about 60 years. Through the investigation and practice of numerous scholars, important theories such as the first and second generations of CPTED have been established and made significant contributions to safety construction in many aspects of urban settlements, streets, parks, and schools. In recent years, with the information age arriving, research in this field has gradually been linked to quantitative research methods such as geospatial technology, with ArcGIS and Spatial Syntax theory having become commonly used tools. However, over the years, most of these quantitative studies have focused on applications at urban scales but consistently paid little attention to school violence. To fill this gap, this study was purposed to explore the intrinsic relationship between the OSVD and the OCE, taking China, which has a high proportion of boarding schools, as the research context, and selecting Case L in southeastern Zhejiang province as the empirical case. By organizing the meticulous field survey and using ArcGIS and DepthmapX software, the kernel density distributions of different types of outdoor school violence and the spatial syntactic graphs of the outdoor campus environment were mapped. Through statistical analysis, the quantitative relationships between the four kernel density variables of outdoor school violence (teacher-student conflict, verbal bullying, physical conflict, external intrusion) and the four spatial attribute variables (In, MD, Con, VC M ) were revealed, with their regression models developed and the corresponding predictors indicated, providing guidance for improving the safety of the outdoor environment.
Firstly, four important and major school violence types and their distributional characteristics were identified for Case L. Teacher-student conflict occupied the second highest percentage (22.64%), which was strongly related to the nature of boarding schools. Many studies have also shown that boarding schools were often characterized by functional and mixed buildings, specific physical and psychological problems of the student groups (rebellion, sensitivity, irritability, lack of family care), inappropriate management by teachers (full-day management of both study and life, and unreasonable teacher-student ratios), which led to numerous teacher-student conflicts such as corporal punishment. In addition, this kind of violence was mainly clustered within the area around the teaching building (the maximum kernel density was 0.021200), which was closely associated with the range of teachers' activities [92,93]. Verbal bullying was the most prevalent of all the violence types (52.14%), both in terms of numbers and distribution characteristics, as it was a less difficult and more insidious form to perpetrate, as found by Wang Peng [78]. As a result, it was present in almost the entire outdoor environment and was concentrated in the teaching building and the dormitory area (the maximum kernel density was 0.01468). Physical conflict, with the third highest incidence (15.92%), was relatively costly to implement and needed to be carried out in spaces that were well concealed or easy to escape from, as summarized by Wang Juan [79]. Its distribution, therefore, was more dispersed and mostly located in the end of roads, courtyards, and other corners with less In and greater MD, such as the male dormitory area (the maximum kernel density was 0.00819), which was consistent with previous studies [94]. External intrusion, the violence type with the lowest incidence (9.30%), tended to occur in the border areas of the non-walkable layer, which was related to the activity characteristics of outsiders as they needed to seek the fastest escape routes [79,80,94].
Secondly, the correlation analysis results indicated that there was a relationship between the OSVD and all spatial attributes of the OCE, which corresponded to the findings of Kweon Jihoon, Ju Mi-Ok, and Lee Chang-Hun [95][96][97][98][99]. Specifically, teacher-student conflict, verbal bullying, and physical conflict were all found to be correlated with In, Con, and VC M (for KD1, In: p < 0.05, Con: p < 0.05, VC M : p < 0.05. for KD2, In: p < 0.01, Con: p < 0.01, VC M : p < 0.01. for KD3, In: p < 0.01, Con: p < 0.05, VC M : p < 0.05.), with verbal bullying being the most significantly related. There are two main reasons why it was the most significant. For one, it had the largest sample size (N = 462), making it a more accurate calculation of kernel density values and possibly further highlighting the significance [91]. Another reason is that the distribution of teacher-student conflict was related to the range of teacher activities, with physical conflict also often occurring in areas of high In and low MD, such as outdoor sports fields, while the distribution of verbal bullying was relatively consistent, mostly in areas of high In and low MD [97,99]. The high coherence of the data thus strengthened the significance of the correlation between the variables. Furthermore, MD was also a spatial attribute that was associated with the presence of teacher-student conflict, verbal bullying, and external intrusion (KD1: p < 0.05. KD2: p < 0.1. KD4: p < 0.01). Within these, the most significant correlation between external intrusion and MD was due to the large number of spatial points that were distributed in the non-walkable layer (MD = 0), creating a prominent correlation [78,94].
Thirdly, the regression analysis results supported some of these findings and showed that In was a significant predictor of teacher-student conflict and physical conflict, and a general predictor of verbal bullying, while MD was a significant predictor of physical conflict, but not recommended as a predictor of external intrusion. As an extrapolation, it can be argued that the outdoor environment with lower In and higher MD may be the place where physical conflict is frequent, as verified in the spatial distribution maps in Case L and in line with the points made by Yijuan Qiao and her team [100,101]. For this violence space, it is possible to attract students and teachers to carry out communication activities and enhance territoriality by re-planning the road network or implanting crowd activities, such as landscaping vignettes, resting seats, and organizing outdoor activities [58,102]. Meanwhile, the environmental image should be well maintained. When choosing tree species, shrubs are mainly below the average height of students' sight lines, while trees are mainly those with high forked crowns and small crowns. Teacher-student conflict tends to occur in areas with high In and low MD, and in the light of the relevant literature [78,79,103], it can be speculated that this may be related to the teachers' activity phenomenon that tends to revolve around spaces with good accessibility, such as teaching buildings. In addition to social interventions such as ICC-T [104], the campus environmental design needs to strengthen the monitoring capacity, and activity inclusion of teachers to reduce the violence occurrence. For example, teacher management stations can be set up in areas of frequent student-teacher activity, including sports fields and courtyards, to reduce the violence incidence such as corporal punishment as a result of student rebellion [94]. The effects between verbal bullying and each spatial attribute are not significant enough, which may be related to the lower cost of violence occurrence, as Ku Na Hyoen and Wahab AA [105,106] argued. For prevention, electronic surveillance and safety information boards in crowded spaces will be important to restrain the behavior and motivation of potential perpetrators. For spaces with high In and low MD, in order to control the people flow and enhance the dwellability of the space, spatial variations can be enriched by adding flower beds, ponds, and other landscape features, appropriately increasing the MD and reasonably installing facilities to enhance surveillance and management [94,107]. In contrast, this study does not recommend MD as a predictor for external invasion, despite its significance, as external invasion tends to occur in the non-walkable layer of the boundary where the MD value is zero. For these spaces, on the one hand, higher or additional walls are needed to improve access control, and on the other hand, landscape design and image maintenance should be enhanced to reduce view shading by regularly building foliage or planting trees with high forked canopies to make it impossible for outside perpetrators to invade and hide [68,78,107]. In addition, to compensate for the poor visibility of spaces, increased lighting and electronic surveillance equipment are also necessary to improve the public safety of the outdoor campus environment. The quantitative findings and environmental security optimization tools that are summarized above are all interpreted accordingly in the qualitative analysis of the relevant literature, further validating the reliability of the predictors.
Finally, there are still numerous limitations and shortcomings in this study. Above all, only one boarding school was selected as a sample, which has the restriction of a small sample and is not representative of the specific situation of school violence across China or the world as a whole, and more further case studies are needed to validate this. Also, the outdoor violence distribution maps were plotted by marking the violence locations on the general plan, which was not a fully accurate method of counting and relied on student memory, with data subject to error. Nevertheless, it could still be a positive help in determining the general areas where outdoor violence occurs. It would be a more reliable approach to use incidents of violence that are registered with the school security office as a source of data. Moreover, many difficulties were identified in the simulations that were carried out using DepthmapX. On the one hand, the outdoor environment does not have clear physical boundaries, such as walls, so the spatial configuration analysis classified the lawn as an inactive area, even though people can step on it. On the other hand, as the software is a two-dimensional operating environment, it does not yet support the analysis of influencing factors in three-dimensional space. Although this study categorized possible obstacles in the outdoor campus environment to get as close to the real results as possible, it ignored the influence of factors such as road materials and environmental color [108,109], which still needs to be improved with the help of 3D simulation software and a more thoughtful evaluation system.

Conclusions
Despite the obvious limitations of the small sample and representativeness, this study makes a contribution to research on school violence prevention through environmental design. At the same time, nowadays, there are few studies that use quantitative techniques to explore the distributional characteristics of school violence, where the innovation of this study lies. Through questionnaire investigation, ArcGIS, and Spatial Syntax theory, the relationship between the OSVD and the OCE was explored and the corresponding predictions were determined, highlighting and validating the intrinsic role of spatial attributes of the OCE (spatial configuration and spatial visibility) on the of OSVD. Specifically, In was noted as a significant predictor of teacher-student conflict and physical conflict (KD1: beta = 10.334, p < 0.01; KD3: beta = 6.521, p < 0.01), a general predictor of verbal bullying (beta = 5.585, p < 0.05), while MD was found to be a significant predictor of physical conflict (beta = 4.616, p < 0.01). Therefore, by calculating the In and MD of the spatial configuration, combined with VC M , a significantly relevant variable, it is possible to predict more accurately the potential places where school violence will occur. And then environmental design strategies can be carried out in a targeted manner, such as enhanced surveillance, activity support, and image maintenance for different spatial attributes, thereby increasing effective prevention. Such a predictive model would allow the establishment of a guideline for the security campus planning, as well as setting up an evaluation mechanism to predict and test design solutions, which would have great potential and practicality to provide guidance and evidence for the outdoor environments design to prevent school violence. It is a brand-new attempt, but in order to get a more accurate and reasonable prediction method, the field needs to be improved with more types and numbers of school samples, more accurate and objective violence data, and more comprehensive software for 3D spatial simulation (the quantitative calculation of environmental color and texture), etc.   The questionnaire section has been closed. Your responses will be counted anonymously and confidentiality will be guaranteed during the research process. Thank you very much for your support and cooperation with our team!

Appendix B
The questionnaire part "Current Environmental Assessment" consisted of 16 items, and its reliability and validity tests were carried out in IBM SPSS Statistics 26.
Firstly, the internal consistency was tested for reliability using Cronbach's alpha, Spearman-Brown coefficient, and McDonalds omega. The calculation results showed that Cronbach's alpha for the questionnaire was 0.922, while the McDonald's omega was 0.934. When the 16 items were analyzed in half, the Spearman-Brown coefficient was 0.905. As can be seen, the results for the three important indicators were all greater than 0.900, which illustrated that they were all within good reliability indicators according to the relevant references [110,111], suggesting that the reliability quality of the study data can be affirmed.
Secondly, the structural validity was tested using exploratory factor analysis, containing KMO, Bartlett's sphericity, variance explained, and commonality, with the difference considered statistically significant at p < 0.05. With all items subjected to exploratory factor analysis simultaneously, the commonality value for Q4 was found to be less than 0.4, indicating a very weak relationship between the factors and it. After excluding Q4 [112], exploratory factor analysis was performed again and the results presented the KMO of 0.933 (greater than 0.8, showing a good fit for factor analysis) as well as the data passing the Bartlett's sphericity test (p < 0.05), satisfying the prerequisite requirements, implying that the data could be used for factor analysis studies. In addition, two factors (both with Eigenroot values greater than 1) were extracted, which led to a rotated variance explained of 21.473% and 36.377%, respectively, with a cumulative variance explained by the rotation of 57.850%. Table A5 displays the classification of the two factors and their factor loadings and commonality, with Factor1 containing Q1, Q2, Q3, and Q11, and Factor2 containing Q5, Q6, Q7, Q8, Q9, Q10, Q12, Q13, Q14, Q15, and Q16. The communality value for each item was above 0.4, signifying a strong correlation between the items and the factors. It can be believed that all of the above indicators presented that the questionnaire had good construct validity [110].
Thirdly, validation factor analysis was carried out for a total of two factors, and 15 items, containing three indicators, average variance extracted (AVE), construct reliability (CR), and the AVE square root. The effective sample size for this analysis was 338, which exceeded the number of items analyzed. The results revealed that the two factors corresponding to AVE values all greater than 0.5 (Factor 1: 0.502, Factor 2: 0.511) and CR values all above 0.7 (Factor 1: 0.799, Factor 2: 0.919), implying good convergent validity [110,111]. Additionally, the AVE square root values for Factor 1 and Factor 2 were 0.709 and 0.715, respectively, which were greater than the maximum value of 0.590 for the absolute value of the inter-factor correlation coefficient, signifying good discriminant validity [110].
In summary, the reliability and validity of the responses to the scale questions of the questionnaire were validated.