Molecular-Composition Analysis of Glass Chemical Composition Based on Time-Series and Clustering Methods

The weathering of ancient glass relics has long been a concerned. Therefore, a systematic and more comprehensive mathematical model with which to correctly judge the category of ancient glass products whose chemical composition changes due to weathering should be established. This paper systematically analyzes and studies the changes in the composition of ancient glass products as a result of weathering of. We first analyze the surface weathering of glass relics and its correlation with three properties and establish a multivariable time-series model to predict the chemical-composition content before weathering. Next, we use one-way analysis of variance for subclassification and, finally, we use a principal component analysis of the rationality, and change the significance level to determine its sensitivity, for the reasonable prediction of the chemical-composition content and classification to provide a theoretical basis for improving the model. This allows the model to provide reference values, which can be used in the protection of cultural relics, historical research, and other fields.


Introduction
Glass is precious material evidence of the trade between the Early Silk Road and the West. Cultural relics are the cultural heritage of a nation and the cultural carriers of civilization and the national spirit. Ancient glass is strongly subject to the influence of buried environments and weathering. Glass weathering is generally related to glass composition, glass surface chemicals, the environmental atmosphere of glass products, and other factors. During the process of weathering, there is a high degree of exchange between the internal elements and the environmental elements. In addition, the composition proportion changes. These changes affect the correct judgment of the category of weathered glass objects; the color and decoration on the surfaces of cultural relics surface cannot be used as the basis for judging their weathering, a large area of weathered cultural relics may still have unweathered areas. After the weathering of glass cultural relics, the differences between the chemical elements in different categories of cultural relics also have a certain relevance. With the development of society and the progress of science and technology, glass weathering, as a traditional topic, also urgently requires more rigorous and accurate methods of study. In this regard, the use of the quantitative analysis and type identification of the chemical composition is a good approach to the study of the weathering of glass cultural relics [1][2][3].
Gan Fuxi et al. combined X-ray fluorescence analysis, X-ray diffraction and laser Raman spectral analysis [4,5]. Li Qing-hui et al. studied the similarities and differences between flux, K 2 O-CaO-SiO 2 , PbO-BaO-SiO 2 system glass, glass sand, and the Western Zhou, spring, and autumn [6,7]. Liu Song et al. discussed the application of a portable energydispersive X-ray fluorescence spectrometer to the analysis of the chemical composition of glass in ancient China [8][9][10].
Based on the related research on the chemical composition of glass, due to the disadvantages of the lack of an authoritative chemical-composition index, the research of domestic and foreign scholars is still limited. Most scholars analyze a small number of elements using qualitative methods or tools, with relatively little substantive research. This paper aims to organize the data of various chemical compositions and predict their content; unlike the work of other scholars, this paper focuses on the chemical composition of glass relics [11,12].
Through mathematical modeling, the classification and quantitative analysis of the known data on ancient glass relics, provide accurate results. Combined with the theoretical speed of chemical substances, this offers a theoretical basis for the reasonable prediction of chemical contents and the classification of cultural relics. In terms of the protection of cultural relics, we enrich the understanding of the preservation status quo and weathering mechanism, which provides a scientific basis for the protection-and-restoration scheme. Thus research expands the study of glass cultural relics and, at the same time, it has special significance in field operation, archaeological sites and cultural protection. (1) In this paper, various elements are incorporated into the same analytical framework and combined with the theoretical speed change of various chemical elements to systematically analyze the change trends of various elements, so as to supplement the domestic theoretical research on the changes in the chemical elements of glass relics and provide theoretical support for the research on the protection and restoration of glass relics. (2) From an empirical perspective, this paper uses the time-series model to predict the composition of various chemical elements, clusters the glass relics based on their characteristics, and provides relevant policy suggestions, so as to provide reference values for archaeological sites in China [13][14][15].

Data Sources and Basic Assumptions
The data are all from question C of the 2022 National College Students Mathematical Modeling Competition. To facilitate the research problem, the following assumptions are made [16]: (1) Environmental assumption: the glass is weathered under natural conditions, without special treatment, such as high temperatures; (2) Reasonability hypothesis: the changes between each element influence each other; (3) Unity assumption: overall unweathered, partial unweathered, overall weathered, local unweathered; overall weathering, local weathering, overall weathering, local severe weathering of the four stages of the same time; (4) Substantial assumptions: even at the beginning of the overall unweathered, local unweathered stage, weathering has begun; (5) Exclusivity assumption: all cultural relics are unaffected by their ornamentation, type and color, and the content of their chemical composition before weathering is a standard fixed value [17,18].

Missing Data
Based on data, four values are missing from data "color" column: 19, 40, 48, 58. The common characteristic of four cultural relics was that surfaces were weathered. We speculated that the cause of cultural relics' surface-weathering level is that their color cannot be observed. To support subsequent data processing, we used the four-color missing value as a reasonable supplement [19][20][21]

Data Preprocessing
According to the requirements, component proportions between 85% and 105% are regarded as effective data; the component proportions of the 15 and 17 sampling sites were 79.47% and 71.89%, respectively. Therefore, we deleted the data of the 15 and 17 sampling sites. Finally, the valid value obtained was 58.

Data Visualization
According to the existing data, the relevant information about the types, colors, patterns and weathering of different cultural relics was collected, and the types, colors and patterns were labeled as first-level characteristics, so that the data could be clearly displayed.
We used the chi-square test to analyze the correlation between the surface weathering of cultural relics and their characteristics. First, the hypothesis was true, and the value was calculated based on this premise, which represented the degree of deviation between the observed value and the theoretical value. According to the distribution and degree of freedom, the probability P of obtaining the current statistics and more extreme cases can be determined if the assumption holds. If the p-value is small, indicating that the deviation from the observed value is too large, the invalid hypothesis should be rejected to indicate significant differences between the comparative data; otherwise, the invalid hypothesis cannot be rejected, and the actual situation and the theoretical hypothesis represented by the sample cannot be considered.
In Tables 1 and 2, it is possible to observe the relationship between the surface weathering of cultural relics and their characteristics. With a confidence interval of 95%, the p-value of decoration is 0.08, and the original hypothesis should be rejected; the p-value of type is 0.01, the original hypothesis should be rejected. If the p value of color is 0.42, the original hypothesis cannot be rejected; therefore, the weathering and glass types are significant, and the correlation with the decoration and color is not significant [22].

Research Ideas
Based on the data presented in this paper, the surface weathering of the ancient glass cultural relics was divided into four stages: no surface weathering, local unweathered; surface weathering, local unweathered; surface weathering, local weathering; and surface weathering, severe local weathering. Secondly, each of the four stages was applied and the mean value was calculated to represent the criteria of each stage. On the premise of searching for relevant information as a theoretical support, combined with the theoretical rate of change of the chemical material, values to were assigned to each element. The composite score was calculated by multiplying the value coefficient of each element (the element-ratio weight of each glass type) by the mean and ranking it. According to the analysis, the lower the score, the greater the weathering degree. A total of 56 samples were processed in turn. The relationship analysis of the four-stage mean gave the initial value. This completed the multivariate time-series prediction. The specific process is shown in Figure 1.
Based on the data presented in this paper, the surface weathering of the ancient glass cultural relics was divided into four stages: no surface weathering, local unweathered; surface weathering, local unweathered; surface weathering, local weathering; and surface weathering, severe local weathering. Secondly, each of the four stages was applied and the mean value was calculated to represent the criteria of each stage. On the premise of searching for relevant information as a theoretical support, combined with the theoretical rate of change of the chemical material, values to were assigned to each element. The composite score was calculated by multiplying the value coefficient of each element (the element-ratio weight of each glass type) by the mean and ranking it. According to the analysis, the lower the score, the greater the weathering degree. A total of 56 samples were processed in turn. The relationship analysis of the four-stage mean gave the initial value. This completed the multivariate time-series prediction. The specific process is shown in Figure 1.

Preparation of the Model
First, based on the characteristics of the data weathering, we divided the types of cultural relics into the following four stages: overall unweathering, local unweathering, overall weathering, local unweathering, overall weathering, local weathering, overall weathering, and local severe weathering.
Next, we performed a further analysis of the four stages. By referring to the literature, we determined that in glass products, after weathering, compounds composed of Si and Na elements can change significantly. For the group of elements comprising K, Ca, Al, and Pb, the content of the resultant compounds changes somewhat after weathering. We assigned the weights according to the proportion of changes in each element's content after the weathering of glass product, according to the literature. Furthermore, according to the known literature, Si, Na, K, Ca, Pb, and Al are important variables in changes caused by glass weathering, and serve as weights according to the proportional changes before and after weathering in various types of chemical elements. In order to make full use of the form data, we weighted the remaining elements of the 14 chemical components. The final resulting weight ratios of the 14 chemical components are shown in Table 3 [23][24][25].

Preparation of the Model
First, based on the characteristics of the data weathering (Supplementary Materials), we divided the types of cultural relics into the following four stages: overall unweathering, local unweathering, overall weathering, local unweathering, overall weathering, local weathering, overall weathering, and local severe weathering.
Next, we performed a further analysis of the four stages. By referring to the literature, we determined that in glass products, after weathering, compounds composed of Si and Na elements can change significantly. For the group of elements comprising K, Ca, Al, and Pb, the content of the resultant compounds changes somewhat after weathering. We assigned the weights according to the proportion of changes in each element's content after the weathering of glass product, according to the literature. Furthermore, according to the known literature, Si, Na, K, Ca, Pb, and Al are important variables in changes caused by glass weathering, and serve as weights according to the proportional changes before and after weathering in various types of chemical elements. In order to make full use of the form data, we weighted the remaining elements of the 14 chemical components. The final resulting weight ratios of the 14 chemical components are shown in Table 3 [23][24][25]. We organized the data and analyzed the data related to the time series with R, which can not only describe patterns in historical data over time, but can also be used for some studies and predictions. A multivariate autoregressive model was fitted using the VAR ( ) function in the multivariate autoregressive library "vars" in the R language. First, the image of each variable changed with the sampling point of the relic, as shown in Figure 2. We organized the data and analyzed the data related to the time series with R, which can not only describe patterns in historical data over time, but can also be used for some studies and predictions. A multivariate autoregressive model was fitted using the VAR ( ) function in the multivariate autoregressive library "vars" in the R language. First, the image of each variable changed with the sampling point of the relic, as shown in Figure 2. In the figure, it can be seen that the changes in SiO2 were relatively regular, the SnO2 and SO2 changes were relatively stable, while the K2O and Na2O changes were relatively stable in the early stage but fluctuated greatly in the later stage. The other elements constantly floated.
Second, we created a probability-density map, on which the abscissa range represents the value interval of each variable, the ordinate represents the probability of each variable taking the value, and the sum of the area between the curve and the x-axis is 1. In Figure 3, we can see where the values of each variable are mainly concentrated [26,27]. In the figure, it can be seen that the changes in SiO 2 were relatively regular, the SnO 2 and SO 2 changes were relatively stable, while the K 2 O and Na 2 O changes were relatively stable in the early stage but fluctuated greatly in the later stage. The other elements constantly floated.
Second, we created a probability-density map, on which the abscissa range represents the value interval of each variable, the ordinate represents the probability of each variable taking the value, and the sum of the area between the curve and the x-axis is 1. In Figure 3, we can see where the values of each variable are mainly concentrated [26,27].
Next, we calculated the correlation between the variables, with the following formula: According to the calculation results, when the absolute value is closer to 1, the correlation is stronger. When the result is closer to 1, the positive correlation is stronger, and when the result is closer to −1, the negative correlation is stronger. Based on the results, we created an analysis diagram of the correlation between the variables. In Figure 4, it can be seen that there were many obvious correlations between the variables. The result on the diagonal is the correlation of each element with itself; since the result was 1, we chose to only examine the diagonal results. We found that silica, potassium oxide, lead oxide, barium oxide, phosphorus pentoxide, and strontium oxide had an obvious correlation with the artificial sampling points. The artificial sampling points are the parts of glass relics that are analyzed according to the weathering and the content of each element [28]. Molecules 2023, 28, x FOR PEER REVIEW 6 of 15 ) )( (  According to the calculation results, when the absolute value is closer to 1, the correlation is stronger. When the result is closer to 1, the positive correlation is stronger, and when the result is closer to −1, the negative correlation is stronger. Based on the results, we created an analysis diagram of the correlation between the variables. In Figure 4, it can be seen that there were many obvious correlations between the variables. The result on the diagonal is the correlation of each element with itself; since the result was 1, we chose to only examine the diagonal results. We found that silica, potassium oxide, lead oxide, barium oxide, phosphorus pentoxide, and strontium oxide had an obvious correlation with the artificial sampling points. The artificial sampling points are the parts of glass relics that are analyzed according to the weathering and the content of each element [28].  Finally, we took various elements of each cultural relic as independent variab dicted various elements in chronological order, and created the prediction plot. In 5, as in the first figure, the black dots represent the true values, and the last five b represent the predicted values. Based on the change results, we believe that the pr results can better reflect the sequence-change characteristics [29][30][31]. Finally, we took various elements of each cultural relic as independent variables, predicted various elements in chronological order, and created the prediction plot. In Figure 5, as in the first figure, the black dots represent the true values, and the last five blue dots represent the predicted values. Based on the change results, we believe that the prediction results can better reflect the sequence-change characteristics [29][30][31]. Finally, we took various elements of each cultural relic as independent variables, predicted various elements in chronological order, and created the prediction plot. In Figure  5, as in the first figure, the black dots represent the true values, and the last five blue dots represent the predicted values. Based on the change results, we believe that the prediction results can better reflect the sequence-change characteristics [29][30][31].

Solution of the Model
We created a five-step prediction of the time series, and the results are shown in Table 4. The relationship-establishment analysis was used to obtain the regular launch initial value and to create the five-step prediction. For elements less than 0, we believe that a chemical reaction did not occur in the initial stage, which can be directly assumed to be 0. We compared the values that were not zero to the initial value derived based on the mean; we considered the values that were close to the initial value reasonable. The specific results are shown in Table 5.

Research Ideas
We pre-processed the data from Form 1 and preliminarily classified the categories of the cultural relics; We used a one-way analysis of variance to select the appropriate chemical composition for the cultural relics with high potassium and lead barium to obtain the first classification results; We used the K-means algorithm to subclass the glass types, and analyzes the specific division methods and results; The rationality of the results was supported if the form was classified again and the results were similar to the cluster junction; the sensitivity is judged by adjusting the significance level in the one-way analysis of variance.

Selection of the Appropriate Chemical Composition
(1) Model Principle One-way analysis of variance (ANOVA) refers to the method of analyzing the one-way test results and testing whether the factors have a significant impact on the test results.
Assuming that the collected data were derived from the sample values of S different populations (each level corresponds to one population), and counting the mean values of each population in one order, the following assumptions need to be tested: Null hypothesis: H 0 : u 1 = u 2 = · · · = u s . Alternative hypothesis: H 1 : u 1 , u 2 , · · · , u s . Not all are zero. To reintroduce the horizontal effect, t δ j δ j = u j − u(j = 1, 2, · · · , s).
Thus, when true, the F-distribution-test statistic that needs to be followed by one-way ANOVA is: Thus, with the significance level a, the rejection domain of the test problem is: At this point, the null hypothesis was rejected as showing significant differences between the samples.
(2) Model Building First, we investigated whether the fourteen chemical components would have a significant effect on the glass-classification results of high-potassium types.
Therefore, we established the following assumptions: Null hypothesis: The fourteen chemical components will not have a significant impact on the glass-classification results of high-potassium types.
In the ANOVA Table 6, SiO 2 , K 2 O, CaO, Al 2 O 3 , and Fe 2 O 3 are less than 0.05. The null hypothesis is rejected in the belief that SiO 2 , K 2 O, CaO, Al 2 O 3 , Fe 2 O 3 , and high-potassiumtype glass will affect the classification of high potassium type glass; that is, select these five suitable chemical components to subdivide the subclass of high-potassium-type glass. Similarly, we investigated whether the fourteen chemical components would have a significant impact on the lead-barium-type-glass-classification results.
Therefore, we established the following assumptions: Null hypothesis: The fourteen chemical components will not significantly affect the results of lead-barium-type-glass classification.
Optional hypothesis: The fourteen chemical components will have a significant impact on the results of lead-barium-type-glass classification.
In the ANOVA Table 6, it can be seem that SiO 2 , Na 2 O, Al 2 O 3 , CuO, PbO, BaO, BaO, P 2 O5, SrO, and SO 2 are less than 0.05, and the null hypothesis that SiO 2 , Na2O, Al 2 O 3 , CuO, PbO, BaO, P 2 O 5 , SrO, SO 2 and lead-barium-type-glass affect the classification of leadbarium-type glass can be rejected; that is, select the nine appropriate chemical components for lead-barium-type glass.

Subclass Division
(1) Model Preparation For the problem of using the above chemical components for each category, we use the k-means algorithm.
The K-value setting is the only defect of the algorithm. In order to improve the effectiveness of K value, we used the fast-clustering method to determine the value of K in K-means algorithm and obtained the K value of 3 through systematic clustering in SPSS software [37,38].
(2) Model Building We randomly selected K samples from the sample set as the initial mean vector; We calculated the distance of the sample from each mean vector and dividedthe sample into the phase according to the nearest mean vector from the sample cluster; After the classification, the central point of the category was redetermined and the mean of all samples in the category was made. For features corresponding to the new center point, the centroid of all samples in the class was applied; Steps 2 and 3 were repeated until the subclass subdivision of high-potassium glass with lead barium glass was completed. (

Model Analysis
(1) Rationality Analysis In order to verify the rationality of the classification results, we used the principalcomponent-analysis method to cluster the data and compare the observed classification

Model Analysis
(1) Rationality Analysis In order to verify the rationality of the classification results, we used the principalcomponent-analysis method to cluster the data and compare the observed classification results with the K-means classification results. If the comparison results were not very different, the classification results were reasonable; otherwise, the classification results were not reasonable. According to the results obtained from the above cluster analysis, the data were divided into two categories; therefore, we also extracted the same two categories using the principal-component-analysis method. The principal-component-analysis steps were as follows: With n cultural relics and p indicators, the initial sample matrix is: Calculate eigenvalues of the inter-index correlation coefficient matrix R and eigenvector e j and obtain the principal component W j : W j = Xe j .
(2) Sensitivity Analysis During the extraction of the significant chemical-component content, the significance level of the one-way ANOVA was determined to be 0.05. To explore the sensitivity of our classification results, we adjusted the significance levels to 0.01 and 0.1, respectively, and the specific experimental procedures and results were as follows.
The table shows that at the significance levels of 0.1 and 0.01, although the classification of the high-potassium and lead-barium glass was perturbed, fewer relics were disturbed; therefore, we believe that the sensitivity of the lead-barium-glass-classification law is low.
In conclusion, although changing the level of significance can affect the change in classification results, the number of changes is small and exerts a weaker impact on the population; therefore, we believe that the results obtained by the K-means classification are less sensitive.

Conclusions
By converting the relevant data to a time series, we can derive the initial value of each element variable. Based on the initial data of each element variable, combined with the existing element content, we can perform chemical reactions on the surfaces of glass artifacts. At the same time, based on the classification of glass with high potassium and lead barium, the use of subclass division can help us to obtain a more detailed understanding of glass artifacts. For a reasonable initial prediction of the chemical composition content, and to provide a theoretical basis for the division of cultural-relic categories, such studies ot only enrich the understanding of the preservation status and weathering mechanism of cultural relics but, at the same time, they are also of special significance in field operations, archaeological sites, and cultural-relic protection, among other applications. They also provides a scientific basis for the formulation of protection-and-restoration programs. Based on the research conclusion, the following policy suggestions are suggested: First, increase the innovative research on glass relics. Most of the existing studies are on the protection and restoration of cultural relics. With the help of instruments and equipment, the research on specific elements should widen the research scope and conduct a more comprehensive study of glass relics.
Second, the richness of extenders combines color, category, decoration, and so on with chemical elements and classifies glass relics. This makes research on glass relics more relevant.
Third, we should attach importance to the coordinated development of the study of environmental systems and technological innovation. In the research on glass cultural relics, we should pay attention to environmental protection and encourage technological innovation through technical exchanges at home and abroad. This would highlight the development of glass technology at various points in time, as well as the integration of Chinese and Western glass technology.
Funding: This work was supported by the major project of the National Social Science Fund, "Statistical Study on the Impact of Digital Economy on Carbon Emission" (22BTJ048), and its phased research results.