Identiﬁcation of Saline Soils Using Soil Geochemical Data: A Case Study in Soda-Salinization Areas, NE China

: Identifying saline soils is of great importance for protecting land resources and for the sustainable development of agriculture. Total soil salinity (TSS) is the most commonly used indicator for determining soil salinization, but the application of soil geochemical data is rarely reported. In general, there is a signiﬁcant relationship between TSS and the content of soil-soluble Na, which can be estimated by the difference between the bulk-soil Na 2 O content and its background value. In this study, the partial least squares regression (PLSR) method was employed to calculate the Na 2 O background value via a regression model between Na 2 O and SiO 2 , Al 2 O 3 , TFe 2 O 3 , Cr, Nb, and P in a 1:250,000 scale regional geochemical data set of soils in Jilin Province, NE China. We deﬁned δ Na as the difference between the bulk-soil Na 2 O value and the regression background value, which can be used as a geochemical indicator to identify saline soils. One hundred and ﬁve samples with known TSS contents in the study area were selected to test the capability of the indicator δ Na. The result shows that the identiﬁcation accuracy can be up to 75%, indicating that the indicator can provide a new means for saline soil identiﬁcation.


Introduction
Salinization refers to the continuous accumulation of water-soluble salts on the soil surface to the extent that they affect agricultural production, environmental health, and economic development [1][2][3]. As a major soil degradation process, soil salinization can cause soil desertification and worsen soil physical and chemical properties, reducing available land area, and it has increasingly become a serious ecological and environmental problem threatening agricultural production, food security, and sustainability in arid and semi-arid regions worldwide. It is of great significance for land ecological protection and sustainable agricultural development to carry out the study of soil salinization identification.
At present, the most commonly used indicator for determining soil salinization and its degree is total soil salinity (TSS), the sum of soluble Na + , Mg 2+ , Ca 2+ , K + , HCO 3 − , CO 3 2− , Cl − , and SO 4 2− contents in soils [4,5]. This method, however, has limitations, such as high costs and a heavy laboratory workload. More importantly, regional investigations of soil soluble salts are relatively few, and the amount of TSS data is limited, which restricts the research on the spatial distribution and the origin of soil salinization.
Recently, the method of retrieving salinization using soil remote sensing spectral data has attracted much more attention [6][7][8]. There is no doubt that this method has the advantages of wide monitoring ranges, strong continuous monitoring ability, and fast information update speed. However, remote sensing interpretation is only an indirect retrieval method and is greatly affected by soil moisture contents, vegetation development, data processing models, and personnel experience. The reliability of the retrieval result is sometimes low.
Over the past several decades, China has accumulated a large amount of soil geochemical data at various scales. These compositional data have the advantages of wide Located in the hinterland of the Songnen Plain, the west of Jilin Province is one of China's most important grain-producing areas. Affected by natural and human factors, such as the semi-arid climate, high evaporation intensity, low and flat terrain, the rise of groundwater level, irrational development of agriculture and animal husbandry, and poor drainage, the soil in some areas of this region represents a severe salinization problem [9][10][11][12][13][14]. In this paper, the geochemical indicators of soil salinization in western Jilin Province (Baicheng, Taonan, Da'an, Zhenlai, Songyuan, Tongyu, Qian'an, and Qianguo) were selected as the study areas ( Figure 1).
The study area is about 30,000 km 2 with many lakes and marshes of different sizes. The study area has few topographic changes, and the terrain is flat and open. The terrain of Da'an and Zhenlai in the middle is the lowest, and those of Qianguo and Qian'an in the southeast and Taonan in the northwest are higher. The study area is located in the semi-arid monsoon region, belonging to the continental monsoon climate of East Asia, with four distinct seasons, hot and rainy in summer and cold and dry in winter. The annual average temperature and precipitation are 4.3 • C and about 410 mm, respectively. The evaporation is far greater than the rainfall in the study area.
Tectonically, the study area is located at the "bottom" of the Songliao Basin. The lower terrain creates favorable conditions for the accumulation of soluble salts and the formation of saline soils. The saline soils are mainly distributed in the central part, where numerous lakes and marshes are distributed, and in the northern part of the study area ( Figure 1). Previous studies show that saline soils in the study area are mainly soda-type, with cations primarily of Na + and anions of HCO 3 − and CO 3 2− [15,16]. In addition, the topsoil layer is the main place for salt accumulation [17][18][19]. Tectonically, the study area is located at the "bottom" of the Songliao Basin. The lower terrain creates favorable conditions for the accumulation of soluble salts and the formation of saline soils. The saline soils are mainly distributed in the central part, where numerous lakes and marshes are distributed, and in the northern part of the study area ( Figure 1). Previous studies show that saline soils in the study area are mainly soda-type, with cations primarily of Na + and anions of HCO3 − and CO3 2− [15,16]. In addition, the topsoil layer is the main place for salt accumulation [17][18][19].

Data
A total of 7631 surface soil samples were collected from the study area, with a sampling depth of 0~20 cm. The sampling density was about 1 sample/km 2 , and 4 neighboring samples constituted 1 analytical sample, which can reflect the soil geochemical information within 4 km 2 . The sample collection rigorously followed the specification of a multi-purpose regional geochemical survey (1:250,000) of the China Geological Survey. To improve the representativeness of samples, 3 to 5 subsamples were collected in each grid (1 km 2 ) to combine a single sample.
The contents of SiO2, Al2O3, TFe2O3, Na2O, K2O, Ce, Co, Cr, Ni, La, Nb, Ti, Th, V, Y, Zr, and P were analyzed using X-ray fluorescence (XRF) spectrometry. One hundred and five typical soil samples were selected for soluble Mg 2+ and Ca 2+ analyses by atomic absorption spectrometry, soluble K + and Na + by flame photometer, and soluble Cl − , HCO3 − , CO3 2− , and SO4 2− by titration. The analytical accuracy of major elements, trace elements, and soluble ions was generally better than 5%, 10%, and 5%, respectively. The major element unit is wt%, the trace element unit is 10 −6 , and the soluble ion unit is g/kg.

Data
A total of 7631 surface soil samples were collected from the study area, with a sampling depth of 0~20 cm. The sampling density was about 1 sample/km 2 , and 4 neighboring samples constituted 1 analytical sample, which can reflect the soil geochemical information within 4 km 2 . The sample collection rigorously followed the specification of a multi-purpose regional geochemical survey (1:250,000) of the China Geological Survey. To improve the representativeness of samples, 3 to 5 subsamples were collected in each grid (1 km 2 ) to combine a single sample.
The contents of SiO 2 , Al 2 O 3 , TFe 2 O 3 , Na 2 O, K 2 O, Ce, Co, Cr, Ni, La, Nb, Ti, Th, V, Y, Zr, and P were analyzed using X-ray fluorescence (XRF) spectrometry. One hundred and five typical soil samples were selected for soluble Mg 2+ and Ca 2+ analyses by atomic absorption spectrometry, soluble K + and Na + by flame photometer, and soluble Cl − , HCO 3 − , CO 3 2− , and SO 4 2− by titration. The analytical accuracy of major elements, trace elements, and soluble ions was generally better than 5%, 10%, and 5%, respectively. The major element unit is wt%, the trace element unit is 10 −6 , and the soluble ion unit is g/kg.

Basic Principles
Linear regression models were first considered to establish the relationship between soil chemical composition and Na 2 O. As soil geochemical data with the sum are constrained to 100%, there is inevitably a multicollinearity problem among elements: multiple correlations. As a result, it is difficult to solve this problem using ordinary regression analysis methods. Finally, the partial least squares regression (PLSR) method was employed in this study [20][21][22][23][24]. The PLSR was first proposed by S. Wood in 1984 to solve the problem of multiple correlations between variables [25]. The main steps of the PLSR method are as follows: firstly, the independent and dependent variables are standardized to obtain E 0 and F 0 , respectively, and then we extract the ingredients p 1 and q 1 , of which p 1 = E 0 w 1 , q 1 = F 0 v 1 , and w 1 = 1, v 1 = 1. For component p 1 , a regression equation is established: If residuals E 1 and F 1 do not meet the requirement, we continue to extract components until the accuracy meets the requirement. For the extracted components p 1 , p 2 , · · · , p m , we establish a regression model of Finally, after standardization and sorting, we can obtain Y = X A + F m . Y represents the predicted value of the model; X represents the values for the model; A is the regression coefficient of the sorted model; F indicates the residual after sorting.

The Data Processing Procedure
The procedure to establish the relationship between soil chemical composition and Na 2 O is as follows: (1) Select non-saline soil samples for which the Na 2 O contents are far from the upper limit of the non-saline range approximately identified by the relationship between Na 2 O and TSS, and use the frequency distribution of Na 2 O to eliminate outliers and improve the accuracy of sample selection. (2) Select major and/or lithophile trace elements that are not or slightly affected by salinization as the independent variables of the regression model. (3) To strengthen the prediction ability of the model, carry out cross-validation in the process of model establishment. (4) Evaluate the model's fitness by the determination coefficient R 2 and the predicted R 2 .
where y i is the ith observation response value, y is the average response value, andŷ i is the ith fitting response value. Ideally, R 2 is close to the predicted R 2 . The model is overfitted if R 2 is significantly smaller than the predicted R 2 . When R 2 reaches maximum and is close to the predicted R 2 value, the optimal model fitting can be obtained, and the corresponding component number should be selected as the component of the model.

Summary Statistics
The bulk-soil Na 2 O content and pH range from 1.09 to 16.52 wt%, and from 5.14 to 10.62, respectively, with average values of 2.32% and 9.16, which are remarkably higher than those in the eastern plain of China (Table 1). This may be caused by soil salinization. The lower contents of TFe 2 O 3 , Al 2 O 3 , Co, Cr, Ni, Ti, Th, and Y in the soil of the study area may be related to the properties of soil parent materials. Previous studies show that the sedimentary materials of the Songnen basin are mainly the weathering product of andesite, trachyte, rhyolite, and other magmatic rocks rich in alkali minerals, such as orthoclase, plagioclase, and nepheline. The contents of other elements of the soil samples are similar to those of the soils in the eastern plain of China.
The soluble salt in the soil takes Na + as the main cation and HCO 3 − as the main anion, which are the compositional characteristics of soda-type saline soils. The contents of Na + and HCO 3 − vary from 0.012 to 5.221 g/kg and from 0.029 to 5.630 g/kg, respectively, with average values of 0.703 g/kg and 0.875 g/kg ( Table 2). The TSS was calculated according to the analysis results of these eight kinds of ions. The variation range of TSS is 0.169 g/kg to 18.629 g/kg, with an average value of 3.626 g/kg. The XRD analysis results of soluble salts in the surface soils of the study area also support the soda-type salinization. The soluble salts in saline soils mainly include anhydrous mirabilite (Na 2 SO 4 ), soda stone (NaHCO 3 ), sodium salt (NaCl), natural alkali (Na 2 CO 3 ), carnallite (KCl·MgCl 2 ), and sylvite (KCl) (Figure 2).

Sample Selection
Following the two steps below, 1393 non-saline soil samples were selected from the study area. First, we roughly determined the upper limit of the bulk-soil Na 2 O content of non-saline soils by observing the frequency distribution of the bulk-soil Na 2 O of 7631 surface soil samples from the study area, which should follow a normal distribution ( Figure 3). It can be seen that the right boundary of the normal distribution curve representing nonsaline soils is about 2.90 wt%, which should be the upper limit of the Na 2 O content of non-saline soils in the study area.

Sample Selection
Following the two steps below, 1393 non-saline soil samples were selected from the study area. First, we roughly determined the upper limit of the bulk-soil Na2O content of non-saline soils by observing the frequency distribution of the bulk-soil Na2O of 7631 surface soil samples from the study area, which should follow a normal distribution ( Figure  3). It can be seen that the right boundary of the normal distribution curve representing non-saline soils is about 2.90 wt%, which should be the upper limit of the Na2O content of non-saline soils in the study area. To improve the selection accuracy, the data of 105 typical soil samples analyzed for soil-soluble salt and bulk-soil Na2O were used to establish the correlation between bulksoil Na2O and TSS (Figure 4). Although the relationship is not very significant, the upper limit of the Na2O content in non-saline soils can still be estimated by the regression equation. Generally, saline soils have TSS values of ≥1 g/kg, while non-saline soils have TSS values of <1 g/kg [27]. Therefore, the 1 g/kg TSS value was used to calculate the upper limit value according to the equation TSS = −5.61 + 3.513 × Na2O. The result is 1.88 wt%,  To improve the selection accuracy, the data of 105 typical soil samples analyzed for soilsoluble salt and bulk-soil Na 2 O were used to establish the correlation between bulk-soil Na 2 O and TSS (Figure 4). Although the relationship is not very significant, the upper limit of the Na 2 O content in non-saline soils can still be estimated by the regression equation. Generally, saline soils have TSS values of ≥1 g/kg, while non-saline soils have TSS values of <1 g/kg [27]. Therefore, the 1 g/kg TSS value was used to calculate the upper limit value according to the equation TSS = −5.61 + 3.513 × Na 2 O. The result is 1.88 wt%, much lower than 2.90 wt%. This upper limit value is smaller than that obtained from the frequency distribution histogram and was used to choose non-saline soil samples. Finally, 1393 non-saline soil samples were selected using the 1.88 wt% for establishing the PLSR regression model.

The PLSR Model
In the process of establishing the regression model, Na2O was set as the dependent variable (Y), and SiO2, TFe2O3, K2O, Al2O3, Ce, Co, Cr, Ni, La, Nb, P, Ti, Th, V, Y, and Zr as the independent variables (X). The method of one-by-one elimination was used for cross-validation. The result shows that K2O, Ce, Ni, La, Ti, and Th play a limited role in the establishment of the regression model, indicating that these six elements have a very weak impact on the fitting of the model and the prediction of observed values. When the other elements of SiO2, TFe2O3, Al2O3, Co, Cr, Nb, P, V, Y, and Zr were used, the test and predicted R 2 both reached the maximum (Table 3). To improve the generalization ability of the model, unnecessary independent variables were further reduced with the premise of ensuring the same fitting effect. Finally, six elements were selected as independent variables for model creation, namely SiO2, TFe2O3, Al2O3, Cr, Nb, and P. The regression equation can explain the variation rate of the dependent variable of 58.07% (Table 3). The above six independent variables in descending order of influence on the model are Al2O3, Cr, TFe2O3, Nb, P, and SiO2.

The PLSR Model
In the process of establishing the regression model, Na 2 O was set as the dependent variable (Y), and SiO 2 , TFe 2 O 3 , K 2 O, Al 2 O 3 , Ce, Co, Cr, Ni, La, Nb, P, Ti, Th, V, Y, and Zr as the independent variables (X). The method of one-by-one elimination was used for cross-validation. The result shows that K 2 O, Ce, Ni, La, Ti, and Th play a limited role in the establishment of the regression model, indicating that these six elements have a very weak impact on the fitting of the model and the prediction of observed values. When the other elements of SiO 2 , TFe 2 O 3 , Al 2 O 3 , Co, Cr, Nb, P, V, Y, and Zr were used, the test and predicted R 2 both reached the maximum (Table 3). To improve the generalization ability of the model, unnecessary independent variables were further reduced with the premise of ensuring the same fitting effect. Finally, six elements were selected as independent variables for model creation, namely SiO 2 , TFe 2 O 3 , Al 2 O 3 , Cr, Nb, and P. The regression equation can explain the variation rate of the dependent variable of 58.07% (Table 3). The above six independent variables in descending order of influence on the model are Al 2 O 3, Cr, TFe 2 O 3 , Nb, P, and SiO 2 . The diagram of actual versus calculated response values of the PLSR model can be used to determine model fitness and the prediction for each observation value. The high coincidence between the fitting and cross-validation values indicates that the model has an appropriate effect. In addition, the better the linear relationship presented by the data points in the response diagram, the higher the correlation between the dependent and independent variables. The response diagram of bulk-soil Na 2 O in non-saline soils from the study area shows that the fitting values are close to the cross-validation values ( Figure 5), indicating that the PLSR model can fit the data well and that there is a high correlation between bulk-soil Na 2 O and the selected elements (Al 2 O 3, Cr, TFe 2 O 3 , Nb, P, and SiO 2 ). The linear regression equation finally established is as follows. an appropriate effect. In addition, the better the linear relationship presented by the data points in the response diagram, the higher the correlation between the dependent and independent variables. The response diagram of bulk-soil Na2O in non-saline soils from the study area shows that the fitting values are close to the cross-validation values ( Figure  5), indicating that the PLSR model can fit the data well and that there is a high correlation between bulk-soil Na2O and the selected elements (Al2O3, Cr, TFe2O3, Nb, P, and SiO2). The linear regression equation finally established is as follows.

The Application of δNa in Salinization Identification
From the point of view of geochemical exploration, the process of identifying soil salinization involves determining the geochemical anomalies of bulk-soil Na2O in the study area. As a result, we established a geochemical indicator named δNa, defined as the difference between the measured and background values of Na2O, to reflect soda salinization.
Obviously, the higher the δNa value, the stronger the soil salinization. With the application of Equation (1), the background values of Na2O in 7631 surface soil samples from the study area were regressed, and then the δNa value of each soil sample was calculated using Equation (2). Subsequently, a contour map was drawn to show the distribution characteristics of δNa ( Figure 6). It can be seen that the areas with high δNa values are mainly distributed in the middle and northern parts of the study area. Most of the highvalue areas around lakes and marshes are island-shaped. The areas with low δNa values are dominantly concentrated along the Songhua and Tao'er Rivers ( Figure 6). Overall, the

The Application of δNa in Salinization Identification
From the point of view of geochemical exploration, the process of identifying soil salinization involves determining the geochemical anomalies of bulk-soil Na 2 O in the study area. As a result, we established a geochemical indicator named δNa, defined as the difference between the measured and background values of Na 2 O, to reflect soda salinization. from the study area were regressed, and then the δNa value of each soil sample was calculated using Equation (2). Subsequently, a contour map was drawn to show the distribution characteristics of δNa ( Figure 6). It can be seen that the areas with high δNa values are mainly distributed in the middle and northern parts of the study area. Most of the high-value areas around lakes and marshes are island-shaped. The areas with low δNa values are dominantly concentrated along the Songhua and Tao'er Rivers ( Figure 6). Overall, the distribution of δNa corresponds well with that of soil salinization in the study area (Figures 1 and 6), indicating that the indicator δNa can effectively identify soil salinization in the study area. In addition, the relationship between δNa and TSS was investigated using 105 typical soil samples with known TSS values. The SiO2, TFe2O3, Al2O3, Cr, Nb, and P content values of these 105 samples were substituted into Equation (1) to calculate the background values of bulk-soil Na2O, and then the δNa values were obtained. Finally, an equation of δNa = −0.04884 + 0.127 × TSS established from these soils was achieved. (Figure 7). The result shows that δNa has a significant linear relationship with TSS, and the correlation coefficient reaches 0.85, much higher than that of bulk-soil Na2O. All these indicate that the geochemical indicator of δNa can play an essential role in determining soda salinization.

The Application of δNa in Determining Salinization Degrees
The TSS boundary values of 1 g/kg, 2 g/kg, 4 g/kg, and 10 g/kg are commonly used to identify salinization degrees [28]. To further evaluate the effect of the indicator δNa, the linear regression equation between δNa and TSS established above, and those five TSS boundary values were used to calculate the δNa boundary values for identifying the In addition, the relationship between δNa and TSS was investigated using 105 typical soil samples with known TSS values. The SiO 2 , TFe 2 O 3 , Al 2 O 3 , Cr, Nb, and P content values of these 105 samples were substituted into Equation (1) to calculate the background values of bulk-soil Na 2 O, and then the δNa values were obtained. Finally, an equation of δNa = −0.04884 + 0.127 × TSS established from these soils was achieved (Figure 7). The result shows that δNa has a significant linear relationship with TSS, and the correlation coefficient reaches 0.85, much higher than that of bulk-soil Na 2 O. All these indicate that the geochemical indicator of δNa can play an essential role in determining soda salinization. In addition, the relationship between δNa and TSS was investigated using 105 typical soil samples with known TSS values. The SiO2, TFe2O3, Al2O3, Cr, Nb, and P content values of these 105 samples were substituted into Equation (1) to calculate the background values of bulk-soil Na2O, and then the δNa values were obtained. Finally, an equation of δNa = −0.04884 + 0.127 × TSS established from these soils was achieved. (Figure 7). The result shows that δNa has a significant linear relationship with TSS, and the correlation coefficient reaches 0.85, much higher than that of bulk-soil Na2O. All these indicate that the geochemical indicator of δNa can play an essential role in determining soda salinization.

The Application of δNa in Determining Salinization Degrees
The TSS boundary values of 1 g/kg, 2 g/kg, 4 g/kg, and 10 g/kg are commonly used to identify salinization degrees [28]. To further evaluate the effect of the indicator δNa, the linear regression equation between δNa and TSS established above, and those five TSS

The Application of δNa in Determining Salinization Degrees
The TSS boundary values of 1 g/kg, 2 g/kg, 4 g/kg, and 10 g/kg are commonly used to identify salinization degrees [28]. To further evaluate the effect of the indicator δNa, the linear regression equation between δNa and TSS established above, and those five TSS boundary values were used to calculate the δNa boundary values for identifying the degree of soil salinization in the study area. The result is listed in Table 4. According to the calculation result of δNa boundary values, 7631 surface soil samples were divided into five groups: non-salinization, mild salinization, moderate salinization, severe salinization, and extremely saline soils. We numbered each group of soil samples and then adopted different colored circles to draw the classification map of soil salinization in the study area ( Figure 8). It can be seen that about 56% of the surface soils in the study area have a salinization problem, and attention should be paid to this. These saline soils have different degrees of salinization. The proportions of saline, severe salinized, moderate salinized, and mild salinized soils are 8%, 19%, 15%, and 14%, respectively. Overall, the distribution of these four types of saline soils in the study area is consistent with the actual situation (Figure 1), indicating that the correlation between δNa and TSS established above is reasonable, and that the indicator δNa can play an important role in identifying the degree of soda salinization. degree of soil salinization in the study area. The result is listed in Table 4. According to the calculation result of δNa boundary values, 7631 surface soil samples were divided into five groups: non-salinization, mild salinization, moderate salinization, severe salinization, and extremely saline soils. We numbered each group of soil samples and then adopted different colored circles to draw the classification map of soil salinization in the study area ( Figure 8). It can be seen that about 56% of the surface soils in the study area have a salinization problem, and attention should be paid to this. These saline soils have different degrees of salinization. The proportions of saline, severe salinized, moderate salinized, and mild salinized soils are 8%, 19%, 15%, and 14%, respectively. Overall, the distribution of these four types of saline soils in the study area is consistent with the actual situation (Figure 1), indicating that the correlation between δNa and TSS established above is reasonable, and that the indicator δNa can play an important role in identifying the degree of soda salinization.
To verify the reliability of the geochemical indicator δNa, 105 typical soil samples with known TSS were selected to perform the new method on, including 4 extremely saline soils, 28 soils with severe salinization, 15 soils with moderate salinization, 12 soils with mild salinization, and 46 non-saline soils. The δNa boundary values of 0.08, 0.21, 0.46, and 1.22 calculated above were used to determine the salinization degree of these 105 typical soil samples. The results showed that 3 extremely saline soils, 25 soils with severe salinization, 10 soils with moderate salinization, 7 soils with mild salinization, and 34 non-saline soils were recognized, with accuracies of 75%, 86%, 67%, 60%, and 74%, respectively. The overall accuracy reached 75%, indicating that δNa can be used as one of the useful indirect geochemical indicators of salinization in this area. To verify the reliability of the geochemical indicator δNa, 105 typical soil samples with known TSS were selected to perform the new method on, including 4 extremely saline soils, 28 soils with severe salinization, 15 soils with moderate salinization, 12 soils with mild salinization, and 46 non-saline soils. The δNa boundary values of 0.08, 0.21, 0.46, and 1.22 calculated above were used to determine the salinization degree of these 105 typical soil samples. The results showed that 3 extremely saline soils, 25 soils with severe salinization, 10 soils with moderate salinization, 7 soils with mild salinization, and 34 non-saline soils were recognized, with accuracies of 75%, 86%, 67%, 60%, and 74%, respectively. The overall accuracy reached 75%, indicating that δNa can be used as one of the useful indirect geochemical indicators of salinization in this area.

Origin of Soil Salinization in the Study Area
Soil salinization has always been closely related to hydrogeological conditions. The area with a high salinization degree in the study area has a low terrain, and the altitude is mostly between 140 and 200 m. The western part of the study area is surrounded by the Greater Xing'an Mountains, the Xiao Xing'an Mountains, and the eastern Changbai Mountains, with a large amount of water and salt converging into the basin, providing a source of salt for the formation of salinization. It is mainly bicarbonate water and contains a small amount of sodium silicate. During evaporation, bicarbonate water changes into NaHCO 3 and Mg(HCO 3 ) 2 type water, and the concentration of sodium silicate in the water increases. The carbon dioxide in the water gradually transforms sodium silicate into silicic acid and sodium bicarbonate. Silicic acid precipitates in the water, and sodium bicarbonate remains in the water, producing trace amounts of soda and increasing the pH value of the water. As evaporation intensifies, the pH value in the water increases significantly, and calcium bicarbonate and magnesium bicarbonate precipitate, resulting in a considerably higher Na + content in the water than Ca 2+ and Mg 2+ [12,28,29].
Due to the tectonic movement, with the formation of the Songliao watershed from the end of the Middle Pleistocene to the Late Pleistocene, the river channel from Nenjiang River to Liaohe River was abandoned, forcing Nenjiang River to become a member of the Songhua River water system. The diversion of river beds in the area has led to many lake clusters and enclosed or semi-enclosed depressions, resulting in poor drainage and the formation of closed-flow lakes. Many enclosed or semi-enclosed depressions are formed on the surface, with low and flat terrain with minor fluctuations. After natural precipitation or the collection of water from high places, the water flow is faster, the drainage is smoother, and water accumulation occurs. Because the evaporation in this area is greater than the precipitation, the water evaporation in lakes, flood plains, and depressions is relatively high, which makes the water in lakes evaporate rapidly, meaning the water salinity is high, and the surrounding soil salinization is serious [10,[30][31][32].
Many factors contribute to the formation of soil salinization, including not only natural but also human factors [33]. One of the human factors may be the construction of numerous water conservancy projects with poor drainage, which has caused the elevation of groundwater level, and brought salt to the surface of the soil. Another factor may be the irrational development of agriculture and animal husbandry, which has deteriorated the area and quality of the grassland in the study area, and accelerated water evaporation from the surface. More attention should be paid to the salinization in the study area.

Conclusions
In this study, the surface soils in the western part of Jilin Province, China, were taken as an example to explore the relationship between soil chemical composition and Na 2 O by applying the partial least squares regression method, and the geochemical indicator δNa for soil salinization identification was established. The δNa indicator can identify soda-type salinization quickly and accurately, and has received good results in identifying soil salinization in the west of Jilin Province, China. In addition, this indirect salinization identification method can provide significant cost savings compared to the TSS method because the XRF analysis of soil chemical composition is relatively inexpensive. The research ideas of this paper can provide some reference for the identification of soda-type salinization in other areas, and different regression models can be established in various types of soil salinization areas. More than half of the soils in the study area are affected by salinization, and attention should be paid to this.