1. Introduction
Financing difficulties have always been an obstacle to the healthy development of China’s manufacturing SMEs. In recent years, the development of the supply chain finance model has greatly improved the financing efficiency of small and medium-sized enterprises [
1,
2]. The supply chain finance model has also attracted the attention of core enterprises in China’s manufacturing supply chain, small and medium-sized enterprises, and financial institutions. Supply chain finance can effectively improve the audit efficiency of financial institutions through the closed-loop management of capital flow, information flow, and business flow across the entire supply chain. At the same time, core enterprises in the supply chain can also reduce production costs and financing costs by integrating supply chain resources and providing credit guarantees to upstream and downstream enterprises, thereby enhancing their product competitiveness. Although supply chain finance can reduce the credit risk of enterprises in the supply chain compared to the traditional financing model, due to the more complex living environment faced by small and medium-sized manufacturing enterprises in China, such as significant operational risk, opaque financial information, and more intermediate links with core enterprises, credit risk is still the main obstacle for current supply chain finance businesses.
Corporate credit risk should be carefully assessed since its accuracy directly affects the future income levels of financial institutions [
3]. In this context, academic and financial circles are actively exploring how to more accurately assess the credit risk in supply chain finance. For example, some scholars have used statistical methods, such as logistic regression and linear regression, to establish enterprise credit risk assessment models. Commercial banks in China usually use the method of credit rating to review the credit risk of financing enterprises. The above method is effective for large enterprises with transparent information, good financial systems, and credit records, but it is difficult to achieve the desired results in small and medium-sized manufacturing industries due to their special nature.
In recent years, the development of artificial intelligence technology has provided new ideas for the credit evaluation of manufacturing SMEs. Existing research shows that machine learning methods, such as the neural network, random forest, and support vector machine models, have higher accuracy than traditional credit evaluation methods, especially in the evaluation of nonlinear factors [
4]. Machine learning algorithms can be trained with existing data to extract knowledge and predict future corporate credit risks based on the training model. It is worth noting that machine learning methods can produce relatively accurate predictions for a small amount of data based on a trained model. Although the use of machine learning models to predict credit risk has produced great achievements, there are still some shortcomings. For example, Li et al. [
5] proposed a hybrid model combining logical regression and an artificial neural network to predict the credit risk of SMEs. However, their research objects were generally SMEs, and the research results could not reflect the real credit risk situation of a manufacturing industry. Fayyaz et al. [
6] used a supervised learning algorithm to predict participant credit risk from the perspective of participant network attributes. Although these authors used the data for Iran’s automobile industry for analysis, they did not take into account the impact of the overall operation of the supply chain. Xuan [
7] believes that a risk evaluation index based on a fuzzy preference relationship and risk-related theory can accurately reflect the credit risk status of a research enterprise. However, this author did not undertake a corresponding analysis of how the risk indicators affect the credit risk status of enterprises. According to the above analysis, although the application of machine learning methods to solve the problem of credit evaluation of SMEs has been recognized in academic and financial circles, current research on enterprise credit risk assessment cannot truly reflect the real situation of manufacturing SMEs.
Therefore, based on the real-life scenario of small and medium-sized manufacturing enterprises in China, this paper makes the following research contributions: (Ⅰ) Taking small and medium-sized manufacturing enterprises in China as the research object and analyzing their special environment and characteristics, this paper determines the credit risk evaluation indicators suitable for small and medium-sized manufacturing enterprises in China in order to more truly reflect the status of manufacturing credit risk. (Ⅱ) Most existing studies take a single SME as the risk assessment object but ignore the impact of upstream and downstream enterprises in the supply chain and the overall operation of the supply chain in the assessment enterprise. Additionally, today’s commercial operations are no longer the environment for the competitive development of a single enterprise and have changed from development of a single enterprise to supply chain development. The core enterprises in the supply chain and the operation status of the supply chain have increasingly important influences on the operation of SMEs. Therefore, this study introduces core enterprise and supply chain operations into the credit risk assessment of SMEs in order to more comprehensively reflect the financing environment of SMEs. (Ⅲ) There are many existing machine learning methods, but few scholars have explored which risk assessment model is suitable for SMEs in China’s manufacturing industry. A reasonable risk assessment model can not only help banks carry out risk assessments but also help SMEs in the manufacturing industry obtain funds for development. Therefore, it is particularly important to explore which machine learning risk assessment model is suitable for small and medium-sized manufacturing enterprises in China. By comparing the experimental results for current commonly used machine learning credit risk assessment models and using the partial dependence graph method to explore the impact of a single indicator on credit, this study aims to provide specific and feasible suggestions for supply chain managers.
The rest of this paper is organized as follows:
Section 2 reviews the literature on credit risk assessment in supply chain finance.
Section 3 introduces the selection of the financial credit risk evaluation indicators for the manufacturing supply chain.
Section 4 uses four different machine learning models to evaluate the credit risk of the sample enterprises selected from the Chinese stock market and uses the partial dependence plot method to analyze the impact of the main indicators on the results. In
Section 5, we summarize the overall research process, promote the idea of enlightened management, and describe future research prospects.
4. Model Evaluation Results and Analysis
4.1. Data Sources
Considering the slow development of China’s supply chain finance prior to 2015, reliable data to support research are scare. Nevertheless, we followed the current common practice in data selection: we selected the data for listed companies because these data are accurate, transparent, and easy to obtain. Therefore, the data source for this study was a selection of manufacturing SMEs on China’s stock market between 2015 and 2020. The sample of core enterprises was screened for the top five customers announced by the SMEs. Firstly, we judged whether the enterprise was a small or medium-sized manufacturing enterprise based on an analysis of the official website of the target enterprise. Secondly, we used the CAMAR database to query specific risk indicators according to the selected enterprise stock code. Finally, we defined a risky enterprise according to a comparison between the asset–liability ratio of the SME and the lower value from the Enterprise Performance Evaluation Standard Value publication. An enterprise with an asset–liability ratio higher than the lower value was defined as a risky enterprise and assigned the value of 1. Otherwise, it was defined as a non-risk enterprise and assigned the value of 0. During data screening, the companies with unlisted core enterprises or core enterprises with credit ratings that could not be queried were excluded, as well as SMEs with incomplete data. A total of 170 data samples were obtained through the data screening, including 140 samples for model training and 30 samples for model testing. All the data in this paper were derived from the official websites of the enterprises, their annual reports, or the CAMAR database.
We selected the data specifically through empirical observation so as to avoid the decline in the prediction function caused by overly concentrated tag values for data samples. As the selected indicators had different dimensions, which can have affected the research results, this study used SPSS 26 software to standardize the sample data based on the data standardization formula: , where represents the mean value of the data and represents the standard deviation of the data. The sample data processed through standardization conformed to the standard normal distribution, preventing any negative impact from different dimensions and feature result contributions on the model experiment. As there were only 170 data samples available, the following measures were proposed to solve the problem. Firstly, a support vector machine, logical regression, and other models with simpler calculation principles were selected. Secondly, this study used a tenfold cross-validation method to increase the amount of data that could be verified outside of the sample and thus ensure that the number of data samples in this study did not affect the experimental results.
4.2. Experimental Process
In this paper, support vector machine, random forest, multilayer perceptron, and traditional logistic regression models were used to compare and explore more suitable methods for manufacturing supply chain financial credit risk assessment, and then important characteristic indicators were selected for visual analysis. Firstly, we selected the original data based on the official websites of the enterprises and the CAMAR database. Secondly, SPSS 26 software was used to screen the data in combination with the real-life scenario. Thirdly, on the basis of the processed data, the data analysis software SPSS Modeler 18.0 was used to establish four models—a support vector machine, a random forest, a multilayer perceptron, and logical regression. The models were used to analyze the divided training set data, and the test set data were verified with the models after training. Finally, Python 3.7 was used to build a partial dependence plot model, and then the impact of characteristic indicators on the prediction results was visually analyzed.
4.3. Comparison of Model Results
4.3.1. Evaluation Index of Model Prediction Effect
Current machine learning model prediction effect evaluation includes the confusion matrix, accuracy, false negative rate, precision, and F-measure, which are specifically defined as follows:
where
TP,
TN,
FP, and
FN refer to true positive, true negative, false positive, and false negative, respectively. The confusion matrix is a visual tool for supervised learning that is mainly used to compare classification results with the real information of the instance and is helpful for intuitively understanding the prediction of a model. Accuracy is the most commonly used indicator for a model, representing the proportion of the number of samples correctly predicted by the model compared to the total number of samples. The false negative rate represents the probability that the model will predict a risk-free enterprise to be a risky enterprise. If the false negative rate is too high, the bank will lose high-quality customers, which may affect future earnings. The accuracy rate indicates the proportion of the number of enterprises accurately predicated as risk-free compared to the total number of enterprises predicted as risk-free. This can also be understood as the precision rate. The F-measure is also an important indicator to measure the accuracy of binary classification problem models, and it takes into account the precision and recall of the classification model. The F-measure score can be regarded as a harmonic average of the model precision and recall, with a maximum value of 1 and a minimum value of 0. The above indicators are commonly used to reflect the performance of model evaluation. We used the above indicators to evaluate model performance.
4.3.2. Model Evaluation Results
Based on the selected evaluation model, the experimental data are analyzed, and the model performance is analyzed according to the model experimental results. The confusion matrix of the prediction results of the four models is obtained through experiments, as shown in
Table 5,
Table 6,
Table 7 and
Table 8. The accuracy, false negative, precision and F-1 of different models are calculated according to the model confusion matrix, as shown in
Table 9.
It can be seen from the model experiment results that the accuracy rates for the different models with the training set were similar, but the models could not be directly evaluated with the training set. This was because the data related to all the influencing factors present in practice could not be obtained. In the validation set, the random forest model had the best accuracy, false negative rate, precision, and F-measure score, indicating that it could more effectively identify risky enterprises and was more suitable for credit risk assessment of SMEs in China’s manufacturing industry. The specific reasons for this were as follows. Firstly, the LR model is similar to the linear model and struggles to address nonlinear problems. However, there were nonlinear characteristic indicators in the real supply chain financial risk assessment, and logistic regression could not solve the problem of data imbalance. Therefore, although the accuracy of the logistic regression model in the test set was high, the accuracy of the verification results was low. Secondly, the performance of the SVM model mainly depends on the kernel function. However, the kernel functions of currently used SVM risk assessment models are generally artificially selected, have a certain level of randomness, and struggle to truly reflect the real-life scenario for the manufacturing supply chain. As the kernel function could not be accurately selected, it was difficult for the model verification results to reach a high level. Thirdly, the reasons for the low accuracy of the MLP test set may have been that the sample dimension was too high and the sample data were small. Considering the limited business cooperation between these two factors in real supply chains, it may not be possible to obtain sufficient data, thus making it impossible to accurately predict supply chain finance risk. Lastly, there are various risk factors to consider in manufacturing supply chain finance, and more characteristic indicators should be selected when making risk predictions. As the random forest model can deal with high-dimensional data, it should show better performance in credit risk assessment. Moreover, the RF model can balance errors and has a strong generalization ability with good performance.
On the whole, the model test results corroborated the real situation of current manufacturing supply chain financial risk prediction. Additionally, the stochastic forest model was found to be more suitable for the risk assessment of manufacturing supply chain finance in terms of risk identification and accuracy.
4.4. Visual Analysis of Model Prediction Results
In order to further understand the specific impacts of different feature indicators on credit risk, the reader can refer to [
39]. This section describes how Python 3.7 was used to build a partial dependence plot model on the basis of the above RF model and then presents a visual analysis of the effects of feature indicators on prediction results. Limited by space, we focus on the characteristic indicators that reflect the characteristics of the manufacturing industry and supply chain, such as the cash ratio, the average age of management, the asset–liability ratio of core enterprises, the ratio of R&D investment to operating income, the credit evaluation of core enterprises, and the strength of supply chain relationships.
4.4.1. Effect of Cash Ratio
Overall, as shown in
Figure 1, the default probability of financing enterprises decreases with the increase in the cash ratio. When the cash ratio is greater than 0.8, the default probability remains at a low level. Since the cash ratio reflects the short-term solvency of an enterprise, the higher the cash ratio is, the stronger the liquidity of the enterprise and the stronger the short-term solvency. The financial sector generally believes that a corporate cash ratio above 0.2 is preferable. Nevertheless, an excessively high cash ratio also means that an enterprise’s current assets have not been properly utilized, resulting in the lower profitability of cash assets. In
Figure 1, we can see that, when the cash ratio is close to 0.8, the profit of the financing enterprise’s cash assets decreases and the opportunity costs of the company increase, creating a certain risk for the enterprise.
4.4.2. Effect of Average Age of Executives
Figure 2 shows the effect of the average management age on the default probability of financing enterprises. It is clear that the default probability of financing enterprises tends to decrease first and then increase with the average age of management. Specifically, when the average age of management is between 44 and 51 years old, the default probability of financing enterprises is at a low level. When the average age of the management reaches 51 years old, the learning and management decision-making abilities of managers decrease with their increasing age, and the business risk of the enterprise increases. This ultimately leads to an increase in the credit risk of the enterprise.
4.4.3. Effect of Asset—Liability Ratio of Core Enterprises
Figure 3 shows the impact of the asset–liability ratio of core enterprises on the default probability of financing enterprises. In general, the credit default probability of financing enterprises increases with the increase in the asset–liability ratio of core enterprises. This result is consistent with the real-life scenario. In other words, when the asset–liability ratio of core enterprises is high, their solvency and development ability are reduced, which leads to an increase in operational risks for the entire supply chain and affects the development of SMEs. At the same time, the reduced solvency of core enterprises also means that, when SMEs default, core enterprises cannot provide financing guarantees for the SMEs in the supply chain.
4.4.4. Effect of Ratio of R&D Investment to Operating Income
Generally, as shown in
Figure 4, the impact of the ratio of R&D investment to operating income on the default probability of financing enterprises shows a trend of first decreasing and then increasing. Specifically, when the proportion of R&D investment increases from 0 to 4.75%, the default probability of enterprises significantly decreases, indicating that R&D innovation is very important for manufacturing enterprises. However, strong R&D also means high investment. When the proportion of R&D investment reaches a certain level, the probability of default begins to increase. As an explanation for this finding, it can be suggested that, when an enterprise invests too much operating income in research and development, this may lead to other problems, such as a shortage of enterprise funds and unstable product sales, thereby increasing the operational risk of the enterprise. Therefore, most SMEs control the proportion of R&D investment in practice.
4.4.5. Effect of Credit Evaluation on Core Enterprises
Figure 5 reflects the impact of core enterprises’ credit rating on the default probability of financing enterprises. It is clear that there is a linear relationship between the credit evaluation of core enterprises and the default probability of financing enterprises. In other words, the better the credit rating of core enterprises is, the lower the default probability of financing enterprises. The main reason for this is that core enterprises serve as an important foundation for supply chain finance and significantly support the credit of the entire supply chain. The higher the credit rating of a core company is, the more likely it is to work with supply chain financing companies to avoid defaults. This also confirms the real-life scenario of risk avoidance, in which financial institutions preferentially select core enterprises with good credit ratings for cooperation with and financing of SMEs.
4.4.6. Effect of Strength of Supply Chain Relationships
Figure 6 shows the effect of the strength of supply chain relationships on the default probability of financing enterprises. The figure shows that, when the relationship between financing enterprises and core enterprises remains strong or below a certain level, the default probability of financing enterprises is low. With a closer cooperative relationship between suppliers and core enterprises in the supply chain, the core enterprises will be more willing to guarantee them in order to achieve win–win cooperation. The difference is that, as shown in the figure, after the strength of supply chain relationships reaches level three (when the relationships are strong), the default probability for financing enterprises significantly increases with further increases in the strength of the cooperation between the two sides. Generally, the closer the cooperation is between a financing enterprise and a core enterprise, the lower the credit risk of the financing enterprise, which seems to be inconsistent with common understandings of supply chain finance. In reality, this may be because the financing enterprise and the core enterprise are too closely related, causing SMEs to heavily rely on core enterprises. However, if the core enterprises encounter risks in their operation, due to the dependence of the financing enterprises on the core enterprises, the related risks of the core enterprises may directly affect the financing enterprises, resulting in an increase in their credit risk.
5. Concluding Remarks
On the basis of the characteristics of manufacturing enterprises, we preliminarily selected financial credit risk evaluation indicators for the supply chain of manufacturing SMEs. Then, we used SPSS 26 software to analyze the correlations between the primary indicators and eliminate the indicators with strong correlations in the correlation test. Finally, we obtained a financial credit risk evaluation system for the manufacturing supply chain that consisted of 18 evaluation indicators in four categories: the overview of the financing enterprises, the asset status of financing companies, the overview of the core enterprises, and the operation of supply chains.
Taking the manufacturing SMEs listed on China’s stock market as the data sample, we used SPSS Modeler 18 software to experimentally compare four common supervised learning algorithms: random forest, logistic regression, multilayer perceptron, and support vector machine. The results showed that the prediction accuracy and type I error rate for the random forest model were the best, indicating that the random forest model was more suitable for the credit risk assessment of China’s manufacturing SMEs.
Based on the random forest model, a PDP model was established using Python 3.7 software. Then, the credit risk prediction results were visually analyzed. The results revealed that the indicators for the credit rating and the asset–liability ratio of core enterprises were in line with the general perception of credit risk prediction results. However, the indicators for the strength of supply chain relationships and the cash ratio were different from the general understanding. In other words, the two indicators significantly increased the probability of default risk after reaching a certain level. In terms of R&D investment, the probability of default risk first decreased and then increased with the increase in R&D investment.
On the basis of the above research conclusions, the main suggestions are as follows.
The selection of credit risk evaluation indicators for manufacturing supply chain finance should take into account both financial indicators and non-financial indicators. In terms of financial indicators, financing enterprises should focus on strengthening the true records and disclosures of their finances. Non-financial indicators should be combined with the situations of specific enterprises. For example, in terms of R&D innovation for manufacturing products, attention should be paid to enterprises’ proportions of R&D investment. In addition, SMEs and core enterprises should maintain a stable and transparent trading environment, as this is more conducive to the acquisition of credit risk indicators.
In the financial risk assessment of the supply chain, each participant should make timely adjustments to their operation according to the credit risk assessment results. Manufacturing SMEs should make reasonable production and sales plans according to their operating conditions to avoid an increase in their credit risk due to improper operation. For example, financing enterprises should make reasonable use of their current assets to increase the profitability of cash flow assets and create reasonable production plans to prevent the risk of capital fracture. To ensure the stability of product supply, core enterprises should actively participate in supply chain finance-related services, efficiently adjust their relationships with SMEs, and provide credit support for SMEs in supply chain finance. In terms of the supply chain, although core enterprises are the core and entry point of supply chain finance, SMEs should properly handle their dependency relationships with core enterprises to avoid excessive reliance on them, which can result in additional risks to their development. In addition, when planning the production of products, enterprises in the supply chain should make arrangements according to the future popularity of products and the development prospects of the industry to avoid business risks caused by blindly expanding market share.
In terms of limitations, the research data samples in this paper were somewhat constrained, and all were listed enterprises. In future research, a large amount of raw data for SMEs should be collected, which would help improve the performance of the models and reflect the real prediction results. Secondly, in this paper, only four kinds of models were compared. In future research, different types of evaluation models could be included for a comparative analysis, and a more suitable model for manufacturing supply chain credit risk evaluation could be obtained.