Next Article in Journal
Comparing Robust Haberman Linking and Invariance Alignment
Previous Article in Journal
Survival Times of Transplanted Kidneys Among Different Donor–Recipient Cohorts: The United States Registry Analysis from 1987 to 2018, Part 1: Gender and Ethnicity
 
 
Communication
Peer-Review Record

Seasonal Analysis and Risk Management Strategies for Credit Guarantee Funds: A Case Study from Republic of Korea

by Juryon Paik 1 and Kwangho Ko 2,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Submission received: 19 October 2024 / Revised: 20 December 2024 / Accepted: 23 December 2024 / Published: 26 December 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The article is devoted to the analysis of time series of performance indicators of small and medium-sized enterprises in Korea from 2012 to 2023 based on different models. It is proposed to identify trends and seasonality in the data, based on a comprehensive model, a forecast of the probability of default is carried out and on this basis it is possible to make anti-risk decisions.

The article is written competently, the modeling methodology is presented clearly, the analysis of related works is presented. There are several small comments on the article:

1. It is desirable to expand the list of publications, mentioning works devoted to forecasting the probability of default of enterprises and organizations.

2. Expand the methodology section, indicating the basic differences between the ARIMA, SARIMA, and Prophet models used

3. In the abstract, describe the proposed methods and the difference between the method for solving the problem posed in the work and others. Also expand the conclusion of the article.

Comments on the Quality of English Language

Please correct the syntax and punctuation.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper utilizes three methods to predict the default rate using a loan dataset provided by a national credit institution, and compares the predictabilty and suitable scenarios under which the three methods perform better. While the paper conducts a relatively comprehensive exercises, it can be further improved by considering the following three issues. First, we need a benchmark setup for prediction so that the accuracy and robustness of the free methods chosen can be evaluated. Second, as a case report style of article, the current paper lacks case background and discussion and devotes too many pages to statistical analyasis. Third, the authors should discuss the in-sample and out-of-sample prediction results because the reliability out of sample is what readers and users care most. Finally, there are also some minor issues such as specifying the reason of choosing those three methods, optimizing the graphical presentation, and making some of the technical statements more concise.  

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

 

This is an interesting study in the field of time series forecasting modeling via an application on default rates in South Korea by using ARIMA, SARIMA, and Prophet models. However, basic methodologies are applied, which doesn't “shock” me, but the problem lies in the various errors and inaccuracies in their application.

The paper raises some questions that need to be clarified and some changes need to be made towards greater precision in the presentation and analysis of results, as well as better justification of the choices that were made throughout the study.

Therefore:

(1)   In the preliminary data analysis, the authors do not say how the missing values were dealt with (Page 5-section 3.1). It is known that the Box-Jenkins methodology (SARIMA models) cannot be applied to time series with missing values; the missing values have to be “estimated”. And about outliers, how were they treated?

(2)   There are inaccuracies and a lack of rigor throughout the paper. For example: Page 5, line 239: “a relatively stable pattern” or line 468: “pronounced seasonality”. What does this mean in the context of time series analysis? It is known that time series can be stationary or non-stationary, but a “stable pattern” does not exist, and seasonality does or does not exist, which means “pronounced “? It is not correct.

(3)   Calculating descriptive statistics in time series is also not correct when the time series is non-stationary. (Stationarity Check: line 390) What does the overall average value of a trending time series mean? For example, if the trend is linear crescent, the average will increase over time, so calculating an overall average is nonsense. Therefore, if the series is non-stationary in variance, calculating a standard deviation is not correct either. The authors should therefore consider whether they should present what is stated between lines 237–247.

(4)   The graphical representations of time series indicate that most of the time series are non-stationary in variance. In this study, the authors don't show how they solved this problem so that they could then apply (S)ARIMA models. Shouldn't they, in the first step, transform the data via a Box-Cox transformation to stabilize the variance? Lines 406-407: “To address this issue, first-order differencing was applied to the data from these regions.”– first-order differencing is to stabilize the trend, but first the variance must be stabilized.

(5)   When comparing series, whether default rates (in 17 cities), observed, or component graphs (trend, seasonal, and residual components by comparing the different components of the different series), the authors must always use the same scale range. The authors should arrange a range of scale, a compromise, for each of the graphs presented on Figures 9 to 12, in order to have the same range of values for each component and each time series.

(6)   I don't think it's appropriate to apply ARIMA models to series that the authors have shown to be seasonal. ARIMA models are known to be a particular case of SARIMA models. For example, for the time series of default rates in Sejong city, which has seasonality, an ARIMA model and a SARIMA model were established and compared, which is unjustified, as it is known that it is not appropriate to apply an ARIMA model to a series in which seasonality is statistically significant. On the other hand, even in the case of time series without seasonality, it would be a mistake to apply SARIMA models. I think it is inappropriate to compare the performance of ARIMA and SARIMA models (Section 33.3.5), as it is the Box-Jenkins methodology (a classical already widely studied and applied methodology), in which you can follow the usual steps to decide whether the final model is an ARIMA or a SARIMA. It does make sense to compare the performance of the model resulting from the application of the Box-Jenkins methodology with that of Prophet.

(7)   Lines 502-503 “The data was then split into training and test sets, with the training set consisting of data up to December 2023, and the test set reserved for evaluating the model’s performance.” I can't understand why this approach wasn't also applied in the previous modelings, since, in time series modeling, this approach is typically implemented in the various processes. If this approach were applied, it would ultimately be interesting to evaluate the performance of the different models applied in the two sets, both to training period/sets and to period/test sets.

(8)   In line 478, R-squared (R2) is presented as a measure to assess the accuracy of each (S)ARIMA model. How is that possible?

(9)   The presentation and description of the databases is very confusing, in particular the appearance of a new time series (lines 611–612) “without considering specific facts such as weekends or holidays”. The default rates are calculated without the values observed on those days, on a monthly basis? This should be better explained. And if the authors are comparing forecasting models, why haven't (S)ARIMA models been applied to these new series? That should have been done. This new time series should have been presented in section 3.1.

The paper is well written but it needs some proofreading. Here are some examples taken from the first 3 pages of the paper:

Abstract

1)     Results prove that the ARIMA model = The results prove that the ARIMA model

Introduction

1)    in diverse fields as business = in diverse fields, such as business

2)    to predict a patient’s future condition for a specific disease = to predict a patient’s future condition regarding a specific disease

3)    and declared by The World Health Organization = and declared by the World Health Organization

4)    improving its clinical diagnostic = improving its clinical diagnosis / clinical diagnostic accuracy

5)    with 39,053 as deaths = with 39,053 as deaths

6)    two lacs of cases of COVID-19: non-Indian readers wont be able to decipher the term lacs, and the correct term is lakh (100,000)

7)    53,652 cases with less testing facilities = 53,652 cases, with less testing facilities

 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for responding to my previous comments. I believe the current version is good for publication if two minor issues are considered. First, some of the figures and tables can be streamlined and put into appendix, while for those remain in text should be undstandable on a stand alone basis (all notations need to be briefly explained in the footnote). second, there are some formating problems. E.g., line 789-794 have some errors in indented format.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop