Next Article in Journal
Searching for Promisingly Trained Artificial Neural Networks
Previous Article in Journal
Shrinking the Variance in Experts’ “Classical” Weights Used in Expert Judgment Aggregation
 
 
Article
Peer-Review Record

Large Language Models: Their Success and Impact

Forecasting 2023, 5(3), 536-549; https://doi.org/10.3390/forecast5030030
by Spyros Makridakis 1, Fotios Petropoulos 1,2 and Yanfei Kang 3,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Forecasting 2023, 5(3), 536-549; https://doi.org/10.3390/forecast5030030
Submission received: 12 July 2023 / Revised: 18 August 2023 / Accepted: 19 August 2023 / Published: 25 August 2023

Round 1

Reviewer 1 Report

Given the hype surrounding ChatGPT and numerous papers written on it, this paper distinguishes itself in examining  ChatGPT’s responses to forecasting questions where the correct answers are known.  In particular, responses from OpenAI’s ChatGPT are compared with a custom-trained version called CustomGPT.  It is a highly informative paper and many thanks to the authors for their original work on this highly relevant issue.

  My comments are as follows:

 - The first two sentences in the Abstract would be better placed in the Introduction.

-Abstract : referring to forecasting as “..a specialized, domain-specific field” – not sure what this means.  Please consider using different (more meaningful) wording to describe forecasting.

 -The paper gives a thorough description of relative forecasting accuracy of ChatGPT with CustomGPT for the chosen topic of M Forecasting Competitions.  A table that compares forecasting accuracy of ChatGPT with custom-trained version for the different categories of questions used in the forecasting task would be highly informative in summarizing the findings. 

 -Similarly, a table comparing the findings for judgmental adjustments section would also be required.

 -Literature review is very light and gives a somewhat biased view especially in the judgmental adjustments section of the paper.  Overall, it is worth noting that such biases could influence the choice of training materials fed into LLMs and these discussions need to be included in the limitations of such work.

 -Is there any particular reason why forecast comparisons with Bard (another highly-used LLM) were not made/included? 

 -Very good section on the future of LLM technology.

 -What does the following mean (ln 471): “..forecasting domain specific vertical geared to the field”

 

 

Good

Author Response

Dear Reviewer,

Thank you for your comments. Please see our replies in the attached reply letter. 

Best regards,

Yanfei

Author Response File: Author Response.pdf

Reviewer 2 Report

This work analyzes the success of LLMs as ChatGPT. The authors also compare the responses of standard ChatGPT and a custom ChatGPT trained with specific material about forecasting from specialized sources. They analyze the accuracy of the standard and customized ChatGPT in questions related to M forecasting competitions. They also analyze the responses of a particular field of forecasting as judgmental adjustments. They find that standard ChatGPT is less precise than custom ChatGPT and neither of them is accurate enough.

 

In my view, this article is very difficult to assess because is not the typical research article with novel techniques, results or applications. It is a new topic that is changing many aspects of our academic life and as far as I know, it is one of the first articles (perhaps the first one) that relates ChatGPT and forecasting. That is also one of the main strengths.

 

The paper is well-written and, of course, is very interesting and related to the topic of the journal. I think that the paper has two views. The technical one that is focused on the experiment designed to test both ChatGPT (standard and customized). In this sense, I think the paper could be improved if the authors try to quantify the extent of accuracy of each type of ChatGPT in case other researchers try to do something along those lines in order to make this research more reproducible. In general, I think the research has been conducted properly.

 

The other view is less technical, and it is more related to the author’s opinion about future prospects of similar LLMs. Although it is interesting, I’m not qualified to review it though.

Author Response

Dear Reviewer,

Thank you for your comments. Please see our replies in the attached reply letter. 

Best regards,

Yanfei

Author Response File: Author Response.pdf

Back to TopTop