Next Article in Journal
Human-Centred Design Meets AI-Driven Algorithms: Comparative Analysis of Political Campaign Branding in the Harris–Trump Presidential Campaigns
Previous Article in Journal
A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data
 
 
Article
Peer-Review Record

Can AI Technologies Support Clinical Supervision? Assessing the Potential of ChatGPT

Informatics 2025, 12(1), 29; https://doi.org/10.3390/informatics12010029
by Valeria Cioffi 1,*, Ottavio Ragozzino 1, Lucia Luciana Mosca 1, Enrico Moretto 1, Enrica Tortora 1, Annamaria Acocella 2, Claudia Montanari 3, Antonio Ferrara 4, Stefano Crispino 5, Elena Gigante 6, Alexander Lommatzsch 7, Mariano Pizzimenti 8, Efisio Temporin 9, Valentina Barlacchi 10, Claudio Billi 11, Giovanni Salonia 12 and Raffaele Sperandeo 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Informatics 2025, 12(1), 29; https://doi.org/10.3390/informatics12010029
Submission received: 16 January 2025 / Revised: 28 February 2025 / Accepted: 11 March 2025 / Published: 17 March 2025

Round 1

Reviewer 1 Report (Previous Reviewer 3)

Comments and Suggestions for Authors

Aim

It would be good to have a citation referring to when Chat GPT4 has the LTM function. 

Methodology

Important note: "Pre-training" is often confused with the term "pre-training", which is commonly used in neural networks when the parameters are to be reused for training another model. The author needs to clarify the term "pretraining" used in this study, which was actually utilizes the memory function of ChatGPT to refine the response to the prompt (just like pre-training a student for a particular skill before the actual implementation).

Please standardize the term "pretraining" or "pre-training". 

The problem with the citation of Cioffi et al. (2024) [12] still exists, as the methods used in this preprint were not detailed enough (consists of only five sentences), which is why this study relies heavily on this reference, making the methodology of this study relatively less reliable and valid. The author must have another similar peer-reviewed publication to substantiate the methodology cited in this study [12].

Please provide a better resolution of Figure 1 and 2.

Author Response

Response to Reviewer 1:

Aim: We appreciate your suggestion and have now included a citation referencing when ChatGPT-4 introduced the Long-Term Memory (LTM) function to clarify this point (l.121-126).

Methodology:

  1. Clarification of "pretraining" vs. "pre-training": We acknowledge the ambiguity in terminology and have now clearly defined "pretraining" as used in this study. In our context, "pretraining" refers to the process of using the memory function of ChatGPT to refine responses to prompts, rather than reusing parameters for another model’s training. We have standardized the term consistently throughout the manuscript (l.153-155).
  2. Citation of Cioffi et al. (2024) [12]: We recognize the limitations of relying on this preprint as a primary reference for the methodology. To strengthen the validity of our study, we have now included additional references from publications that support our approach and provide a more substantial basis for our methodology (l. 114-116).
  3. Resolution of Figures 1 and 2: We acknowledge the concern regarding the quality of Figures 1 and 2. For an improved resolution, we have completely changed them.

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

I have following observations:

The paper assesses the potential of ChatGPT. The proposed manuscript does not provide empirical validation through actual clinical trials. The study does not thoroughly examine the biases associated with artificial intelligence or the dangers of excessive dependence on AI-generated suggestions.  The paper fails to comprehensively tackle the difficulties of integrating AI-based clinical supervision across various healthcare environments. At the same time there are other emerging open AI platforms which may be significantly improved over current chatGPT 4 platform so keeping this rapid evolution in mind, the problem statement becomes too much restricted 

Comments on the Quality of English Language

I have following observations:

The paper assesses the potential of ChatGPT. The proposed manuscript does not provide empirical validation through actual clinical trials. The study does not thoroughly examine the biases associated with artificial intelligence or the dangers of excessive dependence on AI-generated suggestions.  The paper fails to comprehensively tackle the difficulties of integrating AI-based clinical supervision across various healthcare environments. At the same time there are other emerging open AI platforms which may be significantly improved over current chatGPT 4 platform so keeping this rapid evolution in mind, the problem statement becomes too much restricted 

Author Response

Response to Reviewer 2:

  1. Empirical validation and AI integration: We have stated in the manuscript the importance of an integrated approach that combines AI capabilities with those of an experienced human supervisor. This ensures that AI does not replace human oversight but rather complements it, reinforcing the blended supervision methodology (l. 442-450 ).
  2. Challenges in AI-based supervision and rapid evolution of AI platforms: We acknowledge the limitations of the current study in addressing the broader difficulties of integrating AI-based clinical supervision across diverse healthcare settings. While ChatGPT-4 is the focus of our study, we have expanded our discussion to recognize the evolving landscape of AI technologies. Future studies should compare multiple AI platforms to assess their relative strengths and limitations in clinical supervision (l. 451-455).

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

The paper explores the feasibility CatGPT 4 as a supervised tool in psychological training by comparing feedback from three groups—untrained AI, pretrained AI, and a qualified human supervisor—on a clinical case. Gestalt psychotherapy trainees assessed their satisfaction using a Likert scale. The study is well written, but revisions are needed to enhance clarity and rigor.

Figure 1. I do not understand the presence of two repeated diagrams with the "Step 3" and "Step 4" boxes, which appear to be the same. Could this be an error?

Row 109 “In this study…” and “The ultimate purpose…” The sentences, which are the core of the section, are difficult to understand, partly because it is written in somewhat convoluted English. Could you clarify it further?

 

The same holds for sentences starting at row 133. It is not clear to me, in particular, the relation between the sentences starting with “It should be noted…” and the diagram in Figure 1. Could you explain better ther matter?

 

Row 158. “The first feedbak… the second feedback… “ Do you mean that the expert gave more tha one feedback ot that first, second and third refers to the feeback provided by the untrained ChatGPT, pretrained ChatGPT and the expert, respectively?

 

Was there only one expert, or a panel of experts? I wonder if a single expert is sufficient for comparison, considering that ChatGPT’s response is generated by models trained on a vast amount of diverse documents.

 

Section Results

Row 250, Component one refers to item clarity in Table 4, I assume. However, using the same term as in Table 3 may lead to ambiguity.

Related to this, the p-value of the clarity dimension is 0.097, which is not significant even at the 1 percent level.

To my knowledge, a significant p-value in the ANOVA test does not indicate which specific groups differ. How did you determine that the difference between fb2 and fb3 is not significant?

Table 4, Item 4 empathic apporach: Shouldn't it be p<0.0001?

Pklease revise this section.

Comments on the Quality of English Language

The English language could be improved (see comments above): e.g, row 87, the sentence starting with "So that" sounds incomplete, the use of "he" at row 212, some convlouted sentences difficult to understand (see comments ), typos (e.g. "the work the work" at row  114, etc.

Author Response

Response to Reviewer 3:

  1. Figure 1 – Repeated diagrams with "Step 3" and "Step 4": We appreciate your observation. The repetition was unintentional, and we have corrected the figure 1. For an improved resolution, we have completely changed Figure 1 and Figure 2.
  2. Row 109 – Clarity of core sentences: We have revised the sentences in this section to improve readability and coherence, ensuring that the purpose of the study is conveyed more clearly (l. 109-112).
  3. Row 133– Relation between “It should be noted…” and Figure 1: We have provided a more explicit explanation of how the statement connects to Figure 1, clarifying the relationship between the described methodology and the diagram (l. 112-113).
  4. Row 158 – Terminology of feedback sequence: We have clarified that "first feedback," "second feedback," and "third feedback" refer to the responses provided by the untrained ChatGPT, the pre-trained ChatGPT, and the expert supervisor, respectively (l. 223-227).
  5. Number of experts involved: Only one expert provided feedback in this study. We acknowledge the potential limitation of using a single expert compared to AI, which aggregates vast amounts of diverse information. This point has been addressed in limitations and future developments,suggesting that future research could involve multiple experts for a more robust comparison (l. 456-458).
  6. Results Section – Clarifications on terminology and statistical analysis:
  • We have clarified that "Component One" which refers to the relational and emotional dimension ad analyzed in Table 3, includes "Item Clarity" as one of its key aspect in Table 4. (l. 267-268).
  • The p-value of 0.097 for clarity is noted as not statistically significant. We have bolded the statistically significant values to make them easier to notice.
  • We have now specified the post-hoc comparison results in Tables 3 and 4.
  • Table 4, Item 4 ("empathic approach") – We have revised the p-value notation to ensure it correctly reflects statistical significance (p<0.0001). The mistake was an oversight.
  1. English Language Improvements:
  • We have revised and streamlined convoluted sentences, corrected typos (e.g., "the work the work" at row 114), and improved overall readability throughout the manuscript.
  • The phrase starting with "So that" in row 87 has been restructured for clarity.
  • The erroneous use of "he" in row 212 has been corrected.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1.                      The paper aligns with its intended scope and is generally written at a satisfactory level.

2.                      The purpose and main outcomes of the study are missing from the abstract and conclusion. Include the study’s aims and findings to enhance clarity and completeness in these sections.

3.                      hatbot GPT functions as a data repository and analytics tool, with its performance dependent on data availability and query types. How do the authors utilize feedback as input data for further analysis? Please justify.

4.                      The problem statement and introduction should more explicitly relate to the research domain. Include details on ChatGPT’s role in clinical trials and discuss any performance metrics used in this context.

5.                      A literature review is currently missing. Provide an overview of relevant existing studies to frame this work within the context of related research and highlight any novel contributions.

6.                      In the Aim section, the authors mention the study’s intent to support psychotherapists using AI. Specify the treatment methods, conditions, or specific applications where this AI model will be useful to provide greater clarity.

7.                      Include a diagram or flowchart in the methodology section to visually represent the model and its analysis flow for better comprehension.

8.                      The discussion needs clearer insights into the findings, with more direct recommendations based on the results.

9.                      Standard methods and performance measures do not appear to be applied. If alternate criteria were used, clarify how these support the conclusions reached in the study.

10.                  Specify the levels at which the AI mechanism will support therapists and the stages involved. Providing this breakdown will make the AI's functionality and integration into therapy clearer.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

A thorough grammar check is needed.

Author Response

We are grateful for the opportunity to revise and improve our manuscript based on the valuable feedback provided by the reviewers. Their constructive suggestions have allowed us to refine our work and address critical areas for clarification. Below, we have outlined our responses to the reviewers' comments and detailed the corresponding changes made to the manuscript.

We have changed the title: Can AI technologies support clinical supervision? Assessing the potential of ChatGPT

  1. The purpose and main outcomes of the study are missing from the abstract and conclusion.
    • Response: We have updated the abstract (lines 25-31, 37) and conclusion (lines 294-296, 317-320) to explicitly state the study’s aims and summarize the findings. The abstract now clarifies the intent to evaluate ChatGPT as a supervisory tool and highlights the key results. The conclusion emphasizes the potential of pre-trained AI in professional supervision.
  2. How do the authors utilize feedback as input data for further analysis? Please justify.
    • Response: The methodology section (lines 117-127) has been revised to explain that the feedback from AI and human supervisors was analyzed using PCA and ANOVA to identify significant differences in trainee satisfaction. This supports the findings by linking feedback data to statistical evaluations.
  3. Problem statement and introduction should relate more explicitly to the research domain.
    • Response: We have expanded the introduction (lines 75-79) to include ChatGPT’s potential role in clinical trials, highlighting its use as a generative feedback tool. Performance metrics such as Likert scales and statistical evaluations are also discussed to anchor the study within the research domain.
  4. Provide an overview of relevant studies.
    • Response: The lack of extensive literature on AI in clinical supervision is acknowledged in the introduction.
  5. Specify treatment methods, conditions, or applications of ChatGPT.
    • Response: The Aim section (lines 90-101) now includes examples of ChatGPT’s utility, such as aiding therapists in managing transference and countertransference, and preventing burnout through continuous guidance.
  6. Include a flowchart in the methodology section.
    • Response: A flowchart has been added to visually represent the analysis flow and methodology. You can find it as an add-on file
  7. Provide clearer insights in the discussion with direct recommendations.
    • Response: The discussion (lines 280-283, 314-315) has been enhanced to include actionable recommendations, such as the integration of AI tools in blended supervision models.
  8. Clarify alternate criteria supporting the conclusions.
    • Response: The methodology (lines 157-167) discusses how PCA and ANOVA were used as alternative evaluation metrics to support the conclusions.
  9. Specify levels and stages where AI supports therapists.
    • Response: The Conclusion section (lines 322-330, 332-337) now outlines three levels of AI support: professional development, clinical supervision, and real-time clinical assistance.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1. Contribution to the Body of Knowledge

The paper’s contribution to the advancement of AI in clinical supervision is vague. It doesn’t provide a solid basis for the application of ChatGPT to the specific case study. The authors should be more specific as it is not clear to what “liking questionnaires” they refer to. Introduction should clearly state the problem along with the article’s contribution. However, the reader does not gain any insights of the case study, unless it is a specialist in the field (i.e. psychotherapist).

 

2. Technical Soundness (Is the paper technically sound?)

The authors claim that this work is part of broad program but they provide no info about the participants, duration, funding sources etc.

The methodology followed is not reliable at all. The comparison focuses on a pretrained chatbot, a non-pretrained chatbot and a group consisting of and expert psychotherapist and its supervisor. Subsequently, the results were considered by a group of trainees in integrated gestalt psychotherapy. No info is available about the population of this group. This could not be described as a reliable comparison test not even for a feasibility study.

The data analysis is naïve, and the authors provide no information about the calculations that lead to the results in tables 2 to 4. Consequently, the discussion and the conclusions are ambiguous.

There is no reference (and consequently no previous work) presenting any relevant work in the field so as to let the reader deduce whether the methodology presented is scientifically solid or not.

3. Comprehensiveness of the Subject Matter (Is the subject matter presented in a comprehensive manner?)

It is essential for the authors to thoroughly revise their manuscript for language issues. Furthermore, there are sentences that are too long and do not allow the reader to follow the subject. Several words keep repeating in the same sentence. The word likert should be capitalized (Likert).

 

4. Adequate Use of References (Are the references provided applicable and sufficient?)

There are references with obsolete metadata. For example, ref [1] is in Italian – making it hard to assess – and there is no info about the issues, volume or even the journal published.

There is a mix of Harvard and IEEE references within text, even for the same article.

Recommendation: Reject

 

Comments on the Quality of English Language

It is essential for the authors to thoroughly revise their manuscript for language issues. There are sentences that are too long and do not allow the reader to follow the subject. Several words keep repeating in the same sentence. 

Author Response

We are grateful for the opportunity to revise and improve our manuscript based on the valuable feedback provided by the reviewers. Their constructive suggestions have allowed us to refine our work and address critical areas for clarification. Below, we have outlined our responses to the reviewers' comments and detailed the corresponding changes made to the manuscript.

We have changed the title: Can AI technologies support clinical supervision? Assessing the potential of ChatGPT

  1. Contribution to the body of knowledge is vague.

Response: The study is positioned as a pilot study due to the scarcity of existing literature.

Thank you for highlighting the need for clarity regarding the application of ChatGPT to the specific case study and the reference to "liking questionnaires." We have addressed these concerns as follows:

  1. Clarification of "liking questionnaires"(Lines 140-155):
    We have revised the text to clarify that the "liking questionnaires" refer to a structured evaluation tool used by psychotherapy trainees to assess the feedback provided by ChatGPT and human supervisors. Specifically, a 16-item Likert-scale questionnaire was utilized to measure aspects such as clarity, relevance, and comprehensiveness of the feedback. This additional detail ensures transparency and connects the evaluation process to measurable outcomes.
  2. Explicit statement of the problem and contribution(Lines 75-79):
    The introduction has been revised to explicitly state the problem this study addresses: the limited availability of tools to complement traditional psychotherapy supervision, particularly in providing timely and structured feedback. We have also emphasized the contribution of the study as a pilot investigation into the feasibility of using ChatGPT as a supplementary tool for clinical supervision, filling a gap in the current literature.
  3. Case study accessibility to non-specialists(Lines 294-296; 317-320; 278-279):
    Additional context has been provided to ensure the case study is accessible to readers who are not specialists in psychotherapy. Key methodological details have been elaborated, such as the anonymized clinical cases used, the demographic profile of the psychotherapy trainees (aged 25-40), and the specific focus on transference, countertransference, and professional development. These revisions aim to make the study's objectives and findings comprehensible to a broader audience.
  1. Provide details on participants, duration, and funding.
    • Response: The methodology specifies the demographic characteristics of trainees (aged 25-40, randomly acquired, line 141) and clarifies the study’s exploratory nature, giving more details about the previous study too (83-87).
  2. Methodology reliability concerns.
    • Response: Lines 117-127 explain the comparison between trained and untrained AI, and human supervisors. The methodology, although novel, is justified by outlining statistical analysis methods (PCA and ANOVA) to ensure rigor.
  3. Data analysis is naïve.
    • Response: Lines 157-167 provide a detailed description of the calculations leading to results in Tables 2-4, clarifying the basis for statistical significance.
  4. Lack of references presenting relevant work.
    • Response: As noted in Response 1, the literature on AI in psychotherapy supervision is scarce. We have added references to similar fields to establish a foundation for the methodology.
  5. Language issues and reference formatting.
    • Response: The manuscript has been thoroughly revised for grammar and formatting. The word "Likert" is now capitalized consistently. References in both Harvard and IEEE styles have been standardized. References with obsolete metadata and Italian-language citations:
    • We acknowledge that reference [1] is in Italian, which may make it less accessible to an international audience. However, the citation refers to foundational literature in Gestalt therapy, a psychotherapeutic model developed in the 1950s. This model has historical and cultural significance, and much of its foundational work, including key texts, was originally published in Italian. These references are included as they are essential for contextualizing the study within the framework of Gestalt therapy.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

In this study, ChatGPT was used as a chatbot and alternative supervisor to supervise the trainees and compare the effectiveness of the supervision with human experts.

The main problem with this study is ChatGPT, which the author claims is “untrained” and “"pre-trained", which is difficult to justify by the results contained in the manuscript. For example, if the author creates a prompt with ChatGPT on the backend, the authors have not indicated how accurately and consistently the LLM model on the backend of the prompt was able to answer the question (https://learn.microsoft.com/en-us/azure/machine-learning/prompt-flow/how-to-develop-an-evaluation-flow?view=azureml-api-2). So it looks like the author is using ChatGPT directly from https://chatgpt.com/, which raises strong concerns about the methodology and validity of using “ChatGPT” without training and with pre-training.

In many ways, the author referred to the model or method to the citation below [10], but unfortunately the publication was not available/published at the time of submission, making the construction of the model/prompt for this study difficult to convince.

 

Cioffi, V., Ragozzino, O., Scognamiglio, C., Mosca, L. L., Moretto, E., Stanzione, R., Marino, F., Acocella, A., Ammendola, A., D'Aquino, R., Durante, S., Tortora, E., Morfini, F., Montanari, C., Rosa, V., Rossi, O., Ferrara, A., Mori, E., Gigante, E., Pizzimenti, M., Zangarini, S., Sperandeo R. & Cantone D.(2024). Towards integrated AI psychotherapy supervision: A proposal for a ChatGPT-4 study. In press.

The minor issues are listed below:

1.           The structure of the text can be improved in many ways, e.g., the content of lines 101 to 108 in the "Aim" section sounds more like a future perspective or discussion, which is not necessary in the introduction.

2.           The evaluation of the effectiveness of ChatGPT is based on user (traineer) experience, which is appropriate (PCA to reduce dimensions, then one-way ANOVA to compare significance). However, normality should be checked first before the one-way ANOVA and justification must be provided as to why a comparison statistic is used rather than a regression analysis, which may be more robust.

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Author Response

Dear Reviewer,

Thank you for your thorough review and constructive feedback on our manuscript. We have carefully addressed the concerns raised and made several revisions to the paper to improve its clarity and methodological rigor. Below, we outline the specific changes and clarifications:

  1. Technical Specifications of ChatGPT-4
    • In response to your comment regarding the description of the ChatGPT model, we have added technical details in lines 69-92. This section provides a clearer explanation of how the model operates, including its "pre-trained" and "untrained" capabilities, and the methods used in our study to optimize its feedback.
    • We have also included two new bibliographical references ([10] and [11]) to provide additional context and support for these technical explanations.
  2. Changes to the "Aim" Section
    • The content of lines 101-108, which included future perspectives more appropriate for the discussion, has been relocated to lines 349-356 in the "Discussion" section. This restructuring ensures that the "Aim" section remains focused and concise.
  3. Revision of Lines 135-150
    • We removed the original content from lines 135-139 and replaced it with a detailed description of how the prompt was developed and utilized, drawing upon insights from our previous study ([12]). This revision elaborates on the methodologies employed and addresses concerns about the construction of the model/prompt.
  4. Status of Reference [12]
    • Regarding citation [12], the referenced study remains "in press" as it was presented at a conference in June 2024 and is awaiting publication in the conference proceedings. We have clarified this in the manuscript to provide transparency about its current status.
  5. Justification for Statistical Analysis
    • As suggested, we have justified the use of one-way ANOVA for comparing the effectiveness of feedback in lines 192-195. Given the study's design, ANOVA was deemed more suitable than regression analysis, as our goal was to compare group differences rather than explore predictive relationships.
    • We have also verified normality and reported Cronbach’s Alpha (lines 188-191) to demonstrate the reliability of the satisfaction questionnaire used in the study.

We hope these revisions address your concerns and improve the clarity and validity of our manuscript. Thank you again for your valuable feedback and for giving us the opportunity to improve our work.

Kind regards

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The paper is revised in a sufficient level and with appropriate modifications

Author Response

Thank you for your positive evaluation and for approving our reviewed manuscript. We greatly appreciate your thoughtful comments and are pleased to hear that the revisions meet your expectations.

It has been a rewarding experience refining the paper, and we are grateful for your guidance throughout the review process.

Wishing you continued success in your work and a pleasant holiday season.

Reviewer 2 Report

Comments and Suggestions for Authors

The work is not substantially improved. There are still some points that are not clarified. For example, what is the project in the scope of its conducted research this paper was submitted. What are the technical details of the ChatGPT version used?

Is the questionnaire weighted? What is the a cronbach?

There is a lot of work that has to been done

Comments on the Quality of English Language

The language needs revision. There are a lot of big sentences with errors and in several cases the meaning is missing.

Author Response

Dear Reviewer,

Thank you for your detailed feedback on our manuscript. We appreciate the opportunity to address the concerns raised and clarify key points. Below, we outline the specific revisions and improvements made in response to your comments:

  1. Technical Details of ChatGPT-4
    • In lines 69-92, we have added a detailed description of the ChatGPT-4 model used in our study, including its pre-trained nature and how prompts were utilized to optimize its feedback capabilities. This addition provides transparency about the technical setup and methodologies employed.
  2. Detailed Description of Prompt Development
    • We have removed the content in lines 135-139 and replaced it with a comprehensive explanation (lines 135-150) of how the prompt was developed and applied in this study. This section builds upon findings from our previous research ([12]) and clarifies the iterative process used to fine-tune the model’s responses.
  3. Cronbach’s Alpha
    • In lines 188-189, we have added the Cronbach’s Alpha value to demonstrate the reliability of the questionnaire used in this study. This provides statistical validation for the instrument and addresses your concern regarding the weighted nature of the questionnaire.
  4. Scope of the Research
    • The study falls within the broader project aimed at evaluating the potential of AI tools, specifically ChatGPT-4, to complement traditional supervision methods in psychotherapy. This has been clarified in the introduction and methodology sections, aligning the study’s objectives with its practical implications.
  5. Language Revisions
    • We have thoroughly revised the manuscript to address linguistic issues and improve clarity. Sentences have been simplified where necessary to ensure that the meaning is clear and concise.

Your feedback has been invaluable in guiding these revisions, and we hope the changes made address the issues raised.

Thank you for your time and thoughtful review.

Kind regards

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you for the effort of trying to improve the manuscript. However, the additional content still does not justify the reusability and validity of the methodology. This is especially true for the construction of the prompt, which is the core of this study. The author uses the ChatGPT directly with the setting of the "Reminder" function in the settings or he uses a GPT builder function to customize the content that is fed into the ChatGPT. So there is a big difference between "untrain" and "pre-train" for the LLM model. The author may need to revise the methodology and describe the pipeline (how to create the customized GPT in ChatGPT) in more detail. In addition, this is a reference that the manuscript relies on for the details of the methodology and is cited in the "in press" status. There are no preprints from other sources to look up the content, which raises the question of whether the methodology in the manuscript is realistic. For this reason, think manuscript is not suitable for publication at this stage. It requires an intensive review of the methodology and a revision of the section.

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Author Response

Dear Reviewer,

We would like to express our gratitude for your thoughtful and constructive feedback on our manuscript, titled "Can AI technologies support clinical supervision? Assessing the potential of ChatGPT". Your insights have been invaluable in helping us identify areas that require further elaboration and improvement to enhance the clarity, rigor, and reproducibility of our work.

Below, we address your concerns point by point:

  1. Reusability and Validity of the Methodology

You raised concerns regarding the reusability and validity of the methodology, particularly about the construction of the prompt and the cited references.

  • Pipeline Details:
    • We have expanded the "Methodology" section to include a visual diagram representing the pipeline (from input preparation to feedback evaluation) a step-by-step explanation of the process used to construct the customized prompt for ChatGPT in the fig. 1 and in the fig. 2 - Methodology Diagram Details.
  • References:
    • We have supplemented the manuscript with additional references to strengthen the theoretical basis of our methodology (https://doi.org/10.20944/preprints202501.0848.v1 Preprints ID 145816). Where references are "in press," we have summarized key findings and methods to ensure transparency and reproducibility.

Conclusion

We are confident that these revisions address your concerns and enhance the rigor, clarity, and reproducibility of our manuscript. We deeply appreciate your feedback and the opportunity to improve our work. Should there be any additional suggestions or clarifications required, we would be delighted to address them.

Thank you again for your valuable input and time.

Sincerely,

Valeria Cioffi, Ottavio Ragozzino, Lucia Luciana Mosca, Enrico Moretto, Enrica Tortora, and Co-authors

Author Response File: Author Response.docx

Back to TopTop