Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

University Students’ Perceptions and Intentions to Use Digital Mental Health Services Including Online Therapy and Mental Health Apps: A Cross-Sectional Study

Int. J. Environ. Res. Public Health 2026, 23(6), 719; https://doi.org/10.3390/ijerph23060719

by Tamadhir Al-Mahrouqi¹

, Maryam Al Wardy^1,*, Abdullah Al Lawati²

, Ahmed Al Maskari², Alazhar Al Azri³, Qaiser Al Riyami², Hamood Al Aufi², Sachin Jose⁴ and Hamed Al Sinawi²

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Int. J. Environ. Res. Public Health 2026, 23(6), 719; https://doi.org/10.3390/ijerph23060719

Submission received: 20 April 2026 / Revised: 25 May 2026 / Accepted: 25 May 2026 / Published: 28 May 2026

(This article belongs to the Special Issue AI Chatbots and Human Assistants for Mental Health)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you for submitting this cross-sectional study examining the mediational roles of perceptions of mental health apps and online therapy in the relationship between attitudes toward digital technology and intention to use digital mental health solutions among Omani university students. The topic is timely and clinically relevant, the use of structural equation modeling for mediation analysis is methodologically appropriate for the research questions posed, and the cultural framing through the Theory of Planned Behaviour adds meaningful theoretical grounding. Below is detailed feedback organized by manuscript section.

Title

The title accurately reflects the study content.

Suggestions for improvement:

The title ends with a full stop, which is non-standard for article titles in this journal — remove the period

Abstract

The abstract clearly summarizes the study design and key findings.

Suggestions for improvement:

Line 40: The abstract reports "58% female and 42% male," yet Table 1 (lines 219–220) reports 49.7% male and 50.3% female — a direct contradiction; verify and correct throughout
Line 39 vs. line 220: The abstract reports mean age 21.4 years (SD = 1.8), while Table 1 reports 21.24 ± 3.12 — both the mean and the standard deviation differ; verify and correct consistently throughout the manuscript

Keywords

Suggestions for improvement:

Lines 53–54: Keywords are capitalized throughout ("Digital Mental Health," "Artificial Intelligence," etc.), which is non-standard; keywords should be written in lowercase unless they are proper nouns — revise accordingly

Introduction

The introduction effectively establishes the burden of mental health difficulties among university students and contextualizes the Omani and MENA-specific challenges well.

Suggestions for improvement:

Line 440, Reference 19: Etheridge, Sinyard, and Brindle's chapter on Implementation Research is cited as the source for the Theory of Planned Behaviour — this is incorrect; TPB is attributed to Icek Ajzen (1991): Ajzen I. The theory of planned behavior. Organizational Behavior and Human Decision Processes, 1991;50:179–211; misattributing the foundational theoretical framework of the study is not a minor bibliographic issue and must be corrected
Lines 88–89: "Two trials by Fulmer et al and Reyes et al" — a comma is missing before "and," and a period is missing at the end of the sentence before the citation

Methods

The study design is appropriate and the sample size calculation is clearly reported.

Suggestions for improvement:

The questionnaire was adapted from Gbollie et al., which was developed and validated for South African university students; Supplementary Table S1 retains the item "Apps developed by South Africans" — this item was clearly not adapted for the Omani context and represents a meaningful oversight in cultural adaptation; either replace this item with a culturally appropriate equivalent (e.g., "Apps developed locally") or explicitly acknowledge its retention as a limitation and explain the rationale; as it stands, this raises questions about the validity of the perceptions subscale as applied in this study
Neither Supplementary Table S1 nor Table S2 is cited anywhere in the main text of the manuscript — add in-text references to both tables where appropriate, or clarify their role in the supplementary material
Both Supplementary Table S1 and Table S2 carry identical titles: "Perceptions and beliefs about mental health apps" — Table S2 describes preferred app focus areas and should be retitled accordingly (e.g., "Preferred focus areas for mental health apps among university students")

Results

Results are clearly organized and the mediation model is well presented.

Suggestions for improvement:

Line 241–242: States that Figure 2 and Figure 3 illustrate the final model, yet Figure 3 does not appear in the manuscript — either add the missing figure or remove the reference to it
Line 242: Refers readers to "Table 1" for a detailed description of the observed effects, but the effects table is labeled Table 2 — correct the in-text reference
Lines 232–239: The measurement model fit requires a more candid and consolidated discussion; CFI = .871 and TLI = .858 fall below the conventional threshold of .90 for acceptable fit in CFA; AVE values of .35–.49 are uniformly below the recommended threshold of .50 for convergent validity; composite reliability for the Intention construct (CR = .66) falls below the recommended .70; the authors acknowledge each of these individually but do not collectively address what they mean for the interpretation of the mediation results — a dedicated paragraph should explicitly discuss how these measurement limitations taken together affect confidence in the indirect effect estimates, particularly given that the central finding rests on a construct with marginal reliability and below-threshold convergent validity
Line 232: The degrees of freedom reported for model fit (χ²(318) = 816) should be explained — with 27 items across four factors the basis for df = 318 is not immediately apparent and warrants clarification

Discussion

The discussion contextualizes findings appropriately within the TPB framework and the collectivist cultural context.

Suggestions for improvement:

References 21 and 24 are identical — both cite Bosnjak, Ajzen, and Schmidt (2020) on the Theory of Planned Behaviour; delete the duplicate

Limitations

Limitations are acknowledged but incomplete.

Suggestions for improvement:

The limitations section does not mention: (1) the cultural adaptation issue with the retained South African item in the instrument; (2) that the measurement model fit indices fell below conventional thresholds and what this means for result interpretation; (3) that Tables S1 and S2 in the supplementary material are not referenced in the main text — all three should be addressed

References

Suggestions for improvement:

Replace Reference 19 with the correct Ajzen (1991) citation for the Theory of Planned Behaviour as noted above
Remove the duplicate reference (21/24)

With these revisions addressing the factual inconsistencies, the cultural adaptation of the instrument, the measurement model limitations, and the missing figure, this study will make a valuable contribution to the growing literature on digital mental health adoption in collectivist societies, particularly in the underrepresented MENA context.

Sincerely,

Reviewer

Author Response

Comment 1: The title ends with a full stop, which is non-standard for article titles in this journal — remove the period

Response 1: Thank you for the detailed observation. The period has been removed and highlighted for your reference (Line 4).

Comment 2: Line 40: The abstract reports "58% female and 42% male," yet Table 1 (lines 219–220) reports 49.7% male and 50.3% female — a direct contradiction; verify and correct throughout. Line 39 vs. line 220: The abstract reports mean age 21.4 years (SD = 1.8), while Table 1 reports 21.24 ± 3.12 — both the mean and the standard deviation differ; verify and correct consistently throughout the manuscript.

Response 2: Thank you for your careful review on the descriptives. The gender split and mean age/SD were revisited and highlighted for your reference in the abstract (Line 40-41).

Comment 3: Lines 53–54: Keywords are capitalized throughout ("Digital Mental Health," "Artificial Intelligence," etc.), which is non-standard; keywords should be written in lowercase unless they are proper nouns — revise accordingly.

Response 3: Thank you for helping us improve the manuscript. The following keywords were re-written and highlighted for your reference (Line 54-55).

Comment 4: Line 440, Reference 19: Etheridge, Sinyard, and Brindle's chapter on Implementation Research is cited as the source for the Theory of Planned Behaviour — this is incorrect; TPB is attributed to Icek Ajzen (1991): Ajzen I. The theory of planned behavior. Organizational Behavior and Human Decision Processes, 1991;50:179–211; misattributing the foundational theoretical framework of the study is not a minor bibliographic issue and must be corrected.

Response 4: We appreciate your valuable and detailed observation on the references. The reference has been revisited and highlighted for your reference in the references section (updated line 504).

Comment 5: Lines 88–89: "Two trials by Fulmer et al and Reyes et al" — a comma is missing before "and," and a period is missing at the end of the sentence before the citation.

Response 5: The following suggestions were added and highlighted in the text for your reference. We appreciate your time and efforts (Line 91 - Line 93).

Comment 6: The questionnaire was adapted from Gbollie et al., which was developed and validated for South African university students; Supplementary Table S1 retains the item "Apps developed by South Africans" — this item was clearly not adapted for the Omani context and represents a meaningful oversight in cultural adaptation; either replace this item with a culturally appropriate equivalent (e.g., "Apps developed locally") or explicitly acknowledge its retention as a limitation and explain the rationale; as it stands, this raises questions about the validity of the perceptions subscale as applied in this study.

Response 6: Thank you for this important comment. We agree that the retention of the item “Apps developed by South Africans” in Supplementary Table S1 was not appropriate for the Omani context and represents an oversight in the cultural adaptation of the questionnaire. We did not undertake a formal cultural adaptation or re-validation process for the original instrument; therefore, we agree that this should be explicitly acknowledged as a limitation of the study (Line 366-373).

Comment 7: Neither Supplementary Table S1 nor Table S2 is cited anywhere in the main text of the manuscript — add in-text references to both tables where appropriate, or clarify their role in the supplementary material.

Response 7: Thank you for your thoughtful evaluation. References to supplementary tables S1 and S2 were added in the methods section (Line 202-204), referring to the ‘Perceptions of mental health apps’ construct of the Gbollie et al questionnaire.

Comment 8: Both Supplementary Table S1 and Table S2 carry identical titles: "Perceptions and beliefs about mental health apps" — Table S2 describes preferred app focus areas and should be retitled accordingly (e.g., "Preferred focus areas for mental health apps among university students"),

Response 8: Thank you for your helpful recommendations. The title for Supplementary Table 2 (Table S2) was changed to accurately describe the table as “Mental Health App Priorities Among University Students”, and was highlighted in the supplementary materials for your reference.

Comment 9: Line 241–242: States that Figure 2 and Figure 3 illustrate the final model, yet Figure 3 does not appear in the manuscript — either add the missing figure or remove the reference to it.

Response 9: Thank you for your review and helpful insights. The correct in-citations were added to refer to Figure 2 & Table 2, and were highlighted for your reference (Line 256 & Line 258).

Comment 10: Line 242: Refers readers to "Table 1" for a detailed description of the observed effects, but the effects table is labeled Table 2 — correct the in-text reference.

Response 10: Thank you for your review and helpful insights. The correct in-citations were added to refer to Figure 2 & Table 2, and were highlighted for your reference (Line 256 & Line 258).

Comment 11: Lines 232–239: The measurement model fit requires a more candid and consolidated discussion; CFI = .871 and TLI = .858 fall below the conventional threshold of .90 for acceptable fit in CFA; AVE values of .35–.49 are uniformly below the recommended threshold of .50 for convergent validity; composite reliability for the Intention construct (CR = .66) falls below the recommended .70; the authors acknowledge each of these individually but do not collectively address what they mean for the interpretation of the mediation results — a dedicated paragraph should explicitly discuss how these measurement limitations taken together affect confidence in the indirect effect estimates, particularly given that the central finding rests on a construct with marginal reliability and below-threshold convergent validity.

Response 11: We sincerely thank the reviewer for this thoughtful and valuable comment. We agree that, while our measurement model showed an overall acceptable fit, several psychometric indicators did not fully meet conventional standards. Specifically, the CFI (.871) and TLI (.858) values were below the preferred .90 threshold, the AVE values across constructs were below .50, and the Intention construct demonstrated slightly lower composite reliability (CR = .66) than the recommended cutoff. Taken together, these findings suggest that although the model provides a reasonably structured representation of the data, some constructs—particularly Intention—may not capture the underlying latent concepts as strongly as desired. As a result, the mediation findings should be interpreted with caution, especially since the primary outcome relies on a construct with marginal reliability and weaker convergent validity. These measurement limitations may reduce the precision of the indirect effect estimates and could potentially attenuate the observed relationships. However, because most factor loadings were statistically significant, discriminant validity was supported, and other fit indices such as RMSEA and SRMR indicated acceptable approximation, we believe the model still offers meaningful preliminary insights into the proposed mediation pathways. We have revised the manuscript to include a dedicated paragraph that more transparently discusses these combined limitations and clarifies that the mediation results should be viewed as supportive but not definitive, highlighting the need for future studies using more robustly validated measurement instruments (Line 379-393).

Comment 12: Line 232: The degrees of freedom reported for model fit (χ²(318) = 816) should be explained — with 27 items across four factors the basis for df = 318 is not immediately apparent and warrants clarification

Response 12: We thank the reviewer for highlighting the need for greater clarity regarding the reported degrees of freedom. In confirmatory factor analysis, the degrees of freedom are determined by the difference between the number of unique observed variances/covariances in the data and the number of freely estimated model parameters, rather than by sample size alone. For the present model with 27 observed items, there are 378 unique variance–covariance data points. The reported df of 318 reflects the estimation of approximately 60 free parameters, including factor loadings, latent variances/covariances, and item error variances, within the specified four-factor structure. Thus, the model was over-identified, allowing for statistical evaluation of model fit.

Comment 13: References 21 and 24 are identical — both cite Bosnjak, Ajzen, and Schmidt (2020) on the Theory of Planned Behaviour; delete the duplicate.

Response 13: Removed. Thank you for the insightful comment. This was highlighted in the reference section for your reference (updated as reference 22 - line 335/509)

Comment 14: The limitations section does not mention: (1) the cultural adaptation issue with the retained South African item in the instrument; (2) that the measurement model fit indices fell below conventional thresholds and what this means for result interpretation; (3) that Tables S1 and S2 in the supplementary material are not referenced in the main text — all three should be addressed.

Response 14: Thank you for this important feedback. The following comments have been revised, and the cultural implication was added for your reference to address the studies limitations. Furthermore, supplementary tables were cited to indicate their relevance in the manuscript, and explanations on the model fit were provided as suggested.

Comment 15: Replace Reference 19 with the correct Ajzen (1991) citation for the Theory of Planned Behaviour as noted above

Response 15: We appreciate your detailed observations in our manuscript. The above-mentioned reference was replaced with the correct reference and highlighted in the reference section (Line 504).

Comment 16: Remove the duplicate reference (21/24)

Response 16: Removed. Thank you for the insightful comment. This was highlighted in the reference section for your reference (Line 509).

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors,

Your study addresses a relevant and timely topic: the adoption of digital mental health tools among university students in Oman. The methodological approach is well-structured, and the use of mediation analysis via SEM represents a notable strength. However, several methodological and conceptual issues need to be addressed before publication.

Bullet points follow:

Internal Inconsistencies in Demographic Data

There is an unexplained internal discrepancy between the abstract and the main text. The abstract reports 58% female and 42% male participants (lines 40–41), while Table 1 shows a near-equal split (49.7% male, 50.3% female). This inconsistency must be corrected and accounted for.

Similarly, the mean age reported in the abstract is 21.4 years (SD = 1.8), whereas the Results section reports 21.24 ± 3.12. The SD values differ substantially and must be reconciled across the manuscript.

Weaknesses in the Measurement Model

The composite reliability of the outcome variable Intention (CR = .66) falls below the recommended threshold of .70. The authors acknowledge this limitation but minimize its implications without offering concrete solutions. It is recommended that the authors: (a) consider removing items with the lowest loadings, (b) repeat the analysis excluding items Q12–Q14, or (c) discuss more explicitly the consequences of this limitation for the interpretation of findings.

Additionally, convergent validity is moderate (AVE = .35–.49), with values falling below the .50 threshold for some constructs. This warrants more substantive discussion.

CFA Model Fit

CFI = .871 and TLI = .858 are below the conventional threshold of .90, indicating acceptable but suboptimal fit. The authors describe the model as "acceptable" without comparing it against alternative models or reporting modification indices. It is recommended that the most relevant modification indices be reported and that model respecification be considered.

Measurement Instrument

The study adapts the questionnaire from Gbollie et al. (2023) without sufficiently detailing the cultural adaptation process. Specifically:

It is unclear whether a translation/back-translation procedure was carried out, given that the original instrument is in English but the study context is Arabic-speaking.
No piloting or focus group process is described to verify the cultural appropriateness of items prior to administration.
Different response scales (4-point and 5-point) are used across questionnaire sections; this heterogeneity should be methodologically justified.

Sampling and Selection Bias

Recruitment via institutional email and social media platforms (WhatsApp, Twitter, Instagram) introduces a non-trivial selection bias: participants are likely to already have familiarity with and positive attitudes toward digital tools compared to the broader student population. This substantially limits the generalizability of findings and should be addressed more explicitly in the Limitations section.

Furthermore, the response rate is not reported, which is essential information for evaluating sample representativeness.

Missing Data Handling

The authors state that Full Information Maximum Likelihood (FIML) was used for model estimation, but neither the proportion of missing data nor the rationale for choosing FIML over alternatives such as multiple imputation is provided. The distribution of missing values per variable should be reported.

Uncontrolled Confounding

Potentially confounding variables — including mental health status, prior history of psychological treatment, and digital literacy level — are not included in the SEM as covariates. The authors note that 38% of participants self-reported "Very Good" mental health (Appendix), yet this variable is not integrated into the primary analysis, despite its potential influence on both attitudes and intention to use digital tools.

Erroneous reference. Reference 10 (Columbus, Forbes 2017) reports an access date of "December 17, 2026," which falls after the study's data collection period. This should be verified and corrected.
Discussion of non-significance. The non-mediation of perceptions of online therapy (p = .109) is adequately discussed; however, the 95% confidence interval (−0.007, 0.067) includes positive values, suggesting a potentially small positive effect. A post-hoc power analysis could help rule out a Type II error and would strengthen the interpretive discussion.
Terminological consistency. The title and keywords include "Artificial Intelligence," yet the study does not directly measure AI use: it measures perceptions of mental health apps and online therapy more broadly. The conceptual distinction between AI chatbots, general mental health apps, and online therapy should be made more precise in both the Introduction and Discussion.

Highlights section. The three-part structure (relevance, significance, implications) is atypical for this journal, and some statements are redundant with the abstract. Authors should verify compliance with the journal's editorial guidelines.

Final Recommendation: Major Revision Required. This manuscript addresses a meaningful and underexplored research question with genuine publication potential. Nevertheless, the demographic inconsistencies, psychometric weaknesses, and absence of essential methodological information — including response rate, instrument cultural adaptation, and missing data handling — require substantial revision before the manuscript can be considered for publication.

Author Response

Comment 1: There is an unexplained internal discrepancy between the abstract and the main text. The abstract reports 58% female and 42% male participants (lines 40–41), while Table 1 shows a near-equal split (49.7% male, 50.3% female). This inconsistency must be corrected and accounted for. Similarly, the mean age reported in the abstract is 21.4 years (SD = 1.8), whereas the Results section reports 21.24 ± 3.12. The SD values differ substantially and must be reconciled across the manuscript.

Response 1: Thank you for your careful review on the descriptives. The gender split and mean age/SD were revisited and highlighted for your reference in the abstract (Line 40-41).

Comment 2: The composite reliability of the outcome variable Intention (CR = .66) falls below the recommended threshold of .70. The authors acknowledge this limitation but minimize its implications without offering concrete solutions. It is recommended that the authors: (a) consider removing items with the lowest loadings, (b) repeat the analysis excluding items Q12–Q14, or (c) discuss more explicitly the consequences of this limitation for the interpretation of findings. Additionally, convergent validity is moderate (AVE = .35–.49), with values falling below the .50 threshold for some constructs. This warrants more substantive discussion. CFI = .871 and TLI = .858 are below the conventional threshold of .90, indicating acceptable but suboptimal fit. The authors describe the model as "acceptable" without comparing it against alternative models or reporting modification indices. It is recommended that the most relevant modification indices be reported and that model respecification be considered.

Response 2: We sincerely thank the reviewer for this thoughtful and valuable comment. We agree that, while our measurement model showed an overall acceptable fit, several psychometric indicators did not fully meet conventional standards. Specifically, the CFI (.871) and TLI (.858) values were below the preferred .90 threshold, the AVE values across constructs were below .50, and the Intention construct demonstrated slightly lower composite reliability (CR = .66) than the recommended cutoff. Taken together, these findings suggest that although the model provides a reasonably structured representation of the data, some constructs—particularly Intention—may not capture the underlying latent concepts as strongly as desired. As a result, the mediation findings should be interpreted with caution, especially since the primary outcome relies on a construct with marginal reliability and weaker convergent validity. These measurement limitations may reduce the precision of the indirect effect estimates and could potentially attenuate the observed relationships. However, because most factor loadings were statistically significant, discriminant validity was supported, and other fit indices such as RMSEA and SRMR indicated acceptable approximation, we believe the model still offers meaningful preliminary insights into the proposed mediation pathways. We have revised the manuscript to include a dedicated paragraph that more transparently discusses these combined limitations and clarifies that the mediation results should be viewed as supportive but not definitive, highlighting the need for future studies using more robustly validated measurement instruments (Line 379-393).

Comment 3: The study adapts the questionnaire from Gbollie et al. (2023) without sufficiently detailing the cultural adaptation process. Specifically: It is unclear whether a translation/back-translation procedure was carried out, given that the original instrument is in English but the study context is Arabic-speaking. No piloting or focus group process is described to verify the cultural appropriateness of items prior to administration. Different response scales (4-point and 5-point) are used across questionnaire sections; this heterogeneity should be methodologically justified.

Response 3: Thank you for this important comment. We agree that the manuscript required clearer reporting regarding the questionnaire adaptation process. The original instrument by Gbollie et al. (2023) was in English, and our questionnaire was also administered in English. Therefore, no translation or back-translation procedure was conducted. Although Oman is an Arabic-speaking country, the study was conducted among university students in an academic setting where English is the language of instruction. For this reason, the research team considered the English version appropriate for the target. We have now clarified this in the Methods section, and the limitations section (Line 366-373). Regarding the response scales, we retained the scale structure used in the original instrument to preserve comparability with Gbollie et al. (2023). The different response formats reflect the structure of the original questionnaire sections, where different constructs were measured using different Likert-type scales. We have now clarified this in the Methods section, and the limitations section (Line 171-178).

Comment 4: Recruitment via institutional email and social media platforms (WhatsApp, Twitter, Instagram) introduces a non-trivial selection bias: participants are likely to already have familiarity with and positive attitudes toward digital tools compared to the broader student population. This substantially limits the generalizability of findings and should be addressed more explicitly in the Limitations section. Furthermore, the response rate is not reported, which is essential information for evaluating sample representativeness. The authors state that Full Information Maximum Likelihood (FIML) was used for model estimation, but neither the proportion of missing data nor the rationale for choosing FIML over alternatives such as multiple imputation is provided. The distribution of missing values per variable should be reported.

Response 4: Thank you for this important comment. We agree that recruitment through institutional email and social media may have introduced selection bias, as digitally engaged students may have been more likely to participate. We have therefore revised the Limitations section to acknowledge that this may limit the generalizability of the findings (Line 374-378). We also clarified that the response rate could not be calculated because the survey was distributed through open online channels, where the total number of students who received or viewed the invitation could not be accurately determined. We thank the reviewer for raising this point. We would like to clarify that there was no missing data in our dataset; all 360 participants had complete responses across all 27 items. Therefore, no procedures for handling missing data were required in the analysis. Full Information Maximum Likelihood (FIML) was specified as the estimation method within the CFA framework; however, since the dataset was fully complete, the use of FIML did not involve any missing data estimation and produced results equivalent to standard maximum likelihood estimation. We have now clarified this explicitly in the Methods section to avoid any possible misunderstanding (Line 171-178). We also confirm that multiple imputation was not considered necessary, given the completeness of the data.

Comment 5: Potentially confounding variables — including mental health status, prior history of psychological treatment, and digital literacy level — are not included in the SEM as covariates. The authors note that 38% of participants self-reported "Very Good" mental health (Appendix), yet this variable is not integrated into the primary analysis, despite its potential influence on both attitudes and intention to use digital tools.

Response 5: Thank you for this important comment. We agree that mental health status, prior psychological treatment, and digital literacy may act as potential confounders influencing both attitudes toward digital tools and intention to use them. These variables were not included as covariates in the SEM, and we acknowledge this as a limitation. Although self-reported mental health status was collected descriptively, it was not incorporated into the primary model because the SEM was specified according to the theoretical structure adapted from Gbollie et al. We have revised the Limitations section to clarify that the absence of these covariates may limit causal interpretation and may leave residual confounding (Line 394-401)

Comment 6: Erroneous reference. Reference 10 (Columbus, Forbes 2017) reports an access date of "December 17, 2026," which falls after the study's data collection period. This should be verified and corrected.

Response 6: Thank you for your constructive feedback. We believe your feedback is valuable for the improvement of this manuscript. The date has been fixed and was initially referring to the year ‘2025’. It has been highlighted for your reference in the reference section (Line 484).

Comment 7: Discussion of non-significance. The non-mediation of perceptions of online therapy (p = .109) is adequately discussed; however, the 95% confidence interval (−0.007, 0.067) includes positive values, suggesting a potentially small positive effect. A post-hoc power analysis could help rule out a Type II error and would strengthen the interpretive discussion.

Response 7: We sincerely thank the reviewer for this helpful suggestion. We agree that the non-significant indirect effect for perceptions of online therapy (p = .109; 95% CI: −0.007, 0.067) should be interpreted carefully, especially since the confidence interval includes very small positive values close to zero. This means that while we did not find statistical significance, we cannot completely rule out the possibility of a very small indirect effect. We also appreciate the recommendation to conduct a post-hoc power analysis. After careful consideration, we have chosen not to include it, as post-hoc power based on observed results is generally considered to add little new information beyond what is already conveyed by the p-values and confidence intervals, particularly in mediation models. In fact, such calculations tend to mirror the observed significance level rather than provide additional meaningful insight. Instead, we have strengthened the Discussion by focusing on what the data actually show—the size of the effect and the confidence interval. We now clarify that, although the indirect effect was not statistically significant, the result should not be interpreted as definitive evidence of no effect. Rather, it is more appropriate to view it as inconclusive, with the possibility of a very small effect that future, better-powered studies may be able to explore further. We believe this provides a clearer, more transparent, and statistically appropriate interpretation of the finding (Line 288-295).

Comment 8: Terminological consistency. The title and keywords include "Artificial Intelligence," yet the study does not directly measure AI use: it measures perceptions of mental health apps and online therapy more broadly. The conceptual distinction between AI chatbots, general mental health apps, and online therapy should be made more precise in both the Introduction and Discussion.

Response 8: Thank you for your careful and informative feedback. We appreciate your comments and thoughtful suggestions. Upon re-visiting the manuscript, the title explicitly mentions the terms ‘digital mental health services’, ‘online therapy’, and ‘mental health apps’. The term 'Artificial Intelligence’ was carefully selected as a keyword only as it discusses the embedding of AI in mental health apps. As we used an adapted version of Gbollie et al’s questionnaire, which was validated to measure intention to utilize AI based tools, we believe the keyword 'Artificial Intelligence’ is relevant to the manuscript.

Comment 9: Highlights section. The three-part structure (relevance, significance, implications) is atypical for this journal, and some statements are redundant with the abstract. Authors should verify compliance with the journal's editorial guidelines

Response 9: Thank you for helping us improve the manuscript. We value your detailed and helpful feedback. The manuscript was initially submitted with 6 bullet points, addressing each question with 2 highlights. The second version of the manuscript was forwarded to us in this formatting, as we did not include the three-part structure in the initial submission.

Reviewer 3 Report

Comments and Suggestions for Authors

Overall

From the perspective of the impact of the digital environment on individuals' psychological safety and well-being—so-called "Informational Health"—this topic is extremely timely, focusing on the digital technology adoption process in the collectivist cultures of the war-torn Middle East and MENA region. It possesses high relevance and a certain degree of originality in the field of public health.

The adoption of structural equation modeling (SEM) for mediation analysis as a statistical method is commendable, as it goes beyond simple correlation analysis. While the analytical method itself is solid, the model's goodness-of-fit indices (CFI = .871, TLI = .858) are slightly below the general acceptable standard (>.90), raising concerns from a rigorous standpoint. Furthermore, the composite reliability (CR) of the dependent variable, "Intention," is explicitly stated as 0.66, below the standard value of 0.70. Deriving conclusions with low reliability of the measured variables weakens the validity of the estimation results.

While a sufficient sample size (n=360) is ensured, significant sampling bias exists.
According to the provided Supplementary Table S3, the current mental health status of the survey participants is as follows:
Good: 96 people (26.7%)
Very good: 137 people (38.1%)
Excellent: 87 people (24.2%)
In other words, approximately 89% of the entire sample has a mental health status of good or better, and only a small percentage are in need of actual treatment or intervention (Poor/Fair). There is a significant discrepancy between the intention to use the service when healthy students face mental difficulties in the future and the intention to use it when individuals actually experiencing difficulties. The failure to thoroughly discuss this bias of "bias towards healthy groups" in the discussion section significantly undermines the reliability of the research.

While the results of the mediation analysis (Table 2) are presented in this paper, the mathematical definition of the structural model in SEM and the calculation process (formula for calculating path coefficients) are not clearly stated in the text. From the perspective of mathematical modeling of social data, defining the relationships between each variable as a system of equations and presenting the calculation process, including the error term, helps ensure transparency.

Path of the independent variable $X$ (attitude towards technology) for the mediating variables $M_1$ (perception of the app) and $M_2$ (perception of online therapy):
$$M_1 = \alpha_1 + a_1 X + e_1$$
$$M_2 = \alpha_2 + a_2 X + e_2$$

Comprehensive structural equation for the dependent variable $Y$ (intent to use):
$$Y = \alpha_3 + c' X + b_1 M_1 + b_2 M_2 + e_3$$

Furthermore, there is logical confusion (misuse of terminology) in the notation of Table 2. While the sum of indirect effects ($0.065 + 0.030 = 0.095$) is consistent with the calculation, the notation "Total Effects (Direct)" is contradictory. Typically, the total effect $c$ is the sum of the direct effect $c'$ and the indirect effect ($c = c' + a_1b_1 + a_2b_2$). Therefore, it should simply be labeled "Total Effects" here, clearly distinguishing it from the direct effect (Direct Effect: $\beta=0.358$) from variable $X$ to $Y$.

This study attempts to verify a theoretical framework (TPB) explaining behavioral orientation using mediation analysis with SEM.

Negative factors: Because it is a cross-sectional study, causal relationships cannot be identified; it relies on self-report data (risk of common method bias); and, as mentioned above, the sample is extremely biased towards healthy individuals. While sufficient as a basic factual survey, the design has significant limitations when used as a basis for interventions or social implementation.

The conclusion that "awareness of mental health apps mediated, while awareness of online therapy did not" is a qualitatively compelling explanation, considering it within the context of Middle Eastern-specific privacy concerns (a collectivist culture that values the opinions of others). However, quantitatively, the construct validity of "Intention" (CR=.66) is low, making it difficult to determine whether this difference in mediation is due to a true psychological mechanism or simply measurement error. While the conclusion itself is useful, definitive claims that exceed the data's limitations (such as broad recommendations in Clinical Implications) should be avoided.

Clarification and Re-analysis of Bias Regarding Mental Health Status
Based on the data in Supplementary Table S3, clearly state in the Limitation section that "the majority of this sample is a psychologically healthy group." If possible, incorporate mental health status (Good or above vs. Fair or below) as a control variable into the model or add subgroup analysis to confirm the robustness of the results.
Correct "Total Effects (Direct)" to "Total Effects" in Table 2. Furthermore, please add the structural equation formula (relationship between $X, M_1, M_2, Y$) to the Methodology section to verify the mediating effects as described above, and clarify the calculation framework.
Discuss the statistical justification in the text for CFI and TLI being less than 0.90, and for the Intention CR being 0.66 (why these values are acceptable in the context of this study). It is strongly recommended to remove certain questionnaire items (items with low loadings, such as Q12 and Q14) and rebuild the model to verify if the goodness of fit improves.
Since all variables were measured simultaneously using the same self-report questionnaire, CMV is likely to occur. Add statistical measures, such as Harman's single-factor test, to demonstrate that CMV does not significantly affect the results.

Author Response

Comment 1: The adoption of structural equation modeling (SEM) for mediation analysis as a statistical method is commendable, as it goes beyond simple correlation analysis. While the analytical method itself is solid, the model's goodness-of-fit indices (CFI = .871, TLI = .858) are slightly below the general acceptable standard (>.90), raising concerns from a rigorous standpoint. Furthermore, the composite reliability (CR) of the dependent variable, "Intention," is explicitly stated as 0.66, below the standard value of 0.70. Deriving conclusions with low reliability of the measured variables weakens the validity of the estimation results.

Response 1: We sincerely thank the reviewer for this thoughtful and valuable comment. We agree that, while our measurement model showed an overall acceptable fit, several psychometric indicators did not fully meet conventional standards. Specifically, the CFI (.871) and TLI (.858) values were below the preferred .90 threshold, the AVE values across constructs were below .50, and the Intention construct demonstrated slightly lower composite reliability (CR = .66) than the recommended cutoff. Taken together, these findings suggest that although the model provides a reasonably structured representation of the data, some constructs—particularly Intention—may not capture the underlying latent concepts as strongly as desired. As a result, the mediation findings should be interpreted with caution, especially since the primary outcome relies on a construct with marginal reliability and weaker convergent validity. These measurement limitations may reduce the precision of the indirect effect estimates and could potentially attenuate the observed relationships. However, because most factor loadings were statistically significant, discriminant validity was supported, and other fit indices such as RMSEA and SRMR indicated acceptable approximation, we believe the model still offers meaningful preliminary insights into the proposed mediation pathways. We have revised the manuscript to include a dedicated paragraph that more transparently discusses these combined limitations and clarifies that the mediation results should be viewed as supportive but not definitive, highlighting the need for future studies using more robustly validated measurement instruments (Line 379-393).

Comment 2: While a sufficient sample size (n=360) is ensured, significant sampling bias exists.
According to the provided Supplementary Table S3, the current mental health status of the survey participants is as follows:
Good: 96 people (26.7%)
Very good: 137 people (38.1%)
Excellent: 87 people (24.2%)
In other words, approximately 89% of the entire sample has a mental health status of good or better, and only a small percentage are in need of actual treatment or intervention (Poor/Fair). There is a significant discrepancy between the intention to use the service when healthy students face mental difficulties in the future and the intention to use it when individuals actually experiencing difficulties. The failure to thoroughly discuss this bias of "bias towards healthy groups" in the discussion section significantly undermines the reliability of the research.

Response 2: Thank you for this important comment. We agree that the sample was predominantly composed of students who reported good or better mental health, which may limit the applicability of the findings to students currently experiencing psychological distress. We have revised the Limitations section to acknowledge this “healthy participant” bias and to clarify that the findings mainly reflect hypothetical intention to use digital mental health tools among relatively healthy students, rather than actual help-seeking behavior among students with current mental health difficulties (Line 395-401).

Comment 3: While the results of the mediation analysis (Table 2) are presented in this paper, the mathematical definition of the structural model in SEM and the calculation process (formula for calculating path coefficients) are not clearly stated in the text. From the perspective of mathematical modeling of social data, defining the relationships between each variable as a system of equations and presenting the calculation process, including the error term, helps ensure transparency.

Comprehensive structural equation for the dependent variable $Y$ (intent to use):
$$Y = \alpha_3 + c' X + b_1 M_1 + b_2 M_2 + e_3$$

Response 3: Thank you for your valuable comment. We have revised and updated Table 2 accordingly.

Comment 4: This study attempts to verify a theoretical framework (TPB) explaining behavioral orientation using mediation analysis with SEM. Because it is a cross-sectional study, causal relationships cannot be identified; it relies on self-report data (risk of common method bias); and, as mentioned above, the sample is extremely biased towards healthy individuals. While sufficient as a basic factual survey, the design has significant limitations when used as a basis for interventions or social implementation. The conclusion that "awareness of mental health apps mediated, while awareness of online therapy did not" is a qualitatively compelling explanation, considering it within the context of Middle Eastern-specific privacy concerns (a collectivist culture that values the opinions of others). However, quantitatively, the construct validity of "Intention" (CR=.66) is low, making it difficult to determine whether this difference in mediation is due to a true psychological mechanism or simply measurement error. While the conclusion itself is useful, definitive claims that exceed the data's limitations (such as broad recommendations in Clinical Implications) should be avoided.

Response 4: Thank you for this thoughtful comment. We agree that the interpretation of the differential mediation findings should be more cautious, particularly given the marginal reliability of the Intention construct and the broader measurement limitations identified in the CFA. We have therefore revised the Conclusion to avoid definitive claims and to clarify that the observed difference between mental health apps and online therapy may reflect either a meaningful psychological mechanism or measurement-related limitations (Line 415-432). We also toned down the clinical implications to avoid recommendations that exceed the strength of the data and emphasized the need for future studies using more robustly validated measures.

Comment 5: Clarification and Re-analysis of Bias Regarding Mental Health Status: Based on the data in Supplementary Table S3, clearly state in the Limitation section that "the majority of this sample is a psychologically healthy group." If possible, incorporate mental health status (Good or above vs. Fair or below) as a control variable into the model or add subgroup analysis to confirm the robustness of the results. Correct "Total Effects (Direct)" to "Total Effects" in Table 2. Furthermore, please add the structural equation formula (relationship between $X, M_1, M_2, Y$) to the Methodology section to verify the mediating effects as described above, and clarify the calculation framework. Discuss the statistical justification in the text for CFI and TLI being less than 0.90, and for the Intention CR being 0.66 (why these values are acceptable in the context of this study). It is strongly recommended to remove certain questionnaire items (items with low loadings, such as Q12 and Q14) and rebuild the model to verify if the goodness of fit improves. Since all variables were measured simultaneously using the same self-report questionnaire, CMV is likely to occur. Add statistical measures, such as Harman's single-factor test, to demonstrate that CMV does not significantly affect the results.

Response 5: We sincerely thank the reviewer for this helpful suggestion. We agree that common method variance (CMV) can be a concern in studies like ours, where all data were collected at a single time point using a self-report questionnaire. In response, we conducted Harman’s single-factor test to examine this issue. The results showed that the first unrotated factor explained less than 50% of the total variance, indicating that no single factor dominated the data. Taken together, this suggests that CMV is unlikely to have significantly influenced the study findings.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you for your revised manuscript and for the detailed point-by-point responses provided in the accompanying letter. I appreciate the effort made to address the concerns raised in the first round of review, and I acknowledge that several issues have been satisfactorily resolved.

Specifically, I am satisfied with the clarification regarding the absence of missing data and the use of FIML estimation, the correction of the erroneous access date in Reference 10, the strengthened and more transparent discussion of the non-significant indirect effect via perceptions of online therapy, and the revised Limitations section, which now more explicitly acknowledges the role of selection bias and the absence of covariates in the structural model.

However, upon reviewing both the revised manuscript and the supplementary materials, I have identified a critical methodological concern — not previously raised because the supplementary materials were not available at the first round of review — that substantially affects the interpretability of the study's primary finding. In addition, several points from the first round remain unresolved. For these reasons, I recommend Major Revision before this manuscript can be considered for publication.

1. CRITICAL CONCERN: CULTURALLY INAPPROPRIATE ITEM IN SUPPLEMENTARY TABLE S1
Table S1 of the supplementary materials — which reports the items used to measure perceptions of mental health apps (Mediator M1) — includes the following item: "Apps developed by South Africans." This item was retained verbatim from the original instrument by Gbollie et al. (2023), developed and validated in a South African university context. Its inclusion in a study conducted among Omani students is conceptually inappropriate: asking participants in Muscat to rate the importance of an app's South African origin does not correspond to any meaningful construct in their cultural or healthcare context, and is unlikely to have been interpreted consistently across respondents.

This is not a peripheral concern. M1 — perceptions of mental health apps — is the only statistically significant mediator in the model and the basis for the study's central conclusion. Any threat to the construct validity of M1 directly undermines the reliability of that finding. Moreover, the presence of this item is consistent with, and may help explain, the below-threshold AVE values and the suboptimal composite reliability of the intention construct observed in the CFA. It also provides direct empirical evidence that the instrument was not culturally adapted prior to administration — a concern I raised in Comment 3 of the first review round, which the authors addressed by invoking linguistic rather than cultural appropriateness. The presence of this item illustrates that linguistic accessibility and cultural relevance are distinct properties that cannot be conflated.

I ask the authors to choose one of the following courses of action and justify their choice explicitly:

(a) Remove the item from the M1 scale, rerun the CFA and mediation analysis without it, and report the updated results. If findings change materially, the Discussion and Conclusions must be revised accordingly.

(b) Retain the item but provide a substantive theoretical justification for why "locally developed" — rather than "developed by South Africans" specifically — constitutes a valid and culturally relevant dimension of app perception in the Omani context. In this case, the item wording must be discussed explicitly in the Methods section, and the label used in the main text should reflect its actual content rather than the origin-specific phrasing of the original.

(c) Acknowledge that new data collection with a culturally adapted instrument is needed, and reframe the paper explicitly as a preliminary, exploratory study with severely limited conclusions regarding the mediation pathway. In this case, the abstract, Discussion, and Conclusions must be substantially revised to reflect this framing, and any clinical recommendations must be removed or strongly qualified.

Regardless of which option is chosen, the Limitations section must explicitly acknowledge this item as a substantive threat to construct validity.

2. FACTUAL ERROR IN FIGURE 2 — MANDATORY CORRECTION
The note to Figure 2 states that ** denotes p < .001. However, the path coefficient b1 = 0.148 is marked with ** despite having p = .004 according to Table 2. This factual inconsistency was not acknowledged or corrected in the response letter. The authors are required to adopt a differentiated significance notation (e.g., * p < .05; ** p < .01; *** p < .001) applied consistently across all figures and tables.

3. MODIFICATION INDICES — UNADDRESSED POINT
In my original comment, I asked the authors not only to acknowledge the suboptimal model fit, but also to report the most relevant modification indices and to consider whether model respecification might be warranted. The response focused exclusively on expanding the limitations discussion, which does not fulfil this specific request. I ask the authors to either: (a) report the primary modification indices — even in a supplementary note — and briefly discuss whether any respecification would be theoretically justifiable; or (b) provide an explicit methodological rationale for the decision not to pursue respecification.

4. CONCEPTUAL GAP BETWEEN THEORETICAL FRAMING AND MEASUREMENT
The Introduction dedicates substantial space to AI chatbots as a specific technology, whereas the constructs entered into the SEM are operationalised at a broader level not exclusively mapped onto AI-powered tools. This disconnect has not been resolved. I ask the authors to include a brief clarifying statement — in the Introduction or Methods — explicitly acknowledging this distinction and discussing its implications for the scope and interpretation of the findings.

5. HEALTHY PARTICIPANT BIAS — INSUFFICIENT DISCUSSION
Supplementary Table S3 shows that 62.3% of participants rated their mental health as "Very good" or "Excellent", while only 2.2% reported "Poor" mental health. This distribution is not discussed in the main text with adequate depth. The study's findings therefore primarily reflect hypothetical intentions among students with no current psychological distress — a population arguably least likely to seek digital mental health services in practice. This must be addressed more substantively in the Discussion, particularly when drawing implications for clinical implementation.

6. ABSENCE OF MISSING DATA — CLARIFICATION NEEDED
The complete absence of missing data across all 360 participants and 27 items most likely reflects the use of mandatory response fields in Google Forms. If so, this should be stated explicitly, as forced-response designs may introduce response bias through arbitrary answers to items participants would otherwise have skipped. A brief methodological note in the Methods section is requested.

I recognise that some of these concerns — particularly Point 1 — require substantial analytical and editorial work. I nonetheless believe they are necessary to address before the manuscript meets the standards required for publication. I look forward to reviewing a thoroughly revised submission.

Yours sincerely,

Reviewer 1

Author Response

Comment 1: Table S1 of the supplementary materials — which reports the items used to measure perceptions of mental health apps (Mediator M1) — includes the following item: "Apps developed by South Africans." This item was retained verbatim from the original instrument by Gbollie et al. (2023), developed and validated in a South African university context. Its inclusion in a study conducted among Omani students is conceptually inappropriate: asking participants in Muscat to rate the importance of an app's South African origin does not correspond to any meaningful construct in their cultural or healthcare context, and is unlikely to have been interpreted consistently across respondents.

Response 1: Thank you for highlighting this important point. We agree that the item “Apps developed by South Africans,” if used verbatim, would be conceptually inappropriate in the Omani context. After checking with the authors who were responsible for data collection, we confirmed that the survey administered to participants had been contextually adapted, and this item was revised to reflect the Omani setting. However, during the analysis and preparation of the supplementary tables, the team responsible for preparing the tables inadvertently referred to the original scale wording, rather than the adapted version that was actually used in the data collection. We sincerely apologize for this oversight. We have now corrected Table S1 (Supplementary Materials File).

Comment 2: The note to Figure 2 states that ** denotes p < .001. However, the path coefficient b1 = 0.148 is marked with ** despite having p = .004 according to Table 2. This factual inconsistency was not acknowledged or corrected in the response letter. The authors are required to adopt a differentiated significance notation (e.g., * p < .05; ** p < .01; *** p < .001) applied consistently across all figures and tables.

Response 2: Thank you for the detailed observation. The figures and tables have now been updated to denote the significance of *p<.05.

Comment 3: In my original comment, I asked the authors not only to acknowledge the suboptimal model fit, but also to report the most relevant modification indices and to consider whether model re-specification might be warranted. The response focused exclusively on expanding the limitations discussion, which does not fulfil this specific request. I ask the authors to either: (a) report the primary modification indices — even in a supplementary note — and briefly discuss whether any re-specification would be theoretically justifiable; or (b) provide an explicit methodological rationale for the decision not to pursue re-specification.

Response 3: We thank the reviewer for this important methodological suggestion. In response, we have now reviewed and summarized the primary modification indices associated with the CFA model and included these details in the supplementary note (attached). The largest modification indices were primarily related to a limited number of potentially overlapping items, suggesting possible localized areas of shared variance or conceptual overlap. However, extensive post-hoc model re-specification, such as adding multiple cross-loadings or substantially altering the original factor structure, was not pursued because our primary objective was to preserve the theoretical integrity and conceptual framework of the validated questionnaire rather than optimize statistical fit in a purely data-driven manner. Only theoretically defensible considerations, such as reviewing weaker items and acknowledging localized misfit, were evaluated. This approach helps maintain model interpretability, reduces the risk of overfitting, and ensures that findings remain aligned with the study’s original conceptual assumptions.

Comment 4: The Introduction dedicates substantial space to AI chatbots as a specific technology, whereas the constructs entered into the SEM are operationalised at a broader level not exclusively mapped onto AI-powered tools. This disconnect has not been resolved. I ask the authors to include a brief clarifying statement — in the Introduction or Methods — explicitly acknowledging this distinction and discussing its implications for the scope and interpretation of the findings.

Response 4: Thank you for this important comment. We have therefore shortened the AI-focused discussion in the Introduction. We also added a clarification in the Discussion noting that the findings should be interpreted as reflecting broader perceptions of mental health apps, rather than attitudes toward AI-powered tools specifically.

Comment 5: Supplementary Table S3 shows that 62.3% of participants rated their mental health as "Very good" or "Excellent", while only 2.2% reported "Poor" mental health. This distribution is not discussed in the main text with adequate depth. The study's findings therefore primarily reflect hypothetical intentions among students with no current psychological distress — a population arguably least likely to seek digital mental health services in practice. This must be addressed more substantively in the Discussion, particularly when drawing implications for clinical implementation.

Response 5: Thank you for this important comment. We agree that the distribution of self-rated mental health should be discussed more explicitly. We have revised the Discussion to acknowledge that most participants rated their mental health as very good or excellent, while only a small proportion reported poor mental health. Therefore, the findings should be interpreted primarily as reflecting hypothetical intentions and general acceptability of digital mental health solutions among students with relatively good perceived mental health, rather than actual service use among students experiencing current psychological distress. We have also tempered the clinical implications and noted that future studies should examine digital mental health acceptability and uptake among students with higher levels of psychological distress or established mental health needs.

Comment 6: The complete absence of missing data across all 360 participants and 27 items most likely reflects the use of mandatory response fields in Google Forms. If so, this should be stated explicitly, as forced-response designs may introduce response bias through arbitrary answers to items participants would otherwise have skipped. A brief methodological note in the Methods section is requested.

Response 6: We agree that the absence of missing data should be clarified. We have added a brief note in the Methods section stating that the Google Forms survey used mandatory response fields, which resulted in complete item-level responses. We also acknowledge that forced-response formats may introduce response bias if participants provide arbitrary answers to items they might otherwise have skipped.

Author Response File: Author Response.pdf

Article Menu

University Students’ Perceptions and Intentions to Use Digital Mental Health Services Including Online Therapy and Mental Health Apps: A Cross-Sectional Study

Further Information

Guidelines

MDPI Initiatives

Follow MDPI