Next Article in Journal
Intervention Practices for Promoting Well-Being and Cognitive Development in Hospitalized Children: A Scoping Review
Previous Article in Journal
Burnout and Safety Behaviors in Maritime Operations: A Multilevel Analysis of Engagement, Quality of Life, and Work–Family Conflict
 
 
Article
Peer-Review Record

The Arabic Lubben Social Network Scale-6: Psychometric Validation, Measurement Invariance, and Social Support Profiles in Arabic-Speaking Older Adults

Eur. J. Investig. Health Psychol. Educ. 2026, 16(3), 40; https://doi.org/10.3390/ejihpe16030040
by Khaled Trabelsi 1,2,*, Waqar Husain 3, Hadeel Ghazzawi 4, Zahra Saif 5, Achraf Ammar 6,7 and Haitham Jahrami 5,8,*
Reviewer 1: Anonymous
Reviewer 2:
Eur. J. Investig. Health Psychol. Educ. 2026, 16(3), 40; https://doi.org/10.3390/ejihpe16030040
Submission received: 30 December 2025 / Revised: 22 February 2026 / Accepted: 27 February 2026 / Published: 6 March 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Please find the comments:

  1. Adding more keywords (up to ten) would be beneficial for indexation.
  2. The introduction is clear. However, long paragraphs are unwanted. Please improve readability.
  3. Lines 137-139 should be incorporated into section 2.1.
  4.  Examples of items of the used measures should be indicated.
  5. Table 1 includes variables related to mental health, physical health etc. How were these measured?
  6. "Analysis of the LSNS-6 revealed moderate levels of social connectedness..." What was the base of this claim? Moderate? Do you have norms, cut-offs? Please justify.
  7. What was the reason for analysing each item of MOS-SSS (lines 349...). There is no need. These questionnaires, with subscales, and subscale scores should be analysed. 
  8. Item 1 has a low factor loading. However, the discussion of this fact is somewhat poor. Please elaborate on decision on keeping this item in the scale.
  9. Before testing MI, the authors should have tested the model fit in each demographic/diety habits/physical activity group (category) separately.  Without these, MI analysis is not fully reported.
  10. Convergent validity is reported, however, the overall number of correlates used is small. It is a preliminary validation study rather than a comprehensive one (as indicated in the title). The fact that different analyses were used does not mean comprehensiveness as the collected data are very limited. 
  11. LCA: A figure with profiles should be presented. 
  12. LCA: Descriptive statistics of all profiles as well as statistical comparisons between them should be provided.
  13. The discussion is overwhelmed by different and sometimes irrelevant details like comparisons of the scores between countries. This is beyond the topic of tis paper and more importantly beyond the data. In order to provide such claims, measurement invariance should be provided between countries.
  14. The authors should discuss other potential issues which caused a low loading of item 1. For instance, translation errors cor ambiguities. There is no strong evidence to suggest refinement of the scale (lines 639) based on this study. What did other studies indicate about this item?
  15. The recruitment process was described poorly. 22 Arabic-speaking countries participated? Which countries? How were participants recruited?c When? By whom? The sample size was low if to takie into account that participants of 22 Arabic-speaking countries were participated. What was response rate?

Author Response

Authors’ Response: Dear Reviewer 1, we appreciate your positive feedback and valuable insights. Your guidance has been instrumental in enhancing our work, and we are grateful for your support. We would like to assure you that we have incorporated all of your suggestions in the revised manuscript. For the convenience of review, revisions to the manuscript appear in RED Font.

Adding more keywords (up to ten) would be beneficial for indexation.

Authors’ Response: We have now updated the keyword section to include carefully selected terms that reflect the core focus, instrument, population, methods, and key analyses of the study. The revised keywords are as follows:  Social isolation; Lubben Social Network Scale; LSNS-6; Arabic validation; Psychometric properties; Older persons; Measurement equivalence; Item response theory; Latent class analysis; Social support.

The introduction is clear. However, long paragraphs are unwanted. Please improve readability.

Authors’ Response: We have modified the Introduction and have shortened the paragraphs. The introduction now reads as follows: “ Social isolation (SIL) typically refers to the objective absence or limitation of social contact with others (National Academies of Sciences, Engineering, and Medicine [NASEM], 2020). It is characterized by having few social network ties, infrequent social interactions, or possibly living alone (Lubben et al., 2006). In contrast to loneliness, which is a subjective feeling of being isolated, SIL is an objectively quantifiable condition based on social connectivity and interaction levels (NASEM, 2020). It is a complex construct that includes both an objective lack of social ties and a subjective perception of loneliness or social disconnection (Cornwell & Waite, 2009).

The increasing recognition of SIL as a significant public health concern arises from its substantial impact on mental and physical well-being (U. S. General, 2023; NASEM, 2020; Schoenmakers et al., 2025). Previous studies have linked SIL to increased risks of cardiovascular disease (Hodgson et al., 2020; Parker et al., 2024) and cognitive decline (Livingston et al., 2024), as well as negative psychological consequences such as increased risks of depression and anxiety (Li et al., 2024), suicidal ideation (Motil-lon-Toudic et al., 2022), and even premature mortality (Løvsletten & Brenn, 2024). Additionally, a study revealed that individuals who experienced SIL before hospitalization in an intensive care unit faced a greater burden of disability and elevated mortality risk within a year following their critical illness (Falvey et al., 2021). The adverse impacts are especially evident in older persons, who are more vulnerable to SIL and loneliness due to life transitions such as retirement, bereavement, or reduced physical mobility (Hawkley et al., 2022; Hawkley et al., 2019). Given these consequences, it is not surprising that SIL and loneliness impose significant financial burdens on society (WHO, 2021; Schoenmakers et al., 2025). For example, a study conducted in the United Kingdom estimated that the additional costs associated with loneliness-related health and long-term care services amounted to GBP 11,725 per person over 15 years (Fulton & Jupp, 2015).

It should be noted that the NASEM indicated that approximately 24% of community-dwelling Americans aged 65 and older are classified as socially isolated (NASEM, 2020). Additionally, Zhenrong et al. (2024), in a systematic review and meta-analysis, reported a 33% prevalence of SIL among elderly individuals. This issue is likely to worsen, as the World Health Organization (WHO) highlights the rapid global growth of the population aged 60 and older, particularly in developing countries, which is ex-pected to increase the prevalence of SIL among older adults (WHO, 2019).

These findings indicate an immediate necessity to develop standardized screening tools and targeted interventions to identify and address SIL, particularly in older adults and those with severe health conditions. Furthermore, recognizing the critical nature of this issue, the WHO has urged governments worldwide to prioritize SIL and loneliness as major public health challenges (WHO, 2021). The WHO calls for increased political engagement and resource allocation to ensure that all individuals have access to social support networks, fostering a sense of community and solidarity (WHO, 2021). Promoting social interactions and reducing loneliness fosters healthier behaviors, enhances illness recovery, and improves overall well-being by strengthening social ties, which are linked to lower mortality rates, better health, and greater emotional resilience (Uchino, 2006; Umberson & Karas Montez, 2010; Rutledge et al., 2003; TIES, 2003; Cas-sel, 1976). However, accurately assessing and understanding social networks is essen-tial for identifying individuals at risk of SIL, implementing effective interventions to mitigate its negative effects, and evaluating the effectiveness of interventions targeting SIL (Pomeroy et al., 2023; Schoenmakers et al., 2025).

One widely used tool for assessing social networks and identifying SIL is the Lubben Social Network Scale (LSNS), which was developed based on social aging theories that recognize the essential role of family and friendships as both sources of support and fa-cilitators of social engagement (Berkman et al., 2000; Berkman & Syme, 1979). Origi-nally introduced in 1988 as a 10-item self-reported measure (Lubben, 1988), the LSNS has been refined into four versions: LSNS-12 (Kuru Alici & Kalanlar, 2021), LSNS-18 (Myagmarjav et al., 2019), LSNS-23 (Munn et al., 2018), and LSNS-6 (Lubben et al., 2006). All versions assess the frequency of social contact, network size, and the per-ceived support provided by these connections, making them valuable tools in geronto-logical research and clinical applications.

Among these versions, the six-item LSNS (LSNS-6) was specifically designed as a con-cise and effective assessment tool, making it particularly suitable for large-scale sur-veys and healthcare settings where time and resources are often limited (Lubben et al., 2006). Its shorter format reduces respondent burden, enhancing its feasibility for use in clinical and community-based settings, especially among older adults who may experience fatigue with longer questionnaires.

The LSNS-6 was originally developed in English, and to ensure its applicability in different cultural contexts, it has been translated and validated in multiple languages, predominantly in Western countries, such as Chinese (Chang et al., 2018), Japanese (Kurimoto et al., 2011), Korean (Hong et al., 2011), Mongolian (Myagmarjav et al., 2019), Spanish (Vilar-Compte et al., 2018), and Portuguese (Ribeiro et al., 2012; Tavares et al., 2023). Despite its widespread adoption, a validated Arabic version of the LSNS-6 is currently unavailable, limiting its applicability among over 400 million Arabic speakers across 22 countries (ICLS, 2024). The lack of an Arabic-translated and vali-dated LSNS-6 makes it challenging to accurately assess SIL in Arab communities, thereby restricting efforts to develop effective interventions and public health strategies.

Furthermore, most existing validations of the LSNS-6 have relied primarily on classical test theory (CTT) approaches, focusing on scale-level indices (e.g., internal consistency, factor structure). Although CTT provides essential information about overall reliability and construct validity, it offers limited insight into how individual items function across different levels of the underlying construct. This limitation is particularly relevant for short screening instruments such as the LSNS-6, in which each item contributes substantially to measurement precision. Item response theory (IRT) provides a complementary psychometric framework by enabling item-level evaluation of dis-crimination and threshold parameters, thereby assessing how effectively individual items differentiate respondents across the latent continuum of social connectedness and identifying potential ceiling or floor effects (Reckase, 2006).

In addition, SIL is not a homogeneous phenomenon (Bu and Fancourt 2024). Older adults may demonstrate distinct configurations of structural networks and perceived support. Variable-centered approaches evaluate latent dimensions but do not determine whether meaningful subgroups exist within the population (Berlin et al., 2014). Latent class analysis (LCA), a person-centered technique, enables identification of clinically inter-pretable social support profiles (Lanza et al., 2013). Incorporating LCA allows assess-ment of whether the LSNS-6 can move beyond aggregate total scores to stratify indi-viduals into meaningful risk groups, thereby enhancing its relevance for targeted pub-lic health strategies.

Accordingly, the present study translated and culturally adapted the LSNS-6 into Ara-bic and conducted a comprehensive psychometric evaluation among older Ara-bic-speaking adults. Using an integrated framework combining CTT and modern ap-proaches, we examined internal consistency, the two-factor structure (Family and Friends), measurement invariance across key subgroups, convergent validity with the Medical Outcomes Study Social Support Survey (MOS-SSS), item-level parameters using IRT, and profiles of social connectedness and perceived social support identified through LCA.”

Lines 137-139 should be incorporated into section 2.1.

Authors’ Response: We have moved these lines to section 2.1. We also added minor modifications to text, as follows: “Establishing the psychometric properties of LSNS-6 in Arabic was essential to support reliable screening, cross-cultural research, and the development of targeted interventions to mitigate SIL and its associated health consequences in Arabic-speaking populations”.

 Examples of items of the used measures should be indicated.

Authors’ Response: Thank you for your comment. We have provided examples of items for both the measures used as follows:

“2.3.1 The Lubben Social Network Scale-6 (LSNS-6)

The Lubben Social Network Scale-6 (LSNS-6) is a validated screening tool that evaluates social network size and support through six self-reported items. It consists of two subscales: family support (Items 1–3; for example, “How many relatives do you see or hear from at least once a month?”) and friend support (Items 4–6; for example, “How many friends do you see or hear from at least once a month?”). Responses are scored on a 6-point Likert scale (0 = None to 5 = Nine or more), with total scores ranging from 0 to 30. Lower scores indicate greater SIL, with a cutoff score of less than 12 initially proposed (Lubben et al., 2006). However, this threshold may vary by context, as cultural norms can influence both network structure and reporting patterns (Hong et al., 2011). A Cronbach alpha of 0.83 has been reported in the original English version of LSNS-6 (Lubben et al., 2006).2.3.2 The Medical Outcomes Study Social Support Survey (MOS-SSS)

To assess convergent validity, participants completed the Arabic version of the MOS-SSS, which has been previously validated in the Arabic-speaking population (Cronbach's alpha = 0.93) (Dafaalla et al., 2016). The MOS-SSS is a 19-item self-report instrument designed to assess perceived social support across four functional domains: emotional/informational support, tangible support, affectionate support, and positive social interaction. Each item is rated on a 5-point Likert scale (1 = None of the time, 5 = All of the time), with higher scores indicating greater perceived social support. Representative examples of items include: Item 2 (Emotional/Informational Support): "Someone you can count on to listen to you when you need to talk"; Item 4 (Tangible Support): "Someone to help you if you were confined to bed"; Item 5 (Affectionate Support): "Someone who shows you love and affection"; and Item 7 (Positive Social In-teraction): "Someone to do things with to help you get your mind off things." The sur-vey also includes a single-item measure of overall social support. The MOS-SSS is widely used in health and epidemiological research to evaluate the impact of social support on physical and mental health outcomes (Dao‐Tran et al., 2023).

Table 1 includes variables related to mental health, physical health etc. How were these measured?

Authors’ Response: We clarified in the manuscript that these variables were assessed using self-reported single-item measures. Specifically, mental health was assessed with the question: “How would you rate your mental health?”, with example anchors including feelings of anxiety, stress, and general mood. Physical health was evaluated using the question: “How would you rate your physical health?”, with example anchors including level of activity, endurance capacity, and physical complaints such as pain or fatigue. Responses were recorded on a 5-point Likert scale ranging from very poor to excellent. Please see Methods section.

"Analysis of the LSNS-6 revealed moderate levels of social connectedness..." What was the base of this claim? Moderate? Do you have norms, cut-offs? Please justify.

Authors’ Response: Thnak you for your comment. We provided a discussion paragraph as follows: “ Descriptive analysis of the LSNS-6 revealed mild-moderate levels of social connectedness among the sample, with a mean total score of 14.04 (SD = 5.46, median = 14). This value falls above the established cutoff of <12 proposed by Lubben et al. (2006) for identifying greater risk of SIL, indicating that the majority of participants (approximately 65%, based on the normal distribution approximation with Z = -0.374) scored at or above this threshold. The near-symmetrical distribution of scores (skewness = 0.01; kurtosis = -0.27) and substantial variability further support the interpretation of moderate but heterogeneous social networks. Stronger family ties (M = 8.06) compared with friendships (M = 5.98) are consistent with collectivistic social structures in Arabic-speaking populations. ” Please see Discussion section.

What was the reason for analysing each item of MOS-SSS (lines 349...). There is no need. These questionnaires, with subscales, and subscale scores should be analysed. 

Authors’ Response: Thank you for this important comment. We deleted analysis of each item of MOS-SSS and retained results of total/subscales.

Item 1 has a low factor loading. However, the discussion of this fact is somewhat poor. Please elaborate on decision on keeping this item in the scale.

Authors’ Response: We thank the reviewer for highlighting this important issue. We realize that our discussion of Item 1's low factor loading (0.26) was insufficient. We have substantially expanded the discussion section to provide a more thorough analysis of this problematic item, by including the following new text:

“Within the Family subscale, items 2 and 3 demonstrated strong loadings (>0.80), whereas item 1, which assesses the frequency of contact with relatives, showed a considerably weaker loading (0.26). In contrast, all Friends items loaded strongly, particularly item 6 (0.92), reflecting the internal consistency of this subscale. Validity was further supported by acceptable AVE values (Family = 0.544, Friends = 0.673) and an HTMT of 0.712; this confirms adequate discriminant validity between the two subscales.

A notable finding from the CFA and IRT analyses concerns Item 1, which assesses the frequency of contact with relatives ("How many relatives do you see or hear from at least once a month?"). This item demonstrated a considerably weak standardized factor loading (0.26, p < 0.001) within the Family subscale, along with notably lower discrimination power in the IRT analysis (a = 0.38). These psychometric properties suggest that Item 1 functions distinctly—and less effectively—compared to Items 2 and 3 (which assess the size of the family network and closeness of family relationships, respectively).

Item deletion analysis indicated that removing Item 1 would actually improve the internal consistency of the Family subscale from α = 0.70 to α = 0.83, suggesting that this item may not cohere well with the other family-related items. The weak loading and low discrimination suggest that contact frequency alone may not be a strong indicator of meaningful family social support in Arabic-speaking cultures, where strong emotional bonds often persist despite infrequent communication. This pattern reflects the collectivistic cultural context of Arabic-speaking populations, where family relationships are characterized by deep emotional ties and interdependence that transcend frequency of contact. In such cultures, family members may maintain strong psychological and emotional connections even when face-to-face or direct communication is infrequent due to geographic distance, life circumstances, or other practical barriers.

Examination of Item 1's performance across previous LSNS-6 validations reveals considerable variability, with factor loadings ranging from 0.26 (present study) to 0.85. Specifically, the original English validation by Lubben et al. (2006) reported a factor loading of 0.77 for Item 1, indicating strong performance in the original language context. Similarly, the Chinese validation by Chang et al. (2018) demonstrated robust performance with a factor loading of 0.85, and the Mongolian validation by Myagmarjav et al. (2019) reported a loading of 0.75. However, the Korean validation by Hong et al. (2011) reported a notably lower factor loading of 0.46, intermediate between the strong performance in English, Chinese, and Mongolian contexts and the substantially weaker performance (0.26) observed in the present Arabic validation. Except for the Korean version, previous validations have consistently demonstrated moderate to strong factor loadings for Item 1 (ranging from 0.75–0.85), suggesting that the item functions adequately in most cultural and linguistic contexts. The Arabic version's loading of 0.26 represents a substantial departure from this pattern and is the lowest reported to date. The Korean validation's intermediate loading (0.46) suggests that East Asian contexts may present some challenges for this item, but the Arabic version's performance is markedly poorer. This differential performance across languages and cultures suggests that translation-specific issues or cultural factors particular to Arabic adaptation may be responsible, rather than inherent limitations of the item itself.

Despite these psychometric limitations, we elected to retain Item 1 in the Arabic LSNS-6 for several important reasons. First, maintaining the original structure of the LSNS-6 facilitates international comparisons and allows researchers to track consistency with validations in other languages and cultural contexts. Second, removing an item would alter the instrument's established scoring structure and potentially compromise its utility as a standardized screening tool. Third, the item still contributes meaningfully to the overall scale, and the weak loading may reflect a true cultural difference rather than a measurement error—specifically, that contact frequency may be less relevant than emotional closeness in assessing family social networks in this population”.

Additionally, we have mentioned this in the Limitations section as follows:

“Finally, the weak factor loading of Item 1 (assessing contact frequency with relatives) suggests that this item may not optimally capture family support in the Arabic cultural context. Future refinement of this item, potentially through cognitive interviews or qualitative research, may enhance the scale's cultural validity and measurement precision. Additionally, test-retest reliability was not assessed in this study, and future research should examine the temporal stability of the tool to ensure consistent measurement over time”.

Before testing MI, the authors should have tested the model fit in each demographic/diety habits/physical activity group (category) separately.  Without these, MI analysis is not fully reported.

Authors’ Response:

Thank you for this important comment. We agree that evaluating model fit within each subgroup is a necessary preliminary step before conducting measurement invariance testing. Accordingly, we examined the baseline CFA model fit separately for each demographic, dietary, and physical activity category. These results are now clearly reported in the Results section and presented in detail in Supplementary Table S1, which includes χ², CFI, TLI, RMSEA, and confidence intervals for each group. This ensures that MI testing was conducted only after confirming acceptable subgroup-level model fit.

Convergent validity is reported, however, the overall number of correlates used is small. It is a preliminary validation study rather than a comprehensive one (as indicated in the title). The fact that different analyses were used does not mean comprehensiveness as the collected data are very limited. 

Authors’ Response: Thank you for this helpful comment. We agree with the reviewer and have revised the Discussion accordingly. Specifically, we clarified the conceptual distinction between the LSNS-6 and the MOS-SSS by adding the following paragraph: “While the Arabic LSNS-6 and the MOS-SSS both assess aspects of social relationships and support, they capture complementary but distinct constructs, which helps explain the moderate correlations observed between them (r = 0.43–0.51 with MOS-SSS subscales; r = 0.51 with overall perceived social support). The LSNS-6 is a structurally oriented measure that quantifies the size and frequency of social network contacts (e. g., objective social connectedness with family and friends), whereas the MOS-SSS is functionally oriented, assessing the perceived availability and quality of different types of support (emotional/informational, tangible, affectionate, and positive social interaction). These conceptual differences structural versus functional naturally result in correlations that are positive and meaningful but not high enough to suggest redundancy.”

LCA: A figure with profiles should be presented. 

Authors’ Response: A figure illustrating class prevalence is now presented as follows.

Figure 1. Class prevalences for social connectedness and perceived social support (N = 327)

LCA: Descriptive statistics of all profiles as well as statistical comparisons between them should be provided.

Authors’ Response: We added descriptive statistics  as follows: “3.7. Latent class analysis (LCA)

The LCA supported a four-class solution as the best-fitting model (AIC = 7793, BIC = 9479, entropy = 1.00, bootstrap p = 0.02). The estimated class prevalences were approximately 30% (Class 1), 18% (Class 2), 30% (Class 3), and 23% (Class 4). Detailed item-response probabili-ties and marginal prevalences are presented in Table 7 and Figure 1.

Class 1 showed moderate-to-high LSNS-6 probabilities and consistently strong perceived support across MOS-SSS domains. The mean LSNS-6 total score was 14.11 (SD = 4.72), and overall perceived support was relatively high (MOS-SSS mean = 64.27, SD = 9.80). Class 2 represented the lowest-support subgroup. This class had the weakest emotion-al/informational and affectionate support and generally lower LSNS-6 total scores. Overall functional support was reduced (MOS-SSS mean = 55.90, SD = 25.29), with substantial varia-bility, particularly in emotional/informational support (mean = 23.52, SD = 11.66). The mean LSNS-6 total score was 12.83 (SD = 6.82). Class 3 demonstrated the highest and most consistent support levels. Members showed the highest LSNS-6 total score (mean = 15.89, SD = 5.45) and the strongest MOS-SSS scores (overall mean = 87.79, SD = 10.63), particularly for affectionate (mean = 13.98, SD = 1.91) and emotional/informational support (mean = 37.96, SD = 4.98). Class 4 showed a pragmatic pattern with moderate LSNS-6 total score (mean = 12.51, SD = 4.50) and comparatively lower functional support. Positive social interaction (mean = 6.89, SD = 1.76), affectionate support (mean = 7.29, SD = 2.21), and overall support (MOS-SSS mean = 49.53, SD = 10.46) were reduced despite adequate network size.”

 

The discussion is overwhelmed by different and sometimes irrelevant details like comparisons of the scores between countries. This is beyond the topic of tis paper and more importantly beyond the data. In order to provide such claims, measurement invariance should be provided between countries.

Authors’ Response: Thank you for your comment. We have removed the comparisons of LSNS-6 mean scores across countries and revised the Discussion to focus exclusively on findings derived from the present sample.

The authors should discuss other potential issues which caused a low loading of item 1. For instance, translation errors cor ambiguities. There is no strong evidence to suggest refinement of the scale (lines 639) based on this study. What did other studies indicate about this item?

Authors’ Response: We have modified the discussion in this regard and have added comparisons for item 1 from other countries as follows:

“A notable finding from the CFA and IRT analyses concerns Item 1, which assesses the frequency of contact with relatives ("How many relatives do you see or hear from at least once a month?"). This item demonstrated a considerably weak standardized factor loading (0.26, p < 0.001) within the Family subscale, along with notably lower discrimination power in the IRT analysis (a = 0.38). These psychometric properties suggest that Item 1 functions distinctly—and less effectively—compared to Items 2 and 3 (which assess the size of the family network and closeness of family relationships, respectively).

Item deletion analysis indicated that removing Item 1 would actually improve the internal consistency of the Family subscale from α = 0.70 to α = 0.83, suggesting that this item may not cohere well with the other family-related items. The weak loading and low discrimination suggest that contact frequency alone may not be a strong indicator of meaningful family social support in Arabic-speaking cultures, where strong emotional bonds often persist despite infrequent communication. This pattern reflects the collectivistic cultural context of Arabic-speaking populations, where family relationships are characterized by deep emotional ties and interdependence that transcend frequency of contact. In such cultures, family members may maintain strong psychological and emotional connections even when face-to-face or direct communication is infrequent due to geographic distance, life circumstances, or other practical barriers.

Examination of Item 1's performance across previous LSNS-6 validations reveals considerable variability, with factor loadings ranging from 0.26 (present study) to 0.85. Specifically, the original English validation by Lubben et al. (2006) reported a factor loading of 0.77 for Item 1, indicating strong performance in the original language context. Similarly, the Chinese validation by Chang et al. (2018) demonstrated robust performance with a factor loading of 0.85, and the Mongolian validation by Myagmarjav et al. (2019) reported a loading of 0.75. However, the Korean validation by Hong et al. (2011) reported a notably lower factor loading of 0.46, intermediate between the strong performance in English, Chinese, and Mongolian contexts and the substantially weaker performance (0.26) observed in the present Arabic validation. Except for the Korean version, previous validations have consistently demonstrated moderate to strong factor loadings for Item 1 (ranging from 0.75–0.85), suggesting that the item functions adequately in most cultural and linguistic contexts. The Arabic version's loading of 0.26 represents a substantial departure from this pattern and is the lowest reported to date. The Korean validation's intermediate loading (0.46) suggests that East Asian contexts may present some challenges for this item, but the Arabic version's performance is markedly poorer. This differential performance across languages and cultures suggests that translation-specific issues or cultural factors particular to Arabic adaptation may be responsible, rather than inherent limitations of the item itself.

Despite these psychometric limitations, we elected to retain Item 1 in the Arabic LSNS-6 for several important reasons. First, maintaining the original structure of the LSNS-6 facilitates international comparisons and allows researchers to track consistency with validations in other languages and cultural contexts. Second, removing an item would alter the instrument's established scoring structure and potentially compromise its utility as a standardized screening tool. Third, the item still contributes meaningfully to the overall scale, and the weak loading may reflect a true cultural difference rather than a measurement error—specifically, that contact frequency may be less relevant than emotional closeness in assessing family social networks in this population”.

The recruitment process was described poorly. 22 Arabic-speaking countries participated? Which countries? How were participants recruited?c When? By whom? The sample size was low if to takie into account that participants of 22 Arabic-speaking countries were participated. What was response rate?

Authors’ Response:

In this cross-sectional study, participants were recruited through an online dissemination strategy using our research network across 22 Arabic-speaking countries to enhance geographic diversity. The survey link was distributed between January, 2025 and March, 2025 via social media platforms, including Instagram, Facebook, and Twitter/X, as well as instant messaging applications such as WhatsApp, Signal, and Viber. Members of the research team in different countries shared the invitation within professional, community, and older-adult networks. Participation was voluntary and anonymous, and eligibility criteria included being aged 60 years or older and able to read Arabic. Although recruitment targeted 22 countries, participation varied across countries, and some countries yielded no responses despite dissemination efforts. Because the survey was distributed openly online, a precise response rate could not be calculated, as the total number of individuals who viewed the invitation was un-known.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This is a methodologically ambitious and largely well-executed psychometric validation study. The manuscript demonstrates several strengths, including a clearly articulated public health rationale, a rigorous translation and cultural adaptation process, an adequate and geographically diverse sample, and the use of both classical test theory and modern psychometric approaches (CFA, measurement invariance, IRT, and latent class analysis). The study goes beyond minimal validation requirements and, in principle, makes a meaningful contribution by providing the first validated Arabic version of the LSNS-6, a tool with clear relevance for screening social isolation among older adults. A few recommendations for the authors’ references.

A key issue concerns the interpretation of model fit in the CFA. While most incremental fit indices are strong, the RMSEA is clearly above conventional thresholds and indicates model misfit. At present, the manuscript downplays this inconsistency. The authors should explicitly acknowledge the elevated RMSEA, briefly explain why conflicting fit indices may occur in short scales with few degrees of freedom, and moderate claims about “strong” or “excellent” model fit. Alternatively, reporting and testing a minor model modification (e.g., theoretically justified correlated residuals) could be considered, provided it is clearly justified and transparently reported.

Relatedly, Item 1 consistently performs weakly across analyses, showing low factor loading, high residual variance, low discrimination in IRT, and improved reliability when removed. At present, this is described descriptively but not followed through conceptually. The authors should either (a) provide a stronger theoretical explanation for why this item behaves differently in Arabic-speaking contexts (e.g., cultural norms around extended family contact), or (b) explicitly discuss the implications for future scale refinement, including whether a revised Arabic version of the LSNS-5 or an adapted wording of Item 1 should be explored in subsequent studies.

The measurement invariance results, while impressive in scope, are, at the moment, presented in a way that may overstate robustness. Several subgroup models show RMSEA values well above acceptable thresholds, even when invariance is claimed. The authors should clarify that invariance conclusions are based primarily on ΔCFI and ΔRMSEA criteria rather than absolute model fit, and briefly acknowledge that some subgroup fits were suboptimal. This clarification would improve methodological transparency and credibility.

The latent class analysis, although interesting, currently feels under-integrated with the rest of the paper. The authors should more clearly justify why LCA was included in a validation study and explicitly link the identified classes to practical screening or intervention implications. For example, consider explaining how the “low-support” class could be used to inform targeted public health interventions to strengthen the applied contribution.

The sampling and recruitment strategy also requires more cautious framing. Although participants were recruited across multiple countries, the use of online and social media-based recruitment introduces selection bias, which is not a fundamental flaw that undermines the study integrity; however, the author should explicitly acknowledge/address this. In this sense, the manuscript should avoid terms such as “representative” and instead explicitly describe the sample as a convenience sample, noting how digital access, literacy, and health status may have shaped participation.

In addition, the Discussion section could be more synthetic. Rather than reiterating results, it should prioritise the key take-home messages, e.g., the robustness of the Friends subscale, the cultural sensitivity of family-based items, and the implications of item-level findings for cross-cultural measurement of social isolation.

Author Response

This is a methodologically ambitious and largely well-executed psychometric validation study. The manuscript demonstrates several strengths, including a clearly articulated public health rationale, a rigorous translation and cultural adaptation process, an adequate and geographically diverse sample, and the use of both classical test theory and modern psychometric approaches (CFA, measurement invariance, IRT, and latent class analysis). The study goes beyond minimal validation requirements and, in principle, makes a meaningful contribution by providing the first validated Arabic version of the LSNS-6, a tool with clear relevance for screening social isolation among older adults. A few recommendations for the authors’ references.

Authors’ Response: Dear Reviewer 2, we appreciate your positive feedback and valuable insights. Your guidance has been instrumental in enhancing our work, and we are grateful for your support. We would like to assure you that we have incorporated all of your suggestions in the revised manuscript. For the convenience of review, revisions to the manuscript appear in RED Font.

A key issue concerns the interpretation of model fit in the CFA. While most incremental fit indices are strong, the RMSEA is clearly above conventional thresholds and indicates model misfit. At present, the manuscript downplays this inconsistency. The authors should explicitly acknowledge the elevated RMSEA, briefly explain why conflicting fit indices may occur in short scales with few degrees of freedom, and moderate claims about “strong” or “excellent” model fit. Alternatively, reporting and testing a minor model modification (e.g., theoretically justified correlated residuals) could be considered, provided it is clearly justified and transparently reported.

Authors’ Response: Thank you for this critical observation. You are correct that our interpretation of model fit was inconsistent and potentially misleading. While we highlighted strong CFI, TLI, and SRMR values, we inadequately addressed the elevated RMSEA (0.113, 90% CI [0.081–0.148]), which exceeded the conventional threshold of 0.08. We have substantially revised our discussion section and have added the following text:

“Model fit indices, including CFI (0.963), TLI (0.931), and SRMR (0.03), indicated good overall model fit, although RMSEA (0.113) exceeded the ideal threshold. This pattern is not uncommon in CFA of brief scales with limited degrees of freedom. Several factors con-tribute to this phenomenon: (1) RMSEA is particularly sensitive to model complexity and degrees of freedom—short scales with few items naturally have minimal degrees of freedom (in this case, df = 8), which can lead to elevated RMSEA values even when the model is otherwise well-specified; (2) incremental fit indices (CFI, TLI) compare the specified model to a baseline model and may be less sensitive to absolute fit departures, particularly in smaller models; (3) brief screening instruments like the LSNS-6 are necessarily parsimonious and may not capture the complexity of constructs, leading to some residual mis-fit; and (4) RMSEA is also known to penalize simpler models relative to more saturated models, which can inflate RMSEA values for instruments designed to be brief and efficient (Kenny et al., 2015; McNeish et al., 2018). Given these considerations, the moder-ate-to-good performance on incremental fit indices and absolute fit indices (SRMR, TLI, CFI), combined with the strong factor loadings and theoretical coherence of the two-factor structure, suggests that the two-factor model provides an adequate fit to the data”.

References: Kenny, D.A.; Kaniskan, B.; McCoach, D.B. (2015). The Performance of RMSEA in Confirmatory Factor Analysis: Traditional Criteria Versus New Guidelines. Structural Equation Modeling: A Multidisciplinary Journal, 22(3), 449–458.

Relatedly, Item 1 consistently performs weakly across analyses, showing low factor loading, high residual variance, low discrimination in IRT, and improved reliability when removed. At present, this is described descriptively but not followed through conceptually. The authors should either (a) provide a stronger theoretical explanation for why this item behaves differently in Arabic-speaking contexts (e.g., cultural norms around extended family contact), or (b) explicitly discuss the implications for future scale refinement, including whether a revised Arabic version of the LSNS-5 or an adapted wording of Item 1 should be explored in subsequent studies.

Authors’ Response: We thank you for highlighting this important issue. We realize that our discussion of Item 1's low factor loading (0.26) was insufficient. We have substantially expanded the discussion section to provide a more thorough analysis of this problematic item, by including the following new text:

“Within the Family subscale, items 2 and 3 demonstrated strong loadings (>0.80), whereas item 1, which assesses the frequency of contact with relatives, showed a considerably weaker loading (0.26). In contrast, all Friends items loaded strongly, particularly item 6 (0.92), reflecting the internal consistency of this subscale. Validity was further supported by acceptable AVE values (Family = 0.544, Friends = 0.673) and an HTMT of 0.712; this confirms adequate discriminant validity between the two subscales.

A notable finding from the CFA and IRT analyses concerns Item 1, which assesses the frequency of contact with relatives ("How many relatives do you see or hear from at least once a month?"). This item demonstrated a considerably weak standardized factor loading (0.26, p < 0.001) within the Family subscale, along with notably lower discrimination power in the IRT analysis (a = 0.38). These psychometric properties suggest that Item 1 functions distinctly—and less effectively—compared to Items 2 and 3 (which assess the size of the family network and closeness of family relationships, respectively).

Item deletion analysis indicated that removing Item 1 would actually improve the internal consistency of the Family subscale from α = 0.70 to α = 0.83, suggesting that this item may not cohere well with the other family-related items. The weak loading and low discrimination suggest that contact frequency alone may not be a strong indicator of meaningful family social support in Arabic-speaking cultures, where strong emotional bonds often persist despite infrequent communication. This pattern reflects the collectivistic cultural context of Arabic-speaking populations, where family relationships are characterized by deep emotional ties and interdependence that transcend frequency of contact. In such cultures, family members may maintain strong psychological and emotional connections even when face-to-face or direct communication is infrequent due to geographic distance, life circumstances, or other practical barriers.

Examination of Item 1's performance across previous LSNS-6 validations reveals considerable variability, with factor loadings ranging from 0.26 (present study) to 0.85. Specifically, the original English validation by Lubben et al. (2006) reported a factor loading of 0.77 for Item 1, indicating strong performance in the original language context. Similarly, the Chinese validation by Chang et al. (2018) demonstrated robust performance with a factor loading of 0.85, and the Mongolian validation by Myagmarjav et al. (2019) reported a loading of 0.75. However, the Korean validation by Hong et al. (2011) reported a notably lower factor loading of 0.46, intermediate between the strong performance in English, Chinese, and Mongolian contexts and the substantially weaker performance (0.26) observed in the present Arabic validation. Except for the Korean version, previous validations have consistently demonstrated moderate to strong factor loadings for Item 1 (ranging from 0.75–0.85), suggesting that the item functions adequately in most cultural and linguistic contexts. The Arabic version's loading of 0.26 represents a substantial departure from this pattern and is the lowest reported to date. The Korean validation's intermediate loading (0.46) suggests that East Asian contexts may present some challenges for this item, but the Arabic version's performance is markedly poorer. This differential performance across languages and cultures suggests that translation-specific issues or cultural factors particular to Arabic adaptation may be responsible, rather than inherent limitations of the item itself.

Despite these psychometric limitations, we elected to retain Item 1 in the Arabic LSNS-6 for several important reasons. First, maintaining the original structure of the LSNS-6 facilitates international comparisons and allows researchers to track consistency with validations in other languages and cultural contexts. Second, removing an item would alter the instrument's established scoring structure and potentially compromise its utility as a standardized screening tool. Third, the item still contributes meaningfully to the overall scale, and the weak loading may reflect a true cultural difference rather than a measurement error—specifically, that contact frequency may be less relevant than emotional closeness in assessing family social networks in this population”.

Additionally, we have mentioned this in the Limitations section as follows:

“Finally, the weak factor loading of Item 1 (assessing contact frequency with relatives) suggests that this item may not optimally capture family support in the Arabic cultural context. Future refinement of this item, potentially through cognitive interviews or qualitative research, may enhance the scale's cultural validity and measurement precision. Additionally, test-retest reliability was not assessed in this study, and future research should examine the temporal stability of the tool to ensure consistent measurement over time”.

The measurement invariance results, while impressive in scope, are, at the moment, presented in a way that may overstate robustness. Several subgroup models show RMSEA values well above acceptable thresholds, even when invariance is claimed. The authors should clarify that invariance conclusions are based primarily on ΔCFI and ΔRMSEA criteria rather than absolute model fit, and briefly acknowledge that some subgroup fits were suboptimal. This clarification would improve methodological transparency and credibility.

Authors’ Response: We have revised the text to clarify that invariance conclusions rely on fit index changes (ΔCFI, ΔRMSEA) rather than absolute fit, and we now explicitly acknowledge instances where subgroup models showed elevated RMSEA values. The following text has been added to the Discussion:

“While the multi-group CFA supported full measurement invariance across examined subgroups based on hierarchical model comparisons, we acknowledge important methodological considerations. Invariance conclusions are based primarily on changes in fit indices (ΔCFI ≤ 0.01, ΔRMSEA ≤ 0.015) between nested models rather than absolute model fit. Several subgroup models, particularly for dietary habits and physical activity levels, demonstrated RMSEA values exceeding conventional thresholds (ranging from 0.10 to 0.15 across subgroups), despite meeting invariance criteria through relative fit comparisons. This pattern reflects the inherent challenges of fitting brief instruments to multiple subgroups with smaller sample sizes. The invariance conclusions indicate that the factor structure and loadings function equivalently across groups, a prerequisite for valid group comparisons, but do not imply absolute model fit excellence within each subgroup. This distinction is important for interpretation: observed differences in latent means across groups reflect true differences in the construct rather than measurement artifacts, yet absolute fit statistics suggest some residual misspecification remains. Future research with larger subgroup samples may further refine understanding of subgroup-specific model fit.”.

The latent class analysis, although interesting, currently feels under-integrated with the rest of the paper. The authors should more clearly justify why LCA was included in a validation study and explicitly link the identified classes to practical screening or intervention implications. For example, consider explaining how the “low-support” class could be used to inform targeted public health interventions to strengthen the applied contribution.

Authors’ Response: You raise a valid point about the disconnect between our LCA methodology and the validation study's primary aims. We have revised the Introduction and Discussion sections to: (1) explicitly justify LCA's inclusion in a validation framework; (2) clearly link the identified classes to practical screening and intervention applications; and (3) emphasize how the vulnerable low-support class can inform targeted public health strategies. We have made the following changes by adding new text as follows:

In the Introduction section, the following text has been added:

“In addition, SIL is not a homogeneous phenomenon (Bu and Fancourt 2024). Older adults may demonstrate distinct configurations of structural networks and perceived support. Variable-centered approaches evaluate latent dimensions but do not deter-mine whether meaningful subgroups exist within the population (Berlin et al., 2014). Latent class analysis (LCA), a person-centered technique, enables identification of clin-ically interpretable social support profiles (Lanza et al., 2013). Incorporating LCA al-lows assessment of whether the LSNS-6 can move beyond aggregate total scores to stratify individuals into meaningful risk groups, thereby enhancing its relevance for targeted public health strategies.”.

In section 3.7 dealing with the results of LCA, we added the following concluding text:

“The LCA supported a four-class solution as the best-fitting model (AIC = 7793, BIC = 9479, entropy = 1.00, bootstrap p = 0.02). The estimated class prevalences were approximately 30% (Class 1), 18% (Class 2), 30% (Class 3), and 23% (Class 4). Detailed item-response probabili-ties and marginal prevalences are presented in Table 7 and Figure 1.

Class 1 showed moderate-to-high LSNS-6 probabilities and consistently strong perceived support across MOS-SSS domains. The mean LSNS-6 total score was 14.11 (SD = 4.72), and overall perceived support was relatively high (MOS-SSS mean = 64.27, SD = 9.80).

Class 2 represented the lowest-support subgroup. This class had the weakest emotion-al/informational and affectionate support and generally lower LSNS-6 total scores. Overall functional support was reduced (MOS-SSS mean = 55.90, SD = 25.29), with substantial varia-bility, particularly in emotional/informational support (mean = 23.52, SD = 11.66). The mean LSNS-6 total score was 12.83 (SD = 6.82).

Class 3 demonstrated the highest and most consistent support levels. Members showed the highest LSNS-6 total score (mean = 15.89, SD = 5.45) and the strongest MOS-SSS scores (overall mean = 87.79, SD = 10.63), particularly for affectionate (mean = 13.98, SD = 1.91) and emotional/informational support (mean = 37.96, SD = 4.98).

Class 4 showed a pragmatic pattern with moderate LSNS-6 total score (mean = 12.51, SD = 4.50) and comparatively lower functional support. Positive social interaction (mean = 6.89, SD = 1.76), affectionate support (mean = 7.29, SD = 2.21), and overall support (MOS-SSS mean = 49.53, SD = 10.46) were reduced despite adequate network size.”.

In the Discussion section, we added the following text: “The LCA revealed four meaningful and clinically interpretable profiles of social connectedness and perceived support among older Arabic-speaking adults, extending the findings from classical and IRT analyses. Approximately 60% of the sample be-longed to well-supported classes (Class 1 and Class 3), characterized by strong overall networks and high affectionate/positive interaction support, which aligns with the cultural emphasis on familial collectivism previously observed in the sample. These groups would benefit primarily from maintenance-oriented strategies that preserve existing social ties and encourage continued participation in community and family activities. In contrast, the identification of a substantial vulnerable subgroup (Class 2, ≈18%) with particularly low emotional and affectionate support highlights an im-portant at-risk population that may be overlooked when relying solely on total scores or mean comparisons. The emergence of a tangible-focused intermediate class (Class 4) further suggests heterogeneity in how older adults experience practical versus emo-tional support, potentially reflecting generational or situational differences in family roles. These findings reinforce the multidimensional nature of SIL in collectivistic con-texts and provide a foundation for developing targeted, profile-specific interventions to reduce SIL and its associated health risks.”

In section 4.1, Implications for research and public health interventions; we added the following text:

“The identification of meaningful social support profiles through LCA enhances the Arabic LSNS-6's utility as a screening tool. Beyond identifying SIL via total scores, the scale can now inform stratified intervention approaches. Individuals in the low-support class (Class 2) would benefit from interventions emphasizing emotional connection and affective support, such as peer support groups, counseling services, or community programs fostering meaningful relationships. Those in the tangible-focused class (Class 4) might benefit from interventions strengthening emotional reciprocity within existing practical relationships. This profile-based approach represents an advance over aggregate screening, enabling health systems to allocate limited intervention resources toward populations with the greatest need for specific dimensions of social support.”.

The sampling and recruitment strategy also requires more cautious framing. Although participants were recruited across multiple countries, the use of online and social media-based recruitment introduces selection bias, which is not a fundamental flaw that undermines the study integrity; however, the author should explicitly acknowledge/address this. In this sense, the manuscript should avoid terms such as “representative” and instead explicitly describe the sample as a convenience sample, noting how digital access, literacy, and health status may have shaped participation.

Authors’ Response:

Thank you for this important observation. We agree with the reviewer that online and social media–based recruitment may introduce selection bias due to differences in digital access, literacy, and health status among older adults. We have therefore revised the manuscript to avoid the term ‘representative’ and now refer to the sample as a convenience sample. We also clarified in the Limitations section that online recruitment may have shaped participation and may underrepresent older adults with limited digital access.

In addition, the Discussion section could be more synthetic. Rather than reiterating results, it should prioritise the key take-home messages, e.g., the robustness of the Friends subscale, the cultural sensitivity of family-based items, and the implications of item-level findings for cross-cultural measurement of social isolation.

Authors’ Response: We have restructured the Discussion to prioritize major take-home messages: the robust performance of the Friends subscale, cultural specificity of family-centered support patterns, and implications for cross-cultural measurement of social isolation. The Discussion begins with the following new paragraph now:

“This study presents the first validation of the Arabic version of the LSNS-6 and yields three principal findings with implications for assessing SIL in Arabic-speaking populations. First, the Friends subscale demonstrated consistently strong psychometric performance across classical test theory and IRT analyses. Factor loadings ranged from 0.72 to 0.92, internal consistency was robust (ω = 0.86), and discrimination parameters were high, particularly for Item 6 (a = 12.61). These findings indicate that friend-ship-based social networks are reliably and sensitively captured in this cultural con-text. Second, the Family subscale revealed culturally specific response patterns. Items assessing emotional closeness and family network size (Items 2 and 3) performed well psychometrically, whereas the contact frequency item (Item 1) showed weaker performance. This pattern likely reflects collectivistic cultural norms in which strong fam-ily bonds are maintained independently of frequent communication. This finding challenges frequency-based assumptions about family relationships common in West-ern instruments and underscores the necessity of culturally informed measurement development. Third, the identification of four distinct social support profiles, including a vulnerable 18% low-support subgroup, demonstrates that the Arabic LSNS-6 can stratify individuals for targeted intervention beyond binary isolation classifications. These findings collectively advance understanding of how social network assessment instruments must be adapted for collectivistic contexts and how brief screening tools can inform precision public health approaches.”.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for revisions. The paper has been improved significantly. However, Figure 1 is somewhat irrelevant. I feel that comment "LCA: A figure with profiles should be presented." was not addressed properly. It will be better to create a figure, representing profiles and their variable levels, with mean or standardised z-scores of each profile. 

Author Response

We thank Reviewer 1 for the positive evaluation of the revised manuscript and for the constructive suggestions provided. We have carefully considered the comment and made the requested improvements.

Comment: Thank you for revisions. The paper has been improved significantly. However, Figure 1 is somewhat irrelevant. I feel that comment "LCA: A figure with profiles should be presented." was not addressed properly. It will be better to create a figure, representing profiles and their variable levels, with mean or standardised z-scores of each profile. 

Response: Thank you for your comment. We have now incorporated a new Figure 1 that presents the standardized z-score profiles of the four latent classes across all indicators included in the LCA (LSNS-6 total score and the MOS-SSS subscales). 

Back to TopTop