Application of the Delphi Method for Content Validity Analysis of a Questionnaire to Determine the Risk Factors of the Chemsex

Chemsex is understood as “the intentional use of stimulant drugs to have sex for an extended time among gay, bisexual, and other men who have sex with men”. It is a public health problem because of the increased incidence of cases and because of the consequences on the physical and mental health of those who practice it. Aim: This study aimed to analyze, with the help of the Delphi method, the content validity of a new instrument to assess the risk of behaviors associated with the chemsex phenomenon. Method: First, a bank of items identified from the literature was elaborated. Secondly, 50 experts with knowledge of the chemsex phenomenon at the national level were contacted. A Delphi group was formed with them to carry out two rounds of item evaluation. The linguistic evaluation (comprehension and appropriateness) was assessed using a Likert scale from 1 to 5 for each item. Items that did not reach a mean score of 4 were eliminated. Content assessment was calculated using each item’s content validity index (CVI) and Aiken’s V (VdA). A minimum CVI and VdA value of 0.6 was established to include the items in the questionnaire. Results: A total of 114 items were identified in the literature. In the first round of Delphi evaluation, 36 experts evaluated the items. A total of 58 items were eliminated for obtaining a CVI or VdA of less than 0.6, leaving 56 items. In a second Delphi round, 30 experts re-evaluated the 56 selected items, where 4 items were eliminated for being similar, and 10 items were also eliminated for not being relevant to the topic even though they had values higher than 0.6, leaving the scale finally composed of 52 items. Conclusion: A questionnaire has been designed to assess the risk of behaviors associated with the chemsex phenomenon. The items that make up the questionnaire have shown adequate content and linguistic validity. The Delphi method proved to be a helpful technique for the proposed objective.


Introduction
Chemsex is understood as "the intentional use of stimulant drugs to have sex for a long period among gay, bisexual, and other men who have sex with men" and is a public health problem because of the increased incidence of cases [1,2] and because of the consequences on the physical and mental health of those who practice it [3][4][5][6].According to Madrid's recent 2022-2026 Addictions Plan, care for people who practice chemsex has increased from 50 in 2017 to 351 in 2021, an increase of 602% in the number of people served in recent years [7].
It should also be noted that the increased use of geolocation applications has led to an increase in unprotected sexual encounters, which is associated with increased risk behaviors for sexually transmitted infections (STIs) [4,8].Previous studies indicate the association of chemsex with the priority consumption of substances such as mephedrone, methamphetamine, cocaine, GHB/GHL [9] and the search for achievement of emotions, pleasurable sensations, and management of negative symptoms [10].Other studies indicate that individuals with substance use-related problems used in chemsex may have experienced early adverse events and may have an avoidant insecure attachment style.In addition, those who have been diagnosed with HIV may show greater emotional dysregulation and worse self-care patterns.These variables should be routinely assessed in this population [11].
The assessment of mood disorders and addiction linked to the practice of chemsex is of interest given the psychoactive substances used [11].The practice of chemsex has been linked to increased suicide, sexually transmitted infections (STIs) [12][13][14][15], psychosis [16], and mental problems [9], and decreased adherence to pre-exposure prophylaxis treatment (PrEP) among others [17].
It is of interest to raise awareness of chemsex as a public health problem among gay, bisexual, and other men who have sex with men (GBMSM).Specific identification, education [18] and prevention programs need to be strengthened to reduce the incidence of the most undesirable implications of sexualized drug use (USD) among GBMSM [10].
The literature indicates the importance of self-monitoring for the reduction of harm from chemsex use [19] as well as the development of different programs that allow computer applications [18] and other applications (app) [20,21] to support and inform participants, reduce the negative impacts associated with chemsex and encourage more reasoned participation.On the other hand, the lack of knowledge of professionals in our country [22] regarding a growing problem should be considered.
Thus, studies such as Nagington's in 2022 indicate that we suggest that medicalized forms of chemsex support could benefit from more rigorous and rapid forms of assessment for problematic chemsex, and also provide infrastructure and training for peer support initiatives.We also suggest that medical services can learn from patients and their peers about support needs that professional services continue to miss and engage in collaborative approaches to practice development [23].Early detection and knowledge of risk factors can contribute to the reinforcement of accessible, non-judgmental, and well-informed prevention and harm reduction activities to support MSM who engage in slamsex [24].In addition, an equity-oriented approach should be adopted to facilitate unbiased care opportunities [25].
This phenomenon has raised public health concerns, as it can lead to risky behaviors, such as unsafe sexual practices and an increase in sexually transmitted infections, as well as addictions and mental health problems.Chemsex raises essential questions about the intersection between sexuality, drug use, and health and highlights the need to address this issue comprehensively, providing support and education to those who engage in these practices to minimize the associated risks.
However, to date, the risk factors for chemsex have not been specifically addressed, nor is there currently any instrument available to assess them.All of the above can slow down the detection of these warning signs by the health professionals who work with them, who care for them, and who understand that the quality of life of these people is fundamental.Therefore, it is necessary to know the predictive behaviors to improve their care and holistic well-being.To address and respond to the abovementioned needs, this study aimed to design a questionnaire to detect risk factors for chemsex.

Design
The study carried out was of the instrumental type, which, according to Montero and León (2005), is research that develops tests and devices comprising both the design or adaptation and the study of their psychometric properties.The scale will, therefore, be subject to validation in Spanish.

Item Bank Construction
The questionnaire was developed between July and August 2022.A literature search was conducted, and reference studies related to the research topic were analyzed, verifying the absence of instruments available for the study.The initial version of the questionnaire entitled "Chem-Sex Inventory" (CSI) was organized into six sections consisting of 114 items from various validated scales.It is important to note that the scales were not used as such; instead, items were selected from each questionnaire to create the new instrument.All the tools used to develop the "Chem-Sex Inventory" (CSI) questionnaire have been validated in the Spanish environment.The first block related to anxiety consisting of 7 items (from 1-7) (GAD 7) [26]; the second block associated with depression (PHQ-9) [27] consisting of 9 items (from 8-16 items), the third block related to the risk of psycho (PQ-B) [28], Scale of Corrigan [29], and CAPE-15 scale [30], consisting of 21 items (from 17-37), 14 items (from 88-99), and 15 items (from 100-114), respectively, the fourth block related to impulsivity (BIS-11) [31] consisting of 30 items (from 38-67), the fifth block associated with body perception (PHQ-15) [32] composed of 15 items (from 68-82), and finally the sixth block related to suicide risk and consisting of 5 items (from 83-87) (Paykel Scale) [33].The researchers initially eliminated two items associated with hospital admission from the Corrigan Scale.
Each research committee member, including all the study's authors, collaborated in developing the new questionnaire, defining its structure and main characteristics, selecting the items, and reviewing them.

Selection of Experts
Fifty experts on the subject were contacted, understanding as experts those professionals who had more than five years of experience in their field and professional trajectory and at least two years of experience in the management of users who practice chemsex, knew about the chemsex phenomenon, were active specialists in their field, and had a direct relationship with users who practice chemsex.The 50 experts comprised 10 LGTBI+ individuals, 10 mental health professionals, 10 emergency and urgent care professionals, 10 primary care professionals, and 10 professionals from infectious disease and sexually transmitted infection (STI) units.
The experts were invited to participate in the study directly by e-mail.Together with the e-mail, a letter of introduction to the survey was sent informing about chemsex and the risk factors associated with the phenomenon and an information sheet describing the characteristics of the study, the objectives of the research, as well as the selection criteria, the confidentiality of the data, and the voluntary nature of the study.The participation of this group of experts was carried out voluntarily, anonymously, and confidentially using a questionnaire through the Microsoft Forms platform.Before disseminating the first survey, the experts identified were asked to accept the Declaration of Consent if they were interested in participating in the study according to the Data Protection Law in force in Spain.

Delphi Method
The conventional Delphi method was used through an iterative process in which experts were consulted in two rounds [34].Linstone and Turoff (1975) consider that two rounds are sufficient to reach a consensus, allowing adequate reflection on the group's responses [35].
The rounds were developed through different phases from August 2022 to January 2023.This first phase of construction of a questionnaire on the risk behaviors of the chemsex phenomenon is addressed to the experts, who were asked to evaluate both the relevance and comprehensibility of each of the items using a Likert-type scale between 1 (strongly disagree) and 5 (strongly agree) to clarify the aspects and form of the future questionnaire.A qualitative question on the relevance and clarity of the sections was also added, in addition to criteria of completeness, wording, and structuring of each item.Secondly, the responses of the group of experts were received.Subsequently, a discussion group was held, where suggestions were considered.Finally, the experts' responses were collected, integrating the pertinent modifications, and the final version of the questionnaire was defined.
Communication with the experts took place from 28 July 2022, when the cover letter for acceptance of participation in the study was disseminated.The first round was broadcast on 4 August 4 2022 and the second round was issued on 4 October 2022, finally closing on 28 January 2023.

Round 1: Content Validity/Linguistic Validity and Loss of Experts Are Evaluated
The first round of consultation was used to evaluate the content validity (appropriateness) and linguistic validity (comprehensibility) of each item.After this first round, the number of items considered in the second round was significantly reduced.

Round 2: Content Validity Assessed
In the second round, the content validity of the items was evaluated (although in some cases there were still some items with adequate content validity but low scores in comprehensibility that were re-evaluated for comprehensibility).

Content Validity Analysis
The content validity of the questionnaire was analyzed by calculating the content validity index and Aiken's V value for each item.A minimum CVI and Aiken's V value of 0.6 was established to include the items in the questionnaire, the criterion used to select the items.Based on the experts' scores, the indicators were calculated.Following the methodology described by Polit and Beck [36], and used by other authors [37][38][39], three indicators of content validity were calculated for each item (CVI, kappa coefficient (k), and Aiken's V), based on the ratings made by the group of experts, using the following equations: (a) Content validity index (CVI) CVI = number of experts who evaluated the item with 4 or 5 (A)/Total of experts (N) (1) In which the I-CVI is the coefficient of internal validity, previously calculated for each item, whereas the Pc (probability of chance agreement) is the probability of chance in accordance between observers and is calculated through the formula: Its equation, algebraically modified by Penfield and Giacobbi (2004) [40], is: is the mean of the experts' ratings, l is the lowest possible score, and k is the range of possible values of the Likert scale used.For example, if lowest score is 1 and the highest score is 5, then k = 5 − 1= 4. Once calculated, Aiken's V confidence intervals were obtained using the scoring method [41].To obtain this confidence interval, the following equation was used for the lower limit of the interval: And for the upper limit of the interval: L: lower limit of the interval; U: upper limit of the interval; Z: value in standard normal distribution; V: Aiken's V calculated by formula 1; n: number of experts.
The CVI, modified kappa, and Aiken's V were calculated with a database created in Excel 2013, using the assessments of the expert group and according to their respective formula.

Comprehensibility Analysis/Linguistic Validation
To obtain the validity of comprehension, the experts were asked to evaluate the degree of understanding of each item in the first round and whether they considered that any should be reformulated.The average score is calculated for such items.Items with scores above 4 were supposed to be of high comprehensibility; those with scores between 3.5 and 4 were supposed to be of medium comprehensibility; and those with scores below 3.5 were considered to be of low comprehensibility.The items that obtained lower scores in the first round and were selected for the second round because of their content validity were reformulated, so their comprehensibility was re-evaluated in the second round.

Ethical Considerations
The study was conducted under the Declaration of Helsinki, and was approved by the Committee of the University of La Rioja with verification code (CSV) (D2R1m2Iu3v LVPdIzGZVnK0h6N558tCyN) for human studies through this link: https://sede.unirioja.es/csv/public/index.xhtml;jsessionid=2D8AB94A16FEB14C923EFD5E63C25AF2-n1.ma_07, accessed on 19 September 2023.

Content Validity Analysis
The criterion used to select the items that made up the final questionnaire was that the CVI value or Aiken's V test score was higher than 0.6.See Table 2.
Thirty-six experts completed the first round.After a review of the first round, those items that had obtained a minimum CVI and VdA value of 0.6 were selected.Of the 114 items that comprised the questionnaire (Appendix A), 61 met both relevance and comprehensibility criteria.Of these 61 items, we proceeded to analyze the experts' observations for reformulation and understanding of items 43, 44, 63, 65, 65, 73, 91, 92, and 97.The rest of the experts' suggestions were not considered because they were related to items that did not obtain a value higher than 0.6 and were therefore eliminated from the questionnaire.In addition, the group of experts considered that questions 11 and 81, 16 and 84, 10 and 82, 65 and 67, 9 and 83 (which, in principle, all met the criteria for permanence) were similar, so 5 of these 10 questions were eliminated, leaving 56 items finally selected after the first round.
A total of 30 experts completed the second round out of the 36 who initially completed the first round.After a review of the second round, the same criteria as in the first  1 STI (sexually transmitted infection).

Content Validity Analysis
The criterion used to select the items that made up the final questionnaire was that the CVI value or Aiken's V test score was higher than 0.6.See Table 2.  Thirty-six experts completed the first round.After a review of the first round, those items that had obtained a minimum CVI and VdA value of 0.6 were selected.Of the 114 items that comprised the questionnaire (Appendix A), 61 met both relevance and comprehensibility criteria.Of these 61 items, we proceeded to analyze the experts' observations for reformulation and understanding of items 43, 44, 63, 65, 65, 73, 91, 92, and 97.The rest of the experts' suggestions were not considered because they were related to items that did not obtain a value higher than 0.6 and were therefore eliminated from the questionnaire.In addition, the group of experts considered that questions 11 and 81, 16 and 84, 10 and 82, 65 and 67, 9 and 83 (which, in principle, all met the criteria for permanence) were similar, so 5 of these 10 questions were eliminated, leaving 56 items finally selected after the first round.
A total of 30 experts completed the second round out of the 36 who initially completed the first round.After a review of the second round, the same criteria as in the first round were maintained, selecting items with a minimum CVI and VdA value of 0.6.Of the 56 items in the questionnaire, the items that did not obtain a minimum CVI and VdA value of 0.6 were eliminated, leaving 54 items that met the criteria.According to the experts' observations, questions 11 "Have you had difficulty concentrating when doing everyday things, such as reading the newspaper or watching television?" and 26 "Are you a person who has difficulty concentrating?" were similar, so item 26 was eliminated because the question was less complete.Subsequently, after review by the research group, two similar items were observed, item 55 "Have you ever heard voices when you were alone?"and item 56 "Have you ever heard voices talking to each other when you were alone?",so the second item was removed.Finally, a questionnaire with 52 items was obtained.
When analyzing the content validity of the 52 items through the CVI and Aiken's V, the researchers eliminated 10 items for not being relevant to the topic even though they had values above 0.6.The values of CVI and Aiken's V for each of the items that make up the questionnaire are shown in Table A1 (Appendix A).Analyzing the CVI and Aiken's V for each item, we see that 97.6% of them reached the value of 0.6 considered acceptable.Lawshe (1975) suggests that a CVI = 0.29 will be adequate when 40 experts have been used, a CVI =0.51 will suffice with 14 experts, but a CVI of at least 0.99 will be necessary when the number of experts is 7 or less [43].Although classically a value of 0.70 has been taken for the Aiken's V cut-off point [44] and 0.8 for the CVI [45] the team set the cut-off point at 0.6, firstly in order not to be so permissive, secondly because there are many reviewers, and thirdly because of the possible further reduction of items with the following phases of the instrument development.
Only the following two items, "Have you ever seen things that other people cannot see?" and "Have you ever felt as if you were under the control of any external force or power?" scored below 0.6 in one of the two indicators but the other indicator was above 0.6.Finally, the questionnaire consisted of 42 items (Table 2).

Comprehensibility or Linguistic Validation Analysis
The reports on the items were minimal, with some spelling and grammatical changes and syntax structure.For example, experts requested reformulating item 13.The rest of the items were eliminated and were not reformulated.Items 2, 15, 40, 84, 93, and 95 were eliminated from the questionnaire because they presented a low level of comprehension with a value lower than 4. Item 13 was modified because it had values lower than 4 in comprehension; however, keeping it in the instrument was considered essential after its reformulation.
All the active experts who participated responded that they understood, without difficulty, the content of the final questionnaire designed, the concepts, and the answers to each item in terms of adequacy and comprehension.Finally, no expert detailed any doubts regarding the completion of the questionnaire.

Discussion
Given the importance of the increase in the prevalence and public health problem of chemsex consumption, we proposed at the beginning of the work the design and validation of a scale entitled "Chem-Sex Inventory" (CSI) to reach a consensus on the content in the development of a questionnaire [46,47], based on the Delphi method and analyzing the content validity of the questionnaire to be able to know the behaviors that predict consumption, allowing a better approach to people who practice chemsex and an advance in research on this type of addiction [48].
As a strength of the study, this instrument will facilitate the development of future studies to analyze and relate its construct to different variables such as substance use, impulsivity, altered body perception, risk of psychosis, risk of suicide, anxiety, and depression.Its application will allow multidisciplinary teams of professionals to plan and develop a better approach to chemsex patients and knowledge of their behavior.
The Delphi method has been shown to be relevant and helpful in health research [49] to obtain reasoned, consensual, and individualized opinions concerning the analysis and reflection on a given research objective [50,51].
First, professionals who had practice-based knowledge and experience in chemsex and could make valid contributions to the study were chosen as experts.Validation aims to ensure that the questionnaire documents what it is intended to measure and that its design and validation are of rigorous scientific quality.Such validation was carried out with experts to achieve an optimal level of validity, defined as the degree to which all representative indicators intended to be assessed are included in the questionnaire [48].
Furthermore, the group of experts consisted of a heterogeneous group from different regions and work services, which is fundamental because it allowed a different view from other points of view.Various studies indicate the strengths of using the modified Delphi technique as each round is conducted anonymously and independently, with the advantage of opportunity for participation in the study and reducing response bias that can appear in group settings [42].Thus, all expert group participants had equal opportunity to participate in the study, reducing the risk of response bias that can arise in group settings [42,52].
Secondly, the methodology used was the Delphi technique, a consensus technique that allows quantitative estimators to be obtained through the degree of agreement among the participants.This is an effective method for building and creating consensus in a group without the group of experts having to meet in person [47], but rather contacting each of the group members via e-mail.With all of the above, it is necessary to have validated tools adapted to our environment that allow us to know the behavior and conduct of people who practice chemsex.
Regarding the characteristics of the present study, we can assure that the content validity is high.As we have seen, the decision-making of the group of experts has become increasingly uniform in the different Delphi stages, which is reflected in the significant increase in the congruence index of the items included [48].The group of experts considered that the 42 items of the "Chem-Sex Inventory" (CSI) questionnaire, having a high level of content validity, capture aspects related to the predictive behavior of chemsex and therefore present a high level of content validity [42].Being the first version of the questionnaire, the research group eliminated second-round items with a CVI and Aiken's V higher than 0.6 as they were not relevant to assess the risk behaviors of the chemsex phenomenon and kept items with a CVI and Aiken's V lower than 0.6, which were considered relevant to the study.
The Delphi method showed a high degree of agreement among experts in evaluating the questionnaire [48].In the absence of instruments to assess the risk factors of the chemsex phenomenon in the literature reviewed, developing new tools is necessary [53].This study can be used to inform other researchers in their efforts to validate the content of behaviors predictive of the chemsex phenomenon [47].Following this, researchers can conduct a study of the LGTBI+ population that performs chemsex to be able, after the resulting analysis, to assess additional aspects of its validity and reliability and, based on the results, perform item reduction and subsequently establish a cut-off point to discriminate risk and allow for group stratification.
In this sense, the panel of experts made a quantitative and qualitative contribution that allowed for improving the tool [54], obtaining very positive values in all dimensions and their assessment category, namely, the relevance of the reference question and its response category, and relevance to the object of research, clarity, adequacy and comprehension of the wording, structure, and sequence of dimensions and questions.After content validation by experts, a final questionnaire was developed, which is in the process of analyzing other psychometric properties.
For future studies, some limitations should be kept in mind: First, a diverse sample of experts working in different systems and autonomous communities was selected to obtain a broader perspective on the chemsex phenomenon.Although previous research identified snowball sampling as an effective method for identifying expert groups in Delphi studies [55], it was found that the application of this sampling method in the present study may have resulted in the inclusion of some non-experts in the sample.
This may indicate that those with less experience may have limited exposure to some aspects of chemsex practice.As this was a study based on expert opinion, and although the team of experts was sufficiently broad and knowledgeable about the phenomenon, there is always the possibility that not all aspects or dimensions of the phenomenon were addressed.Another limitation is that the resulting questionnaire is relatively long, in our opinion, which may lead to a lack of responses in future research.Since this is the first version of the "Chem-Sex Inventory" (CSI) questionnaire, the research team aims to shorten it in successive versions while maintaining its properties.Therefore, future research using expert samples should consider how participants' experiences fit the research objectives.
Once the instrument has been constructed, a subsequent pilot test is planned to analyze the psychometric properties of the "Chem-Sex Inventory" (CSI) questionnaire at the national level and a nationwide study of all the autonomous communities is planned to validate the instrument in men who have sex with men.

Conclusions
In conclusion, the results of the present study allow us to conclude that the questionnaire designed to determine the risk factors for the chemsex phenomenon has a high level of content validity and can, therefore, be used in the different emergency departments, primary care, psychiatry, and infectious disease departments for this purpose.It should be noted that although this instrument may be potentially helpful, other psychometric properties should be evaluated to ensure its validity and structure.

Figure 1 17 Figure 1 .
Figure 1 shows the questionnaire development process and the content validity index (I-CVI) of each of the 114 items from the Delphi rounds.Of these 114 items, 56 (49.1%) had an I-CVI > 0.6 after the first round and 58 items with an I-CVI < 0.6 were eliminated.After the second round, 52 items (92.8%) exceeded the cut-off value, eliminating the remaining 4 items.Healthcare 2023, 11, x FOR PEER REVIEW 7 of 17

Table 1 .
Demographic data of the panel of experts participating in the Delphi method.

Table 2 .
The content validity index (CVI), Aiken's V, and the kappa for each item.
1 K (kappa); 2 IC (confidence interval).Note: The CSI questionnaire was designed in Spanish; this table shows items translated into English (but not an English-validated version).

Table A1 .
Index of content validity and comprehension of each item in the Delphi procedure in the first round.