Previous Article in Journal
Academic Freedom in US Higher Education: Rights Emergent from the Law and the Profession
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Entry

Questionnaire Use and Development in Health Research

1
Department of Nursing, Tzu Chi University, Hualien City, Hualien 970302, Taiwan
2
Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
3
Department of Education, National Chiayi University, Minsyong Township, Chiayi 621302, Taiwan
4
Department of Styling and Cosmetology, Tainan University of Technology, Tainan City 710302, Taiwan
*
Author to whom correspondence should be addressed.
Encyclopedia 2025, 5(2), 65; https://doi.org/10.3390/encyclopedia5020065
Submission received: 15 March 2025 / Revised: 14 May 2025 / Accepted: 15 May 2025 / Published: 16 May 2025
(This article belongs to the Section Medicine & Pharmacology)

Definition

:
A questionnaire is a structured instrument used in health research to systematically collect data on perceptions, behaviors, and health outcomes. It serves as a fundamental tool for capturing patient-reported outcomes, evaluating treatment effectiveness, and monitoring public health trends. Questionnaires can be administered in various formats, including paper-based, digital, or interactive systems, and must be carefully designed to ensure reliability, validity, and minimal bias. While validated questionnaires facilitate cross-study comparability, new instruments may be needed to address emerging health concerns or specific cultural contexts. Adhering to best practices in survey methodology allows researchers to maximize the utility of questionnaires, ensuring accurate, reproducible, and ethically sound health research.

Graphical Abstract

1. Introduction

Questionnaires are essential tools in health research, functioning as structured instruments for the systematic and consistent collection of information from patients, healthcare providers, and broader populations. Despite their widespread use, the terminology, design principles, and methodological standards for questionnaire development remain inconsistently applied across disciplines. This entry was written in response to a recognized need for a comprehensive yet accessible reference that synthesizes best practices across the lifecycle of questionnaire use in health research. Our goal is not to prescribe a singular approach or propose a universal checklist, nor to suggest that a single questionnaire could or should incorporate every principle discussed herein. Instead, the principles outlined are intended to provide a conceptual and methodological framework that researchers can adapt according to the context, purpose, and constraints of their specific studies.
A questionnaire typically comprises a set of questions designed to capture data on behaviors, attitudes, perceptions, symptoms, or other measurable phenomena. Depending on its design, it may include open-ended, closed-ended, or mixed-format questions, resulting in qualitative, quantitative, or mixed data types. Questionnaires can be administered through various formats, including paper-based surveys, digital platforms, or interactive systems [1]. Although they are relatively low-cost and scalable, the development or selection of a questionnaire requires meticulous planning and design to ensure the collection of valid, reliable, and unbiased data [2].
In health research, precise terminology regarding questionnaires is important but often inconsistently applied, leading to conceptual confusion, particularly among new investigators. Terms such as survey, instrument, questionnaire, scale, and inventory are frequently used interchangeably, despite referring to distinct concepts with specific methodological implications. To improve clarity, these terms can be organized into a hierarchical framework based on their scope and application (Table 1).
First, the term survey refers to the broadest methodological approach. It encompasses the planning, design, administration, and analysis of data collection procedures. Surveys may employ a variety of data collection tools, including questionnaires, interviews, and observational methods. For example, a patient satisfaction survey might involve distributing a questionnaire alongside qualitative interviews to achieve a more comprehensive understanding of patient experiences.
Second, instrument is an umbrella term encompassing any tool or device used to measure variables in research. This category includes questionnaires, scales, inventories, diagnostic tests, and observational checklists. Third, a questionnaire is a structured set of questions designed to elicit information from respondents. Questionnaires may include a range of item types, such as open-ended, closed-ended, or Likert-type questions, and can be used to assess one or multiple constructs. For example, the International Physical Activity Questionnaire (IPAQ) is a widely used tool to quantify physical activity levels in epidemiological studies [3].
Fourth, a scale refers to a specific component within a questionnaire, or a standalone tool, designed to measure a single latent construct, such as anxiety (e.g., Generalized Anxiety Disorder 7-item Scale, GAD-7 [4]), pain (e.g., visual analogue scale, VAS [5]), or fatigue (e.g., Fatigue Assessment Scale, FAS [6]). Scales typically consist of multiple items that, when aggregated, yield a composite score representing the underlying variable.
Finally, an inventory is a more comprehensive type of questionnaire, often used to measure multiple dimensions or traits, such as those found in personality assessments. For example, the Minnesota Multiphasic Personality Inventory (MMPI) is a widely used standardized psychometric test designed to assess adult personality traits and psychopathology [7]. Unlike scales, inventories are not limited to a single construct and often contain multiple subscales or domains [8].
Selecting the appropriate tool also depends on the research context. Questionnaires and scales are most commonly used in quantitative studies, where measurement precision and statistical analysis are essential. Inventories may be employed in quantitative or mixed-methods research [9], particularly when assessing complex psychological or behavioral profiles. Although qualitative studies are less reliant on structured instruments, they may still incorporate questionnaires with open-ended items or use them in conjunction with interviews or focus groups to support methodological triangulation.

2. Strengths and Limitations of Questionnaires

Questionnaires are widely used in research due to their multiple advantages, including cost-effectiveness, time efficiency, respondent anonymity, and standardization of data collection. A key benefit is their ability to facilitate large-scale data collection, allowing researchers to efficiently reach broad and geographically dispersed samples [10]. When distributed online, questionnaires can significantly reduce costs associated with traditional methods such as interviews or postal surveys by eliminating expenses related to printing and travel [11]. Furthermore, their structured format, often incorporating predefined response options such as Likert-type scales [12] or multiple-choice questions, ensures consistency in data collection. This standardization enhances the efficiency of data entry and analysis, streamlining the overall research process.
Another strength of questionnaires is their role in standardizing measurements across studies. By using established, validated scales, researchers can compare findings across different studies, thereby enhancing the reliability and validity of results. This standardization also supports the aggregation of data across populations and time points, which is essential for identifying trends and variations in health outcomes or other key variables. In addition, the anonymity provided by questionnaires encourages honest responses, particularly on sensitive health topics. This anonymity enhances data integrity, as respondents may feel more comfortable providing truthful information without fear of judgment or repercussions.
Despite these advantages, several limitations can affect the quality and validity of questionnaire-based research. A major concern is response bias, where participants’ answers may be influenced by external factors rather than their true beliefs or experiences. One common form of response bias is social desirability bias, which occurs when respondents provide answers they perceive as socially acceptable rather than truthful [13]. For example, in nutrition-related surveys, individuals may overreport their adherence to a healthy diet [14]. To mitigate this, researchers can employ indirect questioning techniques, such as the crosswise model, to reduce socially desirable responses [15].
Another form of response bias is acquiescence bias, in which respondents tend to agree with statements regardless of their actual opinions. This tendency may arise from a desire to please the researcher or to avoid perceived confrontation, leading to skewed data that fails to accurately reflect respondents’ true perspectives. A review identified 48 common types of bias in questionnaires, providing examples of each [16]. Addressing these biases is critical to ensuring the validity and reliability of data collected through questionnaires in health research.
Engagement with questionnaires also presents challenges that can impact data quality. One such challenge is survey fatigue, where participants become disengaged due to long or repetitive questions, leading to low completion rates or high dropout rates [17]. If respondents perceive the survey content as irrelevant to their experiences or interests, their level of engagement decreases, which in turn affects the representativeness and reliability of the collected data.
Another critical issue is the potential for misinterpretation of questions. Ambiguously worded or poorly phrased items can lead to varying interpretations, resulting in inconsistent and unreliable responses. To mitigate this risk, careful questionnaire design is essential. Ensuring clear instructions, neutral wording, and concise, unambiguous questions can help minimize confusion and improve data accuracy [18].
To enhance the reliability and validity of questionnaire-based research, researchers must acknowledge these limitations and implement strategic methodological adjustments. Thoughtful questionnaire design, along with measures to maintain participant engagement, is crucial for obtaining high-quality, meaningful data.

3. Using an Existing Questionnaire or Developing a New One

The decision to use an existing questionnaire or develop a new one should be guided by the specific needs, resources, and objectives of the research. If an instrument adequately addresses the research questions and has established validity and reliability, using it may be the most efficient approach. However, if the study requires capturing unique aspects or contexts not covered by existing tools, developing a new questionnaire may yield more insightful and applicable results. Ultimately, the choice must balance pragmatism with scientific rigor to ensure the integrity and relevance of the findings.

3.1. When to Use an Existing Questionnaire

Using a well-established questionnaire is often the most efficient and scientifically sound approach when the instrument adequately addresses the research questions and demonstrates documented validity and reliability. One of the primary advantages is the time and resource savings associated with using a tool that has undergone rigorous psychometric testing. Moreover, employing a standardized instrument enhances comparability across studies, which facilitates benchmarking and contributes to longitudinal analyses and meta-analyses. This comparability strengthens both the generalizability and credibility of findings.
Nevertheless, using a validated instrument comes with important responsibilities. Researchers must preserve the original format, item wording, time frames, and scaling. Even seemingly minor alterations, such as adjusting response anchors, changing the format from numeric scales to checkboxes, or adding questions, can compromise the tool’s psychometric properties [19]. If a questionnaire does not fully align with the study’s needs or target population, it is preferable to either request permission for contextual adaptations or develop a new instrument, rather than introduce unverified modifications. Preserving the integrity of validated instruments is essential to maintaining scientific rigor.
To determine whether an existing questionnaire is appropriately validated, researchers should consult published literature for several key indicators. A validated instrument will typically have accompanying studies that report psychometric properties, such as internal consistency, test–retest reliability, and different forms of validity (e.g., content, construct, criterion-related). Researchers should also check for evidence of cross-cultural adaptation, especially when applying an instrument in a population different from the one it was originally developed for. Without these elements, the appropriateness of the instrument for a given context may be limited.
A strategic search for existing instruments begins with a thorough literature review using databases such as PubMed or PsycINFO to identify prior studies measuring similar constructs. Validation studies, reference lists, academic networks, and professional societies can be valuable resources in locating appropriate tools. Once a questionnaire is identified, researchers must confirm copyright status and usage permissions. Some tools are freely available for non-commercial research, while others may require purchase or formal authorization. Ensuring legal and ethical compliance is a fundamental responsibility in research.

3.2. When to Develop a New Questionnaire

Developing a new questionnaire becomes necessary when no existing instrument adequately captures the constructs, contexts, or populations relevant to a given study. This is especially true in emerging or rapidly evolving fields, such as digital health, social media, or artificial intelligence, where existing tools may be outdated or lack relevance. It is also warranted when unique or localized aspects must be assessed that are not addressed by standard measures.
While developing a questionnaire offers the opportunity for tailored measurement, it is resource-intensive and methodologically demanding. The process involves multiple stages, including item generation, cognitive interviewing, pilot testing, and formal psychometric evaluation. Without this level of rigor, the instrument may yield unreliable or invalid data, undermining the study’s conclusions. Moreover, for a new questionnaire to achieve broader impact, it must be replicable and interpretable by other researchers. The absence of established benchmarks or comparison points can hinder generalizability and reduce the influence of the study’s findings.
Despite these challenges, the development of new instruments can significantly advance a research field, providing that it is undertaken with methodological care and transparency. Researchers embarking on this path should be prepared for an iterative process that demands both conceptual clarity and statistical expertise.

4. Steps in Questionnaire Design

4.1. Defining Objectives and Target Population

Designing an effective questionnaire begins with two foundational steps, defining research objectives and identifying the target population, to ensure the instrument is both purposeful and participant-centered. Researchers must consider demographic and cultural factors that may influence responses. For example, the mode of administration (e.g., online, paper-based, or face-to-face) should align with the target population’s accessibility and preferences. Cultural sensitivity also enhances the questionnaire’s relevance and questions should be framed to respect diverse social norms.
Next, the construct to be measured must be precisely defined. A thorough literature review helps distinguish the construct from related concepts and ensures a strong theoretical foundation. Researchers should also determine whether the construct is unidimensional or multidimensional, as this decision influences the questionnaire’s structure and item organization.
Once the construct is defined, the next step is operationalizing it into measurable items. This refers to the process of transforming an abstract concept into concrete, structured questions. This process involves defining the construct in specific terms, designing relevant questions, selecting appropriate response formats, and ensuring that the items accurately capture the intended concept.
An effective approach to operationalization involves generating a broad pool of potential items and systematically refining them through expert feedback, focus groups, or pilot testing with a subset of the target population. This iterative process helps eliminate ambiguity, redundancy, and irrelevant items while ensuring that each remaining question meaningfully contributes to measuring the underlying construct.

4.2. Item Generation

The process of item generation involves crafting specific questions that align with research objectives and resonate with the target population. While well-designed questions elicit clear, meaningful responses, poorly constructed items can introduce ambiguity, bias, or respondent fatigue.
To develop high-quality questionnaire items, researchers should integrate several best practices, including reviewing existing studies to identify validated questions as a foundation, consulting subject-matter experts to refine the construct and ensure key domains are covered, and engaging with the target population through focus groups to capture respondents’ language, perspectives, and experiences, ensuring questions are relatable and contextually appropriate.
Three core attributes of effective questions are focus, brevity, and clarity: (1) Focus—Each question should measure a single concept. Avoid double-barreled questions, which ask about multiple issues in one item, leading to ambiguous responses. (2) Brevity—Concise questions reduce cognitive load and response errors. Lengthy or complex questions may overwhelm respondents, increasing the risk of skipping or misinterpreting items. (3) Clarity—Questions should be universally understood, using simple, familiar language suited to the respondents’ education level. Avoid jargon, slang, abbreviations, and double negatives.
In addition, vague terms such as “frequently” can be interpreted differently by respondents, making it difficult to analyze and compare responses. Instead, researchers should use specific, measurable alternatives to reduce ambiguity. Furthermore, researchers should avoid problematic question types that can compromise data integrity. Common examples include leading questions, double-barreled questions, double negatives, non-applicable questions, and those over-demanding recall [10]. By adhering to these principles, researchers can enhance the reliability, validity, and overall quality of their questionnaires, ensuring meaningful and accurate data collection.

4.3. Item Format and Response Scales

The item format and response scale are fundamental elements of questionnaire design. The item format refers to how questions are structured, including open-ended, closed-ended, and mixed formats. Open-ended questions allow respondents to provide unrestricted, detailed answers, yielding rich qualitative data. They are particularly useful for exploring a broad range of perspectives. However, analyzing open-ended responses is time-consuming and may introduce inconsistencies. As a result, they are often used as a preliminary step to identify common themes before designing structured questions. In contrast, closed-ended questions provide predefined response options, such as yes/no, multiple-choice answers, or Likert-type scales, ensuring consistency and facilitating quantitative analysis. Mixed-format items combine elements of both open- and closed-ended questions, offering flexibility but requiring careful consideration of their implications for data analysis [10].
Response scales determine how respondents express their answers and can be categorical or continuous. The choice of scale depends on the construct being measured and the level of precision required. Nominal scales classify data into categories without inherent order, such as sex or ethnicity, while ordinal scales introduce a ranking element, such as levels of agreement or satisfaction. It should be pointed out that while forced ranking scales produce data that is ordinal in nature, they differ from other ordinal scales by requiring respondents to rank items in order of preference or importance. This method is useful for identifying relative priorities, particularly in contexts where understanding trade-offs is essential. Interval scales have equal measurement intervals, such as temperature, but lack a true zero point. Ratio scales provide both consistent intervals and an absolute zero, making them ideal for measuring attributes such as age or income. The selection of an appropriate scale depends on the nature of the construct being measured and the desired granularity of data.
When designing response scales, factors such as the number of response options, labeling, and visual presentation are critical for ensuring clarity and reliability. Likert-type scales are widely used for capturing attitudes or perceptions across a continuum, typically ranging from strongly disagree to strongly agree [12]. The visual analogue scale (VAS) is frequently employed in health research to measure subjective experiences such as pain, fatigue, or anxiety. It consists of a 100 mm line with endpoints representing extremes, such as “no pain” and “worst pain imaginable”. Respondents mark a point on the line to indicate their perception, with scores calculated by measuring the distance from the starting point. The VAS provides precise, continuous data and is highly sensitive to subtle variations [5].
Nonverbal response scales are particularly useful for populations that may struggle with traditional verbal or numeric formats, such as young children or individuals with limited literacy. These scales use visual or symbolic representations, such as images or pictograms, to effectively capture responses. The Faces Pain Scale, developed by Wong and Baker [20], is a widely adopted tool for pediatric pain assessment. The scale presents a series of six facial expressions ranging from a smiling face, representing “no hurt”, to a crying face, indicating “hurts worst”. Children select the face that best matches their pain level. This simple yet effective approach ensures clarity and accessibility, even for those with limited verbal skills.

4.4. Questionnaire Layout

A well-designed questionnaire consists of three main sections: the introduction, the body, and the conclusion. The introduction part typically includes a brief preamble or cover letter outlining the purpose of the study, the target audience, and the significance of the respondent’s participation. It should also specify the estimated time required to complete the questionnaire and provide assurances regarding confidentiality and anonymity. If applicable, the introduction may include instructions for answering specific question types or navigating the questionnaire, ensuring respondents clearly understand how to proceed.
A logical structure is essential, with questions grouped into thematic sections or arranged in a natural flow to prevent confusion. Questions should be concise, unambiguous, and directly aligned with the research objectives. To maintain respondent engagement, it is advisable to begin with simple, non-sensitive questions before gradually progressing to more complex or sensitive topics [1,21].
The conclusion part of the questionnaire serves to thank respondents for their participation and may provide additional information, such as the next steps in the study or contact details for further inquiries.

4.5. Questionnaire Formatting and Length

A user-friendly questionnaire requires careful attention to layout and formatting to enhance readability and ensure respondents can navigate the instrument with ease. The design should be clean and visually appealing, with sufficient white space to prevent clutter. Questions should be clearly numbered or labeled and logically grouped into thematic sections, allowing respondents to follow a natural flow throughout the questionnaire.
Typography plays a critical role in readability and response rates. Research suggests that using larger font sizes can increase response rates, particularly among older adults completing self-administered postal questionnaires [22]. A study on young adults found that a 14-point font size was optimal for on-screen reading, with Courier New facilitating faster reading and Verdana preferred for digital presentations [23]. In contrast, another study found that Times New Roman, a serif font, improved attention, as participants completed a letter cancelation task significantly faster and with fewer omissions compared to sans-serif and script fonts [24].
To improve usability, consistent use of headings and subheadings helps delineate sections, making it easier for respondents to locate specific parts of the questionnaire. Uniform spacing, alignment, and margins contribute to a professional appearance, while visual aids such as bullet points, arrows, or icons can highlight key instructions and reduce cognitive load.
Questionnaire length is a crucial factor in maintaining respondent engagement and data quality. Researchers should estimate completion time in advance to identify potential risks of survey fatigue, rushed responses, or incomplete data. While 25 to 30 well-structured questions are often considered a balance between thoroughness and brevity [25], the optimal length depends on the study’s content, respondent motivation, literacy level, and time constraints. Including an estimated completion time in the introduction helps respondents plan accordingly, improving engagement and response quality.
For mail surveys, formatting the questionnaire as a booklet, using large sheets of paper folded and stapled through the spine, enhances readability, makes page-turning more intuitive, and reduces the risk of misplaced pages [26]. Moreover, careful layout design should ensure that questions are not split across pages, as this can cause confusion and incomplete responses.
Beyond booklet formatting, several strategies have been shown to improve postal questionnaire response rates. A systematic review of 292 randomized trials found that using colored ink, placing engaging questions at the beginning, and keeping the questionnaire concise significantly increased response rates [27].
Pre-testing a questionnaire’s layout and formatting with a sample of the target population is crucial for identifying usability issues and ensuring a seamless respondent experience. A clear, intuitive design enhances engagement, reduces errors, and minimizes survey fatigue, ultimately improving data accuracy, reliability, and response rates.

4.6. Statistical Analyses of Questionnaire Data

The statistical analysis of questionnaire data is a structural process that transforms raw responses into meaningful insights, ensuring data accuracy and validity. Before conducting any statistical tests, data preparation is essential and includes outlier detection, verification of data types (categorical vs. continuous), and handling of missing values.
Descriptive statistics, such as means, medians, standard deviations, and frequency distributions, provide an initial understanding of the dataset. Inferential statistics are used to test hypotheses and generalize findings to a larger population. The choice between parametric and non-parametric tests depends on whether data satisfied the assumption of normality. Parametric tests (e.g., t-tests, analysis of variance) are applied when assumptions of normality hold and non-parametric alternatives (e.g., Mann–Whitney U test, Kruskal–Wallis test) are used when they do not follow a normal distribution.
Ensuring questionnaire reliability and validity is crucial for accurately measuring intended constructs. Cronbach’s alpha or McDonald’s omega is commonly used to evaluate internal consistency, ensuring that related items yield consistent responses. Factor analysis (exploratory or confirmatory) verifies the underlying structure of the questionnaire, confirming whether items align with expected theoretical constructs. Regression analysis is used to examine relationships between multiple variables, identifying predictors of responses and controlling for confounding factors.
Determining an adequate sample size is critical to ensure the study is sufficiently powered to detect meaningful effects while balancing practical constraints (e.g., time, budget, and participant availability). Insufficient sample sizes increase the risk of false-negative results, while overly large samples waste resources. For basic tests, such as t-tests, analysis of variance, and Chi-square tests, G*Power, a widely used and freely available tool [28], can assist in sample size determination. However, for more complex multivariate analyses, specialized formulas, Monte Carlo simulations, or bootstrapping methods are needed.
When estimating prevalence, sample size calculations depend on desired precision, often determined by the margin of error and confidence interval width. Larger sample sizes are necessary when studying low-prevalence conditions or rare events [29].
In addition to quantitative methods, researchers must consider appropriate analytic strategies for responses to open-ended items, which are common in exploratory or mixed-methods designs. Qualitative approaches such as thematic analysis and content analysis are useful for identifying patterns, categories, and emergent concepts in narrative data. These methods involve systematic coding of textual responses, allowing researchers to capture the depth and complexity of participants’ perspectives [9]. Incorporating qualitative analysis can enrich interpretation by revealing contextual or latent meanings that closed-ended questions may overlook.

5. Reliability and Validity

5.1. Reliability

Reliability refers to the consistency and stability of measurement results over time, across raters, and within the internal structure of a questionnaire. Internal consistency, which evaluates how well items within a scale measure the same underlying construct, is most commonly assessed using Cronbach’s alpha. Values of 0.70 or higher are generally considered acceptable. However, Cronbach’s alpha is based on assumptions such as unidimensionality and tau-equivalence (i.e., equal factor loadings across items), which are often violated in real-world data. When these assumptions are not met, alpha can either overestimate or underestimate the true reliability of a scale [30]. An alternative index, McDonald’s omega, has been proposed, particularly for instruments with heterogeneous items or multidimensional constructs. Unlike alpha, omega does not require equal loadings and instead derives reliability from a factor model, providing a more accurate estimate of the proportion of variance attributable to the underlying construct [31].
Another key form of reliability is test–retest reliability, which assesses the temporal stability of responses by administering the instrument to the same participants on two separate occasions. This is particularly important for constructs expected to remain stable over time. In contexts involving subjective ratings, such as clinical assessments or observational studies, interrater reliability becomes relevant. Agreement between raters is commonly measured using Cohen’s kappa or the intraclass correlation coefficient (ICC), depending on the scale and number of raters involved [32].
It is important to stress that reliability is not an inherent property of a questionnaire but is sample- and context-dependent. The same questionnaire may exhibit different reliability coefficients across populations with different cultural, educational, or clinical backgrounds. Therefore, researchers should consider whether the original validation sample is comparable to their target population before adopting an instrument. Furthermore, while reliability is a prerequisite for validity, it is not sufficient on its own, as an instrument may consistently measure something other than the intended construct [33].
In addition to classical reliability indices, modern psychometric frameworks provide more sophisticated approaches for evaluating measurement precision. Item Response Theory (IRT), for instance, models the probability of a given response based on both the respondent’s latent trait level and the characteristics of each item. Unlike classical test theory, IRT generates item-level parameters, such as discrimination, difficulty, and guessing, that offer detailed insights into how individual items function across different levels of the underlying construct. This allows researchers to develop shorter, more efficient instruments that retain psychometric robustness and can be tailored to specific populations [34].
Complementing this, Generalizability Theory (G-Theory) expands upon classical test theory by decomposing the variance in observed scores into multiple sources of measurement error, including items, raters, and testing occasions. This multifaceted approach enables researchers to estimate reliability across a range of study conditions and to pinpoint which sources of error most affect score precision [35]. By integrating IRT and G-Theory into the development and evaluation of questionnaires, researchers can produce instruments that are not only more accurate and reliable but also better aligned with the complexities of real-world research settings.

5.2. Validity

Validity refers to the extent to which a questionnaire measures what it is intended to measure. While reliability and validity are interconnected, they are not interchangeable. A tool that is unreliable cannot be valid, as inconsistency prevents accurate measurement. However, an instrument can be reliable without being valid. For example, a malfunctioning bathroom scale that consistently displays the same incorrect weight is reliable but not valid, as it fails to measure actual weight accurately.
Face validity assesses whether a questionnaire appears, on the surface, to measure the intended construct effectively. It is based on subjective judgments from researchers, participants, or experts regarding the clarity, relevance, and appropriateness of the items. While face validity is not an empirical measure, it provides initial insights into how well an instrument resonates with its intended audience. Some critics argue that it lacks scientific rigor [36], but others contend that high face validity enhances participant engagement and compliance, as clear and relevant items are more likely to elicit genuine responses [37]. However, due to its subjective nature, face validity should be complemented with more robust validation methods, such as content and construct validity, for a comprehensive assessment.
Content validity evaluates whether a questionnaire comprehensively represents all relevant aspects of the construct it aims to measure. This process involves systematic expert review to ensure that the instrument captures the full scope of the construct while excluding irrelevant elements. A common metric for content validity is the Content Validity Index (CVI), which aggregates expert ratings to determine the proportion of items deemed essential or highly relevant [38]. The CVI is assessed at both the item level and scale level to provide a robust measure of an instrument’s adequacy. High content validity is crucial for ensuring theoretical alignment and minimizing measurement bias [39].
Criterion validity assesses how well a questionnaire’s outcomes correlate with external benchmarks or a “gold standard”. A questionnaire is considered valid if its scores strongly correlate with those of an established criterion. This type of validity includes predictive validity, which tests whether the instrument accurately predicts future outcomes, and concurrent validity, which evaluates the questionnaire’s alignment with established measures administered at the same time. For example, a new mental health questionnaire should demonstrate high concurrent validity by correlating strongly with validated tools measuring similar constructs [40].
Construct validity examines whether a questionnaire accurately measures the theoretical construct it is intended to assess. It encompasses two main components: discriminant/convergent validity and factorial validity. Convergent validity is demonstrated when a scale correlates strongly with other instruments measuring similar constructs, whereas discriminant validity is confirmed when the scale does not correlate with unrelated instruments. Factorial validity assesses the internal structure of a scale, ensuring that its items accurately reflect the underlying construct. Factorial validity is typically evaluated using exploratory or confirmatory factor analysis to examine how well items cluster and represent the theoretical phenomenon [41]. By establishing construct validity, researchers can ensure that a questionnaire provides an accurate operationalization of the intended concept.

5.3. Pre-Testing and Pilot Testing

While both pre-testing and pilot testing are essential for enhancing the validity and reliability of survey results, they serve distinct purposes in questionnaire development. Pre-testing represents the initial stage, primarily aimed at refining individual questions to ensure clarity, cultural appropriateness, and correct interpretation by respondents. This stage often involves qualitative methods, such as cognitive interviews, where participants provide feedback on question comprehension and wording [42]. Pre-testing is typically conducted with questionnaire developers, subject-matter experts, and a small number of individuals resembling the target population. Because the goal is conceptual clarity rather than statistical inference, purposive samples of 5 to 10 domain experts and several representatives of the target population are usually sufficient. Common issues such as ambiguous wording, technical jargon, double-barreled questions, missing or inappropriate response options, and culturally insensitive content are documented and corrected. Addressing these problems strengthens the questionnaire’s content validity, ensuring that the instrument measures what it intends to capture.
Pilot testing, in contrast, is a broader evaluation of the entire survey questionnaire. It is conducted with a small, representative sample from the intended study population and aims to assess the overall functionality of the survey. Although samples of 5 to 15 respondents are sometimes used, a pilot sample of at least 30 participants is generally recommended to allow meaningful psychometric evaluation [43]. Pilot testing assesses the logical flow and sequencing of questions, estimated completion time, clarity of instructions, and feasibility of data collection procedures in real-world conditions [44]. Moreover, it provides an opportunity to assess preliminary reliability and construct validity. Statistical analyses may include internal consistency measures (e.g., Cronbach’s alpha, McDonald’s omega), exploratory factor analysis to explore the underlying structure of constructs, and item–total correlations to identify weak items. A subset of participants may be asked to complete the questionnaire again after 1 to 2 weeks to estimate test–retest reliability of the instrument.

6. Cultural Adaptation

Cultural adaptation is critical when developing questionnaires for use across diverse populations. It ensures that instruments are not only linguistically accurate but also culturally relevant, meaningful, and sensitive to the target population’s norms and experiences. Cultural adaptation extends beyond simple translation by addressing conceptual, social, and contextual factors that influence how respondents interpret and answer questions [45]. Without proper adaptation, questionnaires may introduce biases, compromise validity, and limit the comparability of results across different cultural and linguistic groups [46].
An example of how inadequate cultural adaptation can compromise the validity and effectiveness of interventions is noted in the Brazilian PROERD program, which is a direct translation of the North American DARE-keepin’ it REAL curriculum. Mismatches in language, cultural references, and infrastructural assumptions limited the program’s relevance and accessibility, contributing to its failure to prevent substance use in a randomized trial [47].
Cultural adaptation is an iterative, multi-step process that requires the collaboration of professionals from different disciplinary backgrounds:
(1)
Forward translation: Conducted by two independent professional translators who are fluent in both the source and target languages and ideally possess subject-matter expertise. Their role is to ensure technical accuracy while preserving cultural and contextual appropriateness.
(2)
Back-Translation: Performed by a separate translator unfamiliar with the original version to identify semantic discrepancies and shifts in meaning, thereby improving translation fidelity.
(3)
Expert Committee Review: A multidisciplinary panel is essential for ensuring cultural and methodological rigor. This team should include a psychometrician or survey methodologist, at least one healthcare or domain-specific professional, the original translators, and, if possible, the original instrument developer. The panel must also include at least one native speaker of the target language who is culturally embedded in the population of interest. The team collaboratively evaluates the translations for linguistic clarity, conceptual equivalence, and cultural sensitivity, flagging any content that may be misunderstood or offensive.
(4)
Pre-testing with Target Population: A pre-final version is administered to a small, representative sample of the intended population. This stage combines quantitative feedback (e.g., response distributions, item non-response) with qualitative methods, such as cognitive interviews or think-aloud protocols. These methods reveal misunderstandings, ambiguous wording, or culturally inappropriate references.
(5)
Revision and Finalization: Feedback from pre-testing is used to iteratively refine the questionnaire. This may involve modifying question wording, altering response scales, or replacing culturally incongruent examples. Multiple rounds of testing and refinement may be necessary to produce a culturally valid, conceptually coherent, and linguistically accessible instrument [48].
The cultural adaptation process is often underestimated in terms of both complexity and cost. It demands considerable time, expertise, and financial resources, particularly when involving professional translators, qualitative interviewers, and expert panels. Under-budgeting this process can compromise the quality of the adaptation and ultimately undermine data validity [49].
If a translated version of the questionnaire already exists, researchers must verify that it is the official, validated version, not an informal or unpublished adaptation. Relying on unauthorized or poorly adapted translations risks introducing measurement errors and reducing construct validity. Even existing translations should be critically reviewed for cultural relevance and fidelity to the original content.
Finally, comprehensive documentation of each step in the adaptation process is essential for transparency and reproducibility. It allows future researchers to evaluate the rigor of the adaptation, replicate the process in other contexts, and build upon the instrument’s validity evidence over time.

7. Ethical Considerations

Ethical considerations are paramount in any research involving human participants, including studies using questionnaires. While such studies are generally low-risk, researchers must adhere to ethical principles to protect participants’ rights, dignity, and well-being. Although questionnaire studies are often non-interventional, obtaining formal ethics review or waivers is advisable, even in jurisdictions where full ethics approval is not mandated [50]. Ethical oversight helps identify potential risks, ensures compliance with institutional and legal standards, and is particularly critical when studying vulnerable or marginalized populations. These groups may face challenges in understanding participation implications or be at a higher risk of exploitation due to their social status or health conditions.
While questionnaire participation typically poses no physical harm, informational and psychological risks may arise when sensitive topics are explored or personal data are collected [51]. For example, while online surveys offer distinct advantages, such as reduced administration costs, increased accessibility, and lower susceptibility to social desirability bias due to perceived anonymity, the digital format also introduces new ethical and methodological challenges, particularly concerning data validity, privacy, and cybersecurity [52]. Unauthorized access, data interception during transmission, and insecure storage of sensitive information present tangible risks, especially in biomedical contexts where health-related or personally identifiable data are involved.
To mitigate these risks, researchers should implement robust technical safeguards. These include encryption to secure data in transit and secure authentication to prevent unauthorized access. Moreover, deploying questionnaires through platforms that comply with international data protection regulations, such as the Health Insurance Portability and Accountability Act (HIPAA), is strongly recommended [53].
In addition to digital safeguards, physical records containing sensitive participant information must be stored in locked, access-controlled environments to prevent unauthorized access. Access should be restricted to authorized personnel only. Furthermore, institutions should establish clear protocols for secure data disposal. Physical documents should be shredded or incinerated, and digital records should be permanently deleted using certified data-wiping software after the retention period, typically five to seven years, or as specified by institutional policies.
Beyond these technical measures, transparency in data handling is essential. Researchers should clearly communicate how participant data will be de-identified, aggregated, or linked to other datasets. Explicit informed consent should be obtained not only for initial data collection but also for any data sharing or secondary use [54]. Participants must be made aware of their rights, including the right to withdraw from the study and the extent to which their data can be erased upon request. By implementing these safeguards, researchers uphold participant trust, data integrity, and ethical research practices.

8. Reporting of Findings

Effective reporting of questionnaire study findings enhances transparency, reproducibility, and practical application. However, deficiencies in survey reporting remain prevalent across medical research [55,56]. To address this, several evidence-based guidelines and checklists are available, including those listed on the EQUATOR Network website (https://www.equator-network.org/), to guide researchers in reporting survey studies effectively. The Checklist for Reporting Results of Internet E-Surveys (CHERRIES) is the gold standard for reporting online surveys. It addresses unique challenges of web-based data collection, such as manipulation prevention and response metrics [57]. In addition, the Consensus-Based Checklist for Reporting of Survey Studies (CROSS), established through expert consensus from 2018 to 2019, offers a comprehensive guide for reporting both web-based and non-web-based surveys [58].
Latour and Tume [59] proposed the SURVEY mnemonic as a practical tool and structured approach for ensuring robust survey reporting. S represents study design and objectives, emphasizing the need for a clear research question and a well-defined methodological approach. U stands for understanding the population and sampling, ensuring that the target respondents are appropriately selected and representative of the study population. R highlights response rate and representativeness, acknowledging the impact of low participation or selection bias on the generalizability of findings. V refers to validity and reliability of the tool, emphasizing the importance of using psychometrically sound instruments that accurately measure the intended constructs. E addresses ethical considerations, such as obtaining informed consent, maintaining participant confidentiality, and adhering to ethical guidelines. Finally, Y signifies the yield of findings and interpretation, focusing on how the collected data will be analyzed, contextualized, and applied to improve patient care and clinical decision-making.

9. Future Directions in Questionnaire Development

Recent advances in generative artificial intelligence (AI) have introduced promising innovations in questionnaire development. In particular, large language models (LLMs) have demonstrated substantial potential across various stages of the research process, including item generation, sampling, data management, analysis, and reporting [60]. For example, LLMs can generate culturally sensitive and linguistically refined questionnaire items that go beyond literal translation by capturing subtle connotations and avoiding potentially offensive or ambiguous language.
AI chatbots can administer questionnaires in a conversational format that simulates human interaction [61]. This interactive interface can guide participants through complex survey structures and provide real-time clarification for unclear items. Such interactivity enhances engagement and helps reduce respondent fatigue, particularly in lengthy or branching questionnaires.
Adaptive testing, a technique originating in educational psychometrics, represents another AI-enabled advancement. By dynamically adjusting item difficulty based on a respondent’s previous answers, adaptive testing improves assessment efficiency while maintaining the instrument’s sensitivity to individual response patterns [62]. This approach both minimizes participant burden and increases the precision and depth of data collection. Moreover, in qualitative research, LLMs can automate sentiment analysis and thematic coding, thereby reducing the time and human resources typically required for manual coding [63].
Despite these advantages, integrating AI into questionnaire development introduces several methodological and ethical challenges. LLMs are trained on large datasets that may encode cultural, gender-based, or socioeconomic biases, potentially resulting in biased content. Moreover, the lack of algorithmic transparency raises concerns regarding the reproducibility and epistemic validity of AI-generated instruments [60,64].
A further issue is the absence of explicit theoretical grounding in AI-generated items. These models generate content based on probabilistic language associations rather than established conceptual frameworks. Although such items may demonstrate face validity, they may fail to capture the intended constructs, necessitating rigorous empirical validation to ensure construct validity [65].
Future research should prioritize the validation of AI-generated items, the development of bias mitigation strategies [66], and the systematic evaluation of AI tools in qualitative analytic paradigms. In addition, data privacy and security must be upheld. One potential solution is the use of locally deployed LLMs, which can mitigate risks associated with transmitting sensitive participant data to external servers. This approach may enhance compliance with data protection regulations and uphold ethical standards in research involving human subjects [67]. With careful integration and appropriate oversight, AI technologies hold significant promise for advancing efficient, inclusive, and ethically responsible questionnaire development.

10. Conclusions

Questionnaires are essential tools in health research, enabling systematic data collection on perceptions, behaviors, and health outcomes. When adopting or developing questionnaires, researchers must ensure reliability, validity, and ethical transparency to produce robust and credible findings. While established questionnaires facilitate comparability across studies, new instruments may be required to address emerging health concerns and evolving cultural contexts.
Rigorous item generation, pilot testing, psychometric evaluation, and adherence to reporting guidelines are critical for enhancing the scientific integrity of self-report data. Additionally, AI-driven advancements in survey research offer promising opportunities to refine questionnaire design and improve respondent engagement.
By integrating best practices in survey methodology, researchers can maximize the utility of questionnaires, ultimately enhancing health data quality and informing evidence-based policy and practice.

Author Contributions

Conceptualization, M.K. and S.-W.Y.; writing—original draft preparation, M.K. and S.-W.Y.; writing—review and editing, M.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Therefore, data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIartificial intelligence
CVIContent Validity Index
ICCintraclass correlation coefficient
LLMslarge language models
VASvisual analogue scale

References

  1. Boynton, P.M.; Greenhalgh, T. Selecting, designing, and developing your questionnaire. BMJ 2004, 328, 1312–1315. [Google Scholar] [CrossRef] [PubMed]
  2. Thwaites Bee, D.; Murdoch-Eaton, D. Questionnaire design: The good, the bad and the pitfalls. Arch. Dis. Child. Educ. Pract. Ed. 2016, 101, 210–212. [Google Scholar] [CrossRef] [PubMed]
  3. Craig, C.L.; Marshall, A.L.; Sjöström, M.; Bauman, A.E.; Booth, M.L.; Ainsworth, B.E.; Pratt, M.; Ekelund, U.; Yngve, A.; Sallis, J.F.; et al. International physical activity questionnaire: 12-country reliability and validity. Med. Sci. Sports Exerc. 2003, 35, 1381–1395. [Google Scholar] [CrossRef]
  4. Spitzer, R.L.; Kroenke, K.; Williams, J.B.; Löwe, B. A brief measure for assessing generalized anxiety disorder: The GAD-7. Arch. Intern. Med. 2006, 166, 1092–1097. [Google Scholar] [CrossRef] [PubMed]
  5. Heller, G.Z.; Manuguerra, M.; Chow, R. How to analyze the Visual Analogue Scale: Myths, truths and clinical relevance. Scand. J. Pain 2016, 13, 67–75. [Google Scholar] [CrossRef]
  6. Michielsen, H.J.; De Vries, J.; Van Heck, G.L. Psychometric qualities of a brief self-rated fatigue measure: The Fatigue Assessment Scale. J. Psychosom. Res. 2003, 54, 345–352. [Google Scholar] [CrossRef]
  7. Sellbom, M. The MMPI-2-Restructured Form (MMPI-2-RF): Assessment of personality and psychopathology in the twenty-first century. Annu. Rev. Clin. Psychol. 2019, 15, 149–177. [Google Scholar] [CrossRef]
  8. Younas, A.; Porr, C. A step-by-step approach to developing scales for survey research. Nurse Res. 2018, 26, 14–19. [Google Scholar] [CrossRef]
  9. Rana, K.; Chimoriya, R. A guide to a mixed-methods approach to healthcare research. Encyclopedia 2025, 5, 51. [Google Scholar] [CrossRef]
  10. Meadows, K.A. So you want to do research? 5: Questionnaire design. Br. J. Community Nurs. 2003, 8, 562–570. [Google Scholar] [CrossRef]
  11. Ball, H.L. Conducting online surveys. J. Hum. Lact. 2019, 35, 413–417. [Google Scholar] [CrossRef] [PubMed]
  12. Koo, M.; Yang, S.-W. Likert-Type Scale. Encyclopedia 2025, 5, 18. [Google Scholar] [CrossRef]
  13. van de Mortel, T.F. Faking it: Social desirability response bias in self-report research. Aust. J. Adv. Nurs. 2008, 25, 40–48. [Google Scholar]
  14. Miller, T.M.; Abdel-Maksoud, M.F.; Crane, L.A.; Marcus, A.C.; Byers, T.E. Effects of social approval bias on self-reported fruit and vegetable consumption: A randomized controlled trial. Nutr. J. 2008, 7, 18. [Google Scholar] [CrossRef]
  15. Hoffmann, A.; Meisters, J.; Musch, J. Nothing but the truth? Effects of faking on the validity of the crosswise model. PLoS ONE 2021, 16, e0258603. [Google Scholar] [CrossRef]
  16. Choi, B.C.; Pak, A.W. A catalog of biases in questionnaires. Prev. Chronic Dis. 2005, 2, A13. [Google Scholar]
  17. Ghafourifard, M. Survey fatigue in questionnaire-based research: The issues and solutions. J. Caring Sci. 2024, 13, 214–215. [Google Scholar] [CrossRef]
  18. Ranganathan, P.; Caduff, C. Designing and validating a research questionnaire—Part 1. Perspect. Clin. Res. 2023, 14, 152–155. [Google Scholar] [CrossRef] [PubMed]
  19. Juniper, E.F. Validated questionnaires should not be modified. Eur. Respir. J. 2009, 34, 1015–1017. [Google Scholar] [CrossRef]
  20. Wong, D.L.; Baker, C.M. Pain in children: Comparison of assessment scales. Pediatr. Nurs. 1988, 14, 9–17. [Google Scholar]
  21. Rattray, J.; Jones, M.C. Essential elements of questionnaire design and development. J. Clin. Nurs. 2007, 16, 234–243. [Google Scholar] [CrossRef] [PubMed]
  22. Mallen, C.D.; Dunn, K.M.; Thomas, E.; Peat, G. Thicker paper and larger font increased response and completeness in a postal survey. J. Clin. Epidemiol. 2008, 61, 1296–1300. [Google Scholar] [CrossRef]
  23. Banerjee, J.; Bhattacharyya, M. Selection of the optimum font type and size interface for on-screen continuous reading by young adults: An ergonomic approach. J. Hum. Ergol. 2011, 40, 47–62. [Google Scholar]
  24. Gadhvi, M.A.; Baranwal, A.; Chalakapure, A.; Dixit, A. Font matters: Deciphering the impact of font types on attention and working memory. Cureus 2024, 16, e59845. [Google Scholar] [CrossRef]
  25. Sharma, H. How short or long should a questionnaire be for any research? Researchers’ dilemma in deciding the appropriate questionnaire length. Saudi J. Anaesth. 2022, 16, 65–68. [Google Scholar] [CrossRef] [PubMed]
  26. Dillman, D.A. The design and administration of mail surveys. Annu. Rev. Sociol. 1991, 17, 225–249. [Google Scholar] [CrossRef]
  27. Edwards, P.; Roberts, I.; Clarke, M.; DiGuiseppi, C.; Pratap, S.; Wentz, R.; Kwan, I. Increasing response rates to postal questionnaires: Systematic review. BMJ 2002, 324, 1183–1185. [Google Scholar] [CrossRef]
  28. Kang, H. Sample size determination and power analysis using the G*Power software. J. Educ. Eval. Health Prof. 2021, 18, 17. [Google Scholar] [CrossRef]
  29. Althubaiti, A. Sample size determination: A practical guide for health researchers. J. Gen. Fam. Med. 2022, 24, 72–78. [Google Scholar] [CrossRef]
  30. Sijtsma, K. On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika 2009, 74, 107–120. [Google Scholar] [CrossRef]
  31. Dunn, T.J.; Baguley, T.; Brunsden, V. From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. Br. J. Psychol. 2014, 105, 399–412. [Google Scholar] [CrossRef] [PubMed]
  32. Koo, T.K.; Li, M.Y. A guideline for selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef] [PubMed]
  33. Cook, D.A.; Beckman, T.J. Current concepts in validity and reliability for psychometric instruments: Theory and application. Am. J. Med. 2006, 119, 166.e7–166.e16. [Google Scholar] [CrossRef]
  34. Hays, R.D.; Morales, L.S.; Reise, S.P. Item response theory and health outcomes measurement in the 21st century. Med. Care 2000, 38, II28–II42. [Google Scholar] [CrossRef]
  35. Bloch, R.; Norman, G. Generalizability theory for the perplexed: A practical introduction and guide: AMEE Guide No. 68. Med. Teach. 2012, 34, 960–992. [Google Scholar] [CrossRef] [PubMed]
  36. Downing, S.M.; Haladyna, T.M. Validity threats: Overcoming interference with proposed interpretations of assessment data. Med. Educ. 2004, 38, 327–333. [Google Scholar] [CrossRef]
  37. Allen, M.S.; Robson, D.A.; Iliescu, D. Face validity: A critical but ignored component of scale construction in psychological assessment. Eur. J. Psychol. Assess. 2023, 39, 153–156. [Google Scholar] [CrossRef]
  38. Grant, J.S.; Davis, L.L. Selection and use of content experts for instrument development. Res. Nurs. Health 1997, 20, 269–274. [Google Scholar] [CrossRef]
  39. Almanasreh, E.; Moles, R.; Chen, T.F. Evaluation of methods used for estimating content validity. Res. Soc. Adm. Pharm. 2019, 15, 214–221. [Google Scholar] [CrossRef]
  40. Streiner, D.; Norman, G.; Cairney, J. Health Measurement Scales: A Practical Guide to Their Development and Use; Oxford University Press: Oxford, UK, 2015; pp. 233–235. [Google Scholar]
  41. Goodwin, L.D.; Goodwin, W.L. Focus on psychometrics: Estimating construct validity. Res. Nurs. Health 1991, 14, 235–243. [Google Scholar] [CrossRef]
  42. Bowden, A.; Fox-Rushby, J.A.; Nyandieka, L.; Wanjau, J. Methods for pre-testing and piloting survey questions: Illustrations from the KENQOL survey of health-related quality of life. Health Policy Plan 2002, 17, 322–330. [Google Scholar] [CrossRef] [PubMed]
  43. Perneger, T.V.; Courvoisier, D.S.; Hudelson, P.M.; Gayet-Ageron, A. Sample size for pre-tests of questionnaires. Qual. Life Res. 2015, 24, 147–151. [Google Scholar] [CrossRef]
  44. Drennan, J. Cognitive interviewing: Verbal data in the design and pretesting of questionnaires. J. Adv. Nurs. 2003, 42, 57–63. [Google Scholar] [CrossRef]
  45. Beaton, D.E.; Bombardier, C.; Guillemin, F.; Ferraz, M.B. Guidelines for the process of cross-cultural adaptation of self-report measures. Spine 2000, 25, 3186–3191. [Google Scholar] [CrossRef]
  46. Gjersing, L.; Caplehorn, J.R.; Clausen, T. Cross-cultural adaptation of research instruments: Language, setting, time and statistical considerations. BMC Med Res Methodol. 2010, 10, 13. [Google Scholar] [CrossRef]
  47. Valente, J.Y.; Franciosi, B.; Garcia-Cerde, R.; Pietrobon, T.; Sanchez, Z.M. Addressing the needs for cultural adaptation of DARE-keepin’ it REAL among Brazilian students: Strategies to improve implementation. Subst. Abuse Treat. Prev. Policy 2024, 19, 48. [Google Scholar] [CrossRef] [PubMed]
  48. Sousa, V.D.; Rojjanasrirat, W. Translation, adaptation and validation of instruments or scales for use in cross-cultural health care research: A clear and user-friendly guideline. J. Eval. Clin. Pract. 2011, 17, 268–274. [Google Scholar] [CrossRef] [PubMed]
  49. Cruchinho, P.; López-Franco, M.D.; Capelas, M.L.; Almeida, S.; Bennett, P.M.; Miranda da Silva, M.; Teixeira, G.; Nunes, E.; Lucas, P.; Gaspar, F.; et al. Translation, cross-cultural adaptation, and validation of measurement instruments: A practical guideline for novice researchers. J. Multidiscip. Healthc. 2024, 17, 2701–2728. [Google Scholar] [CrossRef]
  50. Evans, M.; Robling, M.; Maggs Rapport, F.; Houston, H.; Kinnersley, P.; Wilkinson, C. It doesn’t cost anything just to ask, does it? The ethics of questionnaire-based research. J. Med. Ethics 2002, 28, 41–44. [Google Scholar] [CrossRef]
  51. Whicher, D.; Wu, A.W. Ethics review of survey research: A mandatory requirement for publication? Patient 2015, 8, 477–482. [Google Scholar] [CrossRef]
  52. Hailu, A.; Rahman, S.M. Security concerns for web-based research survey. In Proceedings of the 7th International Conference on Electrical and Computer Engineering (ICECE), Dhaka, Bangladesh, 20–22 December 2012; pp. 358–361. [Google Scholar]
  53. Mocydlarz-Adamcewicz, M.; Bajsztok, B.; Filip, S.; Petera, J.; Mestan, M.; Malicki, J. Management of onsite and remote communication in oncology hospitals: Data protection in an era of rapid technological advances. J. Pers. Med. 2023, 13, 761. [Google Scholar] [CrossRef]
  54. Hutchings, E.; Loomes, M.; Butow, P.; Boyle, F.M. A systematic literature review of health consumer attitudes towards secondary use and sharing of health administrative and clinical trial data: A focus on privacy, trust, and transparency. Syst. Rev. 2020, 9, 235. [Google Scholar] [CrossRef]
  55. Shankar, P.R.; Maturen, K.E. Survey research reporting in radiology publications: A review of 2017 to 2018. J. Am. Coll. Radiol. 2019, 16, 1378–1384. [Google Scholar] [CrossRef] [PubMed]
  56. Li, A.H.; Thomas, S.M.; Farag, A.; Duffett, M.; Garg, A.X.; Naylor, K.L. Quality of survey reporting in nephrology journals: A methodologic review. Clin. J. Am. Soc. Nephrol. 2014, 9, 2089–2094. [Google Scholar] [CrossRef] [PubMed]
  57. Eysenbach, G. Improving the quality of Web surveys: The Checklist for Reporting Results of Internet E-Surveys (CHERRIES). J. Med. Internet Res. 2004, 6, e34. [Google Scholar] [CrossRef] [PubMed]
  58. Sharma, A.; Minh Duc, N.T.; Luu Lam Thang, T.; Nam, N.H.; Ng, S.J.; Abbas, K.S.; Huy, N.T.; Marušić, A.; Paul, C.L.; Kwok, J.; et al. A consensus-based checklist for reporting of survey studies (CROSS). J. Gen. Intern. Med. 2021, 36, 3179–3187. [Google Scholar] [CrossRef]
  59. Latour, J.M.; Tume, L.N. How to do and report survey studies robustly: A helpful mnemonic SURVEY. Nurs. Crit. Care 2021, 26, 313–314. [Google Scholar] [CrossRef]
  60. Jansen, B.J.; Jung, S.G.; Salminen, J. Employing large language models in survey research. Nat. Lang. Proc. J. 2023, 4, 100020. [Google Scholar] [CrossRef]
  61. Rhim, J.; Kwak, M.; Gong, Y.; Gweon, G. Application of humanization to survey chatbots: Change in chatbot perception, interaction experience, and survey data quality. Comput. Hum. Behav. 2022, 126, 107034. [Google Scholar] [CrossRef]
  62. Hadzhikoleva, S.; Rachovski, T.; Ivanov, I.; Hadzhikolev, E.; Dimitrov, G. Automated test creation using large language models: A practical application. Appl. Sci. 2024, 14, 9125. [Google Scholar] [CrossRef]
  63. Bennis, I.; Mouwafaq, S. Advancing AI-driven thematic analysis in qualitative research: A comparative study of nine generative models on Cutaneous Leishmaniasis data. BMC Med. Inform. Decis. Mak. 2025, 25, 124. [Google Scholar] [CrossRef] [PubMed]
  64. Bahammam, A.S.; Trabelsi, K.; Pandi-Perumal, S.R.; Jahrami, H. Adapting to the impact of artificial intelligence in scientific writing: Balancing benefits and drawbacks while developing policies and regulations. J. Nat. Sci. Med. 2023, 6, 152–158. [Google Scholar] [CrossRef]
  65. Falcão, F.; Pereira, D.M.; Gonçalves, N.; De Champlain, A.; Costa, P.; Pêgo, J.M. A suggestive approach for assessing item quality, usability and validity of Automatic Item Generation. Adv. Health Sci. Educ. 2023, 28, 1441–1465. [Google Scholar] [CrossRef] [PubMed]
  66. Wei, X.; Kumar, N.; Zhang, H. Addressing bias in generative AI: Challenges and research opportunities in information management. Inf. Manag. 2025, 62, 104103. [Google Scholar] [CrossRef]
  67. Williamson, S.M.; Prybutok, V. Balancing privacy and progress: A review of privacy challenges, systemic oversight, and patient perceptions in AI-driven healthcare. Appl. Sci. 2024, 14, 675. [Google Scholar] [CrossRef]
Table 1. Description and typical applications of common terms in questionnaire-based research.
Table 1. Description and typical applications of common terms in questionnaire-based research.
TermDescriptionTypical Use
SurveyA broad methodological approach involving the design, administration, and analysis of data collection procedures.To gather information from a population using various tools and techniques.
InstrumentAny tool or device used to measure variables in research.General term for tools such as questionnaires, scales, or tests.
QuestionnaireA structured set of questions used to collect self-reported data from respondents.Used to assess behaviors, attitudes, symptoms, or experiences.
ScaleA component or standalone tool designed to measure a single latent construct.Used to generate a score representing a specific psychological or health-related variable (e.g., anxiety, pain).
InventoryA comprehensive questionnaire designed to measure multiple constructs or dimensions.Used to assess complex traits or multidimensional profiles (e.g., personality, coping styles).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Koo, M.; Yang, S.-W. Questionnaire Use and Development in Health Research. Encyclopedia 2025, 5, 65. https://doi.org/10.3390/encyclopedia5020065

AMA Style

Koo M, Yang S-W. Questionnaire Use and Development in Health Research. Encyclopedia. 2025; 5(2):65. https://doi.org/10.3390/encyclopedia5020065

Chicago/Turabian Style

Koo, Malcolm, and Shih-Wei Yang. 2025. "Questionnaire Use and Development in Health Research" Encyclopedia 5, no. 2: 65. https://doi.org/10.3390/encyclopedia5020065

APA Style

Koo, M., & Yang, S.-W. (2025). Questionnaire Use and Development in Health Research. Encyclopedia, 5(2), 65. https://doi.org/10.3390/encyclopedia5020065

Article Metrics

Back to TopTop