1. Introduction
The geographical regions and related conditions are essential for the economic development of any area. If favorable, these conditions lead to the clustering of innovative firms [
1]. Geographical elements are termed as ecosystems, and these elements affect business innovation and job creation. These favorable regions thrive on labor markets, information, and networks [
2]. The ecosystem approach encompasses entrepreneurial university ecosystems, location-based innovative clusters, social entrepreneurship ecosystems, and sustainable entrepreneurship ecosystems [
3].
Conceptual attempts have been made to better understand the entrepreneurial ecosystem and why firms cluster in those regions, for example, references [
4,
5,
6]. However, empirical measurement is still in its nascent stage and continues to evolve [
3]. The perception-based scale of the entrepreneurial ecosystem was developed by reference [
3], this scale has been used to collect data. Although this scale is necessary to understand the perspective of entrepreneurs and the quality of entrepreneurial ecosystem conditions in each region in terms of psychophysical behaviors may be different,. These scales are important to capture the feelings, perceptions, and behaviors that cannot be measured with one variable, the instrument or scales are always error-prone, the use of multiple items measuring the underlying latent traits or latent constructs leads to more generalized and precise research findings [
4]. There is extant literature available on the theory of scales and development; however, there are incomplete scales that measure physical, mental, and behavioral attributes, which are important for science and scientific inquiry [
5].
The scales can be validated using either classical test theory (CTT) or IRT; in the CTT assumes reliability is constant across all respondents, regardless of their ability levels, while in IRT, the reliability depends on the ability levels or traits of the respondents—these differences can affect the reliability and validity of the instruments or scale, and, therefore, and conclusions drawn based on the scales may not be accurate [
6].
Therefore, the main aim of this research is to assess whether the EE scale possesses an acceptable level of psychometric properties, e.g., the discrimination and difficulty of the items and what items are problematic in the scale. The IRT would help to validate and assess the properties of the scale at the item level because the IRT measurement framework is extensively used in scoring scale data, such as questionnaires or scales [
7].
This would be the first application of IRT on the EE scale in developing countries, such as Pakistan. The IRT measurement framework and key ideas are applied to individual items, estimating abilities or latent traits, and error measurements.
1.1. Study Motivation
This study was conducted to evaluate and validate the EE scale using IRT. The chi-square fit index was applied at the item level, and the items lacking goodness of fit were identified.
Despite the lack of empirical research on measuring the entrepreneurial ecosystem [
8], no study has evaluated the reliability of the entrepreneurial ecosystem scale developed by reference [
3] using IRT. Therefore, this study will highlight a few studies in other fields where IRT is applied to obtain results.
1.2. Literature Review
Researchers are concerned about the quality of the EE in different regions and countries; policy makers on the other hand want to identify policy action points within entrepreneurial ecosystems [
9]. The ecosystem metaphor is borrowed from biology and it is defined as the interdependent factors and actors and their interaction effects the viability of entrepreneurial activity in a particular region [
10].
The researchers focus on the strategic perspective regarding the EE and want to penetrate the complex interactions between actors and factors within an EE, which enables them to see the black box of EE—these actors work toward common goals of job creation and economic growth [
11]. The EE has six domains: (1) finance, (2) policy, (3) markets, (4) human resources, (5) support, and (6) culture [
3]. Policy is to what extent government supports entrepreneurial activity in terms of favorable legislation, rules, and laws; finance primarily concerns access to finance; markets is concerned with diaspora networks, distribution channels, and early adopters; human resources is concerned with access to human capital in terms of training, the technical workforce, and universities; culture deals with the values and attitudes towards innovation and business venturing; and support includes infrastructure, supporting professions, and entrepreneur-friendly programs and institutions [
3].
The developing countries in contrast to developed countries often lack proper support for startups, beneficial legal rules and procedures, access to human capital, adequate infrastructure, and finance, among other challenges [
12]; therefore, measuring the entrepreneurial ecosystem in a certain region to identify policy action points is very important for developing countries to foster entrepreneurship and reap rewards from it.
The authors Souza et al. [
13] used IRT on the entrepreneurship attitude scale developed by Souza and Lopes Jr. (2005) [
14]. This scale has two dimensions: innovation and prospection, measurement and persistence. They used the graded response model to evaluate the scale. They found that their scale has two levels, the scale was able to distinguish between respondents having different ability levels.
The authors Wu et al. (2015) [
15] used IRT to validate, operationalize, and conceptualize the multidimensional, as well multirole, “Entrepreneurial Behavior” scale in retailing businesses, and they found that IRT yielded robust, reliable measurements of the “ Multirole and Multidimensional construct”.
The authors Harrison et al. (2017) [
16] applied IRT to psychological capacity, that is, melodic discrimination, or the ability to detect differences between two or more melodies. Their results support the application of IRTs strong construct reliability and validity. The researchers Şen and Toker (2021) [
17] applied multilevel mixture IRT theory model using six different multilevel models on an eighth-grade dataset. Their results indicated that one school-level and four student-level latent classes were the best-fit models. Finally, the IRT was applied by author [
12]. Using generalized partial credit model (GPCM) approach, they found the suitability of IRT on the instrument and found item difficulty levels at different grades, and 52% of students had above-average mathematical literacy.
Lemée (2019) [
18] applied IRT to evaluate coping mechanisms in risky environmental situations. Their results suggest that 10 items out of 23 items were sufficient for passive and active coping willingness. Moreover, their study found that IRT can reveal the link between willingness to cope and other factors of interest. Finally, the researchers Zampetakis A. et al. (2015) [
19] used IRT with the GRM to evaluate whether anticipated effect predictions conform to the questions of “what people usually do” or “what people can do”. Their findings suggest that the self-report response to the expected effect works to maximal behavior.
The authors Terman and Burke (2021) [
20] used IRT to examine disability-related questions in National Health and Nutrition Examination Survey disability-related questions. Their findings showed a high degree of information that distinguished individuals with higher than mean limitations and showed zero resolution with individuals conveying lower mean activity limitations. The IRT was applied by Cordier et al. (2019) [
21] on the pragmatics observational measure; their findings showed that their scale needed revision, since they observed significant covariances with a large and complex dataset.
The author Barbosa et al. (2021) [
22] applied IRT to validate and assess the instrument of the “impacts of Integrated Management System”; the purpose of this instrument is to assess the effect of the integrated management system on the performance of organization. They used this scale to assess the discriminating capacity and level of difficulty of the items in the instrument, and they found that this instrument showed good discriminating ability and difficulty level for the items. Their study revealed six levels to measure impact-integrated management systems.
The author Silvia (2021) [
23] evaluated the psychometric properties of the Likert item scale using the polytomous IRT. Their findings showed that the graded response model provided the best-fit model, and the threshold estimates were close to a range of one to five. To date, no such study has applied IRT to the scale of the EE scale developed by Ligouri (2019) [
3].
1.3. Problem Statement
A questionnaire or instrument can be evaluated using the CTT or IRT. CTT uses simplified measures, for example, Cronbach’s alpha, while IRT provides reliability at the item level, the IRT generates item category characteristic curve (ICCC) and the test information function (TIF), ICCC and TIF show precision at different values of the theta or ability levels of the respondents. Another difference is that the CTT assumes a constant reliability and error across all respondents. On the other hand, IRT assumes that instruments are always error-prone and there is always a difference between a person’s expected and true scores. IRT calculates a probability based on item characteristics and latent trait scores. The outcome measure is based on theta in IRT. In contrast, in CTT, the outcome measure is based on the sum of the score distribution [
24].
The author [
19,
20] have argued that CTT assumes an equal precision measurement for all respondents without considering their individual abilities, while IRT uses a precision measurement that depends on latent traits. Another advantage cited is the use of the 2PL model and GRM. These models are used to score items when computing latent trait scores and thus IRT reveals minor changes in the mental ability of individuals. IRT also facilitates the use of pre-test and post-test questions [
6].
4. Discussion
The popularity of the entrepreneurial ecosystem is gaining momentum, and there has been a continuous rise in scholarly research on measuring the entrepreneurial ecosystem to see the environment for entrepreneurs in a given region. This can provide us with policy action points to improve entrepreneurial ecosystem at the regional, state, and national levels.
The perception-based instrument of EE was developed by Ligouri (2019) [
3] and expanded by reference [
1]. This questionnaire has 48 items with a 5-point Likert scale, the pilot study validated the scale and reliability was found to be satisfactory. The scale was assessed through factory and parallel analysis, and it was found to be unidimensional; therefore, the multidimensional IRT model was used (see
Table 1 and
Table 2 and
Figure 1).
With the help of IRT, it was possible to validate the scale, which enabled us to calculate the discrimination and difficulty levels for each response, regarding the level of agreement on the domains of the entrepreneurial ecosystem, e.g., general finance, capital finance, general support, support professions, human resources, culture, market, and policy, and the calculation of ability levels or (θ). The IRT results showed that items of the scale were able to differentiate respondents having different levels of abilities or (θ). The recommended value for discriminating power was based on 0.700, as recommended by Tezza et al. (2011) [
47]. All items showed good discriminating power (>0.700). The highest discrimination value (4.13) was found for item pol5 (provincial and local governments have strong policies for the growth of entrepreneurship), and the lowest value was for Mr5 (1.57), although it was still greater than the recommended value (0.700). The items Mr5 and Hr4 were easiest items for the respondents at level two, while items Sp1, Cf4, and Cf2 were the most difficult at an alternative level four, thus IRT allows us to see the discrimination power and the difficulty levels of the items of the scale.
Figure 2 Standard Error and information retrieved at different ability levels red curves shows the Standard Error against information obtained in blue. These figures indicate that the highest information was achieved in the range of −2.5 to +1.5, and that the perception range information is achieved at different ability levels; therefore, it validates the instrument [
44,
48,
49].
The information and expected total score curves,
Figure 2 indicate that the latent trait fits well to the cumulative model and that the information covers the different latent trait values, validating the instruments. The validations of the latent traits using this same procedure can be analyzed in the studies by author [
44], author [
48], and author [
49].
The anchoring analysis was performed following guidelines set by Vincenzi et al. (2018) [
45], and it is one of the best advantages of IRT models compared to CTT [
50]. Anchoring analysis has identified three levels, namely lower, moderate, and highest levels of agreement on the domains of the entrepreneurial ecosystem, i.e., general finance, capital finance, general support, support professions, human resources, culture, market, and policy. Respondents had the lowest level of agreement for items Cf1, Cf5, Pol3, and Pol5, and the respondents had moderate levels of agreement for items Gf3, Gf4, Gf2, Gf1, Cf2, Cf4, Gs1, Gs7, Sp4, Sp3, Sp5, Sp1, Hr5, Pol1, Pol4, Pol6, and Pol2. The highest level of agreement was found for items Gf5, Gf6, Cf3, Gs2, Gs3, Gs4, Gs5, Gs6, Sp2, Hr3, Hr4, Hr2, Hr6, Hr1, Mr1, Mr2, Mr3, Mr4, and Mr5. In total, the majority level of agreement is from the lowest to moderate levels of agreement (33.76% to 64.35%) (
Table 6), so it can be inferred that the level of agreement for most participants is moderate about the entrepreneurial ecosystem.
Finally, the scale is tested for goodness of fit to achieve this indices CFI, TLI, RSME and SRMR were used. The indices CFI (0.973) and TLI (0.955) values are closer to 1, indicating best fit. RSMEA (0.111) was in a lower bound (0.073 > 0.05) indicating poor fit, the upper bound was 0.152 > 0.10, indicating good fit, the p-value was 0.006 < 0.05 indicating that the fit is not close, and the SRMR was 0.026. Zero SRMR indicates a perfect fit with the threshold value at 0.08.
The validation of the scale is based on the acceptable levels for the factor analysis and the parameters of IRT, such as discriminating power and difficulty level, and it becomes possible to achieve an acceptable measurement scale with the theta or latent trait; therefore, the psychometric properties showed scale is having good properties. The factory analysis validates the latent constructs and IRT validates the discrimination power of the items [
43,
44].
5. Conclusions
The scale tested in the research is able to measure the level of agreement of respondents regarding the entrepreneurial ecosystem. The 48 items showed the satisfactory level of psychometric properties with a good level of discriminating power with a high to very high level of discriminating power (greater than 1.7) with respondents with different levels of knowledge, and with increasing difficulty for response alternatives, which require more abilities or latent traits. The EE scale tested and validated provides an opportunity to identify problematic items; these items need attention or rewording, e.g., item Mar5 under market has lower discrimination (1.57) and, therefore, this item should be reworded to improve its discrimination. The results from IRT have generated three levels, from a minimal level of agreement to the highest level of agreement.
The main findings of this research suggest that the EE scale possesses a satisfactory level of psychometric properties in terms of item difficulty and discrimination to measure the attitudes of respondents towards elements, e.g., policy, finance, market, human resources, culture, and support of the entrepreneurial ecosystem in a particular region, which will help policy makers to identify and prioritize policy action towards an entrepreneurial ecosystem based on the perception-based scale [
3]. This research has contributed in the extant literature by providing a validated scale to measure the EE in a given region through the attitudes of the entrepreneurs, whilst many of the authors [
51,
52,
53,
54,
55,
56] have provided different measuring frameworks to measure the EE.
The present research has a number of limitations. First, the participants are drawn from one region of Pakistan (Sindh), and respondents from other regions, e.g., Punjab, KPK, and Baluchistan, should be included in future research because they might have different perceptions and attitudes towards specific items, therefore, establishing an invariance of the scale across different regions, ages, and gender groups, providing further evidence of the application of the scale. Another limitation is the would-be translation of the scale in local languages to assess its psychometric properties and invariance. Future studies should also use differential item functioning (DIF) using age, gender, business experience, and city of origin, which would yield more reliability for the EE scale.