This section is composed of three parts. To begin with, an assessment of the volume and quality of the data in order to ensure it was sufficient and appropriate for carrying out the study. Secondly, an evaluation of the quality of our measurement system, in which each of the constructs in the model was decomposed into indicators, corresponded to the questions posed in the survey. Finally, an analysis of the quality of our model (structural analysis), the objective of which was to ensure there were no redundant elements (collinearity analysis) and to determine the predictive ability of the model, together with the relative weight of each of the constructs in the model. The predictive relevance of the model was analysed, and the weight of each of the constructs was determined, independently of the data, with the use of blindfolding techniques.
4.1. Data Analysis
This subsection deals with the evaluation of different aspects of the dataset, which consisted of a total of 346 samples. To begin with, the sample size was assessed in relation to the chosen model employing three different methods. The rule of [
68] suggested a minimum viable sample size of between 40 and 60 (respectively equivalent to the highest number of formative indicators of a construct and the highest number of structural relationships, each multiplied by one order of magnitude). An estimate based on statistical power, developed by [
69], indicated a minimum sample size of 97. (This method uses four parameters: the effect size, the power, alpha, and the number of predictors). The last method involved using the G Power programme, as recommended by [
70], which yielded a value of 98. The threshold values thus obtained were, in all cases, amply surpassed by our dataset of 346 samples.
The second part assessed the qualities of the dataset. Missing values were evaluated using SmartPLS, which yielded a total of 290 missing values among a total of 34 indicators and 346 samples, which represented 2.4% of the dataset (290/(346 × 34)). This value being below the 5% threshold, the amount of missing data was thus considered quite acceptable. With regard to the distribution of the data, PLS does not impose any assumptions concerning its normality.
The theoretical model was set up in SmartPLS, as shown in
Figure 3.
4.2. Analysis of the Measurement System
The objective of this analysis was to determine the extent to which the indicators for each construct or latent variable meet the required reliability and validity.
For the reflective indicators (outward arrows from the constructs), reliability was analysed using Cronbach’s alpha, composite reliability [
71], and consistent reliability [
72]. The resulting values were above 0.7. Convergent validity, or the extent to which a set of indicators represents a single construct, was evaluated through load analysis, commonality analysis, and AVE. Finally, discriminant validity, or the extent to which an indicator is different from the rest, was assessed using cross-load analysis [
51].
The analysis of these indicators is reflected in
Table 1. Focusing on the results for UB’s indicator B02, Cronbach’s alpha was 0.514, below the lower limit of 0.6, but composite reliability (0.843) and consistent reliability (1.369) were both acceptable. The indicator BO1 was dropped for having a load of less than 0.7. The criteria of [
73] were followed.
The convergent validity of the reflective indicators was assessed, to begin with, by analysing loads. BO1 had a value below 0.7. The commonality of each indicator represents how much of the variance of the construct is due to the given indicator (with load values of 0.7, yielding 50% of construct variance). AVE describes how variations in the indicators are reflected in the construct. AVE values were above 0.5.
The discriminant validity was tested using the classical methods of cross-load analysis and the Fornell–Larcker criterion. Cross-loads were calculated using the correlations between the construct scores and the standardised data [
53]. To comply, no item should load more heavily on another construct than on the one it is meant to measure [
46]. The Fornell–Larcker criterion states that the amount of variance a construct receives from its indicators (measured by AVE) should be greater than the amount of variance it generates for other indicators. The results are presented in
Table 2A,B. Cells in bold indicate links between the indicators and the constructs to which they belong (e.g., the intersections between BI and BI1, BI2, and BI3).
Ref. [
74] demonstrated the lack of sensitivity of classical methods and devised the HTMT method. It represents the average correlation between HT (heterotrait-heteromethod) and MT (monotrait-heteromethod), and the values obtained should be below 0.9. The bootstrapping technique was then used to test whether the result was significantly different from 0.9. That is if the value 1 was included in the 90% confidence interval. The results for discriminant validity (HTMT) were as follows: EE effects in BI is 0.356; PE effects in BI is 0.588; PE effects in EE is 0.490; UB effects in BI is 0.188; UB effects in EE is 0.128; UB effects in PE is 0.124.
The following step was to evaluate the formative indicators (arrows from indicators towards constructs in
Figure 3). It is important to note that the criteria applied to reflective constructors cannot be applied to constructive ones due to the inherent nature of how they are formed [
75].
The formative constructs were tested through convergent validity analysis or redundancy analysis. This required a reflective indicator for every formative construct, which were included in the survey. Each formative construct was divided into two constructs. The first of these existed in the model, with the formative indicators, and it, in turn, communicate with a new global construct that had a single reflective indicator, which had already been considered in the survey. The path coefficient for this redundancy model will give us an idea of convergent validity. This is known as redundancy analysis [
76]. A path coefficient value of around 0.7 or higher indicates convergent validity.
According to these criteria, the only formative indicators left were those for Social Influence (SI). The next step was to analyse the collinearity between them. VIF values < 3.3 indicate an absence of collinearity. Relevance and significance were then analysed. Given that the weights of the indicators decrease as they increase in number, the maximum possible weight value for each of the three SI indicators was 1/√3 = 0.58. The absolute importance of a formative indicator comes from its external load (loads come from simple regressions of the constructor with its indicators). Therefore, when the external weight of an indicator was not significant, the external load was analysed. If it was also below 0.5, its statistical significance was then analysed by applying the bootstrapping technique to more than 5000 samples. Doubts arose with respect to SI1, as it had the lowest external load, and its absolute contribution was 0.338 (below 0.5) and was not significant, but at the same time, it had no collinearity (VIF < 3.3) and was conceptually relevant [
77]. See
Table 3.
The remaining steps involved analysing the significance of the different groups of users that existed in the measurement system. These distinguished between:
Two types of non-parametric statistic tests were employed. The Mann-Whitney U test was employed when comparing two groups (see
Table 4).
The different significance by universities (private/public) is B02, FCG, SI1 and BI3. On the other hand, BO2, SI1 and Si2 show different significance by gender.
The differences between the nine different engineering schools in the dataset were evaluated using the Kursaal-Wallis test for multigroup data (see
Table 5).
Significance variance by engineering school grouping, according to the Kursaal-Wallis non-parametric test for multigroup data, are BO2, PUPG, FCG, SI1 and BI1.
4.3. Analysis of the Structural System
As a means for testing the hypotheses, the capacity of the model to predict one or more constructs was evaluated through an analysis of the structural model [
76].
The following steps were taken:
Assessment of collinearity in the structural model;
Assessment of the significance and relevance of the relationships within the structural model;
Assessment of the level of R2;
Assessment of the effect size (f2);
Assessment of the predictive relevance (ϱ2);
Assessment of the predictive significance (q2).
Collinearity was evaluated considering a variance inflation factor (VIF), for which the value obtained was below the threshold of 3, thus meeting the criterion of acceptability. See
Table 6.
Path coefficient values oscillate between −1 and 1, indicating stronger and more important relationships as they approach 1. The results yielded positive values in all cases, therefore supporting the model. Statistical significance depends on the standard error that results from applying the bootstrapping technique to a data sample. In this analysis, the two-tailed and 5000-sample test was employed. The higher the empirical t-value is above the critical value, the greater the statistical significance. Thus, with an alpha of 5% = 0.05, the critical value would be 1.96, and the result will have significance. The
p-value is often used because it is easier to remember, as it corresponds to the alpha value and reflects the probability of erroneously rejecting the null hypothesis when it is true. Thus, a
p-value below the significance level (alpha) implies the significance of the path coefficient. However, if zero is found within the confidence interval, it indicates non-significance. If a path coefficient is statistically significant, it indicates the extent to which the exogenous construct is linked to the endogenous construct [
77] (see
Table 7).
Therefore, our model is shown in
Figure 4.
R-squared (R
2) is useful as an assessment of the predictive ability of the model. It is calculated as the squared correlation between the actual and predicted value of a specific endogenous construct. This takes into account the combined effect of all exogenous constructs affecting the endogenous variable. Going further, an adjusted R-squared (R
2 adj) that attempts to correct for the bias is more appropriate for complex models. R
2 adj = 1 − (1 − R2) × (n − 1)/(n – k − 1), where n is the sample size and k is the number of exogenous variables affecting the endogenous construct being measured. The resulting values should be high enough to achieve a minimum explanatory power, which [
78] state should be at least 0.1. [
56] established that values of 0.67 and above were substantial, around 0.33 were moderate, and around 0.25 were weak in terms of explanatory power. [
77] state that, in the field of marketing, values should be above 0.75, the explanatory power is substantial, around 0.5 and moderate, around 0.25. In the model, BI had a substantial predictive ability, PE moderate and UB weak. See
Table 8.
Next, the impact of the non-significant constructs was measured, in particular, EE effects in BI and FC effects in UB. For the effect size, f2 was used, which measures the impact of omitting the effect of an exogenous construct on an endogenous construct, in terms of predictive ability (R2). The contributions of FC and EE to the predictive ability of the model were very small.
The heuristic rule established by [
69] states that:
Therefore, there was a large effect on SI effects in BI and PUP effects in PE, a small effect on PE effects in BI, and a negligible effect on PUP effects in BI, FC effects in BI, FC effects in UB, PE effects in UB, SI effects in UB (marked in red). See
Table 9.
The predictive relevance (ϱ2) provides an assessment of out-of-sample predictive ability. For this purpose, a blindfolding technique was used, which consisted of reusing the sampled data, eliminating the dth datum, and re-estimating the parameters with the remaining data. The predictive relevance was obtained by calculating the difference between the true data (which were produced when omitting the edth datum) and that which was predicted. The Handbook of Partial Least Squares Structural Equation Modelling (PLS-SEM) states that for “the relative measure of predictive relevance, values of 0.02, 0.15, and 0.35 indicate that, for a particular construct, an exogenous construct has small, medium, or large predictive relevance, respectively”. Thus, the predictive relevance for BI was considerably high (0.406); for PE, moderate (0.187); and for UB, small (0.035).
Next, an analysis of the effect size was performed based on the predictive relevance, similar to what was previously done based on the R2 values.
Therefore, considering the value of the path coefficients, the significance of the path coefficients themselves, and the effect size on both BI and UB, the latent variables EE on BI and FC on UB were dropped from the model.
The following step was to analyse the heterogeneity of the dataset. To this end, the following category groups were analysed:
To end, the extent to which the groups differed was analysed, so as to determine if the differences among them were significant or not. The technique proposed by [
74] was used, which involves applying bootstrapping to the dataset with 5000 samples. Parameters for each group were thus estimated, and a comparison among these determined if the differences were significant or not. This analysis was carried out using MGA multigroup analysis with SmartPLS. The results in
Table 10 show that only PUP was significant when comparing university types. Also, R
2 was greater in the public sector than in the private one (0.714 vs. 0.493). See
Figure 5A–D and
Figure 6, and
Table 10.
Grams for different university engineering schools. The brackets indicate the number of respondents from each school. Source: own elaboration.
The school-based models did not have sufficient sample sizes to validate them, except in the case of Industrial Engineering (160 samples, above the minimum sample size of 97). Therefore, it was not possible to analyse the extent to which differences among schools were significant. In addition, the school-based models displayed rather atypical results due to the limited amount of data available (see
Figure 6).
4.4. Summary of Results
Table 11 presents a summary of significant differences by group for each indicator.
The indicator BO2 (use of MS Teams) displayed significant differences among all groupings. This is due to the tool being mandatory at the private university.
The indicator SI1 (the professor’s social influence) displayed significant differences between genders and types of universities, and it reflected the important influence exerted by the professor on public students and women.
The indicator SI2 (classmates’ social influence) displayed a significant difference between genders, being a more important factor for women than men.
The structural system shows a very good predictive ability (see
Table 12). R2 values should be high enough for the model to achieve a minimum level of explanatory power [
58] recommend values above 0.10, whereas [
56] considers R2 values of 0.67, 0.33, and 0.19 as substantial, moderate, and weak, respectively. The model had a very high coefficient of determination (R
2 = 0.660) for the intention to use MS Teams (BI), it being higher for public universities (R
2 = 0.718) and men (R
2 = 0.688). The value of 0.384 for BI indicates a high coefficient of determination. The model is, therefore, valid for BI.
The elements that determine BI are especially SI (Social Influence) and PE (Perceived Expectancy) according to the values of the path coefficients.
The differences between path coefficients when comparing public and private universities suggest reasons for it being higher in the public sector, given that it is for PE that the non-parametric statistical data show a significant difference (see the numbers in blue in
Table 12). It is, therefore, for PE that predictive ability shifts between public and private institutions.