A Cognitive Systems Engineering Approach Using Unsupervised Fuzzy C-Means Technique, Exploratory Factor Analysis and Network Analysis—A Preliminary Statistical Investigation of the Bean Counter Profiling Scale Robustness

A bean counter is defined as an accountant or economist who makes financial decisions for a company or government, especially someone who wants to severely limit the amount of money spent. The rise of the bean counter in both public and private companies has motivated us to develop a Bean Counter Profiling Scale in order to further depict this personality typology in real organizational contexts. Since there are no scales to measure such traits in personnel, we have followed the methodological steps for elaborating the scale’s items from the available qualitative literature and further employed a cognitive systems engineering approach based on statistical architecture, employing cluster, factor and items network analysis to statistically depict the best mathematical design of the scale. The statistical architecture will further employ a hierarchical clustering analysis using the unsupervised fuzzy c-means technique, an exploratory factor analysis and items network analysis technique. The network analysis which employs the use of networks and graph theory is used to depict relations among items and to analyze the structures that emerge from the recurrence of these relations. During this preliminary investigation, all statistical techniques employed yielded a six-element structural architecture of the 68 items of the Bean Counter Profiling Scale. This research represents one of the first scale validation studies employing the fuzzy c-means technique along with a factor analysis comparative design.


Introduction
A bean counter, according to a widely recognized definition, is a person, often an accountant or bureaucrat, who is regarded to place an undue focus on regulating expenditure and budgets, a person concerned in business or government financial choices, particularly one who is hesitant to spend money.
The term "bean counter", also referred to as "bean keeper", "number cruncher" [1], "ledger attendant", "money guard" [2], or "corporate cop" [3][4][5][6], evokes an image of an accountant spending the whole day hunched over a calculator, trying to save expenses for the company on some form of expenditure or creating spreadsheets loaded with information to back up management's decision to reduce staff or programs [7].
Thus, the need for such a scale together with the need for a novel validation technique which harnesses the power of machine learning from the cognitive systems engineering perspective has further motivated us to apply several novel and traditional statistical procedures for uncovering clusters and factors among well-established bean counter behaviors and attitudes, which we have earlier transformed into 68 items, since there are no theoretical approaches to define a bean counter so far. To guarantee that the final product or service is both effective and reliable, cognitive systems engineering is a professional field that use systematic methods of cognitive analysis and cognitive design.
We train a machine learning model to discover clusters in our data set in this research. A clustering process aims to uncover structures in data. To do so, the algorithm must determine the number of structures/groups in the data and how the characteristics are distributed within each group. Because an unsupervised learning algorithm simply accepts features as input, no proper labels are required. The purpose of unsupervised learning is to find clusters that indicate similarities across characteristics, allowing us to illustrate major patterns in describing a bean counter profile from the scientific literature. This research represents one of the first scale validation studies employing the fuzzy c-means technique along with a factor analysis comparative design.
Since the focus of this paper is to provide detailed statistical argumentation for the factorial model chosen for our Bean Counter Profiling Scale, we are reporting this methodology as a preliminary investigation. In the validation study, we will focus on qualitative interpretation of the factorial structure in terms of items selected and the overall statistical descriptors of the Bean Counter Profiling Scale.

Statistical Architecture
The process of changing raw data into features that better describe the underlying problem to predictive models, resulting in enhanced model accuracy on unseen data, is known as feature engineering [21].
We will further employ the statistical architecture term to describe the cognitive systems engineering approach to validating our Bean Counter Profiling Scale; namely, we will further adopt traditional and innovative statistical procedures based on machine learning techniques and a structural visualization technique. The statistical architecture will further employ the cluster, factor and items network analysis to statistically depict the best mathematical design of the scale.
A typical cluster analysis includes three steps: dimension reduction, cluster identification, and result evaluation. Data sets are often made up of many observations with a variety of characteristics. Each element adds another dimension to the data, making it difficult to show the data fully in a two-dimensional graphic. Furthermore, clustering algorithms are afflicted by the propensity to perform badly if a large number of characteristics are added [22].
Dimensionality reduction methods that project multiple characteristics into a lowdimensional space can solve both of these difficulties. The next stage is to identify subgroups, for which there are several methods. Finding clusters and assigning individual observations to those clusters is a feature shared by all approaches. Distance to a cluster centroid (k), dissimilarity and connectivity between groups of data (agglomerative hierarchical clustering), or density can all be used (hierarchical density-based clustering). Not every approach assigns every observation to a cluster. There are other "fuzzy" algorithms (c) that estimate how much each observation corresponds with each cluster. Latent class analysis and latent profile analysis (for continuous data) are two instances of this (for binary data).
Mixture models conceptually resemble fuzzy clustering in that they evaluate the probability that an observation belongs to a cluster. The findings of the cluster analysis must be quantified as the last stage.
Two distinct statistical techniques used in data analytics are cluster analysis and factor analysis, which are frequently used in disciplines such as behavioral sciences. Both analytical processes have such names because they allow users to divide data into clusters or components. The fact that both of these techniques are essentially identical perplexes most newly established data analysts. While these two tactics appear to be identical on the surface, they differ in several ways, including their applications and goals.
A key contrast between cluster analysis and factor analysis is that they serve different purposes. The basic goal of factor analysis is to understand how the variables relate to one another and to explain the connection in a data set. Contrarily, the goal of cluster analysis is to address the variation in distinct data sets. To put it another way, factor analysis is intended to make things simpler, whereas cluster analysis aids in classification. By combining several variables into a smaller group of factors, factor analysis minimizes the number of variables. By dividing the data into fewer groups, cluster analysis lowers the amount of observations. As with factor analysis, there is no separation between independent and dependent variables in cluster analysis.
Both factor analysis and cluster analysis use unsupervised learning to segment data. Many recent researchers in this field think that factor analysis and cluster analysis are interchangeable terms. Despite appearing identical, they are not the same. Cluster analysis and factor analysis have different objectives. The objective is to divide the observations into distinct, homogeneous groups. On the other hand, component analysis explains the homogeneity of the variables because of value similarity. Complexity is another distinction between cluster and factor analysis. The size of the data clearly affects how the analysis is conducted. Cluster analysis becomes computationally challenging for huge data sets. The solution to a problem is the same in factor analysis as it is in cluster analysis, but factor analysis provides the researcher with a more complete answer.
Finally, the visual representation of data allows for the quantification of qualitative codes via network analysis as well as the investigation of network indices' relationships with other quantitative factors via standard statistical processes [23].
Network analysis may be regarded of as a set of methodologies with a shared methodological viewpoint that allows researchers to depict interactions among actors and investigate the social structures that emerge as a result of the recurrence of these ties. The basic idea is that examining the links between objects leads to more accurate interpretations of the interaction processes. This analysis is performed by compiling relational data into a matrix.
For a better comprehension of the statistical architecture design, we will further present in the Table 1, the main advantages of each factor-cluster-network approach used in the present pre-validation study that can help researchers develop new profiling scales in selecting their methodological approach. The study of Hofstetter and collaborators [24] provided a clear delineation between factor analysis and cluster analysis. A comprehensive comparison between factor analysis and network analysis was provided by Lee and collaborators [25], and the study of Ferligoj and collaborators [26] offered a comparative framework between cluster analysis and network analysis; still, no study has provided a comparative analysis between factor, cluster and network analysis so far.

Cluster Analysis
Clustering is a technique for identifying segments or groupings in a data set. Each data point is clustered or grouped to a single cluster in hard clustering. Each data point may or may not totally belong to a cluster. K-Means hard clustering is a clustering method. It divides the data into k-clusters. Instead of placing each data point into a separate cluster, soft clustering assigns a likelihood for that point to be in that cluster. Each data point in soft clustering or fuzzy clustering can belong to numerous groups, as can its probability score or likelihood. The fuzzy c-means clustering (FCM) technique is a popular soft clustering algorithm.
The soft clustering technique known as fuzzy c-means clustering assigns a likelihood or probability score to each data point, indicating whether or not it belongs to that cluster. Fuzzy c-means clustering is a better method than the k-means technique. The fuzzy c-means algorithm permits data points to potentially belong to many clusters, in contrast to the k-means method, which only allows data points to belong to one cluster. Fuzzy c-means clustering results are noticeably better for overlapping data sets.
The fuzzy iterative self-organizing data analysis technique (ISODATA), also known as fuzzy c-means clustering (FCM), is a clustering algorithm that uses membership degrees to determine how much each data point belongs to a certain cluster. Fuzzy c-means (FCM) is a data-clustering technique that splits a data set into N clusters, with various degrees of membership for each data point in the data set in each cluster. Fuzzy c-means is a well-known fuzzy clustering technique. One may create a fuzzy partitioning from data using an unsupervised clustering method. The method is dependent on a parameter m that denotes the level of fuzziness in the result.
K-means clustering divides the whole data set into k clusters, with each data point belonging to just one cluster. Fuzzy c-means generates k clusters and then assigns each data point to one of them; however, there is a factor that determines how strongly the data belong to that cluster. The main advantage of fuzzy c-means clustering is that it allows gradual memberships of data points to clusters measured as degrees in [0,1]. This gives the flexibility to express that data points can belong to more than one cluster.

Factor Analysis
Factor analysis is a statistical data reduction and analysis approach used to explain relationships between different outcomes as the product of one or more underlying causes or factors. The approach uses data reduction to try to express a group of variables with a reduced number. Factor analysis seeks to identify unexplained variables that impact covariation across numerous data. These variables indicate basic notions that are insufficiently measured by a single variable. Factor analysis is especially common in survey research, where each question's replies indicate an outcome. Because many questions are frequently connected, underlying causes may have an impact on the subject's replies. Because the goal of factor analysis is to find underlying factors that explain correlations across various outcomes, the variables analyzed must be at least partially associated; otherwise, factor analysis is ineffective.
Factor analysis is an exploratory investigation that aids in the classification of comparable variables into dimensions. It may be used to make the observations less dimensional in order to make the data easier to understand. For factor analysis, there are several rotation techniques. Data reduction typically involves the use of factor analysis. The exploratory technique is used when you do not already know the structures or dimensions of a group of 6 of 19 variables. The confirmatory approach is used when one wants to test a specific hypothesis about the structures or dimensions of a set of variables.
To screen scale items and discover groupings that will further enable us to choose the most representative items, this study uses an exploratory factor analysis. The most popular technique is principal component analysis, where factor weights are calculated to extract the greatest variation feasible until there is no more useful variance.

Network Analysis
As a collection of approaches with a common methodological stance, network analysis enables researchers to visualize operator interactions and explore the social structures that develop as a result of the recurrence of these linkages. The main principle is that understanding the relationships between items enables us to analyze interaction processes more precisely. Relational data are compiled into a matrix and then used for this analysis. The idea of a social network becomes a useful analytical tool that makes use of the mathematical language of graph theory and the linear presumptions of matrix algebra when items are represented as nodes and their relationships are represented as lines linking pairs of nodes.
Network analysis, in comparison to most other quantitative social science topics, has paid minimal attention to statistical problems. Most techniques and measures examine the structure of specific data sets without accounting for sample variation, measurement error, or other unknown variables. Such difficulties are complex because of the dependency inherent in network data, but they are gaining traction.
Psychometric network analysis is a novel method for investigating the structure of psychological items. The majority of the psychometric network literature so far has been on measuring constructs (for example, dimensional structure); nevertheless, this is only one aspect of psychometrics. In this study, we looked at whether network analysis may be used as a tool for scale development.

Participants
In total, data were collected from 433 participants, of which (83) 19% identified as male gender and (351) 81% identified as feminine gender. In terms of age, our research sample covered the segment between 21 and 70 years old, with an average mean of 43 years old. As for previous work experience, our sample respondents registered values between 1 and 52 years of professional experience in financial departments, with an average mean of 20 years.
In terms of the sociocultural context of the present study, over the past few decades, there have been several significant changes to the sociocultural environment in which people live [27]. According to life span psychological theory [28], both ontogenetic and historical forces influence how an individual develops. From this perspective, results of the present study might be extrapolated for the East-European block countries.
An example item is "Draw management's attention to the financial implications of the company's actions". Responses were registered on a Likert-type scale, ranging from 1 = "This statement does not characterize me at all", to 5 = "This statement totally characterizes me".

Procedure
The online version of the questionnaire was sent by email to previously targeted individuals with solid background in financial departments from Romania. In total, 433 randomized selected participants completed the 68 items version of the BCPS during June and July 2022. A standard protocol for administering the questionnaire was used [46].

Fuzzy C-Means Clustering Results
Fuzzy c-means clustering is a fuzzy partitioning approach that outputs the degree of connection between each observation and each cluster. This allows data observations to be partially allocated to several clusters and provides substantial confidence in cluster membership. Apart from the soft approach, the technique of fuzzy c-means clustering is quite similar to that of k-means clustering.
Each data point is clustered or assigned to any one cluster in the k-means clustering algorithm for hard clustering [47]. Each data point may fully or partially belong to a cluster. Instead of grouping every data point into its own cluster, soft clustering assigns each point a likelihood of being in that cluster. Any data point can adhere to many clusters in soft clustering or fuzzy clustering, which is coupled with a probability score or likelihood [48]. The fuzzy c-means clustering (FCM) algorithm is one of the most used soft clustering methods [49]. Fuzzy c-means clustering has been successfully used in previous psychological clustering research to examine and find the music features that can calm individuals and which form the most relaxing music for therapeutic use [50], to analyze missing data [51], to analyze employee branding typology [52] and to investigate language processing [53].
The algorithm was set to run for a maximum of 25 iterations, with a maximum of two fuzziness parameters and 10 clusters. Measures of model performance that take model complexity into consideration are provided by the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). In AIC and BIC, a condition that measures how well the model fits the data is combined with a term that penalizes the model proportionately to the number of parameters. A statistic used to assess the efficacy of a clustering method is the silhouette coefficient, which is often known as the silhouette score. Its value ranges from [−1, 1], where 1 indicates that the clusters are clearly distinct from one another and are spaced widely away from one another. Values near 0 denote overlapping clusters. The worst value is −1; the negative scores mean that data belonging to clusters may be wrong or incorrect. The following formula may be used to calculate the silhouette score: where p = average intra-cluster distance, i.e., the average distance between each point within a cluster, and q = average inter-cluster distance, i.e., the average distance between all clusters. The size of each cluster as seen in Table 2    The silhouette scores for Cluster 1 is 0.013, for Cluster 2 is 0.051, for Cluster 3 is −0.027, for Cluster 4 is 0.005, for Cluster 5 is 0.101 and for Cluster 6 is −0.087. As seen, four clusters obtained reasonable values, presenting overlapping information, which is expected in the case of our 68 items referring all to the bean counter profile, except for Cluster 3 and Cluster 6, which represent a bad pick for the given data due to the presence of clusters with below average silhouette scores ( Table 2).
In terms of evaluation metrics, the maximum cluster diameter in Euclidean distance obtained was 24.182, the minimum cluster separation in Euclidean distance obtained was 4.040, Pearson's γ value was 0.195: namely, the correlation between distances and a 0-1-vector where 0 means the same cluster, and 1 means different clusters. The Dunn index: minimum separation/maximum diameter value obtained was 0.167, the entropy of the distribution of cluster memberships was 1.002 and the Calinski-Harabasz index, the variance ratio criterion of the cluster memberships, was 16.341.
The elbow method generated a plot with the total within sum of squares on the y-axis and the number of clusters on the x-axis (Figure 1). The plot can be used for determining the optimal number of clusters, which in our case is six. The plot shows three curves using AIC, BIC, and elbow method optimization.
The elbow method generated a plot with the total within sum of squares on the y-axis 366 and the number of clusters on the x-axis (Figure 1). The plot can be used for determining 367 the optimal number of clusters, which in our case is six. The plot shows three curves using 368 AIC, BIC, and elbow method optimization.

372
The factorability of the 68 BCPS items was first investigated. We utilized a number 373 of well-accepted criteria to determine if a relationship may be calculated. First, it was dis-374 covered that every single one of the 68 items showed a strong correlation with at least half 375 of the other items, indicating plausible factorability. Second, the Bartlett's test of sphericity 376 was significant at 2, and the Kaiser-Meyer-Olkin measure of sample adequacy was 0.713, 377 which is above the generally advised threshold of 6, χ 2 (2278) = 16,787.285, p < 0.001. The 378 Chi-squared test was also significant at 4065.240 (1885), p < 0.001. 379 Finally, as seen in Table 3, the communalities were all over 0.407, further demonstrat-380 ing that each item shared some variation with other items. These broad indications led to 381 the conclusion that factor analysis was appropriate for all 68 items. 382

Exploratory Factor Analysis Results
The factorability of the 68 BCPS items was first investigated. We utilized a number of well-accepted criteria to determine if a relationship may be calculated. First, it was discovered that every single one of the 68 items showed a strong correlation with at least half of the other items, indicating plausible factorability. Second, the Bartlett's test of sphericity was significant at 2, and the Kaiser-Meyer-Olkin measure of sample adequacy was 0.713, which is above the generally advised threshold of 6, χ 2 (2278) = 16,787.285, p < 0.001. The Chi-squared test was also significant at 4065.240 (1885), p < 0.001.
Finally, as seen in Table 3, the communalities were all over 0.407, further demonstrating that each item shared some variation with other items. These broad indications led to the conclusion that factor analysis was appropriate for all 68 items.  Because the main objective was to identify and compute composite scores for the factors underlying the 68-item BCPS version, principal components analysis was performed.
The initial eigenvalues indicated that the six factors' unrotated and rotated solution explained a cumulative 45% of the variance, with the first factor explaining 30%, the second factor explaining 4%, the third factor explaining 3%, the fourth factor explaining 2%, the fifth factor explaining 2%, and the sixth factor explaining 1%.
Due to the leveling out of eigenvalues on the scree plot ( Figure 2) after six factors, the six-factor solution, which explained 45% of the variation, was chosen for the statistical pre-evaluation of the BCPS.

398
The network structure of variables may be analyzed using network analysis. To en-399 sure the correctness of our network analysis, the network needs to be a sufficiently accu-400 rate representation of the underlying data in order to guarantee the scientific accuracy 401 [61]. Since the aim of this study is to bring as much as possible statistical evidence for the 402 six-factorial structure of the Bean Counter Profiling Scale, we have employed network 403 analysis for adding empirical rigor [62][63][64][65] besides results obtained with factorial and 404

Network Analysis Results
The network structure of variables may be analyzed using network analysis. To ensure the correctness of our network analysis, the network needs to be a sufficiently accurate representation of the underlying data in order to guarantee the scientific accuracy [61]. Since the aim of this study is to bring as much as possible statistical evidence for the six-factorial structure of the Bean Counter Profiling Scale, we have employed network analysis for adding empirical rigor [62][63][64][65] besides results obtained with factorial and cluster analysis.
A network is a collection of structures that include variables represented by nodes and the links (officially termed edges) between these nodes. Cross-sectional data from a group can indicate conditional independence connections at the group level [66].
Nodes represent items in psychological networks, whereas edges reflect correlations or predictive associations that may be calculated from data. In our case, a node represents a single item on a scale.
The direction and strength of the connection between nodes, or in our case, items, are indicated by edges. The edge may be positive, as in the case of positive covariance or correlation between the items, or it may be negative. Different colored lines to depict the edges of the graph show the polarity of the interactions: positive relationships are often colored blue or green, while negative relationships are typically colored red [67]. You can have weighted or unweighted edges. A weighted edge changes the thickness and color density of the edge linking the nodes to show the strength of a node-to-node link: larger, denser colored lines denote stronger relationships. In a network where there are no connections between nodes, the edge may instead be unweighted and merely indicate whether a link is present or absent.
To keep the graph as clean as possible, thus reducing the noise of the items that were previously discovered as not fitting the six-factor structure, we have eliminated items that did not load on any factor in the exploratory factor analysis. Thus, our network analysis investigated the relationship between 46 items of the total of 68 items of BCPS.
This network has 46 nodes, a maximum of 1035 edges, and a sparsity value of 0.719. The number of edges estimated was decreased to 291 by the analysis's usage of the EBICglasso estimation. Figure 3 shows a visualization of the network of the 46 items of BCPS, which are depicted by six different groups colored in different colors, as seen in the legend. For example, items 5, 8, 15, 17, 50, 51, 52 and 68 belong to group 1 depicted in red color. As the purpose of this research is just the pre-validation of the BCPS, we will not detail the theoretical meaning of the groupings; we are only delineating the scale network structure. As shown in Figure 3, nodes are associated both positively and negatively with one 438 another. 439 As seen in Table 4, there are four centrality measures employed: betweenness, close-440 As shown in Figure 3, nodes are associated both positively and negatively with one another.
As seen in Table 4, there are four centrality measures employed: betweenness, closeness, strength and expected influence to identify highly influential nodes [80]. According to the centrality of closeness, a node's proximity to every other node in the network is measured. It is calculated as the mean of the shortest paths connecting each network node. The total distance between a node and all other nodes is inversely proportional to a node's centrality. One technique for understanding closeness is as a metric of how long it will take for information to spread sequentially from one node to every other node. Betweenness centrality counts how frequently a node is on the shortest route between other nodes. Betweenness centrality is frequently used to quantify a node's dependence on other nodes and, thus, its potential for control. A node's influence on its immediate neighbors or nodes with which it has an edge is determined by its strength, which is the sum of all the absolute values of connections with other nodes in the network [80].
Item 46 has the biggest effect over the flow between all items in terms of betweenness. In terms of closeness, the item best placed to influence the entire network most quickly is the same item 46. In terms of strength, the most influential item over its immediate neighbors is item 44, and in terms of expected influence, the same item 44 presents the most prominent characteristics in the analyzed network.

Discussion
There have been no published experimental investigations that use fuzzy c-means in the pre-validation of profiling scales; hence, our technique is completely novel, especially when our design process is rooted in an exploratory investigation of psychological factors accounting for the bean counter profile. With this original factor-cluster-network analysis of the 68-item Bean Counter Profiling Scale, we have statistically prototyped the sixdimensionality structure.
As results clearly pointed out, the six-dimension scale yielded by all techniques involved in this pre-validation study-fuzzy c-means unsupervised hierarchical clustering algorithm, exploratory factor analysis and network analysis. We found strong evidence for further qualitatively investigating the scale dimensionality and depicting the theoretical explanation behind each of the six dimensions. We anticipate a high correlation with all the big five personality dimensions: openness, conscientiousness, extraversion, agreeableness, and neuroticism, as all items extracted from the literature depict both general and very specific behaviors of accountants, mostly in terms of personality characteristics.

Conclusions
Over the last few years, the need for a systematic and comprehensive approach to cognitive concerns in the design of sociotechnical systems has evolved as computerbased technologies have pushed the nature of operational work in a path where cognitive challenges predominate [81][82][83][84].
Decision making in complex and dynamic information settings, remote cooperation, and the administration of substantially networked systems have all revolutionized the nature of work in many circumstances. Cognitive systems engineers discover the cognitive states, cognitive processes, and cognitive strategies utilized by competent practitioners to conduct this task and then provide design solutions for tools that enable expert human cognition, such as decision and planning tools.
Human cognition is one critical dimension on which sociotechnical systems can fail, often catastrophically. On the other hand, it is a dimension that has the potential to significantly improve the overall performance of a sociotechnical system. Cognitive system engineers, in particular, do not regard the human as a user or operator but rather as an entity having functional features that contribute to system performance.
For determining user requirements and creating acceptable and effective computerbased information aid systems, traditional human-computer interaction (HCI) and system design paradigms have proven ineffective. By merging modeling aspects from engineering, psychology, cognitive science, information science, and computer science, cognitive systems engineering (CSE) offers a much wider, more dynamic framework. This study is one of the first practical applications of the burgeoning new subject of cognitive systems engineering in the construction of psychological resilient scales.
Distinct characteristics of a psychological construct are represented by different individual items, since they cannot be directly assessed. The underlying assumption in instrument development is that the concept is what motivates respondents to react sim-ilarly to all of these items. As a result, it is appropriate to combine the replies to all of these items into a single score and draw conclusions about the construct. Instruments for measuring can be used to measure a single construct, numerous separate constructions, or even smaller differences within a construct.
Even though a set of items appears to measure the same construct theoretically, the researcher must empirically demonstrate that the population studied exhibits a coherent response pattern throughout the set of items to support its use to measure the construct. If respondents do not react coherently, it shows that the items are not working as intended and may not all measure the same concept. As a result, the single number chosen to represent the build from that set of objects would reveal very little about the intended construct.
A multidimensional scale is an instrument designed to assess multiple related constructs or several separate facets of a construct. To be able to divide the findings into subscales, the items must measure distinctively different constructs. It is vital to remember that whether a group of objects reflects distinct constructions depends on the intended population, in this case accountants.
The coefficient alpha does not indicate whether the instrument assesses one or several underlying components [85,86]. In conclusion, publishing simply coefficient alpha, in our instance an overall Cronbach's coefficient of 0.947, is insufficient evidence to make reliable interpretations of instrument scores or to demonstrate that a group of questions measures only one underlying concept, unidimensionality.
In order to determine whether participant responses to particular subsets of survey items are more closely related to one another than to other subsets, a statistical technique known as factor analysis examines the relationships between the survey items. In other words, it examines the dimensionality of the survey items [87][88][89]. This method was specifically created to clarify the dimensionality that underlies sets of achievement test items [90]. In terms of constructs, factor analysis may be used to investigate the likelihood that a certain collection of items collectively measures a predetermined construct, hence obtaining validity data on internal architecture.
As a result, EFA can clarify the connections between various concepts and structures and contribute to the creation of new frameworks. Early on in the creation of an instrument, EFA is appropriate. The researcher can identify survey questions that do not empirically fit the desired construct and should be eliminated by employing EFA. EFA can also be used to investigate the instrument's dimensionality. In EFA, it is considered that the variance that the items share represents the concept and that the nonshared variance represents measurement flaws.
In terms of mathematics, factor analysis examines the variances and covariances among the components. It is believed that the construct is represented by the common variance among the components. The constructs are frequently referred to as factors in factor analysis. Error variance is referred to as nonshared variance. The covariances among all items are examined collectively during an EFA, and items that share a significant amount of variance are collapsed into a factor. A CFA extracts the common variation across items that are designed in advance to assess the same underlying concept. In EFA, it is not required to make an a priori assumption about which items correspond to which factors because the EFA establishes these associations. Before applying the measuring instrument for study, researchers should preferably corroborate the six-factorial structure we established using EFA with a CFA on diverse accountant populations.
All of the strategies described give viable answers that may be examined to determine the optimal solution. When these indices agree, it indicates that the data have a distinct factor structure. It is critical that the number and character of factors make theoretical sense in order for each component to be interpretable. Furthermore, the scale's intended usage should be evaluated.
The proposed six-factor approach necessitates additional refinement, which is the primary limitation of this pre-validation research. Another disadvantage of this study is that the sample was 70% female, which is an over-representation of women in the accounting community. More research should be conducted to see if the stated dimensionality holds true in a larger sample. Future study should look at whether the instrument has the same structure when used with accountants from various backgrounds. The work given here merely establishes the BCPS's dimensionality. We advocate gathering additional forms of validity data, such as evidence based on content or correlations to other factors, to improve our confidence that the scores from this scale accurately represent the bean counter personality profile. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data will be made available on request by the first author and corresponding authors.

Conflicts of Interest:
The authors declare no conflict of interest.