Cultural Intelligence: What Is It and How Can It Effectively Be Measured?

We administered both maximum-performance and typical-performance assessments of cultural intelligence to 114 undergraduates in a selective university in the Northeast of the United States. We found that cultural intelligence could be measured by both maximum-performance and typical-performance tests of cultural intelligence. Cultural intelligence as assessed by a maximum-performance measure is largely distinct from the construct as assessed by a typical-performance measure. The maximum-performance test, the Sternberg Test of Cultural Intelligence (SCIT), showed high internal consistency and inter-rater reliability. Sections with problems from two content domains—Business (SCIT-B) and Leisure (SCIT-L) activities—were highly intercorrelated, suggesting they measured largely the same construct. The SCIT showed substantial correlations with another maximum-performance measure of cultural intelligence, Views-on-Culture. It also was correlated, at more modest levels, with fluid intelligence and personal intelligence tests. Factorially, the (a) maximum-performance cultural intelligence tests, (b) typical-performance cultural intelligence test and a test of openness to experience, and (c) fluid intelligence tests formed three separate factors.


Introduction
Cultural intelligence is one's ability to adapt when confronted with problems arising in interactions with people or artifacts of cultures other than one's own . Some might view cultural intelligence as merely a special case of general intelligence, but there is at least some evidence that cultural intelligence is a distinct construct that is related but nonidentical to general intelligence (Ang et al. 2006(Ang et al. , 2007(Ang et al. , 2020Sternberg 2008;Sternberg and Grigorenko 2006;Van Dyne et al. 2008).
At least conceptually, there are three ways in which cultural intelligence might plausibly differ from general intelligence while at the same time being related to it: First, cultural intelligence would seem to have a practical, tacit-knowledge-based component that makes it akin to what sometimes is called "practical intelligence", which (arguably) is at least somewhat distinct from general intelligence (Hedlund 2020;Polanyi 1976;Sternberg and Hedlund 2002;Sternberg and Horvath 1999). Tacit knowledge is acquired from experience. It is a matter of not how much experience one has but rather of what one learns from that experience. Presumably, cultural intelligence, like crystallized intelligence, develops in part as a result of experience. Where it is more like practical intelligence than like crystallized intelligence is that it is procedural. It is not a matter of knowing declarative general information or vocabulary (as measured on the Wechsler intelligence scales, e.g., Wechsler (1944)) but rather like the procedural practical skills measured through situational judgment tests (Weekley and Ployhart 2005). Cultural intelligence is not something one can memorize, such as a vocabulary list or a set of facts. Rather, it is something that one deploys according to the intricacies of a situation and in light of the task and the persons involved.
Second, cultural intelligence can be seen as having metacognitive, cognitive, motivational, and behavioral components (Ang et al. 2006(Ang et al. , 2007. The first two components are measured by tests of general intelligence, but in abstract contexts that are different from those of intercultural interactions. To the extent that intelligence comprises an interaction of person by task by situation , the metacognitive and cognitive components may be quite different from those that are displayed in a conventional intelligence testing situation. The metacognitive component is used, for example, to understand how one is thinking about the situation one is in-as friendly, hostile, indifferent, or whatever. The cognitive component is used to figure out what to do in the situation. The motivational component is used to create engagement with the situation-some people simply shy away from intercultural situations or refuse to accept them as involving norms potentially different from their own. Moreover, the behavioral component is used to enact the behavior one views as appropriate in a given situation. Third, to be measured fully, one might wish to use a combination of typical-performance and maximum-performance measures. Past research suggests the two kinds of measures assess different aspects of cultural intelligence , much as do typicaland maximum-performance measures of emotional intelligence (Rivers et al. 2020). There is, of course, no perfect measure of anything: Any measure has error built into it. For example, typical-performance measures are subject to deception, directed both against the tester and the test-taker (who may, for example, have an inflated perception of their own skills). They are also subject to bias-people use rating scales differently, so some tend to rate higher than others, much as would be true in grading in school. However, maximum-performance measures are also susceptible to bias, for example, how alert one happens to be at the time of testing and how one handles what are usually timed, multiple-choice, or short-answer tests.
Although maximum-performance and typical-performance tests sometimes are pitted against each other-as though one is the "correct" kind of test and the other an "incorrect" kind of test, or one is a better kind of test and the other is a lesser kind of test (Kunzmann 2019; Webster 2019)-we view them more as complementary than as competing. That said, a risk of self-report measures is that individuals simply do not know where they stand, or worse, that individuals who are low performers have greatly inflated perceptions of their own performance, the so-called Dunning-Kruger effect (Kruger and Dunning 1999). This is a particular problem in the study of wisdom, where epistemic humility is an essential component of wisdom (Grossmann et al. 2020), so more wise people often think of themselves as less wise and less wise people think of themselves as more wise.
It is not difficult to imagine the extension of the effect of cultural intelligence, wherein people of low cultural intelligence may view the problem of intercultural interaction as a simple problem of the people of perceived "inferior" cultures needing to adapt to those of perceived "superior" cultures; in contrast, the culturally intelligent person likely would realize that cultures differ considerably and cannot simply be placed on some kind of value scale from better to worse. Lest this sound exaggerated, it is worth remembering that the early history of psychology and especially of cross-cultural psychology is replete with examples of white male researchers from Western cultures imposing what they saw as their "superior" values on members of what they saw as "inferior" cultures (see, e.g., Gasquoine 1997;Gould 1981).
Why is cultural intelligence even important? The first and main reason is that intercultural interactions are omnipresent, whether we wish them to be or not. Countries can clash because they do not understand each other's values, as can individuals and groups (Markus and Conner 2014). For example, what is viewed as acceptable behavior in male-female interactions differs widely across cultures (Wood and Eagly 2002). Intercultural interactions, for many people, are no longer exotic or some kind of luxury. They have become a nearly inevitable part of everyday life.
There is a second reason cultural intelligence is important, however. Regardless of how it affects our interactions with people of other cultures, it increases our understanding of our own cultures. Presuppositions and cultural patterns that once may have seemed to be necessary parts of life may now be seen as merely single options among many different options. Men who have grown up in a culture that shows a general disrespect for women, however, and who may not have thought that there were other viable options, may now see that the way their culture treats women is not a necessity or even perhaps desirable-it is a choice and perhaps a suboptimal one.
A third reason for the importance of cultural intelligence is that, for whatever arguments one might make in one direction or another about the teachability of general intelligence, cultural intelligence is clearly teachable at some level. No one is born with cultural intelligence. They may be born with propensities at one level or another. However, tacit knowledge is acquired from experience (Sternberg and Hedlund 2002). To the extent that it can be isolated from experience, it can be taught. People can learn from their experience and any instruction they receive about how to interact better with people of diverse cultures. Moreover, they may even increase their own self-understanding and self-awareness.
In a previous study,  studied cultural intelligence and how to measure it. In the current study, we used what is now called the Sternberg Cultural Intelligence Test (SCIT), which comprises two subscales, a Business subscale (SCIT-B) and a Leisure subscale (SCIT-L). There were certain aspects of that earlier study that, at least in principle, could be improved upon, and that we addressed in the current study. In particular: First, the subscale coefficient alpha internal consistency reliabilities of .79 (SCIT-B) and .77 (SCIT-L), with a combined reliability of .87, were somewhat lower than would have been ideal. One might have hoped for subscale reliabilities over .80 and total reliability over .90. We therefore lengthened the measures in the hope of attaining higher internal consistency reliability and possibly higher validity as well, as a larger sampling of behavior would have been considered. The subscales, which were 10 and 9 items in length, were here each 12 items in length in the current study. This is an increase in length of 20% for the SCIT-B and 25% for the SCIT-L.
Second, the instructions in the earlier version of the SCIT did not make clear that the items would be scored in such a way that multiple solutions to a problem would result in a higher score. The idea was that, in intercultural interactions, the first response sometimes is ineffective or, at best, only partially effective. Some participants may not have realized that multiple responses were desired. In this revised version, participants were informed that they should "come up with a solution to solve the problems in scenarios and alternative solutions if the main one does not work out". This change may have increased both validity and reliability.
Third, the previous study lacked, we believe, sufficient measures for convergent validation. In particular, there was a need for questions that were relevant to cultural intelligence and that would correlate with SCIT scores. To increase the number of hoped-for convergent validators, we introduced a Views-on-Culture measure that would measure knowledge and skills that we believed would be relevant to cultural intelligence.
Fourth, the previous study had just 6.4% Black or African American participants. Our hope in the present study was to have greater representation of Black participants. Indeed, in this study, the representation was 13.2% of the participants.
Fifth, the test graders in the previous study also devised the rubric for grading. One could have argued-and a reader did argue-that the higher inter-rater reliability was because the raters devised the rubric under which they were rating. The three raters in the current study used an enhanced version of the previous rubric and hence did not devise it themselves, so that the inter-rater reliability figure could not be due to their having devised the rubric themselves. Nevertheless, we obtained high inter-rater reliabilities for the SCIT-inter-rater reliability amongst the three graders resulted in the reliability values of .97 for the SCIT-B, .96 for the SCIT-L, and .98 for the SCIT-B+SCIT-L. The rubric for this work was more detailed than for the previous work.
Sixth, in the previous study, we had two graders. In the current study, we had three graders in order to increase inter-rater reliability.
Thus, in this study, we sought both to show the replicability of the earlier results in terms of the construct validity of the SCIT for measuring cultural intelligence and also to refine past work to address some of the inadequacies of the previous work, as described above. In particular, based on the past research, we hypothesized that the maximumperformance tests would correlate with each other, and the typical-performance tests would correlate with each other, but the maximum-performance tests would not correlate much, if at all, with the typical-performance tests. This pattern derives from the notion that the two types of tests measure relatively distinct aspects of cultural intelligence. We further expected the maximum-performance tests of cultural intelligence to show factorial loadings different from those of fluid intelligence tests.

Participants
A total of 114 undergraduate and graduate students attending a selective university in the Northeast of the United States participated in the data collection, which was conducted through an online survey. Of these participants, 93 of them were female, 20 of them were male, and 1 of them indicated "other." The average age of the participants was 20.19 years with a standard deviation of 1.03. The self-indicated racial/ethnic composition was 34.2% Asian and Asian American, 29.8% White or Caucasian, 13.2% Black or African American, 9.6% Hispanic or Latino, 1.8% American Indian or Alaska Native, and 7.9% of two or more races; 3.5% preferred not to answer.

Materials
There was a total of 9 assessments in the form of an online survey, administered through Qualtrics. These assessments consisted of two psychometric assessments, which included (1) Letter Sets and (2) Figure Classification; two maximum-performance Sternberg Cultural Intelligence Test subtests, (3) one of which detailed a scenario pertaining to a business trip and (4) the other of which depicted a leisure trip; a (5) Views-on-Culture questionnaire we created that was composed of 3 items; a (6) typical-performance Cultural Intelligence Scale (CQS-Van Dyne et al. 2008); an (7) Openness to Experience (OE) scale (Johnson 2014); a (8) test of Personal Intelligence Mini (TOPI- Mayer et al. 2018); and a (9) demographic questionnaire we constructed.
Psychometric Assessments. The two psychometric assessments administered for this study were (1) Letter Sets and (2) Figure Classification. The Letter Sets test required participants to rule out one letter set that did not fit in with the four other sets given. The Figure Classification test required participants to select and categorize each given figure into a group based on feature similarity. The purpose of the psychometric assessments was to measure fluid intelligence. The tests were taken from the Kit of Factor-Referenced Cognitive Tests (Ekstrom et al. 1963). This section was scored based on how many correct answers were given, with each correct answer yielding one point.
Maximum-Performance Sternberg Cultural Intelligence Test (SCIT). Two subtests of the maximum-performance Sternberg Cultural Intelligence Test were developed. These two versions and the test as a whole were a modification of the test presented in . One subtest was related to a business trip (SCIT-B) and the other subtest was related to a leisure trip (SCIT-L). In both subtests of the Sternberg Cultural Intelligence Test, one could find a variety of simulated, realistic scenarios that one might experience in different cultural contexts. Participants must say what they would do to overcome and deal with certain challenges presented when traveling to a new cultural environment.
The general instructions were: "Instructions: Please read the following information and come up with a solution to solve the problems in scenarios and alternative solutions if the main one does not work out." For example, in the SCIT-B, a business executive travels to a foreign country with which the executive has little familiarity to try to reach an important business agreement.
"You have just arrived on your current confidential assignment in a foreign country with which you are largely unfamiliar. Your assignment is to negotiate a memorandum of agreement between your organization and a large organization in the foreign country. You were told that you were expected to return to the US with a signed agreement. Before leaving the US, you were given very little information about your destination country, and most of that was basic information on the political system, imports and exports, and the general economy. You do not know the language and you know that relations with the country are tense. You realize that your room in the hotel in which you are staying has no access to the World Wide Web. Moreover, your cell phone does not work in this country." One item from the SCIT-B was: "As you get ready to approach customs at the airport, a woman seems to come out of nowhere and approaches you. You think you recognize her from your trip. You can't quite place her but believe she was one of the employees of the organization with which you negotiated. She says that the organization forgot to give you a farewell gift and that she was instructed to give it to you before you departed. She has only now caught up with you. She shoves a gift box into your hands. It is packed in gift wrap with a gold ribbon but otherwise has no identifying marks. On one hand, you don't want to insult the organization but, on the other hand, you have no idea what is in the box. What would you do?" There were 12 items each for both the SCIT-B and for the SCIT-L. Each item was graded by three different graders. The final score was calculated by the average of all three graders. The scoring was based on a 5-point scale, in which 1 indicated a poorly answered item and 5 indicated a thoroughly and elaborately answered item. The grading rubric for the Sternberg Cultural Intelligence Tests (both SCIT-B and SCIT-L) was as follows: Zero points: No answer/blank. One point: Provided one plausible response with no/vague further explanation, for example, "I would go to a hospital." Two points: Provided one or two plausible responses with some explanations, for example, "I would go use hand gestures to indicate my illness and ask for a map to find a hospital." Three points: Provided two or more plausible responses with more elaborated explanations, for example, "I would first do . . . then . . . if something went wrong, I would . . . .".
Four points: Provided three or more plausible responses with elaborated explanations, for example, "I would use nonverbal body language to show that my stomach is in pain. If there was a pharmacy nearby, I would point that out to a local and then use nonverbal body language to see if the local could help me find the hospital. If that did not work, I would pretend to be listening to someone's heartbeat with a stethoscope and see if someone could help me find a hospital after that." Five points: Provided three or more plausible (and novel/unique) responses with specific and detailed explanations.
There were no 5's given in grading this sample. The highest score given was a 4, with the lowest score a 1.
The inter-rater reliabilities, computed as intraclass correlations coefficients, amongst the three graders resulted in reliability values of .97 for the SCIT-B, .96 for the SCIT-L, .98 for the SCIT-B + SCIT-L, .78 for Views-on-Culture Item 1, .76 for Views-on-Culture Item 2, and .86 for Views-on-Culture Item 3. The relatively robust inter-rater reliabilities may be attributed in part to the establishment and careful implementation of a standard rubric. The three graders were themselves somewhat diverse: Two were Asian-Americans and one was German. All were thoroughly trained on the rating protocol.
Views-on-Culture (VC) (3 items). The Views-on-Culture (VC) questionnaire consisted of three items we created, each intended to gauge each participant's interests and personal opinions regarding different aspects of culture.
We included this measure for convergent validation because previous work, we believed, showed the need for an additional measure to demonstrate convergent validity. At this time, there were no fully validated maximum-performance measures of cultural intelligence. Ideally, we would have used an already existing, if not fully validated and standardized, maximum-performance measure of cultural intelligence, based on the notion of measurement dealing with unexpected situations in a novel cultural setting. However, we were unable to find any adequate existing convergent validator at all. Schwarzenthal et al. (2019) devised a situational judgment test, but it was for students meeting students of other cultures, which was situationally quite distinct from our measure. The measure we used, in contrast, was designed for adult, post-student use to measure cultural intelligence in business and leisure settings. Another possibility would have been the measure of Rockstuhl et al. (2015), but we had 1 1 2 hours of testing time and we did not have the remaining testing time available for a test longer than the one we used.
Each of the three items of the Views-on-Culture measure is listed below: Item 1: "Some people believe it is worthwhile to learn to speak at least one foreign language fluently. Other people believe it is not worthwhile.  Item 3: "You meet someone from a foreign country who, in a conversation, expresses beliefs with which you strongly disagree. You are surprised that they could believe and express such a thing.
(a) What would you say or do? (b) Why would you say or do that?" For grading, Part a of Item 1 was not given a score. Item 1, Part b, was graded on a 3-point scale based on the number of reasons judged as satisfactory given while assessing for quality, with 3 rated as a very good answer. Item 2, Part a, was graded with "yes" as 1 point and "no" as 0. Item 2, Part b, was not given a score. Item 2, Part c, was scored for quality on a scale of 0-3. Item 3, Parts a and b, were also both scored for quality on a scale of 0-3. The rubric for the items scored on a scale of 0-3 was as follows: Zero: no answer/perverse answer (irrelevant/mean); One: weak response ("I don't understand why you would say that"); Two: good answer; Three: very good answer. Typical-Performance Cultural Intelligence Scale (CQS). The typical-performance Cultural Intelligence Scale (Van Dyne et al. 2008) is presented as a measure of an individual's capability of navigating various cultural settings that are different from their own. It is composed of 20 statements, one such example being, "I am conscious of the cultural knowledge I apply to cross-cultural interactions". Each of the 20 statements was to be rated by the participant as a self-report, on a scale of 1 to 7, based on the following scale: (1 = strongly disagree; 2 = disagree; 3 = more or less disagree; 4 = undecided; 5 = more or less agree; 6 = agree; 7 = strongly agree).
The purpose of the CQS is to measure the individual's cultural intelligence as typical performance. The CQS is composed of cognitive, motivational, and behavioral CQ subscales.
Openness to Experience (OE). The typical-performance Openness to Experience (OE) scale was modified from the Big Five Inventory Personality scale (Johnson 2014). In this section, a total of 24 statement items gauging individual personality traits and their resulting attitudes on life were shown. Participants were then asked to depict their level of agreement with each statement, on a scale of 1 to 5, with 1 representing a very inaccurate statement with regard to oneself and 5 representing a very accurate statement.
Test of Personal Intelligence Mini-12 (TOPI). The Test of Personal Intelligence was adapted from the full TOPI test, which is a questionnaire composed of 134 items (Mayer et al. 2018). The maximum-performance TOPI mini was a reformed version that was much shorter, intended for quick use in the laboratory. It was composed of 12 items that assessed the individual's problem-solving capabilities. Participants were asked to answer and pick one correct answer of four choices given after reading a short passage.
Demographic Questionnaire. The demographic questionnaire requested information such as age, gender, ethnicity, year at the university, SAT and ACT scores, cumulative college GPA, and the number of different countries the participants had visited.

Design
The design of this study was correlational. The main dependent variables were scores on the Sternberg Cultural Intelligence Test (SCIT-including SCIT-B for Business items and SCIT-L for Leisure items). Other scores were used as independent variables to predict scores on the SCIT.

Procedure
This study was administered in the form of an online survey through the Qualtrics platform. Participants in this study were gathered through an online platform for students at the university. First, before taking the assessments, participants were asked to read and sign the consent form shown before any tests and surveys were administered. Upon signing and the approval of consent, they were then taken to the two psychometric assessments: Letter Sets and then Figure Classification. The psychometric sections were automatically timed and once the time limit was reached, the system forwarded the participant to the next section directly. The time limit for the Letter Sets was 7 min and for the Figure Classification was 8 min. The following sections, including the Sternberg Cultural Intelligence Test (SCIT-B, SCIT-L), Views-on-Cultures (VC) questionnaire, Cultural Intelligence Scale (CQS), Openness to Experience (OE) Scale, Test of Personal Intelligence Mini (TOPI), and the Demographic Questionnaire, all did not have a time limit. Upon the completion of the study, a written debriefing form was presented to the participants.

Basic Statistics
Descriptive statistics for demographic questions (age, cross-cultural experience in years, and number of countries visited), OE, psychometric assessments (Letter Sets, Figure  Classification, and TOPI), standardized admissions tests (ACT and SAT with subtests reading and math), as well as college GPA, are summarized in Table 1. Table 1 further provides basic statistics for the tools that were used to assess cultural intelligence: the maximum-performance Sternberg Cultural Intelligence Test (SCIT-including the Business and Leisure subtests SCIT-B and SCIT-L), the three items that assessed Views-on-Culture, and the typical-performance Cultural Intelligence Scale-CQS-by Van Dyne et al. (2008). The SCIT overall mean was 50.50 with a standard deviation of 13.25. Mean ACT and SAT scores in our population were higher than the average population of college students with the national ACT average of 20.6 and the national SAT averages of the SAT Reading of 533 and the SAT Math of 527. Our values for our selective university sample were, for the ACT overall, 33.56, and for the SATs, 727.30 Reading, and 754.12 Math (https://nces.ed.gov/programs/digest/d17/tables/dt17_226.40.asp, accessed on 3 August 2022; https://www.number2.com/average-act-score/#What_is_the_National_Average_ ACT_Score, accessed on 1 August 2022). Our sample also featured smaller standard deviations of 49.44 in SAT Reading and 56.63 in SAT Math, compared with the national standard deviations of 100 and 107. The national standard deviation for the ACT is 4.8, considerably greater than our standard deviation of 1.49. However, many participants did not take the standardized tests, and one can expect that those who did submit scores may well have scored on the higher side of the university population mean, had all students taken the tests. Table 2 summarizes the results of an analysis of variance for sex. None of the mean differences were significant. Results of the ANOVA for ethnicity are contained in Table 3. Our analysis revealed significant relationships of test scores with ethnicity for the SAT Reading (p < 0.05), SAT/ACT (p < 0.05), GPA (p < 0.01), OE (p < 0.05), and ethnicity. Black or African-born participants had lower overall scores in SAT Reading, SAT/ACT, and GPA compared to other ethnicities. Students who preferred not to answer their ethnicity had lower overall OE scores. There are many possible causes of such differences; we have no basis for choosing among them.   Table 4 provides the internal consistency reliabilities as measured by coefficient alpha. The Sternberg Cultural Intelligence Test (Business, Leisure, and total) showed high reliabilities (0.95, 0.94, and 0.97), with coefficient alpha reliabilities comparable to or higher than those for the other measures used. The results also were better than those of the previous version of the test , which were 0.79, 0.77, and 0.87, respectively, perhaps because the test was revised and lengthened.  Table 5 shows the intercorrelations among all measures used in this study. First, significant correlations were not found between the Sternberg Cultural Intelligence Test and either self-reported standardized admissions tests (SAT/ACT) or selfreported GPA. Only the first Views-on-Culture item correlated significantly (p < 0.05, r = .28) with the SAT Reading score and also (p < .05, r = .23) college GPA (p < 0.05, r = .23). The fourth dimension of the CQS also correlated (p < 0.05, r = .28) with the college GPA as well as the ACT (p < 0.05, r = .26). However, correlations were generally rather weak.
Third, the SCIT and all three Views-on-Culture items intercorrelated with the Test of Personal Intelligence (TOPI-r = .24, r = .31, r = .28, r = .38, r = .25, r = .29, all p < 0.01) at small to medium effect sizes. In contrast, there were no correlations between the typicalperformance CQS and the Letter Sets, Figure Classification, or Test of Personal Intelligence, except for the second dimension of the CQS with the Test of Personal Intelligence (r = −.22, p < .05).
Fourth, the Business (SCIT-B) and Leisure (SCIT-L) tests correlated at high effect sizes with each other (which was expected, as all parts were developed to measure cultural intelligence) and with the three Views-on-Culture items (p < 0.01). The SCIT-B correlated with the SCIT-L at r = .85, with the first Views-on-Culture item at r = .45, with the second Views-on-Culture item at r = .46, and with the third Views-on-Culture item at r = .39. The SCIT-L correlated with the first and second Views-on-Culture items at r = .51 and with the third Views-on-Culture item at r = .47. The total score's correlations with the first, second and third Views-on-Culture items were at r = .50, at r = .50, and at r = .44. The first Views-on-Culture item correlated at r = .46 with the second and at r = .54 with the third Views-on-Culture item. The second Views-on-Culture item correlated at r = .54 with the third Views-on-Culture item. * Correlation is significant at the 0.05 level (2-tailed); ** correlation is significant at the 0.01 level (2-tailed).
Fifth, in contrast to those statistically significant correlations, the maximum-performance SCIT did not correlate significantly with the typical-performance-based CQS, nor with its dimensional subscores. The Views-on-Culture items, however, did correlate with it: Item Number One showed correlations with the CQS (at r = .20, p < 0.05) and its first (metacognitive) and fourth (behavioral) dimensions (r = .21 and r = .19, both p < .05), the second item correlated with first and fourth dimensions as well (r = .21 and r = .19, both p < .05 as well), and the third item correlated with the motivational CQS dimension (r = .19, p < .05). The total SCIT score and SCIT-B score as well as all three Views-on-Culture items showed small-to medium-sized correlations (Cohen 1988) with OE (r = .22, p < .05; r = .20, p < .05; r = .28, p < .01; r = .27, p < .01; and r = .22, p < .05). Lastly, the first and third Views-on-Culture items correlated with the number of countries visited (each correlation, 0.19, p < .05). Table 6 provides the rotated principal component analysis for the maximum-performance Letter Sets and Figure Classification, the two subscales (SCIT-B and SCIT-L) of the maximumperformance Sternberg Cultural Intelligence Test (SCIT), all three items of the Views-on-Culture measure, the typical-performance CQS, the typical-performance OE scale, and the maximum-performance TOPI. The measures used clustered based on three distinct groups in order of decreasing portions of variance are explained here:

Principal Component Analyses
(1) SCIT-B, SCIT-L, and VC items; (2) Letter Sets, Figure Classification, and the TOPI; (3) CQS and OE. A principal component analysis for the psychometric measures, the total score of the SCIT measure, the Views-on-Culture (VC) measure, the CQS, and the Openness to Experience (OE) measure is compiled in Table 7. In analogy to the results in Table 6, the SCIT maximum-performance measures comprised the first principal component, the psychometric tests the second factor, and the typical-performance CQS and OE the third.  Table 8 shows the two subtests (Business and Leisure) of the Sternberg Cultural Intelligence Test, the three Views-on-Culture items, the CQS, the Openness to Experience (OE) scale, the Test of Personal Intelligence (TOPI), self-reported SAT/ACT scores, and college GPA, with four principal components constructed to account for most of the variance. Notably, while the two subscales of the Sternberg Cultural Intelligence Test and the Views-on-Culture items were featured as the first principal component mirroring the results in Tables 6 and 7, the second component consisted of the standardized SAT/ACT test scores and the college GPA. The third component was the Test of Personal Intelligence and Openness to Experience, which was also contained in the fourth component alongside the CQS.  Table 9 provides the results of a principal component analysis of the total Sternberg Cultural Intelligence Test score, the Views-on-Culture items, the CQS, Openness to Experience, the Test of Personal Intelligence, the SAT to ACT conversion, and college GPA. Three components had Eigenvalues greater than 1: the Total SCIT score and the three VC items made up the first factor, SAT/ACT and college GPA the second, and Openness to Experience and the Test of Personal Intelligence the third component. The CQS did show major loadings on any factors. Finally, Table 10 shows a principal factor analysis for the same tests with a very similar outcome as the results of the principal component analysis. In general, principal factor analyses revealed results quite similar to the principal component analyses.

Discussion
We sought, in this study, to continue and refine the construct validation of the Sternberg Cultural Intelligence Test (SCIT). The results largely replicate and extend the results of . Our main findings, largely consistent with previous research, were that: 1.
The overall pattern of results, as described below, seems to suggest that cultural intelligence is a construct that can be measured by a maximum-performance measure with substantial reliability and validity.

2.
Cultural intelligence, as measured by a maximum-performance measure, is somewhat different from cultural intelligence as measured by a typical-performance measure. The SCIT did not correlate significantly with the CQS, a typical-performance measure of cultural intelligence. Thus, the way people characterize themselves in intercultural situations is not related significantly to their maximum performance in at least some such situations. A maximum-performance measurement of cultural intelligence by the SCIT was reliable in terms of internal consistency. The internal-consistency reliabilities were .97 (SCIT), .95 (SCIT-B), and .94 (SCIT-L). Inter-rater reliability was also high (.97) for the SCIT. 3.
The SCIT-B and the SCIT-L were highly intercorrelated, r = .85, p < .001, suggesting that the test measures a coherent set of skills across at least two domains-Business and Leisure activities. 4.
The SCIT is not a disguised test of scholastic or academic achievement. It correlated significantly neither with self-reported standardized admissions test scores (SAT/ACT) nor with self-reported cumulative college school GPA.

5.
However, the SCIT does relate to fluid intelligence, with significant correlations with Letter Sets in the .30s and significant correlations with Figure Classification problems in the .20s. 6.
The SCIT does correlate significantly with a maximum-performance measure of Views-on-Culture, through which participants are asked about their views on (a) the importance of learning a foreign language, (b) the value of living abroad for at least six months, and (c) their views on how to resolve a discrepancy in values between them and a foreigner. The correlations of the SCIT with the three items, respectively, were r = .50, r = .50, and r = .44. Thus, the maximum-performance measures of cultural intelligence seem to show convergent validity with respect to each other. 7.
The SCIT also correlated significantly with the TOPI (r = .28). Thus, the maximumperformance tests relevant to the socioculturally related aspects of intelligence were significantly correlated with each other. 8.
Factorially, the maximum-performance cultural intelligence tests-the SCIT and the Views-on-Culture questions-factored together; the Letter Series and Figure Classification tests measuring fluid intelligence factored together; and the typical-performance CQS and Openness to Experience tests factored together. 9.
Thus, the maximum-performance and typical-performance cultural intelligence tests showed external correlates, but with generally different measures. Both types of tests may measure somewhat different aspects of cultural intelligence. 10. Because we did a number of factor analyses with different and diverse variables included in the various analyses, which variables loaded where depended on the full set of variables that set up the context for each analysis. However, the results were consistent both across analyses and with regard to earlier work on cultural intelligence ). In the current work, the variables included were more diverse across factor analyses than those in the previous  work.
To conclude, first, we found here, as before, that the maximum-performance measures of cultural intelligence (the SCIT and the new measures in the present study, Views-on-Culture) loaded on the same factor. Maximum-performance cultural intelligence thus is replicated as a measure that appears to have integrity as a unified construct. These results suggest that the SCIT and the Views of Culture measure have at least some construct validity as converging measures of maximum-performance cultural intelligence, at the same time that future studies need to compare these measures with other existing maximumperformance measures, such as that of Rockstuhl et al. (2015). Figure Classificationas in the previous work, loaded on the same factor, one measuring conventional fluid intelligence.

Second, our measures of fluid intelligence-Letters Sets and
Third, SAT/ACT and cumulative college GPA consistently loaded on the same factor, suggesting a college preparedness/achievement factor that may have been akin to, but probably not identical to, crystallized intelligence. Both measured acquired academic knowledge and skills. (GPA was not included as a variable in the factor analyses of . Fourth, the CQS, a typical-performance measure of cultural intelligence, never loaded on the same factor as the maximum-performance measures of cultural intelligence, as in . Maximum-and typical-performance cultural intelligence appear to be different constructs, one measuring cognitive aspects and the other more (self-reported) attitudinal aspects of cultural intelligence. This finding is similar to findings for wisdom (Kunzmann 2019;Webster 2019) and emotional intelligence (Rivers et al. 2020).
Fifth, where the CQS and Openness to Experience showed substantial factor loadings, they always loaded on the same factor, consistent with . In some analyses, there were not enough typical-performance measures to balance maximumperformance measures, so they did not both show substantial loadings.
Sixth and finally, the Test of Personal Intelligence (TOPI), a measure related to measures of emotional intelligence, showed somewhat variable patterns of factor loadings. It usually, but not always, loaded with Openness to Experience. Because we did not choose tests to study the construct validity of this measure, we cannot say definitively where it fit into the nomological net of our constructs and measures. Therefore, the loadings of this test were less stable than those of the previous work.
In terms of "improvements" on an earlier study ), (a) we substantially increased coefficient alpha internal consistency reliability, probably by making the SCIT longer and by clarifying the instructions, (b) we made clear to participants that we were seeking more than a single response to challenging intercultural situations, (c) we added the Views-on-Culture measure, which, as expected, provided convergent validation for the SCIT, (d) we more than doubled the percentage of participants who were African-American, although this change still left us with a sample restricted to college students from a selective university, and € we used a prior rubric from a previous cultural intelligence study , rather than having the raters devise their own rubric. Most importantly, we largely replicated the past results.
As always, there are questions that remain unanswered. First, in our study, the individuals described in the SCIT were visiting a foreign country. However, many intercultural interactions occur when someone from a foreign country visits one's own country. A more nearly complete test would have items in which an individual from a different culture visits one's own culture, rather than vice versa. Second, our participants were all undergraduates from a selective Northeastern university. They were therefore not a representative sample from any population of interest. A more representative sample is needed. Third, it would be helpful to have performance-based measures of actual performances executed in intercultural contexts, as opposed to hypothetical situations presented on a computer. Fourth, it would help in future research more clearly to delineate the relationship between cultural intelligence, on the one hand, and social, practical, and emotional intelligence, on the other. Fifth, future research on our cultural intelligence measure should compare it to the Rockstuhl et al. (2015) measure and possibly the Schwarzenthal et al. (2019) measure as well. Finally, we need to learn more about the relationship between typical-and maximum-performance measures of cultural intelligence. As with measures of emotional intelligence (Rivers et al. 2020), typical-and maximum-performance measures seem to be measuring different things. How do they differentially relate to actual intercultural performances?
Cultural intelligence may once have been a luxury. People could grow up in their own little corners of the world and live and die there with few or no intercultural interactions.
Such a life is becoming increasingly hard to lead. Moreover, cultural misunderstandings abound. It often is very challenging for people in one culture to understand why people in another culture think, feel, and act the way they do. Cultural intelligence provides an important key to unlocking the mysteries of what makes people different from us the way they are.
Although we believe our measure shows promise, until the measure is shown to predict actual behavior in real-world intercultural situations, its ecological validity as a measure of cultural intelligence cannot be comprehensively and fully demonstrated. This demonstration could be an important task for future research.
It would be easy but, we believe, mistaken to get into an argument over whether typical-performance measures such as the CQS "really" measure cultural intelligence or whether maximum-performance measures such as the SCIT do. No measure is perfect or complete. We adhere to the view expressed by  that intelligence has both typical-and maximum-performance aspects-that is, ones that are both attitudinal and ability-based. Moreover, in the end, while intellectual ability is important, how it is deployed, as determined by one's attitudes, will determine how it affects adaptation to the environment . We believe the two kinds of measures in combination provide a better reading of a person's cultural intelligence than either alone.

Institutional Review Board Statement:
The study was declared "exempt" by the Cornell University Institutional Review Board.
Informed Consent Statement: Informed consent was obtained from all participants involved in the study.
Data Availability Statement: Data are available from coauthor Chak Haang Wong, cw574@cornell.edu. The SCIT is available from Robert J. Sternberg, robert.sternberg@cornell.edu.