Validity and Reliability of Adapted Self-Efficacy Scales in Malaysian Context Using PLS-SEM Approach

Self-efficacy scales have been used widely across curriculum and demographic structures, while retaining their objectivity in a specific domain. This pilot study aimed to test the validity and reliability of adapted scales that incorporated four sources of self-efficacy (mastery experiences, vicarious experiences, social persuasion, and emotional and physiological states), as well as science self-efficacy in the Malaysian context. A total of 109 students participated in this study. Data were analyzed through confirmatory factor analysis (CFA), by using the partial least square structural equation modelling (PLS-SEM) approach. The reliability criteria were determined through outer loading and composite reliability (CR). The assessment of convergent validity was performed using the average variance extracted (AVE), while the discriminant validity of this instrument was confirmed using the heterotrait–monotrait criterion (HTMT), along with the bootstrapping procedure. The CR values were at a satisfactory level, and two indicators were eliminated (PM3 and PMPR6) to improve the AVE values of the construct. All the values were assessed in the HTMT criterion, along with the confidence intervals in the range recommended to prove discriminant validity. The assessment of validity and reliability, through PLS-SEM, indicated that the scales used in this study are valid and statistically reliable.


Introduction
Self-efficacy is a dynamic construct and changes along with the learning process [1]. A student who goes through a learning phase from childhood to adulthood does not necessarily maintain the same level of self-efficacy throughout the process. Furthermore, self-efficacy is specific to the domain. The level of a student's self-efficacy may vary across different subjects. However, various opinions revolve around this matter. Some researchers state that self-efficacy assessments are more stable due to their objectivity, thus causing them to be unaffected by social comparisons and affective changes [2]. Other researchers, however, argue that self-efficacy is a less stable construct and often results in various findings in terms of self-efficacious aspects, as compared to the other constructs, such as the self-concept in measuring the effect on a student's achievements [3].
However, self-efficacy is an important element of motivation, especially in terms of intrinsic motivation and self-regulation in science learning [4]. The prominent impact of selfefficacy on motivation enables this aspect to be used as a factor to measure performance and achievement [5]. Previous studies have proven that self-efficacy does not only correlate with motivation, but also has a significant relationship with science achievement [6,7]. In fact, self-efficacy can affect students' science achievements at various levels of education [5,[7][8][9].
In the classroom, science self-efficacy influences students' choices in science-related tasks, the efforts they contribute to the completion of those tasks, as well as students' determination when facing difficulties in those tasks [10,11]. Originally, students may have 2. Literature Reviews 2.1. Science Self-Efficacy Scale Previous findings demonstrate that the level of students' science self-efficacy was at a moderate level [6,17]. In fact, past studies have also proven that students who are not in science streams have lower science self-efficacy than science stream students [2,18]. This scenario also happens in Malaysia. In Malaysia, most secondary school students avoid going into science streams due to their beliefs that the pure science subjects are extremely difficult subjects [19].
Nevertheless, the study of the influence of self-efficacy in the domain of science is not something new and began long ago. The diversity of the scale used to measure science self-efficacy in the past studies may either spread across various domains or be specific to the domain of science only. Furthermore, the selection of the scale also depends on the researcher's intention. Although previous studies focused on samples of students majoring in science, the instrument used to measure the level of self-efficacy is general in nature and is not confined to the subject of science [20].
Specifically, the motivated strategies for learning questionnaire (MSLQ), developed by Pintrich and De Groot [21], is widely used by researchers to measure the level of science self-efficacy [9,11,22,23]. The items in self-efficacy measurements only represent several subscales in the instrument. In fact, the instrument often has general characteristics that can be used to measure students' motivations. However, recently used self-efficacy instruments are more focused on and specific to certain domains, thus can measure selfefficacy more effectively.
A few Asian researchers have shown great interest in using science learning selfefficacy (SLSE) [24][25][26] in self-efficacy studies. This scale measures science self-efficacy with the presence of five dimensions, comprising conceptual understanding, higher order thinking skills, practical work, daily life applications, and science communication. A study used the SLSE scale to make a cross-cultural comparison in the level of science self-efficacy between Taiwanese and Singaporean students [26]. In this study, exploratory analysis was performed to validate the SLSE through Varimax rotation, in order to verify the structure of each dimension. As a result, the students' responses were grouped into five orthogonal dimensions. The satisfactory Cronbach Alpha values and the Eigenvalue were the indication that this scale had strong psychometric features to assess science self-efficacy.
Another study that utilized this scale focused on investigating the relationship between the self-efficacy and achievement in science subjects among rural secondary school students in Malaysia [24]. The researchers retained features such as the total items and the original Likert scale used in the scale. However, due to the difference in language background, the original scale was translated to the Malay language. The high Cronbach Alpha values that resulted from the process of validation ensured that the SLSE scales used in this study are suitable to be used to assess science self-efficacy in students.

Sources of Self-Efficacy Scale
Over the past decade, the sources of self-efficacy have become an important aspect in the research on self-efficacy. Initiatives to build scales of sources of self-efficacy are observed in some research [6,[27][28][29]. Most scales focusing on the sources of self-efficacy in education are constructed towards career choice and the students' achievements. As one of the characteristics of self-efficacy is domain specific, most constructed scales focus on a specific or a particular domain, based on the field or construct under study. However, in the context of the science domain, the scarcity of existing scales resulted in most researchers having to modify scales from other domains [6,12,17]; for example, researchers might have to adapt a mathematic source of self-efficacy scale to the science domain [6].
The sources of middle school mathematics self-efficacy scale, constructed by Usher and Pajares, is a widely used scale for measuring the sources of self-efficacy in the context of secondary schools [30]. This scale is proven to have high reliability and validity, as well as having the ability to describe strong psychometric features [17]. The original version measured Bandura's proposed source of self-efficacy using a five-point Likert scale. This scale was originally used to measure the source of self-efficacy of secondary school students in the mathematics domain. Due to the flexibility of this scale, it has also been applied to other domains by researchers [17,[31][32][33]. One clear example was by Chen and Usher, who specifically adapted this scale to be used in the domain of science [17].
However, in contrast with what proposed by Bandura regarding vicarious experience as a second strong source of self-efficacy after mastery experiences, most studies were unable to prove this point [30,[34][35][36][37]. The same issue was also visible in this scale. This may be due to the way the items were presented in writing. The items used to measure vicarious experiences were unable to indicate individuals acting as social models in the source. In order to rectify this, Ahn, Bong, and Kim proposed new vicarious experiences and social persuasion scales [27] that include three social models, namely, teachers, families, and peers, due to their strong influences on individual learning that were highlighted in the past literature [38][39][40].
The new vicarious experiences and social persuasion scales were validated using three different samples and two domains for each sample in Korea. The final version of this scale contains 15 items on vicarious experiences and 16 items of social persuasion, with confirmed social model divisions. The researchers suggest using this instrument as a subscale to replace the vicarious experiences and social persuasion items in the sources of middle school mathematics self-efficacy scale. As this scale has been tested in different domains, such as mathematic, English as a foreign language in Korea, and the Korean language, this scale, therefore, presents solid evidence that it is flexible and applicable to any domain studied by researchers.

Significance of the Study
The evolution in the learning of science is parallel with the changing times. Although every country has its own unique learning curriculum, most countries share similarities in terms of science subjects offered to their students; for instance, in Malaysia, students from the age of 13 to 16, who have entered the realm of secondary school, will study general science. After that, they will be given the option to choose their subject of interest. Students who are interested in science can choose to delve into pure science subjects, such as biology, physics, and chemistry. For students who did not choose these pure science subjects, general science subjects are compulsory for them, as the Malaysian government sees the importance of inculcating science-based knowledge to the younger generations.
However, over the past decades, Malaysia has recorded inconsistent achievements in average science scores in international assessments, such as TIMSS and PISA [41,42]. This scenario is directly linked to students' negative attitudes towards science [43,44]. Malaysian secondary school students have the tendency to avoid science streams, due to their preconceptions that pure science subjects are extremely difficult subjects to learn [19].
Based on this, the Malaysian Ministry of Education has implemented a comprehensive new curriculum for both primary and secondary school levels. The new curriculum was developed as a result of drastic educational reform, as documented in the Malaysian Education Blueprint 2013-2025 [45]. The standard curriculum for secondary school (KSSM) was introduced in the year of 2017, for form one students [46]. Thus, a pioneer batch of students were in form four in 2020. For the purpose of this study, a total of 109 students who were learning the general science subject from the first batch of students were selected as the sample. Their views of the new curriculum for science subjects, in terms of science self-efficacy, were gathered and analyzed in this study.
The psychometric structure of a scale may differ across cultural backgrounds. Therefore, it is important to reconfirm that the scale used is relevant in the context of this study. The psychometric structure of all the scales used in this study was reconfirmed using the partial least square structural equation modeling (PLS-SEM) approach. This is necessary because the original scale is modified and translated to the Malaysian context. Prior to that, content and face validation were performed to uphold the quality of the scale.

Sample
This study adopted a quantitative approach using the survey design. In addition, a multistage sampling method was adopted for this study. This study was conducted in the west coast division of Sabah, Malaysia. Out of the seven districts in the west coast of Sabah, two district education offices were selected through a random sampling method. Next, the researchers selected a few secondary schools from the two districts for the pilot study. Lastly, purposive sampling was used to select form four students who took general science subject to meet the needs of the study. Form four students in Malaysia have similar characteristics to international grade 10 students. They are around 16 years old and are studying in secondary schools. However, the science curriculum in Malaysia might be different to other countries.
The best sample size determination method for structural equation modeling in a study is based on power analysis with specified features [47]. The minimum sample size in this study was determined through calculations based on G*Power software version 3 [48]. Based on the calculations, a total of 74 minimum sample numbers in reference to the structural model were proposed for this study. However, previous studies have suggested that using 100 to 200 samples is a good starting point for studies related to path estimation analysis, especially for structural equation model [49,50]. Therefore, in order to satisfy this requirement, the researchers collected data from 109 samples. The samples were considered to be homogenous as all schools in Malaysia use the same curriculum and syllabus, as stipulated by Malaysian Ministry of Education.

Instrument
In this study, all original scales involving sources of self-efficacy and science selfefficacy were translated into Malay language and were later combined as a scale. Table 1 shows the code used in this scale. The finalized items with code abbreviations in English are shown in Appendix A as references. Several scales were adapted, modified, and translated for the purposes of this study. The sources of middle school mathematics self-efficacy scale were used for the purpose of measuring sources of self-efficacy through mastery experiences (PM) as well as emotional and physiological states (KEF) [30]. In addition, sources of self-efficacy through vicarious experiences (PMP) as well as social persuasions (PS) were measured using the new vicarious experiences and social persuasion scales [27]. Science learning self-efficacy (SLSE) was selected to measure science self-efficacy [25]. There are five dimensions highlighted in this instrument. They are the dimension of conceptual understanding (EKSPK), dimension of higher order thinking skills (EKSKBAT), dimension of daily application (EKSHAR-IAN), dimension of science communication (EKSKOM) and dimension of practical tasks (EKSAMALI).
All items in original scales were retained in this study. However, some modifications were made to fulfil the objectives of this study. Firstly, the 11-point semantic differential scale was used in this study to replace the 5-point Likert scale in the original instruments. Next, all items from three original scales were combined into a single scale to represent the structural model for this study. In addition, back-to-back translation was performed by a panel [51,52]. The translation process ensured that there was no change in the meanings, concepts and instructions stated in the translated version compared to the original scale. Lastly, content validity and face validity were carried out prior to pilot study. Both validities are important in determining whether the translated scale is truly appropriate for measuring the external and internal context of the study [53].

Procedures
The procedure of conducting the study was initiated by applying for permission from the faculty. After that, permission to enter the school was obtained from Education Policy Planning and Research Division, Ministry of Education and Sabah State Education Department. The process of data collection was carried out in four schools in September 2020. The scale was administrated during science sessions in their respective classrooms. Briefing to the students was performed prior to the data collection. This procedure was to make sure that the students were fully aware of the fact that the scale does not function as a test. It only elicits students' own perceptions and should not be pressurized by the expectations from the schools. They were also given adequate time to answer the instruments and all students managed to finish answering the scale in 20 min.

Data Analysis
Structural equation modeling (SEM) is a new measurement alternative labeled as a second-generation measurement method in multivariate analysis. SEM analysis techniques display various advantages over first-generation multivariate analysis methods. In SEM analysis, the combination of factor analysis and regression methods gives researchers the advantage of studying the relationship between the observed variable and the latent variable. Among the advantages of SEM techniques is that all these relationships can be studied at the same time or simultaneously.
Both covariance-based (CB-SEM) and partial least squares (PLS-SEM) are the most widely used SEM analysis techniques in recent studies. The use of CB-SEM and PLS-SEM techniques are complementary, yet differ in various aspects, especially goals, statistical methods, and analysis requirements [54]. Technically, researchers chose to use the PLS-SEM analysis method as there is no theory or stable results that confirm the ability of science self-efficacy as a mediator between source of self-efficacy and science achievement in previous studies.
The model proposed in the framework for this study is a hierarchical component model (HCM), which involves both higher order component (HOC) and lower order component (LOC). In the PLS-SEM analysis method, the reflective-reflective relationships that exist between higher order and lower order components are denoted as type I model. Therefore, the construct validity and reliability of the scale is followed by measurement model assessment in PLS-SEM method.
The validity and reliability of the scale were tested through confirmatory factor analysis method (CFA) by using partial least squares structural equation modelling (PLS-SEM) method. The assessment of the reliability of the scale was reported by the composite reliability (CR) and outer loading values. Meanwhile, convergent validity for this scale was assessed through the average variance extracted (AVE) values. In this study, discriminant validity was determined through the HTMT criterion. Table 2 summarizes the acceptance criteria for the reported values for the purpose of validity and reliability of the scale. The value of outer loading represents the reliability of the indicator in the construct. The recommended value for outer loading should exceed 0.7. The square value of the standard outer loading represents communality, i.e., the extent to which the indicator is described by endogenous constructs in a model [66]. However, when the outer loading value is between 0.4 and 0.7, the decision to maintain, change or delete an item depends on conditions such as high outer loading value for other items and criteria such as CR and AVE values. The outer loading value viewed together with the AVE value was used to obtain convergent validity for the scale. The AVE value can only be calculated when there is a square value of the outer loading used for the purpose of calculating the mean value [67]. The recommended value is above 0.5, which means that more than 50% variance for reflective indicators has been considered to explain the latent variable.
In addition, several criterions considered for discriminant validity were proposed such as cross-loading values, Fornell-Larcker criterion and heterotrait-monotrait criterion (HTMT) [47]. In past studies, the discriminant validity analysis was mostly assessed from the cross-loading values as the first step followed by Fornell-Larcker criterion and HTMT table assessment [54]. In this method, the value of cross loading for items that belong to the construct must be the highest among all other items in the scale. Meanwhile, the Fornell-Larcker criterion states that the square root value of AVE must be higher than the correlation value of the construct along with all other constructs. The assessment was finalized with HTMT value analysis in the table provided. In this step, the HTMT values should not exceed the value of 0.90 to meet the requirement of discriminant validity for the measurement model [64].
However, the new assessment of HTMT criterion has become the main method in recent studies over cross-loading and the Fornell-Larcker criterion due to better data presentation [65,66,68]. The first step in this method is the same as previous assessment of HTMT criterion. The next step has a bigger difference with the older method, as it should be followed by bootstrapping method. The bootstrapping method is performed to confirm the range of HTMT confidence intervals that has gone through bias-corrected and accelerated confidence intervals. In this step, the discriminant validity was assessed through the values of confidence intervals shown in the table. The range of the confidence interval should be in range below 1. If a value of "1" is displayed in upper bound, this indicates that the item used has poor discriminant validity [65].

Structural Equation Modeling
Figure 1 displays the structural equation model for this study. The main objective for the actual study is to test the mediation effect of science self-efficacy for the relationship between sources of self-efficacy and science achievement. It is well known that sources of self-efficacy affect the level of individual self-efficacy both theoretically and statistically [6,10,12,17,69]. On the other hand, past studies confirmed that self-efficacy has a direct effect on science achievement [6,11,22,24,70]. While recent studies confirm the role of self-efficacy as a mediator between several constructs and science achievement in their proposed model [71,72], minimal studies tested the possibility of self-efficacy as a mediator between the self-efficacy sources and science achievement. Some previous researchers proposed a similar model, but in different settings [73,74]. Due to this, researchers proposed the structural model as shown in Figure 1. This is a higher order structural equation model in the form of reflective-reflective relationships [67]. Four sources of self-efficacy, namely, mastery experiences (PM), vicarious experiences (PMP), social persuasion (PS) as well as physiological and emotional states (KEF) act as separate exogenous sources. Achievement acted as endogenous source in this study. Science self-efficacy (EKS) acts as multi-role where EKS is exogenous for the source of self-efficacy, exogenous for achievement as well as a mediator in the structural equation of this model. PMP, PS and EKS are higher order constructs that have dimensions that act as lower order constructs. The weight of all higher order constructs had been set as mode A as suggested by past literature [75]. However, only lower order constructs were analyzed for the purpose of validity and reliability of the questionnaire instrument.  Science achievement in this study refers to the current performance of students in science subjects. Science achievement is measured through student actual test scores based on a set of objective questions. This objective test is a paper and pencil test that contains 50 questions and includes 5 chapters in the new science syllabus in the standard curriculum for secondary school (KSSM) for the form four students (Chapter 1: Safety Measures in Laboratory, Chapter 2: Emergency Help, Chapter 3: Techniques in Measuring the Parameters of Body Health, Chapter 4: Green Technology for Environmental Sustainability and Chapter 5: Genetic). The psychometric features including reliability and validity of this test were already established through Rasch dichotomous measurement model. All the students take the same set of questions at the same time and place. Since the presented model only used actual test scores to represent student achievement, the reliability Science achievement in this study refers to the current performance of students in science subjects. Science achievement is measured through student actual test scores based on a set of objective questions. This objective test is a paper and pencil test that contains 50 questions and includes 5 chapters in the new science syllabus in the standard curriculum for secondary school (KSSM) for the form four students (Chapter 1: Safety Measures in Laboratory, Chapter 2: Emergency Help, Chapter 3: Techniques in Measuring the Parameters of Body Health, Chapter 4: Green Technology for Environmental Sustainability and Chapter 5: Genetic). The psychometric features including reliability and validity of this test were already established through Rasch dichotomous measurement model. All the students take the same set of questions at the same time and place. Since the presented model only used actual test scores to represent student achievement, the reliability and validity for this construct were omitted from this kind of analysis. This is because the produced reliability only shows true perfect score of 1.

Construct Reliability and Convergent Validity
The reliability of the construct is determined by the value of outer loading and the value of CR. Meanwhile, the convergent validity is determined through the AVE value. Table 3 shows the values specified for each indicator before the adjustment is made. The assessment of the reliability of the indicator is performed through the value of outer loading. In the table, the outer loading value for the entire indicator exceeds the level of 0.7, as recommended, except for the indicators EKSHARIAN8, EKSKOM4, PM1, PM3, PMPR3, PMPR5, PMPR6, and PMPR7. However, all the indicators below the 0.7 level passed the 0.4 level, with the lowest value of 0.416 being by PM3. These indicators are conditionally accepted, through the CR and AVE values, for the constructs that are the hosts of the indicators [57,58].   In Table 3, the CR values for all the constructs are at a satisfactory level. Although there are some constructs that have a CR value beyond the 0.9 level, they are still acceptable because they are below the 0.95 level [67,76]. The result from the convergent validity assessment using AVE values indicated that all the constructs passed the acceptance level of 0.5, except for PMPR. Therefore, the removal of the indicator was conducted to observe the change in AVE values. Based on the reliability criteria in the SMART-PLS analysis, the AVE value for this PMPR can be increased by removing the indicators that are below the 0.7 level. Although PMPR3, PMPR5, PMPR6, and PMPR7 are below the 0.7 level, indicator removal should be initiated from the indicator with the lowest outer loading overall. In the table, the outer loading indicator PM3 has the lowest value compared to the other indicators that were finally selected to be removed for the first round of analysis.
After the removal of PM3, the model was reanalyzed to obtain values of outer loading, the CR, and the new AVE after realignment occurred. The item removal process was repeated until the AVE value below the 0.5 level increased and exceeded that level. Since there was no increase in the AVE value for PMPR, the item removal process was repeated against the PMPR6 indicator. Table 4 shows the convergent reliability and validity values after the removal of the second indicator PMPR6. When PMPR6 removal occurred, the AVE value for PMPR increased to 0.513. The values of outer loading, CR, and AVE for each construct and indicator involved after adjustment finally reached an acceptable and satisfactory level. Thus, the reliability and convergent validity of the scale is established through the removal of two items, namely, PM3 and PMPR6.

Discriminant Validity
The evaluation of the discriminant validity of the scale is determined through the HTMT criterion. Although previous studies considered cross-loading values, as well as Fornell-Larcker criteria, to detect the discriminant validity of the scale, these two criteria did not appear to be able to detect most items with poor discriminant validity. Therefore, researchers proposed to evaluate the HTMT criterion as the main assessment for the discriminant validity of a structured model [65]. In a reflective measurement model, discriminant validity is proved when the HTMT value for each construct does not exceed the level of 0.9 [64]. Next, a bootstrapping technique was performed to obtain the actual HTMT confidence interval. In the process, the researchers set up a sample of 5000 samples with a choice of complete bootstrapping technique [77]. A confidence interval of 95% was applied in this technique. The HTMT confidence interval should not show a value of "1" to confirm the discriminant validity of the construct. Tables 5 and 6 show the HTMT criterion between the constructs in the structured model for this study. The HTMT values shown in this table are values that are outside the brackets. Based on the HTMT values displayed, all these values are below the maximum level of acceptance, with the highest HTMT value being 0.879 (EKSHARIAN, EKSKBAT). Meanwhile, the HTMT confidence interval is shown through the values in parentheses. The confidence interval shown is the confidence interval that has gone through the biascorrected and accelerated confidence interval. In Tables 5 and 6, all the confidence interval values shown do not contain the value "1". If a value of "1" is displayed, this indicates that the item used has poor discriminant validity [65]. The evaluation through the HTMT criteria in this pilot study proved the existence of strong discriminant validity for the scale used in this study.

Comparison of Structural Equation Modelling
The validity and reliability of the questionnaire items is evident after the elimination of two indicators (PM3 and PMPR6). Figures 2 and 3 display a comparison of the structural equation model before and after item elimination. The values shown in both the figures are the outer loading values for each indicator, indicated by the arrow. Meanwhile, the number shown in the circular shape is the value of the average variance extracted (AVE) for each lower order construct. However, the values of outer loading and AVE shown by the higher order constructs (PMP, PS, and EKS) should be ignored, as the values do not display the actual values of outer loading and AVE for higher order constructs [75].   All the outer loading values displayed for the lower order construct exceed 0.4. However, the AVE value for PMPR is 0.474, which is below the acceptance level of 0.5. The process of elimination of indicators is started from the indicator with the lowest value. In this case, the indicator that needs to be removed is PM3, since this indicator has the lowest outer loading value. After the elimination process occurs, the calculation process is performed again until the AVE PMPR value reaches the acceptance level of 0.5.  All the outer loading values displayed for the lower order construct exceed 0.4. However, the AVE value for PMPR is 0.474, which is below the acceptance level of 0.5. The process of elimination of indicators is started from the indicator with the lowest value. In this case, the indicator that needs to be removed is PM3, since this indicator has the lowest outer loading value. After the elimination process occurs, the calculation process is performed again until the AVE PMPR value reaches the acceptance level of 0.5.

Discussion and Conclusions
The main purpose of this study is to validate the adapted, modified and translated scales that were used to measure the sources of self-efficacy and science self-efficacy. The assessment of validity and reliability through partial least square structural equation modeling proves that the scale used in this study is valid and statistically reliable. This proves the psychometric features of the scale to be used in further analysis of this study.
This paper strictly defines another method to validate the items in the science selfefficacy scale (SLSE), the sources of middle school mathematics self-efficacy scale, as well as the new vicarious and persuasions scale. In the present work, the researchers were focused on validating the science self-efficacy constructs based on five dimensions that originated from the SLSE scale by Lin and Tsai [25]. The findings show consistency in terms of psychometric features across the different kinds of validation methods used by the researchers. The original method differs in terms of statistical analysis, but both the analyses show that this scale has strong psychometric features to be used as a scale to assess the science self-efficacy constructs. Another study that uses this scale in the Malaysian context uses another method of validation, through Cronbach Alpha values in SPSS software from the present work [24]. However, the present work uses a different measurement scale from previous studies, as the researchers introduced the semantic differential scale to assess more information on this construct. This matter proves that this scale can also be used in different measurement scales, depending on the context of the study. In this study, the self-efficacy sources based on Bandura's theory were assessed separately in each source. The researcher focused on validating both mastery experiences, and emotional and psychological states as constructs. Both sources were assessed using a subscale that originated from Usher and Pajares; therefore, perhaps the researchers provided part of the answers to the future direction of the research suggested by the original study [78]. This study shows that both subscales can be used in different cultural contexts and domains. In addition, this study also provides the stability of psychometric features across different measurement scales since the researchers decided to use the semantic differential scale over the Likert scale in the original study.
Meanwhile, the validation of vicarious experiences and social persuasion constructs shows that the items are statistically valid and reliable through cross-cultural validation, as suggested by the original study of Ahn et al. [27]. The present work also provides new additional information, as this scale can be used in the science domain as well as semantic differential scales. To conclude, this scale is suitable for use on study samples that have similar characteristics to the sample used in this study, especially form four students who take science subjects in the Malaysian context.
There are some limitations of the study that need to be addressed by the researcher to clarify the state of the study. This study used a small sample and limited location due to the pandemic situation. As a consequence, the findings of the study cannot be generalized to the actual population. On the other hand, future studies could extend this scale to a larger sample in a different context. Moreover, the use of actual test scores may result in poor information about the structure of achievement constructs. Therefore, subsequent studies should emphasize the use of instruments that had been validated through the proposed model. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The scale used with the presence of dimensions and code abbreviations.  I am able to propose many viable solutions to solve a science problem.

EKSKBAT4
When I come across a science problem, I will actively think over it first and devise a strategy to solve it.

EKSKBAT5
I am able to make systematic observations and inquiries based on a specific science concept or scientific phenomenon.

EKSKBAT6
When I am exploring a scientific phenomenon, I am able to observe its changing process and think of possible reasons behind it. In science classes, I can clearly express my own opinions. EKSKOM6 In science classes, I can express my ideas properly.