Does Project Children’s University Increase Academic Self-Efficacy in 6th Graders? A Weak Experimental Design

The purpose of this study is to determine whether students’ academic self-efficacy levels increase through a 20 week of education that is based on the problem-based learning theory and transmitted in an inter-disciplinary manner in Project Children’s University. The project aimed to teach students to learn how to learn. Eventually, students will be life-long learners and gain sustainable learning skills. In order to observe the effect of Project Children’s University, academic self-efficacy levels are measured in terms of “self-efficacy in ability”, “context”, and “education quality domains”. Changes in treatment group students’ academic self-efficacy levels are modeled in growth curve modeling framework throughout three waves. Then, they are compared with those of control group students using Welch’s t test. Results have shown that the levels of academic self-efficacy throughout the research have fallen significantly for the treatment group students. In addition, the levels of self-efficacy in ability of the treatment group students were significantly higher than the levels of the control group students. On the other hand, the levels of context of the treatment group students were significantly lower than the levels of the control group students. In conclusion, Project Children’s University has failed to increase students’ academic self-efficacy levels, but entitled them to understand what academic self-efficacy really means, to socialize, to be self-confident students, and to criticize themselves more rationally.


Introduction
As a concept that gained importance towards the end of 1980s, sustainability is used to mean to protect natural resources and to benefit from resources that are available in the most effective way. While the concept of sustainability was initially associated with ecology and environmental issues (Chapin, Torn, and Tateno, 1996) [1], it is now seen that it is associated with a wide range of issues such as economics, living, development, and education. Considering definitions related to sustainability, it is generally expressed as a continuous development in a multi-dimensional area such as cultural, economic, social, environmental, energy, and transferring these skills and achievements to new generations (Brundtland, 1987;Hargreaves and Fink, 2006) [2,3]. In fact, the definitions that have been show that sustainability is a concept that is born and developed within humans (Suzuki, 1997) [4], and this concept is directly in connection with the learning and education process. Education and learning process has turned into a student and life focused structure in recent years. This focus has also brought the understanding of lifelong learning into the forefront and becomes the main reason of When the related literature is reviewed, it is seen that researchers related to academic self-efficacy are mainly focused on three headings, which are performance, the choice of profession, and its effects on the education process (Akbas and Çelikkaleli, 2006) [20]. Researches have shown that individuals who have developed academic self-efficacy manage their learning process with a more sustainable approach and become more successful. Within this context, the purpose of this study is to increase students' academic self-efficacy levels through 20 weeks of education that is established in problem-based learning theory and transmitted in an inter-disciplinary manner. The project aimed to teach students to learn how to learn. Eventually, students will be life-long learners and gain sustainable learning skills. In order to observe the effect of Project Children's University on academic self-efficacy, following research questions are addressed throughout the research; 1.
What are the descriptive characteristics of treatment group students' academic self-efficacy in three waves and control group students' academic self-efficacy in wave 3? 2.
How do treatment group students' academic self-efficacy levels vary across three waves? Null hypothesis: Treatment group students' academic self-efficacy levels do not vary across three waves.

3.
What is the difference between the treatment and control group students' academic self-efficacy levels in wave 3? Null hypothesis: There is no difference between the treatment and control group students' academic self-efficacy levels in wave 3.

Sample
The treatment group consists of a total of 117 6th grade students who are enrolled at 8 different middle schools with different socio-economic levels in Siirt city center. Each school principal is asked to recruit students based on willingness to participate in the Project Children's University (Grant number: SODES 2017-56-172). The number of students who intended to participate in the Project Children's University was far beyond the budget of the project. Regarding the number of 6th graders enrolled at a particular school, we have assigned a quota ranging from 3 to 20 students. Then, we asked school principals to decide which student would participate in the Project Children's University. Remaining students enrolled at those schools were employed as the control group. Later we found that students who were elected to be in the treatment group by the school principals were the most academically successful students in their schools. The data collection tool (introduced in Section 2.3) was applied to treatment group students before the Project, after 10 weeks of instruction, and at the end of the Project. The data collection tool was applied to control group students (N = 417) only at the end of the Project. In this sense, the design of the study reflects the properties of a weak experimental design with a control group.

Project Children's University
The project Children's University was funded by Republic of Turkey, Ministry of Development with the project number SODES 2017-56/172. Details of the Project Children's University can be found at Çetin and Toytok (2018) [21]. Students received education that was based on the problem-based learning theory in an inter-disciplinary manner 10 hours per week on Saturdays and Sundays for 20 weeks. Mathematics, Science, Foreign Language and Personal Development courses (i) given with Music, Drawing, and Physical Education courses (ii). The first 10 weeks were dedicated to exploring those areas. During the second 10 weeks of the project, students specialized in one course from the section (i), and one course from the section (ii). All of the courses were given in labs, studios, and gymnasium. Within the scope of the Project Children's University, the objective is to make students be aware that physics, chemistry, biology and mathematics are ubiquitous in their daily life. In this context, lectures were made fun and meaningful by associating them with courses such as physical education, music, and visual arts, which students would enjoy most. To give some examples related to activities; the states of the matter subject was covered in chemistry course and was given with the music course by investigating how the sound is spread in solid, liquid, and gas substances with the help of materials used in daily life such as glass bottles filled with nothing, water, and soil. When the same subject was given with the physical education course, a game called solid-liquid-gas was designed. According to this game, groups were huddling when the instructor screamed "solid". Students were holding hands only when the instructor screamed "liquid". Lastly, students were not making any physical contact when the instructor screamed "gas". The main purpose of this game is to understand the structure of the molecules in different states of the matter as in solids, close to each other, in liquid, less densely packed, and in gas, much more spread out. When states of the matter were given in visual arts, solid, liquid, and gas materials and painting activities were done. The advantages and disadvantages of painting in these three forms of matter were discussed. As in the case of chemistry, physics, biology, and mathematics courses were organized in the same way and given with physical education, music, and visual arts. All of the courses were established, such that students were involved in the application process by solving daily life problems; therefore, the information that is constructed throughout the courses would be more permanent and effective. The main purpose of the project is to develop students' problem solving skills in which problems are simply related to the daily life issues. Students will be able to find the connection between science and daily life; hence, they will learn how to learn, and this learning will be sustainable (Çetin, Kahyaoglu, and Erulaş, 2018) [22].
While all of the students were full-time students at a public school, the Project Children's University provided a free education opportunity. It was made clear in the parental consent form that usual burdens of schools such as grading pressure, taking notes, mandatory attendance, and carrying a backpack were not part of our project. In addition, students were transported between Siirt University and their schools without any fee and were given free lunch. Parents were also informed about the possible benefits of participation in the project as listed in the previous section. The parental consent form also included information regarding the possibility of student fatigue. While attendance was not mandatory, attendance records revealed that most of the treatment group students were not absent for more than one week, indicating that students preferred attending Project Children's University half a day at the weekends to staying home and having a rest. Required permissions for all of the applications and data collection activities were granted by the Governorship of Siirt and Siirt Provincial Directorate of National Education, with approval number 42026020-604.E.19691363.

Data Collection Tool
Academic Self Efficacy Scale was initially developed by Morgan and Jinks (1999) [23] and adapted to Turkish language by Öncü (2012) [24] for middle school students. The adapted form has 21 items and measures academic self-efficacy in ability (11 items), context (7 items), and educational quality (3 items) domains. Items regarding the context domain are negatively worded. Students with higher academic self-efficacy levels should have lower scores for this particular domain because items in this domain reflect that the source of the academic success is outsourced. Each item has 4 possible point answers, ranging from really disagree to really agree. In other words, collected data are ordinal.

Analytical Strategies
Analytical strategies are summarized in this section as follows:

Missing Data
As it was explained under the sample section, the treatment group data were collected in three waves. When a student was absent at the data collection day, the data for that particular occasion was missing. Assuming absentness is not related to the missing mechanism, students' missing data are predicted with multiple imputation using chained equations using "mice" package version 2.46.0 (Stef van Buuren and Karin Groothuis-Oudshoorn, 2011) [25] in R version 3.4.2 (R Core Team, 2017) [26]. Since 22.5% of the data collected in three waves were missing, missing data were imputed 5 times. All of the results reported in this study regarding treatment group students are the pooled results of the statistical analyses that are obtained using five imputed datasets separately. On the other hand, only 0.37% of the data were missing for the control group students; therefore, single imputation was utilized to impute missing data points.

Reliability Analyses
In order to ensure the reliability of the measures, Cronbach's Alpha was calculated in each imputed dataset for each wave and domain. The following table reflects reliability statistics that are pooled across imputed datasets.
Based on the statistics reported in Table 1, we may conclude that each domain in each wave measured reliably with the exception of the educational quality domain for the control group students. Due to the low reliability, the comparison of control group and treatment group in the educational quality domain is excluded.

Obtaining Scaled Domain Scores
In order to obtain total scores in each domain, we have summed responses given to each item for a particular domain. Data are collected using an ordinal scale, but we assumed that it was collected in an interval scale where "really disagree" is represented with 1, "disagree" is represented with 2, "agree" is represented with 3, and "really agree" is represented with 4. Once the total score is calculated, it is scaled to obtain scaled scores such that the lowest possible scaled domain score is 0 and the highest possible scaled domain score is 100 to ensure the comparability across domain score levels.

Growth Curve Analyses
In order to address research question 2, the academic self-efficacy of the treatment group students' over time is modeled in three waves. Results from the first wave represented the intercept (i.e., the beginning point), results from the second and the third wave are used to model the growth at the middle point of the Project and at the end point of the Project. "lavaan" package version 0.5-23.1097 (Rosseel, 2012) [27] in R version 3.4.2 (R Core Team, 2017) [26] was used to model students' growth in academic self-efficacy over time for three domains described in the data collection tool part. Each growth analysis for each particular domain scores is run 5 times for each imputed datasets and results reported as the pooled averages of the five analyses. Figure 1 visualizes the conceptual model for the growth curve analysis. In order to address research question 3, independent samples Welch's t test was utilized to compare means of each particular domain scores for treatment and control group students. Welch's t test does not assume equal variance across compared groups (Welch, 1947) [28]. Since group sizes are quite large, ignoring normality assumption can be ignored because of the central limit theorem. Error terms may not be independent of one another, given that students come from 8 schools, but in consideration with the low sample size from various schools, standard errors that are estimated may be smaller than it should be and this situation may increase the Type-I error rate.

Results
In this section, we briefly explained the descriptive properties of academic self-efficacy scores of the treatment group students in three waves and control group students in wave 3. Then, we reported the growth in academic self-efficacy of the treatment group students over three waves. Finally, we summarized the comparison of treatment and control group students' academic self-efficacy scores in wave 3.

Descriptive properties of the sample regarding academic self efficacy
In order to understand the academic self-efficacy properties of the sample, we obtained means and standard deviations of three scaled domain scores of the academic self-efficacy in three waves for treatment group students and in Wave 3 for the control group students. Results are listed in Table  2. Table 2. Descriptive properties of the sample regarding academic self-efficacy. Focusing only on the treatment group students, on average, we observed high levels of selfefficacy in ability as 80.352, 70.692, and 68.262 in three waves, respectively. In addition, we observed

Comparison of Treatment vs. Control Groups
In order to address research question 3, independent samples Welch's t test was utilized to compare means of each particular domain scores for treatment and control group students. Welch's t test does not assume equal variance across compared groups (Welch, 1947) [28]. Since group sizes are quite large, ignoring normality assumption can be ignored because of the central limit theorem. Error terms may not be independent of one another, given that students come from 8 schools, but in consideration with the low sample size from various schools, standard errors that are estimated may be smaller than it should be and this situation may increase the Type-I error rate.

Results
In this section, we briefly explained the descriptive properties of academic self-efficacy scores of the treatment group students in three waves and control group students in wave 3. Then, we reported the growth in academic self-efficacy of the treatment group students over three waves. Finally, we summarized the comparison of treatment and control group students' academic self-efficacy scores in wave 3.

Descriptive Properties of the Sample Regarding Academic Self Efficacy
In order to understand the academic self-efficacy properties of the sample, we obtained means and standard deviations of three scaled domain scores of the academic self-efficacy in three waves for treatment group students and in Wave 3 for the control group students. Results are listed in Table 2. Table 2. Descriptive properties of the sample regarding academic self-efficacy.

Self-Efficacy in Ability
Context Educational Quality in each academic self-efficacy domain scores across treatment group students increased over three waves. Moving to the control group students, on average, moderate self-efficacy in ability scores were observed with the mean of 56.696. On average, observed context scores were low with the mean of 24.906. Descriptive properties of the sample regarding academic self-efficacy are visualized in Figure 2.
Sustainability 2019, 11 FOR PEER REVIEW 7 high levels of educational quality as 86.743, 74.872, and 71.434 in three waves, respectively. On the other hand, we observed low levels of context as 17.558, 30.631, and 31.820 in three waves, respectively. Variability in each academic self-efficacy domain scores across treatment group students increased over three waves. Moving to the control group students, on average, moderate self-efficacy in ability scores were observed with the mean of 56.696. On average, observed context scores were low with the mean of 24.906. Descriptive properties of the sample regarding academic self-efficacy are visualized in Figure 2.

Growth of academic self efficacy levels of treatment group students
Three separate growth curve analyses were run for each one of the three academic self-efficacy scaled domain scores using each one of the five imputed datasets. Results were pooled to establish Table 2. Growth curve analyses for educational quality scaled domain scores did not converge; therefore, they are not reported. Table 2 revealed that, on average, treatment group students initially had a self-efficacy in ability score of 79.688 (p < 0.0001). There is a significant linear trend for self-efficacy in ability scores over three waves with slope of -6.139 (p< 0.0001) indicating that, on average, students' scaled selfefficacy in ability domain scores decreased 6.139 points in each wave. The negative correlation of -0.261 between intercept and slope indicates that students with higher scaled self-efficacy in ability domain scores in wave 1 have a steeper decrease in scaled self-efficacy in ability domain scores, but the relationship is not statistically significant with p = 0.2794. Variance component for the slope of 146.073 is statistically significant with p value of 0.0011, indicating that the slope for scaled selfefficacy in ability domain scores across three waves varies significantly among students. Table 2. Summary table for the growth curve analyses over three waves.

Growth of Academic Self Efficacy Levels of Treatment Group Students
Three separate growth curve analyses were run for each one of the three academic self-efficacy scaled domain scores using each one of the five imputed datasets. Results were pooled to establish Table 3. Growth curve analyses for educational quality scaled domain scores did not converge; therefore, they are not reported.

Growth Curve Analysis for the Self-Efficacy in Ability
Results in Table 3 revealed that, on average, treatment group students initially had a self-efficacy in ability score of 79.688 (p < 0.0001). There is a significant linear trend for self-efficacy in ability scores over three waves with slope of −6.139 (p < 0.0001) indicating that, on average, students' scaled self-efficacy in ability domain scores decreased 6.139 points in each wave. The negative correlation of −0.261 between intercept and slope indicates that students with higher scaled self-efficacy in ability domain scores in wave 1 have a steeper decrease in scaled self-efficacy in ability domain scores, but the relationship is not statistically significant with p = 0.2794. Variance component for the slope of 146.073 is statistically significant with p value of 0.0011, indicating that the slope for scaled self-efficacy in ability domain scores across three waves varies significantly among students. Note: * p < 0.05, ** p < 0.01, and *** p < 0.001. Variance of the intercept and slope are fixed at 1 to obtain standardized results. Table 3 revealed that, on average, treatment group students initially had a scaled context domain score of 18.607 (p < 0.0001). There is a significant linear trend for scaled context domain scores over three waves with slope of 7.040 (p < 0.0001), indicating that, on average, students' scaled context domain scores increased 7.040 points in each wave. The negative correlation of −0.116 between intercept and slope indicates that students with lower scaled context domain scores in the wave 1 have a steeper increase in scaled context domain scores but the relationship is not statistically significant with p = 0.3156. Variance component for the slope of 148.512 is statistically significant with p value of 0.0026, indicating that the slope for scaled context domain scores across three waves varies significantly among students.

Comparison of Treatment and Control Group Students in Wave 3
In order to compare treatment and control group students' academic self-efficacy scores in wave 3, we ran Welch's t test for self-efficacy in ability and context scores. Welch's t test does not require variance homogeneity across treatment and control groups. Since the sample size is high, deviations from normality in scores does not increase the Type-1 error rate. Results are listed in Table 4. Notes: * p < 0.05, ** p < 0.01, and *** p < 0.001. 1 df stands for degrees of freedom.

Comparison of Treatment and Control Groups in Wave 3 for the Scaled Self-Efficacy in Ability Domain Scores
Results in Table 4 revealed that the mean of the scaled self-efficacy in ability domain scores in treatment group was 11.566 points higher than the mean of the scaled self-efficacy in ability domain scores in control the group with t (151.104) = 4.155, p < 0.0001. We are 95% confident that the true population difference is in the (6.067, 17.066) interval. The effect size estimate for this difference is 0.676, indicating that the difference is from medium to large.

Comparison of Treatment and Control Groups in Wave 3 for the Scaled Context Domain Scores
Results in Table 4 revealed that, the mean of the scaled context domain scores in treatment group was 6.914 points higher than the mean of the scaled context domain scores in control group with t (145.372) = 2.465, p = 0.0149. We are 95% confident that the true population difference is in the (1.257, 12.458) interval. The effect size estimate for this difference is 0.409, indicating that the difference is from small to medium.

Discussion
Under sustainable teaching and learning framework, students received education that was based on problem-based learning theory and in an inter-disciplinary manner throughout 20 weeks. The aim of the study was to develop students' problem solving skills in which problems are simply related to daily life issues. All of the courses were given in labs, studios, and gymnasium. Mathematics, Science, Foreign Language, and Personal Development courses (i) were given with Music, Drawing, and Physical Education courses (ii). In this relational and holistic education approach, students will be able to find the connection between science and daily life; hence, they will learn how to learn and this learning will be sustainable.
Throughout 20 weeks, it was observed that the treatment group students' academic self-efficacy levels fell continuously. However, Lv, Zhou, Liu, Guo, Zhang, Liu, and Luo (2018), [29] stated that there is a high level of relationship between academic achievement and academic self-efficacy. On the other hand, Honicke and Broadbent (2016) [30] reported that the relationship between academic performance and academic self-efficacy was moderate in the results of a research that included a total of 12 years data that were collected between 2003 and 2015. Because self-efficacy belief is an important source of motivation for individual success, the domain of its influence is not limited to only academic success, but also to skills related to social and emotional areas (Sachitra and Bandara, 2017) [31]. Bandura (1999) [32] found that academic self-efficacy was an important explanatory factor for feeling less depressed or coping issues with stress freely and easily. In this context, academic self-efficacy affects academic proficiency as well as social and emotional competence areas.
Students that were selected for the treatment group in this study were considered to be the most successful students of their class and schools. As a result of the selection process, treatment group students showed a high level of academic self-efficacy in Wave 1 because their answers were based on their experience in classrooms and schools where they were already considered as the most successful students. Throughout 20 weeks of the experiment phase, courses taught in the Project Children's University were different from regular courses in their schools. In addition, the fact that almost all of the students from eight different schools were among best students of their classes caused the students to question themselves and see their shortcomings. They found that they were not successful enough to transfer their knowledge to the real world, especially in problem-oriented applications. All of these situations enabled students to make more rational self-criticism by creating awareness, and accordingly, they criticized themselves and found that they in fact do not have a really high level of academic self-efficacy. On the other hand, students attended the activities voluntarily with a high participation rate throughout the project. Moreover, observations and interviews with students, teachers, and parents indicated a positive development in other proficiency areas of the self-efficacy such as social and emotional competence. Since social and emotional competence are not easily measurable, measurement tools about self-efficacy focus mostly on the academic competence. Almost all schools in Turkey perceive academic success as the success in science, math, social studies, and Turkish language. These courses are the main courses in curricula and measured in high-stake exams. However, it is not possible to develop students in a holistic manner without lessons such as physical education course, which improves the orientation of students, music course that helps students understand and express their emotions, and the visual arts course, which gives artistic perspective and aesthetic skills. Considering solely basic courses and ignoring other courses as listed above cause students to grow up without skills that are covered in those courses. As mentioned by Bandura (1997) [8], Pajares (1997) [33], and Schunk (1981Schunk ( , 1982 [34,35], a child's development process should not only focus on the cognitive aspect, but also focus on a balanced social and physical aspect. All of the courses taught in Project Children's University were designed to achieve a development in students in those three areas holistically. As a result of all these practices, students have not only questioned themselves, but have also started to question the acceptances that are considered as academic achievement in their schools. This increase in student awareness may be seen as another reason for the decrease in academic self-efficacy scores. The concept of self-efficacy as being aware of one's goals and realizing what they can and cannot do to achieve these goals (Elias and MacDonald, 2007) [36] is different from the concept of academic self-efficacy. In fact, this realization can be viewed as a process of self-balancing students. According to Medrano, Flores-Kanter, Moretti, and Pereno (2016) [37], children with high academic self-efficacy have also shown success in balancing emotionally difficult situations. Lower academic self-efficacy scores throughout the study can be regarded as an indication that students have begun to understand the concept of academic self-efficacy. This development and change in the level of cognition of students can be seen as a big step as the beginning of the lifelong learning journey. Individuals who acquire lifelong learning skills are also accepted as individuals who learn how to learn (Olcum and Titrek, 2015) [38]. Learning how to learn is a sine qua non for sustainable learning.
At the end of wave 3, we observed significant differences in academic self-efficacy domain scores among treatment and control group students. Having higher scores in self-efficacy in ability domain can be seen as a positive situation while having higher scores in context domain can be seen as a negative situation. Compared to control group students, treatment group students have higher levels self-efficacy in ability domain. Parallel to the findings of Lv, Zhou, Liu, Guo, Zhang, Liu, and Luo (2018) [29] that indicate a high level of relationship between academic achievement and academic self-efficacy, our treatment group students showed higher academic self-efficacy in ability compared to the control group students because they were successful in academic achievement. On the other hand, context domain scores for treatment group students were higher than those of control group students indicating that treatment group students show lower academic self-efficacy for this particular domain. This result was unexpected. When we investigated this issue further with face to face interviews with treatment group students, we found that students were happy with how the faculty treated them, the quality of the instruction, and access to labs, studios, and gymnasium opportunities throughout 20 weeks. Students started blaming their schools for not being able to provide those privileges; therefore, they reflected lower academic self-efficacy levels in the context domain.

Conclusions
Under sustainable teaching and learning framework, students received education that was based on problem-based learning theory and in an inter-disciplinary manner throughout 20 weeks. The aim of the study was to develop students' problem solving skills in which problems are simply related to daily life issues. Hence, students would learn how to learn. Eventually, students would be life-long learners and gain sustainable learning skills. Academic self-efficacy levels of the students were measured as an indicator for the sustainable learning skills, assuming that those learning skills are sustainable and has higher academic-self efficacy. Throughout 20 weeks, it was observed that the treatment group students' academic self-efficacy levels fell continuously. Yet, compared to control group students, treatment group students still have higher levels self-efficacy. We may conclude that the objective of the project has not been achieved. Educational process designed in the project may be inadequate or the duration of the project may be so short to improve learner's ability to learn how to learn. Nevertheless, we have observed many positive changes such as better inter-personal relationships among students, more positive attitude towards schooling, and increased self-confidence in self-efficacy of the students, in social and emotional areas that were not in the scope of this research. Since students we recruited for the treatment group already had higher levels of self-efficacy in ability domain of the academic self-efficacy, due to the ceiling effect, there was not much room for them to grow throughout the project. Replicating this study with a more heterogeneous student group may provide different results. Additionally, social and emotional characteristics of the self-efficacy should be monitored to measure growth of the students in a holistic manner with a more comprehensive study.
Due to the unreliable measurement process and non-converging statistical model issues, we failed to investigate the education quality domain of the academic self-efficacy construct. This particular domain was not apparent in the original version of the scale and measured using only three items. Therefore, the academic self-efficacy measurement tool was limited to measure only self-efficacy in ability and context domains. Besides, some parts of the original academic self-efficacy scale such as task difficulty and effort dimensions were not adapted to the Turkish version; hence, those parts are not studied in this study. In addition, the sample consists of students from Siirt city center only and the results should not be generalized beyond Siirt city center. Researchers should investigate the relationships listed above in other cities and countries and they should use a more comprehensive data collection tool.