Texas Rural vs. Nonrural School District Student Growth Trajectories on a High-Stakes Science Exam: A Multilevel Approach

: This study compares the science achievement growth trajectories of ﬁfth-grade students in rural and nonrural school districts in Texas. Using a growth hierarchical linear model, we explored the e ﬀ ects of time, school location (rural vs. nonrural), and their interaction on students’ science performance as measured by the high-stakes State of Texas Assessment of Academic Readiness (STAAR) science test over ﬁve academic years. We found that rural school students lagged in science at the initial stage when STAAR was ﬁrst administered in the 2011–2012 school year. With time, the gap between rural and nonrural district students’ science performance persisted. We further added eight district-level factors that might inﬂuence students’ academic performance into the model and found that three variables (i.e., student mobility rate, percentage of students identiﬁed ELs, and teacher turnover rate) constantly inﬂuenced students’ science scores. The implications for teaching pedagogy and research are discussed regarding science education in Texas rural districts.


Introduction
Compared to other states, Texas has the highest percentage of schools located in rural areas (Texas Education Agency 2017) and the greatest number of rural students attending public schools (Showalter et al. 2017). U.S. rural areas are home to 11.4 million children (Avery 2013). Over 9.1 million students enrolled in rural public schools in fall 2013 (National Center for Education Statistics 2013), yet little attention is given to rural education in either policy or academia (Lavalley 2018). Given rural school districts' small size, limited educational resources and instructional capacity, and high student mobility rates, these districts face challenges serving diverse students, especially English learners (ELs) and economically disadvantaged (ED) students, including improving their academic language proficiency and content knowledge (Cooley and Floyd 2013;Howley et al. 2009;Masumoto and Brown-Welty 2009;Paik and Phillips 2002;Showalter et al. 2017).
With a rich language base, science learning is a process of acquiring both science content and academic language (Lee et al. 2013;Rosebery and Warren 2008). Since science was included in the Texas school accountability system starting in 2007, science education has received increased attention (Maerten-Rivera et al. 2016). In Texas, science is a required course for all grade levels, and the high-stakes science test is administered in Grades 5 and 8. The literature indicated that fifth grade is a critical time for students' science learning (Singh et al. 2002) and motivation (Karaarslan and Sungur 2011), as well as selection of advanced science courses and science career aspirations (Simpkins et al. 2006;Singh et al. 2002). Furthermore, although rural students have opportunities to explore the natural world and apply science concepts in their daily life, researchers found that science education in rural areas can be problematic, stemming from a lack of funding, educational resources, and personnel (Avery 2013;Avery and Kassam 2011). According to a National Commission on Teaching and America 's Future (2002) report, due to limited funding and resources, rural schools not only have difficulty in attracting and retaining quality science teachers, but also employ teachers with fewer science teaching credentials.
Therefore, in this study, we examine and compare the growth trajectories of Texas fifth-grade students' science performance in rural and nonrural school districts. This paper includes (a) a brief overview of the impact of rural factors on students' science performance and (b) a quantitative, data-driven analysis that compares Texas students' science achievement in rural and nonrural school districts.

An Overview of the Impact of Rural Demographic and Social Context on Students' Science Learning
Among the features commonly cited as negatively impacting educational development in rural districts, the following three have received the most attention; isolation (Lee et al. 2018;Williams 2010), underfunding (Williams 2010;Showalter et al. 2017), and small enrollment size (Jimerson 2005;Showalter et al. 2017).
First, the great physical distances of rural settings result in isolation based on geographic location (Reeves 2003) and teacher professional isolation (Howley and Howley 2005;Mollenkopf 2009). Economic hardship pushes impoverished ethnic minority populations to move to geographically isolated locations (Lichter 2012;Massey and Pren 2012). "People who live in communities with historically limited access to education or modest economic returns to educational investments may simply invest less in education" (Roscigno et al. 2006(Roscigno et al. , p. 2123). In addition, rural communities and school districts have trouble recruiting and retaining qualified teachers (Howley et al. 2002;Kearney et al. 2018;Reeves 2003). Since rural districts often hire less experienced and/or novice teachers, teacher professional development is needed to improve the quality of instruction. However, rural teachers reported professional isolation, which hinders them from acquiring the knowledge and skills of working with rural students who have diverse needs in education, behavior, and mental health (Monk 2007). Moreover, among the rural teachers who are trained within the district, a quarter of them leave each year and move to a more affluent district nearby in search of higher salaries and shorter commutes (Kearney et al. 2018).
Second, compared to nonrural schools, rural schools receive more limited funding from state and federal agencies (Harmon and Smith 2007;Williams 2010). According to Showalter et al. (2017), only 17% of state education funding was funneled to rural school districts. With inadequate funding, rural teachers are provided with outdated facilities (Harmon and Smith 2007) and have limited access to professional development (Harmon and Smith 2007;Kenny et al. 2008;Showalter et al. 2017). The National Education Association (2008) reported that rural teachers are often required to teach several subjects and at different grade levels due to finite school funding and personnel. Furthermore, rural teachers may take on other school-related responsibilities, such as supervising extracurricular activities or driving school buses (Brownell et al. 2005).
Third, rural school districts face small student enrollments (Monk 2007;Reeves 2003). Although smaller class sizes are appreciated by some rural teachers (Kenny et al. 2008;Monk 2007), Jimerson (2004 argued that with small student sample sizes, it is challenging for researchers and policymakers to make valid and reliable evaluations and recommendations for rural school districts to improve their Adequate Yearly Progress. Moreover, small student enrollment also directly affects how much funding rural districts receive (Reeves 2003).

Rural Culture, Local Knowledge, and Science Learning
Learning is a social and cultural process (Vygotsky 1987), which involves cultural practices and everyday life (Zimmerman and Weible 2017). A commonly adopted pedagogy in rural education studies is from a place-based perspective that incorporates students' local community experience with broader concerns in discipline-specific practices (Avery 2013;Zimmerman and Weible 2017). The literature indicated when rural students' local knowledge is valued, their academic performance can be improved (Avery and Kassam 2011;Borgerding 2017;Zimmerman and Weible 2017). We were able to locate a handful of empirical studies on science education in rural areas. For example, Zimmerman and Weible (2017) investigated 74 ninth-and tenth-grade biology students' understanding of their local community and water quality. They found that as students learned the science content, they acknowledged the impact of local activities on the environmental health of their community. Borgerding (2017) studied the role of the teacher in guiding students to make connections between science and their local environment. A total of 50 high school biology students participated in this study. The researcher found that with the teacher's assistance, students learned about rural places, made connections between their daily lives and local environmental problems, and were able to leverage their rural experiences as they engaged in science during school. Avery and Kassam (2011) examined 20 fifth-and sixth-grade students' science learning through the analysis of videotaped interviews and photographs. In their study, all students identified scientific phenomenon in their daily lives, and they learned science concepts by observing the rural environment and participating in daily activities. The authors suggested that students' local rural knowledge made a contribution to their science learning.
Although these three studies covered rural student science learning, none of them was a longitudinal study reflecting the growth of student science achievement. Moreover, none of these studies adopted a standardized test or high-stakes exam to measure student science progress. Furthermore, the student sample size in each of these studies was very small, which limits their generalizability. Therefore, caution needs to be taken regarding the generalizability of their findings.

The Context of Texas Rural Districts
Texas has the largest rural student enrollment in the nation, reaching nearly 610,000 students in 2015-2016 (Showalter et al. 2017). Due to social and demographic changes in Texas small towns, the number of ELs enrolled in rural schools is growing rapidly (Schafft and Jackson 2010). In Texas, ELs and ED students comprise a large proportion of the student population in rural districts, with 8.2% of EL students attending rural schools compared to 3.5% at the national level (Showalter et al. 2017). In rural districts, 53.3% of students are eligible for free or reduced lunch (Showalter et al. 2017).
Texas rural school districts face the same challenges as other rural districts across the nation: low funding, isolation, and small enrollments. Compared with other states, Texas rural school districts have a larger proportion of ELs and a higher percentage of students from low-socioeconomic status (SES) families. Additionally, according to Showalter et al. (2017), Texas rural districts have remarkably low instructional spending per student, which makes Texas one of the most inequitable states across the nation. Only 14.2% of Texas's state educational funding goes to rural districts (Showalter et al. 2017). Harter (1999) emphasized that students' high-level academic achievement is associated with the amount of funding spent on hiring qualified teachers and the provision and maintenance of basic supplies. Despite this finding, Texas rural school districts are less competitive for grant funds than their nonrural counterparts (Texas Education Agency 2017). Moreover, due to their isolated locations, rural school districts are plagued by persistent shortages of educational resources (Cooley and Floyd 2013), quality teachers (Hutson et al. 2011), and administrative personnel (Canales et al. 2008). In small rural school districts, teachers and administrators often fill multiple roles other than managing district finances, instructional support, and human resources (Canales et al. 2008).
In terms of teacher professional isolation, Hansen-Thomas et al. (2016) conducted a survey of 159 K-12 teachers from Texas rural school districts regarding their professional development needs.
They found that rural teachers encountered many challenges, including working with ELs to improve their language and academic learning, encouraging parental involvement, and having limited time for professional development to improve the quality of instruction. Canales et al. (2008) observed 278 superintendents, principals, and teachers' leadership behaviors in small rural school districts in Texas. They found that superintendents, principals, and teachers are expected to take on more administrative responsibilities due to the districts' limited staffing, and they felt isolated due to scant supports. This led to underperformance and high turnover of teachers and administrators. For example, according to the Texas Academic Performance Reports from the Texas Education Agency, the teacher turnover rates of Buckholts ISD (a typical rural school district) ranged from 55.2% to 88.1% between 2012 and 2017, while at the state level these ranged from 15.3% to 16.6%.

Texas Science Instruction and Standards
In Texas, science is a required subject at every grade level. As required by the Texas Education Code (2017), the primary goal of science instruction is to provide students with opportunities to learn about the natural world. Texas science teachers are expected to follow the Texas Essential Knowledge and Skills (TEKS), which is a series of curriculum standards for each grade level. The TEKS outlines the knowledge and skills that Texas students should know and master. For Grade 5 science instruction and standards, there are ten types of essential knowledge and skills that students should know: (a) following safety procedures and environmentally appropriate practices when conducting indoor and outdoor scientific investigations; (b) applying scientific investigative practices, such as asking well-defined questions and collecting observation data; (c) using scientific problem-solving skills and critical thinking before drawing conclusions; (d) utilizing different methods and tools when conducting science inquiry; (e) concepts of matter and energy; (f) different forms of energy through various cycles, patterns, and systems; (g) the surface of the Earth is changing constantly and consists of various types of natural resources; (h) the patterns of the natural world and the relationships between the Sun, Moon, and Earth; (i) certain types of relationships, cycles, and systems in the natural environment; and (j) how organisms survive in their environment with specific structures and behaviors (Texas Education Code 2017). In Grade 5, students should acquire the basic skills to answer certain types of science questions through observation and update their conclusions as new information is available or observed (Texas Education Code 2017). Moreover, school districts are able develop their unique curricular framework and pacing guide that are aligned to the TEKS to facilitate standards-based instruction (Jackson and Ash 2012). The State of Texas Assessments of Academic Readiness (STAAR) science test-the annual high-stakes exam-has been administered to measure student science learning in Grades 5 and 8 since 2011-2012.

Students' Science Achievement in Texas Rural Schools
In order to meet the requirements of the No Child Left Behind Act and to promote students' science achievement, the Texas Science Initiative program was established to provide every science classroom with qualified science teachers who utilize rigorous and effective science teaching (Schroeder et al. 2007). However, with a persistent shortage of educational resources, Texas districts struggle to provide quality public education to their steadily growing student population (Cooley and Floyd 2013). There are very limited empirical studies investigating students' science achievement in rural Texas. For example, (Jackson and Ash 2012) investigated the impact of a three-year science instruction and school-wide professional development intervention on students' achievement in two Texas elementary schools with high-need and ethnically diverse populations. The 5E Instructional Model (Bybee 1996) for science inquiry and academic content vocabulary development were embedded in science instruction. The researchers found that with the support of professional development, teachers transferred state standards into practice and significantly improved student performance on high-stakes tests. Holland et al. (2011) looked at the relationship between school district size and students' academic achievement. They utilized Texas Assessment of Knowledge and Skills (TAKS) data retrieved from 1135 school districts. Their results showed students in rural school districts with small enrollments underperformed compared to students from larger school districts. Due to small enrollment sizes and limited funding, Texas rural school districts struggled to provide quality education to students. The authors urged the state to focus on these rural districts for science education and to provide more resources.
Taken together, this review of literature on science education in Texas rural school districts revealed (a) that rural students underperform compared to their peers in urban and suburban areas in science, (b) teachers and administrators in rural districts need additional support and professional development to work with students, and (c) the importance of rural teachers utilizing local resources and areas as well as integrating students' daily experiences with the science curriculum to enhance learning. These findings concurred with the Showalter et al. (2017) report that local rural context (e.g., percentage of rural schools) and student background (e.g., percentage of rural ELs) are the two most important considerations for evaluating the condition of rural education in a state. However, few empirical studies have been conducted to explore academic success for rural students in general, especially in science (Zuniga et al. 2005). Furthermore, limited research exists showing how well students understand the specific knowledge and skills required in the science TEKS.
Therefore, in the next section, we present a data-driven analysis of fifth-grade students' science performance in Texas districts. We compare rural and nonrural student science performance on the high-stakes STAAR, to explore the extent to which the rural context impacts students' science achievement. The research questions were as follows.

1.
Was there a significant difference between Texas rural and nonrural school districts in their student performance on the STAAR science test in the 2011-2012 school year when STAAR was first administered? 2.
Over time, did Texas rural school districts significantly differ from nonrural districts in their growth trajectories of student performance on the STAAR science test from 2011 to 2016? 3.
What was the impact of district-level characteristics (i.e., student mobility rate; percentage of students identified as ELs; teacher-student ratio; percentage of full-time teachers with bachelor's, master's, and doctoral degrees; teacher average base salary; and teacher turnover rate) on student performance on the STAAR science test, when time and location are controlled for?

Research Design and Context
According to the Texas Education Agency (2017), a Texas district is classified as rural if it has (a) an enrollment of fewer than 300 students or (b) an enrollment of between 300 and the state median district enrollment, with an enrollment growth rate of less than 20% over the past five years. In 2015-2016, there were 459 districts classified as rural among 1210 public school districts in Texas.
In this study, we randomly selected 90 rural districts and paired them with 90 typical nonrural districts, including 11 urban and 79 suburban districts. Therefore, a total of 180 school districts were included. Through the publically available Texas Assessment Management System (TAMS), we gathered district-level STAAR science data for these 180 districts. Since STAAR was first administered in the 2011-2012 school year as a state accountability measure and the STAAR science assessment is first administered in fifth grade, we collected district-level STAAR science scores of fifth-grade students from 2011 to 2016. District-level demographic data of the school year 2011-2012 (i.e., student mobility rate-if a student has missed schools for over six weeks of the school year, he/she is considered to be mobile; percentage of students identified as ELs; teacher-student ratio; percentage of full-time teachers with bachelor's, master's, and doctoral degrees; teacher average base salary; and teacher turnover rate-the percentage of teachers from the fall of the previous academic year were not employed in the district in the fall of the following academic year) were collected from Texas Education Agent reports.

Measurement
STAAR is a paper-based and state-mandated assessment for school accountability purposes. It measures students' academic performance on core subjects, including reading, math, social studies, writing, and science. The STAAR Grade 5 science test assesses students' science learning across four reporting categories (RC): RC1-Matter and Energy; RC2-Force, Motion, and Energy; RC3-Earth and Space; and RC4-Organisms and Environments, which are aligned with the TEKS. The test consists of 36 items, and it has three performance levels: unsatisfactory, satisfactory, and advanced. RC1 includes six items (17%) and measures students' understanding of the properties of matter and energy and their interactions. RC2 has eight items (22%) and tests students' knowledge of force, motion, and energy. RC3 consists of ten items (28%) and evaluates students' knowledge of Earth and space. RC4 includes 12 items (33%) and tests students' familiarity with organisms and environments.
The total raw score possible on the STAAR science test is 44 points, with eight points for RC1, ten points for RC2, 12 points for RC3, and 14 points for RC4. A student needs to earn a total of 26 points or above on the STAAR science test to be rated at a satisfactory level. The same standard for aggregating students' performance was followed by TAMS and applied to the district-aggregated STAAR science score. For example, a district should score above 4.7 points in RC1, 5.9 points in RC2, 7.1 points in RC3, and 8.3 points in RC4 to meet a satisfactory level.

Data Analysis and Model Specification
We compared rural and nonrural school districts' student science performances by utilizing the district-aggregated, fifth-grade STAAR science achievement data. A growth hierarchical linear model (GHLM) was adopted to analyze the multilevel longitudinal dataset since the aggregated data were collected for school years 2011-2016. This growth model was chosen because, compared to traditional repeated-measure techniques, it offers new benefits and information about the data (Chen and Cohen 2006). For example, growth modeling estimates both linear and nonlinear change of the data and allows estimation of unbalanced data with missing information at some time points (Chen and Cohen 2006;Kwok et al. 2008). Five separate models were created during the model-building process and were repeated four times for reporting categories 1 to 4. We utilized SAS 9.4 to complete these analyses. Model specifications are described as follows: Model 1: an unconditional model. Starting with an unconditional model, this analysis provides information about the mean of RC science scores. It shows whether school districts generally demonstrate variation in these scores and whether investigating the school district variability in these scores over time is plausible. It is redundant to further explain the variability across time if the variability in science scores across school districts is not statistically significant. Level-1 RC ij is STAAR science performance at time i for school district j, β 0j is the expected mean STAAR science RC score for an individual school district j, γ 00 is the expected grand mean RC scores across all occasions and districts, u 0j is the deviation of district j from γ 00 (i.e., a between-district random effect), and r ij is the deviation of time i from district j's mean RC scores (i.e., a within-district random effect).
Model 2: the unconditional growth model. Based on the unconditional model, this model adds a time variable at level 1. This model is to explore the estimated average growth rate in RC scores for districts in each year.
Model 3: the conditional growth model. Based on the unconditional growth model, Model 3 adds location (rural vs. nonrural) as the level-2 predictor. By adding this predictor, we determine on average whether rural districts exhibit different growth trajectories than nonrural school districts regarding their RC science scores. Level-1 Model RC ij = β 0j + β 1j *(TIME ij ) + r ij Level-2 Model β 0j = γ 00 + γ 01 *(Location j ) + u 0j β 1j = γ 10 Mixed Model RC ij = γ 00 + γ 01 *Location j + γ 10 *TIME ij + u 0j + r ij γ 00 is the difference between rural and nonrural school districts in RC science scores in the school year 2011-2012, u 0j is the district-level residual variance (after considering whether a district is rural or nonrural) in γ 01 .
Time was added as a level-1 predictor indicating the school year 2011-2012 as the reference time point in Models 2, 3, and 4. In total, five time points were included in the analysis to represent school years from 2011 to 2016, respectively. Location (rural vs. nonrural) was a level-2 predictor. In this study, in order to explore school districts' STAAR science performance across the four reporting categories (RC1, RC2, RC3, and RC4), we replicated the four models described above four times using RC1, RC2, RC3, and RC4 as the outcome separately. To test model fit, the difference of deviance (-2loglikelihood) of two models (Model 1 vs. Model 2; Model 2 vs. Model 3; Model 3 vs. Model 4) was calculated with the following formula: χ 2 = Deviance Reduced − Deviance Full . For example, in order to test whether Model 2 is significantly different from Model 1 or whether it is necessary to add time as a level-1 predictor, the difference in deviances between Model 1 and Model 2 was calculated. The critical value was set as 3.84. If the difference in deviances between Model 1 and Model 2 is larger than 3.84, Model 2 is statistically significantly different from Model 2, which indicates adding time as a level-1 predictor in Model 2 is meaningful.
In addition, to examine what percentage of the variance of the level-2 random effect could be explained by district location, we calculated intraclass correlation (ICC) with the following formula; ICC = τ 00 /(τ 00 + σ 2 ).
Model 5: full model-the interaction model with district-level covariates. This model further incorporates district-level covariate factors, including the student mobility rate; percentage of students identified as ELs; teacher-student ratio; percentage of full-time teachers with a bachelor's, master's, and doctoral degree; teacher average base salary; and teacher turnover rate at the district level in the 2011-2012 school year. This model is designed to explore, over time, whether rural districts exhibit different growth trajectories from nonrural school districts regarding their RC science scores, while taking additional district characteristics into consideration.

Results
Descriptive statistics of rural and nonrural school district Grade 5 STAAR scores by reporting categories are reported in Table 1. An independent sample t-test was conducted to examine differences between rural and nonrural school district characteristics, as collected in the school year 2011-2012.    Research Question 1: Was there a significant difference between Texas rural and nonrural school districts in their student performance on the STAAR science test in the 2011-2012 school year when STAAR was first administered?

RC1
ICC was calculated to determine the percentage of the variance of the level-2 random effect that could be explained by school district location in the four reporting categories. The results indicated that 41%, 40%, 55%, and 58% of the variance could be explained by district location for RC1, RC2, RC3, and RC4, respectively. The parameter estimates of Model 1 by reporting categories are listed in Table 3. Table 3. Parameter estimates of fixed and random effects of Model 1 by reporting categories.

Fixed Effect Coefficient (SE) t(df ) p
Intercept (γ 00 ) 5.65 (0.03) 172.14 (178) <0.001 Time was added as the level-1 predictor in Model 2 to answer the first research question. Model fit analysis indicated that Model 2 was significantly different from Model 1 in four reporting categories, with a chi-square value of model fit change at 88.5 for RC1, 290.8 for RC2, and 31.1 for RC3, which were all higher than the critical value of 3.84. For RC4, the chi-square value of model change was less than 3.84, which indicated no statistical difference between Models 1 and 2 for RC 4. The parameter estimates of Model 2 by reporting categories are shown in Table 4. On average, school district fifth-grade STAAR science scores in the 2011-2012 school year were 5.85, 7.54, 7.99, and 9.62 for RC1, RC2, RC3, and RC4, respectively. The results indicated that time was a statistically significant predictor for RC1, RC2, and RC3. Between 2011 and 2016, students' STAAR science performance across school districts decreased 0.10, 0.22, and 0.08 points annually for RC1, RC2, and RC3, respectively. Over the same period, students' STAAR science scores for RC4 across school districts increased 0.01 points annually; however, time was not a statistically significant predictor for RC4. Research Question 2: Over time, did Texas rural school districts significantly differ from nonrural districts in their growth trajectories of student performance on the STAAR science test from 2011 to 2016?

Random Effects
In order to examine the difference between rural and nonrural districts regarding their STAAR science RC scores, location (rural vs. nonrural) was added in Model 3. The model fit analysis indicated that compared to Model 2, Model 3 was significantly different from Model 2 in reporting categories 1-4, with a chi-square value of model change at 14.9 for RC1, 16.8 for RC2, 30.8 for RC3, and 5.9 for RC4. The chi-square values of these four categories were higher than the critical value of 3.84. The parameter estimates of Model 3 by reporting categories are listed in Table 5. The results indicated that location is a statistically significant predictor across four reporting categories. In the 2011-2012 school year, Texas rural school districts on average scored 5. 72, 7.37, 7.68, and 9.47 points in RC1, RC2, RC3, and RC4, respectively, while nonrural school districts on average scored 5. 97, 7.70, 8.29, and 9.76 points. Growth rates for the four reporting categories remained the same as compared to in Model 3. The interaction between time and location (rural vs. nonrural) was added in Model 4 to explore the growth trajectory differences between rural and nonrural school districts regarding their STAAR science RC scores. Compared with Model 3, the chi-square change of Model 4 was 8.3, 1.9, 1.00, and 0.5 for RC1, RC2, RC3, and RC4, respectively. This suggested that by adding interaction between time and location, Model 4 was statistically significantly different from Model 3 in one reporting category, RC1, with the chi-square change larger than the critical value of 3.84. In Model 4, time and location remained as two statistically significant predictors in RC1, RC2, and RC3, after the interaction variable was included in the model. In RC4, by adding the interaction between time and location, time was not a statistically significant predictor, and location became marginally significant in Model 4. Additionally, the interaction between time and location was a statistically significant predictor in RC1. The parameter estimates of Model 4 by reporting categories are displayed in Table 6.
The results of Model 4 revealed that Texas rural school districts scored 5.67, 7.33, 7.65, and 9.49 points in RC1, RC2, RC3, and RC4, respectively, in the 2011-2012 school year, while nonrural school districts scored 6. 04, 7.72, 8.32, and 9.74 points in these four categories in the same school year. In Texas rural areas, district STAAR science RC1, RC2, and RC3 scores decreased 0.07, 0.2, and 0.07 points annually during the 2011 to 2016 school years. Their RC4 scores, however, increased 0.001 points annually during the same period. In nonrural school districts, STAAR science RC1, RC2, and RC3 scores decreased 0.13, 0.23, and 0.1 points annually during the 2011 to 2016 school years, while RC4 increased 0.02 points annually. Research Question 3: What was the impact of district-level characteristics (i.e., student mobility rate; percentage of students identified as ELs; teacher-student ratio; percentage of full-time teachers with bachelor's, master's, and doctoral degrees; teacher average base salary; and teacher turnover rate) on student performance on the STAAR science test, when time and location are controlled for?
In order to explore the influence of other related factors on rural and nonrural school districts' STAAR science scores, eight additional district-level variables, including student mobility rate, percentage of students identified as ELs, teacher-student ratio, teacher average base salary, teacher turnover rate, and percentage of teachers with bachelor's, master's, and doctoral degrees, were added in Model 5. The results of Model 5 indicated that time remained as a statistically significant predictor in RC 1-3. In RC 1, both location and the interaction variable remained as statistically significant predictors. However, location and location-time interaction were no longer significant predictors in RC 2-4 in Model 5. The parameter estimates of Model 5 by reporting categories are displayed in Table 7. We found that student mobility rate, percentage of students identified as ELs, and teacher turnover rate had a statistically significant negative impact on student science achievement in all four reporting categories. For example, in RC1, a one unit increase in student mobility rate resulted in a 0.02 point decrease in the RC1 score. A one unit increase in percentage of students identified as ELs resulted in a 0.01 point decrease in the RC1 score, and a one unit increase in teacher turnover rate resulted in a 0.01 point decrease in the RC1 score. This indicates that higher mobility rates, higher turnover rate, or higher percentage of ELs in the districts led to districts' lower performance on STAAR science RC scores.  We identified a similar pattern in RC 2-4. Additionally, it was found that the percentages of teachers holding graduate degrees were not statistically significant predictors of any reporting categories. Furthermore, we found that teacher-student ratio, teacher average base salary, and percentage of teachers with bachelor's degrees had statistically significant impact on only one of the reporting categories. This suggested that these three predictors are not as influential as other district-level variables, such as student mobility rate, percentage of EL students, and teacher turnover rate on fifth-grade students' science scores.
To summarize, we found student mobility rate, percentage of EL students, and teacher turnover rate to be statistically significant predictors on fifth-grade students' science performance in all reporting categories for both rural and nonrural school districts. Furthermore, we found that Texas rural school districts lagged behind the nonrural school districts in 2011-2012 when the STAAR science test was first implemented as a state-mandated, high-stakes assessment. As time passed, both Texas rural and nonrural school districts had significant decreases in their performance in STAAR science in reporting categories 1-3 and demonstrated no difference in RC4. The fifth-grade science achievement gap between rural and nonrural school districts persisted into school year 2015-2016. Among the eight district-level variables, student mobility rate, percentage of EL students, and teacher turnover rate played critical roles on fifth-grade student science learning.

Discussion and Conclusions
This study compared the growth trajectories of fifth-grade student science performance of Texas rural and nonrural school districts, as measured by the high-stakes assessment administered by the state of Texas. To analyze the multilevel longitudinal dataset, we adopted a growth hierarchical linear model. Specifically, we replicated the analysis in four times for RC1, RC2, RC3, and RC4, which capture students' science achievement in different areas aligned with the TEKS.
We discovered that fifth-grade students in Texas rural school districts lagged behind their nonrural peers when the STAAR science test was first administered in 2011-2012. They continued to linger behind nonrural students over the span of five years. We further compared Texas rural and nonrural students' science performance across four reporting categories. A similar pattern was identified in the four reporting categories, in that rural district students constantly underperformed compared to their counterparts in nonrural districts. These findings are consistent with previous studies showing a persistent science achievement gap among demographic subgroups (Llosa et al. 2016), and rural school districts are challenged to improve their students' academic performance (Cooley and Floyd 2013). The results also align with a previous longitudinal analysis of statewide science test scores of fifth-grade students determining that ELs and ED students in Texas consistently underperformed (Texas Education Agency 2011). As Cummins (1980) suggested that it might take ELs five to seven years to reach grade-level academic English language proficiency, we argue that science achievement for rural ELs is contingent on having access to strong, well-implemented language programs and ensuring science teachers receive adequate professional development on utilizing hands-on science inquiry and integrating explicit academic vocabulary and literacy development in their instruction.
We detected statistically significant differences between rural and nonrural school districts' science performance concerning a number of district-related factors, including the percentage of EL students; teacher-student ratio; percentages of teachers with bachelor's, master's, and doctoral degrees; teacher average base salary; and teacher turnover rate. Most of these variables are related to the retention of qualified teachers, which is consistent with previous studies documenting that one of the greatest challenges for rural schools is the recruitment and retention of qualified teachers (Howley et al. 2002;Kearney et al. 2018;Reeves 2003). Among these variables, it is worth noting that Texas nonrural districts have a higher percentage of fifth-grade students identified as ELs than rural districts. A possible reason might be students in the rural districts are less likely to report English learning status as compared with their peers in the nonrural area (Fry 2008). Moreover, rural districts might lack financial resources to recruit certified bilingual teachers and launch or maintain bilingual programs up to fifth grade (Faltis 2011), resulting in ELs being placed in a transitional early-exit bilingual programs or in a regular English as a Second Language (ESL) programs in which the student are required to be transitioned to a mainstream class by Grade 2.
Additionally, we found that three district-level characteristics constantly significantly impacted fifth-grade students' science performance across four reporting categories: student mobility rate, percentage of students identified as ELs, and teacher turnover rate. This finding is in line with previous studies that showed student mobility (Isernhagen and Bulkin 2011;Paik and Phillips 2002), teacher turnover rate, and percentage of EL students (Jimerson 2004) to have a significant negative impact on students' academic performance. The instability in rural schools caused by high teacher turnover and student mobility (Isernhagen and Bulkin 2011) and the comparatively greater instructional demands on teachers working with EL students (Jimerson 2005) together led to the lower science performance of rural schools. In general, teacher education background is not a critical predictor in student science learning. The possible reason might be that districts normally set bachelor degree as the minimum degree requirement, which results in less variance between different districts and difference locations. However, we saw both rural and nonrural districts did not reach the level that 100% of their teachers held the degree. We believe that this is the area that both rural and nonrural districts need to work on, no matter teacher academic degree is a critical predictor or not. Moreover, teacher-student ratio and teacher average base salary were statistically significant predictors in only one of the reporting categories, which indicated that these factors are less influential compared to student mobility rate, teacher turnover rate, and percentage of EL students on student science achievement.
After we added the additional district-level variables into the model, location and its interaction with time were no longer statistically significant in three reporting categories (i.e., RC 2-4). This suggests that the significant difference between rural and nonrural school districts that we detected in Model 4 could be further explained these variables. In short, what actually impacts student science performance is not the physical location, but the local context and factors associated with that context. If rural schools could attract the same level of resources and teachers, students' academic performance could be improved. However, the physical location is the initial cause of the resource and staffing inequities present in rural school districts.
Time was identified as a statistically significant negative predictor in all models in RC 1-3, which is an alarming sign that, as time went by, there was no significant science improvement by both rural and nonrural students. A possible explanation might be that in Texas, science, unlike reading and math, is not a subject area that impacts student retention and advancement (Texas Education Agency 2019). Since science has no bearing on student retention, teachers and schools might not expend the same level of effort on improving student science performance compared to how they prepare students for reading and math. Yet science is extremely important for diverse student populations to become educated citizens in the 21st century (American Association for the Advancement of Science 1989Science , 1993National Research Council 1996).
Our findings are consistent with previous studies that small student enrollment (Zuniga et al. 2005), inexperienced/novice teachers (Showalter et al. 2017) and inadequate access to science instructional resources (Sipple and Brent 2007) limit rural student science learning. Further, we investigated how other factors, such as student mobility, teacher turnover, and teacher educational background, impact student science performance from a longitudinal perspective. Though our findings were based on Texas rural school districts, these findings could be further applied to understand other rural school districts in the U.S. as they share similar challenges, such as high student mobility rates, limited resources and poverty (Masumoto and Brown-Welty 2009;Paik and Phillips 2002;Showalter et al. 2017).
Based on our findings, we emphasize the need to capitalize on those distinctive rural district features as much as possible to increase student science learning. For instance, a small student enrollment provides opportunities for curriculum and instruction to be truly tailored to the community, its locale, and context. Geographic isolation creates the space for hands-on student exploration of the natural world. Furthermore, due to limited district resources, we recommend the use of structured online professional development with a focus on improving teachers' scientific knowledge and place-based pedagogy (Tang et al. 2019). We urge teachers and schools to partner with researchers to develop research-based curriculum and resources that reflect the local community and knowledge to serve the diverse needs of rural students. Moreover, instead of marginalizing ELs and ED students in the science classroom, teachers should provide extra support. Such support might come in the form of in-depth discussions of scientific concepts that embed in everyday English vocabulary and instructional scaffolding to acquire scientific language (Lee et al. 2013). Finally, we encourage administrators and policymakers to consider and act on the empirical evidence that rural districts lag behind in providing science education.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflicts of interest.