#### 2.2. Measures

**Dependent Variables.** Dependent variables include participants’ declared major in 2006, two years after high school and first completed degree major field as of 2012, eight years after high school. Majors are coded to compare mathematics-intensive PEMC fields (physical sciences, engineering, mathematics, and computer sciences) with other STEM (biological sciences, health sciences, and social/behavioral and other sciences) and non-STEM fields, which serve as the reference group. Declared major includes an undeclared/undecided category to capture students who had not yet selected a field of study or who had delayed entry into postsecondary education.

**Independent Variables: Difficulty Orientations.** Questionnaires in students’ 10th grade year included Likert-scale items regarding perceived ability to learn the most “difficult,” “hard,” or “complex” material in general as well as in English or mathematics classes. These items were originally developed for self-efficacy scales in PISA:2000 and modified for ELS:2002 (

Ingels et al. 2004;

OECD n.d.). However, given our interest in difficulty orientations, we focused our analyses on the six items that measure students’ perceptions of their own abilities with difficult or challenging material. After developing our 10 datasets using multiple imputation, we used confirmatory factor analysis to develop three scales that reflect students’ difficulty orientation by domain:

general difficulty orientation (alpha = 0.7)

, verbal difficulty orientation (alpha = 0.9), and

mathematics difficulty orientation (alpha = 0.9).

Table A1 provides a description of the items for each scale, factor loadings, scoring coefficients, eigenvalues, and average alpha coefficients, all which meet generally acceptable levels for usage as scales (

Kline 2011).

**Covariates.** Given the above-cited prominence of background and educational experiences in the literature, the analysis additionally included demographic factors (

gender, race/ethnicity, family income, and

parent education), high school experiences (

standardized test scores, science course taking, GPA, valuing mathematics2, and

mathematics growth mindset3), high school characteristics (

percentage free/reduced lunch, region, and

urbanicity), postsecondary participation in

undergraduate research with a faculty member, and postsecondary institutional characteristics (

control and

selectivity of the first attended institution).

Table A2 shows pooled sample descriptive statistics for each of the covariates listed.

#### 2.3. Analysis

This study examines the extent to which difficulty orientations, gender, and race/ethnicity predict mathematics-intensive degrees, independently and interdependently. The following research questions guide our study.

RQ1. Do domain-specific and domain-general difficulty orientation measures differ by gender and race/ethnicity identity categories?

**H1A.** High school boys report higher difficulty orientations than their female peers, particularly in mathematics. In other words, boys’ mathematics difficulty orientation scores will be higher than girls’.

**H1B.** Non-White students’ difficulty orientations will be lower than those of their White peers.

RQ2. To what extent do difficulty orientation measures predict PEMC degrees?

**H2.** Students with higher mathematics difficulty orientations will be more likely to declare PEMC majors and earn PEMC degrees, all else being equal.

RQ3. Do the relationships between difficulty orientation and PEMC degrees differ by gender and race/ethnicity?

**H3.** The relationship between mathematics difficulty orientation and PEMC outcomes will be greater among non-White students than White students and among women than men, such that the relationship for White men will be weaker than for other gender and race/ethnicity groups.

To answer the first research question, we estimated linear regression models to evaluate how difficulty orientation differs by gender and race/ethnicity.

4 To address the second research question, we estimated a series of multinomial logistic regression models, progressively introducing difficulty orientation measures to estimate their effects on declared/degree major, while controlling for the covariates listed in the previous section. While our reporting focuses on results for PEMC fields, the models carefully consider gradations in declared/degree majors rather than a binary PEMC/non-PEMC model. Non-STEM majors serve as the reference group, as compared to (a) PEMC; (b) other STEM, and in the declared major models; and (c) undeclared/undecided majors.

We started with a base model (Equation (1)), including the dependent variable of interest, gender, race/ethnicity, and control variables.

where

major = declared or degree major (see “Dependent Variables” section);

S = student-level controls (family income, parent education standardized test scores, science course taking, GPA, mathematics value, and growth mindset);

HS = high school characteristics (percentage free and reduced lunch, region, and urbanicity);

research = participation in undergraduate research; and

PSI = postsecondary institutional characteristics (control and selectivity of the first attended institution)

To capture the domain-specific effects of each difficulty orientation, we estimated four additional models. The first three added only one of the difficulty orientations to the base model (Equations (2)–(4)). The last model in this sequence included all three of the difficulty orientation scales (Equation (5)).

where

general = domain-general difficulty orientation scale,

verbal = verbal difficulty orientation scale, and

math = mathematics difficulty orientation scale.

Our final research question (RQ3) examines whether the relationship between difficulty orientations and PEMC outcomes varies by gender and race/ethnicity. In our preliminary analyses, we tested for significant differences in gender and race/ethnicity slopes by including interaction terms.

5 Because these interaction terms were statistically insignificant in our initial model results, they were therefore removed from our final models, and they are not shown in our mathematical expressions of these models above. Despite the null findings for the interaction terms, we hypothesized there could still be meaningful differences in the relationship between difficulty orientations and PEMC outcomes by identity group.

Using the Equation (5) model, we used multinomial logistic regression (mlogit) models to predict students’ PEMC major outcomes (declared and degree field). To better understand potential differences by race/ethnicity and gender, and to simplify interpretation of our results, we report these results as predicted probabilities. Post-estimation predicted probabilities were generated by

mimrgns, a user-written Stata command that correctly produces pooled estimates of multiply-imputed data using Stata’s built-in

margins command and by applying Rubin’s rules (

Klein 2016). These predicted probabilities were estimated holding all other variables in Equation (5) constant.

First, we generated predicted probabilities to declare a major or earn a degree in PEMC by both gender and race/ethnicity for the 10th, 25th, 50th, 75th, and 90th percentiles of each difficulty orientation scale. Next, using the pwcompare option, we evaluated the statistical significance of differences in students’ predicted probabilities of PEMC majors and degrees, by difficulty orientation, gender, and race/ethnicity. We assessed intersectional differences by identity (gender and race/ethnicity) as follows: (1) comparing women and men within race/ethnicity groups (e.g., Latinas vs. Latinos) and (2) race/ethnicity groups within gender categories (e.g., Latinos vs. White men). Finally, we examined the degree to which each identity group increased in percentile difficulty orientation. For instance, we tested whether the probability for Latinas at the 25th percentile differed from the probability for Latinas at the 10th percentile. Together, these results provide insights on the manner that PEMC outcomes are related to intersections between gender, race/ethnicity, and difficulty orientations.