A Variable Selection Method for Small Area Estimation Modeling of the Proficiency of Adult Competency
Abstract
:1. Introduction
2. Background
2.1. PIAAC
2.2. Proficiency Measures in the PIAAC
2.3. PIAAC SAE Models
3. Methods
3.1. Identifying County and State Variables
3.2. Initial Set of Selected County and State Variable Sources
3.2.1. Initial Set of Selected Sources for County-Level Variables
3.2.2. Initial Set of Selected Sources for State-Level Variables
3.3. Variable Selection Process
3.3.1. Phase 1—Variable Reduction
3.3.2. Phase 2—Cross-Validation
- We sorted the 184 sampled counties from the largest to the smallest by sample size and divided them into groups of 10 counties, with the last group having only 4 counties. There were 19 groups in total.
- For each group of 10 counties, the counties were randomly assigned to 10 subsets, with each subset containing 1 county from the group. For the group with four counties, the counties were randomly assigned to four subsets. At the end of this step, each subset contained 18 or 19 counties with varying sample sizes.
- Excluding the counties in the first subset, the counties in the remaining nine subsets were used to fit the bivariate small area estimation model for each given set of variables and made predictions for the group of counties that were deleted.
- The previous step was repeated by excluding subsets 2 through 10, one at a time. At the end of this process the predicted proportions at or below level 1, at level 2, and at or above level 3 were calculated for all the counties.
4. Results
4.1. Phase 1—Variable Reduction
4.2. Phase 2—Cross-Validation
5. Discussion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
County Characteristics | Source | Year |
---|---|---|
Poverty | ||
Percentage of population below 150 percent of poverty line | ACS | 2013–2017 |
Percentage of population receiving SNAP/food stamps | ACS | 2013–2017 |
Percentage of population below 100 percent of poverty line | ACS | 2013–2017 |
Percentage of population in poverty (all ages) | SAIPE | 2015 |
Income | ||
Median household income—ACS | ACS | 2013–2017 |
Median household income—SAIPE | SAIPE | 2015 |
Per capita personal income | BEA | 2015 |
Education | ||
Percentage of population aged 25+: with education less than high school (no high school diploma) | ACS | 2013–2017 |
Percentage of population aged 25+: with high school diploma, no college | ACS | 2013–2017 |
Percentage of population aged 25+: with education more than high school (including some college, no degree) | ACS | 2013–2017 |
English-speaking ability for people who speak another language | ||
Percentage of population aged 5+: speaking other languages and speaking English not at all or not well | ACS | 2013–2017 |
Percentage of population aged 5+: speaking other languages | ACS | 2013–2017 |
Urban/rural | ||
Metro or non-metro counties | ACS | 2013–2017 |
Counties in metro areas of 1 million population or more | USDA | 2013 |
Counties in metro areas of less than 1 million population | USDA | 2013 |
Non-metro counties | USDA | 2013 |
Race/ethnicity | ||
Percentage of Hispanics | ACS | 2013–2017 |
Percentage of Whites | ACS | 2013–2017 |
Percentage of Blacks | ACS | 2013–2017 |
Percentage of Asians | ACS | 2013–2017 |
Percentage of American Indians and Alaska Natives | ACS | 2013–2017 |
Percentage of Native Hawaiians and Pacific Islanders | ACS | 2013–2017 |
Percentage of other races | ACS | 2013–2017 |
Foreign-born status | ||
Percentage of foreign-born people who entered United States after year 2010 | ACS | 2013–2017 |
Percentage of foreign-born people who entered United States between years 1990 and 2009 | ACS | 2013–2017 |
Percentage of foreign-born people who entered United States after year 1990 | ACS | 2013–2017 |
Percentage of foreign-born people who entered United States before year 1990 | ACS | 2013–2017 |
Percentage of population born outside of United States | ACS | 2013–2017 |
Age | ||
Percentage of population 16–54 years old | ACS | 2013–2017 |
Percentage of population 55–64 years old | ACS | 2013–2017 |
Percentage of population 65+ years old | ACS | 2013–2017 |
Gender | ||
Percentage of male population | ACS | 2013–2017 |
Employment status | ||
Unemployment rate | BLS | 2015 |
Percentage of population aged 20–64: in armed forces | ACS | 2013–2017 |
Percentage of population aged 20–64: in labor force and employed | ACS | 2013–2017 |
Percentage of population aged 20–64: in labor force and unemployed | ACS | 2013–2017 |
Percentage of population aged 20–64: not in labor force | ACS | 2013–2017 |
Occupation | ||
Percentage of population aged 16+: management/professional occupations | ACS | 2013–2017 |
Percentage of population aged 16+: service occupation | ACS | 2013–2017 |
Percentage of population aged 16+: sales/office occupation | ACS | 2013–2017 |
Percentage of population aged 16+: natural resources/construction/maintenance occupation | ACS | 2013–2017 |
Percentage of population aged 16+: military | ACS | 2013–2017 |
Percentage of population aged 16+: production/transportation/moving occupation | ACS | 2013–2017 |
Census division | ||
New England | ACS | 2013–2017 |
Middle Atlantic | ACS | 2013–2017 |
East North Central | ACS | 2013–2017 |
West North Central | ACS | 2013–2017 |
South Atlantic | ACS | 2013–2017 |
East South Central | ACS | 2013–2017 |
West South Central | ACS | 2013–2017 |
Mountain | ACS | 2013–2017 |
Pacific | ACS | 2013–2017 |
Journey to work | ||
Percentage of population aged 16+ and did not work at home: less than 30 min to work | ACS | 2013–2017 |
Percentage of population aged 16+ and did not work at home: 30–44 min to work | ACS | 2013–2017 |
Percentage of population aged 16+ and did not work at home: 45–59 min to work | ACS | 2013–2017 |
Percentage of population aged 16+ and did not work at home: 60+ minutes to work | ACS | 2013–2017 |
Housing unit tenure and phone service | ||
Percentage of owner-occupied housing units | ACS | 2013–2017 |
Percentage of renter-occupied housing units | ACS | 2013–2017 |
Percentage of owner-occupied housing units with phone service available | ACS | 2013–2017 |
Percentage of renter-occupied housing units with phone service available | ACS | 2013–2017 |
Percentage of occupied housing units | ACS | 2013–2017 |
Plumbing facilities | ||
Percentage of housing units with plumbing facilities | ACS | 2013–2017 |
Marital status | ||
Percentage of population 15+: never married | ACS | 2013–2017 |
Percentage of population 15+: married | ACS | 2013–2017 |
Percentage of population 15+: widowed | ACS | 2013–2017 |
Percentage of population 15+: divorced | ACS | 2013–2017 |
Migration | ||
Percentage of population 1+: in different house in the past year | ACS | 2013–2017 |
Percentage of population 1+: in different county in the past year | ACS | 2013–2017 |
Percentage of population 1+: in different state in the past year | ACS | 2013–2017 |
Percentage of population 1+: moved from abroad in the past year | ACS | 2013–2017 |
Health | ||
Percentage of civilian non-institutionalized population with one type of health insurance coverage | ACS | 2013–2017 |
Percentage of civilian non-institutionalized population with two or more types of health insurance coverage | ACS | 2013–2017 |
Percentage of civilian non-institutionalized population with no health insurance coverage | ACS | 2013–2017 |
Percentage of diagnosed diabetes | DDT | 2013 |
Percentage of obesity | DDT | 2013 |
Percentage of population eligible for Medicaid | CMS | 2015 |
Tax | ||
Average number of tax returns per person | SOI | 2014 |
Average number of returns with unemployment compensation per person | SOI | 2014 |
Average number of returns with taxable Social Security benefits per person | SOI | 2014 |
Proportion of the amount of unemployment compensation among all tax return amounts | SOI | 2014 |
Proportion of the amount of taxable Social Security benefits among all tax return amounts | SOI | 2014 |
State Characteristics | Source | Year |
---|---|---|
Socioeconomic status | ||
Average annual pay | BLS | 2015 |
Homeownership rate | Housing Vacancies and Home Ownership (CPS/HVS) | 2015 |
Education | ||
Adult basic education enrollment rate | OCTAE | 2015 |
Adult secondary education enrollment rate | OCTAE | 2015 |
English as a second language enrollment rate | OCTAE | 2015 |
Graduation rate of postsecondary institutes | IPEDS | 2014–2015 |
Average weighted monthly salary for full-time instructional staff | IPEDS | 2014–2015 |
Average amount of grant and scholarship aid received | IPEDS | 2014–2015 |
Annual college cost (tuition and fees) | IPEDS | 2014–2015 |
GED test completion rate | GED Testing Service (GEDTS) | 2013 |
Average 4th-grade reading composite scale scores | NAEP | 2015 |
Average 4th-grade math composite scale scores | NAEP | 2015 |
Average 8th-grade reading composite scale scores | NAEP | 2015 |
Average 8th-grade math composite scale scores | NAEP | 2015 |
Other area characteristics | ||
Infant mortality rate per 1000 live births | NCHS, Vital Statistics of the United States, annual, and unpublished data | 2013 |
Women 15–50 years old who gave birth in the past 12 months (per 1000 15–50-year-old women) | ACS | 2011–2015 |
Physicians per 100,000 population | AMA, Chicago, IL, Physician Characteristics and Distribution in the United States, 2014 | 2015 |
Violent crime rate per 100,000 population | FBI, Crime in the United States, annual | 2015 |
Federal aid to state and local governments per capita | Census Bureau, Federal Aid to States for Fiscal Year 2010 | 2010 |
State government general revenue per capita | Census Bureau; State and Local Government Finance Estimates by State, annual, and unpublished data | 2014 |
Energy consumption per person | EIA, State Energy Data Report, 2014 | 2014 |
Traffic fatalities per 100 million vehicle miles | NHTSA, Traffic Safety Facts, annual | 2015 |
Birth rate | National Vital Statistics Reports, 2015 | 2017 |
Birth rate for teenagers aged 15–19 | National Vital Statistics Reports, 2015 | 2017 |
Variable | Literacy P1 | Literacy P2 | Literacy P3 | Literacy average | Numeracy P1 | Numeracy P2 | Numeracy P3 | Numeracy Average |
---|---|---|---|---|---|---|---|---|
County level | ||||||||
Percentage of population aged 25+: with education less than high school | 0.72 | 0.22 | −0.70 | −0.73 | 0.74 | −0.11 | −0.63 | −0.73 |
Percentage of population aged 25+: with high school diploma, no college | 0.28 | 0.59 | −0.59 | −0.44 | 0.36 | 0.41 | −0.59 | −0.44 |
Percentage of population aged 25+: with education more than high school | −0.56 | −0.52 | 0.77 | 0.68 | −0.63 | −0.22 | 0.73 | 0.68 |
Percentage of population below 100 percent of poverty line | 0.65 | 0.24 | −0.65 | −0.67 | 0.74 | −0.10 | −0.64 | −0.71 |
Percentage of population receiving SNAP/food stamps | 0.59 | 0.31 | −0.66 | −0.64 | 0.69 | 0.01 | −0.66 | −0.68 |
Percentage of population below 150 percent of poverty line | 0.67 | 0.28 | −0.70 | −0.70 | 0.75 | −0.05 | −0.68 | −0.73 |
Percentage of population in poverty (all ages) | 0.64 | 0.23 | −0.64 | −0.64 | 0.71 | −0.09 | −0.62 | −0.68 |
ACS median household income—log-transformed | −0.49 | −0.42 | 0.65 | 0.56 | −0.59 | −0.13 | 0.64 | 0.59 |
SAIPE median household income | −0.49 | −0.42 | 0.65 | 0.56 | −0.59 | −0.13 | 0.64 | 0.59 |
Per capita personal income—log-transformed | −0.17 | −0.34 | 0.35 | 0.23 | −0.20 | −0.15 | 0.28 | 0.21 |
Percentage of population aged 5+: speak another language and speak English not at all or not well | 0.15 | −0.15 | −0.02 | −0.10 | 0.11 | −0.18 | 0.01 | −0.12 |
Percentage of population aged 5+: speaking other languages | 0.24 | −0.37 | 0.05 | −0.15 | 0.14 | −0.33 | 0.07 | −0.13 |
Percentage of Hispanics | 0.33 | −0.25 | −0.10 | −0.27 | 0.26 | −0.26 | −0.09 | −0.26 |
Percentage of Blacks | 0.37 | −0.03 | −0.27 | −0.32 | 0.46 | −0.24 | −0.28 | −0.39 |
Percentage of Asians | −0.04 | −0.40 | 0.28 | 0.13 | −0.15 | −0.27 | 0.31 | 0.16 |
Percentage of American Indians and Alaska Natives | 0.01 | −0.04 | 0.02 | −0.03 | 0.01 | <0.001 | −0.01 | −0.05 |
Percentage of Whites | −0.33 | 0.27 | 0.08 | 0.23 | −0.34 | 0.37 | 0.09 | 0.28 |
Percentage of Native Hawaiians and Pacific Islanders | −0.04 | −0.13 | 0.12 | 0.07 | −0.08 | −0.05 | 0.11 | 0.08 |
Percentage of other races | 0.20 | −0.29 | 0.03 | −0.12 | 0.14 | −0.29 | 0.05 | −0.11 |
Percentage of foreign-born people who entered United States after year 2010 | −0.22 | −0.27 | 0.34 | 0.31 | −0.21 | −0.18 | 0.31 | 0.26 |
Percentage of foreign-born people who entered United States between years 1990 and 2009 | 0.16 | −0.19 | −0.01 | −0.10 | 0.13 | −0.23 | 0.02 | −0.13 |
Percentage of foreign-born people who entered United States after year 1990 | <0.001 | −0.31 | 0.19 | 0.09 | −0.01 | −0.29 | 0.20 | 0.05 |
Percentage of foreign-born people who entered United States before year 1990 | −0.02 | −0.02 | 0.03 | 0.02 | −0.07 | 0.07 | 0.02 | 0.06 |
Percentage of population born outside of United States | 0.12 | −0.40 | 0.16 | −0.03 | 0.02 | −0.33 | 0.18 | −0.01 |
Percentage of population 16–54 years old | 0.11 | −0.20 | 0.04 | −0.05 | 0.07 | −0.17 | 0.04 | −0.05 |
Percentage of population 55–64 years old | −0.16 | 0.32 | −0.08 | 0.07 | −0.14 | 0.31 | −0.06 | 0.09 |
Percentage of population 65+ years old | −0.07 | 0.36 | −0.17 | −0.03 | −0.03 | 0.33 | −0.17 | −0.02 |
Percentage of male population | 0.19 | <0.001 | −0.15 | −0.18 | 0.14 | −0.12 | −0.06 | −0.12 |
Percentage of population aged 20–64: in armed forces | −0.10 | −0.04 | 0.10 | 0.11 | −0.06 | 0.01 | 0.05 | 0.09 |
Percentage of population aged 20–64: in labor force and employed | −0.52 | −0.41 | 0.66 | 0.58 | −0.60 | −0.12 | 0.64 | 0.60 |
Percentage of population aged 20–64: in labor force and unemployed | 0.33 | 0.05 | −0.29 | −0.33 | 0.39 | −0.07 | −0.33 | −0.39 |
Percentage of population aged 20–64: not in labor force | 0.62 | 0.24 | −0.63 | −0.63 | 0.67 | −0.07 | −0.59 | −0.64 |
Percentage of population aged 16+: management/ professional occupations | −0.38 | −0.50 | 0.61 | 0.51 | −0.44 | −0.33 | 0.62 | 0.52 |
Percentage of population aged 16+: service occupation | 0.34 | 0.07 | −0.31 | −0.37 | 0.39 | −0.07 | −0.33 | −0.39 |
Percentage of population aged 16+: sales/office occupation | −0.05 | 0.17 | −0.07 | −0.04 | 0.02 | 0.24 | −0.16 | −0.09 |
Percentage of population aged 16+: natural resources/construction/maintenance occupation | 0.22 | 0.36 | −0.40 | −0.30 | 0.22 | 0.23 | −0.35 | −0.28 |
Percentage of population aged 16+: military | −0.09 | −0.02 | 0.08 | 0.10 | −0.06 | 0.03 | 0.04 | 0.09 |
Percentage of population aged 16+: production/transportation/moving occupation | 0.29 | 0.43 | −0.50 | −0.39 | 0.32 | 0.29 | −0.48 | −0.38 |
Percentage of population aged 16+ and did not work at home: less than 30 min to work | <0.001 | −0.02 | 0.01 | 0.01 | 0.02 | 0.01 | −0.03 | −0.02 |
Percentage of population aged 16+ and did not work at home: 30–44 min to work | −0.02 | −0.09 | 0.08 | 0.05 | −0.01 | −0.16 | 0.11 | 0.05 |
Percentage of population aged 16+ and did not work at home: 45–59 min to work | −0.07 | 0.04 | 0.03 | 0.06 | −0.09 | <0.001 | 0.08 | 0.09 |
Percentage of population aged 16+ and did not work at home: 60+ min to work | 0.07 | 0.11 | −0.12 | −0.11 | 0.02 | 0.14 | −0.10 | −0.07 |
Percentage of owner-occupied housing units | −0.20 | 0.32 | −0.05 | 0.08 | −0.20 | 0.38 | −0.04 | 0.12 |
Percentage of renter-occupied housing units | 0.20 | −0.32 | 0.05 | −0.08 | 0.20 | −0.38 | 0.04 | −0.12 |
Percentage of owner-occupied housing units with phone service available | −0.30 | −0.11 | 0.30 | 0.29 | −0.32 | 0.04 | 0.28 | 0.31 |
Percentage of renter-occupied housing units with phone service available | −0.21 | −0.05 | 0.20 | 0.19 | −0.21 | 0.02 | 0.18 | 0.19 |
Percentage of occupied housing unit | −0.10 | −0.26 | 0.24 | 0.15 | −0.15 | −0.14 | 0.23 | 0.16 |
Percentage of housing units with plumbing facilities | −0.15 | −0.07 | 0.16 | 0.15 | −0.14 | 0.02 | 0.12 | 0.14 |
Percentage of population aged 15+: never married | 0.24 | −0.37 | 0.05 | −0.11 | 0.23 | −0.41 | 0.03 | −0.16 |
Percentage of population aged 15+: married | −0.35 | 0.19 | 0.15 | 0.26 | −0.40 | 0.31 | 0.19 | 0.33 |
Percentage of population aged 15+: widowed | 0.35 | 0.47 | −0.57 | −0.45 | 0.41 | 0.28 | −0.57 | −0.46 |
Percentage of population aged 15+: divorced | 0.07 | 0.34 | −0.27 | −0.15 | 0.18 | 0.23 | −0.31 | −0.19 |
Percentage of population aged 1+: in different house in the past year | −0.10 | −0.20 | 0.20 | 0.16 | −0.05 | −0.23 | 0.19 | 0.13 |
Percentage of population aged 1+: in different county in the past year | 0.10 | −0.01 | −0.07 | −0.10 | 0.11 | −0.11 | −0.04 | −0.08 |
Percentage of population aged 1+: in different state in the past year | −0.20 | −0.17 | 0.26 | 0.27 | −0.16 | −0.20 | 0.28 | 0.25 |
Percentage of population aged 1+: moved from abroad in the past year | −0.15 | −0.52 | 0.45 | 0.32 | −0.23 | −0.44 | 0.49 | 0.33 |
Percentage of civilian non-institutionalized population with one type of health insurance coverage | −0.43 | −0.20 | 0.46 | 0.43 | −0.48 | <.001 | 0.46 | 0.46 |
Percentage of civilian non-institutionalized population with two or more types of health insurance coverage | −0.04 | 0.29 | −0.15 | −0.02 | 0.01 | 0.23 | −0.15 | −0.01 |
Percentage of civilian non-institutionalized population with no health insurance coverage | 0.52 | <0.001 | −0.41 | −0.48 | 0.53 | −0.17 | −0.40 | −0.51 |
Percentage of diagnosed diabetes | 0.39 | 0.45 | −0.58 | −0.49 | 0.50 | 0.19 | −0.59 | −0.52 |
Percentage of obesity | 0.40 | 0.38 | −0.55 | −0.47 | 0.48 | 0.17 | −0.56 | −0.49 |
Percentage of population eligible for Medicaid | 0.54 | 0.09 | −0.48 | −0.52 | 0.55 | −0.06 | −0.49 | −0.54 |
Average number of tax returns per person | −0.12 | −0.40 | 0.35 | 0.18 | −0.20 | −0.22 | 0.33 | 0.19 |
Average number of returns with unemployment compensation per person | −0.04 | <0.001 | 0.03 | 0.03 | −0.06 | 0.07 | 0.01 | 0.04 |
Average number of returns with taxable Social Security benefits per person | −0.38 | 0.28 | 0.12 | 0.28 | −0.36 | 0.38 | 0.10 | 0.30 |
Proportion of the amount of unemployment compensation among all tax return amounts | 0.09 | 0.10 | −0.14 | −0.12 | 0.09 | 0.09 | −0.14 | −0.12 |
Proportion of the amount of taxable Social Security benefits among all tax return amounts | −0.09 | 0.42 | −0.19 | −0.02 | −0.03 | 0.39 | −0.21 | −0.03 |
Unemployment rate | 0.48 | 0.15 | −0.46 | −0.48 | 0.54 | −0.09 | −0.45 | −0.51 |
Counties in metro areas of 1 million population or more | −0.12 | −0.23 | 0.24 | 0.17 | −0.15 | −0.12 | 0.22 | 0.16 |
Counties in metro areas of less than 1 million population | −0.07 | −0.01 | 0.06 | 0.06 | −0.07 | 0.05 | 0.04 | 0.05 |
Non-metro counties | 0.22 | 0.28 | −0.35 | −0.25 | 0.26 | 0.08 | −0.29 | −0.24 |
New England | −0.16 | −0.02 | 0.14 | 0.15 | −0.16 | 0.02 | 0.14 | 0.16 |
Middle Atlantic | −0.07 | −0.01 | 0.06 | 0.06 | −0.08 | 0.05 | 0.04 | 0.06 |
East North Central | −0.17 | 0.08 | 0.08 | 0.11 | −0.13 | 0.11 | 0.06 | 0.09 |
West North Central | −0.11 | 0.03 | 0.07 | 0.10 | −0.18 | 0.10 | 0.11 | 0.14 |
South Atlantic | 0.07 | <0.001 | −0.06 | −0.05 | 0.11 | −0.03 | −0.09 | −0.10 |
East South Central | 0.16 | 0.23 | −0.27 | −0.19 | 0.23 | 0.07 | −0.26 | −0.18 |
West South Central | 0.27 | −0.09 | −0.15 | −0.23 | 0.26 | −0.16 | −0.15 | −0.25 |
Mountain | −0.14 | −0.01 | 0.12 | 0.15 | −0.14 | −0.03 | 0.16 | 0.17 |
Pacific | 0.06 | −0.26 | 0.12 | <0.001 | −0.02 | −0.16 | 0.12 | 0.01 |
State level | ||||||||
Adult basic education enrollment rate | 0.20 | 0.31 | −0.35 | −0.25 | 0.28 | 0.14 | −0.36 | −0.28 |
Physicians per 100,000 population | −0.18 | −0.15 | 0.23 | 0.23 | −0.20 | −0.07 | 0.23 | 0.23 |
Birth rate for teenagers aged 15–19 | 0.34 | 0.21 | −0.39 | −0.36 | 0.40 | 0.02 | −0.39 | −0.38 |
Average annual pay | −0.11 | −0.27 | 0.25 | 0.16 | −0.15 | −0.14 | 0.23 | 0.16 |
Adult secondary education enrollment rate | −0.12 | 0.13 | 0.01 | 0.09 | −0.10 | 0.14 | 0.01 | 0.11 |
Birth rate | 0.23 | −0.07 | −0.13 | −0.16 | 0.18 | −0.20 | −0.05 | −0.13 |
GED test completion rate | 0.13 | 0.13 | −0.18 | −0.17 | 0.18 | 0.07 | −0.22 | −0.17 |
English as a second language enrollment rate | −0.15 | −0.33 | 0.32 | 0.20 | −0.23 | −0.18 | 0.33 | 0.22 |
Traffic fatalities per 100 million vehicle miles | 0.31 | 0.25 | −0.40 | −0.35 | 0.35 | 0.09 | −0.39 | −0.35 |
Women 15–50 years old who gave birth in the past 12 months | 0.10 | −0.05 | −0.04 | −0.06 | 0.04 | −0.09 | 0.02 | −0.03 |
Average amount of grant and scholarship aid received | −0.26 | −0.01 | 0.20 | 0.24 | −0.26 | 0.09 | 0.19 | 0.24 |
Graduation rate of postsecondary institutes | −0.01 | −0.18 | 0.12 | 0.06 | −0.08 | −0.12 | 0.15 | 0.09 |
Homeownership rate | −0.18 | 0.20 | 0.02 | 0.12 | −0.14 | 0.17 | 0.02 | 0.13 |
Infant mortality rate per 1000 live birth | 0.21 | 0.22 | −0.30 | −0.24 | 0.31 | 0.05 | −0.32 | −0.29 |
Average 4th-grade math composite scale scores | −0.23 | 0.01 | 0.17 | 0.22 | −0.24 | 0.06 | 0.19 | 0.24 |
Average 8th-grade math composite scale scores | −0.37 | −0.06 | 0.32 | 0.35 | −0.41 | 0.07 | 0.34 | 0.39 |
Energy consumption per person | 0.23 | −0.06 | −0.14 | −0.18 | 0.20 | −0.13 | −0.11 | −0.18 |
State government general revenue per capita | −0.11 | −0.25 | 0.25 | 0.19 | −0.19 | −0.08 | 0.23 | 0.20 |
Federal aid to state and local governments per capita | −0.01 | −0.07 | 0.05 | 0.07 | −0.02 | −0.06 | 0.05 | 0.06 |
Average 4th-grade reading composite scale scores | −0.22 | 0.13 | 0.09 | 0.18 | −0.19 | 0.12 | 0.11 | 0.20 |
Average 8th-grade reading composite scale scores | −0.41 | 0.10 | 0.26 | 0.35 | −0.42 | 0.19 | 0.28 | 0.39 |
Average weighted monthly salary for full-time instructional staff | −0.34 | −0.29 | 0.33 | 0.23 | −0.25 | −0.12 | 0.32 | 0.25 |
Annual college cost (tuition and fees) | −0.25 | −0.02 | 0.21 | 0.23 | −0.24 | 0.03 | 0.21 | 0.24 |
Violent crime rate per 100,000 population | 0.21 | −0.15 | −0.07 | −0.19 | 0.16 | −0.11 | −0.09 | −0.19 |
Variable | Literacy | Numeracy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
λ = 0.02 | λ = 0.03 | λ = 2 | λ = 3 | λ = 0.02 | λ = 0.03 | λ = 2 | λ = 3 | |||||
P1 | P3 | P1 | P3 | Avg. | Avg. | P1 | P3 | P1 | P3 | Avg. | Avg. | |
Percentage of population aged 25+: with education less than high school | 0.62 | −0.51 | 0.57 | −0.53 | −108.97 | −107.03 | 0.48 | −0.26 | 0.44 | −0.28 | −86.67 | −88.65 |
Percentage of population aged 25+: with education more than high school | −0.13 | 0.45 | −0.13 | 0.38 | 27.54 | 22.77 | −0.24 | 0.47 | −0.21 | 0.39 | 41.17 | 31.47 |
Percentage of population below 100 percent poverty line | 0.26 | −0.27 | 0.26 | −0.28 | −31.84 | −34.53 | 0.47 | −0.35 | 0.52 | −0.39 | −45.39 | −59.07 |
Percentage of Blacks | 0.03 | 0.02 | † | † | −1.71 | † | 0.10 | 0.05 | 0.04 | 0.02 | −12.76 | −5.30 |
Percentage of foreign-born people who entered United States after year 2010 | −0.01 | 0.02 | † | † | 1.64 | † | † | † | † | † | † | † |
Percentage of civilian non-institutionalized population with no health insurance coverage | 0.07 | −0.04 | † | † | −11.68 | −6.44 | 0.21 | −0.14 | 0.11 | −0.07 | −38.52 | −33.75 |
Birth rate | <0.001 | <0.001 | † | † | † | † | † | † | † | † | † | † |
Average amount of grant and scholarship aid received | <0.001 | <0.001 | † | † | <0.001 | † | † | † | † | † | <0.001 | † |
Percentage of population born outside of United States | <0.001 | <0.001 | † | † | † | † | † | † | † | † | † | † |
Unemployment rate | † | † | † | † | −0.03 | † | † | † | † | † | −0.33 | −0.26 |
Percentage of population aged 16+: service occupation | † | † | † | † | −16.52 | −0.70 | † | † | † | † | −17.44 | −0.81 |
Percentage of population aged 16+ and did not work at home: 60+ minutes to work | † | † | † | † | −0.31 | † | † | † | † | † | † | † |
Percentage of Hispanics | † | † | † | † | † | † | † | † | † | † | −1.95 | † |
References
- Rao, J.N.K.; Molina, I. Small Area Estimation, 2nd ed.; Wiley Series in Survey Methodology; Wiley: Hoboken, NJ, USA, 2015. [Google Scholar]
- Fay, R.E.; Herriot, R.A. Estimates of income for small places: An application of James-Stein procedures to census data. J. Am. Stat. Assoc. 1979, 74, 269–277. [Google Scholar] [CrossRef]
- Battese, G.E.; Harter, R.M.; Fuller, W.A. An error-components model for prediction of county crop areas using survey and satellite data. J. Am. Stat. Assoc. 1988, 83, 28–36. [Google Scholar] [CrossRef]
- Tibshirani, R. The Lasso Method for Variable Selection in the Cox Model. Stat. Med. 1997, 16, 385–395. [Google Scholar] [CrossRef] [Green Version]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees, 1st ed.; Wadsworth Statistics/Probability; Routledge: New York, NY, USA, 1984. [Google Scholar]
- Shao, J. Linear Model Selection by Cross-validation. J. Am. Stat. Assoc. 1993, 88, 486–494. [Google Scholar] [CrossRef]
- Pfeffermann, D. New Important Developments in Small Area Estimation. Stat. Sci. 2013, 28, 40–68. [Google Scholar] [CrossRef]
- Van den Brakel, J.A.; Buelens, B. Covariate Selection for Small Area Estimation in Repeated Sample Surveys. Stat. Transit. New Ser. Surv. Methodol. Jt. Issue Small Area Estim. 2014, 16, 523–540. [Google Scholar] [CrossRef]
- Erciulescu, A.L.; Berg, E.J.; Cecere, W.; Ghosh, M. A bivariate hierarchical Bayesian model for estimating cropland cash rental rates at the county level. Surv. Methodol. 2019, 45, 199–216. [Google Scholar]
- Cai, S.; Rao, J.N.K.; Dumitrescu, L.; Chatrchi, G. Effective Transformation-based Variable Selection under Two-Fold Subarea Models in Small Area Estimation. Stat. Transit. New Ser. 2020, 21, 68–83. [Google Scholar] [CrossRef]
- Erciulescu, A.L.; Opsomer, J.D. A model-based approach to predict employee compensation components. In Proceedings of the Joint Statistical Meetings; Government Statistics Section, American Statistical Association: Alexandria, VA, USA, 2019; Available online: https://ww2.amstat.org/MembersOnly/proceedings/2019/data/assets/pdf/1199560.pdf (accessed on 22 July 2022).
- Hogan, J.; Thornton, N.; Diaz-Hoffmann, L.; Mohadjer, L.; Krenzke, T.; Li, J.; Van De Kerckhove, W.; Yamamoto, K.; Khorramdel, L.U.S. Program for the International Assessment of Adult Competencies (PIAAC) 2012/2014: Main Study and National Supplement Technical Report (NCES 2016-036REV); U.S. Department of Education, National Center for Education Statistics: Washington, DC, USA, 2016.
- Wolter, K.M. Taylor Series Methods and Generalized Variance Functions. In Introduction to Variance Estimation. Statistics for Social and Behavioral Sciences; Springer: New York, NY, USA, 2007. [Google Scholar]
- Krenzke, T.; Mohadjer, L.; Li, J.; Erciulescu, A.; Fay, R.; Ren, W.; VanDeKerckhove, W.; Li, L.; Rao, J.N.K. Program for the International Assessment of Adult Competencies (PIAAC): State and County Estimation Methodology Report; National Center for Education Statistics: Washington, DC, USA, 2020.
- Särndal, C.E.; Hidiroglou, M. Small Domain Estimation: A Conditional Analysis. J. Am. Stat. Assoc. 1989, 84, 266–275. [Google Scholar]
- Fushiki, T. Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 2011, 21, 137–146. [Google Scholar] [CrossRef]
- Rampey, B.D.; Finnegan, R.; Goodman, M.; Mohadjer, L.; Krenzke, T.; Hogan, J.; Provasnik, S. Skills of U.S. Unemployed, Young, and Older Adults in Sharper Focus: Results from the Program for the International Assessment of Adult Competencies (PIAAC) 2012/2014: First Look (NCES 2016-039rev); U.S. Department of Education, National Center for Education Statistics: Washington, DC, USA, 2016.
- Goodman, M.; Finnegan, R.; Mohadjer, L.; Krenzke, T.; Hogan, J. Literacy, Numeracy, and Problem Solving in Technology-Rich Environments among U.S. Adults: Results from the Program for the International Assessment of Adult Competencies 2012: First Look (NCES 2014-008); U.S. Department of Education, National Center for Education Statistics: Washington, DC, USA, 2013. Available online: https://nces.ed.gov/pubs2014/2014008.pdf (accessed on 22 July 2022).
- Kirsch, I.S.; Jungeblut, A.; Jenkins, L.; Kolstad, A. Adult Literacy in America: A First Look at the Results of the National Adult Literary Survey (NALS); U.S. Department of Education, National Center for Education Statistics: Washington, DC, USA, 2002.
- Greenberg, E.; Macias, R.F.; Rhodes, D.; Chan, T. English Literacy and Language Minorities in the United States: NCES 2001-464; U.S. Department of Education, National Center for Education Statistics: Washington, DC, USA. Available online: http://www.nces.ed.gov/pubs2002/2002382.pdf (accessed on 22 July 2022).
- Coley, R.J. International adult literacy. ETS Policy Notes 1996, 7, 1–12. [Google Scholar]
- Vaish, A.K. Small area estimation with data from multiple sources. Presented at the 61st ISI World Statistics Congress Satellite Meeting on Small Area Estimation, Paris, France, 10–12 July 2017. [Google Scholar]
- Harrell, F.E.; Lee, K.L.; Califf, R.M.; Pryor, D.B.; Rosati, R.A. Regression modeling strategies for improved prognostic prediction. Stat. Med. 1984, 3, 143–152. [Google Scholar] [CrossRef] [PubMed]
- Harrell, F.E. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis; Springer: New York, NY, USA, 2015. [Google Scholar]
- Austin, P.C.; Allignol, A.; Jason, P.F. The number of primary events per variable affects estimation of the subdistribution hazard competing risks model. J. Clin. Epidemiol. 2017, 83, 75–84. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning: With Applications in R; Springer: New York, NY, USA, 2013. [Google Scholar]
- Lahiri, P.; Suntornchost, J. Variable selection for linear mixed models with applications in small area estimation. Indian J. Stat. 2015, 77, 312–320. [Google Scholar] [CrossRef]
Number of Completed Cases | Number of Counties |
---|---|
Less than 5 | 4 |
5 to 10 | 14 |
11 to 20 | 10 |
21 to 50 | 58 |
51 to 100 | 56 |
101 or more | 43 |
Total | 185 |
Proficiency Domain | Proficiency Measure |
---|---|
Literacy | Average score |
Proportion at or below level 1 | |
Proportion at level 2 | |
Proportion at or above level 3 | |
Numeracy | Average score |
Proportion at or below level 1 | |
Proportion at level 2 | |
Proportion at or above level 3 |
Source | Year(s) | Description | Label |
---|---|---|---|
American Community Survey | 2013–2017 | Percentage of population aged 25 and over with less than high school education (no high school diploma) | Education—LH |
Percentage of population aged 25 and over with more than high school education (including some college, no degree) | Education—MH | ||
Percentage of population below 100 percent of the poverty line | Poverty | ||
Percentage of Black or African American population | Black | ||
Percentage of Hispanic population | Hispanic | ||
Percentage of civilian non-institutionalized population who has no health insurance coverage | No health insurance | ||
Percentage of population aged 16 and over with service occupations | Service occupations | ||
Percentage of foreign-born people who entered the United States after the year 2010 among the population born outside the United States | Enter U.S. 2010 | ||
Percentage of the population born outside of the United States | Foreign born | ||
Percentage of population 16 and over who did not work at home who spent more than 60 min traveling to work | Journey to work | ||
Bureau of Labor Statistics | 2015 | Unemployment rate | Unemployment rate |
Division of Diabetes Translation | 2013 | Percentage of diabetes diagnosed | Diabetes rate |
National Vital Statistics Reports | 2015 | Birth rate per 1000 women | Birth rate |
The Integrated Postsecondary Education Data System | 2014–2015 | Average amount of grant and scholarship aid received | Grant/scholarship received |
Variable | Literacy | Numeracy | ||||||
---|---|---|---|---|---|---|---|---|
Proportion Model | Average Model | Proportion Model | Average Model | |||||
λ = 0.02 | λ = 0.03 | λ = 2 | λ = 3 | λ = 0.02 | λ = 0.03 | λ = 2 | λ = 3 | |
Education—LH | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Education—MH | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Poverty | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Black | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||
Hispanic | ✓ | |||||||
No health insurance | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
Service occupations | ✓ | ✓ | ✓ | ✓ | ||||
Enter U.S. 2010 | ✓ | ✓ | ||||||
Foreign born | ✓ | |||||||
Journey to work | ✓ | |||||||
Unemployment rate | ✓ | ✓ | ✓ | |||||
Diabetes rate | ✓ | ✓ | ||||||
Birth rate | ✓ | |||||||
Grant/scholarship received | ✓ | ✓ | ✓ |
Variable | Scenarios | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
Education—LH | ✓ | ✓ | ✓ | ✓ | ✓ |
Education—MH | ✓ | ✓ | ✓ | ✓ | ✓ |
Poverty | ✓ | ✓ | ✓ | ✓ | ✓ |
Black | ✓ | ✓ | ✓ | ✓ | |
Enter U.S. 2010 | ✓ | ✓ | |||
No health insurance | ✓ | ✓ | ✓ | ||
Birth rate | ✓ | ||||
Grant/scholarship received | ✓ | ||||
Foreign born | ✓ | ||||
Hispanic | ✓ | ✓ | ✓ | ||
Service occupations | ✓ | ||||
Sums of squared differences between the predicted proportions and direct estimates over 44 counties with a sample size of at least 100 | |||||
P1 | 0.109 | 0.078 | 0.081 | 0.076 | 0.076 |
P2 | 0.136 | 0.137 | 0.144 | 0.141 | 0.143 |
P3 | 0.212 | 0.155 | 0.186 | 0.170 | 0.183 |
Variables | Label |
---|---|
Percentage of population aged 25 and over with less than high school education | Education—LH |
Percentage of population aged 25 and over with more than high school education | Education—MH |
Percentage of population below 100 percent of the poverty line | Poverty |
Percentage of Black or African American population | Black |
Percentage of Hispanic population | Hispanic |
Percentage of civilian non-institutionalized population who had no health insurance coverage | No health insurance |
Percentage of population aged 16 and over with service occupations | Service occupations |
Variable | Education—MH | Poverty | Black | Hispanic | No Health Insurance | Service Occupation |
---|---|---|---|---|---|---|
Education—LH | −0.76 | 0.64 | 0.34 | 0.42 | 0.58 | 0.21 |
Education—MH | −0.53 | −0.20 | −0.04 | −0.38 | −0.13 | |
Poverty | 0.47 | 0.08 | 0.47 | 0.37 | ||
Black | −0.11 | 0.19 | 0.15 | |||
Hispanic | 0.40 | 0.15 | ||||
No health insurance | 0.19 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ren, W.; Li, J.; Erciulescu, A.; Krenzke, T.; Mohadjer, L. A Variable Selection Method for Small Area Estimation Modeling of the Proficiency of Adult Competency. Stats 2022, 5, 689-713. https://doi.org/10.3390/stats5030041
Ren W, Li J, Erciulescu A, Krenzke T, Mohadjer L. A Variable Selection Method for Small Area Estimation Modeling of the Proficiency of Adult Competency. Stats. 2022; 5(3):689-713. https://doi.org/10.3390/stats5030041
Chicago/Turabian StyleRen, Weijia, Jianzhu Li, Andreea Erciulescu, Tom Krenzke, and Leyla Mohadjer. 2022. "A Variable Selection Method for Small Area Estimation Modeling of the Proficiency of Adult Competency" Stats 5, no. 3: 689-713. https://doi.org/10.3390/stats5030041
APA StyleRen, W., Li, J., Erciulescu, A., Krenzke, T., & Mohadjer, L. (2022). A Variable Selection Method for Small Area Estimation Modeling of the Proficiency of Adult Competency. Stats, 5(3), 689-713. https://doi.org/10.3390/stats5030041