Lack of consistent sheltering options for youths experiencing homelessness (YEH) often intersects with limited access to healthcare, living wage employment, education and other unmet needs that may impede the ability to exit homelessness [1
]. YEH sleep in a variety of places, ranging from the streets, places not meant for human habitation, temporarily staying with others and shelters. Youths spending the night in a shelter (SN), another person’s home (unstable housing (UH)—transient sleeping arrangements such as the home of a friend, acquaintance, partner or extended family) or less-structured locations (i.e., literally homeless (LH)—a car, street, park or abandoned building) vary in patterns of consistency in use of these different options [2
Among a sample of YEH aged 14–24 years (n
= 426), less than half (41%) were consistently housed over a two-year period [3
]. The remaining youth used short-term (e.g., staying with a friend; 20%) or longer-term sheltering options (e.g., permanent supportive housing; 39%). In a longitudinal study of shelter-using youth (n
= 166), 18% of first-time runaways and 34% of repeat runaways returned to the shelter within a year [4
]. One recent study that included daily ecological momentary assessments (EMAs) of daily sleeping arrangements found that among YEH aged 16–22 years (n
= 150), sleeping location changed almost daily during a one month period [5
]. While the literature suggests that there is substantial variation in sheltering and transiency among YEH, little is known about the day-level predictors of those various sheltering patterns.
Previous evaluations of sheltering are often limited by analyzing cross-sectional data and have focused on measuring the lifetime, rather than day-to-day, experiences of homeless populations [6
]. Given that such cross-sectional data may have been collected at a service location, this methodology may increase the likelihood that these youths will report staying in a shelter. For example, in a large, seven-city study among YEH aged 18–26 years recruited from shelters and service providing locations, YEH reported primarily sleeping at a shelter (49%) the night before participating in the survey, while 33% reported staying on the streets the night before [9
]. Other studies that include youth recruited from the streets suggest that very few YEH use shelters, with rates ranging from 7% in the last three months to 21% in the last year among YEH aged 14 to 21 years [10
]. Due to the variable use of shelters among this high risk and underserved population, other longitudinal methods that account for within person variance of sheltering patterns are sorely needed.
While shelters often offer a place to stay at night, onsite access to healthcare, life skills education, school success supports, workforce readiness and referrals for other services [11
], there are many facilitators and barriers to accessing these shelter-based services. YEH have reported that attitudinal facilitators (e.g., the desire to extricate themselves from street life and turn their lives in a new direction) increase service utilization [12
]. Yet, YEH also reported barriers to shelter access, including shelter availability and the use of restrictive definitions of homelessness that only allow prolonged street dwelling to access services [5
]. Other reported barriers to shelter-based service utilization among YEH include negative encounters with service staff [13
] and inflexible shelter rules [14
]. Further, beyond structural limitations in access to resources, unstable housing presents immediate concerns related to lack of safety and potential victimization. However, understanding the factors that influence sheltering choices remains a challenge. The present research seeks to explicate the relationships between sheltering and various behavioral/environmental factors and identify the strongest predictors of sheltering option on a given night. Essentially, many factors may conceivably influence daily sheltering choices, and to date, these have seldom been explored in the literature. The present analysis uses a large-scale, data-driven approach to explore a wide set of potential factors and provide hypothesis-generating direction to this field of research. These findings would directly address the primary aims of the study: (1) to contextualize the current state of daily sheltering patterns among YEH (i.e., look beyond broad factors that implicate sheltering to find more precise, day-to-day indicators), (2) identify behavioral targets for interventions that may reduce risky sheltering (i.e., of those strongest predictors, which, if any, are specific behaviors that could be modifiable and thus targetable for intervention) and (3) inform higher-level decision making (i.e., provide direction for researchers and/or policy makers to further investigate the specific factors found here). A secondary aim of the present study is to demonstrate the application of a useful machine-learning algorithm for this research domain that can identify the strongest subset of predictors from a large set while fully accounting for multicollinearity.
1.1. Sexual Risk Behaviors and Sheltering
Evidence suggests that homelessness is associated with sexual risks. YEH consistently report high rates of sexual risk behaviors across studies [15
]. Studies using cross-sectional designs report rates of condomless sex ranging from 40% to 70% [18
] and being homeless was associated with a higher number of sexual partners [19
]. A longitudinal study of EMA data indicated that condom use was much lower (25%) than indicated by self-reporting at baseline (54%) among a sample of YEH [20
]. Furthermore, lacking consistent housing may lead to increased risk of trading sex for shelter [21
] and exposure to sexual exploitation [22
]. However, few studies have assessed the relationship between sexual activity and sheltering patterns using more granular measures that account for within person variations.
1.2. Role of Drug Use on Sheltering
The impact of drug use on homelessness has also been well documented across various populations, including YEH. Among YEH, greater shelter utilization has been associated with reductions in substance use [5
]. YEH aged 14–24 years who used drugs were less likely to be consistently sheltered across a two-year period than those who did not use drugs [3
]. In a hospital-based sample, those experiencing homelessness or unstable housing had higher rates and greater severity of alcohol and drug use than other patients seen in the emergency room [25
]. Having a substance use problem is also a risk factor for failure to achieve longer-term housing stability [26
]. Yet, the implications of drug use on patterns of daily sheltering over time have not yet been fully explored.
1.3. Gender Identity, Sexual Orientation and Sheltering
System-based and societal homophobia and transphobia act as barriers to accessing supportive services including the lack of safe, gender-affirming sheltering options [27
]. Transgender and gender-nonconforming individuals often experience gender-based discrimination from service providers that may lead to disparities in access and utilization of shelters and other social services [28
]. There is some evidence that cisgender women may be more satisfied with homeless youth services than cisgender men [29
]. Additionally, lesbian, gay, bisexual and queer (LGBQ) youth report having more trouble finding a shelter compared to heterosexual youth [30
]. Additionally, lacking an affirming, safe sheltering option may increase risk for engaging in trade sex among LGBQ youth which is one survival strategy YEH may use to secure shelter [31
1.4. Using Intensive Longitudinal Assessment Methods to Identify Predictors of Sheltering Patterns
Although there have been a few studies that have investigated factors that may influence patterns of shelter use among YEH, less well understood are the factors that influence sheltering on a day-to-day basis that may be accessible via longitudinal study. EMA is currently the gold standard methodology for the measurement of real-time data in natural settings [32
], with generally high compliance among youth across studies [34
]. Several studies have shown that EMA is more accurate than self-reports that require participants to average behaviors over periods of time [35
]. Daily diary assessments collected via EMA, which provide the compliance and accuracy benefits of the methodology, thus provide an ideal data collection method for exploring sheltering patterns and identifying predictors of daily sheltering patterns among YEH. The present exploratory secondary data analyses aimed to identify predictors of daily sheltering accommodations among YEH using demographic and daily diary items collected via EMA, including items evaluating the broad predictor classes described above. For example, the broad class of sexual risk is evaluated by items regarding any sexual activity, number of sexual partners, condom use and prostitution. The broad class of drug use is evaluated by a set of items inquiring as to any drugs used as well as alcohol and nicotine. The broad class of gender issues (identity; sexual orientation) was evaluated via questions about discrimination and sexual partners (related to above).
3.1. Sample Characteristics
Analyses were restricted to the N = 66 participants with available data on the sheltering outcome. Participants were mostly male (62.1%), Black (65.2%), heterosexual (78.8%) and unemployed (83.3%). Male participants had slightly higher mean age (male: 21.2 (SD = 1.9); female = 20.8 (SD = 2.0)). Male and female participants were represented across racial categories (male: 68.3% Black; female: 62.5% Black, each relative to other race), orientation (male: 87.87% heterosexual; female: 62.5% heterosexual) and employment status (male: 17.1% employed; female: 16.7% employed). The present non-randomized sample was largely representative of the YEH population in the Houston metropolitan area.
3.2. Sheltering Characteristics
Across the 724 observations, participants more often reported their previous night’s sleeping arrangements as UH (n = 362, 50.0%) versus LH (n = 262, 36.2%) or SN (n = 100, 13.8%). The locations characterized as UH were staying at a relative’s house or in the family home (n = 107/724; 14.8%), staying with a friend or acquaintance (n = 102/724; 14.1%), staying in the home of a boyfriend, girlfriend or sexual partner (n = 97/724; 1.31%) or staying in a hotel/motel (n = 56/724; 7.7%). The most frequent LH locations were staying on the street, in a park or near a bayou (n = 135; 18.6%), staying in an abandoned apartment, vaco or squat (n = 94; 13.0%), staying in a bus, metro or train (n = 24; 3.3%) or staying in a car (n = 9; 1.2%).
Participants experienced various possible combinations of shelter types during the study. Eighteen participants reported each type at least once. Of these, 125 observations were UH, 70 observations were LH, and 37 observations were SN. Twenty-four participants reported a combination of two types: 20 reported UH and LH, three reported UH and SN, and one reported LH and SN. The first of these hybrids were slightly less characterized by UH (119 observations) than LH (123 observations), the second hybrid was nearly evenly split between UH and SN (22 and 21 observations, respectively), and the third hybrid consisted of one LH and one SN observation each. Finally, 13, 7 and 4 participants reported only one type (respectively, 96 UH, 68 LH and 41 SN observations).
3.3. Component-Wise Gradient Boosting
The CGB algorithm was used to derive an optimized model fitting the categorical sheltering variable whereby SN was compared to both LH and UH using a set of 92 predictors. With the default shrinkage parameter nu = 0.1, tuning the optimal number of boosting iterations by 10-fold cross-validation resulted in a model featuring 35 predictors. Penalized coefficients, odds ratios, normalized importance scores, and raw endorsement frequency by category for the selected predictors are included in Supplementary Table S2
. This model of 35 predictors was further reduced by another pass through the algorithm with the shrinkage parameter set to half of the default (nu = 0.05). Tuning the second pass through the remaining predictors resulted in a model featuring 15 predictors (Table 1
Interpreting a given predictor in Table 1
follows from understanding the odds ratios, normalized importance and raw endorsement frequencies. For example, the predictor with the highest average normalized importance was endorsing the response option, “Not having a place to stay” to the item, “What were you stressed about?” (ORLH
= 1.37; ORUH
= 0.90). These odds ratios may be interpreted such that endorsing the “Not having a place to stay” option was associated with a 37% increase in the odds of experiencing a LH evening compared to a SN evening and a corresponding 10% decrease in the odds of experiencing a UH evening compared to a SN evening. Odds ratios were calculated in the present study by exponentiating the raw penalized coefficients reported by the tuned algorithm. Subsequently examining the frequency of endorsement for each outcome category aids interpretation: the “Not having a place to stay” response option was endorsed more than twice as often for LH than UH nights and approximately five times more often than SN.
The relative strength of the various predictor relationships with the outcome may be investigated via further consideration of the normalized importance scores. The predictor with the second-highest normalized average importance was endorsing the response option, “Yes” to the item, “Were you arrested yesterday?” (ORLH
= 0.87; ORUH
= 1.33). These odds ratios correspond to a 13% decrease in the odds of a LH and a 33% increase in the odds of a UH, each relative to a SN. The importance scores (ordered in Table 1
by normalized average importance, high to low) provide additional detail: for the LH versus SN comparison, the importance was 43.8% that of the strongest predictor and for the UH versus SN comparison, the predictor yielded the top rank in importance; subsequent averaging of these importance scores demonstrated an average normalized importance of 99.5%. In essence, this predictor provided almost the same amount of overall predictive value to the model as the top ranking predictor (not having a place to stay), with the understanding that the variable contributes more to understanding the UH versus SN comparison relative to the LH versus SN comparison. The frequency of endorsements support this interpretation, with substantially higher frequencies reported for the UH nights relative to the other categories.
Eight additional predictors provided at least 25% of the average normalized importance of the top predictor; these are described here with predicted probabilities relative to a SN except where otherwise specified. These predictors included indicating that a friend had discriminated against the participant yesterday (ORLH = 1.30; ORUH = 0.89), responding that race was the primary reason for experiencing discrimination (ORLH = 1.28; ORUH = 0.92), using synthetic cannabinoids (a.k.a., “kush”; ORLH = 1.25; ORUH = 0.93), reporting having had sex with an unspecified other person (i.e., not a significant other or a prostitute; ORLH = 0.99; ORUH = 1.14), receiving verbal abuse (ORLH = 1.08; ORUH = 0.96), not responding to the item regarding having worked yesterday (ORLH = 1.01; ORUH = 1.12), being physically assaulted (i.e., hit/ punched/slapped/kicked; ORLH = 0.99; ORUH = 0.11) and stress about parenting (ORLH = 0.99; ORUH = 0.12). Additional predictors in the model may be interpreted in similar fashion, but do not provide as much predictive utility. The relative importance ascribed to the remaining five predictors selected by the reduced model may be given attention accordingly.
Weaker predictors of the sheltering outcome that did not meet the 25% importance threshold of the present study deserve accordingly lower, but still some, attention here, given that the tuned algorithm chose to retain them (especially than the 77 predictors the algorithm did not retain). The remaining predictors of a LH night were reporting stress about hunger, receiving discrimination from an unspecified other, not asking a sex partner if they wanted to have sex before it happened each time and receiving discrimination due to being homeless. The remaining predictor of a UH night was an affirmative response to having worked yesterday. Percentage changes in the odds of a LH or UH night did not exceed 5% for any of these relatively less important predictors.
Model performance metrics indicated that the algorithm provided daily shelter status classification accuracy of 79.9%, a significant (p < 0.001) improvement over the no information rate of 50.0% (represented by choosing LH, the most frequent category, for each prediction). The algorithm more readily distinguished the UH locations (92.0%) than the LH locations (71.8%) or the shelter (58.0%). The algorithm’s overall ability to distinguish LH from UH was given by the multiclass AUC = 0.92.
The present study applied data science techniques to three weeks of intensive longitudinal data to predict different sheltering patterns among YEH using innovative methodology. Sheltering patterns varied within and across participants over the study period indicating substantial transiency among YEH. Although shelters are primed for assisting youth in accessing needed resources and services, only one in five nights within the study period were collectively spent in a shelter. YEH utilize shelters less commonly than other types of services such as drop-in centers [12
] or staying temporarily with others. Consistent with the literature [2
], this signals the need to potentially broaden the definition of homelessness and/or modify point-in-time counting methodologies to account for variations in sheltering patterns among YEH and reduce the risk of undercounting disconnected youth in need of services who may be UH during a point-in-time count, rather than LH [64
]. Findings from this study identified predictors of shelter use and literal homeless nights that should be considered by service providers. These findings can be used to inform policies that support low-barrier access to shelters and homeless services.
The results of the present study are summarized in Table 1
. Interpretative statements here directly follow from the coefficients, importance measurements, and endorsement frequencies described for each retained predictor. Generally, the probability of a SN fell between the probability of either a LH or a UH night, i.e., most predictors followed a pattern of lower-to-higher probabilities of LH > SN > UH or LH < SN < UH. For example, endorsing the “Not having a place to stay” response option of the question, “What were you stressed about yesterday?” followed the former pattern of being more likely to experience literal homelessness than use a shelter or find unstable housing. Responding “yes” to the question, “Were you arrested yesterday?” followed the latter of being more likely to experience unstable housing, i.e., a night in jail. One exception was noted to this pattern, such that not responding to the item, “I worked yesterday” was related to higher probabilities for both LH and UH relative to SN.
In the present study, stress related to not having a place to stay, being arrested, experiencing discrimination (particularly due to race) and using synthetic cannabinoids were the strongest predictors of not staying in a shelter on a given night (> 50% normalized importance). This may be driven by substance use policies in the shelters, spending the night in jail and being denied access to a shelter related to perceived discrimination. Additional predictors demonstrating a substantial contribution to the model (those between 25% and 50% normalized importance) included having sex with an unspecified other (i.e., not a significant other or a prostitute) and being physically (i.e., hit, slapped, punched or kicked) or verbally abused. This may indicate that youth who secure unstable housing may be doing so in exchange for sex and violence on those nights. Youth who are parenting may perceive unstable housing to be safer than shelters, thus increasing their use of unstable housing. Shelters often highly encourage and/or require youth to be actively working or seeking employment, which may lead to less shelter use for those who are not working.
Several of the strongest predictors may be directly related to the broad classes of factors related to homelessness that were discussed in Section 1.1
–1.3 in this manuscript. Regarding sexual activity, having sex with an unspecified individual was related to a 14% increase in the odds of a UH night. Drug use was captured by synthetic cannabinoid (“kush”) use (+ 25% increased odds of a LH night). Factors directly related to gender identity and sexual orientation were not identified by the algorithm; however, the first pass through the algorithm (Supplementary Table S2
) identified sex with a non-binary partner as related to increased odds (+10%) of a LH night (this variable was likely not selected by the final model due to low frequencies of endorsement). Further studies are needed to disentangle these phenomena. Research methods that merge geographical data, longitudinal data, and qualitative interviews may enhance our understanding the drivers of sheltering patterns. Such methodology has been instructive in furthering our understanding of geographical connections to substance use [65
Synthetic cannabinoid (“kush”) use was found to strongly predict the nights that youth spent on the streets relative to SN or UH nights. However, drug use was less predictive of staying in a shelter compared to any other place. The literature clearly supports that drug use is associated with less shelter use [5
] and less housing stability overall [3
]. The findings from this study suggest that on the days that YEH use synthetic cannabinoids, they are more likely to be LH. More research is needed to determine best strategies for sheltering youth who use substances both within emergency shelters and in more permanent housing options. While substance use was less likely on the nights one used a shelter, it is unclear whether substance use follows the inability to secure temporary shelter or if youth are denied the ability to stay in a shelter due to using substances. Event-based assessments inquiring about sheltering attempts would increase our understanding of the critical points that lead to LH nights among substance using YEH. Shelter-based substance using spaces have been explored as a way to increase safety and reduce overdose among homeless populations in Canada [66
] and may improve rates of shelter use.
Many predictors were not selected by the algorithm in the present study; the final model only retained 15 of 92 predictors (thus discarding the 77 others). The non-retained predictors included those related to nicotine and alcohol use, other stressors (e.g., money, job, personal safety), aspects of sexual activity (type, partner’s gender identity, securing active sexual consent, condom use), school attendance, several other types of discrimination (e.g., age, gender identity) and sources (e.g., family member, friend) and other assault types (e.g., robbed, held against will) and sources (e.g., family member, significant other). This may indicate that, while still all too common experiences for YEH, these factors may not be as strongly related to where one stays on a given night as much as the other predictors. However, it is important to explore these phenomena further in larger studies using mixed methods to improve our understanding of sheltering patterns and inform interventions that address barriers to sheltering and prevention efforts needed to keep unstably housed youth safe. Further, the present
4.1. Implications and Significance of the Present Findings
This is the first study to use longitudinal data and techniques from data science to explore patterns of sheltering and to predict the likelihood of utilizing a shelter or unstable housing among a high-risk, hard-to-reach, population of youth experiencing homelessness. The longitudinal and applied machine-learning methodologies used here are potentially applicable to other hard-to-reach populations and have been used to predict other risk behaviors such as sexual activity and substance use that vary across days, occur with frequency and are potentially affected by real-time factors [20
Although the present study does not evaluate causality and generalizability may be limited to YEH that interface with shelter and drop-in service locations, the present study was able to isolate a small, parsimonious set of factors demonstrating the strongest relationship to daily sheltering. Moreover, the methodology used here provided an index of the relative importance of each predictor in the model, in essence ranking the predictors. Although we may have generally expected the direction of influence for each predictor, understanding the relative contribution of the predictors provides considerable value (e.g., racial discrimination, particularly by a friend, is more predictive of a LH night than stress about hunger). Further, given that the algorithm focused attention on 15 predictors while discarding 77 provides an optimized set of variables for further investigation. In essence, future efforts may place more value on targeting interventions at these predictors than the non-retained predictors, particularly those with the strongest relationship to the outcome.
This study deepens our understanding of the variation and transiency in sheltering patterns as well as suggesting that it is possible to predict days when youth are less likely to access the relative safety of emergency shelters. This study adds valuable information to the literature regarding the aforementioned broad factors related to homelessness as well as a starting point for investigating specific predictors related to daily sheltering in future studies. Moreover, with this data, it may be possible to develop just-in-time messaging and alerts that can disrupt the progression from drug use to unstable or literal homeless nights and encourage safer sexual practices on nights when youth are unstably housed. Further research is needed to inform violence prevention efforts for youth experiencing unstable housing. Finally, findings from this study indicate there may be a need for location specific resource navigation to assist youth in finding safer sheltering options and seeking alternatives to unstable housing that may increase the risk of experiencing violence.
4.2. Limitations and Future Directions
One limitation of the present research lies in the confusion that may arise from the disparate definitions of UH, LH and SN have arisen in the study of housing instability over time. The present research relies on a distinction between UH and LH that has been described as precariously or marginally housed by some [67
] but largely concurs with research suggesting that housing instability lacks a fundamental, standard definition [68
] irrespective of authoritative criterions (e.g., the HEARTH Act in the United States). The present research also has methodological constraints: although machine learning allows for exploration of all measured potential predictors for an outcome, (e.g., sheltering), the findings may not reflect other possible factors that may influence sheltering but were not measured. While the daily survey was based on extensive formative research [70
], this particular outcome of sheltering patterns was not a primary research question. Nevertheless, using longitudinal data and machine learning methodologies to assess sheltering patterns and predictors is a novel approach that accounts for large variabilities within and across participants. Of note, these data collection approaches are a class of relatively new methods. As a result, several of the measures used to assess these factors have not yet been psychometrically validated. Further, the response patterns of the participants necessitated a coarsening of the available data to focus strictly on the daily diary observations. This limitation inherently restricts the granularity of the predictions possible by the algorithm; however, it may be somewhat tempered by the wide predictor set that was available on the daily diary observations. Data temporality is another limitation, as youth provided retrospective reports of sheltering behavior on the previous day. Therefore, we cannot conclude whether these predictors (e.g., drug use, sexual behaviors) lead to sheltering choices or were a byproduct/consequence of that sheltering choice. The current study does not evaluate causality.
Another limitation is the sampling strategy used in this study. While the use of frequent assessments of sheltering patterns over a period of time is an improvement from cross sectional designs, the participants were recruited from service locations and were compensated for their participation. Therefore, the results may not be generalizable to youth who do not interface with shelter or drop-in center services. Further studies should include youth recruited from the streets. In addition, to the extent that disparate samples are different from the present sample, not all of the results may generalize to YEH. For example, synthetic cannabinoids were particularly salient to the present sample at the time of data collection; other samples may be more influenced by other drugs (or none). Other predictors may similarly be influenced by sampling concerns. Finally, it is important to conduct subsequent studies to determine the reproducibility of the patterns that emerged in this study. Future research should investigate the extent to which sample characteristics moderate the relationships between these predictors and sheltering.