Respondent-Driven Sampling for Surveying Ethnic Minorities in Ecuador

: In this work, we consider the problem of surveying a population of young Indigenous, Montubios and Afro-Ecuadorians to study their living conditions and socioeconomic issues. We conducted a Respondent-driven sampling (RDS) survey in the canton of Riobamba, Ecuador. RDS is a network-based sampling method intended to survey hidden or hard-to-reach populations. We have obtained RDS estimates and conﬁdence intervals of these characteristics. We have illustrated and discussed some of the assumptions of the method using some available diagnostic tools. Our results suggest that RDS is an effective methodology for studying social and economic issues of this ethnic minority in Ecuador. This technique is relatively easy to implement and has the potential to be applied to survey other hidden populations in other settings.


Introduction
Like any other South American country, Ecuador is ethnically diverse [1][2][3]. Most of their population is identified as mestizo (71.93%), which comprises a mixed Amerindian and Spanish heritage and the following minorities: Montubio (7.39%), Afro-ecuadorians (7.19%), Indigenous (7.03%), Whites (6.09%), and other (0.37%) [4]. Poverty and unsatisfied basic needs affects more Indigenous and Afro-Ecuadorian households, than those with mixed origins [5]. Furthermore, the incidence of extreme poverty in Indigenous and in Afro-Ecuadorian households is even higher than in mestizo families, which results in less access to education and difficulties to getting decent housing [5]. Several studies have highlighted that housing is one of the most important components in a person's subjective well-being and overall satisfaction [6][7][8]. Chica et al. [9] recently studied the quality of life of households in Colombia, and Chica and Cano [10] studied the prices of houses using a regression-kriging method. There is evidence that issues with housing and the living standard of some groups in the population and their social exclusion can lead to health issues [11,12].
Some studies and data collected from official statistical agencies highlight the higher rate of unemployment and poor access to health, education and housing of Indigenous people and other ethnic minorities compared to those with mixed origins. [13][14][15][16][17]. Nevertheless, most ot these studies are based on sources of data and information on the living conditions of ethnic minorities in Ecuador as the Census of population and housing (CPV) 2010, the Survey on living conditions (ECV) 2014, or the National survey of employment, unemployment, and underemployment (ENEMDU) 2018, but they are not representative at the canton level and are outdated or do not take into account

Materials and Methods
The present study is a cross-sectional survey conducted by the University of Granada and the Superior Technical College of Chimborazo, Ecuador in collaboration with the National Confederation of Farmers, Indigenous and Black People Organizations (FENOCIN). Inclusion criteria for participants in the survey were: self-identification as Indigenous, Montubio, or Afro-ecuadorian, being 18 to 29 years old; living in city of Riobamba, having a valid ID number and giving their consent to participate in the study. First of all, the authors evaluated the suitability of RDS for surveying this group of people. The non-random selection of initial respondents, called seeds, is critical as they must have a large social network. They were selected using face to face and telephone interviews among 40 youth leaders who work on issues of interculturality, justice and solidarity for FENOCIN. They were interviewed several times to ensure their suitability. Ten seeds were selected, diverse in terms of sex, age, marital status, ethnicity, and instruction (see Table 1). The selection was based on two criteria: personal characteristics and number of connections within their social group. RDS methodology requires that the group of interest is a hidden population and that they form a well-connected social network. Indigenous, Montubio, and Afro-Ecuadorian populations have been studied so far in the CPV, ECV, and ENEMDU surveys, but there is no available survey focused on excluded young ethnic populations in Ecuador. More importantly, young Indigenous, Montubio, and Afro-Ecuadorians find it difficult to self-identify [23][24][25][26]. Therefore, we lack a reliable sampling frame for this group, which makes traditional sampling difficult to implement. As RDS reduces privacy concerns, it can be a convenient method for surveying such populations [28].
The Riobamba canton is home to the highest proportion of Indigenous youth in Ecuador and has the presence of young Montubios and Afro-Ecuadorians [39]. Evidence from different studies show that the Indigenous population in the canton of Chimborazo and particularly in the city of Riobamba, concentrates the vast majority of the poorest Indigenous population in the country, with essential living conditions being far from ideal [15][16][17]. The city of Riobamba has approximately 39,000 students in universities, technological institutes, etc. Most of the Montubios and Afro-Ecuadorians living in Riobamba are students that have migrated from rural areas (from the Ecuadorian coast mostly) to the city of Riobamba seeking education and better life conditions. This has led to a fusion of different social groups, that have merged into a single social network [26]. Therefore, Indigenous, Montubio, and Afro-Ecuadorian youngsters are a hidden population, and they are an interesting socio-economic group and form a social network, so that we aim to to survey this young ethnic group using RDS.
The questionnaire included eight different sections covering the following information: contact and eligibility data, informed consent, socio-demographics, housing and home, health, habits, practices and use of time, poverty, discrimination, and general satisfaction with life (see Table A1 of Appendix A). Once it was completed, respondents became recruiters and could access a different form to recruit new respondents. Both the questionnaire and the recruitment forms were hosted on the website (www.ugremina.com). After collecting the contact information on new participants at each wave, the computer system sent an email with instructions on how to use the website for filling out the survey and how to recruit up to three new participants. The computer system awaited their response in the two following weeks with up to four texts reminding them to complete the questionnaire and to invite new peers. Each recruit received a username and password by email for logging in the website. The identifiers of both the recruit and its recruiter were stored in a database, which allowed tracking the chains created from each seed. We followed the recruitment process, making sure that the RDS sample is large enough to overcome the potential bias introduced with the initial selection of seeds. We used the convergence and bottleneck plots (Figures 1 and 2) to check the evolution of the RDS estimated chains. The final RDS sample is consistent with the CPV 2010 and the ENEMDU 2018 surveys. A dual system of incentives was used for promoting recruitment, as it is usually done with RDS surveys [28]. The incentive was the right to participate in a lottery where the prize was a holiday trip to Galapagos. Participants received one raffle ticket immediately after filling out the web survey and another for each (up to three) successfully recruited peer.
To account for the RDS assumptions, we considered the convergence and bottleneck plots and the homophily ratio of variables. Homophily is the tendency to associate with those with similar characteristics. The RDS survey homophily scores are shown on Table 2 and interpreted in the next section. Convergence plots show the true population parameter with the number of recruits on the horizontal axis. This plot can help assess whether the sample is biased by the initial set of seeds. Bottleneck plots can show differences between seeds. Illustration of these two plots for the RDS ethnic survey data are given and interpreted in the next section.
We used the most usual estimators in RDS, which are the RDS-I ratio estimator, the RDS-II estimator [30], and the Gile and Hanckock [40] version for sampling with replacement. The RDS-I estimator for estimating proportions with binary response and groups A and B is defined aŝ withĈ AB = r AB r AB +r AA , r AB is the number of people of A s recruiting B s in the sample, r AA the number of people of A s recruiting A s in the sample, The RDS-SS [40] estimator for estimating proportions: withπ(d i ) the estimated population distribution of degrees through successive sampling. The RDS-II estimator takes the form of the Hajek estimator as follows: with d i the degree reported by respondent i. Estimators RDS-SS and RDS-II have desirable statistical properties as they are consistent and asymptotically unbiased. We used the software environment R, in particular, the RDS library (Handcock et al. 2017) and the igraph library for drawing social networks.

Results
As mentioned before, ten initial seeds were selected to participate and recruit up to three more respondents. Every new respondent was given the opportunity to recruit another three new participants in the study. Thirty-two of the recruits used the three coupons to recruit, 300 just two of them, 108 one, and 60 did not recruit anybody. Three of the seeds were very successful (86 or more recruited within their chains), four had a moderate success (from 66 to 85 recruited within their chains), and three a lower performance (65 or less recruited). The survey reached six waves for the 10 seeds (see Figures 1 and 2).
A total of 1510 invitations were sent to potential eligible participants, and 814 completed the questionnaire, which gives a 53.9% overall cooperation rate. Valid cases and cooperation rates are distributed from the first wave to the sixth as shown in Figure 2.

Living Conditions of the Ethnic Group
We studied the living conditions of this ethnic group and compared the RDS estimates with the values obtained with official surveys CPV 2010 and ENEMDU 2018 for the regular Ecuadorian (blue color values) and those belonging to ethnic minorities (green color values). RDS estimates and confidence intervals are reported in Table 3. We computed the three usual RDS estimators (given in Section 2) for every characteristic under study and obtained similar results, which are reported in the Table A2 of Appendix A.
RDS allows recruiting participants who would not normally be part of a probabilistic sample in the context of studying hidden populations. Age, marital status, and salary characteristics (for ENEMDU survey of ethnic minorities) fall outside the 95% RDS confidence interval, indicating that people who are relunctant to be identified as part of those ethnic minorities (who were captured by RDS) tend to be younger (21.81 years) than those who have no problem with their ethnic self-identification (23.25 years). Similarly, they tend to be single (81.31% compared to 55.09%) and with lower median income ($295.50 compared to $379.64). In contrast, the characteristics sex, instruction, language, work, social security, and extreme poverty fall within the interval, showing that there is no difference for these variables between those who self-identify as part of the ethnic minority and those who do not.
We compare the estimates of the total ENEMDU survey with the 95% RDS confidence intervals with the intention of identifying gaps. Table 3 shows large differences in socio-economic characteristics, such as the total ENEMDU salary and the RDS estimate ($523.58 compared to $295.496), falling the former well outside the 95% RDS confidence interval. There are also differences in instruction and social security coverage between these two groups, with the total ENEMDU values outside the confidence intervals. Moreover, 90.63% of the ethnic youngsters claim to have occasionally been victims of discrimination. Following the same arguments, there are differences in most of the housing characteristics considered in the survey (number of people living in a house, water, and energy service). Despite the important effort being done by the Ecuadorian administrations to avoid social exclusion and discrimination of these ethnic groups, there is evidence of socio-economic differences.

Assessment of the RDS Survey
We computed the homophily ratios of the characteristics under study, as shown in Table 2. Homophily is computed as the ratio of the number of recruits with the same characteristic as their recruiter to the number expected by chance [41]. A value of 1 means that there is not preferential recruitment, while values over 1 indicate homophily and values under 1 heterophily (i.e., a value a bit over 1 indicate modest homophily).
In order to assess the recruitment, we computed the homophily scores for every variable under study. Nevertheless, there are socio-economic variables that are more connected with what homophily represent, the tendency to associate with those with similar characteristics. The occupation group they work at, appear to have modest homophily. For other socio-economic characteristics, such as clothing or social security, there is very small homophily . Generally, most of the values are close to 1, indicating modest homophily or modest heterophily and therefore a satisfactory recruitment.
Convergence plots in Figure 3 show the sample values converging to the true population parameter for variables sex, instruction level, working status and income. It indicates the stabilization of values as recruitment continues, suggesting that the resulting sample is not biased by the initial selection of seeds. Similar plots and similar results have been obtained for all the other characteristics under study and for RDS-I and RDS-II estimators, but they are not reported here for ease of presentation.  Figure 4 shows the bottleneck plots, which appear to converge on one point estimate for each variable, suggesting estable estimates (instead of converging on two or three, which would indicate unstable estimates and important differences between the data from different seeds). Examples of bottleneck plots are available in Reference [42]. We computed RDS-II and RDS-SS estimates for every variable in the survey. Differences between the RDSII and RDS-SS estimators are very small (under 0.01) for all variables in the survey (see Table A2 of Appendix A), indicating that the size of the population is not inducing bias in our estimates [42].

Discussion
RDS has been widely used in public health studies, particularly for studying the prevalence of a desease, but there are very few examples of application to survey ethnic minorities.
We carried out a RDS survey to study the socioeconomic and living conditions of the youngest segments of Indigenous, Montubio, and Afro-Ecuadorian population in the cities of the Riobamba canton in Ecuador. We considered dimensions, such as housing, social welfare, income, poverty, social exclusion, and perception of life. We compared the RDS estimates of these characteristics for the ethnic population in Riobamba with the average Ecuadorian to check for potential gaps and differences on such dimensions.
We showed that RDS can collect information on participants who would not be recruited using traditional sampling. We showed there are differences in some socio-economic characteristics between those who self-identify as part of the ethnic group and those who are relunctant to do it. These differences suggest that RDS is an effective method for studying social and economic issues of ethnic minority urban youth in Ecuador. Furthermore, we used the RDS-II and the RDS-SS estimators, the two most important estimators in RDS, well-known among RDS practitioners for their good theoretical and practical properties.
We recruited a sample of 814 Indigenous, Montubio, and Afro-descendant urban youth over five months of fieldwork in the Riobamba canton in Ecuador. 93% of participants in the study (including the 10 initial seeds) successfully recruited at least one peer. The resulting sample is ethnically, demographically, and socioeconomically diverse and was large enough to produce estimates on the population.
A well-documented social problem is the under-registration of ethnic minorities in surveys and censuses in Ecuador. These populations can be better represented with the RDS methodology. RDS also has the potential to be useful to study sensitive issues in hidden populations, like having been victim of discrimination. With further information about a sensitive issue of interest in hidden populations, we can have a better understanding of the actual state of that issue and use it to design knowledgeable policies to address that problem, like, for instance, eradicating all forms of discrimination.
There are some limitations to this study. Its results deals with an urban young ethnic population in the Riobamba canton and cannot be generalized to other areas in Ecuador or to the rural youth in the canton (due to poor internet access). Nevertherless, RDS has the potential to be applied at the national level by studying it separately at the 24 capital cities of provinces in Ecuador. The survey is subjected to coverage bias as approximately 20% of this social group in Ecuador does not use the internet regularly. Finally, it is not clear how the network responding process could be affected by social-desirable responding in surveys using RDS sampling methods. Future research is needed on this issue as most studies using RDS deal with topics that are deemed sensitive.
RDS is an useful methodology that can be applied to a wide range of populations and contexts that are difficult to address with probability-based sampling techniques. Social researchers may consider using these techniques other hidden and/or difficult-to-reach populations.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

RDS Respondent-driven sampling FENOCIN National Confederation of Farmers, Indigenous and Black People Organizations
Appendix A