Towards a Scalable Architecture for Smart Villages: The Discovery Phase

Alleviating poverty, reducing inequality, and achieving economic prosperity and well-beingis a global challenge. The spread and quantum of this daunting challenge calls for a scalable solution.The aim of the ‘Scalable Architecture for Smart Villages’ project is to contribute to an eective solutionwhich addresses scale as well as customization. In order to achieve both in our new framework forsmart villages, we take an endogenous approach. This approach emphasizes learning which will createa catalytic eect for scale. Learning is an essential component in the process, both for the researchersas well as members of the community. With these principles in mind, our approach proceeds in fourphases, namely discovery, planning, resourcing and executing. In this paper we outline the discoveryphase, which will lay the foundation for developing our framework of scalable smart villages.The Discovery Phase is a research process where the community learns about itself and the researcherslearn about the underlying factors that can help uplift and develop a smart village. Using conventionalqualitative and quantitative research methodology, the researchers and the community will generatebaseline data which will help calibrate villages for future development into smart villages.


Introduction
Alleviating poverty, reducing inequality and achieving economic prosperity and well-being is a global challenge, which corresponds to the United Nations Sustainable Development Goals, 1, 8, and 10 [1]. The aim of the 'Scalable Architecture for Smart Villages' project is to make a contribution to this rather daunting challenge. In proposing an approach to scalable architecture, we are aware of the many great minds that have directed their attention to this challenge and who have made significant contributions, and thus we approach it with humility. We acknowledge that those who have come before have impacted varying degrees of change in their own unique way. We would like to share our novel approach, with our focus being on the words 'smart' and 'scalable', which allow communities to customize their development trajectory.
In order to achieve scale and customization in smart villages, we are formulating a new scalable framework that is guided on a philosophical level by the need for an endogenous initiative. We draw on the scholarship around the field of Participatory Action Research (PAR), but note some differences in our approach. We detail the differences between PAR and our approach further in the paper, under the section "Knowledge Gap and The Discovery Phase". With our endogenous principles serving as a base, our approach proceeds in four phases, namely discovery, planning, resourcing and executing. In this paper, we discuss the discovery phase as a foundation to build ownership. While randomized control trials represent an advance in measuring the effectiveness of an intervention, we observe that in general, it will be challenging to achieve scale through an interventionist approach to rural poverty. On the other hand, given that India alone is home to 663,418 villages, scalable solutions are critical in making a significant impact [13].

Smart Cities and Smart Villages
A significant amount of research has already been done on developing and studying the evolution of Smart Cities in terms of elevating an existing city's transportation systems, government systems and improving health record organization. While research on smart cities can support the efforts in smart villages research, the modeling of smart villages needs to consider different parameters [14]. There is a distinction between developing a smart city and modeling a smart village, because of the varying priorities and needs of a city as compared to a village [14].
According to Anthony M. Townsend, American researcher and author of 'Smart Cities' information and communications technology is the backbone of a city's life including infrastructure, architecture and all the objects used to make it livable and sustainable [15]. Internet and connectivity is a key component [16]. Considering that rural environments and smaller remote communities have their own challenges and opportunities, including the possible absence of a legacy infrastructure, transcribing the approaches of cities may not be appropriate. The concept of a smart village requires a fundamentally different thinking in terms of objectives and processes.
As a starting point, the authors of this paper view smart villages as a vehicle by which to advance human development, and thus view the components of the human development index, namely education, employment and health, as key factors to focus on in developing the concept of a smart village. Thus, in contrast to smart cities, information and communication technologies are relevant only to the extent that they advance human development.
The European Commission (EC) launched a smart villages initiative in 2017 as a policy priority to transform rural areas. According to the European Network of Rural Development (ENRD) Thematic Group, Smart Villages are about rural citizens taking the initiative to transform their communities through local solutions. A key component of making villages smart is by improving rural services through digital and communication technologies but rural communities take the initiative in building on their strengths and develop new opportunities [17]. The efforts of ENRD are towards designing a policy framework where digital communications technology plays an active role.
The Millennium Villages Project (MVP), initiated by Sachs at Columbia University is perhaps the most well-known smart villages project. While it has laudable goals and has achieved a measure of success, it is heavily resource-dependent, and does not appear to be a scalable solution. There are also other projects (such as that of the Berkeley group headed by Darwin) that have had some success in transforming individual villages, but have not addressed the issue of scalability.
The experience of these projects and the framework put forward by the European Commission can guide future research, but we strongly believe that scalability is the key to making a sizable impact on rural poverty.

Scalable Architecture for Smart Villages
The objective of our model is to develop a scalable 'smart architecture' of village communities, through which they become self-reliant and sustainable. The design of scalable architecture needs to consider four important aspects.
First, there is a need to identify development issues that can be replicated across villages without compromising on the ability of each village to customize to their needs [18]. A simple example of this is the need for a primary school and a basic health-care facility. A more sophisticated, albeit well-known example, is the versatility and customized use of generic multi-micronutrient powders in different foods for reducing iron deficiency among children [19]. Secondly, scaling up requires adaptation to match local community needs and available resources to ensure sustainability [20]. This requires an endogenous receptivity and initiative, and a community capacity to examine and customize available opportunities that are relevant for the community. Thirdly, smartness with respect to sustainability and resilience lies in the ability to constantly learn and respond to changes in the ecosystem [21]. Fourthly, development policies and practices that can be scaled up through explicit and implicit social networks in the community, and between communities, need to be analyzed. Social capital and networks, and scaling them up to levels beyond the immediate community can result in sustainable development as well as impact knowledge flow and innovation [22,23].
This paper is largely concerned with the second of these aspects, namely the development of community capacity and endogenous receptivity and initiative. We propose an approach to both qualitatively and quantitatively locate the baseline of these attributes in a given community, and to understand the conditions under which, and the mechanisms through which, there may be movement from the baseline. This should shed light on the readiness or preparedness of a community to move towards being a smart community. The manner in which this information will be used to catalyze a scalable architecture for smart villages through networks of learning will be discussed elsewhere.

The Role of Learning
Note that learning includes knowledge as well as skills, in other words, actionable knowledge. The famous work [24] of the economist Kenneth Arrow on 'learning by doing' argues that we learn as we produce and invest. In our work, the role of learning is that we believe it holds the key to the solution of the scalability problem [24]. Networks of learning provide an endogenous approach to scaling. In other words, we leverage the network of villages themselves and their interrelationships to solve the scaling problem. In order for this to succeed, we have to begin by evaluating community capacity for learning in an individual community.
Our point of view differs from many prevailing approaches to smart villages in that we relegate the role of technology to that of a supporting tool, while the key driver of development in this scalable architecture is peer-to-peer collaboration and learning. By building a learning society, progress towards a smart village will be the result of endogenous effort, and the result will be a self-reliant and sustainable community. The focus is on employment opportunities, education and health, three pillars of human development, which can help village communities discover their potential as independent creators and contributors, instead of seeing themselves as consumers and dependents. Empowering people to help themselves, and creating learning networks is key to catalyzing a scalable effect.
The importance of learning in development economics has been well articulated in the important work of Greenwald and Stiglitz [25]. They explain that increases in standards of living arise from increases in productivity, that is learning how to do things better [25]. Thus, a focus of economic policy should be to increase, or to facilitate and actively encourage learning with an aim to close the knowledge gap [25] (pp. 5,6). Greenwald and Stiglitz focus their discussion on how government economic policy can promote learning and how companies might be structured so as to benefit from learning.
However, it seems to us that the concept is of much wider applicability. In particular, improving productivity and 'doing things better' in a sense that is relevant in a rural community may result in an improved quality of life, and in particular an improved access to health, education and employment. Whether it actually does so depend on whether the proposition is contextually relevant and whether there is endogenous receptivity for it. In the context of a village community, we interpret a 'learning society' to be one which has a high degree of community capacity and endogenous receptivity and initiative.
Our goal is to measure key social indicators and relate them to community capacity and endogenous receptivity and initiative. There is already a body of literature on how community capacity might be measured; however, it is clear that it has to be customized for the context in which one is working [26].
The ambitious goal of designing scalable architecture for smart villages can be successfully advanced in four phases, namely discovery, planning, resourcing and executing. The aim of this paper is to describe what is meant by the 'Discovery' phase.

Knowledge Gap and the Discovery Phase
Discovery has two aims. The first is for the village community to discover its aspirations, strengths, and areas in which improvement is needed to ensure whatever upward mobility is collectively desired in the three pillars of health, education and employment. Community members can use the baseline data to assess changes over time. The second is for the research team to use the data to identify underlying drivers and variables that would be impactful in the creation of a scalable smart architecture model. We expect that amongst these are community capacity, receptivity and initiative, which indicate the potential of a village to become a 'smart village'.
We hypothesize that the currently unsolved scalability problem can be addressed by identifying and seeding a sufficiently large number of villages in which these key drivers are strong. Indeed, such villages are likely to move more quickly to adopt practices that will make them 'smart' and to influence other nearby villages to do the same. This catalytic effect can be quantified through network modeling through a process of Dynamic Community Discovery [27,28]. Discovery is the first step in that process, through which we expect to shed light on the data-driven identification of villages (viewed as nodes of a network) that can serve as catalytic nuclei, in other words villages that are poised to make significant progress towards becoming 'smart' and influence other villages to do the same. These nodes can later trigger change, in neighboring or other networked villages through peer to peer learning.
While there have been many studies and projects in which discovery has played a role to solve a specific problem or to engage in a specific intervention, we are not aware of it being used in the holistic sense that we are envisaging. Studies on rural issues with a focus on intervention to solve a specific problem are inadequate for creating a database that can be mined for knowledge discovery and used for scalable architecture of smart villages. First, the range of information required for smart villages is extensive and holistic. It should represent all aspects of rural life and living. Rural studies are often focused on a specific area: addressing the crisis of arsenic poisoning in rural Bangladesh, developing a framework for planning, implementing and evaluating sexual health interventions for youth in rural Australia and tourism development in Cameroon [29][30][31]. Second, the data need to be free of global dominated ideologies and images that may not truly represent the reality of rural life [32]. These ideological beliefs may be based on Western perspectives that undermine other forms of knowledge that are created locally [32]. There needs to be an appreciation of social circumstances in which local knowledge is created and used [33]. Data have to represent experiences of people who live in the respective space-time settings [32,34]. Third, the data need to be expressed by people with their own socio-cultural, historical and rural perspective. These three characteristics give authenticity to the data that is collected, and they help us discover the distinct and unique identities of village communities.
In addition to data, the process of discovery, though participatory, captures the indigenous knowledge base. PAR has been acclaimed to be a way of democratizing knowledge [35]. It is a facilitative process for knowledge generation by those who want to transform the environments they live in [35]. In the context of a community, it allows them to ask, think and change their behavior as well as evaluate the changes [35]. Through this learning process, they create their own body of actionable knowledge [35]. Wood and colleagues discuss the need for action researchers to reflect on how their practices are helping the process of democratization of knowledge [35]. There is a fear that action research may not be achieving the objective of acknowledging that there are different ways knowledge can be created and represented [35]. The critique of PAR by some researchers highlights the need to decolonize the process to avoid 'epistemicide of indigenous ways of knowing' [35] (p. 8).
A study done by Stanton documenting a community based participatory research (CBPR) within an Indigenous community shows the epistemological differences between community and mainstream academics [36]. Stanton provides a comprehensive table which looks at the epistemological differences between Indigenous/community epistemologies and mainstream academic epistemologies. In this table, it is important to note the conflicting paradigms-the Indigenous epistemologies state that "Native peoples have unique histories and potential as demonstrated through oral histories, ceremonies, visual art, and so on", whereas the mainstream epistemologies are of the belief that "Native communities are deficient according to mainstream measures such as large scale studies, written accounts, and tests"-that would seemingly impact every aspect of interactions between the researchers and community members [36]. Stanton emphasizes that in order to decolonize CBPR, researchers must center the Native knowledge and epistemologies through the research process [36].
Masalam and colleagues discuss the neo-colonial capitalist cooptation of PAR and provide examples of "Third Worldist-PAR" as adult education praxis [37]. Masalam and colleagues state that through the examples of Indigenous peoples in Indonesia and India, the significance of a decolonized Third Worldist-PAR methodology would include: a better utilization of PAR, which would be more supportive of change in rural communities; PAR could contribute to the transgression of coloniality of power in knowledge production; and it can function as explanatory theory and methodology of praxis because it can generate movement-relevant knowledge which goes beyond the traditional function of academic explanation and description [37].
Drawing upon the literature, the discovery phase will use participatory processes incorporating indigenous knowledge by giving autonomy to the village communities.
Therefore, the discovery phase is unique in both content and process for developing scalable smart architecture. The design of the discovery phase is based on collective learning and experiences shared by academic scholars and practitioners from across the world at various Smart Villages conference and seminars and field visits of authors.

The Discovery Phase: Research Design
The overarching research program in the design of a scalable architecture for smart villages has many components. There is the scalability problem which development economics has not been able to solve despite decades of interventionist approaches. There is the question of what constitutes a smart community, especially in a rural setting where there is little or no infrastructure, not to mention highly connected infrastructure. The Discovery phase is, in our view, the first step towards solving these fundamental problems.
Within the scope of this project the aim of the Discovery phase is to gather information about the communities we are working with, especially as it relates to community capacity and endogenous receptivity and initiative. We shall rather vaguely define community capacity as the totality of assets that can be brought to bear for collective action to address community challenges. Similarly, we define endogenous receptivity as the openness of the community to new ideas and endogenous initiative as the willingness of the community to carry an idea into action. However, these may be measured, a high indicator of community capacity and endogenous receptivity and initiative is likely to indicate readiness to embrace a concept of a 'smart village'. The collection of a large number of community parameters, together with a mathematical analysis, will be used to attempt to formulate a quantitative indicator of these key attributes of a village.
The outcome of the process is baseline data, which will help us to calibrate villages according to whether an attempt to develop a smart community might be fruitful. From the point of view of the community, the baseline data will achieve two objectives: (a) provide a mirror for the communities to see where they stand and where they want to be, and (b) create a base against which progress can be monitored by the community. This is done through data shared by the community using data collection instruments designed by the researchers in consultation with grassroots team members. These instruments, in the form of questionnaires, would have been vetted through a pilot program involving sample villages. The information that is gathered is both quantitative as well as qualitative, and it includes aspirational aspects. It may also be seen as 'learning' about the community and is consistent with our general approach of empowering rather than instructing.
Thus, the Discovery phase for smart villages goes well beyond collecting and analyzing information. During this phase, we conceptualize the smart village and our approach to smart village development. Relationships with communities and stakeholders are built, helping them to reflect on their personal and community aspirations and create an environment that fosters change from within these communities.
This stage in the smart villages project will help us gain a holistic understanding of the communities that we are working with in a manner that is sensitive to their perceptions and needs, as well as aspirations. It will also properly document our findings with a systematic and data-driven methodology. This will allow future researchers, as well as the community itself, to clearly understand what information has already been collected from the community and what insights the analysis of this information provides.
There are two main principles of the Discovery phase: firstly, the discovery phase is not an intervention which provides solutions; secondly it is to establish and to begin tracking parameters to measure the effectiveness of changes made and how these changes are helping communities develop.
Obviously, any interaction with a community for the purpose of research might be termed an intervention. However, we wish to point out that it not an intervention in the usual sense of identifying a specific problem, proposing a specific solution and implementing the solution on a trial basis, and measuring the outcome. The discovery phase that we are proposing may be seen as a step in promoting community self-evaluation.
By presenting the data in a clear and concise manner, through the means of data visualization, members of the community would be able to identify areas of improvement that they themselves would like to take up. Laverack coins the phrase 'organic change' to refer to community efforts to gain control of the determinants of quality of life, and comments that "organic change sometimes involves an emotional or symbolic response that can be triggered by an evidence-based argument as part of a behaviour change approach" [38]. They will be able to identify aspects of their village that they feel more strongly about, to make "smart". At the same time, our project is a research project and so we are interested in measuring the outcomes, such as the increase of 'community capacity' [39]. This approach is supported by research that indicates individuals are able to build on their intuitive understanding of the factors that are shaping their community, and the potential causal relationships amongst those factors, when they can visualize the data [40]. This encapsulates the spirit of the discovery phase, in which the data facilitates the community's initiative in charting community its direction. Thus, the soul of the project is end-user driven.
In the context of rural development in which many interventionist attempts have failed to produce a meaningful increase in quality of life, our unique integrative endogenous approach might be the only viable option currently available.

Gathering Data
A baseline survey will be designed based on the information and knowledge acquired during exploratory research. Survey instruments will be designed to collect village level information on the three pillars of the human development index (HDI), namely education, employment and health. Additionally, information will be collected on social capital, aspirations, wealth and asset ownership, leadership, resilience and perceived happiness and well-being of the community. Finally, all parameters that are contributory to inequality will be analyzed.
Parameters like perceived happiness and well-being go beyond the human development index (HDI) and can be measured over time to indicate development. The focus will be on perceived happiness and well-being and not on any attempt to measure "actual" well-being, as there are divergent views on how well-being should be measured [41].
Parameters like social capital, aspirations, leadership and resilience can be indicators of the potential of a village for development, and can help villagers determine their path to becoming a smart village. Social capital can promote development [42]. It can also facilitate learning by the creation and exchange of knowledge, and it can positively impact grassroots development [10].
Though it has been seen to decline with changes like youth migrating out of the village, what gives optimism for smart village development is that it can grow with certain internal changes [42,43]. Women's empowerment can create social capital [44]. It cannot be measured directly but proxy measures relevant to a culture can be used to assess social capital [42]. It can manifest in collective action in the community, how disputes are resolved, trust, solidarity etc. [42]. Leadership can play a key role in the creation of social capital [45]. Research will assess whether social capital lies in the hands of a few people, as this can negatively impact development [46].
Aspirations and poverty are related. Poverty can lead to a loss of aspiration to reach one's own potential [47]. Our approach to Smart Villages is consistent with the perspective of Amartya Sen. In his opinion, the goal of development is the enhancement of human freedom [48]. It is also the primary means of development [48]. Sen writes "The ends and means of development call for placing the perspective of freedom at the center of the stage. The people have to be seen, in this perspective, as being actively involved-given the opportunity-in shaping their own destiny, and not just as passive recipients of the fruits of cunning development programs." [48] (p. 53). An indicator of freedom is the formulation of aspiration. Research shows that when entrepreneurial aspirations are encouraged through conducive institutional environments, poverty reduction can be achieved [49].
Resilience is another important parameter. In this context, we are referring to "community resilience". It is generally viewed as a positive characteristic of a community but scientific literature, policies and practice do not seem to have a consensus on its definitions and core characteristics [50]. Conceptually, it is an ability of communities to thrive when challenged by change [51]. Community resilience can prevent as well as facilitate recovery from a disaster or a traumatic event [52]. It is indicative of the community's ability to prevent poor outcomes from happening, as well as its capacity to restore and adapt itself after an event [52]. The Community Advancing Resilience Toolkit, a community intervention designed to promote community resilience, describes community resilience as the capacity to transform its environment and learn from adverse experiences collectively [50,53]. In rural communities, this adaptive capacity is related to the ability of the community for social learning as well as innovation [54,55]. Secondary research will review existing studies relevant to smart villages development. International organizations (such as the United Nations, World Bank, World Health Organization) conduct periodic studies in collaboration with national organizations to address various development issues, especially in developing countries. Governments around the world conduct countrywide surveys to collect data on specific issues relevant to that country. In India, the Ministry of Statistics and Programme Implementation conducts surveys through its National Statistics Office. The National Institution for Transforming India (NITI Aayog), a policy think tank initiated by the Government of India in 2015, also conducts research that would have policy implications. Most of these studies are at the national, state or district level to help formulate policy or evaluate specific intervention programs. Access to village level data is a challenge and disaggregation of data is, in general, difficult if not impossible. In India, secondary data is available at the village level for certain areas like school enrolment, availability of teachers and teaching infrastructure [56].
Baseline data will be collected using mixed methods, which include quantitative and qualitative data collection techniques [57]. Two types of research instruments will be used for data collection [47]: • Structured (closed ended) questionnaires for qualitative data • Unstructured (open ended) questionnaires for qualitative data Separate questionnaires will be administered for village level and household level data. Simple unstructured questionnaires will be administered to youth in schools. Questionnaires will be designed through a multistage process. To ensure relevance to the communities, questionnaire items, selected from a large bank of questions, will be judged by partners on the ground, who work closely with the villages. The modified questionnaire will be translated for pilot testing in a few villages before it is used for data collection. For the sake of practicality, the pilot testing will take place in whichever villages invite us, through their village leadership. The research team does not intend to disrupt any power structure.
The pilot will also test and compare paper based data collection system with electronic data collection system for final roll out. Qualitative data will be collected through focus group discussions and key informant interviews. Respondents for both qualitative and quantitative data collection will be household representatives, village leaders (elected as well as non-elected) and youth in schools. The process of empowering village communities will begin at data collection itself. Representatives from the villages will be encouraged and trained to collect information about the village, building among them ownership and engagement of the process. The process will enable the village community to develop a holistic and objective perspective on their own village.

Sampling
Judgement sampling will be used to select villages for exploratory research. After completion of the exploratory research, baseline data will be collected from states where government permission is obtained. A stratified random sampling method will be used to select villages in the state. The sample will include representation from districts and subdistricts in each state. We are aware that these subdistricts could have different names (for example, in India, they may be called tehsil, taluka, etc.) Each subdistrict will contain a number of villages, and we will again apply random sampling to this collection. We have to keep in mind that subdistricts may vary greatly in terms of size and number of villages. The sample size may vary between 5-10% of the number of villages in the sub-district.
The sample will be further stratified by type and size of village. We have a coarse ontology of three major types of villages, namely remote and isolated, not so remote, and peri-urban. This classification depends upon how far or inaccessible the villages are from an urban center. This ontology will be described in further detail in a paper in preparation.
It is important to understand the justification for stratification of villages based on accessibility and distance from an urban center. In our opinion, smart rural development cannot be based on a 'one-size fits all' approach and the specific rural context has to be taken into account. Villages, as discussed earlier, can be placed on a rural-urban continuum and face different challenges. Rural areas distant from urban centers face the consequence of social and economic decline due to outmigration (moving away from the area, within a certain geographical border) of youth [58,59]. Peri-urban areas face the pressures of modern development, where local infrastructure cannot handle the pace of development [59]. The two types of rural contexts also vary with respect to entrepreneurial activity and knowledge innovation due to differences in population density and connectedness to urban centers [60].
Geographic inaccessibility is another aspect of remoteness that needs to be considered in stratifying the population. Mountainous regions and islands have natural barriers to access, which further increases rural deprivation [61]. In these regions, inaccessibility and development potential form a vicious cycle. Poor network opportunities and availability of services in these regions exacerbate the problem of outmigration, which in turn result in loss of jobs opportunities and available services, making them inaccessible [61].
Finally, 'the poor' do not form a homogenous category [62]. Different subgroups among poor populations impact levels of participation and collective action [62]. One of the differences is caused by the movement in and out of poverty. Some people were not born poor but came into poverty, some are persistently poor and some came out of poverty [62]. The data from discovery so far also indicates heterogeneity due to varying timings and places of migration into the village and extent of diversity among village social groups. Different subgroups may have varying socioeconomic needs that impact collective action and potential for development [62]. The sample design will also consider the level of heterogeneity of a village for stratification.

Participatory Mapping
These instruments will be supplemented with participatory mapping, a participatory research tool [63]. It gives communities an opportunity to actively participate in the research process by visually representing the area they live in. In its simple form, it is a map created by the community of its physical and social structures [63]. These maps may not be scaled but provide information of spatial distribution of facilities for health, education, recreation, public utility, networks, practicing faith and all services used by the community [63]. In addition to providing information about spatial distribution of infrastructure and resources, the participatory process helps to understand a community's social dynamics, challenges and potential solutions to address the challenges [64]. Maps can include actors in the community responsible for achieving certain goals or those who are opposing some actions. As well, a sociogram can be an extension of the map in which relationships between different actors and groups is visualized [65]. Social mapping benefits the community as well. It helps them document and record local knowledge; raise awareness within the community; reflect on their issues and think of actions they can take; and build community cohesion [64]. The research team and the community are both in a process of discovery. The tool is versatile and has been used for improving land administration and management of natural resources, as a teaching tool for public health, and for building strong links between a community and a local rural school [63,64,66]. Mapping sessions can lead to insightful discussions on identifying issues, priorities and possible actions [65].
There are several high-tech multimedia and internet based technologies available for participatory mapping data collection and analysis [67]. These can be efficient in monitoring development over time. We will evaluate the options based on access and digital literacy of participating communities [67]. It is important to ensure that benefits of the participatory process are not lost when using technology [67].
There are several examples of social mapping currently in use, though there do not seem to be examples in the peer-reviewed literature that are being used in a rural setting. One study, in the village Thenganayakanahalli, in the vicinity of Bangalore, India worked with respondents to create a map, highlighting various aspects of infrastructure that were in place, and came to their own definition of poverty, as well helped underline certain social dynamics [68]. This in turn led to the respondents reflecting on areas of concern/improvement. The authors working in Bangalore found that although a visual representation of their focus area was lacking in certain information, the presence of the map allowed them to then start a dialogue, which helped the respondents to further open up and discuss certain nuances with them [68].

Analysis
Discovery will provide us with both quantitative and qualitative data for analysis [58]. In this project, analysis will involve integrating both types of data, as they will be designed to measure different aspects of village reality [69]. The two types will provide a more complete picture of the ground reality and can be analyzed with respect to the hypothesis or theoretical assumptions of the research [58,69]. Triangulation connects qualitative and quantitative empirical findings with theoretical propositions [58].
Analysis of data will use a number of techniques including clustering analysis and dimension reduction. In dealing with data that depends on several parameters whose correlations are not a-priori known to us, clustering analysis can help us to observe potential relationships. There are several well-known algorithms for this kind of analysis and we will use a combination of them according to the context [70]. For example, in analyzing data which is collected in a social context, there is a degree of uncertainty or unreliability in the information provided. Thus, it is useful to consider the use of a mixture-modeling algorithm to understand such data [70] (Chapter 18). Along with cluster analysis, a technique called dimensionality reduction will be used, in order to reduce the number of variables being analyzed, without a significant error in the calculations [71]. Dimension reduction arises from a methodology that exploits sparsity [72]. In other words, indicators may depend, or be essentially determined by, just a few of many possible parameters that are measured.
Electronic data capture may be used to capture baseline information. In health research, it has been observed that electronic data collection introduces both reliability as well as savings in time compared to paper based questionnaires [73]. This has been observed even among users having varied experience and literacy levels to collect and transmit health related information [74].

Smart Village Discovery in Telecommunication
While we have not observed 'discovery' being used in the way we have described it here, there are instances in which some aspects of it have been employed and we describe one such instance here. Fennell and their colleagues investigated if rural youth have improved employment opportunities through ICT initiatives and digital inclusion. This study could help understand if mobile telephones could catalyze the development of smart villages in India [75].
Fennell et al. used household surveys within villages in two states in India, Punjab and Tamil Nadu. In Tamil Nadu, their sample consisted of 3 districts with 100 participants, and in Punjab they sampled 1 district with 112 participants. The discovery process involved designing a research method to examine the choices made by rural youth, in terms of their education and employment. They created a protocol that allowed them to look at the accessibility of the current IP (internet protocol) connectivity expansion program. Then they used a network sensing architecture application to geographically map the quality and availability of mobile internet connectivity. From that, they could construct maps using their parameters for the quality of the connectivity available from every service provider on the market. The authors wanted to see whether the aspirations of the rural youth were reflected in their mobile device usage. This might provide an understanding of what programs could be implemented to help with the youth's social mobility.
For the most part, their discovery process was a combination of secondary and primary research. The outcome of the study was to present their findings and analysis that could be used by other researchers and practitioners or policy makers.

Participatory Action Research in Conservation Agriculture
Sulifoa and Cox [73] study the implementation of conservation agriculture in Samoa [76]. About 75% of the population of this Pacific Island Country is engaged in village agriculture at the subsistence level [76]. Significant investment of development dollars resulted in a set of sustainable farming practices which were disseminated widely but which failed to be adopted, especially by smallholder farmers [76]. In Samoa and elsewhere, adoption rates of conservation agriculture were low because the initial set-up is labor intensive, it requires equipment which smallholder farmers may not have access to, and yield improvement may not manifest for as long as seven years [76]. In a 2016-2017 study involving extensive interviews with farmers and extension officers, the authors were able to develop a clear picture of "agricultural farming practices across Samoan villages; details about the role of agriculture in these villages and information on the differences between various stakeholders" [76] (p. 133). They concluded that changing the agricultural practices currently in use required a long-term participatory approach involving "co-creation activities, reflexive feedback loops and cooperative buy-in" [76].
The authors concluded "As this case study illustrates, devoting resources to the process of understanding exactly what the problem is and what solutions might be most effective could play a much larger role in making a difference than how quickly a solution suggested by experts is implemented." [76] (p. 142). Moreover, they point out that a collaborative approach is needed that includes smallholder farmers, their families and their village, in addition to officials with a more national or global perspective [76].

Discussion
We discuss similarities and differences between our approach and that of the above two studies. Our approach differs from the approach of Fennell in several ways. Firstly, our project aims to look at multiple parameters contributing to the creation of smart villages and not just communications technology (or any technology for that matter). It considers education, employment and health, together with several other parameters mentioned above. We consider these parameters taken collectively to be fundamental in building a smart village.
Secondly, the telecommunications project was a meta-analysis of data that had already been collected for other purposes and it ended with their analysis and conclusions [75]. On the other hand, our discovery process is meant to engage the communities that it affects in interpreting and potentially actioning the knowledge thus gained. The collective effort is envisaged to result in the development of a scalable model for implementation. Respondents in our discovery, unlike the telecom study, will play a role beyond data collection. They will be active participants in the future development of their village, and its transformation into a smart village.
Thirdly, the Fennell study sampled villages in two states of India [75]. This project will sample villages more broadly. These sample villages will be the implementation centers, which would then influence other villages through networks and learning. In terms of initial geographies, we are targeting selected villages in India, Botswana and First Nations communities in Canada.
Fourthly, the telecom study addresses sustainability but not scalability, while our focus is on scalable sustainable development.
Our approach also differs from that of Sulifoa and Cox, though there are also similarities. Firstly, we are concerned with multiple parameters of community well-being and development, and not just conservation agriculture. Secondly, as in Sulifoa and Cox, the project will rely on data directly collected from villages and not secondary data sources. Thirdly, unlike Sulifoa and Cox, we plan to sample a number of different geographies. Fourthly, unlike Sulifoa and Cox, we are interested in both sustainability as well as scalability.
The discovery process in this project is more akin to the agriculture study than the telecommunications study in that it is not limited to conducting research and involves developing a collaborative approach. Moreover, unlike both the telecommunications study and the agriculture study, we do not begin with an a priori concept of the direction that development ought to take, whether it is a widespread deployment of telecommunications capacity or implementation of conservation agriculture.
Our process is the foundational part of a larger initiative of developing a scalable architecture of smart villages. The proposed outcomes of the Discovery Phase are: 1.
analyzing the data that has been collected by the community 2.
presenting this analysis to the village communities who provided the data, so as to create ownership of the process, and 3.
creating a mirror for the communities to see their aspirations, assets and areas of improvement 4.
building a baseline for measuring potential future progress of the village towards becoming a smart village.
The process, therefore, will be more holistic and will require the research itself to be more detailed and conducted in multiple stages. It is about discovering how to build smart villages in partnership with those who are most affected by such transformation, namely the villagers themselves. Such a partnership requires that the village and its population want such a transformation. In keeping with this goal, we propose that the survey is conducted by village representatives themselves, and not by an external team of researchers. Unlike the telecommunication study, where a representative sample of only rural youth was surveyed, our discovery process involves the entire village community. This project is not trying to identify a catalytic factor, but laying the foundation for an endogenous process of development, in other words, a process that is guided by the community itself.

Conclusions
The Smart Villages project is an interdisciplinary endeavor to address global poverty through a model that has a catalytic impact and is sustainable. The objective is to design a scalable smart architecture through which villages, whether remote or peri-urban, can break the cycle of poverty. In our view, one of the attributes of a smart village, is a sentiment that it has achieved its potential.
The Discovery phase of the development of scalable smart architecture uniquely creates a database that can be mined for modeling through an endogenous process. Discovery enables our learning at the grass roots level about the range of issues faced by rural communities, their diverse aspirations and strengths, the variety of paths to achieving their potential and relationships among relevant stakeholders. Most importantly, it enables us to gauge community capacity, endogenous receptivity and initiative. We feel that these factors are critical in determining whether a village will move towards a 'smart' village. Discovery also provides an opportunity for the village communities to learn about themselves, map their assets, and create an environment of ownership to become 'smart'. The process of discovery has started in India, and our experience has helped us to abstract the process to some extent. The plan is to continue to test and refine this model by applying it to other geographies.
As learning and exploration continues in this phase, information will be generated to help design the scalable smart architecture. The data collected from a sample of villages will also provide a baseline to monitor impact of the model when it is implemented.