A Human Machine Hybrid Approach for Systematic Reviews and Maps in International Development and Social Impact Sectors

: The international development and social impact evidence community is divided about the use of machine-centered approaches in carrying out systematic reviews and maps. While some researchers argue that machine-centered approaches such as machine learning, artiﬁcial intelligence, text mining, automated semantic analysis, and translation bots are superior to human-centered ones, others claim the opposite. We argue that a hybrid approach combining machine and human-centered elements can have higher effectiveness, efﬁciency, and societal relevance than either approach can achieve alone. We present how combining lexical databases with dictionaries from crowdsourced literature, using full texts instead of titles, abstracts, and keywords. Using metadata sets can signiﬁcantly improve the current practices of systematic reviews and maps. Since the use of machine-centered approaches in forestry and forestry-related reviews and maps are rare, the gains in effectiveness, efﬁciency, and relevance can be very high for the evidence base in forestry. We also argue that the beneﬁts from our hybrid approach will increase in time as digital literacy and better ontologies improve globally.


Introduction
Information and communication technologies have made significant advances in the last decades and disrupted science and other knowledge sectors. Many mathematical problems waiting to be solved for centuries are solved using more efficient algorithms and increased computing power. Artificial intelligence became commercially viable, and standard tools using machine learning have been developed in space, technology, and medical research. Many tools and techniques for analyzing unstructured data have been developed and commercialized at scale. However, the changes in systematic review methodologies and tools in international development and social impact sectors have been much slower. There are many missed opportunities to increase the effectiveness, efficiency, and societal relevance of systematic reviews and maps in international development and social impact sectors.
Systematic reviews and maps are gaining momentum in the broader forests use and management literature, including subjects such as land use management, ecosystems services, and zoology in the last decade. There are about 140 systematic reviews and maps related to forestry use and management in Scopus and the Web of Science database. Following the early years of minor increase between 2003 and 2013, the number of forest The diversity of the review types have also been increasing. In the early years, systematic reviews were the only types. Since 2012, systematic review protocols have become a part of the forestry use and management systematic review work, and since 2015 systematic maps and map protocols have been published. By June 2021, there are more than 106 systematic reviews, 15 systematic review protocols, 12 Systematic maps, and seven systematic map protocols in broader forestry use and management related subjects. However, machine-centered approaches, such as machine learning, artificial intelligence, text mining, automated semantic analysis, and translation bots, in carrying out these reviews and maps were minimal. No review or map has used artificial intelligence, algorithmic procedures, or text mining approaches in a structured manner among the 140 reviews and maps. The use of analysis software that has text and content analysis functionalities was limited to a few.
One of the root causes of the slow change in the evidence sector in general and systematic reviews, in particular, is the polarization of the opinion about the role of machine-centered systems. On the one hand, several scientists argue that machine-centered evidence systems, such as artificial intelligence-based classifications, are black boxes, and their findings cannot be validated sufficiently [1,2]. Some others raise ethical concerns and claim that evidence generated by machine-centered systems will lead to ethically blind interventions [3,4]. On the other hand, multiple scientists argue that human judgment on evidence will always be incomplete, at best [5,6], and partial in some cases [6]. Some proponents of machine-centered systems also argue that human-centered evidence approaches reflect the preferences of a professional community rather than the needs of society [7,8] since building sufficient capacity to generate, disseminate, and use evidence effectively requires a considerable investment that can only be done by a minority of the people in society [9,10].
In this article, we provide a hybrid approach that combines both machine-centered and human-centered elements. We think our hybrid approach for conducting systematic reviews and maps can address most concerns about evidence management and improve the efficiency, effectiveness, and societal relevance of systematic reviews and maps. We propose using the hybrid approach, especially in the evidence sector in international development, social impact, and broader forest use, as well as management sectors that shaped our perspective leading to the design and development of the hybrid approach. We think that broader forest use and management sectors, including land management, are among the fields that can benefit from the approach for addressing the key challenges in contemporary forestry.

Current Challenges in Systematic Review and Map Sector
We classify current challenges in the systematic review and map sector that inhibit effectiveness, efficiency, and societal relevance in three major groups. The first group of challenges is related to the evidence sources. In the last decade, the number of publications in the international development and social impact sectors has increased exponentially [11,12]. However, none of the major databases hosting publications have near full coverage [13,14]. The metadata sets of the publications registered in the databases are not standardized. It is common to have multiple versions of the same publication [15,16]. In addition, authors of publications may inadvertently "obscure" their articles by using loosely related popular terms in the titles and keywords, decreasing the specificity of articles to increase the likelihood of the article being indexed in and prioritized by the query engines, as titles play an essential role in making publications accessible to search engines and attractive to users [17].
The second group of challenges is related to the process of conducting the systematic reviews and maps, which can be a 'time consuming and sometimes boring task' [18]. A systematic review was estimated to take more than 12 months [19,20] in health research. Although the estimations were made for health reviews and maps, the evidence sources in international development and social impact sectors differ, so it is realistic to assume that Forests 2021, 12, 1027 3 of 8 reviews and maps in international development would take this amount of time. Although anecdotal in the sense that it is not yet fully documented, our own experience indicates the time-consuming nature of systematic reviews in international development [21].
Current approaches used for systematic reviews and maps heavily rely on human and manually intensive efforts in compiling, screening, analyzing, and synthesizing literature sources, which require significant time investments [22,23] and usually lead to leaving limited time for evaluation and synthesis activities [24]. Analysis and synthesis require advanced skills, necessitating comprehensive education and training periods coupled with significant financial investments [9]. Due to the time limitations, low investment by the international development and social impact sectors on evidence, and lack of capabilities by the review teams, the evidence generation process is not sufficiently documented, making replicating the review results hardly feasible. Because of the same limitations, review teams could not spend sufficient efforts on documenting their learning about the content and the evidence, which could have been an essential source of information for the design of international development and social impact interventions. Notwithstanding the considerable potential of systematic reviews for knowledge generation, these processrelated challenges highlight the lack of timeliness of systematic reviews and their limited uptake, significantly limiting the use of the evidence in different contexts and making use at scale unlikely.
The third group of challenges relates to the acceptance and applicability in the social sciences. Although originally derived from the health and medical field to consider the effectiveness of specific health interventions, systematic reviews are less able to deal with qualitative evidence, multidisciplinary studies, and differing contexts, common in international development research [25]. Systematic reviews also privilege scientific over other forms of multiple pieces of knowledge, such as local, technical, and experiential knowledge, making them less applicable to multidisciplinary research.
Despite these challenges, systematic reviews are seen by bilateral and multilateral donors as an essential tool for evidence-informed research for policymaking precisely because they provide a unique opportunity to synthesize large bodies of evidence. For example, UK Department for International Development (DFID), the Australian Agency for International Development (AusAID), and International Initiative for Impact Evaluation (3ie) had commissioned up to 100 systematic reviews [24]. Given this emphasis on systematic reviews among funders, improving current approaches and addressing some of these challenges, specifically their timeliness, is essential.

A Hybrid Approach to Address the Challenges
Our hybrid approach builds on the relative advantages of human and machinecentered approaches. In conducting systematic reviews and maps, machines have a comparative advantage over humans in any processes that can be standardized and can be structured into simple components. Although it is changing rapidly, humans have advantages in unstandardized processes and semantics. For instance, machine-centered approaches are not advanced enough to have common sense, and machines cannot establish semantic relations as effectively and efficiently as human-centered approaches yet [26,27]. Based on this picture, our approach uses three critical operations.

Combining Lexical Databases with Dictionaries from Crowdsourced Literature for Queries
Building a search query is one of the critical steps of any systematic review and map process [22,28]. In conventional reviews and maps, the queries are manually identified by experts in the author team of the review or the advisory committee [28,29] (Figure 1). In some cases, the queries are tested against a reference set of resources and updated until it retrieves the whole reference set or a significant part of the reference set. Expert opinion is a quick way to develop a query with a high chance of retrieving relevant resources. However, since the expert identification process is not random and the number of experts who can be consulted is limited, there is always a high risk of omitting relevant resources and expert bias, especially when experts are selected based on convenience. Organizing the advisory committee is cost and effort-intensive for reviews and maps in international development and social impact. Since international development and social impact sectors are international, and advisory committees require diverse members from different countries and continents who need to travel significant distances to attend the committee meetings. Each trip requires significant resources and time for various arrangements. some cases, the queries are tested against a reference set of resources and updated until it retrieves the whole reference set or a significant part of the reference set. Expert opinion is a quick way to develop a query with a high chance of retrieving relevant resources. However, since the expert identification process is not random and the number of experts who can be consulted is limited, there is always a high risk of omitting relevant resources and expert bias, especially when experts are selected based on convenience. Organizing the advisory committee is cost and effort-intensive for reviews and maps in international development and social impact. Since international development and social impact sectors are international, and advisory committees require diverse members from different countries and continents who need to travel significant distances to attend the committee meetings. Each trip requires significant resources and time for various arrangements. When there is a difference between the approaches, there are multiple boxes that indicate the differences. When the tasks are not different between the two approaches, either a box with "same" was used, or detailed parts of the tasks were not omitted for improving the accessibility of the figure. For instance, in the conventional approaches, the end product is static reviews. In contrast, in the hybrid approach, the end product is dynamic (live) reviews, i.e., the reviews that can be updated using the recent evidence sources without significant efforts and reviewer time. When there is a difference between the approaches, there are multiple boxes that indicate the differences. When the tasks are not different between the two approaches, either a box with "same" was used, or detailed parts of the tasks were not omitted for improving the accessibility of the figure. For instance, in the conventional approaches, the end product is static reviews. In contrast, in the hybrid approach, the end product is dynamic (live) reviews, i.e., the reviews that can be updated using the recent evidence sources without significant efforts and reviewer time. In our hybrid approach, we reduce the risk of bias induced by omitting relevant resources and evidence that the author team and advisory committee did not know by building queries using standardized ontologies in lexical databases and expert pooled evidence dictionaries. Standardized agricultural and forestry ontologies such as AGROVOC, AIMS, The Crop Ontology for Agricultural Data have been built with the participation of many experts and peer-reviewed by large communities of researchers. They combine more than 40,000 concepts and 700,000 terms in more than 20 languages, providing opportunities to analyze literature published in those languages. WordNet combines all concepts and terms on the internet with a clear semantic relation structure for about 200 languages. The semantic relations in Wordnet go beyond synonyms and antonyms and include all relations such as hyponyms and meronyms. By selecting the words and terms from these ontologies based on the research question, we capitalize the inputs of hundreds of experts in identifying the search query in a standardized way. The queries formulated by the hybrid approach identify a higher number of relevant resources from a broader base of literature.
The ontologies suggest too many words and terms for practical use when the review or map research question is not focused enough. In this case, we use dictionaries generated from expert pooled literature to prioritize words and terms included in the ontologies. To ensure that the experts cover all possible points of view, we use a crowdsourcing approach. Instead of looking for publications or university positions to identify the experts, we use professional social media platforms, mostly ResearchGate, AcademiaEdu, and LinkedIn. We share the research question with all the experts that come out from searching on the social media pages and by emails and ask them if they could provide literature resources informing the questions. Since these social media platforms include expertise and contact information about even the experts who do not have personal profiles, this approach identifies and enables one to contact digitally disengaged experts. Afterward, we use the most frequent words that come out from the analysis of the literature shared by the experts.

Using Full Texts Instead of Title, Abstract, and Keywords for Screening
Most conventional systematic review and mapping methods include a screening protocol whereby title, abstracts, and keywords are screened to identify potentially relevant literature for the research question against specific inclusion criteria [30,31]. Since identification depends on individual perspective relative to the inclusion criteria adherence, more than one person does the screening, and relevance decisions are compared [28,32]. When there are differences in inclusion decisions, the screeners interact, agree on standard criteria, and reassess [33,34]. This process requires significant time as reaching an overall consensus is time-intensive and blind to the inadvertent obscuring we mentioned in the section describing the challenges. The majority of the graduate programs have a research writing course in which strategic use of scientifically or societally popular words in titles, abstracts, and keywords was taught. Although the objective of such courses is to increase the chances of the articles being prioritized among many hits generated by the query, systematic use of it creates a massive discrepancy between the content of the full papers and title, abstract, and keywords. In addition, a significant portion of the evidence generated in international development and social impact literature is funded by international research for development projects.
Accessing international research for development research funding depends on the quality of research and a promise of positive, large-scale development impact expectations. These expectations create an incentive for authors to have catchy titles and maximalist impact claims in the abstracts. In our hybrid approach, we go a step forward and do not just screen the resources using the title, abstract, and keywords. We retrieve full texts of all accessible resources from the significant general databases, i.e., Scopus, Web of Sciences, Pubmed, and specialty databases based on the research question. Then, we leverage algorithmic procedures to analyze them and extract the relevant parts using text mining methods such as word combinations, interactive word trees, and word-resource maps. This enables us to not only significantly increase the number of publications we could use for the synthesis and mapping (also reducing the time to gather them in ways that would not be possible manually) but also identify evidence patterns that would not be visible in the conventional human-centered approaches. For research questions, including concepts with fuzzy boundaries, which is common in international development and social impact sector reviews and maps, the machine-driven only system might lead to the extraction of a large text set, making a coherent synthesis impractical. If this is the case, we use human selection and auditing of the extracted text and identify a subset of them based on key principles identified by the authors team. Our early attempts showed that enforcing a further focus and rerunning the updated query might lead to a reduction of information relevant for the focus as well. Going back to the beginning, updating the research question would require a new process of ontology identification and crowdsourcing. We think that human-based selection is a more viable approach for the concepts with fuzzy boundaries until validated ontologies are published.

Using Metadata Sets for Mapping and Synthesizing the Evidence
Metadata of publications have not been utilized in the conventional production process of systematic review and mapping methods. Although the location, the period that the evidence has been generated, and the type of the publication play essential roles in exclusion and inclusion criteria, they are external variables that define the scope of the review rather than internal variables that can be used to compare the existing evidence available across different locations and times [35,36]. A significant part of the international development and social impact research is, in fact, multi-location and context-dependent. They cut across multiple periods since development and social impact are long-term processes [24,37]. The inclusion and exclusion of resources based on time and location can lead to a loss of relevant evidence. The learning that can be generated from the comparison of time and location can be lost.
Other metadata of publications, such as the authors' profile, organizational affiliations associated with the publications, and funding agencies of the research are hardly used. Since the authors' profile, organizational affiliation, and funding agencies indicate a potential conflict of interest in a broader sense, giving information about inclusivity and diversity of the agency that generates the evidence, excluding them from the synthesis and maps creates a risk of enforcing an illusion of the impartiality of evidence.
In our hybrid approach, we use metadata as an internal variable for presenting the evidence in a more granular way. Using a combination of academic reference management software and word-publication text analysis techniques, we create high-quality, comparable metadata and make the authors, organizations, and multiple contextual variables a part of the synthesis. This enables us to present specific configurations of time, space, individual, and organizational factors that can lead to specific international development and social impact outcomes. We also combine the metadata from the academic data basis with other data sources such as national statistics and datasets of international organizations such as the United Nations organizations to enrich the evidence that can inform the systematic review or map the research question.

Towards More Effective, Efficient and Societally Relevant Systematic Reviews and Maps in International Development and Social Impact Sectors
Our Human Machine Hybrid Approach proposes changes in how systematic reviews and maps in international development and social impact sectors are conducted. We justify the changes by increasing the effectiveness via capitalizing standardized ontologies and revealing the strategic behavior in the title, abstract, and keyword formulation, efficiency by removing the screening step and introducing a combination of text mining with human auditing to increase societal relevance. We do so via crowdsourcing the literature identification and by increasing granularity on the use of metadata. We argue that, by combining the strengths of both human-centered and machine-centered approaches in multiple steps from building queries to synthesis, our approach is a significant improvement in the ways systematic reviews and maps in international development and social impact sectors are carried out. Due to the nature of this perspective paper, specific details of our approach were not described in this paper. We intend to provide such details in a follow-up method article.
We are aware that our approach might require higher digital and technical literacy to review teams and greater access to computing and processing power, which might be hard to achieve for teams working in low-income countries and settings where infrastructure capacity is low. We recognize that the gains from implementing our hybrid approach might not be high enough to justify changing the ways of doing reviews and maps when the review questions include fuzzy concepts. Nevertheless, as time passes, we believe that existing trends of increasing digital literacy at a global scale, the exponential increase in open-access academic literature, and advances in ontologies will reduce the resource demands of our hybrid approach and make it even more beneficial. The value of this approach to subjects such as forestry and sustainable development, which rely heavily on a wide range of research in many different disciplines and on grey literature to provide evidence for decision-making, will be particularly high given the existing constraints.