The Contribution of Data-Driven Technologies in Achieving the Sustainable Development Goals

: The United Nations’ Sustainable Development Goals (SDGs) set out to improve the quality of life of people in developed, emerging, and developing countries by covering social and economic aspects, with a focus on environmental sustainability. At the same time, data-driven technologies inﬂuence our lives in all areas and have caused fundamental economical and societal changes. This study presents a comprehensive literature review on how data-driven approaches have enabled or inhibited the successful achievement of the 17 SDGs to date. Our ﬁndings show that data-driven analytics and tools contribute to achieving the 17 SDGs, e.g., by making information more reliable, supporting better-informed decision-making, implementing data-based policies, prioritizing actions, and optimizing the allocation of resources. Based on a qualitative content analysis, results were aggregated into a conceptual framework, including the following categories: (1) uses of data-driven methods (e.g., monitoring, measurement, mapping or modeling, forecasting, risk assessment, and planning purposes), (2) resulting positive effects, (3) arising challenges, and (4) recommendations for action to overcome these challenges. Despite positive effects and versatile applications, problems such as data gaps, data biases, high energy consumption of computational resources, ethical concerns, privacy, ownership, and security issues stand in the way of achieving the 17 SDGs.


Introduction
The Sustainable Development Goals (SDGs) formulated by the United Nations (UN) are internationally agreed goals to improve the quality of life for billions of people in developed, emerging, and developing countries by covering the social and economic aspects of human society, with a focus on economic security and environmental sustainability [1]. The SDGs comprise 17 agendas with 169 targets and 232 indicators that apply to all countries and regions of the world and that are planned to be implemented by 2030. When the goals were adopted in 2015, it was agreed that their progress would be reviewed regularly at the regional, national, and global levels [2]. Despite qualitative and quantitative assessments at national government levels, a cross-country study (sample of 26 countries) by Allen et al. [3] shows that while good progress has been made in the implementation stage, there are large gaps in further stages.
Since (big) data-driven analytics, artificial intelligence (AI), the Internet of Things (IoT), deep learning (DL), and machine learning (ML) influence our lives in all areas and have caused fundamental change processes in recent years [4], this study is concerned with the research question of the extent to which such data-driven technologies have enabled or inhibited the successful achievement of the 17 SDGs to date. For this purpose, a comprehensive review of the relevant scientific literature on the impact of data-driven approaches on sustainable development in the context of each of the 17 SDGs was conducted. Our findings show that data-driven analytics and tools are primarily used for data collection, monitoring, forecasting, and mapping or modeling.
The paper is structured as follows: First, the applied methods (i.e., literature review, content analysis) are described. Second, the results of our literature review are summarized by presenting the state of research separately for each of the 17 SDGs. Third, we summarize the state of research based on the literature, address the potential uses of data-driven methods, and present the resulting positive effects as well as arising challenges. We derive recommendations for action to overcome these problems and to set priorities that should be pursued over the next decade to enhance the positive impact of data-driven approaches on achieving the SDGs by 2030. The paper ends with the discussion of results and conclusion.

Materials and Methods
Our study aimed to identify and summarize the current state of research on the extent to which data-driven technologies have enabled or inhibited the successful achievement of the 17 SDGs to date. To answer our research question, an exploratory literature search, loosely based on the method of Tranfield et al. [5] for systematic literature reviews, was conducted to identify relevant articles in the field and to provide a comprehensive overview of the state of research. The exploratory search for publications covered the two topics of data-driven technologies and sustainable development (cf. SDGs). Articles were identified that contained a combination (AND conjunction) of two groups of keywords, either of which was included in the title, the keywords, or the abstract. The first group of keywords referred to data-driven technologies (e.g., AI, big data, data mining, DL, ML, etc.), and the second group of keywords referenced SDGs or sustainability. Separate searches were conducted for each SDG, and keywords were assigned for each that encompassed the SDG's topic. To give an example: For SDG 6 (Clean Water and Sanitation), these were the keywords "Water stress; water resources management; water scarcity; water security; wastewater; sanitation; hygiene". We searched several databases (e.g., Google Scholar) to obtain an initial sample of English-language journal articles published between 2015 and 2021. The search was not limited to journals in a particular scientific discipline. Our inclusion criteria were broad, as in addition to journal articles, also conference proceedings, working papers, and book chapters were included. Care was taken to include mostly ranked journals and/or frequently cited articles. No specific class of journals was preselected with respect to their ranking (e.g., top-ranked journals). This approach is based on Kubíček and Machek [6], who argue that "innovative research ideas may even appear in lower-ranked journals" (p. 967). A total of 193 papers were identified as being relevant to the review. Despite its limited representativeness, the exploratory literature review shows that there is a vast body of knowledge from different disciplines regarding data-driven models and their influence on helping to achieve the SDGs.
In an attempt to identify and synthetize the state of research on the extent to which data-driven technologies have enabled or inhibited the successful achievement of the 17 SDGs, we followed the process of qualitative content analysis presented by Finfgeld-Connett [7]: We conscientiously read each article to identify data segments and codes (step 1: identification of data segments) and organized these findings in a spreadsheet (step 2: data matrices and coding). As data analysis progressed, coded findings were interpreted, synthesized across studies, and continuously recorded (step 3: memoing). Subsequently, the apparent relation between codes and accompanying memos was compiled in a diagram (step 4: diagramming), on the basis of which our results were later aggregated into a conceptual framework. Through the iterative process of reflecting on ideas and their interconnections (step 5: reflection), concepts were gradually constructed. We synthesized findings into the following categories: (1) uses of data-driven methods, (2) resulting positive effects, (3) arising challenges, and (4) recommendations for action to overcome these challenges. The content analysis of the identified articles was carried out systematically, organized by the respective SDGs, and summarized in the following chapters. In the presentation of the 17 SDGs, selected targets and indicators are mentioned whose inclusion is based on the fact that they are named in the sampled literature or can be clearly identified. However, covering all 169 targets and 232 indicators would go far beyond the scope of this paper.
of sinking back into poverty in a particular year. Rule-based measures can help sustain local governments' poverty alleviation efforts, while requiring the performance of regular surveys, the monitoring of beneficiaries, and the development of databases that help assess the effect of poverty eradication policies and monitor individuals' socioeconomic status.
Vinuesa et al. [23] discuss the positive and negative effects of AI in the context of achieving the 17 SDGs. Concerning SDG 1, positive effects of AI were observed regarding poverty prediction, implementation of social protection/benefit systems and access to essential services for economically disadvantaged and marginalized people, worldwide collaboration in poverty alleviation programs, mobilization of resources, and development of a robust policy framework. Negative effects included increased economic and social inequalities in emerging and developing countries due to digital technologies and a lack of access for economically disadvantaged and marginalized people to basic services due to data biases in the model and resulting biased policy implementation. Data bias is a systematic distortion of sample data that affects the representativeness and predictability of the data [24] and might reflect biases based on race, gender, religion, or disability [25,26]. Fair and efficient model development using data-driven AI and ML models offers the potential to resolve poverty-related challenges by allowing higher level understanding and prediction-based decision-making. The collaborative and focused attention of policymakers, government agencies, economists, service providers, and technology/AI developers is required to maximize AI and ML models' potential to alleviate poverty.

SDG 2: Zero Hunger
SDG 2 sets out to end hunger (target 2.1), achieve food security and improved nutrition (i.e., end all forms of malnutrition) (target 2.2), and promote sustainable agriculture (target 2.4). The unavailability of food in the required quantity causes people to consume less food, which can lead to severe malnutrition and starvation [27]. The reasons for hunger in regions and communities where this problem is acute are complex and include food insecurity caused by poverty, political instability, the lack of social benefit systems, low agricultural productivity, unemployment, famine, natural disasters, wars and conflicts, social discrimination, and gender inequality. Food security is particularly important for the sustainable development of countries, which can be supported by increasing production efficiency. Far-reaching measures are needed, such as (1) comprehensive reduction in grain consumption across all channels, (2) effective intersectoral cooperation among national and regional organizations, and (3) allocation of public funds for intensification of agricultural production and development of industrial infrastructure [28]. To achieve zero hunger goals in the short and long term, the problem must be addressed at multiple levels: Robust policy making and policy implementation [29,30] are needed that ensure food supply by increasing agricultural productivity (target 2.3) and sustainable food production (target 2.4). In addition, efficient supply chain management is needed that can respond to short product life, large fluctuations in supply and demand, seasonal production, and quality issues (target 2.c) [31]. The problem of food insecurity could be addressed by establishing robust social benefit systems as well as food distribution systems that enable people struggling with acute hunger to purchase food at affordable prices and in sufficient quantity and quality.
AI and ML models are being widely used to achieve various SDG 2-related targets and to provide efficient management techniques to address hunger at the primary level, e.g., mapping macro-and micro-level poverty (cf. SDG 1), monitoring sustainable benefit schemes, monitoring food and nutritional supply, and increasing sustainable agricultural production whilst ensuring minimal environmental damage. Addressing malnutrition requires a well-resourced data and information system that includes multisectoral actions [32]. However, data related to hunger and nutrition are sparse, incomplete, and scattered. A framework that can integrate, curate, and analyze data is therefore needed to provide insights into regions experiencing undernutrition and to address challenges in eradicating malnutrition. Data-driven analytics and tools are useful for understanding hunger-related issues and taking action to prevent them. DataDent [33] proposes a visualization tool based on a robust data value chain (e.g., prioritization of measurements, data creation and collection, data curation, and data analysis) to address nutrition-related challenges and enable effective policymaking, evidence-based decision-making, and collaboration.
A key issue in addressing hunger is increasing agricultural production (target 2.3) and developing an efficient and sustainable distribution system. The European Commission's Farm to Fork (F2F) strategy under the European Green Deal aims to make food systems fair, healthy, and environmentally friendly and to reduce food waste [34]. One of the goals of the F2F strategy is healthy food from healthy soils [35]. Data-driven analytics and ML models are applied in recommendation systems for crop identification, selection, and protection, crop growth monitoring, yield maximization and optimization, fertilizer recommendation, and soil quality [36][37][38][39][40][41][42]. To measure progress in implementing a reduction in chemical/hazardous pesticide and fertilizer use in agriculture, a monitoring, reporting, and verification system for organic carbon in agricultural soils is needed [43]. The use of pulsed electric fields could improve energy and yield efficiency of tomato processing by using less energy in pressing tomatoes and reducing greenhouse gas emissions [44,45]. AI-and ML-based models and technologies are being used for sustainable livestock farming, such as monitoring animal health and well-being, predicting disease, and optimizing dietary supply, intake, and consumption of nutrients [46][47][48][49][50][51]. To determine where and how to target efforts most effectively, it is necessary to capture where along the food chain, in which foods, and in which countries the greatest losses occur [52]. However, the lack of comprehensive data on the amount of food wasted makes it difficult to determine the effectiveness of prevention and reduction efforts [53]. The current food system is under pressure due to various challenges, such as growing population, climate change, soil degradation and erosion, eradication of all forms of hunger, price instability, and environmental regeneration [54,55]. These challenges could be met with sustainable intensification (SI) of food production, but implementing SI strategies can involve trade-offs with environmental issues, land use, marketing of agricultural products, and other governance and policy issues. Purnhagen et al. [56] indicate that achieving SDG 2 would benefit from incorporating innovations in biotechnology into organic agriculture, as otherwise, the desired increase in organic production could lead to less sustainable food systems. Minimizing trade-offs in the sustainable food system can be achieved through the application of cutting-edge technologies of AI, data-driven and ML models, and computational methods. Fanzo et al. [57] present a food systems dashboard and visualization tool that provides detailed information for 140 indicators for global, national, and regional food systems, integrates them, and makes them comparable. As a result, it helps prioritize actions to improve diet quality and nutrition and to make decisions based on high-quality data. The dashboard is based on the interconnectedness of various components of the food system, including consumer behavior and diet, individual factors (economic, cognitive, aspirational, and situational), external factors (environmental, political, social, and economic), and the food supply chain. AI and ML models for descriptive, predictive, and prescriptive methods are applied to analyze economic, social, and environmental indicators of the agriculture supply chain (ASC) using soil, weather, yield, and geographic information systems, as well as satellite, irrigation, livestock, and economic data [58,59]. Gardas et al. [60] propose interpretive structural modeling for exploring the indirect impact of different variables, using the influence matrix and reachability matrix in the ASC as a decision support system for policymakers to improve the performance of the ASC. Liu et al. [61] discuss supply chain (re)structuring utilizing big data and blockchain applicable for pricing and investment decisions in the green agri-food supply chain.
Smart farming, precision farming, digital agriculture, and precision irrigation are emerging areas where big data, sensors and IoT infrastructure, robotics, and data analytics are key components of the smart agriculture infrastructure. These applications minimize cost, improve farmers' profit, enable faster and robust supply chain management, and improve transparency. Agricultural activities are driven by AI and ML models used for predicting future outcomes, making real-time operational decisions [62,63], and developing novel business models. In order to meet the requirement of sustainability, these agricultural business models also need to be sustainable. In general, the importance of sustainable business models is increasing due to emerging framework conditions such as environmental legislation, access to raw materials, or decarbonization [64]. Sarker et al. [65] examine the prerequisites for deploying big data in sustainable agricultural production, address technical and organizational challenges associated with implementing big data technologies, and develop a conceptual framework for big data-driven sustainable agriculture. Approaches such as Farmbeats, an IoT data platform for agriculture, enable data collection from digital devices and the effective use of sensors for smart and precision farming [66]. The implementation of data-driven models in smart farming brings concerns about data quality, privacy and ownership, data integration, and the interpretability and acceptability of AI and ML models in decision-making. There are gaps in the reach of data-driven technologies to small farmers, landowners, and remote areas worldwide, which is why Mehrabi et al. [67] recommend that governments, agricultural organizations, entrepreneurs, and researchers adopt policies, make investments, and conduct research focused on the availability, access, and use of technologies that can lead to inclusive and sustainable agriculture and food systems.

SDG 3: Good Health and Wellbeing
SDG 3 aims to ensure healthy lives and promote the well-being of all people at all ages, including reducing maternal mortality (target 3.1), lowering neonatal and under-5 mortality (target 3.2), combating epidemics (e.g., AIDS, tuberculosis, malaria) and communicable diseases (e.g., hepatitis) (target 3.3), and decreasing deaths from traffic accidents (target 3.6). Target 3.1 of SDG 3 is to reduce global maternal mortality, which is currently at 211 per 100,000 births, to fewer than 70 per 100,000 births. The main reason for maternal mortality is pregnancy complications, which are often the result of inadequate knowledge about maternal health care, a lack of screening, and the failure to provide timely hospital treatment for complications. Countries with high maternal mortality rates tend to have poor health infrastructure, a lack of technologies and skilled health workers, and an overburdened medical system [68,69]. To address these issues, big data-driven health systems can be used to fill gaps in resource mobilization and enable early prediction, monitoring, online diagnosis, and medical consultation. Providing affordable access to technology for health care through digital services to countries that do not have robust information technology (IT) and healthcare infrastructures is a challenge. Increased investment in general health coverage is essential for establishing centers for IT/technology-enabled health services that are available to the population in remote regions.
Cloud-based data mining and IoT services [70][71][72], data-driven ML models (e.g., biomarkers, neural network models, Bayesian networks and classifiers, rule-based tree models, random forest approaches, particle swarm optimization, boosting models) [73][74][75][76][77][78][79][80], and mobile healthcare (mHealth) applications are used to monitor maternal health data, provide adequate maternal healthcare, and predict pregnancy risks (e.g., hypertension, preeclampsia, gestational diabetes). mHealth services are applied to detect early risk of maternal mortality using social, demographic, gynecological, and obstetric predictor variables [81], to identify maternity risks based on a pregnancy database to support decision-making [82], and to improve maternal care through mobile diagnostics by accessing data that predict early symptoms of hypertension and recommending countermeasures [83]. It has been shown that mHealth can be successfully used to treat hypertensive disorders in pregnancy and improve clinical care [84]. Mobile apps such as PIERS on the Move can assist healthcare professionals in performing prenatal screening, measuring blood pressure and dipstick proteinuria, and assessing symptoms [85]. Wu et al. [86] present an early warning diagnostic tool that uses ML methods and biomarker studies to provide highly accurate results for early prediction of gestational diabetes. Although national/regional, well-maintained, and regularly updated longitudinal data in the form of electronic health records can be used for maternal and clinical purposes [87][88][89], missing data, insufficient sample size, class imbalance, misclassification of socioeconomic features, and ML algorithms can lead to severe prediction bias [90].
To reduce neonatal and under-5 mortality rates (target 3.2), its causes such as premature births, poor nutrition, poverty, and lack of medical facilities and health care must be addressed. Research shows that AI-based and data-driven decision support systems in neonatal intensive care units can provide affordable, accessible, and highly accurate systems to support neonatal care (e.g., diagnosis, prognosis, monitoring) [91][92][93][94][95][96]. However, early warning models for predicting infant mortality that incorporate biological, social, demographic, and ethnicity characteristics [97,98] are not without challenges, as the clinical application and interpretability of ML models require high-quality data and a deep understanding of clinical data. Target 3.3 is to prevent deaths from epidemics (e.g., AIDS) and airborne and waterborne diseases. AI and data-driven models can be applied in various tools to prevent the spread of disease, enable online tracking, monitoring, and diagnosis, and provide access to affordable drugs/medical products. Digital systems can be effective in raising awareness of HIV prevention, increasing accessibility of testing services, improving uptake of preexposure prophylaxis, and developing technologies and big data algorithms for efficient monitoring of incidence, deaths, mental health, and patient care [99]. Medical imaging, computed tomography, magnetic resonance imaging, echocardiography, and mammography with DL models are state-of-the-art techniques for predicting communicable and non-communicable diseases with higher accuracy, such as tuberculosis from chest radiographs, various cancers from tissue images, neurological diseases, bone fractures, and hemorrhages [100][101][102]. The current use of AI in diagnosis, patient mortality risk assessment, disease outbreak prediction, surveillance, and policy-making may improve healthcare facility performance, resource allocation, and healthcare management in the future [103]. While AI and DL models have the potential to improve predictive accuracy, productivity, performance, and cost-effectiveness in medicine, they also raise concerns related to patient-physician relationships, data security, and privacy [104]. The large-scale adoption of AI technologies poses potential threats that should be considered in the development of AI so that the security, traceability, transparency, explainability, validity, and verifiability of AI applications are ensured in our everyday lives [105]. In low-and middle-income countries, there is an increased need for research on user-centered AI and ML models, with implementation and adherence to statistical, ethical, and regulatory global standards and guidelines being essential [103].
Traffic accidents are a public health issue and cause many deaths and injuries worldwide each year, so reducing them is part of SDG 3 (target 3.6). A large proportion of traffic accidents are due to human error, weak enforcement of traffic laws, poor management of traffic systems, bad infrastructure, and inadequate driver education. AI and data-driven models, such as automated systems, IoT-based ML models for traffic accident risk prediction and smart transportation, and models for collision avoidance, pedestrian movement prediction, and traffic monitoring systems, can be used for traffic control, early prediction of traffic congestion, and driver safety training, thus contributing to a reduction in traffic fatalities and injuries [106,107].

SDG 4: Quality Education
Free, equitable, and quality primary and secondary education for all is the first target (target 4.1) of SDG 4. The data show that in many countries, a large proportion of children-especially girls (cf. gender inequality)-do not receive formal primary and secondary education, and the dropout rate among these children is high. The level of education is directly correlated with economic and social outcomes, mainly poverty reduction, health improvement, agricultural productivity, and skilled human resources. Data-driven and AI-enabled systems can be game changers in providing high-quality primary and secondary education for all, especially in remote regions where a formal education infrastructure is inadequate.
In digital learning (D-learning)-including electronic learning (E-learning) and mobile learning (M-learning)-teaching, learning, and study activities are supported by digital and electronic means [108]. In D-learning, AI and data-driven frameworks provide access to educational resources, monitor the progress of students, and allow self-regulated, flexible, and effective learning. This requires the development of a robust digital framework for AI and data-driven learning that is accessible to all. DigCompOrg is such a conceptual meta-framework applied as an evaluation tool for governmental, educational, and other institutions to effectively adapt and integrate digital technologies into teaching [109]. Blayone et al. [110] discuss a theoretical framework for democratized, collaborative D-learning that facilitates socially and cognitively rich multidimensional learning. Yen et al. [111] discuss a digital framework for domain-specific and domain-general self-learning systems. D-learning offers the possibility of a student-centereD-learning environment, which is not possible in a classroom setting where teacher-centereD-learning is the predominant approach. Student-centereD-learning-i.e., students participate in the learning process in a self-determined and proactive manner and compile their own course of study [112][113][114]-can be useful in cases where expert educators are scarce. Yin et al. [115] analyze students' learning behavior with ML models using their eBook learning log data. Such studies can help improve D-learning platforms and user-centric, customizable digital environments for academic applications.
Emerging areas in D-learning are AI-based education systems, AI tutors, AI chatbots, smart AI classrooms, personalized and adaptive learning environments, technologies to support learners with cognitive disabilities, online distance learning, mobile game-baseDlearning and gamification, augmented reality (AR)-, virtual reality (VR)-, and extended reality (XR)-baseD-learning, anD-learning by design [116][117][118][119][120][121][122][123][124][125][126][127]. Advances in these technologies offer potential for learning complex topics at the school/university level by enabling simulations of rare geographical, ecological, and other complex events. AR, VR, XR, and other interactive visualization tools enable a multimodal, collaborative work environment where learners can build a deep and detailed understanding of a subject, become motivated, develop positive attitudes towarD-learning, and improve critical thinking. AI-based technologies with robust infrastructure and common, unified data standards can be applied to educational tasks in school/university settings or in online virtual environments. They provide quality education for all by enabling systematic and collaborative learning and increasing productivity, performance, and creativity [128], by improving the level of services (e.g., distribution of learning materials, instructional activities by pedagogical agents) [119], by using chatbots as teaching assistants, by providing language support to teachers interacting with foreign students, by translating educational materials into each student's native language, and by helping teachers assess student performance (e.g., automated grading, online exams) [129,130].
Worldwide accessibility of D-learning platforms would democratize education, increase equity, reduce learning costs, decrease time and emissions from travel/commuting, allow freedom and flexibility to students, and enable continuous learning and rapid access to information from diverse sources. The challenges in D-learning environments are to achieve a viable universal design for a functional, systematically implemented digital framework, to ensure acceptance of the learning environment, to provide broad accessibility around the world [131], and to prepare educators/teachers to use AI technologies safely and effectively to avoid bias [132]. The impact of D-learning on the psychological, emotional, social, and cognitive skills of students in different age groups has not yet been explored in detail. The AI-based education system should incorporate a learning sciencedriven approach that can fully realize the potential of AI in education. Luckin et al. [128] emphasize that learning is an interdisciplinary science, and therefore designing and developing AI-based technology and tools should involve interdisciplinary collaboration (e.g., psychology, sociology, computer science, pedagogy, and cognitive science) between AI developers, researchers, educators, and learners.

SDG 5: Gender Equality
SDG 5 calls for gender equality and empowerment of all women and girls. The UN criticize the fact that women worldwide do not participate equally in political and corporate decision-making and are threatened by physical and sexual violence and child marriages. Regarding the question of the extent to which data-driven methods have supported equality, it can be noted that data collection has already proven to be problematic: Information and data gaps are often a barrier to achieving the 17 SDGs, and SDG 5 is no exception, as progress on gender equality depends on closing the so-called "gender data gap" [133]. Criticisms within the SDG context include the lack of diversity in datasets [23], existing data gaps (e.g., 44 out of 54 gender equality indicators are globally not reliably monitored), and the lack of gender-disaggregated statistics [133,134]. As long as discussions around big data and new data technologies remain "gender-blind" [133], the potential of big data and AI to reduce gender inequality and gender bias [135] will not be fully realized. Efforts to address these issues include the creation of Equal Measures 2030 to put data and data-driven tools into practice, such as the SDG Gender Index, which was created as a new tool for data collection and advocacy [136].
Areas in which data-driven approaches have already contributed to partial achievement of SDG 5 targets are: (1) Big data analytics enable the real-time monitoring of gender discrimination worldwide [137] (target 5.1); (2) AI systems can help eliminate gender bias in recruitment decisions [4] (target 5.5); and (3) Internet access, blockchain, and mobile banking support women's (economic) empowerment [4] (target 5.b). In a pilot program by microfinance firm Swadhar, Fidelity Information Services (FIS), and Citibank, women receive free phones along with a money management app called Saathi to track their income and expenses. On the one hand, the app helps women better understand their financial behavior and spend responsibly; on the other hand, it collects data that can be analyzed to provide services that subsequently offset smartphone and internet access expenses [138].
In addition to the positive impact that data-driven technologies can have on achieving SDG 5, there are also drawbacks. Natural language processing (NLP) and ML tools (e.g., smart algorithms, word embeddings, image recognition) may disseminate, reinforce, and amplify gender bias found in text data [139]: ML algorithms [23] and word embeddings [140] uncritically trained on news articles as well as current image captioning models [141] exhibit gender stereotypes and tend to amplify gender bias in training data. Gender debiasing methods to recognize and mitigate gender bias in NLP are relatively new and not yet sufficient to consistently debias models end-to-end for many applications [139], or to provide gender-neutral modeling [142].
3.6. SDG 6: Clean Water and Sanitation SDG 6 envisions ensuring water security, i.e., the sustainable management and availability of water in sufficient quantities and of acceptable quality, and achieving sanitation and hygiene for all. Climate change, overpopulation, agricultural demands, and pollution have significant impacts on the quality and availability of water resources [143,144]. According to the UN, billions of people worldwide suffer from water scarcity and lack access to safe drinking water, even in places where water is abundant (e.g., the Brazilian Amazon region) [143]-creating a water security crisis that will exacerbate water stress in agriculture and energy production. Dealing with this crisis requires integrated strategies that focus on water management, improving water availability, security, and quality [145].
Monitoring progress toward SDG 6 is challenging, as methods for measuring the quality and use of water and sanitation services are expensive and complex [146]-for example, because of the necessity to collect and evaluate large amounts of spatial data. In order to globally assess water security, Giupponi and Gain [147] present a geographical information system software tool that is used to process huge amounts of spatial infor-mation, and by which a spatial multi-criteria analysis framework is derived [145]. To improve the performance of water reservoirs under climate change (e.g., water scarcity), Elhassnaoui et al. [148] propose a program for the real-time control of dams that can help determine optimal schedules and project strategies in terms of drought, mitigation, water security, energy conservation, and agricultural development. Rainfall and evaporation data are provided by temporal downscaling and used by a real-time algorithm (based on water balance equations and rule curves) in conjunction with a hydrologic modeling system [148]. The program was applied to the Hassan-Addakhil reservoir in Morocco, and the study shows that its efficiency could be improved [148]. This successful application links to the call by Moumen et al. [144] for Morocco to develop advanced tools for water management, such as the establishment of a national database whose data on soil, land use, land cover, hydrology, etc. could be incorporated into mathematical tools for modeling and assisting decision-makers in determining appropriate management strategies.
To achieve data-driven policymaking, data management is employed to decide how and what to localize: The localization of SDG 6 is explored by disaggregating Key Performance Indicators (KPI) and interlinkages between SDGs, by which also new KPIs are developed [149]. Requejo-Castro et al. [150] propose a data-driven Bayesian network approach to identify and interpret SDG 6 interlinkages across the 2030 Agenda, as monitoring SDGs requires combining conventional indicator-based frameworks with approaches that capture the links and interdependencies between the SDGs and their targets. Such datadriven indicators are used to disaggregate water stress at higher spatial and temporal resolution [151] and to better assess hydro-political risks [152]-during the risk analysis in managing water systems, real-time data provided by IoT systems are applied [4].
In addition to data collection and analysis, AI technologies are applied in monitoring and forecasting: (1) Satellite data are used for remote water supply monitoring [153]; (2) In situ and remote sensing technologies linked to IoT are applied to improve the monitoring of water and sanitation interventions to achieve targets more cost-effectively and efficiently [146]; and (3) ML is utilized for water quality monitoring and forecasting [154,155]. In hydrological forecasting, artificial neural networks are employed in various forms-for example, in network structures including feed forward neural networks, adaptive neuro fuzzy inference systems, extreme learning machines, and recurrent neural networks [156].
AI technologies are currently applied in clean water test systems to detect water contamination and in smart water management (SWM) systems to help address the water crisis [157] and to provide water to the population [23]. Arsenic, fluoride, and nitrate contamination of groundwater and drinking water is a global problem that negatively impacts the lives of more than 300 million people [158,159]. For example, exposure to arsenic can cause arsenic poisoning in the short term [158] and lead to long-term health problems such as higher rates of bladder cancer [160]. AI (e.g., ML, DL, logistic regression, and random forests) is utilized to develop prediction models for groundwater contamination and can be used as a decision-support system as well as for devising proactive environmental management strategies, with hybrid ML-DL models being useful given the multidimensional nature of environmental data [159]. Complex water systems benefit by AI by being better modeled and simplified, and by becoming more reliable and robust (through blockchain) [4]. Urbanization could have a negative impact on water infrastructure and water resource management in the coming decades due to the associated greater water consumption and pollution and degradation of water quality [143]. In the future, urban water resilience might be improved by further developing existing big data-driven SWM systems into cyber-physical systems, which collect and analyze water data in real time and control water infrastructures based on technologies such as web-ready sensors, IoT-driven solutions, big data mining, or cloud computing [161].

SDG 7: Affordable and Clean Energy
The goal of SDG 7 is to make affordable, reliable, and modern energy accessible to all, whereby advanced technologies can help to significantly reduce energy consumption. Big datadriven approaches-such as geographical information system software tools [147]-are used for modeling and forecasting climatic and geo-ecological situations and allow conclusions regarding global access to energy over time. According to the UN, 759 million people lack access to electricity (so-called "energy poverty"), about three-quarters of whom live in sub-Saharan Africa. Hassani et al. [162] employ big data technologies to identify and predict energy poverty using satellite imagery, and their findings identify sub-Saharan Africa as the region most affected. Big data can contribute to the fight against energy poverty by combining different data sets (e.g., satellite photography, money transfers, mobile telecommunications) to provide information about current and future energy demand. Mastrucci et al. [163] conclude that the energy poverty gap is broader than indicated by the SDG 7 definition, when the need for indoor cooling is considered. To estimate the energy required to meet these cooling needs, they apply the variable degree days method on a global grid. Closing this cooling gap with minimal environmental damage would require solutions such as innovative cooling technologies. SDG 7 is not only aimed at universal access to energy, but to meet the sustainability target, energy should come primarily from renewable sources (target 7.2), and energy efficiency should be improved (target 7.3). Ryan et al. [164] explore how smart information systems (SIS)-technologies that build on big data analytics, typically facilitated by AI techniques (e.g., ML)-used in the energy sector can help solve the "energy trilemma" (i.e., energy security, energy equity, and environmental sustainability). The use of SIS in smart grid systems enables renewable energy integration, can provide energy security, and ensures affordable and sustainable energy for an ever-increasing demand (energy distribution). By guaranteeing energy efficiency and timely energy supply at optimal cost, smart grids will likely facilitate the achievement of SDG 7 in the future [4]-a goal that will require a massive deployment of mini-grids in rural areas of developing countries. Lorenzoni et al. [165] present a shared database for load profiles of 61 rural mini-grids and identify five archetypal load profile clusters using a clustering procedure. This approach allowed the lack of available data and the difficulties in identifying demand to be overcome. For the design of energy systems, ANN is utilized as a forecast model for available renewable resources [156], and in the provision of energy services to the population, the integration of variable renewables can be supported by smart grids that match electricity demand to times when the sun is shining or the wind is blowing [23]. AI technologies are already contributing to SDG 7 by predicting renewable energy source performance and management needs and by enabling sustainable energy consumption through smart meters [4]. However, AI-based technologies can also compromise outcomes by requiring computational resources that are only available in large computing centers, which have very high energy requirements and carbon footprints [23]. SDG 7 is closely connected to SDG 11 (sustainable cities and communities), as smart sustainable cities depend on the provision of clean and sustainable energy: Targets of SDG 7 should accordingly promote investment in clean energy technologies (target 7.a) and should expand infrastructure for the provision of sustainable energy services to all (especially in developing countries) (target 7.b). Research is exploring the potential of datadriven smart solutions (e.g., IoT and big data technologies) to improve energy efficiency and environmental sustainability of smart sustainable cities (cf. data-driven urbanism): Chui et al. [166] present pilot systems and prototypes that demonstrate how AI, cognitive computing, IoT, and big data analytics can support the process of energy sustainability in smart cities. A case study conducted by Bibri and Krogstie [167] shows that SDG 7 is supported by establishing a big data ecosystem as part of information and communication technology architectures, even before data-driven applications are adopted. Based on routinely collected data and the use of a new urban intelligence function, models of smart sustainable cities are created to monitor, understand, analyze, and plan such cities to improve their energy efficiency and environmental sustainability [167].

SDG 8: Decent Work and Economic Growth
SDG 8 calls for the promotion of "sustainable economic growth, full and productive employment and decent work for all". Sustainable economic growth can only be achieved by decoupling economic growth from environmental degradation, i.e., by improving global resource efficiency in consumption and production (target 8.4). The achievement of SDG 8 is significantly impaired by the ongoing COVID-19 crisis. The UN report that the pandemic has led to the loss of 255 million jobs and will likely lead to an increase in youth unemployment in the future. Global real GDP collapsed in 2020, and in many countries economic growth is not expected to return to pre-pandemic levels until 2022 or 2023. Big data can contribute to achieving the goal of economic growth (target 8.1) through data analytics and the monitoring of indicators [168]-for example, through network analyses of logistical networks that contain information about economic growth [169]. AI can promote job creation, entrepreneurship, and innovation (target 8.3)-for instance, in the bio-based economy, which is likely to be one of the main beneficiaries of the AI-driven acceleration of data mining and forecasting and the resulting creation of new companies and jobs in the microbial biotechnology sector [170].
Advances in financial technology facilitate opportunities for "digitized financial inclusion" [171]: (1) ML is applied in financial market forecasting (e.g., to predict stock trends) [172] and (2) AI, combined with the now widespread use of smart phones in developing countries, gives access to digital financial services to people who were previously unbanked [171], thereby promoting and expanding access to banking and financial services for all (target 8.10). In this context, microfinance needs to be mentioned, which provides microcredit to economically disadvantaged customers who do not have access to conventional banking services. Microfinance plays an important role in developing countries, and in the future, the use of AI technologies could contribute to broader access to finance for economically disadvantaged population segments. Technologies in microfinancing have evolved from management information systems, online lending, and the use of mobile banking to AI use in dynamic information systems in recent years: ML and DL technologies speed up and simplify the lending process by making faster decisions about potential customers and automatically calculating the likelihood of default, even for customers with no formal credit history and no bank account. To do this, the customer installs an app on their smart phone, and the lender can use the information on the customer's phone and browser to assess risk and, if approved, disburse funds within minutes [138].
AI technologies have multiple fields of application in the workplace, with a focus on STEM (science, technology, engineering, and math) job creation [4] and improving employees' working conditions: Smart information systems (SIS) can help workers by reducing demanding work, supporting the completion of complex tasks, and increasing productivity in the workplace [164]. AI technologies can simplify workers' lives, e.g., through intelligent transportation systems that lead to more efficient commuting and flexible working [4], but there are also downsides to the adoption of AI in the workplace: While SDG 8 aims at strengthening the relationship between an organization and its employees, Braganza et al. [173] find that the adoption of AI has the opposite effect and is detrimental to decent work. SIS and AI raise ethical concerns and carry risks that could hinder achieving SDG 8, such as algorithmic bias, job loss, power asymmetries, and inequality [164]. Automated decision-making algorithms have been shown to be biased, to lack ethical oversight, and to have limited transparency, which would reinforce unequal access to funding [171].
3.9. SDG 9: Industry, Innovation, and Infrastructure According to the UN, investing in basic, sustainable infrastructure is essential for improving the living standards of communities worldwide [174]. This refers to the fundamentals-that is, the issues that comprise common human needs around the world.
However, just because we are talking about basics does not mean that addressing these concerns properly is simple. In the case of infrastructure, this requires coordinated, long-term planning that spans geographic, political, and cultural boundaries. Although innovation comes last in the description of SDG 9-"Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation"-it can be argued that innovation will underpin the entire SDG 9 agenda in the future, because without innovation we are left with current infrastructure technologies and industrialization models that remain on an unsustainable path [175]. Denoncourt [176] illustrates the important connection between SDG 9 targets, social responsibility, intellectual property rights, and corporate longevity.
In Kynčlová et al. [177], a data-driven metric is introduced to measure the progress of a country towards achieving industry-related targets of SDG 9. The SDG 9 index represents a comprehensive but straightforward approach for assessing the extent to which countries have industrialized while promoting social inclusiveness and minimizing natural resource use and environmental impacts. The selection of indicators is based on the global indicator framework for the goals and targets of the 2030 Agenda adopted by the UN General Assembly. The resulting SDG 9 index benchmarks inclusive and sustainable industrial development in 128 economies over the period 2000-2016. Industrialized economies outperform other countries, with the top five leading the 2016 ranking being Ireland, Germany, the Republic of Korea, Switzerland, and Japan. The calculated scores of the SDG 9 index show in which dimensions countries are leading or lagging behind other economies. Thus, the SDG 9 index forms a valuable tool for policy makers and analysts.
Raut et al. [178] studied the predictors of sustainable business performance through big data analytics in the context of developing countries. Data were collected from manufacturing firms that adopted sustainable practices. A hybrid Structural Equation Modelling-Artificial Neural Network model was used to analyze 316 responses of Indian professional experts. The study's findings show that management and leadership style, as well as state and central government policies are the most important predictors of big data analytics and sustainability practices. The study provides theoretical and practical insights into how manufacturing firms can improve their sustainable business performance from an operations management perspective and details big data implementation issues when accomplishing sustainability practices in businesses in emerging economies.
Ilie et al. [179] suggest monitoring SDG 9 with global open data and open software. They present training materials for using open-source software, along with freely available high-resolution global geospatial datasets. These training materials provide step-by-step guidance on calculating the status of SDG indicators to support monitoring of SDG progress. To demonstrate the usefulness of their approach, they conducted a case study in rural Tanzania. Malhotra et al. [180] propose that data from GPS devices can be used to control traffic and improve public transport to achieve indicators, such as indicator 9.1.1. "Proportion of the rural population who live within 2 km of an all-season road".

SDG 10: Reduced Inequalities
SDG 10 aims to "reduce inequality within and among countries" and refers to all types of discrimination and lack of access in the broadest sense. Inequality within and between countries is a persistent problem, and while much progress has been made globally, it remains one of the most problematic challenges even for developed economies. In the EU-which has adopted numerous policies and directives prohibiting inequality and discrimination based on race or ethnic origin, gender, religious belief, age, disability, or sexual orientation-the legal framework has numerous gaps that prevent it from adequately addressing discrimination [181]. In Europe, there is increasing inequality in terms of economic performance (measured by GDP) and social cohesion (measured by income, employment rates, and opportunities) not only between metropolitan areas but also within these between affluent central and economically disadvantaged peripheral districts [182]. Lelo et al. [182] emphasize the need for data-driven policies that enable national and local authorities to identify needs and address diverse and emerging forms of marginality and social exclusion.
The inclusion of SDG 10 is not based on broad agreement among member states but was only included as a stand-alone goal against considerable opposition [183,184]. This contestation is reflected in a relatively weak SDG 10 that is characterized by imprecise language and that is missing a road map to achieve the goal [185,186]. For example, there is not a single indicator that explicitly refers to inequality. Not surprisingly, official progress reported by the UN on SDG 10 is a challenging issue due to the complexity and impreciseness of its targets. In the 2018 Sustainable Development Report, the progress on SDG 10 is described only in an overview chapter concentrating on SDGs 10.1, 10.a, 10.b, and 10.c [187]. SDG 10 might be the data-weakest SDG, as globally available and comparable data are rare. There are a few data-driven approaches, such as a speech-to-text analysis of local radio content to uncover discrimination issues [180] and an SDG 10 index score applied to India [188].
Truby [171] argues that data-driven and AI-driven technologies have a great impact on inequality. Multiple AI-driven apps and digital services have been established to benefit the developing world, such as through AI-powered medical diagnosis. Critics, however, liken the growth of tech start-ups to "tech colonialism" [189], as AI is developed in and creates revenue for developed countries but disrupts markets in emerging and developing countries at the expense of existing service providers and workers. Examples include firms that were founded in Africa with the intention of benefiting Africa but ended up being owned and managed by Europeans and Americans, who provided most of the capital and receive most of the profits. Despite the best intentions of AI developers with respect to the software's goals, AI may not directly benefit developing countries for several reasons: The software may be so successful in disrupting the market and providing an attractive service for consumers that existing service providers are unable to compete. For example, Uber does create opportunities for individual freelance employment in developing countries, but at the expense of local taxi companies. This drives up profits for company owners, who are often based in Silicon Valley or London because of the opportunity to obtain seed start-up funding there. Developed countries thus have a continuing advantage in the advancement of AI technologies, which in turn generate further revenue that can be used to develop even better AI. Moreover, AI may replace existing jobs without providing opportunities for upskilling people in the economically most disadvantaged communities. When designed with sustainable development in mind, AI can instead provide people with employment and more skilled and interesting opportunities, making them more productive by eliminating unnecessary labor-intensive jobs.

SDG 11: Sustainable Cities and Communities
Projections indicate that two-thirds of the world's population will live in urban areas by 2050, creating an urgent need to apply sound urban planning approaches to create communities that can thrive sustainably [190]. According to Vaidya and Chatterji [191], the world at large is gradually taking an urban turn, as more and more people are moving to cities. Cities account for 55% of the population and produce 85% of the global GDP but also produce 75% of the greenhouse gas emissions. The issues of global sustainability cannot be addressed without focusing on sustainability at the urban level. Rozhenkova et al. [192] argue that there is a critical need for large-scale comparative urban policy data that could be used in conjunction with outcome data to identify where policies are working and where improvements are needed. Based on a comprehensive scoping review, they evaluated existing urban policy databases and concluded that they are inadequate for the purposes of comparative analysis. They developed an "ideal" urban policy database, which they argue is a key tool for achieving SDG 11. Thomas et al. [193] examined 484 data-driven indicators of urban and regional environmental sustainability drawn from 40 indexes and online data repositories to determine their suitability for measuring both urban environmental performance and equity. Despite the large number of existing indicators related to urban environmental monitoring, they found that all indicators were inadequate for evaluating progress regarding SDG 11's targets of inclusive, accessible, and sustainable urban areas (targets 11.3 and 11.7) due to a lack of benchmarks and explicit equity measures. Thomas et al. [193] recommend that future research should focus on collecting data that can be geographically disaggregated to measure distributional equity and establish locally appropriate benchmarks and realistic indicators for urban sustainability targets. Malhotra et al. [180] suggest satellite remote sensing to track encroachment on public land or spaces, such as parks and forests.
The European Commission-Joint Research Centre has developed a suite of (open and free) data and tools named Global Human Settlement Layer (GHSL) that maps the human presence on Earth (built-up areas, population distribution, and settlement typologies) between 1975 and 2015. The GHSL provides information on the progressive expansion of built-up areas on Earth and population dynamics in human settlements, with both sources of information serving as baseline data to quantify land use efficiency, listed as an indicator for SDG 11 (indicator 11.3.1) [194]. Schiavina et al. [194] show that (1) the GHSL framework allows the estimation of land use efficiency (LUE) for the entire planet at several territorial scales, opening the opportunity of lifting the LUE indicator from its Tier II classification; (2) the current formulation of the LUE is substantially subject to path dependency; and (3) it requires additional spatially explicit metrics for its interpretation. They propose the "Achieved Population Density in Expansion Areas" and the "Marginal Land Consumption per New Inhabitant" metrics for this purpose. Their study is planetary and multi-temporal in coverage, demonstrating the value of well-designed, open, and free, fine-scale geospatial information on human settlements in supporting policy and monitoring progress made towards meeting the SDGs.
Knowledge of the global spatial distribution and evolution of human settlements has become one of the most important requirements for monitoring progress in sustainable development of urban and rural areas. Corbane et al. [195] present recent developments in processing big Earth observation data to improve GHSL data, and they outline the results from two experiments with Sentinel-1 and Landsat datasets based on the Joint Research Centre Earth Observation Data and Processing Platform. A comparative analysis of the results of extracting built-up areas from remote sensing data and through processing workflows shows how information production-supported by a data-intensive computing infrastructure for optimization and multiple tests-can improve mapping capabilities, the handling and processing of such datasets, and the reliability and consistency of output information within the GHSL domain. The approach supports powerful comparative monitoring of land use efficiency and access to basic services under SDG 11.
Gue et al. [156] state that the largest number of journal articles addressing AI is available for SDG 11. The articles cover a wide range of applications, such as urban transport air pollution, prediction of house rental prices, prediction of energy consumption, green building design, simulation of soundscape, prediction of walkability, categorization of slum areas, prediction of air emissions, estimation of flood susceptibility and damages, and characterization of generated waste.

SDG 12: Responsible Consumption and Production
Responsible consumption and production are critical to a sustainable world [196]. Human and environmental systems interact through the economic system in various ways that have caused many unsustainable issues. Solving these problems is a non-trivial exercise and could be considered one of the world's "wicked" problems. SDG 12 aims to ensure sustainable consumption and production patterns and has a long history in various international conferences and actions. Dubey et al. [197] show that organizations engaged in sustainable development programs are increasingly paying attention to synergistic relationships between focal firms and their partners to achieve SDG 12. They conclude that (1) big data and predictive analytics have a significant positive impact on sustainable consumption and production among partners, and (2) organizational compatibility and resource complementarity have positive moderating effects on linking data and predictive analytics to sustainable consumption and production. Gunawan et al. [198] identify a high correlation between CSR and SDG 12.
Gasper et al. [199] state that SDG 12 indicators have major deficiencies-in particular, inadequate coverage of corresponding targets and a checklist orientation that privileges counting of reports over examining their content and quality. In their study, Carlsen [200] focused on five main indicators selected by Eurostat as key factors for the development of SDG 12, i.e., (1) resource productivity, (2) average CO 2 emissions from new passenger cars, (3) circular material use rate, (4) generation of waste excluding major mineral wastes, and (5) consumption of toxic chemicals.
Hermann [201] addresses the relevant relation between marketing, data (especially AI), and SDG 12. Marketing claims to help consumers by satisfying wants and needs, but the endless pursuit of satisfying them can further fuel consumption, which in turn depletes resources, pollutes the environment, and drives climate change. In light of the environmental imperative and the stance of sustainable development, AI in marketing is a double-edged sword: On the one hand, AI applications and systems pursue sales' objectives and increase consumption and its (negative) externalities. For instance, Amazon's-whose e-commerce platform relies on AI-driven recommender systems and collaborative filtering-ratio of carbon footprint in CO 2 equivalents and of gross merchandise sales (carbon intensity) was 122.8 g/USD in 2019 [202]. Given Amazon's multi-billion sales volume, the carbon footprint of the world's largest e-commerce company alone equals dozens of tons of CO 2 emissions annually. Moreover, energy consumption and emissions related to AI development, production, and deployment induce adverse rebound effects. On the other hand, AI in marketing can be a powerful force in promoting supply-and demand-side sustainability efforts. AI's potential to foster sustainability in marketing should be leveraged across the four Ps of the marketing mix-that is, product, price, place (distribution), and promotion (communication). AI in marketing should support consumers in making better-informed and more sustainable decisions. Hermann [201] proposes AI-powered devices and applications to continuously update and provide a current ecological footprint (e.g., CO 2 emissions, water consumption) based on our purchase history and decisions. In addition, the individual ecological footprint could be compared with an individually defined social comparison group to induce a certain degree of social pressure.
Beier et al. [203] identify potential big data use cases for corporate environmental management by using the example of the German automotive industry. The use cases found are: (1) Improved creation of life cycle assessments; (2) measuring energy consumption and increasing energy efficiency; (3) measurement of emissions and their reduction; (4) measurement of water consumption and its reduction; and (5) optimization of waste management. For SDG 12, a large number of journal articles addressing AI are available [156] that cover a wide range of applications such as additive manufacturing, specific production processes, cleaner production, predicting product life, renewable resource management, manufacturing practices, material flow, energy flow, energy consumption, and green technologies. Wang et al. [204] adopted fuzzy interpretive structural modeling to develop a precise evaluation framework and provide a theoretical basis for enhancing the understanding of responsible consumption and production. Malhotra et al. [180] suggest online search patterns or e-commerce transactions to reveal the pace of transition to energy efficient products.
3.13. SDG 13: Climate Action SDG 13 presents five targets, with this work focusing on the targets 13.1, 13.2, and 13.3, as they seem to have the highest potential to be achieved through digitalization. Target 13.1 aims to strengthen the resilience and adaptive capacity to climate-related hazards and natural disasters in all countries. The number of deaths or missing persons per 100,000 people is one of this target's indicators (indicator 13.1.1). Prediction and in-time warning of upcoming climate hazards is needed to reduce the number of fatalities or missing persons. Poolman et al. [205] used data-driven deterministic unified models to increase the warning lead-time for flash floods from 1-6 h to 18 h. Digitalization helps data from different influencers to be obtained and merged into a model. Weyn et al. [206] followed a more generic approach and used a convolutional neural network to forecast several atmospheric variables. The authors stated that their model was not as accurate as operational numerical weather prediction models but that it outperformed dynamic numerical weather prediction models for short-and medium-range forecasts. Computing power and ongoing research on data-driven DL approaches are likely to support the achievement of SDG 13.
Target 13.2 is to integrate climate change measures into national policies, strategies, and planning. In the face of extreme weather events, the public sector needs to prepare, yet few actions are taken, and a significant number of public authorities do not have integrated disaster management plans, despite the availability of data and predictions about potential threats [207]. Using five case studies, Saulnier et al. [208] highlighted the complexity of measuring disaster mortality and the apparent need for disaster mortality data, demonstrating that by combining multiple data sources to accurately estimate mortality, a standardized definition of mortality can be provided. This will assist policy makers in implementing appropriate measures to prevent disaster mortality and strengthen disaster risk reduction for a country's citizens. Target 13.3 aims to improve education, raise awareness, and build human and institutional capacity for climate change mitigation, adaption, impact reduction, and early warning. Climate change mitigation is a challenge that must be faced not only by policy makers or specific industrial sectors but also by all of mankind. Climate change mitigation cannot be accomplished at the household level; thus, broader integration is needed, with all actions requiring support of innovation, infrastructure, development, and industrialization [209].

SDG 14: Life below Water
Digitization can support the achievement of SDG 14 on the protection and sustainable use of oceans, seas, and marine resources for sustainable development. The implementation of water pollution indicators in coastal areas is a step toward helping coastal managers prevent pollution [210]. A clear understanding of cause and effect is essential, and networkbased approaches can support this understanding. However, Del Río Castro et al. [211] point out that there are still research gaps in the application of digital paradigms (e.g., big data or AI) to achieve SDG 14. They identify governance hurdles and technological white spots in the application of digital paradigms that require a multidisciplinary effort. The reduction in coastal pollution index values could be achieved through several measures, but the connections in such a network are missing. Approaches from other sectors, such as industry or computer science, where the cause-and-effect connection is more elaborated, should be applied to measures to achieve SDG 14.
Emerging digitalization and IT are being used to leverage environmental supply chain sustainability. Sarkis et al. [212] show how digitalization is affecting supply chain greening and present two frameworks with exemplary practices that can advance research. Target 14.4 calls for an end to overfishing the oceans. As the fishing industry is the starting point of a supply chain, greening measures of supply chains could be applied, similar to those in the automotive industry. IoT technologies provide promising opportunities for organizations by connecting and enabling collaborations between physical objects, devices, systems, platforms, and applications [213]. Vollen and Haddara [213] investigated the benefits and challenges of an internet-based refrigerated seawater system that kept the catch at a specific temperature until delivery. Such internet-based systems offer the potential to monitor catch quotas in addition to monitoring temperature control. A transmission of the data to a monitoring system makes it possible to get a real-time overview of the quotas in specific coastal areas.
Ocean acidification poses a threat to marine fauna and flora, which is why minimizing acidification is one of the targets of SDG 14 (target 14.3). Kroeker et al. [214] conducted a meta-analysis on the reaction of organisms to acidification. By synthesizing 228 studies, they found that organisms in a laboratory environment reacted differently to acidification than in a multi-organism environment, such as that prevalent in the ocean. However, when trying to reach a comprehensive understanding, the researchers found that the variability in organisms' responses grows exponentially. Digitalization and data analytics could support this research field with forecasting approaches such as those applied in production [215] or finance [216] and in recently published approaches for well forecasting and groundwater quality [217]. Given the huge amounts of data on the organisms' possible responses in an ocean environment, data analytics approaches from other research fields could help visualize the effects of acidification on organisms.

SDG 15: Life on Land
SDG 15 calls for protecting, restoring, and promoting the sustainable use of terrestrial ecosystems, managing forests sustainably, combating desertification, halting and reversing land degradation, and halting biodiversity loss. The protection and the sustainable use of terrestrial ecosystems (targets 15.1 and 15.2) can benefit from approaches (e.g., quality management measures) that are already being implemented in various sectors of the economy, such as in the production industry. Quality management is omnipresent in this industry, and a connection can be made between quality management and sustainable use of land. Carnerud et al. [218] show changes over the past 40 years in how IT has influenced the recent trend toward sustainability and thereby changed contact quality management.
In the production industry, resources such as raw materials or spare parts are constantly monitored to avoid shortcomings, and a shortcoming could be seen in the overuse of terrestrial resources. SDG 15 formulates the need for the sustainable use of terrestrial ecosystems. An exploitation of a renewable resource can be connected to quality management in the same way that it is necessary to avoid shortcomings in industry. It might be concluded that overuse of a resource is an indicator of low quality, so that the resource cannot renew itself independently. Information technologies can assist in monitoring natural resource utilization: Applying IT measures to monitor resource use can support the achievement of SDG 15 and can prevent over-utilization, leading to sustainable use of terrestrial resources.
One of SDG 15's targets is to progress towards sustainable forest management (target 15.b). On the one hand, an exit from fossil fuels is necessary to reduce the carbon footprint [219]. On the other hand, people need alternatives for heating. One alternative to fossil heating systems such as oil, gas, or coal could be cellulosic fuel-like wood-based products, but this results in a higher forest usage. Antonescu and Stanescu [219] present technical insights for retrofitting heating chambers to switch from fossil fuels to cellulosic ones such as wood pallets. However, a switch from fossil fuels to wood pallets, in addition to increasing wood consumption, would also increase transportation by delivering the pallets to households. Digitalization measures can open up solutions to this problem. Early research papers [220,221] present optimization approaches that can be adapted to achieve SDG 15. Based on the travelling salesman problem, Petersen and Madsen [221] present a heuristic optimization for complex routing networks. Perpiñá et al. [220] discuss in their paper an approach for mapping potential locations for biomass plants and optimizing transportation. This approach could be applied to wood pallet heating systems, so that a map of all users of these heaters and a visualization of user demand could help reduce transportation. Gejdoš et al. [222] address this problem in more detail and present a mathematical optimization approach based on three criteria: time, cost, and fuel consumption per volume of forest biomass. We argue that there are several ways to overcome hurdles, such as increased transportation costs when switching from fossil to cellulosic fuels. Digitalization, i.e., increasing the computational power of heuristic optimization approaches of complex networks, has the potential to solve this problem. Science is applying various optimization approaches-for example, in the production industry-and these approaches could also be used to achieve SDG 15.

SDG 16: Peace, Justice, and Strong Institutions
SDG 16 aims at promoting peaceful and inclusive societies for sustainable development, access to justice for all, and building effective, accountable, and inclusive institutions at all levels. Van der Velden [223] makes a connection between the smartphone and the relation to the SDGs, and points out the correlation between upcoming digitalization, represented by the smartphone, and inequality in society. An increase in smartphone usage is linked to an increased usage of raw materials and components. Both raw materials and components can have a negative impact on poverty or child labor, as the production sites are located in countries where this is tolerated by society and the judiciary. Nevertheless, digitalization and especially the use of smartphones can also have positive effect on the achievement of the SDGs, e.g., SDG 1 [14], SDG 5 [138], and SDG 8 [171].
SDG 16 states, among other targets, the targets of overall reduction in death rates (target 16.1), reduction in violence against children (target 16.2), equal access to justice for all (target 16.3), reduction in crime (target 16.4), and reduction in corruption and bribery (target 16.5). Big data and related technologies are being used, albeit still to a limited extent, in the justice systems to fight crime [224,225]. Further, predictive algorithms could help improve accuracy, efficiency, and fairness [226]. Laberge and Touihri [227] show that there are opportunities to increase CSR by integrating people. Indicators of SDG 16 were derived for Tunisia, and people are part of all measures, as stated in the goal description. However, the authors mention that there are major challenges in measuring the success of all inclusion measures and the possible overload and skepticism of people. The outcome of this action in Tunisia can only be evaluated after some time, but it is a first step to connect and fully include people. Digitalization has the potential to provide connectivity to foster education and inclusion in society.
3.17. SDG 17: Partnerships for the Goals SDG 17 occupies a special position, as the formation of partnerships and collaborations to fully implement the other SDGs is to be achieved through this SDG. Strong, global partnerships at all levels are often cited as a key mechanism for achieving the other goals, so strengthening the means for their implementation, as well as shared values and inclusion, is crucial [228,229]. Despite being of central importance, indicators for SDG 17 targets have been criticized as being vague, unspecified, and difficult to measure, which is why the increasing number of partnerships reported in the SDG partnership registry might do little to change practices and behaviors [228]. Therefore, initiatives to promote interdisciplinary and cross-border data-driven collaboration and networking, as well as knowledge sharing via freely available datasets, need to be supported.
Advanced cloud computing and the use of big data processing platforms could help establish global partnerships that work on the same SDGs and realize the potential of big data in the context of sustainable development [230]. Concerns about fair data sharing, data ownership, and data privacy hinder partnership collaboration and full use of existing data but could be addressed-alongside mutually consenting policies and agreements-through multi-party computation (MPC) protocols and blockchain technology [168,231]. MPC protocols are an important technical tool for implementing the mechanisms that guarantee the increased utility or accuracy of data [231]. The use of blockchain technology can facilitate the secure exchange and processing of big data-as it was invented to support and record transactions of value without the approval of a central authority (cf. Sharing Economy)-and thus support the building of global partnerships [168,232].

Contribution and Recommendations
This paper contributes to the existing literature by examining the extent to which data-driven technologies have contributed to the achievement of the 17 SDGs to date. The purpose was not to present all existing studies, but to summarize the most current findings and derive insights for future research. The conducted literature analysis shows that data-driven approaches contribute to achieving these goals and could become even more important in the future, provided that currently unused potential can be exploited. The results of our content analysis are summarized and visualized below (see Figure 1). At present, data usage consists of collecting data [136,138] primarily for monitoring [167,193] and measurement [177,203] purposes, performing data analyses [161,169], based on which maps [16,195] and models [60,142] are created, forecasts [18,217] and risk assessments [4,107] are derived, and plans are made [103].
These fields of application bring positive effects: They support the achievement of the 17 SDGs by making available information more reliable and coherent [195], thus promoting better-informed decision-making and data-based policy implementation [15,25,26]. Furthermore, the use of data can thus contribute to the reduction in risks or even to their prevention [81,106,107,208]. Performance improves as actions can be prioritized, and access or distribution of available resources is optimized [43,104,128]. Digitalization and the expansion of IT networks make it possible to reach economically disadvantaged people living in remote regions with inadequate infrastructure, e.g., in terms of medical or educational facilities [67].
Despite many positive effects and versatile application possibilities, several At present, data usage consists of collecting data [136,138] primarily for monitoring [167,193] and measurement [177,203] purposes, performing data analyses [161,169], based on which maps [16,195] and models [60,142] are created, forecasts [18,217] and risk assessments [4,107] are derived, and plans are made [103].
These fields of application bring positive effects: They support the achievement of the 17 SDGs by making available information more reliable and coherent [195], thus promoting better-informed decision-making and data-based policy implementation [15,25,26]. Furthermore, the use of data can thus contribute to the reduction in risks or even to their prevention [81,106,107,208]. Performance improves as actions can be prioritized, and access or distribution of available resources is optimized [43,104,128]. Digitalization and the expansion of IT networks make it possible to reach economically disadvantaged people living in remote regions with inadequate infrastructure, e.g., in terms of medical or educational facilities [67].
Despite many positive effects and versatile application possibilities, several challenges persist: (1) Despite global expansion, IT infrastructures are still inadequate in many regions, especially in developing countries [68,69]. (2) With the number of internet users increasing worldwide, cyber attacks are also a growing phenomenon, and IT and operational technology protection is becoming increasingly important: Many countries, including developed economies, are struggling to fully implement cybersecurity, especially with regard to protecting critical (energy) infrastructures [233]. Although many countries have published cybersecurity strategies, Tvaronavičienė et al. [233] evaluated five selected national strategies using the six-tier model of [234] and showed that the protection of critical infrastructure and the cybersecurity that depends on it is very inadequate in most countries. (3) There is a lack of user acceptability of technologies that address human concerns (e.g., safety of self-driving cars) [66]. (4) Transparency of decision-making is limited with automated decision-making algorithms [171]. (5) One challenge is "data absences" [235] due to missing, unavailable, and/or incomplete data material [15,53,133]. (6) Data gaps mean that data are imperfect and may reflect existing data biases that arise from the reproduction of societal biases that can reinforce inequalities and discrimination [23,24,236]. (7) Increasing inequality affects emerging and developing countries, as they are not benefiting from the deployment of AI technologies but are experiencing a "tech colonialism" [189]. (8) A general challenge is the low interpretability of various ML models, for which it is important for data scientists to explain the models and understand results and for developers to debug and improve the models [97,98]. (9) Research gaps exist regarding the application of data-driven methods to achieve the 17 SDGs [211]. (10) Another challenge arises from the high energy consumption and emissions caused by the computing resources required for technology development and deployment, which counteracts the goal of sustainability [23]. (11) Another concern is ensuring model security, i.e., that malware does not manipulate data and lead to incorrect decisions of ML models [104,171]. (12) Lastly, ethical concerns, as well as privacy, ownership, and security issues, are prevalent [104,164,168].
Based on the challenges, recommendations for countermeasures can be derived from the literature: (1) A full-coverage IT infrastructure is necessary to enable systematic data collection in the form of data repositories and/or information systems and subsequently to be used for AI and data-driven applications in the context of the 17 SDGs. With the regionally prioritized expansion of IT infrastructure and AI capacity, open training (e.g., adult education, professional training) on the use of data-driven models (e.g., AI) [179] should occur to guarantee their correct application. Consistently, education should be the starting point to ensure the supply of qualified AI/ML/IT human resources. Investment in research and training, especially in economically deprived regions, is therefore advisable. (2) To further support the aim of accessibility, worldwide collaboration and networking needs to be encouraged, not only internationally but also interdisciplinarily. Advanced cloud computing, big data processing platforms, and blockchain technology can facilitate the secure exchange and processing of big data and can thus help establish global partnerships [168,230,232]. To achieve the 17 SDGs, the planning and implementation of measures should take place at the global level, e.g., in the form of a globally accepted multisectoral action plan [32,231]. (3) In order to protect critical (energy) infrastructures from cyber attacks, the development of effective cybersecurity management strategies is of great importance. Concrete recommendations for action relate to improving planning management and extensive employee training to provide the necessary knowledge about the components and the functioning of critical infrastructures, thus ensuring that volatile and susceptible parts are adequately protected when developing new solutions [233]. (4) Knowledge sharing via open-source software and freely available datasets should be encouraged [179] to advance the rollout of a full-coverage IT infrastructure, to increase user acceptability, and to make decision-making algorithms more transparent. Approaches that could be used here include interpretable models, human-centric AI, fair ML, explainable AI (cf. explainability) [25,237], and causability [238]. In addition to security and reliability, causability is a key criterion for the effective application of a model. It is a measure of how well the ML model clarifies the causal understanding of a specific problem for a user in a particular context based on its proposed explanation (determining causal relations between input features and model predictions) [239][240][241]. In determining causability, three important criteria must be considered: effectiveness, efficiency, and satisfaction for a specific context in which ML is used [239][240][241]. In terms of SDG-related tasks, data-driven approaches need to provide a contextual understanding of various problems, as they can have a variety of causes depending on geography, social structure, governance, the economy of the region, and various other reasons. For example, hunger, health, and social problems may look the same but have different causal relations to input data in different societies. In such cases, causability is a valuable measure for isolating the most appropriate model when multiple models predict outcomes equally well and have reasonable explainability but have different outcomes in terms of understanding context. Nilsson et al. [242] discuss various contextual factors in terms of positive and negative interactions of SDGs; it is important that AI/ML are able to understand such contextual factors. (5) To compensate for data absences, it is recommended that high-quality data [15] are collected and (6) linked to large-scale comparative data with outcome data [192]. (7) An urgent recommendation for action is to close existing data gaps [211]. To address data absences as well as data gaps, the following recommendations can be made: widely accessible education; increased awareness of biases; better domain understanding of the field; and effective collaboration between developers, policy makers, and experts (e.g., business, data, law) [236,243]. (8) To specifically address data biases and discrimination, it is recommended that data be collected that can be disaggregated by gender [134] or geography [193]. Other measures include strict evaluation and benchmarking of AI/ML models, a robust framework for fair training, evaluation, and testing of ML, algorithmic transparency [244,245], and data transparency. (9) Democratization of AI, i.e., free and fair access to models [246], could be helpful for combatting "tech colonialism". (10) There are different approaches for solving the problem of low interpretability of various ML models, such as application-grounded, human-grounded, functionally grounded, or model-agnostic evaluation [247]. (11) To close research gaps, multidisciplinary efforts that draw on already proven application fields from other disciplines are recommended [67,211]. (12) In order to reduce energy consumption and emissions from the operation of computing resources, the development of more energy-efficient technologies (cf. green technology) [44,45] is recommended. (13) To ensure model security, model security evaluation is required first, and reactive and proactive approaches for protecting models are permissible [248]. It is recommended that data and models are tested and protected, as well as warning users about data compromises [86,195]. (14) Implementing global guidelines for the secure and ethical use of data would be a step toward addressing ethical concerns as well as privacy, ownership, and security issues [103]. Concerns about fair data sharing, data ownership, and data privacy could be addressed-alongside mutual-consent policies and agreements-through MPC protocols and blockchain technology [168,231].

Discussion and Conclusions
Although the SDGs are receiving growing attention worldwide, our research found that there are wide disparities in the achievement of the 17 SDGs, especially when comparing developed, emerging, and developing countries. The latter require measures adapted to their specific needs to further advance the achievement of the SDGs. Furthermore, it cannot be overemphasized that knowledge sharing should be encouraged and supported through all reliable channels. Country-or region-specific strategies are already being applied to develop appropriate policy and regulatory frameworks for AI, as well as AI-based tools, products, and services: The "Smart Africa" government alliance is pursuing the implementation of these plans for its 30 African member countries, while the International Research Centre on Artificial Intelligence (IRCAI) is working to implement these plans for UNESCO member states [249]. The following actions could make a significant contribution to achieving the SDGs: an extensive rollout of a full-coverage IT network, strong collaborative efforts to share knowledge across borders and disciplines, integration and support of actions at all levels, further development and use of AI in line with the SDGs, as well as implementing ethical, legal, and policy guidelines.
The 17 SDGs are closely interrelated, and the data-driven methods used in achieving them overlap or are applicable to multiple SDGs simultaneously. This illustrates that a sustainable environment and a just, secure, and thriving society can only be created if all SDGs are addressed. This is particularly true for the SDGs that have received less research attention compared with others (e.g., SDG 16 and 17). Therefore, for future research, it is suggested that research gaps with regard to neglected SDGs are filled and that their connections to the other SDGs are explored. Ideally, data-driven methods and tools should contribute to the achievement of multiple SDGs simultaneously and should be mutually reinforcing or complementary. Another area of future research should focus on examining the role of AI in accelerating innovations that contribute to achieving the SDGs. Data-driven technologies are innovations in their own right and contribute to change in living and business environments. They open up opportunities that may accelerate developments but also may cause new challenges.
There are certain limitations resulting from the scope of the study and the chosen method that should be addressed by future research: (1) Although it is known that there are complex interrelationships between the 17 SDGs and that they influence each other, the interactions between them are not sufficiently discussed [250][251][252]. Several questions are not conclusively addressed, such as: How does the decision to use AI to perform tasks to achieve one SDG impact the achievement of the other 16 SDGs as they are interlinked? How can AI optimize the overall impact of the SDGs, considering the trade-offs among them? (2) There is a lack of a data-driven framework that enables the use of AI and allows for a robust and comprehensive evaluation of AI methods for making effective decisions about which SDGs should be prioritized in terms of AI use. Therefore, there is no consensus on how AI/ML methods can be applied to SDG problems. (3) AI models vary in their effectiveness in achieving the 17 SDGs and present us with different challenges: For example, DL can be used in healthcare (cf. SDG 3) with high prediction accuracy, but given the lack of explainability and causability, as well as concerns about patient-physician relationships, data security, and privacy [104], the question remains whether or not AI models should be used or how they should be used. In other cases, AI models are only moderately effective, have redundant or negative effects, or may lead to (data) problems. Especially in developing countries, data are often unavailable or insufficient. How useful or effective is the use of AI models in these cases? Accordingly, the use of AI requires a very detailed and systematic understanding of the challenges in its use and the potential impact, both positive and negative. Defining a generally acceptable framework for the use of data-driven approaches to realize their full potential to achieve the 17 SDGs with minimal risk will be one of the upcoming challenges. Future research should analyze the interrelationships between the SDGs, identify and prioritize the most important SDGs, and evaluate whether the use of data-driven approaches/AI is effective in achieving the prioritized SDGs.
In conclusion, it can be stated that data-driven and AI approaches have been and are likely to continue to be instrumental in achieving the SDGs, although the potential remains untapped to date. For this reason, the achievement of the UN' 17 SDGs by 2030 must be considered ambitious but not very likely. It should be in all of our interests to continue to work towards achieving the SDGs.