A Survey on Sustainable Surrogate-Based Optimisation

: Surrogate-based optimisation (SBO) algorithms are a powerful technique that combine machine learning and optimisation to solve expensive optimisation problems. This type of problem appears when dealing with computationally expensive simulators or algorithms. By approximating the expensive part of the optimisation problem with a surrogate, the number of expensive function evaluations can be reduced. This paper defines sustainable SBO, which consists of three aspects: applying SBO to a sustainable application, reducing the number of expensive function evaluations, and considering the computational effort of the machine learning and optimisation parts of SBO. The paper reviews sustainable applications that have successfully applied SBO over the past years, and analyses the used framework, type of surrogate used, sustainable SBO aspects, and open questions. This leads to recommendations for researchers working on sustainability-related applications who want to apply SBO, as well as recommendations for SBO researchers. It is argued that transparency of the computation resources used in the SBO framework, as well as developing SBO techniques that can deal with a large number of variables and objectives, can lead to more sustainable SBO.


Introduction
While undergoing the climate crisis with no easy global solution in sight, some have turned their attention to Artificial Intelligence (AI) as a key technology in the pathway towards a sustainable future [1,2], causing the rise of new AI initiatives such as Climate Change AI [3] and AI for Good [4]. Though it is unlikely that any one technology will be the solution to one of humanity's greatest challenges, Machine Learning (ML) in particular is seen as a technique with potentially a large positive impact on the United Nations' Sustainable Development Goals [5] (SDGs), for example in forecasting extreme weather events, balancing supply and demand of renewable energy systems, designing zero-emission transportation systems, identifying woodlands from satellite data, and fault detection in wind turbines [1,[6][7][8][9]. This does not come without cost, however, as it turns out ML is a technology with a substantial carbon footprint [2,10]. This has also been recognised in several sub-fields of ML such as natural language processing [11] and Automated ML [12] (AutoML). However, one sub-field of ML, namely, Surrogate-Based Optimisation (SBO), has not yet achieved similar positive or negative attention, even though it is particularly suitable for reducing energy consumption. In fact, SBO techniques are often especially designed to avoid having to run computationally expensive software. This is done by using an ML model as a surrogate of an expensive part of an optimisation problem.
This work investigates how SBO, as a subset of AI, can contribute to sustainability, not only by replacing computationally expensive software with more efficient ML models, but also by solving optimisation problems in sustainable applications. This is done by means of a literature review on the intersection of sustainability and SBO. The main purpose of this literature review is to answer the question: "How is SBO applied to sustainabilityrelated applications?" This question will be answered by identifying sustainability-related optimisation problems that are addressed using SBO, and by identifying in what way SBO is applied in these applications, e.g., which ML models and SBO techniques are used. This research is not meant to be an exhaustive list, but to give an overview of several examples of sustainable applications where SBO is applied. Such an overview can help point the way to combinations of applications and SBO techniques that add significant value. Furthermore, attention is given to the sustainability aspect of SBO itself, but only for the studies considered in this review. Finally, new avenues that can improve the use of SBO for sustainable applications are identified, both for researchers applying SBO in their own application domains, as well as for SBO researchers. The overall goal is to use SBO to improve sustainability aspects in a wide variety of applications, while also advancing research in SBO itself.
An overview of how other optimisation algorithms, namely, metaheuristic algorithms that do not make use of ML, are applied to sustainability-related applications, can be found in [13]. Such algorithms are less suited for expensive optimisation problems, as without a surrogate to guide them, they might require a prohibitively large number of evaluations of the expensive part of the problem. Besides not making use of ML, that study also does not take the sustainability aspects of the algorithms themselves into account. These are two significant differences with this work. This paper is organised as follows: while this section gives a short introduction on the intersection of AI and sustainability, the sub-field of AI under consideration, namely, SBO, is explained and defined in Section 2. This is followed by Section 3, which investigates sustainability aspects of SBO. It defines three different ways in which SBO can contribute to sustainability. After this, Section 4 explains the method used in this literature review. This leads to a number of studies that are analysed in Section 5, where the used SBO techniques are divided into different frameworks, and several sustainable applications are identified that have benefited from SBO. Finally, Section 6 discusses the results based on the analysis, while providing recommendations for researchers that want to apply SBO to sustainable applications, and for SBO researchers themselves.

Surrogate-Based Optimisation
Motivated by the need for efficiently solving expensive optimisation problems, SBO algorithms such as efficient global optimisation [14] and Bayesian optimisation [15] have been developed. These algorithms make use of ML to guide the search for good solutions. The expensive optimisation problems they are designed for can involve computationally demanding simulators, or problems that depend on the outcome of other ML or optimisation algorithms. The problems are also considered 'black-box', meaning that no exact mathematical formulation is available that could be exploited. Examples are designing heat pump systems [16] and diabetes drug manufacturing [17]. The expensive part of the optimisation problem is approximated with a surrogate model in order to reduce the number of expensive computations. The surrogate model is obtained using ML on the available data of the expensive optimisation problem, and is typically updated during the optimisation process as more data becomes available; see Figure 1. The surrogate model is used inside the optimisation process, which makes SBO a powerful combination of ML and optimisation. A recent textbook introduction to SBO can be found in chapter 10 of [18].

Surrogate model g
Candidate solution x Simulator/algorithm f Outcome (x, f (x)) Figure 1. Simplified framework of a typical SBO algorithm. Optimisation is applied to the surrogate model instead of the expensive simulator or algorithm, giving a candidate solution. This solution is evaluated by the simulator or algorithm. The resulting outcome is given to the surrogate model to be updated using machine learning, making it more accurate over time. This gives better candidate solutions and therefore better outcomes.
As there are many synonyms of 'surrogate model' to be found in the literature, such as 'response surface model' or 'metamodel', and many related terms as well, such as 'sequential model-based optimisation', 'Bayesian optimisation', or 'AutoML', it should come as no surprise that an exact definition of SBO is lacking. This work assumes the following broad definition: Definition 1. Surrogate-based optimisation (SBO) is an optimisation technique that makes use of a surrogate model obtained using machine learning, usually to replace an expensive part of the optimisation problem.
Note that this definition makes no distinction between iterative and non-iterative methods, or between surrogate-based and surrogate-assisted methods. The corresponding optimisation problem is given as where f : R d → R m consists of m objectives that are the outcome of an expensive simulator or algorithm, x ∈ R d consists of the decision variables, and X ⊆ R d consists of the search space for the decision variables, including any constraints. The expensive part of the optimisation Equation (1), which is usually f itself, is approximated with a machine learning model g, called the surrogate model. This is done using the outcomes obtained so far: where L is a loss function such as the mean squared error or negative log-likelihood, and n is the current iteration of the SBO algorithm. Common surrogate models are Gaussian Processes [15] and random forests [19], among others. The surrogate model g is used to provide a candidate solution by finding the maximum of a so-called acquisition function α: This problem is much easier to solve than the original Equation (1) due to g having a closed form that is easy to evaluate; therefore, traditional optimisation methods such as derivative-based methods can be used. The acquisition function α is used to balance the trade-off between exploration and exploitation. Example acquisition functions are Expected Improvement, Upper Confidence Bound, Thompson sampling, and Entropy Search [15,20]. While the details of SBO algorithms can differ, they all contain a learning part as in (2) and an optimisation part as in (3).

Sustainability and Surrogate-Based Optimisation
This section proposes three definitions concerning the intersection of sustainability and SBO. Together, they are called Sustainable SBO (SSBO). Following the terminology of [2], where the distinction is made between AI for Sustainability, i.e., using AI as a tool to achieve sustainability, and Sustainability of AI, i.e., taking carbon footprints and energy consumption into account when developing AI methods, one can define similar notions for SSBO: Definition 2. SBO for sustainability is concerned with applying SBO to sustainable applications, for example those that work towards the United Nations SDGs. Definition 3. Sustainability of SBO is concerned with making sure the SBO algorithm itself is sustainable, e.g., does not significantly contribute to greenhouse gas emissions, has low energy consumption, is transparent about its computation costs, etc. This holds for both the ML part and the optimisation part of SBO.
However, unlike in AI or ML in general, SBO is concerned with another aspect related to sustainability. As SBO is often used to prevent the prohibitive costs of running computationally expensive simulators multiple times, these 'savings' can be considered part of SSBO as well. The following definition is used in this work, where the name Sustainability with SBO is chosen to stay in line with the existing terminology: Definition 4. Sustainability with SBO is concerned with the prevention of running computationally expensive software, such as simulators or algorithms, more times than necessary.
'More times than necessary' is ill-defined here, but a comparison can be made with any method that would be used if SBO algorithms did not exist, for example randomly searching for good outcomes of a simulator or algorithm, or applying other black-box optimisation techniques that do not make use of ML surrogates, e.g., metaheuristic algorithms.
It is this last aspect of SBO that sets it apart from other AI techniques, as the main goal of SBO is to reduce the number of expensive function evaluations for some objective function. In fact, Definition 4 is the actual purpose for which SBO has been developed in the first place, starting with algorithms such as EGO for expensive black-box optimisation [14]. At the same time, such types of 'energy savings' that are the result of using less expensive function evaluations, must be considered carefully when compared to the other SSBO definitions, so as to prevent falling into the trap of Jevons' paradox [21]. This paradox, when translated to the case of SSBO, could counter-intuitively result in using SBO with the same or an even higher number of function evaluations than any other algorithm because it is so efficient, which results in no savings of computational resources.
All in all, the three SSBO definitions must be carefully weighed against each other, similar to the weighing of sustainable AI notions according to Wynsberghe [2]: "to assess whether training or tuning of an AI model for a particular task is proportional to the carbon footprint, and general environmental impact, of that training and/or tuning". While such a 'proportionality framework' [2] is beyond the scope of this paper, Definition 2 is the main focus of this research, though the other SSBO definitions get some attention as well.

Survey Method
In this literature review, the search terms 'sustainable' and 'surrogate' were used. By far, this should not result in an extensive list, as both words have many synonyms (and also multiple meanings), and listing all synonyms would not only result in a number of studies too large to analyse, but would also be highly subjective. However, using these two words should be enough to serve the main goal of this survey: identifying studies that apply SBO to sustainable applications. At the same time, the terms are broad enough to cover a wide range of applications.
Only one database was used to retrieve records: SCOPUS (https://www.scopus.com, accessed on 6 February 2022). Furthermore, the time frame was limited to studies published in the 5-year period of 2017-2021. This was done to limit the number of studies to analyse in this exploratory survey, while still focusing on a time period where the topics of sustainability and AI both gathered significant attention, and to make the search easily reproducible.
The following search term was used on SCOPUS: TITLE-ABS-KEY ( "surrogate" AND "sustainable" ) AND PUBYEAR > 2016 AND PUBYEAR < 2022 This resulted in 329 records. See Figure 2 for an overview of the methodology. Upon closer inspection, many of these records were not relevant for this review. For example, the word 'surrogate' can have different meanings in fields such as chemistry, healthcare and biology (e.g., surrogate mother). Even in optimisation, 'surrogate' could have a different meaning than intended for this survey, such as a surrogate measure that is defined by expert knowledge rather than learned from data (e.g., [22]). Such records were removed by reading the title and abstract of all 329 records and checking whether the word 'surrogate' was used in the context of surrogate models using machine learning. In case of doubt, the record was not removed. At this point, no attention was given to the sustainability aspect. Screening the records this way resulted in 89 reports, of which 2 were automatically detected as duplicates using Rayyan [23], and 5 turned out to not be accessible with the academic licenses of Eindhoven University of Technology. These reports were removed.
All 82 remaining reports were read in full, and were included in this review if they satisfied the following eligibility criteria: • It can be seen how sustainability is a topic of the report; • The report uses a surrogate model based on machine learning; • The surrogate model is used inside an optimisation framework.
These criteria are specified further in the remainder of this section.

Sustainability
As this survey is performed by one person and the first criterion is especially subjective, in order to avoid bias, only in the rarest of circumstances was this criterion used to exclude a report. For two studies, no link to sustainability was found: in [24], the word 'sustainability' appeared in the affiliation, keywords, and acknowledgements, but this word or similar words appeared nowhere in the title, abstract or main text, and in [25], the same word was only found in the copyright text and once in the main text without explanation. However, it is possible that the link to sustainability has gone unnoticed in these two papers, also due to the wide range of applications covered in this survey.

Machine Learning
Some reports did not use surrogate models based on machine learning, just like when screening the abstracts, but more information from the main paper was required to notice this (e.g., [26]). This only occurred six times, as in most other cases it was straightforward to notice the lack of machine learning from screening the abstract. Note that adding the words 'machine learning' to the search query would lead to different problems, as papers were found that would satisfy such an extended query but still do not use surrogate models based on machine learning (e.g., [27]).

Optimisation
Most of the studies that were excluded at this stage were not related to optimisation, even if they used surrogate models for other purposes such as prediction (e.g., [28]). In some of these studies, optimisation is only mentioned as a direction for future work (e.g., [29]). Note that adding a search query such as 'optimisation' would not solve the issue, as studies such as the earlier mentioned example [22] would be covered in the search query while not making use of SBO. Overall, 28 reports were excluded for not satisfying this criterion.

Other Criteria
Finally, two reports were a part I and part II of the same study [30,31]. These were both included but were counted together as one study. In the end, 45 studies were included in this review.

Surrogate-Based Optimisation for Sustainable Applications
This section analyses the studies that were included in this literature review, particularly how the studies made use of SBO for sustainable applications. Several properties of the included studies are shown in Table 1. These are: reference to the study, year, SBO framework, surrogate model used, the application related to sustainability that is addressed with SBO, domain or subject area, whether sustainability aspects of SBO are addressed, and any open questions related to SBO rather than the application. These properties are analysed in this section.

Year
Due to the search terms used in this survey, all years are covered in full: the year 2022 is not over at the time of writing this paper, but this year was not included in the survey. The number of studies included in this survey increased from 2 in 2017, to 4 in 2018, 8 in 2019, 14 in 2020, and finally 17 in 2021. Even though there is room for other search queries than the one used in this work, and therefore quantitative results could be subject to bias, this does give the indication that interest in applying SBO to sustainable applications has increased over the surveyed time period. A potential explanation for this is that both sustainability and AI were popular topics in this time period, not only in public but also in private sectors. Looking at AI investments for example: "From 2015 to 2020, the total yearly corporate global investment in AI increased by 55 billion U.S. dollars" [32]. It is likely that research on SBO, as a subset of AI, has benefited from this popularity. At the same time, the surveyed period closely follows the adoption of the Paris Agreement [33] and the United Nations SDGs [5], which have likely had a significant impact on the research focus of the time period under consideration.

Framework
While SBO methods such as Bayesian optimisation typically use an iterative approach, where surrogate models are constantly updated and used to search for better candidate solutions, this was not the only framework in which surrogate models found in this review were used. Non-iterative or direct approaches were more common, probably due to not including terms like 'Bayesian optimisation' or 'sequential model-based optimisation' in the search query. This work divides the included studies in five frameworks: Sequential Model-Based Optimisation, Predict-then-Optimise, Optimise-then-Predict, Predict-then-Interact, Bi-level Optimisation, Automated Machine Learning, and finally 'review' for review papers. Note that these frameworks are by no means extensive, and that they may be subject to bias and sometimes overlap, as is common when trying to divide optimisation algorithms into categories.

Sequential Model-Based Optimisation (SMBO)
This iterative procedure is the one used by Bayesian optimisation and similar algorithms [15,34]. Starting from an initial set of function evaluations, the surrogate model is learned. Then, an iterative procedure starts where (1) the surrogate model is used to suggest a new candidate point, (2) the expensive objective is evaluated at the new candidate point, and (3) the new evaluation is used to update the surrogate model. These three steps repeat until a stopping criterion such as maximum number of objective evaluations is satisfied. This iterative procedure is shown in Figure 1. This framework ensures that only promising parts of the search space are approximated by the surrogate model, which can reduce the required number of expensive function evaluations. An example of an included study that uses this framework in a water management problem is [35], where the surrogate model is repeatedly used in an optimisation problem solved by a genetic algorithm, the optimal points are given to a high-fidelity hydrodynamics simulator, and the surrogate is then updated with the outcome of the simulator.

Predict-then-Optimise (PtO)
This procedure is also called a direct procedure [36]. The surrogate model is learned from sampled data of the expensive objective, using e.g., Latin hypercube sampling, and then the optimisation problem is solved once with the new surrogate model. See Figure 3. It is possible that an iterative procedure is used for the learning process, however, no optimisation problem is solved within the iterative procedure: accuracy of the surrogate is the only goal. In some fields like chemical engineering, SMBO is known to outperform PtO [37], but in other fields like building design, it is still an open question which approach is better [36]. An example of an included study that uses the PtO framework in sustainable food production is given in [38], where the crop water demand is predicted with a neural network surrogate model, and the resulting optimisation problem is solved with nonlinear optimisation.

Multiple sampled x
Simulator / algorithm f

Optimise-then-Predict (OtP)
Though it is arguable whether this framework is considered SBO, it was included as it satisfies the definition used in this work. Using a similar terminology as for PtO, in OtP, first, an optimisation problem is solved using standard optimisation algorithms, for multiple situations or contexts (e.g., wind conditions). Then, a surrogate model is trained on the resulting data. The model is then used to generalise or visualise the outcomes of the optimisation problem for new situations, i.e., prediction of optimal outputs is the final goal. See Figure 4. An example of an included study that uses this framework when planning charging stations for electric vehicles is found in [39], where a mathematical program is solved multiple times for different situations such as number of electric vehicles or energy storage capacity, and the surrogate model then predicts the optimal annual profit of the system for all possible situations.

Predict-then-Interact (PtI)
Keeping the terminology consistent, PtI is similar to PtO, but instead of using an algorithm to solve the optimisation problem with the trained surrogate, a human interacts with the surrogate. See Figure 5. If used for design, the designer can use the surrogate to manually solve an optimisation problem defined by their own constraints and objectives, which are often related to creativity and cannot always be defined mathematically. Two included studies use this framework, both for building design [40,41].

Multiple sampled x
Simulator / algorithm f Surrogate model g Human interaction

Bi-level Optimisation (BlO)
In this framework, there are two nested optimisation problems, where the outer optimisation problem depends on the results of the inner optimisation problem. The surrogate model is used to approximate the inner optimisation problem. The framework is the same as in Figure 1, but with f consisting of two nested problems. An example of an included study that uses this framework for sustainable land development is [42], where the problem is formulated as a Stackelberg game between government (who wants to balance land use between food, energy and water) and land developers (who want to maximise profit), and a surrogate model approximates the decisions of the land developers.

Automated Machine Learning (AutoML)
In this framework, the final task is similar as in machine learning (ML), e.g., prediction or classification, but SBO is used (typically in the SMBO framework) to automatically solve part of the ML procedure that is usually done by hand, such as choosing which ML model to use or tuning the hyperparameters. The framework is again the same as in Figure 1, but with f consisting of the ML procedure. The only included study that uses this framework is [43], which uses SBO to select the models and hyperparameters of a ML model that predicts groundwater levels. Note that AutoML is also considered a sub-field of AI that does not necessarily always make use of SBO, but here it is considered an SBO framework to separate it from the other frameworks.

Framework Discussion
In the included studies, not all frameworks had the same frequency of appearance. As mentioned, the PtO framework was quite common. Different optimisers were used in the optimisation step, such as gradient descent (e.g., [44]), NSGA-II (e.g., [45]), tabu search (e.g., [46]), CMA-ES (e.g., [31]), and more. The SMBO framework was most common in studies from the chemical engineering and computer science domains, among others. It is possible that this framework is not yet well known in other domains, despite its earlier mentioned advantages. Examples of optimisers used in the SMBO framework are NSGA-II [47,48], simulated annealing [35], multi-objective particle swarm optimisation [49], and more. Bi-level optimisation problems were often the result of having multiple stakeholders-which is common in sustainable applications, such as road users and government [50], or land owners and government [42]. OtP problems often appeared when outside factors such as traffic load or weather came into play [39,51]. Two studies used the PtI framework, both for building design [40,41]. There were also some comprehensive reviews for SBO in building design [36,52] and other specific applications, but none for sustainable applications in general. Finally, the AutoML framework appeared only once [43], though replacing 'surrogate' with 'AutoML' or similar terms in the search query would likely yield more results.

Surrogate Model
The type of surrogate model used in the included studies varied a lot, mostly depending on the framework. Artificial Neural Networks (ANNs) are very popular in PtO, while Gaussian Processes (GPs) or Radial Basis Functions (RBFs) are more common in other frameworks such as SMBO or BlO. As noted in one of the included studies [52], the popularity of ANNs can likely be explained by the success of deep learning in the last decade and by researchers having access to more powerful hardware that allows more complex models. Other surrogate models were Multivariate Adaptive Regression Spline (MARS), Support Vector Machine (SVM) or Support Vector Regression (SVR), linear and polynomial regression, piece-wise linear models, Random Forest (RF), Recurrent Neural Network (RNN), and ensembles of multiple models. A general explanation of how to use ML models such as ANNs or GPs in SBO can be found in, e.g., [18].

Application and Domain
The goal of this study is to identify sustainable applications where SBO is applied, both to point SBO researchers to new interesting and relevant problems, and to make researchers in these sustainable application domains aware of the power of SBO. The range of applications is quite broad, from groundwater management to electric vehicles. The terms in the 'application' column of Table 1 were chosen manually after reading the studies, and might be subject to bias, especially considering the broad range of topics. Therefore, the domains of the application are also mentioned: these are retrieved from SCO-PUS directly. A broad list of domains is covered, but engineering, environmental science, energy and computer science were the most common domains, covering over half of all included studies as seen in Figure 6. While the prominence of energy and environmental science is to be expected when searching for sustainability-related studies, the engineering domain is likely well-represented due to SBO being considered a subset of engineering optimisation [18]. Similarly, SBO is considered part of AI, which can be considered part of the computer science domain. Overall, due to the diversity of the sustainable applications that were found, this study has already achieved its main goal.

Sustainable SBO
As explained in Section 3, this work considers three types of Sustainable SBO (SSBO). SBO for Sustainability is about the sustainability aspects of the application itself. This work assumes that all the included studies cover this aspect in some way or another: e.g., [53] uses SBO to increase soil health by 7.6%, while [54] finds several new stable chemical compounds for sustainable energy applications using SBO. This assumption is made in order to reduce bias, and because of the wide range of applications which requires expertise in many different science domains to fully understand the sustainability aspects of the applications. The corresponding column in Table 1 ignores this aspect of SSBO and only reveals whether other SSBO aspects were covered. This was done by manually inspecting the studies and is therefore still subject to bias, so only examples and general insights are given here.
Sustainability with SBO is about the prevention of running computationally intensive simulators or algorithms. Since this is the main reason for using SBO, it is assumed that all included studies take this aspect into account when choosing their methods. However, most studies do not quantify this aspect, which makes it difficult to determine whether the benefits (SBO for Sustainability and Sustainability with SBO) are worth the computational resources of SBO itself (Sustainability of SBO). An example of an included study that does quantify this is [55], where the total time of the SBO approach was estimated at 15 h for evaluating the expensive simulator and 1 000 s for the ML and optimi-sation parts of the SBO approach, while directly optimising the expensive simulator using the same optimisation procedure without a surrogate model was estimated to take 330 h. In other words, SBO has lead to approximately a 95% reduction in computational resources for this application, when compared to other optimisation techniques that do not make use of ML. Similar savings in computation time were reported for sustainable building design in one of the included reviews [36], with 97% as the largest reported number. Studies that quantify this aspect of SSBO get a checkmark in the corresponding column in Table 1.
Finally, studies that discuss Sustainability of SBO itself, for example by mentioning the trade-off in computational resources between expensive optimisation problem and SBO framework, or by quantifying the computation time or energy usage of their SBO framework, get a checkmark in the corresponding column. An example is [44], where computation times for training the ANN surrogate, using the surrogate for optimisation, and evaluating the expensive simulator are all reported. The study estimates that the surrogate model is 4000 times faster than the expensive simulator after performing the 'predict' step of the PtO framework. If the entire PtO framework is taken into account, the study estimates that the SBO framework is more efficient than a regular optimisation framework (in this case the gradient descent or Nelder-Mead simplex algorithm) in the situation that the expensive simulator is called more than 168,000 times. The authors conclude that "the use of ANN in multidisciplinary optimization frameworks transfers the computational cost of the aircraft optimization task to the ANN training process".
It should be noted that the above study was an exception: almost none of the studies included in this work mentioned this last aspect of SSBO, i.e., the computational resources of the SBO framework. What was usually mentioned is the number of samples used to train the surrogate model, but not the time used to train or optimise the surrogate. In some studies, computation times of the SBO framework were reported in the supplementary material, e.g., [30]. In others, such as [56], the ML and optimisation parts of the SBO framework were considered to be negligible, so only the total computation time of using the SBO framework and evaluating the expensive simulator together were mentioned. While it is indeed the case for many applications that the SBO framework has negligible computation time compared to the expensive simulator, this is certainly not the case for all applications, as seen with the study [44] earlier. Therefore, it is important to keep track of both the computation time of the SBO framework itself and the computation time spent on calling the expensive simulator, and to separate these.

Open Questions
Most studies posed several open questions related to their application; however, many studies also contained more general open questions related to SBO. The latter are denoted in this column, in order to stimulate SBO researchers to tackle these open questions. These questions were found manually, mainly by inspecting the conclusions and future work sections of the papers, and are by no means an exhaustive list of open problems in SBO, even for the applications in this review. Still, many commonalities were found.
One of the most common open questions is related to the high dimensionality present in many applications. Many SBO algorithms struggle when a large number of variables is present, e.g., over 200 variables, which can occur in applications such as building design [31]. Fortunately, high dimensionality is an active area of research in SBO [57,58], though it remains the question which approach works best in practice and how to solve the problem efficiently.
Another common open question was that of dealing with multiple objectives, or specifically, many objectives, as most problems in this review already had to deal with two or three objectives and some of them mentioned adding more in the future work section. For example, Ref. [47] uses two objectives but mentions at least five more in the future work section. Multi-objective problems were common in this review due to the nature of sustainable applications: often, ecological, economical, and social aspects of the same problem had to be considered. This was also noticed in a related review on metaheuristic optimisation algorithms for sustainability-related applications, where multi-objective problems outnumbered single-objective problems at least four to one [13]. Furthermore, this could be another reason for the popularity of the PtO framework compared to SMBO, as not all SMBO techniques are equipped for dealing with multi-objective problems. How to deal with a large number of objectives (e.g., more than 1-3) in SBO remains an open question.
The question of robustness and generalisation was also often encountered. Just because a surrogate model is accurate in one situation, does not mean it is accurate in similar situations, such as different weather conditions when designing heat pump systems [16]. In the ML community this question has gained a lot of attention in the past, but in SBO, generalisation aspects are less well understood. Especially in the SMBO framework, where the surrogate only approximates a small part of the search space, generalisation typically plays a smaller role than in traditional ML or in the PtO framework. Research on contextual bandits [59] and similar ideas could be of use here.
Other open questions are concerned with: making efficient use of parallelisation, having multiple users in the PtI framework, including historic data in the SMBO framework, hyperparameter optimisation for surrogate learning, using smooth and/or interpretable surrogates, and general challenges in optimisation such as mixed variables, multimodality, nonlinearity and nonconvexity. Though many of these questions are gaining attention in recent SBO research, such as mixed variables [60,61], these open questions can serve as an incentive for SBO researchers to tackle such problems.

Discussion and Conclusions
Answering the main research question, "How is SBO applied to sustainability-related applications?", several sustainability-related applications from a wide variety of research domains were identified, as well as several frameworks in which SBO were used. Some of these frameworks (such as SMBO and BlO) used an iterative procedure where the surrogate is continuously updated, while others (such as PtO and OtP) trained the surrogate only once. The AutoML framework, where SBO is applied to a more general ML problem, opens up the path of applying SBO indirectly to many more sustainable applications that make use of ML. Overall, besides the sustainability aspect of the application (SBO for Sustainability), many researchers applied SBO to prevent having to run expensive simulators or algorithms multiple times, which is another sustainability aspect related to SBO denoted Sustainability with SBO in this work. However, researchers were not always transparent about the computational resources spent on the ML and optimisation parts of SBO itself: Sustainability of SBO was often not considered. In some cases, these computational resources were negligible compared to those spent on the expensive simulators and algorithms, but in other cases they were not. This makes it difficult to analyse the trade-off between different sustainability aspects of SBO, for example whether to use a more complex but time-consuming ML model in order to save a few more calls to the expensive simulator or algorithm, or to spend more computational resources to increase the sustainability aspects of the application itself (e.g., reduce CO 2 emissions). Therefore, the following recommendations are made to application researchers to improve Sustainable SBO: • Report the hardware used for the SBO framework and the hardware used for the expensive simulator or algorithm (these are often the same Besides these SBO-specific recommendations, the following recommendation is related to sustainability of AI in general: • If possible, report not just the computation times, but also the energy consumption (and energy mix used) or even CO 2 emissions used for the computations.
While this last point is not necessary for making the trade-off between Sustainability of SBO and Sustainability with SBO, i.e., deciding whether using SBO is more efficient than using other optimisation algorithms, it is necessary for also including SBO for Sustainability in the trade-off. To give a concrete example: if 100 tonnes of CO 2 can be mitigated by designing a sustainable solar heat system in Mexico, as is done in one of the included studies [48], but training the ML model would emit over 200 tonnes of CO 2 , as estimated can happen for complex ML tasks such as natural language processing [11], the decision whether to apply SBO to this application is not that easy to make. Fortunately, the carbon footprint of the ML models used in SBO are likely not that high, but without any transparency on this issue it will be difficult to prevent SBO from heading the same direction as natural language processing, with a large carbon footprint as a result. This same call for transparency on the environmental impact of AI techniques is found in other related subfields of AI, such as autoML [12]. Examples of tools to measure the CO 2 emissions of ML are Carbontracker [83] and Machine Learning Emissions Calculator [84], and a similar example for algorithms in general is Green Algorithms [85].
For SBO researchers, the same recommendations above are made, but the reviewed studies themselves also contained recommendations in the form of open questions. These can be used to determine which challenges in SBO to tackle next. The most common open questions found in the reviewed studies were related to high dimensionality, and multi-objective optimisation. This is in line with the results of a questionnaire on more general real-world optimisation problems [86], i.e., problems not necessarily related to SBO or sustainability, where having two or more objectives and tens or hundreds of variables was quite common. Especially researchers using the SMBO framework should take this into account when designing their algorithms. Note that in that same questionnaire, objectives that took up to an hour to evaluate were quite common, which indicates the importance of SBO for real-world optimisation problems, and the potential of Sustainability with SBO.
All in all, it can be concluded that there is great potential in responsibly using SBO to make the world more sustainable.
Funding: This research received no external funding.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments:
The author would like to thank Shane Ó Seasnáin for his valuable feedback on a first draft of this paper.

Conflicts of Interest:
The author declares no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: