Application of Sustainability Principles for Harsh Environment Exploration by Autonomous Robot

: Currently, the European Union (EU) is focusing on a large-scale campaign dedicated to developing a competitive circular economy and expanding the single digital market. One of the main goals of this campaign is the implementation of the sustainability principles in the development and deployment cycle of the new generation technologies. This paper focuses on the fast-growing ﬁeld of autonomous mobile robots and the harsh environment exploration problem. Currently, most state-of-the-art navigation methods are utilising the idea of evaluating candidate observation locations by combining di ﬀ erent task-related criteria. However, these map building solutions are often designed for operating in near-perfect environments, neglecting such factors as the danger to the robot. In this paper, a new strategy that aims to address the safety and re-usability of the autonomous mobile agent by implementing the economic sustainability principles is proposed. A novel multi-criteria decision-making method of Weighted Aggregated Sum Product Assessment—Single-Valued Neutrosophic Sets, namely WASPAS-SVNS, and the weight selection method of Step-Wise Weights Assessment Ratio Analysis (SWARA) are applied to model a dynamic decision-making system. The experimental evaluation of the proposed strategy shows that increased survivability of the autonomous agent can be observed. Compared to the greedy baseline strategy, the proposed method forms the movement path which orients the autonomous agent away from dangerous obstacles.


Introduction
Due to the constantly growing human population, the demand for clean food and water, energy, raw materials for habitats and basic goods has been increasing at an unprecedented rate. Naturally, trying to sustain such an economy by exhaustively using non-renewable resources is not effective and can result in the global economic collapse. To tackle this problem the paradigm of sustainable manufacturing was introduced and adopted by many businesses, countries and market groups [1]. For example, in 2015 European Union [2] announced the action plan for the development of the circular economy and by 2017 identified 27 heavy and light earth elements and platinum group metals as the critical raw materials that should be preserved to develop a competitive and technologically advanced economy [3]. A similar strategy was defined by the United States of America [1] in an attempt to conserve energy, minimise greenhouse gas emissions and other toxic waste products.
The sustainability principles, in general, are supported by the environmental, social and economic factors. Therefore, when the paradigm is introduced to the manufacturing system, each factor is addressed at every stage of the product life cycle-design phase, manufacture phase, usage phase scenarios, UGVs are often required to perform in harsh environments, that are unreachable or too dangerous for humans, such as in Fukushima Daiichi event [12]. These environments can be irradiated, flooded, have spreading fire source, high explosion or structural collapse risk [13]. Commonly in these situations, no initial knowledge about the operating environment is given to the autonomous mobile agent. The mainstream approach of solving this problem is to apply an iterative map building approach, through which robot expands its' knowledge about the environment by adding together small bits of obtained information. Losing or damaging the robot due to an unexpected change in the environment or poor decision-making process means that the affected components (or sometimes the whole system) need to be repaired or replaced. Therefore, in the real-world scenarios, autonomous robots should not only strive to complete the given task but also preserve themselves and avoid danger whenever it's possible. Hence, this strategy is an essential requirement to maximise the reusability of the autonomous robot and minimise the potential economic damages.

Multi-Criteria Decision-Making Methods for Sustainable Robot Design
The process of creating a robust robot decision-making module, that integrates sustainability principles and also addresses the given task, involves multiple environmental, economic, social and functional requirements. The successful fulfilment of each requirement depends on a number of task-related criteria that have to be evaluated during the route planning stage. Hence, autonomous trajectory selection problem can be modelled as a multi-criteria decision-making (MCDM) problem.
In the context of sustainable system development, several papers propose different MCDM approaches to solve complex real-world selection problems. For example, Zavadskas et al. [14] proposed a new assessment methodology for waste incineration plant location selection. Stojić et al. [15] addressed the problem of supplier selection in the manufacturing chain by introducing a rough WASPAS framework. Stanujkić and Karabašević [16] extended the WASPAS method by integrating intuitionistic fuzzy numbers.
The MCDM methods are extremely fast and easily adjustable tools, which enable the user to evaluate the alternatives not only by considering the data from various conflicting criteria and their relative weights, but also the format of presentation, for example, fuzzy or crisp [14]. Due to these characteristics, multi-criteria decision-making methods were successfully applied in several studies, such as single robot area exploration in [8] or multi-robot navigation in [9]. In both studies, Choquet fuzzy integral was applied to model redundancy and synergy between the elements of the standard criteria set. This extension was applied to develop a dynamic search and rescue strategy which was based on the observation location evaluation. According to the research authors, the proposed method produced topologically representative maps and showed a good overall performance of the proposed strategy. A more advanced MCDM framework, namely PROMETHEE II, was proposed by Taillandier and Stinckwich [17] to improve the decision-making efficiency in search and rescue scenarios. This outranking method provided better results in open and cluttered environments, compared to [9]. However, the experimental evaluation of all discussed area exploration strategies was conducted without taking into consideration the harsh environmental conditions, such as open fire sources, dynamic obstacles and faulty sensor readings.
In recent years, an effort to extend the MCDM methods to solve such complex real-world problems took place. Various fuzzy set formulations for MCDM frameworks were taken into consideration while modelling the incomplete data sets for practical decision problems [18][19][20]. Recently, a new distinctive method to model the vagueness of the perceived data was formulated by Smarandache [21], called neutrosophic set logic. Neutrosophic sets can be viewed as the generalisation of Intuitionistic fuzzy sets [22], which, unlike other fuzzy-based methods, incorporate the estimation of three independent factors: truth-membership degree, indeterminacy-membership degree and falsity-membership degree and provide the tools to analyse each of them separately.
The neutrosophic sets were used to extend several multi-criteria decision-making frameworks, such as WASPAS [14,23] or Decision-Making Trial and Evaluation Laboratory Method, namely DAMATEL, proposed by Liu et al. [24]. In general, multi-criteria decision-making frameworks that incorporate such tools, show great potential in solving complex harsh environment analysis problems, that are given to the UGVs.

Autonomous Robot Platform
In the context of this research, a virtual turtle-bot-like [25] autonomous robot, whose design is presented in Figure 1, is deployed. The robot has two driven wheels on the sides of its' chassis and two supporting wheels that follow the movement of the robot. The driven wheel diameter is 0.1 m, and the robot length, width and height parameters are 0.15 m, 0.125 m and 0.135 m respectively. The robot can rotate in a 360 • angle around its' vertical axis. High accuracy virtual heat and laser sensors are utilised as the main environment perception devices and are mounted above the chassis, at the centre of the robot. The Hokuyo laser sensor has a measuring range r from 0.01 m to 15 m with an accuracy of ±0.01 m and can detect obstacles at a 180 • angle in front of the robot. incorporate such tools, show great potential in solving complex harsh environment analysis problems, that are given to the UGVs.

Autonomous Robot Platform
In the context of this research, a virtual turtle-bot-like [25] autonomous robot, whose design is presented in Figure 1, is deployed. The robot has two driven wheels on the sides of its' chassis and two supporting wheels that follow the movement of the robot. The driven wheel diameter is 0.1 m, and the robot length, width and height parameters are 0.15 m, 0.125 m and 0.135 m respectively. The robot can rotate in a 360° angle around its' vertical axis. High accuracy virtual heat and laser sensors are utilised as the main environment perception devices and are mounted above the chassis, at the centre of the robot. The Hokuyo laser sensor has a measuring range from 0.01 m to 15 m with an accuracy of ± 0.01 m and can detect obstacles at a 180° angle in front of the robot. Robotic exploration process can be defined as a process through which the physical structure of the initially unknown environment is discovered by making incremental, information-based decisions. In the context of this research, authors consider a standard map building method when newly obtained geometrical data is added to the specified-scale grid, whose cells can have one of the three states: occupied, free or unreachable [26]. It is also assumed that the robot can localise itself within the reconstructed map.
It is worth noting, that any mobile robot, that operates on the ground, in the air, above or below the water surface, can be utilised in the context of environment exploration problem. Naturally, for such a variety of robots, different engineering solutions can be applied, requiring different materials and investments. However, considering different class autonomous robots, commonly shared program components can be distinguished between them [27]: environment perception module; selflocalisation module; cognition and path planning module; motion control module. In the context of this research, the main focus is directed to the cognition and path planning module, and the expansion of robots' decision-making capabilities.
For the robot to efficiently explore the unknown environment, its' decision-making strategy must achieve good long-term performance by making a series of short-term decisions. Assuming that the robot sensor range is limited, this problem can also be defined as the iterative next-best observation location selection problem [9]. Thus, robot decision at every iteration depends only on its' current state and available candidate locations within the currently explored environment and not on the previous states. The value of an alternative location can be measured by evaluating multiple criteria, that depend on the system goal. The number of considered criteria can essentially be unlimited and can be changed to address the requirements of the specific task. In the context of this research, the utilised decision-making module analyses the sensor data, compares alternative routes in relation to their respective criteria and chooses the highest ranked alternative. The complete navigation sequence, utilised by the deployed virtual robot system, is depicted in Figure 2. Robotic exploration process can be defined as a process through which the physical structure of the initially unknown environment is discovered by making incremental, information-based decisions. In the context of this research, authors consider a standard map building method when newly obtained geometrical data is added to the specified-scale grid, whose cells can have one of the three states: occupied, free or unreachable [26]. It is also assumed that the robot can localise itself within the reconstructed map.
It is worth noting, that any mobile robot, that operates on the ground, in the air, above or below the water surface, can be utilised in the context of environment exploration problem. Naturally, for such a variety of robots, different engineering solutions can be applied, requiring different materials and investments. However, considering different class autonomous robots, commonly shared program components can be distinguished between them [27]: environment perception module; self-localisation module; cognition and path planning module; motion control module. In the context of this research, the main focus is directed to the cognition and path planning module, and the expansion of robots' decision-making capabilities.
For the robot to efficiently explore the unknown environment, its' decision-making strategy must achieve good long-term performance by making a series of short-term decisions. Assuming that the robot sensor range is limited, this problem can also be defined as the iterative next-best observation location selection problem [9]. Thus, robot decision at every iteration depends only on its' current state and available candidate locations within the currently explored environment and not on the previous states. The value of an alternative location can be measured by evaluating multiple criteria, that depend on the system goal. The number of considered criteria can essentially be unlimited and can be changed to address the requirements of the specific task. In the context of this research, the utilised decision-making module analyses the sensor data, compares alternative routes in relation to their respective criteria and chooses the highest ranked alternative. The complete navigation sequence, utilised by the deployed virtual robot system, is depicted in Figure 2. The proposed environment exploration strategy encapsulates decision matrix preparation and criteria evaluation methods under the decision-making module. This module can be easily expanded or moved across different autonomous robots, making the system more dynamic. The exploration sequence is ended when the robot battery is depleted, the robot is severely damaged, or the main goal is achieved. In any scenario, the reconstructed map data and the robots' current location coordinates are sent to the control center for analysis. This iterative strategy is the core of the autonomous exploration system that allows the robot to explore the environment about which no initial knowledge is given.

MCDM Problem Formulation
The main goal of this research is to expand the robots' artificial intelligence capabilities. The proposed decision-making model is essentially responsible for two tasks: 1) The processing of environment information and computation of candidate observation location list; 2) The evaluation of observation locations and the selection of the highest ranked alternative.
This alternative selection process can be further formalised from the multi-criteria decisionmaking perspective. For each movement iteration robot computes a new list of candidate observation locations denoted by , , . . . , . For each alternative, a set of criteria , , … , is assigned. The utility of candidate location can be denoted by and used to measure the candidate performance with respect to the criterion . Assuming that has criteria, the candidate location can be denoted as a utility vector , , … , . By applying MCDM methods, the overall value of such a utility vector can be measured and ranked. Therefore, candidate with the highest rank is considered to be a solution to the problem. In the following sections, the computation of alternative observation locations and the selection of the sustainabilitybased criteria, that support safe environment exploration and resource preservation strategy, will be further defined.

Alternative Computation Method for Local-Space Exploration
To compute the candidate observation location list, the autonomous robot must first define the safe navigation area in its field of view [28]. In the context of this research, a safe area is defined as an object-free space, that is visible to the robot at its' current state and through which the autonomous agent can move freely.
Considering the proposed turtle-bot robot design, the attached Hokuyo laser sensor uses 720 light beams to detect obstacles in the environment. However, using the entire free area for candidate computation would not be effective-for each iteration, the decision matrix would be computed from 720 alternatives with their respective criteria. So as to simplify the calculation process, the safe area The proposed environment exploration strategy encapsulates decision matrix preparation and criteria evaluation methods under the decision-making module. This module can be easily expanded or moved across different autonomous robots, making the system more dynamic. The exploration sequence is ended when the robot battery is depleted, the robot is severely damaged, or the main goal is achieved. In any scenario, the reconstructed map data and the robots' current location coordinates are sent to the control center for analysis. This iterative strategy is the core of the autonomous exploration system that allows the robot to explore the environment about which no initial knowledge is given.

MCDM Problem Formulation
The main goal of this research is to expand the robots' artificial intelligence capabilities. The proposed decision-making model is essentially responsible for two tasks: (1) The processing of environment information and computation of candidate observation location list; (2) The evaluation of observation locations and the selection of the highest ranked alternative.
This alternative selection process can be further formalised from the multi-criteria decision-making perspective. For each movement iteration robot computes a new list of i candidate observation locations denoted by A = {a 1 , a 2 , . . . , a i }. For each alternative, a set of n criteria C = c 1, c 2, . . . , c n is assigned. The utility of candidate location can be denoted by u n (a) and used to measure the candidate performance with respect to the criterion c n . Assuming that C has n criteria, the candidate location a can be denoted as a utility vector (u 1 (a), u 2 (a), . . . , u n (a)). By applying MCDM methods, the overall value of such a utility vector can be measured and ranked. Therefore, candidate a with the highest rank is considered to be a solution to the problem. In the following sections, the computation of alternative observation locations and the selection of the sustainability-based criteria, that support safe environment exploration and resource preservation strategy, will be further defined.

Alternative Computation Method for Local-Space Exploration
To compute the candidate observation location list, the autonomous robot must first define the safe navigation area in its field of view [28]. In the context of this research, a safe area is defined as an object-free space, that is visible to the robot at its' current state and through which the autonomous agent can move freely.
Considering the proposed turtle-bot robot design, the attached Hokuyo laser sensor uses 720 light beams to detect obstacles in the environment. However, using the entire free area for candidate computation would not be effective-for each iteration, the decision matrix would be computed from 720 alternatives with their respective criteria. So as to simplify the calculation process, the safe area is segmented into separate regions by grouping the laser beams into similar-length sets, applying the threshold of two meters. For each segment, a candidate observation location is computed, and around each of these locations, meaningful geometrical data is extracted based on the selected criteria.

Criteria Set for Sustainable Environment Exploration
To address the sustainability factors in the autonomous decision-making process, a new candidate observation location evaluation strategy is proposed. A new criteria set is constructed from two main components: three standard criteria, which are commonly applied for greedy map building methods; and three new criteria, constructed to specifically address the economic-related robot safety and re-usability factors for the autonomous harsh environment exploration.
The standard greedy map building strategy relies on the evaluation of the estimated amount of information that would be visible from the new observation point, the length of the collision-free path and the battery consumption rate. Although these criteria are commonly applied in route planning tasks, they are not sufficient for navigation in harsh environments. The inability to identify hazardous obstacles and evaluate their impact on the robot system is a critical design flaw that directly contradicts the sustainability paradigm. Therefore, in the context of this research, UGVs decision-making module is expanded by introducing criteria of the ratio between the detected drive-through region and standard door size, the distance to the detected hazardous obstacle, and the distance to the nearest vision-occluding object. It is worth to emphasise that the constructed criteria list is not exhaustive by any means and can be easily expanded or adjusted to address any new sustainability requirements. All criteria that were utilised in the context of this research were applied in autonomous systems separately, with their own success rate. However, our proposed criteria set addresses the crucial economical aspects of the sustainable environment exploration process. Authors argue, that such contributions in the autonomous agent design phase can have a significant impact on robots' safety and as a result-overall maintenance and repair costs. The full criteria list, relative sustainability factors and used measurement units are presented in Table 1.

Standard Criteria
The anticipated amount of new information. max m 2 The length of a visible collision-free path in the robots' local-space. max m The battery consumption rate. min s

Proposed Criteria
The ratio between the detected drive-through region and standard door size. max -The distance to the detected hazardous object. max m The distance to the nearest vision-occluding object. max m First, the standard criteria list, which consists of the anticipated information gain, the length of the collision-free path and the battery consumption rate, will be described.
In the context of this research, anticipated information gain is measured by applying the methodology proposed by Basilico and Amigoni [9]. Considering an integrated grid map building system and robots' ability to track its' current location p cur and movement direction, the free cell count around the candidate location a can be estimated. This calculation is achieved by subtracting the known map information from the total area that would be visible to the robot from the considered observation point. By knowing the map resolution, this cell count can be further converted to the metric system for easier processing. Similarly, the robot can estimate how much information it could acquire during the movement to the endpoint of the collision-free route. However, it is worth to notice that the estimate can greatly differ from the actual result, depending on how cluttered the environment is at the destination point.
The length of the collision-free path to the candidate observation location is a maximum possible distance that the robot can traverse in the safe area segment. In the robot local navigation space, this parameter is measured as Euclidean distance between the current robot location p cur and candidate location a. In autonomous harsh environment exploration scenarios with a time limit, the robot should maximise the travel distance while also minimising the energy consumption rate. In a sustainable system, the battery consumption rate is a typical criterion, that helps to evaluate the cost of any mechanical or computational action. An effective system should preserve as much energy as possible while also ensuring the highest performance [2]. In other words, energy consumption should be minimised without affecting the overall system performance. In the context of this research, this criterion is measured by evaluating the amount of energy that is needed to reach the candidate location. To estimate the value of this parameter, the robot utilises a simple time-based methodology proposed in [9].
To exhaustively explore the unknown environment, the robot should visit all the regions in the vicinity, taking priority in finding the corridors. In structured human-made environments, rooms, corridors and other enclosed spaces are often separated by doors. Therefore, to further segment the exploration environment, and assist the robot in visiting or leaving these areas, one more criterion is added to support the base criteria list. Namely, the ratio between the constant δ, representing the standard door size, and the detected wall cavity length L d . For calculation purpose, the robot only uses L d that are bigger than its' width. Also, the most common internal door sizing in England and Wales-1981 × 762 × 35 mm [29] is chosen, setting δ = 0.762. The criterion value is measured by applying the following Equation (1): The base criteria set commonly address the greedy environment exploration methodology. However, to address the core problem of this research the designed autonomous robot decision-making system must be capable to exhaustively evaluate the local navigation space. The short-term decisions need to be robust and minimise the probability to damage or lose the autonomous agent. Therefore, the flexibility of multi-criteria decision-making frameworks can be exploited by adding two criteria, that address the safety of autonomous robot and support the economic factors of a sustainable system. These criteria are the distance to the visible hazardous object and the distance to the nearest vision-occluding object.
The probability of causing severe damage to the robot is the most crucial factor, that should be considered while developing a sustainable system. In harsh environments, there are numerous unpredictable events and dangerous objects that can destroy the autonomous robot. Naturally, the decision-making module should avoid any threat it can recognise and choose the safest path possible. In the context of this research, fire damage is proposed as a primary damage source, because of the fire-related event frequency in real-world harsh environment scenarios [30]. This criterion can also be measured by using simple geometry-calculating the Euclidean distance from the candidate observation location to the detected hazardous object. From this distance, a radius of hazardous obstacle effect zone should be subtracted to estimate the real safe navigation area and help the decision-making module to choose the safest alternative.
The probability of colliding with the unseen dynamic object is also high in harsh environments. The constant tracking of the distance to the nearest vision-occluding objects is the main requirement to ensure that robot can fully stop before the sharp turn and avoid collision with the unseen dynamic object that may cross the movement trajectory. By keeping further away from the sharp corners, the robot can choose a safer route, leaving enough time for collision avoidance manoeuvres and emergency brake function [31,32].

Criteria Weight Selection
In general, criteria weights indicate the importance of one criterion in relation to other criteria. Deliberated criteria weight selection is essential to efficiently solve the multi-criteria decision-making problems, and therefore only well-founded weighting factors should be used in the decision-making process [33]. To address this problem, the Step-Wise Weights Assessment Ratio Analysis method, namely SWARA, that was proposed by Kersuliene et al. [33], is utilised.
Unlike commonly applied weight determination methods, SWARA method provides the means to estimate the expert and interest group (stakeholders) opinions about the significance of the criteria based on the accumulated experience, knowledge and available information [33]. This feature is especially important in harsh environment exploration scenarios when multiple contradicting criteria have to be addressed. Considering the Harbers et al. [34] research, different stakeholders can have different values and can prioritise different criteria. For example, firefighters working in the same harsh environment as an autonomous robot can value access to information provided by the robot more than its' safety. However, authorities that are providing the robot may prioritise economic factors and re-usability of the system, creating a so-called value tension between the stakeholders [34]. In these scenarios, SWARA method can be applied to normalise the tensions between the interest groups and assist in developing a more dynamic system.
The process of weight determination with SWARA method can be described in six following steps: (1) First, the list of task-specific criteria is constructed; (2) Then experts rank criteria by their significance in descending order, as shown in Table 2; (3) At the third step, the comparative importance of average value s j is calculated; (4) Characteristics of the comparative importance are determined by k j = s j + 1; Considering the proposed environment exploration strategy, ten experts with a background in the field of robotics, artificial intelligence and decision-making systems, have unanimously agreed on the importance of criteria and their order. The ranking results are provided in Table 2. Table 2. Criteria ranking by their significance for autonomous harsh environment exploration task. The battery consumption rate. min s c 6 The distance to the nearest vision-occluding object. max m Participants also provided their insights about criteria assessment problem. The pairwise comparison of criteria relative importance is shown in Table 3. Table 4 presents results obtained by SWARA method, and most importantly, final criteria weights.

WASPAS Framework by the Single-Valued Neutrosophic Set
The history of the MCDM method, utilised to develop the exploration strategy, tracks back to 2012, when the Weighted Aggregated Sum Product Assessment framework (WASPAS), was proposed by Zavadskas et al. [35] for the first time. The originally described method aggregates the Weighted Product Model-WPM-and the Weighted Sum Model-WSM, to construct a universal decision-making strategy. In 2014, the original WASPAS MCDM method was extended to tackle the uncertainty of the initial data. The extension is set under the interval-valued intuitionistic fuzzy numbers and is referred to as WASPAS-IVIF [36]. In 2015, Zavadskas et al. [37] proposed a novel technique to address the vague input data and improve the decision-making process accuracy-the Weighted Aggregated Sum Product Assessment method with grey attribute scores, namely WASPAS-G. In the same year, Turskis et al. [38] proposed a fuzzy multi-attribute performance measurement framework, that allows dealing with the qualitative parameters in a natural way under the uncertainty. Lastly, a new neutrosophic extension to the WASPAS MCDM method, namely WASPAS-SVNS, was introduced in 2015 by Zavadskas et al. [14]. The neutrosophic sets were proposed by Smarandache [21] in 1999 as a framework to model and solve the real-world problems with uncertainty. The framework is built under the environment of single-valued neutrosophic sets, which provides the tools for modelling and evaluating the sensor input data in the context of three membership functions: truth, falsity and indeterminacy. The general concept of neutrosophic sets used in WASPAS-SVNS can be defined as follows: Definition 1. Let X be the space of the modelled problem-related objects and x ∈ X. The neutrosophic set A in X is defined by three functions: truth-membership function T A (x), indeterminacy-membership function I A (x) and falsity-membership function F A (x). Each function is defined by real standard or real non-standard subsets of T A (x) : to the application of other fuzzy sets, no restrictions are imposed on the sum of neutrosophic sets truth, indeterminacy and falsity membership functions. Therefore, a sum value of T A (x), I A (x) and F A (x) can be expressed as: Definition 2. SVNS is a simplified version of the neutrosophic set. Let X be a universal space of objects and x ∈ X. The single-valued neutrosophic set N ⊂ X can be expressed by the following formula: for all x ∈ X. The values of T N (x) correspond to the truth-membership degree, I N (x)-indeterminacy-membership degree, and F N (x) correspond to the falsity-membership degree of x to N, respectively. When X consists of the single element, N is called a single-valued neutrosophic number and can be expressed as: Integration of the neutrosophication concepts into the decision-making method requires the neutrosophic set algebra, which is the fundamental part of the WASPAS-SVNS framework. This decision-making method is composed of seven stages which can be presented as follows: Stage 1. The decision matrix X is constructed from the computed alternative set with respect to the considered criteria. These matrix elements can be expressed as x ij , where i = 1, 2, . . . , m; j = 1, 2, . . . , n. In this case, x ij is the rating of alternative i with respect to the criterion j. The constructed aggregated decision matrix can be defined as: This stage consists of the normalisation of the decision matrix X which is achieved by implementing the vector normalisation method expressed by the following equation: Stage 3. In this stage, the neutrosophication of the obtained normalised aggregated decision matrix X in the crisp form, and the weight vector w is performed. As a result, the neutrosophic aggregated decision matrix X n is computed. For this conversion, the relationships are applied between the single-value neutrosophic numbers and crisp normalised terms of the alternatives. The linguistic definitions of these conversion grades are provided in Table 5. Table 5. Neutrosophication grades to rate the importance of the alternatives.

Crisp Normalised Terms SVNNs
Stage 4. To apply the first WASPAS-SVNS decision-making strategy, the total relative importance of the alternative is calculated by using the following equation: where i is the alternative, x n +ij and w + j correspond to maximised criteria and x n −ij with w − j -to the minimised criteria. The summation of two SVNN N 1 = (t 1, i 1, f 1 ) and N 2 = (t 2, i 2, f 2 ) can be performed by the following neutrosophic set algebra equation: The second term of the summation consists of complementary neutrosophic number component, which can be defined by applying the following equation: Stage 5. In this stage, the second WASPAS-SVNS decision-making strategy is applied and the product total relative importance of the alternative i is calculated by using the following expression: The identical component definition for this expression is used as in the previous equation (6). The multiplication of two SVNN N 1 = (t 1, i 1, f 1 ) and N 2 = (t 2, i 2, f 2 ) can be calculated by using the following neutrosophic algebra equation: If N 1 = (t 1, i 1, f 1 ) is the single-valued neutrosophic number and the λ ∈ is the arbitrary positive real number, the multiplication between neutrosophic and real number can be expressed as: The power function of the single-valued neutrosophic number N 1 = (t 1, i 1, f 1 ) and the arbitrary positive real number λ ∈ can be calculated by the following equation: Stage 6. The joint generalised criteria that incorporate the results obtained from the 4th and 5th stage are determined by applying the following expression: Stage 7. In the last stage, the score function S Q i is applied to determine the alternative rankings.
is a single-valued neutrosophic number, a score function can be defined by the following equation: The crisp outputs of S N A ∈ [0, 1] are ranked in descending order, and the alternative with the maximum value is considered to be the solution for the next observation position selection problem. The results of this score function are in the same range interval as all functions applied in the definition of the neutrosophic sets [14].

Experiment Environment
In the context of this research, the unknown harsh environment exploration scenario is considered. The robot is tasked to safely navigate through the disaster site and build the representative environment map in the given time limit of 20 min. The experiment is conducted in a virtual environment, created by using Gazebo software [39]. For the ease of recreating the experiment, the standard Willow Garage building model, provided by Gazebo, is used. This building has several small-and large-scale rooms, interconnecting corridors and narrow passages, typical for man-made structures. To simulate the harsh environment and test the efficiency of the proposed sustainable exploration strategy, several non-expanding fire sources were added at random locations within the building. The whole test environment is shown in Figure 3.

Example of the Next-Best Observation Location Selection by WASPAS-SVNS
To highlight the proposed decision-making strategy and to provide the numerical example of next-best observation location selection, the solution of one decision-making iteration is considered. Assuming that an autonomous agent is located at the position shown in Figure 4, the safe navigation area segments, computed in the robots' field of view, are coloured in blue. A total of six candidate observation locations denoted as , , … , were computed-one for each segment. Each alternative is evaluated on the basis of the proposed criteria that address the sustainability factors of autonomous environment exploration. Criteria weights are obtained by using the SWARA method as shown in Table 4. The initial decision matrix computed at the sample location is provided in Table 6.

Example of the Next-Best Observation Location Selection by WASPAS-SVNS
To highlight the proposed decision-making strategy and to provide the numerical example of next-best observation location selection, the solution of one decision-making iteration is considered. Assuming that an autonomous agent is located at the position shown in Figure 4, the safe navigation area segments, computed in the robots' field of view, are coloured in blue. A total of six candidate observation locations denoted as a 1 , a 2 , . . . , a 6 were computed-one for each segment.

Example of the Next-Best Observation Location Selection by WASPAS-SVNS
To highlight the proposed decision-making strategy and to provide the numerical example of next-best observation location selection, the solution of one decision-making iteration is considered. Assuming that an autonomous agent is located at the position shown in Figure 4, the safe navigation area segments, computed in the robots' field of view, are coloured in blue. A total of six candidate observation locations denoted as , , … , were computed-one for each segment. Each alternative is evaluated on the basis of the proposed criteria that address the sustainability factors of autonomous environment exploration. Criteria weights are obtained by using the SWARA method as shown in Table 4. The initial decision matrix computed at the sample location is provided in Table 6. Each alternative is evaluated on the basis of the proposed criteria that address the sustainability factors of autonomous environment exploration. Criteria weights are obtained by using the SWARA method as shown in Table 4. The initial decision matrix computed at the sample location is provided in Table 6. The aggregated decision matrix, obtained by neutrosophication conversion method, is presented in Table 7. The numerical results of WASPAS-SVNS framework stages 4-7 are presented in Table 8. The ranking of the alternatives is calculated by applying the score function (Equation (14)). It can be observed that alternative location a 5 is superior to other alternatives, and therefore should be chosen as a next observation location for the robot to move to. Alternatives a 4 and a 2 are second-and third-best candidates. If the robot would follow these routes, then it would be directed further from vision occluding objects or would leave the room by selecting doors on the left side of the map. However, the proposed strategy ensures the prioritisation of safety factor and battery preservation. Also, the robot is directed to the nearest exit. This behaviour is expected to maximise the overall mapped area in the given time interval.

Results and Discussion
To illustrate the efficiency of the proposed sustainable environment exploration strategy, the comparison between the proposed method and the standard greedy exploration strategy is provided. Two representative maps were built by the autonomous robot in 20-min time interval. The first map, shown in Figure 5, was computed by a greedy autonomous agent, controlled only by the three base criteria: c 3 , c 4 and c 5 . The criteria weights were adjusted to 0.5, 0.3, 0.2 accordingly to the relative research [9], in which similar MCDM-based candidate evaluation strategy is utilised. The second map, shown in Figure 6, was obtained by applying the proposed exploration strategy, that incorporates economic sustainability principles.  From the provided examples, it can be observed that the proposed environment exploration strategy enables the robot to successfully avoid dangerous obstacles, that can destroy or damage the autonomous agent. Considering Figure 5, it can be seen that by using only the base criteria to evaluate the observation location, autonomous agent drove directly through the dangerous obstacles two times. However, the movement trajectory provided by Figure 6, clearly shows the impact of the  From the provided examples, it can be observed that the proposed environment exploration strategy enables the robot to successfully avoid dangerous obstacles, that can destroy or damage the autonomous agent. Considering Figure 5, it can be seen that by using only the base criteria to evaluate the observation location, autonomous agent drove directly through the dangerous obstacles two times. However, the movement trajectory provided by Figure 6, clearly shows the impact of the From the provided examples, it can be observed that the proposed environment exploration strategy enables the robot to successfully avoid dangerous obstacles, that can destroy or damage the autonomous agent. Considering Figure 5, it can be seen that by using only the base criteria to evaluate the observation location, autonomous agent drove directly through the dangerous obstacles two times. However, the movement trajectory provided by Figure 6, clearly shows the impact of the c 1 criterion. The robot navigates around the dangerous obstacles ensuring its' survival. Moreover, the autonomous agent is directed towards narrow and lengthy corridor spaces more than enclosed areas, such as rooms. This behaviour can be linked to the influence of c 2 and c 3 criteria combination. Because of c 3 criterion robot prefers to choose distant locations, that can provide more information about the environment. By integrating the newly proposed c 2 criterion, the agent can detect and evaluate the door-like structures that connect corridors. In general, such iterative behaviour lead the autonomous robot to further located parts of the map in fewer steps, which in theory can help the agent to preserve more energy in long term missions.
Improved damage awareness is also observed by comparing the proposed and greedy strategies. By integrating the c 6 criterion to the decision-making process, the autonomous agent is forced to keep the safe distance from the sharp turns. From the robot movement trajectory, it can be seen that at locations where doors are lined up in front of each other, robot stays at the middle of the corridor, keeping the same distance from both sides. However, in situations where only single doors were detected, the robot often took a semicircle movement trajectory, trying to keep a certain distance from the estimated damage source. This behaviour, achieved by WASPAS-SVNS multi-criteria decision-making method, directly addresses the economic factors of sustainability, helping the robot to increase its' survival time in a harsh environment.
However, the proposed method also has some flaws. In the context of this research, the map building process is based only on local-space exploration. In some situations, this methodology can force the robot to choose a location that is not optimal in the current state. After making a decision, the robot can move away from other enclosed locations that are behind it or in the close vicinity. We can identify this behaviour from Figures 5 and 6 when the robot drove through the entire length of the corridor without checking the nearby rooms. In theory, this problem could be solved by mixing local-space and global-space exploration models, however, further research is needed.
Authors of this research believe that a solid foundation has been laid for future research in the field of autonomous robot environment exploration strategies, that utilise MCDM methods to address the sustainability principles. However, compared to real-world scenarios, it can be noticed that the test environment considered in this research is a bit too simplified. Future research could focus on expanding the autonomous robot decision-making module to address such problems as global and local space exploration, expanding hazardous obstacles and uneven navigation terrains. More exhaustive questionnaires for experts could also be made to identify common stakeholder needs related to environment exploration or search and rescue missions.

Conclusions and Future Work
In this research, sustainability principles were integrated into the fast-expanding field of autonomous mobile robot systems, by addressing the autonomous harsh environment exploration problem. A commonly used iterative map building strategy was applied and modelled as a multi-criteria decision-making problem. To solve this problem, a new area exploration strategy, that takes into consideration the economic aspects of the autonomous environment exploration was proposed. The integration of sustainability principles was achieved by implementing the WASPAS-SVNS decision-making framework, which was developed under the single-valued neutrosophic set environment. WASPAS-SVNS is a fast and powerful framework that provides the means to assess and combine an essentially unlimited number of criteria to solve complex real-world decision-making problems.
Unlike other fuzzy MCDM methods, WASPAS-SVNS can deal with truth, falsity and indeterminacy membership functions separately, allowing a more accurate evaluation of alternatives, when dealing with partial environment information. Combining this framework with SWARA criteria weight determination method creates a highly dynamic decision-making system, that can be adjusted to address specific expert and stakeholder needs.
Considering the experimental evaluation, it can be confirmed that the proposed exploration strategy was successfully utilised in a harsh environment. Compared to the greedy baseline strategy, the decision-making model forms the movement path, in which motion direction is oriented away from dangerous obstacles, keeping the robot safe throughout the given time limit. The considered numerical example and provided navigational segments show the efficiency and flexibility of the proposed decision-making strategy.
However, the field of autonomous robot decision-making systems continues to grow and is yet to be exhaustively studied. In the context of this research, only the economic factor of the sustainability principles was addressed, and the efficiency of the MCDM method utilisation was presented. To develop a fully sustainable mobile robot, social and environmental factors should also be considered.
For possible future works, authors consider expanding the criteria list, used for harsh environment exploration. These criteria could be used not only to develop a robust decision-making system, but also to provide the foundation for designers developing similar systems. Authors also consider the development of adaptable local and global space exploration system, based on MCDM methodology.