Development of a Test-Bench for Evaluating the Embedded Implementation of the Improved Elephant Herding Optimization Algorithm Applied to Energy-Based Acoustic Localization

: The present work addresses the development of a test-bench for the embedded implementation, validity, and testing of the recently proposed Improved Elephant Herding Optimization ( iEHO ) algorithm, applied to the acoustic localization problem. The implemented methodology aims to corroborate the feasibility of applying iEHO in real-time applications on low complexity and low power devices, where three different electronic modules are used and tested. Swarm-based metaheuristic methods are usually examined by employing high-level languages on centralized computers, demonstrating their capability in ﬁnding global or good local solutions. This work considers iEHO implementation in C-language running on an embedded processor. Several random scenarios are generated, uploaded, and processed by the embedded processor to demonstrate the algorithm’s effectiveness and the test-bench usability, low complexity, and high reliability. On the one hand, the results obtained in our test-bench are concordant with the high-level implementations using MatLab R (cid:13) in terms of accuracy. On the other hand, concerning the processing time and as a breakthrough, the results obtained over the test-bench allow to demonstrate a high suitability of the embedded iEHO implementation for real-time applications due to its low latency. writing—review and S.D.C., J.F.,


Introduction
Embedded systems are widely spread in our daily life. They are a vital component of larger structures such as wireless sensors networks ( WSNs) [1], Internet of Things (IoT) [2], automotive electronics [3], home automation [4], energy management [5][6][7], noise monitoring [8,9], autonomous vehicles [10,11], among several others. Elecia White defines an embedded system as " a computerized system that is purpose-built for its application" [12]. Extending this concept, today's modern systems are based on microcontrollers with integrated memory (volatile and nonvolatile), digital inputs/outputs, and application-specific peripherals. Features such as analog-to-digital or digital-to-analog conversion, connectivity (802. 15.4, 802.11b/g/n and LORA), or power management make them well suited for data harvesting, sensing, and actuating the physical environment as edge devices [13]. Bearing in mind that these types of equipment produce knowledge in the context of sensor networks on IoT platforms, the integration of new sources of information (based on more efficient algorithms) should be considered an essential direction of research. In this context, the role of power management, coupled with the efficient implementation of integrated algorithms, plays an important role in achieving significantly longer battery life of embedded systems [14,15]. The embedded system design has several constraints, namely its cost, energy consumption, performance, processing time, flexibility, time, or sustainability [12]. When it comes to WSNs, features such as energy consumption and performance are of crucial importance. Wireless sensor nodes (or simply sensors) typically aim to retain their energy level for an extended time, monitor required data, and send it to a base station located in a remote place. When the sensors are distributed in geographically isolated areas, they are deployed with batteries that cannot be replaced, or their replacement comes with high-cost [16].
Keeping the same line of reasoning, knowing the location of sensors plays an important role in several fields and applications, namely acoustic localization. Some examples of application can be found in shooter localization in urban terrain [17], smart surveillance [18], wildlife [19] or robotics [20]. The state-of-the-art comprises some platforms that have been proposed in the context of an acoustic acquisition. Nonetheless, localization processing is mostly done offline, on a central computing unit, or when considered locally, on complex and expensive devices such as field-programmable gate arrays (FPGAs) [9,10]. Some proposals of hardware platforms that have been presented in the context of acoustic acquisition can be found in [21] but with the need for complex mathematical processing. The platform described in [22] considers acoustic processing stage (detection, classification, and analysis) on a sensor itself, creating the need for complex sensors architectures. In [23], an application to support Ambient Assisted Living based on acoustic events was proposed for indoor environments. However, it is confined to a single place of application, again with the need for complex hardware architectures.
One the other hand, swarm-based algorithms have demonstrated their potential in solving complex engineering problems [24,25]. The methodology was originally proposed to mimic the social behavior of a bird flock [26], but nowadays, it comprises a broad group of computational methods known as swarm intelligence. In the particular case of acoustic energy-based source localization, an approach based on Elephant Herding Optimization [27] was initially proposed in [28] and improved in [29]. Earlier results demonstrated the applicability of the methodology, as well as a significantly reduced complexity of implementation in comparison with other approaches available in the scientific literature. As such, the present work focuses on developing a test-bench to evaluate the performance, validate, and test an embedded implementation of iEHO algorithm [29] for performing localization of acoustic events. In addition to the challenge of putting into operation the mentioned algorithms on processors with low computational resources that use low level programming languages, the present works aim to develop a reliable and simple test-bench to perform a wide range of test conditions. Instead of using well established protocols such as JTAG or ARM Serial Wire Debug [30], that would imply specific tools; the setup and observation data are supplied to the processor through a serial link, and the same goes for the obtained results, storing only a minimal memory overhead.
To the best of the authors' knowledge, no localization algorithms with the use of swarm-based optimization have been incorporated directly on the sensors themselves, where maximizing performance and minimizing computational resource are the two main objectives. Therefore, standard inline debug strategies do not suffice to validate the overall performance of this type of implementation. The proposed work assumes the following claims: (1) to design and implement an embedded swarm-based methodology for energy-based acoustic localization; (2) to design a simple integrated test-bench to validate the implementation as a proof of concept; (3) to consider a sufficient number of hardware platforms to generalize their usability; (4) to consider a wide set of simulation conditions to extrapolate the effectiveness of the embedded implementation; (5) to validate the embedded results with high level languages such as MatLab R .
The current work falls in the field of computation, software testing, and applications of sensor networks signal processing, far from computer simulations with well established software platforms, integrating complex libraries and graphical interfaces. The obtained results will thus serve as a proof of concept given that they are achieved through real-life platforms for product development. All procedures will thus be embedded, namely the data sources and performance metrics calculation.
The remaining paper is organized as follows. Section 2 provides a contextualization with related embedded testing frameworks and protocols. Section 3 introduces the problem of interest, based on presenting its theoretical and technical foundations. Section 4 presents the test-bench developed, the methodology used to address the problem and the experimental setup employed in this work. Section 5 presents and discusses the obtained results, and Section 6 concludes the paper and presents future extensions of the present work.

Related Work
Embedded software testing lifecycle has the goal of finding defects and evaluating the performance of implemented algorithms. To achieve these goals, a test procedure containing activities and test data must be planned carefully in order to specify what should be evaluated and how. Therefore, some means of communication with the embedded processor should be designed. Alternatively, already existing communication channel could be used [31]. To provide standardized approaches, the "IEEE 1149.1 Standard Test Access Port and Boundary-Scan Architecture", also known as JTAG, specifies a debug interface that can be included in any Integrated Circuit (IC) [32]. The standard is widely adopted by IC manufacturers, and it specifies a four-pin test access port and an optional test reset pin as a debug interface for a chip. Although single wire interfaces were proposed in the literature [33,34], they are only applicable on System on Chip ICs [35,36] or field-programmable gate array designs [37]. When available, a serial communication port is one of the most flexible solutions, deploying only a two-pin full-duplex interface.
When considering the application level, debugging frameworks are usually used for debugging, verifying, and classifying some performance or behaviors. Implementation examples can be found regarding embedded wireless sensors [38], web-based sensors [39], Internet of Things devices [40], power quality sensor nodes [41], or cyber-physical automation systems [42]. Application-specific debugging is known for the decrease in development cost and time of faulty stage detection [43]. The present work relies on embedded acoustic sensor nodes to validate and evaluate a swarm-based algorithm to perform acoustic localization.

Theoretical Background
This section is divided into two parts. The first part introduces the acoustic measurement model and formalizes the localization problem, while the second part summarizes the swarm-based algorithm of interest here, i.e., the iEHO method. As such, the present section intends to provide the necessary theoretical fundamentals to specify the test-bench environment and its components.

Acoustic Problem Formulation
The location of an acoustic source consists of analyzing a q-dimensional (where q = 2 or 3) sensor network, composed of N sensors and one acoustic source. The true (but unknown) location of the source is denoted by x and the known location of the i th sensor by s i , where i = 1, . . . , N. The location of the source is determined by employing the acoustic energy measurements acquired by sensors. The relationship between the acoustic energy and distance is modeled through a decay model [44,45] where it is considered that the sound propagates in free space, the sound is omnidirectional, the propagation medium (air) is homogeneous, and there is no sound reverberation [44]. Each sensor performs M noisy measurements, within a time frame T = M/ f s , where f s is the sampling frequency. Thus, when considering the average of the energy signatures over the time window, according to [44,45], the received signal at the i th sensor can be modeled as: where g i is the sensor gain, P is the transmitted power, ν i ∼ N (0, σ 2 ν i ) represents the measurement noise, assumed to follow a Gaussian distribution with zero mean and variance σ 2 ν i , and β L is the path loss exponent. For the sake of simplicity and without loss of generality, the measurement noise is considered equal for all links and thus, σ 2 The value of β L is typically considered within the interval [2,4] [46]. In this work, we consider β L = 2, since we consider signal propagation in free space, without reflections or reverberations [46]. Incorporating all observations from the multiple sensors (1), the maximum likelihood (ML) estimator of x can be formulated as [44,45]: (2) The mathematical expression (2) is highly nonconvex, has singularities in each sensor coordinate, several suboptimal solutions and saddle regions. In Figure 1 Several solutions for tackling (2) have been proposed in the literature, mostly using deterministic approaches. The work in [47], in which a closed-form solution was proposed, exhibits good performance for low noise power but suffers considerable degradation for higher levels of noise power. A weighted least squares method was presented in [48,49]. Although its computational load is low, its performance degrades in noisy environments, due to the approximations necessary to achieve a linear estimator. Semidefinite relaxation was proposed in [50,51], where good performance was obtained in terms of localization accuracy, even in noisy environments, but with a high computational cost. The problem of high computational cost was partially by applying Second-Order Cone Programming [52], but still, the reduced computational complexity is not suitable for the implementation in real-time scenarios, at least for the time being. Authors in [28,53] adapted the classical Elephant Herding Optimization (EHO) method [27] to the problem of energy-based source localization. This implementation of a swarm-based metaheuristic algorithm allowed the possibility to use significantly less-complex algorithms in terms of computation, which achieve a location accuracy similar to its deterministic counterparts.
Apart from the energy model discussed, time difference of arrival has also been employed to solve the source localization problem. The approach consist of estimating the time delay between a pair of microphones. For that purpose, the work presented in [54] used a triangular array consisting of 3-microphones to determine the source location from observations of a mobile robot. Additionally, in [55], an evaluation of real-time sound localization approaches are compared using an 8-microphone array. Similar architectures were presented in [56,57], where the common need of an array of microphones can be considered as a drawback for practical applications [58,59]. As such, the present work considers the acoustic energy-based model of [44].

Swarm Optimization
Swarm optimization falls within the set of algorithms for global optimization inspired by nature behavior. A group of initial candidate solutions is generated and updated based on a particular swarm behavior in an iterative process. Each new generation is produced by removing less desired solutions and introducing small random changes based on the behavior in question [60]. Regarding EHO, the algorithm emulates the herding behavior of elephants in a group. In nature, elephants belonging to different clans live together under the leadership of a matriarch, and the male elephants leave their family group when they reach adulthood [27]. Thanks to the low number of control parameters and its simple implementation, the EHO algorithm has been successfully applied to several scientific fields. Applications range from drone placement [61,62], power flow management [63], image encryption [64], proportional-integral-derivative control [65,66], or localization in WSNs [67,68].
The mathematical models used are based on simple algebraic expressions, allowing an initial population to approach the real solution. While the initial population is usually randomly generated, authors in [29] developed a new strategy, that by exploiting the particularities of the problem at hand, allowed to start the search procedure around the region in which it is most likely to find an suboptimal solution. The main finding reported in [29] was the significant reduction of the number of population generations. As such, it is expected that the implementation of such methodology significantly reduces the processing needs for finding a feasible solution, i.e., the desired location with high accuracy, simultaneously guaranteeing low latency to a more generic architecture. Figure 2 shows the algorithm's flowchart, where usual swarm-based method features such as sorting, updating, performing elitism, and evaluating can be found. In the case of iEHO [29], the population is initialized based on distance estimation, the stopping criteria monitors the cost function progress, and the clan operator performs a local search based on a discrete gradient descent method.
The present work considers an embedded implementation of iEHO [29], using low cost processors, available as development boards, compatible with the Arduino R Integrated Development Environment (IDE). The implementation of a problem-specific test-bench will allow collecting information for generating performance metrics and evaluating the obtained results. The iEHO algorithm is considered here at the expense of other swarm-based ones, due to its reduced number of tuning parameters and its proven effectiveness in the specific problem under study [28,29,53]. In addition, its simplicity contrasts the need for matrix calculations and other essential complex mathematical calculation in the deterministic context [69,70]. In order to get a better understanding and appreciation of the obtained results, the standard EHO is also implemented and evaluated for comparison. Thus, it will be possible to evaluate the modifications on iEHO, such as the obtained error and the computational time with regard to the original EHO.

Methodology and Experimental Setup
Embedded software test design techniques can be classified between White-Box or Black-Box, whenever the explicit knowledge of the implementation details is under scrutiny or not [13]. In a Black-Box test, the system is fed with some inputs, and the resulting outputs are analyzed as to whether it complies with the expected behavior. On the other hand, a White-Box technique would be based on the knowledge of the firmware internal structure [13]. Given our primary focus and the fact that, since the major concern of our platform is to determine acoustic events positions, a Black-Box approach is considered here (Figure 3). The adopted procedure consists of generating Monte Carlo, M c , random observations, corrupted by white Gaussian noise, ν, of variance σ 2 ν , i.e., ν ∼ N (0, σ 2 ν ). Those observations are the only input provided to the embedded processor in the test-bench, which will estimate the unknown position by applying iEHO algorithm for solving the problem given by Equation (2). Besides the estimated position, to evaluate the implementation performance, additional metrics will be gathered, namely the processing time and the number of evaluations of the cost function. It should be noted that iEHO uses a stopping criteria based on the cost function value between successive iterations (∆ f ), in order to possibly stop the method before it attains a predefined maximum number of evaluations [29]. As such, some correlation between the number of function evaluations and the processing time is expected.
The tests are individualized and run on three different platforms. The first one comprises a NodeMCU module, constituted by a L106 32-bit RISC core processor from Tensilica Xtensa  Table 1 resumes all individual features of the modules used in the test-bench. The described hardware features two similar architectures, but with different clock speeds, with the purpose of highlighting the expected reduction on the processing time. Since Tensilica Xtensa employs a 32-bit proprietary RISC CPU, we choose a third processor incorporating the well known ARM Cortex-M3 architecture, allowing the extrapolation of the results since it is present on several semiconductor manufactures microcontrollers such as STMicroelectronics, NXP Semiconductors, Microchip (ATMEL), Broadcom, Texas Instrumrnts or Silicon Labs. The test-bench developed for validating the performance of the iEHO algorithm consists of two main components. The first one relies on a software script running in MatLab R R2019, on an Intel R Core TM I7-4700HQ CPU, at 2.4 GHz, with 16 GB of RAM, on Windows R 8 (64 Bits) operating system. Firstly, the script is responsible for loading the generated observations according to (1), and supplying them to the embedded processor, complying with the entry format of the iEHO function call. It controls the loop, providing the M c samples, depending on the vector size of the generated observations. Secondly, the script waits for an incoming message in the serial port. This corresponds to the processing of the algorithm in the embedded processor. When the data is available, in the third stage, the script receives the data, processes any occurred error, and stores the estimated position, along with the corresponding metrics (time and number of function evaluations). Figure 4 resumes the three procedures, each one being highlighted with different colors. It is worth mentioning that the elapsed time is measured on the embedded processor and not by the test-bench script. The second component relies on the embedded processor application software (firmware) that is running on the module, which was written in C language and performs several tasks ( Figure 5). The complete compiled code (iEHO and test-bench) takes only 280 Kb of the flash memory and 28 Kb of RAM. As it can be seen from Figure 5, the developed firmware has some specific features for supporting the test-bench, more specifically the main cycle that receives the serial communication data and call the iEHO function, providing the observations under test. The figure shows a "Parameters Setup", a functional block that processes the model parameters and the positions of the sensors in an offline configuration, apart from the test running iEHO. Regarding the possibility of performing a White-Box Test, there are only two function calls that are flow-dependent (green blocks), which would be scorned in the test. These functions correspond to two distinct situations regarding the geometric characteristics of the distance estimations. Since several different scenarios will be randomly generated, for several values of the observations noise, both functions will be called, when considering the M c observation data. With the goal of establishing the considered test conditions, Table 2 summarizes the values for the model parameters, test conditions, the sets of the sensor numbers and, the variance noise intervals.  Regarding the variance noise, increments of 5 dB are considered for the proposed range. Overall, considering the 3 hardware modules, 4 different sets of sensors (6,9,12,15), 7 sets of noise variance added to the observations, the 1000 Monte Carlos runs, running iEHO and EHO as the reference method, a total of 168,000 simulation are arranged and discussed on the test-bench. With regard to the evaluation metrics, the obtained results are processed in order to obtain: the root mean squared error (RMSE) of the M c positions errors (the true source position used to generate the observations is applied); the mean value of the number of cost function evaluations; and the mean of the M c processing times. Since four sets of sensors (N = 6, N = 9, N = 12 and N = 15) and seven sets of variance are under scrutiny, twenty eight values of each considered metric are available for validating the newly designed implementation.

Results and Discussion
This section presents a set of numerical results based on the test-bench developed, where the three processed metrics of performance, over the M c runs on the processor, are analyzed. For demonstration purposes, it is considered here that the sensors are uniformly distributed on a circle, centered at the middle point of the search space, deployed over a 50 × 50 square region. Sources are randomly generated over the search space.
Firstly, the number of the function cost evaluations is analyzed, where iEHO algorithm is compared with the standard EHO. As it can be seen from Figure 6, for the three modules under test (Figure 6a-c), the mean number of functions evaluations, over the Mc, runs is substantially decreased with regard of EHO, where the number of function evaluations is a constant initial parameter. In addition, when a lower number of sensors are considered, the number of evaluations is likewise lower. This situation occurs essentially due to the population initialization. Since the initial population is closer to the expected solution, less population generations will be carried out, accelerating the convergence rate of the algorithm and assessing the initialization of the population methodology. Concretely, although the maximum number of 3000 function evaluations was allowed, most of the processor runs were far below this number. This result indicates that the population was, in fact, initialized in the region of its (suboptimal) solution. Regarding the noise superimposed on the measurements, it can be seen that for higher values of its variance, the number of function evaluations also increases. This behavior is expected, since noisy measurements will widen the intersection region, resulting in a greater search space, i.e., more function evaluations.
The significance of the mean number of function evaluations can only be validated if the localization error (RMSE) falls into acceptable values. Besides the gain obtained with regard to the number of function evaluations, it is mandatory to obtain a lower value of the RMSE, or leastwise, an equal value when applying iEHO. As such, Figure 7 plots the RMSE for both standard EHO and iEHO over the sets of the number of sensors and the range of the considered noise variance.
As it can be seen from Figure 7, the lowest error value occurs when N = 15 and σ 2 ν = −80 dB, that is, for the highest value of the number of sensors and the lowest value of the noise variance. When the number of sensors is decreased, higher values of the RMSE arises. With regard to the measurement noise, the iEHO algorithm always presents lower values than EHO, at the most, the iEHO RMSE is similar with its EHO counterpart for the highest values of the considered noise variance, following its growth. Even so, the use of iEHO is justified from the number of function evaluation perspective (please see Figure 6). When the noise is varied between its minimum and maximum value, as the noise increases, the error also reaches higher values. Nonetheless, when considering the mentioned search space, the sources far from the sensors will be subject to a very low signal-to-noise-ratio (SNR) value when disturbed with a noise of magnitude −50 dB. This situation is evident when, for high values of the noise power, increasing the number of sensors does not have the expected effect on the error reduction. This comportment is quite perceptible for small values of noise, where the stochastic behavior of the algorithm has scope effect on the results. When comparing the embedded implementation results with MatLab R (Figure 7d), one can see the same order of magnitude and evolution of the curves, although with slightly lower values. This difference is mostly due to the fact that MatLab R implements double-precision data type according to IEEE R Standard 754 (64 bits). On the other hand, the embedded firmware was implemented using single precision (32 bits). The analysis of both metrics, the number of function evaluations and the RMSE, shows a consistent behavior between all three modules considered for applying the developed test-bench. This situation allows to consider the proposed algorithm as robust with regards to real-life implementations, taking into account the used hardware platforms used. Having corroborated the behavior of the algorithm and verified the obtained positioning error values falls acceptable ranges, it remains to be analyzed whether this situation fits in execution times that allow running the method in real-time. For that purpose, the third metric, the processing time measured by the embedded processor, is analyzed. This metric is presented in relation to the number of sensors and the measurement noise, superimposed on the number of evaluations of the cost function. This overlay intends to demonstrate the correlation between the number of evaluations and the processing time aiming to illustrate the fitness of the population initialization methodology.
Observing Figure 8a-c, it can be seen that the mean computing time follows the mean number of the function evaluations over the M c test runs. A clear distinction exists between using N = 6 or N = 15 sensors, where the lower number of sensors has a lower number of function evaluations. The same observation can be applied with the variance of the perturbation noise, where the execution time increases with the increase of the noise variance, a situation that is emphasized with a higher number of sensors. From a quantitative point of view, computing times are below 150 ms for reduced noise values. When the noise increases, their average remains around 100 ms when N = 6, and about 500 ms when N = 15 when using the NodeMCU and Arduino DUE modules. When considering the ESP32 module, the computation time reduces to 15 ms and 100 ms for N = 6 and N = 15 respectively. This huge difference for the ESP32 module is due to its clock speed, which duplicates regarding its counterparts. In Figure 9a-c, a comparison in execution time between standard EHO and iEHO is illustrated. It should be noticed that in the standard EHO, there is no difference between the different disturbance noises, since there is no initialization and, for every test, the maximum number of function evaluations is performed. However, there is a significant difference when considering the different sets of sensors. While a mean value around 1000 ms is obtained for N = 6, the value increases to about 2500 ms for N = 15 (more than 100%) for the Arduino DUE module. The rate is essentially the same for the NodeMCU and ESP32 modules. Once again, the difference is justified with the clock speed of the ESP32 module.
In the case of comparing the standard EHO and iEHO execution time, Tables 3-5

Conclusions and Future Work
In the current work, a test-bench for evaluating and validating a swarm-based algorithm for acoustic localization was presented. The particularity of this implementation was the platform on which the algorithm was executed, where embedded modules were used to perform the calculations using and the simple procedure for supplying and retrieving data. The obtained results, over the proposed test-bench, validated the use of the recently proposed iEHO algorithm with low power, low cost, and low form factor microcontrollers on a wide range of performed tests. They indicated good accuracy, compared with the ones obtained with high-level programming languages, running on high-performance processors. Similar conclusions can be made regarding the processing time. In this case, the results obtained over the proposed test-bench showed computation times sufficiently low to allow iEHO's implementation in real-time applications. This is owed to the accelerated convergence rate, which results in reduced latency, where values as low as 100 ms were obtained. Although the proposed test-bench was used for iEHO, its generalization to support implementation of other swarm-based algorithms is straightforward and requires very few modifications. Hence, this work stamps a new stage, where the improved convergence rate, associated with a reduced number of operations of EHO compared with iEHO, indirectly imply a lower energy consumption and, consequently, extension of the network's lifetime. This result is owed to the flexibility and reliability of the proposed test-bench. As such, new computational/software architectures can be deployed, where swarm-based algorithms (with low complexity and sufficient accuracy) are implemented on embedded frameworks to perform event localization, in this particular case, energy-based acoustic localization.
The proposed and evaluated test-bench in this work, and the data acquired by it, are set to be the fundamentals for several future developments. Creating a fully distributed platform or applying sequential localization algorithms can take full advantage of implementing the positioning method on an embedded processor. Even when considering a centralized scheme, processing the localization algorithm on the edge of the network (Edge Computing) assumes many advantages. Regarding the payload that is being transmitted over the network, its reduction has full implications on the transmission times, network congestion, or energy consumption, leading to new user applications and quality of service. Therefore, the iEHO algorithm can be useful for implementing sequential and distributed localization schemes or integrating IoT platforms that combine localization purposes.

Conflicts of Interest:
The authors declare no conflict of interest.