Smart Sensor Architectures for Multimedia Sensing in IoMT

Today, a wide range of developments and paradigms require the use of embedded systems characterized by restrictions on their computing capacity, consumption, cost, and network connection. The evolution of the Internet of Things (IoT) towards Industrial IoT (IIoT) or the Internet of Multimedia Things (IoMT), its impact within the 4.0 industry, the evolution of cloud computing towards edge or fog computing, also called near-sensor computing, or the increase in the use of embedded vision, are current examples of this trend. One of the most common methods of reducing energy consumption is the use of processor frequency scaling, based on a particular policy. The algorithms to define this policy are intended to obtain good responses to the workloads that occur in smarthphones. There has been no study that allows a correct definition of these algorithms for workloads such as those expected in the above scenarios. This paper presents a method to determine the operating parameters of the dynamic governor algorithm called Interactive, which offers significant improvements in power consumption, without reducing the performance of the application. These improvements depend on the load that the system has to support, so the results are evaluated against three different loads, from higher to lower, showing improvements ranging from 62% to 26%.


Introduction and Related Work
The development of the IIoT (Industrial Internet of Things) and ICPS (Industrial Cyber Physical Systems) paradigms [1] introduced by the Industry 4.0 concept [2] is producing an increase in the use of embedded systems. This in turn is creating new needs in industry [3]. The use of more distributed devices in plants, which collaborate to achieve a certain goal, is one of these needs. This trend has created new communications requirements, as well as requirements in terms of their computational capacity and energy consumption. Moreover, there is a trend in the use of the Internet of Things (IoT) towards mobile, multisensorial and smart solutions, which has led to an evolution towards IoMT (Internet of Multimedia Things) [4]. In this field, the requirements of IoMT devices require low-cost solutions with restrictions in processing capacity and energy consumption, and with wireless connections to the network [5]. Moreover, the use of cloud computing applications is not feasible in image-processing applications due to latency (although 5G may change this) and privacy issues. This is highlighted in [6], where the importance of near-sensor computing is emphasized, relating IoMT to fog computing. The cloud cannot support and analyse the constant increase in data. Edge computing processes the data very close to the device, and sends only the significant information to higher levels. ETSI (European Telecommunications Standards Institute) has defined several examples of the use of mobile edge computing, one of which is video analysis. Examples of use are wearable cognitive assistance, behavioral analytics and telemedicine. In [6] a proof of concept of software/hardware • Reduce the vulnerability of data; • Offer the possibility of customisation through specialisation of the hardware to reduce latency and energy consumption; • Gain a huge reduction in bandwidth, avoiding the transmission of irrelevant information.
In this work, where trends to develop edge intelligence are analyzed, an Odroid XU4 with Linux and 2 GB of RAM is used. In [9] the authors highlight how embedded platforms are transforming and evolving quickly from standalone computer systems to become part of a smarter, more connected IoT that can be adopted and deployed in different environments, and are being adapted according to their restrictions and needs. A similar concept is that of the IoT-based multimedia applications (IoTMM), where connected industry is one of the 5 categories of application classified in [10], and where the importance of energy saving in the nodes (of limited resources) used is also highlighted. Another concept that is gaining in importance is that of embedded vision [11], which highlights the importance of SoC systems (system on chip) based on ARM architectures to revolutionize image and machine vision. There are therefore several areas where embedded systems, which are connected to the network, are required but which are characterized by restrictions in their computational capacity, consumption, and cost.
SoCs have evolved significantly in recent years, greatly influenced by the exponential increase in the smartphone market, to a point where today we have central processing units (CPUs) capable of running complete operating systems with their own graphical environment. These architectures have evolved from the original homogeneous architectures, to the current heterogeneous architectures, where several cores with different capacities and different energy requirements are mounted on the same chip. This property makes it the perfect choice to integrate IoMT systems into the Industry 4.0 paradigm, as it allows for enormous flexibility. The use of multicore systems also allows for improved energy efficiency based on a reduction in frequency achievable by spreading the work over several cores [12,13]. In this context, where flexibility is an important requirement, systems must be able to be reconfigured in a simple way, so that they can be adapted to applications with different computing requirements in an energy-efficient way. Figure 1 shows an IoMT architecture divided into 4 levels. The lower level, multimedia sensing, is where this work is located. IoMT devices are resource constrained, low-cost, low-power and heterogeneous. They are limited in terms of power resources. However, they should be embedded with application-and context-aware intelligence, so that the multimedia content of the physical world is only acquired when necessary, minimizing the acquisition of redundant information. In the architecture proposed in [5] the only pre-transmission procedures considered are related to the compression of the captured multimedia information. In this work, IoMT technology is used to move from a cloud system to an edge system (or fog), using the processing capacity of the sensors to transmit only the information of interest, and only when that information exists. To develop this function, a method is The predominant architecture in this type of ARM processor-based system is the big.LITTLE ARM architecture, where processes with less computational requirements are executed in the LITTLE cores. However, when increased computational capacity is needed, the process is executed in the big cores. Initially only the processor could internally decide on which core a process was running. In kernel switch scheduling (IKS), each pair of big.LITTLE cores was seen as a single virtual processor associated with a process, switching internally from one to the other according to these needs. More recently, global task scheduling (GTS) has become available [14], where each core, big or LITTLE, can execute tasks simultaneously. In [15] the differences in the IKS and GTS planning algorithms can be seen, as well as the different energy management techniques in mobile processing units. In these systems Dynamic voltage and frequency scaling (DVFS) is available, which is controlled by the governor used. The use of DVFS techniques has already been analyzed, but mainly in the area of smartphones, with a very specific workload and characteristics. In [16] the workload of smartphones and the influence of governors on heterogeneous multicore systems is analyzed, highlighting the over-design from a computational point of view in relation to the needs of smartphones. In [17], energy management methods are analysed from the point of view of the response time perceived by the user. Other works have attempted to characterize the user, such as [18] where machine-learning methods are used to identify and classify the user, and thus optimize energy consumption, or in [19] where a model for predicting satisfaction based on user history is presented. Other work on embedded and mobile systems is presented in [20], based on counter propagation networks to classify tasks and predict the best frequency for the system, or in the IoT environment in [21], based on extreme machine learning with the same objective. However, these proposals are not compared with the most suitable governors, when indicated, nor are their default parameters changed. Moreover, they select a fixed and static frequency for a given task, wasting the dynamic capacity to adjust this value when the workload is periodic but not symmetric. Thus in [20], Ondemand is used but not interactive, and it is not specified which parameters have been used. In [21], there is no specific information it has been compared to, but its results are compared to the governor that assigns the highest frequency, the performance governor, and therefore where any improvement in frequency scaling brings advantages.
In Ref. [15] there is another review of references using DVFS-based methods in applications ranging from 3D games on smartphones to wearable devices. This paper does not propose a new governor that would be suitable for a multi-load and multiapplication environment and that would require neural network training stages to assign statically the best working frequency for a given symmetric load. Instead, it aims to demonstrate how energy efficiency can be increased in devices by running IoMT applications, which have an asymmetric load by the very nature of the data they handle, using standard dynamic controllers, but setting the parameters of these controllers to achieve this efficiency. The paper presents a methodology to choose these parameters according to the load of the task, and its periodicity. Furthermore, instead of comparing only the energy improvements achieved, the results are explained through an analysis of the frequency histogram obtained with each method. The predominant architecture in this type of ARM processor-based system is the big.LITTLE ARM architecture, where processes with less computational requirements are executed in the LITTLE cores. However, when increased computational capacity is needed, the process is executed in the big cores. Initially only the processor could internally decide on which core a process was running. In kernel switch scheduling (IKS), each pair of big.LITTLE cores was seen as a single virtual processor associated with a process, switching internally from one to the other according to these needs. More recently, global task scheduling (GTS) has become available [14], where each core, big or LITTLE, can execute tasks simultaneously. In [15] the differences in the IKS and GTS planning algorithms can be seen, as well as the different energy management techniques in mobile processing units. In these systems Dynamic voltage and frequency scaling (DVFS) is available, which is controlled by the governor used. The use of DVFS techniques has already been analyzed, but mainly in the area of smartphones, with a very specific workload and characteristics. In [16] the workload of smartphones and the influence of governors on heterogeneous multicore systems is analyzed, highlighting the over-design from a computational point of view in relation to the needs of smartphones. In [17], energy management methods are analysed from the point of view of the response time perceived by the user. Other works have attempted to characterize the user, such as [18] where machine-learning methods are used to identify and classify the user, and thus optimize energy consumption, or in [19] where a model for predicting satisfaction based on user history is presented. Other work on embedded and mobile systems is presented in [20], based on counter propagation networks to classify tasks and predict the best frequency for the system, or in the IoT environment in [21], based on extreme machine learning with the same objective. However, these proposals are not compared with the most suitable governors, when indicated, nor are their default parameters changed. Moreover, they select a fixed and static frequency for a given task, wasting the dynamic capacity to adjust this value when the workload is periodic but not symmetric. Thus in [20], Ondemand is used but not interactive, and it is not specified which parameters have been used. In [21], there is no specific information it has been compared to, but its results are compared to the governor that assigns the highest frequency, the performance governor, and therefore where any improvement in frequency scaling brings advantages.
In Ref. [15] there is another review of references using DVFS-based methods in applications ranging from 3D games on smartphones to wearable devices. This paper does not propose a new governor that would be suitable for a multi-load and multi-application environment and that would require neural network training stages to assign statically the best working frequency for a given symmetric load. Instead, it aims to demonstrate how energy efficiency can be increased in devices by running IoMT applications, which have an asymmetric load by the very nature of the data they handle, using standard dynamic controllers, but setting Sensors 2020, 20, 1400 4 of 16 the parameters of these controllers to achieve this efficiency. The paper presents a methodology to choose these parameters according to the load of the task, and its periodicity. Furthermore, instead of comparing only the energy improvements achieved, the results are explained through an analysis of the frequency histogram obtained with each method. Section 2 reviews the architecture of ARM-based SoCs, and in particular the part corresponding to energy saving, defining a methodology to determine the parameters. The following section presents the experimental results obtained for different types of video sequences. Finally, the conclusions and future work to be done are presented.

Architecture
In a big.LITTLE ARM architecture (see Figure 2) each type of core has a minimum frequency (F min ), a maximum frequency (F max ), and a range of frequencies available between these two, so that, depending on the policy applied by the governor, and the computer load required by the process, one particular frequency will be chosen as the working frequency (f w ), a value that can change continuously during the execution of the task. These governors can apply a static or dynamic frequency assignment policy. As a static, there is the Powersave governor, where f w = F min , and the Performance governor, where f w = F max . The dynamic governors will make the f w value go up and down through the available frequencies between F min and F max with the objective of executing the tasks in a satisfactory time alongside a reduction in the energy consumption. Common examples of dynamic policies are the Ondemand, Conservative, and Interactive governors. These are naive algorithms [22], which means that appropriate configuration may give a significant improvement in consumption. Using Ondemand, if the load on the core exceeds a certain threshold, f w = F max will be set, and a gradual reduction of f w will be performed until f w = F min is reached, as the load on the core decreases. Conservative has a more gradual way of raising the f w value, and a progressive decrease when the load drops from the lower threshold. The Interactive governor was designed for interactive workloads that require a fast reaction in response to user actions. In addition, the procedure for adjusting the f w value is in the kernel with the highest priority, in order to avoid delays in response. This governor, has a series of parameters that allow the operation to be regulated according to the relationship between performance and energy to be achieved. Table 1 shows a description of these parameters, and Table 2 shows the default values used and those proposed here.
Sensors 2020, 20, x FOR PEER REVIEW 4 of 15 Section 2 reviews the architecture of ARM-based SoCs, and in particular the part corresponding to energy saving, defining a methodology to determine the parameters. The following section presents the experimental results obtained for different types of video sequences. Finally, the conclusions and future work to be done are presented.

Architecture
In a big.LITTLE ARM architecture (see Figure 2) each type of core has a minimum frequency (Fmin), a maximum frequency (Fmax), and a range of frequencies available between these two, so that, depending on the policy applied by the governor, and the computer load required by the process, one particular frequency will be chosen as the working frequency (fw), a value that can change continuously during the execution of the task. These governors can apply a static or dynamic frequency assignment policy. As a static, there is the Powersave governor, where fw = Fmin, and the Performance governor, where fw = Fmax. The dynamic governors will make the fw value go up and down through the available frequencies between Fmin and Fmax with the objective of executing the tasks in a satisfactory time alongside a reduction in the energy consumption. Common examples of dynamic policies are the Ondemand, Conservative, and Interactive governors. These are naive algorithms [22], which means that appropriate configuration may give a significant improvement in consumption. Using Ondemand, if the load on the core exceeds a certain threshold, fw = Fmax will be set, and a gradual reduction of fw will be performed until fw = Fmin is reached, as the load on the core decreases. Conservative has a more gradual way of raising the fw value, and a progressive decrease when the load drops from the lower threshold. The Interactive governor was designed for interactive workloads that require a fast reaction in response to user actions. In addition, the procedure for adjusting the fw value is in the kernel with the highest priority, in order to avoid delays in response. This governor, has a series of parameters that allow the operation to be regulated according to the relationship between performance and energy to be achieved. Table 1 shows a description of these parameters, and Table  2 shows the default values used and those proposed here.

Parameter Symbol Description
Hispeed_freq

Fhs
Value of fw initially chosen as soon as the core load exceeds a certain load value go_higspeed_load GHL Load threshold to increase frequency above_highspeed delay AHD Time during which, if the load continues to exceed the threshold, the frequency fw will be raised again until Fmax is reached timer_rate TR Load sampling interval if the core is not idle  In Algorithm 1 there is a description of the operation of the algorithm which aims to determine the value of f w so that it meets the requirements of applications while reducing consumption. Each load-sampling interval (TR) is checked to see if this load is above the GHL threshold. If it is, the frequency is raised to the value of F hs . Once this frequency has been established, the system waits for a time determined by AHD, to re-evaluate the use, and if it continues to exceed the threshold, it increases the f w value again, which is already already above F hs . value, and may reach F max . If the threshold is not exceeded, the system waits for MST, and if the load remains below the threshold, the value of f w will be reduced. Usually this governor is configured so that F hs = F max , as also indicated in [15], which improves the system's reaction time, but does not allow the exploitation of the range of intermediate frequencies between the maximum and minimum, providing a very fast response, but also with the highest power consumption.

Methodology for Parameter Selection
The approach is that the flexible node IoMT will be executing a certain task at a given time with a periodicity T. The algorithm to be executed in each period may be symmetric and invariant to the content, so that its execution time (C s i ) in each period T will be constant if used at the same frequency In this case, a static governor can be chosen using a fixed frequency f w , between F min and F max so that C s< ∼ T, and in this way the temporal requirements of the application are satisfied. Given the quadratic relationship between energy and frequency, the lower the f w , the lower the consumption. In the case of an asymmetric algorithm, where the execution time (C a i ) is not constant but depends on the content of the images, the use of another certain fixed frequency f w means that there is sometimes too much idle time, while in other cases there may be very little idle time or the time T may even be exceeded. Figure 3 shows these two situations graphically.  The process for determining the value of fw when the load is symmetrical is described in [23] in which recommendations for asymmetrical loads are also given. In the case of the asymmetrical load , this is defined by the average , the standard deviation , and the worst case ( = max ( )).
The proposed relationship to choose the value of the working frequency is [23]: That is, the frequency Fi is chosen which allows that even in the worst case, the execution time is less than the period T of the tasks. However, to satisfy the worst case, in a static governor, a high value, close to Fmax, may have to be taken and thus the idle time is considerable, which will mean a significant waste of energy. In the case of using a dynamic governor, the parameters Fhs and AHD will be modified in order to satisfy < , at the same time achieving an important energy saving.
The first proposal for the value of Fhs in this paper is: That is to say, the frequency Fi is chosen, which guarantees that the average time is less than the period. Although a value close to the average is used, this is for a determined value of fw. Choosing this value for Fhs, if the computational load for processing an image is higher, the value of fw will change from Fhs to higher values, meaning that the execution time will be reduced in comparison with the time that would be obtained with a static governor using fw = Fʹhs. In whichever case, with this selection, the deadline will not be met on occasions. Another possible relationship to reduce the possibility of the deadline not being met is: In this case a value of Fhs higher than with Equation (2) will be chosen, reducing the chances of missing the deadline at the cost of higher energy consumption. The other important parameter is the value for the time that this frequency will remain in use, before checking that the use is still high and therefore raising the frequency again.
In this way, we can be sure that we are working for a greater period of time with the value of Fhs, before increasing the value of fw towards Fmax. The process for determining the value of f w when the load is symmetrical is described in [23] in which recommendations for asymmetrical loads are also given. In the case of the asymmetrical load C a i , this is defined by the averageĈ a , the standard deviation σ C a , and the worst case C a wc (C a wc = max C a i ). The proposed relationship to choose the value of the working frequency is [23]:

Experimental Results
That is, the frequency F i is chosen which allows that even in the worst case, the execution time is less than the period T of the tasks. However, to satisfy the worst case, in a static governor, a high value, close to F max , may have to be taken and thus the idle time is considerable, which will mean a significant waste of energy. In the case of using a dynamic governor, the parameters F hs and AHD will be modified in order to satisfy C a wc < T, at the same time achieving an important energy saving. The first proposal for the value of F hs in this paper is: That is to say, the frequency F i is chosen, which guarantees that the average time is less than the period. Although a value close to the average is used, this is for a determined value of f w . Choosing this value for F hs , if the computational load for processing an image is higher, the value of f w will change from F hs to higher values, meaning that the execution time will be reduced in comparison with the time that would be obtained with a static governor using f w = F hs . In whichever case, with this selection, the deadline will not be met on occasions. Another possible relationship to reduce the possibility of the deadline not being met is: In this case a value of F hs higher than with Equation (2) will be chosen, reducing the chances of missing the deadline at the cost of higher energy consumption. The other important parameter is the Sensors 2020, 20, 1400 7 of 16 value for the time that this frequency will remain in use, before checking that the use is still high and therefore raising the frequency again.
In this way, we can be sure that we are working for a greater period of time with the value of F hs , before increasing the value of f w towards F max .

Equipment and Sequences
An Odroid XU4 was used for the experiments, the main features of which can be seen in Table 3. The haartraining algorithm using OpenCV was used as a load on a video sequence. The video sequence is from a highway, where the passage of vehicles is controlled. In a cloud system, the system would capture and send all the images to the cloud to be correctly processed there. Images like that shown in Figure 4a would involve the entire image being sent to the central office for processing. In the fog computing system, the algorithm looks for the cars in the sequence, and only sends the ROIs (regions of interest) of the images where it has located vehicles, as can be seen in Figure 4b. Since it is a process that is costly in terms of computing requirements, this vehicle search process has not been executed on all the images, but only on those where a threshold of change between the images I i and I i+1 is exceeded (it is assumed that there is no alternative sensorization that indicates the presence of vehicles, either because it is a provisional installation where an attempt is made to economize on the installation, or because of the difficulties that may exist in the location of presence sensors that can perform this function. Therefore, this function will be carried out by means of multimedia processing [24]).

Equipment and Sequences
An Odroid XU4 was used for the experiments, the main features of which can be seen in Table 3. The haartraining algorithm using OpenCV was used as a load on a video sequence. The video sequence is from a highway, where the passage of vehicles is controlled. In a cloud system, the system would capture and send all the images to the cloud to be correctly processed there. Images like that shown in Figure 4a would involve the entire image being sent to the central office for processing. In the fog computing system, the algorithm looks for the cars in the sequence, and only sends the ROIs (regions of interest) of the images where it has located vehicles, as can be seen in Figure 4b. Since it is a process that is costly in terms of computing requirements, this vehicle search process has not been executed on all the images, but only on those where a threshold of change between the images Ii and Ii+1 is exceeded (it is assumed that there is no alternative sensorization that indicates the presence of vehicles, either because it is a provisional installation where an attempt is made to economize on the installation, or because of the difficulties that may exist in the location of presence sensors that can perform this function. Therefore, this function will be carried out by means of multimedia processing [24]).
This same sequence was then used, but processed in three different ways, and henceforth in this work will be described as three different videos. In the first (vid1) an artificially high change threshold has been used, so it does not manage to detect vehicles and is, therefore, equivalent to a video sequence with very low activity. In the second (vid2), the threshold is set to an appropriate value, so all the images where there are changes are processed and movement of cars is detected, and the ROI of the vehicles found is also sent by WiFi. In this sequence, moments of higher activity are alternated with moments of less activity. The third (vid3) is a part of the same sequence with high activity, which is repeated several times and is thus considered to be a scene with high activity.    This same sequence was then used, but processed in three different ways, and henceforth in this work will be described as three different videos. In the first (vid1) an artificially high change threshold has been used, so it does not manage to detect vehicles and is, therefore, equivalent to a video sequence with very low activity. In the second (vid2), the threshold is set to an appropriate value, so all the images where there are changes are processed and movement of cars is detected, and the ROI of the vehicles found is also sent by WiFi. In this sequence, moments of higher activity are alternated with moments of less activity. The third (vid3) is a part of the same sequence with high activity, which is repeated several times and is thus considered to be a scene with high activity. Table 4 shows the energy consumption in Wh made in one hour of video transmission, calculated by extrapolating the data from the duration of the video to a one-hour video, and this will also be the way energy consumption is expressed in the rest of the cases, so that the comparison is easier to perceive. Only the consumption in the IoMT device is considered here, and not the consumption generated from having to process in the cloud servers all the images in search of vehicles, a task that will not be necessary in the fog computing approach.  Table 5 shows the values obtained with the Performance governor using different values of F max for the sequence Vid3. As can be seen, the value of F hs would be 1.0 GHz, while the value of F hs would be 1.6 GHz. A value of T = 125 ms has been used in the sequence, so the value of AHD chosen is 120 ms.

Results Obtained for a Sequence of Medium Activity
In the sequence denominated Vid2, the values obtained by the Performance and Ondemand governors can be seen in Table 6.  Table 7 shows the results using the Interactive governor, with its default parameters (F hs = 2.0 GHz), and the values obtained for lower values of F hs while maintaining the rest of the default parameters. As can be seen in the table, the energy cost is higher than that obtained with the OnDemand governor, very close to that of the Performance governor, with similar temporal results. In the case of choosing Sensors 2020, 20, 1400 9 of 16 a value of F hs in this configuration which would provide a power consumption similar to that of OnDemand, as with F hs = 800 MHz, the values ofĈ a and σ C a are significantly worse.  Table 8 shows the results using the interactive governor, with the parameters proposed in the previous section (F hs = 1.6 GHz or F"hs = 1.0 GHz, where AHD = 120 ms. and the rest of parameters as conf2), and the values obtained for other F hs values. As can be seen, using the proposed parameters compared to the default ones, the power consumption using F hs is reduced by 41% compared to Performance and 8% compared to Ondemand, while the temporal values also show better performance compared to the Ondemand governor. Using F" hs it consumption is reduced by 50% compared to Performance and 15% compared to Ondemand, maintaining a temporal behavior similar to Ondemand.
The same values of F hs give better energy consumption and behaviour of conf2 compared to conf1.  Figure 5 shows the distribution histograms of the core frequencies used with the two configurations: (a) with F hs = 2 GHz, (b) with F hs = 1.6 GHz and F" hs = 800 MHz. The same frequencies are included with configuration 1 for comparison purposes. As can be seen, when using F hs = F max in configuration 1 (and as it is a process of high computational requirements) the frequency remains at its maximum value most of the time, being at its F min value, a very small part of the time. Using configuration 2 and the values F hs and F" hs , it is clear that most of the time the core will work with that frequency, although sometimes it is necessary to increase the frequency value, but without reaching F max at any time. Using the same values of F hs and F" hs but with configuration 1 (default), the results show how the core, even when remaining for some time in that F hs frequency, the lower value of AHD means that it passes more easily to higher frequencies, including F max , so the consumption is higher but without providing a significant time difference. The reduction of the MST value can also be seen in lower use between F hs and F min increasing the time in F min , which also favours the energy reduction achieved with configuration 2. Figure 6 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of F hs and compared to OnDemand and Performance. As can be seen, only conf2 provides better energy performance compared to Ondemand, and this is achieved by using an F hs < F max .
reaching Fmax at any time. Using the same values of Fʹhs and Fʹʹhs but with configuration 1 (default), the results show how the core, even when remaining for some time in that Fhs frequency, the lower value of AHD means that it passes more easily to higher frequencies, including Fmax, so the consumption is higher but without providing a significant time difference. The reduction of the MST value can also be seen in lower use between Fhs and Fmin increasing the time in Fmin, which also favours the energy reduction achieved with configuration 2.  Figure 6 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of Fhs and compared to OnDemand and Performance. As can be seen, only conf2 provides better energy performance compared to Ondemand, and this is achieved by using an Fhs < Fmax. Figure 6. Evolution of energy consumption for the interactive controller for configurations 1 and 2, using different values of Fhs, compared to consumption using Ondemand and Performance.

Results Obtained for a Low-Activity Sequence
In the Vid1 sequence, when there is no activity in the whole video sequence, the values obtained by the Performance and Ondemand controllers are shown in Table 9. As can be seen, the values are lower than those obtained for the cloud computing solution, since only the images are captured and the existence of activity is verified, but if there is no activity, neither transmission of images or ROIs is produced, meaning that the energy consumption is lower.  Table 10 shows the results using the Interactive governor, with its default parameters (Fhs = 2.0 GHz), and the values obtained for lower values of Fhs, while maintain the rest of the default parameters. As can be seen in the table, the same results are obtained as in the previous case. The energy cost is higher than that obtained with the OnDemand governor, and very close to that of the Performance governor, with quite similar temporal results. In the case of using an Fhs that provides similar energy consumption, as is the case of 1.6 GHz, better temporal behavior can be appreciated.

Results Obtained for a Low-Activity Sequence
In the Vid1 sequence, when there is no activity in the whole video sequence, the values obtained by the Performance and Ondemand controllers are shown in Table 9. As can be seen, the values are lower than those obtained for the cloud computing solution, since only the images are captured and the existence of activity is verified, but if there is no activity, neither transmission of images or ROIs is produced, meaning that the energy consumption is lower.  Table 10 shows the results using the Interactive governor, with its default parameters (F hs = 2.0 GHz), and the values obtained for lower values of F hs , while maintain the rest of the default parameters. As can be seen in the table, the same results are obtained as in the previous case. The energy cost is higher than that obtained with the OnDemand governor, and very close to that of the Performance governor, with quite similar temporal results. In the case of using an F hs that provides similar energy consumption, as is the case of 1.6 GHz, better temporal behavior can be appreciated.  Table 11 shows the results using the interactive governor, with the parameters proposed in the previous section (F hs = 1.6 GHz o F" hs = 1.0 GHz, where AHD = 120 ms. and the rest of the parameters as in conf2), and the values obtained for other F hs values. As can be seen, using the proposed parameters compared to the default parameters, the energy consumption using F hs is reduced by 62% compared to the Performance and 18% compared to the Ondemand. Although the average time is 33% worse than Ondemand, it is in no danger of failing to meet the deadline. Using F" hs energy consumption is reduced by 43% compared to Performance and 5% compared to Ondemand, maintaining a temporal behavior somewhat better than Ondemand. In any case, the same values of F hs give better consumption and behavior of conf2 compared to conf1.  Figure 7 also shows the frequency distribution histograms as above. As can be seen, using conf2 the highest frequency is practically F hs , while with conf1, this value is exceeded as there is also a high use in a shorter evaluation time, as is the case with conf1.
Sensors 2020, 20, x FOR PEER REVIEW 11 of 15 33% worse than Ondemand, it is in no danger of failing to meet the deadline. Using Fʹʹhs energy consumption is reduced by 43% compared to Performance and 5% compared to Ondemand, maintaining a temporal behavior somewhat better than Ondemand. In any case, the same values of Fhs give better consumption and behavior of conf2 compared to conf1.  Figure 7 also shows the frequency distribution histograms as above. As can be seen, using conf2 the highest frequency is practically Fhs, while with conf1, this value is exceeded as there is also a high use in a shorter evaluation time, as is the case with conf1.  Figure 8 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of Fhs and compared to OnDemand and Performance. As can be seen, with no activity in the sequence, both conf1 and conf2 provide better energy performance compared to Ondemand, although in both cases it is also necessary that Fhs < Fmax.  Figure 8 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of F hs and compared to OnDemand and Performance. As can be seen, with no activity in the sequence, both conf1 and conf2 provide better energy performance compared to Ondemand, although in both cases it is also necessary that F hs < F max . Figure 7. Core frequency distribution for each configuration for Vid1. Figure 8 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of Fhs and compared to OnDemand and Performance. As can be seen, with no activity in the sequence, both conf1 and conf2 provide better energy performance compared to Ondemand, although in both cases it is also necessary that Fhs < Fmax.

Results Obtained for a Sequence with High Activity
In what is considered sequence 3, when there is a lot of activity throughout the video sequence, the values obtained by the Performance and Ondemand controllers can be seen in Table 12.  Table 13 shows the results using the Interactive governor, with its default parameters (F hs = 2.0 GHz), and the values obtained for lower values of F hs maintaining the rest of the default parameters. As can be seen in the table, the same results are obtained as in the previous case. The energy cost is higher than that obtained with the OnDemand governor, very close to that of the Performance governor, with quite similar temporal results.  Table 14 shows the results using the Interactive governor, with the parameters proposed in the previous section (F hs = 1.6 GHz o F" hs = 1.0 GHz, with AHD = 120 ms. and the rest of parameters as conf2), and the values obtained for other F hs values. As can be seen, using the proposed parameters compared to the default parameters, the energy consumption using F hs is reduced by 26% compared to the Performance and 22% compared to the Ondemand. The average time is 21% worse, and furthermore, as expected, not all deadlines are met, although these are not fully met using either Ondemand or Performance. Using F" hs energy consumption is reduced by 22% compared to Performance and by 12% compared to Ondemand, maintaining a slightly better time performance than Ondemand and meeting all deadlines. In this case, the same values of F hs give a better consumption with configuration 2, while the temporal behavior is very similar.  Figure 9 also shows the frequency distribution histograms as above. As can be seen, as it requires a greater computational load, in both configurations F max is reached, although the percentage of use is higher in conf1 compared to conf2, where F hs is still the most used frequency, although now this frequency is at times exceeded.
Sensors 2020, 20, x FOR PEER REVIEW 13 of 15 Figure 9. Core frequencies distribution for each configuration for Vid3. Figure 10 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of Fhs and compared to OnDemand and Performance. The figure shows that as there is a high activity in the sequence, the consumption using Ondemand is close to that of Performance. With this workload, both conf1 and conf2 can provide better energy performance compared to Ondemand, although in both cases it is also necessary that Fhs < Fmax.   Figure 10 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of F hs and compared to OnDemand and Performance. The figure shows that as there is a high activity in the sequence, the consumption using Ondemand is close to that of Performance. With this workload, both conf1 and conf2 can provide better energy performance compared to Ondemand, although in both cases it is also necessary that F hs < F max . Figure 10 shows the difference in energy consumption for the parameters of conf1 and conf2, using different values of Fhs and compared to OnDemand and Performance. The figure shows that as there is a high activity in the sequence, the consumption using Ondemand is close to that of Performance. With this workload, both conf1 and conf2 can provide better energy performance compared to Ondemand, although in both cases it is also necessary that Fhs < Fmax.

Conclusions and Future Work
The use of smart sensors in the field of industry, in particular in near sensor computing or fog computing, has great potential for use in the future. The requirements determined by industry 4.0 require the flexibility of the sensors so that they can be used for different types of tasks. Multicore architectures, whether big.LITTLE or not, allow the field of use of these devices to be extended. Thus, with respect to a cloud computing solution using Performance, savings of 178% are achieved for vid1, 101% for vid2, and 21% for vid3. Thus, compared to a cloud computing solution, a high energy saving is achieved, a key aspect in this field, although it depends on the system load. As for the solution based on edge computing, the default configurations are designed for smartphones. The use of the Interactive governor allows different computing loads to be dealt with efficiently. If the parameters of this governor are also set taking into account the load with which they have to work at a given time in an application in the 4.0 industry, it is possible to significantly improve energy savings while Figure 10. Evolution of energy consumption for the interactive governor for configurations 1 and 2, using different values of F hs , compared to consumption using Ondemand and Performance.

Conclusions and Future Work
The use of smart sensors in the field of industry, in particular in near sensor computing or fog computing, has great potential for use in the future. The requirements determined by industry 4.0 require the flexibility of the sensors so that they can be used for different types of tasks. Multicore architectures, whether big.LITTLE or not, allow the field of use of these devices to be extended. Thus, with respect to a cloud computing solution using Performance, savings of 178% are achieved for vid1, 101% for vid2, and 21% for vid3. Thus, compared to a cloud computing solution, a high energy saving is achieved, a key aspect in this field, although it depends on the system load. As for the solution based on edge computing, the default configurations are designed for smartphones. The use of the Interactive governor allows different computing loads to be dealt with efficiently. If the parameters of this governor are also set taking into account the load with which they have to work at a given time in an application in the 4.0 industry, it is possible to significantly improve energy savings while maintaining or even improving the temporal response of the system. This is demonstrated by the differences between configuration 2 used with the interactive governor, and configuration 1 with this same governor, or using Performance or Ondemand. Comparing the proposed Interactive configuration with respect to Performance, which is the one used by default by Exynos_cpufreq (see Table 3), savings are achieved from 62% for low-activity sequences to 26% for very high-activity sequences. Comparing the proposed Interactive configuration (conf2) with the default Interactive configuration (conf1), savings are achieved from 58% for low-activity sequences, to 31% for very high-activity sequences.
This paper presents a method to determine the parameters of the Interactive governor, so that the temporal requirements of the applications are maintained, improving significantly the energy consumption. This improvement makes an edge computing solution even more efficient than cloud computing. The use of these SoCs in industry 4.0 allows a high degree of flexibility; they could be used, depending on the moment, using only a LITTLE core in applications with very low computing requirements, to other applications, such as that shown in Section 3, where the 4 big cores are used, which represents a very high range of applications.
The parameter selection method shown is based on the experimental information obtained with a video sequence. As a future work, and with the aim of achieving a flexible and more autonomous and reconfigurable system, the authors aim to develop a workload analysis system so that it is possible to select not only the parameters of the governor, but also to choose which type of cores and how many cores to use to achieve the desired performance, thus increasing energy savings.