1. Introduction
Modern embedded systems give the possibility of effective video sequence processing in real time. Algorithms for video sequence analysis can be based on classic image processing methods and on neural networks, the use of which is becoming more and more popular. Cameras are installed in cars and autonomous vehicles [
1,
2], for city monitoring, tracking [
3], but also in specialized applications, such as inspection vehicles [
4]. Such systems help operators to observe urban spaces or machine surroundings and to supervise and respond to current events. In some cases, they are able to completely eliminate the human factor, automatically issuing specific commands to other systems. In the use of autonomous vehicles in airport areas [
4], the accuracy of the algorithms and the hardware performance of the devices are extremely important, because it is possible to ensure safety. Moreover, such vehicles should be extremely energy efficient due to the large areas of the airports in which they are moving and have the necessary energy reserve in the event of emergencies.
For this reason, it is necessary to control the lighting and navigation equipment installed in the airport areas, with particular emphasis on the runway center line, the touchdown zone, and the taxiway center line. To perform the correct drive over the tested lamp, due to the width of the starting road and the lack of static reference points, it is necessary to prepare a system that will allow the operator/driver to carry out the measurement correctly [
4]. Such systems are more often based on vision solutions. Therefore, the task of detecting runway lines and markings plays an important role in ensuring the safe movement of aircrafts on the ground [
5]. They can also be helpful in determining the position of the measurement platform during inspections, because the position of the lamps is closely related to the location of horizontal markings on airport surfaces [
6].
Processing video sequences requires the use of computing units of different power [
7,
8], depending on the selected algorithm and the operations performed on the image [
5,
9]. In mobile applications, it is important to find a compromise between the compact size of the system, power consumption, and computer performance. To select the best equipment that can be used in individual cases, it is necessary to perform a comparative analysis of the available embedded systems, which, due to their small size and relatively reduced energy demand, can be installed in mobile autonomous vehicles, such as a measurement platform for quality testing of airport lamps. The need to build such a platform and control airport lamps came from the growing requirements of aviation safety agencies, which introduce more and more stringent regulations regarding the need to control airport lighting [
6]. Lamps located in airport areas have a very important role in air navigation, because their visibility determines the possibility of performing individual air operations. The decrease in their light efficiency, and thus the need for cyclical checking, results from the use of halogen bulbs or LEDs but also from the winter season. During this period, airport areas are cleared of snow using ploughs with metal brushes. This causes tarnishing and scratching of the prisms of the in-pavement lamps, and even creates cracks, which exclude the lamp from further operation.
Figure 1a shows the idea of a measurement platform in a real working environment (on the runway),
Figure 1b shows the camera mounted on the airport maintenance vehicle, and
Figure 1c shows the operator view of the camera on the airport maintenance vehicle located on the runway.
In order for the platform to properly ride over the lamp built into the runway, it is necessary to use a vision system that correctly identifies the markings used on the runway and other airport areas and makes the necessary correction of the trajectory to improve the measurement. The specificity of a runway, which is much wider than public roads, makes it more difficult to find the right reference point because the edges of the pavement are often outside the frame. The principle of operation of the vision system on the measuring platform for quality testing of airport lamps is shown in
Figure 2.
The purpose of the research was to analyze the performance of the video processing rate in terms of power requirements of the Raspberry Pi 4 and NVIDIA Jetson platforms. The tests carried out were aimed at estimating the energy consumption for the operating modes defined by the manufacturer. This manuscript is a revised and extended version of an article presented at the 29th International Conference Mixed Design of Integrated Circuits and Systems, held in Wroclaw, Poland, from 23 to 25 June 2022 [
10].
2. Related Works
The development of microcontrollers enables the implementation of more and more advanced image processing algorithms and video sequences. Because of the low energy consumption, it is possible to build portable devices powered by batteries. The design of prototypes of mobile vision processing solutions is facilitated by a wide range of minicomputers from the Raspberry Pi and NVIDIA Jetson families. It should be noted that industrial versions can also be used for these solutions, which offer professional performance and are resistant to long-term operation [
11].
An important aspect in the design of such devices is the assessment of computational efficiency, which in the case of video sequence analysis is related to image resolution and the number of frames per second. The second important factor is power consumption, which should be as low as possible to reduce the size of the device (mainly the number and weight of the batteries) and reduce the need for frequent charging.
In general, the manufacturers of evaluation platforms present the performance of the microprocessors used, and on specialized websites (for example, on the [
12] website), you can find comparisons of such metrics as integer math, floating point math, find prime numbers, random string sorting, data encryption, and data compression. Such metrics are difficult to relate directly to image processing performance, especially if the algorithm contains conditional instructions.
Assessment of computational performance can also be found on the [
13] website; the manufacturer focuses on popular deep neural network solutions. Performance results in the form of samples/sec are presented for various NVIDIA Jetson models taking into account power requirements. It should be noted that the indicated neural networks have relatively low resolutions and the consumption analysis does not take into account the various possible power supply modes. For example, we can find the information that each NVIDIA Jetson module was run with the maximum performance MAXN.
The issue of video sequence processing efficiency is also discussed in scientific publications. In article [
14], various algorithms such as the canny edge detection algorithm, road line tracking, face and eye recognition, motion detection, and object detection are discussed. Raspberry Pi 4 was indicated as the experimental platform, for which maximum power and frequency values of the CPU and GPU were given, but the exact values for individual algorithms were not indicated. The Raspberry Pi 4 module was also used in the research presented in the article [
15]. The author focused on the frame rate, frame transfer delay, and frame processing time. The energy consumption aspect has not been investigated.
The results of video sequence processing performance analysis can also be found in publications for NVIDIA Jetson family modules. The paper [
16] discusses a vision system to recognize fiducial markers including ARTag type. The vision system consists of two Logitech HD Pro Webcam C920 cameras and a NVIDIA Jetson TX2 module that performs digital image processing. NVIDIA Jetson Orin AGX performance and power consumption analyses are presented in article [
17]. The mean Average Precision (mAP) was tested as a function of FPS (frames per second) and the different image sizes. The line detection task with the use of NVIDIA Jetson Xavier NX is discussed in [
18]. The authors proposed a CNN Encoder–Decoder network architecture and tested their solution at different image resolutions and an input image size up to 1280 × 720. In all the above papers, energy consumption aspects were not analyzed.
A benchmark analysis of NVIDIA Jetson Platforms (Nano, TX2, NX, and AGX) are shown in the paper [
19]. Measurements of resource usage and power consumption are presented without the influence of image resolution. Performances of single-board computers in NVIDIA Jetson Nano, NVIDIA Jetson TX2, and Raspberry PI4 through the Convolutional Neural Network (CNN) algorithm created by using a fashion product images dataset are compared in [
20]. The authors of this paper analyzed performance defined as the processing power (Central Processing Unit—CPU, Graphics Processing Unit—GPU), memory (Random-Access Memory—RAM, cache), power consumption, and cost. Unfortunately, no analysis of frame rate speed processing (Frames Per Second—FPS) was conducted.
The experiments presented in the following sections were based on the ideas presented in our previous article [
5]. Due to the potential of using such solutions in battery-powered devices, a comprehensive analysis of both the FPS performance and the power consumption for various operating modes of the device was performed.
4. Results and Discussion
4.1. Performance Comparison of Embedded Systems
According to
Section 3, analysis of algorithm performance was performed using Raspberry Pi 4B, NVIDIA Jetson Nano, NVIDIA Jetson Xavier AGX, and NVIDIA Jetson Orin AGX modules. The aim of the experiment was to segment the lines along the runway of the airport to determine the correct path. The video sequences used show the airport areas with horizontal markings that were registered at the Poznań–Ławica Airport.
Tests were carried out in 10 s fragments, selected randomly from the entire video sequence. Additionally, each fragment was scaled to six resolutions: 1920 × 1080 (Full HD), 1600 × 900 (HD +), 1366 × 768 (HD), 1280 × 720 (WXGA), 640 × 360 (nHD), and 320 × 180.
Figure 7 and
Figure 8 show a comparison of the mean number of frames analyzed per second that were obtained during the processing of 10 s fragments depending on their resolution and the embedded system used for individual algorithms. It should be noted that the power consumption is due to the specificity of the image processing algorithm; thus, the process of randomizing the sequences was performed only once and then submitted for offline testing.
During all experiments for different platforms and for different modes of operation, identical software (without any specific modification for particular hardware) using CPUs only was used. Tests were carried out using different power consumption modes.
The algorithm based on image segmentation in the HSV color space, due to its lower computational complexity [
5], is characterized by shorter processing times than the line detection using the Hough algorithm with Scharr mask filtering. The duration processing time of the program increases with the higher input resolution. This relationship is visible for each embedded device tested and for the selected power consumption mode. The Raspberry Pi 4B microcomputer allowed almost identical results as in the case of the NVIDIA Jetson Nano module in MAXN power mode. In the case of NVIDIA Jetson Xavier AGX, the increase in the number of processed frames per second on the maximum clock speed of the CPU processor can be observed. As shown in
Table 3, higher processor timings are obtained for higher power consumption modes. For the 30 W mode, depending on the number of processor cores used, the maximum clock frequency also changes: the fewer active cores, the higher frequency. A similar dependence can be seen for NVIDIA Jetson Orin AGX, recommended in
Table 4, where the clock frequency of the processor also changes. In the case of the most efficient units and this algorithm, NVIDIA Jetson Orin AGX and NVIDIA Jetson Xavier AGX, it is possible to process video sequences in real time at maximum resolution, obtaining values over 24 FPS.
Due to the architecture of the airport line detection program, which is single-threaded [
5], reducing the number of active processor cores does not have a negative impact on the achieved results. The computational complexity of line detection using the Hough algorithm with Scharr mask filtering demonstrates the increase in input processing time in embedded systems. For this reason, performance at the level of 24 FPS of video sequence processing was possible only after reducing the resolution to 1366 × 768, where the values of 24 FPS were obtained for the most efficient NVIDIA Jetson Xavier MAXN modules and 26 FPS for NVIDIA Jetson Orin MAXN.
The efficiency of line detection using the Hough algorithm with Scharr mask filtering, due to the greater computational complexity, causes the maximum performance of the tested embedded devices to be worse than in the case of the algorithm based on image segmentation in the HSV color space, making the Hough algorithm unequally usable in real-time work.
4.2. Power Consumption
4.2.1. Software Power Consumption Measurement
Energy consumption was measured using the Jetson-Stats software [
30]. It is an advanced system monitoring and control package explicitly designed for the NVIDIA Jetson series, including the Orin, Xavier, Nano, and TX models. The tool (
Figure 9) is a vital resource for researchers and developers seeking in-depth analysis and performance tracking of their NVIDIA Jetson boards. Key features of Jetson-Stats include:
Hardware, architecture, L4T, and NVIDIA Jetpack decoding: The tool provides detailed information about the system’s underlying hardware and software configurations, thereby offering insights into the system’s operational parameters and potential performance-optimization opportunities.
Comprehensive monitoring: Jetson-Stats can track a wide range of system metrics, including CPU, GPU, memory, engines, and fan speeds. This monitoring feature ensures continuous awareness of system health and performance.
System control: The package allows users to manage their NVP model, fan speed, and jetson_clocks, offering comprehensive control over the performance and power management of the device.
The resolution of the average value of power consumption is shown at the level of 0.1 W. Due to the compatibility of the results obtained using the Jetson-Stats software, the results for the hardware measurement using multimeters are also presented with the same accuracy.
Table 5 presents the results of measurements of the average power consumption during the operation of the algorithm for various power modes and different resolutions using the algorithm based on image segmentation in the HSV color space.
Table 6 presents the results for line detection using the Hough algorithm with Scharr mask filtering. The tables show the average results of the offline trials for the 10 s sequences mentioned above, with a change in resolution, randomly selected from the database pool. Measurements were made with no FPS limits.
More advanced modules, e.g., NVIDIA Jetson Orin AGX working in MAXN mode and NVIDIA Jetson Xavier AGX in MAXN mode, have higher power requirements, which is quite an obvious observation, because at the same time, the FPS obtained for these devices is definitely higher, as shown in
Section 4.1. For individual power modes, there is a slight variation in power consumption at different resolutions, and the measured values are quite similar. This is because there are no limits to the FPS values, which were individually maximum for each of the experiments. To better assess the power requirements, which depend on both the type of algorithm, the resolution used, and the FPS value obtained, these dependencies have been taken into account and summarized in the next sub-chapter.
4.2.2. Power Consumption Measurement Using an Electronic Multimeter
Due to the use of an embedded system from the Raspberry Pi family in the paper, it was not possible to use the above-mentioned software. For this reason, the authors decided to use standard laboratory multimeters. In this way, the average energy consumption of a given embedded system was obtained. It is worth noting that the earlier Jetson-Stats software solution does not take into account the power consumption of accessories connected to the microcomputer, but only measures the power consumed by its components. In the following cases, measurements include typical accessories connected to microcomputers, in the form of a keyboard, mouse, and a fan.
Table 7 and
Figure 10 presents the results of the average power consumption measurements during the operation of the algorithm for various power modes and different resolutions using the algorithm based on image segmentation in the HSV color space with the use of a power meter.
Table 8 and
Figure 11 present the results for line detection using the Hough algorithm with Scharr mask filtering.
For the algorithm analyzed based on image segmentation in the HSV color space, the highest resolution of 1920 × 1080 implies a limit to the FPS value below 24, which is an acceptable value for airport line detection techniques. These assumptions are met by two microcomputers operating in the following configurations: NVIDIA Jetson Xavier AGX MAXN and NVIDIA Jetson Orin AGX in mode MAXN. In these cases, the average power consumption is:
| 18.5 W @ 27 FPS |
| 20.8 W @ 25 FPS |
In the case of the second solution, line detection used the Hough algorithm with Scharr mask filtering and, as before, there was a limit set for processing at a minimum speed of 24 FPS, with a resolution of video sequences of 1920 × 1080. For this case, none of the microcomputers reached the minimum FPS value. Only the resolution of 1366 × 768 allowed processing of video sequences with sufficient speed. These assumptions are also met by two microcomputers operating in the following configurations: NVIDIA Jetson Xavier AGX in MAXN mode and NVIDIA Jetson Orin AGX in MAXN mode. In these cases, the average power consumption is:
| 16 W @ 24 FPS |
| 20 W @ 26 FPS |
4.3. Energy Efficiency Analysis
For the FPS and power consumption values determined in
Section 4.1 and
Section 4.2, respectively, comparative charts have been prepared that show the dependence of power consumption requirements on both the resolution and the FPS values.
Figure 12 shows the dependence of the algorithm based on image segmentation in the HSV color space. The lower the value, the lower the power requirements. For faster processing speeds expressed in FPS, the MAX modes are the best.
The comparison of the operation of the two algorithms shows that in the case of an algorithm based on image segmentation in the HSV color space, it is possible to obtain the same or better FPS values, with less energy consumption. For example, for the NVIDIA Jetson Xavier in mode MAXN device, this algorithm for 1920 × 1080 resolution requires 679 mJ/frame, while line detection using the Hough algorithm with Scharr mask filtering required 680 mJ/frame for 1366 × 768 resolution.
Figure 13 shows the dependence graph for line detection using the Hough algorithm with Scharr mask filtering.
4.4. Dynamic Voltage and Frequency Switching (DVFS)
There is also another approach to control hardware performance similar to NVIDIA Jetson manufacturer power modes—dynamic voltage and frequency switching (DVFS) [
31,
32,
33]. As part of the tests, an experiment was also carried out to run the algorithms image segmentation in the HSV color space and line detection using the Hough algorithm with Scharr mask filtering, as was the case with previous research. Energy consumption was also measured using multimeters and Jetson-Stats software. This time, however, not only the power mode of the microcomputer was changed, but also the DVFS policy (possible modes: schedutil, performance, powersave, userspace, on-demand, interactive, conservative). The tests were carried out on the NVIDIA Jetson AGX Xavier microcomputer, due to the best results achieved in previous tests.
Concerning the assumptions about processing speed, the NVIDIA Jetson AGX Xavier microcomputer also only met values in the MAXN mode in DVFS policies: schedutil (27 fps), performance (25 fps), on-demand (27 fps), interactive (26 fps), and conservative (27 fps) for the image segmentation in the HSV color space for a resolution of 1920 × 1080. In the case of the line detection algorithm using the Hough algorithm with Scharr mask filtering, the assumptions were not achieved.
Figure 14 and
Figure 15 show the exact results of the experiment. It is worth mentioning that the result in the MAXN and DVFS policy powersave operating modes resulted in a practical inability to run the program, which in turn significantly extended the program’s running time, and thus the energy consumption per image frame.
The experiments carried out, as part of the assumptions and algorithms used, show that the best operating modes that meet the assumptions are powermode MAXN and DVFS policy schedutil (680 mJ/frame); however, in most cases, the differences are in the range of 7%:
| 27 fps | 680 mJ/frame |
| 25 fps | 732 mJ/frame |
| 27 fps | 697 mJ/frame |
| 26 fps | 698 mJ/frame |
| 27 fps | 685 mJ/frame |
It is worth noting that the previously obtained results (default DVFS policy schedutil) confirm the optimization of the mode, which among all tested ones gave the best result (lowest energy consumption per image frame).
5. Conclusions and Future Work
New development modules using advanced microprocessors, and at the same time not having high power requirements (the ability to operate for several hours using battery power), can be successfully used in mobile monitoring devices.
The simplest Raspberry Pi 4B modules are suitable for solutions where resolution requirements are not too high. As a series of experiments has shown, standard values at the 30 FPS level are processed with the use of such modules at a maximum resolution of 640 × 360. For such resolutions, the advantage of using Raspberry Pi 4B modules is the low cost of the device and relatively low energy demand to process one frame of the video sequence. This device can work with passive cooling, and it is not necessary to use an additional fan, which required an additional 0.5 W during the tests. An alternative solution to using the Raspberry Pi 4B is to use the NVIDIA Jetson Nano module. This device has two operating modes: 5 W and MAXN. In MAXN mode, it has performance and energy requirements similar to those of Raspberry Pi 4B.
The most efficient units in the form of NVIDIA Jetson Xavier AGX and NVIDIA Jetson Orin AGX are the only modules that meet the assumptions of the FPS value when processing video sequences. Despite the fact that the latest solution of the manufacturer is NVIDIA Jetson Orin AGX, in the vast majority of tests carried out in the article, NVIDIA Jetson Xavier AGX achieved better results, not only in FPS values but also in the ratio of the necessary power to process one frame, especially in default schedutil DVFS policy.
The specifics of the evaluation and selection of embedded system devices is a complex issue due to several different aspects discussed in this article. The energy aspects, important from the point of view of mobile devices, have been noticed by the manufacturers of experimental modules, who in the offered solutions provide the possibility of using various power modes. At the same time, it can be noted that there are currently few publications that comprehensively assess the use of such opportunities.
It should be noted that the examined algorithms have a typical scale of difficulty in processing video sequences. Thus, the estimates presented in the paper can be valuable guidelines for designers of intelligent embedded systems processing video sequences. Of course, in order to precisely determine energy requirements, selected algorithms should be run individually.
The assessment presented in this paper is related to the authors’ previous practical solutions, while the tested sets of image processing blocks are quite universal in most video sequence processing solutions.