Sensors 2012, 12(11), 14647-14670; doi:10.3390/s121114647

Article
Energy Efficient Image/Video Data Transmission on Commercial Multi-Core Processors
Sungju Lee , Heegon Kim , Yongwha Chung * and Daihee Park
Department of Computer Information Science, Korea University, Sejong KS002, Korea; E-Mails: peacfeel@korea.ac.kr (S.L.); khg86@korea.ac.kr (H.K.); dhpark@korea.ac.kr (D.P.)
*
Author to whom correspondence should be addressed; E-Mail: ychungy@korea.ac.kr.
Received: 11 July 2012; in revised form: 19 October 2012 / Accepted: 23 October 2012 /
Published: 1 November 2012

Abstract

: In transmitting image/video data over Video Sensor Networks (VSNs), energy consumption must be minimized while maintaining high image/video quality. Although image/video compression is well known for its efficiency and usefulness in VSNs, the excessive costs associated with encoding computation and complexity still hinder its adoption for practical use. However, it is anticipated that high-performance handheld multi-core devices will be used as VSN processing nodes in the near future. In this paper, we propose a way to improve the energy efficiency of image and video compression with multi-core processors while maintaining the image/video quality. We improve the compression efficiency at the algorithmic level or derive the optimal parameters for the combination of a machine and compression based on the tradeoff between the energy consumption and the image/video quality. Based on experimental results, we confirm that the proposed approach can improve the energy efficiency of the straightforward approach by a factor of 2∼5 without compromising image/video quality.
Keywords:
video sensor network; energy efficiency; multi-core processors

1. Introduction

In transmitting image/video data over Video Sensor Networks (VSNs), energy consumption must be minimized while maintaining high image/video quality [1]. Although image/video compression is well known for its efficiency and usefulness in VSNs, the excessive costs associated with the encoding computation and complexity still hinder its adoption in practical applications. Additionally, image/video compression techniques such as JPEG, JPEG2000, and H.264 [24] may degrade the image/video quality compared to the original image/video. However, it is anticipated that high-performance handheld multi-core devices will be used as processing nodes of VSNs in the near future, and the use of multi-core processors for handheld devices has been increasing. Since handheld devices operate with a battery, we need to consider energy consumption for efficiently compressing image/video content while still satisfying the user's image/video quality requirements. The use of multi-core processors is a possible way to not only reduce the execution time, but also improve the energy efficiency [5,6], thus parallel processing techniques using multi-core processors have become attractive for satisfying both real-time and energy efficiency requirements.

Parallel processing has been widely used to reduce the execution times of applications [5]. With advances in multi-core technology, multiprocessing techniques at a system software level have been used in order to reduce energy consumption [6]. However, parallel processing on multi-core processors may increase the total power consumption due to the use of more physical cores. Therefore, we need to evaluate the power-time tradeoff quantitatively.

Generally, there is a tradeoff between power consumption and execution time [711]. That is, if we increase the frequency (i.e., processor speed), the power consumption is increased while the execution time is decreased. Because energy consumption is computed by a product of the power consumption and the execution time, we need to analyze the tradeoff with the given frequency.

Previous studies [711] conducted by the computer architecture community were targeted at designing general-purpose processors which could be applied to several applications. Processor vendors provide several levels of frequency settings and several numbers of cores, and it is the user's role to determine the optimal configuration for his/her application. Therefore, we need to optimize the system configuration at the software level (i.e., the frequency setting and the number of cores) by analyzing the machine's characteristics and the application's parallelism collectively, because both the power consumption and the execution time depend on the number of cores and the application's parallelism.

To increase energy efficiency, compression techniques at the algorithmic level have been proposed [1216]. Traditionally, many studies have been conducted to derive the optimal compression parameters using Rate-Distortion (R-D) analysis [1214]. However, this traditional analysis has not considered the resource consumption of a platform, and may thus not be suitable for resource-constrained embedded devices or sensor network environments. Recently, some research results using Power-Rate-Distortion (P-R-D) analysis in order to control the power consumption of a network and maximize the video quality have been reported [15,16]. However, these analyses neither considered the compression time on the platform nor the machine's characteristics. Therefore, it is difficult to apply this analysis to an application's parallelism and energy efficiency when using a multi-core processor. Because of these difficulties, we need to analyze the characteristics of the machine and the compression collectively, and thus improve the energy efficiency of compression using a commercial multi-core processor.

In this paper, we propose Energy-Distortion (E-D) analysis in order to analyze the tradeoff between energy consumption of a platform and image/video quality in transmitting image/video data. In particular, we improve the energy efficiency of a commercial multi-core processor by using parallelism, because this analysis includes both the machine's and application's characteristics during the compression operation. Finally, we propose a general approach that can satisfy a user's requirements of image/video quality using E-D analysis.

In the experiments, we used three commercial multi-core processors (Intel quad-core i7and dual-core i5, AMD quad-core) [17,18] and analyzed the machines' characteristics. The energy efficiency was analyzed by measuring the actual power consumption with a WT210 power meter [19]. We also used three compression algorithms (JPEG, JPEG2000, and H.264), various image/video data, and diverse network conditions. Based on the experimental results with E-D analysis, the proposed approach can improve the energy efficiency of the straightforward approach by a factor of 2∼5 compared to the transmission of un-compressing/compressing data with equal image/video quality. We used a multi-core based notebook and did not consider the data capturing step since multi-core based sensor devices were not available to us during the experiments and our focus was only the compression and transmission step. Also, the battery consumption is proportional to the energy consumption, and although we could not measure the battery consumption directly, we believe that the proposed approach for energy efficiency can also extend the battery life of multi-core based sensor devices.

The rest of the paper is structured as follows: Section 2 describes the properties of commercial multi-core processors, the parallelism of applications, the multimedia compressions, and the control parameters. Section 3 explains the proposed approach for E-D analysis of machine characteristics and multimedia application characteristics, and the optimization of system configuration. Finally, Sections 4 and 5 describe the experimental results and conclusions, respectively.

2. Background

2.1. Commercial Multi-Core Processors

To improve the performance of computer systems, many studies related to the developments in semiconductor processes, distributed processing, and parallel processing technologies have been reported. With the advance of integrated circuit technology, the number of transistors and the frequency of processors have been improved significantly. However, improving the frequency is no longer possible due to high power consumption and heat dissipation, which should be reduced for resource-constrained, mobile/ubiquitous environments. To handle this issue, many hardware/software level studies have been reported [511].

Commercial multi-core processors have different characteristics according to the hardware architecture design. In Intel's multi-core architecture [17], the L2 cache is shared by two cores. In AMD's multi-core architecture [18], the L2 cache is allocated per core. According to service requirements, various hardware components (i.e., memory, hard disk, IO devices, etc.) can be configured. Since the characteristics of the power consumption and execution time of the commercial multi-core processor depend on the design of the hardware architecture, it is difficult to generalize the power consumption and execution time characteristics. Therefore, to analyze the machine's characteristics, the power consumption and execution time need to be measured at least once.

2.2. Application's Parallelism

The execution time of an application on a multi-core processor depends on the application's parallelism. Amdahl's law provides a simple model to predict the speedup of parallel processing given the sequential portion of a program and the number of processors used.

Despite providing insight and usefulness, Amdahl's law considers neither the processor speed (i.e., frequency) nor the power consumption. All the processor speeds are implicitly assumed to have the same (maximum) value. As the energy and the power are some of the most critical shared resources in a multicore-based parallel processor, it is not only interesting, but also necessary to collectively consider the implications of parallelization on the program performance and the energy consumption. Current technologies and design trends strongly indicate that future processors will be capable of Dynamic Voltage and Frequency Scaling (DVFS or DVS in short) [6]. Therefore, we need to collectively analyze the machine's characteristics (i.e., the power and the execution time by setting the frequency and the number of cores) and the application's characteristics (i.e., the application's parallelism), and thus improve the energy efficiency of applications using a commercial multi-core processor. Note that, we apply only the frequency scaling (without the voltage scaling) with the application level command, due to the limitations of our experimental environments.

2.3. Multimedia Compression

Generally, digital image/video data can be compressed using both lossy and lossless compression techniques. Lossy compression is a technique to remove spatial and temporal redundancy [24]. In image compression algorithms such as JPEG and JPEG2000, transformation coding (i.e., discrete cosine transform and discrete wavelet transform) and quantization techniques have been studied in order to remove the spatial redundancy. Also, motion estimation and motion compensation have been studied in order to remove temporal redundancy between frames. Lossless compression such as Huffman coding and arithmetic coding is a technique to reduce the amount of statistical entropy.

JPEG and JPEG2000 are standards for still image compression. Notably, JPEG2000 has a rate-distortion advantage over JPEG. MPEG and H.264 are International Organization for Standardization (ISO) and International Telecommunication Union (ITU) standards for video compression. Figure 1 illustrates the H.264 video encoder.

Although image/video compression techniques can reduce the size of an original image/video, it may require more energy consumption due to the high computational complexity of the compression. Therefore, to reduce the energy consumption of image/video compression techniques, many studies using R-D analysis [1214] or extended P-R-D analysis [15,16] have been reported.

2.4. Compression Control Parameters

In multimedia compression, the type of DCT, DWT, entropy coding and the size of the quantization table, etc., can be used as compression parameters. In this paper, we represent the compression parameter as q (i.e., Quality Level of JPEG/JPEG2000, and Quality Parameter of H.264). The purpose of q is to control the compression rate and image/video quality with a scalable quantization table. q affects not only the image/video quality, but also lossless compression part (i.e., entropy coding) after lossy compression (i.e., DCT or DWT).

In the compression procedure, the image/video is processed by 8×8 pixel blocks. Figure 2(a) shows an example of FDCT and Quantization Table by 8×8 pixel blocks. In Figure 2(b), the FDCT and Quantization Table results are calculated by (FDCTij/QuantizationTable) × q/100, where q = 1, 2, …, 99, 100. Since the number of zeros is increased with decreased q, the computation of lossless compression and the compressed image/video size are decreased, and the image/video quality is also decreased. Note that, the computation of lossless compression is maximized where q = 100, and also the image/video quality is maximized. In contrast, the computation of lossless compression is minimized where q=1, and also the image/video quality is minimized. Therefore, we can control the amount of computation, compression rate, and image/video quality with q [24].

3. Proposed Approach

We propose an experiment-based model in order to evaluate the performance of a given application on a machine collectively. We measure the power consumption of a test application “only once” with every combination of the number of cores and frequency of a machine in order to understand the machine's characteristics. Then, we measure the execution time of a given application only with the single core and at maximum frequency of a machine in order to understand the application's characteristics. With these two measurements, we can estimate the energy performance of the given application with “any” combination of the number of cores and frequency of the machine. Also, we propose a greedy approach to find the optimal parameters for the energy efficiency in transmitting image/video data without compromising image/video quality.

3.1. Machine's and Application's Characteristics

First, to understand the machine's and application's characteristics, we measured the power consumption, execution time, and the energy consumption of parallelized AES-CBC (i.e., 0% parallelism), AES-CCM (i.e., 50% parallelism) and AES-CTR (i.e., 100% parallelism) [21] with the Pthread library [20] as examples of test applications on the Intel i7 and AMD multi-core processors. The AES-CTR problem has no data dependency and is easily parallelized. In contrast, AES-CCM has 50% data dependency, and AES-CBC has 100% data dependency. According to Amdahl's law, the maximum speedup (with a 4-core processor) of AES-CTR and AES-CCM are 4 and 2, respectively. Note that AES-CCM combines encryption and authentication, and it is widely used in wireless applications.

Figure 3 shows the power consumption and execution time of the test applications with 0%, 50%, 100% parallelism on multi-core processors, with various frequencies and numbers of cores. The power consumption, the execution time, and the energy consumption were normalized based on the case with a single core and maximum frequency. As shown in Figure 3, the power consumption increased and execution time decreased with increased frequency and number of cores. In the results, it can be seen that these characteristics have similar patterns for each processor. Since increasing or decreasing rates of power consumption and execution time are different across processors, the power consumption and execution time of a processor should be measured at least once in order to analyze the processor's characteristics. As shown in Figure 3, we found that applications with less parallelism can use fewer cores, and thus less power is consumed.

Although an application with less parallelism requires less power consumption, it may consume more energy due to greater execution time. Figure 4 shows the execution time of AES-CBC, AES-CCM, and AES-CTR on 1, 2, 3, and 4 cores. AES-CBC (0% parallelism) can be performed with increased number of the cores, but both the power consumption and the execution time are always constant (see Figures 3 and 4). In contrast, as we increase the number of cores in AES-CTR (100% parallelism), the execution time decreases while the power consumption increases. To improve the energy efficiency, we need a collective analysis of the machine and application characteristics.

Figure 5 shows the energy consumption with various parallel applications on Intel and AMD processors. On the Intel processor, the optimal frequency is always 1,462 MHz, but each optimal number of cores is different for each amount of parallelism: one core (0% parallelism), three cores (50% parallelism), and four cores (100% parallelism). On the AMD processor, the optimal frequency is always 1,796 MHz, and the optimal number of cores is also different for different amounts of parallelism: one core (0% parallelism), four cores (50% parallelism), and four cores (100% parallelism). In this paper, we propose a way to improve the energy efficiency by using optimal machine parameters (i.e., the frequency and the number of cores) according to application's parallelism. We generated a performance metric for the power consumption in order to understand the machine's characteristics, and then predicted the energy consumption by an application's parallelism using Amdahl's law.

3.2. Collective Analysis of Machine's and Application's Characteristics

First, we analyze the relationship between the application/machine and the energy consumption. The power consumption and the execution time depend on the characteristics of the machine and the application. Thus, we can represent the energy consumption E by Equation (1) with power consumption W and execution time T:

E = W × T

To analyze the power consumption and the execution time with an application's parallelism, we denote the application's parallelism as papp, where 0 ≤ papp ≤ 1.The application's parallelism (i.e., papp), frequency (i.e., f), and number of cores (i.e., n) sensitively affect the energy consumption of a processor as shown in Figure 6. Thus, the energy consumption is represented as Equation (2), where f is the frequency and n is the number of cores. To reduce the energy consumption, we need to set the optimal f and n with a prediction of the energy consumption from the given application and machine characteristics.

E ( f , n , p app ) = W ( f , n , p app ) × T ( f , n , p app )

The power consumption can be measured with an application having 100% parallelism (i.e., AES-CTR). With an increased number of cores, the power consumption is also increased. We can also find that the power consumption depends on the number of cores. Thus, when the combination of application and machine characteristics are given, we can analyze the application's parallelism. We can predict the power consumption by using Equation (3) with the measured results. We focus only on the dynamic power consumption of the whole multi-core based platform at the compression and transmission step although the static power consumption at the idle time is not negligible.

Note that, the power varies during the execution of the given application. We measured the power consumption at several points and took the average. For simplicity, we used this average value as the power consumption value. Note also that, an application consists of a sequential portion (having some data dependency) and a parallel portion (not having any data dependency). We denote the power consumption of the sequential portion of the application with 1 core as Wsequential(f, 1) and the power consumption of the parallel portion of the application with n cores Wparallel (f, n). As shown in Figure 3 (with the 0% parallelism case), the power consumption of the sequential portion of the application is independent with the number of cores. Therefore, Wsequential(f, 1) = Wsequential(f, n) (i.e., the power consumption of the sequential portion of the application with n cores).

W ( f , n , p app ) W sequential ( f , 1 ) × ( 1 p app ) + W parallel ( f , n ) × ( p app )

Also, the total execution time (i.e., T(f, n, papp), with various numbers of cores can be predicted using Equation (4). Wsequential(f, 1) and Tsequential(f, 1) represent the power consumption and the execution time of the sequential portion of the application, respectively. As shown in Figures 3 and 4 (parallelism of 0% case), both Wsequential(f, 1) and Tsequential(f, 1) are independent with the number of cores. In contrast, Wparallel(f, n) and Tparallel(f, n) represent the power consumption and the execution time of the parallel portion of the application, respectively. As shown in Figures 3 and 4 (parallelism of 100% case), both Wparallel(f, n) and Tparallel(f, n) depend on the number of cores.

We denote the execution time of the sequential portion of the application with 1 core as Tsequential(f, 1) and the execution time of the parallel portion of the application with n cores Tparallel(f, n). As shown in Figure 4 (with the 0% parallelism case), the execution time of the sequential portion of the application is independent with the number of cores. Therefore, Tsequential(f, 1) = Tsequential(f, n) (i.e., the execution time of the sequential portion of the application with n cores). Note that, if we denote the execution time of the parallel portion of the application with 1 core as Tparallel (f, 1), then Tparallel (f, n) is not equal to Tparallel (f, 1)/n in a strict sense, due to the pthread overhead. However, Tparallel (f, n) can be approximately equal to Tparallel (f, 1)/n, with a careful parallelization:

T ( f , n p app ) T sequential ( f , 1 ) × ( 1 p app ) + T parallel ( f , 1 ) / n × ( p app )

3.3. E-D Analysis

In general, to control the compression rate and image/video quality, compression parameters are widely used by the multimedia compression community. Recently, to improve the energy efficiency, Rate-Distortion (R-D) and Power-Rate-Distortion (P-R-D) analysis have been reported [15,16]. In this paper, we propose E-D analysis in order to analyze the energy efficiency of the machine and the required image/video quality collectively.

R-D or P-R-D analysis is not enough to evaluate multimedia compression algorithms such as JPEG, JPEG2000, and H.264 in terms of the energy consumption and image/video quality. However, the proposed E-D analysis can evaluate them. Figure 7 compares the performance of JPEG, JPEG2000, and H.264.

With E-D analysis, the energy consumption to compress/transmit the multimedia data Ecomp+trans is represented as Equation (5):

E comp + trans = E comp + E trans

The image/video quality (i.e., distortion) is represented as Equation (6), where PSNR (i.e., peak signal to noise ratio) is widely used as a performance indicator to evaluate image/video distortion by the multimedia compression community. In this paper, we represent the compression parameter as q (i.e., Quality Level of JPEG, JPEG2000, and Quality Parameter of H.264). The purpose of q is to control the compression rate and image/video quality with a scalable quantization table:

D ( q ) = PSNR

Figure 8 shows the energy consumption and the image/video quality with the q parameter. We found that q affects both the compression energy consumption and the transmission energy consumption. To minimize the total energy consumption, we need collective analysis that considers machine and application characteristics.

To analyze the energy consumption and image/video quality by controlling q, we can find the image/video quality (i.e., PSNR) with q as shown in Figure 9. Specifically, we use three types of multimedia data (HALL_MONITOR, FOREMAN, and COAST_GUARD) of CIF size, and three compression algorithms (JPEG, JPEG2000, and H.264). The image/video quality of each compression algorithm is similar to q. Thus, controlling q is a possible way to satisfy a user's image/video quality requirements.

Figure 10 shows the total energy consumption with q. In fact, the power consumption may not be affected by q, but the execution time depends on q. Therefore, q should be determined in order to improve the energy efficiency by using the E-D analysis while satisfying the user's image requirements.

Figure 11 shows the result of the E-D analysis on a commercial multi-core platform (i.e., Intel i7 quad-core processors) in different network environments (i.e., a wired network that supports 100 Mbps with 15 W, and a wireless network that supports 11 Mbps with 11 W). As shown in Figure 11, the energy consumption of compression/transmission depends on the machines, the parallelism of the applications, and the network environment.

This is because the compression computation affects the machine's energy consumption, and both the compression ratio and the transmission bandwidth affect the transmission's energy consumption. Also, in these given environments (i.e., the machines, the parallelism of the applications, the network environment), we should determine whether the compression is applied or not. For example, in Figure 11(a) with JPEG and a wired network, the un-compression/transmission case is always better than the compression/transmission case. However, parallel-compression/transmission using 4 cores can reduce the energy consumption of the un-compression/transmission. Also, in Figure 11(b) with JPEG and a wireless network, both the compression/transmission and the parallel-compression/transmission are always better than the un-compression/transmission. Therefore, given these environments (i.e., commercial multi-core platforms and compression algorithms), we should select the compression/transmission, the parallel-compression/transmission, or the un-compression/ transmission by using the E-D analysis.

3.4. Optimization of System Configuration

In this paper, we propose a greedy approach to find the optimal parameters for the energy efficiency in transmitting image/video data without compromising image/video quality. Algorithm 1 shows the procedure to find the optimal frequency f and the number of cores n by using a greedy approach.

Algorithm 1. Finding Optimal Machine Parameters.
given the environment parameter
papp← application's parallelism
set the default parameters
f ← maximum frequency
n ← 1 core
do {
 calculate E(f, n, papp)
 if (n_next is not last level) {
n_next ← next increased level
  calculate E(f, n_next, papp)}
 if (f_next is not last level) {
f_next← next decreased level
  calculate E(f_next, n, papp)}
 if (E(f, n_next, papp)<E(f, n, papp)) nn_next
 if (E(f_next, n, papp)<E(f, n_next, papp)) ff_next
} while ((E(f, n, papp)<E(f, n_next, papp) AND E(f, n, papp)<E(f_next, n, papp))
f_optf // found optimal frequency
n_optn// found optimal cores

Note that papp is a given parameter which can be gained by application parallelism. The energy consumption can be represented as Equation (7), which consists of compression energy Ecomp and transmission energy Etrans. Ecomp is represented by a compression parameter q as in Equation (7):

E comp ( q ) = W comp ( q ) × T comp ( q )

Since the compression energy consumption should be considered for the given machine and parallel application, Ecomp is represented as in Equation (8). D(q) is the image/video quality with compression parameters, and D0 (i.e., PSNR) is the user's requirement of image/video quality:

E comp ( f , n , p compress , q ) = W comp ( f , n , p compress , q ) × T comp ( f , n , p compress , q )

We also need to analyze the transmission energy consumption to minimize the total energy consumption. The transmission energy consumption Etrans is represented as Equation (9):

E trans = W trans × T trans

The machine, network environment, and compression rate affect the transmission energy consumption. Thus, the transmission energy consumption is represented as Equation (10). M is the compressed data size determined by the compression parameter (i.e., q), and B is the network bandwidth (i.e., unit: bit per second).

E trans ( q , B ) = W trans × M ( q ) / B

By using Equations (6) and (11) collectively, we can minimize the total energy consumption Ecomp+trans while satisfying the user's image/video quality requirements:

min E comp + trans ( f , n , p compress , q , B ) = min [ E comp ( f , n , p compress , q ) + E trans ( q , B ) ] s . t . D ( q ) > D

Finally, we can find the optimal compression and machine parameters (i.e., the frequency f and the number of cores n) by using Algorithm 2.

Algorithm 2. Finding Optimal Machine and Compression Parameters.
given environment parameters
pcompress← compression application's parallelism
B ← network bandwidth
D0← user's requirement for image/video quality
find machine's parameters by using algorithm 1
ff_opt
nn_opt
set the default compress parameter
q ← maximum image/video quality parameter
do{
 calculate Ecomp+trans(f, n, pcompress, q, B)
q_next← next decreased image/video quality parameter
 calculate Ecomp+trans(f, n, pcompress, q_next, B)
 if (Ecomp+trans(f, n, pcompress, q_next, B)<Ecomp+trans(f, n, pcompress, q, B))
qq_next
} while (D(q) >D0)
q_optq // found optimal compress parameter

In addition, we can select the compression/transmission, the parallel-compression/transmission, or the un-compression/transmission scenario by using Algorithm 3.

Algorithm 3. Selection of the Minimum Energy Consumption Scenario.
given environment parameters
pcompress← compression application's parallelism
B ← network bandwidth
set the optimal parameters by using algorithm 1 and 2
ff_opt
nn_opt
qq_opt
if (Etrans(no_compress) <Ecomp+trans(f, n, pcompress, q, B)) select Etrans(no_compress)
else select Ecomp+trans(f, n, pcompress, q, B)

4. Experimental Results

We present the experimental results. The experimental environment is described in Section 4.1. Then, the energy efficiency that results from using the E-D analysis is explained in Section 4.2.

4.1. Experimental Environments

To evaluate the energy efficiency that results from using the E-D analysis, we configured the experimental environment as shown in Figure 12.

We used three commercial multi-core platforms (i.e., Intel quad-core i7 and dual-core i5, AMD quad-core), which are summarized in Table 1.

We configured the network environment as wired (100 Mbps) and wireless (11 Mbps). Table 2 shows the power consumption of the network devices on the i7, i5, and AMD platforms, respectively.

Figure 13 shows the configuration of the measurement environment. We measured the actual power consumption using a WT210 power meter [19]. We considered the power consumption of the whole system at the compression/transmission step with various machine and application parameters.

We used three compression algorithms (i.e., JPEG, JPEG2000, and H.264), and various image/video data. For parallel compression algorithms, we parallelized JPEG, JPEG2000 with Pthread [20], and used parallel H.264 of the PARSEC benchmark suite [23]. We selected CIF-size HALL_MONITOR, FORMAN, and COAST_GUARD from the image/video data set [22], and Figure 14 shows these input data.

4.2. Experimental Analysis

4.2.1. Accuracy Validation of Prediction Parameters

First, to evaluate the prediction accuracy, we measured the performance of AES-CCM with 100% parallelism on each machine. Tables 35 show the normalized energy consumption of each machine. With these results, we can predict the energy consumption and find the optimal frequency and number of cores. We normalized the power consumption, execution time, and energy consumption based on a single core and the maximum frequency, and the user's image/video quality requirements.

We also analyzed the parallelism of JPEG, JPEG2000, and H.264 applications, which were 0.97, 0.95, and 0.93, respectively. With the parallelism analyzed, we can predict the normalized energy consumption, and find the machine parameters (i.e., frequency f and number of cores n). Table 6 shows the estimated and measured results from the energy consumption analysis.

Table 7 shows the estimated and measured results from E-D analysis on i7, i5, and AMD platforms on wired/wireless networks (i.e., 100 Mbps and 11 Mbps), with a quality requirements of PSNR > 30 dB. Based on the results, we confirmed that our prediction of energy consumption is accurate and can determine the optimal machine and compression parameters to improve the energy efficiency while satisfying quality requirements. Finally, we can select the minimum energy consumption scenario with the comparison of E-D analysis and un-compress scenario.

4.2.2. Results from E-D Analysis

To evaluate the energy efficiency that results from using the E-D analysis, we compared several scenarios and the proposed approach as shown in Table 8. The baseline scenarios 1-A and 1-B are for the un-compression/transmission case and the compression/transmission case, respectively. In scenario 1, we examine the frequency as a single core and maximum frequency. Also, we set the q parameter as 25 (i.e., H.264) or 50 (i.e., JPEG and JPEG2000). The scenarios 2 and 3 are for the computer architectural approach and the multimedia compression approach, respectively. In scenario 2, we set the optimal machine parameters (i.e., frequency and the number of cores), and the compression parameter (i.e., q) as 25 or 50. In scenario 3, we set the optimal compression parameters, and used the maximum frequency and 1 core. Finally, in scenario 4, we set the optimal machine and compression parameters collectively by using the E-D analysis.

Scenario 4 is a way to improve the energy efficiency with both the machine and multimedia compression parameters collectively. Table 9 shows the results of the optimal machine and multimedia compression parameters.

Finally, the scenarios 1, 2, 3, and 4 on each machine are shown in Figures 15 and 16. In the given environments, scenario 4 (i.e., E-D analysis) can provide the minimum energy consumption. The wireless network consumed more energy than the wired network. With JPEG2000 in the wired network environment shown in Figure 15(b), the energy consumption of scenario 1-A (i.e., un-compression) was less than that in scenarios 2, 3, and 4. However, scenario 4 can provide the minimum energy consumption with the wireless network, as shown in Figure 16(b). Since the energy consumption of H.264 is more affected by the multimedia compression parameters than the machine parameters, scenario 3 consumed less energy than scenario 2. However, scenario 4 can provide the minimum energy consumption, regardless of the network. Therefore, in the given environments, we can improve the energy consumption by using E-D analysis for a given image/video quality.

We focused on reducing the energy consumption at the compression/transmission step by using multi-core based sensor nodes. However, the latency at the compression/transmission step is also important. In order to evaluate the effect of the proposed approach (i.e., scenarios 4 in Table 8: the optimal number of cores with the optimal frequency and the optimal compression parameter) on the latency, we compared the elapsed time at the compression/transmission step. As shown in Figure 17, the proposed approach can also reduce the elapsed time of the straightforward approach (i.e., scenarios 1-B in Table 8: single core with the maximum frequency and the default compression parameter).

4. Conclusions

Multi-core processors have been used recently for embedded systems, in addition to PCs and servers. Therefore, many studies have been conducted in order to apply commercial multi-core processors to real applications. This paper proposed an approach that could provide both high energy efficiency and high image/video quality by analyzing machine and application characteristics collectively. From the given multi-core platform and network environment, the proposed approach can provide a collective analysis by considering both machine and application characteristics. We proposed E-D analysis in order to analyze the tradeoff between energy consumption of a platform and image/video quality. In particular, we improved the energy efficiency of a commercial multi-core platform by using parallelism because this analysis includes both the machine's characteristics and the application's characteristics during the compression operation. Based on the experimental results with image/video data and Pthread programming model, the proposed approach with E-D analysis can improve the energy efficiency of typical approaches used by computer architecture or multimedia compression communities by a factor of 2∼5 with equal multimedia quality. We believe the proposed approach can be applied to real scenarios such as VSNs with multi-core processors in the near future.

This research was supported by the MKE, Korea, under the HNRC ITRC support program supervised by the NIPA (NIPA-2012-H0301-12-1002) and supported by the National Research Foundation of Korea Grant funded by the Korean Government (NRF-2012-S1A5B5A01- 2012R1A6A3A01040440).

References

  1. Makkaoui, L.; Lecuire, V.; Moureaux, J. Fast Zonal DCT-based Image Compression for Wireless Camera Sensor Networks. Proceedings of the International Conference on Image Processing Theory Tools and Applications, Paris, France, 7–10 July 2010; pp. 126–129.
  2. Wallace, G. The JPEG still picture compression standard. IEEE Consum. Electron. Soc. 1992, 38, 108–124.
  3. Skodras, A.; Christopoulos, C.; Ebrahimi, T. The JPEG 2000 still image compression standard. IEEE Signal Proc. Mag. 2001, 18, 36–58.
  4. Richardson, I.H. 264 and MPEG-4 Video Compression; Wiley Online Library: New York, NY, USA, 2003.
  5. Kumar, V.; Grama, A.; Gupta, A.; Karypis, G. Introduction to Parallel Computing—Design and Analysis of Algorithms; The Benjamin/Cummings Pub. Co. Inc.: San Francisco, CA, USA, 1994.
  6. Huang, X.; Li, K.; Li, R. An energy efficient scheduling base on dynamic voltage and frequency scaling for multi-core embedded real-time system. LNCS 2009, 5574, 137–145.
  7. Brooks, D.; Tiwari, V.; Martonosi, M. A Framework for Architectural-Level Power Analysis and Optimizations. Proceedings of the International Symposium on Computer Architecture, Vancouver, BC, Canada, 12–14 June 2000; pp. 83–94.
  8. Wang, H.-S.; Zhu, X.; Peh, L.-S.; Malik, S. Orion: A Power-Performance Simulator for Interconnection Networks. Proceedings of the 35th Annual IEEE /ACM International Symposium on Microarchitecture, Istanbul, Turkey, 18 November 2002; pp. 294–305.
  9. Butts, J.; Sohi, G. A Static Power Model for Architects. Proceedings of the Annual IEEE /ACM International Symposium on Microarchitecture, Monterey, CA, USA, 10–13 December 2000; pp. 191–201.
  10. Freeh, V.; Lowenthal, D.; Pan, F.; Kappiah, N.; Speringer, R.; Rountree, B.; Femal, M. Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications. IEEE Trans. Parallel Distrib. Syst. 2007, 18, 835–848.
  11. Lively, C.; Wu, X.; Taylor, V.; Moore, S.; Chang, H.; Cameron, K. Energy and Performance Characteristics of Different Parallel Implementations of Scientific Applications on Multicore Systems. Int. J. High Perform. Comput. Appl. 2011, 25, 342–350.
  12. Taylor, C.; Dey, S. Adaptive Image Compression for Wireless Multimedia Communication. Proceedings of IEEE International Conference on Communications, Helsinki, Finland, 11–14 June 2001. Volume 6; pp. 1925–1929.
  13. Lee, D.; Kim, H.; Rahimi, M.; Estrin, D.; Villasenor, J. Energy-Efficient Image Compression for Resource-Constrained Platforms. IEEE Trans. Image Process. 2009, 18, 2100–2113.
  14. Ferrigno, L.; Paciello, V.; Pietrosanto, A. Balancing Computational and Transmission Power Consumption in Wireless Image Sensor Networks. Proceedings of the IEEE International Conference on VECIMS, Giardini Naxos, Italy, 18–20 July 2005; pp. 61–66.
  15. He, Z.; Liang, Y.; Chen, L.; Ahmad, I.; Wu, D. Power-Rate-Distortion Analysis for Wireless Video Communication under Energy Constraint. IEEE Trans. Circuits Syst. Video Technol. 2005, 15, 645–658.
  16. He, Z.; Cheng, W.; Zhao, X.; Millspaugh, J.; Moll, R.; Beringer, J.; Sartwell, J. Energy-Aware Portable Video Communication System Design for Wildlife Activity Monitoring. IEEE Circuits Syst. Mag. 2008, 8, 25–37.
  17. Intel Processor Chipset. Available online: http://www.intel.com/content/www/us/en/chipsets/ (accessed on 10 July 2012).
  18. AMD Processor. Available online: http://www.amd.com/us/products/desktop/processors/ (accessed on 10 July 2012).
  19. Hirofumi, N.; Naoya, N.; Katsuya, T. WT210/WT230 Digital Power Meters. Yokogawa TR 35l; Yokogawa Electric Corp.: Tokyo, Japan, 2003.
  20. Barney, B. POSIX Threads Programming. Available online: http://www.llnl.gov/computing/tutorials/pthreads (accessed on 20 October 2006).
  21. Dworkin, M. Recommendation for Block Cipher Modes of Operation: The CCM Mode for Authentication and Confidentiality. NIST Special Publication 2004. NIST Special Publication 800-38C.
  22. Video Test Media. Available online: http://media.xiph.org/video/derf/ (accessed on 20 November 2000).
  23. Bienia, C.; Kumar, S.; Singh, J.P.; Li, K. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Toronto, ON, Canada, 25–29 October 2008; pp. 72–81.
Sensors 12 14647f1 200
Figure 1. H.264 encoder [19].

Click here to enlarge figure

Figure 1. H.264 encoder [19].
Sensors 12 14647f1 1024
Sensors 12 14647f2 200
Figure 2. Illustration of q (i.e., Quality Level or Quality Parameter).

Click here to enlarge figure

Figure 2. Illustration of q (i.e., Quality Level or Quality Parameter).
Sensors 12 14647f2 1024
Sensors 12 14647f3 200
Figure 3. The power consumption with various test an applications on multi-core platforms.

Click here to enlarge figure

Figure 3. The power consumption with various test an applications on multi-core platforms.
Sensors 12 14647f3 1024
Sensors 12 14647f4 200
Figure 4. The execution time with test applications on multi-core platforms.

Click here to enlarge figure

Figure 4. The execution time with test applications on multi-core platforms.
Sensors 12 14647f4 1024
Sensors 12 14647f5 200
Figure 5. The energy consumption with test applications on multi-core processors.

Click here to enlarge figure

Figure 5. The energy consumption with test applications on multi-core processors.
Sensors 12 14647f5 1024
Sensors 12 14647f6 200
Figure 6. The relationship between application/machine characteristics and the energy consumption.

Click here to enlarge figure

Figure 6. The relationship between application/machine characteristics and the energy consumption.
Sensors 12 14647f6 1024
Sensors 12 14647f7 200
Figure 7. Comparison of performance with JPEG, JPEG2000, and H.264.

Click here to enlarge figure

Figure 7. Comparison of performance with JPEG, JPEG2000, and H.264.
Sensors 12 14647f7 1024
Sensors 12 14647f8 200
Figure 8. The relationship between the energy consumption and the image/video quality.

Click here to enlarge figure

Figure 8. The relationship between the energy consumption and the image/video quality.
Sensors 12 14647f8 1024
Sensors 12 14647f9 200
Figure 9. PSNR with q.

Click here to enlarge figure

Figure 9. PSNR with q.
Sensors 12 14647f9 1024
Sensors 12 14647f10 200
Figure 10. The energy consumption with q.

Click here to enlarge figure

Figure 10. The energy consumption with q.
Sensors 12 14647f10 1024
Sensors 12 14647f11 200
Figure 11. E-D anal$ysis on commercial multi-core processors in various network environments.

Click here to enlarge figure

Figure 11. E-D anal$ysis on commercial multi-core processors in various network environments.
Sensors 12 14647f11 1024
Sensors 12 14647f12 200
Figure 12. The experimental environment.

Click here to enlarge figure

Figure 12. The experimental environment.
Sensors 12 14647f12 1024
Sensors 12 14647f13 200
Figure 13. Configuration of the power measurement environment.

Click here to enlarge figure

Figure 13. Configuration of the power measurement environment.
Sensors 12 14647f13 1024
Sensors 12 14647f14 200
Figure 14. Image/Video data set [22].

Click here to enlarge figure

Figure 14. Image/Video data set [22].
Sensors 12 14647f14 1024
Sensors 12 14647f15 200
Figure 15. The energy consumption with various scenarios over wired network.

Click here to enlarge figure

Figure 15. The energy consumption with various scenarios over wired network.
Sensors 12 14647f15 1024
Sensors 12 14647f16a 200Sensors 12 14647f16b 200
Figure 16. The energy consumption with various scenarios over wireless network. (a) The energy consumption with JPEG on i7, i5, and AMD (b) The energy consumption with JPEG2000 on i7, i5, and AMD (c) The energy consumption with H.264 on i7, i5, and AMD

Click here to enlarge figure

Figure 16. The energy consumption with various scenarios over wireless network. (a) The energy consumption with JPEG on i7, i5, and AMD (b) The energy consumption with JPEG2000 on i7, i5, and AMD (c) The energy consumption with H.264 on i7, i5, and AMD
Sensors 12 14647f16a 1024Sensors 12 14647f16b 1024
Sensors 12 14647f17a 200Sensors 12 14647f17b 200
Figure 17. The elapsed time with JPEG/JPEG2000/H.264 in wired and wireless network.

Click here to enlarge figure

Figure 17. The elapsed time with JPEG/JPEG2000/H.264 in wired and wireless network.
Sensors 12 14647f17a 1024Sensors 12 14647f17b 1024
Table Table 1. Platforms specs. of Intel i7 and i5, AMD processors.

Click here to display table

Table 1. Platforms specs. of Intel i7 and i5, AMD processors.
i7i5AMD
ProcessorIntel i7 720QMIntel i5 coreAMD PenumII
Frequency range1.0 GHz∼1.5 GHz0.9 GHz∼1.5 GHz0.7G Hz∼1.7 GHz
Frequency step133 MHz100 MHz500/300/200 MHz
The maximum # of cores424
Network deviceWiredIntel(R) 82577LM Gigabit
Network Connection
RealtekPCIe GBE Family
Controller
JMicron PCI Express Gigabit
Ethernet Adapter
WirelessIntel(R) Centrino(R)
Advanced-N 6200 AGN
Broadcom 802.11n
Network Adapter
Athreos AR9285 Wireless
Network Adapter
Table Table 2. Power consumption of the network devices on i7, i5, and AMD platforms.

Click here to display table

Table 2. Power consumption of the network devices on i7, i5, and AMD platforms.
i7i5AMD
Wired (100 Mbps)28.5 W17.0 W37.5 W
Wireless (11 Mbps)24.5 W19.0 W38.5 W
Table Table 3. Normalized energy consumption on i7 platform.

Click here to display table

Table 3. Normalized energy consumption on i7 platform.
Actuali7
1 core2 cores3 cores4 cores
1,595MHz100%63%49%41%
1,462MHz99%59%47%39%
1,329MHz108%61%47%41%
1,197MHz117%65%50%41%
1,064MHz131%71%53%44%
Table Table 4. Normalized energy consumption on i5 platform.

Click here to display table

Table 4. Normalized energy consumption on i5 platform.
Actuali5
1 core2 cores
1,397MHz100%55%
1,297MHz106%57%
1,197MHz115%62%
1,097MHz123%66%
997MHz136%74%
Table Table 5. Normalized energy consumption on AMD platform.

Click here to display table

Table 5. Normalized energy consumption on AMD platform.
ActualAMD
1 core2 cores3 cores4 cores
1,796MHz100%56%43%34%
1,597MHz107%61%45%37%
1,298MHz176%92%67%54%
798MHz210%107%75%60%
Table Table 6. The estimated and measured results from the energy consumption analysis.

Click here to display table

Table 6. The estimated and measured results from the energy consumption analysis.
JPEGJPEG2000H.264
pcompress = 0.97pcompress = 0.95pcompress = 0.93
EstimatedMeasuredEstimatedMeasuredEstimatedMeasured
i71462, 4 (MHz, # of cores)1462, 4 (MHz, # of cores)1462, 4 (MHz, # of cores)1462, 4 (MHz, # of cores)1462, 4 (MHz, # of cores)1462, 4 (MHz, # of cores)
42%39%44%40%46%38%
i51397, 2 (MHz, # of cores)1397, 2 (MHz, # of cores)1397, 2 (MHz, # of cores)1397, 2 (MHz, # of cores)1397, 2 (MHz, # of cores)1397, 2 (MHz, # of cores)
56%57%57%59%58%59%
AMD1796, 4 (MHz, # of cores)1796, 4 (MHz, # of cores)1796, 4 (MHz, # of cores)1796, 4 (MHz, # of cores)1796, 4 (MHz, # of cores)1796, 4 (MHz, # of cores)
36%33%38%35%40%35%
Table Table 7. The estimated and measured results from E-D analysis on i7, i5, and AMD platforms.

Click here to display table

Table 7. The estimated and measured results from E-D analysis on i7, i5, and AMD platforms.
Machine Parameters f, n (MHz, # of cores)Compression Parameters q Distortion(q) > 30 dBNormalized energy consumption (wired/wireless)
E-D analysis
i7
JPEGEstimated1462, 41743%/60%
Measured1462, 42044%/63%
JPG2000Estimated1462, 43139%/39%
Measured1462, 43339%/39%
H.264Estimated1462, 44415%/14%
Measured1462, 43718%/19%
i5
JPEGEstimated1397, 21763%/91%
Measured1397, 22063%/91%
JPG2000Estimated1397, 23155%/57%
Measured1397, 23355%/58%
H.264Estimated1397, 24411%/9%
Measured1397, 23712%/10%
AMD
JPEGEstimated1796, 41737%/98%
Measured1796, 42038%/98%
JPG2000Estimated1796, 43139%/46%
Measured1796, 43341%/67%
H.264Estimated1796, 4444%/3%
Measured1796, 4376%/4%
Table Table 8. Scenarios of the image/video transmission.

Click here to display table

Table 8. Scenarios of the image/video transmission.
Machine ParametersCompression Parameter q
Frequency# of cores
Scenario 1-A. BASELINE
Un-compression and Transmission
Maximum1core-
Scenario 1-B. BASELINE
Compression and Transmission
Maximum1core25 (H.264) or 50 (JPEG/JPEG2000)
Scenario 2
Computer Architectural Approach
OptimumOptimum25 (H.264) or 50 (JPEG/JPEG2000)
Scenario 3
Multimedia Compression Approach
Maximum1coreOptimum
Scenario 4
Optimization with E-D Analysis
OptimumOptimumOptimum
Table Table 9. The optimal machines and multimedia compression parameters.

Click here to display table

Table 9. The optimal machines and multimedia compression parameters.
i7i5AMD
JPEGFrequency f1,462 MHz1,397 MHz1,796 MHz
# of cores n424
Compress parameter q
PSNR = 30.22 dB
171717
JPEG2000Frequency f1,462 MHz1,397 MHz1,796 MHz
# of cores n424
Compress parameter q
PSNR = 30.22 dB
313131
H.264Frequency f1,462 MHz1,397 MHz1,796 MHz
# of cores n424
Compress parameter q
PSNR = 30.22 dB
444444
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert