You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

10 December 2020

A Time-Based Dynamic Operation Model for Webpage Steganography Methods

and
Department of Defense Science (Computer Engineering), Graduate School of Defense Management, Korean National Defense University, Nonsan 33021, Korea
*
Author to whom correspondence should be addressed.
This article belongs to the Section Networks

Abstract

The webpage steganography technique has been used for a covert communication method for various purposes in which a sender embeds a secret message into a plain webpage file like an HTML file by using various steganography methods. With human eyes, it is very difficult to distinguish between the original webpage (cover webpage) and the modified webpage with the secret data (stego webpage) because both are displayed alike in a web browser. In this approach, when two communicating entities want to share a secret message, a sender uploads a stego webpage to a web server or modifies an existing webpage in the web server by using a webpage steganography method, and then a receiver accesses the stego webpage to download and extract the embedded secret data from it. Meanwhile, according to our extensive survey, we observed that most webpage steganography methods focused on proposing or improving steganography algorithms but did not well address how to operate a stego webpage as time passes. However, if a stego webpage is used in a static way such that the stego webpage does not change and is constantly exposed to web clients until the sender removes it, such a static operation approach will limit or badly affect the hiding capacity and undetectability of a webpage steganography method. By this motivation, in this paper, we proposed a time-based dynamic operation model (TDOM) that improves the performance of existing webpage steganography methods in terms of hiding capacity and undetectability by dynamically replacing the stego webpage with other stego webpages or the original webpage. In addition, we designed two time-based dynamic operation algorithms (TDOA-C and TDOA-U), which improve the hiding capacity of existing methods and TDOA-U for improving the undetectability of existing methods, respectively. To validate our model and show the performance of our proposed algorithms, we conducted extensive comparative experiments and numerical analysis by implementing two webpage steganography methods with our TDOM (CCL with TDOA-C and COA with TDOA-C) and tested them in the web environment. According to our experiments and analysis, our proposed algorithms could significantly improve the hiding capacity and undetectability of two existing webpage steganography methods.

1. Introduction

Steganography techniques have been used for covert communication between a sender and a receiver who want to hide their secret communication even in the presence of unauthorized entities [1,2]. In this approach, the sender creates a stego medium by embedding a secret message into a cover medium (e.g., image, audio, video, text, webpage, etc.) by using various steganography methods depending on the characteristics of cover mediums. Due to its undetectability, the steganography techniques have been used for terrors, crimes, and espionage, and it is not difficult to find many related cases and reports on news media [3,4,5].
The webpage steganography technique is one of the representative steganography techniques that use a webpage file (e.g., HTML, XHTML, XML, SMIL, etc.) as a cover medium [6]. The created webpage with the secret message is called stego webpage. The webpage steganography exploits the characteristic of a webpage file like HTML such that it is less sensitive to the change of its syntax in displayed view in the web browser compared with other web programming languages such as Java or Python. Thus, a sender can hide a secret message easily into a webpage file by manipulating its source codes such that changes to the webpage file do not affect the view of the webpage in the web browser to avoid being detected by a monitoring entity (warden). In addition, the webpage steganography technique can deliver a secret message to more receivers than other types of steganography techniques. For example, while a sender delivers a secret message to one or a group of receivers in other types of steganography techniques (Figure 1b), a secret message can be delivered to many receivers (especially, to anonymous receivers) efficiently because a stego webpage is deployed in a web server, which is accessed by many web users (Figure 1a).
Figure 1. Webpage steganography technique vs. other types of steganography techniques.
As a representative webpage steganography-based cyberattack case, Kaspersky released a report on Platinum in 2019, which is one of the famous hacking groups [7]. According to the report, the Platinum had leaked critical information from governmental and military domains of southeast Asian countries by using two webpage steganography methods to hide their behaviors. For example, one of the methods is to add some special characters such as whitespace and tab as much as they need to embed secret data into HTML files because those special characters are not visualized when they are parsed by the web browser (Figure 2a). The other method is to change the order of attributes in HTML files. This method uses the fact that the order of attributes in the same tag does not affect the display generated by the web browser (Figure 2b).
Figure 2. Webpage steganography methods used by Platinum [7].
According to our extensive survey [6,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31], most existing studies did not consider how the stego webpage should be operated as time passes. However, how to operate a stego webpage should be well addressed. Thus, if a stego webpage is operated simply in a static manner such that such a static operation feature will make a webpage steganography method very inefficient and inferior in terms of hiding capacity and undetectability because the stego webpage does not change as time passes. We will discuss in detail in Section 2.3.
On the other hand, if we operate a stego webpage in a dynamic manner such that the stego webpage is replaced with other stego webpages containing different secret messages or with its original webpage (cover webpage), we believe that the hiding capacity and undetectability of the webpage steganography method used for generating the stego webpage can be significantly improved. By this motivation, in this study, we propose and design our time-based dynamic operation model for webpage steganography methods.
Our contributions in this study can be summarized as the following:
  • We proposed a time-based dynamic operation model (TDOM) that improves the performance of existing webpage steganography methods in terms of hiding capacity and undetectability.
  • We designed two time-based dynamic operation algorithms TDOA-C and TDOA-U, which improve the hiding capacity of existing methods and TDOA-U for improving the undetectability of existing methods, respectively.
  • We conducted comparative experiments and numerical analysis to validate our model and show the performance of our proposed algorithms. For this, we implemented two webpage steganography methods with our TDOM (CCL with TDOA-C and COA with TDOA-C) and tested them in the web environment; CCL stands for Changing the Cases of Letters in tags and attributes and COA stands for Changing the Order of Attributes.
  • In addition to the above contributions, we hope that this study can provide useful information about webpage steganography techniques and their dynamic operations, which can be used in cyberattacks or cybercrimes, raise an alarm to security engineers and researchers, and, thus, attract them to research defense mechanisms and techniques against them. Meanwhile, we note that our study can be used in positive use cases because the steganography-based covert communication can be used to avoid unauthorized and illegal eavesdroppers.
The rest of this paper is organized as follows. In Section 2, we overview backgrounds and related studies. In Section 3, we propose our time-based dynamic operation model (TDOM) for existing webpage steganography methods and design two time-based dynamic operation algorithms (TDOA-C and TDOA-U) based on two existing methods (CCL and COA). In Section 4, we conduct comparative experiments and numerical analysis. Finally, we conclude in Section 5.

3. Proposed Model and Algorithms

3.1. Our Approach: Time-Based Dynamic Operation Model (TDOM)

To improve the hiding capacity and undetectability of existing webpage steganography methods, we propose a simple but effective Time-based Dynamic Operation Model (TDOM) for them that controls when a stego webpage is replaced with a new stego webpage with a different secret message for higher capacity or when a stego webpage is exposed to users for higher undetectability.
Figure 6 shows how our TDOM (dynamic operation model) can improve the hiding capacity and undetectability of an existing method, which is operated in a static manner (static operation model). We note that, according to our survey, there is no clear static operation model for existing steganography methods but, for better understanding, we consider the latter case the static operation model in this study. Thus, we here compare how an existing method can work differently when it is operated in the static operation model and our dynamic operation model.
Figure 6. Comparison of the static operation model and our Time-based Dynamic Operation Model (TDOM).
  • For higher hiding capacity (or larger amount of secret message delivery): Given the same time slot ( t 2 t 1 ), as shown in Figure 6a, a webpage steganography method delivers only one message (#1) in the static operation model. Meanwhile, as shown in Figure 6b, it can deliver four messages (#1–#4) in our TDOM since four different stego webpages are uploaded in turn in the time slot.
  • For higher undetectability: Given the same time slot ( t 2 t 1 ), as shown in Figure 6c, a stego webpage keeps exposed to users, and, thus, if a powerful monitor (detection system) visits the stego webpage, the monitor can detect the existence of the stego webpage. Meanwhile, as shown in Figure 6d, when the stego webpage is deactivated and replaced with the original webpage two times, the chance that the monitor detects the stego webpage will definitely decrease accordingly.
Our model is designed as a generic architecture to existing webpage steganography methods. Thus, it is not necessary to modify existing methods and our model can be easily combined with them. Although TDOM can be designed and implemented in various ways considering the degree of hiding capacity and undetectability, which are in a trade-off relationship, we show two time-based dynamic operation algorithms: (1) TDOA-C mainly focusing on hiding capacity and TDOA-U mainly focusing on undetectability.

3.2. Time-based Dynamic Operation Algorithm for Hiding Capacity (TDOA-C)

As we briefly explained in Section 3.1, TDOA-C mainly focuses on improving the hiding capacity of an existing webpage steganography method without modifying it. Thus, as shown in Figure 7, when our TDOA-C is combined with an existing webpage steganography method, given a specific time period, the hiding capacity of the method can be significantly improved as the following steps. Let us assume that a sender S wants to send a secret message M F to a receiver R by using a certain webpage steganography method W S M . In addition, the sender uses a cover webpage C W (original webpage) and the maximum hiding capacity H C [ C W ] = c bits. Algorithm 1 describes TDOA-C for the sender and receiver.
Figure 7. Working steps of Time-based Dynamic Operation Algorithm for hiding Capacity (TDOA-C).
  • Step 1. S partitions M F into n smaller messages M 1 , M 2 , …, M n .
  • Step 2. S creates n stego webpages S W 1 , S W 2 , …, S W n by using W S M , C W , and M 1 , M 2 , …, M n .
  • Step 3. S uploads S W 1 , S W 2 , …, S W n to the web server, in turn, at a certain time t i T = { t 1 ,   t 2 ,   ,   t n } , which is agreed with R . For this step, the sender and receiver exchange such information by using another covert channel or in person.
  • Step 4. R receives each stego webpage, in turn, by accessing the URL where the cover webpage is located at the time period agreed with S .
  • Step 5. R extracts the partitioned secret messages M 1 , M 2 , …, M n from received stego webpages.
  • Step 6. R restores the complete secret message M F by combining M 1 , M 2 , …, M n .
By this manner, the existing method with TDOA-C can deliver n   ×   c bits, which means our algorithm can improve the hiding capacity of the method in the static operation model by n   ×   100 % . The hiding capacity (or the amount of delivered secret messages) H C [ T D O A C ] can be expressed in Equation (1) below.
H C [ T D O A C ]   = i = 1 n H C ( S W i ) = n × c  
Algorithm 1 TDOA-C (Time-Based Dynamic Operation Algorithm for Hiding Capacity)
Definition
Message M F : a full secret message to send to R
M S = { M 1 , M 2 , , M n } : a set of partitioned messages
n : the number of partitioned messages
Webpage C W : cover webpage
S W = { S W 1 , S W 2 , , S W n } : a set of stego webpages
Time T = { t 1 , t 2 , , t n } :   a set of times that agree with R
t s t a r t , t e n d : the start and end time of covert communication between S and R
t c u r r e n t : current time
Function e m b e d ( C W ,   M n ) : embeds the secret message M n into C W . returns S W n
e x t r a c t ( S W n ) : extracts the embedded secret message from S W n . returns M n
p a r t i t i o n ( M F ,   n ) : partitions M F into n smaller messages M 1 , M 2 , , M n . returns M S
r e s t o r e ( M S ) : restores the M F by combining M 1 , M 2 , , M n . returns M F
SenderReceiver
Input: M F , n , C W , T , t s t a r t , t e n d , t c u r r e n t Input: n , T , t s t a r t , t e n d , t c u r r e n t
Output: M S , S W Output: S W , M S , M F
Functions: p a r t i t i o n ( M F ,   n ) , e m b e d ( CW ,   M i ) Functions: e x t r a c t ( S W n ) , r e s t o r e ( M S )
1: begin1: begin
2:  M S p a r t i t i o n ( M F ,   n ) #Step12: while t s t a r t t c u r r e n t < t e n d
3: for i from 1 to n do #Step23:  for i from 1 to n do #Step4
4:   S W i = e m b e d ( C W ,   M i )4:   while True #infinitely repeat
5:   S W   S W i #add an element S W i into set S W 5:    if t c u r r e n t == t i then
6: while t s t a r t t c u r r e n t < t e n d 6:     request S W i to a webserver
7:  for j from 1 to n do #Step37:      S W   S W i from a webserver
8:   while True #infinitely repeat8:     break
9:    if t c u r r e n t = = t j then9: for j from 1 to n do #Step5
10:     upload S W j to webserver10:   M j   = e x t r a c t ( S W j )
11:     break11:   M S   M j #add M j into set M S
12: end12:  M F = r e s t o r e ( M S ) #Step6
13: end

3.3. Time-based Dynamic Operation Algorithm for Undetectability (TDOA-U)

We introduce another algorithm (TDOA-U) based on our TDOM and on-off strategy [37] for improving the undetectability of an existing webpage steganography method. Unlike the existing static operation model, by using this algorithm, the sender S controls when a stego webpage is activated (on) and deactivated (off) as time passes. Thus, as shown in Figure 8, only when the stego webpage S W is activated (on), the stego webpage S W can be accessed by the receiver R and, thus, a secret message can be delivered to the receiver R . Meanwhile, when a stego webage is deactivated (off), it is replaced with the original cover webpage C W without any secret message, and, thus, the monitor will not detect the stego webpage when it accesses C W . Consequently, the chance that the monitor detects the stego webpage will decrease according to the deactivated time of S W given a certain time period.
Figure 8. Working steps of Time-based Dynamic Operation Algorithm for Undetectability (TDOA-U).
Our TDOA-U works as the following steps (see Algorithm 2). We assume the following. First, a monitor M with steg analysis capability is checking periodically if a webpage (a cover webpage C W ) is a stego webpage. Second, a sender uses a dynamic webpage for a cover webpage to avoid a simple webpage change detector (or a file integrity checker) without a steganography detection function. Lastly, there are multiple receivers who want to receive the same secret message from the web server for a certain period of time.
  • Step 1. The sender S creates stego webpages S W 1 , S W 2 , …, S W n by embedding a secret message into a cover webpage C W . We assume that n stego webpages have the same secret message.
  • Step 2. S uploads S W 1 , S W 2 , …, S W n to the web server, in turn, at t S i T S = { t S 1 ,   t S 2 ,   ,   t S n } and replace them with C W at t C i T C = { t C 1 ,   t C 2 ,   ,   t C n } in which both T S and T C are agreed in advance between S and R . For this step, the sender and receiver need to exchange such information by using another covert channel or in person. In addition, each S W i will be replaced with C W again when a small amount of time passes to avoid the monitor. Therefore, the accurate time synchronization between S and R is very important so that this step can succeed. Moreover, to successfully get S W i even in the presence of a possible time difference between S and R due to the unexpected network or processing delay, the receiver may need to access the web address of S W i multiple times. Meanwhile, we do not delve into designing a sophisticated time synchronization and guaranteed delivery method for S and R in this study but, instead, we want to leave it for our future research.
  • Step 3. R receives each stego webpage, in turn, by accessing the URL where the cover webpage is located at the time period agreed with S .
  • Step 4. R can extract the secret message M from received stego webpages S W 1 , S W 2 , …, S W n , which means that R can obtain M at any time between t S i and t C i .
Algorithm 2 TDOA-U (Time-based Dynamic Operation Algorithm for Undetectability)
Definition
Message M : a secret message to send to R
N S : the number of stego webpage to send to R
Webpage C W : cover webpage
S W = { S W 1 , S W 2 , , S W n } : a set of stego webpages
Time T S = { t S 1 , t S 2 , , t S n } :   a set of times, which S upload S W to the web server
T C = { t C 1 , t C 2 , , t C n } : a set of times, which S upload C W to the web server
t s t a r t , t e n d : the start and end time of covert communication between S and R
t c u r r e n t : current time
Function e m b e d ( C W ,   M n ) : embeds the secret message M n into C W . returns S W n
e x t r a c t ( S W n ) : extracts the embedded secret message from S W n . returns M n
r e s t o r e ( M S ) : restores the M F by combining M 1 , M 2 , , M n . returns M F
SenderReceiver
Input: M , N S , C W , T S , T C , t s t a r t , t e n d , t c u r r e n t Input: T S , T C , t s t a r t , t e n d , t c u r r e n t
Output: S W Output: S W , M
Function: e m b e d ( C W ,   M ) Functions: e x t r a c t ( S W n ) , r e s t o r e ( M S )
1: begin1: begin
2: for i from 1 to n do #Step12: while t s t a r t t c u r r e n t < t e n d
3:   S W i = e m b e d ( C W ,   M )3:  for i from 1 to n do #Step3
4:   S W   S W i #add an element S W i into set S W 4:   while True #infinitely repeat
5: while t s t a r t t c u r r e n t < t e n d 5:    if t S i t c u r r e n t < t C i then
6:  for j from 1 to N S do #Step26:     request S W i to webserver
7:   while True #infinitely repeat7:      S W   S W i from webserver
8:    if t c u r r e n t = = t S j then8:     break
9:     upload S W j to webserver9: choice j from 1 to n #Step4
10:    else if t c u r r e n t = = t C j then 10:  M = e x t r a c t ( S W j )
11:     upload C W to webserver11: end
12:    break
13: end

3.4. The Hybrid Model of Combining TDOA-C and TDOA-U

As described previously, we have designed our TDOA-C and TDOA-U for hiding capacity and undetectability, respectively. Meanwhile, these two algorithms can be combined in a hybrid manner depending on which factor of hiding capacity (the amount of secret message delivery) or undetectability is more necessary. Thus, as shown in Figure 9, the hybrid model can be designed and implemented in various ways by considering the following factors, such as the goal of covert communication between S and R , the expected monitoring and detection capability, the frequency, or cycle, the amount of a secret message or its delivery frequency, and so on. In this paper, we do not implement this hybrid model and consider it in our comparative experiments in Section 4, since our main goal of this study is to show how our idea (dynamic operation model) can improve existing webpage steganography methods rather than showing the optimized performance of our proposed model, which will be conducted for one of our future works.
Figure 9. A hybrid approach that combines TDOA-C and TDOA-U.

4. Experimental Results

In this section, we conduct two experiments for the following purposes.
  • Experiment 1: Validating TDOA-C has a higher hiding capacity (or a larger amount of secret message delivery) than the existing static model.
  • Experiment 2: Validating TDOA-U has a higher undetectability than the existing static model.
We will describe details on the experimental purpose, methods, and results of each experiment. All our experiments were conducted with Python 3 on a laptop (Intel i5 10th GEN and 16GB RAM).

4.1. Experiment 1: Validation of TDOA-C

Experiment 1 consists of two parts (Part 1 and Part 2) as follows. In Part 1, we implement an existing method combined with our TDOA-C in a real web environment and show our model can work. In the second part, we conduct numerical analysis to validate our TDOA-C that will improve the hiding capacity of two representative existing methods (CCL and COA) compared with it in the static operation model.

4.1.1. Experiment 1. Part 1: Implementation of TDOA-C in the Web Environment

To show our TDOA-C works in a real web environment, we built a web server and then implemented TDOA-C according to Algorithm 1 (see Figure 10). We used the Flask web framework [38] to build a web server in our laptop and urllib.request library [39] to receive stego webpages from the webserver.
Figure 10. Experimental design of Experiment 1. Part 1.
Based on the constructed web environment, we conducted Part 1 of Experiment 1 as follows (see Figure 10). For the secret message M F , we generated 100 UUID (Universally Unique Identifiers) by using UUID version 4 algorithm [40] and used the combined 100 UUIDs for M F . Each UUID consists of 32 hexadecimal numbers (16 bytes). Next, we created 100 S W s by embedding each UUID into a cover web page, uploaded each S W i to the web server every 10 s, and then accessed and received S W i from the web server. We confirm that our algorithm works in the web environment if the embedded and extracted UUIDs are the same.
The experimental results show that all 100 UUIDs are exactly matched. Table 1 shows part of the experimental results of part 1 for n   =   1 ,   2 ,   3 ,   4 ,   5 ,   10 ,   50 , and 100 . For example, for S W 1 ( n   =   1 ), the sender embedded the first UUID (b8ab647e-4042-46bc-87b4-615c720ab068) into C W and then generated S W 1 , and the receiver could extract the same UUID from the received S W 1 . According to our experimental results, since all 100 UUIDs are received and extracted successfully at the receiver, M F (combined 100 UUIDs) is delivered correctly. Therefore, based on the experimental result, we confirmed our TDOA-C works properly in the web environment.
Table 1. A part of the result of experiment 1. Part 1.

4.1.2. Experiment 1. Part 2: Comparative Numerical Analysis

To show how our TDOA-C can improve the hiding capacity (or the amount of a secret message delivery) of two representative existing methods in the static operation model, we conducted the comparative numerical analysis as follows.
First, we measured the average hiding capacity of existing methods (CCL and COA) by considering the top popular global websites’ main webpages. For cover webpages, we collected the top 50 webpages introduced by two popular websites Alexa [41] and SimilarWeb [42] that provide the global website ranks in terms of the number of visitors to websites.
For the CCL method, we implemented it by adopting Sui and Luo’s CCL method [6] and measured the hiding capacity of CCL for each of the collected webpages as:
H C [ C C L ] = n = 1 N e l e m e n t s N l e t t e r s ( e n )
where e n denotes n th elements in the webpage, and N l e t t e r s   denotesa function that returns to the number of alphabetic letters except to attribute values and contents in an element, and N e l e m e n t s denotes the number of elements in a webpage. We averaged all obtained H C [ C C L ] .
In addition, for the COA method, we implemented it by adopting Huang et al.’s COA method [25], which measured the hiding capacity of COA for each of the collected webpages as:
H C [ C O A ] = n = 1 N t a g s log 2 { N a t t r i b u t e s ( t n   ) } !
where t n is the n th tag in a webpage, N a t t r i b u t e s ( t n ) is a function that returns the number of attributes in t n , and N t a g s is the number of tags within the webpage. If N a t t r i b u t e ( t n ) 2 , the maximum number of permutations is equal to { N a t t r i b u t e ( t n ) } ! , and, thus, the hiding capacity of t n is equal to log 2 { N a t t r i b u t e s ( t n   ) } ! bits. We averaged all obtained values of H C [ C O A ] .
All measured values of the hiding capacity of existing methods (CCL and COA) are shown in Table 2 and Figure 11.
Table 2. The average and max hiding capacity measured for top websites when Changing the Case of Letters (CCL) in tags and attributes and Changing the Order of Attributes (COA) are used.
Figure 11. The capacity of Changing the Case of Letters (CCL) in tags and attributes and Changing the Order of Attributes (COA) from each website list.
Second, we compared the average hiding capacity calculated of the existing two methods (CCL and COA) in the static operation model with our approaches (CCL with TDOA-C and COA with TDOA-C). For our approaches, we measured the average hiding capacity of our two methods as the change cycle time ( C t ) decreases from 100 to 10 by 10. For simplicity, we set C t to a fixed value between 10 and 100, which means the interval of t i and t i + 1 is equally set where t i T   a n d   1 i n 1 (see Algorithm 1). For example, if C t is 2 s, an existing stego webpage S W i will be updated with the next stego webpage S W i + 1 every two seconds in our dynamic operation model. Thus, for 10 s, five different stego webpages (i.e., the number of stego webpages N s = 5 ) will be uploaded to the web server while only one stego webpage is exposed in the static operation model regardless of C t . Based on the above settings, we conducted the comparative numerical analysis in terms of hiding capacity by using various operation times (1000 s, 2000 s, 5000 s, and 10,000 s). The operation time is t e n d t s t a r t .
We now explain our experimental results as follows (see Table 3 and Figure 12).
Table 3. The capacity of TDOA-C and existing static model with CCL and COA in 1000 s. The unit of capacity is bytes. N S denotes the number of transmitted stego webpages. C t denotes the cycle time that the period of a stego webpage is changed to the next stego webpage.
Figure 12. Average hiding capacities of our methods (CCL with TDOA-C and COA with TDOA-C) and existing methods (CCL and COA) as the value of the operation time increases from 1000 s to 10,000 s.
First, for all value of C t , the average hiding capacities of our proposed algorithms (CCL with TDOA-C and COA with TDOA-C) are much higher than those of the existing two methods (CCL and COA).Table 3 shows the analysis results when the operation time = 1000 s. For example, when the top 50 websites from Alexa is considered, the average hiding capacities ( μ H C )   of CCL and COA are fixed as 3690 bytes and 187 bytes. μ H C [ C C L ] = 3690 bytes and μ H C [ C O A ] = 187 bytes. This is because only one stego webpage is exposed ( N s = 1 ) regardless of the operation time. On the other hand, when C t = 100 , our methods (CCL with TDOA-C and COA with TDOA-C) are 10 times better than CCL and COA in terms of the average hiding capacity because, when C t = 100 , 10 different stego webapges are uploaded for 1000 s. As a result, μ H C [ C C L   w i t h   T D O A C ] = 36,900 bytes and μ H C [ C O A   w i t h   T D O A C ] = 1870 bytes. Moreover, the difference of the average hiding capacity of existing methods and our methods keep increasing as C t decreases because N s grows as C t decreases, as we can see in Table 3 and Figure 12. In case of a SimilarWeb, the analysis results are shown similarly.
Second, the difference between the hiding capacity of existing methods and our methods increases as the operation time increases from 1000 s, 2000 s, 5000 s, and 10,000 s. For example, Figure 12 shows μ H C [ C C L ] ,   μ H C [ C O A ] , and μ H C [ C C L   w i t h   T D O A C ] , and   μ H C [ C O A   w i t h   T D O A C ] as the increment of operation time grows when the top 50 websites from Alexa (Figure 12a,b) and SimilarWeb (Figure 12c,d)) are considered. We can see that, as the operation time increases from 1000 s to 2000 s, the μ H C [ C C L   w i t h   T D O A C   ] and   μ H C [ C O A   w i t h   T D O A C ] doubles. Likewise, when the operation time increases from 1000 s to 5000 s and from 1000 s to 10,000 s the μ H C [ C C L   w i t h   T D O A C ] and   μ H C [ C O A   w i t h   T D O A C ] grow by five times and 10 times, respectively. However, no matter how much operation time increases, μ H C [ C C L ] and   μ H C [ C O A ] were constant.
Third, the hiding capacity of CCL with TDOA-C is much larger than the capacity of COA with TDOA-C. For example, Figure 13a,b shows M a x H C [ C C L   w i t h   T D O A C ] ,   μ HC [ C C L   w i t h   T D O A C ] and M a x H C [ C O A   w i t h   T D O A C ] ,   μ H C [ C O A   w i t h   T D O A C ] when the top 50 websites from Alexa are considered. For all values of C t , M a x H C [ C C L   w i t h   T D O A C ] > M a x H C [ C O A   w i t h   T D O A C ] and μ H C [ C C L   w i t h   T D O A C ] > μ H C [ C O A   w i t h   T D O A C ] .   In addition, as C t decreases, both M a x H C [ C C L   w i t h   T D O A C ] M a x H C [ C O A   w i t h   T D O A C ] and μ H C [ C C L   w i t h   T D O A C ] μ H C [ C O A   w i t h   T D O A C ] also increase. In the case of SimilarWeb as shown in Figure 13c,d, such a tendency holds as well. This is because the CCL method can embed data more into a cover webpage than the COA method, as we discussed above (see Table 3).
Figure 13. The capacity of our methods (CCL with TDOA-C and COA with TDOA-C) and existing methods (CCL and COA) according to C t .

4.2. Experiment 2: Validation of TDOA-U

The purpose of Experiment 2 is to show how our TDOA-U can improve the undetectability of an existing method in the static operation model. The concept to measure undetectability is depicted in Figure 14.
Figure 14. Concept to measure undetectability.
We consider that an existing method in the static operation model has zero undetectability because, as shown in Figure 14a, it will always be detected by a monitor since there is no concealed time period between t s t a r t and t e n d . In addition, we consider that a method in Case 2 has higher undetectability than a method in Case 3 because Case 2′s exposure time to users including a monitor is smaller than Case 3 given the same operation time period ( = t e n d t s t a r t ). Thus, the chance that Case 2 is not detected by the monitor is lower than the chance that Case 3 is not detected. To compare two methods quantitatively in terms of undetectability, we define undetectability M U as:
M U = 1 ( i = 1 | D S | d S i   i = 1 | D S | d S i + k = 1 | D C | d C k ) =   k = 1 | D C | d C k t e n d t s t a r t
where d S i is the exposure time period of the i th stego webpage, D S is a set of d S i   f o r   i   [ 1 , n ] , d C k is the exposure time period of the k th cover webpage, D C is a set of d C k   f o r   k   [ 1 , m ] , and M U   [ 0 ,   1 ] ; | D S | = n   a n d   | D C | = m . This metric indicates the ratio of the amount of time that a stego webpage is not exposed to the total operation time. Thus, a method has its maximum undetectability when M U = 1 , and, for the existing method in the static operation model, M U = 0 .
To see how differently our proposed methods has M U depending on D S and D C , we conducted the comparative numerical analysis as follows. For simplicity, we set   d S 1 = d S 2 = = d S i = = d S n   and   d C 1 = d C 2 = = d C k = = d C m , and the total operation time ( = t e n d t s t a r t ) = 1000 s. In addition, we used various values for d S i and d C k from 10 s to 100 s by 10 s.
In addition, we implemented our CCL with TDOA-U and conducted an experiment in a real web environment to see if its M U measured in the real web environment is similar with M U calculated in our numerical analysis (see Figure 15). The experimental environment is the same as Experiment 1, Part 1. To create an actual stego webpage, we used the main webpage of Google as a cover webpage, and then embedded the secret string “STEGANOGRAPHY” (91 bits) by using the CCL method. The total operation time period ( = t e n d t s t a r t ) is 100 s, and we considered three cases: Case 1 ( d S i = 50 s and d C k = 50 s), Case 2 ( d S i = 30 s and d C k = 70 s), and Case 3 ( d S i = 70 s and d C k = 30 s). In addition, we implemented a monitor such that it accesses the webpage one time randomly during the operation time of 100 s and then check if the accessed time point is in the stego webpage’s exposed time period. The experiment was repeated 100 rounds, and then we measured M U as the total number of detections of the stego webpage over the total number of accesses.
Figure 15. An experiment to measure M U in the real web environment.
We now explain the results of Experiment 2 (see Table 4 and Table 5).
Table 4. Measured undetectability M U of our TDOA-U according to values of d S i and d C k . M U is blue-colored when M U > 0.6 and MU is red-colored when M U < 0.4.
Table 5. M U ,   N A and M U ,   W E B .
First, our proposed method TDOA-U has higher undetectability than the existing method in the static operational model. As we can see in Table 4, for all values of d S i and d C k , all measured values of M U were higher than zero in our experimental settings. That means that our proposed method has higher undetectability than the existing method in the static operation model. Therefore, we confirmed that our TDOA-U can improve the undetectability of the existing webpage steganography methods.
Second, depending on the values of d S i and d C k of TDOA-U, the undetectability M U varies. As shown in Table 4, M U is blue-colored when M U > 0.6 and M U is red-colored when M U < 0.4 . As we expected, when d S i > d C k , M U 0.5 because the total amount of exposure time of C W is greater than the total amount of exposure time of S W during the operation time of 1000 s. On the other hand, when d S i < d C k , M U 0.5 because the total amount of exposure time of C W is lower than the total amount of exposure time of S W during the operation time of 1000 s. In addition, when d S i = d C k , M U 0.5 because the total amount of exposure time of C W is almost equal to the total amount of exposure time of C W during the operation time of 1000 s. There were some small deviations from 0.5 due to the fixed operation time of 1000 s (e.g., when d S i = 30 s and d C k = 30 s, M U = 0.49 ). We can use d S i and d C k for various purposes. For example, for higher undetectability, we can set d C k > d S i and, for higher secret message delivery, we can set d C k < d S i .
Third, the undetectability M U ,   W E B measured in the real web environment is similar to the M U , N A calculated in our numerical analysis (see Table 5). For Case 1, Case 2, and Case 3, the measured M U ,   W E B is 0.49, 0.67, and 0.29, respectively, and they are similar to the M U , N A calculated for Case 1, Case 2, and Case 3. The small difference M U ,   N A and M U ,   W E B between each case can be ignored.

5. Conclusions and Future Works

In this paper, to improve the hiding capacity or undetectability of existing webpage steganography, we proposed a time-based dynamic operation model (TDOM) that dynamically replaces the stego webpage with other stego webpages or the original webpage. We designed two time-based dynamic operation algorithms (TDOA-C and TDOA-U), which improve the hiding capacity of existing methods and TDOA-U for improving the undetectability of existing methods, respectively. In addition, we validated and showed the performance of our proposed methods, conducted extensive comparative experiments and numerical analysis by implementing two webpage steganography methods with our TDOM (CCL with TDOA-C and COA with TDOA-C), and tested them in the web environment.
Our future research directions are as follows. First, we will consider a spatial factor for our dynamic operation model by studying the concept of recent moving target defense techniques [43,44] to better improve the hiding capacity or the undetectability of existing webpage steganography methods. Second, we will devise a new webpage steganography method that overcomes the limitations of existing webpage steganography methods, and then combine it with our dynamic operation model. Third, we will design a secured and sophisticated time synchronization method for the sender and receiver. Fourth, we will study randomizing the change-replacement pattern of the stego webpage and the cover webpage to improve the undetectability of our dynamic operation model. Lastly, we will design the hybrid model to combine TDOA-C and TDOA-U and examine how the hybrid approach can improve the performance of existing methods.

Author Contributions

Conceptualization, S.Y. and Y.C. Methodology, Y.C. Software, S.Y. Validation, S.Y. Formal analysis, S.Y. and Y.C. Investigation, S.Y. Writing—original draft preparation, S.Y. Writing—review and editing, Y.C. Visualization, S.Y. Supervision, Y.C. Project administration, Y.C. Funding acquisition, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Republic of Korea Air Force Academy Research Fund, grant number ROKAFA 20-A-1.

Acknowledgments

An earlier version of this paper was presented and selected as one of the outstanding presentation papers at the KIISE Korea Software Congress 2019 (KSC 2019) in December 2019, South Korea [45]. The authors would like to thank the editor and reviewers for their valuable comments and constructive suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Johnson, N.F.; Jajodia, S. Exploring steganography: Seeing the unseen. Computer 1998, 31, 26–34. [Google Scholar] [CrossRef]
  2. Bender, W.; Gruhl, D.; Morimoto, N.; Lu, A. Techniques for data hiding. IBM Syst. J. 1996, 35, 313–336. [Google Scholar] [CrossRef]
  3. Steganography: A Close View of the Traditional Attack Technique that Has Created Chaos in the Cybersecurity World. Available online: https://cyware.com/news/steganography-a-close-view-of-the-traditional-attack-technique-that-has-created-chaos-in-the-cybersecurity-world-d412d190 (accessed on 2 December 2020).
  4. Steganography Anchors Pinpoint Attacks on Industrial Targets. Available online: https://threatpost.com/steganography-pinpoint-attacks-industrial-targets/156151/ (accessed on 2 December 2020).
  5. Steganography in Attacks on Industrial Enterprises. Available online: https://ics-cert.kaspersky.com/reports/2020/06/17/steganography-in-attacks-on-industrial-enterprises/ (accessed on 2 December 2020).
  6. Sui, X.G.; Luo, H. A new steganography method based on hypertext. In Proceedings of the 2004 Asia-Pacific Radio Science Conference, Qingdao, China, 24–27 August 2004; IEEE: New York, NY, USA, 2004; pp. 181–184. [Google Scholar]
  7. Platinum Is Back. Available online: https://securelist.com/platinum-is-back/91135/ (accessed on 2 December 2020).
  8. Cox, I.; Miller, M.; Bloom, J.; Fridrich, J.; Kalker, T. Digital Watermarking and Steganography; Morgan kaufmann: San Francisco, CA, USA, 2007. [Google Scholar]
  9. Rafat, K.F. Cutting Edge Steganography Using HTML Document-An Appraisal. Int. J. Comput. Sci. Inf. Secur. 2016, 14, 960. [Google Scholar]
  10. Odeh, A.; Elleithy, K.; Faezipour, M.; Abdelfattah, E. Novel Steganography over HTML Code. In Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering; Springer: Cham, Switzerland, 2015; pp. 607–611. [Google Scholar]
  11. Kis, D.; Pataki, N. Source Code-based Steganography. In Proceedings of the 10th International Conference on Applied Informatics, Eger, Hungary, 30 January–1 February 2017; pp. 157–162. [Google Scholar]
  12. Katzenbeisser, S.; Petitcolas, F.A.P. Digital Watermarking; Artech House: London, UK, 2000; Volume 2. [Google Scholar]
  13. Lee, I.S.; Tsai, W.H. Secret communication through webpages using special space codes in HTML files. Int. J. Appl. Sci. Eng. 2008, 6, 141–149. [Google Scholar]
  14. Chou, Y.C.; Huang, C.Y.; Liao, H.C. A reversible data hiding scheme using cartesian product for HTML file. In Proceedings of the 2012 Sixth International Conference on Genetic and Evolutionary Computing, Kitakyushu, Japan, 25–28 August 2012; IEEE: New York, NY, USA, 2012; pp. 153–156. [Google Scholar]
  15. Imran, S.; Khan, A.; Ahmad, B. Text Steganography Utilizing XML, HTML and XHTML Markup Languages. Int. J. Inf. Technol. Secur. 2017, 9, 99–116. [Google Scholar]
  16. Tariq, M.A.; Khan, A.T.A.A.; Ahmad, B. Boosting the Capacity of Web based Steganography by Utilizing Html Space Codes: A blind Steganography Approach. IT Ind. 2017, 5, 29–36. [Google Scholar]
  17. Bajaj, I.; Aggarwal, R.K. RSA Secured Web Based Steganography Employing HTML Space Codes and Compression Technique. In Proceedings of the 2019 International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 15–17 May 2019; IEEE: New York, NY, USA, 2019; pp. 865–868. [Google Scholar]
  18. Jaiswal, R.J.; Patil, N.N. Implementation of a new technique for web document protection using unicode. In Proceedings of the 2013 International Conference on Information Communication and Embedded Systems (ICICES), Chennai, India, 21–22 February 2013; IEEE: New York, NY, USA, 2013; pp. 69–72. [Google Scholar]
  19. Zhao, Q.; Lu, H. A PCA-based watermarking scheme for tamper-proof of webpages. Pattern Recognit. 2005, 38, 1321–1323. [Google Scholar] [CrossRef]
  20. Zhao, Q.; Lu, H. PCA-based webpage watermarking. Pattern Recognit. 2007, 40, 1334–1341. [Google Scholar] [CrossRef]
  21. Wu, C.C.; Chang, C.C.; Yang, S.R. An efficient fragile watermarking for webpages tamper-proof. In Advances in Web and Network Technologies, and Information Management; Springer: Berlin/Heidelberg, Germany, 2007; pp. 654–663. [Google Scholar]
  22. Junling, R.; Chengquan, W. A Webpage information hiding algorithm based on tag dictionary. In Proceedings of the 2012 International Conference on Computer Science and Electronics Engineering, Hangzhou, China, 23–25 March 2012; IEEE: New York, NY, USA, 2012; pp. 546–550. [Google Scholar]
  23. Ghosh, S. StegHTML: A message hiding mechanism in HTML tags; Technical Report: Charlottesville, VA, USA, 10 December 2007. [Google Scholar]
  24. Shen, D.; Zhao, H. A novel scheme of webpage information hiding based on attributes. In Proceedings of the 2010 IEEE International Conference on Information Theory and Information Security, Austin, TX, USA, 13–18 June 2010; IEEE: New York, NY, USA, 2010; pp. 1147–1150. [Google Scholar]
  25. Huang, H.; Zhong, S.; Sun, X. An algorithm of webpage information hiding based on attributes permutation. In Proceedings of the 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Harbin, China, 15–17 August 2008; IEEE: New York, NY, USA, 2008; pp. 257–260. [Google Scholar]
  26. Reddy, B.S.; Kuppusamy, K.S.; Sivakumar, T. Towards Web page steganography with Attribute Truth Table. In Proceedings of the 2016 3rd International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 22–23 January 2016; IEEE: New York, NY, USA, 2016; pp. 1–5. [Google Scholar]
  27. Singh, R.K.; Alankar, B. A Novel Approach For Data Hiding In Web Page Steganography Using Encryption With Compression Based Technique. IOSR J. Comput. Eng. 2016, 18, 73–77. [Google Scholar] [CrossRef]
  28. Yang, Y.J.; Yang, Y.M. An efficient webpage information hiding method based on tag attributes. In Proceedings of the 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, Yantai, China, 10–12 August 2010; IEEE: New York, NY, USA, 2010; pp. 1181–1184. [Google Scholar]
  29. Yong, X.; Juan, L.; Yilai, Z. A high capacity information hiding method for webpage based on tag. In Proceedings of the 2012 Third International Conference on Digital Manufacturing & Automation, Guilin, China, 31 July–2 August 2012; IEEE: New York, NY, USA, 2012; pp. 62–65. [Google Scholar]
  30. Garg, M. A novel text steganography technique based on html documents. Int. J. Adv. Sci. Technol. 2011, 35, 129–138. [Google Scholar]
  31. Mahato, S.; Yadav, D.K.; Khan, D.A. A modified approach to text steganography using HyperText markup language. In Proceedings of the 2013 Third International Conference on Advanced Computing and Communication Technologies (ACCT), Rohtak, India, 6–7 April 2013; IEEE: New York, NY, USA, 2013; pp. 40–44. [Google Scholar]
  32. Fridrich, J. Applications of data hiding in digital images. In Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No. 99EX359), ISSPA’99, Brisbane, QLD, Australia, 22–25 August; IEEE: New York, NY, USA, 1999; Volume 1. [Google Scholar]
  33. Wbstego. Available online: http://wbstego.wbailer.com/ (accessed on 2 December 2020).
  34. Invisible Secret. Available online: http://www.invisiblesecrets.com/ (accessed on 2 December 2020).
  35. Snow. Available online: http://www.darkside.com.au/snow/ (accessed on 2 December 2020).
  36. Deogol. Available online: https://hord.ca/projects/deogol/ (accessed on 2 December 2020).
  37. Cho, Y. Intelligent On-Off Web Defacement Attacks and Random Monitoring-Based Detection Algorithms. Electronics 2019, 8, 1338. [Google Scholar] [CrossRef]
  38. Flask Web Framework. Available online: https://flask.palletsprojects.com/en/1.1.x/ (accessed on 2 December 2020).
  39. Python url.request Library. Available online: https://docs.python.org/3/library/urllib.request.html (accessed on 2 December 2020).
  40. UUID_RFC4122. Available online: https://www.ietf.org/rfc/rfc4122.txt (accessed on 2 December 2020).
  41. Alexa Top 500 Sites on the Web. Available online: https://www.alexa.com/topsites (accessed on 2 December 2020).
  42. SimilarWeb Top Websites Ranking. Available online: https://www.similarweb.com/top-websites/ (accessed on 2 December 2020).
  43. Tan, J. Optimal strategy selection approach to moving target defense based on Markov robust game. Comput. Secur. 2019, 85, 63–76. [Google Scholar] [CrossRef]
  44. Kanellopoulos, A.; Vamvoudakis, K.G. A moving target defense control framework for cyber-physical systems. IEEE Trans. Autom. Control 2019, 65, 1029–1043. [Google Scholar] [CrossRef]
  45. Yuk, S.; Cho, Y. A New Covert Communication Method based on Webpage Steganography. In Proceedings of the KIISE Korea Software Congress, Pyeongchang, Korea, 18–20 December 2019; Volume 12, pp. 794–796. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.