Research on Image Steganography Based on Sudoku Matrix

: At present, the Sudoku matrix, turtle shell matrix, and octagonal matrix have been put forward according to the magic matrix-based data hiding methods. Moreover, the magic matrices to be designed depend on the size of the embedding capacity. In addition, by determining the classiﬁcation of points of pixel pairs after applying a magic matrix and by determining the traversal area, the average peak signal-to-noise ratio (PSNR) can be improved. Therefore, this topic intends to propose a data hiding method based on a 16 × 16 Sudoku matrix by using the 16 × 16 Sudoku matrix and extending it to a double-layer magic matrix. Low-cost data embedding methods are also studied, in order to improve the PSNR and maintain good image quality with the same embedding capacity.


Introduction
The development and popularization of Internet technology make the exchange and dissemination of information easy and allows people to get information from all over the world by just clicking a button. However, there are increasing concerns about information security. A great deal of information is transmitted over the Internet, and everyone s privacy may be accessed or stolen by others. At present, it is a common phenomenon that the information and privacy of individuals, companies, and governments are leaked.
The Internet provides a public and convenient channel for exchanging information; however, the security becomes a critical issue on the Internet. Especially, the illegal users on the Internet may retrieve, intercept, camouflage, or copy the information on transferring. Fortunately, the steganography technology provides practical solutions for exchanging secret message on the Internet. At present, information hiding technology play an increasingly more important role in government, military intelligence, financial systems, and the area of medical and health. It is also widely used in secret communications, digital copyright protection, e-commerce security, data integrity, reliability validation, and so on [1,2].
With the rapid development of Computer Science and Internet Technology and the advent of era for Big Data, increasing amounts of digital information (image, text, video, audio, etc.) will be frequently transmitted in the network, causing the protection of digital information security becomes very prominent. Traditional information encryption technology can effectively protect the security of digital information, while the information hiding technology can achieve copyright protection, tamper detection, access control, authentication, and other security features by hiding secret information in the digital information. The combination of information encryption technology and information hiding technology, which hides information in encrypted information, can guarantee digital information security and effective management and control for digital information. Information hiding technology in encrypted domain plays an important role in the information security field for its superiority [3]. There is vivid research on securely delivering a secret message by using a data hiding technique in digital images. Different from cryptography, which requires secret information to be encrypted, this technology usually protects secret information by hiding it into standard images. Because the stego-images are mixed with numerous social images on the Internet, it is hard to attract the attention of the attackers. The most concerning performance indicators of steganography are embedding capacity and stego-image quality. Traditional steganography schemes usually lack flexibility or have the problem of the low quality of stego-image [4][5][6]. In recent years, achieving higher steganography capacity and quality of stego-image has become a hot research topic.
The rapid development of the Internet also faces serious security problems, such as illegal eavesdropping, interception, and malicious tampering with data. Researchers are paying increasingly more attention to relevant information in the security field. As an important branch in the field of information security, data hiding is a technology that embeds the important secret message into common used digital multimedia data, such as audio, text, image, and video, and aims to make the transmission of secret message invisible. However, compared with the traditional cryptography, the research on data hiding technology is still in an emerging stage. There are still many key problems that need to be explored and solved. Therefore, it is of great significance and value to study efficient data hiding techniques.
Sudoku-based data hiding technology is a novel data hiding method in recent years. It begins from the mathematical characteristics of Sudoku and makes data hiding algorithms based on it with higher security. This paper presents a high-capacity data hiding scheme based on the Sudoku game, which expands the usage of Sudoku, not only as a reference matrix for data hiding, but also participating in the data hiding algorithm. On the other hand, it also increases hiding capacity and improves data hiding algorithm security which can prevent violent crackdown more effectively. The data hiding algorithm can also promote the use of other different order of the Sudoku matrix, so it has good scalability and adaptability.

Research Status
The existing data hiding techniques mainly focus on three domains: frequency domain, compression domain, and spatial domain. In the frequency domain, because most of the digital images in Internet are compressed, the transform domain is used to hide secret message on these compressed images. However, many block-based data hiding schemes are proposed on the spatial domain but cannot be applied directly to the transform domain. A block-based high-capacity reversible data hiding scheme for JPEG images is proposed. After studying the algorithm characteristics of Hamming code and histogram shift, our scheme combines them and achieves the purpose of making full use of AC coefficient in DCT block [4,7] or the discrete wavelet transform (DWT) coefficients [8,9], in which the secret data are embedded. The scheme improves the embedding capacity and maintains a good visual quality, and, more importantly, has good security (hard to perceive by third parties). It also can use different quality factors to meet the user's different hidden needs. If the sender has a larger number of secret bits to hide, it selects a lower quality factor for compression image. If the sender needs a stego image that has a higher visual quality to avoid being suspected by third-party eavesdroppers, a higher quality factor should be a better choice. Some researchers have proposed the data hiding scheme based on vector quantization (VQ) compression [10][11][12][13]. The data hiding scheme based on side match vector quantization (SMVQ) and search-order code (SOC) is proposed. In this scheme, the VQ image is compressed by SMVQ technology to reduce redundancy, and then the image is recompressed by SOC. In the embedding process, the embedding capacity of image is changed dynamically by threshold to meet different embedding capacity requirements. The experimental results show that the proposed scheme can improve the embedding capacity and reduce the image distortion.
In the spatial domain, there are about three types of schemes: the least significant bit (LSB) substitution [14][15][16][17][18][19], the exploiting modification direction (EMD) [20][21][22], and the magic matrix-based (MMB) schemes [23][24][25][26][27][28][29][30][31][32][33]. In 1996, Bender et al. first proposed that LSB was the most common scheme [3]. Wang et al. [14] proposed an optimal LSB substitution algorithm to improve the image quality and a genetic algorithm to solve the huge computation issue, with an embedding capacity of 1 bit per pixel (bpp). The LSB substitution algorithm is very simple, but the hidden data can be easily detected [16]. Mielikainen et al. [15] improved the LSB matching method and embedded data in pairs by modifying the parity check, with an embedding capacity of 1 bpp. In recent years, Sahu et al. [17][18][19] proposed some new data hiding methods based on LSB to further improve the embedding capacity. Zhang and Wang et al. [20] fully exploited the modification direction to embed one secret (2n + 1)-bit digit into a vector with n pixels by changing at most one LSB of one pixel.
Unlike the above methods, several kinds of image steganography based on MMB have been put forward in the last few years. In 2008, Chang et al. [23] proposed a novel data hiding scheme using Sudoku. This scheme takes pixel pairs as the coordinates of a Sudoku matrix to specify the value to embed one 9-bit digit into each pixel pair, with an embedding capacity of 1.5 bpp. Figure 1a shows an example matrix of this scheme. Most image data hiding schemes divide the original cover image into non-overlapping small blocks and then use each block idea; the space complexity of the data hiding algorithm can be reduced; the algorithm is more concise, efficient, easy to implement, and convenient for future modification and optimization; and more importantly, it is more suitable for the complicated network environment. In order to design a more efficient data hiding scheme, this paper studies a large number of data hiding schemes and related knowledge, then proposes two image data hiding algorithms based on blocks with higher embedding capacity [23][24][25][26]. This scheme can maintain an embedding capacity of 1.5 bpp with less distortion, and Figure 1b is an example matrix of this scheme. In 2016, Liu's method can maintain good image quality with an average peak signal-to-noise ratio (PSNR) of 41.87dB when the embedding capacity is up to 2.5 bpp [27][28][29]. This scheme improves the embedding capacity and maintains a good visual quality, and, more importantly, has good security (hard to perceive by third parties). It can also use different quality factors to meet the user's different hidden needs. If the sender has larger number of secret bits to hide, a lower quality factor for the compression image should be selected. If the sender needs a stego-image with a higher visual quality to avoid being suspected by third-party eavesdroppers, a higher quality factor should be a better choice. In 2018, Xie et al. [30] put forward a two-layer turtle shell matrix-based data embedding method, which added one layer of matrix with the turtle shell as a cell based on the turtle shell matrix-based data hiding method proposed by Chang et al. [26] maintained an embedding capacity up to 2.5 bpp and achieved larger embedding capacity and high image quality with a PSNR of 47.12dB. In recent years, the mini-Sudoku matrix-based data hiding schemes [31][32][33] have also been proposed. In 2019, He et al. [31] proposed a mini-Sudoku matrix-based image steganographic scheme which could reach an embedding capacity of 2 bpp and a PSNR of 46.37dB. In 2020, Horng et al. [32] proposed a cubic mini-Sudoku matrix-based image steganographic method in which the algorithm uses a cubic magic cube at the plane matrix stage. In the same year, Chen et al. [33] put forward a multi-layer mini-Sudoku matrix-based data hiding method, which showed an embedding capacity up to 3 bpp and a PSNR of 40.01 dB.

Research Methods
In terms of the data embedding method that employs magic matrices, the key to improving the embedding capacity and the image quality lies in the construction of a magic matrix and the conditions of a traversal area. The general objective of this topic is to improve the embedding capacity while maintaining good image quality. Specifically, taking the previous research as the basis, this topic studies the construction of a magic matrix and the conditions of a traversal area.

Research Plan
This topic is based on the magic matrix-based data hiding methods. In most magic matrix-based data hiding methods, Sudoku matrix, turtle shell matrix, octagonal matrix, and other matrices with constraints are used. In this research, in order to improve the embedding capacity, a 16 × 16 Sudoku matrix will be used for data hiding.
The magic matrix-based data hiding method is a novel method proposed in recent years. According to this method, the magic matrix construction information, embedded images, and binary ciphertext should be input. First, the binary ciphertext is converted to a string as required by the method. For example, in the turtle shell matrix-based data hiding method, every three-bit binary ciphertext is converted to an octal ciphertext, and a corresponding magic matrix is generated according to the magic matrix construction information. Second, data is embedded; two-bit consecutive pixels in the image are read; the traversal area and method are determined by classifying the corresponding points on the magic matrix; the ciphertext value to be embedded is found in the traversal area; and the pixel pairs in the original image are modified as the coordinates of the ciphertext value, which is looped according to the length of the ciphertext until the end of the ciphertext or the pixels in the image are all modified; and the stego-image is output in the end.
With the magic matrix-based data hiding methods, decoding becomes easy. After the stego-image and magic matrix are input, the ciphertext can be extracted from the magic matrix based on the pixel pairs of the stego-image.

Design of Magic Matrices
The design of magic matrices is one of the focuses of this topic. The 9 × 9 Sudoku matrix, turtle shell matrix, octagonal matrix, and other matrices with diverse shapes and

Research Methods
In terms of the data embedding method that employs magic matrices, the key to improving the embedding capacity and the image quality lies in the construction of a magic matrix and the conditions of a traversal area. The general objective of this topic is to improve the embedding capacity while maintaining good image quality. Specifically, taking the previous research as the basis, this topic studies the construction of a magic matrix and the conditions of a traversal area.

Research Plan
This topic is based on the magic matrix-based data hiding methods. In most magic matrix-based data hiding methods, Sudoku matrix, turtle shell matrix, octagonal matrix, and other matrices with constraints are used. In this research, in order to improve the embedding capacity, a 16 × 16 Sudoku matrix will be used for data hiding.
The magic matrix-based data hiding method is a novel method proposed in recent years. According to this method, the magic matrix construction information, embedded images, and binary ciphertext should be input. First, the binary ciphertext is converted to a string as required by the method. For example, in the turtle shell matrix-based data hiding method, every three-bit binary ciphertext is converted to an octal ciphertext, and a corresponding magic matrix is generated according to the magic matrix construction information. Second, data is embedded; two-bit consecutive pixels in the image are read; the traversal area and method are determined by classifying the corresponding points on the magic matrix; the ciphertext value to be embedded is found in the traversal area; and the pixel pairs in the original image are modified as the coordinates of the ciphertext value, which is looped according to the length of the ciphertext until the end of the ciphertext or the pixels in the image are all modified; and the stego-image is output in the end.
With the magic matrix-based data hiding methods, decoding becomes easy. After the stego-image and magic matrix are input, the ciphertext can be extracted from the magic matrix based on the pixel pairs of the stego-image.

Design of Magic Matrices
The design of magic matrices is one of the focuses of this topic. The 9 × 9 Sudoku matrix, turtle shell matrix, octagonal matrix, and other matrices with diverse shapes and constraints have been proposed. Magic matrices are constructed by referring to the existing mathematical magic matrices or based on the matrices designed by researchers.

1.
16 × 16 Sudoku matrix A 16 × 16 Sudoku matrix is a magic matrix developed from a 4 × 4 magic matrix by adding conditions and complexity. Moreover, a Sudoku matrix can generate countless combination patterns. It is not easy to decode and then extract the steganographic content even if the Sudoku matrix-based image steganography is identified. Therefore, the security of such an algorithm is relatively high. Figure 2 is an example of the 16 × 16 Sudoku matrix. constraints have been proposed. Magic matrices are constructed by referring to the existing mathematical magic matrices or based on the matrices designed by researchers.
1. 16 × 16 Sudoku matrix A 16 × 16 Sudoku matrix is a magic matrix developed from a 4 × 4 magic matrix by adding conditions and complexity. Moreover, a Sudoku matrix can generate countless combination patterns. It is not easy to decode and then extract the steganographic content even if the Sudoku matrix-based image steganography is identified. Therefore, the security of such an algorithm is relatively high. Figure 2 is an example of the 16 × 16 Sudoku matrix.

Double-layer magic matrix
The double-layer magic matrix is a method to improve the embedding capacity while maintaining good image quality. With the same embedding capacity, this method narrows the traversal area when compared with the image steganography with a one-layer magic matrix and is more applicable due to the properties of a magic matrix. Figure 3 is an example of a 4-2 double-layer magic matrix with a 4 × 4 magic matrix as the first layer and a 2 × 2 magic matrix as the second layer.

2.
Double-layer magic matrix The double-layer magic matrix is a method to improve the embedding capacity while maintaining good image quality. With the same embedding capacity, this method narrows the traversal area when compared with the image steganography with a one-layer magic matrix and is more applicable due to the properties of a magic matrix. Figure 3 is an example of a 4-2 double-layer magic matrix with a 4 × 4 magic matrix as the first layer and a 2 × 2 magic matrix as the second layer.

Determination of Traversal Area
The traversal area is an area determined by the structure and construction constraints of a magic matrix. All values in the area, and how to search the closest satisfactory point, need to be considered in its design. For example, the methods for determining the traversal area in the 4 × 4 magic matrix-based data hiding method are shown in Figure 4a,b. The construction constraints of a 4 × 4 magic matrix are that the column sum, row sum, and diagonal sum are all 30. This magic matrix as the basic magic matrix fills in loops to generate a 256 × 256 matrix. This magic matrix is used to simulate the process of

Determination of Traversal Area
The traversal area is an area determined by the structure and construction constraints of a magic matrix. All values in the area, and how to search the closest satisfactory point, Symmetry 2021, 13, 387 6 of 11 need to be considered in its design. For example, the methods for determining the traversal area in the 4 × 4 magic matrix-based data hiding method are shown in Figure 4a,b. The construction constraints of a 4 × 4 magic matrix are that the column sum, row sum, and diagonal sum are all 30. This magic matrix as the basic magic matrix fills in loops to generate a 256 × 256 matrix. This magic matrix is used to simulate the process of hiding the ciphertext 0 from the pixel pair p(4,3). In Figure 4a, the basic magic matrix serves as the traversal area for data hiding, and a new pixel pair p (7,0) is got as an outcome. The PSNR, in this case, is 38.59dB. In Figure 4b, the 4 × 4 area where the origin is located on the inside lower right serves as the traversal area for data hiding, and a new pixel pair p (3,4) is obtained as an outcome. The PSNR, in this case, is 48.13. The determination of the traversal area depends on the image quality.

Determination of Traversal Area
The traversal area is an area determined by the structure and construction constraints of a magic matrix. All values in the area, and how to search the closest satisfactory point, need to be considered in its design. For example, the methods for determining the traversal area in the 4 × 4 magic matrix-based data hiding method are shown in Figure 4a,b. The construction constraints of a 4 × 4 magic matrix are that the column sum, row sum, and diagonal sum are all 30. This magic matrix as the basic magic matrix fills in loops to generate a 256 × 256 matrix. This magic matrix is used to simulate the process of hiding the ciphertext 0 from the pixel pair p (4,3). In Figure 4a, the basic magic matrix serves as the traversal area for data hiding, and a new pixel pair p′ (7,0) is got as an outcome. The PSNR, in this case, is 38.59dB. In Figure 4b, the 4 × 4 area where the origin is located on the inside lower right serves as the traversal area for data hiding, and a new pixel pair p′ (3,4) is obtained as an outcome. The PSNR, in this case, is 48.13. The determination of the traversal area depends on the image quality.

Evaluation Methods
The main evaluation criteria of magic matrix-based data hiding methods are visual effect, embedding capacity (EC), and average peak signal-to-noise ratio (PSNR).

Visual effect
The visual effect is the contrast between the original image and the stego-image through human eyes. The contrast mainly depends on the color difference, degree of difference, and other details of images to check traces of data embedding. Figure 5 displays the image contrast according to the 4 × 4 magic matrix-based data hiding method, in which the images are sourced from the USC-SIPI Image Database [34], and the images in the green box are stego-images.

2.
Embedding rate (ER) The embedding rate in bpp (bit per pixel) is used to calculate the size of binary ciphertext data that can be embedded into a pixel. The embedding rate is calculated according to Equation (1), where H is the image height, W is the image width, and S is the size of binary ciphertext.

Visual effect
The visual effect is the contrast between the original image and the stego-image through human eyes. The contrast mainly depends on the color difference, degree of difference, and other details of images to check traces of data embedding. Figure 5 displays the image contrast according to the 4 × 4 magic matrix-based data hiding method, in which the images are sourced from the USC-SIPI Image Database [34], and the images in the green box are stego-images.

Embedding rate (ER)
The embedding rate in bpp (bit per pixel) is used to calculate the size of binary ciphertext data that can be embedded into a pixel. The embedding rate is calculated according to Equation (1), where H is the image height, W is the image width, and S is the size of binary ciphertext.

Peak signal-to-noise ratio (PSNR)
The Peak Signal-to-Noise Ratio (PSNR) refers to the parameter for evaluating the difference between the carrier image and the hidden image; it is also one of the most important indicators for evaluating the performance of an image information hiding scheme. An image's PSNR is defined by the Mean Square Error (MSE) of the pixels of the carrier image and the hidden image. For example, for an M × N grayscale image, its MSE is defined as shown in Equation (2). The PSNR can reflect, to a certain extent, the change of the carrier image before and after the information hiding operation. The higher the PSNR value, the smaller the difference between the carrier image and the hidden image, that is, the better the performance of the image information hiding algorithm. On the contrary, the smaller the PSNR value, the greater the difference between the carrier image and the hidden image, indicating that the higher the distortion of the image, the easier it is to be detected by the human visual system. In general, when the PSNR value exceeds 30dB, it can be considered that there is no visual difference between the carrier image and the hidden image.

3.
Peak signal-to-noise ratio (PSNR) The Peak Signal-to-Noise Ratio (PSNR) refers to the parameter for evaluating the difference between the carrier image and the hidden image; it is also one of the most important indicators for evaluating the performance of an image information hiding scheme. An image's PSNR is defined by the Mean Square Error (MSE) of the pixels of the carrier image and the hidden image. For example, for an M × N grayscale image, its MSE is defined as shown in Equation (2). The PSNR can reflect, to a certain extent, the change of the carrier image before and after the information hiding operation. The higher the PSNR value, the smaller the difference between the carrier image and the hidden image, that is, the better the performance of the image information hiding algorithm. On the contrary, the smaller the PSNR value, the greater the difference between the carrier image and the hidden image, indicating that the higher the distortion of the image, the easier it is to be detected by the human visual system. In general, when the PSNR value exceeds 30dB, it can be considered that there is no visual difference between the carrier image and the hidden image.
The PSNR is used to evaluate the image quality of the stego-image by comparing the original image with the stego-image by computer. PSNR is calculated according to Equation (3), where MSE is the mean squared error. In Equation (2), I(i, j) is the pixel in row i, column j in the original image and I (i, j) is the pixel in row i, column j in the stego-image.
Equation (3) demonstrates that in 8-bit grayscale images (signal sequences), each pixel (signal) may output a value in the range of 0 to 255. Each image is only a case in the 8-bit grayscale image. If the range of pixel values is fixed for each image, then the result is no longer an 8-bit grayscale image, but the experimental results of the image within that range.

Experimental Analysis
To perform the experiment, Matlab 2018a, AMD3700X (CPU), and RTX 2070 super (GPU) are used. The experimental images are grayscale images of 512 × 512 pixels. This paper studies the information embedding and extracting based on the encrypted domain information hiding and proposes the tag-based encrypted domain information embedding and extracting algorithm. We analyze each algorithm by simulation to compare their effectiveness, security, and so on. Besides, this paper implements the Information embedding and extracting to evaluate its function and performance.

Turtle Shell Magic Matrix-Based Image Steganography
Four images are processed by steganography. According to the experimental results, this scheme can achieve an embedding capacity of 1.5 bpp and a PSNR of 49.76dB. There are two other steganographic schemes that can change the traversal area. The experimental results of the turtle shell magic matrix-based image steganography are listed in Table 1.  Table 2 lists the experimental results of the 4 × 4 magic matrix-based image steganography. As shown by the example of a 4 × 4 magic matrix in Figure 6, four images are processed by steganography. According to the experimental results, this scheme can achieve an embedding capacity of 2 bpp and a PSNR of 46.36 dB, able to maintain good image quality.     Table 3 lists the experimental results of the 4-2 double-layer magic matrix-based image steganography. Four images are processed by steganography using the structure of the double-layer magic matrix shown in Figure 3. According to the experimental results, this scheme can achieve an embedding capacity of 3 bpp and a PSNR of 40.73dB. In order to improve stego-image quality and steganography efficiency [35,36], the paper redesigns the hidden manner of secret information. Therefore, this scheme can achieve the corresponding secret information steganography. Abundant experimental comparison analyses show that this scheme can realize secret information hidden for different steganography capacity requirements. Moreover, it can improve the quality of steganography images under the same steganography capacity.

4 × 4 Magic Matrix-Based Image Steganography
The paper describes the three image information hidden schemes in terms of the design theory and steganographic process. It analyzes steganography efficiency, steganography image quality, steganography capacity, and security through simulation experiments. Experimental results show that, compared with the previous schemes, the three schemes have some advantages in terms of steganography capacity and steganography image quality.

Conclusions
In the magic matrix-based data hiding methods proposed in the previous researches, the embedding capacity can reach 3bpp if the image maintains good quality. In this paper, a larger embedding capacity is achieved utilizing the double-layer magic matrix.
As one of the indicators, image quality is used to verify data embedding. With the same embedding capacity, constructing magic matrices and designing traversal areas for special magic matrices can improve the image quality and the efficiency of data embedding.
Based on the 16 × 16 Sudoku magic matrix, this paper proposes image steganography to improve algorithm security. The double-layer magic matrix-based image steganography is applied to improve the embedding capacity. A method for determining a new traversal area is designed to improve the image quality and reduce the computation complexity. A number of magic matrix-based data hiding methods have been implemented. On such a basis, it is practicable to implement a data hiding system for newly designed magic matrices.
Focusing on the security of the information transmitted on the Internet, this paper presents the data hiding scheme, which apply many technologies such as Magic Matrix. The scheme inspired from the Sudoku use a magic matrix generated by a Double-layer numeral system function to guide cover pixels' modification and fully explores embedding space in the magic matrix. This information hiding scheme gets very large embedding capacity with good security at the same time.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.