The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security

Raikhlin, Vadim; Gibadullin, Ruslan; Boyko, Alexey

doi:10.3390/computers15030167

Open AccessArticle

The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security

by

Vadim Raikhlin

,

Ruslan Gibadullin

^*

and

Alexey Boyko

Department of Computer Systems, Kazan National Research Technical University named after A.N. Tupolev—KAI, Kazan 420000, Russia

^*

Author to whom correspondence should be addressed.

Computers 2026, 15(3), 167; https://doi.org/10.3390/computers15030167

Submission received: 18 January 2026 / Revised: 21 February 2026 / Accepted: 1 March 2026 / Published: 4 March 2026

(This article belongs to the Special Issue Cyber Security and Privacy in IoT Era)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Opportunities to improve the effectiveness of associative protection in scene analysis can be found in changing the configurations of digital etalons (reference patterns) and in the transition from a decimal to a hexadecimal system when encoding object names and their coordinates. The relevance of the research undertaken is determined by the need for a significant increase in the number of keys used and the advisability of further improvement of the security strength. Based on a preliminary analysis, a rule for selecting digital reference configurations has been formulated from the condition of uniform distribution of bit inclusions in the pseudorandom sequence (GAMMA) container when using the decimal and hexadecimal systems for encoding purposes. Algorithms for forming a complete and limited test list of permutations for experimental research purposes have been developed. Results of the computational experiment confirmed validity of the formulated rule. For the accepted configurations, estimates of the expected number of preserved bits of the etalon were obtained.

Keywords:

associative data protection; digital etalon configurations; transition to a 16-bit system; bit distribution function; mathematical expectation of the number of stored etalon bits

1. Introduction

Review [1,2] examined the state of research in the field of associative data protection in scene analysis [3,4]. Associativity is defined by the use of a masking mechanism for binary matrix etalons of decimal digits of code representations of object names and their coordinates. Both are encoded by k-bit numbers. Each number digit is entered into a binary matrix of size

m \times n

,

m = 2 \times n - 1

. The masking algorithm performs a division of the etalons of the set

{\bar{0, Q - 1}}

, where Q is the base of the number system (in this case,

Q = 10

), and sequentially formed subsets into dichotomic pairs according to the value of one bit. The positions of this bit correspond to the unit in the inverse (assuming bit-wise inversion, i.e., complement) matrices of the masks of both subsets of the pair.

We used decimal coding with postal (ZIP code-style) symbols (Figure 1). Individual bits of the etalon set were placed along the outer contour and inner “zigzag” of the binary matrices (Figure 2).

The mask generation process is random. For each matrix, a separate mask matrix of the same size is created, which stores the bits essential for its identification in the etalon. The set of masks is the key to recognition. The masked bits are subject to randomization. As a result, each numerical code is converted into a k-section steganographic container, initially filled with a segment of a pseudo-random sequence of length

L = k \times (9 \times n - 12)

, into which randomly stored code bits are interspersed at the positions of the units of the inverse matrix masks. Regardless of n, the average number of such bits

k \times M ≪ L

remains constant. Here M is the mathematical expectation for a single reference. At the same time, as n increases, the steganographic security of the method also increases due to the increase in

L / (k \times M)

[5].

The integration of cryptography and steganography is a characteristic feature of associative security.

The previous study [2] addressed the problem of associative security primarily at the level of engineering optimization: minimizing the average number of bit inclusions M in the GAMMA container while maintaining recognition capability. However, this optimization-oriented perspective left unresolved a more fundamental issue that directly affects the practical viability of the method. Specifically, the non-uniform distribution of unmasked bits along the container length creates two interrelated vulnerabilities. First, the distribution spikes at nodal points provide an adversary with exploitable structural regularities, enabling targeted distortion of the transmitted message through intentional network noise. Second, the requirement to exclude keys producing inclusions at nodal points leads to a drastic reduction in the usable key space (to 27% of the total for

Q = 10

), thereby degrading cryptographic strength. Thus, the central problem addressed in this work is the elimination of distribution non-uniformity through principled selection of etalon configurations—a problem that lies at the intersection of bit distribution uniformity and key space preservation, and that has direct implications for adversarial robustness. Unlike [2], which treated etalon reconfiguration as a means of reducing M, the present work elevates the uniformity of bit distribution to a primary design criterion and formulates a heuristic rule grounded in the analysis of dichotomous partition cardinalities, thereby shifting the level of inquiry from parameter optimization to structural design of the encoding system.

Notation and Definitions

To facilitate reading across disciplines, we summarize the core notation and definitions used throughout the paper.

Number system base Q: The radix of the positional number system used to encode object names and coordinates. In this work,

Q = 10

(decimal) and

Q = 16

(hexadecimal).

Code length k: The number of digits in the code representation of an object name or coordinate value. For

k = 3

, the maximum number of representable values is

Q^{k}

.

Binary matrix-etalon: A binary matrix of size

m \times n

, where

m = 2 \times n - 1

, that encodes the graphical representation of a single digit from the set

{0, Q - 1}

. The positions of ones in this matrix define the shape (configuration) of the digit. The term “etalon” (from French étalon, meaning “reference standard”) denotes a reference pattern against which incoming data is matched during recognition.

Etalon outline: The ordered sequence of matrix cells traversed along the outer contour (clockwise, starting from the lower-left corner) and then along the inner zigzag of the binary matrix (see Figure 2). This linearization maps the two-dimensional matrix to a one-dimensional bit sequence.

Nodal points: The corner cells of the binary matrix that lie at the junctions of contour segments (highlighted in Figure 2). These points are shared by two or more segments and play a special role in the dichotomous masking process.

Mask matrix: A binary matrix of the same size

m \times n

as the etalon, generated randomly for each etalon in the set. The positions of ones in the mask indicate the bits of the etalon that are preserved (stored) in the container; the remaining bits are replaced by pseudorandom values. The complete collection of mask matrices for all Q etalons constitutes the secret key.

Dichotomous partitioning: The recursive process by which the masking algorithm divides the etalon set

{0, Q - 1}

and its subsequent subsets into pairs based on the value of a single bit at a given position of the outline. At each step, the bit position where the partition occurs is recorded in the inverse mask matrices of both resulting subsets.

GAMMA container: A pseudorandom binary sequence of length

L = k \times (9 \times n - 12)

that serves as the steganographic carrier. The unmasked (preserved) bits of all k etalons are embedded into this sequence at positions determined by the inverse masks, while all other positions retain their pseudorandom values. The container thus integrates cryptographic (key-dependent masking) and steganographic (embedding in a pseudorandom carrier) protection.

Mathematical expectation of stored bits M: The average number of etalon bits that survive the masking process and are embedded into the container for a single etalon, computed based on generated keys.

Distribution function of bit inclusions: The function that maps each position along the linearized etalon outline to the total number of stored (unmasked) bits at that position, aggregated over all keys. A uniform distribution indicates that no position is disproportionately favoured, which is the design objective of this work.

Permutation: An ordered arrangement of Q symbols that defines the initial assignment of digits to etalon shapes in the masking algorithm. The total number of distinct permutations is

Q!

. Different permutations produce different keys.

2. Research Objectives

The perspective for the development of research on associative security outlined in [2] is linked to the reconfiguration of digital etalons and the transition from a decimal to a hexadecimal system. This transition increases the maximum number of object names and coordinate values from

10^{3} = 1000

(decimal) to

16^{3} = 4096

(hexadecimal) for

k = 3

. This should significantly increase cryptographic strength, provided that complete coverage is achieved—meaning that GAMMA sequence is generated once using the cryptographic version of the Mersenne Twister pseudorandom number generator [6,7]. Given an acceptable key search, decryption of the container contents will then yield the complete set of names. The key space also expands significantly due to a much larger number of permutations available with hexadecimal symbols. The use of such permutations is characteristic of the developed masking algorithm [2].

The criterion for selecting etalon configurations is not trivial. The emphasis of reconfiguration in [2] was made on minimising the average number of bit inclusions in the information-carrying GAMMA container. Our findings give

M = 3.8

for “postal coding,” rather than

M = 5

as in [2]. This is already quite close to the minimum possible value

M_{m i n} = 3.4

for decimal encoding. For the set

{\bar{0, 15}}

, this number is reduced to

M_{m i n} = 4

by sequentially dividing the entire set and the resulting subsets in half [2].

In this article, the focus has changed while keeping an interest in the obtained M values. We suppose that the main issue is the need to eliminate spikes in the distribution function of unmasked bits along the container length. Such spikes make it easier for an adversary to distort the transmitted message using intentional network noise. To obtain a uniform distribution function of inclusions along the length of the container in [2], it was necessary to prohibit the appearance of units at the nodal points highlighted in Figure 2 when forming key sets of inverse mask matrices. This led to a significant reduction in the number of keys used (“correct” keys) and, consequently, to a decrease in cryptographic strength.

To ensure programming convenience we abandon the reverse diagonal approach introduced in [2] and return to the contour shown in Figure 2. This choice facilitates the transition from matrix to linear representation, which leads to a significant reduction in the amount of data transferred. Moreover, as the research has shown, this makes it possible to better meet the reconfiguration criterion adopted in [2].

The randomness of key (i.e., mask set) generation is largely determined by the random selection of the initial permutation of characters in the set

{\bar{0, Q - 1}}

. The number of permutations of r elements is

r!

. Figure 3 shows the distribution of inverse mask units along the etalon outlines (Figure 2) for postal symbols (Figure 1), obtained using the Fisher–Yates shuffling algorithm [8,9].

The majority of the individual bits of all keys involved are distributed among the nodes shown in Figure 2. In Figure 3, the zero position corresponds to the lower left point in Figure 2, the outer contour is traversed clockwise, then continuously in a zigzag pattern (linearisation of the matrix representation of the binary etalon).

It was useful to check whether the spikes in the distribution function were the result of an unsuccessful set of statistics. The most reliable check would be to go through all possible permutations. For

Q = 10

, the number of possible permutations is 10! = 3,628,800. The current state of computer technology allows us to generate this list in a fraction of a second. Complete iteration over the list during mask generation took 30 min and confirmed the validity of the previously obtained result (see Figure 4;

M = 3.83

in this case).

Let us refer to Figure 3. Each key has an average of 8 occurrences of stored bits at selected points. This casts doubt on the existence of “correct” keys with zeros at all node points. Nevertheless, such keys do exist [1]. We analysed all

10!

obtained keys. It was found that 73% of them contain at least one “critical” mask (on average, 8 such masks for each non-working key). In other words, only 27% of the keys can be used in practice. This could lead to a reduction in the cryptographic security strength.

This defines our primary research objective: finding such a rule for selection of digital etalon shapes that will eliminate spikes in the inclusion distribution function.

Spikes occur as a result of combining multiple keys. Figuring out an appropriate etalon selection rule, we should bear in mind that the key generation system [10] is subject to external influences, specifically changes in etalon shapes. Therefore, we can only observe the effects of this influence without explaining the underlying mechanisms. Explanation is, of course, a key function of science [11], but a full explanation here proves difficult.

Studying Figure 4 can provide us with valuable information regarding the heuristic rule that we seek to develop. According to the condition, the outline of the etalon points between the nodal points is composed of line segments consisting only of ones or only of zeros. The results of the first dichotomous partitions for all points of each segment, except for the nodal points, are identical. Examination of Figure 4 reveals that the distribution for each segment is also identical. Therefore, it is logical to assume that:

1.: The behaviour of the system at each point of the outline is determined by the result of the first dichotomous division at that point.
2.: To determine the cause of the spikes, it is enough to consider the case $n = 3$ .

Threat Model and Security Implications of Distribution Non-Uniformity

The claim that spikes in the bit inclusion distribution function compromise security warrants clarification in terms of specific adversarial models. We consider three classes of attacks for which distribution non-uniformity is relevant.

Targeted channel noise attack [12]: We assume an active adversary who can inject additive noise into the communication channel but has a limited energy budget for distortion. If the distribution of embedded bits along the container is non-uniform, the adversary can concentrate noise at positions corresponding to the spikes (nodal points), thereby corrupting a disproportionately large fraction of stored etalon bits with minimal total distortion energy. Under a uniform distribution, the same energy budget would affect a smaller and more predictable fraction of embedded bits, since the adversary gains no positional advantage. This model is the primary motivation for the uniformity criterion adopted in this work.

Statistical steganalysis [13]: A passive adversary who intercepts the container and possesses knowledge of the etalon outline structure (but not the key) may attempt to distinguish the container from a purely pseudorandom sequence. Non-uniform bit inclusion produces detectable statistical deviations at predictable positions. Specifically, at spike positions, the frequency of certain bit values across multiple containers deviates from the expected pseudorandom distribution, providing the adversary with a distinguishing criterion. Under a uniform distribution, such positional statistical signatures are eliminated, and the adversary is forced to rely on aggregate statistical tests over the entire container length, which are less effective given that

k \times M ≪ L

.

Key space reduction attack: The requirement to avoid inclusions at nodal points (to mitigate the effects of spikes) reduces the usable key space to 27% of the total for

Q = 10

. An adversary performing brute-force key search benefits directly from this reduction.

We emphasize that the present analysis is qualitative. A formal quantitative treatment—including bounds on adversarial advantage under specific steganalytic frameworks [14] and precise capacity–security trade-offs—is an important direction for future work.

3. The Etalon Configurations Selection Rule and the Prerequisites for Its Verification

Our Algorithm 1 for generating the complete set of permutations of r elements is based on the expression:

r! = r \times (r - 1)!

.

Algorithm 1 Generation of the complete set of permutations of r elements

Require:: Initial permutation $P = (p_{1}, p_{2}, \dots, p_{r})$
Ensure:: Complete list of $r!$ permutations
1:: procedure GeneratePermutations(P, r)
2:: if $r = 2$ then
3:: Record P
4:: Perform one cyclic left shift on P; record the result
5:: return
6:: end if
7:: for $i = 1$ to r do
8:: Record the current state of P
9:: GeneratePermutations( $(p_{2}, p_{3}, \dots, p_{r})$ , $r - 1$ )
10:: Perform a cyclic left shift on P
11:: end for
12:: end procedure

A complete permutation list is characterized by the equal presence of each element in any position of the sequence

(r - 1)!

times. This fact is illustrated by Table 1 using the example of

r = 5

, where

(r - 1)! = 24

.

Let us consider the case

n = 3

. It is characteristic that the spikes in Figure 4 appear only at the nodal points of the outer contour of Figure 2a. There is no spike at the upper right point, because there is no dichotomous pair for it. Among the dichotomous pairs at the nodal points, two pairs include singleton subsets, two include subsets of two elements, and one includes a subset of three elements. At other points where no spikes are observed, five pairs include subsets of cardinality 4, three pairs include subsets of cardinality 5, and one pair includes a subset of cardinality 3. However, the node with the subset of cardinality 3 produces a spike. Therefore, configurations exhibiting this property should be excluded.

According to the assumption given in Section 2, we come to the following conclusion. A necessary condition for the absence of spikes in the distribution function is the absence of dichotomy at certain points or (what is still hypothetical) that during the first division in the masking process, dichotomous selections occur at the points of the outline in Figure 2a, and subsets of cardinality (4–5)—for

{\bar{0, 9}}

and, by analogy, (7–8)—for

{\bar{0, 15}}

, close in cardinality to the subsets selected in [1] to minimise the number of inclusions.

Based on this analysis, we propose the following heuristic rule:

Rule 1.

The configurations of digital etalons should be chosen in such a way that the previously specified conditions are fulfilled in order to obtain a uniform distribution function.

The above rule was derived from the analysis of the minimal case

n = 3

. We now provide a justification for its extrapolation to arbitrary values of n. The outer contour of the binary matrix of size

m \times n

,

m = 2 n - 1

, contains exactly four corner (nodal) points at fixed relative positions (Figure 2), regardless of n. Increasing n extends the contour segments between these points but does not alter the number or topology of the nodal points themselves. Since the distribution spikes occur precisely at the nodal points, the structural source of the problem is identical for all n.

Moreover, the result of the first dichotomous partition at any contour point depends solely on the number of etalons having a 1 versus a 0 at that position, which is determined entirely by the etalon configurations and not by the matrix size. Between any two adjacent nodal points, the contour consists of a segment with identical bit values for each etalon, so all non-nodal points within a segment produce the same first-partition result. This segment homogeneity holds for all n, as increasing n only adds points within each segment without affecting its uniformity. After the first partition, all subsequent recursive partitions operate on subsets whose composition is already fixed by the first step, meaning that deeper recursion levels are modulated by the n-invariant first partition. Collectively, these invariants imply that the conditions producing or eliminating spikes at the nodal points are governed by the etalon configurations alone.

To supplement this structural argument with empirical evidence, we conducted additional experiments at intermediate values

n = 7

and

n = 15

for both

Q = 10

and

Q = 16

(with configurations presented in Section 4). In all cases, the distribution of bit inclusions along the container remained uniform, with no spikes at the nodal points, and the values of M coincided with those obtained for

n = 3

and

n = 30

. This is consistent with the analysis above: since M depends on the partition structure rather than on the matrix dimensions, it remains constant across n. What does change with n is the container length

L = k \times (9 n - 12)

and consequently the steganographic security ratio

L / (k \times M)

, which grows linearly.

We acknowledge that this extrapolation applies specifically to the distribution uniformity criterion and the value of M. Other aspects of system behaviour—such as computational efficiency of key generation and practical resistance to specific statistical attacks at different container lengths—may exhibit n-dependent properties and require separate investigation.

This hypothesis requires experimental verification. For the case

Q = 10

, testing can be performed on the entire set of permutations. However, when

Q = 16

, the total number of permutations becomes as much as 16! = 20,922,789,888,000. Exhaustively enumerating all permutations is computationally infeasible. It is therefore practical to construct a suitable limited test list for the purposes of research.

We now clarify what “suitable” means as applied to a limited test list. To ensure suitability, we must select a subset of permutations in such a way that will allow to reproduce the same distribution function as could be obtained by the usage of complete permutation set. For this purpose we should strive to maintain the property which is naturally inherent to a complete permutation set: that each element is equally contained in any position of the r-sequence.

To form such a list for even values of r, the following Algorithm 2 is proposed:

Algorithm 2 Generation of the test list of permutations for even r

Require:: Initial permutation $P = (p_{1}, p_{2}, \dots, p_{r})$ , r is even
Ensure:: Test list of $r \times (2 (r / 2)! - 1)$ permutations
1:: Perform $r - 1$ cyclic left shifts on P, obtaining r permutations (including the initial one)
2:: for each of the r resulting permutations do
3:: Split it into two halves of $r / 2$ elements
4:: Generate the complete list of $(r / 2)!$ permutations for each half using Algorithm 1
5:: end for

For this algorithm, the cardinality of the resulting set of permutations is equal to

r \times (2 (r / 2)! - 1)

. Each element is present in any position of the r-sequence

2 \times (r / 2)! - 1

times. These positions are illustrated in Table 2 (

r = 4

) and Table 3 (

r = 6

). When

r = 4

, we obtain 12 test permutations, while

r! = 24

. All permutations are different. Each element appears three times in any position. When

r = 6

, we obtain 66 different test permutations, while

r! = 720

. Each element appears 11 times in any position.

For

{\bar{0, 9}}

(

r = 10

), we obtain

10 \times (2 \times 5! - 1) = 10 \times (2 \times 120 - 1) = 2390

different test permutations, i.e., approximately

0.66 \times 10^{- 3}

of the total set. Each decimal digit appears 239 times in any position of the r-sequence. The validity of using the test list to estimate the distribution of inclusions along the length of the container is illustrated in Figure 5 using the example of

Q = 10

(compare with Figure 4).

To provide a quantitative assessment of the agreement between the distributions obtained from the test list (Figure 5) and the complete permutation set (Figure 4), we introduce two statistical metrics [15]. Let

f_{i}

and

g_{i}

denote the normalized distribution values at position i of the etalon outline for the full permutation set and the test list, respectively, where

i = 1, \dots, N

and

N = 9 \times n - 12

is the outline length. Normalization is performed by dividing the raw counts by the total number of keys in each case, so that the distributions are directly comparable regardless of sample size.

The first metric is the maximum absolute deviation

D_{max} = {max}_{i} | f_{i} - g_{i} |

, analogous to the Kolmogorov–Smirnov statistic. For

Q = 10

,

n = 30

, the computed value is

D_{max} = 0.0041

, indicating that the largest pointwise discrepancy between the two normalized distributions does not exceed

0.41 %

.

The second metric is the root mean square error

RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(f_{i} - g_{i})}^{2}}

. The obtained value

RMSE = 0.0018

confirms that the average deviation across all outline positions is negligible.

These two metrics collectively confirm that the test list of 2390 permutations, comprising only

0.066 %

of the full set, reproduces the distribution of bit inclusions with high fidelity. The property of equal element presence at every position of the r-sequence, ensured by Algorithm 2, provides the structural basis for this agreement. We therefore consider the test list to be a reliable proxy for the complete permutation set in subsequent experiments with

Q = 16

, where exhaustive enumeration of

16! \approx 2.09 \times 10^{13}

permutations is computationally prohibitive.

The requirements of the formulated rule were fulfilled in the configurations of digital etalons shown in Figure 6. Of the 15 points in Figure 2a, at

Q = 10

(Figure 6a) we have: 4 points with no dichotomous partitions; and among the rest (with existing partitions): 7 points with subsets of cardinality 4 and 4 points with subsets of cardinality 5. At

Q = 16

(Figure 6b)—the previous 4 points of absence, 8 points with subsets of cardinality 8, 3 points—subsets of cardinality 7.

The proposed rule formulates necessary but not sufficient conditions. Repeated adjustments are possible in some etalons. In the course of further planned research, an increase in the duration of the computational experiment cannot be ruled out. This led to a revision of the software implementation of the masking algorithm presented in [1].

The new software implementation of the masking algorithm (see Figure A1 in Appendix A) employs recursive dichotomous partitioning and utilizes the BitVectorLib library [16] for efficient bitwise operations. The implementation targets the .NET 10 platform. On the full set of permutations in the case of

Q = 10

, the original implementation [17,18] required approximately 38 min, whereas the new implementation completes the same task in 51 s in single-threaded mode—a roughly 45-fold acceleration achieved solely through algorithmic optimizations, including compact bit-vector representations and elimination of redundant memory allocations. Further acceleration is attainable by engaging the Parallel class of the .NET framework [19]; however, a systematic study of parallel scalability is beyond the scope of this article and is planned as a separate investigation. The testing was carried out on a hardware platform with the following characteristics: Intel Core i5-9300H processor, 16 GB DDR4 RAM, Windows 11 operating system (10.0.22631.4460).

4. Results of the Computational Experiment

This section presents the results of experimental verification of the formulated rule for the proposed etalon configurations on the sets

{\bar{0, 9}}

and

{\bar{0, 15}}

with evaluation of the values of M and enumeration of the complete and limited test lists of permutations, respectively. The value of n seriously affects the volume of the transmitted message. Thus, the research is limited to the case

n = 30

, which retains the advantages of associative protection [20].

Case $Q = 10$ : The experiment with shapes in Figure 6a led to the appearance of a spike in the upper left node of Figure 2a. It can be seen that there is no spike in the lower right node. We consider the reason for this lies in the fact that the first position gives a subset with the cardinality of 4 in the first division, while the second position divides the complete set

Q = 10

exactly in half. To align the situations, we had to change the representation of the digit 3 (Figure 7).

As a result, the desired uniform distribution was obtained (Figure 8) with the value

M = 3.54

.

Case $Q = 16$ : The figure is similar to that observed in the case of

Q = 10

. Figure 6b still maintains the spike in the upper left node of Figure 2a. Note that during the first division, subsets with a cardinality of

7 < Q / 2

,

Q = 16

are selected for this and the lower-right nodes in this case. The applied correction principle is the same as before: we adjust the cardinality of the selected subsets in the upper-left node to the level

Q / 2

assumed for the case

Q = 10

, correcting the representation of the number 3 (Figure 9).

This ensured the required uniform distribution (Figure 10) with a value of

M = 4.34

.

Practical significance of the obtained M values. The mathematical expectation M of stored etalon bits plays a dual role in the associative protection system, simultaneously affecting recognition reliability and steganographic concealment. We now discuss the practical implications of the values

M = 3.54

(for

Q = 10

) and

M = 4.34

(for

Q = 16

) obtained with the proposed configurations.

From the standpoint of recognition reliability, M represents the average number of bits per etalon that survive the masking process and are available for pattern matching during decoding. A higher M increases the probability of correct identification of each digit, since more reference bits are preserved in the container. However, this advantage is bounded: beyond a certain threshold, additional preserved bits yield diminishing returns in recognition accuracy while increasing the system’s vulnerability to detection. For the decimal case, the obtained value

M = 3.54

lies close to the theoretical minimum

M_{min} = 3.4

, meaning that each etalon retains on average only

3.54

bits out of the 15 positions in the

n = 3

outline. This is sufficient for unambiguous identification of each of the

Q = 10

digits (since

⌈ {log}_{2} 10 ⌉ = 4

bits are needed in theory), while keeping the information footprint minimal. For the hexadecimal case,

M = 4.34

is likewise close to

M_{min} = 4.0

and comfortably exceeds the information-theoretic minimum of

⌈ {log}_{2} 16 ⌉ = 4

bits required to distinguish 16 symbols.

From the standpoint of steganographic security, the critical parameter is not M alone but the ratio

L / (k \times M)

, which characterizes the degree to which the preserved bits are “diluted” within the pseudorandom container. For

n = 30

and

k = 3

, the container length is

L = 3 \times (9 \times 30 - 12) = 774

bits per code. The total number of embedded bits per code is

k \times M

, yielding embedding ratios of

774 / (3 \times 3.54) \approx 72.9

for

Q = 10

and

774 / (3 \times 4.34) \approx 59.4

for

Q = 16

. In both cases, the preserved information constitutes less than

2 %

of the container, making statistical detection by an adversary substantially more difficult [3]. A lower M improves this ratio, enhancing steganographic concealment; a higher M degrades it.

The transition from

Q = 10

to

Q = 16

increases M by approximately

23 %

(from

3.54

to

4.34

), which moderately reduces the embedding ratio. However, this cost is offset by three significant gains: the key space expands from

10!

to

16!

(an increase by a factor of approximately

5.77 \times 10^{6}

), the addressable name space grows from

10^{3} = 1000

to

16^{3} = 4096

, and the distribution uniformity is preserved. From an engineering perspective, the slight increase in M represents an acceptable trade-off.

Regarding acceptable thresholds, the lower bound is

M_{min}

, below which the masking process cannot produce a valid key for all etalons. The upper bound is less rigid and depends on the application: for high-security scenarios where steganographic concealment is paramount, one should aim for M values as close to

M_{min}

as possible; for scenarios prioritizing recognition robustness (e.g., in noisy channels), a moderately higher M may be preferable. The values obtained in this work—within

4 %

of

M_{min}

for

Q = 10

and within

9 %

for

Q = 16

—fall well within the range where both security and reliability requirements are satisfied for typical applications.

Applicability boundaries and failure analysis of the proposed rule: As stated above, the proposed rule formulates necessary but not sufficient conditions for achieving a uniform distribution of bit inclusions. We now discuss in greater detail the practical boundaries of this rule, the conditions under which it may fail, and the validation steps that should accompany its application.

The rule requires that the cardinalities of the subsets formed during the first dichotomous partition at each nodal point be close to

Q / 2

. However, this condition alone does not fully determine the distribution behaviour, because the subsequent recursive partitions may introduce secondary non-uniformities that are not captured by the first-level analysis. This is precisely the situation that arose with the initial configurations in Figure 6: the rule’s conditions were satisfied at all nodal points, yet a spike persisted at the upper-left node. The cause was traced to a subtle asymmetry: at the upper-left node the first partition produced a subset of cardinality 4 (for

Q = 10

), while at the lower-right node the partition divided the complete set exactly in half. The correction required modifying the representation of digit 3 (Figure 7 and Figure 9) to align the partition cardinalities at both nodes.

This example serves as a counterexample to the sufficiency of the rule in its original formulation and illustrates a general pattern: when two nodal points undergo first partitions with different subset cardinalities (even if both are individually within the range specified by the rule), the resulting distribution may still exhibit local non-uniformity. Therefore, a stricter interpretation of the rule should require not only that the cardinalities at each nodal point lie within the prescribed range, but also that the cardinalities be consistent across all nodal points of the same type.

More broadly, the rule may fail to guarantee uniformity in the following scenarios: (a) when the etalon configurations produce partition cardinalities that satisfy the range condition but differ significantly between nodal points, as demonstrated above; (b) when the number system base Q is such that

Q / 2

is not an integer (i.e., Q is odd), making exact halving impossible and requiring a relaxed cardinality criterion; or (c) when the etalon shapes introduce correlations between bit values at non-nodal positions that propagate through deeper recursion levels in ways not predicted by the first-partition analysis.

Given these limitations, the following post hoc validation procedure is recommended for any new set of etalon configurations:

1.: Compute the distribution function of bit inclusions over the complete permutation set (for $Q = 10$ ) or the test permutation list (for $Q = 16$ and larger), using the masking algorithm described in the Appendix A, and verify the absence of spikes at all nodal points.
2.: Compare the obtained value of M with the theoretical minimum $M_{min}$ for the given Q to ensure that the reconfiguration has not significantly increased the average number of inclusions.

These two checks are computationally inexpensive relative to the design process itself and provide a practical safeguard against over-reliance on the heuristic rule.

5. Conclusions

The methodology employed in this article consists of establishing facts through computational experiments, hypothesizing possible causes of failures and ways to overcome them, and verifying them as needed. It is not strictly inductive. Nevertheless, it was possible to extrapolate the results of the analysis of the simplest case

n = 3

to arbitrary values of n.

The rule formulated in this article and its robust verification allow us to properly approach the choice of digital etalon representations when encoding scene data in the case of their associative protection. Some readers may find the configurations in Figure 7 and Figure 9 aesthetically less appealing. In this case, they can try to find their own forms that satisfy this rule.

The results obtained in this work open several concrete directions for further research:

1.: Comparative evaluation of the key parameters of associative protection for $Q = 10$ and $Q = 16$ with the proposed etalon forms: The present article established the etalon configurations and confirmed the uniformity of the distribution function for both number system bases. The natural next step is a systematic comparison of the principal security and efficiency parameters—including the mathematical expectation M, the steganographic embedding ratio $L / (k \times M)$ , the usable key space, and the recognition error rate—between the proposed configurations and those previously obtained in [1]. Such a comparison will quantify the practical gains from the transition to the hexadecimal system and provide the basis for selecting the appropriate value of Q depending on the application requirements.
2.: Determining the advisability of using a particular value of Q for text data protection: While the present work focuses on scene analysis (cartographic and coordinate data), the associative protection mechanism is potentially applicable to other data types. For text data, the choice of Q must account for the alphabet size, character frequency distributions, and the resulting embedding ratios. Investigating whether $Q = 16$ (or other values) offers advantages over $Q = 10$ in the text protection scenario requires dedicated analysis of the interplay between the number system base, the etalon configurations, and the statistical properties of natural language texts.
3.: Investigation of the practical achievability of unconditional steganographic security: The embedding ratios obtained in this work (less than $2 %$ of the container for $n = 30$ ) suggest that the method may approach the regime of provably undetectable steganographic communication. A formal analysis within established steganalytic frameworks—such as KL-divergence-based detectability bounds [21] or hypothesis testing formulations—would determine whether the current parameter ranges provide unconditional (information-theoretic) steganographic security or whether additional constraints on n, k, or the pseudorandom generator are required.
4.: Development of certified protection systems for cartographic and text data: The ultimate practical objective of this line of research is the creation of certified systems implementing associative protection for operational use. This requires not only the algorithmic advances described above, but also formal specification of the security properties, compliance with relevant cryptographic standards, and integration with existing data transmission infrastructure for cartographic and text information.

Beyond these planned directions, we identify two additional theoretical objectives that would strengthen the foundations of the method. First, a formal proof (or disproof) of the sufficiency of the proposed rule—moving from the current empirically validated heuristic to a theorem with provable guarantees—would significantly elevate the theoretical contribution. Second, a systematic comparison of associative protection with classical cryptographic and steganographic methods (such as AES-based encryption combined with LSB embedding or syndrome coding) in terms of security strength, embedding capacity, and computational cost would position the method within the broader landscape of information security techniques and clarify its comparative advantages and limitations.

Author Contributions

Conceptualization, V.R.; methodology, V.R.; software, R.G. and A.B.; validation, V.R., R.G. and A.B.; formal analysis, V.R. and R.G.; investigation, V.R. and R.G.; resources, R.G. and A.B.; data curation, V.R.; writing—original draft preparation, V.R.; writing—review and editing, V.R. and R.G.; visualization, V.R., R.G. and A.B.; supervision, V.R.; project administration, V.R. and R.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work/publication was funded by a grant from the Academy of Sciences of the Republic of Tatarstan provided to higher education institutions, scientific and other organizations to support human resource development plans in terms of encouraging their research and academic staff to defend doctoral dissertations and conduct research activities. (Agreement No. 15/2025-PD-KAI dated 22 December 2025).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors express their sincere gratitude to V.K. Levin for his support of the research on associative data protection.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Figure A1. Presentation of the new software implementation of the masking algorithm in the form of an activity diagram in UML.

References

Raikhlin, V.A.; Vershinin, I.S.; Gibadullin, R.F. The Elements of Associative Steganography Theory. Mosc. Univ. Comput. Math. Cybern. 2019, 43, 40–46. [Google Scholar] [CrossRef]
Vershinin, I.S.; Gibadullin, R.F.; Raikhlin, V.A. State of research in the field of associative data protection. Uchenye Zap. Kazan. Univ. Seriya Fiz.-Mat. Nauk. 2025, 167, 413–436. (In Russian) [Google Scholar] [CrossRef]
Duda, R.O.; Hart, P.E. Pattern Classification and Scene Analysis; Wiley-Interscience: New York, NY, USA, 1973. [Google Scholar]
Hu, J.; Guo, P. Spatial local binary patterns for scene image classification. In Proceedings of the 2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), Sousse, Tunisia, 21–24 March 2012; pp. 326–330. [Google Scholar] [CrossRef]
Ker, A.D. A capacity result for batch steganography. IEEE Signal Process. Lett. 2007, 14, 525–528. [Google Scholar] [CrossRef]
Matsumoto, M.; Saito, M.; Nishimura, T.; Hagita, M. CryptMT3 Stream Cipher. In New Stream Cipher Designs. The eSTREAM Finalists; Robshaw, M., Billet, O., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2008; Volume 4986, pp. 7–19. [Google Scholar] [CrossRef]
Sharipov, B.R.; Perukhin, M.Y.; Mullayanov, B.I. Statistical Analysis of Pseudorandom Sequences and Stegocontainers. In Proceedings of the 2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM), Sochi, Russia, 17–21 May 2021; pp. 434–439. [Google Scholar] [CrossRef]
Fisher, R.A.; Yates, F. Statistical Tables for Biological, Agricultural and Medical Research, 3rd ed.; Oliver & Boyd: London, UK, 1948. [Google Scholar]
Febriani, I.; Ekawati, R.; Supriadi, U.; Abdullah, M.I. Fisher-Yates shuffle algorithm for randomization math exam on computer based-test. In Proceedings of the AIP Conference Proceedings, April 2021; AIP Publishing LLC: New York, NY, USA, 2021; Volume 2331, p. 060015. [Google Scholar]
Nicolis, G.; Prigogine, I. Exploring Complexity: An Introduction; W.H. Freeman: New York, NY, USA, 1989. [Google Scholar]
Nikitin, E.P. Ob”yasnenie—Funktsiya Nauki (Explanation as a Function of Science); Nauka: Moscow, Russia, 1970. (In Russian) [Google Scholar]
Anderson, R.J.; Petitcolas, F.A.P. On the limits of steganography. IEEE J. Sel. Areas Commun. 1998, 16, 474–481. [Google Scholar] [CrossRef]
Liu, S.; Ma, L.; Yao, H.; Zhao, D. Universal Steganalysis Based on Statistical Models Using Reorganization of Block-based DCT Coefficients. In Proceedings of the 2009 Fifth International Conference on Information Assurance and Security (IAS), Xi’an, China, 18–20 August 2009; pp. 778–781. [Google Scholar] [CrossRef]
Li, Q.; Shao, Z.; Tan, S.; Zeng, J.; Li, B. Non-structured Pruning for Deep-learning based Steganalytic Frameworks. In Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, China, 18–21 November 2019; pp. 1735–1739. [Google Scholar] [CrossRef]
Rosenberg, J. A methodology for evaluating predictive metrics. In Proceedings of the Fifth International Software Metrics Symposium (METRICS), Bethesda, MD, USA, 20–21 November 1998; p. 181. [Google Scholar] [CrossRef]
Sabouri, A. BitVector: A Memory-Efficient and High-Performance Struct for Working with Individual Bits in .NET Applications. Available online: https://github.com/alirezanet/BitVector (accessed on 19 December 2025).
Nadel, A.; Ryvchin, V. Bit-Vector Optimization. In Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2016; Chechik, M., Raskin, J.-F., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2016; Volume 9636, pp. 585–600. [Google Scholar] [CrossRef]
Vershinin, I.S.; Gibadullin, R.F.; Pystogov, S.V.; Raikhlin, V.A. Associative Steganography. Durability of Associative Protection of Information. Lobachevskii J. Math. 2020, 41, 440–450. [Google Scholar] [CrossRef]
Vasilchikov, V.V. On the recursive-parallel programming for the .NET framework. Autom. Control Comput. Sci. 2014, 48, 575–580. [Google Scholar] [CrossRef]
Raikhlin, V.A.; Gibadullin, R.F.; Vershinin, I.S. Is It Possible to Reduce the Sizes of Stegomessages in Associative Steganography? Lobachevskii J. Math. 2022, 43, 455–462. [Google Scholar] [CrossRef]
Cachin, C. An Information-Theoretic Model for Steganography. In Information Hiding. IH 1998; Aucsmith, D., Ed.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1525, pp. 306–318. [Google Scholar] [CrossRef]

Figure 1. The set of postal symbols.

Figure 2. Positions of binary units in matrix-etalons for

n = 3

(a) and

n = 7

(b).

Figure 2. Positions of binary units in matrix-etalons for

n = 3

(a) and

n = 7

(b).

Figure 3. Total distribution of significant bits for

10^{4}

keys according to the outline of the etalon at

Q = 10

,

n = 30

.

Figure 3. Total distribution of significant bits for

10^{4}

keys according to the outline of the etalon at

Q = 10

,

n = 30

.

Figure 4. Total distribution of significant bits for

10!

keys according to the etalon outline at

Q = 10

,

n = 30

.

Figure 4. Total distribution of significant bits for

10!

keys according to the etalon outline at

Q = 10

,

n = 30

.

Figure 5. Total distribution of significant bits for a set of keys according to the etalon outline at

Q = 10

,

n = 30

obtained by enumerating the test list of permutations.

Figure 5. Total distribution of significant bits for a set of keys according to the etalon outline at

Q = 10

,

n = 30

obtained by enumerating the test list of permutations.

Figure 6. The proposed etalon configurations of sets

{\bar{0, 9}}

(a) and

{\bar{0, 15}}

(b).

Figure 6. The proposed etalon configurations of sets

{\bar{0, 9}}

(a) and

{\bar{0, 15}}

(b).

Figure 7. Found labels of the set

{\bar{0, 9}}

.

Figure 7. Found labels of the set

{\bar{0, 9}}

.

Figure 8. Distribution of significant bits for the found etalon labels at

Q = 10

,

n = 30

and

10!

permutations.

Figure 8. Distribution of significant bits for the found etalon labels at

Q = 10

,

n = 30

and

10!

permutations.

Figure 9. Refined etalon labels of the set

{\bar{0, 15}}

.

Figure 9. Refined etalon labels of the set

{\bar{0, 15}}

.

Figure 10. Distribution of significant bits for refined etalon configurations at

Q = 16

,

n = 30

obtained through enumeration of the test list of permutations.

Figure 10. Distribution of significant bits for refined etalon configurations at

Q = 16

,

n = 30

obtained through enumeration of the test list of permutations.

Table 1. The complete set of permutations in a sequence of 5 elements.

(E D) C B A	(D C) B A E	(C B) A E D	(B A) E D C	(A E) D C B
(E D) C A B	(D C) B E A	(C B) A D E	(B A) E C D	(A E) D B C
(E D) B A C	(D C) A E B	(C B) E D A	(B A) D C E	(A E) C B D
(E D) B C A	(D C) A B E	(C B) E A D	(B A) D E C	(A E) C D B
(E D) A C B	(D C) E B A	(C B) D A E	(B A) C E D	(A E) B D C
(E D) A B C	(D C) E A B	(C B) D E A	(B A) C D E	(A E) B C D
(E C) B A D	(D B) A E C	(C A) E D B	(B E) D C A	(A D) C B E
(E C) B D A	(D B) A C E	(C A) E B D	(B E) D A C	(A D) C E B
(E C) A D B	(D B) E C A	(C A) D B E	(B E) C A D	(A D) B E C
(E C) A B D	(D B) E A C	(C A) D E B	(B E) C D A	(A D) B C E
(E C) D B A	(D B) C A E	(C A) B E D	(B E) A D C	(A D) E C B
(E C) D A B	(D B) C E A	(C A) B D E	(B E) A C D	(A D) E B C
(E B) A D C	(D A) E C B	(C E) D B A	(B D) C A E	(A C) B E D
(E B) A C D	(D A) E B C	(C E) D A B	(B D) C E A	(A C) B D E
(E B) D C A	(D A) C B E	(C E) B A D	(B D) A E C	(A C) E D B
(E B) D A C	(D A) C E B	(C E) B D A	(B D) A C E	(A C) E B D
(E B) C A D	(D A) B E C	(C E) A D B	(B D) E C A	(A C) D B E
(E B) C D A	(D A) B C E	(C E) A B D	(B D) E A C	(A C) D E B
(E A) D C B	(D E) C B A	(C D) B A E	(B C) A E D	(A B) E D C
(E A) D B C	(D E) C A B	(C D) B E A	(B C) A D E	(A B) E C D
(E A) C B D	(D E) B A C	(C D) A E B	(B C) E D A	(A B) D C E
(E A) C D B	(D E) B C A	(C D) A B E	(B C) E A D	(A B) D E C
(E A) B D C	(D E) A C B	(C D) E B A	(B C) D A E	(A B) C E D
(E A) B C D	(D E) A B C	(C D) E A B	(B C) D E A	(A B) C D E

Table 2. Formation of the test list of permutations for

r = 4

.

Table 2. Formation of the test list of permutations for

r = 4

.

(D C) (B A)	(C B) (A D)	(B A) (D C)	(A D) (C B)
C D B A	B C A D	A B D C	D A C B
D C A B	C B D A	B A C D	A D B C

Table 3. Formation of the test list of permutations for

r = 6

.

Table 3. Formation of the test list of permutations for

r = 6

.

(F E D) (C B A)	(E D C) (B A F)	(D C B) (A F E)	(C B A) (F E D)	(B A F) (E D C)	(A F E) (D C B)
E D F C B A	D C E B A F	C B D A F E	B A C F E D	A F B E D C	F E A D C B
D F E C B A	C E D B A F	B D C A F E	A C B F E D	F B A E D C	E A F D C B
F D E C B A	E C D B A F	D B C A F E	C A B F E D	B F A E D C	A E F D C B
E F D C B A	D E C B A F	C D B A F E	B C A F E D	A B F E D C	F A E D C B
D E F C B A	C D E B A F	B C D A F E	A B C F E D	F A B E D C	E F A D C B
F E D B A C	E D C A F B	D C B F E A	C B A E D F	B A F D C E	A F E C B D
F E D A C B	E D C F B A	D C B E A F	C B A D F E	B A F C E D	A F E B D C
F E D C A B	E D C B F A	D C B A E F	C B A F D E	B A F E C D	A F E D B C
F E D B C A	E D C A B F	D C B F A E	C B A E F D	B A F D E C	A F E C D B
F E D A B C	E D C F A B	D C B E F A	C B A D E F	B A F C D E	A F E B C D

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Raikhlin, V.; Gibadullin, R.; Boyko, A. The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security. Computers 2026, 15, 167. https://doi.org/10.3390/computers15030167

AMA Style

Raikhlin V, Gibadullin R, Boyko A. The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security. Computers. 2026; 15(3):167. https://doi.org/10.3390/computers15030167

Chicago/Turabian Style

Raikhlin, Vadim, Ruslan Gibadullin, and Alexey Boyko. 2026. "The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security" Computers 15, no. 3: 167. https://doi.org/10.3390/computers15030167

APA Style

Raikhlin, V., Gibadullin, R., & Boyko, A. (2026). The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security. Computers, 15(3), 167. https://doi.org/10.3390/computers15030167

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Influence of the Form of Digital Etalons on the Effectiveness of Associative Security

Abstract

1. Introduction

Notation and Definitions

2. Research Objectives

Threat Model and Security Implications of Distribution Non-Uniformity

3. The Etalon Configurations Selection Rule and the Prerequisites for Its Verification

4. Results of the Computational Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI