Next Article in Journal
Research on Evaluation Methods of Complex Product Design Based on Hybrid Kansei Engineering Modeling
Next Article in Special Issue
Developing a New Approach for Assessing and Improving Business Excellence: Integrating Fuzzy Analytic Hierarchical Process and Constraint Programming Model
Previous Article in Journal
L2,1-Norm Regularized Double Non-Negative Matrix Factorization for Hyperspectral Change Detection
Previous Article in Special Issue
Asymptotics of Riemann Sums for Uniform Partitions of Intervals of Integration
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Roots of Binary Shuffle Squares

Institute of Mathematics, Silesian University of Technology, 44-100 Gliwice, Poland
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(2), 305; https://doi.org/10.3390/sym17020305
Submission received: 3 January 2025 / Revised: 28 January 2025 / Accepted: 10 February 2025 / Published: 17 February 2025
(This article belongs to the Special Issue Symmetry in Numerical Analysis and Applied Mathematics)

Abstract

:
A square is a word of the form X X , where X is any finite non-empty word. For example, couscous is a square. A shuffle square is a finite word that can be formed by self-shuffling a word; for instance, the Spanish word acaece is a shuffle square but not a square. We discuss both known and novel enumerative problems related to shuffle squares, with a focus on the number of distinct roots of binary shuffle squares. We introduce the term explicit shuffle squares, propose several conjectures, and present some preliminary results towards their resolution. Our discussion is supported by computational experiments. In particular, we determine the exact number of distinct roots of binary shuffle squares with a length of up to 24. On the other hand, we show that every non-constant binary word of length n generates at least n different shuffle squares.

1. Introduction

Symmetry plays a significant role in the combinatorics of words, as it provides a deeper understanding of structural properties and simplifies the enumeration of patterns within sequences. By examining symmetries in word structures, we can identify invariants under specific transformations, such as reversals, rotations, or permutations of letters. In this paper, we focus on shuffle squares—a type of word characterized by a hidden symmetric structure.
Throughout this paper, we consider all words to be finite. Let A be a fixed alphabet, and let W be a word made from A . The number of letters in W is called the length of W. Words P, U, and S (each possibly empty) such that W = P U S are called factors of W. Moreover, P is called a prefix of W, and S is called a suffix of W.
A tangram is a word in which every letter occurs an even number of times. A special type of tangram is a square, i.e., a word W such that W = X X for some word X. If W = X X , then X is called the root of the square W, and we say that X generates W. For example, the word hots generates the square hotshots .
A shuffle square is a word W obtained by self-shuffling some finite word X; here, X is, again, called a root of the shuffle square W, and we say that X generates W. For instance, the word tuteurer is a self-shuffle of tuer . This can be visualized by coloring the letters in the shuffle square and decomposing it into two copies of a root:
tu t e u r er tuer ,   tuer .
More formally, for n 1 , the word W = w 1 w 2 w 2 n is a shuffle square if there exist index sets
I = { i 1 , i 2 , , i n } , J = { j 1 , j 2 , , j n }
such that
i 1 < i 2 < < i n , j 1 < j 2 < < j n ,
I J = , and w i r = w j r for 1 r n .
Clearly, every shuffle square is a tangram.
Let a be a letter and k be a positive integer. We use the notation a n for a word that is a concatenation of k copies of a :
aa a k .
The word a k is called a constant word of length k.
Words that are squares were defined by Thue in 1906 in a study [1] that is considered today to be a pioneering paper in the field of combinatorics on words [2]. Shuffle squares are relatively new—introduced by Henshall, Rampersad, and Shallit [3] in 2012. Henshall et al. focus on the enumeration of shuffle squares of a fixed length over a given alphabet. Let us note that counting squares with a given length over a fixed alphabet is very easy; the number of squares with length 2 n over a k-letter alphabet is k n for every positive n and k. However, to this day, we only know the quantity of binary shuffle squares with length 2 n for n 19 , as shown in Table 1 for the OEIS sequence A191755 [4].
A general formula for the function f or any of its generalizations over larger alphabets is not known. Some results from the study of these functions were obtained by He, Huang, Nam, and Thaper [5]; they demonstrated that the number of binary shuffle squares is at least 2 n n and that the number of shuffle squares over a k-letter alphabet is
1 n + 1 2 n n k n 2 n 1 n + 1 k n 1 + O n ( k n 2 ) ,
where 2 n is the length under consideration. He et al. also proposed an intriguing conjecture: almost every binary tangram is a shuffle square—meaning that the probability of selecting a shuffle square uniformly at random from all tangrams of length 2 n tends to 1 as n . A related result was obtained by Axenovich, Person, and Puzynina [6], who showed that every binary word of length n is a shuffle square up to the deletion of o ( n ) letters; the precise asymptotic formula for the maximum number of deletable letters is not known. Most recently, Basu and Ruciński showed [7] that there exists a ternary word requiring the deletion of at least Ω ( log 2 n ) letters to become a shuffle square. Moreover, some properties of binary shuffle squares were recently presented by Fici [8], and some generalizations of shuffle squares were introduced by Grytczuk, Pawlik, and Pleszczyński [9,10].
Buss and Soltys [11] proved that recognizing shuffle squares is NP-complete for every sufficiently large alphabet. Recently, Bulteau and Vialette [12] improved upon this result and showed that this statement also holds for a binary alphabet.
In this paper, we focus on certain heuristics related to the combinatorial properties of binary shuffle squares, specifically investigating the relationship between the number of shuffle squares generated by given roots and, conversely, the number of distinct roots of a given shuffle square.
Note that the constant word 000 generates only one shuffle square— 000000 , while the word 011 generates three shuffle squares: 011011 , 010111 and 001111 . The exemplary decompositions of obtained shuffle squares are
011 011 ,   01 0 1 11 ,   0 0 11 11 .
On the other hand, the shuffle square 01010011 has only one root, namely 0101 , while 00110011 has two distinct roots: 0011 and 0101 —it can be demonstrated with the following decompositions:
0011 0011 ,   0 0 1 1 0 0 1 1 .
We investigate the minimal and maximal results in both scenarios, as well as propose some observations and conjectures.

2. Searching of Shuffle-Squares for Given Roots

From now on, we consider only words over a binary alphabet { 0 , 1 } . Let x ¯ be the letter different than x , i.e., 0 ¯ = 1 and 1 ¯ = 0 .
Let us notice that every binary word generates a shuffle square.
We characterise the number of shuffle squares generated by any binary word with exactly one letter 1 :
Proposition 1.
Let a , b be the non-negative integers. The word 0 a 10 b generates exactly ( a + 1 ) ( b + 1 ) different shuffle squares.
Proof. 
Let us notice that a shuffle square S generated by the word 0 a 10 b has the prefix 0 a and the suffix 0 b . Thus,
S = 0 a S 1 S 2 0 b ,
where S 1 and S 2 are words of lengths ( a + 1 ) and ( b + 1 ) , respectively.
Notice that S 1 must contain exactly one 1 , as the first occurrence of 1 in S cannot appear after more than 2 a occurrences of 0 . Similarly, S 2 must also contain exactly one 1 —the second occurrence of 1 cannot appear before more than 2 b trailing occurrences of 0 . Therefore, the number of possible positions for the first 1 is ( a + 1 ) , and the number of possible positions for the second 1 is ( b + 1 ) . Consequently, we can generate ( a + 1 ) ( b + 1 ) distinct words from the word 0 a 1 0 b .□
Of course, a constant word generates only one shuffle square. We show that no other words have this property in the following proof.
Proposition 2.
Every non-constant binary word W generates at least two different shuffle squares.
Proof. 
Without a loss of generality, let us assume that 0 is the prefix of W. Then, there exist k 1 and a (possibly empty) word S such that
W = 00 0 k 1 S .
Let us note that W generates both shuffle squares
S 1 = 00 0 k 1 S 00 0 k 1 S
and
S 2 = 00 0 2 k 1 S 1 S ,
which are different because each of them contains a different prefix of length ( k + 1 ) . □
Now, we state a more precise result in the following.
Theorem 1.
Every non-constant binary word W generates at least n different shuffle squares.
Proof. 
It is easy to verify manually that this statement holds if the word W has a length of 3 or less. We use induction on the length, n, of the root of W.
Assume that all non-constant words of length n generate at least n distinct shuffle squares. We need to show that all non-constant words of length ( n + 1 ) generate at least ( n + 1 ) distinct shuffle squares. Let us note that a non-constant word of length ( n + 1 ) can be obtained either by adding a letter x ¯ to the end of the constant word x n or by adding the letter 0 or 1 to the end of the non-constant word V.
In the first case, the resulting word generates ( n + 1 ) shuffle squares according to Proposition 1. Therefore, we only need to investigate the second case. Without a loss of generality, we can assume that V has the suffix 1 (the proof for the case of suffix 0 is analogous to the one presented below).
We need to show that the numbers of shuffle squares generated by the words V 0 and V 1 are at least one greater than the number of shuffle squares generated by the word V.
Since V is a non-constant word, there exists a word P and a number s ( 0 < s < n ) such that
V = P 0 1 1 s .
Let us denote the n distinct shuffle squares generated by V as
s s 1 ( V ) , s s 2 ( V ) , , s s n ( V ) .
Case 1: Root V 0 .
Note that the words
s s 1 ( V ) 00 , s s 2 ( V ) 00 , , s s n ( V ) 00
are pairwise distinct and generated by V 0 . Hence, V 0 generates at least n shuffle squares.
Now, observe that
V 0 = P 0 1 1 s 0 ,
so V 0 generates the word
P 0 1 1 s 0 P 0 1 1 s 0 ,
which is distinct from every word s s i ( V ) 00 , as it has a different suffix of length 2. Thus, V 0 generates at least one more shuffle square than V.
Case 2: Root V 1 .
Analogously to the previous case, let us note that the words
s s 1 ( V ) 11 , s s 2 ( V ) 11 , , s s n ( V ) 11
are pairwise distinct and generated by V 1 . Hence, V 1 generates at least n shuffle squares.
We have
V 1 = P 0 1 1 s + 1 ,
so V 1 generates the word
P 0 P 0 1 1 s 1 1 s 1 1 ,
which is distinct from every word s s i ( V ) 11 , as it has a different suffix of length ( s + 3 ) . Thus, V 1 generates at least one more shuffle square than V, which completes the proof. □
Let us note that the lower bound n given by Theorem 1 is optimal: according to Proposition 1, for every positive n, the word 0 n 1 1 generates exactly n different shuffle squares.
On the other hand, the upper bound is not known, but in Table 2, we present known terms from the sequence of the maximal number of shuffle squares for given lengths of binary words, obtained by Shallit in 2020 [13].
In Table 3, the values of k are presented for which there exists a binary word of length n generating exactly k different shuffle squares for k 20 and n 12 .
In Table 4, we present words of length n for 1 n 12 that generates, by self-shuffle, the most shuffle squares. Numbers of generated shuffle squares by these words are presented in Table 2.

3. Searching for Roots for Given Shuffle Squares

Let us note that for every positive n, the constant word a 2 n has only one root, a n , which means that over every alphabet, there are arbitrarily long words with only one root. On the other hand, the number of roots for a single word can be arbitrarily large:
Proposition 3.
Let us consider words over a fixed non-empty alphabet A . For every natural number n, let R 2 n be the maximal number of roots of a single word with length 2 n . Then, lim n R 2 n = .
Proof. 
Clearly, this is sufficient to prove the statement for the binary alphabet.
Let us note that, for 4 n 7 , there exist words of length 2 n that have at least two roots. Examples of such words are provided in Table 5. If the word W has a root r W and the word V has a root r V , then the word W V has the root r W r V . Thus, if the sets of roots for the words W and V contain k and l elements, respectively, the set of roots of the word W V contains at least k · l distinct elements (note that the roots r W r V are unique in this summation).
Every natural number m 8 can be decomposed as a sum
m = m 1 + m 2 + + m l
such that
4 m 1 , m 2 , , m l 7 .
Therefore, for every such m, we can construct a word W of length m such that
W = W 1 W 2 W l ,
where the words W i are taken from Table 5. Thus, the number of roots, R W , of the word W satisfies
R W R W 1 · R W 2 · · R W l 2 · 2 · · 2 l = 2 l ,
which completes the proof. □
The results presented in Table 6 suggest that the maximal number of different roots for a single binary shuffle square W increases with the length, 2 n , of W.
A shuffle square is called explicit if it has only one root. There is no known general formula for the number of explicit shuffle squares with a given length. The number, h ( n ) , of binary explicit shuffle squares for small lengths, 2 n , is shown in Table 7.
The first step towards counting the binary explicit shuffle squares is the following.
Theorem 2.
Let W be a binary shuffle square of length 2 n with a prefix 0. Let a , b > 0 . If W belongs to one of the following classes, then W is an explicit shuffle square.
1.
0 2 n ;
2.
0 2 a 1 2 b , where a + b = n ;
3.
0 2 a 110 2 b , where a + b = n 1 ;
4.
0 2 a 1 1010 2 b , where a + b = n 1 .
Proof. 
(1)
The constant shuffle square can only be obtained using a constant root.
(2)
Of course, every root contains a prefix 0 a . Since the number of 0s and 1s in every root is a and b, respectively, then we have that the only possible root is 0 a 1 b .
(3)
Every root contains a prefix 0 a and a suffix 0 b and has to contain exactly one 1. The length of a root is a + b + 1 , so the only possibility is 0 a 10 b .
(4)
Similarly, like in case (3), we have that the only possible root is 0 a 1 0 b .
Classes (3) and (4) in Theorem 2 suggest that all shuffle squares of the form 0 x 10 y 10 z might be explicit. However, this is not true; the word 0000100100 has two roots: 00100 and 00010.
Moreover, there exist binary explicit shuffle squares that do not belong to any of the classes mentioned in the above theorem. The shortest of such words is 010111.
Using values from Table 1 and values from Table 7 multiplied by 2 (Table 1 contains the total number of distinct binary shuffle squares, and Table 7 contains the number of explicit shuffle squares that start with the letter 0 ), we can define the function i ( n ) as follows:
i ( n ) = h ( n ) f ( n ) .
Values of i ( n ) of length 2 n for small values of n are shown in Table 8.
Using values from Table 8 and Proposition 3, we state the following.
Conjecture 1.
lim n i ( n ) = 0 .
Indeed, the values in Table 8 decrease as n increases, which might suggest that the conjecture is true. However, from Theorem 2, we know that for every 2 n , we can create at least ( 3 n 4 ) explicit shuffle squares, so this topic needs further investigation.

4. Summary

The presented study explores the combinatorial properties of binary shuffle squares, focusing on their number of roots and the generation of shuffle squares from given words. It was demonstrated that every non-constant word generates at least n shuffle squares, where n is the word’s length, and classes of explicit shuffle squares with unique roots were characterized. A conjecture was proposed suggesting that the ratio of explicit shuffle squares to all shuffle squares approaches zero as their length increases, warranting further investigation. Future research should aim to extend these findings to larger alphabets, refine asymptotic analyses, develop efficient algorithms for recognizing and analyzing shuffle squares, and conduct computational experiments for larger values of n. These efforts could enhance our understanding of shuffle squares and their applications in mathematics and computer science.

Author Contributions

Conceptualization, D.D. and B.P.; methodology, B.P.; software, D.D.; validation, D.D. and B.P.; formal analysis, B.P.; writing—original draft preparation, and B.P.; writing—review and editing, D.D. and B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Thue, A. Über unendliche Zeichenreihen. Norske vid. Selsk. Skr. Mat. Nat. Kl. 1906, 7, 1–22, Reprinted in Selected Mathematical Papers of Axel Thue; Nagell, T., Ed.; Universitetsforlaget: Oslo, Norway, 1977; pp. 139–158. [Google Scholar]
  2. Berstel, J.; Perrin, D. The origins of combinatorics on words. Eur. J. Comb. 2007, 28, 996–1022. [Google Scholar] [CrossRef]
  3. Henshall, D.; Rampersad, N.; Shallit, J. Shuffling and Unshuffling. Bull. EATCS 2012, 107, 131–142. [Google Scholar]
  4. Available online: https://oeis.org/A191755 (accessed on 2 January 2025).
  5. He, X.; Huang, E.; Nam, I.; Thaper, R. Shuffle Squares and Reverse Shuffle Squares. Eur. J. Comb. 2024, 116, 103883. [Google Scholar] [CrossRef]
  6. Axenovich, M.; Person, Y.; Puzynina, S. A regularity lemma and twins in words. J. Combin. Theory Ser. A 2013, 120, 733–743. [Google Scholar] [CrossRef]
  7. Basu, A.; Ruciński, A. How far are ternary words from shuffle squares? Ars Math. Contemp. 2024. [Google Scholar] [CrossRef]
  8. Fici, G. The Shortest Interesting Binary Words. arXiv 2024, arXiv:2412.21145. [Google Scholar]
  9. Grytczuk, J.; Pawlik, B.; Pleszczyński, M. More Variations on Shuffle Squares. Symmetry 2023, 15, 1982. [Google Scholar] [CrossRef]
  10. Grytczuk, J.; Pawlik, B.; Pleszczyński, M. Variations on shuffle squares. arXiv 2024, arXiv:2308.13882. [Google Scholar]
  11. Buss, S.; Soltys, M. Unshuffling a square is NP-hard. J. Comput. Syst. Sci. 2014, 80, 766–776. [Google Scholar] [CrossRef]
  12. Bulteau, L.; Vialette, S. Recognizing binary shuffle squares is NP-hard. Theor. Comput. Sci. 2020, 806, 116–132. [Google Scholar] [CrossRef]
  13. Available online: https://oeis.org/A331850 (accessed on 2 January 2025).
Table 1. The number, f ( n ) , of distinct binary shuffle squares of length 2 n for small values of n [4].
Table 1. The number, f ( n ) , of distinct binary shuffle squares of length 2 n for small values of n [4].
n012345678
f ( n ) 12622823201268510220,632
n910111213
f ( n ) 83,927342,4681,399,2965,720,96623,396,618
n14151617
f ( n ) 95,654,386390,868,9001 596,000,4186,511,211,718
n1819
f ( n ) 26,538,617,050108,060,466,284
Table 2. The maximal distinct binary shuffle number, f ( n ) , of distinct shuffle squares for binary generators of length n.
Table 2. The maximal distinct binary shuffle number, f ( n ) , of distinct shuffle squares for binary generators of length n.
n123456789101112
f ( n ) 124918541203249002406640019,600
n1314151617
f ( n ) 50,176148,042442,3251,373,0703,954,113
Table 3. The existence of binary words of length n generating exactly k distinct shuffle squares for k 20 and n 12 .
Table 3. The existence of binary words of length n generating exactly k distinct shuffle squares for k 20 and n 12 .
k
n1234567891011121314151617181920
1+
2++
3+++
4+++++
5++++++++
6+++++++
7+++++
8+++++
9+++
10+++
11+++
12++
Table 4. Words that generate the maximal number, f ( n ) , of distinct shuffle squares for binary generators of length n.
Table 4. Words that generate the maximal number, f ( n ) , of distinct shuffle squares for binary generators of length n.
n123456
words 000010011000110, 01000, 01100, 01100011000
n78910
words 0110001, 0111001011100010110001100110001110, 0111000110
n1112
words 01110001110011100001110
Table 5. Exemplary binary words of length 2 n with multiple different roots for 4 n 7 .
Table 5. Exemplary binary words of length 2 n with multiple different roots for 4 n 7 .
nWordNumber of Different RootsRoots
40011001120011, 0101
50011001100300110, 01100, 01010
60010100101004001010, 001100, 010100, 010010
70011001100110060011100, 0011010, 0110010, 0100110, 0101100, 0101010
Table 6. The maximum number, g ( n ) , of roots of binary shuffle squares of length 2 n for small values of n.
Table 6. The maximum number, g ( n ) , of roots of binary shuffle squares of length 2 n for small values of n.
n0123456789101112
g ( n ) 11112346913192842
Table 7. The number, h ( n ) , of explicit shuffle squares of length 2 n for small values of n.
Table 7. The number, h ( n ) , of explicit shuffle squares of length 2 n for small values of n.
n0123456789
h ( n ) 11311381354751681587520,641
n101112
h ( n ) 71,955250,447869,331
Table 8. The quotient, i ( n ) , of the number of explicit shuffle squares and the number of shuffle squares of length 2 n for small values of n.
Table 8. The quotient, i ( n ) , of the number of explicit shuffle squares and the number of shuffle squares of length 2 n for small values of n.
n01234567
i ( n ) 1 1 1 1 3 3 11 11 ( 38 ( 41 0.93 ( 135 ( 160 0.84 ( 475 ( 634 0.75 ( 1681 ( 2551 0.66
n891011
i ( n ) ( 5875 ( 10,316 0.57 ( 20,641 ( 41,986 0.49 ( 71,955 ( 171,234 0.42 ( 250,447 ( 699,648 0.36
n12
i ( n ) ( 869,331 ( 2,860,483 0.3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Datko, D.; Pawlik, B. Roots of Binary Shuffle Squares. Symmetry 2025, 17, 305. https://doi.org/10.3390/sym17020305

AMA Style

Datko D, Pawlik B. Roots of Binary Shuffle Squares. Symmetry. 2025; 17(2):305. https://doi.org/10.3390/sym17020305

Chicago/Turabian Style

Datko, Dominika, and Bartłomiej Pawlik. 2025. "Roots of Binary Shuffle Squares" Symmetry 17, no. 2: 305. https://doi.org/10.3390/sym17020305

APA Style

Datko, D., & Pawlik, B. (2025). Roots of Binary Shuffle Squares. Symmetry, 17(2), 305. https://doi.org/10.3390/sym17020305

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop