On the entropy and letter frequencies of powerfree words

We review the recent progress in the investigation of powerfree words, with particular emphasis on binary cubefree and ternary squarefree words. Besides various bounds on the entropy, we provide bounds on letter frequencies and consider their empirical distribution obtained by an enumeration of binary cubefree words up to length 80.


Introduction
The interest in combinatorics on words goes back to the work of Axel Thue at the beginning of the 20th century [1]. He showed, in particular, that the famous morphism called Thue-Morse morphism since the work of Morse [2], is cubefree. Its iteration on the initial word 0 produces an infinite cubefree word 0110100110010110100101100110100110010110011010010110100110010110 . . . over a binary alphabet, which means that it does not contain any subword of the form 0 3 = 000, 1 3 = 111, (01) 3 = 010101, (10) 3 = 101010 and so forth. Moreover, the statement that the morphism is cubefree means that it maps any cubefree word to a cubefree word, so it preserves this property. Generally, the iteration of a powerfree morphism is a convenient way to produce infinite powerfree words. The investigation of powerfree, or more generally of pattern-avoiding words, is one particular aspect of combinatorics on words; we refer the reader to the book series [3,4,5] for a comprehensive overview of the area, including algebraic formulations and applications. The area has attracted considerable activity in the past decades [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24], and continues to do so, see [25,26,27,28,29,30,31,32,33] for some recent work. Beyond the realm of combinatorics on words and coding theory, substitution sequences, such as the Thue-Morse sequence, have been investigated for instance in the context of symbolic dynamics [34,35,36] and aperiodic order [37], to name but two. In the latter case, one is interested in systems which display order without periodicity, and substitution sequences often provide paradigmatic models, which are used in many applications in physics and materials science. However, sequences produced by a substitution such as in Eq. (1) have subexponential complexity and hence zero combinatorial entropy, cf. Definition 12 below. A natural generalisation to interesting sets of positive entropy is provided by powerfree or pattern-avoiding words.
In this article, we review the recent progress on powerfree words, with emphasis on the two 'classic' cases of binary cubefree and ternary squarefree words. We include a summary of relevant results which are scattered over 25 years of literature, and also discuss some new results as well as conjectures on cubefree morphisms and letter frequencies in binary cubefree words.
The first term of interest is the combinatorial entropy of the set of powerfree words. Due to the fact that every subword of a powerfree word is again powerfree, the entropy of powerfree words exists as a limit. It is a measure for the exponential growth rate of the number of powerfree words of length n. Unfortunately, neither an explicit expression for the entropy of k-powerfree words nor an easy way to compute it numerically is known. Nevertheless, there are several strategies to derive upper and lower bounds for this limit. Upper bounds can be obtained, for example, by enumeration of all powerfree words up to a certain length, or by the derivation of generating functions for the number of powerfree words, see Section 4. Until recently all methods to achieve lower bounds relied on powerfree morphisms. However, the lower bounds obtained in this way are not particularly good, since they are considerably smaller than the upper bounds as well as reliable numerical estimates of the actual value of the entropy. A completely different approach introduced recently by Kolpakov [29], which amounts to choosing a parameter value to satisfy a number of inequalities derived from a Perron-Frobenius-type argument, provides surprisingly good lower bounds for the entropy of ternary squarefree and binary cubefree words.
In the following section, we briefly introduce the notation and basic terminology; see [3] for a more detailed introduction. We continue with a summary of results on k-powerfree morphisms, which can be used to derive lower bounds for the corresponding entropy. We then proceed by introducing the entropy of k-powerfree words and summarise the methods to derive upper and lower bounds in general, and for binary cubefree and ternary squarefree words in particular. We conclude with a discussion of the frequencies of letters in binary cubefree and ternary squarefree words.

Powerfree words and morphisms
Define an alphabet A as a finite non-empty set of symbols called letters. The cardinality of A is denoted by Card(A). Finite or infinite sequences of elements from A are called words. The empty word is denoted by ε. The set of all finite words, the operation of concatenation of words and the empty word ε form the free monoid A * . The free semigroup generated by A is The length of a word u ∈ A * , denoted by |u|, is the number of letters that u consists of. The length of the empty word is |ε| := 0.
For two words u, v ∈ A * , we say that v is a subword or a factor of u if there are words x, y ∈ A * such that u = xvy. If x = ε, the factor v is called a prefix of u, and if y = ε, v is called a suffix of u. Given a set of words X ⊂ A * (here and in what follows, the symbol ⊂ is meant to include the possibility that both sets are equal), the set of all factors of words in X is denoted by Fact(X).
A map ̺ : A * → B * , where A and B are alphabets, is called a morphism if ̺(uv) = ̺(u)̺(v) holds for all u, v ∈ A * . Obviously, a morphism ̺ is completely determined by ̺(a) for all a ∈ A, and satisfies ̺(ε) = ε. A morphism ̺ : For a word u, we define u 0 := ε, u 1 := u and, for an integer k > 1, the power u k as the concatenation of k occurrences of the word u. If u = ε, u k is called a k-power. A word v contains a k-power if at least one of its factors is a k-power. If a word does not contain any k-power as a factor, it is called k-powerfree. If a word does not contain the k-power of any word up to a certain length p as a factor, it is called length-p k-powerfree, i.e., w = xu k y implies that u = ε whenever x, u, y ∈ A * with |u| p. We denote the set of k-powerfree words in an alphabet A by F (k) (A) ⊂ A * and the set of length-p k-powerfree by F (k,p) (A) ⊂ A * . By definition, the empty word ε is k-powerfree for all k. A word w ∈ A * is called primitive, if w = v n , with v ∈ A * and n ∈ N, implies that n = 1, meaning that w is not a proper power of another word v.
A morphism ̺ : . A test-set for k-powerfreeness of morphisms on an alphabet A is a set T ⊂ A * such that, for any morphism ̺ : A * → B * , ̺ is k-powerfree if and only if ̺(T ) is k-powerfree. A morphism is called powerfree if it is a k-powerfree morphism for every k 2.
In particular, 2-powerfree and 3-powerfree words and morphisms are called squarefree and cubefree, respectively. A morphism from A * to B * with Card(A) = 2 is also called a binary morphism. The notion of powerfreeness can be extended to non-integer powers; see, for instance, Ref. [25] for an investigation of k-powerfree binary words for k 2. However, in this article we shall concentrate on the cases k = 2 and k = 3, and hence restrict the discussion to integer powers.

Characterisations of k-powerfree morphisms
In what follows, we summarise a number of relevant results on k-powerfree morphisms. In particular, we are interested in the question how to test a specified morphism for k-powerfreeness. We start with results relating to the case k = 2.

Characterisations of squarefree morphisms
A sufficient (but in general not necessary) condition for the squarefreeness of a morphism is known since 1979.
is squarefree for every squarefree word w ∈ A * of length |w| 3; (ii) a = b whenever a, b ∈ A and ̺(a) is a factor of ̺(b).
If the morphism ̺ is uniform, this condition is in fact also necessary, because in this case ̺(a) being a factor of ̺(b) implies that ̺(a) = ̺(b). If a, b ∈ A exist with a = b and ̺(a) = ̺(b), then clearly ̺ is not squarefree since ̺(ab) = ̺(a)̺(b) is a square. This gives the following corollary. This corollary corresponds to Brandenburg's Theorem 2 in Ref. [11] which only demands that ̺(w) is squarefree for every squarefree word w ∈ A * of length exactly 3. A short calculation reveals that this condition is equivalent to (i), because every squarefree word of length smaller than 3 occurs as a factor of a squarefree word of length 3.
For the next characterisation, we need the notion of a pre-square with respect to a morphism ̺. Let A be an alphabet, w ∈ A * a squarefree word and ̺ : A * → B * a morphism. A factor u = ε of ̺(w) = αuβ is called a pre-square with respect to ̺, if there exists a word w ′ ∈ A * satisfying: ww ′ is squarefree and u is a prefix of β̺(w ′ ) or w ′ w is squarefree and u is a suffix of ̺(w ′ )α. Obviously, if u is a pre-square, then either ̺(ww ′ ) or ̺(w ′ w) contains u 2 as a factor.
Theorem 3 (Crochemore [9]). A morphism ̺ : A * → B * is squarefree if and only if (i) ̺(w) is squarefree for every squarefree word w ∈ A * of length |w| 3; (ii) for any a ∈ A, ̺(a) does not have any internal pre-squares.
It follows that, for a ternary alphabet A, a finite test-set exists, as specified in the following corollary. However, the subsequent theorem shows that, as soon as we consider an alphabet with Card(A) > 3, no such finite test-sets exist, so the situation becomes more complex when considering larger alphabets.
Theorem 5 (Crochemore [9]). Let Card(A) > 3. For any integer n, there exists a morphism ̺ : A * → B * which is not squarefree, but maps all squarefree words of length up to n on squarefree words.

Characterisations of cubefree and k-powerfree morphisms
We now move on to characterisations of cubefree and k-powerfree morphisms for k > 3. We start with a recent result on cubefree binary morphisms. Obviously, the set T min itself is a test-set for cubefree binary morphisms. Another test-set is the set of cubefree words of length 7, as each word of T min appears as a factor of this set. There are even single words which contain all the elements of T min as factors. For instance, the cubefree word aabbababbabbaabaababaabb is one of the 56 words of length 24 which are test-sets for cubefree morphisms on {a, b}. The length of this word is optimal: no cube-free word of length 23 contains all the words of T min as factors.
The following sufficient characterisation of k-powerfree morphisms generalises Theorem 1 to integer powers k > 2.
Theorem 7 (Bean et al. [8]). Let ̺ : A * → B * be a morphism for alphabets A and B and let k > 2. Then ̺ is k-powerfree if (i) ̺(w) is k-powerfree whenever w ∈ A * is k-powerfree and of length |w| k + 1; (ii) a = b whenever a, b ∈ A with ̺(a) a factor of ̺(b); (iii) the equality x̺(a)y = ̺(b)̺(c), with a, b, c ∈ A and x, y ∈ B * , implies that either x = ε, a = b or y = ε, a = c.
As in the squarefree case above, a uniform morphism ̺ for which (i) holds also meets (ii), because uniformity implies that ̺(a) = ̺(b). If a = b, the word a k−1 b is k-powerfree but ̺(a k−1 b) = ̺(a) k is a k-power, which produces a contradiction. The condition (iii) means that, for all letters a ∈ A, the images ̺(a) do not occur as an inner factor of ̺(bc) for any b, c ∈ A. In general, this is not necessary for uniform morphisms; an example is given by the Thue-Morse morphism ̺ of Eq. (1). For instance, ̺(00) = 0101 = 0̺(1)1, which violates condition (iii) in Theorem 7. Nevertheless, the Thue-Morse morphism is cubefree [1].
Alphabets with Card(B) < 2 only provide trivial results, because the only k-powerfree morphism from A * to {ε} * is the empty morphism ε, and for Card(B) = 1 the only additional morphism is the map for Card(A) = 1 that maps the single element in A to the single letter in B. From now on, we consider alphabets with Card(B) 2. First, we deal with the case Card(A) 3.
Theorem 8 (Richomme, Wlazinski [23]). Given two alphabets A and B such that Card(A) 3 and Card(B) 2, and given any integer k 3, there is no finite test-set for k-powerfree morphisms from A * to B * .
This again is a negative result, which shows that the general situation is difficult to handle. In general, no finite set of words suffices to verify the k-powerfreeness of a morphism. The situation improves if we restrict ourselves to uniform morphisms, and look for test-sets for this restricted class of morphisms only. Here, a test-set for k-powerfreeness of uniform morphisms on A * is a set The existence of finite test-sets of this type was recently established by Richomme and Wlazinski [28]. Let Card(A) 2 and k 3 be an integer. Define where U (k) (A) is the set of k-powerfree words over A of length at most k + 1, and V (k) (A) is the set of words over A that can be written in the form a 0 w 1 a 1 w 2 . . . a k−1 w k a k with letters a 0 , a 1 , . . . , a k ∈ A and words w 1 , w 2 . . . w k ∈ A * which contain every letter of A at most once and satisfy |w i | − |w j | 1. Obviously, this set is finite and comprises words with a maximum length of max |w| w ∈ T (k) (A) k Card(A|) + 1 + 1.
Theorem 9 (Richomme, Wlazinski [28]). Let Card(A) 2 and k 3 be an integer. The finite set T (k) (A) is a test-set for k-powerfreeness of uniform morphisms on A * .
Due to the upper bound on the maximum length of words in T (k) (A), the following corollary is immediate.
Corollary 10 (Richomme, Wlazinski [28]). A uniform morphism ̺ on A * is k-powerfree for an integer power k 3 if and only if ̺(w) is k-powerfree for all k-powerfree words w of length at most k Card(A) + 1 + 1.
Although this result provides an explicit test-set for k-powerfreeness, it is of limited practical use, simply because the test-set becomes large very quickly. Already for Card(A) = 4 and k = 3, the set T (3) (A) has 26247020 elements. For comparison, the set of cubefree words in four letters of length 16, as required in Corollary 10, has 1939267560 elements, so is still much larger.
Finally, let us quote the following result of Keränen [38], which characterises k-powerfree binary morphisms and indicates that the test-set of Theorem 9 is far from optimal.

Entropy of powerfree words
Let A be an alphabet. A subset X ⊂ A * is called factorial if for any word x ∈ X all factors of x are also contained in X. Define for a factorial subset X ⊂ A * the number of words of length n occurring in X by c X (n). This number gives some idea of the complexity of X: the larger the number of words of length n, the more diverse or complicated is the set. That is why c X : N → N is called the complexity function of X.
Definition 12. The (combinatorial) entropy of an infinite factorial set X ⊂ A * is defined by The requirement that X is factorial ensures the existence of the limit, see for example [39,Lemma 1].
We note the following: (i) If X ⊂ A * with Card(A) = r, then 1 c X (n) r n for all n which implies 0 h(X) log r.
(ii) If X = A * with Card(A) = r, then c X (n) = r n and h(X) = log r.
The set of k-powerfree words F (k) (A) over an alphabet A is obviously a factorial subset of A * , which is infinite for suitable values of k for a given alphabet A. The precise value of the corresponding entropy, which coincides with the topological entropy [40], is not known, but lower and upper bounds exist for many cases. Recently, much improved upper and lower bounds have been established for h F (2) ({0, 1, 2}) and h F (3) ({0, 1}) , which will be outlined below. Generally, it is easier to find upper bounds than to give lower bounds, due to the factorial nature of the set of k-powerfree words, so we start with describing several methods to produce upper bounds on the entropy.

Upper bounds for the entropy
A simple way to provide upper bounds is based on the enumeration of the set of k-powerfree words up to some length. Clearly, for the case of r = Card(A) letters, the number of words c(n) := c F (k) (A) (n) is bounded by r n , so the corresponding entropy is h := h F (k) (A) log r, as mentioned above. Suppose we know the actual value of c(n) for some fixed n. Then, due to the factorial nature of the set for any m 1. Hence which, for any n, yields an upper bound for h. Obviously, the larger the value of n, the better the bound obtained in this way. In some cases, the bound can be slightly improved by considering words that overlap in a couple of letters; see [39] for an example. Sharper upper bounds can be produced by following a different approach, namely by considering a set of words that do not contain k-powers of a fixed finite set of words, for instance k-powers of all words up to a given length. This limitation means that the number of forbidden words is finite, and that the resulting factorial set has a larger entropy than the set of k-powerfree words, so the latter provides an upper bound. Again, by increasing the number of forbidden words, the bounds can be systematically improved.
As Noonan and Zeilberger pointed out [41], it is possible to calculate the generating function for the numbers of words avoiding a finite set of forbidden words by solving a system of linear equations. The generating functions are rational functions, and the location of the pole closest to the origin determines the radius of convergence, and hence the entropy of the corresponding set of words. This approach has been applied in Ref. [26] to derive an upper bound for the set of squarefree words in three letters, and generating functions for cubefree words in two letters are discussed below.
A related, though computationally easier approach is based on a Perron-Frobenius argument. It is sometimes referred to as the 'transfer matrix' or the 'cluster' approach. Here, a matrix is constructed, which determines how k-powerfree words of a given length can be concatenated to form k-powerfree words, and the growth rate is then determined by the maximum eigenvalue of this matrix. Both methods yield upper bounds that can be improved by increasing the length of the words involved, and in principle can approximate the entropy arbitrarily well, though in practice this is limited by the computational problem of computing the leading eigenvalue of a large matrix, or solving a large system of linear equations; see, for instance, [27] for details.

Lower bounds for the entropy
Until very recently, all methods used to prove that the entropy of k-powerfree words is positive and to establish lower bounds on the entropy were based on k-powerfree morphisms. Clearly, a k-powerfree morphism, iterated on a single letter, produces k-powerfree words of increasing length and suffices to show the existence of infinite k-powerfree words. For example, the fact that the Thue-Morse morphism (1) is cubefree shows the existence of cubefree words of arbitrary length in two letters. To prove that the entropy is actually positive, one has to show that the number of kpowerfree words grows exponentially with their length. Essentially, this is achieved by considering k-powerfree morphisms from a larger alphabet. The following theorem is a generalisation of Brandenburg's method, compare [11], and provides a path to produce lower bounds for the entropy of k-powerfree words. This result means that, whenever we can find a uniform k-powerfree morphism from a sufficiently large alphabet, it provides a lower bound for the entropy. Clearly, the larger r and the smaller ℓ the better the bound, so one is particularly interested in uniform k-powerfree morphisms from large alphabets of minimal length.
Another method due to Brinkhuis [12], which is related to Brandenburg's method, can be generalised as follows. Let again B = {b 1 , . . . , b s } be an alphabet and r ∈ N. For i = 1, . . . s let Brinkhuis' method was applied in Refs. [43,21,44]; see also below for a summary of bounds obtained for binary cubefree and ternary squarefree words. These bounds have in common that they are nowhere near the actual value of the entropy, and while a systematic improvement is possible by increasing the value of r in Theorem 13 (which, however, also means that one has to consider larger values of ℓ), it will always result in a much smaller growth rate, because only a subset of words is obtained in this way.
Recently, a different approach has been proposed [29], based essentially on the derivation of an inequality S m (n + 1) αS m (n) for the weighted sum S m (n) of the number of elements in a certain subset (which depends on the choice of m ∈ N) of squarefree (resp. cubefree) words of length n over a ternary (resp. binary) alphabet and a parameter α > 1 which satisfies two inequalities for i = m, m + 1, . . . , n − 1.
The estimation of S m (n + 1) starts from a Perron-Frobenius argument and concludes with the observation that the order of growth of the number of squarefree (resp. cubefree) words cannot be less than the order of growth of S m (n), which is α. This implies for k = 2 and A = {0, 1, 2} or k = 3 and A = {0, 1}, with the corresponding values for α. In the end, this method leads to a recipe to check, with a computer, several conditions for the parameters (including m), which ensure that the inequality for S m holds. By increasing the parameter m, it appears to be possible to estimate the growth rate of cubefree and squarefree words with an arbitrary precision. For details, we refer the reader to Ref. [29].

Bounds on the entropy of binary cubefree and ternary squarefree words
We now consider the two main examples, binary cubefree and ternary squarefree words, in more detail, reviewing the bounds derived by the various approaches mentioned above. We start with the discussion of binary cubefree words, and then give a brief summary of the analogous results for ternary squarefree words.  [45]; an extended list for n 80 is shown in Table 1. They were obtained by a straight-forward iterative construction of cubefree words, appending a single letter at a time. According to Eq. (2), the corresponding upper limit for the entropy h is h log b(80) 80 ≃ 0.389855.

Binary cubefree words
For comparison, the limit obtained using the number of words of length 79 is 0.390020, which indicates that these limits are still considerably larger than the actual value of h. As in the case of ternary squarefree words [26], the asymptotic behaviour of b(n) fits a simple form b(n) ∼ Ax −n c as n → ∞, pointing at a simple pole as the dominating singularity of the corresponding generating function at x = x c . The estimated values of the coefficients are A ≃ 2.847 and x c ≃ 1.4575773, leading to a numerical estimate of h = log(x c ) ≃ 0.3767757 for the entropy.
Let us compare this with the upper limit derived from generating function of the number of binary length-p cubefree words. To this end, let b p (n) := c F (3,p) ({0,1}) (n) denote the number of length-p cubefree words, and define to be the generating function for the number of binary length-p cubefree words. These functions of x are rational [41]. The first few generating functions read B 0 (x) = 1 1−2x = 1 + 2x + 4x 2 + 8x 3 + 16x 4 + 32x 5 + 64x 6 + . . . , The degrees of the numerator and denominator polynomials for p 14 are given in Table 2. The generating functions B p (x) have a finite radius of convergence, determined by the location of the zero x c of its denominator polynomial which lies closest to the origin. A plot of the location of  Figure 1. It very much resembles the analogous distribution for ternary squarefree words [26]; again, the poles seem to accumulate, with increasing p, on or near the unit circle, which may indicate the presence of a natural boundary beyond which the generating function for cubefree binary words (corresponding to taking p → ∞) cannot be analytically continued; see [26] for a discussion of this phenomenon in the case of ternary squarefree words. As a consequence of Pringsheim's theorem [46,Sec. 7.2], there is a dominant singularity on the positive real axis; we denote the position of the singularity by x c . For the cases we considered, this simple pole appears to be the only dominant singularity. Since the radius of convergence of the power series B p (x) is given by lim sup n→∞ n b p (n) −1 , the entropy h p of the set of binary length-p cubefree words is h p = − log x c . Clearly, h p h p ′ for p p ′ , and h = lim p→∞ h p , so for any finite p the entropy h p provides an upper bound of the entropy h of binary cubefree words. The values of the entropy h p for p 14 are given in Table 2. As was observed for ternary squarefree words [26], the values appear to converge very quickly with increasing p, but it is difficult to extract a reliable estimate of the true value of the entropy without making assumptions on the asymptotic behaviour. Already in 1983, Brandenburg [11] showed that which leads in our setting to 0.07701 h 0.41952. The currently best upper bounds are due to Edlin [45] and Ochem and Reix [27]. Analysing length-15 cubefree words up to a finite length, Edlin [45] arrives at the bound of h 0.376777 (which is what we would expect to find if we extended Table 2 to n = 15, but this would require huge computational effort to compute the corresponding generating function completely), while using the transfer matrix (or cluster) approach described above, Ochem and Reix obtained an upper bound on the growth rate of 1.45758131, which corresponds to the bound h 0.3767784 We now move on to the lower bound and cubefree morphisms. We already have seen one example above, the Thue-Morse morphism, which is a cubefree morphism from a binary alphabet to a binary alphabet. As explained above, it is also useful to find uniform cubefree morphisms from larger alphabets, because these provide lower bounds on the entropy. Clearly, if we have a uniform cubefree morphism ̺ : A * → {0, 1} * of length ℓ, with Card(A) = r, it is completely specified by the r words w i , i = 1, . . . , r, which are the images of the letters in A. Since any permutation of the letters in A will again yield a uniform cubefree morphism, the set w 1 , . . . , w r ⊂ {0, 1} ℓ of generating words determines the morphism up to permutation of the letters in A.
Moreover, the set w 1 , . . . , w r , where w denotes the image of w under the permutation 0 ↔ 1, also defines cubefree morphisms, as does w 1 , . . . , w r , where w denotes the reversal of w, i.e., the words w read backwards. This is obvious because the test-sets of Theorem 9 are invariant under these operations. Unless the words are palindromic (which means that w = w), the set w 1 , . . . , w r thus represents four different morphisms (not taking into account permutation of letters in A), the forth obtained by performing both operations, yielding w 1 , . . . , w r .
For cubefree morphisms from a three-letter alphabet A to two letters one needs words of length at least six. For length six, there are twelve in-equivalent (with respect to the permutation of letters in A) cubefree morphisms. The corresponding sets of generating words are and the corresponding images under the two operations explained above. Here, the four words are w 1 = 001011, w 2 = 001101, w 3 = 010110, w 4 = 011001.
It turns out that none of these morphisms actually satisfy the sufficient criterion of Theorem 7, but cubefreeness was verified using the test-set of Theorem 9.
One has to go to length nine to find cubefree morphisms from four to two letters. There are 16 in-equivalent morphisms with respect to permutations of the four letters. Explicitly, they are given by the generating sets Note that w 9 = w 9 is a palindrome, and that two of the five sets are invariant under the permutation 0 ↔ 1, which explains why they only represent 16 different morphisms. Beyond four letters, the test-set of Theorem 9 becomes unwieldy, but the sufficient criterion of Theorem 7 can be used to obtain morphisms. However, these may not have the optimal length, as the examples here show -again for length nine all morphisms violate the conditions of Theorem 7. Still, this need not be the case; for instance, morphisms from a five-letter alphabet that satisfy the sufficient criterion exist for length 12, which in this case is the optimal length.
As a consequence of Theorem 13, the morphisms (6) from a four letter alphabet show that the entropy of cubefree binary words is positive, and that h log 2 8 ≃ 0.08664.
Using the sufficient condition, this bound can be improved. For instance, for length 15, one can find cubefree morphisms from 10 letters, which yields a lower bound of h log 5 14 ≃ 0.11496.
However, a large step to close the gap between these lower bounds and the upper bound was achieved by the work of Kolpakov [29]. With his approach, a lower bound of h 0.37676, which is the best lower bound so far, has been established. The difference between this bound and the upper bound 0.3767784 by Ochem and Reix [27] is just 10 −5 , showing the huge improvement over the previously available estimates.

Ternary squarefree words
Denote by a(n) := c F (2) ({0,1,2}) (n) the number of ternary squarefree words and by a p (n) the number of length-p squarefree words of length n. For this section let h := h F (2) ({0, 1, 2}) be the entropy of squarefree words over the alphabet {0, 1, 2}. See [39] for a list of a(n) for n 90 and [21] for 91 n 110. The generating functions are defined according to the binary cubefree case. The first four of them are stated in [26,Sec. 3], which also contains a list of their radii of convergence for p 24. Already in 1983 Brandenburg [11] showed that which leads in our setting to 0.03151 h 0.32120.
In 1999, Noonan and Zeilberger [41] lowered the upper bound to 0.26391 by means of generating functions for the number of words avoiding squares of up to length 23. Grimm and Richard [26] used the same method to improve the upper bound to 0.263855. At the moment, the best known upper bound is 0.263740 which was established by Ochem in 2006 using an approach based on the transfer matrix (or cluster) method, see [27] for details.
In 1998, Zeilberger showed that a Brinkhuis pair of length 18 exists, which by Theorem 13 implies that the entropy is bounded by h log(2)/17 ≃ 0.04077 [47]. By going to larger alphabets, this was subsequently improved to h log(65)/40 ≃ 0.10436 by Grimm [21] and h log(110)/42 ≃ 0.11192 by Sun [44]. Again, the recent work of Kolpakov [29] has made a large difference to the lower bounds; he achieved the best current lower bound which is h 0.26369. The difference between the best known upper and lower bound is now just 5 × 10 −5 .

Letter frequencies
For a finite word w of length n, the frequency of the letter a is # a (w)/n ∈ [0, 1], where # a (w) denotes the number of occurrences of the letter a in w. In general, infinite k-powerfree words need not have well-defined letter frequencies. However, we can define upper and lower frequencies f + a f − a of a letter a ∈ A of a word w ∈ A * as where w n is a n-letter subword of w. Here, we take the supremum and infimum over all sequences {w n }. Alternatively, we can compute these frequencies from a + n = max wn⊂w # a (w n ) and a − n = min wn⊂w # a (w n ) by f ± a = lim n→∞ a ± n /n. The limits exist due to the subadditivity of the sequences {a + n } and {1 − a − n }. If the infinite word w is such that f + a = f − a =: f a , we call f a the frequency of the letter a in w.
The requirement that a word is k-powerfree for some k restricts the possible letter frequencies. For instance, for cubefree binary words, there cannot be three consecutive zeros, and hence the frequency of the letter 0 is certainly bounded from above by 2/3. Due to symmetry under permutation of letters, it is bounded from below by 1/3. In a similar way, considering maximum and minimum frequencies of letters in finite k-powerfree words produces bounds on the possible (upper and lower) frequencies of letters in infinite words. It is of interest, for which frequency of a letter k-powerfree words cease to exist, and how the entropy of k-powerfree words depends on the letter frequency. To answer these questions, k-powerfree morphisms are exploited once again, and in two ways. Firstly, the argument using frequencies in finite words only produces 'negative' results, in the sense that you can exclude the existence of k-powerfree words for certain ranges of the frequency. To show that k-powerfree words of a certain frequency actually exist, these are produced as fixed points of k-powerfree morphisms. The letter frequency for an infinite word obtained as a fixed point of a morphism ̺ on the alphabet A = {a 1 , a 2 , . . . , a m } is well-defined, and obtained from the (statistically normalised) right Perron-Frobenius eigenvector of the associated m × m substitution matrix M with elements M ij = # ai ̺(a j ); see for instance [48]. For example, for the Thue-Morse morphism (1), the substitution matrix is M = ( 1 1 1 1 ) with Perron-Frobenius eigenvalue 2 and corresponding eigenvector ( 1 2 , 1 2 ) T , so both letters occur with frequency 1/2 in the infinite Thue-Morse word.
To show that there exist exponentially many words with a given letter frequency, or, in other words, that the entropy of the set of k-powerfree words with a given letter frequency is positive, a variant of Theorem 13 is used. A = {a 11 , . . . , a 1r , a 21 , . . . , a 2r , . . . , a s1 , . . . , a sr } and B = {b 1 , . . . , b s } be alphabets with Card(A) = rs and Card(B) = s, where r, s > 1 are integers. Assume that there exists an ℓ-uniform k-powerfree morphism ̺ :

Theorem 14. Let
for all b ∈ B, 1 i s and 1 j, j ′ r. Define the r × r matrix M with elements and denote its right Perron-Frobenius eigenvector (with eigenvalue ℓ) by (f 1 , . . . , f r ) T , with statistical normalisation f 1 + . . . + f r = 1. Then, the entropy h of the set of k-powerfree words in B with prescribed letter frequencies Proof. The bound is the same as in Theorem 13, and the statement thus follows by showing that the infinite words obtained from the uniform k-powerfree morphism ̺ have letter frequency given by f 1 , . . . , f r . We again introduce the morphism φ : A * → B * by φ(a ij ) := b i for i = 1, . . . , s and j = 1, . . . , r. Every k-powerfree word of length m over B has r m different preimages of φ which, by construction, consist only of k-powerfree words. These words are mapped by ̺, which is injective due to its k-powerfreeness, to different k-powerfree words of length mℓ over B. Due to the condition # b ̺(a ij ) = # b ̺(a ij ′ ) on ̺, the letter statistics do not depend on the choice of the preimage under φ. The letter frequencies of words obtained by the procedure described in the proof of Theorem 13 are thus well defined, and given by the right Perron-Frobenius eigenvector of the r × r matrix M .
Some results for binary cubefree words, as well as a discussion of the empirical frequency distribution of cubefree binary words obtained from the enumeration up to length 80, are detailed below.

Binary cubefree words
When counting the numbers b(n) of binary cubefree words of length n shown in Table 1, we also counted the number b(n, n 0 ) of words with n 0 occurrences of the letter 0. Clearly, these numbers satisfy b(n) = n n 0 =0 b(n, n 0 ) and b(n, n − n 0 ) = b(n, n 0 ) as a consequence of the symmetry under permutation of letters. Their values for n = 80 are given in Table 3.
Obviously, there are at least 32 and at most 48 occurrences of the letter 0 in any cubefree binary word of length 80, so the frequency of a letter is bounded by 2/5 f 0 3/5. A stronger bound has been obtained by Ochem [30], who showed (amongst many results for a number of rational powers) that f 0 > 115 283 ≃ 0.40636, using a backtracking algorithm. One is interested to locate the minimum frequency f min , such that infinite cubefree words with frequency f 0 = f min exist, but not for any f 0 < f min . Clearly, the lower bound above is a lower bound for f min . In order to obtain an upper bound, we need to prove the existence of an infinite binary cubefree words of a given letter frequency. This is again done by using a cubefree morphism, which provides an infinite word with well-defined letter frequencies. For instance, 0 → 011011010110110011011010110 1 → 011011010110110011010110110 is a uniform morphism of length 27 with substitution matrix ( 11 11 16 16 ), so the infinite fixed point word has letter frequencies f 0 = 11 27  Using the data from our enumeration of binary cubefree words up to length 80, we can study the empirical distribution for small length, and try to conjecture the behaviour for large words. Figure 2 shows a plot of the normalised data b(80, 40 + e)/b(80) of Table 3, compared with a Gaussian distribution, which appears to fit the data very well. Here, the Gaussian profile was determined from the variance σ 2 of the data points, which is approximately σ 2 ≃ 2.124.
To draw any conclusions on the limit of large word length, we need to consider the scaling of the distribution with the word length n. The first step is to determine how the variance scales with n. A plot of the numerical data is given in Figure 3, which shows that, for large n, the variance appears to scale linearly with n. A least squares fit to the data points for 40 n 80 gives a slope of 0.021616.
Assuming that the distribution for fixed n is Gaussian, the suitably re-scaled data considered as a function of the rescaled letter excess should approach a Gaussian distribution  with variance σ 2 ≃ 0.021616. Figure 4 shows a plot of this distribution, together with the data points obtained for 40 n 80. Clearly, there are some deviations, which has to be expected due to the fact that the relationship between the variance and the length shown in Figure 3, while being asymptotically linear, is not a proportionality; however, the overall agreement is reasonable. A plausible conjecture, therefore, is that the scaled distribution becomes Gaussian in the limit of large word length. In terms of the entropy, the observed concentration property is consistent with the entropy maximum occurring at letter frequency 1/2, and a lower entropy for other letter frequencies. This is similar to the observed and conjectured behaviour for ternary squarefree words in Ref. [26].
By an application of Theorem 14, the cubefree morphisms of Eq. (6) show that the entropy for the case of letter frequency f 0 = f 1 = 1/2 is positive. More interesting in the case of non-equal letter frequencies. As an example, consider the 13-uniform morphism where all words on the right-hand side comprise seven letters 0 and six letters 1. One can check that this morphism satisfies the criterion of Theorem 9, hence is cubefree. Consequently, the matrix M of Theorem 14 is M = ( 7 7 6 6 ), and the letter frequencies of any word constructed by

Summary and Outlook
In this paper, we reviewed recent progress on the combinatorics of k-powerfree words, with particular emphasis on the examples of binary cubefree and ternary squarefree words, which have attracted most attention over the years. Recent work in this area, using extensive computer searches, but also new methods, has led to a drastic improvement of the known bounds for the entropy of these sets. No analytic expression for the entropy is known to date, and the results on the generating function for the sets of length-p powerfree words indicate that this may be out of reach. However, considerable progress has been made on other combinatorial questions, such as letter frequencies, where again bounds have been improved, but eventually also a definite answer has emerged, in this case on the minimum letter frequency in ternary squarefree words. We also presented some new results on binary cubefree words, including an enumeration of the number of words and their letter frequencies for length up to 80. The empirical distribution of the number of words as a function of the excess of one letter is investigated, and conjectured to become Gaussian in the limit of infinite word lengths after suitable scaling. We also found bounds on the letter frequency in binary squarefree words, and show that exponentially many words with unequal letter frequency exist, like in the case of ternary squarefree words. The analysis of the generating functions of length-p binary cubefree words, which we calculated for p 14, also shows striking similarity to the case of ternary squarefree words, suggesting that the observed behaviour may be generic for sets of k-powerfree words.
While a lot of progress has been made, there remain many open questions. For instance, is there an explanation for the observed accumulation of poles and zeros of the generating functions on or near the unit circle, and is it possible to prove what happens in the limit when p → ∞? How does the entropy depend on the power, say for binary k-powerfree words? A partial answer to this question is given in Ref. [25], but it would be nice to show that, at least in some region, the entropy increases by a finite amount at any rational value of k, which you might expect to happen. Concerning powerfree words with given letter frequencies, how does the entropy vary as a function of the frequency? One might conjecture that the entropy changes continuously, but at present all we have are results that for some very specific frequencies, where powerfree morphisms have been found, the entropy is positive. Some of these questions may be too hard to hope for an answer in full generality, but the recent progress in the area shows that one should keep looking for alternative approaches which may succeed.