Square-free word

Source: Wikipedia, the free encyclopedia.

In

avoids the pattern
XX.

Finite square-free words

Binary alphabet

Over a binary alphabet , the only square-free words are the empty word , and .

Ternary alphabet

Over a ternary alphabet , there are infinitely many square-free words. It is possible to count the number of ternary square-free words of length n.

The number of ternary square-free words of length n[1]
n 0 1 2 3 4 5 6 7 8 9 10 11 12
1 3 6 12 18 30 42 60 78 108 144 204 264

This number is bounded by , where .[2] The upper bound on can be found via

Fekete's Lemma and approximation by automata. The lower bound can be found by finding a substitution that preserves square-freeness.[2]

Alphabet with more than three letters

Since there are infinitely many square-free words over three-letter alphabets, this implies there are also infinitely many square-free words over an alphabet with more than three letters.

The following table shows the exact growth rate of the k-ary square-free words, rounded off to 7 digits after the decimal point, for k in the range from 4 to 15:[2]

Growth rate of the k-ary square-free words
alphabet size (k) 4 5 6 7 8 9
growth rate 2.6215080 3.7325386 4.7914069 5.8284661 6.8541173 7.8729902
alphabet size (k) 10 11 12 13 14 15
growth rate 8.8874856 9.8989813 10.9083279 11.9160804 12.9226167 13.9282035

2-dimensional words

Consider a map from to A, where A is an alphabet and is called a 2-dimensional word. Let be the entry . A word is a line of if there exists such that , and for .[3]

Carpi[4] proves that there exists a 2-dimensional word over a 16-letter alphabet such that every line of is square-free. A computer search shows that there are no 2-dimensional words over a 7-letter alphabet, such that every line of is square-free.

Generating finite square-free words

Shur[5] proposes an algorithm called R2F (random-t(w)o-free) that can generate a square-free word of length n over any alphabet with three or more letters. This algorithm is based on a modification of entropy compression: it randomly selects letters from a k-letter alphabet to generate a -ary square-free word.

algorithm R2F is
    input: alphabet size ,
           word length 
    output: a -ary square-free word wof length n.

    (Note that  is the alphabet with letters .)
    (For a word ,  is the permutation of  such that a precedes b in  if the 
     right most position of a in w is to the right of the rightmost position of b in w.
     For example,   has .)

    choose  in  uniformly at random 
    set  to  followed by all other letters of  in increasing order
    set the number N of iterations to 0

    while  do
        choose j in  uniformly at random
        append  to the end of w
        update  shifting the first j elements to the right and setting 
        increment N by 1
        if w ends with a square of rank r then
            delete the last r letters of w

    return w

Every (k+1)-ary square-free word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of w.

The expected number of random k-ary letters used by Algorithm R2F to construct a -ary square-free word of length n is

Note that there exists an algorithm that can verify the square-freeness of a word of length n in time. Apostolico and Preparata[6] give an algorithm using suffix trees. Crochemore[7] uses partitioning in his algorithm. Main and Lorentz[8] provide an algorithm based on the divide-and-conquer method. A naive implementation may require time to verify the square-freeness of a word of length n.

Infinite square-free words

There exist infinitely long square-free words in any alphabet with three or more letters, as proved by Axel Thue.[9]

Examples

First difference of the Thue–Morse sequence

One example of an infinite square-free word over an alphabet of size 3 is the word over the alphabet obtained by taking the

first difference of the Thue–Morse sequence.[9]
That is, from the Thue–Morse sequence

one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting square-free word is

(sequence A029883 in the OEIS).

Leech's morphism

Another example found by John Leech[10] is defined recursively over the alphabet . Let be any square-free word starting with the letter 0. Define the words recursively as follows: the word is obtained from by replacing each 0 in with 0121021201210, each 1 with 1202102012021, and each 2 with 2010210120102. It is possible to prove that the sequence converges to the infinite square-free word

0121021201210120210201202120102101201021202102012021...

Generating infinite square-free words

Infinite square-free words can be generated by square-free morphism. A morphism is called square-free if the image of every square-free word is square-free. A morphism is called k–square-free if the image of every square-free word of length k is square-free.

Crochemore[11] proves that a uniform morphism h is square-free if and only if it is 3-square-free. In other words, h is square-free if and only if is square-free for all square-free w of length 3. It is possible to find a square-free morphism by brute-force search.

algorithm square-free_morphism is
    output: a square-free morphism with the lowest possible rank k.

    set 
    while True do
        set k_sf_words to the list of all square-free words of length k over a ternary alphabet
        for each  in k_sf_words do
            for each  in k_sf_words do
                for each  in k_sf_words do
                    if  then
                        break from the current loop (advance to next )
                    if  and  then
                        if  is square-free for all square-free w of length 3 then
                            return 
        increment k by 1

Over a ternary alphabet, there are exactly 144 uniform square-free morphisms of rank 11 and no uniform square-free morphisms with a lower rank than 11.

To obtain an infinite square-free words, start with any square-free word such as 0, and successively apply a square-free morphism h to it. The resulting words preserve the property of square-freeness. For example, let h be a square-free morphism, then as , is an infinite square-free word.

Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is square-free if and only if it is 5-square-free.[11]

Letter combinations in square-free words

Extending a square-free word to avoid ab.

Avoid two-letter combinations

Over a ternary alphabet, a square-free word of length more than 13 contains all the square-free two-letter combinations.[12]

This can be proved by constructing a square-free word without the two-letter combination ab. As a result, bcbacbcacbaca is the longest square-free word without the combination ab and its length is equal to 13.

Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary two-letter combination.

Avoid three-letter combinations

Over a ternary alphabet, a square-free word of length more than 36 contains all the square-free three-letter combinations.[12]

However, there are square-free words of any length without the three-letter combination aba.

Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary three-letter combination.

Density of a letter

The density of a letter a in a finite word w is defined as where is the number of occurrences of a in and is the length of the word. The density of a letter a in an infinite word is where is the prefix of the word w of length l.[13]

The minimal density of a letter a in an infinite ternary square-free word is equal to .[13]

The maximum density of a letter a in an infinite ternary square-free word is equal to .[14]

Notes

  1. ^ "A006156 - OEIS". oeis.org. Retrieved 2019-03-28.
  2. ^ .
  3. .
  4. .
  5. .
  6. .
  7. .
  8. ^ .
  9. .
  10. ^ .
  11. ^ ].
  12. ^ .
  13. .

References