The Terras Theorem

One of the most important results on the 3x+1 problem is the Terras Theorem, originally stated by Riho Terras. Because it is such an important theorem for the 3x+1 problem this page is dedicated to sketching its approach and ideas. While some minimal mathematical background is probably necessary to follow this page none of the proofs require detailed knowledge of advanced techniques or deeper theorems.

Although all lemmas are proven and no steps are skipped or ignored the purpose of this page is to introduce the Terras Theorem to those new to the 3x+1 problem. Please note therefore that the emphasis is on making the theorem comprehensible rather than on 100% exact mathematics (but see this note for a more exact version).

The simplest way to summarize the theorem goes like this: Almost all numbers have finite Glide.

This is too simple really. Before we get back to a more exact version of the theorem let us first redefine the 3x+1 algorithm as follows:

3x + 1 algorithm, restated
For any positive integer N define a sequence where
S0 = N and for all i > 0 Si = Si-1 / 2 if Si is even Si = ( Si-1 * 3 + 1 ) / 2 if Si is odd
And we will call the smallest value of i for which Si < S0 the Stopping time of N
Example :
Let N = 17; then S0 = 17, S1 = 26, S2 = 13, S3 = 20, S4 = 10, S5 = 5, etc.
And since S2 is the first term < 17 the stopping time of 17 is 2.
The Terras theorem in simple words then becomes: Almost all numbers have finite Stopping Time.

Note that this redefinition has no consequences for the convergence of the series.

Next we define a parity vector for each number as follows:

Parity vector
For any number N the parity vector v(N) is defined as vi(N) = Si mod 2
Example :
For N = 17 we find v0 = 1, v1 = 0, v2 = 1, v3 = 0, v4 = 0, v5 = 1, etc.

The parity vector completely describes the way the 3x+1 iterates of N behave, since the numbers correspond directly to the criterion upon which the 3x + 1 algorithm is based.

We now state a simple lemma to increase our understanding of the parity vector :

Lemma 1 :
If N is a positive integer of the form a . 2k + b (b < 2k)
then the first k elements of the parity vector are dependent on b only
Proof :
The proof is by induction on k.
For k = 1 we have N = 2a + b with b either 0 or 1
and the lemma is true by the definition of the 3x + 1 algorithm
Now assume the lemma is true for some k
Let N = a . 2k+1 + b (b < 2k+1)
Then if S0 = N is even S1 = a . 2k + b / 2
and if S0 = N is odd S1 = 3a . 2k + (3b+1) / 2
In either case the first k elements of v(S1) depend on b only
Therefore the first k+1 elements of N are independent of a as well
This concludes the proof of the lemma.
Example :
Let N = 17 = 0 . 26 + 17. The first 6 elements of the parity vector are (1,0,1,0,0,1). From Lemma 1 this is true for every number of the form a . 26 + 17. Therefore these are also the first 6 elements of the parity vector of 81, 145, 209, 273, etc.

Another lemma about parity vectors is :

Lemma 2 :
Assume wi (0 ≤ i < k) is a parity vector of length k
Then some number N exists for which vi(N) = wi (0 ≤ i < k)
Proof :
The proof is again by induction on k.
For k = 1 we have w0 is either 0 or 1
and N is either any even or odd number
Now assume the lemma is true for some k
Let wi (0 ≤ i < k+1 ) be some parity vector of length k+1
Then some number N exists for which vi(N) = wi (0 ≤ i < k)
From Lemma 1 we can write N as a . 2k + b (b < 2k)
Now take two numbers M1 and M2, with M1 = a . 2k+1 + b and M2 = a . 2k+1 + 2k + b
For both M1 and M2 we automatically have vi(M1) = vi(M2) = wi (0 ≤ i < k)
Now take S0 = M1 and let us call Sk(M1) = X
From the definition of the algorithm we have Sk(M2) = Y = X + 3n
with n the number of odd elements of wi from 0 to k
Since 3n is always odd we see that X and Y are of different parity
Therefore wk+1(M1) ≠ wk+1(M2)
which means that for either wk+1 = 0 or wk+1 = 1
a number M must exist for which vi(M) = wi (0 ≤ i < k+1)
This concludes the proof of the lemma.
Example :
Take the parity vector (1,0,1,0,0,1,x) with x either 0 or 1. For N = 17 these are the first 6 elements of its parity vector. With S0 = N = 17 we find S6 = 8. So for N = 17 the first 7 elements of its parity vector are (1,0,1,0,0,1,0). If we take S0 = N = 17 + 26 = 81 then from Lemma 2 we know that the first 7 elements of its parity vector must be (1,0,1,0,0,1,1) and therefore that S6 must be odd, and indeed S6 = 35. Note that the parity vector contains three 1's in its first 6 elements and that S6 is also necessarily equal to 8 + 33 = 8 + 27 = 35.

The next step is to define a parity number of order k:

Parity number
For any number N the parity numbers Pk (k ≥ 1) are defined as Pk = ∑i<k (vi . 2i)
Example :
For N = 17 we find P1 = 1, P2 = 1, P3 = 5, P4 = 5, P5 = 5, P6 = 37, etc.

Note that the parity number of order k is really nothing else than the first k elements of the parity vector written in reverse order and interpreted as a binary number. Thus the parity vector of 17 starts as (1,0,1,0,0,1) and therefore P6 = 37, where 37 written in binary is 100101. Note that for any parity number of order k we have 0 ≤ Pk < 2k

We now reach an important Lemma:

Lemma 3 :
Let M and N be positive integers. Then Pk(M) = Pk(N) if and only if M ≡ N mod 2k
Proof :
• From M ≡ N mod 2k it follows that both can be written as
M = x . 2k + b and N = y . 2k + b, with b identical for both M and N
From Lemma 1 the first k elements of v depend on b only
Therefore vi(M) = vi(N) for 0 ≤ i < k and so Pk(M) = Pk(N)
• Let Pk(M) = Pk(N), and automatically vi(M) = vi(N) for 0 ≤ i < k
Assume that M and N are congruent b1 and b2 mod 2k respectively
Then we have two numbers a . 2k + b1 and a . 2k + b2 which both give rise to the same parity vector of length k
By Lemma 2 however for every parity vector w of length k a number Q exists for which vi(Q) = wi (0 ≤ i < k)
Since there are 2k different vectors of length k and since there are also
2k different congruence classes mod 2k it follows that b1 = b2
This completes the proof of Lemma 3

From Lemma 3 we can derive that every congruence class mod 2k uniquely maps to a parity number, with both numbers lying between 0 and 2k - 1. This function can therefore be seen as a permutation function.

Example :
Taking k=3 we find that 8x + 0 maps to parity number 0, 8x+1 maps to 5, etc. Written more concisely the numbers 8x + r (r=0, 1, 2, 3, 4, 5, 6, 7) map to 0, 5, 2, 3, 4, 1, 6, 7 respectively. We see that 1 maps to 5 and vice versa, but all other numbers map to themselves. Written even shorter in standard permutation notation the mapping for k=3 can be described by (1,5). Similarly the mapping for k=4 can be written as (1,5) (2,10) (9,13).

From the definition of the parity vector it should be clear that the elements describe the behaviour of the 3x + 1 sequence. In fact we can give a pretty good approximation for Sk from Vi (0 ≤ i < k). This is stated more explicitly in Lemma 4.

Lemma 4 :
Let S0 = N be a positive integer and vi its parity vector.
Let d(a,b) = ∑ vi (a ≤ i < b)
Specifically, let d(k) be a shorthand for d(0,k)
Then Sk ≈ Tk = S0 . 3d(k) . 2-k
(Or, in logarithmic form: ln(Sk) ≈ ln(Tk) = ln(S0) + d(k).ln(3) - k.ln(2)) and lim (Sk - Tk) / Sk = 0 for Sk → ∞
Proof :
Since every element of vi represents a (3x + 1) / 2 multiplication step and every zero represents a division step we immediately see that
Si+1 = (3Si+1)/2 if vi is 1 and Si+1 = Si/2 if vi is 0
Thus Sk = Tk + Rk where Rk is equal to
Rk = ∑ vi . 3d(i+1,k) / 2k-i (0 ≤ i < k)
Since lim (Rk / Sk) = 0 for Sk → ∞ this proves Lemma 4

Example :
Taking N = S0 = 17 and k = 5 we saw that S5 = 5. We find that T5 = 17 . 32 . 2-5 = 4.78125 which is less then 5% off the real value. For larger numbers the results are much closer, for instance taking N = 1,234,567 results in S5 = 3,125,000 and T5 = N . 34 . 2-5 = 3,124,997.7 where the difference between the two numbers is already less than 10-6 . S5

Note that the factor Rk is non-negative, implying that Tk ≤ Sk
Now that we have a fair way of estimating Sk from v(N) we can use v(N) to get a good indication of the convergence of N. To this end we introduce a new definition.

Convergent and Divergent vectors
Let vi (0 ≤ i < k) be a parity vector of length k
Let d(j) = ∑ vi (0 ≤ i < j)
and define c(j) as d(j).ln(3) - j.ln(2) for any j < k
Then we call v convergent if c(j) < 0 for any j < k and the smallest value j for which c(j) < 0 is called the convergence time of v, or, generally, the convergence time of any N having such a parity vector
Finally we call v divergent if such a value of j does not exist

It should be clear by Lemma 4 that a necessary though not quite sufficient condition for N to have a stopping time k is that its parity vector has Convergence Time k.
We are now getting closer to the real Terras theorem. The next lemma takes a big step towards the final goal.

Lemma 5 :
Let vi be a parity vector of length k with convergence time k and let M be the set of integers having parity vector v
Then all sufficiently large elements N from M have stopping time k
Proof :
First we note that Lemma 2 asserts that some N exist, so M is not empty
By Lemma 3 we know that all elements from M fall into a single congruence class mod 2k
We can easily see from Lemma 4 that S0 < Ti ≤ Si (0 ≤ i < k-1)
Therefore N must have a stopping time of at least k
Now the limit ( Sk ) / S0 for S0 → ∞ is equal to (S0 . ec(k) + Rk) / S0 = ec(k)
Since c(k) < 0 obviously ec(k) < 1 which proves the lemma.

Example :
All numbers ≡ 3 mod 16 (x ≥ 0) have a parity vector v starting (1,1,0,0). The convergence time of v is 4. Therefore all sufficiently large numbers ≡ 3 mod 16 have stopping time 4. The smallest such number, 3, is apparently sufficiently large enough already, since the stopping time of 3 is indeed 4.

Lemma 5 effectively states that only finitely many numbers with convergence time k do not have stopping time k. In actual fact no number is known with this property other than the trivial case where N = 1. Since such a number should (by Lemma 5) be relatively small it is generally believed that no such number exists. In other words:

Conjecture :
For all N > 1 the convergence time is equal to the stopping time

We have one more lemma to go before returning to the Terras Theorem proper. But this lemma takes us really close to our final destination.

Lemma 6 :
Let Vk be the set of all parity vectors v of length k.
Let Wk be the subset of V consisting of all divergent elements of V.
Then (taking |C| to be the number of elements of C) lim k → ∞ |Wk| / |Vk| = 0.
Proof :
From the definition of convergent vectors it should be clear that c(k) < 0 is a sufficient though not necessary condition for convergence.
Since the distribution of 1's and 0's in all parity vectors can be described by the binomial distribution, the limit above is obviously bounded by lim k → ∞ ∑ 2-k . k! / i!(k-i)! for 0 ≤ i < kα
where α = ( 1 - ln(2) / ln(3) )
This limit simply represents a part of the binomial distribution, and it is
well known that the limit is equal to zero for any α less than ½.
Since ln(2) / ln(3) > ½ the condition is fulfilled and the Lemma is proven.

Example :
The table below depicts both the number of divergent parity vectors and the relative number of those vectors for a number of values of k. Although the number of divergent parity vectors increases rapidly their relative number can be seen to drop substantially even for fairly modest values of k. For instance for k=70 the fraction of divergent vectors is approximately 0.001, which implies that 99.9% of all parity vectors with length 70 are convergent.

k|Wk||Wk| / |Vk|
110.5000
210.2500
320.2500
430.1875
540.1250
680.1250
7130.1016
8190.0742
9380.0742
10640.0625
k|Wk||Wk| / |Vk|
111280.0625
122260.0552
133670.0448
147340.0448
1512950.0395
1621140.0323
1742280.0323
1874950.0286
19149900.0286
20273280.0261
k|Wk||Wk| / |Vk|
301.28e+0070.01189418
406.40e+0090.00582334
503.73e+0120.00331669
602.22e+0150.00192219
701.24e+0180.00105159
808.03e+0200.00066440
905.09e+0230.00041078
1003.03e+0260.00023868
1102.03e+0290.00015662
1201.31e+0320.00009879
k|Wk||Wk| / |Vk|
1308.11e+0340.00005957
1405.52e+0370.00003963
1503.67e+0400.00002569
1602.52e+0430.00001726
1701.64e+0460.00001093
1801.13e+0490.00000736
1907.69e+0510.00000490
2004.92e+0540.00000306
2507.41e+0680.00000041
3001.11e+0830.00000005

We finally return to the Terras theorem itself. Stated more accurately than above the essence of the theorem is that the fraction of numbers within a given interval that can not be shown to have a finite stopping time tends to zero as the interval expands. In other words :

Theorem (Terras) :
Let M be a positive integer.
Let D(M) be the fraction of numbers < M that do not have finite stopping time.
Then the limit of D(M) for M → ∞ is equal to 0.
Proof :
First we establish that from Lemma 5 only finitely many numbers
have a stopping time different from their convergence time.
Now partition the numbers into congruence classes mod 2k
From Lemma 3 every number ≡ m mod 2k maps to a unique parity number n
( 0 ≤ n < 2k ) and therefore to a unique parity vector of length k.
But Lemma 6 proved that the fraction of divergent parity vectors of length k
for k → ∞ is zero.
On considering numbers N < 2k that do not have stopping time ≤ k
it follows that lim k → ∞ D(2k) = 0.
This concludes the proof of the Theorem.

Note : The above version of the Terras Theorem is a simplified version. It can be improved in several ways, e.g. by stating exactly 'how fast' the limit goes to zero. A more formal and more complete proof can be found in this excellent standard overview of the 3x+1 problem by Jeffrey Lagarias.


Back to the general 3x+1 page.