The Terras Theorem
One of the most important results on the 3x+1 problem is the Terras Theorem,
originally stated by Riho Terras. Because it is such an important theorem
for the 3x+1 problem this page is dedicated to sketching its approach and ideas.
While some minimal mathematical background is probably necessary to follow this page
none of the proofs require detailed knowledge of advanced techniques or deeper
theorems.
Although all lemmas are proven and no steps are skipped or ignored the purpose of
this page is to introduce the Terras Theorem to those new to the 3x+1 problem.
Please note therefore that the emphasis is on making the theorem comprehensible
rather than on 100% exact mathematics (but see this note
for a more exact version).
The simplest way to summarize the theorem goes like this:
Almost all numbers have finite Glide.
This is too simple really. Before we get back to a more exact version of
the theorem let us first redefine the 3x+1 algorithm as follows:
- 3x + 1 algorithm, restated
- For any positive integer N define a sequence where
S0 = N
and for all i > 0
Si = Si-1 / 2 if Si is even
Si = ( Si-1 * 3 + 1 ) / 2 if Si is odd
And we will call the smallest value of i for which
Si < S0 the Stopping time of N
- Example :
- Let N = 17; then S0 = 17, S1 = 26, S2 = 13,
S3 = 20, S4 = 10, S5 = 5, etc.
And since S2 is the first term < 17 the stopping time of 17 is 2.
The Terras theorem in simple words then becomes:
Almost all numbers have finite Stopping Time.
Note that this redefinition has no consequences for the convergence of the series.
Next we define a parity vector for each number as follows:
- Parity vector
- For any number N the parity vector v(N) is defined as
vi(N) = Si mod 2
- Example :
- For N = 17 we find v0 = 1, v1 = 0, v2 = 1,
v3 = 0, v4 = 0, v5 = 1, etc.
The parity vector completely describes the way the 3x+1 iterates of N behave,
since the numbers correspond directly to the criterion
upon which the 3x + 1 algorithm is based.
We now state a simple lemma to increase our understanding of the parity vector :
- Lemma 1 :
- If N is a positive integer of the form a . 2k + b (b < 2k)
then the first k elements of the parity vector are dependent on b only
- Proof :
- The proof is by induction on k.
For k = 1 we have N = 2a + b with b either 0 or 1
and the lemma is true by the definition of the 3x + 1 algorithm
Now assume the lemma is true for some k
Let N = a . 2k+1 + b (b < 2k+1)
Then if S0 = N is even S1 = a . 2k + b / 2
and if S0 = N is odd S1 = 3a . 2k + (3b+1) / 2
In either case the first k elements of v(S1) depend on b only
Therefore the first k+1 elements of N are independent of a as well
This concludes the proof of the lemma.
- Example :
- Let N = 17 = 0 . 26 + 17. The first 6 elements of the parity vector
are (1,0,1,0,0,1). From Lemma 1 this is true for every number
of the form a . 26 + 17. Therefore these are also the first 6 elements
of the parity vector of 81, 145, 209, 273, etc.
Another lemma about parity vectors is :
- Lemma 2 :
- Assume wi (0 ≤ i < k) is a parity vector of length k
Then some number N exists for which vi(N) = wi (0 ≤ i < k)
- Proof :
- The proof is again by induction on k.
For k = 1 we have w0 is either 0 or 1
and N is either any even or odd number
Now assume the lemma is true for some k
Let wi (0 ≤ i < k+1 ) be some parity vector of length k+1
Then some number N exists for which vi(N) = wi (0 ≤ i < k)
From Lemma 1 we can write N as a . 2k + b (b < 2k)
Now take two numbers M1 and M2, with M1 = a . 2k+1 + b
and M2 = a . 2k+1 + 2k + b
For both M1 and M2 we automatically have
vi(M1) = vi(M2) = wi (0 ≤ i < k)
Now take S0 = M1 and let us call Sk(M1) = X
From the definition of the algorithm we have Sk(M2) = Y = X + 3n
with n the number of odd elements of wi from 0 to k
Since 3n is always odd we see that X and Y are of different parity
Therefore wk+1(M1) ≠ wk+1(M2)
which means that for either wk+1 = 0 or wk+1 = 1
a number M must exist for which vi(M) = wi (0 ≤ i < k+1)
This concludes the proof of the lemma.
- Example :
- Take the parity vector (1,0,1,0,0,1,x) with x either 0 or 1. For N = 17 these
are the first 6 elements of its parity vector. With S0 = N = 17
we find S6 = 8. So for N = 17 the first 7 elements of its
parity vector are (1,0,1,0,0,1,0).
If we take S0 = N = 17 + 26 = 81 then from Lemma 2
we know that the first 7 elements of its parity vector must be (1,0,1,0,0,1,1)
and therefore that S6 must be odd, and indeed S6 = 35.
Note that the parity vector contains three 1's in its first 6 elements
and that S6 is also necessarily equal to 8 + 33 = 8 + 27 = 35.
The next step is to define a parity number of order k:
- Parity number
- For any number N the parity numbers Pk (k ≥ 1) are defined as
Pk = ∑i<k (vi . 2i)
- Example :
- For N = 17 we find P1 = 1, P2 = 1, P3 = 5,
P4 = 5, P5 = 5, P6 = 37, etc.
Note that the parity number of order k is really nothing else than the first
k elements of the parity vector written in reverse order and interpreted
as a binary number. Thus the parity vector of 17 starts as (1,0,1,0,0,1)
and therefore P6 = 37, where 37 written in binary is 100101.
Note that for any parity number of order k
we have 0 ≤ Pk < 2k
We now reach an important Lemma:
- Lemma 3 :
- Let M and N be positive integers.
Then Pk(M) = Pk(N) if and only if M ≡ N mod 2k
- Proof :
- • From M ≡ N mod 2k it follows that both can be written as
M = x . 2k + b and N = y . 2k + b, with b identical for both M and N
From Lemma 1 the first k elements of v depend on b only
Therefore vi(M) = vi(N) for 0 ≤ i < k
and so Pk(M) = Pk(N)
• Let Pk(M) = Pk(N),
and automatically vi(M) = vi(N) for 0 ≤ i < k
Assume that M and N are congruent b1 and b2 mod 2k respectively
Then we have two numbers a . 2k + b1 and a . 2k + b2
which both give rise to the same parity vector of length k
By Lemma 2 however for every parity vector w of length k a number Q exists
for which vi(Q) = wi (0 ≤ i < k)
Since there are 2k different vectors of length k and since there are also
2k different congruence classes mod 2k it follows that b1 = b2
This completes the proof of Lemma 3
From Lemma 3 we can derive that every congruence class mod 2k uniquely
maps to a parity number, with both numbers lying between 0 and 2k - 1.
This function can therefore be seen as a permutation function.
- Example :
- Taking k=3 we find that 8x + 0 maps to parity number 0, 8x+1 maps to 5, etc.
Written more concisely the numbers 8x + r (r=0, 1, 2, 3, 4, 5, 6, 7) map to
0, 5, 2, 3, 4, 1, 6, 7 respectively. We see that 1 maps to 5 and vice versa,
but all other numbers map to themselves. Written even shorter in standard
permutation notation the mapping for k=3 can be described by (1,5).
Similarly the mapping for k=4 can be written as (1,5) (2,10) (9,13).
From the definition of the parity vector it should be clear that the elements
describe the behaviour of the 3x + 1 sequence. In fact we can give a pretty good
approximation for Sk from Vi (0 ≤ i < k).
This is stated more explicitly in Lemma 4.
- Lemma 4 :
- Let S0 = N be a positive integer and vi its parity vector.
Let d(a,b) = ∑ vi (a ≤ i < b)
Specifically, let d(k) be a shorthand for d(0,k)
Then Sk ≈ Tk = S0 . 3d(k) . 2-k
(Or, in logarithmic form: ln(Sk) ≈ ln(Tk) =
ln(S0) + d(k).ln(3) - k.ln(2)) and lim (Sk - Tk)
/ Sk = 0 for Sk → ∞
- Proof :
- Since every element of vi represents a (3x + 1) / 2 multiplication step
and every zero represents a division step we immediately see that
Si+1 = (3Si+1)/2 if vi is 1
and Si+1 = Si/2 if vi is 0
Thus Sk = Tk + Rk where Rk is equal to
Rk = ∑ vi . 3d(i+1,k) / 2k-i (0 ≤ i < k)
Since lim (Rk / Sk) = 0 for Sk → ∞ this proves Lemma 4
- Example :
- Taking N = S0 = 17 and k = 5 we saw that S5 = 5.
We find that T5 = 17 . 32 . 2-5 = 4.78125
which is less then 5% off the real value. For larger numbers the results are much closer,
for instance taking N = 1,234,567 results in S5 = 3,125,000 and
T5 = N . 34 . 2-5 = 3,124,997.7
where the difference between the two numbers is already less than 10-6 . S5
Note that the factor Rk is non-negative, implying that Tk ≤ Sk
Now that we have a fair way of estimating Sk from v(N) we can use v(N) to get a
good indication of the convergence of N. To this end we introduce a new definition.
- Convergent and Divergent vectors
- Let vi (0 ≤ i < k) be a parity vector of length k
Let d(j) = ∑ vi (0 ≤ i < j)
and define c(j) as d(j).ln(3) - j.ln(2) for any j < k
Then we call v convergent if c(j) < 0 for any j < k
and the smallest value j for which c(j) < 0 is called the convergence time of v,
or, generally, the convergence time of any N having such a parity vector
Finally we call v divergent if such a value of j does not exist
It should be clear by Lemma 4 that a necessary though not quite sufficient condition
for N to have a stopping time k is that its parity vector has Convergence Time k.
We are now getting closer to the real Terras theorem.
The next lemma takes a big step towards the final goal.
- Lemma 5 :
- Let vi be a parity vector of length k with convergence time k
and let M be the set of integers having parity vector v
Then all sufficiently large elements N from M have stopping time k
- Proof :
- First we note that Lemma 2 asserts that some N exist, so M is not empty
By Lemma 3 we know that all elements from M fall into a single congruence class mod 2k
We can easily see from Lemma 4 that
S0 < Ti ≤ Si (0 ≤ i < k-1)
Therefore N must have a stopping time of at least k
Now the limit ( Sk ) / S0 for S0 → ∞
is equal to (S0 . ec(k) + Rk) / S0 = ec(k)
Since c(k) < 0 obviously ec(k) < 1 which proves the lemma.
- Example :
- All numbers ≡ 3 mod 16 (x ≥ 0) have a parity vector v starting (1,1,0,0).
The convergence time of v is 4. Therefore all sufficiently large numbers
≡ 3 mod 16 have stopping time 4. The smallest such number, 3,
is apparently sufficiently large enough already, since the stopping time of 3 is indeed 4.
Lemma 5 effectively states that only finitely many numbers with convergence
time k do not have stopping time k. In actual fact no number is known
with this property other than the trivial case where N = 1. Since such a number
should (by Lemma 5) be relatively small it is generally believed that
no such number exists. In other words:
- Conjecture :
- For all N > 1 the convergence time is equal to the stopping time
We have one more lemma to go before returning to the Terras Theorem proper.
But this lemma takes us really close to our final destination.
- Lemma 6 :
- Let Vk be the set of all parity vectors v of length k.
Let Wk be the subset of V consisting of all divergent elements of V.
Then (taking |C| to be the number of elements of C)
lim k → ∞ |Wk| / |Vk| = 0.
- Proof :
- From the definition of convergent vectors it should be clear that
c(k) < 0 is a sufficient though not necessary condition for convergence.
Since the distribution of 1's and 0's in all parity vectors can be described
by the binomial distribution, the limit above is obviously bounded by
lim k → ∞ ∑ 2-k . k! / i!(k-i)! for 0 ≤ i < kα
where α = ( 1 - ln(2) / ln(3) )
This limit simply represents a part of the binomial distribution, and it is
well known that the limit is equal to zero for any α less than ½.
Since ln(2) / ln(3) > ½ the condition is fulfilled and the Lemma is proven.
- Example :
The table below depicts both the number of divergent parity vectors and the
relative number of those vectors for a number of values of k.
Although the number of divergent parity vectors increases rapidly their relative
number can be seen to drop substantially even for fairly modest values of k.
For instance for k=70 the fraction of divergent vectors is approximately 0.001, which
implies that 99.9% of all parity vectors with length 70 are convergent.
k | |Wk| | |Wk| / |Vk| |
1 | 1 | 0.5000 |
2 | 1 | 0.2500 |
3 | 2 | 0.2500 |
4 | 3 | 0.1875 |
5 | 4 | 0.1250 |
6 | 8 | 0.1250 |
7 | 13 | 0.1016 |
8 | 19 | 0.0742 |
9 | 38 | 0.0742 |
10 | 64 | 0.0625 |
|
k | |Wk| | |Wk| / |Vk| |
11 | 128 | 0.0625 |
12 | 226 | 0.0552 |
13 | 367 | 0.0448 |
14 | 734 | 0.0448 |
15 | 1295 | 0.0395 |
16 | 2114 | 0.0323 |
17 | 4228 | 0.0323 |
18 | 7495 | 0.0286 |
19 | 14990 | 0.0286 |
20 | 27328 | 0.0261 |
|
k | |Wk| | |Wk| / |Vk| |
30 | 1.28e+007 | 0.01189418 |
40 | 6.40e+009 | 0.00582334 |
50 | 3.73e+012 | 0.00331669 |
60 | 2.22e+015 | 0.00192219 |
70 | 1.24e+018 | 0.00105159 |
80 | 8.03e+020 | 0.00066440 |
90 | 5.09e+023 | 0.00041078 |
100 | 3.03e+026 | 0.00023868 |
110 | 2.03e+029 | 0.00015662 |
120 | 1.31e+032 | 0.00009879 |
|
k | |Wk| | |Wk| / |Vk| |
130 | 8.11e+034 | 0.00005957 |
140 | 5.52e+037 | 0.00003963 |
150 | 3.67e+040 | 0.00002569 |
160 | 2.52e+043 | 0.00001726 |
170 | 1.64e+046 | 0.00001093 |
180 | 1.13e+049 | 0.00000736 |
190 | 7.69e+051 | 0.00000490 |
200 | 4.92e+054 | 0.00000306 |
250 | 7.41e+068 | 0.00000041 |
300 | 1.11e+083 | 0.00000005 |
|
We finally return to the Terras theorem itself.
Stated more accurately than above the essence of the theorem is that the
fraction of numbers within a given interval that can not be shown to have
a finite stopping time tends to zero as the interval expands. In other words :
- Theorem (Terras) :
- Let M be a positive integer.
Let D(M) be the fraction of numbers < M that do not have finite stopping time.
Then the limit of D(M) for M → ∞ is equal to 0.
- Proof :
- First we establish that from Lemma 5 only finitely many numbers
have a stopping time different from their convergence time.
Now partition the numbers into congruence classes mod 2k
From Lemma 3 every number ≡ m mod 2k maps to a unique parity number n
( 0 ≤ n < 2k ) and therefore to a unique parity vector of length k.
But Lemma 6 proved that the fraction of divergent parity vectors of length k
for k → ∞ is zero.
On considering numbers N < 2k that do not have stopping time ≤ k
it follows that lim k → ∞ D(2k) = 0.
This concludes the proof of the Theorem.
Note : The above version of the Terras Theorem is a simplified version.
It can be improved in several ways, e.g. by stating exactly 'how fast'
the limit goes to zero. A more formal and more complete proof can be found
in this excellent standard
overview of the 3x+1 problem by Jeffrey Lagarias.