The Terras Theorem
One of the most important results on the 3x+1 problem is the Terras Theorem,
originally stated by Riho Terras. Because it is such an important theorem
for the 3x+1 problem this page is dedicated to sketching its approach and ideas.
While some minimal mathematical background is probably necessary to follow this page
none of the proofs require detailed knowledge of advanced techniques or deeper
theorems.
Although all lemmas are proven and no steps are skipped or ignored the purpose of
this page is to introduce the Terras Theorem to those new to the 3x+1 problem.
Please note therefore that the emphasis is on making the theorem comprehensible
rather than on 100% exact mathematics (but see this note
for a more exact version).
The simplest way to summarize the theorem goes like this:
Almost all numbers have finite Glide.
This is too simple really. Before we get back to a more exact version of
the theorem let us first redefine the 3x+1 algorithm as follows:
 3x + 1 algorithm, restated
 For any positive integer N define a sequence where
S_{0} = N
and for all i > 0
S_{i} = S_{i1} / 2 if S_{i} is even
S_{i} = ( S_{i1} * 3 + 1 ) / 2 if S_{i} is odd
And we will call the smallest value of i for which
S_{i} < S_{0} the Stopping time of N
 Example :
 Let N = 17; then S_{0} = 17, S_{1} = 26, S_{2} = 13,
S_{3} = 20, S_{4} = 10, S_{5} = 5, etc.
And since S_{2} is the first term < 17 the stopping time of 17 is 2.
The Terras theorem in simple words then becomes:
Almost all numbers have finite Stopping Time.
Note that this redefinition has no consequences for the convergence of the series.
Next we define a parity vector for each number as follows:
 Parity vector
 For any number N the parity vector v(N) is defined as
v_{i}(N) = S_{i} mod 2
 Example :
 For N = 17 we find v_{0} = 1, v_{1} = 0, v_{2} = 1,
v_{3} = 0, v_{4} = 0, v_{5} = 1, etc.
The parity vector completely describes the way the 3x+1 iterates of N behave,
since the numbers correspond directly to the criterion
upon which the 3x + 1 algorithm is based.
We now state a simple lemma to increase our understanding of the parity vector :
 Lemma 1 :
 If N is a positive integer of the form a . 2^{k} + b (b < 2^{k})
then the first k elements of the parity vector are dependent on b only
 Proof :
 The proof is by induction on k.
For k = 1 we have N = 2a + b with b either 0 or 1
and the lemma is true by the definition of the 3x + 1 algorithm
Now assume the lemma is true for some k
Let N = a . 2^{k+1} + b (b < 2^{k+1})
Then if S_{0} = N is even S_{1} = a . 2^{k} + b / 2
and if S_{0} = N is odd S_{1} = 3a . 2^{k} + (3b+1) / 2
In either case the first k elements of v(S_{1}) depend on b only
Therefore the first k+1 elements of N are independent of a as well
This concludes the proof of the lemma.
 Example :
 Let N = 17 = 0 . 2^{6} + 17. The first 6 elements of the parity vector
are (1,0,1,0,0,1). From Lemma 1 this is true for every number
of the form a . 2^{6} + 17. Therefore these are also the first 6 elements
of the parity vector of 81, 145, 209, 273, etc.
Another lemma about parity vectors is :
 Lemma 2 :
 Assume w_{i} (0 ≤ i < k) is a parity vector of length k
Then some number N exists for which v_{i}(N) = w_{i} (0 ≤ i < k)
 Proof :
 The proof is again by induction on k.
For k = 1 we have w_{0} is either 0 or 1
and N is either any even or odd number
Now assume the lemma is true for some k
Let w_{i} (0 ≤ i < k+1 ) be some parity vector of length k+1
Then some number N exists for which v_{i}(N) = w_{i} (0 ≤ i < k)
From Lemma 1 we can write N as a . 2^{k} + b (b < 2^{k})
Now take two numbers M_{1} and M_{2}, with M_{1} = a . 2^{k+1} + b
and M_{2} = a . 2^{k+1} + 2^{k} + b
For both M_{1} and M_{2} we automatically have
v_{i}(M_{1}) = v_{i}(M_{2}) = w_{i} (0 ≤ i < k)
Now take S_{0} = M_{1} and let us call S_{k}(M_{1}) = X
From the definition of the algorithm we have S_{k}(M_{2}) = Y = X + 3^{n}
with n the number of odd elements of w_{i} from 0 to k
Since 3^{n} is always odd we see that X and Y are of different parity
Therefore w_{k+1}(M_{1}) ≠ w_{k+1}(M_{2})
which means that for either w_{k+1} = 0 or w_{k+1} = 1
a number M must exist for which v_{i}(M) = w_{i} (0 ≤ i < k+1)
This concludes the proof of the lemma.
 Example :
 Take the parity vector (1,0,1,0,0,1,x) with x either 0 or 1. For N = 17 these
are the first 6 elements of its parity vector. With S_{0} = N = 17
we find S_{6} = 8. So for N = 17 the first 7 elements of its
parity vector are (1,0,1,0,0,1,0).
If we take S_{0} = N = 17 + 2^{6} = 81 then from Lemma 2
we know that the first 7 elements of its parity vector must be (1,0,1,0,0,1,1)
and therefore that S_{6} must be odd, and indeed S_{6} = 35.
Note that the parity vector contains three 1's in its first 6 elements
and that S_{6} is also necessarily equal to 8 + 3^{3} = 8 + 27 = 35.
The next step is to define a parity number of order k:
 Parity number
 For any number N the parity numbers P_{k} (k ≥ 1) are defined as
P_{k} = ∑_{i<k} (v_{i} . 2^{i})
 Example :
 For N = 17 we find P_{1} = 1, P_{2} = 1, P_{3} = 5,
P_{4} = 5, P_{5} = 5, P_{6} = 37, etc.
Note that the parity number of order k is really nothing else than the first
k elements of the parity vector written in reverse order and interpreted
as a binary number. Thus the parity vector of 17 starts as (1,0,1,0,0,1)
and therefore P_{6} = 37, where 37 written in binary is 100101.
Note that for any parity number of order k
we have 0 ≤ P_{k} < 2^{k}
We now reach an important Lemma:
 Lemma 3 :
 Let M and N be positive integers.
Then P_{k}(M) = P_{k}(N) if and only if M ≡ N mod 2^{k}
 Proof :
 • From M ≡ N mod 2^{k} it follows that both can be written as
M = x . 2^{k} + b and N = y . 2^{k} + b, with b identical for both M and N
From Lemma 1 the first k elements of v depend on b only
Therefore v_{i}(M) = v_{i}(N) for 0 ≤ i < k
and so P_{k}(M) = P_{k}(N)
• Let P_{k}(M) = P_{k}(N),
and automatically v_{i}(M) = v_{i}(N) for 0 ≤ i < k
Assume that M and N are congruent b1 and b2 mod 2^{k} respectively
Then we have two numbers a . 2^{k} + b1 and a . 2^{k} + b2
which both give rise to the same parity vector of length k
By Lemma 2 however for every parity vector w of length k a number Q exists
for which v_{i}(Q) = w_{i} (0 ≤ i < k)
Since there are 2^{k} different vectors of length k and since there are also
2^{k} different congruence classes mod 2^{k} it follows that b1 = b2
This completes the proof of Lemma 3
From Lemma 3 we can derive that every congruence class mod 2^{k} uniquely
maps to a parity number, with both numbers lying between 0 and 2^{k}  1.
This function can therefore be seen as a permutation function.
 Example :
 Taking k=3 we find that 8x + 0 maps to parity number 0, 8x+1 maps to 5, etc.
Written more concisely the numbers 8x + r (r=0, 1, 2, 3, 4, 5, 6, 7) map to
0, 5, 2, 3, 4, 1, 6, 7 respectively. We see that 1 maps to 5 and vice versa,
but all other numbers map to themselves. Written even shorter in standard
permutation notation the mapping for k=3 can be described by (1,5).
Similarly the mapping for k=4 can be written as (1,5) (2,10) (9,13).
From the definition of the parity vector it should be clear that the elements
describe the behaviour of the 3x + 1 sequence. In fact we can give a pretty good
approximation for S_{k} from V_{i} (0 ≤ i < k).
This is stated more explicitly in Lemma 4.
 Lemma 4 :
 Let S_{0} = N be a positive integer and v_{i} its parity vector.
Let d(a,b) = ∑ v_{i} (a ≤ i < b)
Specifically, let d(k) be a shorthand for d(0,k)
Then S_{k} ≈ T_{k} = S_{0} . 3^{d(k)} . 2^{k}
(Or, in logarithmic form: ln(S_{k}) ≈ ln(T_{k}) =
ln(S_{0}) + d(k).ln(3)  k.ln(2)) and lim (S_{k}  T_{k})
/ S_{k} = 0 for S_{k} → ∞
 Proof :
 Since every element of v_{i} represents a (3x + 1) / 2 multiplication step
and every zero represents a division step we immediately see that
S_{i+1} = (3S_{i}+1)/2 if v_{i} is 1
and S_{i+1} = S_{i}/2 if v_{i} is 0
Thus S_{k} = T_{k} + R_{k} where R_{k} is equal to
R_{k} = ∑ v_{i} . 3^{d(i+1,k)} / 2^{ki} (0 ≤ i < k)
Since lim (R_{k} / S_{k}) = 0 for S_{k} → ∞ this proves Lemma 4
 Example :
 Taking N = S_{0} = 17 and k = 5 we saw that S_{5} = 5.
We find that T_{5} = 17 . 3^{2} . 2^{5} = 4.78125
which is less then 5% off the real value. For larger numbers the results are much closer,
for instance taking N = 1,234,567 results in S_{5} = 3,125,000 and
T_{5} = N . 3^{4} . 2^{5} = 3,124,997.7
where the difference between the two numbers is already less than 10^{6} . S_{5}
Note that the factor R_{k} is nonnegative, implying that T_{k} ≤ S_{k}
Now that we have a fair way of estimating S_{k} from v(N) we can use v(N) to get a
good indication of the convergence of N. To this end we introduce a new definition.
 Convergent and Divergent vectors
 Let v_{i} (0 ≤ i < k) be a parity vector of length k
Let d(j) = ∑ v_{i} (0 ≤ i < j)
and define c(j) as d(j).ln(3)  j.ln(2) for any j < k
Then we call v convergent if c(j) < 0 for any j < k
and the smallest value j for which c(j) < 0 is called the convergence time of v,
or, generally, the convergence time of any N having such a parity vector
Finally we call v divergent if such a value of j does not exist
It should be clear by Lemma 4 that a necessary though not quite sufficient condition
for N to have a stopping time k is that its parity vector has Convergence Time k.
We are now getting closer to the real Terras theorem.
The next lemma takes a big step towards the final goal.
 Lemma 5 :
 Let v_{i} be a parity vector of length k with convergence time k
and let M be the set of integers having parity vector v
Then all sufficiently large elements N from M have stopping time k
 Proof :
 First we note that Lemma 2 asserts that some N exist, so M is not empty
By Lemma 3 we know that all elements from M fall into a single congruence class mod 2^{k}
We can easily see from Lemma 4 that
S_{0} < T_{i} ≤ S_{i} (0 ≤ i < k1)
Therefore N must have a stopping time of at least k
Now the limit ( S_{k} ) / S_{0} for S_{0} → ∞
is equal to (S_{0} . e^{c(k)} + R_{k}) / S_{0} = e^{c(k)}
Since c(k) < 0 obviously e^{c(k)} < 1 which proves the lemma.
 Example :
 All numbers ≡ 3 mod 16 (x ≥ 0) have a parity vector v starting (1,1,0,0).
The convergence time of v is 4. Therefore all sufficiently large numbers
≡ 3 mod 16 have stopping time 4. The smallest such number, 3,
is apparently sufficiently large enough already, since the stopping time of 3 is indeed 4.
Lemma 5 effectively states that only finitely many numbers with convergence
time k do not have stopping time k. In actual fact no number is known
with this property other than the trivial case where N = 1. Since such a number
should (by Lemma 5) be relatively small it is generally believed that
no such number exists. In other words:
 Conjecture :
 For all N > 1 the convergence time is equal to the stopping time
We have one more lemma to go before returning to the Terras Theorem proper.
But this lemma takes us really close to our final destination.
 Lemma 6 :
 Let V_{k} be the set of all parity vectors v of length k.
Let W_{k} be the subset of V consisting of all divergent elements of V.
Then (taking C to be the number of elements of C)
lim k → ∞ W_{k} / V_{k} = 0.
 Proof :
 From the definition of convergent vectors it should be clear that
c(k) < 0 is a sufficient though not necessary condition for convergence.
Since the distribution of 1's and 0's in all parity vectors can be described
by the binomial distribution, the limit above is obviously bounded by
lim k → ∞ ∑ 2^{k} . k! / i!(ki)! for 0 ≤ i < kα
where α = ( 1  ln(2) / ln(3) )
This limit simply represents a part of the binomial distribution, and it is
well known that the limit is equal to zero for any α less than ½.
Since ln(2) / ln(3) > ½ the condition is fulfilled and the Lemma is proven.
 Example :
The table below depicts both the number of divergent parity vectors and the
relative number of those vectors for a number of values of k.
Although the number of divergent parity vectors increases rapidly their relative
number can be seen to drop substantially even for fairly modest values of k.
For instance for k=70 the fraction of divergent vectors is approximately 0.001, which
implies that 99.9% of all parity vectors with length 70 are convergent.
k  W_{k}  W_{k} / V_{k} 
1  1  0.5000 
2  1  0.2500 
3  2  0.2500 
4  3  0.1875 
5  4  0.1250 
6  8  0.1250 
7  13  0.1016 
8  19  0.0742 
9  38  0.0742 
10  64  0.0625 

k  W_{k}  W_{k} / V_{k} 
11  128  0.0625 
12  226  0.0552 
13  367  0.0448 
14  734  0.0448 
15  1295  0.0395 
16  2114  0.0323 
17  4228  0.0323 
18  7495  0.0286 
19  14990  0.0286 
20  27328  0.0261 

k  W_{k}  W_{k} / V_{k} 
30  1.28e+007  0.01189418 
40  6.40e+009  0.00582334 
50  3.73e+012  0.00331669 
60  2.22e+015  0.00192219 
70  1.24e+018  0.00105159 
80  8.03e+020  0.00066440 
90  5.09e+023  0.00041078 
100  3.03e+026  0.00023868 
110  2.03e+029  0.00015662 
120  1.31e+032  0.00009879 

k  W_{k}  W_{k} / V_{k} 
130  8.11e+034  0.00005957 
140  5.52e+037  0.00003963 
150  3.67e+040  0.00002569 
160  2.52e+043  0.00001726 
170  1.64e+046  0.00001093 
180  1.13e+049  0.00000736 
190  7.69e+051  0.00000490 
200  4.92e+054  0.00000306 
250  7.41e+068  0.00000041 
300  1.11e+083  0.00000005 

We finally return to the Terras theorem itself.
Stated more accurately than above the essence of the theorem is that the
fraction of numbers within a given interval that can not be shown to have
a finite stopping time tends to zero as the interval expands. In other words :
 Theorem (Terras) :
 Let M be a positive integer.
Let D(M) be the fraction of numbers < M that do not have finite stopping time.
Then the limit of D(M) for M → ∞ is equal to 0.
 Proof :
 First we establish that from Lemma 5 only finitely many numbers
have a stopping time different from their convergence time.
Now partition the numbers into congruence classes mod 2^{k}
From Lemma 3 every number ≡ m mod 2^{k} maps to a unique parity number n
( 0 ≤ n < 2^{k} ) and therefore to a unique parity vector of length k.
But Lemma 6 proved that the fraction of divergent parity vectors of length k
for k → ∞ is zero.
On considering numbers N < 2^{k} that do not have stopping time ≤ k
it follows that lim k → ∞ D(2^{k}) = 0.
This concludes the proof of the Theorem.
Note : The above version of the Terras Theorem is a simplified version.
It can be improved in several ways, e.g. by stating exactly 'how fast'
the limit goes to zero. A more formal and more complete proof can be found
in this excellent standard
overview of the 3x+1 problem by Jeffrey Lagarias.