Christopher Wellons
mosquitopsu@gmail.com
Public GPG Key
Post Index - see a list of all of the entries.
I am a computer engineer working at the Johns Hopkins University Applied Physics Laboratory. I love computers and free software.
Most of the topics you will find here are about hobby computing and programming with free software (and a few exceptions).
E-mail me: mosquitopsu@gmail.com
Projects:
- Brainfuck Compiler
- PNG Archiver
- BINI Tools
-
Parallel Mandelbrot Generator
Archives:
September 2007
October 2007
November 2007
December 2007
January 2008
February 2008
March 2008
April 2008
June 2008
July 2008
August 2008
September 2008
The 3n + 1 conjecture, also known as the Collatz conjecture, is based around this recursive function,
The conjecture is this,
This process will eventually reach the number 1, regardless of which positive integer is chosen initially.
The way I am defining this may not be entirely accurate, as I took a shortcut to make it a bit simpler. I am not a mathematician (IANAM) -- but sometimes I pretend to be one. For a really solid definition, click through to the Wikipedia article in the link above.
A sample run, starting at 7, would look like this: 7, 22, 11,
34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. The sequence
starting at 7 contains 17 numbers. So 7 has a cycle-length of
17. Currently, there is no known positive integer that does not
eventually lead to 1. If the conjecture is true, then none exists to
be found.
I first found out about the problem when I saw it on UVa Online Judge. UVa Online Judge is a system that has a couple thousand programming problems to do. Users can submit solution programs written in C, C++, Java, or Pascal. For normal submissions, the fastest program wins.
Anyway, the way UVa Online Judge runs this problem is by providing the
solution program pairs of integers on stdin as text. The
integers define an inclusive range of integers over which the program
must return the length of the longest Collatz cycle-length for all the
integers inside that range. They don't tell you which ranges they are
checking, except that all integers will be less than 1,000,000 and the
sequences will never overflow a 32-bit integer (allowing shortcuts to
be made to increase performance).
The simple approach would be defining a function that returns the cycle length (Lua programming language),
function collatz_len (n)
local c = 1
while n > 1 do
c = c + 1
if math.mod(n, 2) == 0 then
n = n / 2
else
n = 3 * n + 1
end
end
return c
end
Then we have a function check over a range (assuming n <= m here),
function check_range (n, m)
local largest = 0
for i = n, m do
local len = collatz_len (i)
if len > largest then
largest = len
end
end
return largest
end
And top it off with the i/o. (I am just learning Lua, so I hope I did this part properly!)
while not io.stdin.eof do
n, m = io.stdin:read("*number", "*number")
-- check for eof
if n == nil or m == nil then
break
end
print (n .. " " .. m .. " " .. check_range(n, m))
end
Notice anything extremely inefficient? We are doing the same work over
and over again! Take, for example, this range: 7, 22. When we start
with 7, we get the sequence shown above: 7, 22, 11, 34, 17, 52,
26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Eight of these numbers
are part of the range that we are looking at. When we get up to 22, we
are going to walk down the same range again, less the 7. To make
things more efficient, we apply
some
dynamic programming and store previous calculated cycle-lengths in
an array. Once we get to a value we already calculated, we just look
it up.
I used dynamic programming in my submission, which I wrote up in C. You can grab my source here. It fills in a large array (1000000 entries) as values are found, so no cycle-length is calculated twice. When I submitted this program, it ranked 60 out of about 300,000 entries. There are probably a number of tweaks that can increase performance, such as increasing the size of the array, but I didn't care much about inching closer to the top. I would bet that the very top entries did some trial-and-error and determined what ranges are tested, using the results to seed their program accordingly. You could take my code and submit it yourself, but that wouldn't be very honest, would it?
So why am I going through all of this describing such a simple problem? Well, it is because of this neat feature of Lua that applies well to this problem. Lua is kind of like Lisp. In Lisp, everything is a list ("list processing" --> Lisp). In Lua, (almost) everything is an associative array (Maybe they should have called it Assp? Or Hashp? I am kidding.) An object is a hash with fields containing function references. There is even some syntactic sugar to help this along.
The cool thing is that we can create a hash with default entries that reference a function that calculates the Collatz cycle-length of its key. Once the cycle-length is calculated, the function reference is replaced with the value, so the function is never called again from that point. The function only actually determines the next integer, then references the hash to get the cycle-length of that next integer.
Now this hash looks like it is infinitely large. This is really a form of lazy evaluation: no values are calculated until they are needed (this is one of my favorite things about Haskell). We don't need to explicitly ask for it to be calculated, either. We just go along looking up values in the array as if they were always there. Here is how you do it,
collatz_len = { 1 }
setmetatable (collatz_len, {
__index = function (name, n)
if (math.mod (n, 2) == 0) then
name[n] = name[n/2] + 1;
else
name[n] = name[3 * n + 1] + 1;
end
return name[n]
end
})
So we replace the collatz_len function with this array
(and replace the call to an array reference) and we have applied
dynamic programming to our old program. If I run the two programs with
this sample input,
10 1000 1000 3000 300 500
and look at average running times, the dynamic programming version runs 87% faster than the original.
One problem with this, though, is the use of recursion. In Lua, it is really easy to hit recursion limits. For example, accessing element 10000 will cause the program to crash. This will probably get fixed someday, or in some implementation of Lua.
I thought there might be a way to do this in Perl, by changing the
default hash value from undef to something else, but I
was mildly disappointed to find out that this is not true.
Here is the source for the original program and the one with dynamic programming (BSD licenced): collatz_simple.lua and collatz.lua
wbf2c converts an
esoteric programming language
called brainfuck
into C code which can be machine compiled. Several optimizations are
done to make the resulting program extremely fast an efficient. The
converter supports both a static (standard 30,000 cells) array or a
dynamically-sized array. It also supports many different cell types,
from the standard char to multi-precision cells
using GMP.
The converter can also run several brainfuck programs on the same memory array at once by running each one in a thread. To make sure each brainfuck operation is atomic, each cell gets a mutex lock. The only other multi-threading brainfuck implementation I know of is Brainfork.
For an example of some brainfuck code I wrote,
+>+< [ [->>+>+<<<]>>> [-<<<+>>>]<< [->+>+<<]>> [-<<+>>]<< ]
This program fills the memory with the Fibonacci series. Make sure you
use the dynamically sized array, along with the bignum cell
type. After two or three seconds of running, my laptop (unmodified
Dell Inspiron 1000) can calculate and spit out a 140MB text file
containing the first 50,000 numbers in the series. I used
the -d dump option to see this output.
Download information, as well as some more examples, including a multi-threaded one, are on the project website.
Sorry for the lack of new material. I was apartment hunting with my fiancee (the most beautiful and wonderful girl in the world) this past weekend in the Baltimore area. I found a nice place in Columbia. Tomorrow morning I am getting some oral surgery done and will be hospitalized for awhile. So nothing new here for possibly a couple of weeks.
However, once I have recovered, I will probably make a post about Lua, another about a robot game I wrote this past summer, and another with pictures of my tournament-winning robot from last semester.
If you are reading this post months from now and there has been no new material, I guess you can assume something went bad with the surgery (unlikely). According to this book I have called Scam Proof Your Life (from the AARP), preventable medical errors in US hospitals results in 195,000 deaths. I am not worried, however.
Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.