Null Program

Christopher Wellons
mosquitopsu@gmail.com
Public GPG Key

Chris Wellons


Post Index - see a list of all of the entries.


RSS Logo RSS Feed




Hacker Logo

I am a computer engineer working at the Johns Hopkins University Applied Physics Laboratory. I love computers and free software.

Most of the topics you will find here are about hobby computing and programming with free software (and a few exceptions).

E-mail me: mosquitopsu@gmail.com


My Git Repositories


Projects:
- Brainfuck Compiler
- PNG Archiver
- BINI Tools
- Parallel Mandelbrot Generator


Archives:
September 2007
October 2007
November 2007
December 2007
January 2008
February 2008
March 2008
April 2008
June 2008
July 2008
August 2008
September 2008


This site is powered by blosxom and DreamHost.

29 Jan 2008

The 3n + 1 Conjecture

The 3n + 1 conjecture, also known as the Collatz conjecture, is based around this recursive function,

Collatz function LaTeX screenshot

The conjecture is this,

This process will eventually reach the number 1, regardless of which positive integer is chosen initially.

The way I am defining this may not be entirely accurate, as I took a shortcut to make it a bit simpler. I am not a mathematician (IANAM) -- but sometimes I pretend to be one. For a really solid definition, click through to the Wikipedia article in the link above.

A sample run, starting at 7, would look like this: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. The sequence starting at 7 contains 17 numbers. So 7 has a cycle-length of 17. Currently, there is no known positive integer that does not eventually lead to 1. If the conjecture is true, then none exists to be found.

I first found out about the problem when I saw it on UVa Online Judge. UVa Online Judge is a system that has a couple thousand programming problems to do. Users can submit solution programs written in C, C++, Java, or Pascal. For normal submissions, the fastest program wins.

Anyway, the way UVa Online Judge runs this problem is by providing the solution program pairs of integers on stdin as text. The integers define an inclusive range of integers over which the program must return the length of the longest Collatz cycle-length for all the integers inside that range. They don't tell you which ranges they are checking, except that all integers will be less than 1,000,000 and the sequences will never overflow a 32-bit integer (allowing shortcuts to be made to increase performance).

The simple approach would be defining a function that returns the cycle length (Lua programming language),

function collatz_len (n)
   local c = 1
   
   while n > 1 do
      c = c + 1
      if math.mod(n, 2) == 0 then
	 n = n / 2
      else
	 n = 3 * n + 1
      end
   end
   
   return c
end

Then we have a function check over a range (assuming n <= m here),

function check_range (n, m)
   local largest = 0
   
   for i = n, m do
      local len = collatz_len (i)

      if len > largest then
	 largest = len
      end
      
   end
   
   return largest
end

And top it off with the i/o. (I am just learning Lua, so I hope I did this part properly!)

while not io.stdin.eof do
   n, m = io.stdin:read("*number", "*number")
   
   -- check for eof
   if n == nil or m == nil then
      break
   end
   
   print (n .. " " .. m .. " " .. check_range(n, m))
end

Notice anything extremely inefficient? We are doing the same work over and over again! Take, for example, this range: 7, 22. When we start with 7, we get the sequence shown above: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Eight of these numbers are part of the range that we are looking at. When we get up to 22, we are going to walk down the same range again, less the 7. To make things more efficient, we apply some dynamic programming and store previous calculated cycle-lengths in an array. Once we get to a value we already calculated, we just look it up.

I used dynamic programming in my submission, which I wrote up in C. You can grab my source here. It fills in a large array (1000000 entries) as values are found, so no cycle-length is calculated twice. When I submitted this program, it ranked 60 out of about 300,000 entries. There are probably a number of tweaks that can increase performance, such as increasing the size of the array, but I didn't care much about inching closer to the top. I would bet that the very top entries did some trial-and-error and determined what ranges are tested, using the results to seed their program accordingly. You could take my code and submit it yourself, but that wouldn't be very honest, would it?

So why am I going through all of this describing such a simple problem? Well, it is because of this neat feature of Lua that applies well to this problem. Lua is kind of like Lisp. In Lisp, everything is a list ("list processing" --> Lisp). In Lua, (almost) everything is an associative array (Maybe they should have called it Assp? Or Hashp? I am kidding.) An object is a hash with fields containing function references. There is even some syntactic sugar to help this along.

The cool thing is that we can create a hash with default entries that reference a function that calculates the Collatz cycle-length of its key. Once the cycle-length is calculated, the function reference is replaced with the value, so the function is never called again from that point. The function only actually determines the next integer, then references the hash to get the cycle-length of that next integer.

Now this hash looks like it is infinitely large. This is really a form of lazy evaluation: no values are calculated until they are needed (this is one of my favorite things about Haskell). We don't need to explicitly ask for it to be calculated, either. We just go along looking up values in the array as if they were always there. Here is how you do it,

collatz_len = { 1 }

setmetatable (collatz_len, {
   __index = function (name, n)
      if (math.mod (n, 2) == 0) then
         name[n] = name[n/2] + 1;
      else
         name[n] = name[3 * n + 1] + 1;
      end
         return name[n] 
   end
})

So we replace the collatz_len function with this array (and replace the call to an array reference) and we have applied dynamic programming to our old program. If I run the two programs with this sample input,

10 1000
1000 3000
300 500

and look at average running times, the dynamic programming version runs 87% faster than the original.

One problem with this, though, is the use of recursion. In Lua, it is really easy to hit recursion limits. For example, accessing element 10000 will cause the program to crash. This will probably get fixed someday, or in some implementation of Lua.

I thought there might be a way to do this in Perl, by changing the default hash value from undef to something else, but I was mildly disappointed to find out that this is not true.

Here is the source for the original program and the one with dynamic programming (BSD licenced): collatz_simple.lua and collatz.lua

[] permanent link


20 Jan 2008

Optimizing, Multi-threading Brainfuck to C Converter

wbf2c converts an esoteric programming language called brainfuck into C code which can be machine compiled. Several optimizations are done to make the resulting program extremely fast an efficient. The converter supports both a static (standard 30,000 cells) array or a dynamically-sized array. It also supports many different cell types, from the standard char to multi-precision cells using GMP.

The converter can also run several brainfuck programs on the same memory array at once by running each one in a thread. To make sure each brainfuck operation is atomic, each cell gets a mutex lock. The only other multi-threading brainfuck implementation I know of is Brainfork.

For an example of some brainfuck code I wrote,

+>+<
[
[->>+>+<<<]>>>
[-<<<+>>>]<<

[->+>+<<]>>
[-<<+>>]<<
]

This program fills the memory with the Fibonacci series. Make sure you use the dynamically sized array, along with the bignum cell type. After two or three seconds of running, my laptop (unmodified Dell Inspiron 1000) can calculate and spit out a 140MB text file containing the first 50,000 numbers in the series. I used the -d dump option to see this output.

Download information, as well as some more examples, including a multi-threaded one, are on the project website.

[] permanent link


08 Jan 2008

Some News

Sorry for the lack of new material. I was apartment hunting with my fiancee (the most beautiful and wonderful girl in the world) this past weekend in the Baltimore area. I found a nice place in Columbia. Tomorrow morning I am getting some oral surgery done and will be hospitalized for awhile. So nothing new here for possibly a couple of weeks.

However, once I have recovered, I will probably make a post about Lua, another about a robot game I wrote this past summer, and another with pictures of my tournament-winning robot from last semester.

If you are reading this post months from now and there has been no new material, I guess you can assume something went bad with the surgery (unlikely). According to this book I have called Scam Proof Your Life (from the AARP), preventable medical errors in US hospitals results in 195,000 deaths. I am not worried, however.

[] permanent link


Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.