Christopher Wellons
mosquitopsu@gmail.com
Public GPG Key
Post Index - see a list of all of the entries.
I am a computer engineer working at the Johns Hopkins University Applied Physics Laboratory. I love computers and free software.
Most of the topics you will find here are about hobby computing and programming with free software (and a few exceptions).
E-mail me: mosquitopsu@gmail.com
Projects:
- Brainfuck Compiler
- PNG Archiver
- BINI Tools
-
Parallel Mandelbrot Generator
Archives:
September 2007
October 2007
November 2007
December 2007
January 2008
February 2008
March 2008
April 2008
June 2008
July 2008
August 2008
September 2008
In December I made
a post
about a simple hash table I had written in C. I got an e-mail
from Benoit
Rouits (
http://brouits.free.fr/) pointing out a memory leak
in ht_destroy() that he found
with Valgrind. I had failed to free
the main array in the hashtable when it was being destroyed,
free (hashtable->arr);
I added the fix and updated my e-mail address in the source tarball, which is in the same location as before. There are no version numbers involved, so I guess this is version "now" and the old one is version "then", with names subject to change.
I have gotten several e-mails lately about using GNU Octave. One specifically was about blurring images in Octave. In response, I am writing this in-depth post to cover spatial filters, and how to use them in GNU Octave (a free implementation of the Matlab programming language). This should be the sort of information you would find near the beginning of an introductory digital image processing textbook, but written out more simply. In the future, I will probably be writing a post covering non-linear spatial and/or frequency domain filters in Octave.
If you want to follow along in Octave, I strongly recommend that you
upgrade to the new Octave 3.0. It is considered stable, but differs
significantly from Octave 2.1, which many people may be used to. You
will also need to install
the image
processing package
from Octave-Forge. To get
help with any Octave function, just type help
<function>.
The most common linear spatial image filtering involves convolving a filter mask, sometimes called a convolution kernel, over an image, which is a two-dimensional matrix. In the case of an RGB color image, the image is actually composed of three two-dimensional grayscale images, each representing a single color, where each is convolved with the filter mask separately.
Convolution is sliding a mask over an image. The new value at the mask's position is the sum of the value of each element of the mask multiplied by the value of the image at that position. For an example, let's start with 1-dimensional convolution. Define a mask,
5 3 2 4 8
The 2 is the anchor for the mask. Define an image,
0 0 1 2 1 0 0
As we convolve, the mask will extend beyond the image at the edges. One way to handle this is to pad the image with 0's. We start by placing the mask at the left edge. (zero-padding is underlined)
Mask: 5 3 2 4 8 Image: 0 0 0 0 1 2 1 0 0
The first output value is 8, as every other element of the mask is multiplied by zero.
Output: 8 x x x x x x
Now, slide the mask over by one position,
Mask: 5 3 2 4 8 Image: 0 0 0 1 2 1 0 0
The output here is 20, because 8*2 + 4*1 = 20;
Output: 8 20 x x x x x
If we continue sliding the mask along, the output becomes,
Output: 8 20 18 11 13 13 5
Here is the correlation done in Octave interactively,
(filter2() is the correlation function).
octave> filter2([5 3 2 4 8], [0 0 1 2 1 0 0])
ans =
8 20 18 11 13 13 5
The same thing happens in two-dimensional convolution, with the mask moving in the vertical direction as well, so that each element in the image is covered.
Sometimes you will hear this described as correlation
(Octave's filter2) or convolution
(Octave's conv2). The only difference between these
operations is that in convolution the filter masked is rotated 180
degrees. Whoop-dee-doo. Most of the time your filter is probably
symmetrical anyway. So, don't worry much about the difference between
these two. Especially in Octave, where rotating a matrix is easy
(see rot90()).
Now that we know convolution, let's introduce the sample image we will be using. I carefully put this together in Inkscape, which should give us a nice scalable test image. When converting to a raster format, there is a bit of unwanted anti-aliasing going on (couldn't find a way to turn that off), but it is minimal.
Save that image (the PNG file, not the linked SVG file) where you can
get to it in Octave. Now, let's load the image into Octave
using imread().
m = imread("image-test.png");
The image is a grayscale image, so it has only one layer. The size
of m should be 300x300. You can check this like so (note
the lack of semicolon so we can see the output),
size(m)
You can view the image stored in m
with imshow. It doesn't care about the image dimensions
or size, so until you resize the plot window, it will probably be
stretched.
imshow(m);
Now, let's make an extremely simple 5x5 filter mask.
f = ones(5) * 1/25
Octave will show us what this matrix looks like.
f = 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000 0.040000
This filter mask is called an averaging filter. It simply averages all the pixels around the image (think about how this works out in the convolution). The effect will be to blur the image. It is important to note here that the sum of the elements is 1 (or 100% if you are thinking of averages). You can check it like so,
sum(f(:))
Now, to convolve the image with the filter mask
using filter2().
ave_m = filter2(f, m);
You can view the filtered image again with imshow()
except that we need to first convert the image matrix to a matrix of
8-bit unsigned integers. It is kind of annoying that we need this, but
this is the way it is as of this writing.
ave_m = uint8(ave_m); imshow(ave_m);
Or, we can save this image to a file
using imwrite(). Just like with imshow(),
you will first need to convert the image to uint8.
imwrite("averaged.png", ave_m);
There are a few things to notice about this image. First there is a
black border around the outside of the filtered image. This is due to
the zero-padding (black border) done by filter2(). The
border of the image had 0's averaged into them. Second, some parts of
the blurred image are "noisy". Here are some selected parts at 4x zoom.
Notice how the circle, and the "a" seem a little bit boxy? This is due to the shape of our filter. Also notice that the blurring isn't as smooth as it could be. This is because the filter itself isn't very smooth. We'll fix both these problems with a new filter later.
First, here is how we can fix the border problem: we pad the image with itself. Octave provides us three easy ways to do this. The first is replicate padding: the padding outside the image is the same as the nearest border pixel in the image. Circular padding: the padding from from the opposite side of the image, as if it was wrapped. This would be a good choice for a periodic image. Last, and probably the most useful is symmetric: the padding is a mirror reflection of the image itself.
To apply symmetric padding, we use the padarray()
function. We only want to pad the image by the amount that the mask
will "hang off". Let's pad the original image for a 9x9 filter, which
will hang off by 4 pixels each way,
mpad = padarray(m, [4 4], "symmetric");
Next, we will replace the averaging filter with a 2D Gaussian
distribution. The Gaussian, or normal, distribution has many wonderful
and useful properties (as a statistics professor I had once said,
anyone who considers themselves to be educated should know about the
normal distribution). One property that makes it useful is that if we
integrate the Gaussian distribution from minus infinity to infinity,
the result is 1. The easiest way to get the curve without having to
type in the equation is using fspecial(): a special
function for creating image filters.
f_gauss = fspecial("gaussian", 9, 2);
This creates a 9x9 Gaussian filter with variance 2. The variance controls the effective size of the filter. Increasing the size of the filter from 9 to 99 will actually have virtually no impact on the final result. It just needs to be large enough to cover the curve. Six times the variance covers over 99% of the curve, so for a variance of 2, a filter of size 7x7 (always make your filters odd in size) is plenty. A larger filter means a longer convolution time. Here is what the 9x9 filter looks like,
And to filter with the Gaussian,
gauss_m = filter2(f_gauss, mpad, "valid"; gauss_m = uint8(guass_m);
Notice the extra argument "valid"? Since we padded the
image before filtering, we don't want this padding to be part of the
image result. filter2() normally returns an image of the
same size as the input image, but we only want the part that didn't
undergo (additional) zero-padding. The result is now the same size as
the original image, but without the messy border,
Also, compare the result to the average filter above. See how much smoother this image is? If you are interested in blurring an image, you will generally want to go with a Gaussian filter like this.
Now I will let you in on a little shortcut. In Matlab, there is a
function called imfilter which does the padding and
filtering in one step. As of this writing, the Octave-Forge image
package doesn't officially include this function, but it is there in
the source repository now, meaning that it will probably appear in the
next version of that package. I actually wrote my own before I found
this one. You can grab the official one
here:
imfilter.m
With this new function, we can filter with the Gaussian and save like
this. Notice the flipping of the first two arguments
from filter2, as well as the lack of converting
to uint8.
gauss_m = imfilter(m, f, "symmetric");
imwrite("gauss.png", gauss_m);
imfilter() will also handle the 3-layer color images
seamlessly. Without it, you would need to run filter2()
on each layer separately.
So that is just about all there is. fspecial() has many
more filters available including motion
blur,
unsharp, and edge detection. For example,
the Sobel edge
detector,
octave:25> fspecial("sobel")
ans =
1 2 1
0 0 0
-1 -2 -1
It is good at detecting edges in one direction. We can rotate this each way to detect edges all over the image.
mf = uint8(zeros(size(m)));
for i = 0:3
mf += imfilter(m, rot90(fspecial("sobel"), i));
end
imshow(mf)
Happy Hacking with Octave!
You may have noticed the new index available on the left of the page. It really needs one, because looking at older posts required paging through the archives. Even more so because I have yet to add search functionality. There were Blosxom plug-ins available to do these things, but they are long gone now. I looked into getting a Google search box dropped in there, but it requires a big advertisement to go around the box (and Javascript... bleh!). I'll figure out something sometime.
Blosxom doesn't have a fancy database behind it, but rather plain flat text files. I imagine that this doesn't scale well, but I haven't looked that carefully into Blosxom's code. I should never be having problems like that anyway as I doubt I will ever surpass even 1000 posts, nor will there ever be some huge flood of users. What I do get is simplicity: to make a new post, I just write a text file and drop it on the server. It also made writing the index code very simple.
Without further ado, here is the code (under the same license as Blosxom): blog-index.perl. As you can see, it doesn't integrate with the rest of the blog; it only produces an index. I need function, not form, so this is fine for me. You can use it on your own Blosxom setup, just as long as you set the parameters at the top of the file correctly.
The index is created dynamically, but only when it needs to be updated. It first checks the date of the pre-generated index against the date of the latest entry. If the entry is newer, the index is rewritten. In both cases, this index is then served up efficiently straight from a plain XHTML file. All the proper file locking is in place (I think), so if there was a flood of requests when the index needs to be rewritten, no two requests are trying to re-write the index at the same time.
So nothing terribly exciting this week, but I hope someone finds this code useful somewhere.
Introduction:
This "news" is over two months old, simply because
I had other more interesting things to write about first. Not that I
am out of ideas: I have at least three more ideas lined up at the
moment on top of several half-written entries that may never see the
light of day. I just want to get it out of the way.
In December we held the robot competition, pitting against each other the robots that we spent the semester building. It was a double-elimination bracket with five teams. Teams competed by arranging the maze (within the rules) and deciding the initial position for their opponents. The robots do not get to know about the maze or where they are starting; they must figure this out on their own by exploring the maze.
To recap, there was an 8'x8' game area containing a 4'x8' maze of 1-square-foot cells. On the floor of the game area was a grid of white lines on black, where the white lines were about 7-inches apart. The robot started at an unknown position and orientation in the maze, which was also set up with a configuration unknown to the robot. In the non-maze open area, three small wooden blocks were placed at the intersection of white lines, with a steel washer attached to the top of each block.
In short, the robot had to move all three blocks to the repository, a pre-programmed position in the maze.
At the end of the semester, our team's robot was the only one that could successfully complete this task. The other teams needed to play in a degraded mode: known maze configuration, known starting position, known block positions. The loser bracket played this degraded version of the game. Because of this, our team was able to sweep the tournament with a perfect run. All the robot had to do was successfully run the full game. The competition, not being able to do this, automatically lost.
The robots were mostly the same, except for one team who had a robot with 4 multi-direction wheels. Every other team made a "scooter bot" type of robot: two powered wheels (with casters for balance) and chassis with three levels. The first real separation of design was when it came to picking up blocks. Each team initially had a different idea. One team was going to build a pulley system to lift the blocks. Another was going to use sweeping arms to sweep in the block. Another was going to used a stationary magnet.
Our team went with a rotating wheel in front with magnets along the outside (see images below). Once a block was found, the robot would rotate a magnet over the block, then rotate the attached block out of the way. In the end, four of the five teams ended up using this design for their own robots (the last team stuck with the stationary magnet).
These pictures were taken about a month before the competition. The wiring job was still a bit sloppy and the front magnet wheel lacks tiny magnets attached to the outside. Other than that, this is what our final robot looked like. In that last month, we attached the magnets, cleaned up the wiring, and made a whole bunch of code improvements making the robot more robust.
I will now attempt to describe some of the things you see in these images.
On the bottom of the robot you can see two casters for balancing the robot (big clunky things). You can see an IR sensor, which is pointing at the blue surface attached to the other side of the robot. This was the block detection sensor, a home-made break-beam sensor. And finally, you can see three LED lights on top of a long circuit board. This is a line tracker, with three sensors that can see the white grid on the bottom of the game board. The line tracker is how the robot navigated the open area of the board. It went back-and-forth looking for blocks, using the line tracker to stay on the line.
Also attached to this bottom layer are the powered wheels, with blue rubbers for traction, and their wheel encoders. There are spokes on the inside of the wheels (encoder disks), and the wheel encoders send a signal to the micro-controller each time it sees a spoke. The software counts the number of spokes that passed, allowing the robot to keep track of how far that wheel has turned. This information is combined with IR distance sensors to give it a very accurate idea of its position.
On top of the bottom black layer, you can see four distance IR sensors for tracking walls in the maze. They checked to make sure the robot was going straight (that's why there are two on each side), as well as map out the maze as it travels long. Hanging down from the bottom of the red layer is another IR sensor facing forward, looking at walls in front of the robot. Mounted on the front is the block retrieval device (lacking magnets at this point).
On top of the red layer are two (empty) battery packs, which holds 9 AA rechargeable NiMH batteries. This actually makes two separate power systems: a 4-pack for motors and a 5-pack for logic (micro-controller et al). In the circuit, the motors, containing coils of wire, behave like inductors, which could cause harmful voltage spikes to the logic. Separate power systems help prevent damage.
On top is the micro-controller and all of the important connections. The vertical board contains the voltage regulator and "power strip" where all of the sensors are attached. It also contains the start button, which was connected to an interrupt in the micro-controller. The micro-controller had its own restart button, but once the system started up, initialized, and self-calibrated, it waited for a signal from the start button to get things going.
I was about to post this when I was reminded by my fiancee that she took pictures at the end-of-semester presentation, after the competition. Here are some images of the robot after it was completely finished. Yes, that is a little face fastened to the front.
If you are ever at Penn State and are visiting the IST building, you can see the robot. Because the robot won the competition, it is on display and will be for years to come. You can recognize it by its face.
I have made the final robot code available
here:
final-robot-code.zip. I was the software guy, handling pretty much
all the code, so everything here,
except interupt_document.c was written by me. It's
probably not very useful as code, except for reading and learning how
our robot worked. There are a few neat hacks in there, though, which I
may discuss as posts here. It's not noted in the code itself, nor in
the zip file, but I'll make this available under my
favorite 2-clause BSD
license.
Don't stop here! This isn't everything. Check out the archives (on the left) for more posts. Or just have a look at the index.