Wednesday, September 23, 2015

A digression on recreating a higher res logo

This isn't a post about my compiler project. I haven't touched that in a while - I've been thinking about what my reference semantics are going to be, and I think that I've got stuff making sense in my head, but it'll take a bit of time to get it working. Time and uninterrupted attention, which I haven't carved out lately. I've got some vacation time coming up.

No, this is about some goofing around I did with some different approaches to solve a problem that isn't super important. You know, fun.

Many years ago, I decided I needed a stylized logo for Big Dice Games, something that'd capture some of the idea of old school gaming, in a way that'd reproduce on letterhead, or business cards, or whatnot.

So, I sat down and doodled this in Inkscape, I think:


Well, something like that, anyway. I generated a SVG, because I wanted nice crisp edges, and SVGs are good at that.

And I made business cards. And I got a local company to embroider it onto a sweatshirt. I love that sweatshirt.

And I put it as a title screen on a bunch of tiny games.

Somewhere along the way, though, I misplaced the SVG, and just have a PNG of it. And those business cards. And the sweatshirt.

So, I decided I'd try to recover, or recreate, the vector description of the logo. That's what this post is about.

Stab #1 - Genetic Algorithm
I love me some good genetic algorithms. They're fun to implement, fun to watch as they lurch around with their hilariously unfit attempts, and slowly converging on... something. And then, they're frustrating as they plateau, and no progress gets made, and then you turn your machine off, and you tell yourself you weren't interested in the project to begin with.

Or, you make some good progress, and you realize that you forgot to have the output saved somewhere, or the output isn't in a useful format, and you only realize that several hours of GA searchtime later.

Or, your computer turns off in the middle of a search, and, you didn't save the population, so you could start over, but... is it worth it?

I think those roughly sum up my experiences with genetic algorithms. Oh, I also used GAs a while ago to come in 5th place in a Games Magazine contest. I've got a Games T-Shirt as a result.

So, yeah, despite GAs shortcomings, I decided to try to recreate the logo with a simple GA.

Maybe not so simple. Looking at that logo, it's easy to recognize there are some circles and some polygons. It's not too hard to pull out the colors, so the GA doesn't have to work to find them. So, I hardcoded a 40-dimensional representation - three numbers for each circle (x,y,r), six numbers for the one triangle, and eight for the two quads. 40 floats, which I could seed with some vaguely neighborhoodish values.

I wrote a python function that would take an array of 40 floats, and use them to draw a series of circles and polygons. My seed numbers drew this:



Which get the GA started. Recognizably using the pieces of the logo, now just fine tune the values. Go.

So, the evaluation function would take the rendered output, subtract it from an image I captured by scanning one of my business cards, histogram the difference, and count the number of pixels that had a difference.

That was a reasonably quick evaluation function, and I got some output looking like this:

Closer to the target than the seeds, in some ways, but not really all that good.

My GA is still running, around 5 days in now. Still nothing accurate. I kinda like the jaunty, rakish version there, but it's not what I was going for. I'm keeping it, in case I decide I want to revamp the logo later, though.


Stab #2 - Hough Transforms

A couple years ago, I listened to a talk about computer vision, particularly as it applied to recovering text from Google street view photographs. The details of that talk aren't important here, except that the speaker mentioned Hough Transforms. The basic idea is that you create a big chunk of memory that captures all the possible parameters for the thing you're looking for. If you're looking for a line in your picture, you have a 2d space with some sort of line parameters. Could be the m and b in your algebra class standby y = mx + b, or maybe you're more clever, and want to avoid infinite slopes for vertical lines, and you use a radius / angle representation, or whatever. 2 floats capture a line pretty well. You set up an accumulator array for those two values, and for every pixel in the source image that seem to be part of an edge, you have it "vote" on possible parameter pairs for lines that would pass through that pixel. Or, flipping it around, for each parameter pair, generate a line, then for that line, for each pixel in the source image, collect votes on how much that line has edge pixel candidates.

It's clever, it's kind of compute-intensive, it's the kind of thing that GPUs can compute pretty easily.

There's a Python implementation, which makes me happy.

Circles can also be detected, and ellipses, as well, if that's your thing. Could be anything, really - as long as you can parameterize it, and don't mind the high dimensional space for wobbly peanut shaped whatevers you're looking for. Circles and lines were plenty for what I was trying to do.

I downloaded scikit-image - which was an annoying process of using the easy_install command, which didn't work, trying to trace back through the dependencies, and installing those, and eventually just using apt-get to install scikit-image, which seemed to work, except that much of the documentation and sample code was out of sync with the version that I installed, so I wasn't able to use the Canny edge detector, only the Sobel one. Which was OK.

Pretty much OK.

I did a quick edge detection path (good grief, my edges are so wide, in part due to the gross job I did scanning the business card), and asked scikit to find straight lines in my image, and it found a couple. I think I was overloading it with edge pixels.

So, I decided to simplify things. I wrote a script to force pixels into one of the six colors that should be in the image:


That cleans things up some, but I'm getting unfortunate gray edges that I don't want. Maybe it's not such a big deal.

I also wrote a version of that script that broke each color out into a single file:




I proceeded to open these bitmaps (well, black and white PNGs, but close enough) in GIMP, and manually cut things further apart into individual primitives.

I passed these hand-cut PNGs back into the Hough Transform scikit script, and asked for some lines and circles, and started getting fairly good values. In some cases, I got a couple different candidates for lines or circles, but at least for lines, I got a "strength" value, so I just took the strongest candidate.


Ok, so now that I had these, it was time to see what the output was. I started drawing some circles, ok, that looks pretty good, gray, green, white... hmm... Oops, the output from the scikit was coming out as y,x and my drawing code was expecting x,y. Thank you, science.

Flipping things around, the circles all looked like they were coming out about right.

The polygons, though... that would be some more work.

I started with the triangle for the hat. The scikit line parameters were theta, radius - an angle from the origin to the closest point on the line, along with the distance from the origin to that point. I futzed around with some drawing code, and eventually got three very long lines tracing the outlines of the hat, and then continuing well beyond the corners.

Ok, so I need to intersect some lines.

For each line, I parameterized the lines starting at that closest point to the origin, and proceeding in a direction perpendicular to that radius. So, I had x0,y0,dx,dy descriptions of lines, which you could think of as x(t) = x0+dx*t, y(t) = y0+dy*t. That's convenient.

For each pair of lines, I'd have two equations in two unknowns, and as long as I don't have any infinities around, it should be a matter of algebra to figure out the point of intersection.


My algebra teacher from 7th grade insisted I show my work. So there you go.

So, I now had three intersection points, which looked pretty sweet.

Only two more primitives to go - a blue quad for the wizard's robe, and a brown one for his staff.

I had broken the robe into two different pieces when doing the edge detection, so I ended up with something like eight different candidate edges. With trial and error, I figured out four that I wanted to keep, and the right order to draw them.

The staff was trickier - my edge detection efforts only got two edges, so I started there, and inserted random other edges from the cloak to begin with. Fortunately, the right side of the staff gets buried underneath the red fireball circle, so I could do a lot of things there and have no visible effect.

On the tail end of the staff, I tweaked the offset until it looked about right. So much for computer recovery of subpixel parameters. -300? Sure, looks good.

I had something that looked pretty good at this point. My eyes were getting a little bleary from all the pixel pushing, so I went back to the original scan, and was pleased to see that things looked pretty close to the original. The mage's left hand seemed to be gripping the staff a little high - the bottom of the circle was just about tangent to the staff polygon. So, I manually offset it down 10 pixels. Again, not doing a great job of adhering to the machine recovery process.

I'm pretty happy with the result, though:

Look at those sharp edges, those clean colors. You may not be able to tell, but it's got an alpha channel. It's still not a SVG, which I could probably have got without too much difficulty by just tracing some edges in Inkscape, but I've got values that approximate the original design that I can begin to play around with for fancy animated logo screens if I want to do that. The primitives could fall into frame from above, revealing the mage in the space of a few seconds. Or I could animate scale - stretch and squash into place. Or combinations of these. Or rasterize the logo, one primitive at a time, one scanline at a time, like one might have done on an Apple ][ back in the day.

So many options.

I could even make a new sweatshirt.