Wednesday, December 30, 2015

I've got a lot of books here.



Over the past week or so, I've taken 5 boxes of books to Half Price Books and just now filled another box to go out the next time I'm headed in that direction.

I like books. I like reading books. I like collecting books. I have maybe too many books. I have books that have perhaps been a fire hazard, making easy escape from my house a difficulty in case of an emergency. I have more than once bought a duplicate of a book, either because I've bought it already and forgot, or I just couldn't find the original.

Years ago, I thought it'd be good to organize my books by Dewey Decimal, oh, or maybe Library of Congress. Or maybe some sort of hierarchical tag-based scheme that I never got around to fully exploring. Now, I'm just happy if I can keep track of my books. 

I have a Google Drive Sheets spreadsheet, which is super handy - I can check it while I'm at a bookstore and see if the book I see in front of me is also waiting for me at home. This has already been a benefit.

But it's kind of ponderous - it's just a big list of stuff, and there's not a good way of identifying where a book actually is. (Organization.) I might also like to have a list of all my GURPS books, or when I get a new Kakuro puzzle book (I shouldn't be buying more of those), it might be nice to put it near the other ones of its ilk.

Also, barcode lookup might be nice to support.

So, I've been writing a tool.

Starting with web.py, and using MySQL (for now?), I started manually pulling individual lines from my spreadsheet in. The database knows that authors and ISBNs exist, but the web app has no way to display or edit them. But there's enough to start pulling one line at a time over from my spreadsheet.

Some(!) of my books are in boxes, and each box has its own ID. Each room in my house has its own ID. My house has an ID. So, there's a chain of containment there - such and so book might be in such and so box, which is in such and so room which is in my house. There are container links up from each item, and the web app is smart enough to do a select if I want to know what the contents of a specific box are. I've started printing out listings for the contents of boxes as I do my inventory - easy, just click the "contents" link off the box listing, and hit "print".

The web app does allow for me to use my fancy barcode scanner, storing UPC data for whatever items I feel like tagging. So far, it does nothing with that UPC ID, but I'm planning on doing an automated fetch from isbndb.org and making that data available.

I just now put in a facility where I can search for title substrings. So, if I ask for "gurps", it lists all the GURPS books I've entered so far (only 14, I'm just getting started). And when I ask for "apple", I get 21 hits, all for early '80s stuff. 6502 only hits 5 titles so far.

One thing that this should do is reach back to the Google Drive Sheets spreadsheet and mark which items in my spreadsheet have been added to the web app. And if I add a thing to the web app, it'd be nice if the spreadsheet reflected that without having to enter it twice. And maybe, it'd be nice to import from the spreadsheet, but if that's a lot of work, I'm OK with doing a manual inventory anyway, since my spreadsheet doesn't have a lot of UPC data, which I kind of want to collect at some point.

If you look carefully at the screenshot at the top, you'll see 4 digit IDs. That's not because I've catalogged more than 1000 items (well, actually, I have, but that's not why). It's because I had some idea that I'd use the thousands place to denote the type of object - allowing me to have enumerated categories for furniture, rooms, boxes, books, games, and other things. That categorization scheme has sort of fallen apart, but I might go back and rebuild it later, now that it's easier for a bit of code to enforce that sort of thing. Or I might ignore it, now that I've got code that makes it easier to filter and search.

Once I've got it reaching out and writing back to my spreadsheet, the next logical thing to do would be to reach out and write to a BoardgameGeek list, so my games would be listed there. But just games. Again, software is smart enough to mostly get that right, probably.

Also, maybe I could go back to my LibraryThing account and update my booklist there. 

Sunday, December 13, 2015

Clerics & Codenames, v0.0.2



Back in late October, I played a game of Codenames by Vlaada Chvátil. I was impressed with how much fun it is, how challenging, how much the words stuck in my head afterward, and how simple a game it seemed.

I appreciate that when somebody is really good at their art, they can make stuff that looks simple, even if it takes a lot of work, so to say that it seems like a simple game isn't meant to diminish it - it's elegant.

I also wondered how much work it'd be to make a playable set of expansion cards. I had a bunch of questions - should you go for simple words? Just nouns? Would choosing words from a theme make sense? (Star Wars? Computers? Geek Life?) Or would that narrow the range of possible connections too much?

I poked around, looking for word lists of common words (inspired by Randall Munroe's Thing Explainer, which also inspired Space Weird Thing, which is awesome, but is only super loosely connected to the game I was talking about), but couldn't find a list I liked, so I set things aside for a bit.

Then, last week, somehow the idea of taking a bunch of words from fantasy role playing games might be a place to start. I went back to a couple of the works that made an impression on me at an early age, the Holmes edition of the Dungeons & Dragons ruleset and B2 - The Keep on the Borderlands. Skimming through there (and paying particular attention to glossaries), I came up with many words that were burned into my brain first through Dungeons & Dragons.

Those words went into a Google Docs Spreadsheet (er, a Google Spreadsheet as hosted on Google Drive?), which I then exported as a CSV file.

Using Reportlab and PIL (actually Pillow), I made a PDF, which I printed out on cardstock:


I invited a few of my coworkers to sit down and see if a strongly-themed cardset would actually be fun, and I was surprised first by how enthusiastic people were, and then I was surprised that it played really well.

Some of the feedback included:

  • randomize the cards, because we had a bunch of 'h' words, followed by a bunch of 'c' words. Shuffling home-printed cardstock cards is challenging.
  • wouldn't it be awesome for the red and blue 'cover' cards from the Codenames set were also themed?
  • remove words that are too close in meaning. e.g. 'cave' and 'cavern', 'vial' and 'phial'.
  • rework the script so that the fontsize is consistent from card to card
  • I would totally buy this on DriveThruCards
  • Maybe this should be posted on Board Game Geek 



So, stuff I've done today:

Script rework. I reworked my Python script to generate either letter (print and play) or mini-card (Drive Thru Cards) page sizes. Reportlab's canvas object defaults to letter, but you can pass in a pagesize argument.

I pulled the UniversalisADFStd-Regular.otf file into fontforge and exported it as a TTF. This made Reportlab happier with the file, as the outlines were in a format that it liked. (Known issue, they're in no hurry to support Postscript outlines, or something.

In my 1st printing, I had worked around the OTF limitations by using Pillow to create an image just large enough to hold the text, rendering the text to an in-memory image, then using Reportlab to insert that image into my PDF. Now that Reportlab can render the text directly, I can skip Pillow for this, which fixes the odd sizing artifacts that I was seeing - for some words (ones with descenders, I think), the images were sizing differently from others. Now, since I'm not using images, the font sizes don't have that step to cause strange behaviors.

Randomized the cards. For the first image (above), I wanted to make sure I could easily find "dungeon" and "dragon", so I sorted the words. This led to a whole bunch of the same first letter. This will be solved, in time, when I get better printed cards. Until then, I might as well shuffle the words before creating the PDF.

Custom Red & Blue cards. I've made some solid color cards, as well as some temporary placeholder themed cards using copyrighted content that I cannot distribute. Red and Blue line up nicely with that particular content, though.

Word curation. I got rid of 'cave'. I added 'sepulchre'. I should probably spell that the way any good United States citizen spells it. Oops. Also forgot to remove 'phial'.

Printed out a new playable set:


I haven't got my print settings set up right - the cards aren't lining up front-and-back the way I had intended. Not a huge thing, totally playable as-is, but if I wanted to distribute this as print-and-play, I'd want to make sure that it's possible to line up the cards.

Board Game Geek posting. I have had an account on BGG for years, mostly using it to read reviews. I found the link to create a new listing for a new game, which itself depends on a BGG page for designer and publisher, so I created new listings for myself and for Big Dice Games. It's like I'm a real boy!

All of the above are under review. I think that's a good policy. If it were faster, I'd include a link here to my listings.

DriveThruCards listing. As much to test out the process (is the PDF I generated from ReportLab the right flavor of PDF that DTC is expecting?) as anything, I made a listing of this on DriveThruCards. This required setting my account up as a publisher account (whee). It also required a cover image, which I made by cropping the above photo. Good enough for a work-in-progress.

Apparently, it passed the simple checks to get to the point where I could order a proof, so I have. I'm eager to see if it prints without errors, and if it gets to me before I'm on vacation.


Feels like a lot, and this is for a trivial little expansion to somebody else's game. But it's fun to play, and it's been fun to go through the process. 



Sunday, November 29, 2015

Extended Range, at the Ready.

So, back in High School, I played a bunch of the "Star Wars Lightsaber Dueling Pack" by West End Games. It's a two player diceless combat system, where one player plays Luke Skywalker, the other player plays Darth Vader. Both players simultaneously select moves from a movement reference card keyed to their player (Luke selects "Jump/Dodge [8]" , Vader selects "Hurl Objects [24]"). The cards direct you to pages in combat books, which then dispatch based on the pair of actions, and the players end up looking at pictures of what their character would see, along with points (maybe), and instructions for their opponent. In this example, Luke manages to avoid the projectiles, and the combatants face off at extended range for more combat.

(from page 49 of the Darth Vader book. I claim "Fair Use", since it's a small portion of the original work.)

In college, I took a game theory course which touched on the traditional payoff matrix analysis, but spent a lot of time on combinatorial game theory. I guess if Berlekamp is teaching, you're in for a course on nimbers, tiny, spiny and any number (ha) of other esoteric concepts.

As a project in that class, we had to analyze a game. I selected Star Wars Lightsaber Dueling Pack. I'm going to call that Saber Dueling, because the full name sounds dumb. I wrote a C program to generate a payoff matrix for Saber Dueling, but I think I trimmed a few of the details. (For instance, you can dislodge an opponent's weapon, but I don't think my program represented that.) I ran into a few weird scenarios, probably due to oversimplifications. (In the example above,  Vader ends up at extended range as part of the basic logic, but Luke only ends up at extended range based on conditional logic that I probably didn't implement. If you don't get that right, Luke's close to Vader, who's far from Luke. Whoops.)

I had something like a 32x32 array of integers (probably 8-bit integers, because I was stingy back then) and I jammed that into the minimax algorithm that we had been taught on the one day of class we weren't doing the combinatorial Berlekamp stuff. And I came up with a mixed strategy for playing the game. And I forget, I probably got an OK grade on the project.

And yet, it wasn't really satisfying. I had drained out a bunch of the character of the game in the simplifications I had made. I mentioned losing your weapon. It's kind of important to know that you don't have a weapon in order to select good moves. There's a "Retrieve Weapon" action you can take, but it doesn't score you points, so if you're just looking to maximize the points you score on this turn, you'll overlook it, even if the representation is there.

Another simplification that I had made was to skip over the move restrictions and bonuses. Each move leaves you (and potentially your opponent) in a situation with potentially only a subset of moves available. If I knock you down, your next move has to be a jump. If I know that your next move has to be a jump, I don't have to defend against a downswing - I might restore some hit points, or something. Maybe choose an attack to hit you when you're down. Restrictions are important, but my college project skipped all that.

Years later, and this is perhaps not central to any particular narrative thread that I'm spinning, I had a dream where I was at some sort of gaming convention. I forget if this is the same dream that featured "Cremwits", which I might post about at some other time. This was a convention maybe something like GenCon, which I've never been to. Picture a bunch of folding tables with hard to find, rare, possibly out of print games. I was super excited to find (this is still inside the dream) a copy of "Han Solo with Bomb", a compatible book to go along with the Luke and Vader books. This particular bomb would probably be a thermal detonator, or some similar grenade-like explosive device.

The idea of a new character to play against these two characters was pretty exciting to me at the time. I'm still fairly excited about it.

On the left, some files, on the right some debug output.


I spent a chunk of this past weekend revisiting the analysis of Luke vs Vader. I started by scanning in everything. I don't think that's really important, but it feels handy to have digital copies.

I proceeded to create a Google Docs Spreadsheet (ahem, a "Google Spreadsheet" inside "Google Drive") with one page each for moves, results, and the dispatch table for each character. I exported them all as CSVs, and began writing some python scripts to wrangle the CSV information. I've got classes for restrictions, heavily laden with comments saying "TODO - prohibit use of the Force" or "TODO - enable Spin&Strike", as I still haven't got the full logic working for restrictions. I've got classes for characters. I've got a utility function that figures out what moves are available. I've got a test harness that takes two characters, with some (partially working) move restrictions, chooses legal moves, and figures out what resulting pages they end up on.

Which is, maybe, 75% of the way towards my somewhat disappointing Game Theory class project.

I could imagine this turning into a way for me to play a version of the game solo vs a randomized AI.

I've been thinking it might be a little more interesting to allow the AI to learn how to play the game by playing against itself, or against a population of AI players. I've been toying with a bayesian model that would adaptively figure out what the best moves are for each situation. I haven't made specific decisions about how it would be implemented, but the shapes I've considered would probably still have the one-move-lookahead problem I mentioned above - retrieving your weapon is a good idea, and you want to do it, even if it doesn't win you points right now.

The unfun bit that I was hitting yesterday was doing data entry, filling out a big dispatch table. I'm pretty sure that's done (modulo finding and correcting errors). The bit that's baking my bean right now is keeping track of what bits of information apply to which player - Player 1 looks up a move on the "Luke" movement reference, but turns to a page in the "Vader" book, and then, based on Player 2's action, Player 1 turns to a different page in the book, and tells Player 2 what Player 2's restrictions are for the next turn. Simultaneously, also vice versa.

And I haven't yet got to scoring. I've got the data for it. Each move has a score modifier. Did we properly apply those back in High School? I kind of suspect we ignored that. There's a bonus that carries over for one round that doubles the effect of The Force. There's a Force move that heals 3 hit points. There's another result that loses 6 hit points in one go. Doubling up the Force and then pulling off the heal would totally counter that one result. So, I'd like to see some AI that knows the value of a well timed healing move.


I mentioned that having one more character seems kind of exciting. Well, what about several dozen? Turns out, Saber Dueling is an adaptation of the Lost Worlds Combat Picture Books, which I was somewhat aware of in High School, but only started collecting more recently. I've got around a dozen books that I collected around a year ago. Just now, I went to a handful of different online retailers to see what copies they had lying around. I found three different places that would allow me to put titles into shopping carts, and two of them actually let me check out. It seems weird that an online store would have server problems this weekend of all weekends - or, rather, it's weird to me that this weekend, problems like not being able to complete a checkout would go unfixed.

If-when I get my Luke/Vader scripts playing Saber Dueling, then I'll consider expanding it to allow Man With Plate Mail and Broadsword vs Unicorn vs Magician With Dice Bag vs Skeleton with Scimitar. Perhaps Luke can hold his own against the skeleton. I'd be interested to see.

And, even if Luke vs Skeleton isn't a battle we can see, I think I understand the system now well enough to craft Han Solo vs somebody. One bit at a time.

Wednesday, September 23, 2015

A digression on recreating a higher res logo

This isn't a post about my compiler project. I haven't touched that in a while - I've been thinking about what my reference semantics are going to be, and I think that I've got stuff making sense in my head, but it'll take a bit of time to get it working. Time and uninterrupted attention, which I haven't carved out lately. I've got some vacation time coming up.

No, this is about some goofing around I did with some different approaches to solve a problem that isn't super important. You know, fun.

Many years ago, I decided I needed a stylized logo for Big Dice Games, something that'd capture some of the idea of old school gaming, in a way that'd reproduce on letterhead, or business cards, or whatnot.

So, I sat down and doodled this in Inkscape, I think:


Well, something like that, anyway. I generated a SVG, because I wanted nice crisp edges, and SVGs are good at that.

And I made business cards. And I got a local company to embroider it onto a sweatshirt. I love that sweatshirt.

And I put it as a title screen on a bunch of tiny games.

Somewhere along the way, though, I misplaced the SVG, and just have a PNG of it. And those business cards. And the sweatshirt.

So, I decided I'd try to recover, or recreate, the vector description of the logo. That's what this post is about.

Stab #1 - Genetic Algorithm
I love me some good genetic algorithms. They're fun to implement, fun to watch as they lurch around with their hilariously unfit attempts, and slowly converging on... something. And then, they're frustrating as they plateau, and no progress gets made, and then you turn your machine off, and you tell yourself you weren't interested in the project to begin with.

Or, you make some good progress, and you realize that you forgot to have the output saved somewhere, or the output isn't in a useful format, and you only realize that several hours of GA searchtime later.

Or, your computer turns off in the middle of a search, and, you didn't save the population, so you could start over, but... is it worth it?

I think those roughly sum up my experiences with genetic algorithms. Oh, I also used GAs a while ago to come in 5th place in a Games Magazine contest. I've got a Games T-Shirt as a result.

So, yeah, despite GAs shortcomings, I decided to try to recreate the logo with a simple GA.

Maybe not so simple. Looking at that logo, it's easy to recognize there are some circles and some polygons. It's not too hard to pull out the colors, so the GA doesn't have to work to find them. So, I hardcoded a 40-dimensional representation - three numbers for each circle (x,y,r), six numbers for the one triangle, and eight for the two quads. 40 floats, which I could seed with some vaguely neighborhoodish values.

I wrote a python function that would take an array of 40 floats, and use them to draw a series of circles and polygons. My seed numbers drew this:



Which get the GA started. Recognizably using the pieces of the logo, now just fine tune the values. Go.

So, the evaluation function would take the rendered output, subtract it from an image I captured by scanning one of my business cards, histogram the difference, and count the number of pixels that had a difference.

That was a reasonably quick evaluation function, and I got some output looking like this:

Closer to the target than the seeds, in some ways, but not really all that good.

My GA is still running, around 5 days in now. Still nothing accurate. I kinda like the jaunty, rakish version there, but it's not what I was going for. I'm keeping it, in case I decide I want to revamp the logo later, though.


Stab #2 - Hough Transforms

A couple years ago, I listened to a talk about computer vision, particularly as it applied to recovering text from Google street view photographs. The details of that talk aren't important here, except that the speaker mentioned Hough Transforms. The basic idea is that you create a big chunk of memory that captures all the possible parameters for the thing you're looking for. If you're looking for a line in your picture, you have a 2d space with some sort of line parameters. Could be the m and b in your algebra class standby y = mx + b, or maybe you're more clever, and want to avoid infinite slopes for vertical lines, and you use a radius / angle representation, or whatever. 2 floats capture a line pretty well. You set up an accumulator array for those two values, and for every pixel in the source image that seem to be part of an edge, you have it "vote" on possible parameter pairs for lines that would pass through that pixel. Or, flipping it around, for each parameter pair, generate a line, then for that line, for each pixel in the source image, collect votes on how much that line has edge pixel candidates.

It's clever, it's kind of compute-intensive, it's the kind of thing that GPUs can compute pretty easily.

There's a Python implementation, which makes me happy.

Circles can also be detected, and ellipses, as well, if that's your thing. Could be anything, really - as long as you can parameterize it, and don't mind the high dimensional space for wobbly peanut shaped whatevers you're looking for. Circles and lines were plenty for what I was trying to do.

I downloaded scikit-image - which was an annoying process of using the easy_install command, which didn't work, trying to trace back through the dependencies, and installing those, and eventually just using apt-get to install scikit-image, which seemed to work, except that much of the documentation and sample code was out of sync with the version that I installed, so I wasn't able to use the Canny edge detector, only the Sobel one. Which was OK.

Pretty much OK.

I did a quick edge detection path (good grief, my edges are so wide, in part due to the gross job I did scanning the business card), and asked scikit to find straight lines in my image, and it found a couple. I think I was overloading it with edge pixels.

So, I decided to simplify things. I wrote a script to force pixels into one of the six colors that should be in the image:


That cleans things up some, but I'm getting unfortunate gray edges that I don't want. Maybe it's not such a big deal.

I also wrote a version of that script that broke each color out into a single file:




I proceeded to open these bitmaps (well, black and white PNGs, but close enough) in GIMP, and manually cut things further apart into individual primitives.

I passed these hand-cut PNGs back into the Hough Transform scikit script, and asked for some lines and circles, and started getting fairly good values. In some cases, I got a couple different candidates for lines or circles, but at least for lines, I got a "strength" value, so I just took the strongest candidate.


Ok, so now that I had these, it was time to see what the output was. I started drawing some circles, ok, that looks pretty good, gray, green, white... hmm... Oops, the output from the scikit was coming out as y,x and my drawing code was expecting x,y. Thank you, science.

Flipping things around, the circles all looked like they were coming out about right.

The polygons, though... that would be some more work.

I started with the triangle for the hat. The scikit line parameters were theta, radius - an angle from the origin to the closest point on the line, along with the distance from the origin to that point. I futzed around with some drawing code, and eventually got three very long lines tracing the outlines of the hat, and then continuing well beyond the corners.

Ok, so I need to intersect some lines.

For each line, I parameterized the lines starting at that closest point to the origin, and proceeding in a direction perpendicular to that radius. So, I had x0,y0,dx,dy descriptions of lines, which you could think of as x(t) = x0+dx*t, y(t) = y0+dy*t. That's convenient.

For each pair of lines, I'd have two equations in two unknowns, and as long as I don't have any infinities around, it should be a matter of algebra to figure out the point of intersection.


My algebra teacher from 7th grade insisted I show my work. So there you go.

So, I now had three intersection points, which looked pretty sweet.

Only two more primitives to go - a blue quad for the wizard's robe, and a brown one for his staff.

I had broken the robe into two different pieces when doing the edge detection, so I ended up with something like eight different candidate edges. With trial and error, I figured out four that I wanted to keep, and the right order to draw them.

The staff was trickier - my edge detection efforts only got two edges, so I started there, and inserted random other edges from the cloak to begin with. Fortunately, the right side of the staff gets buried underneath the red fireball circle, so I could do a lot of things there and have no visible effect.

On the tail end of the staff, I tweaked the offset until it looked about right. So much for computer recovery of subpixel parameters. -300? Sure, looks good.

I had something that looked pretty good at this point. My eyes were getting a little bleary from all the pixel pushing, so I went back to the original scan, and was pleased to see that things looked pretty close to the original. The mage's left hand seemed to be gripping the staff a little high - the bottom of the circle was just about tangent to the staff polygon. So, I manually offset it down 10 pixels. Again, not doing a great job of adhering to the machine recovery process.

I'm pretty happy with the result, though:

Look at those sharp edges, those clean colors. You may not be able to tell, but it's got an alpha channel. It's still not a SVG, which I could probably have got without too much difficulty by just tracing some edges in Inkscape, but I've got values that approximate the original design that I can begin to play around with for fancy animated logo screens if I want to do that. The primitives could fall into frame from above, revealing the mage in the space of a few seconds. Or I could animate scale - stretch and squash into place. Or combinations of these. Or rasterize the logo, one primitive at a time, one scanline at a time, like one might have done on an Apple ][ back in the day.

So many options.

I could even make a new sweatshirt.







Sunday, June 28, 2015

Smooth Animation, but lots of jaggy edges


You'd think a game loop wouldn't be a hard thing.

while(true) {
  process_input();
  update_game_logic();
  render();
}

That can be adequate for a lot of simple games, but the cool thing to do is to decouple the update calls from the render calls, update on a fixed (sim) clock, and render as often as you can, using a very lightweight interpolation inside the render loop to smooth the trajectories between update calls.

In particular, I'm using a version from Robert Nystrom's "Game Programming Patterns", which looks like this in my language:

 loop {
    float currentTime = get_current_time();
    float elapsedTime = currentTime - previousTime;
    previousTime = currentTime;
    lag = lag + elapsedTime;
    
    sdl_tick_input();
    if (sdl_quit_signalled() == 1) {
      break;
    }

    while (lag >= SECONDS_PER_UPDATE) {
      update();
      lag = lag - SECONDS_PER_UPDATE;
      updates = updates + 1;
    }

    float alpha = lag / SECONDS_PER_UPDATE;
    draw(alpha);
    renders = renders + 1;
  }

The nice thing about this is that the updates are always at a fixed rate, so the simulations don't go unstable, and you still get to render as often as your machine is capable of. I could stick an artificial delay in there to keep the frame rate from going too high (rarely a problem, but if I wanted to conserve battery power on portable devices (phones, tablets, laptops) it might be worth considering.


Several bits that I ran into along the way that I need to clean up:
  • I'm being sloppy about pointers within my compiler. I just throw an extra load() to dereference the pointer, and things seem to work out, but it feels super sloppy.
  • I'm getting a segmentation fault about 13 seconds into the run. Don't know why that is, but it's reproducible. Which is a nice thing for a crash bug.
  • I've got user-defined data types (let's call them "structs"). But if I create them at global scope, they don't get linked properly. Gotta figure that out.
I was hoping to get the ship to fly around with simulated intention by this point. That's soon, but not yet, I guess.

Wednesday, June 24, 2015

White Triangle


From a compiler/language perspective, this isn't an interesting post. All I've done is incorporate OpenGL into my existing SDL2 support from the Pong game, earlier.

It was a bit of a hassle merging the SDL2 OpenGL tutorials (which are, I guess, pretty new, and pretty spartan) with the NeHe "drawing a shape on the screen" tutorial (which has a SDL1 implementation). But, hey, there you go. A triangle, rendered in perspective, in my language.

If you're familiar with the NeHe tutorials, you might expect there to be a square on the screen, too. And the square does get drawn, just not at the point that you see here - I have a few different "game modes", and in the first state, I draw a triangle. After several seconds, I switch states, and then I draw the square. So that's even fancier than the NeHe tutorials. Take that!

For the next foreseeable bit, I'll be hooking up OpenGL to my language, so that's not pretty, but it's what needs to happen. I suppose I can write a Python script to grab GL.h and provide some amount of interface for free.


A few things that I should add to my TODO list:

  • GLES support - I'm using the old OpenGL glBegin(GL_TRIANGLES) style. That won't fly on all the platforms I intend to hit, so I'll rewrite the triangle using a vertex array, which hopefully my language will be able to support natively.
  • Strings - pretty soon, I'm going to want to accept user input (for a high score list?) and show text onscreen (on an about screen?) without the gross approach I used in Pong, which included a draw_char call that put a character at a position on the screen - no word wrapping, no language support for strings. Maybe that won't be too hard.

Monday, June 22, 2015

I have nothing to say, but it's OK, because now I can say it.

Apologies to The Beatles.

A bit ago, I copied the sourcecode for my Pong game into a new directory, as a way to start work on my next game in my language. I proceeded to comment out a large chunk of code.

Tangent: languages often don't have good facilities for commenting out chunks of code that might already include commented out code. In C++, say, you've got your single line comments

// like this

And there are multiple-line comments

/*
 * that can go on for many
 * lines, like this
 */

And if you want to disable a chunk of code that has a single-line comment, it's fine:

  for (int i=0; i<10; i++) {
    if (i < 3) {
/*
      // show small numbers
      printf("small number %d", i);
*/
      process_data(i);
  }

But it gets tricky if I wanted to disable that entire loop because */ terminates a comment, even if the comment was itself in a comment.

C and C++ have one more tool, the preprocessor, which helps some, so I could just do this:

#if 0
  for (int i=0; i<10; i++) {
    if (i < 3) {
/*
      // show small numbers
      printf("small number %d", i);
*/
      process_data(i);
  }
#endif //0


but that's still not terrific. Maybe disabling code isn't what a compiler is for. Still.


But that's not what I came here to talk about today. As I was saying at the top, I commented out a bunch of code from my Pong game, and stuff wouldn't compile. Ok, I thought, perhaps I was sloppy and commented out half of a function, and the remaining bits are causing problems. That may well have been part of it, but I cleaned up the obvious errors like that, and still I was running into issues.

One thing that doesn't get a lot of coverage in compilers tutorials is how to deal with errors - lexing errors, parse errors - there's perhaps a paragraph talking about inserting a special error token into your language and that's what gets called when things go wrong, which is a start, but then what do you do when that production gets called? How do you present enough information to the user of your compiler to fix the problem?

Right now, I print a warning with a line number and a little more information, which isn't currently sufficient. For one thing, the line number doesn't line up with the actual line number of the file being compiled. I think part of that comes from my compiler starting at line number 0 versus emacs, which starts at 1. Also, if I #include files, those throw off the numbering. Whoops. So, that's stuff to fix.

It'd also be nice to know what column the error was on, if I know that.

I've looked a little bit at pycparser, which seems to do a better job than I am for this sort of thing. TODO: roll in more of pycparser's approach to error handling.


And even that isn't what I meant to be talking about.

So, I waded through the output of my compiler, complaining about unexpected closing braces, and I ultimately determined that code like this was causing trouble:

void display_char(char c) {
# sdl_show_char(c);
}

Seems innocuous - a call to an external function that I commented out, since I'm rewriting my graphics library. Except, whoops, aha. The body of display_char is empty. And, whaddayaknow, my compiler doesn't like empty bodies for functions. Or, as it turns out for if/elif/else bodies, either:

if (phase == PLAY_GAME) {
#  draw_game();
} elif (phase == SHOW_TITLE) {
# draw_title();
} elif (phase == SHOW_CREDITS) {
# ...
}

As it turned out, that was super easy to fix - I just extended my grammar to allow statementlists to be 0 or more statements, instead of 1 or more, which is how they were before.

So, now I can have functions that do nothing, and blocks of code that are empty. Progress!    

Saturday, June 13, 2015

Local arrays, arrays of objects


What you see here is an array of user-defined objects (Ship) like so:

class Ship {
  Vec2f pos;
  Vec2f vel;
}

in a locally-defined array. And you can see position and velocity being set, and then the position being updated using the ship's velocity. And the values are being read out correctly.

All working as you expect.

After the flailing to get the arrays working to begin with, this feels anticlimactic.

Next up: graphics, including using the above position and velocity data, vertex arrays, all on the way to an Asteroids-like game.

Friday, June 12, 2015

Reading/Writing ints/floats to global arrays


It doesn't look like a big thing, but the relevant bit of my code is this:


int intvals[3];
float floatvals[10];

int main() {
  intvals[0] = 17;
  intvals[1] = 19;
  floatvals[0] = 3.14;
  floatvals[1] = 1.5;


Here, I'm allocating two arrays in global space, assigning to them (and then reading from them, not pictured in the code snippet here).

A couple things I discovered along the way, which isn't well documented in the stuff I was looking at, was that array allocations need to have their "linkage" specified, which I haven't been doing for other global variables. Also, I wasn't able to get things working until I provided initializers for my arrays. That's a good thing, and I was considering providing default initialization anyway. I found it surprising that the LLVM assembler didn't want to accept my generated bytecode until I provided initialization values for my arrays.

Next up:
- arrays declared locally (top of function, inside loops)
- arrays of structs (e.g. Vec2f, Ship)
- *GL linkage, passing arrays through to OpenGL and WebGL


Thursday, June 4, 2015

Loading into, reading from, structs


I've found that in order to make progress on my language, it helps to have very specific lines of code that I want to make sure my compiler compiles, and focus on those.

The couple lines of code that I've been struggling with off and on lately have been:

  newpos.x = pos.x + vel.x;
  newpos.y = pos.y + vel.y;

Really simple stuff; retrieve member data from an aggregate type, do some arithmetic, store the results into member fields of an aggregate type. It so happens that this is simple physics / animation code, but the compiler doesn't care about that.

I had done some reading of the LLVMlite documentation and was sure that I needed to do some sort of insert_value and extract_value code. And then there's getelementptr, which is tricky, to the point that it's got its own FAQ page: http://llvm.org/releases/3.5.1/docs/GetElementPtr.html

In my previous post, I talked about using Clang as a reference - I might not have to use all of LLVM if Clang's got a way to accomplish what I'm doing, I can do the same thing that it does. And all it does is simple getelementptr instructions. 

So, I started stripping out stuff that I had built to handle stuff on the left hand side of assignments differently from stuff on the right hand side. I put some getelementptr calls into my code. I started seeing errors about "gep not implemented on identified struct type", which I thought I maybe had to work around, but it turns out that somebody else had run into that bug and had submitted a fix for it. So, git pull and reinstall the LLVMlite module, and all of a sudden, things were working a lot better.

There's still some gross bits that are left over from my efforts to set up a complex solution to what was ultimately a simple problem - that's got to get cleaned up before I push this stuff up to my repository. I think I've also found some pointer stuff that now looks ripe for refactoring.

Next short term feature: (static?) array support - being able to create an array of objects, either simple or aggregate objects, and then reading and writing those objects.

Longer term ambition: some sort of Asteroids-ish clone. That'll put the above physics / animation code into context, and having a collection of asteroids floating around the screen will motivate the array support.

Getting away from language features, an Asteroids clone will provide pressure for more OpenGL support, which will be a good thing.

Wednesday, May 27, 2015

Recursive Structs in LLVM experiments, using Clang


You don't get enough science in computer science, like with the whole experimental method.

I'm (currently?) using LLVMlite (https://github.com/numba/llvmlite) in my compiler project. Some things you might want to do aren't well covered in the LLVMlite documentation, or even in the LLVM documentation.

Case in point, I'd like to be able to generate code for something like this:

  Ship bar;
  bar.pos.y = 7;
  bar.vel.y = 3;
  bar.pos.y += bar.vel.y;

I haven't been making a lot of progress lately (you'd know), trying to wrap my head around what I'd even need to keep track of for that to work, including most recently thinking that I'd need to be able to walk the tree to create a path of element offsets, keep track of l-vals and r-vals, and some sort of stack of symbol tables.

But then I recalled that I had used Clang before to figure out how they do things.

  clang -S -emit-llvm teststruct.c 

I just pointed that commandline at some C that does what I want, and it spits out LLVM instructions that I can use as a reference.

For example, some of the instructions generated for the above code is:

  %10 = getelementptr inbounds %struct.Ship* %bar, i32 0, i32 0
  %11 = getelementptr inbounds %struct.Vec2f* %10, i32 0, i32 1
  %12 = load i32* %11, align 4
  %13 = add nsw i32 %12, %9
  store i32 %13, i32* %11, align 4
 
So, there you go. Just get the element pointer, then get the element pointer, and then bam, put the thing in the thing. That's it.

Reverse engineering. One of my top several favorite kinds of engineering.

Sunday, May 3, 2015

SIFF 2015 Scheduling, Part 3 of the epic trilogy(?)

In the prior post, I had been publishing the showing times into CSV, which I was pulling in to Google Sheets.

Seems like a more natural way to show a sequence of things in time might be a calendar app, so I poked around with the Google Docs API.




I started off with the Python Quickstart: https://developers.google.com/google-apps/calendar/quickstart/python which walked through getting the client secret and OAuth bits. I then followed (cut-and-pasted) examples from https://developers.google.com/google-apps/calendar/v3/reference/events/insert to put the showings into a calendar.

If you'd like to see the calendar, maybe because you're going to SIFF and want to see the movies in a different format than siff.net provides, that's cool - check out this link: https://www.google.com/calendar/embed?src=18j890v2366vkfpcn9nhn3828k%40group.calendar.google.com&ctz=America/Los_Angeles

There's a few problems, including Unicode still giving me grief, which I'm currently avoiding by just dropping the movies that have difficult accents. I might want to watch a movie about sake, but if it's "saké", then it's a problem. Also, some of the special programs have pages that are formatted a little differently, leading to at least one event ostensibly screening at "Buy".

I may continue to tinker, or maybe I walk away from it - it's plenty useful as it is now for me. If it's useful to you, that's groovy, as well.




SIFF 2015 - Movie 0 : Song of the Sea

I've already posted about the geekery of pulling data off the siff.net webpage and putting it into CSV or Google Calendar format. That continues.

Today, while I wasn't doing that, I was actually watching a movie at SIFF. I always think it's a bit confusing to the uninitiated, but SIFF is an organization that runs, amongst other things,  the Seattle International Film Festival event. Which is also called SIFF. SIFF, the organization, also shows films year-round. As well as during the festival.

So, today, I went to the SIFF Uptown cinema to watch a movie that's not part of the festival. Is that all clear? Did it need to be? Probably not, but that's why I'm not numbering it as a festival movie. A pre-festival movie.

http://www.siff.net/cinema/song-of-the-sea

An animated movie out of Ireland, this is the story of a family whose mother disappears under mysterious and possibly tragic circumstances right about the time of the birth of the daughter. And then the daughter turns out to be a "selkie", which is like a were-seal, I guess. I read stats for them in a Monster's Manual at some point.

This was done by the same team that did "The Secret of Kells", which I enjoyed, which was a story about calligraphers in the middle ages. Or, a kid apprenticed to calligraphers. Big eyes, very flat shapes - reminded me a whole lot of Genndy Tartakovsky's "Samurai Jack" style. And "Song of the Sea" has a lot of the same style.

One thing bothered me when I looked at the description: it's described as a movie about a boy who discovers his sister is a selkie. Is it really that, or is it a movie about a girl who is a selkie? Turns out, the description is probably a little more apt. It's about the boy's adventures, tugging his supernatural sister around. You see the rest of the family, but clearly the boy is the focus. But why is that? Isn't the interesting character the girl?

Random other observation: how do the Irish learn how to pronounce things? The girl's name is "Saoirse", pronounced "seer-shuh". It's their language, they can do what they want, but wow.

Some peril, no dirty language as far as I can recall. Seems appropriate for kids.

SIFF 2015 Scheduling (Part 2/???)

My efforts to get SIFF schedule data into a format useful to me continues.

Earlier, I had collected the movies that I had already manually tagged on SIFF's website that I might be interested in. There was some data that I didn't have in that original pass, like running time of the movies, so I proceeded to spider the full 2015 festival schedule:


Before I go on, let me encourage anybody that is inspired to do any sort of crawling of a website to be considerate, perhaps rate-limiting your requests - it's easy to imagine a script getting out of hand and inadvertently becoming a "denial of service" bot. And presumably, you don't want that.

So, you can see above that I've pulled the different showing data for each movie, and each showing is its own line in my new spreadsheet. That's handy.

You will also see that the movie "Décor" got mangled in the process. I fought with unicode and encodings and decodings and python 2 vs python 3 and I gave up. What's worse is that you don't see "Paco de Lucía: A Journey", because the URL for the movie is non-ascii, which I guess is fine, except that it broke my stuff. So I skipped that movie altogether. Maybe that's a tip for webmasters that want to discourage lazy hackers: throw in some accented characters, and hope that unicode is too much work to bother with.

I'm assured that Python 3 gets the unicode stuff right, and if I were to start all over again right now, I might use Python 3, but how many of the libraries I depend upon currently support Python 3? (Some, I imagine. Probably not all.)

Also, while I'm here, I'll mention that I appreciate that the midnight showings are listed as 11:55pm showings. That's unambiguous and easy to understand.


Next up: Hm, I don't know - I was thinking of jamming all of this information into a Google Calendar, which could be pretty useful. A whole new set of APIs to wrestle with, which isn't entirely a bad thing. I've got the start time, duration, and location - all of which would make for useful values in a schedule.

Saturday, May 2, 2015

SIFF 2015 Scheduling (Part 1?)


It's time for the 2015 Seattle International Film Festival. Last year, I saw somewhere around 30 movies. This year, I think I might be able to do around 50. I've put some hurdles in my way like deciding to go into work most of the days of the festival, and maybe I won't see the midnight movies like I might have when I was younger.

I spent some time going through SIFF's festival website, and got somewhat impatient and frustrated with what it provided me, so I pulled down the data of movies I was (somewhat?) interested in.


SIFF's website lets you add movies to "MySIFF", which is then presented in one page of scrapable HTML. I used the Python "Beautiful Soup" tool to find the appropriate chunk of data inside the HTML file, then I carved out individual chunks of data that I wanted to track. I bundled those bits back up and sent everything out to a CSV file, which Google Sheets (or "Google Docs Spreadsheets", as I call it) was happy to import.

I did a tiny bit of special-case formatting in my script so that the movie "808" renders as a string instead of an integer. I didn't special-case that movie, but I did recognize that there might be titles that get interpreted as integers.

Next Up(?) - constraint satisfaction to help me understand when I should see a particular screening of a movie (movie A conflicts with movie B now, but movie A is only showing now, but movie B has no conflicts next Tuesday, therefore, movie A is the better option now).

Sunday, March 22, 2015

And now, for something completely different, and inconsequential


Taking a break from composite structure management inside my language, I wrote a little Python script to generate k-ominoes. I was first introduced to the 5-tile version, "pentominoes". If I recall correctly, somebody manufactured a playkit/puzzle of a set of all(?) of the pentominoes, and one of the challenges was to fit them all into the box.

Playing around with tetrominoes will be familiar to people who have played Tetris before.

One vaguely tongue-in-cheek idea I have for a game is "k-tris", where you get to select the value of k, that is, the number of tiles making up the pieces that you'll be dropping. For k=4, the game is Tetris, and might earn me a cease and desist, so perhaps I only allow odd values for k. k=3 becomes a pretty trivial game, and even moreso for lower values. Not that it's my job to police that. Feel free to play one-block-Tetris if you want. My guess is that "Pentris" (k=5) or "Septris" (k=7) will be really hard to play. But there's no reason it'd be any harder to program than the more familiar versions.

Sunday, March 15, 2015

Storing into aggregate types

I've been working on updating the grammar to handle aggregate types (structs, classes, that sort of thing). This weekend, I've adapted some of the C grammar and got things running again.




The LLVM code you see up there uses the "insertvalue" instruction, which is the way to write into a struct. The paired instruction is "extractvalue", which the compiler doesn't yet generate.

There are a few other issues with the generated code - the correct offset isn't being used. I think the "1" at the end of each line is an index to the element within the aggregate structure. I'm computing an offset for x and y members of my 2d vectors, but it's not piped through properly.

And then there's the loading and reloading of each of the aggregate instances. I know that LLVM has some aggressive optimization, but I'd rather not rely on it to clean this sort of thing up.

Friday, March 13, 2015

K&R 2e


I don't have any progress to report on the development of the language - I'm currently trying to get structs working, and that's involved some pretty substantial breakage of the grammar. I had a grammar rule for assignments which was something like <variablename> = <expression>, which was sufficient for a while - I could evaluate functions and bind them to variables, I could have mathematical expressions, I could take the contents of variables, and assign them, things were rosy.

But then I tried "velocity.x = 3", and my structure didn't hold up anymore. I had already created the velocity struct, that part of the work was pretty much working (see my earlier posts about migrating to llvmlite in order to get aggregate types being generated). I even knew that the "x" member of the velocity structure was the zeroeth element, because I was keeping that information around.

I just ran into a stumbling block in how to represent a struct member as a "L-value", a thing that can be assigned to.

I looked at some online references for the C grammar, and found an ANSI C grammar which sent me down a rabbit hole of rewriting my expression productions. I looked at either pycparser or maybe pycparser (honestly, probably the first one) and I see a bunch of ways that my code can be cleaner, but I'm still not parsing struct assignment correctly.

I stumbled across a LALR tutorial which makes everything seem simple.

And, I guess, for simple languages, simple parsers are simple. But things get hairy quick.


C is, in a lot of ways, a simple language. I learned it in maybe a week, probably much less than that. I bought myself the K&R "white book" (pictured above) around Christmas of 1989. Hm, I'm not sure all of the chronology of this story makes sense, because I'm pretty sure I bought myself an IBM PS/2 286 in January of 1990, and I taught myself C on that 286 with the clackety keyboard and the VGA monitor and the 20 MB hard drive.

C is an easy language to dive into, and it's pretty easy to understand how things work. I still smile at the characterization of C as having the power of assembly language with the readability of assembly language. Looking at C and assembly today, it's easy for me to read either one of them, but the difference is the level of abstraction - there's more "idioms per line" in C than there is in assembly.

And then, there's C++, which I taught myself bits of in 1992 and 1993. Again, the structure of the language is there to afford more idioms per line than C. But I never found C++ quite as satisfying as C was, and over time, I managed to ship plenty of working C++ code, and pass several programming interviews, and I knew about the "diamond of death" multiple inheritance issues, and I knew about vtables, and it was fine, but I kept looking for something else.


On the flip side, when you look at the grammar of C, it's hard to call it "clean". There are simpler languages out there (LISP and PostScript come to mind, as well as a few proprietary languages that I've had to use professionally). In each of these environments, there's a balance between the complexity of the language tools (parser, compiler, interpreter, etc) and the productivity of the people using these tools. At one company I worked at, an engineer insisted that a scripting language was important to the project, but he couldn't convince management to allocate time for it, so a simple PostScript(ish) interpreter was coded up over a weekend, and that engineer was happy. Management was happy, because people were still on-schedule, and then junior engineers were given the task of creating a coding culture in this stack-based language.

Toward the end of that project, I was approached to see if I wanted to add "regular expressions" onto the language, which didn't make any sense to me. I'm pretty sure it didn't make any sense to the person asking me, either. Perhaps he meant infix operators, which would have been a substantial change to the language.


All of this to say, I still have fondness for C. I dug around in the house and found my 25 year old copy of K&R, and I'll be using that as a reference as I do some more hacking with the grammar of my language this weekend. Hopefully, I'll move closer to that sweet spot of clean language design balanced against clean language implementation.

Sunday, March 8, 2015

Test Driven Compiler Development


Some more (hard to screenshot) progress, some more features that turn up to be annoying in real development.

A while ago, I was trying to get aggregate types (e.g. structs, classes, etc) into my language so that I could represent the position of the ball in my pong game as a single variable. I had trouble with getting LLVMPY to assign to or read from members of an instance of an aggregate type.

I dug into the mailing list for LLVMPY and discovered that LLVMPY was no longer maintained, and the cool new thing is llvmlite, which is a thinner wrapper around LLVM. Ok, so that's what I need to take advantage of the full LLVM feature set, that's fine.

So, I set aside the work on aggregated types and began rewriting the compiler to use llvmlite. I'm using git, so this was as easy as creating a new branch on my local machine.

This being the recommended upgrade path, there is a LLVMPY emulation layer, which you can presumably just import instead of the old library. I figured switching all at once seemed the right thing to do, so at first I ignored the migration layer. Much of the migration was straightforward, but I bumped into some bits for instance of managing global variables, and looking at the migration layer was a really good reference to see how the migration might be implemented.


As I was making my changes, I compiled several of my earlier test cases, verifying that the compilation completed, and then eyeballing the output. A better approach would be to embed expected output into the tests themselves and then automatically verify the built output against the expected output.

One step in that direction that I did implement is a simple python script that verifies that all the binaries build without compilation error, which is what's in the screenshot here:


You can see "built 17/17". Can't do much better than that.

Along the way, I updated some of the pong code to start using floats for position and velocity, but then I started getting a lot of errors where I was comparing a floating point number to an integer. My language doesn't allow this, and it has no automatic type conversion. Indeed, it has no conversion at all, so pong wasn't compiling, and I had to wade through a bunch of these lines.

Worse than changing numerous lines of code was the difficulty of finding the lines where the problems were. Several of the errors reported a line number, but they were invariably way too high, which resulted from doing two passes over the source code.

I fixed that, and now my error messages are more useful, and all of my tests compile and run.


I proceeded to check in my llvmlite branch, which now needs to be merged with my work on aggregate types.

Saturday, March 7, 2015

It prints 7

I've installed llvmlite, the successor to llvmpy, which is itself the successor to something else.

I've installed the prerequisites, run the tests, and begun poking around with the sample code and the docume... no. The nice thing about llvmlite is that it's a really thin layer around the LLVM library, so the LLVM documentation is going to be my reference.

I proceeded to hack one of the test scripts to output LLVM code to match one of my earlier tests.


and, as advertised, it prints 7:

Wednesday, March 4, 2015

Or, maybe I could do it that way.

So, my language recognizes struct declarations. And allocations. And I've been working on assigning to instances of these structs.

I had false starts that I've mentioned here before, specifically about creating anonymous structs, which are probably useful for a language feature I want to do later, but not really relevant right now.

And so, I've been looking for examples of how it's supposed to be done. I found vague references to "getelementptr" in LLVM, which in LLVMPY is exposed as "gep", because hey, it's hard to type stuff.

Interesting side trip: my language is pretty close to C, so I wrote a little bit of C code demonstrating the behavior I want in my language, and I used "clang" to compile it to LLVM.

clang -S -emit-llvm teststruct.c 

Easy peasy. Dropped a bunch of LLVM asm into teststruct.s, which used that getelementptr thing that I was trying to use. So, maybe I'm going in the right direction.

More searching, more experimentation (they call it Computer Science because we use the scientific method) and I start exploring the LLVMPY issue tracker, and I discover a feature request to support structures. Ok, good, it's a priority... ooooh. No. That's been outstanding for going on two years now.

And I start poking around the LLVMPY users' mailing list, and I discover a post suggesting that LLVMPY users switch over to LLVMlite.


So, I guess that's the next thing on my TODO list - migrate over to LLVMlite. And maybe, you know, read the documentation.

Tuesday, March 3, 2015

On my way to implementing classes



I've been working on adding classes to my language, and I've got very little to show for it so far. Previous modifications have been relatively small, adding a little bit to the grammar, handling one new production, a small test case, and post here.

The test case I'm working on is this:


I added class definitions to my language, and that seems to be working OK, and I seem to be able to declare the instances of my classes just fine, but assigning to the members (e.g. "scrpos.x = 160;") has been giving me a hard time up to this point.

The place where I was stuck was in handling the "MemberRef" production (what happens when the parser processes the "." member operator). I have an object that I know is a struct, and I know it's got two int elements, and I know I want to refer to the "x" element, but I was banging my head, trying to figure out how to get the "Vec2i" name for that object.

Turns out, when I was constructing my struct object originally, I had ignored an optional parameter that is the struct's name. Now I think I have what I need.


On the left, in black-on-white, you'll see llvm.core.Type.struct(types, self.classname). This is where I'm passing in "Vec2i" in as the name of the struct I'm constructing. On the right, in white-on-black, you'll see my debugging spew, where it is reporting that the "base type name", the name of the type of scrpos in my sample at the top, that's "Vec2i", where I need to be able to figure out what "x" means.

A couple lessons here:

1) (boring) llvm.core.Type.struct can take more than one parameter, just because the code runs doesn't mean that it's doing what you want.

2) LLVMPY's documentation can be sparse. However, in this case, I could have discovered my problem by rereading the documentation on the struct constructor. By passing in no name, earlier, I was implicitly asking for an "opaque" struct.

3) source control is your friend. I don't think I need to tell you that, but I've ripped out a lot of stuff, rebuilding stuff, and the only way I can feel confident to do this is to know that I've got a checked in "working" version to revert to if things go wrong. Indeed, I've added a bunch of stuff that's now unnecessary, and I can filter out the unuseful stuff before I commit it.