Monday, July 29, 2019

Constraint Propagation, "Wave Function Collapse", Dungeon Generation


I'm still trying to get my book "Shapeshifting: The Dream of Dagrec's Legacy" finished by the end of the summer, and if at all possible, available by mid-August.


The book is intended to be an homage / parody / satire / reflection on old game systems, old game books that were evocative, but not actually sufficient to be played as written, leaving players to come up with house rules to fill in the gaps.

Also, it's being largely written by AI. More on that in a later post.

I decided that a relevant thing to add to the book was a generated dungeon. The 1st Edition Dungeon Master's Guide had some tables to help you generate a random dungeon, and there were Dungeon Geomorphs, which were map tiles that you could lay out in random assortments, if you liked.

And, of course, every roguelike that's sufficiently like Rogue will have some sort of dungeon generation system, which typically separate the world into rectangular rooms connected by a network of tunnels.

A while ago, I stumbled across people getting excited about "Wave Function Collapse", a "new" algorithm that can be used to generate plausible content, given a small test sample. The animations I looked at were pretty, but the algorithm seemed to be a repackaging of constraint propagation, which I had been using for content generation for years, and had first seen in my AI class back in 1991, where the technique was used to find a consistent labelling of edges for doing image comprehension on a set of edges representing a computer vision view on a world of blocks.

The one thing that looks like it's different in WFC is that the frequencies of patterns are recorded, and when a pattern is matched as a basis to place new content, the input frequency data is consulted to pick a plausible successor. For example, if my input had straight corridors followed by more straight corridors 9 times out of 10, and a turn 1 time out of 10, I could use that to generate corridors that resembled the input by rolling a d10 and placing a turn 1 time out of 10, otherwise continue the corridor going straight.

One hesitation I've had about the various WFC implementations I've looked at is that they largely don't implement backtracking, meaning they can get stuck in dead ends. This is apparently not a show-stopper for people to use these implementations - the algorithm seems to get stuck rarely, and if it does, just restart, and it'll be fine. Typically.

So, I wrote a simple WFC / constraint system, without implementing backtracking, and it was generating dungeon maps pretty much to my liking, resembling the image at the top of this post. One feature I had in my tileset was stairs, which I didn't give the algorithm enough information to really deal with well. The generator would create loops with 3 stairways, all going down going one way around the loop. If you're familiar with Escher's "Ascending and Descending", or the Penrose stair illusion on which it's based, that's the same idea. Not really something I wanted accidentally appearing in my dungeons.

I added in some elevation tags, and I still ran into problems. Now, what was happening was that the generator was making portions of the map at wildly different elevations, and couldn't figure out how to merge them together. And, without backtracking, there'd be no way to adjust the placed tiles, so I'd have to restart. And restart. And again.

So, I rewrote the generator from scratch, this time with backtracking. One reason I had hesitated implementing backtracking myself, and why most of the WFC implementations avoided it, was that there was a lot of partial solution state that one kept around in a backtracking solution.

But I dove in, and got a backtracking placer working. It was slow, because of all the copying I was doing. And I was doing it recursively, so I was using up the program stack. And this began to be gross, so I unrolled my recursive loop to be an iterative loop. This let me control things differently, doing reckless things to my queue of partial solutions, including throwing away a big fraction of my work when I started to get pressed for memory, and doing a random shuffle of my queue to give it an opportunity to "jump out" of dead ends. I've had fairly good results in the past from periodically shuffling my iterative work queue to get away from local maxima. It sounds goofy, but you might try it and see if it helps you.

So, I had an iterative backtracking solver, and it was slow. Part of the problem, I'm sure, was the copying I was doing. Part of the problem was that I was using Python. I'm not really willing to rewrite this whole solver in C++ for performance - and the galling thing was that my WFC generator was able to make a reasonably-sized map (~17x22) in under a minute, while my backtracking generator was struggling to finish a map maybe a quarter of that size (8x10?) in a few hours.

So, maybe I don't need stairs. The whole reason I stepped away from the WFC implementation was that I had these non-local constraints that WFC didn't deal with well.

One thing that might be faster is to implement Knuth's "dancing links" algorithm, which I never have been able to entirely wrap my head around. It seems to allow you to do backtracking search in-place, which would be handy.

I went to my tileset data and turned off all my stair tiles, and got my old WFC implementation working again. The tunnel system looks like a block of uncooked ramen noodles, but I'm willing to go with that as an insane, but legal, generated dungeon. It also reminds me of the studies where they fed spiders a variety of drugs and looked at the kinds of webs those spiders spun.

Once this was in place, I got a feature working where each tile could specify how frequently it would appear in the output. For example, in early versions, there were a LOT of pits and one-way doors. I turned those down, and the dungeons look somewhat reasonable (still with the ramen tunnels, but I'm still OK with that).

I've also added some convenience constraints, blocking out the lower left hand corner so that I can paste in a legend of the various symbols used on the map. Not all dungeon maps need that, but my hand-drawn dungeon bits might not always be obvious to the reader. Is a circle with a star inside it a pentagram or a statue? At this level, a statue. Later on, we'll talk some more.

I intend to write out the map as a CSV to make it easy to do some editorial touch ups, and then read the edited CSV back to feed to the renderer.

And then I intend to move on to some deeper caves and caverns, with yet another generator, this one less tied to grid lines.

Monday, July 22, 2019

On the sad state of inline images

In part, this post is a gratuitous effort to add juice to my "Shapeshifting RPG" project, which you can find at the Shapeshifting RPG website or at the Shapeshifting RPG GitHub repo. I really want to get that "done" in the next few weeks, which includes "writing" a few more chapters, a bunch of layout, and some iteration with DriveThruRPG to get it purchasable by the end of the summer.

If you're not familiar with the Shapeshifting RPG project, allow me to explain. I'm a big fan of computer-generated art, whether that's Harold Cohen's AARON project, or computer music, or even computer-written books.

OpenAI recently released a computer language model "GPT-2", trained on a big chunk of Internet text, which does a surprisingly solid job of generating prose passing for human-written. It still seems a little weird, but it does a subjectively better job of passing the Turing Test than, say, Weizenbaum's ELIZA.

To make it easier to interact with GPT-2, Adam King hosted a webpage https://talktotransformer.com/
which allows you to use pre-generated prompts or your own text and get a couple paragraphs of generated text.

Also, about this same time, the "Prismatic Corpse" game jam project drifted across my consciousness - people getting together to make an old-school RPG without working "together".

The ingredients were mise en placed, the table was set, it seemed like the only thing I could possibly do was to create my own RPG ruleset.


And so, for the past couple months, I've been prompting the TalkToTransformer AI with various bits of RPG rules, and it's been giving me big boluses of vaguely readable text, which I proceed to edit into chapters.

Where the AI totally wins is in generating lists of things. Hat tip to Janelle Shane, whose https://aiweirdness.com/ blog can consistently make me laugh out loud, with AI generated lists of things that begin on the weird side of plausible, and quickly descend into crazy-wrong space. So, I asked GPT-2 to make lists of monsters, and weapons, and armor, and NPCs. It loves lists. And a RPG rulebook is often a lot of pages of tables. Which are like lists.

To make my book borderline "useful", I decided to index my list of monsters, each with a 4-digit number from one to six. This gives you an easy way to roll up a foe to fight against, if you wanted to. Just roll 4d6, and look on the table for that die roll. So, all I had to do was to put a picture of four 6-sided dice in front of each element in my monster table.

I'm using Scribus to do the layout of my book, which is mostly doing what I need it to - it's WYSIWYG layout, without being a word processor, which is fine. It's even got the ability to run Python scripts, so you know I like that. I poked around a lot with trying to put an image inline into a bit of text, but it seems like Scribus prefers to have images exist outside of text flows, which is great if you want your paragraph text to flow around an image like this:
To be clear, I don't think that's how I want my text to flow, but it's super cool that Scribus makes it easy to do that. (And, somewhat amusingly, I had a hard time getting the image of the text flow to get placed where I want it in Blogger's layout. Layout is hard. Layout of images relative to text is doubly hard.)

What I ended up doing, which feels like failure, but it gets the job done, is to find a font that has the numbers 1-6 as the faces of dice, and then I add that to my list of monsters, like so:



It turns out, it wasn't completely that easy - what I ended up doing was writing one Python script to walk over a monsters.txt file that I had prepared, doing some light sanity checking and cleaning up (but, clearly, not fixing case-consistency; Ice-Eared Goblin should probably be all lower-case, like everything else). That script outputted a CSV with the indices in their own fields. Then, I had another Python script, this one inside Scribus, which pulled in the CSV and wrote out the text, one line at a time, but switching fonts along the way, switching to my dice font for the index, then switching back to the text font for the monster name. This is a surprisingly time-consuming script to run, taking several seconds to generate around 18 pages of monsters.

It works, it's sufficient for this project, but it feels like yet another hack along the same lines as hacks that I've done in game development to get, for example, the PlayStation "X" button to render as an image in a tutorial text display. This feels like it doesn't need to be this hard, especially with the Unicode Consortium spending time on frowning poop, and Apple bowing to user demands for shapely peach imagery.

Friday, July 19, 2019

Mandelbrot set on an emulated Apple ][



I figured I'd start off with a pretty picture, so that maybe you'll let me go on a discursive narrative.

When I was but a wee lad, I learned to program on my family's Apple ][. I started off with the little stuff that you do, like 

10 PRINT "HELLO, WORLD"
20 GOTO 10

and the like. I began drawing pictures on the "High Resolution Graphics" (HGR) screen, with a staggering 280x160 pixels (280x192, if you were fancy and used the second page and gave up the 4 lines of text at the bottom of the screen). But, eesh, I was playing fancy games like Space Invaders, Donkey Kong, and Pac Man in the arcades, and I wasn't getting anywhere near the performance that the professional developers were. And even games written for the Apple were a lot more capable than what I was able to get done in BASIC.

Somehow, somewhere, I learned that the professionals didn't do high-performance graphics in Applesoft BASIC. The big boys used Assembly Language. So, that's where I decided I needed to focus my energy.

As an aside, probably around this time, my dad did have Apple Pascal running on the home machine, but maybe Pascal also was sluggish in its own way. For whatever reason, I never spent much time with Pascal for the Apple ][. A few years later, the Apple IIgs came around, and that's when I learned Pascal, which led directly to C, then C++, which may account for the majority of code that I've written in my life, though there are plenty of others. C and C++ I learned on a 286 IBM PC and a 486 IBM PC Compatible, back when IBM was relevant in the PC space.

There were a few dead ends that I learned on the Apple ][, though, including GraFORTH and Apple Logo - both promising to give good graphics, in their own way. GraForth was a Forth implementation, which introduced me to stacks, which would help me with Postscript and RPN years later. And Logo was LISP without all the parentheses. So, even though the languages themselves didn't help me make cool animations, they planted seeds.

The idea fixed in my mind was to learn Assembly Language one way or another. On a family trip to the Pacific Science Center in Seattle, Washington, we passed through the gift store (back when it occupied the balcony of Building 4), and I found a book for sale titled "Assembly Language for the Applesoft Programmer". Well, that was exactly what I wanted. This was the key to writing computer games, and being successful, and avoiding actually having to "work" at a "job" - I'd write some sort of platformer for the Apple ][, and, I don't know. Profit. I didn't have all the steps worked out, but I knew I needed this book.

I begged and pleaded for Dad to buy the book for me, and he flipped through it, and decided it was a fairly substantial work, but it had a lot of programs to type in (back in the day, we typed in programs, even from magazines). So, Dad made me a deal; he'd buy me the book if I promised that I'd work through all the programming tasks, type in all the code, actually go through the whole thing. Yes, of course. 

I probably read the first chapter in the car ride home, which introduced a number of concepts. So far, so good. But then, chapter 2 instructed the reader to open up any one of a number of commercial assemblers for the Apple ][. My heart sank - I had just used up all my negotiating capital to get a $12 book, there's no way that Dad would go on to drop $200 on a commercial assembler, as well.

So, the book gathered dust on a shelf. Dad did work through a small number of the examples, using the Apple ]['s built-in monitor, the poor man's assembler toolchain.

I felt guilty about the promise I had made, and I kept the book as a reminder. As I mentioned, the Apple IIgs came along, and I learned Pascal, which introduced me to procedural programming, and I went off to college, and learned Scheme and LISP and C and C++ and CLU. Perhaps a few others, besides. 

Years later, like probably around 30 years after making that promise to work through the examples, I had experience with working with LinApple and a variety of other Apple ][ emulators, and I knew that there were disk images of various old pieces of Apple software available on the Internet, including some software that was actually legally available.

I dusted off the book, looked up the suggestions of commercial assemblers mentioned to build the samples, and discovered that Bob Sander-Cedarlof of S-C Software had made his S-C Assembler available for free, and so I was able to begin to make progress on my long-delayed promise.

I learned about the X, Y, and A registers, I learned about the Zero Page, I learned about the various flags that arithmetic operations would set. I learned a little bit of magic locations in memory. I wrote a sort routine that would organize literally dozens of pieces of data.

Yeah, this wasn't going to actually have practical benefits, but a promise is a promise, and I continued to plug through.

The Apple ][ used the Motorola 6502 CPU, which had support for addition, subtraction... branching... using 8-bit integers. As I recall, some of the addressing modes allowed you to use contiguous pairs of bytes to support 16-bit integer math. The hardware did not support multiplication, division, or floating point numbers. So, I wasn't going to have it easy if I wanted to write a 3d flight simulator or first person shooter. Not really what my goals were - I just needed to complete the book.

Well, except that I decided that I should build something of my own, based on what I had learned. And one of the programming go-tos (sigh) that I had gone back to over and over again was to write a Mandelbrot Set fractal renderer. I had done this in Pascal on the IIgs, C on the 286, and over and over again, as my own little proof to myself that I understood how various programming environments worked. In the years since, I've heard this referred to as a "coding kata".

So, pushing the (emulated) Apple ][ to render a Mandelbrot set, which requires complex algebra and looks best with lots of pixels, that was a challenge I felt was worthy of this journey.

I built multiplication of complex numbers, I wrote pixels to the screen, I got stuff sort of looking sort of right, but not quite, and debugging was a huge pain. So, I ported my assembly program back into Applesoft BASIC, and I was happy and sad to see that the BASIC program drew pictures that looked correct to me, while my assembly code was giving me a similar-looking blob on the screen, but not the recognizable bumblebee shape. Eventually, I managed to track down a flag that was being set that I didn't expect to be set, which was causing a conditional branch to happen when I didn't expect it. I rearranged the logic, and I had my assembly program drawing images to the screen, as I had imagined to begin with:

   

What you can see in that image is that I was drawing the image progressively - here, you can see every other row drawn, but initially, I drew widely-spaced pixels:


which is enough to tease the recognizable shape. Once that pass is done, I drew a lot more pixels horizontally:


Those look like solid horizontal lines, but I'll get back to that. Then, I filled in vertically:


And then I had one more pass filling in pixels horizontally:


wait, what? If you're unfamiliar with the Apple ]['s graphic screen, you'll wonder where all that white and orange came from. If you've spent time in the pixel trenches, you probably know that orange is accomplished by lighting up the odd columns of the second palette, blue is even columns of the second palette, green is odd columns of the first palette, and purple the even columns of the first palette. Because my progressive renderer was drawing only even columns for the first several passes, there was a lot of purple and a little bit of blue. No orange, no green, until later. Also, white was a result of lighting adjacent odd and even pixels. It's almost like it wasn't a 280 column display, but more like a 140 column display.

And so, tada:
 

Working to the level that I was shooting for. There's some color fringing, like the blue on the far left, the green on the edge of the orange field. But you know, that's fine. Also, I want to point out that the white circle actually looks like a circle, even though the Apple ][ did not have square pixels. The aspect ratio math wasn't hard. But kids these days, they don't even think about that sort of thing.

Earlier, I mentioned that I had an Applesoft implementation and an Assembly implementation - to my recollection, the Assembly version took me about 10x as long to write, and ran about 3x as fast. That doesn't seem like an especially fair trade. I consider, though, that I was pushing the mathematical capabilities (multiplication of fixed-point numbers on a 6502), which was going to be slow, no matter what. Performance was an important part of what set me on the road to learning Assembly, but by the time I had finished the book, even a fast program on a 64K 1MHz machine wasn't going to impress a lot of people. So, I considered myself done.

Giddy with my accomplishment, I printed out the various screenshots that you've seen up to this point. I took them to show off to Dad the next time I visited. He wasn't super impressed - I don't think he even remembered that day, three decades earlier, when I had committed myself to learning assembly. But he nodded and there was some small level of recognition that I had done something.


A related story, which I won't tell now, has to do with Dad and I collaborating on making a Pac Man clone on the Apple ][. Things went comically badly.