Wednesday, September 11, 2019

On making a book, when one isn't really the author - the process behind "Shapeshifting"

Over the summer of 2019, I put together a book, using OpenAI's GPT-2 algorithm, as running on http://talktotransformer.com . It's currently available in physical and electronic formats, and literally dozens of copies have been printed, and over a hundred copies have been downloaded. These are not terrific numbers (yet), but just getting the book onto Amazon is an accomplishment.

This post talks about much of the process in making this book, including many of the pain points that I would try to avoid if I did something similar again.

The Influences

I've been trying to go back and identify what pieces other people's AI projects banged together in my head to spark this process. Janelle Shane's "AI Weirdness" blog was certainly one piece; she posts an experiment each week, ranging from naming ice cream flavors to Dungeons and Dragons spells. So that feels like it probably set my mind in motion.

I also read about "Prismatic Corpse", a rewrite of D&D crowdsourced from group memory. I signed up to be a part of that game jam, but didn't get around to submitting anything - probably because I decided I was too deep in working on the book by the submission deadline. Also, I was concerned that the participants might be offended at my use of AI in their presumably human project.

Around the same time, I read "Maze Rats", a print-and-play RPG that's super light, with a lot of mileage from a couple of 2d6 table lookups.

Also, years ago, I read some of John Hodgman's books, including one with hundreds of hobo names. Cumulatively, this stuff's a riot.

I've also had a long-term project simmering on the back burner, to make a computer role-playing game, and that's a somewhat terrifying project, involving many moving parts, and lots of different dimensions of creativity.

The Book

What I ended up delivering, and this feels out of order, but it helps to understand where I ended up, which was roughly where I was aiming. With this in mind, a lot of the rest of the process makes a little more sense (and I'll take whatever sense anybody can find).

The final book is over 100 pages, with a cover that evokes mid-1970s role playing games, and internal content looking like early 1980s rulebooks. I cut my RPG teeth on J. Eric Holmes' Basic Dungeons and Dragons (the edition with a monochrome blue cover), which was a step up from the "little brown books", themselves a rewrite of Gygax and Arneson's home rules.

In my experience, each of these are pretty spindly skeletons, evoking what role playing could be, rather than prescribing how you had to play the game. The players (including the dungeon master) had to contribute a great deal to make the experience work.

I find myself describing Shapeshifting as somewhere between a parody and an homage, something not really a rules system, but more than just an accessory or players aide. It's probably correctly placed in some gray area between each of these points.

There are descriptions of attributes, including strength, dexterity, and luck - but no real rules on what those attributes do in gameplay. There's no discussion of whether you roll 3d6 six times in order, or if you do some sort of point-buy system. Somehow, attributes are important, though.

Also, there's discussion of different races, from Humans to Elves to Dwarven to Half-Elk to Half-Halflings. No stat modifiers, just flavor text. Usually, when I encounter flavor text in a game, it's the stuff that I ignore, but with Shapeshifting, it's 100% of the book.

Well, not quite 100%. There's a few tables (hat tip to Maze Rats) for weapons, armor, monsters, and magic items. If you get nothing else out of the book, maybe a big table of over 1000 monsters is useful to you. Granted, a lot of the monsters are weird, so be prepared for the tone not to match your existing campaign. (If my monsters fit into your campaign, write me and let me know, I'm very interested.)

In order to make my tables line up on clean page boundaries, I got a bunch of stock art (some free, some paid for) and used it to shim the otherwise ragged ends of my pages.

The Technology 

Most of the technology required for generating the text came in the form of OpenAI's GPT-2 text generation engine. This is a neural net that was trained on a corpus of text found on the Internet, found by following links from any post on Reddit with three or more upvotes.

The text generation software is what's called a "Transformer" (not the Cybertronian robots), and I don't entirely understand how it works. But you don't have to! Adam King posted an implementation of the algorithm at talktotransformer.com that allows a user to prompt the AI with a sentence, a list, or even just a sentence fragment, and the AI will continue the text from where the prompt leaves off.

So, I entered an awful lot of half sentences that I figured would belong in a RPG book, like "Elves are a race that..." and "When a player has initiative...", and the AI would give me a couple paragraphs at a time of text more or less on-topic. Mostly not on-topic, so I'd try again. The stuff that seemed usable would go into a big text file, and I'd keep going. In time, I had enough for a few chapters, and I'd focus on another part of what I needed for my book.

For the big lists (again, thanks to Janelle Shane), it turns out that GPT-2 loves making lists. So, I'd give it a list like:
1) sword
2) dagger
3) mace
and the AI would proceed to give me a long list of weapons, complete with numbers. The ordered list tag was one of the first things I learned when learning HTML, and it's strongly represented in GPT-2's training data, it would seem.

I would categorize stuff that worked - sometimes I'd get armor in with my weapons, or magic items in with my monsters, but collating and curating was easy, if a little numbing.

I proceeded to organize my lists to try to help the de-duplication process, as well as giving players a little opportunity to fudge the rolls, or roll on a convenient neighborhood (rolling 2d6 around the shark section of monsters, maybe).

The Formatting

Having most of the text, most of the lists, I opened up Scribus, an open-source desktop publisher, and proceeded to dive in to doing layout. I briefly considered using Python tools to generate a PDF by hand, but I decided that I wanted more variability and uneven layout, which Scribus afforded me. Here, I was aided by some documents by Sine Nomine Publishing that talked about how TSR's layout changed over time. Turns out, there was a lot of variability, so I felt free to take inspiration, rather than follow a strict set of layout rules.

Scribus makes it easy to create layouts of page layouts to reuse (heading here, page number there, column A, column B), which went a long way to making stuff flow together with a minimal level of cohesion and professional layout.

One of the tricky bits that I wanted to get right was to have inline dice images on all my dice tables. I was prepared to use my copious drawing skills to generate dice images, but I couldn't figure out a way to get Scribus to flow an inline image as part of paragraph text. (If you've ever written tutorial text for a console game, and are required to embed an icon of the XBox "X" button, you'll know what I'm talking about.)

I never figured out embedded images, but what I did find was a font with dice, so all I needed to do was to dump my table content into a text file, run a python script on it to turn it into a CSV with the die roll "index" in the first column, and the rest in the next column - except I didn't end up using commas to separate the values, because I had some phrases with commas inside them. I think I ended up using tab delimiters.

I then wrote a Python plugin script to import my CSV (TSV?) in, line-by-line, switching to the dice font as needed, and then back to the body text font. This took a few seconds to run for the longer tables, but was basically painless.

The Last Bits

I knew I also wanted to have a sample adventure, which included a right-angles corridors and rooms level, and a caves and caverns level.

For the corridors-and-rooms level, I wrote a Python script inspired by a lot of "Wave Function Collapse" samples that I've seen around, which honestly, I find a little annoying, because I was doing this sort of stuff decades ago, calling it "constraint propagation". The fancy bit that WFC has going for it is that, if you do it right, you can give it something that looks like your desired output, and it magically extrapolates that single input into infinite outputs. Sort of. It works better if your input is easy to parse into tiles, and then it's able to identify duplicated tiles, which lets it infer the relative frequencies of which tiles can go next to which tiles.

I tried a few existing implementations, and didn't get anywhere, so I wrote my own constraint propagation implementation from scratch, using a set of hand-drawn tiles in a little grid journal I happened to have with me when I was waiting for my car to get serviced.

Boom, instant dungeon crawl. Except that I had Escher-like stairway loops that bothered me. I proceeded to add more information to tiles, including that this stairway tile had an entrance that might come in on level <x>, and then the other side of the tile might exit on level <x+1>. Which was more or less fine, but the constraints were a lot tighter than most WFC samples, so I'd get trapped in inconsistencies, and the system would recognize a dead end and give up.

I tried adding in backtracking, but that wasn't going well, either. I used up lots of memory and lots of time, and still wasn't getting good results.

So, I just threw away all the stairway tiles. I sort of miss them, but not enough to go back to rewrite my dungeon generator right now.

After getting the dungeon generator working, I wrote a super simple generator that made ragged shapes that I assert are caverns, and then the generator connected them with jagged shapes that I tell you are cave passageways. After the long fight with getting the corridors-and-room level working nicely, this went super quick. Or, maybe, my standards had been lowered. But it looked like a lot of what I've seen as one page dungeons. So, in it went.

The Publishing

At this point, I had a PDF (really a couple different PDFs, one that had the whole book, including front and back covers, and fancy maps on the inside covers, one PDF with just the outside cover art, one PDF of just the inside) and I was ready to start uploading to places to get them into the hands of people.

I uploaded to DriveThruRPG, an obvious place for a person with an RPG PDF that they want to get into the hands of people who might want to read it. This was right around the time of GenCon, and they said "hey, we'll review your submission, but c'mon, it's GenCon", which was totally fair. I marked it as "pay what you want", with a suggested $5 price tag.

And then I turned my attention to Amazon. I pushed my PDF(s) up to them, going through their multi-page submission web wizard process. Seems like I got pulled backwards and re-entered information more than once. It's Amazon, they aren't in the job of making good web UX, right?

And, I got to the end of the flow, and got a button to request my submission be reviewed. So, bam, sent it off.

And while I waited for those things, I thought about maybe also doing a Kindle eBook. Turns out, you can upload a PDF which they'll turn into a Kindle eBook, but they prefer you upload a proprietary Kindle formatted version, to take advantage of the Kindle platform's features. Which makes sense, just feels a little up-sell-y.

Oh, and to author the Kindle formatted version, you have to use their desktop software. Maybe there could be a web version of it? But it's Amazon, they aren't in the job of making good web development tools. So, I downloaded the Windows version of the software. Onto my Linux machine. I didn't really expect it to work. Also, it did not work. So, I dusted off an underpowered, overused Mac laptop, and downloaded their Kindle authoring tool for that. And cut and pasted the text in, bit by bit. The pictures mostly didn't survive, but most of the pictures were there to make the page layout fit well, so in a land of auto-flowing text, I let go my responsibility for worrying about page breaks.

I did keep my two dungeon maps - they were more important to the content (in as much as anything really is important). I did a pass to add in keyword linking, because it's a Kindle feature.

And I uploaded that version. And, perhaps not entirely surprisingly, it was the Kindle eBook that got approved and available for downloading first. A Friday in early August, if my memory serves me, which I don't actually trust. And not too long after, the Drive Thru RPG download went live. The paperback version went live, long enough for me to order a copy, and then it went to "ON HOLD", with a little message "to find out why your book is on hold, contact us", with a link going to a big page of FAQs about the publishing process, but not any super obvious discussion of what would make a book be put on hold.

I figured some flag got tripped somewhere in the process, perhaps having to do with DPI settings or margins. I found some means of contacting the customer service team, and asked them "hey, the UI told me to contact you, what's up?", and the customer service team said "oh, hm, we'll have to check with the tech team. We'll get back to you in like three days?". So, sure. I sat and waited, and got a response from the customer service team, relaying a message from the tech team saying "we can't print the book until he replaces the Quentin Caps font". Which, all right, I get that fonts can be tricky. No discussion about what was tricky about this particular font. Maybe it was a bad format, maybe my use of it wasn't clearly within the rights of my license, so I found a similar font, paid a somewhat reasonable amount for that font, and re-submitted. And waited. And after several days, I reached out again, asking why my book was still on hold. And again, customer service had to contact the tech team to find out why the book was stuck. And again, it was a font issue. This time, my new font. Still no indication about what the issue is, or how to fix it. No guidance on what to do other than to replace the font.

So, and if you've read this far, which presumably you have, or maybe you're skimming, because it's been a lot of words up to this point, maybe this is the one takeaway that is of use. When you're working with a print on demand service, and you give them a PDF, you can embed fonts in the PDF, or you can convert fonts into outlines. Or, and this is the grossest option, you can take the fonts into let's say Photoshop, make an image (maybe a PNG) of the text, save that out, pull it in to your desktop publishing app, and paste the image in place of the text using the problematic font.

This is bad because it increases your filesize. It's bad because the layout is all kinds of sloppy. It's bad because it's caving in to a workflow that flags problems without offering solutions.

But it worked. My book is (currently) in print. You can buy paperback or two different formats of electronic versions. Or all three. You can send copies to your favorite game master for the holidays.

I'm tempted to make a Kickstarter option for the boxed set. I haven't fully imagined what it would entail. A box, certainly. More art? Better art? A bigger adventure. Better rules? Little cardboard standee figurines? Dice? Seems like at a minimum, a project this random needs some dice.

People ask me if I've played the game, and I laugh at them. This isn't a game for playing, this is a book to laugh at. Or, if you do play it, let me know. I'd be delighted to hear.