Robert,
How've you been man?
Me? I've been writing and building internal tools to capture and contain the rush of ideas I have daily.
This note documents just one of the many recent breakthroughs I've had in evolving fiction as an entertainment class.
Hope you find it interesting.
How's Delphine going?
-Keith
IRL the leak started under the kitchen sink, but the digital leak had been going on for two years.
Right before going to bed last night, I noticed water pooled under the kitchen sink. 11PM is the worst time for a plumbing disaster, but I had to look into it. After some investigation, I discover the bottom seal on the garbage disposal had failed. This was annoying, but minor compared to many other home emergencies this year.
But earlier in the day, I'd caught something else that had been failing for two years. The data on my website was locked behind a competent, yet difficult to navigate website UI and I couldn't access it to inform my writing.
Since 2024, I've I run my website on Ghost. That year I made the migration from WordPress to have more modern tools to run my own (mostly digital) business. Posts, drafts, pages, short stories, process notes, abandoned essays, writing fragments, newsletters makes up a real archive.
The problem is the Ghost UI gets clunky past about six months. I can't search my own backlog. I'd been writing into a system I couldn't easily query. Two years of voice, themes, recurring images, half-finished arguments were all in there somewhere, but functionally inaccessible to me unless I knew exactly what I was looking for.
Yesterday I fixed that. Or more accurately: a few short Python scripts did the job.
How I built the Keith Hayden Corpus
Creating this collection of data wasn't as difficult as you'd think. It really only took three passes.
First pass was to pull the index of web content into a spreadsheet. This included post title, status, URL, and created date. Within in 30 minutes, I could see what I had on the site at a glance. The N64 fog began to clear revealing the map of my thinking and writing.
On the second pass I expanded the scope to pull everything: posts, pages, drafts, tags, members.
On the third pass— and this is where it stopped being administrative and I started to see the potential of what I'd gathered— I stripped the HTML, extracted the pure text, and wrote the whole thing into a single `.jsonl` file. In an instant, every post and page in my entire Ghost database became machine-readable, structured, and portable. If I wanted to I could drop it on a thumb drive to recruit any computer for understanding how I operate.
The Result
This is the part I'm still trying to absorb.
What I have now is not only a backup of my writing. It's a structured artifact of my own voice that any LLM can read and use as context. I can hand it to Claude and say "here's how I write, what I think about, and how I organize my thoughts". I no longer have to reference my style or taste, it's all their in my personal corpus.
Here at the beginning of the AI Age this is huge. It's the equivalent of putting a smartphone on your pocket for the first time in 2012.
I can feed this into a Claude Project and run my morning brief against my own historical voice instead of against generic training data. I can build what Claude called a Semantic Collision Engine to randomly pull three unrelated entries from the corpus and ask the AI to synthesize a new concept, character, or Lexicon entry, from the unexpected combination. I can resurrect unfinished drafts, take inventory of past thematic fixations, or I can even have the archive talk to projects I'm currently working on like my novel.
And then a deeper realization dawned: this is only the Ghost site, which is only two years of output. What about other collections of previously untapped data? My Google, Facebook, and ChatGPT histories, how would they strengthen the corpus? There's also physical artifacts like my old military trunk that can be integrated in various forms. Every one of those is precious metal waiting to be refined.
You may wonder about privacy. Shouldn't I be worried about it?
Well the way I see it, I've given it up long ago. Likely, you have too. Not wittingly, but just as a prerequisite for existing in the modern world.
We're already heavily surveilled. Every time you open most popular apps, make a voice call, send a DM, or perform a digital action, some company logs the movement.
That's the bargain we made for always-on internet. Google has every search, Gemini chat, location ping, every email. Meta has the friend graph and the photos. OpenAI has every prompt. The shadow corpus of me already exists in a disunified configuration. It just exists on someone else's server, optimized to serve ads, not to serve me.
That asymmetry has always been a problem. Tech companies exploit our data to adjust algorithms, flood our inboxes, feeds, and eyes with what they want us to buy, think, and feel.
You generate the raw material, they generate profit. In the past we could only accept that arrangement because there were no tools for the average person to organize, search, and make use of their own data.
In 2026, things are different.
What I did yesterday is a rebalancing of the scale. Now anything I've ever generated digitally (or that can be digitized), with enough patience, can be indexed, structured, and made usable. My goal isn't "perfect memory” (no Total Recall here.) What I want is the ability to consult past versions of myself and to harvest from my own mind in a deliberate manner for future thinking, writing, and utility.
There's a thing I keep saying to myself: the past is not the project. The past is prima materia for the current project. That rule was meant for the Archive Coordinate framework— a weekly excavation layer I've created to randomly surface old artifacts and let them collide with present work. But it also applies to the corpus too. The point isn't to build the Keith Hayden collection, it's to turn my data into valuable resource.
Does drafting using a corpus of my own writing and thinking via large language models make me less of a writer? Should I stop calling myself a novelist?
I used to worry about this a lot.
But now I realize the social trap I'd fallen into. Fact is I would have built this sooner, had I not been so insecure about openly using AI as creative tool in my process. I won't attempt to justify it any longer.
If I must I will renounce the novelist label just as easily as I surrendered my “Black card” years ago. The meaning for both, in traditional interpretation, has long expired.