Why Generic Tools Break for Game Design

I’m going to make an argument that sounds self-serving, so let me front-load the disclosure: I build and sell a game-design tool. Of course I think generic tools are insufficient. This entire post could be a marketing brochure with extra steps.

I’m going to write it anyway, because I think the specific pattern of how generic tools fail is interesting independent of which alternative you pick. If you read this and decide Obsidian, Articy, pen-and-paper, or your own internal tool is the right answer for you. Great. The argument below works the same way regardless of which specific solution you choose. The point is to name the shape of the failure clearly, so you can recognise it in your own workflow and act on it instead of paying the tax silently.

The shape of the problem

Game design is a knowledge-work activity with three unusual properties:

Property 1. The objects you reason about have type-specific structure. A character has a name, a portrait, stats, a voice sample, a three-beat backstory, an arc. A quest has a giver, a location, a branching objective list with conditions. A mechanic has an input, a process, an output. These aren’t “documents.” They’re records with mandatory and optional fields whose shape matters. A character without a portrait is incomplete in a different way than a character without an arc. And both are incomplete in ways that no generic document model knows how to flag.

Property 2. Every object references every other object, densely. A quest references its giver (a character), its location (a place), its rewards (items), its mechanics (combat verbs). A level references the items that spawn in it and the conversations that trigger inside it. A conversation references characters, flags, outcomes, and the items it might give away. Pull on any single thread of the web and the whole thing moves. The web isn’t an afterthought; it’s the primary fact about game-design data.

Property 3. The web is in constant motion. Characters get renamed. Items get rebalanced. Quests get split, merged, scrapped, restored. Mechanics get reframed three times before they ship. A character bio that doesn’t update when you rename the character is worse than no bio at all. It actively misleads new readers and creates two-week delays where someone is confused about which version is current.

Generic productivity tools. Notion, Google Docs, Confluence, Obsidian-as-default. Handle property 1 badly, property 2 badly, and property 3 catastrophically. (I went deep on Notion specifically in a separate review; the failure modes I cover below are most acute with Notion but apply to the whole category.)

Let me show you what that looks like in concrete failure modes, from two real projects.

Project A. Notion + Miro + Sheets, six months

Six months. Solo prototype. Top-down action-adventure. Final scope: ~80 items, ~30 quests, ~15 characters, ~12 levels.

Where it broke, by month:

Month 2. The rename incident. I renamed a major NPC from Maren to Lyra. Three months later, a playtester asked me “wait, is Maren the same person as Lyra?” The two names had drifted into being treated as different characters by everyone who’d read the GDD, because Notion’s @-mentions inside page bodies hadn’t propagated the rename. I found stale references to “Maren” in seven separate docs. This was the moment I realised Notion’s cross-reference feature was decorative, not structural.

Month 3. The quest-design fork. Quest 12 needed a conditional branch. Notion’s bullet-indented list couldn’t represent it. I moved all quest design to a Miro board. From that day forward, the truth of every quest lived in Miro and the narrative lived in Notion, and the two drifted constantly. Every change had to be made in two places or risk being silently lost. I forgot to update the Miro version of three quests in one week; the implementation contractor caught two of them in code review and missed the third, which shipped wrong to the playtest build.

Month 4. The balance pass. I did a 6-hour rebalance of 80 items across three tools: Notion item pages (for the descriptions), a Sheets balance master (for the numbers), and a Miro mood board (where item icons lived). Halfway through the pass, I lost track of which items I’d done where. The contractor doing the implementation caught roughly a dozen items where the description, the stats, and the icon all referred to technically the same item but disagreed about its current state.

Month 5. The onboarding attempt. I tried to bring in a writer for some side quests. I spent half a day explaining where things lived. Which Notion pages were canonical, which were drafts, which mappings between Notion and Miro to remember, which Sheets file had the “real” stats, where to find the asset library. The writer’s response, almost verbatim: “this is a lot.” They lasted two weeks before quietly drifting off. The tooling wasn’t the only reason, but it didn’t help.

Total tooling-overhead time across the project, tracked for one month and extrapolated: ~80 hours over the six months. That’s a working week per month spent not designing.

Project B. Google Docs + Sheets + a wiki, three months

I deliberately tried a different stack on the next project to see if simplicity won. Three months. Two-person team. Narrative-heavy adventure. ~40 characters, ~12 quests, ~50 dialogue scenes.

The failure pattern was the same shape, faster:

Day 4. No structured editor for characters. Every character was a Doc. Every character had a slightly different shape because the writer and I had different header conventions. Comparing two characters required opening both documents side-by-side and visually scanning for matching fields. Within a week we had three different shapes of “character bio” in circulation.

Day 11. Linking a dialogue scene to its characters meant pasting Google Doc URLs into the scene. When you renamed a doc, the URL stayed valid but the link text in the referencing doc was now wrong. Within three weeks the dialogue scenes had links saying “talk to Maren” pointing at a doc titled “Lyra (formerly Maren).”

Day 19. I noticed the wiki had silently become “the place everything actually is” and the Docs had become “the place where drafts live.” We had two homes for the same information without ever deciding to. We decided to consolidate. We never finished consolidating.

Day 32. The writer asked me “where’s the latest version of [character]?” three times in one week. I started keeping a “current source of truth” tracking doc. The tracking doc went out of date within five days.

Three months in, I killed the project (for product reasons too, not just tooling) and went back to Project A’s stack, slightly wiser about which failure mode I was paying for and why.

What both projects had in common

Different sizes. Different teams. Different tooling stacks. Identical failure pattern:

The tool’s atomic unit was the document. The project’s atomic unit was the item. Every minute spent translating between the two was a minute lost to translation, not design.

Notion’s pages are documents. Google Docs are documents. Confluence pages are documents. Miro boards are documents. They’re all genuinely good at what they do. Describing one fixed-shape thing (a memo, a meeting note, a project brief, a flowchart).

A character isn’t a document. A character is a structured record with optional long-form fields nested inside. Forcing a character into a document means either:

Flattening the structure. Everything becomes prose, which loses the typed fields you need to query, filter, sort, or balance against other characters. (You can’t find “all characters with stat X above 10” by reading prose.)
Replicating the structure manually. Everything becomes a database row with a separate body field, which splits the character across two surfaces and creates a “where do I look” problem within minutes of onboarding anyone new.

The trade-off is unsolvable inside the tool. The only way out is to use a tool whose atomic unit is the item, not the document. Whether you build that tool, buy one, or accept the limitation and design around it.

What “item-shaped” looks like, when you align primitives

I’m going to describe what changes when you move to a tool whose primitives match the work. Not as a sales pitch (I sell one such tool; I’m being upfront about that) but as a description of what vanishes when the structural mismatch goes away. The point is that the things below stop being problems you actively manage. They become non-problems.

Rename-stale references vanish. When a character is an item and every mention is a typed reference to that item, renaming the item updates every mention everywhere in real time. The “where’s the latest [character]?” question stops being asked because the question doesn’t make sense. There is only one place the character can be.

The “where does the bio actually live” question vanishes. The character’s structured fields and the character’s long-form body are the same record. There isn’t a database row and a separate notes doc and a sub-page. There is one thing.

Translating between tools for branching vanishes. When a quest is an item with structured “branching steps,” the quest editor knows what a quest is and represents the branches natively. Same place, same surface, no Miro board to keep in sync.

Balance passes go from hours to minutes. When items are typed objects with stat grids, a balance change in one place propagates to every reference (in GDD pages, in quest reward lists, in shop inventories, in the test build). The 6-hour balance pass I described above becomes a 20-minute exercise where the thinking is the bottleneck, not the bookkeeping.

The tooling-overhead time drops from “80 hours over six months” to roughly minutes per month. Not because I’m a better tool builder than the Notion team. They’re better than I will ever be. But because the shape of the tool matches the shape of the work, and shape-mismatch is a tax that doesn’t show up on any invoice.

When generic tools are still the right answer

Honest answer: there are real cases where the switching cost outweighs the structural-mismatch cost. Don’t switch tools just because the cool kids on the internet say to.

Tiny projects under 50 items. The cracks don’t show. Don’t introduce friction you don’t need.
Hackathons and game jams. You’re throwing the project away in 48 hours. Use whatever you already know.
Solo dev with Obsidian already set up. Obsidian’s bidirectional links cover the cross-reference problem. The structured-editor problem remains, but you might be OK.
The team has zero appetite for learning anything new AND design has not yet been the bottleneck. (Revisit when it becomes the bottleneck. It will.)
AAA studios with dedicated tools teams. You’re going to build internal tools anyway; the choice is just which off-the-shelf base to extend. Most major studios are running on Confluence + custom-built layers, and that’s a defensible choice when you have the headcount to maintain the custom layer.

There are also situations where I would not have built a tool, because the better answer was already obvious before I started:

Pure dialogue-heavy narrative game with budget. Use Articy:draft. They’ve been doing this for fifteen years and they’re excellent at it. (See the 7-tool comparison for the full landscape, including how Articy and Arcweave stack up against Notion and Obsidian.)
You only ship one project per decade. Build internal tools. The amortisation works.

The market I built Ludessy for is the middle: indie + mid-size teams who ship more than one project, can’t afford Articy enterprise pricing, and have outgrown Notion. That’s a real population and they’re systematically underserved by the existing tool landscape. (For the full backstory of why this gap pushed me to build a thing, here’s the launch post.)

The bigger argument

I think most “build a vertical SaaS in a niche” essays are correct but vague about why the niche needs its own tool. Here’s the precise version:

Generic productivity tools optimise for “fits all knowledge work reasonably well.” They are necessarily a worse fit for any specific knowledge-work activity that has both type-structured objects and dense cross-references between them. Because supporting those properties well requires opinionated primitives that don’t generalise to other domains.

The same argument applies to:

Legal contract drafting. Clauses are typed records with cross-references; renaming a defined term should propagate; generic word processors don’t get this right and legal-tech tools exist for exactly this reason.
Scientific paper writing. Citations are typed records with cross-references; generic word processors get this wrong and BibTeX exists for exactly this reason.
Music composition. Notes are typed records with timing relationships; generic text editors don’t work; DAWs exist for exactly this reason.
Architectural design. Components are typed records with spatial relationships; generic drawing tools don’t work; CAD exists for exactly this reason.
Recipe development at scale. Ingredients are typed records with quantity relationships that propagate when you scale a recipe up or down; spreadsheets are almost good enough and full recipe-development tools exist for exactly this reason.

Notion’s bet is that flexibility wins across all these domains by being good enough at each. For a lot of knowledge work, this bet is correct. For game design specifically. Based on two years of trying. It isn’t, and the failure mode is consistent enough across teams I’ve talked to that I’m comfortable calling it structural rather than a “you” problem.

A test you can run today

If you’re fighting your tools and you assumed it was a you problem, run this test: pick the most-renamed item in your current project (a character, a system, a location. Whatever changed names in the last 30 days). Open your tool’s search. Search for the old name. If you find any hits in document bodies, the tool is failing to propagate renames. That’s the cost of the structural mismatch, expressed in concrete searchable form.

If you don’t find any stale references. Congratulations, your tool is working for you, keep it.

If you do find them, you now have a measurable problem instead of a vague frustration, and you can decide whether the cost of switching is worth the cost of paying the tax. There is no universal right answer to that trade-off. There is just a real choice instead of an invisible drift.

Next step

I’m collecting the specific moment people realised their generic tool was costing more than it was saving. The rename incident, the balance pass that lost three items, the writer who quit because the tooling was “a lot.” If you have a story, send it via the feature-request form and I’ll compile the best ones into a follow-up post about the patterns across teams.

Why Generic Tools Break for Game Design

The shape of the problem

Project A. Notion + Miro + Sheets, six months

Project B. Google Docs + Sheets + a wiki, three months

What both projects had in common

What “item-shaped” looks like, when you align primitives

When generic tools are still the right answer

The bigger argument

A test you can run today

Next step

Designed in Ludessy.

Related posts

Ludessy is live. Here's what it does, and why I built it.

Notion for Game Design: An Honest Review After Two Years

The 7 Best Game Design Document Tools in 2026 (Tested With a Real GDD)