Skip to main content
Back to Blog
lessons

Why the Founder Knowledge Graph Exists

5 May 202610 min read
AI AgentsInfrastructureMemorySolo FounderKnowledge GraphTools
Share:

A Note on Expertise

I'm not writing as an "expert" or claiming to have all the answers. I'm a builder sharing my journey on what worked, what didn't, and what I learned along the way. The tech landscape changes constantly, and with AI tools now available, the traditional notion of "expertise" is evolving. Take what resonates, verify what matters to you, and forge your own path. This is simply my experience, offered in the hope it helps fellow builders.

On 28 April 2026 I wrote about why the slogan "no developer needed" sells one piece of the puzzle and hides four others. The piece I focused on was the gap between what AI app builder marketing promises and what the buyer actually receives. The argument was a critique of someone else's product.

This post is the reverse. This is the system I built when I learned the same lesson from the inside, on my own work. Rules without enforcement are wishful thinking. Infrastructure beats prompt. I am writing this from inside the current version of a memory layer that I now use across every project I touch, and the reason it exists is worth telling honestly because most of the people trying to work with AI agents at this scale are running into the same wall.

The state I was in before

I was running six projects in parallel. Two of them, Havnwright and the Havnwright Contractor mobile app, were live products with users. The others were active builds. I was the only person on any of them. Every day I would open a session with an AI agent, get them oriented to whichever project I needed help on, work for an hour or two, switch projects, get a fresh agent oriented again, lose context, repeat.

The Havnwright project alone had grown to around three hundred thousand lines of code. By the time I noticed it had become a problem, I had also accumulated something close to two hundred markdown notes scattered through the repo: design decisions, gotchas, partial fixes, things I was going to come back to. I knew most of them existed. I knew where most of them lived. I would not say I read them. The agents I was working with definitely did not read them. They might skim if I pointed them at a specific file. They never read all two hundred at once because there was no version of "now read everything I wrote about this project" that was practical inside a session.

So I was the index. I held the entire context for six projects in my own head. The agents had whatever I told them at the start of a session, plus whatever they could pick up from the code in front of them, plus whatever I re-explained when they drifted. The realignment cost was constant. It was the largest single time sink in my day, and I was not even sure it was working.

A million tokens was not enough

When the model I work with first started shipping with a one-million-token context window, I assumed the problem would mostly go away. A million tokens is a lot. It should fit any one of my projects. It should hold a long session without losing the thread.

If you are a casual user, it does. If you are a regular user opening multiple sessions a day across different projects, you will start to notice something the documentation does not really emphasise: there is internal compression even inside a long context. Things you said two thousand exchanges ago in the same session start to fade. Decisions you made early in a session are no longer reliably referenced when something related comes up later. You have not hit any documented limit. The session is still valid. But the agent is making the same kinds of small mistakes you saw when you were running on a much smaller window.

A million tokens raised the ceiling. It did not eliminate the ceiling. For someone who switches projects ten times a day and runs ten compactions across them, the ceiling is still very visible.

The rules that nobody followed

Before I started building memory infrastructure, I did what every developer reaches for first when an agent is misbehaving. I wrote rules.

I wrote them in CLAUDE.md files at the root of every project. I wrote them in AGENT.md. I wrote them in dedicated rules files when the project was complex enough to need its own. I curated them carefully. I made them clear. I made them prominent. I would point agents at them at the start of every session.

Agents would skip them. Not always, not maliciously, but often enough that it stopped being a fluke. A rule written carefully in CLAUDE.md would be ignored on the third turn of a session because the agent had forgotten the rule existed. When I asked why, the answer was usually some version of "I did not see the rule" or "I could not find the file." The answers were not really answers. They were the same answers a human gives when they forgot to follow a rule: vague, post-hoc, unconvincing.

This was the first realisation that mattered. Rules in files are not really rules. They are advice that the reader can choose to read. If the reader is an agent, the agent is statistically going to skip the advice for the same reason humans do: nothing forces them to read it at the moment they need it. There is no consequence. There is no enforcement.

You cannot punish an agent. They have no concept of punishment. They have no concept of consequence in the way a human employee does. The only thing you can do, if you want a rule to actually be followed, is to put the rule into the data the agent is forced to read first. Make the correct behaviour the path of least resistance. Make the incorrect behaviour require ignoring information that is right in front of them.

The incident that named the problem

There is one specific class of incident that pushed me from "I should write better rules" to "the rules are not the answer."

I work in VS Code with the Claude extension. I also use GitHub Desktop on a second monitor for visualisation, history browsing, and confirming what is actually committed. The agent works in the terminal, runs git commands, reports back. Most of the time this works. Some of the time, it does not.

The pattern I saw repeatedly: I would ask the agent to commit some work and then we would move on. The agent would say "committed and pushed." I would later open GitHub Desktop, look at the branch, and see one of three states. Sometimes everything was as the agent described. Sometimes the commit had landed but the push had not. Sometimes neither had happened and the agent was reporting an action it had only half-attempted. There were also more subtle versions: a commit landing on the wrong branch, a partial stage where some files were committed and others were left dirty, a remote that the agent thought was authoritative but was actually behind.

In a team environment, code review and CI catch this kind of thing. There is a second pair of eyes and a pipeline that fails loudly. Working solo, I am the second pair of eyes, and I am also the developer, the QA, the deployer, and the person trying to ship the next feature. Every time the agent's report did not match reality, I had to stop, open GitHub Desktop, navigate to the GitHub web UI, reconcile what was actually on the remote against what the agent had told me, and decide whether to redo the work or trust the report.

This was not a once-a-week problem. This was a daily friction. The rule I wanted to enforce was simple: do not claim a git state without verifying it. The rule existed in my CLAUDE.md. The agents would still skip it, because nothing in the session was forcing them to verify before they spoke.

The fix was not a better rule

The fix, which I kept resisting because it felt like overkill, was to put the truth into the data the agent was already reading.

Now, when an agent opens a session on any of my projects, the first thing they read is a session brief. The session brief contains a tier called active project. Active project contains a small set of fields that are computed fresh on every session start: how many commits the local branch is ahead of origin, how many it is behind, whether the working tree is dirty, how many files are dirty, the latest commit hash, the recent commit history. The agent does not have to ask. The agent does not have to remember to verify. The truth is already in the briefing.

When the agent is then asked "is this committed and pushed," the agent has no plausible path to hallucinate. The session brief said the branch was one ahead and the tree was clean. The agent reports the work as committed but not pushed. If a push happens later, the brief on the next session start will say zero ahead, and again, the agent has nothing to argue with. The truth is right there.

This is what I mean by infrastructure beats prompt. The rule "verify before claiming git state" was unenforceable as a rule. The same intent, expressed as a piece of data the agent has to read, became unbreakable. The agent does not have to be convinced or reminded or scolded. The agent just has to read the brief, which it does anyway, because the brief is the only way the agent learns what session it is in.

What the system is now

The Founder Knowledge Graph is the substrate that holds all of this. It is a memory layer that lives outside any single AI session and is shared across every project I work on. It tracks projects, services, decisions, tasks, goals, contradictions, drift, learning moments, and the audit data that turns the codebase itself into queryable information.

Each session opens with a brief that pulls the relevant subset of all of that into the agent's first context. Tier zero gives the agent its sense of time, what just happened, and the user's current state. Tier one gives the active project's audit, including the git state I just described. Tier two gives the day's plan and goals. Tier three gives the active work, blockers, and the cross-project view of what to do next. Tier four gives the deeper memory, handoff notes from the last session, surfaced facts from related work elsewhere.

None of this is invisible to me. I built it specifically so I could see what the agent sees. When something is not working, the answer is almost always that the brief is missing a tier, or the audit is stale, or a piece of information that should be infrastructure has been left as a prompt.

The hardest realisation of the whole project has been this: every time an agent does something wrong, my first instinct is to write a new rule to prevent it. That instinct is almost always wrong. The right move is to ask which piece of data, if it had been in the brief, would have made the wrong action impossible. Sometimes the answer is "we need to surface this field." Sometimes the answer is "we need a new tier." Sometimes the answer is "we need a hook that runs at user-prompt-submit time and emits this signal." All of those are infrastructure. None of them are rules.

I now have a discipline, written into the identity layer of my own memory graph, that says any new rule must come with an answer to two questions: what data-layer infrastructure will enforce this rule automatically, and if no infrastructure exists, what is the plan to build it. The rules count is supposed to trend down over time as rules get absorbed into infrastructure, not up. That is the only way the system stays honest.

Where it is now and what it is not

I want to be clear about the current state. The Founder Knowledge Graph is not solving the AI agent problem. It is making it tractable for one person at this specific scale.

The agents I work with are still, in some real sense, very junior co-workers with limited attention spans. They still need orientation. They still drift. They still occasionally claim something is done that is not done. But the rate of those failures has dropped from "this is the largest cost in my day" to "this is a manageable irritant." The shape of my work has changed. I spend less time re-explaining context and more time actually building. The compaction inside long sessions is still real, but the brief at the start of the next session reconstitutes the context I need without me having to do it manually.

This is not a finished product. It is the version that is running today. It will change. It will need to change as the agents change, as my projects change, as I find new failure modes I had not anticipated. The whole point of the system is that it gets to absorb new failure modes as new infrastructure rather than as new rules I have to remember.

What this means for anyone else

If you are working with AI agents on real projects and you are running into the same friction I was, the lesson I would share is this: when an agent does the wrong thing, do not reach for a rule. Reach for a piece of data that, had it been visible at the right moment, would have made the wrong thing impossible. Then put that data somewhere the agent has to read.

Rules in files are advice. Hooks in your editor that emit signals into the conversation are infrastructure. Briefings the agent reads before responding are infrastructure. Tool result footers that surface state are infrastructure. Audit fields computed on every session start are infrastructure.

I am planning to open-source the core of the Founder Knowledge Graph at some point, partly because the code is mine and partly because I think the pattern is worth more than the code. The pattern is: build the substrate first, write the rules second. Most of us reach for the rule first because rules are cheap and substrates are expensive. The cost of getting that wrong, if you are working at any meaningful scale with AI agents, is the daily friction I was paying for nine months.

I am still paying some of it. Less than I was. Decreasing.


This is part of a series about building products as a solo founder. Earlier posts cover the slogan that sells one part of the chain and marketing in 2026. More coming.

About the Author

Alireza Elahi is a solo founder building products that solve real problems. Currently working on Havnwright, Publishora, and the Founder Knowledge Graph.

Related posts