Building with AI: What Actually Works, and What Does Not

When I started building Havnwright, I did not know what Claude Code was. I did not know what Codex was. I was opening the OpenAI console app on my laptop, pasting questions, pasting answers, and trying to turn the output into something coherent. It worked, sort of. But sort of is a dangerous place to be when you are trying to build a platform that other people will eventually trust with their money and their time.

This post is about what building with AI actually looks like when you do it solo, day after day, for something real. Not the demo-day version. Not the viral tweet version. The actual version.

The tooling evolution

I tried a lot of things before landing where I am now. ChatGPT and Codex were my first stops. I used Gemini briefly, and I still use its Nano Banana model occasionally for visual concepts, though I do not need much of that. For a while I was interested in Cursor. It has good UI and good UX, and for many people it is the right choice.

Where I ended up is Claude Code, running as a VS Code extension, connected directly to the Anthropic API. That combination gives me more per dollar than any of the subscription-based editor tools. I pay for API usage rather than a fixed monthly cap, and what I get back in terms of capability, context window, and raw output volume is significantly more than what I was getting from a two-hundred-Dollar editor subscription. I also use the Claude desktop app for longer conversations that are not tied to a specific codebase.

For anyone starting out, I would say this: try a few. The gap between tools has narrowed and will keep narrowing. What matters more than the tool is the context you give it and the system you build around it.

What works: the big picture workflow

The single biggest thing that changed how I work with AI is that I stopped treating it like a search engine. Early on, every conversation started with a blank page and a specific question. "How do I implement X?" "Why is Y broken?" The answers were usually generic, because the questions were context-free.

What works now is different. I come to the agent with the big picture of what I am trying to build. The agent already has access to the codebase, the git history, the project structure, and a layer of context that I built on top of all of that. The discussion is no longer "how do I do this?" It is "here is what I am trying to achieve, here is what already exists, what is the best way to fit this in?"

There is a saying I use with one of the agents: when you start a fresh session, the agent is like a person in a dark room with no windows. It does not know what time it is. It does not know what project it is working on. It does not know what was decided yesterday. You can ask it a question and it will give you the most generic answer possible, because generic is all it has.

The infrastructure I built makes that dark room impossible. The agent walks into the room and the lights are already on. It can see where it is, what was decided before, what is blocked, what is in progress, and what the bigger picture looks like. It can navigate the system to find what it needs rather than waiting for me to explain everything.

That is the workflow that works.

What does not work: expecting the agent to know

The most painful lesson I learned is that the agent does not know things you have not told it. This sounds obvious. It is not obvious in practice.

When you are building authentication, for example, the agent can generate plausible code for any individual piece. A login form. A session check. A middleware function. A database query with a user filter. Each piece works in isolation. Each piece looks correct.

The problem is that authentication only works if the pieces fit together correctly across the entire system. One forgotten filter in one query, and another user's data leaks. One race condition between two components both trying to manage auth state, and sessions start getting mixed up. The agent can build all of these pieces, but it cannot automatically check that they fit together, because it does not understand your entire system at once.

I wrote a whole post about this specific problem and how I eventually solved it. Short version: you need to enforce patterns at the architectural level, not trust individual components to remember to do the right thing. That lesson applies everywhere, not just to auth. The more complex your system becomes, the more you have to design in ways that do not rely on the agent, or you, remembering the right thing at the right moment.

The document explosion problem

For a long time, I tried to solve the context problem by writing everything down. Research notes, architecture decisions, user flows, feature specs. I had MD files for everything. At one point Havnwright had more than one hundred markdown files inside the project. I had folders of Word documents and PDFs from research. I had stacks of exported ChatGPT conversations. I had so much documentation outside the project that I would occasionally bundle a hundred files into one prompt and ask the agent to summarise them so I could extract what I needed.

It got to the point where the documentation itself became a project to manage. I was spending time organising, summarising, and trying to remember which file had which decision. The thing that was supposed to give the agent context was becoming a bottleneck on its own.

That is what eventually led me to build the Founder Knowledge Graph. A structured system that captures entities, relationships, decisions, tasks, goals, and learning moments in a way the agent can query directly. Instead of stacking files and asking the agent to read all of them, the agent pulls exactly what it needs for the current conversation.

Some of this is starting to appear in the tools themselves. Claude has added memory features and better context handling. But there is still a gap between what the tools provide and what a serious multi-project setup actually needs, so the knowledge graph is what I use every day across all my projects.

The aha moments

There were a lot of aha moments along the way. I keep a physical notebook where I write these down because they happen fast and you forget them just as fast.

The biggest one was not technical. It was a mindset shift. Early on, I was terrified of getting things wrong. I was building authentication, payment systems, user data flows, features that real people would eventually rely on, and I kept reading best practices and looking at how big companies do things, trying to find the one right way. The result was paralysis. You have all this information, and you do not know which version of it applies to you.

At some point I realised: I am doing my best. I am trying to come up with the best system. It does not mean I have to get everything right at the very beginning or every time. Something might go wrong. And if it goes wrong, we fix it.

That shift sounds obvious when you read it. It is not obvious when you are in the middle of trying to ship something and you are scared of what might break. But once it clicks, you stop spending three days researching which approach is theoretically correct, and you start shipping something that works, learning from what breaks, and improving from there.

What I would tell someone starting today

A few things I wish someone had told me when I started:

Pick one tool and go deep. You can waste a lot of time comparing editors. Pick one that has direct API access if you can afford it, and commit for a few months before you evaluate.

Build a context layer early. The quality of your work with AI is mostly a function of the context you give it. If you find yourself repeating the same explanations in every conversation, that is a signal to build something that captures that context once.

Expect the agent to be wrong, and design for it. The agent is a fast collaborator, not an infallible one. Patterns that enforce correctness at the architecture level are more valuable than hoping every generated piece of code is right.

Accept that you will not understand everything. Especially in the beginning. I did not fully understand what I was building until I had built it. Learning by doing is real, and with AI it becomes both faster and more confusing at the same time.

Write things down. But then build a system that uses what you wrote down, instead of just accumulating more files.

This is part of a series about building products as a solo founder. Earlier posts cover my personal journey into building, the story behind Havnwright, and the centralised authentication pattern that came out of the lessons in this post. More coming.

Building with AI: What Actually Works, and What Does Not

A Note on Expertise

The tooling evolution

What works: the big picture workflow

What does not work: expecting the agent to know

The document explosion problem

The aha moments

What I would tell someone starting today

About the Author

Related posts

When the Agent Is Wrong and Sounds Right

Why the Founder Knowledge Graph Exists

Infrastructure or Prompt: What AI App Builders Actually Deliver in 2026