From "Messy Middle" to Magic: My Agentic Coding Journey (so far)

Let me take you on a journey through my love-hate relationship with AI coding assistants. If you've felt that frustration of spending more time guiding the AI than actually coding, trust me - I've been there. But 2026 changed everything.

AI Autocomplete

2024: The Start: AI Autocomplete

Back in 2024, I shipped a product called GrantWrite AI - a RAG-based grant writing solution mostly using AI autocomplete. The idea was simple: train on previous grants, use that context to help write new ones. Collaborative editing, Google Docs-style, the whole deal.

The vibe in 2024? "This works with my workflow. It saves me time." RAG was king, and I felt genuinely good about where AI coding was heading.

Then came 2025.

IDE agents

2025: The "Messy Middle" of IDE Agents

I call this my "vibe coding / IDE agents" phase - also known as the messy middle.

I tried Copilot. I tried Cursor. And honestly? I was not a fan.

Here's what killed it for me:

Constant context management - I had to explicitly mention which files to work with
Feeling like a tour guide - I spent more time telling the AI "here's the file to find, here's the file to use" than actually coding
Spaghetti code output - The models were just okay, and the code wasn't production-ready
Workflow disruption - It felt like I could do everything faster myself

By the end of 2025, I'd basically written off agentic coding. AI autocomplete remained my bread and butter, but full agents? Not ready for prime time.

Claude Code

2026: Enter Claude Code - Finally, The Promise Land

I kept hearing about Claude Code. Released in February 2025 apparently, but I didn't catch the wave until 2026. Everyone was saying it was that much better. I was skeptical, but I finally installed it.

Game. Changer.

Two things made the difference:

1. Terminal UX > IDE Agents

The terminal-based workflow felt more productive, not less. No more wrestling with file mentions or context linking. Just me, the terminal, and an AI that actually understood what I wanted.

2. Models Got Significantly Better

We went from early Claude versions to Sonnet and Opus. The combination of Claude Code's terminal UX + improved models felt like unlocking a superpower. The code wasn't spaghetti anymore - it was actually usable, functioning code.

I went through the full emotional arc:

"Oh god, this thing can actually do what I do" (brief despair)
"Wait, this can really streamline my workflow" (excitement)
"I can work on 50 projects at once!" (over-ambition)
"...I have too many ideas now" (reality check)

What I Built: Pointing Poker 2.0

During our team's tech refinements, we use Pointing Poker to cost tickets. I had this random idea: what if Jeopardy hold music played while people voted?

In the past, this would've been a non-starter. Building that one small feature felt like overkill. But with Claude Code? Less than an hour.

The idea snowballed into three new features:

Hold music (Jeopardy theme, obviously) - plays for everyone during voting
Attention checks - Detects when someone goes inactive during voting and flags them
"Popcorning" - Ability to randomly assign host duties to other team members

I literally told Claude Code: "Hey, clone this Pointing Poker app, add these three features, here's my tech stack." And it just... worked.

One cool moment: When I had issues with the hold music playing only through the host's speakers (MS Teams limitations), I described the problem and the AI figured out how to play music locally on each participant's machine instead. I'm not sure how long that would've taken me to implement manually.

The Shell & The Brain: A Mental Model

Here's how I think about agentic coding now: Shell + Brain = Magic

Shell = Claude Code (preferred, or any CLI agent) - handles the UX, tool calling, workflow
Brain = The LLM - does the actual reasoning

The beautiful thing? You can swap out the brain.

I've been experimenting with Kimi K 2.5 (from Moonshot, a Chinese company). It's essentially a model that gives me Claude-like reasoning at a fraction of the cost. My brother put me on to it - he suggests using Claude Opus for planning and Kimi for execution.

Cost breakdown so far:

Pointing Poker: ~$25 in Claude Code credits
My next project: $20 in Kimi credits (still not fully spent)
Total: ~$50 for two production-ready projects

What I Learned (The Hard Way)

1. Shell + Brain Incompatibility is Real

Claude Code (the shell) gets updated. Sometimes those updates change how tools are invoked. I ran into a compatibility issue where my Claude Code version didn't play nicely with the Kimi model I was using. The fix? Downgrade the shell version.

Pro tip: Brain swapping is just environment variable-based. It's literally just:

export ANTHROPIC_AUTH_TOKEN="YOUR_MOONSHOT_KEY"
export ANTHROPIC_BASE_URL="https://api.moonshot.ai/anthropic"
export ANTHROPIC_MODEL="kimi-k2-turbo-preview"
...

That's it. The shell picks up the keys and swaps the brain.

2. You Still Need To Be "Somewhat" Technical

AI can't do everything. When I built Mailinator 2 (a temp email service with forwarding), I hit a wall: Railway's free tier doesn't let you set up SMTP servers. I had to understand email protocols, know about Resend (SMTP-over-API service), and figure out how to wire everything together.

The AI got me 80% there. The last 20% required actual technical knowledge.

3. AI Will Try To Do Everything From Scratch

When I asked for Jeopardy music, Claude Code created a synthesized version rather than grabbing the actual audio (copyright concerns). I ended up downloading the audio myself. Similarly, I was surprised it didn't try to build its own SMTP server when we hit the Railway limitation.

Sometimes you need to step in and say "no, use this existing thing."

4. Shiny Object Syndrome Is Real

Being able to build fast means you'll start 50 projects and finish maybe 5. I learned to scope down and finish one thing before moving to the next. Use feature branches and commit often - AI can absolutely break working code.

5. Mobile Coding Is Now Possible

I set up Terminus + Tailscale + Tmux to access my terminal from my phone. Now when my wife drives, I'm building projects from the passenger seat. (Claude Code recently added QR code functionality to make this even easier.)

Getting Started: My Advice

Start with a greenfield project. I tried integrating AI into existing codebases (like GrantWrite AI) and found it much harder. Instead:

Look at software you already use
Identify 1-2 features you wish it had
Ask the AI: "Clone this project and add these features"

That's literally how I built both Pointing Poker 2 and Mailinator 2. Nothing novel - just existing software with small improvements.

The Bottom Line

We've gone from "this kinda works" (2024) → "this is actively making me slower" (2025) → "holy sh*t this is magic" (2026).

Coding now feels effortless. The moment that really hit me was watching AI deploy to Railway - something I'd spent painstaking time building templates for - without any of that corpus. It just figured out the configurations, solved the problems, and shipped.

If you haven't tried agentic coding since the early IDE days, give Claude Code a shot. The terminal UX + better models might just change your mind.

Now if you'll excuse me, I have 47 unfinished projects to get back to.