Skip to content
All posts
AIProductivity

Why Claude Code Goes Dumb After 2 Hours (And What You Can Do About It)

March 4, 2026·Read on Medium·
Photo by Mohammad Rahmani on Unsplash

I was 3 hours into a refactoring session. Laravel backend, payment module, the kind of messy codebase that takes 20 minutes just to explain. Claude Code was flying in the first hour. Correct file edits, sensible suggestions, remembered every constraint I gave it.

Then something shifted.

It started suggesting code I had already rejected an hour earlier. It forgot that we agreed not to touch the legacy transaction handler. It tried the same broken approach three times in a row. I kept typing “no, we already discussed this” like I was arguing with someone who had just woken up from a nap in the middle of our meeting.

That is not Claude being lazy. That is a context window problem. And once you understand what is actually happening under the hood, you will stop blaming the tool and start managing it properly.

What Is a Context Window and Why Should You Care

Think of it like this. Claude Code does not have long-term memory. Every message you send, every file it reads, every command output it processes all goes into one giant working buffer. That buffer has a hard limit of 200,000 tokens on standard plans.

A token is roughly four characters of text. A single 300-line PHP file could eat 1,500 to 2,000 tokens. A long debugging thread with stack traces, file reads and rewrites can burn through 10,000 tokens in a single back-and-forth.

Here is the part most developers do not know: performance starts degrading well before the buffer is full. Research from Geoffrey Huntley, an engineer at Sourcegraph who builds coding agents, found that output quality noticeably drops at around 147,000 to 152,000 tokens. That is only 73 to 76 percent of the advertised limit. You still have 50,000 tokens of space left on paper but the quality of work has already fallen off.

You hit the invisible wall before you hit the real one.

The Auto-Compact Trap

Claude Code has a safety net called auto-compact. When the buffer gets too full, it automatically summarizes the earlier parts of your conversation to free up space so you can keep working.

Sounds fine. Here is the problem.

The summary is lossy. It flattens the details. The architectural decision you explained at the start of the session, the reason you chose to keep the legacy handler untouched, the three approaches you already tried and rejected, all of that gets compressed into a vague paragraph like “the developer is refactoring a payment module.” The nuance disappears.

It also eats a significant chunk of your buffer just to run. As of early 2026, Claude Code reserves around 33,000 tokens, roughly 16.5 percent of your total window, as a compaction buffer. That space is held aside before you even type a single line. Auto-compact fires at around 83.5 percent usage, which works out to roughly 167,000 tokens of actual conversation before the process kicks in.

So you get less usable space than you think and you lose more context than you expect when compaction runs.

The Degradation Timeline in a Long Session

Based on how the buffer fills during a real coding session, here is roughly what happens:

First 30 to 45 minutes. Everything is sharp. Responses are fast and accurate. Claude remembers every instruction and applies them consistently.

45 to 75 minutes in. Small drift starts appearing. Occasional repetition. It might forget a minor constraint. Still mostly reliable but you need to repeat yourself slightly more often.

75 to 120 minutes in. This is the danger zone. Suggestions start contradicting earlier decisions. It tries approaches you already ruled out. You spend more time correcting than building.

Past 120 minutes or at the “Context Left: 0%” warning. Auto-compact has fired. Claude is now working from a summary of your session, not the session itself. You are effectively starting over with a stranger who read the meeting notes but was not in the room.

How to Fight It

Run /context during your session to see exactly how many tokens you have used. It breaks down where they are going: system tools, memory files, conversation history. Make it a habit every 30 to 40 minutes.

Use CLAUDE.md for things that must not be forgotten

CLAUDE.md is a file Claude Code reads at the start of every session. Put your core architecture decisions, project constraints and “never touch this” rules in there. Unlike conversation history, this survives auto-compact and session restarts. If something is important enough that losing it would derail your work, it belongs in CLAUDE.md and not in the chat.

Compact on your terms, not its

Do not wait for auto-compact to fire. Run /compact yourself with a specific instruction before you hit the limit. Something like /compact focus on the payment module refactor and the decisions we made about the legacy handler. You control what survives the summary instead of letting the system guess.

Clear aggressively between tasks

The /clear command resets your conversation history entirely. This feels drastic but it is actually the cleanest approach when you finish a discrete task. You do not carry the baggage of the previous conversation into the next one. Fresh context means full capacity.

Split your work across separate sessions

You can run multiple Claude Code instances at the same time in different terminal windows. One session handles the backend. Another handles the frontend. Each one has a full 200,000-token budget. Use git worktree add to create separate working directories for each session so they do not interfere with each other.

This is how developers who use Claude Code seriously actually work. Not one long marathon session but several focused sprints running in parallel.

The Mental Model Shift

The mistake most developers make, and I made it too, is treating Claude Code like a senior developer you can just keep talking to all day. You explain the full project once and expect it to remember everything forever.

It does not work that way. It is more like a contractor who reads everything you give them and does excellent work, but they have a briefcase with a maximum capacity. Once that briefcase is full, they start leaving older documents behind to fit the new ones in.

Your job is to keep that briefcase focused. One task per session. Important decisions in CLAUDE.md. Clear the session when you finish something meaningful. Check your usage before you hit the wall.

Two hours of clean, focused sessions will produce better output than four hours of one degraded marathon session. The tool is not the problem. The workflow is.

If you have been fighting Claude Code and wondering why it suddenly feels slower or less capable after a long session, now you know why. Try the /context command on your next session and see where your tokens are actually going. The number will probably surprise you.

Found this helpful?

If this article saved you time or solved a problem, consider supporting — it helps keep the writing going.

Originally published on Medium.

View on Medium
Why Claude Code Goes Dumb After 2 Hours (And What You Can Do About It) — Hafiq Iqmal — Hafiq Iqmal