If you have used a coding agent for a while, you might have noticed it reading the same file over and over. No edits in between. Nothing changed on disk. It just asks for the file again.

It looks harmless. A few hundred extra tokens each time. But the research from the past six months makes it clear this is one of the biggest hidden costs of running agents today.

What the research says

Salesforce AI Research, LoCoBench-Agent (arXiv:2511.13998). Their long-context agent benchmark treats redundant tool usage as a formal waste metric. The biggest efficiency gains come from reducing redundant operations, not from better models.

AgentDiet, arXiv:2509.23586. Researchers looked at what actually piles up in agent trajectories and found a large share is "code retrieved in previous steps that repeats in the trajectory." When they pruned this automatically, they cut input tokens by 39.9% to 59.7% and total cost by 21.1% to 35.9%, with no drop in task success.

SWE-Pruner, arXiv:2601.16746. A separate group studied the same failure mode and called it "repeated exploratory file reads." Their pruner helped agents finish tasks earlier with fewer tokens.

Anthropic's Claude Code guidance puts a number on it directly: "40 to 60% of Read tokens go to redundant reads." That is from a live shipping product, not a lab benchmark.

Four independent sources, one signal. Duplicate reads are a large, measurable share of what coding agents actually spend tokens on.

Why agents keep doing it

It has nothing to do with how smart the model is. Three structural reasons:

•The model forgets what it already saw. Content it read an hour ago is still in the conversation, but buried so deep that the model's attention no longer reaches it reliably. Asking for the file again feels cheaper than trying to remember.
•Agents do not deduplicate their own outputs. Every tool result gets appended fresh, even if the same bytes were already added earlier in the session.
•Exploration looks impressive on benchmarks. Agents that read files eagerly score higher on coverage metrics, so the habit rarely gets trained out.

The agent ends up opening the same drawer to check what is inside, even though it just looked.

What it actually costs

A typical 100-call coding session on a mid-sized codebase:

•If 40% of reads are redundant, that is 20 to 30 wasted reads per session.
•Redundant tokens alone: 50,000 to 100,000 per session.
•At frontier-model prices, roughly $1.50 to $6.00 of avoidable spend in a single session.

Multiply across a team. Production agents today run $3,200 to $13,000 per month. A meaningful chunk of that bill is paying for information the agent already had.

There is a quieter second cost. Every redundant read pushes newer, more relevant content further back in the model's working memory. Long sessions quietly get dumber as they get more expensive.

What we do about it

At ClevAgent, the duplicate-read logic is the first rule we built. In our pre-launch testing across 1,680 agent sessions and 56,367 LLM calls, this one logic accounted for roughly 68% of the total conservative savings we measured.

The mechanism is simple:

•We track the file paths an agent reads across a short rolling window.
•If the same file is read twice without an edit in between, we pause the turn.
•ClevAgent AI delivers a specific instruction, not a warning: *"You read {path} at turn N. The content is in your context. Answer from memory, or use Grep for a specific excerpt."*

No model change, no retraining. The agent simply stops doing the thing it did not need to do.

The bigger point

The next meaningful gains in agent efficiency will not come from bigger models. They will come from catching the structural waste that already sits in every trajectory. Duplicate reads are the easiest of those to measure, and for many real workflows, the single largest.

If you run agents in production, it is worth asking how much of your monthly bill is paying for information your agent already has.

References

*LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering*. Salesforce AI Research, arXiv:2511.13998. arxiv.org/abs/2511.13998
*Improving the Efficiency of LLM Agent Systems through Trajectory Reduction (AgentDiet)*. arXiv:2509.23586. arxiv.org/abs/2509.23586
*SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents*. arXiv:2601.16746. arxiv.org/html/2601.16746v1
*Best Practices for Claude Code*. Anthropic. code.claude.com/docs/en/best-practices
*How Much Does It Actually Cost to Run an AI Agent 24/7 in 2026?* dev.to community report. dev.to

The Duplicate-Read Problem