If you have used a coding agent for a while, you might have noticed it reading the same file over and over. No edits in between. Nothing changed on disk. It just asks for the file again.
It looks harmless. A few hundred extra tokens each time. But the research from the past six months makes it clear this is one of the biggest hidden costs of running agents today.
What the research says
Salesforce AI Research, LoCoBench-Agent (arXiv:2511.13998). Their long-context agent benchmark treats redundant tool usage as a formal waste metric. The biggest efficiency gains come from reducing redundant operations, not from better models.
AgentDiet, arXiv:2509.23586. Researchers looked at what actually piles up in agent trajectories and found a large share is "code retrieved in previous steps that repeats in the trajectory." When they pruned this automatically, they cut input tokens by 39.9% to 59.7% and total cost by 21.1% to 35.9%, with no drop in task success.
SWE-Pruner, arXiv:2601.16746. A separate group studied the same failure mode and called it "repeated exploratory file reads." Their pruner helped agents finish tasks earlier with fewer tokens.
Anthropic's Claude Code guidance puts a number on it directly: "40 to 60% of Read tokens go to redundant reads." That is from a live shipping product, not a lab benchmark.
Four independent sources, one signal. Duplicate reads are a large, measurable share of what coding agents actually spend tokens on.
Why agents keep doing it
It has nothing to do with how smart the model is. Three structural reasons:
The agent ends up opening the same drawer to check what is inside, even though it just looked.
What it actually costs
A typical 100-call coding session on a mid-sized codebase:
Multiply across a team. Production agents today run $3,200 to $13,000 per month. A meaningful chunk of that bill is paying for information the agent already had.
There is a quieter second cost. Every redundant read pushes newer, more relevant content further back in the model's working memory. Long sessions quietly get dumber as they get more expensive.
What we do about it
At ClevAgent, the duplicate-read logic is the first rule we built. In our pre-launch testing across 1,680 agent sessions and 56,367 LLM calls, this one logic accounted for roughly 68% of the total conservative savings we measured.
The mechanism is simple:
{path} at turn N. The content is in your context. Answer from memory, or use Grep for a specific excerpt."*No model change, no retraining. The agent simply stops doing the thing it did not need to do.
The bigger point
The next meaningful gains in agent efficiency will not come from bigger models. They will come from catching the structural waste that already sits in every trajectory. Duplicate reads are the easiest of those to measure, and for many real workflows, the single largest.
If you run agents in production, it is worth asking how much of your monthly bill is paying for information your agent already has.