Your LangChain agent works in development. Chains resolve, tools return, the ReAct loop converges.

Then you deploy it. Day one is fine. Day two, the agent processes 200 requests without a single error.

Day three, you check your OpenAI bill. $340 — on an agent that should cost $15/day. The agent got stuck in a tool-retry loop at 2 AM. The LLM kept calling a search tool that returned empty results, parsing the empty response, deciding it needed to search again, and repeating. No exceptions. No crashes. Every health check returned 200 OK.

Traditional monitoring tools — including LangSmith — would have shown you the traces after the fact. Nobody would have woken you up at 2 AM when it started.

Why LangChain agents need runtime monitoring

LangChain agents fail differently from web services:

•Stuck chains: An HTTP tool call hangs indefinitely. The chain never completes. The process is alive, the health endpoint responds, but no work is happening.

•Infinite ReAct loops: The agent keeps calling tools without converging. max_iterations helps, but only if you set it — and only for iteration count, not cost.

•Silent cost spikes: A loop that makes 50 LLM calls in 30 seconds doesn't spike CPU. It spikes your API bill. By the time you see the invoice, the damage is done.

•Zombie agents: The callback thread is alive, traces are flowing to LangSmith, but the actual work loop is stuck on a deadlocked resource.

LangSmith and Langfuse are excellent for tracing — understanding what happened after the fact. But they don't answer the real-time question: is this agent alive and making progress right now?

Free for 3 agents. No credit card required. Get your API key →

Add ClevAgent to your LangChain agent in 3 lines

Step 1. Install the SDK.

pip install clevagent

Step 2. Initialize ClevAgent with your API key.

import clevagentclevagent.init(
    api_key="your-api-key",
    agent="langchain-research-agent",
)

Step 3. Add the callback handler to your LLM or chain.

from clevagent.integrations.langchain import ClevAgentCallbackHandlerhandler = ClevAgentCallbackHandler()
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

That's it. Every LLM call now sends a heartbeat with token usage. If the agent stops calling the LLM — because a chain hung, a tool timed out, or the process crashed — ClevAgent detects the silence and alerts you.

Complete example

import clevagent
from clevagent.integrations.langchain import ClevAgentCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
clevagent.init(api_key="your-api-key", agent="research-agent")
handler = ClevAgentCallbackHandler()
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
tools = [
    Tool(name="search", func=search_web, description="Search the web"),
    Tool(name="calculate", func=calculator, description="Do math"),
]
agent = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, max_iterations=15)
Every LLM call and tool use is now monitored
result = executor.invoke({"input": "Research the latest AI agent frameworks"})

LangGraph agents: use the node decorator

For LangGraph's graph-based agents, ClevAgent provides a @monitored_node decorator that wraps each node with automatic heartbeat monitoring:

from clevagent.integrations.langgraph import monitored_node
from langgraph.graph import StateGraph
@monitored_node("research")
def research_node(state):
    result = llm.invoke(state["messages"])
    return {"messages": [result]}
@monitored_node("summarize")
def summarize_node(state):
    summary = llm.invoke(f"Summarize: {state['messages'][-1].content}")
    return {"messages": [summary]}graph = StateGraph(AgentState)
graph.add_node("research", research_node)
graph.add_node("summarize", summarize_node)
graph.add_edge("research", "summarize")

Each node execution sends a heartbeat. If a node hangs — because an API call never returns or an LLM request times out — ClevAgent detects the gap and alerts you.

Or use the explicit callback for more control:

from clevagent.integrations.langgraph import clevagent_node_callbackdef research_node(state):
    result = llm.invoke(state["messages"])
    clevagent_node_callback("research", tokens=result.usage_metadata.get("total_tokens", 0))
    return {"messages": [result]}

What ClevAgent catches

Stuck chains and hung tools

Your agent calls an external API inside a tool. The API hangs. The chain never completes. The process is still alive — systemctl status says "running" — but no heartbeats are arriving.

ClevAgent detects the silence within your configured threshold (default: 120 seconds) and sends an alert.

Infinite ReAct loops

The agent enters a loop: call tool → parse result → decide to call tool again → repeat. max_iterations caps the count, but what about cost? An agent that makes 15 iterations of GPT-4o calls in 30 seconds burns through tokens fast.

ClevAgent tracks cumulative token usage per heartbeat cycle. If tokens spike 10-100x above your agent's baseline, you get a cost alert — while the loop is still running, not after.

Silent exits

The process gets OOM-killed at 3 AM. No traceback, no error log, no alert. The agent just stops.

ClevAgent expects a heartbeat every N seconds. When it stops arriving, you get an alert within one missed interval. Optional auto-restart brings the agent back without manual intervention.

Getting started

pip install clevagent

Get your API key from clevagent.io/signup

Add clevagent.init() and the callback handler

Deploy. ClevAgent starts monitoring immediately.

Configure alerts in the dashboard — Telegram, Slack, Discord, or email.

How to Monitor LangChain Agents in Production