Add agent monitoring in 2 lines of code. No config files.
Get your API key from Dashboard → Settings.
Tip: Use a consistent agent name. Typos create duplicate agents. Enable Strict Mode in project settings to block unregistered names.
Two levels of protection:
Liveness (default): clevagent.init() sends heartbeats automatically via background thread. Catches crashes, OOM kills, clean exits.
Work-progress (recommended): Call clevagent.ping() inside your main loop. Also catches zombie states, hung API calls, and logic deadlocks.
Go to clevagent.io/dashboard. Your agent will appear automatically on the first heartbeat. No manual registration needed.
New agents are created automatically via upsert on first ping — just give them a unique agent name.
| Parameter | Default | Description |
|---|---|---|
| api_key | required | Your project API key (starts with cv_) |
| agent | required | Unique agent name within your project |
| interval | 60 | Heartbeat interval in seconds (min based on your tier) |
| auto_cost | True | Auto-capture token usage from OpenAI/Anthropic SDK calls |
| endpoint | "https://clevagent.io" | ClevAgent API base URL (override for self-hosted deployments) |
| on_loop | "alert_only" | "alert_only" · "stop" · callable — action when loop is detected |
| on_cost_exceeded | "alert_only" | "stop" · "alert_only" · callable — action when daily cost budget exceeded |
If auto_cost doesn't work with your SDK version, log costs manually:
Returns a dict with the same fields echoed back, plus ok: True on success or ok: False with an error key on failure. Raises no exceptions — safe to call from within your main loop.
What works automatically vs. what requires manual configuration:
| Feature | Auto | Manual | Not supported |
|---|---|---|---|
| Heartbeat | ✅ 2 lines | — | — |
| Cost tracking | ✅ OpenAI / Anthropic | log_cost() for others | — |
| Auto-restart | ✅ Docker (SDK or Runner) | systemd/launchd/process (via Runner) | K8s/supervisor (use native restart) |
| Loop detection | ✅ | — | — |
You can also send heartbeats directly without the SDK:
| Method | Path | Auth |
|---|---|---|
| POST | /api/v1/heartbeat | X-API-Key header |
| POST | /api/v1/heartbeat/batch | X-API-Key header |
| GET | /api/v1/status | X-API-Key header |
| Status | Meaning |
|---|---|
| 200 OK | Heartbeat received |
| 401 | Invalid or missing API key |
| 422 | Validation error (missing required field) |
| 429 | Rate limit exceeded — check Retry-After header |
Error response bodies:
Rate limit: 200 req/min (Free) · 500 (Starter) · 1,000 (Pro) · 10,000 (Enterprise) per API key
Pagination (status endpoint): ?limit=N&offset=M — default 100, max 1000
GET /api/v1/status — returns status for all enabled agents in the project:
Real-time alerts go to Telegram, Slack, or Discord. Email delivers periodic digest reports (weekly on Free, daily on Starter+).
Use your own bot — zero extra cost. Three steps:
@BotFather on Telegram, run /newbot, copy the token.https://api.telegram.org/bot<TOKEN>/getUpdates — the chat.id field is your Chat ID.Email — Periodic digest reports (weekly on Free, daily on Starter+). Not real-time. Sent automatically to your account email.
Custom integrations (Enterprise) — Contact [email protected] for PagerDuty, Opsgenie, or other integrations.
| Plan | History |
|---|---|
| Free | 7 days |
| Starter | 30 days |
| Pro | 90 days |
| Enterprise | 1 year |
When managing many agents, send multiple heartbeats in a single request to reduce latency and API overhead.
Max batch size: 100 agents per request
429 retry strategy: Exponential backoff — wait Retry-After seconds from the response header before retrying. Example: 1s → 2s → 4s → 8s.
Burst allowance: Up to 2× your rate limit is allowed momentarily (1-minute window). Sustained bursts beyond that return 429.
Monitor and auto-restart your agents without changing any code. Install the runner daemon alongside your agent.
Get your API key from Dashboard → Settings → API Keys.
Add multiple --watch flags. Each target is monitored independently.
Go to clevagent.io/dashboard. Your agents will appear automatically on the first heartbeat. No manual registration needed.
| Flag | Default | Description |
|---|---|---|
| --api-key | required | Project API key (cv_xxx) |
| --watch | required | Target to monitor (type:name). Repeatable. |
| --endpoint | https://clevagent.io | ClevAgent server URL |
| --heartbeat-interval | 30 | Seconds between heartbeats |
| --log-level | INFO | DEBUG / INFO / WARNING / ERROR |
| Type | Target | Example | Restart method |
|---|---|---|---|
| docker | Container name | docker:my-bot | docker restart |
| systemd | Service name | systemd:my-agent.service | systemctl restart |
| launchd | Service label | launchd:com.me.agent | launchctl kickstart |
| process | PID file path (recommended) or process name substring | process:/var/run/my-bot.pid | PID file or substring matching via pgrep |
pgrep to find matching processes; if multiple match, the lowest PID (oldest process) is used.Zero code changes. Runner sends heartbeats on behalf of your agent.
Best for: Existing Docker/systemd services you don't want to modify.
Add 2 lines of SDK code + run the Runner. SDK handles cost tracking and loop detection. Runner handles restarts.
Best for: Full monitoring with cost and loop alerts.
Use your native restart mechanisms — ClevAgent monitors the heartbeat and alerts you on downtime.
When ClevAgent detects a loop (repeated tool calls above threshold), the heartbeat response includes a warning field. The SDK reads this and takes action based on your on_loop setting.
| on_loop value | Behavior |
|---|---|
| "alert_only" (default) | Prints warning only — agent keeps running, you get the alert |
| "stop" | Prints warning + calls os._exit(1) to stop the agent immediately |
| callable | Calls your function — use this for custom safe shutdown (flush state, close positions, etc.) |
Example: Trading bot with safe shutdown
pip install --upgrade clevagent. The backend warning field is set by the Advanced loop detection engine (available on Starter, Pro, and Enterprise plans).ClevAgent detects loops using three signals. Default thresholds work for most agents — adjust per-agent if your workload is legitimately intensive.
| Signal | Default threshold | Adjustable? |
|---|---|---|
| Tool call rate | 10 calls/min | ✅ Dashboard · API |
| Repeated message | 5 identical messages | ✅ Dashboard · API |
| Token spike | 3× rolling average | ✅ Dashboard · API |
To adjust, go to Dashboard → Agent Detail → Settings, or via API:
Full TypeScript types. Zero runtime dependencies. Works with any Node.js agent framework — Vercel AI SDK, LangChain.js, AutoGen, or plain HTTP agents.
| Parameter | Type | Default | Description |
|---|---|---|---|
| apiKey | string | required | Project API key (cv_xxx) |
| agent | string | required | Unique agent name within your project |
| interval | number | 60 | Heartbeat interval in seconds |
| endpoint | string | "https://clevagent.io" | API base URL (override for self-hosted) |
| onLoop | string | () => void | "stop" | "stop" · "alert_only" · custom function — action when loop is detected |
| onCostExceeded | string | () => void | "alert_only" | "stop" · "alert_only" · custom function — action when daily cost budget exceeded |
| agentType | string | undefined | Framework identifier ("claude", "langchain", etc.) |
| Parameter | Type | Default | Description |
|---|---|---|---|
| status | string | "ok" | "ok" · "warning" · "error" · "shutdown" |
| message | string | undefined | Free-text status message |
| tokensUsed | number | undefined | Token count for this cycle |
| costUsd | number | undefined | Cost in USD for this cycle |
| toolCalls | number | undefined | Number of tool calls this cycle |
| iterationCount | number | undefined | Current iteration number |
| memoryMb | number | undefined | Memory usage in MB |
Record costs, prompts, tool calls, and iterations explicitly when auto-tracking isn't available:
Export ClevAgent metrics to Prometheus, Grafana, Datadog, or any OTLP-compatible backend.
Exported metrics include: agent uptime, heartbeat latency, token cost per cycle, loop detection count, and auto-restart events.
Receive Langfuse error traces and trigger auto-restart. Configure a Langfuse webhook to POST to ClevAgent when a trace has error status.
When status is "error", ClevAgent creates a langfuse_error event. If the matched agent has auto-restart enabled, a restart command is queued automatically.
In Langfuse, set the webhook URL to https://clevagent.io/api/v1/webhooks/langfuse and pass your project API key as the X-API-Key header.
Add ClevAgent to your existing agent framework in minutes.
Add heartbeat monitoring to your CrewAI crew. Each agent gets its own heartbeat.
Monitor your LangGraph agent loops. Ping inside the graph node for work-progress tracking.
Add monitoring to your AutoGen multi-agent conversations.
All error responses follow this format:
| Code | Detail | When |
|---|---|---|
| 400 | Password must be at least 8 characters | Registration with short password |
| 400 | Invalid or expired token | Email verification with bad/used token |
| 401 | Not authenticated | Missing or invalid session cookie / API key |
| 401 | Invalid credentials | Wrong email or password at login |
| 402 | upgrade_required | Action requires a higher tier (e.g. agent limit reached) |
| 403 | Access denied | No permission for this project/agent |
| 403 | EMAIL_NOT_VERIFIED | Login before email verification |
| 403 | Requires editor/admin role | Shared member lacks permission |
| 404 | Agent/Project not found | Invalid ID or wrong project |
| 409 | Email already registered | Duplicate registration attempt |
| 422 | Unknown event / validation error | Invalid request body or unknown funnel event |
| 429 | Rate limit exceeded | Too many requests (login attempts, heartbeats, funnel events) |
| 502 | Failed to create checkout/portal | Stripe API error during billing operations |
| 503 | Billing not configured | Stripe keys missing on server |
Full API schema: /api/docs (Swagger) or /api/redoc (ReDoc)
Questions? Email [email protected]