Introduction

Cerca is a hosted runtime for agents that need to keep working past a single prompt. Instead of running a harness on your own VM — owning the patches, restarts, recovery semantics, and sandbox — you configure agents over an API and the runtime keeps them executing through model errors, tool failures, and process restarts. Every agent and thread is durable and addressable, so long-running work picks up where it left off through restarts and partial failures.

What the runtime gives you

Durable execution and context management. Threads recover from transient errors — bad LLM responses, timed-out tool calls, dropped connections — without losing state. Old turns are compacted and sub-agents spun up so threads stay responsive as they grow.
A cloud sandbox. Each agent has a CLI environment to write and run code in.
Secure credentials. Connections (OAuth or API keys) are stored encrypted; tools call through them, so the model never sees raw secrets.
Approvals. Tools can be marked approval-required. The runtime pauses on those calls until a decision is recorded.
Memory. Agents can write and search structured memory across threads.

What you provide

Custom tools — HTTP endpoints or MCP servers the agent can call.
Goals and instructions — system prompt, default model, and standing context the agent can rely on.
Approval policies — which tools run autonomously and which need a human.
Inbound events — webhooks or schedules that wake an agent up.

Core objects

Agent — a durable workspace, usually one per end-user. Holds configuration, memory, connections, and any threads it has run.
Thread — one conversation or run inside an agent. Threads stream events, can pause for approvals, and can spawn sub-threads.
Fleet — shared configuration for a group of agents: tools, OAuth apps, default instructions, approval defaults. Configure tools once at the fleet level instead of per agent.

Next steps

Quickstart — verify your API key, create an agent, start a thread.
Client libraries — install the TypeScript, Python, Go, Ruby, or CLI client.
Authentication — API keys, connections, and credential scoping.
Agents, threads, tools, approvals, and webhooks for the mental model.
API Reference — exact request shapes and response schemas.

​What the runtime gives you

​What you provide

​Core objects

​Next steps

What the runtime gives you

What you provide

Core objects

Next steps