Some agent actions should not happen without a human saying yes. Deleting files, sending money, posting on someone’s behalf, or making an irreversible API call are all moments when you want a real person in the loop. Other times the agent simply does not have enough information to proceed and needs to ask a clarifying question before going further. Motus has first-class support for both. An agent running insideDocumentation Index
Fetch the complete documentation index at: https://docs.motus.lithosai.com/llms.txt
Use this file to discover all available pages before exploring further.
motus serve can pause itself, send a payload to the parent server, wait for a reply, and then continue from exactly where it stopped. The session API exposes this as a state called interrupted, and clients drive it back to running with POST /sessions/{id}/resume.
Pick your approach
Block dangerous tools
Ask the user a question
Build your own flow
interrupt() primitive for any custom payload shape.Try it in 30 seconds
The fastest way to see HITL in action is the bundled example agent and the reference CLI client.src/motus/serve/cli.py for the full implementation if you want to use it as a template for your own UI.
How it works
Every message you send to a session spawns a fresh worker subprocess. The worker runs your agent, and your agent can callinterrupt() from anywhere inside its execution. When that happens:
The worker pauses
await interrupt(payload). The worker sends the payload to the parent server over a pipe and the agent’s coroutine blocks on a future, waiting for a reply.The session enters the interrupted state
interrupt_id, adds it to the session’s pending interrupts, and flips the session status from running to interrupted. Any client doing a long poll on GET /sessions/{id} wakes up immediately and sees the new state. The interrupt_id is what the client will echo back when it posts the resume.The client shows the payload to the user and collects a reply
The client posts a resume
POST /sessions/{id}/resume with the interrupt_id and a value ships the user’s reply back through the server, into the worker, and into the agent’s awaiting future. The agent picks up exactly where it left off.AgentServer runs in both environments, so the REST API, session lifecycle, and wire protocol are the same.
Three ways to pause an agent
Tool approval gates
The simplest case: you have a tool that should never run without explicit user approval. Addrequires_approval=True to the @tool decorator and Motus does the rest.
delete_file, Motus pauses the worker and emits an interrupt with this shape:
approved is true, the tool runs normally. If it is false or missing, Motus raises ToolRejected; the agent sees it as a regular tool error ({"error": "User rejected delete_file"}) and can try a different approach, ask the user what to do instead, or give up gracefully. The model stays in control of recovery.
requires_approval=True prepends an auto-generated input guardrail to the tool’s guardrail chain. The approval check runs before any guardrails you defined yourself, and you never write interrupt logic by hand.Structured questions with ask_user_question
Sometimes the agent does not need approval. It needs information. Maybe it does not know which file to edit, or which date range to pull data for, or whether the user wants the long answer or the short one. The ask_user_question builtin tool lets the model ask structured questions with predefined options.
organize_files, calls ask_user_question instead, and the worker emits this interrupt payload:
{"answers": {...}} dict is what the ask_user_question tool returns back to the model (serialized as JSON, like any other tool result), so the agent can continue reasoning with the user’s choice in hand.
Schema reference
Theask_user_question tool validates inputs against this Pydantic schema:
| Field | Type | Validation | Notes |
|---|---|---|---|
questions | list | required, 1 to 4 items (enforced) | Top-level array. |
questions[].question | string | required | Full question text. End with ? by convention (not enforced). |
questions[].header | string | required | Short chip label for the UI. Keep it under ~12 chars (not enforced). |
questions[].multiSelect | bool | optional, default false | Set to true for non-mutually-exclusive choices. |
questions[].options | list | required, 2 to 4 items (enforced) | The list of choices. |
questions[].options[].label | string | required | Display text, 1 to 5 words by convention. |
questions[].options[].description | string | required | Explanation of this option. |
questions[].options[].markdown | string | null | optional | A code snippet or technical preview the frontend can render in a monospace box when the user hovers or focuses this option. Use it when an option is best explained by showing the actual text or code it would produce. |
Custom interrupts with the interrupt() primitive
If neither pattern fits, drop into the primitive directly. interrupt() accepts any dict and returns whatever value the client posts back. This is what you reach for when you want to ask for free-form text input, present a custom UI, or build your own elicitation pattern.
The example below uses the plain (message, state) -> (response, new_state) callable shape that motus serve accepts. See the Serving guide for the full set of agent shapes.
type field is a convention, not enforced. Pick any string your client knows how to handle. The framework’s built-in interrupts use tool_approval and user_input. If you want the reference CLI client (motus serve chat) to render your custom interrupts, reuse one of those strings. Otherwise, the CLI prints [warn] unknown interrupt type on every poll and never resumes the interrupt, so the session stays wedged until you build your own client.
Session state machine
The session status tells your client what to do next.| Status | What it means | What you can do |
|---|---|---|
idle | Waiting for input. Initial state after creation. | Send a message with POST /messages. |
running | The worker is executing the agent. | Long poll GET /sessions/{id}?wait=true. |
interrupted | The agent paused and is waiting for one or more resumes. | Read interrupts, present to user, post POST /resume for each. |
error | The worker failed (exception, timeout, cancellation, or crash). The error field has the message. | Read the error, optionally send a new message to retry. |
Common errors
RuntimeError: interrupt() called outside motus serve worker subprocess
RuntimeError: interrupt() called outside motus serve worker subprocess
interrupt() from a unit test, REPL, or some other context that is not a serve worker. The primitive only works inside a process spawned by AgentServer. To test, use a real serve process in your fixture (see tests/integration/serve/test_hitl.py).ValueError: Interrupt message too large
ValueError: Interrupt message too large
ValueError is raised inside interrupt() itself, so it propagates up through your agent code like any other exception. If the agent does not catch it, the worker returns an error traceback and the session transitions to error. If you need to ship something bigger (a screenshot, a large file), upload it to object storage first and pass a URL through the interrupt instead. The limit only applies to interrupts going out from the worker; resume values posted in by the client are not size checked.404 Not Found on POST /resume
404 Not Found on POST /resume
interrupt_id you sent does not map to a live pending interrupt. Common causes: the interrupt was already resumed (resumes are not idempotent, the second call gets a 404), the session was deleted or timed out, you raced a resume against the worker tearing down (the server replies with "Session not actively waiting for resume"), or there is a typo in the id. Use the exact id from the most recent poll response and avoid double-resuming.Session never returns to idle
Session never returns to idle
interrupts array and posts a resume for each one. The CLI client (motus serve chat) only handles tool_approval and user_input. Custom interrupt types make it print [warn] unknown interrupt type on every poll without ever resuming, so the session stays wedged. If you use custom types, build your own client.Things to know
Sessions live in memory
Sessions live in memory
motus serve process holds all sessions in a dict that does not survive restarts. In Motus Cloud, each deployment runs as a single process, so HITL just works. If you ever scale a serve deployment horizontally yourself, you need sticky session routing so resume requests land on the same process that holds the interrupted session.Interrupted sessions bypass TTL sweeps
Interrupted sessions bypass TTL sweeps
--ttl flag auto-sweeps idle and errored sessions, but not running or interrupted ones. A session that pauses on an interrupt and then gets abandoned will stay in memory until you explicitly delete it or restart the server. Build a client-side timeout if you expect users to walk away mid-approval.Cancellation kills pending interrupts
Cancellation kills pending interrupts
DELETE the session while it is interrupted, the worker is killed and any pending await interrupt(...) inside the agent raises EOFError("Worker pipe closed"). The session transitions to error with a clean message: "Turn cancelled" on delete, "Agent timed out" on timeout. Full Python tracebacks only appear when the agent itself raises an unhandled exception.Multiple concurrent interrupts are supported
Multiple concurrent interrupts are supported
asyncio.gather). The session’s pending_interrupts is a dict and each must be resumed individually. The status only flips back to running once every pending interrupt has been resolved. The order does not matter.Webhooks fire on turn completion, not on interrupts
Webhooks fire on turn completion, not on interrupts
POST /messages request, it fires when the session reaches idle or error, not on each interrupt. If your orchestration depends on knowing the exact moment an interrupt arrives, use long polling instead.Where to go next
Serving
Sessions API
Tools
@tool decorator and guardrails work under the hood.
