Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.motus.lithosai.com/llms.txt

Use this file to discover all available pages before exploring further.

Some agent actions should not happen without a human saying yes. Deleting files, sending money, posting on someone’s behalf, or making an irreversible API call are all moments when you want a real person in the loop. Other times the agent simply does not have enough information to proceed and needs to ask a clarifying question before going further. Motus has first-class support for both. An agent running inside motus serve can pause itself, send a payload to the parent server, wait for a reply, and then continue from exactly where it stopped. The session API exposes this as a state called interrupted, and clients drive it back to running with POST /sessions/{id}/resume.

Pick your approach

Block dangerous tools

Require user approval before specific tools run. One decorator flag does it.

Ask the user a question

Let the model present 1 to 4 questions with predefined options.

Build your own flow

Drop down to the interrupt() primitive for any custom payload shape.

Try it in 30 seconds

The fastest way to see HITL in action is the bundled example agent and the reference CLI client.
# Terminal 1: serve the example agent
motus serve start examples.serving.hitl_agent:agent --port 8000

# Terminal 2: chat with it
motus serve chat http://localhost:8000
Try saying delete the old logs file (triggers an approval gate) or help me organize my downloads (triggers a clarifying question). The chat client polls for status, prints any interrupts as they arrive, prompts you in the terminal, posts the resume, and keeps going until the turn finishes. Look at src/motus/serve/cli.py for the full implementation if you want to use it as a template for your own UI.

How it works

Every message you send to a session spawns a fresh worker subprocess. The worker runs your agent, and your agent can call interrupt() from anywhere inside its execution. When that happens:
1

The worker pauses

Your agent calls await interrupt(payload). The worker sends the payload to the parent server over a pipe and the agent’s coroutine blocks on a future, waiting for a reply.
2

The session enters the interrupted state

The server stores the payload under a freshly generated interrupt_id, adds it to the session’s pending interrupts, and flips the session status from running to interrupted. Any client doing a long poll on GET /sessions/{id} wakes up immediately and sees the new state. The interrupt_id is what the client will echo back when it posts the resume.
3

The client shows the payload to the user and collects a reply

Your frontend (or CLI, or Slack bot) reads the payload, presents whatever UI makes sense, and gathers the user’s response.
4

The client posts a resume

A POST /sessions/{id}/resume with the interrupt_id and a value ships the user’s reply back through the server, into the worker, and into the agent’s awaiting future. The agent picks up exactly where it left off.
5

The session goes back to running

Once all pending interrupts are resolved, the status flips back to running and the agent finishes its turn normally. If the agent triggers another interrupt later in the same turn, the cycle repeats.
Your local setup and Motus Cloud behave identically. The same AgentServer runs in both environments, so the REST API, session lifecycle, and wire protocol are the same.

Three ways to pause an agent

Tool approval gates

The simplest case: you have a tool that should never run without explicit user approval. Add requires_approval=True to the @tool decorator and Motus does the rest.
from motus.tools import tool

@tool(requires_approval=True)
async def delete_file(path: str) -> str:
    """Delete a file at the given path."""
    import os
    os.remove(path)
    return f"Deleted {path}"
When the agent decides to call delete_file, Motus pauses the worker and emits an interrupt with this shape:
{
  "type": "tool_approval",
  "tool_name": "delete_file",
  "tool_args": { "path": "/tmp/old_logs.txt" }
}
Your client shows the user what is about to happen, collects a yes or no, and posts back:
{
  "interrupt_id": "<uuid from the interrupt>",
  "value": { "approved": true }
}
If approved is true, the tool runs normally. If it is false or missing, Motus raises ToolRejected; the agent sees it as a regular tool error ({"error": "User rejected delete_file"}) and can try a different approach, ask the user what to do instead, or give up gracefully. The model stays in control of recovery.
Under the hood, requires_approval=True prepends an auto-generated input guardrail to the tool’s guardrail chain. The approval check runs before any guardrails you defined yourself, and you never write interrupt logic by hand.

Structured questions with ask_user_question

Sometimes the agent does not need approval. It needs information. Maybe it does not know which file to edit, or which date range to pull data for, or whether the user wants the long answer or the short one. The ask_user_question builtin tool lets the model ask structured questions with predefined options.
from motus.agent import ReActAgent
from motus.models import OpenRouterChatClient
from motus.tools import tool
from motus.tools.builtins.ask_user import ask_user_question

@tool
async def organize_files(directory: str, strategy: str) -> str:
    """Organize files in a directory using the chosen strategy."""
    return f"Organized {directory} by {strategy}"

agent = ReActAgent(
    client=OpenRouterChatClient(),
    model_name="anthropic/claude-sonnet-4",
    tools=[organize_files, ask_user_question],
)
Serve the agent, then send a vague prompt like help me clean up my downloads folder. The model decides it does not have enough to run organize_files, calls ask_user_question instead, and the worker emits this interrupt payload:
{
  "type": "user_input",
  "questions": [
    {
      "question": "How would you like to organize the files?",
      "header": "Strategy",
      "multiSelect": false,
      "options": [
        { "label": "By date", "description": "Group files by year and month" },
        { "label": "By type", "description": "Group files by extension" },
        { "label": "By size", "description": "Move large files into a separate folder" }
      ]
    }
  ]
}
Your frontend renders the question, shows the options as buttons or a dropdown, and (by convention) appends a free-text “Other” input so the user can type something the model did not anticipate. The reply has this shape:
{
  "interrupt_id": "<uuid>",
  "value": {
    "answers": {
      "How would you like to organize the files?": "By date"
    }
  }
}
The whole {"answers": {...}} dict is what the ask_user_question tool returns back to the model (serialized as JSON, like any other tool result), so the agent can continue reasoning with the user’s choice in hand.

Schema reference

The ask_user_question tool validates inputs against this Pydantic schema:
FieldTypeValidationNotes
questionslistrequired, 1 to 4 items (enforced)Top-level array.
questions[].questionstringrequiredFull question text. End with ? by convention (not enforced).
questions[].headerstringrequiredShort chip label for the UI. Keep it under ~12 chars (not enforced).
questions[].multiSelectbooloptional, default falseSet to true for non-mutually-exclusive choices.
questions[].optionslistrequired, 2 to 4 items (enforced)The list of choices.
questions[].options[].labelstringrequiredDisplay text, 1 to 5 words by convention.
questions[].options[].descriptionstringrequiredExplanation of this option.
questions[].options[].markdownstring | nulloptionalA code snippet or technical preview the frontend can render in a monospace box when the user hovers or focuses this option. Use it when an option is best explained by showing the actual text or code it would produce.
The reference CLI client (motus serve chat) automatically appends an “Other” free-text input. If you build your own frontend, do the same so users are not boxed in by the model’s options.

Custom interrupts with the interrupt() primitive

If neither pattern fits, drop into the primitive directly. interrupt() accepts any dict and returns whatever value the client posts back. This is what you reach for when you want to ask for free-form text input, present a custom UI, or build your own elicitation pattern. The example below uses the plain (message, state) -> (response, new_state) callable shape that motus serve accepts. See the Serving guide for the full set of agent shapes.
from motus.models import ChatMessage
from motus.serve.interrupt import interrupt

async def my_agent(message: ChatMessage, state: list[ChatMessage]):
    user_choice = await interrupt({
        "type": "color_picker",
        "prompt": "Pick a brand color for the export",
        "presets": ["#FF6B6B", "#4ECDC4", "#FFE66D"]
    })

    color = user_choice.get("hex", "#000000")
    response = ChatMessage.assistant_message(content=f"Using {color} for the export.")
    return response, state + [message, response]
Then serve it like any other agent:
motus serve start myapp:my_agent --port 8000
The type field is a convention, not enforced. Pick any string your client knows how to handle. The framework’s built-in interrupts use tool_approval and user_input. If you want the reference CLI client (motus serve chat) to render your custom interrupts, reuse one of those strings. Otherwise, the CLI prints [warn] unknown interrupt type on every poll and never resumes the interrupt, so the session stays wedged until you build your own client.
interrupt() only works inside a motus serve worker subprocess. Calling it from a unit test or standalone script raises RuntimeError("interrupt() called outside motus serve worker subprocess"). To test agents that use HITL, spin up a real serve process in your test fixture (see tests/integration/serve/test_hitl.py for the pattern).

Session state machine

The session status tells your client what to do next.
StatusWhat it meansWhat you can do
idleWaiting for input. Initial state after creation.Send a message with POST /messages.
runningThe worker is executing the agent.Long poll GET /sessions/{id}?wait=true.
interruptedThe agent paused and is waiting for one or more resumes.Read interrupts, present to user, post POST /resume for each.
errorThe worker failed (exception, timeout, cancellation, or crash). The error field has the message.Read the error, optionally send a new message to retry.
Transitions:
idle ──POST /messages── running ──┬── idle           (turn completed)
                                      ├── interrupted   (agent called interrupt)
                                      └── error         (worker failed)

interrupted ──POST /resume──▶ running   (only after ALL pending interrupts are resolved)
You cannot send a new message while the session is running or interrupted. The server returns 409 Conflict. Either wait for the turn to finish, post a resume, or DELETE the session to start over.
For the exact request and response shapes on every endpoint, see the Sessions API reference.

Common errors

You called interrupt() from a unit test, REPL, or some other context that is not a serve worker. The primitive only works inside a process spawned by AgentServer. To test, use a real serve process in your fixture (see tests/integration/serve/test_hitl.py).
Each interrupt payload is pickled before being sent across the worker pipe, and Motus enforces a hard limit of 16 KiB on outbound interrupts. The ValueError is raised inside interrupt() itself, so it propagates up through your agent code like any other exception. If the agent does not catch it, the worker returns an error traceback and the session transitions to error. If you need to ship something bigger (a screenshot, a large file), upload it to object storage first and pass a URL through the interrupt instead. The limit only applies to interrupts going out from the worker; resume values posted in by the client are not size checked.
The interrupt_id you sent does not map to a live pending interrupt. Common causes: the interrupt was already resumed (resumes are not idempotent, the second call gets a 404), the session was deleted or timed out, you raced a resume against the worker tearing down (the server replies with "Session not actively waiting for resume"), or there is a typo in the id. Use the exact id from the most recent poll response and avoid double-resuming.
Your client is probably ignoring an interrupt. Every interrupt must be resumed before the worker can continue. Make sure your polling loop walks every entry in the interrupts array and posts a resume for each one. The CLI client (motus serve chat) only handles tool_approval and user_input. Custom interrupt types make it print [warn] unknown interrupt type on every poll without ever resuming, so the session stays wedged. If you use custom types, build your own client.

Things to know

A motus serve process holds all sessions in a dict that does not survive restarts. In Motus Cloud, each deployment runs as a single process, so HITL just works. If you ever scale a serve deployment horizontally yourself, you need sticky session routing so resume requests land on the same process that holds the interrupted session.
The server’s --ttl flag auto-sweeps idle and errored sessions, but not running or interrupted ones. A session that pauses on an interrupt and then gets abandoned will stay in memory until you explicitly delete it or restart the server. Build a client-side timeout if you expect users to walk away mid-approval.
If the turn times out or you DELETE the session while it is interrupted, the worker is killed and any pending await interrupt(...) inside the agent raises EOFError("Worker pipe closed"). The session transitions to error with a clean message: "Turn cancelled" on delete, "Agent timed out" on timeout. Full Python tracebacks only appear when the agent itself raises an unhandled exception.
An agent can fire multiple interrupts at once (for example with asyncio.gather). The session’s pending_interrupts is a dict and each must be resumed individually. The status only flips back to running once every pending interrupt has been resolved. The order does not matter.
If you configure a webhook in your POST /messages request, it fires when the session reaches idle or error, not on each interrupt. If your orchestration depends on knowing the exact moment an interrupt arrives, use long polling instead.

Where to go next

Serving

The full serving guide covers session lifecycle, agent types, and the Python API.

Sessions API

REST reference for creating, polling, and managing sessions.

Tools

Learn how the @tool decorator and guardrails work under the hood.

Guardrails

Validate and transform agent inputs and outputs without touching your agent logic.