Serve agents over HTTP with session-based conversations and per-request process isolation with a single command.
The self-managed motus exposes any agent as an HTTP server with session-based conversations using motus serve. Each message spawns a fresh worker subprocess, so your agent runs in complete isolation. No shared state between requests, and a crash in one turn never affects another.
Every agent type follows the same turn contract: receive a ChatMessage and the session’s prior state, return a response ChatMessage and updated state. All agent types run in worker subprocesses and must be importable from the module level.
Parameter
Type
Description
message
ChatMessage
The new user message (constructed by the server from the HTTP request).
state
list[ChatMessage]
The session’s state from the previous turn (empty list on first turn).
Return value: tuple[ChatMessage, list[ChatMessage]] the response message (surfaced to the HTTP client) and the updated state (stored in the session). The agent owns the state and can append, compact, or restructure it freely.
ServableAgent
Google ADK
Anthropic SDK
OpenAI Agents SDK
Callable function
Any object with a conforming run_turn method can be served directly. This is a runtime checkable. Protocolinheritance is not required.
Built-in implementations include AgentBase and all of its subclasses (such as ReActAgent).
Google ADK agents are supported via motus.google_adk.agents.Agent, a subclass of the ADK Agent that implements ServableAgent. Session history is replayed automatically each turn.
from motus.google_adk.agents.llm_agent import Agentagent = Agent( model="gemini-2.0-flash", name="my_agent", instruction="You are a helpful assistant.",)
motus serve start myapp:agent
Requires the optional google-adk dependency.
Anthropic SDK tool runners are supported via motus.anthropic.ToolRunner. Define tools with the @beta_async_tool decorator and pass the runner directly. A fresh runner is created per turn.
from motus.anthropic import ToolRunner, beta_async_tool@beta_async_toolasync def get_weather(city: str) -> str: """Get the weather for a city.""" return f"Sunny in {city}"runner = ToolRunner( model="claude-sonnet-4-20250514", max_tokens=1024, tools=[get_weather], system="You are a helpful assistant.",)
motus serve start myapp:runner
Requires anthropic>=0.49.0. Pass max_iterations to limit the tool-use loop.
OpenAI Agents SDK agents are supported via auto-detection — no adapter import needed. Guardrail tripwire exceptions are caught and returned as refusal messages. Structured output is serialized to JSON.
from agents import Agentagent = Agent( name="my_agent", instructions="You are a helpful assistant.",)
motus serve start myapp:agent
Requires the optional openai-agents dependency.
Plain functions with the signature (message, state) -> (response, state) are supported. Both sync and async functions work:
from motus.models.base import ChatMessage# Syncdef my_agent(message, state): response = ChatMessage.assistant_message(content="hello") return response, state + [message, response]# Asyncasync def my_agent(message, state): result = await some_api_call(message.content) response = ChatMessage.assistant_message(content=result) return response, state + [message, response]
A session in error state can receive new messages and transitions back to running. Sessions are held in memory and do not persist across server restarts.
When --ttl is set, idle and errored sessions whose last activity exceeds the TTL are swept by a background task. When --timeout is set, agent turns that exceed the limit are killed and the session transitions to error with an "Agent timed out" message.
Each message spawns a fresh subprocess via multiprocessing.Process with pipe-based IPC. An asyncio.Semaphore limits concurrency to max_workers. Processes are not reused: each one starts, runs the agent function, sends the result over the pipe, and exits. On timeout or cancellation, the process is killed immediately.This subprocess isolation model means:
A crash in one agent turn never affects other sessions or the server itself.
No shared state leaks between requests.
Resource cleanup is automatic — when the process exits, all memory is reclaimed.
# Interactive REPL (creates and cleans up a session automatically)motus serve chat http://localhost:8000# Single messagemotus serve chat http://localhost:8000 "hello"# Keep the session after exit for later resumptionmotus serve chat http://localhost:8000 --keep# Resume an existing sessionmotus serve chat http://localhost:8000 --session 550e8400-e29b-41d4-a716-446655440000
Flag
Default
Description
--session
—
Resume an existing session instead of creating a new one
--keep
false
Preserve the session on exit (prints session ID)
--param
—
KEY=VALUE per-request parameter passed as user_params (repeatable)