Skip to main content
Agentic inference is exploding. Motus is open source agent serving that enables higher capability, lower cost, and faster agents, and keeps deployment simple across local and cloud environments at any scale. You bring the agent. Motus runs it, serves it, and deploys it. Agents can come from any framework you already use, and Motus also ships with its own agent toolkit for writing production ready agents in clean Python.

Set up

One command installs everything and teaches your coding agent how to use Motus.
1

Run the installer

curl -fsSL https://www.lithosai.com/motus/install.sh | sh
This installs the Motus CLI, the Python library, and adds Motus plugins to Claude Code, Codex, and Cursor.
2

Use it inside your coding agent

/motus               # activate Motus skills
/motus serve         # serve locally
/motus deploy        # ship to the cloud
Your coding agent now handles scaffolding, serving, and deploying for you. See the Plugin guide for the full list of commands.

Serve and deploy any agent

Motus serves agents from any of these. Bring what you already have.
  • Motus native ReActAgent and workflows
  • OpenAI Agents SDK
  • Anthropic SDK
  • Google ADK
  • Plain Python
See the Integrations section for how each framework plugs in, what Motus adds on top, and the minimal code change needed to switch over. Once you have an agent, one command exposes it as an HTTP API or ships it to production. The code is the same either way.
# Serve locally on your own machine
motus serve start myapp:agent --port 8000

# Chat with it
motus serve chat http://localhost:8000 "Hello!"
See Serving for session management, worker pools, and webhooks, and Deployment for the cloud workflow.

The Motus library

lithosai-motus is the Python package. Alongside the serving layer, it ships with an agent toolkit you can use to write agents in clean Python. Here is what you get.

Start simple

Agents

ReActAgent runs the reasoning loop and tool dispatch with multi turn memory, structured output, guardrails, and usage tracking baked in. A working agent in under 10 lines.

Tools

Write a function, get a tool. Expose class methods with @tools, wrap an MCP server with get_mcp(), nest another agent with as_tool(). Built-in utilities: skills, bash, file ops, glob / grep, todo tracking.

Workflows

@agent_task turns plain Python functions into a parallel, resilient workflow. Motus infers the dependency graph from data flow, so you write normal Python and skip the DAG wiring entirely.

Models

Unified client for OpenAI, Anthropic, Gemini, and OpenRouter. Switch providers by changing one line. Local models (Ollama, vLLM) work through base_url.

Tracing and debugging

Every LLM call, tool invocation, and task dependency traced automatically. Interactive HTML viewer, Jaeger export, or cloud dashboard. Enabled with one env var.

Local serving

motus serve exposes any agent as a session based HTTP API locally. Test the full serving stack before deploying to the cloud.

Go deeper

Memory

Basic append only memory or compaction memory that auto summarizes when the token budget runs thin. Session save and restore built in.

Guardrails

Input and output validation on both agents and individual tools. Return a dict to modify, raise to block. Structured output guardrails match Pydantic fields.

Multi-agent composition

agent.as_tool() wraps any agent as a tool. The supervisor does not know it is calling another agent. fork() creates independent conversation branches.

MCP integration

Connect any MCP-compatible server with get_mcp(). Local via stdio, remote via HTTP, or inside a Docker container. Filter and rename tools with prefix, blocklist, and guardrails.

Docker sandboxes

Run untrusted code in isolated containers. Mount volumes, expose ports, execute shell and Python. Attach to any agent as a tool provider.

Prompt caching

Prompt caching via CachePolicy. STATIC covers system and tools, AUTO adds the conversation prefix. Cut latency and cost on long conversations.

Human in the loop

Pause an agent mid turn, ask the user for approval or clarification, then resume from exactly where you left off.

Lifecycle hooks

Three level hook system (global, per task name, per task type). Tap into task_start, task_end, task_error for logging, metrics, or custom logic.

Cloud deployment

motus deploy ships your agent to Motus Cloud with one command. No Dockerfiles, no Kubernetes, no infra code.

SDK compatibility

Drop in for OpenAI Agents SDK, Anthropic SDK, and Google ADK. Change the import, keep your code.
This is a slice of what ships with Motus. Browse the rest of the docs to find what fits your use case.

Learn more

Quickstart

Your first agent running in under 5 minutes.

Architecture Overview

How agents, tools, models, memory, runtime, and serving fit together.

Examples

Runnable demos covering runtime patterns, MCP, multi-agent bots, and more.

Contributing

Dev environment, tests, and how to send your first PR.