Skip to main content
Any time an agent runs a shell command or executes Python, the interesting question is where. Your laptop is fine for prototyping and wrong for almost everything else: a stray rm -rf, a runaway pip install, or a model that decides to curl a dubious URL reaches straight into your machine. A sandbox is the boundary that prevents that. Sandbox is an abstract execution environment. You create one, run commands in it, move files in and out, and close it when you are done. Because the interface is a clean abstract class, a sandbox can be backed by anything: a local Docker container, a remote machine, a microVM, a serverless runtime, your own in-house isolation technology. Motus ships two reference backends so you do not have to start from scratch:
  • DockerSandbox runs work inside a local Docker container. Great for local development and self-hosted deployments.
  • CloudSandbox talks to a remote sandbox over a REST API. Used automatically when your agent runs on Motus Cloud.
The same get_sandbox(...) call works in both places. You get a DockerSandbox on your laptop and a CloudSandbox when the deploy target is Motus Cloud. If neither fits, implement the Sandbox interface yourself and callers of get_sandbox() keep working unchanged.

Quick start

import asyncio
from motus.tools import get_sandbox

async def main():
    with get_sandbox(image="python:3.12") as sb:
        print(await sb.sh("echo hello from the sandbox"))
        print(await sb.sh("ls /tmp"))

asyncio.run(main())
That’s the whole loop. get_sandbox() picks the right backend, the with block tears it down on exit, and each exec/sh call returns the command’s output as a string.

Local and cloud, one call

Your code makes the same get_sandbox(...) call in both environments. Locally it returns a DockerSandbox. When the same agent runs on Motus Cloud, it returns a CloudSandbox that talks to a sandbox the cloud side manages for you. The switch happens automatically. Motus Cloud currently runs your agent inside a fixed, Motus-provided sandbox image.

Creating a sandbox

get_sandbox() is the recommended entry point. It manages a global provider behind the scenes, so repeated calls in the same process do not spin up redundant infrastructure.
with get_sandbox(image="python:3.12") as sb:
    ...
For fine-grained control, the DockerSandbox class is a direct factory with the same options plus an async variant:
from motus.tools import DockerSandbox

async with await DockerSandbox.acreate("python:3.12", mounts={"/tmp/data": "/data"}) as sb:
    await sb.sh("ls /data")

Parameter reference

ParameterTypeDescription
imagestrDocker image to run. Default "python:3.12".
dockerfilestr | NonePath to a directory with a Dockerfile. Built on creation.
namestr | NoneContainer name. If provided, must be unique.
envdict[str, str] | NoneEnvironment variables inside the container.
mountsdict[str, str] | NoneHost-to-container bind mounts ({"/local": "/inside"}).
portsdict[int, int | None] | NoneContainer-to-host port mapping. None value picks a random host port.
connectstr | NoneDocker backend only. Attach to an existing container by name or id. When set, image and dockerfile are ignored.

What you can do in a sandbox

Every sandbox exposes the same small surface, built on exec().
MethodPurpose
exec(*cmd, input=, cwd=, env=)Run an arbitrary command. Returns the command output as a string (Docker and Cloud interleave stdout + stderr; LocalShell returns stdout on success and both streams on non-zero exit).
python(script)Shortcut for exec("python3", "-c", script).
sh(command)Shortcut for exec("sh", "-c", command).
put(local_path, sandbox_path)Copy a file in.
get(sandbox_path, local_path)Copy a file out.
endpoint(port)URL to reach a service listening on port inside the sandbox.
async def pip_install_and_verify():
    with get_sandbox(image="python:3.12") as sb:
        await sb.put("./requirements.txt", "/app/requirements.txt")
        await sb.sh("cd /app && pip install -r requirements.txt")
        version = await sb.sh("python3 -c 'import requests; print(requests.__version__)'")
        # get() writes to disk and returns the resolved local path.
        local_path = await sb.get("/app/output.json", "./output.json")
A non-zero exit code does not raise; the command output is returned as a string so the caller can inspect it. This matches the way a terminal behaves and lets the agent handle a failure the same way it handles any other tool output.

Handing a sandbox to an agent

A Sandbox is both a Python object you can drive yourself and a tool collection an agent can call. Two common patterns:

Pass the sandbox directly

from motus.agent import ReActAgent
from motus.models import OpenAIChatClient
from motus.tools import get_sandbox

with get_sandbox(image="python:3.12") as sb:
    agent = ReActAgent(
        client=OpenAIChatClient(),
        model_name="gpt-4o",
        tools=[sb],
    )
    await agent("Use python to compute the 50th Fibonacci number.")
Motus extracts the sandbox’s python and sh methods and exposes them as tools (Sandbox is declared with @tools(allowlist={"python", "sh"})).

Use the full builtin_tools suite

builtin_tools(sandbox=sb) wraps the sandbox in a richer developer toolkit: bash, read_file, write_file, edit_file, glob_search, grep_search, a todo list, and (when skills_dir= is passed) load_skill. See Tools for how these fit together.
from motus.tools import builtin_tools, get_sandbox

with get_sandbox(image="python:3.12") as sb:
    agent = ReActAgent(
        client=OpenAIChatClient(),
        model_name="gpt-4o",
        tools=[*builtin_tools(sandbox=sb)],
    )
    await agent("Find every TODO in /workspace and open an issue for each.")
Call builtin_tools() with no argument and it binds to a LocalShell, which runs commands directly on the host. Convenient for prototyping; not suitable once the model is running code you did not write yourself.

Lifecycle

Context managers are the recommended path. Motus supports both sync and async:
# Sync
with get_sandbox(image="python:3.12") as sb:
    ...  # sb.close() runs on exit

# Async
async with await DockerSandbox.acreate("python:3.12") as sb:
    ...  # sb.aclose() runs on exit
For long-lived sandboxes driven from tests or a custom orchestrator, you can also close manually:
sb = DockerSandbox.create("python:3.12")
try:
    ...
finally:
    sb.close()

Ownership

Sandboxes created with create() / acreate() own the underlying container: closing them attempts to stop and remove it. Sandboxes obtained with connect(name) do not own the container and will leave it running when closed. This is what you want when your sandbox is a shared dev environment rather than a disposable workspace.

Backends

The default on your laptop and on self-hosted deploys. Requires a running Docker daemon. Supports images, Dockerfiles, bind mounts, port mapping, and attach-to-existing.The first time Motus brings up the DockerToolProvider in a process, it checks for a ghcr.io/lithos-ai/sandbox image (a Python base with common utilities) and builds it locally from the bundled Dockerfile if missing. This check only runs once per provider. Your get_sandbox(image=...) call is separate: it spins up whatever image you name.

Where to go next

Tools

How the @tool decorator, builtin_tools, and sandbox fit together.

Motus Cloud

How your agents end up running on cloud sandboxes without a code change.

MCP integration

Running MCP servers inside a sandbox when you want their side effects contained.

Human in the Loop

Require approval before the agent runs commands that write to disk or reach the network.