# Pictograph SDK — full documentation
> One-shot bundle of every published doc page. Each section header is the
> page title; navigate by searching for `## Page:
`.
## Page: Agents
_URL: https://pictograph.io/docs/agents (markdown: agents.md)_
_Section: Agents_
Every Pictograph operation is exposed as both a typed Python SDK call and an agent tool. The 28-tool registry feeds three integration paths.
```python
from pictograph.agents import create_toolkit
toolkit = create_toolkit() # reads PICTOGRAPH_API_KEY
```
## Three integration paths
### Bundled adapters — Claude / OpenAI
For raw tool-use loops against the Anthropic or OpenAI SDKs, the adapters return ready-to-pass tool dicts. No extra dependencies.
```python
from pictograph.agents import for_anthropic_messages, for_openai_responses
claude_tools = for_anthropic_messages(toolkit) # → anthropic.messages.create(tools=...)
openai_tools = for_openai_responses(toolkit) # → openai.responses.create(tools=...)
# Both paths dispatch results the same way:
result = toolkit.dispatch(name="list_datasets", args={"limit": 10})
```
For the framework SDKs (`claude-agent-sdk`, `openai-agents`) install the extra:
```bash
pip install 'pictograph[agents]'
```
```python
from pictograph.agents import for_claude_agent_sdk, for_openai_agents
claude_sdk_tools = for_claude_agent_sdk(toolkit) # @tool-decorated callables
openai_sdk_tools = for_openai_agents(toolkit) # FunctionTool objects
```
Full integration cookbooks: [Claude](/docs/agents/claude) · [OpenAI](/docs/agents/openai).
### Bundled Claude Skill
The SDK ships a Claude Skill (`pictograph-cv`) with workflow recipes, reference docs, and bash-callable Python scripts. Install it once and Claude auto-discovers it:
```bash
pictograph agents install-skill --target claude-code # → ~/.claude/skills/pictograph-cv/
pictograph agents install-skill --target claude-ai # → ./pictograph-cv.zip (upload at claude.ai/skills)
pictograph agents install-skill --target both
```
Update the skill after upgrading the SDK by re-running the install command (it overwrites the existing directory).
### Dynamic discovery — `tools.json`
For frameworks without a bundled adapter (Vercel AI SDK, LangChain, custom dispatchers), fetch the JSON Schema registry directly:
```bash
curl -H "X-API-Key: pk_live_…" https://api.pictograph.io/api/v1/developer/tools.json
```
Each entry has `name`, `description`, `input_schema`, plus metadata (`required_role`, `credit_cost`, `idempotent`). Wire it into your dispatcher and route the model's tool calls to the matching SDK method. See [Dynamic discovery](/docs/agents/dynamic-discovery) for end-to-end examples.
## What's in the registry
Twenty-eight tools, grouped by category. Full JSON schemas at [`/docs/api-reference/tools`](/docs/api-reference/tools).
| Category | Tools |
| --- | --- |
| Workflows | `upload_dataset_from_folder`, `auto_annotate_dataset`, `train_pipeline`, `full_pipeline` |
| Datasets | `list_datasets`, `get_dataset`, `create_dataset`, `delete_dataset` |
| Images | `upload_image`, `delete_image` |
| Annotations | `get_annotations`, `save_annotations` |
| Auto-annotate | `auto_annotate_point`, `auto_annotate_box`, `auto_annotate_text` |
| Search | `search_by_tag`, `search_by_similarity` |
| Exports | `create_export`, `list_exports`, `download_export` |
| Training | `get_training_status`, `cancel_training` |
| Models | `list_models`, `download_model` |
| Credits | `get_credit_balance`, `estimate_credit_cost` |
| Connectors | `validate_connector`, `import_from_connector` |
## Guardrails
The toolkit enforces three guardrails on every dispatch, independent of which integration path you use.
**Role gate (`required_role`)** — each tool's required role is metadata in the registry, but the API re-checks the calling key's role on every request. An agent holding a `viewer` key gets `403 ForbiddenError` on any write tool.
**Credit gate (`credit_cost`)** — paid tools (`auto_annotate_dataset`, `train_pipeline`, `full_pipeline`) have known costs. Agents can pre-flight via `get_credit_balance` + `estimate_credit_cost` (both in the registry) and refuse to start when the balance is short.
**Response cap (`max_response_tokens`)** — large list/get results that exceed the cap (default 25k tokens) are truncated with a `_truncated` marker. The agent re-calls with narrower filters. Pass `max_response_tokens=N` to `create_toolkit(...)` to override.
## Recommended system prompt patterns
These three rules keep agents safe and cheap. Drop them into your system prompt.
```text
1. Before destructive actions (delete_dataset, delete_image,
cancel_training), restate exactly what will be removed and ask
for confirmation.
2. Before paid actions (auto_annotate_dataset, train_pipeline,
full_pipeline), call estimate_credit_cost first, then surface
the cost and the remaining balance. Proceed only if sufficient.
3. For multi-step tasks, prefer the workflow tools (full_pipeline,
upload_dataset_from_folder, auto_annotate_dataset, train_pipeline)
over chaining individual resource tools. They handle short-circuit
on failure and credit gating automatically.
```
## See also
- [Claude](/docs/agents/claude) · [OpenAI](/docs/agents/openai) — integration cookbooks
- [Dynamic discovery](/docs/agents/dynamic-discovery) — for framework-agnostic stacks
- [Cookbook](/docs/agents/cookbook) — recipe-style end-to-end examples
- [Tool reference](/docs/api-reference/tools) — full JSON schemas for all 28 tools
---
## Page: Claude
_URL: https://pictograph.io/docs/agents/claude (markdown: agents/claude.md)_
_Section: Agents_
Toolkit setup and guardrails live on [Agents — overview](/docs/agents). This page shows the two Claude-specific integration paths.
## Path 1: Anthropic SDK (raw tool dicts)
No extra dependencies — works with the standard `anthropic` package. You manage the tool-use loop.
```python
import anthropic
from pictograph.agents import create_toolkit, for_anthropic_messages
toolkit = create_toolkit()
tools = for_anthropic_messages(toolkit)
client = anthropic.Anthropic()
messages = [
{"role": "user", "content": "Upload ./photos to a dataset called 'demo' and auto-annotate cars and people."},
]
response = client.messages.create(
model="claude-opus-4",
max_tokens=4096,
tools=tools,
tool_choice={"type": "auto"},
messages=messages,
)
while response.stop_reason == "tool_use":
tool_uses = [b for b in response.content if b.type == "tool_use"]
tool_results = [
{
"type": "tool_result",
"tool_use_id": tu.id,
"content": str(toolkit.dispatch(tu.name, tu.input)),
}
for tu in tool_uses
]
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
response = client.messages.create(
model="claude-opus-4",
max_tokens=4096,
tools=tools,
messages=messages,
)
print(response.content)
```
`toolkit.dispatch(name, args)` validates `args` through the tool's Pydantic schema, invokes the handler, and returns a JSON-serializable result. Invalid input raises `ValidationError`.
## Path 2: Claude Agent SDK
The Agent SDK manages the loop, streaming, and dispatch. Requires the extra:
```bash
pip install 'pictograph[agents]'
```
```python
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions
from pictograph.agents import create_toolkit, for_claude_agent_sdk
toolkit = create_toolkit()
agent_tools = for_claude_agent_sdk(toolkit)
async with ClaudeSDKClient(
options=ClaudeAgentOptions(
system_prompt="You drive Pictograph for the user. Confirm destructive operations.",
allowed_tools=[t.name for t in agent_tools],
),
) as client:
async for response in client.query(
"Train a YOLOX model on the 'road-signs' dataset, A10G GPU, 50 epochs"
):
print(response)
```
## Picking between paths
| Path 1 (raw dicts) | Path 2 (Agent SDK) |
| --- | --- |
| You already call `messages.create()` directly | You want streaming + multi-turn with no loop boilerplate |
| You need custom logging or gating in the loop | You're building a long-running agent process |
| You don't want the `claude-agent-sdk` dependency | You're running inside Claude Code or another Agent-SDK runtime |
## See also
- [Agents](/docs/agents) — toolkit setup, guardrails, and the registry
- [Bundled Skill](/docs/agents) — `pictograph-cv` for Claude Code / claude.ai
- [Cookbook](/docs/agents/cookbook) — recipe-style examples
---
## Page: OpenAI
_URL: https://pictograph.io/docs/agents/openai (markdown: agents/openai.md)_
_Section: Agents_
Toolkit setup and guardrails live on [Agents — overview](/docs/agents). This page shows the two OpenAI-specific integration paths.
## Path 1: OpenAI SDK (raw function tools)
No extra dependencies — works with the standard `openai` package.
```python
import json
from openai import OpenAI
from pictograph.agents import create_toolkit, for_openai_responses
toolkit = create_toolkit()
tools = for_openai_responses(toolkit)
client = OpenAI()
input_messages = [
{"role": "user", "content": "Show my Pictograph credit balance and recent training runs"},
]
while True:
response = client.responses.create(
model="gpt-5",
input=input_messages,
tools=tools,
)
function_calls = [o for o in response.output if o.type == "function_call"]
if not function_calls:
print(response.output_text)
break
for call in function_calls:
result = toolkit.dispatch(call.name, json.loads(call.arguments))
input_messages.append(call)
input_messages.append({
"type": "function_call_output",
"call_id": call.call_id,
"output": json.dumps(result, default=str),
})
```
`toolkit.dispatch(name, args)` is the same dispatcher used by the Anthropic path — single source of truth.
## Path 2: openai-agents SDK
The framework manages the tool-call loop, streaming, handoffs, and tracing. Requires the extra:
```bash
pip install 'pictograph[agents]'
```
```python
from agents import Agent, Runner
from pictograph.agents import create_toolkit, for_openai_agents
toolkit = create_toolkit()
agent_tools = for_openai_agents(toolkit)
agent = Agent(
name="Pictograph driver",
instructions=(
"Drive Pictograph for the user. Confirm destructive operations. "
"Prefer the workflow tools (full_pipeline, train_pipeline) over "
"chaining individual resource tools."
),
tools=agent_tools,
)
result = Runner.run_sync(
agent,
"Annotate ./road_signs as cars and people, then train a YOLOX model",
)
print(result.final_output)
```
Streaming:
```python
result = await Runner.run(agent, user_input)
async for event in result.stream_events():
if event.type == "raw_response_event":
print(event.data, end="", flush=True)
```
## Picking between paths
| Path 1 (raw dicts) | Path 2 (openai-agents) |
| --- | --- |
| You already call `responses.create()` directly | You want multi-turn conversations with built-in dispatch |
| You need a custom dispatch loop | You need agent handoffs |
| You don't want the `openai-agents` dependency | You want tracing + replay + structured outputs |
## See also
- [Agents](/docs/agents) — toolkit setup, guardrails, and the registry
- [Cookbook](/docs/agents/cookbook) — recipe-style examples
- [Dynamic discovery](/docs/agents/dynamic-discovery) — for Vercel AI SDK, LangChain, and other stacks
---
## Page: Dynamic discovery
_URL: https://pictograph.io/docs/agents/dynamic-discovery (markdown: agents/dynamic-discovery.md)_
_Section: Agents_
The Pictograph SDK ships first-party adapters for **Claude** and
**OpenAI**. For everything else (Vercel AI SDK, LangChain, raw HTTP
clients, custom dispatchers), the registry is exposed as JSON Schema
at:
```
GET https://api.pictograph.io/api/v1/developer/tools.json
```
Authenticated with the same `X-API-Key` header — any role works (read-only).
## Why this exists
Per-framework adapters are a treadmill. The JSON Schema contract is
the long-term answer:
- **Pictograph** maintains one source of truth (`pictograph.agents.REGISTRY`).
- **Your agent stack** consumes JSON Schema natively — every modern
framework supports it.
- **No bespoke adapter** to maintain on either side.
The Python SDK still ships Claude + OpenAI adapters because their
ecosystems are big enough to warrant the convenience. Everyone else
gets the open-standard path.
## Vercel AI SDK
```ts
import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
const headers = { 'X-API-Key': process.env.PICTOGRAPH_API_KEY! };
const { tools } = await fetch(
'https://api.pictograph.io/api/v1/developer/tools.json',
{ headers },
).then(r => r.json());
// Build a tools map keyed by name. The execute() function dispatches
// back to the Pictograph REST API directly — no Python required.
const pictographTools = Object.fromEntries(
tools.map((t: any) => [
t.name,
tool({
description: t.description,
parameters: t.input_schema, // Vercel accepts JSON Schema directly
execute: async (args: any) => {
// Map tool name → REST endpoint.
// (See: jsonschema-to-rest mapper, or hardcode the routes you use.)
const res = await fetch(
`https://api.pictograph.io/api/v1/developer/_dispatch/${t.name}`,
{ method: 'POST', headers, body: JSON.stringify(args) },
);
return res.json();
},
}),
]),
);
const result = await generateText({
model: anthropic('claude-opus-4'),
tools: pictographTools,
prompt: 'List my Pictograph datasets',
});
```
(Pictograph doesn't ship a `_dispatch` endpoint in v1.0.0 — wire
each tool to its underlying REST endpoint manually. The
`Toolkit.dispatch()` Python method is the reference behavior.)
## LangChain
```python
from langchain_core.tools import tool
import requests, os
headers = {"X-API-Key": os.environ["PICTOGRAPH_API_KEY"]}
schema = requests.get(
"https://api.pictograph.io/api/v1/developer/tools.json",
headers=headers,
).json()
# Build LangChain tools from the JSON Schema:
def make_tool(spec):
@tool(spec["name"], description=spec["description"], args_schema=...)
def _run(**kwargs):
# Call the matching REST endpoint with kwargs.
...
return _run
langchain_tools = [make_tool(t) for t in schema["tools"]]
```
For LangChain specifically, you can also use the Pictograph Python SDK
directly inside a `@tool` — that's often simpler than rebuilding the
dispatch loop:
```python
from langchain_core.tools import tool
from pictograph import Client
client = Client()
@tool("list_datasets", description="List Pictograph datasets in your org.")
def list_datasets(limit: int = 100) -> list[dict]:
return [d.model_dump(mode="json") for d in client.datasets.list(limit=limit)]
```
This trades the dynamic-discovery benefit for typed, tested SDK calls —
worth it for production.
## Custom dispatchers
If you're rolling your own:
1. Fetch `/api/v1/developer/tools.json` once at startup.
2. Hand the `tools` array to your LLM as the function/tool spec.
3. When the LLM emits a tool call, look up the tool by name.
4. Map name → REST endpoint (see the [SDK source](https://github.com/pictograph-labs/pictograph-sdk/tree/main/src/pictograph/resources)
for canonical mappings).
5. Send the args as the JSON body, return the response to the LLM.
Or use the Python SDK as a server-side dispatcher and only expose tool
names + schemas to your client — usually the cleanest production
architecture.
## Snapshot file
The same registry ships in the Python SDK package — useful for offline
work or if you want to bundle the schema with your agent:
```python
from pictograph.agents import Toolkit
from unittest.mock import MagicMock
toolkit = Toolkit(MagicMock())
schema = toolkit.as_json_schema() # identical to the HTTP response payload
```
Or via the CLI:
```bash
pictograph agents export-tools -o tools.json
```
## See also
- [Agents — overview](/docs/agents) — the three integration paths.
- [Tool registry endpoint](/docs/api-reference/tools) — full spec.
- [Cookbook](/docs/agents/cookbook) — concrete recipe examples.
---
## Page: Agent cookbook
_URL: https://pictograph.io/docs/agents/cookbook (markdown: agents/cookbook.md)_
_Section: Agents_
Concrete patterns for driving Pictograph from agents. Each recipe
shows the system-prompt guidance plus the expected tool-call sequence.
## 1. Credit-aware training
**User asks**: "Train a YOLOX model on my 'road-signs' dataset"
**System prompt addition**:
> For paid operations, always call `estimate_credit_cost` first. If
> insufficient, tell the user the gap and ask before proceeding.
**Expected sequence**:
```
1. get_credit_balance()
→ {credits_remaining: 1500, ...}
2. estimate_credit_cost("training_a10g_per_minute", quantity=30)
→ {total_credits: 300, sufficient: true, ...}
3. (response to user) "Estimated 300 credits, you have 1500. Proceeding."
4. train_pipeline(dataset_name="road-signs", pipeline="yolox", gpu="a10g")
→ {training_run: {...}, model: {id: "model-uuid", status: "ready"}}
5. (response to user) "Trained. Model ID: model-uuid. Download with `pictograph models download model-uuid -o ./yolox.onnx`"
```
If `sufficient: false`, the agent should surface
`PaymentRequiredError.upgrade_url` to the user, not the raw exception.
## 2. V7 import → auto-annotate with new classes
**User asks**: "Import 'road-damage' from V7 and add a 'pothole' class
auto-annotated by SAM3"
**Expected sequence**:
```
1. validate_connector(provider="v7", api_key="")
→ {valid: true, datasets: [{id, name: "road-damage", ...}, ...]}
2. import_from_connector(provider="v7", api_key=..., datasets=[])
→ {import_id, status: "processing", ...}
(the SDK polls until terminal — agent waits)
3. (response to user) "Imported. Now adding 'pothole' annotations…"
4. auto_annotate_dataset(
dataset_name="road-damage",
classes=[{name: "pothole", output_type: "polygon"}],
mode="batch",
)
→ {images_processed: 234, annotations_added: 487, ...}
5. (response to user summary)
```
Existing V7 annotations are preserved — auto-annotate by default skips
images that already have annotations (`overwrite=False`).
## 3. Batch SAM3 with progress + per-image fallback
**User asks**: "Auto-annotate 'wildlife' for 'tiger' and 'elephant';
fall back to text mode if batch fails"
**System prompt addition**:
> If a batch tool returns `failed_images > 0` or raises, retry the
> failed subset with the synchronous text-prompt mode.
**Expected sequence**:
```
1. auto_annotate_dataset(
dataset_name="wildlife",
classes=[{name: "tiger", output_type: "polygon"},
{name: "elephant", output_type: "polygon"}],
mode="batch",
)
→ {images_processed: 198, failed_images: 12, job_id, ...}
2. (response to user) "Batch done — 12 images failed. Retrying as text..."
3. get_dataset(name="wildlife", include_images=true)
→ list of image filenames
4. For each failed image (agent loops):
auto_annotate_text(
dataset_name="wildlife",
image_filename="img-failed-1.jpg",
text_prompt="tiger or elephant",
confidence_threshold=0.3,
)
5. (final response)
```
In practice, prefer `auto_annotate_dataset(mode="batch")` first for
speed, fall back to `mode="text"` only on the failures.
## 4. Multi-class export with status filtering
**User asks**: "Export only the 'complete' subset of 'road-signs',
just the 'stop_sign' and 'yield' classes, in YOLO format"
**Expected sequence**:
```
1. get_dataset(name="road-signs")
→ confirms classes include both
2. create_export(
dataset_name="road-signs",
name="stop-yield-yolo-2026-04-19",
format="yolo",
include_images=true,
class_filter=["stop_sign", "yield"],
status_filter="complete",
)
→ {id, status: "completed", image_count: 412, ...}
3. download_export(
dataset_name="road-signs",
export_name="stop-yield-yolo-2026-04-19",
output_path="./stop-yield.zip",
)
4. (response to user) "Wrote ./stop-yield.zip — 412 images, 1287 annotations."
```
Always include the date in export names so re-runs don't conflict
(`409 ConflictError` on duplicate names).
## 5. Cleanup confirmation flow
**User asks**: "Delete the old test datasets"
**System prompt addition**:
> Before any destructive action (`delete_dataset`, `delete_image`,
> `cancel_training`), list the targets and ask the user to confirm by
> repeating the names.
**Expected sequence**:
```
1. list_datasets(limit=100)
→ [{name: "test-1", ...}, {name: "test-2", ...}, {name: "production", ...}]
2. (response to user) "Found 2 with 'test' in the name: test-1, test-2.
Confirm by typing the names you want deleted, separated by commas."
3. (user) "test-1, test-2"
4. (agent) For each confirmed name:
delete_dataset(name="test-1")
delete_dataset(name="test-2")
5. (response to user) "Deleted: test-1, test-2."
```
The `delete_dataset` tool is marked `idempotent=True` and
`required_role="admin"` — agents using a `member` key get
`403 ForbiddenError` automatically.
## 6. Folder reorganization with batch ops
**User asks**: "Move all images tagged 'blurry' to a 'blurry' folder"
**Expected sequence**:
```
1. search_by_tag(dataset_name="…", attributes=["blurry"], limit=500)
→ [{image_id, filename, ...}, ...]
2. batch.move(
image_ids=[r.image_id for r in results],
folder_path="/blurry",
)
→ {succeeded: [...], failed_count: 0}
3. (response to user) "Moved 87 blurry images to /blurry."
```
Agents should prefer `batch.*` over per-image loops — single call,
single round-trip, atomic at the database level.
## See also
- [Claude integration](/docs/agents/claude) — Anthropic-specific paths
- [OpenAI integration](/docs/agents/openai) — OpenAI-specific paths
- [Agents — overview](/docs/agents) — toolkit setup, guardrails, and `pictograph-cv` Skill install
---
## Page: Workflows
_URL: https://pictograph.io/docs/workflows (markdown: workflows.md)_
_Section: Workflows_
Workflows are the headline UX of the SDK. Each one chains several REST calls into a single Python function so you can express "upload this folder and train a model" without orchestrating it yourself. Failures fail open — every workflow returns a report you can inspect.
```python
from pictograph import Client
from pictograph.workflows import full_pipeline
client = Client()
report = full_pipeline(
client,
dataset_name="road-signs",
folder="./road_signs",
classes=[("stop_sign", "bbox"), ("yield", "bbox")],
pipeline="yolox",
)
print("model:", report.model.id if report.success else report.upload.failures)
```
## When to reach for each
| Workflow | What it chains | Use when |
| --- | --- | --- |
| [`full_pipeline`](/docs/workflows/full-pipeline) | upload → auto-annotate → train | You have a folder of images and want a trained model |
| [`upload_dataset_from_folder`](/docs/workflows/upload) | walk folder → bulk upload | You only need the upload step (annotations come later) |
| [`auto_annotate_dataset`](/docs/workflows/auto-annotate) | list images → SAM3 batch → save | The dataset is uploaded; you want SAM3 to label it |
| [`train_pipeline`](/docs/workflows/train) | create export → train → fetch model | Annotations are saved; you want the model |
## Report objects, not exceptions
Workflows don't raise on partial failure. They return dataclasses with per-phase success flags and failure lists. This is intentional — agents and CI jobs need to make decisions on partial outcomes, not unwind on the first 4xx.
```python
report = upload_dataset_from_folder(client, "my-dataset", "./images")
if report.success:
print(f"Uploaded {report.images_uploaded}")
else:
for failure in report.failures:
print(failure.path, failure.reason)
```
Exceptions are still raised for unrecoverable errors before any work happens (`NotFoundError` on a missing dataset, `ValidationError` on a bad pipeline name). See [Error handling](/docs/error-handling).
---
## Page: Full pipeline
_URL: https://pictograph.io/docs/workflows/full-pipeline (markdown: workflows/full-pipeline.md)_
_Section: Workflows_
`full_pipeline()` chains upload → auto-annotate → train. Each phase short-circuits on failure and the `PipelineReport` carries every sub-report so you can see exactly where the chain broke.
```python
from pictograph import Client
from pictograph.workflows import full_pipeline
client = Client()
report = full_pipeline(
client,
dataset_name="road-signs",
folder="./road_signs",
classes=[("stop_sign", "bbox"), ("yield", "bbox")],
pipeline="yolox",
)
if report.success:
print("Model:", report.model.id)
else:
print("Stopped at:", report.credit_skip_reason or "see sub-reports")
```
## Signature
```python
full_pipeline(
client: Client,
*,
dataset_name: str,
folder: str | Path,
classes: Sequence[BatchClass | tuple[str, str] | dict[str, str]],
pipeline: PipelineType,
gpu: GpuType = "a10g",
annotate: bool = True,
annotate_mode: AnnotateMode = "batch",
train: bool = True,
upload_workers: int = 8,
train_config: dict[str, Any] | None = None,
train_timeout: float = 7200.0,
min_credits: int | None = 1,
) -> PipelineReport
```
| Argument | Default | Purpose |
| --- | --- | --- |
| `dataset_name` | required | Destination dataset; created if missing |
| `folder` | required | Local folder of images — subdirectories become virtual folders |
| `classes` | required | Each class becomes a SAM3 target and a training label |
| `pipeline` | required | `yolox`, `detectron2`, `sm_pytorch`, `classification`, `rfdetr_detection`, `rfdetr_segmentation` |
| `gpu` | `"a10g"` | `a10g`, `a100`, or `h100` |
| `annotate` | `True` | Skip the SAM3 phase if you already have annotations |
| `annotate_mode` | `"batch"` | `batch` (async, multi-image) or `text` (synchronous per-image) |
| `train` | `True` | Skip training to do upload + annotate only |
| `upload_workers` | `8` | Concurrent upload threads |
| `train_config` | `None` | Hyperparameters (`epochs`, `batch_size`, `learning_rate`, `image_size`) |
| `train_timeout` | `7200` | Max seconds to wait for training (2 hours) |
| `min_credits` | `1` | Pre-flight balance check before paid phases — pass `None` to disable |
## How the chain fails open
Each phase only runs if the previous one succeeded.
1. **Upload** always runs. If it produces zero successes, the function returns immediately with `upload.failures` populated.
2. **Credit gate**: before any paid phase, the balance is checked. Below `min_credits` and the function returns with `credit_skip_reason` set.
3. **Auto-annotate** runs only when `annotate=True` and upload succeeded. If it produces zero processed images, training is skipped.
4. **Train** runs only when `train=True` and the previous phases succeeded. The export is auto-named `-`.
## Inspecting the report
```python
@dataclass
class PipelineReport:
dataset_name: str
upload: UploadReport
annotate: AnnotateReport | None
training_run: TrainingRun | None
model: Model | None
credit_skip_reason: str | None
@property
def success(self) -> bool: ...
```
`success` is `True` only when every populated phase succeeded and no credit skip happened. Each sub-report has its own `success` property.
## Common patterns
**Upload + annotate only** (no training):
```python
full_pipeline(client, ..., pipeline="yolox", train=False)
```
**Use existing annotations** (skip SAM3):
```python
full_pipeline(client, ..., annotate=False, pipeline="yolox")
```
**Disable the credit pre-flight** (you're OK paying for partial runs):
```python
full_pipeline(client, ..., min_credits=None)
```
## Errors
The function does not raise on partial failure — inspect the report. It will still raise for unrecoverable conditions before any work begins:
| Status | Exception | Cause |
| --- | --- | --- |
| 402 | `PaymentRequiredError` | Mid-run cost exceeds balance (from auto-annotate or training phase) |
| 422 | `ValidationError` | `pipeline` or `gpu` value invalid |
| `FileNotFoundError` | — | `folder` doesn't exist or isn't a directory |
## See also
- [Upload](/docs/workflows/upload) · [Auto-annotate](/docs/workflows/auto-annotate) · [Train](/docs/workflows/train) — the underlying workflows
- [Credits](/docs/api-reference/credits) — pre-flight cost estimation
- [Error handling](/docs/error-handling)
---
## Page: Upload from folder
_URL: https://pictograph.io/docs/workflows/upload (markdown: workflows/upload.md)_
_Section: Workflows_
`upload_dataset_from_folder()` walks a local directory of images, creates the destination dataset if needed, and uploads everything through a thread pool. Subdirectories become virtual folders on the dataset by default. Re-runs are idempotent — duplicate filenames are skipped, not failed.
```python
from pictograph import Client
from pictograph.workflows import upload_dataset_from_folder
client = Client()
report = upload_dataset_from_folder(
client,
dataset_name="road-signs",
folder="./road_signs",
)
print(f"{report.images_uploaded} uploaded, {report.images_skipped} skipped")
```
## Signature
```python
upload_dataset_from_folder(
client: Client,
dataset_name: str,
folder: str | Path,
*,
organize_by_class: bool = True,
parallel: bool = True,
max_workers: int = 8,
skip_existing: bool = True,
create_if_missing: bool = True,
progress: Callable[[int, int, str | None], None] | None = None,
) -> UploadReport
```
| Argument | Default | Purpose |
| --- | --- | --- |
| `dataset_name` | required | Destination dataset |
| `folder` | required | Local directory (walked recursively) |
| `organize_by_class` | `True` | First-level subdirectories become virtual folders |
| `parallel` | `True` | Use a thread pool |
| `max_workers` | `8` | Pool size — higher values risk hitting the rate limit |
| `skip_existing` | `True` | Treat duplicate-filename conflicts as skips, not failures |
| `create_if_missing` | `True` | Create the dataset if it doesn't exist (else `NotFoundError`) |
| `progress` | `None` | `(completed, total, filename)` callback fired after each file |
## Folder layout convention
With `organize_by_class=True` (the default), the **first-level subdirectory** becomes the virtual folder:
```
./road_signs/
├── stop/ → /stop on the dataset
│ ├── 001.jpg
│ └── 002.jpg
├── yield/ → /yield
│ └── 003.jpg
└── 004.jpg → / (root)
```
Nested subdirectories collapse — `./road_signs/stop/night/005.jpg` still lands in `/stop`. Pass `organize_by_class=False` to put every file at the root.
Supported extensions: `.jpg`, `.jpeg`, `.png`, `.webp`, `.bmp`, `.tif`, `.tiff`, `.gif`, `.heic`.
## Idempotency
Re-running the same call on a dataset that already has matching filenames is safe — those uploads come back as `images_skipped`. To force re-upload, set `skip_existing=False` (failures will be recorded instead).
```python
# First run — uploads everything.
report = upload_dataset_from_folder(client, "road-signs", "./road_signs")
assert report.images_uploaded == 100 and report.images_skipped == 0
# Second run — skips everything that's already there.
report = upload_dataset_from_folder(client, "road-signs", "./road_signs")
assert report.images_uploaded == 0 and report.images_skipped == 100
```
## Progress callback
```python
def on_progress(done: int, total: int, filename: str | None) -> None:
print(f"[{done}/{total}] {filename}")
upload_dataset_from_folder(
client, "road-signs", "./road_signs", progress=on_progress,
)
```
The callback fires once per file, regardless of success or failure.
## Inspecting the report
```python
@dataclass
class UploadReport:
dataset_name: str
images_attempted: int
images_uploaded: int
images_skipped: int
failures: list[UploadFailure] # each carries .path and .reason
@property
def success(self) -> bool: ...
```
`success` is `True` only when there are zero failures **and** at least one file uploaded. An empty folder returns a report with `success=False`.
## Errors
| Status | Exception | Cause |
| --- | --- | --- |
| `FileNotFoundError` | — | `folder` doesn't exist or isn't a directory |
| 404 | `NotFoundError` | `dataset_name` missing and `create_if_missing=False` |
Per-file errors (network, validation, conflict) are recorded in `report.failures`, not raised.
## See also
- [Full pipeline](/docs/workflows/full-pipeline) — chains upload with annotate + train
- [Images](/docs/api-reference/images) — the underlying `client.images.upload()` method
---
## Page: Auto-annotate a dataset
_URL: https://pictograph.io/docs/workflows/auto-annotate (markdown: workflows/auto-annotate.md)_
_Section: Workflows_
`auto_annotate_dataset()` runs SAM3 over a dataset and saves the resulting annotations. By default it runs in **batch mode** — one async job over many images — which is the right call for anything above ~10 images. Text mode (one synchronous prompt per image) is available for debugging.
```python
from pictograph import Client
from pictograph.workflows import auto_annotate_dataset
client = Client()
report = auto_annotate_dataset(
client,
dataset_name="road-signs",
classes=[("stop_sign", "bbox"), ("yield", "polygon")],
)
print(f"{report.annotations_added} annotations across {report.images_processed} images")
```
## Signature
```python
auto_annotate_dataset(
client: Client,
dataset_name: str,
classes: Sequence[BatchClass | tuple[str, str] | dict[str, str]],
*,
mode: AnnotateMode = "batch",
confidence_threshold: float = 0.5,
overwrite: bool = False,
max_images: int | None = None,
poll_interval: float = 5.0,
timeout: float = 1800.0,
) -> AnnotateReport
```
| Argument | Default | Purpose |
| --- | --- | --- |
| `dataset_name` | required | Project name |
| `classes` | required | What to detect — see "Class specs" below |
| `mode` | `"batch"` | `batch` (async multi-image) or `text` (synchronous per-image) |
| `confidence_threshold` | `0.5` | SAM3 score cutoff (0–1) |
| `overwrite` | `False` | When `False`, skip images that already have annotations |
| `max_images` | `None` | Cap (useful for dry-runs) |
| `poll_interval` | `5.0` | `batch` mode — seconds between status polls |
| `timeout` | `1800` | `batch` mode — max seconds to wait |
## Class specs
`classes` accepts three shapes — pick whichever is shortest:
```python
# 1. Tuples — name + output_type
classes=[("stop_sign", "bbox"), ("yield", "polygon")]
# 2. Dicts
classes=[
{"name": "stop_sign", "output_type": "bbox"},
{"name": "yield", "output_type": "polygon"},
]
# 3. BatchClass (canonical)
from pictograph.models.auto_annotate import BatchClass
classes=[BatchClass(name="stop_sign", output_type="bbox")]
```
Valid `output_type` values: `"bbox"`, `"polygon"`, `"polyline"`, `"keypoint"`.
## Batch vs text mode
`mode="batch"` (default) sends every image and every class to one async SAM3 job. The job is polled until it terminates; you get one report at the end. This is what you want for >10 images — it's faster and cheaper per image.
`mode="text"` runs one synchronous SAM3 text-prompt per image, per class. It's slower (no batching) and saves annotations as they come back. Use it when you need to debug a single image or when the dataset is small enough that the batch warmup overhead isn't worth it.
## Skip vs overwrite
By default the workflow skips images that already have at least one annotation. Set `overwrite=True` to re-annotate everything:
```python
# Annotate only the unlabelled subset.
auto_annotate_dataset(client, "road-signs", classes=[...])
# Re-annotate every image (overwrites existing).
auto_annotate_dataset(client, "road-signs", classes=[...], overwrite=True)
```
## Inspecting the report
```python
@dataclass
class AnnotateReport:
dataset_name: str
images_attempted: int
images_processed: int
images_skipped: int
annotations_added: int
failures: list[AnnotationFailure]
job_id: str | None # set only when mode="batch"
@property
def success(self) -> bool: ...
```
In batch mode, `job_id` lets you fetch the job later via `client.auto_annotate.get_batch(job_id)` (e.g. to surface progress in a UI) or cancel it.
## Errors
| Status | Exception | Cause |
| --- | --- | --- |
| 404 | `NotFoundError` | Dataset doesn't exist |
| 402 | `PaymentRequiredError` | Insufficient credits |
| 422 | `ValidationError` | Class name invalid or `output_type` not recognised |
Per-image failures are recorded in `report.failures` — they don't raise.
## See also
- [Full pipeline](/docs/workflows/full-pipeline) — chains annotate with upload + train
- [Auto-annotate](/docs/api-reference/auto-annotate) — point / box / text / batch primitives
- [Credits](/docs/api-reference/credits) — cost estimation per image
---
## Page: Train a model
_URL: https://pictograph.io/docs/workflows/train (markdown: workflows/train.md)_
_Section: Workflows_
`train_pipeline()` chains export creation, training, and model fetch. The export is auto-named so the workflow doesn't collide with exports you've created manually.
```python
from pictograph import Client
from pictograph.workflows import train_pipeline
client = Client()
run, model = train_pipeline(
client,
"road-signs",
pipeline="yolox",
gpu="a10g",
config={"epochs": 50, "batch_size": 16},
)
if model:
client.models.download(model.id, "./yolox.onnx")
```
## Signature
```python
train_pipeline(
client: Client,
dataset_name: str,
*,
pipeline: PipelineType,
gpu: GpuType = "a10g",
name: str | None = None,
config: dict[str, Any] | None = None,
export_name: str | None = None,
class_filter: list[str] | None = None,
status_filter: str = "complete",
wait: bool = True,
poll_interval: float = 5.0,
timeout: float = 7200.0,
) -> tuple[TrainingRun, Model | None]
```
| Argument | Default | Purpose |
| --- | --- | --- |
| `dataset_name` | required | Project to train on |
| `pipeline` | required | `yolox`, `detectron2`, `sm_pytorch`, `classification`, `rfdetr_detection`, `rfdetr_segmentation` |
| `gpu` | `"a10g"` | `a10g`, `a100`, or `h100` |
| `name` | auto | Run name (defaults to `-run-`) |
| `config` | `{}` | Hyperparameters (`epochs`, `batch_size`, `learning_rate`, `image_size`) |
| `export_name` | auto | Defaults to `-` |
| `class_filter` | `None` | Train only on these classes |
| `status_filter` | `"complete"` | Only include images at this annotation status |
| `wait` | `True` | When `True`, block until training terminates and fetch the model |
| `poll_interval` | `5.0` | Seconds between polls |
| `timeout` | `7200` | Max seconds to wait (2 hours) |
## Pipelines
| `pipeline` | Output | When to pick it |
| --- | --- | --- |
| `yolox` | Object detection (boxes) | Speed, edge deployment, small datasets |
| `detectron2` | Instance segmentation (polygons + masks) | Per-instance pixel masks |
| `sm_pytorch` | Semantic segmentation | Pixel-wise class maps |
| `classification` | Image classification | Tag-style labels with no geometry |
| `rfdetr_detection` | Object detection | Higher mAP than YOLOX on harder data |
| `rfdetr_segmentation` | Instance segmentation | Higher mAP than Detectron2 on harder data |
## GPU tiers
| `gpu` | Pick for |
| --- | --- |
| `a10g` (default) | YOLOX, classification, RF-DETR-detection |
| `a100` | Detectron2, large RF-DETR, big batch sizes |
| `h100` | Last resort — only when A100 OOMs |
The dataset must have **at least 5 images** with the chosen `status_filter` so the worker can split train / val / test.
## What happens under the hood
```
1. client.exports.create(dataset, "-", format="pictograph",
include_images=True, class_filter=…, status_filter=…)
→ waits for the export to finish.
2. client.training.create(dataset, export_name, pipeline_type=…, name=…,
config=…, gpu_type=…, wait=…, timeout=…)
→ kicks off the run; polls until terminal when wait=True.
3. client.models.get(run.model_id)
→ returns the trained model — only when wait=True and status=="completed".
```
## Async usage
Pass `wait=False` to fire-and-forget:
```python
run, _ = train_pipeline(client, "road-signs", pipeline="yolox", wait=False)
print("queued:", run.id)
# Poll yourself later.
run = client.training.get(run.id)
if run.status == "completed":
model = client.models.get(run.model_id)
```
## Hyperparameters
`config` keys are pipeline-specific. Common ones across pipelines:
| Key | Type | Typical |
| --- | --- | --- |
| `epochs` | int | 30–100 |
| `batch_size` | int | 8 / 16 / 32 |
| `learning_rate` | float | `0.001`–`0.01` |
| `image_size` | int | `640` (YOLOX), `1024` (Detectron2) |
Unsupported keys are ignored.
## Errors
| Status | Exception | Cause |
| --- | --- | --- |
| 404 | `NotFoundError` | Dataset missing or has no `status_filter`-matching images |
| 422 | `ValidationError` | Pipeline or GPU invalid, or dataset has fewer than 5 annotated images |
| 402 | `PaymentRequiredError` | Insufficient credits for the estimated training minutes |
| 408 | `PollTimeoutError` | `wait=True` and `timeout` elapsed (the run continues; poll later) |
| 5xx | `ApiError` | Training run failed — inspect `run.error_message` |
## See also
- [Full pipeline](/docs/workflows/full-pipeline) — chains training with upload + annotate
- [Training](/docs/api-reference/training) — lower-level `create / list / get / cancel` primitives
- [Models](/docs/api-reference/models) — download trained ONNX weights
- [Credits](/docs/api-reference/credits) — `estimate("training__per_minute")`
---
## Page: API reference
_URL: https://pictograph.io/docs/api-reference (markdown: api-reference.md)_
_Section: API Reference_
The Pictograph SDK exposes 15 resource groups under `client.`. Each method maps 1:1 to a REST endpoint. Use the SDK for type safety and auto-retry; use raw REST when you need a non-Python language.
```python
from pictograph import Client
client = Client()
```
## Resources
| Resource | Purpose |
| --- | --- |
| [Datasets](/docs/api-reference/datasets) | Project CRUD and bulk download |
| [Images](/docs/api-reference/images) | Single-image upload / download / delete |
| [Annotations](/docs/api-reference/annotations) | Per-image annotation read / save / delete |
| [Auto-annotate](/docs/api-reference/auto-annotate) | SAM3 point / box / text / batch prompts |
| [Search](/docs/api-reference/search) | Tag and similarity search across a dataset |
| [Batch](/docs/api-reference/batch) | Bulk move / copy / delete / update on images |
| [Exports](/docs/api-reference/exports) | Dataset exports in COCO, YOLO, CVAT, Pascal VOC, LabelMe, CSV |
| [Training](/docs/api-reference/training) | Spawn and monitor training runs |
| [Models](/docs/api-reference/models) | List and download trained ONNX weights |
| [Credits](/docs/api-reference/credits) | Balance, ledger, pre-flight cost estimation |
| [Connectors](/docs/api-reference/connectors) | V7 / Roboflow dataset import |
| [Video](/docs/api-reference/video) | Video upload + frame extraction |
| [Organizations](/docs/api-reference/organizations) | Members and invites for the active org |
| [Projects](/docs/api-reference/projects) | Project config (classes, annotation types) |
| [API keys](/docs/api-reference/api-keys) | Programmatic key management |
| [Tools](/docs/api-reference/tools) | Agent tool registry (JSON Schema reference) |
## Prefer workflows for end-to-end tasks
For multi-step flows like "upload, annotate, train", reach for [`pictograph.workflows`](/docs/workflows) before chaining resource calls. Workflows handle short-circuit on failure, credit gating, and report aggregation for you.
| Workflow | Chains |
| --- | --- |
| [`full_pipeline`](/docs/workflows/full-pipeline) | upload → auto-annotate → train |
| [`upload_dataset_from_folder`](/docs/workflows/upload) | walk folder → bulk upload |
| [`auto_annotate_dataset`](/docs/workflows/auto-annotate) | list images → batch SAM3 → save |
| [`train_pipeline`](/docs/workflows/train) | export → train → fetch model |
Each workflow is also exposed as an [agent tool](/docs/api-reference/tools).
## Conventions across resources
- **Name-based lookups** are preferred (`get(name=...)`). UUID variants exist where useful (`get_by_id(...)`).
- **Pagination**: `.list()` returns one page; `.iter()` returns an `OffsetPager` that auto-fetches subsequent pages.
- **Long-running ops** (training, exports, batch SAM3, dataset imports) default to `wait=True` and poll until terminal. Pass `wait=False` to fire-and-forget.
- **Failure reports**: bulk operations return per-item failure lists rather than raising on the first error.
- **Idempotency**: mutating operations auto-generate `Idempotency-Key` headers — safe to retry. Pass `idempotency_key=` to set it explicitly.
---
## Page: Datasets
_URL: https://pictograph.io/docs/api-reference/datasets (markdown: api-reference/datasets.md)_
_Section: API Reference_
A **dataset** in Pictograph is a project — a collection of images
sharing a class set. Datasets are unique by `(organization, name)`.
The SDK strongly prefers name-based lookups (agents pass strings users
gave them; UUID indirection is friction).
```python
from pictograph import Client
client = Client()
```
## list
Single-page list of datasets in your organization.
```python
datasets = client.datasets.list(limit=100)
for ds in datasets:
print(ds.name, ds.image_count)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `limit` | `int` | `100` | Backend cap: 1000 |
Returns `list[Dataset]`.
## iter
Auto-paging iterator over every dataset.
```python
for ds in client.datasets.iter(page_size=100):
print(ds.name)
# Or materialize:
all_datasets = client.datasets.iter().all()
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `page_size` | `int` | `100` | Items per backend round-trip |
| `max_total` | `int \| None` | `None` | Stop after this many items |
Returns `OffsetPager[Dataset]`.
## get
Fetch by name (case-sensitive within org).
```python
ds = client.datasets.get("road-signs", include_images=True, images_limit=200)
print(ds.image_count, len(ds.images))
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `name` | `str` | required | Dataset name |
| `include_images` | `bool` | `False` | Embed first `images_limit` `DatasetImage` summaries |
| `images_limit` | `int` | `1000` | Backend cap: 10000 |
| `images_offset` | `int` | `0` | Page the embedded image list |
Returns `Dataset`.
## get_by_id
UUID lookup. Use only when you already have the ID.
```python
ds = client.datasets.get_by_id("a3e12f...")
```
## download
Bulk-download images and / or annotations to a local directory. Fetches
a batch of signed download URLs in one call, then downloads in parallel
via a thread pool.
```python
report = client.datasets.download(
"road-signs",
output_dir="./dump",
mode="full", # "full" | "images_only" | "annotations_only"
status_filter="complete", # restrict to annotation-finalised images
max_workers=10,
progress=lambda done, total, fn: print(f"{done}/{total} {fn}"),
)
print(report.images_downloaded, report.annotations_downloaded, len(report.failures))
```
Returns a `DownloadReport`. Inspect `.failures` to retry the subset —
the call does **not** raise on individual file errors.
## Project CRUD
Project create / update / delete live on the [`projects`](/docs/api-reference/projects)
resource (it's the same underlying entity; "dataset" is the SDK alias for
the read paths and "project" is the alias for the write paths).
```python
proj = client.projects.create("new-dataset", description="…")
client.projects.update("new-dataset", description="updated")
client.projects.delete("new-dataset")
```
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | Name doesn't exist (case-sensitive) or belongs to another org |
| 409 | `ConflictError` | `create` with a duplicate name |
| 403 | `ForbiddenError` | `delete` requires `admin`+ role |
## REST equivalent
```bash
curl -H "X-API-Key: pk_live_…" \
https://api.pictograph.io/api/v1/developer/datasets/?limit=10
```
---
## Page: Images
_URL: https://pictograph.io/docs/api-reference/images (markdown: api-reference/images.md)_
_Section: API Reference_
For bulk operations across many images, prefer the
[upload workflow](/docs/quick-start) and the [`batch`](/docs/api-reference/batch)
resource. This page is for single-image ops.
```python
from pictograph import Client
client = Client()
```
## get
Fetch metadata for a single image.
```python
image = client.images.get("img-uuid-1")
print(image.filename, image.status, image.annotation_count)
```
Returns `Image`. Annotations live on the [annotations](/docs/api-reference/annotations)
resource — call `client.annotations.get(image.id)` to fetch them.
## upload
Upload a local file to a dataset. Three steps under the hood: get a
signed upload URL → PUT the bytes → register the image.
```python
from pathlib import Path
project = client.projects.get("my-dataset")
image = client.images.upload(
dataset_id=project.id,
file_path=Path("./photo.jpg"),
folder_path="/cars", # virtual folder on the dataset
)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `dataset_id` | `str` | required | UUID of the destination dataset |
| `file_path` | `str \| Path` | required | Local file. Pillow extracts dimensions client-side |
| `folder_path` | `str` | `"/"` | Virtual folder (e.g. `/cars`). Storage paths are immutable |
| `idempotency_key` | `str \| None` | auto | Override the auto-generated dedup key |
Returns `Image`. Raises `ConflictError` if a file with the same name already exists in the same folder.
**Supported extensions**: `.jpg`, `.jpeg`, `.png`, `.webp`, `.bmp`, `.tif`, `.tiff`, `.gif`, `.heic`. HEIC is auto-converted to PNG server-side.
## download
Stream the original image bytes to a local file (chunked, safe for
large images).
```python
client.images.download("img-uuid-1", output_path="./photo.jpg")
```
| Arg | Type | Default |
|---|---|---|
| `image_id` | `str` | required |
| `output_path` | `str \| Path` | required |
The bytes are served via Cloud CDN with 30-day edge caching, so repeat
downloads are fast.
## delete
Soft-delete (archive) by default. Set `permanent=True` to free the
stored bytes — irreversible.
```python
client.images.delete("img-uuid-1") # archive (recoverable)
client.images.delete("img-uuid-1", permanent=True) # permanent
```
Permanent deletes require `member`+ role on the API key.
## Bulk uploads
For directories of images, use the workflow:
```python
from pictograph.workflows import upload_dataset_from_folder
report = upload_dataset_from_folder(
client,
"my-dataset",
folder="./photos",
organize_by_class=True, # subdirectory → virtual folder
parallel=True,
max_workers=8,
)
print(report.images_uploaded, len(report.failures))
```
See [Quick Start](/docs/quick-start) for the full workflow surface.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | `image_id` doesn't exist, or belongs to another org |
| 409 | `ConflictError` | Filename collision in the same virtual folder |
| 413 | `ApiError` | Uploaded file exceeds 50 MB |
| 415 | `ValidationError` | Unsupported file extension |
---
## Page: Annotations
_URL: https://pictograph.io/docs/api-reference/annotations (markdown: api-reference/annotations.md)_
_Section: API Reference_
Annotations follow the canonical Pictograph JSON schema — see
[Annotation format](/docs/annotation-format) for the full spec.
The class label field is **`name`** (not `class`). Polygons use
multi-ring `paths`, not flat coordinate arrays.
```python
from pictograph import Client
client = Client()
```
## get
Fetch the typed annotation list attached to an image.
```python
annotations = client.annotations.get("img-uuid-1")
for ann in annotations:
print(ann.name, ann.type)
```
Returns `list[Annotation]` — a discriminated union over
`BBoxAnnotation` / `PolygonAnnotation` / `PolylineAnnotation` /
`KeypointAnnotation`. An image with no annotations returns `[]`
(never raises for the "no annotations" case — only for "no such image").
## save
Replace the image's annotations with the supplied list. **Full overwrite** —
existing annotations are dropped.
```python
from pictograph import BBoxAnnotation, BoundingBox, PolygonAnnotation, PolygonGeometry, Point
result = client.annotations.save("img-uuid-1", [
BBoxAnnotation(
id="ann-1",
name="person",
bounding_box=BoundingBox(x=100, y=200, w=50, h=80),
),
PolygonAnnotation(
id="ann-2",
name="car",
polygon=PolygonGeometry(paths=[
[Point(x=0, y=0), Point(x=10, y=0), Point(x=10, y=10)],
]),
),
])
print(result.previous_count, "→", result.new_count, result.status)
```
| Arg | Type | Notes |
|---|---|---|
| `image_id` | `str` | Image UUID |
| `annotations` | `Sequence[Annotation]` | Pydantic-validated client-side; backend re-validates |
Returns `SaveResult` — `image_id`, `previous_count`, `new_count`, `status`
(`"new"` / `"in_progress"` / `"complete"`, set automatically by count).
Polygons may omit `bounding_box` on save — the backend computes the
enclosing rectangle server-side.
## delete
Remove every annotation from the image. Equivalent to `save(image_id, [])`
but uses `DELETE` and requires `admin`+ role.
```python
result = client.annotations.delete("img-uuid-1")
print(result.deleted_count)
```
## Validation
The SDK Pydantic models reject malformed payloads at construction:
```python
from pictograph import PolygonAnnotation, PolygonGeometry, Point
PolygonGeometry(paths=[[Point(x=0, y=0)]])
# ValidationError: paths[0] has 1 point(s); polygon ring requires >= 3
```
The backend re-validates on save as defense-in-depth — agents that
construct dicts directly will hit `422 ValidationError` for the same
class of mistakes.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | `image_id` doesn't exist |
| 422 | `ValidationError` | `class` instead of `name`, flat polygon array, unknown class label |
| 403 | `ForbiddenError` | `delete` requires `admin`+ role |
## Auto-annotate workflow
If you want SAM3 to generate annotations rather than write them by hand,
see the [`auto-annotate`](/docs/api-reference/auto-annotate) resource.
The `auto_annotate_dataset` workflow saves annotations automatically;
the single-prompt methods return a `PromptResult` and you call `save`
yourself.
## REST equivalent
```bash
curl -X POST -H "X-API-Key: pk_live_…" \
-H "Content-Type: application/json" \
-d '{"image_id":"…","annotations":[…]}' \
https://api.pictograph.io/api/v1/developer/annotations/
```
---
## Page: Auto-annotate
_URL: https://pictograph.io/docs/api-reference/auto-annotate (markdown: api-reference/auto-annotate.md)_
_Section: API Reference_
Pictograph runs SAM3 (Segment Anything Model 3) on T4 GPUs for
auto-annotation. Three single-image prompt modes plus an async batch
endpoint for many images at once.
```python
from pictograph import Client
client = Client()
```
## point — "click here, segment that"
Best when the user knows the object's location.
```python
result = client.auto_annotate.point(
dataset_name="my-dataset",
image_filename="img-1.jpg",
x=320, y=240,
name="car",
positive_points=[(310, 250)], # optional extra positives
negative_points=[(100, 100)], # exclude regions
score_threshold=0.75,
)
# result.annotations[0] is a polygon — call client.annotations.save() to persist.
```
Returns `PromptResult` with `status` ∈ `{"success", "no_detection", "below_threshold"}`.
On `success`, `annotations[0]` is a `PolygonAnnotation`.
## box — "segment everything in this box"
Best when the user has drawn a rough bounding box.
```python
result = client.auto_annotate.box(
dataset_name="my-dataset",
image_filename="img-1.jpg",
box={"x": 100, "y": 200, "w": 200, "h": 150},
name="car",
return_polygon=True, # also include polygon (not just bbox)
confidence_threshold=0.5,
negative_boxes=[{"x": 50, "y": 50, "w": 30, "h": 30}],
)
```
`return_polygon=False` returns only the refined bbox.
## text — "find all "
Open-vocabulary text prompt. Best for many objects in one image.
```python
result = client.auto_annotate.text(
dataset_name="my-dataset",
image_filename="img-1.jpg",
text_prompt="red cars",
output_type="polygon", # or "bbox"
confidence_threshold=0.3,
max_detections=50,
)
```
## batch — async, many images
Use for **>10 images**. Kicks off one job; polls until terminal status.
```python
from pictograph import BatchClass
job = client.auto_annotate.batch(
dataset_name="my-dataset",
image_filenames=["img-1.jpg", "img-2.jpg", "..."],
classes=[
BatchClass(name="car", output_type="polygon"),
BatchClass(name="person", output_type="bbox"),
],
confidence_threshold=0.5,
wait=True,
poll_interval=5.0,
timeout=1800.0, # 30 min default
)
print(job.status, job.processed_images, job.total_annotations_added)
```
`wait=False` returns the job immediately — poll later via:
```python
job = client.auto_annotate.get_batch(job.job_id)
job = client.auto_annotate.wait_for_batch(job.job_id, timeout=600.0)
client.auto_annotate.cancel_batch(job.job_id)
```
## auto_annotate_dataset workflow
Higher-level helper that paginates a dataset's image list, runs batch
or text mode, and returns a per-image report:
```python
from pictograph.workflows import auto_annotate_dataset
report = auto_annotate_dataset(
client,
"my-dataset",
classes=[("car", "polygon"), ("person", "bbox")],
mode="batch", # "batch" | "text"
confidence_threshold=0.5,
overwrite=False, # skip already-annotated images
max_images=None, # None = all
)
print(report.images_processed, report.annotations_added, len(report.failures))
```
## Choosing a mode
| Scenario | Mode |
|---|---|
| User clicks one spot | `point` |
| User drags a rough box | `box` |
| Many images, known classes | `batch` (or `text` per image for small datasets) |
| Single image, multiple objects | `text` |
## Cost
SAM3 is paid:
- **3-credit minimum** per session (image embedding generation).
- **~1 credit** per additional prompt on the same image (sub-second).
- **Batch** is charged per image processed.
`client.credits.estimate("sam3_per_minute", quantity=N)` for an A10G-time
estimate — pre-charge is fixed per call. Read
`PaymentRequiredError.required` on rejection for the exact ask.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 402 | `PaymentRequiredError` | Out of credits |
| 404 | `NotFoundError` | Dataset or image missing |
| 408 | `PollTimeoutError` | Batch job didn't finish within `timeout` (job keeps running on the backend) |
## See also
- [Annotations](/docs/api-reference/annotations) — saving the prompt results
- [Annotation format](/docs/annotation-format) — wire format
- [Credits](/docs/api-reference/credits) — budget gating
---
## Page: Models
_URL: https://pictograph.io/docs/api-reference/models (markdown: api-reference/models.md)_
_Section: API Reference_
Models are produced by [training runs](/docs/api-reference/training).
The SDK doesn't insert model rows directly — you train, then read.
```python
from pictograph import Client
client = Client()
```
## list / iter
```python
models = client.models.list(limit=20)
for m in models:
print(m.name, m.architecture, m.status, m.metrics)
# Or auto-page:
for m in client.models.iter(page_size=50):
print(m.id, m.model_type)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `limit` | `int` | `50` | Backend cap: 500 |
| `status` | `ModelStatus \| None` | `None` | `"training"` / `"ready"` / `"failed"` / `"archived"` |
| `model_type` | `ModelType \| None` | `None` | `"object_detection"` / `"semantic_segmentation"` / `"instance_segmentation"` / `"classification"` |
## get
```python
model = client.models.get("model-uuid")
print(model.architecture, model.metrics["mAP"], model.class_mapping)
```
Returns `Model`. Inspect `metrics` (mAP, precision, recall) and
`class_mapping` (index → class name) for inference setup.
## download
Stream the ONNX weights to a local file. Only `status="ready"` models
are downloadable — `training` / `failed` raise `409 ConflictError`.
```python
from pathlib import Path
client.models.download("model-uuid", output_path=Path("./yolox.onnx"))
```
| Arg | Type |
|---|---|
| `model_id` | `str` |
| `output_path` | `str \| Path` |
The download is chunked and checksummed against the object MD5. Safe
for multi-GB models.
## delete
```python
client.models.delete("model-uuid")
```
Soft-delete (sets `status="archived"`). Weights are not purged on
soft-delete — contact support to permanently remove them.
## Status lifecycle
| `status` | Meaning |
|---|---|
| `training` | Training is in progress; `download` returns 409. |
| `ready` | Trained successfully; weights downloadable via `download()`. |
| `failed` | Training stopped with an error. Inspect the source `TrainingRun.error_message`. |
| `archived` | Soft-deleted. Hidden from `list()` unless `status="archived"` filter passed. |
## Inference
The SDK ships read access only — there is no `client.models.infer()`
endpoint in v1.0.0. To run inference, download the ONNX file and use
your own runtime (`onnxruntime`, `tensorrt`, etc.):
```python
import onnxruntime as ort
client.models.download(model.id, output_path="./model.onnx")
session = ort.InferenceSession("./model.onnx")
# … standard ORT inference loop
```
The `image_inference` credit operation (~1 cr/image) reserved for a
future managed inference endpoint — not yet exposed.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | `model_id` missing or belongs to another org |
| 409 | `ConflictError` | `download` on a non-`ready` model |
---
## Page: Credits
_URL: https://pictograph.io/docs/api-reference/credits (markdown: api-reference/credits.md)_
_Section: API Reference_
Pictograph uses a **credit ledger** (signed integers, per organization)
for paid operations. Free actions (uploads, exports, search) cost 0.
```python
from pictograph import Client
client = Client()
```
## balance
Current balance + monthly allowance + last 20 ledger entries.
```python
balance = client.credits.balance()
print(balance.credits_remaining, "/", balance.credits_monthly_allowance)
print("Resets:", balance.credits_reset_at)
for entry in balance.recent_history:
print(entry.created_at, entry.operation, entry.amount)
```
Returns `CreditBalance`.
## history
Page through the credit ledger (newest first).
```python
entries = client.credits.history(limit=50, offset=0)
for e in entries:
direction = "debit" if e.amount < 0 else "credit"
print(e.created_at, direction, abs(e.amount), e.operation)
```
Sign convention: `amount < 0` = debit (operation consumed credits),
`amount > 0` = credit / refund (top-up, training overcharge refund).
## iter
Auto-paging iterator over the entire ledger.
```python
for entry in client.credits.iter(page_size=100):
print(entry.balance_after, entry.operation)
```
## estimate
Pre-flight cost check **before** invoking a paid operation.
```python
estimate = client.credits.estimate("training_a10g_per_minute", quantity=30)
print(estimate.total_credits, "credits;", "sufficient:", estimate.sufficient)
```
`sufficient=True` is **not** a guarantee — another caller may drain credits
between the estimate and the actual call. The authoritative answer is
the operation's own `PaymentRequiredError`.
## Cost cheatsheet
| Operation slug | Approx cost |
|---|---|
| `sam3_per_minute` | 3 cr session minimum + ~1/prompt |
| `training_a10g_per_minute` | 10 cr/min |
| `training_a100_per_minute` | 60 cr/min |
| `training_h100_per_minute` | 120 cr/min |
| `image_generate_imagen_fast` | 5 cr/image |
| `image_edit_gemini_flash` | 3 cr/image |
| `image_inference` | 1 cr/image |
The full table lives server-side in `utils.tier_limits.CREDIT_COSTS`.
## Gating in workflows
`full_pipeline` already gates on credit balance before kicking off
paid phases:
```python
from pictograph.workflows import full_pipeline
report = full_pipeline(
client,
dataset_name="…", folder="…", classes=…, pipeline="yolox",
min_credits=1, # skip annotate + train if balance < 1
)
if report.credit_skip_reason:
print(report.credit_skip_reason)
```
`min_credits=None` disables the check.
## PaymentRequiredError details
```python
from pictograph.exceptions import PaymentRequiredError
try:
client.training.create(dataset_name, export_name, pipeline_type="yolox")
except PaymentRequiredError as e:
print(f"Need {e.required}, have {e.remaining}")
print(f"Top up at: {e.upgrade_url}")
```
## Refunds
The training pipeline auto-refunds unused GPU minutes when:
- A run is **cancelled** mid-training.
- A run **failed** before consuming the full `timeout` budget.
Refunds appear as positive ledger entries with operation
`training_refund_`. No SDK call required.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 422 | `ValidationError` | `operation` slug not in the cost table |
| 402 | `PaymentRequiredError` | (raised by the *operation* being estimated, not by `estimate` itself) |
---
## Page: Exports
_URL: https://pictograph.io/docs/api-reference/exports (markdown: api-reference/exports.md)_
_Section: API Reference_
An **export** is a ZIP of an annotated dataset in a chosen format,
optionally embedding the original image files. Export builds run
server-side (a few seconds for hundreds of images, longer for tens
of thousands).
```python
from pictograph import Client
client = Client()
```
## Formats
| `format` | Notes |
|---|---|
| `pictograph` (default) | Canonical Pictograph JSON — the wire format the SDK consumes |
| `coco` | COCO instance segmentation / object detection |
| `yolo` | YOLO darknet `.txt` files (one per image) |
| `cvat` | CVAT XML |
| `pascal_voc` | Pascal VOC XML (one per image) |
| `labelme` | LabelMe JSON (one per image) |
| `csv` | Flat CSV — bbox annotations only |
## create
Build a new export. Defaults to `wait=True`, which blocks until the
ZIP is ready.
```python
export = client.exports.create(
"my-dataset",
"for-yolov8",
format="yolo",
include_images=True,
class_filter=["car", "truck"], # None = all classes
status_filter="complete", # "all" / "complete" / "in_progress" / "new"
wait=True,
poll_interval=2.0,
timeout=600.0,
)
print(export.id, export.status, export.image_count, export.annotation_count)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `dataset_name` | `str` | required | |
| `name` | `str` | required | Unique within the dataset |
| `format` | `ExportFormat` | `"pictograph"` | See table above |
| `include_images` | `bool` | `True` | When `False`, ZIP contains only annotations |
| `class_filter` | `list[str] \| None` | `None` | Limit to these class names |
| `status_filter` | `str` | `"complete"` | Image status filter |
| `wait` | `bool` | `True` | Block until terminal status |
`wait=False` returns a `pending` / `processing` `Export`. Poll via
`get` or `wait_for_completion`.
## list / iter
```python
exports = client.exports.list(limit=20)
for e in client.exports.iter(page_size=50):
print(e.dataset_name, e.name, e.status)
```
## get
```python
export = client.exports.get("my-dataset", "for-yolov8")
print(export.status, export.download_url)
```
## download
Stream the ZIP to a local file (chunked).
```python
from pathlib import Path
client.exports.download(
"my-dataset", "for-yolov8",
output_path=Path("./my-dataset.zip"),
)
```
The download URL is a signed URL valid for 60 minutes — generated
fresh on every call.
## wait_for_completion
If you used `wait=False`:
```python
export = client.exports.create("ds", "name", format="coco", wait=False)
# … later
export = client.exports.wait_for_completion("ds", "name", timeout=300.0)
```
## delete
```python
client.exports.delete("my-dataset", "for-yolov8")
```
Removes the export and the stored ZIP. Other ongoing downloads of the
same export will fail mid-stream.
## Class filtering
`class_filter` only includes annotations matching the given class names.
Images with no surviving annotations are still included if their
`status` matches `status_filter` — they get an empty annotation list.
Pass `class_filter=None` (default) to keep every annotation.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | Dataset or export missing |
| 409 | `ConflictError` | Export name already exists in this dataset |
| 422 | `ValidationError` | Unknown `format` or `status_filter` |
| 408 | `PollTimeoutError` | `wait=True` timed out (export keeps building) |
---
## Page: Training
_URL: https://pictograph.io/docs/api-reference/training (markdown: api-reference/training.md)_
_Section: API Reference_
The training resource manages the lifecycle of a single run against a pre-built export. For the end-to-end "give me an ONNX model from this dataset" call, use [`train_pipeline`](/docs/workflows/train) instead.
```python
from pictograph import Client
client = Client()
```
## Pipelines
| `pipeline_type` | Output |
| --- | --- |
| `yolox` | Object detection (boxes) |
| `detectron2` | Instance segmentation (polygons + masks) |
| `sm_pytorch` | Semantic segmentation |
| `classification` | Image classification |
| `rfdetr_detection` | Object detection (RT-DETR) |
| `rfdetr_segmentation` | Instance segmentation (RT-DETR) |
## GPU tiers
| `gpu_type` | Approx. cost | Pick for |
| --- | --- | --- |
| `a10g` (default) | ~$0.30/hr | YOLOX, classification, RF-DETR-detection |
| `a100` | ~$2/hr | Detectron2, large RF-DETR, big batches |
| `h100` | ~$4/hr | Last resort — only when A100 OOMs |
## create
Spawn a run against an existing export.
```python
run = client.training.create(
dataset_name="road-signs",
export_name="road-signs-20260512-120000",
pipeline_type="yolox",
name="yolox-run-1",
config={"epochs": 50},
gpu_type="a10g",
wait=True,
poll_interval=5.0,
timeout=7200.0,
)
```
| Arg | Type | Default | Notes |
| --- | --- | --- | --- |
| `dataset_name` | `str` | required | Source project |
| `export_name` | `str` | required | Pre-built export |
| `pipeline_type` | `PipelineType` | required | See table above |
| `name` | `str \| None` | auto | Defaults to `-run-` |
| `config` | `dict` | `{}` | `epochs`, `batch_size`, `learning_rate`, `image_size` |
| `gpu_type` | `GpuType` | `"a10g"` | |
| `wait` | `bool` | `True` | When `False`, returns immediately with `status="queued"` |
| `poll_interval` | `float` | `5.0` | Seconds between polls |
| `timeout` | `float` | `7200.0` | Max poll seconds (2 hours) |
Returns `TrainingRun`.
## list / iter
```python
runs = client.training.list(limit=20, status="running")
for run in client.training.iter(page_size=50):
print(run.id, run.status, run.progress)
```
## get
```python
run = client.training.get("run-uuid")
print(run.status, run.progress, run.current_epoch, "/", run.total_epochs)
```
`status` is one of `{"pending", "queued", "running", "completed", "failed", "cancelled"}`.
## cancel
```python
client.training.cancel("run-uuid") # stops the worker, refunds remaining minutes
```
## wait_for_completion
If you created with `wait=False`, you can block later:
```python
run = client.training.wait_for_completion("run-uuid", timeout=3600.0)
if run.status == "completed":
model = client.models.get(run.model_id)
```
## Minimum dataset size
Training requires **at least 5 images** matching the export's `status_filter` so the worker can split into train / val / test. Below that, training fails with a validation error.
```python
ds = client.datasets.get("my-dataset")
assert ds.completed_image_count >= 5
```
## Cost estimation
```python
estimate = client.credits.estimate("training_a10g_per_minute", quantity=30)
if not estimate.sufficient:
raise RuntimeError(f"Need {estimate.total_credits}, have {estimate.credits_remaining}")
```
Refunds for cancelled or under-budget runs appear automatically as positive ledger entries (`training_refund_`).
## Errors
| Status | Exception | Cause |
| --- | --- | --- |
| 402 | `PaymentRequiredError` | Insufficient credits |
| 404 | `NotFoundError` | Dataset or export missing |
| 422 | `ValidationError` | Pipeline / GPU invalid, dataset too small |
| 408 | `PollTimeoutError` | `wait=True` exceeded `timeout` (run keeps going) |
## See also
- [`train_pipeline`](/docs/workflows/train) — end-to-end workflow (recommended starting point)
- [Models](/docs/api-reference/models) — download trained ONNX weights
- [Credits](/docs/api-reference/credits) — `estimate("training__per_minute")`
---
## Page: API keys
_URL: https://pictograph.io/docs/api-reference/api-keys (markdown: api-reference/api-keys.md)_
_Section: API Reference_
Use these endpoints to issue, list, update, and revoke API keys for
your organization. The full key string (`pk_live_…`) is returned **only
once** on creation — store it immediately.
```python
from pictograph import Client
client = Client()
```
## list
```python
keys = client.api_keys.list() # active org
keys = client.api_keys.list(organization_id="org-uuid") # explicit org
for k in keys:
print(k.name, k.role, k.key_prefix, k.last_used_at)
```
Returns `list[ApiKey]` — metadata only (no full key strings).
## create
```python
created = client.api_keys.create(
organization_id="org-uuid",
name="ci-pipeline",
role="member", # viewer / member / admin / owner
expires_at=None, # ISO datetime or None for no expiry
)
print("Save this — it is shown once:", created.full_key)
```
Returns `CreatedApiKey` — `id`, `name`, `role`, `key_prefix`,
`full_key` (the only call that returns it).
| Arg | Type | Default | Notes |
|---|---|---|---|
| `organization_id` | `str` | required | |
| `name` | `str` | required | Human label, not unique |
| `role` | `ApiKeyRole` | required | `"viewer"` / `"member"` / `"admin"` / `"owner"` |
| `expires_at` | `datetime \| str \| None` | `None` | ISO 8601 or `None` for no expiry |
## get
```python
key = client.api_keys.get("key-uuid")
print(key.role, key.created_at, key.last_used_at)
```
## update
Patch the key's name, role, or expiry. The full key string is **not**
rotated by update — issue a new key + delete the old one to rotate.
```python
client.api_keys.update("key-uuid", name="renamed", role="admin")
client.api_keys.update("key-uuid", expires_at="2027-01-01T00:00:00Z")
```
## delete
Revokes the key immediately. In-flight requests using the key fail
with `401 AuthError` after revocation propagates (≤ 1 second).
```python
client.api_keys.delete("key-uuid")
```
## Role hierarchy
Keys can only manage keys of equal or lower role. An `admin` key cannot
create an `owner` key. Owner-tier ops require an `owner` key.
| Caller role | Can create |
|---|---|
| `viewer` | nothing — these endpoints all require `admin`+ |
| `member` | nothing |
| `admin` | `viewer`, `member`, `admin` |
| `owner` | `viewer`, `member`, `admin`, `owner` |
## Web app vs SDK
- **Web app** (`app.pictograph.io → Settings → API Keys`) — visual UI,
the most common path for one-off keys.
- **SDK / CLI** — for programmatic key issuance (CI provisioning,
multi-org tools, automated rotation).
The SDK enforces the same role hierarchy as the web UI.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 403 | `ForbiddenError` | Caller's role too low for the requested action |
| 404 | `NotFoundError` | `key_id` doesn't exist or belongs to another org |
| 422 | `ValidationError` | Invalid role string, malformed `expires_at` |
---
## Page: Batch
_URL: https://pictograph.io/docs/api-reference/batch (markdown: api-reference/batch.md)_
_Section: API Reference_
Bulk image operations on a single dataset. Each call accepts a list of image IDs and returns a `BatchResult` with per-item failure context — partial success does not raise.
```python
from pictograph import Client
client = Client()
```
## move
Move images to a different virtual folder within the same dataset.
```python
result = client.batch.move(
dataset_name="my-dataset",
image_ids=["img-1", "img-2", "img-3"],
target_folder_path="/sorted/cars",
)
print(result.succeeded, result.failed_count, result.failures)
```
Storage paths are immutable — "move" updates `virtual_folder_path`; the underlying image bytes don't move.
## copy
Copy images to a different folder. Server-side copy of the underlying bytes (instant, zero data transfer).
```python
result = client.batch.copy(
dataset_name="my-dataset",
image_ids=["img-1", "img-2"],
target_folder_path="/cars-copy",
duplicate_handling="rename", # collision policy in the destination
copy_annotations=False, # destination images start without annotations
)
```
| Arg | Type | Default | Notes |
| --- | --- | --- | --- |
| `dataset_name` | `str` | required | |
| `image_ids` | `Sequence[str]` | required | |
| `target_folder_path` | `str` | `"/"` | Destination virtual folder |
| `duplicate_handling` | `Literal["rename", "skip", "overwrite"]` | `"rename"` | How to handle filename collisions |
| `copy_annotations` | `bool` | `False` | When `True`, copy `annotations_json` and `status` too |
## delete
Soft-archive by default; permanent on request.
```python
result = client.batch.delete(
dataset_name="my-dataset",
image_ids=["img-1", "img-2", "img-3"],
permanent=False, # archive (recoverable)
)
```
`permanent=True` purges the stored bytes — irreversible. Requires `admin`+ role.
## update
Update metadata fields on a batch of images. Pass exactly the fields you want to change — `None` is omitted from the request.
```python
result = client.batch.update(
dataset_name="my-dataset",
image_ids=["img-1", "img-2"],
status="complete",
is_archived=False,
)
```
| Arg | Type | Default | Notes |
| --- | --- | --- | --- |
| `dataset_name` | `str` | required | |
| `image_ids` | `Sequence[str]` | required | |
| `status` | `str \| None` | `None` | `"new"`, `"annotate"`, `"review"`, `"complete"` |
| `display_name` | `str \| None` | `None` | Display override |
| `is_archived` | `bool \| None` | `None` | `True` archives; `False` restores |
`ValidationError` if every field is `None` (the update would be a no-op).
## BatchResult
| Attribute | Type | Notes |
| --- | --- | --- |
| `succeeded` | `list[str]` | IDs the op completed for |
| `failed_count` | `int` | `len(failures)` |
| `failures` | `list[BatchFailure]` | `{image_id, reason}` per failure |
| `success` | `bool` (property) | `failed_count == 0` |
## Errors
| Status | Exception | Cause |
| --- | --- | --- |
| 403 | `ForbiddenError` | `permanent=True` requires `admin`+ role |
| 404 | `NotFoundError` | Dataset missing, or every `image_id` invalid |
| 422 | `ValidationError` | Invalid field value or empty update |
## Why batch over loops
Reorganizing 10K images is one round-trip with `batch.move()` versus 10K with `images.update()`. Bulk operations are implemented server-side as single statements, not loops.
---
## Page: Search
_URL: https://pictograph.io/docs/api-reference/search (markdown: api-reference/search.md)_
_Section: API Reference_
Two search modes:
1. **Visual similarity** — `by_similarity()` — SigLIP2 (1152-dim)
embeddings + pgvector HNSW index.
2. **Tag-based** — `by_tag()` — JSONB containment over the
auto-classified `image_auto_tags` field (objects / scenes / attributes).
Both auto-tag and embedding pipelines run on every upload (zero API
cost; T4 GPU). No setup required.
```python
from pictograph import Client
client = Client()
```
## by_similarity
Find images visually similar to a reference image. Scope is the
reference image's dataset + folder unless overridden.
```python
results = client.search.by_similarity(
image_id="img-uuid-1",
threshold=0.6, # cosine similarity floor (0–1)
limit=50,
folder_path=None, # None = inherit; "/" = whole dataset
)
for r in results:
print(r.image_id, r.filename, f"{r.similarity:.3f}")
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `image_id` | `str` | required | UUID of the reference image |
| `threshold` | `float` | `0.6` | Minimum cosine similarity (`0.6` ≈ "visually related") |
| `limit` | `int` | `50` | Backend cap: 500 |
| `folder_path` | `str \| None` | `None` | Override folder scope |
Returns `list[SimilarImage]`, sorted by descending similarity. The
source image is excluded from results.
## by_tag
Find images with auto-tags matching the given filters. Pass at least
one of `objects` / `scenes` / `attributes` (an empty filter returns
nothing rather than everything — semantically clearer for agents).
```python
results = client.search.by_tag(
objects=["car", "truck"], # match ANY object tag
scenes=["outdoor"], # match ANY scene tag
attributes=["blurry"], # match ANY attribute tag
dataset_name="my-dataset", # restrict scope; None = whole org
limit=100,
)
for r in results:
print(r.image_id, r.tags["objects"])
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `objects` | `Sequence[str] \| None` | `None` | At least one of objects/scenes/attributes required |
| `scenes` | `Sequence[str] \| None` | `None` | |
| `attributes` | `Sequence[str] \| None` | `None` | |
| `dataset_name` | `str \| None` | `None` | Org-wide search if `None` |
| `limit` | `int` | `50` | Backend cap: 500 |
Returns `list[TaggedImage]`. Within a category, tags are OR'd; across
categories they are AND'd:
- `objects=["car","truck"]` → "car OR truck"
- `objects=["car"], scenes=["outdoor"]` → "car AND outdoor"
## Auto-tag taxonomy
The SigLIP2 classifier picks from ~200 curated labels per category.
Common ones:
- **objects**: car, truck, person, bicycle, dog, sign, building, etc.
- **scenes**: outdoor, indoor, urban, rural, daytime, nighttime, etc.
- **attributes**: blurry, dark, bright, high-contrast, low-light, etc.
The full taxonomy ships with the SigLIP2 service prompts; tags not in
the curated list won't be assigned.
## Cost
Search is **free**. Embeddings + auto-tags are computed once per image
on upload (T4 GPU, zero API cost) and cached.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | `image_id` (similarity) or `dataset_name` (tag) missing |
| 422 | `ValidationError` | `by_tag` called with all three filters None |
---
## Page: Connectors
_URL: https://pictograph.io/docs/api-reference/connectors (markdown: api-reference/connectors.md)_
_Section: API Reference_
Import remote datasets in two steps: validate the source API key →
kick off the import. The import runs as an async job; the SDK polls
until terminal status by default.
```python
from pictograph import Client
client = Client()
```
## Supported providers
| `provider` | Source | Notes |
|---|---|---|
| `v7` | V7 Darwin | Polygon paths, bboxes, polylines, keypoints, tags |
| `roboflow` | Roboflow | COCO export → Pictograph JSON |
## validate
Verify the source API key and list available remote datasets. No quota
consumed; the API key is sent only on this call.
```python
result = client.connectors.validate(
provider="v7",
api_key="v7_api_token_…",
)
if result.valid:
for ds in result.datasets:
print(ds.id, ds.name, ds.image_count)
else:
print("invalid:", result.error)
```
Returns `ValidationResult` — inspect `.valid` first; `.datasets` is
populated only on success.
## check_limits
Pre-flight tier-cap check before kicking off an import.
```python
check = client.connectors.check_limits(
total_images=12500,
estimated_size_bytes=4_000_000_000, # 4 GB
)
if not check.allowed:
print("blocked by:", check.exceeded) # "images" / "storage" / "both"
```
## import_
Kick off the import. Trailing underscore avoids shadowing the Python
`import` keyword.
```python
job = client.connectors.import_(
provider="v7",
api_key="v7_api_token_…",
datasets=[
{"id": "ds_abc", "name": "Road signs", "slug": "road-signs"},
# OR pass RemoteDataset instances from validate():
# *result.datasets[:2],
],
wait=True,
poll_interval=3.0,
timeout=3600.0, # 1h default
)
print(job.import_id, job.status)
for ds in job.datasets:
print(ds.dataset_name, ds.images_imported, "/", ds.total_images)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `provider` | `ConnectorProvider` | required | `"v7"` / `"roboflow"` |
| `api_key` | `str` | required | Sent only to fetch source data |
| `datasets` | `Sequence[RemoteDataset \| dict]` | required | `RemoteDataset` instances or raw dicts |
| `wait` | `bool` | `True` | Poll until terminal |
| `poll_interval` | `float` | `3.0` | seconds |
| `timeout` | `float` | `3600.0` | Max poll seconds (V7 large exports take 30+ min) |
Returns `ImportJob`.
## get_import / wait_for_import / cancel_import
```python
job = client.connectors.get_import(import_id)
job = client.connectors.wait_for_import(import_id, timeout=600.0)
job = client.connectors.cancel_import(import_id)
```
## Annotation conversion
| V7 / COCO | Pictograph |
|---|---|
| V7 `polygon.paths` | `polygon.paths` (passthrough) |
| V7 `bounding_box` (no polygon) | `bounding_box` |
| V7 `line.path` | `polyline.path` |
| V7 `keypoint` | `keypoint` |
| V7 `tag` / `ellipse` / `mask` | skipped (no Pictograph equivalent) |
| COCO `segmentation` (flat array) | `polygon.paths` (paired into points) |
| COCO `bbox` (no segmentation) | `bounding_box` |
| COCO `keypoints` triplets | `keypoint` (skips `v=0`) |
## Tier caps
Imports are charged against your storage + image-count tier caps. See
[Credits](/docs/api-reference/credits) and your plan in the web app.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 401 | `AuthError` | Source provider API key rejected |
| 402 | `PaymentRequiredError` | Tier cap exceeded |
| 404 | `NotFoundError` | `import_id` missing |
| 408 | `PollTimeoutError` | `wait=True` exceeded `timeout` (job keeps running) |
| 422 | `ValidationError` | Invalid provider, empty datasets list |
---
## Page: Video
_URL: https://pictograph.io/docs/api-reference/video (markdown: api-reference/video.md)_
_Section: API Reference_
Pictograph annotates **frames**, not videos. The video resource handles
upload and frame extraction; once extracted, frames are regular images
you annotate with the standard SAM3 / annotation workflows.
```python
from pictograph import Client
client = Client()
```
## upload
Three-step upload (signed URL → PUT → register), same pattern as
images.
```python
from pathlib import Path
info = client.video.upload(
file_path=Path("./recording.mp4"),
dataset_id="proj-uuid",
folder_path="/raw-footage",
)
print(info.gcs_path, info.video_id)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `file_path` | `str \| Path` | required | Local video file |
| `dataset_id` | `str` | required | Destination dataset |
| `folder_path` | `str` | `"/"` | Virtual folder |
Supported codecs: anything ffmpeg can demux (H.264, H.265, VP9, AV1, etc.).
## probe
Inspect a video's metadata without extracting frames. Pass the
`gcs_path` returned by `upload()`.
```python
meta = client.video.probe(info.gcs_path)
print(meta.duration_seconds, meta.fps, meta.width, meta.height)
print(meta.codec, meta.frame_count)
```
Returns `VideoMetadata` from a server-side `ffprobe` invocation.
## extract_frames
Extract frames from a video into the destination dataset as images.
```python
job = client.video.extract_frames(
gcs_path=info.gcs_path,
dataset_id="proj-uuid",
folder_path="/raw-footage/frames",
fps=2.0, # extract 2 frames per second of source video
start_seconds=10.0,
end_seconds=120.0,
max_frames=200, # cap on output count
wait=True,
poll_interval=5.0,
timeout=1800.0,
)
print(job.status, job.frames_extracted)
```
Frames are written as `{video_basename}_{frame_index:06d}.jpg` in the
target folder. Each becomes a regular `Image` row — ready for
annotation, search, training.
| Arg | Type | Default | Notes |
|---|---|---|---|
| `gcs_path` | `str` | required | Source video |
| `dataset_id` | `str` | required | Destination dataset |
| `folder_path` | `str` | `"/"` | Virtual folder for the extracted frames |
| `fps` | `float` | `1.0` | Frames per source second |
| `start_seconds` | `float \| None` | `None` | Skip the first N seconds |
| `end_seconds` | `float \| None` | `None` | Stop at second N |
| `max_frames` | `int \| None` | `None` | Cap on output count |
| `wait` | `bool` | `True` | Poll until terminal |
`fps=1.0` is the cheapest setting; `fps=30.0` extracts every frame of a
30 fps source. Frame extraction does **not** consume credits — you pay
only for the storage of the resulting images.
## get_extraction / wait_for_extraction
```python
job = client.video.get_extraction(job_id)
job = client.video.wait_for_extraction(job_id, timeout=600.0)
```
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | `gcs_path` missing or `dataset_id` invalid |
| 415 | `ValidationError` | Unsupported codec |
| 408 | `PollTimeoutError` | Long videos may exceed default `timeout` |
| 413 | `ApiError` | Video file exceeds upload limit (10 GB) |
---
## Page: Organizations
_URL: https://pictograph.io/docs/api-reference/organizations (markdown: api-reference/organizations.md)_
_Section: API Reference_
The Developer API key carries an organization scope. Every endpoint
operates on that org — there is no cross-org access from a single key.
Use these endpoints to read org metadata, manage members, and handle
invites.
```python
from pictograph import Client
client = Client()
```
## me
Fetch the active organization (the one the API key belongs to).
```python
org = client.organizations.me()
print(org.id, org.name, org.subscription_tier, org.member_count)
```
Returns `Organization` with tier (`free` / `core` / `pro` / `enterprise`),
billing email, monthly credit allowance, etc.
## list_members
```python
for m in client.organizations.list_members():
print(m.email, m.role, m.joined_at)
```
Returns `list[OrganizationMember]`.
## update_member_role
Promote / demote a member. `member_id` is the row's UUID, **not** the
underlying user UUID.
```python
client.organizations.update_member_role(
member_id="member-uuid",
role="admin", # viewer / member / admin / owner
)
```
Role hierarchy: callers may set roles ≤ their own. An `admin` cannot
promote anyone to `owner`.
## remove_member
```python
client.organizations.remove_member(member_id="member-uuid")
```
The user's account is preserved — only the org membership is revoked.
They can be re-invited.
## list_invites
```python
invites = client.organizations.list_invites(status="pending")
for inv in invites:
print(inv.email, inv.role, inv.expires_at, inv.invite_url)
```
Filter by `status` ∈ `{"pending", "accepted", "revoked", "expired"}`
or `None` for all.
## invite
```python
invite = client.organizations.invite(
email="new@example.com",
role="member",
expires_in_days=14, # optional; defaults to 7
)
print("Send this link:", invite.invite_url)
```
Returns `OrganizationInvite` with a one-time `invite_url`. Email
delivery is automatic when SMTP is configured server-side; if you
disabled email, share the URL manually.
## revoke_invite
```python
client.organizations.revoke_invite(invite_id="invite-uuid")
```
Sets the invite's `status` to `"revoked"` immediately. The URL stops
working.
## Permission matrix
| Op | Min role |
|---|---|
| `me`, `list_members`, `list_invites` | `viewer` |
| `update_member_role`, `invite`, `revoke_invite` | `admin` |
| `remove_member` | `admin` (can't remove `owner` unless caller is `owner`) |
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 403 | `ForbiddenError` | Caller's role too low for the action |
| 404 | `NotFoundError` | `member_id` / `invite_id` doesn't exist (or belongs to another org) |
| 409 | `ConflictError` | `invite` to an email that's already a member |
| 422 | `ValidationError` | Invalid role, malformed email |
---
## Page: Projects
_URL: https://pictograph.io/docs/api-reference/projects (markdown: api-reference/projects.md)_
_Section: API Reference_
The **`projects`** resource is the write side of the
[`datasets`](/docs/api-reference/datasets) resource. Same underlying
entity; the SDK aliases the read path to "datasets" because that's the
word users say.
Use this page for: creating a project, editing its class set,
deleting it.
```python
from pictograph import Client
client = Client()
```
## list / iter
```python
projects = client.projects.list(limit=50)
for p in client.projects.iter(page_size=100):
print(p.name, len(p.classes))
```
`Project` includes the embedded `project_config` (classes + annotation
types), unlike `Dataset` which is read-optimized for the list view.
## get
```python
project = client.projects.get("my-dataset")
for cls in project.classes:
print(cls.name, cls.type, cls.color)
```
## create
```python
from pictograph import ProjectClass
project = client.projects.create(
"new-dataset",
description="Road sign detection training set",
annotation_types=["bbox", "polygon"],
classes=[
ProjectClass(name="stop_sign", type="bbox", color="#ff0000"),
ProjectClass(name="yield", type="bbox", color="#ffff00"),
],
)
```
| Arg | Type | Default | Notes |
|---|---|---|---|
| `name` | `str` | required | Unique within the org |
| `description` | `str \| None` | `None` | |
| `annotation_types` | `Sequence[str]` | `["bbox"]` | Allowed types for this project |
| `classes` | `Sequence[ProjectClass \| dict]` | `[]` | Class definitions; can be added later via `update` |
Returns `Project`.
## update
Patch project metadata or its config (classes / annotation types).
Pass only the fields you're changing.
```python
client.projects.update(
"my-dataset",
description="Updated description",
annotation_types=["bbox", "polygon", "polyline"],
)
# Add / remove / recolor classes:
client.projects.update(
"my-dataset",
classes=[
ProjectClass(name="stop_sign", type="bbox", color="#ff0000"),
ProjectClass(name="yield", type="bbox", color="#ffaa00"), # color change
ProjectClass(name="speed_limit", type="bbox", color="#00aaff"), # new
# 'merge' class omitted → removed (also removes any annotations using it)
],
)
```
Class updates are atomic — the entire `classes` list is replaced.
Removing a class **also removes every annotation using that class
name** across the dataset. Be deliberate.
## delete
```python
result = client.projects.delete("my-dataset")
print(result["images_deleted"], result["annotations_deleted"])
```
Permanent. Removes:
- The project and its config
- Every image (and the underlying stored bytes)
- Every export tied to the project
Models trained from this project are **not** deleted (they're useful
even after the source dataset is gone).
Requires `admin`+ role.
## ProjectClass shape
```python
{
"name": "stop_sign",
"type": "bbox", # "bbox" / "polygon" / "polyline" / "keypoint"
"color": "#ff0000", # hex color for UI rendering (any valid CSS color)
}
```
The class name must be unique within the project's class list — saves
fail with `ValidationError` if two classes share a name.
## Common errors
| Status | Exception | Cause |
|---|---|---|
| 403 | `ForbiddenError` | `delete` requires `admin`+ |
| 404 | `NotFoundError` | Project name doesn't exist |
| 409 | `ConflictError` | `create` with a duplicate name |
| 422 | `ValidationError` | Duplicate class names, invalid annotation type |
---
## Page: Agent tool registry
_URL: https://pictograph.io/docs/api-reference/tools (markdown: api-reference/tools.md)_
_Section: API Reference_
`GET /api/v1/developer/tools.json` serves the agent tool registry as
a JSON Schema array. Dynamic-discovery agent stacks (Vercel AI SDK,
LangChain, raw OpenAI / Anthropic SDKs without the bundled adapters)
fetch this once and have everything they need.
The registry is the **single source of truth** — the Python SDK's
`Toolkit.as_anthropic_tools()` / `as_openai_tools()` / `as_json_schema()`
all derive from it; the backend snapshot is regenerated on every SDK
release via a CI parity check.
## Endpoints
| URL | Notes |
|---|---|
| `/api/v1/developer/tools` | Trailing-slash-tolerant. Returns the full payload. |
| `/api/v1/developer/tools.json` | Same content. Matches the conventional `*.json` URL. |
## Auth
Standard developer-API auth — pass `X-API-Key`. Any role works
(read-only).
```bash
curl -H "X-API-Key: pk_live_…" \
https://api.pictograph.io/api/v1/developer/tools.json
```
## Response shape
```json
{
"tools": [
{
"name": "upload_dataset_from_folder",
"description": "Use when the user asks to upload a folder of images …",
"input_schema": {
"type": "object",
"properties": {
"dataset_name": { "type": "string", "description": "…" },
"folder": { "type": "string", "description": "…" }
// … remaining fields
},
"required": ["dataset_name", "folder"],
"additionalProperties": false
},
"required_role": "member",
"credit_cost": 0,
"idempotent": false
}
// … 27 more entries
],
"version": "1.0.0",
"count": 28,
"generated_at": "2026-04-19T…Z"
}
```
## Tool metadata
| Field | Notes |
|---|---|
| `name` | Snake-case identifier. Stable across SDK versions. |
| `description` | Anthropic "use when X" framing. Agents read this to choose between tools. |
| `input_schema` | Pydantic-generated JSON Schema with `extra: forbid`. |
| `required_role` | Minimum org role on the calling API key. Backend re-enforces. |
| `credit_cost` | Approximate cost (0 for read-only / free ops). Agents may gate. |
| `idempotent` | When `true`, agents may safely retry on transient failures. |
## Tool list (v1.0.0)
28 tools across 11 categories. See the [agents overview](/docs/agents)
for details and the dispatch pattern.
| Category | Tools |
|---|---|
| Workflows | `upload_dataset_from_folder`, `auto_annotate_dataset`, `train_pipeline`, `full_pipeline` |
| Datasets | `list_datasets`, `get_dataset`, `create_dataset`, `delete_dataset` |
| Images | `upload_image`, `delete_image` |
| Annotations | `get_annotations`, `save_annotations` |
| Auto-annotate | `auto_annotate_point`, `auto_annotate_box`, `auto_annotate_text` |
| Search | `search_by_tag`, `search_by_similarity` |
| Exports | `create_export`, `list_exports`, `download_export` |
| Training | `get_training_status`, `cancel_training` |
| Models | `list_models`, `download_model` |
| Credits | `get_credit_balance`, `estimate_credit_cost` |
| Connectors | `validate_connector`, `import_from_connector` |
## SDK equivalents
```python
from pictograph.agents import create_toolkit
toolkit = create_toolkit()
schema = toolkit.as_json_schema() # same payload, no HTTP roundtrip
anthropic_tools = toolkit.as_anthropic_tools() # name/description/input_schema only
openai_tools = toolkit.as_openai_tools() # OpenAI function-calling format
```
The CLI also dumps the registry locally:
```bash
pictograph agents export-tools -o tools.json
```
## Versioning
The `version` field tracks the SDK release that generated the snapshot.
Tools may be added between minor versions; renames / removals only
happen at major versions and are listed in the changelog.
## Drift protection
The backend snapshot at `routes/developer/_tools_snapshot.json` is
verified against the SDK's live registry on every CI build (via
`scripts/generate_tools_snapshot.py --check`). PRs that change one
without the other fail.
---
## Page: Error handling
_URL: https://pictograph.io/docs/error-handling (markdown: error-handling.md)_
_Section: Reference_
Every SDK error subclasses **`PictographError`**. Catch the specific
subclass to handle a known failure mode; catch the base class to log
and rethrow.
## Hierarchy
```
PictographError
├── ConfigurationError — missing API key, invalid base URL
├── AuthError — 401 (bad / missing / revoked key)
├── ForbiddenError — 403 (role lacks permission)
├── NotFoundError — 404 (resource missing)
├── ConflictError — 409 (duplicate name, optimistic-lock fail)
├── ValidationError — 422 (payload shape rejected)
├── PaymentRequiredError — 402 (out of credits)
├── RateLimitError — 429 (per-key rate cap hit)
├── ServerError — 5xx (transient backend failure)
├── NetworkError — connection / DNS / TLS failure
├── RequestTimeoutError — request exceeded the SDK's timeout budget
├── PollTimeoutError — long-running job (training, batch SAM3) didn't finish
└── ApiError — catch-all for unmatched status codes
```
Import from the top-level package:
```python
from pictograph.exceptions import (
PictographError, AuthError, ForbiddenError, NotFoundError,
ConflictError, ValidationError, PaymentRequiredError,
RateLimitError, ServerError, NetworkError, RequestTimeoutError,
PollTimeoutError, ApiError,
)
```
## When each fires
| Exception | Common cause | What to do |
|---|---|---|
| `ConfigurationError` | `PICTOGRAPH_API_KEY` not set, no `api_key=` arg | Set the env var or pass `api_key` |
| `AuthError` (401) | Key revoked / typo | Re-issue the key |
| `ForbiddenError` (403) | `viewer` key calling a write op | Use a `member`+ key |
| `NotFoundError` (404) | Dataset name typo (case-sensitive!) | Verify with `datasets list` |
| `ConflictError` (409) | Same image filename in same folder | Pass `skip_existing=True` to the upload workflow, or use a new name |
| `ValidationError` (422) | `class` instead of `name`, flat polygon array | Fix the payload (see [Annotation format](/docs/annotation-format)) |
| `PaymentRequiredError` (402) | Out of credits mid-operation | Show `e.upgrade_url` to the user |
| `RateLimitError` (429) | Per-key burst limit | SDK auto-retries when `Retry-After < 120s`; otherwise raise |
| `ServerError` (5xx) | Backend incident | SDK retries with exponential backoff; persistent failure surfaces |
| `NetworkError` | Connection dropped | Retry idempotent ops; investigate non-idempotent |
| `PollTimeoutError` | Training run exceeded `timeout` | Re-poll with `client.training.get(run_id)` |
## Retry behavior
The SDK already retries on transient failures with exponential backoff:
- **5xx responses** — up to 3 retries, backoff `1s → 2s → 4s`.
- **429 with `Retry-After` ≤ 120s** — auto-waits then retries.
- **Network errors** (connection reset, DNS blip) — same 3-retry policy.
- **Idempotency** — retried requests inherit the original
`Idempotency-Key` header, so the backend dedupes.
Override on the Client:
```python
client = Client(timeout=30.0, max_retries=5)
```
## PaymentRequiredError details
```python
from pictograph.exceptions import PaymentRequiredError
try:
client.training.create(dataset_name, export_name, pipeline_type="yolox")
except PaymentRequiredError as e:
print(f"Need {e.required} credits, you have {e.remaining}")
print(f"Top up at: {e.upgrade_url}")
```
`required`, `remaining`, and `upgrade_url` are populated from the
backend's `detail` block — fall back to plain `str(e)` if you only
need a user-facing message.
## ValidationError details
The backend returns a structured body listing every offending field:
```python
from pictograph.exceptions import ValidationError
try:
client.annotations.save(image_id, [{"class": "person", "type": "bbox"}])
except ValidationError as e:
print(e) # human-readable summary
print(e.errors) # list of {"loc": [...], "msg": "...", "type": "..."}
```
The most common cause is the **`class` vs `name`** field mistake — the
backend rejects any annotation that uses `class`.
## PollTimeoutError + recovery
Long-running jobs (`training_pipeline`, batch auto-annotate, large
dataset imports) accept a `timeout` arg and raise `PollTimeoutError`
when it elapses. The job is **not cancelled** — it keeps running on the
backend.
```python
from pictograph.exceptions import PollTimeoutError
try:
run, model = train_pipeline(client, "ds", pipeline="yolox", timeout=60.0)
except PollTimeoutError as e:
# Pick up later
run_id = e.run_id # most poll errors carry the resource ID
run = client.training.get(run_id)
if run.status == "completed":
model = client.models.get(run.model_id)
```
## Idempotency
For mutating ops the SDK auto-generates an `Idempotency-Key` header,
so retries are safe. Override per-call:
```python
client.images.upload(
dataset_id=ds.id,
file_path="x.jpg",
idempotency_key="upload-x-jpg-2026-04-19",
)
```
Backend dedupes within 24h. Reusing the same key with a different body
returns `409 ConflictError` (`error_code: idempotency_conflict`).
See [Rate limits](/docs/rate-limits) for the per-tier limits and burst
behaviour.
---
## Page: Pictograph annotation format
_URL: https://pictograph.io/docs/annotation-format (markdown: annotation-format.md)_
_Section: Reference_
Every annotation in Pictograph follows the same schema. Snake-case
field names, no shorthand: bounding boxes are objects `{x, y, w, h}`,
polygons are multi-ring `paths`, polylines are ordered point lists,
keypoints are single points.
The class-label field is **`name`** (not `class`). Do not improvise.
## Discriminator
| `type` | Geometry container | Notes |
|---|---|---|
| `bbox` | `bounding_box: {x, y, w, h}` | Axis-aligned rectangle. |
| `polygon` | `polygon: {paths: [[{x, y}, ...], ...]}` | Multi-ring (holes via even-odd). |
| `polyline` | `polyline: {path: [{x, y}, ...]}` | Open path, doesn't close. |
| `keypoint` | `keypoint: {x, y}` | Single landmark. |
## Required fields
| Field | Type | Notes |
|---|---|---|
| `id` | non-blank string | Unique within the image. UUIDs preferred. |
| `name` | non-blank string | Class label. Must match a class in `project_config.classes` (case-sensitive). |
| `type` | one of `bbox`/`polygon`/`polyline`/`keypoint` | Discriminator. |
| `` | see table above | Field name is determined by `type`. |
## Optional fields
| Field | Default | Notes |
|---|---|---|
| `confidence` | `1.0` | Range `[0, 1]`. SAM3 sets this; manual annotations get 1.0. |
| `created_by` | `null` | UUID of the creator. Backend fills this for SDK uploads. |
| `attributes` | `[]` | User-defined metadata. Backend stores opaque. |
| `bounding_box` (polygon/polyline) | computed | Backend auto-computes the enclosing rectangle if omitted. |
## Examples
### Bounding box
```json
{
"id": "ann-1",
"name": "person",
"type": "bbox",
"bounding_box": {"x": 100, "y": 200, "w": 50, "h": 80}
}
```
### Polygon
```json
{
"id": "ann-2",
"name": "car",
"type": "polygon",
"polygon": {
"paths": [[
{"x": 10, "y": 20}, {"x": 110, "y": 20},
{"x": 110, "y": 80}, {"x": 10, "y": 80}
]]
}
}
```
### Polygon with hole
```json
{
"id": "ann-3",
"name": "donut",
"type": "polygon",
"polygon": {
"paths": [
[{"x": 0, "y": 0}, {"x": 100, "y": 0}, {"x": 100, "y": 100}, {"x": 0, "y": 100}],
[{"x": 30, "y": 30}, {"x": 70, "y": 30}, {"x": 70, "y": 70}, {"x": 30, "y": 70}]
]
}
}
```
### Polyline
```json
{
"id": "ann-4",
"name": "lane_centerline",
"type": "polyline",
"polyline": {
"path": [
{"x": 0, "y": 100}, {"x": 50, "y": 100}, {"x": 100, "y": 100}
]
}
}
```
### Keypoint
```json
{
"id": "ann-5",
"name": "left_eye",
"type": "keypoint",
"keypoint": {"x": 250, "y": 180}
}
```
## Storage
Annotations are stored in `project_images.annotations_json` as a
**plain array** — no wrapper:
```json
[
{"id": "ann-1", "name": "person", "type": "bbox", "bounding_box": {…}},
{"id": "ann-2", "name": "car", "type": "polygon", "polygon": {…}}
]
```
Updating an image's annotations is a **full overwrite**: pass the
complete list every time. There is no partial-update endpoint.
## Common mistakes
- ❌ `"class": "person"` — must be `"name"`.
- ❌ `"polygon": [[10, 20, 30, 40]]` — flat array. Must be `[{"x": …, "y": …}]`.
- ❌ `"bbox": [x, y, w, h]` — array. Must be `"bounding_box": {x, y, w, h}` object.
- ❌ Class label not in `project_config.classes` — backend rejects with 400.
- ❌ Polygon ring with < 3 points — Pydantic rejects on save.
## SDK helpers
```python
from pictograph import BBoxAnnotation, BoundingBox, PolygonAnnotation, PolygonGeometry, Point
bbox = BBoxAnnotation(
id="ann-1",
name="person",
bounding_box=BoundingBox(x=100, y=200, w=50, h=80),
)
polygon = PolygonAnnotation(
id="ann-2",
name="car",
polygon=PolygonGeometry(paths=[
[Point(x=10, y=20), Point(x=110, y=20), Point(x=110, y=80)],
]),
)
client.annotations.save(image_id, [bbox, polygon])
```
The SDK Pydantic models are the source of truth — they generate the
JSON Schema this page describes. If a backend rejects your payload,
diff your dump (`.model_dump(mode="json", exclude_none=True)`) against
the rejection message.
---
## Page: Rate limits
_URL: https://pictograph.io/docs/rate-limits (markdown: rate-limits.md)_
_Section: Reference_
Every API key is rate-limited per organization tier with a 1-hour sliding window. The SDK auto-retries short waits; longer waits raise so your code can decide.
## Per-tier limits
| Tier | Requests / hour |
| --- | --- |
| Free | 1,000 |
| Core | 5,000 |
| Pro | 20,000 |
| Enterprise | 100,000 |
## Response headers
Every successful response carries the current state of the window. Read them off the underlying response if you're tracking your own consumption — most users can ignore them and let the SDK retry automatically.
| Header | Meaning |
| --- | --- |
| `X-RateLimit-Limit` | Cap for the current window |
| `X-RateLimit-Remaining` | Calls left in the current window |
| `X-RateLimit-Reset` | Unix timestamp when the window resets |
| `Retry-After` | (429 only) seconds until the next call may succeed |
## What counts
One HTTP request → one count. Payload size doesn't matter. Streaming downloads (image / model / export blobs) count as a single request regardless of size.
Bulk operations are designed to keep counts low — prefer `client.batch.move()` over N `client.images.update()` calls, and prefer the [workflows](/docs/workflows) over hand-rolled loops.
## SDK auto-retry
`RateLimitError` carries a `retry_after` attribute. The SDK waits and retries automatically when the response includes a `Retry-After` header **and** the wait is at most 120 seconds. Anything longer raises immediately so your code can back off, queue, or fail.
```python
from pictograph.exceptions import RateLimitError
import time
try:
client.datasets.list(limit=1000)
except RateLimitError as e:
print(f"Hit cap; retry in {e.retry_after}s")
time.sleep(e.retry_after)
# …then retry. The SDK won't auto-recover for waits >120s.
```
The `pictograph` CLI inherits the same behaviour. On a long wait it prints the retry estimate to stderr so you can Ctrl-C if you don't want to wait.
## Spreading bursty load
If your workload is bursty (nightly imports, large auto-annotation runs), pace it across the hour:
```python
from time import sleep
for batch in batches:
process(batch)
sleep(0.5) # ~7,200 req/hr ceiling — comfortably under Core
```
For sustained workloads above your tier, the right move is to upgrade — retrying harder doesn't increase your share.
## See also
- [Error handling](/docs/error-handling) — the full exception hierarchy
- [Credits](/docs/api-reference/credits) — paid-operation budgeting
---
## Page: CLI reference
_URL: https://pictograph.io/docs/cli (markdown: cli.md)_
_Section: Reference_
The `pictograph` CLI is a thin wrapper over the SDK with Rich-formatted
output. Same operations, same auth model, no learning curve.
## Install
```bash
pip install 'pictograph[cli]'
```
## Auth
```bash
pictograph login # interactive; writes ~/.pictograph/config.toml
# OR
export PICTOGRAPH_API_KEY=pk_live_…
# OR
pictograph datasets list --api-key pk_live_…
```
Resolution order: `--api-key` flag > `PICTOGRAPH_API_KEY` env > `~/.pictograph/config.toml`.
## Global flags
| Flag | Notes |
|---|---|
| `--version` / `-V` | Print version and exit |
| `--help` | Print help (works on every subcommand) |
| `--api-key ` | Override the resolved key |
| `--json` | Emit raw JSON instead of Rich tables (where applicable) |
## Top-level commands
```bash
pictograph init # drop AGENTS.md template into ./
pictograph login # save API key
```
## datasets
```bash
pictograph datasets list # list (table)
pictograph datasets list --json # list (JSON)
pictograph datasets get road-signs # by name
pictograph datasets get road-signs --include-images # with image summaries
pictograph datasets create new-dataset -d "Description"
pictograph datasets delete road-signs # confirms first
pictograph datasets delete road-signs --yes # skip confirm
pictograph datasets download road-signs -o ./dump --workers 10
```
## images
```bash
pictograph images upload ./photo.jpg --folder /cars
pictograph images download -o ./out.jpg
pictograph images delete --yes
```
## annotations
```bash
pictograph annotations get
pictograph annotations save --file ./anns.json # JSON list of annotations
pictograph annotations delete --yes
```
## train
```bash
pictograph train start --pipeline yolox --gpu a10g
pictograph train start --pipeline detectron2 \
--gpu a100 --config '{"epochs": 50}'
pictograph train status
pictograph train cancel --yes
pictograph train logs # current status (SSE streaming arrives in v1.1)
```
## models
```bash
pictograph models list
pictograph models download -o ./yolox.onnx
```
## credits
```bash
pictograph credits balance
pictograph credits balance --json
pictograph credits history --limit 100
pictograph credits estimate training_a10g_per_minute -q 30
```
## agents
```bash
pictograph agents list-tools # see all 28 tools
pictograph agents export-tools -o tools.json # JSON Schema dump
pictograph agents install-skill --target claude-code # → ~/.claude/skills/pictograph-cv/
pictograph agents install-skill --target claude-ai # → ./pictograph-cv.zip
pictograph agents install-skill --target both
```
## Examples
### Build + download a YOLO export
```bash
pictograph datasets create road-signs
# … upload images via the SDK or web app …
pictograph train start road-signs --pipeline yolox
# … wait for completion …
pictograph train status
pictograph models download -o ./yolox.onnx
```
### Bulk-export all completed datasets to COCO
```bash
for ds in $(pictograph datasets list --json | jq -r '.[].name'); do
pictograph train start "$ds" --pipeline detectron2 --no-wait
done
```
(Use the SDK's `client.exports.create(..., format="coco")` directly for
better control — the CLI doesn't yet have an `export` subcommand; coming in v1.1.)
### Daily cost monitoring
```bash
pictograph credits balance --json | jq '.credits_remaining'
```
## Output
- **Default**: Rich tables for human-readable terminal use. Auto-detects
TTY width and wraps gracefully.
- **`--json`**: pretty-printed JSON for piping into `jq` / scripting.
Same payload structure as the SDK's `model_dump(mode="json")`.
## Errors
CLI errors print bold-red to stderr and exit with non-zero status. The
SDK's exception name maps to the message:
```
$ pictograph datasets get nonexistent
error: Project 'nonexistent' not found
$ echo $?
1
```
Exit codes:
| Code | Meaning |
|---|---|
| `0` | success |
| `1` | API error (handled cleanly by the CLI) |
| `2` | usage / config error (missing args, no API key) |
## See also
- [Quick Start](/docs/quick-start) — install + first run
- [Authentication](/docs/authentication) — key resolution + roles
- [Error handling](/docs/error-handling) — exception hierarchy
---
## Page: Pictograph
_URL: https://pictograph.io/docs/index (markdown: index.md)_
_Section: Get Started_
Pictograph turns directories of images into trained CV models with as little hand-annotation as possible. The same REST API drives three surfaces: a typed Python SDK, a CLI, and an agent toolkit for Claude and OpenAI.
```python
from pictograph import Client
from pictograph.workflows import full_pipeline
client = Client()
report = full_pipeline(
client,
dataset_name="road-signs",
folder="./road_signs",
classes=[("stop_sign", "bbox"), ("yield", "bbox")],
pipeline="yolox",
)
print("model:", report.model.id if report.success else "see report")
```
## What you can do
- **Upload** directories of images; subdirectories become virtual paths.
- **Auto-annotate** with SAM3 — point, box, or text prompts, single image or async batch.
- **Train** YOLOX, Detectron2, SM-PyTorch, RF-DETR, or classification models on A10G / A100 / H100 GPUs.
- **Export** to COCO, YOLO, CVAT, Pascal VOC, LabelMe, CSV, or Pictograph JSON.
- **Import** existing datasets from V7 (Darwin) or Roboflow.
- **Search** by visual similarity (SigLIP2) or auto-generated content tags.
- **Drive everything from agents** — Claude Agent SDK, openai-agents, Vercel AI SDK, LangChain, or any framework that speaks JSON Schema.
## Map of the docs
| Section | Pages |
| --- | --- |
| **Get Started** | [Installation](/docs/installation) · [Quickstart](/docs/quick-start) · [Authentication](/docs/authentication) |
| **Workflows** | [Full pipeline](/docs/workflows/full-pipeline) · [Upload](/docs/workflows/upload) · [Auto-annotate](/docs/workflows/auto-annotate) · [Train](/docs/workflows/train) |
| **API Reference** | [Overview](/docs/api-reference) · [Datasets](/docs/api-reference/datasets) · [Images](/docs/api-reference/images) · [Annotations](/docs/api-reference/annotations) · [Auto-annotate](/docs/api-reference/auto-annotate) · [Search](/docs/api-reference/search) · [Batch](/docs/api-reference/batch) · [Exports](/docs/api-reference/exports) · [Training](/docs/api-reference/training) · [Models](/docs/api-reference/models) · [Credits](/docs/api-reference/credits) · [Connectors](/docs/api-reference/connectors) · [Video](/docs/api-reference/video) · [Organizations](/docs/api-reference/organizations) · [Projects](/docs/api-reference/projects) · [API Keys](/docs/api-reference/api-keys) · [Tools](/docs/api-reference/tools) |
| **Agents** | [Overview](/docs/agents) · [Claude](/docs/agents/claude) · [OpenAI](/docs/agents/openai) · [Dynamic discovery](/docs/agents/dynamic-discovery) · [Cookbook](/docs/agents/cookbook) |
| **Reference** | [Annotation format](/docs/annotation-format) · [Error handling](/docs/error-handling) · [Rate limits](/docs/rate-limits) · [CLI](/docs/cli) |
Every page has a "Copy as Markdown" button and an `.md` mirror for agents to consume directly.
## For agents browsing this site
- Site index: [`/docs/llms.txt`](/docs/llms.txt)
- Full doc bundle (one file): [`/docs/llms-full.txt`](/docs/llms-full.txt)
- Tool registry (JSON Schema): [`/api/v1/developer/tools.json`](https://api.pictograph.io/api/v1/developer/tools.json)
---
## Page: Installation
_URL: https://pictograph.io/docs/installation (markdown: installation.md)_
_Section: Get Started_
## Requirements
- **Python 3.10+** (3.11+ recommended). Tested on 3.10–3.13.
- A Pictograph account and an API key (see [Quick Start](/docs/quick-start)).
## Install
```bash
pip install pictograph
```
Published on PyPI: [`pictograph`](https://pypi.org/project/pictograph/). The base install gives you the SDK Client, every resource, the agent toolkit, and the bundled `pictograph-cv` Skill.
## Optional extras
| Extra | What it adds | Install |
|---|---|---|
| `cli` | `pictograph` command (Typer + Rich) | `pip install 'pictograph[cli]'` |
| `agents` | Claude Agent SDK + openai-agents adapters | `pip install 'pictograph[agents]'` |
| `cache` | Local SQLite response cache (aiosqlite) | `pip install 'pictograph[cache]'` |
| `telemetry` | OpenTelemetry SDK for SDK calls | `pip install 'pictograph[telemetry]'` |
| `all` | Everything above | `pip install 'pictograph[all]'` |
Pillow is included in the base install (used to extract image dimensions
client-side during upload).
## Verify
```python
from pictograph import Client, REGISTRY, __version__
print(f"pictograph v{__version__}")
print(f"{len(REGISTRY)} agent tools registered")
```
Expected output:
```
pictograph v1.x.x
28 agent tools registered
```
## CLI verify
```bash
pictograph --version
pictograph --help
```
## Editable install (contributors)
```bash
git clone https://github.com/pictograph-labs/pictograph-sdk
cd pictograph-sdk
pip install -e '.[dev,cli,agents,cache,telemetry]'
pytest
```
## Next
- [Quickstart](/docs/quick-start) — run an end-to-end pipeline in five minutes
- [Authentication](/docs/authentication) — API key resolution and roles
- [Workflows](/docs/workflows) — the headline upload / annotate / train helpers
---
## Page: Quick start
_URL: https://pictograph.io/docs/quick-start (markdown: quick-start.md)_
_Section: Get Started_
## Install
```bash
pip install pictograph
```
For the CLI + Rich-formatted output:
```bash
pip install 'pictograph[cli]'
```
For the agent toolkit (Claude Agent SDK + openai-agents):
```bash
pip install 'pictograph[agents]'
```
## Get an API key
1. Sign in at [app.pictograph.io](https://app.pictograph.io).
2. Navigate to **Settings → API Keys**.
3. Click **Create API Key**, give it a role (`viewer` / `member` / `admin` / `owner`).
4. Copy the key (`pk_live_…`) — it is only shown once.
```bash
export PICTOGRAPH_API_KEY=pk_live_…
```
Or use the CLI's interactive setup:
```bash
pictograph login
```
This writes `~/.pictograph/config.toml`.
## First call
```python
from pictograph import Client
client = Client() # reads PICTOGRAPH_API_KEY
datasets = client.datasets.list(limit=10)
print(datasets)
```
## End-to-end: upload, annotate, train
The headline workflow — one function call:
```python
from pictograph import Client
from pictograph.workflows import full_pipeline
client = Client()
report = full_pipeline(
client,
dataset_name="road-signs",
folder="./road_signs",
classes=[("stop_sign", "bbox"), ("yield", "bbox")],
pipeline="yolox",
)
if report.success:
print(f"Trained model: {report.model.id}")
else:
print(report.credit_skip_reason or "see sub-reports")
```
Each phase short-circuits on failure and the `PipelineReport` carries every sub-report. See [`full_pipeline`](/docs/workflows/full-pipeline) for every parameter.
## CLI equivalent
```bash
pictograph login # one-time
pictograph datasets list
pictograph train start road-signs --pipeline yolox --gpu a10g
pictograph models download -o ./yolox.onnx
```
## Next
- [Workflows](/docs/workflows) — the four batteries-included helpers
- [Agents](/docs/agents) — wire Pictograph into Claude or OpenAI
- [Annotation format](/docs/annotation-format) — the canonical JSON schema
- [Credits](/docs/api-reference/credits) — budget gating and cost estimation
---
## Page: Authentication
_URL: https://pictograph.io/docs/authentication (markdown: authentication.md)_
_Section: Get Started_
The Developer API authenticates via **API keys** — `pk_live_…` strings
issued from **Settings → API Keys** in the web app. The same key works
for the SDK, the `pictograph` CLI, and direct REST calls.
## Get an API key
1. Sign in at [app.pictograph.io](https://app.pictograph.io).
2. **Settings → API Keys → Create API Key**.
3. Pick a role: `viewer` / `member` / `admin` / `owner` (see below).
4. Copy the key — **shown once, never again**.
## Use the key
The SDK reads `PICTOGRAPH_API_KEY` from the environment by default:
```bash
export PICTOGRAPH_API_KEY=pk_live_…
```
```python
from pictograph import Client
client = Client() # uses env var
```
Or pass it explicitly:
```python
client = Client(api_key="pk_live_…")
```
The CLI has the same resolution order plus a `~/.pictograph/config.toml`
file written by `pictograph login`:
```bash
pictograph login # prompts (input hidden), writes ~/.pictograph/config.toml
pictograph datasets list # uses the saved key
```
## Resolution order
| Priority | Source |
|---|---|
| 1 (highest) | `--api-key` flag (CLI) or `Client(api_key=...)` arg (SDK) |
| 2 | `PICTOGRAPH_API_KEY` environment variable |
| 3 | `~/.pictograph/config.toml` `[default].api_key` (CLI only) |
| 4 (failure) | `ConfigurationError` raised |
## REST clients
```bash
curl -H "X-API-Key: pk_live_…" https://api.pictograph.io/api/v1/developer/datasets/
```
The header name is exactly `X-API-Key`. Bearer tokens are not accepted on
developer endpoints.
## Roles + permissions
API keys carry a role that the backend re-enforces server-side. Roles are
hierarchical: `owner > admin > member > viewer`.
| Role | Read | Create / update | Delete | Invite users | Org settings |
|---|---|---|---|---|---|
| viewer | ✓ | — | — | — | — |
| member | ✓ | ✓ | own resources only | — | — |
| admin | ✓ | ✓ | ✓ | ✓ | — |
| owner | ✓ | ✓ | ✓ | ✓ | ✓ |
The agent tool registry tags each tool with `required_role` — see
[`/docs/api-reference/tools`](/docs/api-reference/tools).
## Key format
API keys look like `pk_live_<32_random_bytes_base64>`. They are **bcrypt-
hashed** server-side (cost factor 12) and stored only as the hash plus
the first 12 chars (`pk_live_<8>`) for the prefix-lookup index. Once
created, the full key is never recoverable.
## Rotation
To rotate:
1. **Settings → API Keys → Create API Key** (new key).
2. Update `PICTOGRAPH_API_KEY` / `~/.pictograph/config.toml` / your CI secret.
3. Delete the old key once the rollout is verified.
There is no in-place rotation — every key is immutable after creation.
## Errors
| Status | Exception | Cause |
| --- | --- | --- |
| 401 | `AuthError` | Missing / malformed / unknown / revoked key |
| 403 | `ForbiddenError` | Key's role lacks permission for the operation |
| 429 | `RateLimitError` | Per-key rate cap hit (see [Rate limits](/docs/rate-limits)) |
---