# Pictograph SDK — full documentation > One-shot bundle of every published doc page. Each section header is the > page title; navigate by searching for `## Page: `. ## Page: Agents _URL: https://pictograph.io/docs/agents (markdown: agents.md)_ _Section: Agents_ Every Pictograph operation is exposed as both a typed Python SDK call and an agent tool. The 28-tool registry feeds three integration paths. ```python from pictograph.agents import create_toolkit toolkit = create_toolkit() # reads PICTOGRAPH_API_KEY ``` ## Three integration paths ### Bundled adapters — Claude / OpenAI For raw tool-use loops against the Anthropic or OpenAI SDKs, the adapters return ready-to-pass tool dicts. No extra dependencies. ```python from pictograph.agents import for_anthropic_messages, for_openai_responses claude_tools = for_anthropic_messages(toolkit) # → anthropic.messages.create(tools=...) openai_tools = for_openai_responses(toolkit) # → openai.responses.create(tools=...) # Both paths dispatch results the same way: result = toolkit.dispatch(name="list_datasets", args={"limit": 10}) ``` For the framework SDKs (`claude-agent-sdk`, `openai-agents`) install the extra: ```bash pip install 'pictograph[agents]' ``` ```python from pictograph.agents import for_claude_agent_sdk, for_openai_agents claude_sdk_tools = for_claude_agent_sdk(toolkit) # @tool-decorated callables openai_sdk_tools = for_openai_agents(toolkit) # FunctionTool objects ``` Full integration cookbooks: [Claude](/docs/agents/claude) · [OpenAI](/docs/agents/openai). ### Bundled Claude Skill The SDK ships a Claude Skill (`pictograph-cv`) with workflow recipes, reference docs, and bash-callable Python scripts. Install it once and Claude auto-discovers it: ```bash pictograph agents install-skill --target claude-code # → ~/.claude/skills/pictograph-cv/ pictograph agents install-skill --target claude-ai # → ./pictograph-cv.zip (upload at claude.ai/skills) pictograph agents install-skill --target both ``` Update the skill after upgrading the SDK by re-running the install command (it overwrites the existing directory). ### Dynamic discovery — `tools.json` For frameworks without a bundled adapter (Vercel AI SDK, LangChain, custom dispatchers), fetch the JSON Schema registry directly: ```bash curl -H "X-API-Key: pk_live_…" https://api.pictograph.io/api/v1/developer/tools.json ``` Each entry has `name`, `description`, `input_schema`, plus metadata (`required_role`, `credit_cost`, `idempotent`). Wire it into your dispatcher and route the model's tool calls to the matching SDK method. See [Dynamic discovery](/docs/agents/dynamic-discovery) for end-to-end examples. ## What's in the registry Twenty-eight tools, grouped by category. Full JSON schemas at [`/docs/api-reference/tools`](/docs/api-reference/tools). | Category | Tools | | --- | --- | | Workflows | `upload_dataset_from_folder`, `auto_annotate_dataset`, `train_pipeline`, `full_pipeline` | | Datasets | `list_datasets`, `get_dataset`, `create_dataset`, `delete_dataset` | | Images | `upload_image`, `delete_image` | | Annotations | `get_annotations`, `save_annotations` | | Auto-annotate | `auto_annotate_point`, `auto_annotate_box`, `auto_annotate_text` | | Search | `search_by_tag`, `search_by_similarity` | | Exports | `create_export`, `list_exports`, `download_export` | | Training | `get_training_status`, `cancel_training` | | Models | `list_models`, `download_model` | | Credits | `get_credit_balance`, `estimate_credit_cost` | | Connectors | `validate_connector`, `import_from_connector` | ## Guardrails The toolkit enforces three guardrails on every dispatch, independent of which integration path you use. **Role gate (`required_role`)** — each tool's required role is metadata in the registry, but the API re-checks the calling key's role on every request. An agent holding a `viewer` key gets `403 ForbiddenError` on any write tool. **Credit gate (`credit_cost`)** — paid tools (`auto_annotate_dataset`, `train_pipeline`, `full_pipeline`) have known costs. Agents can pre-flight via `get_credit_balance` + `estimate_credit_cost` (both in the registry) and refuse to start when the balance is short. **Response cap (`max_response_tokens`)** — large list/get results that exceed the cap (default 25k tokens) are truncated with a `_truncated` marker. The agent re-calls with narrower filters. Pass `max_response_tokens=N` to `create_toolkit(...)` to override. ## Recommended system prompt patterns These three rules keep agents safe and cheap. Drop them into your system prompt. ```text 1. Before destructive actions (delete_dataset, delete_image, cancel_training), restate exactly what will be removed and ask for confirmation. 2. Before paid actions (auto_annotate_dataset, train_pipeline, full_pipeline), call estimate_credit_cost first, then surface the cost and the remaining balance. Proceed only if sufficient. 3. For multi-step tasks, prefer the workflow tools (full_pipeline, upload_dataset_from_folder, auto_annotate_dataset, train_pipeline) over chaining individual resource tools. They handle short-circuit on failure and credit gating automatically. ``` ## See also - [Claude](/docs/agents/claude) · [OpenAI](/docs/agents/openai) — integration cookbooks - [Dynamic discovery](/docs/agents/dynamic-discovery) — for framework-agnostic stacks - [Cookbook](/docs/agents/cookbook) — recipe-style end-to-end examples - [Tool reference](/docs/api-reference/tools) — full JSON schemas for all 28 tools --- ## Page: Claude _URL: https://pictograph.io/docs/agents/claude (markdown: agents/claude.md)_ _Section: Agents_ Toolkit setup and guardrails live on [Agents — overview](/docs/agents). This page shows the two Claude-specific integration paths. ## Path 1: Anthropic SDK (raw tool dicts) No extra dependencies — works with the standard `anthropic` package. You manage the tool-use loop. ```python import anthropic from pictograph.agents import create_toolkit, for_anthropic_messages toolkit = create_toolkit() tools = for_anthropic_messages(toolkit) client = anthropic.Anthropic() messages = [ {"role": "user", "content": "Upload ./photos to a dataset called 'demo' and auto-annotate cars and people."}, ] response = client.messages.create( model="claude-opus-4", max_tokens=4096, tools=tools, tool_choice={"type": "auto"}, messages=messages, ) while response.stop_reason == "tool_use": tool_uses = [b for b in response.content if b.type == "tool_use"] tool_results = [ { "type": "tool_result", "tool_use_id": tu.id, "content": str(toolkit.dispatch(tu.name, tu.input)), } for tu in tool_uses ] messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": tool_results}) response = client.messages.create( model="claude-opus-4", max_tokens=4096, tools=tools, messages=messages, ) print(response.content) ``` `toolkit.dispatch(name, args)` validates `args` through the tool's Pydantic schema, invokes the handler, and returns a JSON-serializable result. Invalid input raises `ValidationError`. ## Path 2: Claude Agent SDK The Agent SDK manages the loop, streaming, and dispatch. Requires the extra: ```bash pip install 'pictograph[agents]' ``` ```python from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions from pictograph.agents import create_toolkit, for_claude_agent_sdk toolkit = create_toolkit() agent_tools = for_claude_agent_sdk(toolkit) async with ClaudeSDKClient( options=ClaudeAgentOptions( system_prompt="You drive Pictograph for the user. Confirm destructive operations.", allowed_tools=[t.name for t in agent_tools], ), ) as client: async for response in client.query( "Train a YOLOX model on the 'road-signs' dataset, A10G GPU, 50 epochs" ): print(response) ``` ## Picking between paths | Path 1 (raw dicts) | Path 2 (Agent SDK) | | --- | --- | | You already call `messages.create()` directly | You want streaming + multi-turn with no loop boilerplate | | You need custom logging or gating in the loop | You're building a long-running agent process | | You don't want the `claude-agent-sdk` dependency | You're running inside Claude Code or another Agent-SDK runtime | ## See also - [Agents](/docs/agents) — toolkit setup, guardrails, and the registry - [Bundled Skill](/docs/agents) — `pictograph-cv` for Claude Code / claude.ai - [Cookbook](/docs/agents/cookbook) — recipe-style examples --- ## Page: OpenAI _URL: https://pictograph.io/docs/agents/openai (markdown: agents/openai.md)_ _Section: Agents_ Toolkit setup and guardrails live on [Agents — overview](/docs/agents). This page shows the two OpenAI-specific integration paths. ## Path 1: OpenAI SDK (raw function tools) No extra dependencies — works with the standard `openai` package. ```python import json from openai import OpenAI from pictograph.agents import create_toolkit, for_openai_responses toolkit = create_toolkit() tools = for_openai_responses(toolkit) client = OpenAI() input_messages = [ {"role": "user", "content": "Show my Pictograph credit balance and recent training runs"}, ] while True: response = client.responses.create( model="gpt-5", input=input_messages, tools=tools, ) function_calls = [o for o in response.output if o.type == "function_call"] if not function_calls: print(response.output_text) break for call in function_calls: result = toolkit.dispatch(call.name, json.loads(call.arguments)) input_messages.append(call) input_messages.append({ "type": "function_call_output", "call_id": call.call_id, "output": json.dumps(result, default=str), }) ``` `toolkit.dispatch(name, args)` is the same dispatcher used by the Anthropic path — single source of truth. ## Path 2: openai-agents SDK The framework manages the tool-call loop, streaming, handoffs, and tracing. Requires the extra: ```bash pip install 'pictograph[agents]' ``` ```python from agents import Agent, Runner from pictograph.agents import create_toolkit, for_openai_agents toolkit = create_toolkit() agent_tools = for_openai_agents(toolkit) agent = Agent( name="Pictograph driver", instructions=( "Drive Pictograph for the user. Confirm destructive operations. " "Prefer the workflow tools (full_pipeline, train_pipeline) over " "chaining individual resource tools." ), tools=agent_tools, ) result = Runner.run_sync( agent, "Annotate ./road_signs as cars and people, then train a YOLOX model", ) print(result.final_output) ``` Streaming: ```python result = await Runner.run(agent, user_input) async for event in result.stream_events(): if event.type == "raw_response_event": print(event.data, end="", flush=True) ``` ## Picking between paths | Path 1 (raw dicts) | Path 2 (openai-agents) | | --- | --- | | You already call `responses.create()` directly | You want multi-turn conversations with built-in dispatch | | You need a custom dispatch loop | You need agent handoffs | | You don't want the `openai-agents` dependency | You want tracing + replay + structured outputs | ## See also - [Agents](/docs/agents) — toolkit setup, guardrails, and the registry - [Cookbook](/docs/agents/cookbook) — recipe-style examples - [Dynamic discovery](/docs/agents/dynamic-discovery) — for Vercel AI SDK, LangChain, and other stacks --- ## Page: Dynamic discovery _URL: https://pictograph.io/docs/agents/dynamic-discovery (markdown: agents/dynamic-discovery.md)_ _Section: Agents_ The Pictograph SDK ships first-party adapters for **Claude** and **OpenAI**. For everything else (Vercel AI SDK, LangChain, raw HTTP clients, custom dispatchers), the registry is exposed as JSON Schema at: ``` GET https://api.pictograph.io/api/v1/developer/tools.json ``` Authenticated with the same `X-API-Key` header — any role works (read-only). ## Why this exists Per-framework adapters are a treadmill. The JSON Schema contract is the long-term answer: - **Pictograph** maintains one source of truth (`pictograph.agents.REGISTRY`). - **Your agent stack** consumes JSON Schema natively — every modern framework supports it. - **No bespoke adapter** to maintain on either side. The Python SDK still ships Claude + OpenAI adapters because their ecosystems are big enough to warrant the convenience. Everyone else gets the open-standard path. ## Vercel AI SDK ```ts import { generateText, tool } from 'ai'; import { anthropic } from '@ai-sdk/anthropic'; import { z } from 'zod'; const headers = { 'X-API-Key': process.env.PICTOGRAPH_API_KEY! }; const { tools } = await fetch( 'https://api.pictograph.io/api/v1/developer/tools.json', { headers }, ).then(r => r.json()); // Build a tools map keyed by name. The execute() function dispatches // back to the Pictograph REST API directly — no Python required. const pictographTools = Object.fromEntries( tools.map((t: any) => [ t.name, tool({ description: t.description, parameters: t.input_schema, // Vercel accepts JSON Schema directly execute: async (args: any) => { // Map tool name → REST endpoint. // (See: jsonschema-to-rest mapper, or hardcode the routes you use.) const res = await fetch( `https://api.pictograph.io/api/v1/developer/_dispatch/${t.name}`, { method: 'POST', headers, body: JSON.stringify(args) }, ); return res.json(); }, }), ]), ); const result = await generateText({ model: anthropic('claude-opus-4'), tools: pictographTools, prompt: 'List my Pictograph datasets', }); ``` (Pictograph doesn't ship a `_dispatch` endpoint in v1.0.0 — wire each tool to its underlying REST endpoint manually. The `Toolkit.dispatch()` Python method is the reference behavior.) ## LangChain ```python from langchain_core.tools import tool import requests, os headers = {"X-API-Key": os.environ["PICTOGRAPH_API_KEY"]} schema = requests.get( "https://api.pictograph.io/api/v1/developer/tools.json", headers=headers, ).json() # Build LangChain tools from the JSON Schema: def make_tool(spec): @tool(spec["name"], description=spec["description"], args_schema=...) def _run(**kwargs): # Call the matching REST endpoint with kwargs. ... return _run langchain_tools = [make_tool(t) for t in schema["tools"]] ``` For LangChain specifically, you can also use the Pictograph Python SDK directly inside a `@tool` — that's often simpler than rebuilding the dispatch loop: ```python from langchain_core.tools import tool from pictograph import Client client = Client() @tool("list_datasets", description="List Pictograph datasets in your org.") def list_datasets(limit: int = 100) -> list[dict]: return [d.model_dump(mode="json") for d in client.datasets.list(limit=limit)] ``` This trades the dynamic-discovery benefit for typed, tested SDK calls — worth it for production. ## Custom dispatchers If you're rolling your own: 1. Fetch `/api/v1/developer/tools.json` once at startup. 2. Hand the `tools` array to your LLM as the function/tool spec. 3. When the LLM emits a tool call, look up the tool by name. 4. Map name → REST endpoint (see the [SDK source](https://github.com/pictograph-labs/pictograph-sdk/tree/main/src/pictograph/resources) for canonical mappings). 5. Send the args as the JSON body, return the response to the LLM. Or use the Python SDK as a server-side dispatcher and only expose tool names + schemas to your client — usually the cleanest production architecture. ## Snapshot file The same registry ships in the Python SDK package — useful for offline work or if you want to bundle the schema with your agent: ```python from pictograph.agents import Toolkit from unittest.mock import MagicMock toolkit = Toolkit(MagicMock()) schema = toolkit.as_json_schema() # identical to the HTTP response payload ``` Or via the CLI: ```bash pictograph agents export-tools -o tools.json ``` ## See also - [Agents — overview](/docs/agents) — the three integration paths. - [Tool registry endpoint](/docs/api-reference/tools) — full spec. - [Cookbook](/docs/agents/cookbook) — concrete recipe examples. --- ## Page: Agent cookbook _URL: https://pictograph.io/docs/agents/cookbook (markdown: agents/cookbook.md)_ _Section: Agents_ Concrete patterns for driving Pictograph from agents. Each recipe shows the system-prompt guidance plus the expected tool-call sequence. ## 1. Credit-aware training **User asks**: "Train a YOLOX model on my 'road-signs' dataset" **System prompt addition**: > For paid operations, always call `estimate_credit_cost` first. If > insufficient, tell the user the gap and ask before proceeding. **Expected sequence**: ``` 1. get_credit_balance() → {credits_remaining: 1500, ...} 2. estimate_credit_cost("training_a10g_per_minute", quantity=30) → {total_credits: 300, sufficient: true, ...} 3. (response to user) "Estimated 300 credits, you have 1500. Proceeding." 4. train_pipeline(dataset_name="road-signs", pipeline="yolox", gpu="a10g") → {training_run: {...}, model: {id: "model-uuid", status: "ready"}} 5. (response to user) "Trained. Model ID: model-uuid. Download with `pictograph models download model-uuid -o ./yolox.onnx`" ``` If `sufficient: false`, the agent should surface `PaymentRequiredError.upgrade_url` to the user, not the raw exception. ## 2. V7 import → auto-annotate with new classes **User asks**: "Import 'road-damage' from V7 and add a 'pothole' class auto-annotated by SAM3" **Expected sequence**: ``` 1. validate_connector(provider="v7", api_key="<v7-key-from-user>") → {valid: true, datasets: [{id, name: "road-damage", ...}, ...]} 2. import_from_connector(provider="v7", api_key=..., datasets=[<chosen>]) → {import_id, status: "processing", ...} (the SDK polls until terminal — agent waits) 3. (response to user) "Imported. Now adding 'pothole' annotations…" 4. auto_annotate_dataset( dataset_name="road-damage", classes=[{name: "pothole", output_type: "polygon"}], mode="batch", ) → {images_processed: 234, annotations_added: 487, ...} 5. (response to user summary) ``` Existing V7 annotations are preserved — auto-annotate by default skips images that already have annotations (`overwrite=False`). ## 3. Batch SAM3 with progress + per-image fallback **User asks**: "Auto-annotate 'wildlife' for 'tiger' and 'elephant'; fall back to text mode if batch fails" **System prompt addition**: > If a batch tool returns `failed_images > 0` or raises, retry the > failed subset with the synchronous text-prompt mode. **Expected sequence**: ``` 1. auto_annotate_dataset( dataset_name="wildlife", classes=[{name: "tiger", output_type: "polygon"}, {name: "elephant", output_type: "polygon"}], mode="batch", ) → {images_processed: 198, failed_images: 12, job_id, ...} 2. (response to user) "Batch done — 12 images failed. Retrying as text..." 3. get_dataset(name="wildlife", include_images=true) → list of image filenames 4. For each failed image (agent loops): auto_annotate_text( dataset_name="wildlife", image_filename="img-failed-1.jpg", text_prompt="tiger or elephant", confidence_threshold=0.3, ) 5. (final response) ``` In practice, prefer `auto_annotate_dataset(mode="batch")` first for speed, fall back to `mode="text"` only on the failures. ## 4. Multi-class export with status filtering **User asks**: "Export only the 'complete' subset of 'road-signs', just the 'stop_sign' and 'yield' classes, in YOLO format" **Expected sequence**: ``` 1. get_dataset(name="road-signs") → confirms classes include both 2. create_export( dataset_name="road-signs", name="stop-yield-yolo-2026-04-19", format="yolo", include_images=true, class_filter=["stop_sign", "yield"], status_filter="complete", ) → {id, status: "completed", image_count: 412, ...} 3. download_export( dataset_name="road-signs", export_name="stop-yield-yolo-2026-04-19", output_path="./stop-yield.zip", ) 4. (response to user) "Wrote ./stop-yield.zip — 412 images, 1287 annotations." ``` Always include the date in export names so re-runs don't conflict (`409 ConflictError` on duplicate names). ## 5. Cleanup confirmation flow **User asks**: "Delete the old test datasets" **System prompt addition**: > Before any destructive action (`delete_dataset`, `delete_image`, > `cancel_training`), list the targets and ask the user to confirm by > repeating the names. **Expected sequence**: ``` 1. list_datasets(limit=100) → [{name: "test-1", ...}, {name: "test-2", ...}, {name: "production", ...}] 2. (response to user) "Found 2 with 'test' in the name: test-1, test-2. Confirm by typing the names you want deleted, separated by commas." 3. (user) "test-1, test-2" 4. (agent) For each confirmed name: delete_dataset(name="test-1") delete_dataset(name="test-2") 5. (response to user) "Deleted: test-1, test-2." ``` The `delete_dataset` tool is marked `idempotent=True` and `required_role="admin"` — agents using a `member` key get `403 ForbiddenError` automatically. ## 6. Folder reorganization with batch ops **User asks**: "Move all images tagged 'blurry' to a 'blurry' folder" **Expected sequence**: ``` 1. search_by_tag(dataset_name="…", attributes=["blurry"], limit=500) → [{image_id, filename, ...}, ...] 2. batch.move( image_ids=[r.image_id for r in results], folder_path="/blurry", ) → {succeeded: [...], failed_count: 0} 3. (response to user) "Moved 87 blurry images to /blurry." ``` Agents should prefer `batch.*` over per-image loops — single call, single round-trip, atomic at the database level. ## See also - [Claude integration](/docs/agents/claude) — Anthropic-specific paths - [OpenAI integration](/docs/agents/openai) — OpenAI-specific paths - [Agents — overview](/docs/agents) — toolkit setup, guardrails, and `pictograph-cv` Skill install --- ## Page: Workflows _URL: https://pictograph.io/docs/workflows (markdown: workflows.md)_ _Section: Workflows_ Workflows are the headline UX of the SDK. Each one chains several REST calls into a single Python function so you can express "upload this folder and train a model" without orchestrating it yourself. Failures fail open — every workflow returns a report you can inspect. ```python from pictograph import Client from pictograph.workflows import full_pipeline client = Client() report = full_pipeline( client, dataset_name="road-signs", folder="./road_signs", classes=[("stop_sign", "bbox"), ("yield", "bbox")], pipeline="yolox", ) print("model:", report.model.id if report.success else report.upload.failures) ``` ## When to reach for each | Workflow | What it chains | Use when | | --- | --- | --- | | [`full_pipeline`](/docs/workflows/full-pipeline) | upload → auto-annotate → train | You have a folder of images and want a trained model | | [`upload_dataset_from_folder`](/docs/workflows/upload) | walk folder → bulk upload | You only need the upload step (annotations come later) | | [`auto_annotate_dataset`](/docs/workflows/auto-annotate) | list images → SAM3 batch → save | The dataset is uploaded; you want SAM3 to label it | | [`train_pipeline`](/docs/workflows/train) | create export → train → fetch model | Annotations are saved; you want the model | ## Report objects, not exceptions Workflows don't raise on partial failure. They return dataclasses with per-phase success flags and failure lists. This is intentional — agents and CI jobs need to make decisions on partial outcomes, not unwind on the first 4xx. ```python report = upload_dataset_from_folder(client, "my-dataset", "./images") if report.success: print(f"Uploaded {report.images_uploaded}") else: for failure in report.failures: print(failure.path, failure.reason) ``` Exceptions are still raised for unrecoverable errors before any work happens (`NotFoundError` on a missing dataset, `ValidationError` on a bad pipeline name). See [Error handling](/docs/error-handling). --- ## Page: Full pipeline _URL: https://pictograph.io/docs/workflows/full-pipeline (markdown: workflows/full-pipeline.md)_ _Section: Workflows_ `full_pipeline()` chains upload → auto-annotate → train. Each phase short-circuits on failure and the `PipelineReport` carries every sub-report so you can see exactly where the chain broke. ```python from pictograph import Client from pictograph.workflows import full_pipeline client = Client() report = full_pipeline( client, dataset_name="road-signs", folder="./road_signs", classes=[("stop_sign", "bbox"), ("yield", "bbox")], pipeline="yolox", ) if report.success: print("Model:", report.model.id) else: print("Stopped at:", report.credit_skip_reason or "see sub-reports") ``` ## Signature ```python full_pipeline( client: Client, *, dataset_name: str, folder: str | Path, classes: Sequence[BatchClass | tuple[str, str] | dict[str, str]], pipeline: PipelineType, gpu: GpuType = "a10g", annotate: bool = True, annotate_mode: AnnotateMode = "batch", train: bool = True, upload_workers: int = 8, train_config: dict[str, Any] | None = None, train_timeout: float = 7200.0, min_credits: int | None = 1, ) -> PipelineReport ``` | Argument | Default | Purpose | | --- | --- | --- | | `dataset_name` | required | Destination dataset; created if missing | | `folder` | required | Local folder of images — subdirectories become virtual folders | | `classes` | required | Each class becomes a SAM3 target and a training label | | `pipeline` | required | `yolox`, `detectron2`, `sm_pytorch`, `classification`, `rfdetr_detection`, `rfdetr_segmentation` | | `gpu` | `"a10g"` | `a10g`, `a100`, or `h100` | | `annotate` | `True` | Skip the SAM3 phase if you already have annotations | | `annotate_mode` | `"batch"` | `batch` (async, multi-image) or `text` (synchronous per-image) | | `train` | `True` | Skip training to do upload + annotate only | | `upload_workers` | `8` | Concurrent upload threads | | `train_config` | `None` | Hyperparameters (`epochs`, `batch_size`, `learning_rate`, `image_size`) | | `train_timeout` | `7200` | Max seconds to wait for training (2 hours) | | `min_credits` | `1` | Pre-flight balance check before paid phases — pass `None` to disable | ## How the chain fails open Each phase only runs if the previous one succeeded. 1. **Upload** always runs. If it produces zero successes, the function returns immediately with `upload.failures` populated. 2. **Credit gate**: before any paid phase, the balance is checked. Below `min_credits` and the function returns with `credit_skip_reason` set. 3. **Auto-annotate** runs only when `annotate=True` and upload succeeded. If it produces zero processed images, training is skipped. 4. **Train** runs only when `train=True` and the previous phases succeeded. The export is auto-named `<pipeline>-<timestamp>`. ## Inspecting the report ```python @dataclass class PipelineReport: dataset_name: str upload: UploadReport annotate: AnnotateReport | None training_run: TrainingRun | None model: Model | None credit_skip_reason: str | None @property def success(self) -> bool: ... ``` `success` is `True` only when every populated phase succeeded and no credit skip happened. Each sub-report has its own `success` property. ## Common patterns **Upload + annotate only** (no training): ```python full_pipeline(client, ..., pipeline="yolox", train=False) ``` **Use existing annotations** (skip SAM3): ```python full_pipeline(client, ..., annotate=False, pipeline="yolox") ``` **Disable the credit pre-flight** (you're OK paying for partial runs): ```python full_pipeline(client, ..., min_credits=None) ``` ## Errors The function does not raise on partial failure — inspect the report. It will still raise for unrecoverable conditions before any work begins: | Status | Exception | Cause | | --- | --- | --- | | 402 | `PaymentRequiredError` | Mid-run cost exceeds balance (from auto-annotate or training phase) | | 422 | `ValidationError` | `pipeline` or `gpu` value invalid | | `FileNotFoundError` | — | `folder` doesn't exist or isn't a directory | ## See also - [Upload](/docs/workflows/upload) · [Auto-annotate](/docs/workflows/auto-annotate) · [Train](/docs/workflows/train) — the underlying workflows - [Credits](/docs/api-reference/credits) — pre-flight cost estimation - [Error handling](/docs/error-handling) --- ## Page: Upload from folder _URL: https://pictograph.io/docs/workflows/upload (markdown: workflows/upload.md)_ _Section: Workflows_ `upload_dataset_from_folder()` walks a local directory of images, creates the destination dataset if needed, and uploads everything through a thread pool. Subdirectories become virtual folders on the dataset by default. Re-runs are idempotent — duplicate filenames are skipped, not failed. ```python from pictograph import Client from pictograph.workflows import upload_dataset_from_folder client = Client() report = upload_dataset_from_folder( client, dataset_name="road-signs", folder="./road_signs", ) print(f"{report.images_uploaded} uploaded, {report.images_skipped} skipped") ``` ## Signature ```python upload_dataset_from_folder( client: Client, dataset_name: str, folder: str | Path, *, organize_by_class: bool = True, parallel: bool = True, max_workers: int = 8, skip_existing: bool = True, create_if_missing: bool = True, progress: Callable[[int, int, str | None], None] | None = None, ) -> UploadReport ``` | Argument | Default | Purpose | | --- | --- | --- | | `dataset_name` | required | Destination dataset | | `folder` | required | Local directory (walked recursively) | | `organize_by_class` | `True` | First-level subdirectories become virtual folders | | `parallel` | `True` | Use a thread pool | | `max_workers` | `8` | Pool size — higher values risk hitting the rate limit | | `skip_existing` | `True` | Treat duplicate-filename conflicts as skips, not failures | | `create_if_missing` | `True` | Create the dataset if it doesn't exist (else `NotFoundError`) | | `progress` | `None` | `(completed, total, filename)` callback fired after each file | ## Folder layout convention With `organize_by_class=True` (the default), the **first-level subdirectory** becomes the virtual folder: ``` ./road_signs/ ├── stop/ → /stop on the dataset │ ├── 001.jpg │ └── 002.jpg ├── yield/ → /yield │ └── 003.jpg └── 004.jpg → / (root) ``` Nested subdirectories collapse — `./road_signs/stop/night/005.jpg` still lands in `/stop`. Pass `organize_by_class=False` to put every file at the root. Supported extensions: `.jpg`, `.jpeg`, `.png`, `.webp`, `.bmp`, `.tif`, `.tiff`, `.gif`, `.heic`. ## Idempotency Re-running the same call on a dataset that already has matching filenames is safe — those uploads come back as `images_skipped`. To force re-upload, set `skip_existing=False` (failures will be recorded instead). ```python # First run — uploads everything. report = upload_dataset_from_folder(client, "road-signs", "./road_signs") assert report.images_uploaded == 100 and report.images_skipped == 0 # Second run — skips everything that's already there. report = upload_dataset_from_folder(client, "road-signs", "./road_signs") assert report.images_uploaded == 0 and report.images_skipped == 100 ``` ## Progress callback ```python def on_progress(done: int, total: int, filename: str | None) -> None: print(f"[{done}/{total}] {filename}") upload_dataset_from_folder( client, "road-signs", "./road_signs", progress=on_progress, ) ``` The callback fires once per file, regardless of success or failure. ## Inspecting the report ```python @dataclass class UploadReport: dataset_name: str images_attempted: int images_uploaded: int images_skipped: int failures: list[UploadFailure] # each carries .path and .reason @property def success(self) -> bool: ... ``` `success` is `True` only when there are zero failures **and** at least one file uploaded. An empty folder returns a report with `success=False`. ## Errors | Status | Exception | Cause | | --- | --- | --- | | `FileNotFoundError` | — | `folder` doesn't exist or isn't a directory | | 404 | `NotFoundError` | `dataset_name` missing and `create_if_missing=False` | Per-file errors (network, validation, conflict) are recorded in `report.failures`, not raised. ## See also - [Full pipeline](/docs/workflows/full-pipeline) — chains upload with annotate + train - [Images](/docs/api-reference/images) — the underlying `client.images.upload()` method --- ## Page: Auto-annotate a dataset _URL: https://pictograph.io/docs/workflows/auto-annotate (markdown: workflows/auto-annotate.md)_ _Section: Workflows_ `auto_annotate_dataset()` runs SAM3 over a dataset and saves the resulting annotations. By default it runs in **batch mode** — one async job over many images — which is the right call for anything above ~10 images. Text mode (one synchronous prompt per image) is available for debugging. ```python from pictograph import Client from pictograph.workflows import auto_annotate_dataset client = Client() report = auto_annotate_dataset( client, dataset_name="road-signs", classes=[("stop_sign", "bbox"), ("yield", "polygon")], ) print(f"{report.annotations_added} annotations across {report.images_processed} images") ``` ## Signature ```python auto_annotate_dataset( client: Client, dataset_name: str, classes: Sequence[BatchClass | tuple[str, str] | dict[str, str]], *, mode: AnnotateMode = "batch", confidence_threshold: float = 0.5, overwrite: bool = False, max_images: int | None = None, poll_interval: float = 5.0, timeout: float = 1800.0, ) -> AnnotateReport ``` | Argument | Default | Purpose | | --- | --- | --- | | `dataset_name` | required | Project name | | `classes` | required | What to detect — see "Class specs" below | | `mode` | `"batch"` | `batch` (async multi-image) or `text` (synchronous per-image) | | `confidence_threshold` | `0.5` | SAM3 score cutoff (0–1) | | `overwrite` | `False` | When `False`, skip images that already have annotations | | `max_images` | `None` | Cap (useful for dry-runs) | | `poll_interval` | `5.0` | `batch` mode — seconds between status polls | | `timeout` | `1800` | `batch` mode — max seconds to wait | ## Class specs `classes` accepts three shapes — pick whichever is shortest: ```python # 1. Tuples — name + output_type classes=[("stop_sign", "bbox"), ("yield", "polygon")] # 2. Dicts classes=[ {"name": "stop_sign", "output_type": "bbox"}, {"name": "yield", "output_type": "polygon"}, ] # 3. BatchClass (canonical) from pictograph.models.auto_annotate import BatchClass classes=[BatchClass(name="stop_sign", output_type="bbox")] ``` Valid `output_type` values: `"bbox"`, `"polygon"`, `"polyline"`, `"keypoint"`. ## Batch vs text mode `mode="batch"` (default) sends every image and every class to one async SAM3 job. The job is polled until it terminates; you get one report at the end. This is what you want for >10 images — it's faster and cheaper per image. `mode="text"` runs one synchronous SAM3 text-prompt per image, per class. It's slower (no batching) and saves annotations as they come back. Use it when you need to debug a single image or when the dataset is small enough that the batch warmup overhead isn't worth it. ## Skip vs overwrite By default the workflow skips images that already have at least one annotation. Set `overwrite=True` to re-annotate everything: ```python # Annotate only the unlabelled subset. auto_annotate_dataset(client, "road-signs", classes=[...]) # Re-annotate every image (overwrites existing). auto_annotate_dataset(client, "road-signs", classes=[...], overwrite=True) ``` ## Inspecting the report ```python @dataclass class AnnotateReport: dataset_name: str images_attempted: int images_processed: int images_skipped: int annotations_added: int failures: list[AnnotationFailure] job_id: str | None # set only when mode="batch" @property def success(self) -> bool: ... ``` In batch mode, `job_id` lets you fetch the job later via `client.auto_annotate.get_batch(job_id)` (e.g. to surface progress in a UI) or cancel it. ## Errors | Status | Exception | Cause | | --- | --- | --- | | 404 | `NotFoundError` | Dataset doesn't exist | | 402 | `PaymentRequiredError` | Insufficient credits | | 422 | `ValidationError` | Class name invalid or `output_type` not recognised | Per-image failures are recorded in `report.failures` — they don't raise. ## See also - [Full pipeline](/docs/workflows/full-pipeline) — chains annotate with upload + train - [Auto-annotate](/docs/api-reference/auto-annotate) — point / box / text / batch primitives - [Credits](/docs/api-reference/credits) — cost estimation per image --- ## Page: Train a model _URL: https://pictograph.io/docs/workflows/train (markdown: workflows/train.md)_ _Section: Workflows_ `train_pipeline()` chains export creation, training, and model fetch. The export is auto-named so the workflow doesn't collide with exports you've created manually. ```python from pictograph import Client from pictograph.workflows import train_pipeline client = Client() run, model = train_pipeline( client, "road-signs", pipeline="yolox", gpu="a10g", config={"epochs": 50, "batch_size": 16}, ) if model: client.models.download(model.id, "./yolox.onnx") ``` ## Signature ```python train_pipeline( client: Client, dataset_name: str, *, pipeline: PipelineType, gpu: GpuType = "a10g", name: str | None = None, config: dict[str, Any] | None = None, export_name: str | None = None, class_filter: list[str] | None = None, status_filter: str = "complete", wait: bool = True, poll_interval: float = 5.0, timeout: float = 7200.0, ) -> tuple[TrainingRun, Model | None] ``` | Argument | Default | Purpose | | --- | --- | --- | | `dataset_name` | required | Project to train on | | `pipeline` | required | `yolox`, `detectron2`, `sm_pytorch`, `classification`, `rfdetr_detection`, `rfdetr_segmentation` | | `gpu` | `"a10g"` | `a10g`, `a100`, or `h100` | | `name` | auto | Run name (defaults to `<pipeline>-run-<timestamp>`) | | `config` | `{}` | Hyperparameters (`epochs`, `batch_size`, `learning_rate`, `image_size`) | | `export_name` | auto | Defaults to `<pipeline>-<timestamp>` | | `class_filter` | `None` | Train only on these classes | | `status_filter` | `"complete"` | Only include images at this annotation status | | `wait` | `True` | When `True`, block until training terminates and fetch the model | | `poll_interval` | `5.0` | Seconds between polls | | `timeout` | `7200` | Max seconds to wait (2 hours) | ## Pipelines | `pipeline` | Output | When to pick it | | --- | --- | --- | | `yolox` | Object detection (boxes) | Speed, edge deployment, small datasets | | `detectron2` | Instance segmentation (polygons + masks) | Per-instance pixel masks | | `sm_pytorch` | Semantic segmentation | Pixel-wise class maps | | `classification` | Image classification | Tag-style labels with no geometry | | `rfdetr_detection` | Object detection | Higher mAP than YOLOX on harder data | | `rfdetr_segmentation` | Instance segmentation | Higher mAP than Detectron2 on harder data | ## GPU tiers | `gpu` | Pick for | | --- | --- | | `a10g` (default) | YOLOX, classification, RF-DETR-detection | | `a100` | Detectron2, large RF-DETR, big batch sizes | | `h100` | Last resort — only when A100 OOMs | The dataset must have **at least 5 images** with the chosen `status_filter` so the worker can split train / val / test. ## What happens under the hood ``` 1. client.exports.create(dataset, "<pipeline>-<timestamp>", format="pictograph", include_images=True, class_filter=…, status_filter=…) → waits for the export to finish. 2. client.training.create(dataset, export_name, pipeline_type=…, name=…, config=…, gpu_type=…, wait=…, timeout=…) → kicks off the run; polls until terminal when wait=True. 3. client.models.get(run.model_id) → returns the trained model — only when wait=True and status=="completed". ``` ## Async usage Pass `wait=False` to fire-and-forget: ```python run, _ = train_pipeline(client, "road-signs", pipeline="yolox", wait=False) print("queued:", run.id) # Poll yourself later. run = client.training.get(run.id) if run.status == "completed": model = client.models.get(run.model_id) ``` ## Hyperparameters `config` keys are pipeline-specific. Common ones across pipelines: | Key | Type | Typical | | --- | --- | --- | | `epochs` | int | 30–100 | | `batch_size` | int | 8 / 16 / 32 | | `learning_rate` | float | `0.001`–`0.01` | | `image_size` | int | `640` (YOLOX), `1024` (Detectron2) | Unsupported keys are ignored. ## Errors | Status | Exception | Cause | | --- | --- | --- | | 404 | `NotFoundError` | Dataset missing or has no `status_filter`-matching images | | 422 | `ValidationError` | Pipeline or GPU invalid, or dataset has fewer than 5 annotated images | | 402 | `PaymentRequiredError` | Insufficient credits for the estimated training minutes | | 408 | `PollTimeoutError` | `wait=True` and `timeout` elapsed (the run continues; poll later) | | 5xx | `ApiError` | Training run failed — inspect `run.error_message` | ## See also - [Full pipeline](/docs/workflows/full-pipeline) — chains training with upload + annotate - [Training](/docs/api-reference/training) — lower-level `create / list / get / cancel` primitives - [Models](/docs/api-reference/models) — download trained ONNX weights - [Credits](/docs/api-reference/credits) — `estimate("training_<gpu>_per_minute")` --- ## Page: API reference _URL: https://pictograph.io/docs/api-reference (markdown: api-reference.md)_ _Section: API Reference_ The Pictograph SDK exposes 15 resource groups under `client.<resource>`. Each method maps 1:1 to a REST endpoint. Use the SDK for type safety and auto-retry; use raw REST when you need a non-Python language. ```python from pictograph import Client client = Client() ``` ## Resources | Resource | Purpose | | --- | --- | | [Datasets](/docs/api-reference/datasets) | Project CRUD and bulk download | | [Images](/docs/api-reference/images) | Single-image upload / download / delete | | [Annotations](/docs/api-reference/annotations) | Per-image annotation read / save / delete | | [Auto-annotate](/docs/api-reference/auto-annotate) | SAM3 point / box / text / batch prompts | | [Search](/docs/api-reference/search) | Tag and similarity search across a dataset | | [Batch](/docs/api-reference/batch) | Bulk move / copy / delete / update on images | | [Exports](/docs/api-reference/exports) | Dataset exports in COCO, YOLO, CVAT, Pascal VOC, LabelMe, CSV | | [Training](/docs/api-reference/training) | Spawn and monitor training runs | | [Models](/docs/api-reference/models) | List and download trained ONNX weights | | [Credits](/docs/api-reference/credits) | Balance, ledger, pre-flight cost estimation | | [Connectors](/docs/api-reference/connectors) | V7 / Roboflow dataset import | | [Video](/docs/api-reference/video) | Video upload + frame extraction | | [Organizations](/docs/api-reference/organizations) | Members and invites for the active org | | [Projects](/docs/api-reference/projects) | Project config (classes, annotation types) | | [API keys](/docs/api-reference/api-keys) | Programmatic key management | | [Tools](/docs/api-reference/tools) | Agent tool registry (JSON Schema reference) | ## Prefer workflows for end-to-end tasks For multi-step flows like "upload, annotate, train", reach for [`pictograph.workflows`](/docs/workflows) before chaining resource calls. Workflows handle short-circuit on failure, credit gating, and report aggregation for you. | Workflow | Chains | | --- | --- | | [`full_pipeline`](/docs/workflows/full-pipeline) | upload → auto-annotate → train | | [`upload_dataset_from_folder`](/docs/workflows/upload) | walk folder → bulk upload | | [`auto_annotate_dataset`](/docs/workflows/auto-annotate) | list images → batch SAM3 → save | | [`train_pipeline`](/docs/workflows/train) | export → train → fetch model | Each workflow is also exposed as an [agent tool](/docs/api-reference/tools). ## Conventions across resources - **Name-based lookups** are preferred (`get(name=...)`). UUID variants exist where useful (`get_by_id(...)`). - **Pagination**: `.list()` returns one page; `.iter()` returns an `OffsetPager` that auto-fetches subsequent pages. - **Long-running ops** (training, exports, batch SAM3, dataset imports) default to `wait=True` and poll until terminal. Pass `wait=False` to fire-and-forget. - **Failure reports**: bulk operations return per-item failure lists rather than raising on the first error. - **Idempotency**: mutating operations auto-generate `Idempotency-Key` headers — safe to retry. Pass `idempotency_key=` to set it explicitly. --- ## Page: Datasets _URL: https://pictograph.io/docs/api-reference/datasets (markdown: api-reference/datasets.md)_ _Section: API Reference_ A **dataset** in Pictograph is a project — a collection of images sharing a class set. Datasets are unique by `(organization, name)`. The SDK strongly prefers name-based lookups (agents pass strings users gave them; UUID indirection is friction). ```python from pictograph import Client client = Client() ``` ## list Single-page list of datasets in your organization. ```python datasets = client.datasets.list(limit=100) for ds in datasets: print(ds.name, ds.image_count) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `limit` | `int` | `100` | Backend cap: 1000 | Returns `list[Dataset]`. ## iter Auto-paging iterator over every dataset. ```python for ds in client.datasets.iter(page_size=100): print(ds.name) # Or materialize: all_datasets = client.datasets.iter().all() ``` | Arg | Type | Default | Notes | |---|---|---|---| | `page_size` | `int` | `100` | Items per backend round-trip | | `max_total` | `int \| None` | `None` | Stop after this many items | Returns `OffsetPager[Dataset]`. ## get Fetch by name (case-sensitive within org). ```python ds = client.datasets.get("road-signs", include_images=True, images_limit=200) print(ds.image_count, len(ds.images)) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `name` | `str` | required | Dataset name | | `include_images` | `bool` | `False` | Embed first `images_limit` `DatasetImage` summaries | | `images_limit` | `int` | `1000` | Backend cap: 10000 | | `images_offset` | `int` | `0` | Page the embedded image list | Returns `Dataset`. ## get_by_id UUID lookup. Use only when you already have the ID. ```python ds = client.datasets.get_by_id("a3e12f...") ``` ## download Bulk-download images and / or annotations to a local directory. Fetches a batch of signed download URLs in one call, then downloads in parallel via a thread pool. ```python report = client.datasets.download( "road-signs", output_dir="./dump", mode="full", # "full" | "images_only" | "annotations_only" status_filter="complete", # restrict to annotation-finalised images max_workers=10, progress=lambda done, total, fn: print(f"{done}/{total} {fn}"), ) print(report.images_downloaded, report.annotations_downloaded, len(report.failures)) ``` Returns a `DownloadReport`. Inspect `.failures` to retry the subset — the call does **not** raise on individual file errors. ## Project CRUD Project create / update / delete live on the [`projects`](/docs/api-reference/projects) resource (it's the same underlying entity; "dataset" is the SDK alias for the read paths and "project" is the alias for the write paths). ```python proj = client.projects.create("new-dataset", description="…") client.projects.update("new-dataset", description="updated") client.projects.delete("new-dataset") ``` ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | Name doesn't exist (case-sensitive) or belongs to another org | | 409 | `ConflictError` | `create` with a duplicate name | | 403 | `ForbiddenError` | `delete` requires `admin`+ role | ## REST equivalent ```bash curl -H "X-API-Key: pk_live_…" \ https://api.pictograph.io/api/v1/developer/datasets/?limit=10 ``` --- ## Page: Images _URL: https://pictograph.io/docs/api-reference/images (markdown: api-reference/images.md)_ _Section: API Reference_ For bulk operations across many images, prefer the [upload workflow](/docs/quick-start) and the [`batch`](/docs/api-reference/batch) resource. This page is for single-image ops. ```python from pictograph import Client client = Client() ``` ## get Fetch metadata for a single image. ```python image = client.images.get("img-uuid-1") print(image.filename, image.status, image.annotation_count) ``` Returns `Image`. Annotations live on the [annotations](/docs/api-reference/annotations) resource — call `client.annotations.get(image.id)` to fetch them. ## upload Upload a local file to a dataset. Three steps under the hood: get a signed upload URL → PUT the bytes → register the image. ```python from pathlib import Path project = client.projects.get("my-dataset") image = client.images.upload( dataset_id=project.id, file_path=Path("./photo.jpg"), folder_path="/cars", # virtual folder on the dataset ) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `dataset_id` | `str` | required | UUID of the destination dataset | | `file_path` | `str \| Path` | required | Local file. Pillow extracts dimensions client-side | | `folder_path` | `str` | `"/"` | Virtual folder (e.g. `/cars`). Storage paths are immutable | | `idempotency_key` | `str \| None` | auto | Override the auto-generated dedup key | Returns `Image`. Raises `ConflictError` if a file with the same name already exists in the same folder. **Supported extensions**: `.jpg`, `.jpeg`, `.png`, `.webp`, `.bmp`, `.tif`, `.tiff`, `.gif`, `.heic`. HEIC is auto-converted to PNG server-side. ## download Stream the original image bytes to a local file (chunked, safe for large images). ```python client.images.download("img-uuid-1", output_path="./photo.jpg") ``` | Arg | Type | Default | |---|---|---| | `image_id` | `str` | required | | `output_path` | `str \| Path` | required | The bytes are served via Cloud CDN with 30-day edge caching, so repeat downloads are fast. ## delete Soft-delete (archive) by default. Set `permanent=True` to free the stored bytes — irreversible. ```python client.images.delete("img-uuid-1") # archive (recoverable) client.images.delete("img-uuid-1", permanent=True) # permanent ``` Permanent deletes require `member`+ role on the API key. ## Bulk uploads For directories of images, use the workflow: ```python from pictograph.workflows import upload_dataset_from_folder report = upload_dataset_from_folder( client, "my-dataset", folder="./photos", organize_by_class=True, # subdirectory → virtual folder parallel=True, max_workers=8, ) print(report.images_uploaded, len(report.failures)) ``` See [Quick Start](/docs/quick-start) for the full workflow surface. ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | `image_id` doesn't exist, or belongs to another org | | 409 | `ConflictError` | Filename collision in the same virtual folder | | 413 | `ApiError` | Uploaded file exceeds 50 MB | | 415 | `ValidationError` | Unsupported file extension | --- ## Page: Annotations _URL: https://pictograph.io/docs/api-reference/annotations (markdown: api-reference/annotations.md)_ _Section: API Reference_ Annotations follow the canonical Pictograph JSON schema — see [Annotation format](/docs/annotation-format) for the full spec. The class label field is **`name`** (not `class`). Polygons use multi-ring `paths`, not flat coordinate arrays. ```python from pictograph import Client client = Client() ``` ## get Fetch the typed annotation list attached to an image. ```python annotations = client.annotations.get("img-uuid-1") for ann in annotations: print(ann.name, ann.type) ``` Returns `list[Annotation]` — a discriminated union over `BBoxAnnotation` / `PolygonAnnotation` / `PolylineAnnotation` / `KeypointAnnotation`. An image with no annotations returns `[]` (never raises for the "no annotations" case — only for "no such image"). ## save Replace the image's annotations with the supplied list. **Full overwrite** — existing annotations are dropped. ```python from pictograph import BBoxAnnotation, BoundingBox, PolygonAnnotation, PolygonGeometry, Point result = client.annotations.save("img-uuid-1", [ BBoxAnnotation( id="ann-1", name="person", bounding_box=BoundingBox(x=100, y=200, w=50, h=80), ), PolygonAnnotation( id="ann-2", name="car", polygon=PolygonGeometry(paths=[ [Point(x=0, y=0), Point(x=10, y=0), Point(x=10, y=10)], ]), ), ]) print(result.previous_count, "→", result.new_count, result.status) ``` | Arg | Type | Notes | |---|---|---| | `image_id` | `str` | Image UUID | | `annotations` | `Sequence[Annotation]` | Pydantic-validated client-side; backend re-validates | Returns `SaveResult` — `image_id`, `previous_count`, `new_count`, `status` (`"new"` / `"in_progress"` / `"complete"`, set automatically by count). Polygons may omit `bounding_box` on save — the backend computes the enclosing rectangle server-side. ## delete Remove every annotation from the image. Equivalent to `save(image_id, [])` but uses `DELETE` and requires `admin`+ role. ```python result = client.annotations.delete("img-uuid-1") print(result.deleted_count) ``` ## Validation The SDK Pydantic models reject malformed payloads at construction: ```python from pictograph import PolygonAnnotation, PolygonGeometry, Point PolygonGeometry(paths=[[Point(x=0, y=0)]]) # ValidationError: paths[0] has 1 point(s); polygon ring requires >= 3 ``` The backend re-validates on save as defense-in-depth — agents that construct dicts directly will hit `422 ValidationError` for the same class of mistakes. ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | `image_id` doesn't exist | | 422 | `ValidationError` | `class` instead of `name`, flat polygon array, unknown class label | | 403 | `ForbiddenError` | `delete` requires `admin`+ role | ## Auto-annotate workflow If you want SAM3 to generate annotations rather than write them by hand, see the [`auto-annotate`](/docs/api-reference/auto-annotate) resource. The `auto_annotate_dataset` workflow saves annotations automatically; the single-prompt methods return a `PromptResult` and you call `save` yourself. ## REST equivalent ```bash curl -X POST -H "X-API-Key: pk_live_…" \ -H "Content-Type: application/json" \ -d '{"image_id":"…","annotations":[…]}' \ https://api.pictograph.io/api/v1/developer/annotations/<image_id> ``` --- ## Page: Auto-annotate _URL: https://pictograph.io/docs/api-reference/auto-annotate (markdown: api-reference/auto-annotate.md)_ _Section: API Reference_ Pictograph runs SAM3 (Segment Anything Model 3) on T4 GPUs for auto-annotation. Three single-image prompt modes plus an async batch endpoint for many images at once. ```python from pictograph import Client client = Client() ``` ## point — "click here, segment that" Best when the user knows the object's location. ```python result = client.auto_annotate.point( dataset_name="my-dataset", image_filename="img-1.jpg", x=320, y=240, name="car", positive_points=[(310, 250)], # optional extra positives negative_points=[(100, 100)], # exclude regions score_threshold=0.75, ) # result.annotations[0] is a polygon — call client.annotations.save() to persist. ``` Returns `PromptResult` with `status` ∈ `{"success", "no_detection", "below_threshold"}`. On `success`, `annotations[0]` is a `PolygonAnnotation`. ## box — "segment everything in this box" Best when the user has drawn a rough bounding box. ```python result = client.auto_annotate.box( dataset_name="my-dataset", image_filename="img-1.jpg", box={"x": 100, "y": 200, "w": 200, "h": 150}, name="car", return_polygon=True, # also include polygon (not just bbox) confidence_threshold=0.5, negative_boxes=[{"x": 50, "y": 50, "w": 30, "h": 30}], ) ``` `return_polygon=False` returns only the refined bbox. ## text — "find all <thing>" Open-vocabulary text prompt. Best for many objects in one image. ```python result = client.auto_annotate.text( dataset_name="my-dataset", image_filename="img-1.jpg", text_prompt="red cars", output_type="polygon", # or "bbox" confidence_threshold=0.3, max_detections=50, ) ``` ## batch — async, many images Use for **>10 images**. Kicks off one job; polls until terminal status. ```python from pictograph import BatchClass job = client.auto_annotate.batch( dataset_name="my-dataset", image_filenames=["img-1.jpg", "img-2.jpg", "..."], classes=[ BatchClass(name="car", output_type="polygon"), BatchClass(name="person", output_type="bbox"), ], confidence_threshold=0.5, wait=True, poll_interval=5.0, timeout=1800.0, # 30 min default ) print(job.status, job.processed_images, job.total_annotations_added) ``` `wait=False` returns the job immediately — poll later via: ```python job = client.auto_annotate.get_batch(job.job_id) job = client.auto_annotate.wait_for_batch(job.job_id, timeout=600.0) client.auto_annotate.cancel_batch(job.job_id) ``` ## auto_annotate_dataset workflow Higher-level helper that paginates a dataset's image list, runs batch or text mode, and returns a per-image report: ```python from pictograph.workflows import auto_annotate_dataset report = auto_annotate_dataset( client, "my-dataset", classes=[("car", "polygon"), ("person", "bbox")], mode="batch", # "batch" | "text" confidence_threshold=0.5, overwrite=False, # skip already-annotated images max_images=None, # None = all ) print(report.images_processed, report.annotations_added, len(report.failures)) ``` ## Choosing a mode | Scenario | Mode | |---|---| | User clicks one spot | `point` | | User drags a rough box | `box` | | Many images, known classes | `batch` (or `text` per image for small datasets) | | Single image, multiple objects | `text` | ## Cost SAM3 is paid: - **3-credit minimum** per session (image embedding generation). - **~1 credit** per additional prompt on the same image (sub-second). - **Batch** is charged per image processed. `client.credits.estimate("sam3_per_minute", quantity=N)` for an A10G-time estimate — pre-charge is fixed per call. Read `PaymentRequiredError.required` on rejection for the exact ask. ## Common errors | Status | Exception | Cause | |---|---|---| | 402 | `PaymentRequiredError` | Out of credits | | 404 | `NotFoundError` | Dataset or image missing | | 408 | `PollTimeoutError` | Batch job didn't finish within `timeout` (job keeps running on the backend) | ## See also - [Annotations](/docs/api-reference/annotations) — saving the prompt results - [Annotation format](/docs/annotation-format) — wire format - [Credits](/docs/api-reference/credits) — budget gating --- ## Page: Models _URL: https://pictograph.io/docs/api-reference/models (markdown: api-reference/models.md)_ _Section: API Reference_ Models are produced by [training runs](/docs/api-reference/training). The SDK doesn't insert model rows directly — you train, then read. ```python from pictograph import Client client = Client() ``` ## list / iter ```python models = client.models.list(limit=20) for m in models: print(m.name, m.architecture, m.status, m.metrics) # Or auto-page: for m in client.models.iter(page_size=50): print(m.id, m.model_type) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `limit` | `int` | `50` | Backend cap: 500 | | `status` | `ModelStatus \| None` | `None` | `"training"` / `"ready"` / `"failed"` / `"archived"` | | `model_type` | `ModelType \| None` | `None` | `"object_detection"` / `"semantic_segmentation"` / `"instance_segmentation"` / `"classification"` | ## get ```python model = client.models.get("model-uuid") print(model.architecture, model.metrics["mAP"], model.class_mapping) ``` Returns `Model`. Inspect `metrics` (mAP, precision, recall) and `class_mapping` (index → class name) for inference setup. ## download Stream the ONNX weights to a local file. Only `status="ready"` models are downloadable — `training` / `failed` raise `409 ConflictError`. ```python from pathlib import Path client.models.download("model-uuid", output_path=Path("./yolox.onnx")) ``` | Arg | Type | |---|---| | `model_id` | `str` | | `output_path` | `str \| Path` | The download is chunked and checksummed against the object MD5. Safe for multi-GB models. ## delete ```python client.models.delete("model-uuid") ``` Soft-delete (sets `status="archived"`). Weights are not purged on soft-delete — contact support to permanently remove them. ## Status lifecycle | `status` | Meaning | |---|---| | `training` | Training is in progress; `download` returns 409. | | `ready` | Trained successfully; weights downloadable via `download()`. | | `failed` | Training stopped with an error. Inspect the source `TrainingRun.error_message`. | | `archived` | Soft-deleted. Hidden from `list()` unless `status="archived"` filter passed. | ## Inference The SDK ships read access only — there is no `client.models.infer()` endpoint in v1.0.0. To run inference, download the ONNX file and use your own runtime (`onnxruntime`, `tensorrt`, etc.): ```python import onnxruntime as ort client.models.download(model.id, output_path="./model.onnx") session = ort.InferenceSession("./model.onnx") # … standard ORT inference loop ``` The `image_inference` credit operation (~1 cr/image) reserved for a future managed inference endpoint — not yet exposed. ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | `model_id` missing or belongs to another org | | 409 | `ConflictError` | `download` on a non-`ready` model | --- ## Page: Credits _URL: https://pictograph.io/docs/api-reference/credits (markdown: api-reference/credits.md)_ _Section: API Reference_ Pictograph uses a **credit ledger** (signed integers, per organization) for paid operations. Free actions (uploads, exports, search) cost 0. ```python from pictograph import Client client = Client() ``` ## balance Current balance + monthly allowance + last 20 ledger entries. ```python balance = client.credits.balance() print(balance.credits_remaining, "/", balance.credits_monthly_allowance) print("Resets:", balance.credits_reset_at) for entry in balance.recent_history: print(entry.created_at, entry.operation, entry.amount) ``` Returns `CreditBalance`. ## history Page through the credit ledger (newest first). ```python entries = client.credits.history(limit=50, offset=0) for e in entries: direction = "debit" if e.amount < 0 else "credit" print(e.created_at, direction, abs(e.amount), e.operation) ``` Sign convention: `amount < 0` = debit (operation consumed credits), `amount > 0` = credit / refund (top-up, training overcharge refund). ## iter Auto-paging iterator over the entire ledger. ```python for entry in client.credits.iter(page_size=100): print(entry.balance_after, entry.operation) ``` ## estimate Pre-flight cost check **before** invoking a paid operation. ```python estimate = client.credits.estimate("training_a10g_per_minute", quantity=30) print(estimate.total_credits, "credits;", "sufficient:", estimate.sufficient) ``` `sufficient=True` is **not** a guarantee — another caller may drain credits between the estimate and the actual call. The authoritative answer is the operation's own `PaymentRequiredError`. ## Cost cheatsheet | Operation slug | Approx cost | |---|---| | `sam3_per_minute` | 3 cr session minimum + ~1/prompt | | `training_a10g_per_minute` | 10 cr/min | | `training_a100_per_minute` | 60 cr/min | | `training_h100_per_minute` | 120 cr/min | | `image_generate_imagen_fast` | 5 cr/image | | `image_edit_gemini_flash` | 3 cr/image | | `image_inference` | 1 cr/image | The full table lives server-side in `utils.tier_limits.CREDIT_COSTS`. ## Gating in workflows `full_pipeline` already gates on credit balance before kicking off paid phases: ```python from pictograph.workflows import full_pipeline report = full_pipeline( client, dataset_name="…", folder="…", classes=…, pipeline="yolox", min_credits=1, # skip annotate + train if balance < 1 ) if report.credit_skip_reason: print(report.credit_skip_reason) ``` `min_credits=None` disables the check. ## PaymentRequiredError details ```python from pictograph.exceptions import PaymentRequiredError try: client.training.create(dataset_name, export_name, pipeline_type="yolox") except PaymentRequiredError as e: print(f"Need {e.required}, have {e.remaining}") print(f"Top up at: {e.upgrade_url}") ``` ## Refunds The training pipeline auto-refunds unused GPU minutes when: - A run is **cancelled** mid-training. - A run **failed** before consuming the full `timeout` budget. Refunds appear as positive ledger entries with operation `training_refund_<gpu>`. No SDK call required. ## Common errors | Status | Exception | Cause | |---|---|---| | 422 | `ValidationError` | `operation` slug not in the cost table | | 402 | `PaymentRequiredError` | (raised by the *operation* being estimated, not by `estimate` itself) | --- ## Page: Exports _URL: https://pictograph.io/docs/api-reference/exports (markdown: api-reference/exports.md)_ _Section: API Reference_ An **export** is a ZIP of an annotated dataset in a chosen format, optionally embedding the original image files. Export builds run server-side (a few seconds for hundreds of images, longer for tens of thousands). ```python from pictograph import Client client = Client() ``` ## Formats | `format` | Notes | |---|---| | `pictograph` (default) | Canonical Pictograph JSON — the wire format the SDK consumes | | `coco` | COCO instance segmentation / object detection | | `yolo` | YOLO darknet `.txt` files (one per image) | | `cvat` | CVAT XML | | `pascal_voc` | Pascal VOC XML (one per image) | | `labelme` | LabelMe JSON (one per image) | | `csv` | Flat CSV — bbox annotations only | ## create Build a new export. Defaults to `wait=True`, which blocks until the ZIP is ready. ```python export = client.exports.create( "my-dataset", "for-yolov8", format="yolo", include_images=True, class_filter=["car", "truck"], # None = all classes status_filter="complete", # "all" / "complete" / "in_progress" / "new" wait=True, poll_interval=2.0, timeout=600.0, ) print(export.id, export.status, export.image_count, export.annotation_count) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `dataset_name` | `str` | required | | | `name` | `str` | required | Unique within the dataset | | `format` | `ExportFormat` | `"pictograph"` | See table above | | `include_images` | `bool` | `True` | When `False`, ZIP contains only annotations | | `class_filter` | `list[str] \| None` | `None` | Limit to these class names | | `status_filter` | `str` | `"complete"` | Image status filter | | `wait` | `bool` | `True` | Block until terminal status | `wait=False` returns a `pending` / `processing` `Export`. Poll via `get` or `wait_for_completion`. ## list / iter ```python exports = client.exports.list(limit=20) for e in client.exports.iter(page_size=50): print(e.dataset_name, e.name, e.status) ``` ## get ```python export = client.exports.get("my-dataset", "for-yolov8") print(export.status, export.download_url) ``` ## download Stream the ZIP to a local file (chunked). ```python from pathlib import Path client.exports.download( "my-dataset", "for-yolov8", output_path=Path("./my-dataset.zip"), ) ``` The download URL is a signed URL valid for 60 minutes — generated fresh on every call. ## wait_for_completion If you used `wait=False`: ```python export = client.exports.create("ds", "name", format="coco", wait=False) # … later export = client.exports.wait_for_completion("ds", "name", timeout=300.0) ``` ## delete ```python client.exports.delete("my-dataset", "for-yolov8") ``` Removes the export and the stored ZIP. Other ongoing downloads of the same export will fail mid-stream. ## Class filtering `class_filter` only includes annotations matching the given class names. Images with no surviving annotations are still included if their `status` matches `status_filter` — they get an empty annotation list. Pass `class_filter=None` (default) to keep every annotation. ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | Dataset or export missing | | 409 | `ConflictError` | Export name already exists in this dataset | | 422 | `ValidationError` | Unknown `format` or `status_filter` | | 408 | `PollTimeoutError` | `wait=True` timed out (export keeps building) | --- ## Page: Training _URL: https://pictograph.io/docs/api-reference/training (markdown: api-reference/training.md)_ _Section: API Reference_ The training resource manages the lifecycle of a single run against a pre-built export. For the end-to-end "give me an ONNX model from this dataset" call, use [`train_pipeline`](/docs/workflows/train) instead. ```python from pictograph import Client client = Client() ``` ## Pipelines | `pipeline_type` | Output | | --- | --- | | `yolox` | Object detection (boxes) | | `detectron2` | Instance segmentation (polygons + masks) | | `sm_pytorch` | Semantic segmentation | | `classification` | Image classification | | `rfdetr_detection` | Object detection (RT-DETR) | | `rfdetr_segmentation` | Instance segmentation (RT-DETR) | ## GPU tiers | `gpu_type` | Approx. cost | Pick for | | --- | --- | --- | | `a10g` (default) | ~$0.30/hr | YOLOX, classification, RF-DETR-detection | | `a100` | ~$2/hr | Detectron2, large RF-DETR, big batches | | `h100` | ~$4/hr | Last resort — only when A100 OOMs | ## create Spawn a run against an existing export. ```python run = client.training.create( dataset_name="road-signs", export_name="road-signs-20260512-120000", pipeline_type="yolox", name="yolox-run-1", config={"epochs": 50}, gpu_type="a10g", wait=True, poll_interval=5.0, timeout=7200.0, ) ``` | Arg | Type | Default | Notes | | --- | --- | --- | --- | | `dataset_name` | `str` | required | Source project | | `export_name` | `str` | required | Pre-built export | | `pipeline_type` | `PipelineType` | required | See table above | | `name` | `str \| None` | auto | Defaults to `<pipeline>-run-<ts>` | | `config` | `dict` | `{}` | `epochs`, `batch_size`, `learning_rate`, `image_size` | | `gpu_type` | `GpuType` | `"a10g"` | | | `wait` | `bool` | `True` | When `False`, returns immediately with `status="queued"` | | `poll_interval` | `float` | `5.0` | Seconds between polls | | `timeout` | `float` | `7200.0` | Max poll seconds (2 hours) | Returns `TrainingRun`. ## list / iter ```python runs = client.training.list(limit=20, status="running") for run in client.training.iter(page_size=50): print(run.id, run.status, run.progress) ``` ## get ```python run = client.training.get("run-uuid") print(run.status, run.progress, run.current_epoch, "/", run.total_epochs) ``` `status` is one of `{"pending", "queued", "running", "completed", "failed", "cancelled"}`. ## cancel ```python client.training.cancel("run-uuid") # stops the worker, refunds remaining minutes ``` ## wait_for_completion If you created with `wait=False`, you can block later: ```python run = client.training.wait_for_completion("run-uuid", timeout=3600.0) if run.status == "completed": model = client.models.get(run.model_id) ``` ## Minimum dataset size Training requires **at least 5 images** matching the export's `status_filter` so the worker can split into train / val / test. Below that, training fails with a validation error. ```python ds = client.datasets.get("my-dataset") assert ds.completed_image_count >= 5 ``` ## Cost estimation ```python estimate = client.credits.estimate("training_a10g_per_minute", quantity=30) if not estimate.sufficient: raise RuntimeError(f"Need {estimate.total_credits}, have {estimate.credits_remaining}") ``` Refunds for cancelled or under-budget runs appear automatically as positive ledger entries (`training_refund_<gpu>`). ## Errors | Status | Exception | Cause | | --- | --- | --- | | 402 | `PaymentRequiredError` | Insufficient credits | | 404 | `NotFoundError` | Dataset or export missing | | 422 | `ValidationError` | Pipeline / GPU invalid, dataset too small | | 408 | `PollTimeoutError` | `wait=True` exceeded `timeout` (run keeps going) | ## See also - [`train_pipeline`](/docs/workflows/train) — end-to-end workflow (recommended starting point) - [Models](/docs/api-reference/models) — download trained ONNX weights - [Credits](/docs/api-reference/credits) — `estimate("training_<gpu>_per_minute")` --- ## Page: API keys _URL: https://pictograph.io/docs/api-reference/api-keys (markdown: api-reference/api-keys.md)_ _Section: API Reference_ Use these endpoints to issue, list, update, and revoke API keys for your organization. The full key string (`pk_live_…`) is returned **only once** on creation — store it immediately. ```python from pictograph import Client client = Client() ``` ## list ```python keys = client.api_keys.list() # active org keys = client.api_keys.list(organization_id="org-uuid") # explicit org for k in keys: print(k.name, k.role, k.key_prefix, k.last_used_at) ``` Returns `list[ApiKey]` — metadata only (no full key strings). ## create ```python created = client.api_keys.create( organization_id="org-uuid", name="ci-pipeline", role="member", # viewer / member / admin / owner expires_at=None, # ISO datetime or None for no expiry ) print("Save this — it is shown once:", created.full_key) ``` Returns `CreatedApiKey` — `id`, `name`, `role`, `key_prefix`, `full_key` (the only call that returns it). | Arg | Type | Default | Notes | |---|---|---|---| | `organization_id` | `str` | required | | | `name` | `str` | required | Human label, not unique | | `role` | `ApiKeyRole` | required | `"viewer"` / `"member"` / `"admin"` / `"owner"` | | `expires_at` | `datetime \| str \| None` | `None` | ISO 8601 or `None` for no expiry | ## get ```python key = client.api_keys.get("key-uuid") print(key.role, key.created_at, key.last_used_at) ``` ## update Patch the key's name, role, or expiry. The full key string is **not** rotated by update — issue a new key + delete the old one to rotate. ```python client.api_keys.update("key-uuid", name="renamed", role="admin") client.api_keys.update("key-uuid", expires_at="2027-01-01T00:00:00Z") ``` ## delete Revokes the key immediately. In-flight requests using the key fail with `401 AuthError` after revocation propagates (≤ 1 second). ```python client.api_keys.delete("key-uuid") ``` ## Role hierarchy Keys can only manage keys of equal or lower role. An `admin` key cannot create an `owner` key. Owner-tier ops require an `owner` key. | Caller role | Can create | |---|---| | `viewer` | nothing — these endpoints all require `admin`+ | | `member` | nothing | | `admin` | `viewer`, `member`, `admin` | | `owner` | `viewer`, `member`, `admin`, `owner` | ## Web app vs SDK - **Web app** (`app.pictograph.io → Settings → API Keys`) — visual UI, the most common path for one-off keys. - **SDK / CLI** — for programmatic key issuance (CI provisioning, multi-org tools, automated rotation). The SDK enforces the same role hierarchy as the web UI. ## Common errors | Status | Exception | Cause | |---|---|---| | 403 | `ForbiddenError` | Caller's role too low for the requested action | | 404 | `NotFoundError` | `key_id` doesn't exist or belongs to another org | | 422 | `ValidationError` | Invalid role string, malformed `expires_at` | --- ## Page: Batch _URL: https://pictograph.io/docs/api-reference/batch (markdown: api-reference/batch.md)_ _Section: API Reference_ Bulk image operations on a single dataset. Each call accepts a list of image IDs and returns a `BatchResult` with per-item failure context — partial success does not raise. ```python from pictograph import Client client = Client() ``` ## move Move images to a different virtual folder within the same dataset. ```python result = client.batch.move( dataset_name="my-dataset", image_ids=["img-1", "img-2", "img-3"], target_folder_path="/sorted/cars", ) print(result.succeeded, result.failed_count, result.failures) ``` Storage paths are immutable — "move" updates `virtual_folder_path`; the underlying image bytes don't move. ## copy Copy images to a different folder. Server-side copy of the underlying bytes (instant, zero data transfer). ```python result = client.batch.copy( dataset_name="my-dataset", image_ids=["img-1", "img-2"], target_folder_path="/cars-copy", duplicate_handling="rename", # collision policy in the destination copy_annotations=False, # destination images start without annotations ) ``` | Arg | Type | Default | Notes | | --- | --- | --- | --- | | `dataset_name` | `str` | required | | | `image_ids` | `Sequence[str]` | required | | | `target_folder_path` | `str` | `"/"` | Destination virtual folder | | `duplicate_handling` | `Literal["rename", "skip", "overwrite"]` | `"rename"` | How to handle filename collisions | | `copy_annotations` | `bool` | `False` | When `True`, copy `annotations_json` and `status` too | ## delete Soft-archive by default; permanent on request. ```python result = client.batch.delete( dataset_name="my-dataset", image_ids=["img-1", "img-2", "img-3"], permanent=False, # archive (recoverable) ) ``` `permanent=True` purges the stored bytes — irreversible. Requires `admin`+ role. ## update Update metadata fields on a batch of images. Pass exactly the fields you want to change — `None` is omitted from the request. ```python result = client.batch.update( dataset_name="my-dataset", image_ids=["img-1", "img-2"], status="complete", is_archived=False, ) ``` | Arg | Type | Default | Notes | | --- | --- | --- | --- | | `dataset_name` | `str` | required | | | `image_ids` | `Sequence[str]` | required | | | `status` | `str \| None` | `None` | `"new"`, `"annotate"`, `"review"`, `"complete"` | | `display_name` | `str \| None` | `None` | Display override | | `is_archived` | `bool \| None` | `None` | `True` archives; `False` restores | `ValidationError` if every field is `None` (the update would be a no-op). ## BatchResult | Attribute | Type | Notes | | --- | --- | --- | | `succeeded` | `list[str]` | IDs the op completed for | | `failed_count` | `int` | `len(failures)` | | `failures` | `list[BatchFailure]` | `{image_id, reason}` per failure | | `success` | `bool` (property) | `failed_count == 0` | ## Errors | Status | Exception | Cause | | --- | --- | --- | | 403 | `ForbiddenError` | `permanent=True` requires `admin`+ role | | 404 | `NotFoundError` | Dataset missing, or every `image_id` invalid | | 422 | `ValidationError` | Invalid field value or empty update | ## Why batch over loops Reorganizing 10K images is one round-trip with `batch.move()` versus 10K with `images.update()`. Bulk operations are implemented server-side as single statements, not loops. --- ## Page: Search _URL: https://pictograph.io/docs/api-reference/search (markdown: api-reference/search.md)_ _Section: API Reference_ Two search modes: 1. **Visual similarity** — `by_similarity()` — SigLIP2 (1152-dim) embeddings + pgvector HNSW index. 2. **Tag-based** — `by_tag()` — JSONB containment over the auto-classified `image_auto_tags` field (objects / scenes / attributes). Both auto-tag and embedding pipelines run on every upload (zero API cost; T4 GPU). No setup required. ```python from pictograph import Client client = Client() ``` ## by_similarity Find images visually similar to a reference image. Scope is the reference image's dataset + folder unless overridden. ```python results = client.search.by_similarity( image_id="img-uuid-1", threshold=0.6, # cosine similarity floor (0–1) limit=50, folder_path=None, # None = inherit; "/" = whole dataset ) for r in results: print(r.image_id, r.filename, f"{r.similarity:.3f}") ``` | Arg | Type | Default | Notes | |---|---|---|---| | `image_id` | `str` | required | UUID of the reference image | | `threshold` | `float` | `0.6` | Minimum cosine similarity (`0.6` ≈ "visually related") | | `limit` | `int` | `50` | Backend cap: 500 | | `folder_path` | `str \| None` | `None` | Override folder scope | Returns `list[SimilarImage]`, sorted by descending similarity. The source image is excluded from results. ## by_tag Find images with auto-tags matching the given filters. Pass at least one of `objects` / `scenes` / `attributes` (an empty filter returns nothing rather than everything — semantically clearer for agents). ```python results = client.search.by_tag( objects=["car", "truck"], # match ANY object tag scenes=["outdoor"], # match ANY scene tag attributes=["blurry"], # match ANY attribute tag dataset_name="my-dataset", # restrict scope; None = whole org limit=100, ) for r in results: print(r.image_id, r.tags["objects"]) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `objects` | `Sequence[str] \| None` | `None` | At least one of objects/scenes/attributes required | | `scenes` | `Sequence[str] \| None` | `None` | | | `attributes` | `Sequence[str] \| None` | `None` | | | `dataset_name` | `str \| None` | `None` | Org-wide search if `None` | | `limit` | `int` | `50` | Backend cap: 500 | Returns `list[TaggedImage]`. Within a category, tags are OR'd; across categories they are AND'd: - `objects=["car","truck"]` → "car OR truck" - `objects=["car"], scenes=["outdoor"]` → "car AND outdoor" ## Auto-tag taxonomy The SigLIP2 classifier picks from ~200 curated labels per category. Common ones: - **objects**: car, truck, person, bicycle, dog, sign, building, etc. - **scenes**: outdoor, indoor, urban, rural, daytime, nighttime, etc. - **attributes**: blurry, dark, bright, high-contrast, low-light, etc. The full taxonomy ships with the SigLIP2 service prompts; tags not in the curated list won't be assigned. ## Cost Search is **free**. Embeddings + auto-tags are computed once per image on upload (T4 GPU, zero API cost) and cached. ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | `image_id` (similarity) or `dataset_name` (tag) missing | | 422 | `ValidationError` | `by_tag` called with all three filters None | --- ## Page: Connectors _URL: https://pictograph.io/docs/api-reference/connectors (markdown: api-reference/connectors.md)_ _Section: API Reference_ Import remote datasets in two steps: validate the source API key → kick off the import. The import runs as an async job; the SDK polls until terminal status by default. ```python from pictograph import Client client = Client() ``` ## Supported providers | `provider` | Source | Notes | |---|---|---| | `v7` | V7 Darwin | Polygon paths, bboxes, polylines, keypoints, tags | | `roboflow` | Roboflow | COCO export → Pictograph JSON | ## validate Verify the source API key and list available remote datasets. No quota consumed; the API key is sent only on this call. ```python result = client.connectors.validate( provider="v7", api_key="v7_api_token_…", ) if result.valid: for ds in result.datasets: print(ds.id, ds.name, ds.image_count) else: print("invalid:", result.error) ``` Returns `ValidationResult` — inspect `.valid` first; `.datasets` is populated only on success. ## check_limits Pre-flight tier-cap check before kicking off an import. ```python check = client.connectors.check_limits( total_images=12500, estimated_size_bytes=4_000_000_000, # 4 GB ) if not check.allowed: print("blocked by:", check.exceeded) # "images" / "storage" / "both" ``` ## import_ Kick off the import. Trailing underscore avoids shadowing the Python `import` keyword. ```python job = client.connectors.import_( provider="v7", api_key="v7_api_token_…", datasets=[ {"id": "ds_abc", "name": "Road signs", "slug": "road-signs"}, # OR pass RemoteDataset instances from validate(): # *result.datasets[:2], ], wait=True, poll_interval=3.0, timeout=3600.0, # 1h default ) print(job.import_id, job.status) for ds in job.datasets: print(ds.dataset_name, ds.images_imported, "/", ds.total_images) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `provider` | `ConnectorProvider` | required | `"v7"` / `"roboflow"` | | `api_key` | `str` | required | Sent only to fetch source data | | `datasets` | `Sequence[RemoteDataset \| dict]` | required | `RemoteDataset` instances or raw dicts | | `wait` | `bool` | `True` | Poll until terminal | | `poll_interval` | `float` | `3.0` | seconds | | `timeout` | `float` | `3600.0` | Max poll seconds (V7 large exports take 30+ min) | Returns `ImportJob`. ## get_import / wait_for_import / cancel_import ```python job = client.connectors.get_import(import_id) job = client.connectors.wait_for_import(import_id, timeout=600.0) job = client.connectors.cancel_import(import_id) ``` ## Annotation conversion | V7 / COCO | Pictograph | |---|---| | V7 `polygon.paths` | `polygon.paths` (passthrough) | | V7 `bounding_box` (no polygon) | `bounding_box` | | V7 `line.path` | `polyline.path` | | V7 `keypoint` | `keypoint` | | V7 `tag` / `ellipse` / `mask` | skipped (no Pictograph equivalent) | | COCO `segmentation` (flat array) | `polygon.paths` (paired into points) | | COCO `bbox` (no segmentation) | `bounding_box` | | COCO `keypoints` triplets | `keypoint` (skips `v=0`) | ## Tier caps Imports are charged against your storage + image-count tier caps. See [Credits](/docs/api-reference/credits) and your plan in the web app. ## Common errors | Status | Exception | Cause | |---|---|---| | 401 | `AuthError` | Source provider API key rejected | | 402 | `PaymentRequiredError` | Tier cap exceeded | | 404 | `NotFoundError` | `import_id` missing | | 408 | `PollTimeoutError` | `wait=True` exceeded `timeout` (job keeps running) | | 422 | `ValidationError` | Invalid provider, empty datasets list | --- ## Page: Video _URL: https://pictograph.io/docs/api-reference/video (markdown: api-reference/video.md)_ _Section: API Reference_ Pictograph annotates **frames**, not videos. The video resource handles upload and frame extraction; once extracted, frames are regular images you annotate with the standard SAM3 / annotation workflows. ```python from pictograph import Client client = Client() ``` ## upload Three-step upload (signed URL → PUT → register), same pattern as images. ```python from pathlib import Path info = client.video.upload( file_path=Path("./recording.mp4"), dataset_id="proj-uuid", folder_path="/raw-footage", ) print(info.gcs_path, info.video_id) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `file_path` | `str \| Path` | required | Local video file | | `dataset_id` | `str` | required | Destination dataset | | `folder_path` | `str` | `"/"` | Virtual folder | Supported codecs: anything ffmpeg can demux (H.264, H.265, VP9, AV1, etc.). ## probe Inspect a video's metadata without extracting frames. Pass the `gcs_path` returned by `upload()`. ```python meta = client.video.probe(info.gcs_path) print(meta.duration_seconds, meta.fps, meta.width, meta.height) print(meta.codec, meta.frame_count) ``` Returns `VideoMetadata` from a server-side `ffprobe` invocation. ## extract_frames Extract frames from a video into the destination dataset as images. ```python job = client.video.extract_frames( gcs_path=info.gcs_path, dataset_id="proj-uuid", folder_path="/raw-footage/frames", fps=2.0, # extract 2 frames per second of source video start_seconds=10.0, end_seconds=120.0, max_frames=200, # cap on output count wait=True, poll_interval=5.0, timeout=1800.0, ) print(job.status, job.frames_extracted) ``` Frames are written as `{video_basename}_{frame_index:06d}.jpg` in the target folder. Each becomes a regular `Image` row — ready for annotation, search, training. | Arg | Type | Default | Notes | |---|---|---|---| | `gcs_path` | `str` | required | Source video | | `dataset_id` | `str` | required | Destination dataset | | `folder_path` | `str` | `"/"` | Virtual folder for the extracted frames | | `fps` | `float` | `1.0` | Frames per source second | | `start_seconds` | `float \| None` | `None` | Skip the first N seconds | | `end_seconds` | `float \| None` | `None` | Stop at second N | | `max_frames` | `int \| None` | `None` | Cap on output count | | `wait` | `bool` | `True` | Poll until terminal | `fps=1.0` is the cheapest setting; `fps=30.0` extracts every frame of a 30 fps source. Frame extraction does **not** consume credits — you pay only for the storage of the resulting images. ## get_extraction / wait_for_extraction ```python job = client.video.get_extraction(job_id) job = client.video.wait_for_extraction(job_id, timeout=600.0) ``` ## Common errors | Status | Exception | Cause | |---|---|---| | 404 | `NotFoundError` | `gcs_path` missing or `dataset_id` invalid | | 415 | `ValidationError` | Unsupported codec | | 408 | `PollTimeoutError` | Long videos may exceed default `timeout` | | 413 | `ApiError` | Video file exceeds upload limit (10 GB) | --- ## Page: Organizations _URL: https://pictograph.io/docs/api-reference/organizations (markdown: api-reference/organizations.md)_ _Section: API Reference_ The Developer API key carries an organization scope. Every endpoint operates on that org — there is no cross-org access from a single key. Use these endpoints to read org metadata, manage members, and handle invites. ```python from pictograph import Client client = Client() ``` ## me Fetch the active organization (the one the API key belongs to). ```python org = client.organizations.me() print(org.id, org.name, org.subscription_tier, org.member_count) ``` Returns `Organization` with tier (`free` / `core` / `pro` / `enterprise`), billing email, monthly credit allowance, etc. ## list_members ```python for m in client.organizations.list_members(): print(m.email, m.role, m.joined_at) ``` Returns `list[OrganizationMember]`. ## update_member_role Promote / demote a member. `member_id` is the row's UUID, **not** the underlying user UUID. ```python client.organizations.update_member_role( member_id="member-uuid", role="admin", # viewer / member / admin / owner ) ``` Role hierarchy: callers may set roles ≤ their own. An `admin` cannot promote anyone to `owner`. ## remove_member ```python client.organizations.remove_member(member_id="member-uuid") ``` The user's account is preserved — only the org membership is revoked. They can be re-invited. ## list_invites ```python invites = client.organizations.list_invites(status="pending") for inv in invites: print(inv.email, inv.role, inv.expires_at, inv.invite_url) ``` Filter by `status` ∈ `{"pending", "accepted", "revoked", "expired"}` or `None` for all. ## invite ```python invite = client.organizations.invite( email="new@example.com", role="member", expires_in_days=14, # optional; defaults to 7 ) print("Send this link:", invite.invite_url) ``` Returns `OrganizationInvite` with a one-time `invite_url`. Email delivery is automatic when SMTP is configured server-side; if you disabled email, share the URL manually. ## revoke_invite ```python client.organizations.revoke_invite(invite_id="invite-uuid") ``` Sets the invite's `status` to `"revoked"` immediately. The URL stops working. ## Permission matrix | Op | Min role | |---|---| | `me`, `list_members`, `list_invites` | `viewer` | | `update_member_role`, `invite`, `revoke_invite` | `admin` | | `remove_member` | `admin` (can't remove `owner` unless caller is `owner`) | ## Common errors | Status | Exception | Cause | |---|---|---| | 403 | `ForbiddenError` | Caller's role too low for the action | | 404 | `NotFoundError` | `member_id` / `invite_id` doesn't exist (or belongs to another org) | | 409 | `ConflictError` | `invite` to an email that's already a member | | 422 | `ValidationError` | Invalid role, malformed email | --- ## Page: Projects _URL: https://pictograph.io/docs/api-reference/projects (markdown: api-reference/projects.md)_ _Section: API Reference_ The **`projects`** resource is the write side of the [`datasets`](/docs/api-reference/datasets) resource. Same underlying entity; the SDK aliases the read path to "datasets" because that's the word users say. Use this page for: creating a project, editing its class set, deleting it. ```python from pictograph import Client client = Client() ``` ## list / iter ```python projects = client.projects.list(limit=50) for p in client.projects.iter(page_size=100): print(p.name, len(p.classes)) ``` `Project` includes the embedded `project_config` (classes + annotation types), unlike `Dataset` which is read-optimized for the list view. ## get ```python project = client.projects.get("my-dataset") for cls in project.classes: print(cls.name, cls.type, cls.color) ``` ## create ```python from pictograph import ProjectClass project = client.projects.create( "new-dataset", description="Road sign detection training set", annotation_types=["bbox", "polygon"], classes=[ ProjectClass(name="stop_sign", type="bbox", color="#ff0000"), ProjectClass(name="yield", type="bbox", color="#ffff00"), ], ) ``` | Arg | Type | Default | Notes | |---|---|---|---| | `name` | `str` | required | Unique within the org | | `description` | `str \| None` | `None` | | | `annotation_types` | `Sequence[str]` | `["bbox"]` | Allowed types for this project | | `classes` | `Sequence[ProjectClass \| dict]` | `[]` | Class definitions; can be added later via `update` | Returns `Project`. ## update Patch project metadata or its config (classes / annotation types). Pass only the fields you're changing. ```python client.projects.update( "my-dataset", description="Updated description", annotation_types=["bbox", "polygon", "polyline"], ) # Add / remove / recolor classes: client.projects.update( "my-dataset", classes=[ ProjectClass(name="stop_sign", type="bbox", color="#ff0000"), ProjectClass(name="yield", type="bbox", color="#ffaa00"), # color change ProjectClass(name="speed_limit", type="bbox", color="#00aaff"), # new # 'merge' class omitted → removed (also removes any annotations using it) ], ) ``` Class updates are atomic — the entire `classes` list is replaced. Removing a class **also removes every annotation using that class name** across the dataset. Be deliberate. ## delete ```python result = client.projects.delete("my-dataset") print(result["images_deleted"], result["annotations_deleted"]) ``` Permanent. Removes: - The project and its config - Every image (and the underlying stored bytes) - Every export tied to the project Models trained from this project are **not** deleted (they're useful even after the source dataset is gone). Requires `admin`+ role. ## ProjectClass shape ```python { "name": "stop_sign", "type": "bbox", # "bbox" / "polygon" / "polyline" / "keypoint" "color": "#ff0000", # hex color for UI rendering (any valid CSS color) } ``` The class name must be unique within the project's class list — saves fail with `ValidationError` if two classes share a name. ## Common errors | Status | Exception | Cause | |---|---|---| | 403 | `ForbiddenError` | `delete` requires `admin`+ | | 404 | `NotFoundError` | Project name doesn't exist | | 409 | `ConflictError` | `create` with a duplicate name | | 422 | `ValidationError` | Duplicate class names, invalid annotation type | --- ## Page: Agent tool registry _URL: https://pictograph.io/docs/api-reference/tools (markdown: api-reference/tools.md)_ _Section: API Reference_ `GET /api/v1/developer/tools.json` serves the agent tool registry as a JSON Schema array. Dynamic-discovery agent stacks (Vercel AI SDK, LangChain, raw OpenAI / Anthropic SDKs without the bundled adapters) fetch this once and have everything they need. The registry is the **single source of truth** — the Python SDK's `Toolkit.as_anthropic_tools()` / `as_openai_tools()` / `as_json_schema()` all derive from it; the backend snapshot is regenerated on every SDK release via a CI parity check. ## Endpoints | URL | Notes | |---|---| | `/api/v1/developer/tools` | Trailing-slash-tolerant. Returns the full payload. | | `/api/v1/developer/tools.json` | Same content. Matches the conventional `*.json` URL. | ## Auth Standard developer-API auth — pass `X-API-Key`. Any role works (read-only). ```bash curl -H "X-API-Key: pk_live_…" \ https://api.pictograph.io/api/v1/developer/tools.json ``` ## Response shape ```json { "tools": [ { "name": "upload_dataset_from_folder", "description": "Use when the user asks to upload a folder of images …", "input_schema": { "type": "object", "properties": { "dataset_name": { "type": "string", "description": "…" }, "folder": { "type": "string", "description": "…" } // … remaining fields }, "required": ["dataset_name", "folder"], "additionalProperties": false }, "required_role": "member", "credit_cost": 0, "idempotent": false } // … 27 more entries ], "version": "1.0.0", "count": 28, "generated_at": "2026-04-19T…Z" } ``` ## Tool metadata | Field | Notes | |---|---| | `name` | Snake-case identifier. Stable across SDK versions. | | `description` | Anthropic "use when X" framing. Agents read this to choose between tools. | | `input_schema` | Pydantic-generated JSON Schema with `extra: forbid`. | | `required_role` | Minimum org role on the calling API key. Backend re-enforces. | | `credit_cost` | Approximate cost (0 for read-only / free ops). Agents may gate. | | `idempotent` | When `true`, agents may safely retry on transient failures. | ## Tool list (v1.0.0) 28 tools across 11 categories. See the [agents overview](/docs/agents) for details and the dispatch pattern. | Category | Tools | |---|---| | Workflows | `upload_dataset_from_folder`, `auto_annotate_dataset`, `train_pipeline`, `full_pipeline` | | Datasets | `list_datasets`, `get_dataset`, `create_dataset`, `delete_dataset` | | Images | `upload_image`, `delete_image` | | Annotations | `get_annotations`, `save_annotations` | | Auto-annotate | `auto_annotate_point`, `auto_annotate_box`, `auto_annotate_text` | | Search | `search_by_tag`, `search_by_similarity` | | Exports | `create_export`, `list_exports`, `download_export` | | Training | `get_training_status`, `cancel_training` | | Models | `list_models`, `download_model` | | Credits | `get_credit_balance`, `estimate_credit_cost` | | Connectors | `validate_connector`, `import_from_connector` | ## SDK equivalents ```python from pictograph.agents import create_toolkit toolkit = create_toolkit() schema = toolkit.as_json_schema() # same payload, no HTTP roundtrip anthropic_tools = toolkit.as_anthropic_tools() # name/description/input_schema only openai_tools = toolkit.as_openai_tools() # OpenAI function-calling format ``` The CLI also dumps the registry locally: ```bash pictograph agents export-tools -o tools.json ``` ## Versioning The `version` field tracks the SDK release that generated the snapshot. Tools may be added between minor versions; renames / removals only happen at major versions and are listed in the changelog. ## Drift protection The backend snapshot at `routes/developer/_tools_snapshot.json` is verified against the SDK's live registry on every CI build (via `scripts/generate_tools_snapshot.py --check`). PRs that change one without the other fail. --- ## Page: Error handling _URL: https://pictograph.io/docs/error-handling (markdown: error-handling.md)_ _Section: Reference_ Every SDK error subclasses **`PictographError`**. Catch the specific subclass to handle a known failure mode; catch the base class to log and rethrow. ## Hierarchy ``` PictographError ├── ConfigurationError — missing API key, invalid base URL ├── AuthError — 401 (bad / missing / revoked key) ├── ForbiddenError — 403 (role lacks permission) ├── NotFoundError — 404 (resource missing) ├── ConflictError — 409 (duplicate name, optimistic-lock fail) ├── ValidationError — 422 (payload shape rejected) ├── PaymentRequiredError — 402 (out of credits) ├── RateLimitError — 429 (per-key rate cap hit) ├── ServerError — 5xx (transient backend failure) ├── NetworkError — connection / DNS / TLS failure ├── RequestTimeoutError — request exceeded the SDK's timeout budget ├── PollTimeoutError — long-running job (training, batch SAM3) didn't finish └── ApiError — catch-all for unmatched status codes ``` Import from the top-level package: ```python from pictograph.exceptions import ( PictographError, AuthError, ForbiddenError, NotFoundError, ConflictError, ValidationError, PaymentRequiredError, RateLimitError, ServerError, NetworkError, RequestTimeoutError, PollTimeoutError, ApiError, ) ``` ## When each fires | Exception | Common cause | What to do | |---|---|---| | `ConfigurationError` | `PICTOGRAPH_API_KEY` not set, no `api_key=` arg | Set the env var or pass `api_key` | | `AuthError` (401) | Key revoked / typo | Re-issue the key | | `ForbiddenError` (403) | `viewer` key calling a write op | Use a `member`+ key | | `NotFoundError` (404) | Dataset name typo (case-sensitive!) | Verify with `datasets list` | | `ConflictError` (409) | Same image filename in same folder | Pass `skip_existing=True` to the upload workflow, or use a new name | | `ValidationError` (422) | `class` instead of `name`, flat polygon array | Fix the payload (see [Annotation format](/docs/annotation-format)) | | `PaymentRequiredError` (402) | Out of credits mid-operation | Show `e.upgrade_url` to the user | | `RateLimitError` (429) | Per-key burst limit | SDK auto-retries when `Retry-After < 120s`; otherwise raise | | `ServerError` (5xx) | Backend incident | SDK retries with exponential backoff; persistent failure surfaces | | `NetworkError` | Connection dropped | Retry idempotent ops; investigate non-idempotent | | `PollTimeoutError` | Training run exceeded `timeout` | Re-poll with `client.training.get(run_id)` | ## Retry behavior The SDK already retries on transient failures with exponential backoff: - **5xx responses** — up to 3 retries, backoff `1s → 2s → 4s`. - **429 with `Retry-After` ≤ 120s** — auto-waits then retries. - **Network errors** (connection reset, DNS blip) — same 3-retry policy. - **Idempotency** — retried requests inherit the original `Idempotency-Key` header, so the backend dedupes. Override on the Client: ```python client = Client(timeout=30.0, max_retries=5) ``` ## PaymentRequiredError details ```python from pictograph.exceptions import PaymentRequiredError try: client.training.create(dataset_name, export_name, pipeline_type="yolox") except PaymentRequiredError as e: print(f"Need {e.required} credits, you have {e.remaining}") print(f"Top up at: {e.upgrade_url}") ``` `required`, `remaining`, and `upgrade_url` are populated from the backend's `detail` block — fall back to plain `str(e)` if you only need a user-facing message. ## ValidationError details The backend returns a structured body listing every offending field: ```python from pictograph.exceptions import ValidationError try: client.annotations.save(image_id, [{"class": "person", "type": "bbox"}]) except ValidationError as e: print(e) # human-readable summary print(e.errors) # list of {"loc": [...], "msg": "...", "type": "..."} ``` The most common cause is the **`class` vs `name`** field mistake — the backend rejects any annotation that uses `class`. ## PollTimeoutError + recovery Long-running jobs (`training_pipeline`, batch auto-annotate, large dataset imports) accept a `timeout` arg and raise `PollTimeoutError` when it elapses. The job is **not cancelled** — it keeps running on the backend. ```python from pictograph.exceptions import PollTimeoutError try: run, model = train_pipeline(client, "ds", pipeline="yolox", timeout=60.0) except PollTimeoutError as e: # Pick up later run_id = e.run_id # most poll errors carry the resource ID run = client.training.get(run_id) if run.status == "completed": model = client.models.get(run.model_id) ``` ## Idempotency For mutating ops the SDK auto-generates an `Idempotency-Key` header, so retries are safe. Override per-call: ```python client.images.upload( dataset_id=ds.id, file_path="x.jpg", idempotency_key="upload-x-jpg-2026-04-19", ) ``` Backend dedupes within 24h. Reusing the same key with a different body returns `409 ConflictError` (`error_code: idempotency_conflict`). See [Rate limits](/docs/rate-limits) for the per-tier limits and burst behaviour. --- ## Page: Pictograph annotation format _URL: https://pictograph.io/docs/annotation-format (markdown: annotation-format.md)_ _Section: Reference_ Every annotation in Pictograph follows the same schema. Snake-case field names, no shorthand: bounding boxes are objects `{x, y, w, h}`, polygons are multi-ring `paths`, polylines are ordered point lists, keypoints are single points. The class-label field is **`name`** (not `class`). Do not improvise. ## Discriminator | `type` | Geometry container | Notes | |---|---|---| | `bbox` | `bounding_box: {x, y, w, h}` | Axis-aligned rectangle. | | `polygon` | `polygon: {paths: [[{x, y}, ...], ...]}` | Multi-ring (holes via even-odd). | | `polyline` | `polyline: {path: [{x, y}, ...]}` | Open path, doesn't close. | | `keypoint` | `keypoint: {x, y}` | Single landmark. | ## Required fields | Field | Type | Notes | |---|---|---| | `id` | non-blank string | Unique within the image. UUIDs preferred. | | `name` | non-blank string | Class label. Must match a class in `project_config.classes` (case-sensitive). | | `type` | one of `bbox`/`polygon`/`polyline`/`keypoint` | Discriminator. | | `<geometry>` | see table above | Field name is determined by `type`. | ## Optional fields | Field | Default | Notes | |---|---|---| | `confidence` | `1.0` | Range `[0, 1]`. SAM3 sets this; manual annotations get 1.0. | | `created_by` | `null` | UUID of the creator. Backend fills this for SDK uploads. | | `attributes` | `[]` | User-defined metadata. Backend stores opaque. | | `bounding_box` (polygon/polyline) | computed | Backend auto-computes the enclosing rectangle if omitted. | ## Examples ### Bounding box ```json { "id": "ann-1", "name": "person", "type": "bbox", "bounding_box": {"x": 100, "y": 200, "w": 50, "h": 80} } ``` ### Polygon ```json { "id": "ann-2", "name": "car", "type": "polygon", "polygon": { "paths": [[ {"x": 10, "y": 20}, {"x": 110, "y": 20}, {"x": 110, "y": 80}, {"x": 10, "y": 80} ]] } } ``` ### Polygon with hole ```json { "id": "ann-3", "name": "donut", "type": "polygon", "polygon": { "paths": [ [{"x": 0, "y": 0}, {"x": 100, "y": 0}, {"x": 100, "y": 100}, {"x": 0, "y": 100}], [{"x": 30, "y": 30}, {"x": 70, "y": 30}, {"x": 70, "y": 70}, {"x": 30, "y": 70}] ] } } ``` ### Polyline ```json { "id": "ann-4", "name": "lane_centerline", "type": "polyline", "polyline": { "path": [ {"x": 0, "y": 100}, {"x": 50, "y": 100}, {"x": 100, "y": 100} ] } } ``` ### Keypoint ```json { "id": "ann-5", "name": "left_eye", "type": "keypoint", "keypoint": {"x": 250, "y": 180} } ``` ## Storage Annotations are stored in `project_images.annotations_json` as a **plain array** — no wrapper: ```json [ {"id": "ann-1", "name": "person", "type": "bbox", "bounding_box": {…}}, {"id": "ann-2", "name": "car", "type": "polygon", "polygon": {…}} ] ``` Updating an image's annotations is a **full overwrite**: pass the complete list every time. There is no partial-update endpoint. ## Common mistakes - ❌ `"class": "person"` — must be `"name"`. - ❌ `"polygon": [[10, 20, 30, 40]]` — flat array. Must be `[{"x": …, "y": …}]`. - ❌ `"bbox": [x, y, w, h]` — array. Must be `"bounding_box": {x, y, w, h}` object. - ❌ Class label not in `project_config.classes` — backend rejects with 400. - ❌ Polygon ring with < 3 points — Pydantic rejects on save. ## SDK helpers ```python from pictograph import BBoxAnnotation, BoundingBox, PolygonAnnotation, PolygonGeometry, Point bbox = BBoxAnnotation( id="ann-1", name="person", bounding_box=BoundingBox(x=100, y=200, w=50, h=80), ) polygon = PolygonAnnotation( id="ann-2", name="car", polygon=PolygonGeometry(paths=[ [Point(x=10, y=20), Point(x=110, y=20), Point(x=110, y=80)], ]), ) client.annotations.save(image_id, [bbox, polygon]) ``` The SDK Pydantic models are the source of truth — they generate the JSON Schema this page describes. If a backend rejects your payload, diff your dump (`.model_dump(mode="json", exclude_none=True)`) against the rejection message. --- ## Page: Rate limits _URL: https://pictograph.io/docs/rate-limits (markdown: rate-limits.md)_ _Section: Reference_ Every API key is rate-limited per organization tier with a 1-hour sliding window. The SDK auto-retries short waits; longer waits raise so your code can decide. ## Per-tier limits | Tier | Requests / hour | | --- | --- | | Free | 1,000 | | Core | 5,000 | | Pro | 20,000 | | Enterprise | 100,000 | ## Response headers Every successful response carries the current state of the window. Read them off the underlying response if you're tracking your own consumption — most users can ignore them and let the SDK retry automatically. | Header | Meaning | | --- | --- | | `X-RateLimit-Limit` | Cap for the current window | | `X-RateLimit-Remaining` | Calls left in the current window | | `X-RateLimit-Reset` | Unix timestamp when the window resets | | `Retry-After` | (429 only) seconds until the next call may succeed | ## What counts One HTTP request → one count. Payload size doesn't matter. Streaming downloads (image / model / export blobs) count as a single request regardless of size. Bulk operations are designed to keep counts low — prefer `client.batch.move()` over N `client.images.update()` calls, and prefer the [workflows](/docs/workflows) over hand-rolled loops. ## SDK auto-retry `RateLimitError` carries a `retry_after` attribute. The SDK waits and retries automatically when the response includes a `Retry-After` header **and** the wait is at most 120 seconds. Anything longer raises immediately so your code can back off, queue, or fail. ```python from pictograph.exceptions import RateLimitError import time try: client.datasets.list(limit=1000) except RateLimitError as e: print(f"Hit cap; retry in {e.retry_after}s") time.sleep(e.retry_after) # …then retry. The SDK won't auto-recover for waits >120s. ``` The `pictograph` CLI inherits the same behaviour. On a long wait it prints the retry estimate to stderr so you can Ctrl-C if you don't want to wait. ## Spreading bursty load If your workload is bursty (nightly imports, large auto-annotation runs), pace it across the hour: ```python from time import sleep for batch in batches: process(batch) sleep(0.5) # ~7,200 req/hr ceiling — comfortably under Core ``` For sustained workloads above your tier, the right move is to upgrade — retrying harder doesn't increase your share. ## See also - [Error handling](/docs/error-handling) — the full exception hierarchy - [Credits](/docs/api-reference/credits) — paid-operation budgeting --- ## Page: CLI reference _URL: https://pictograph.io/docs/cli (markdown: cli.md)_ _Section: Reference_ The `pictograph` CLI is a thin wrapper over the SDK with Rich-formatted output. Same operations, same auth model, no learning curve. ## Install ```bash pip install 'pictograph[cli]' ``` ## Auth ```bash pictograph login # interactive; writes ~/.pictograph/config.toml # OR export PICTOGRAPH_API_KEY=pk_live_… # OR pictograph datasets list --api-key pk_live_… ``` Resolution order: `--api-key` flag > `PICTOGRAPH_API_KEY` env > `~/.pictograph/config.toml`. ## Global flags | Flag | Notes | |---|---| | `--version` / `-V` | Print version and exit | | `--help` | Print help (works on every subcommand) | | `--api-key <key>` | Override the resolved key | | `--json` | Emit raw JSON instead of Rich tables (where applicable) | ## Top-level commands ```bash pictograph init # drop AGENTS.md template into ./ pictograph login # save API key ``` ## datasets ```bash pictograph datasets list # list (table) pictograph datasets list --json # list (JSON) pictograph datasets get road-signs # by name pictograph datasets get road-signs --include-images # with image summaries pictograph datasets create new-dataset -d "Description" pictograph datasets delete road-signs # confirms first pictograph datasets delete road-signs --yes # skip confirm pictograph datasets download road-signs -o ./dump --workers 10 ``` ## images ```bash pictograph images upload <dataset> ./photo.jpg --folder /cars pictograph images download <image-uuid> -o ./out.jpg pictograph images delete <image-uuid> --yes ``` ## annotations ```bash pictograph annotations get <image-uuid> pictograph annotations save <image-uuid> --file ./anns.json # JSON list of annotations pictograph annotations delete <image-uuid> --yes ``` ## train ```bash pictograph train start <dataset> --pipeline yolox --gpu a10g pictograph train start <dataset> --pipeline detectron2 \ --gpu a100 --config '{"epochs": 50}' pictograph train status <run-uuid> pictograph train cancel <run-uuid> --yes pictograph train logs <run-uuid> # current status (SSE streaming arrives in v1.1) ``` ## models ```bash pictograph models list pictograph models download <model-uuid> -o ./yolox.onnx ``` ## credits ```bash pictograph credits balance pictograph credits balance --json pictograph credits history --limit 100 pictograph credits estimate training_a10g_per_minute -q 30 ``` ## agents ```bash pictograph agents list-tools # see all 28 tools pictograph agents export-tools -o tools.json # JSON Schema dump pictograph agents install-skill --target claude-code # → ~/.claude/skills/pictograph-cv/ pictograph agents install-skill --target claude-ai # → ./pictograph-cv.zip pictograph agents install-skill --target both ``` ## Examples ### Build + download a YOLO export ```bash pictograph datasets create road-signs # … upload images via the SDK or web app … pictograph train start road-signs --pipeline yolox # … wait for completion … pictograph train status <run-uuid> pictograph models download <model-uuid> -o ./yolox.onnx ``` ### Bulk-export all completed datasets to COCO ```bash for ds in $(pictograph datasets list --json | jq -r '.[].name'); do pictograph train start "$ds" --pipeline detectron2 --no-wait done ``` (Use the SDK's `client.exports.create(..., format="coco")` directly for better control — the CLI doesn't yet have an `export` subcommand; coming in v1.1.) ### Daily cost monitoring ```bash pictograph credits balance --json | jq '.credits_remaining' ``` ## Output - **Default**: Rich tables for human-readable terminal use. Auto-detects TTY width and wraps gracefully. - **`--json`**: pretty-printed JSON for piping into `jq` / scripting. Same payload structure as the SDK's `model_dump(mode="json")`. ## Errors CLI errors print bold-red to stderr and exit with non-zero status. The SDK's exception name maps to the message: ``` $ pictograph datasets get nonexistent error: Project 'nonexistent' not found $ echo $? 1 ``` Exit codes: | Code | Meaning | |---|---| | `0` | success | | `1` | API error (handled cleanly by the CLI) | | `2` | usage / config error (missing args, no API key) | ## See also - [Quick Start](/docs/quick-start) — install + first run - [Authentication](/docs/authentication) — key resolution + roles - [Error handling](/docs/error-handling) — exception hierarchy --- ## Page: Pictograph _URL: https://pictograph.io/docs/index (markdown: index.md)_ _Section: Get Started_ Pictograph turns directories of images into trained CV models with as little hand-annotation as possible. The same REST API drives three surfaces: a typed Python SDK, a CLI, and an agent toolkit for Claude and OpenAI. ```python from pictograph import Client from pictograph.workflows import full_pipeline client = Client() report = full_pipeline( client, dataset_name="road-signs", folder="./road_signs", classes=[("stop_sign", "bbox"), ("yield", "bbox")], pipeline="yolox", ) print("model:", report.model.id if report.success else "see report") ``` ## What you can do - **Upload** directories of images; subdirectories become virtual paths. - **Auto-annotate** with SAM3 — point, box, or text prompts, single image or async batch. - **Train** YOLOX, Detectron2, SM-PyTorch, RF-DETR, or classification models on A10G / A100 / H100 GPUs. - **Export** to COCO, YOLO, CVAT, Pascal VOC, LabelMe, CSV, or Pictograph JSON. - **Import** existing datasets from V7 (Darwin) or Roboflow. - **Search** by visual similarity (SigLIP2) or auto-generated content tags. - **Drive everything from agents** — Claude Agent SDK, openai-agents, Vercel AI SDK, LangChain, or any framework that speaks JSON Schema. ## Map of the docs | Section | Pages | | --- | --- | | **Get Started** | [Installation](/docs/installation) · [Quickstart](/docs/quick-start) · [Authentication](/docs/authentication) | | **Workflows** | [Full pipeline](/docs/workflows/full-pipeline) · [Upload](/docs/workflows/upload) · [Auto-annotate](/docs/workflows/auto-annotate) · [Train](/docs/workflows/train) | | **API Reference** | [Overview](/docs/api-reference) · [Datasets](/docs/api-reference/datasets) · [Images](/docs/api-reference/images) · [Annotations](/docs/api-reference/annotations) · [Auto-annotate](/docs/api-reference/auto-annotate) · [Search](/docs/api-reference/search) · [Batch](/docs/api-reference/batch) · [Exports](/docs/api-reference/exports) · [Training](/docs/api-reference/training) · [Models](/docs/api-reference/models) · [Credits](/docs/api-reference/credits) · [Connectors](/docs/api-reference/connectors) · [Video](/docs/api-reference/video) · [Organizations](/docs/api-reference/organizations) · [Projects](/docs/api-reference/projects) · [API Keys](/docs/api-reference/api-keys) · [Tools](/docs/api-reference/tools) | | **Agents** | [Overview](/docs/agents) · [Claude](/docs/agents/claude) · [OpenAI](/docs/agents/openai) · [Dynamic discovery](/docs/agents/dynamic-discovery) · [Cookbook](/docs/agents/cookbook) | | **Reference** | [Annotation format](/docs/annotation-format) · [Error handling](/docs/error-handling) · [Rate limits](/docs/rate-limits) · [CLI](/docs/cli) | Every page has a "Copy as Markdown" button and an `.md` mirror for agents to consume directly. ## For agents browsing this site - Site index: [`/docs/llms.txt`](/docs/llms.txt) - Full doc bundle (one file): [`/docs/llms-full.txt`](/docs/llms-full.txt) - Tool registry (JSON Schema): [`/api/v1/developer/tools.json`](https://api.pictograph.io/api/v1/developer/tools.json) --- ## Page: Installation _URL: https://pictograph.io/docs/installation (markdown: installation.md)_ _Section: Get Started_ ## Requirements - **Python 3.10+** (3.11+ recommended). Tested on 3.10–3.13. - A Pictograph account and an API key (see [Quick Start](/docs/quick-start)). ## Install ```bash pip install pictograph ``` Published on PyPI: [`pictograph`](https://pypi.org/project/pictograph/). The base install gives you the SDK Client, every resource, the agent toolkit, and the bundled `pictograph-cv` Skill. ## Optional extras | Extra | What it adds | Install | |---|---|---| | `cli` | `pictograph` command (Typer + Rich) | `pip install 'pictograph[cli]'` | | `agents` | Claude Agent SDK + openai-agents adapters | `pip install 'pictograph[agents]'` | | `cache` | Local SQLite response cache (aiosqlite) | `pip install 'pictograph[cache]'` | | `telemetry` | OpenTelemetry SDK for SDK calls | `pip install 'pictograph[telemetry]'` | | `all` | Everything above | `pip install 'pictograph[all]'` | Pillow is included in the base install (used to extract image dimensions client-side during upload). ## Verify ```python from pictograph import Client, REGISTRY, __version__ print(f"pictograph v{__version__}") print(f"{len(REGISTRY)} agent tools registered") ``` Expected output: ``` pictograph v1.x.x 28 agent tools registered ``` ## CLI verify ```bash pictograph --version pictograph --help ``` ## Editable install (contributors) ```bash git clone https://github.com/pictograph-labs/pictograph-sdk cd pictograph-sdk pip install -e '.[dev,cli,agents,cache,telemetry]' pytest ``` ## Next - [Quickstart](/docs/quick-start) — run an end-to-end pipeline in five minutes - [Authentication](/docs/authentication) — API key resolution and roles - [Workflows](/docs/workflows) — the headline upload / annotate / train helpers --- ## Page: Quick start _URL: https://pictograph.io/docs/quick-start (markdown: quick-start.md)_ _Section: Get Started_ ## Install ```bash pip install pictograph ``` For the CLI + Rich-formatted output: ```bash pip install 'pictograph[cli]' ``` For the agent toolkit (Claude Agent SDK + openai-agents): ```bash pip install 'pictograph[agents]' ``` ## Get an API key 1. Sign in at [app.pictograph.io](https://app.pictograph.io). 2. Navigate to **Settings → API Keys**. 3. Click **Create API Key**, give it a role (`viewer` / `member` / `admin` / `owner`). 4. Copy the key (`pk_live_…`) — it is only shown once. ```bash export PICTOGRAPH_API_KEY=pk_live_… ``` Or use the CLI's interactive setup: ```bash pictograph login ``` This writes `~/.pictograph/config.toml`. ## First call ```python from pictograph import Client client = Client() # reads PICTOGRAPH_API_KEY datasets = client.datasets.list(limit=10) print(datasets) ``` ## End-to-end: upload, annotate, train The headline workflow — one function call: ```python from pictograph import Client from pictograph.workflows import full_pipeline client = Client() report = full_pipeline( client, dataset_name="road-signs", folder="./road_signs", classes=[("stop_sign", "bbox"), ("yield", "bbox")], pipeline="yolox", ) if report.success: print(f"Trained model: {report.model.id}") else: print(report.credit_skip_reason or "see sub-reports") ``` Each phase short-circuits on failure and the `PipelineReport` carries every sub-report. See [`full_pipeline`](/docs/workflows/full-pipeline) for every parameter. ## CLI equivalent ```bash pictograph login # one-time pictograph datasets list pictograph train start road-signs --pipeline yolox --gpu a10g pictograph models download <model-id> -o ./yolox.onnx ``` ## Next - [Workflows](/docs/workflows) — the four batteries-included helpers - [Agents](/docs/agents) — wire Pictograph into Claude or OpenAI - [Annotation format](/docs/annotation-format) — the canonical JSON schema - [Credits](/docs/api-reference/credits) — budget gating and cost estimation --- ## Page: Authentication _URL: https://pictograph.io/docs/authentication (markdown: authentication.md)_ _Section: Get Started_ The Developer API authenticates via **API keys** — `pk_live_…` strings issued from **Settings → API Keys** in the web app. The same key works for the SDK, the `pictograph` CLI, and direct REST calls. ## Get an API key 1. Sign in at [app.pictograph.io](https://app.pictograph.io). 2. **Settings → API Keys → Create API Key**. 3. Pick a role: `viewer` / `member` / `admin` / `owner` (see below). 4. Copy the key — **shown once, never again**. ## Use the key The SDK reads `PICTOGRAPH_API_KEY` from the environment by default: ```bash export PICTOGRAPH_API_KEY=pk_live_… ``` ```python from pictograph import Client client = Client() # uses env var ``` Or pass it explicitly: ```python client = Client(api_key="pk_live_…") ``` The CLI has the same resolution order plus a `~/.pictograph/config.toml` file written by `pictograph login`: ```bash pictograph login # prompts (input hidden), writes ~/.pictograph/config.toml pictograph datasets list # uses the saved key ``` ## Resolution order | Priority | Source | |---|---| | 1 (highest) | `--api-key` flag (CLI) or `Client(api_key=...)` arg (SDK) | | 2 | `PICTOGRAPH_API_KEY` environment variable | | 3 | `~/.pictograph/config.toml` `[default].api_key` (CLI only) | | 4 (failure) | `ConfigurationError` raised | ## REST clients ```bash curl -H "X-API-Key: pk_live_…" https://api.pictograph.io/api/v1/developer/datasets/ ``` The header name is exactly `X-API-Key`. Bearer tokens are not accepted on developer endpoints. ## Roles + permissions API keys carry a role that the backend re-enforces server-side. Roles are hierarchical: `owner > admin > member > viewer`. | Role | Read | Create / update | Delete | Invite users | Org settings | |---|---|---|---|---|---| | viewer | ✓ | — | — | — | — | | member | ✓ | ✓ | own resources only | — | — | | admin | ✓ | ✓ | ✓ | ✓ | — | | owner | ✓ | ✓ | ✓ | ✓ | ✓ | The agent tool registry tags each tool with `required_role` — see [`/docs/api-reference/tools`](/docs/api-reference/tools). ## Key format API keys look like `pk_live_<32_random_bytes_base64>`. They are **bcrypt- hashed** server-side (cost factor 12) and stored only as the hash plus the first 12 chars (`pk_live_<8>`) for the prefix-lookup index. Once created, the full key is never recoverable. ## Rotation To rotate: 1. **Settings → API Keys → Create API Key** (new key). 2. Update `PICTOGRAPH_API_KEY` / `~/.pictograph/config.toml` / your CI secret. 3. Delete the old key once the rollout is verified. There is no in-place rotation — every key is immutable after creation. ## Errors | Status | Exception | Cause | | --- | --- | --- | | 401 | `AuthError` | Missing / malformed / unknown / revoked key | | 403 | `ForbiddenError` | Key's role lacks permission for the operation | | 429 | `RateLimitError` | Per-key rate cap hit (see [Rate limits](/docs/rate-limits)) | ---