Train a model

Create an export and train a model in one call. Returns the trained model.

train_pipeline() chains export creation, training, and model fetch. The export is auto-named so the workflow doesn’t collide with exports you’ve created manually.

from pictograph import Client
from pictograph.pipelines import train_pipeline

client = Client()

run, model = train_pipeline(
    client,
    "road-signs",
    pipeline="yolox",
    gpu="a10g",
    config={"epochs": 50, "batch_size": 16},
)

if model:
    client.models.download(model.id, "./yolox.onnx")

Signature

train_pipeline(
    client: Client,
    dataset_name: str,
    *,
    pipeline: PipelineType,
    gpu: GpuType = "a10g",
    name: str | None = None,
    config: dict[str, Any] | None = None,
    export_name: str | None = None,
    class_filter: list[str] | None = None,
    status_filter: str = "complete",
    wait: bool = True,
    poll_interval: float = 5.0,
    timeout: float = 7200.0,
) -> tuple[TrainingRun, Model | None]

Argument	Default	Purpose
`dataset_name`	required	Project to train on
`pipeline`	required	`yolox`, `sm_pytorch`, `classification`, `rfdetr_detection`, `rfdetr_segmentation`
`gpu`	`"a10g"`	`a10g`, `a100`, or `h100`
`name`	auto	Run name (defaults to `<pipeline>-run-<timestamp>`)
`config`	`{}`	Hyperparameters (`epochs`, `batch_size`, `learning_rate`, `image_size`)
`export_name`	auto	Defaults to `<pipeline>-<timestamp>`
`class_filter`	`None`	Train only on these classes
`status_filter`	`"complete"`	Only include images at this annotation status
`wait`	`True`	When `True`, block until training terminates and fetch the model
`poll_interval`	`5.0`	Seconds between polls
`timeout`	`7200`	Max seconds to wait (2 hours)

Pipelines

`pipeline`	Output	When to pick it
`yolox`	Object detection (boxes)	Speed, edge deployment, small datasets
`sm_pytorch`	Semantic segmentation	Pixel-wise class maps
`classification`	Image classification	Tag-style labels with no geometry
`rfdetr_detection`	Object detection	Higher mAP than YOLOX on harder data
`rfdetr_segmentation`	Instance segmentation (polygons + masks)	Best per-instance mask accuracy

GPU tiers

`gpu`	Pick for
`a10g` (default)	YOLOX, classification, RF-DETR-detection
`a100`	Large RF-DETR, big batch sizes
`h100`	Last resort — only when A100 OOMs

The dataset must have at least 5 images with the chosen status_filter so the worker can split train / val / test.

What happens under the hood

1. client.exports.create(dataset, "<pipeline>-<timestamp>", format="pictograph",
                         include_images=True, class_filter=…, status_filter=…)
   → waits for the export to finish.

2. client.training.create(dataset, export_name, pipeline_type=…, name=…,
                          config=…, gpu_type=…, wait=…, timeout=…)
   → kicks off the run; polls until terminal when wait=True.

3. client.models.get(run.model_id)
   → returns the trained model — only when wait=True and status=="completed".

Async usage

Pass wait=False to fire-and-forget:

run, _ = train_pipeline(client, "road-signs", pipeline="yolox", wait=False)
print("queued:", run.id)

# Poll yourself later.
run = client.training.get(run.id)
if run.status == "completed":
    model = client.models.get(run.model_id)

Hyperparameters

config keys are pipeline-specific. Common ones across pipelines:

Key	Type	Typical
`epochs`	int	30–100
`batch_size`	int	8 / 16 / 32
`learning_rate`	float	`0.001`–`0.01`
`image_size`	int	`640` (YOLOX), `1024` (segmentation)

Unsupported keys are ignored.

Errors

Status	Exception	Cause
404	`NotFoundError`	Dataset missing or has no `status_filter`-matching images
422	`ValidationError`	Pipeline or GPU invalid, or dataset has fewer than 5 annotated images
402	`PaymentRequiredError`	Insufficient credits for the estimated training minutes
408	`PollTimeoutError`	`wait=True` and `timeout` elapsed (the run continues; poll later)
5xx	`ApiError`	Training run failed — inspect `run.error_message`