Sign in Get started

Full pipeline

Upload a folder, auto-annotate it, and train a model in one call.

View as Markdown

full_pipeline() chains upload → auto-annotate → train. Each phase short-circuits on failure and the PipelineReport carries every sub-report so you can see exactly where the chain broke.

from pictograph import Client
from pictograph.workflows import full_pipeline

client = Client()

report = full_pipeline(
    client,
    dataset_name="road-signs",
    folder="./road_signs",
    classes=[("stop_sign", "bbox"), ("yield", "bbox")],
    pipeline="yolox",
)

if report.success:
    print("Model:", report.model.id)
else:
    print("Stopped at:", report.credit_skip_reason or "see sub-reports")

Signature

full_pipeline(
    client: Client,
    *,
    dataset_name: str,
    folder: str | Path,
    classes: Sequence[BatchClass | tuple[str, str] | dict[str, str]],
    pipeline: PipelineType,
    gpu: GpuType = "a10g",
    annotate: bool = True,
    annotate_mode: AnnotateMode = "batch",
    train: bool = True,
    upload_workers: int = 8,
    train_config: dict[str, Any] | None = None,
    train_timeout: float = 7200.0,
    min_credits: int | None = 1,
) -> PipelineReport
ArgumentDefaultPurpose
dataset_namerequiredDestination dataset; created if missing
folderrequiredLocal folder of images — subdirectories become virtual folders
classesrequiredEach class becomes a SAM3 target and a training label
pipelinerequiredyolox, detectron2, sm_pytorch, classification, rfdetr_detection, rfdetr_segmentation
gpu"a10g"a10g, a100, or h100
annotateTrueSkip the SAM3 phase if you already have annotations
annotate_mode"batch"batch (async, multi-image) or text (synchronous per-image)
trainTrueSkip training to do upload + annotate only
upload_workers8Concurrent upload threads
train_configNoneHyperparameters (epochs, batch_size, learning_rate, image_size)
train_timeout7200Max seconds to wait for training (2 hours)
min_credits1Pre-flight balance check before paid phases — pass None to disable

How the chain fails open

Each phase only runs if the previous one succeeded.

  1. Upload always runs. If it produces zero successes, the function returns immediately with upload.failures populated.
  2. Credit gate: before any paid phase, the balance is checked. Below min_credits and the function returns with credit_skip_reason set.
  3. Auto-annotate runs only when annotate=True and upload succeeded. If it produces zero processed images, training is skipped.
  4. Train runs only when train=True and the previous phases succeeded. The export is auto-named <pipeline>-<timestamp>.

Inspecting the report

@dataclass
class PipelineReport:
    dataset_name: str
    upload: UploadReport
    annotate: AnnotateReport | None
    training_run: TrainingRun | None
    model: Model | None
    credit_skip_reason: str | None

    @property
    def success(self) -> bool: ...

success is True only when every populated phase succeeded and no credit skip happened. Each sub-report has its own success property.

Common patterns

Upload + annotate only (no training):

full_pipeline(client, ..., pipeline="yolox", train=False)

Use existing annotations (skip SAM3):

full_pipeline(client, ..., annotate=False, pipeline="yolox")

Disable the credit pre-flight (you’re OK paying for partial runs):

full_pipeline(client, ..., min_credits=None)

Errors

The function does not raise on partial failure — inspect the report. It will still raise for unrecoverable conditions before any work begins:

StatusExceptionCause
402PaymentRequiredErrorMid-run cost exceeds balance (from auto-annotate or training phase)
422ValidationErrorpipeline or gpu value invalid
FileNotFoundErrorfolder doesn’t exist or isn’t a directory

See also

Copied to clipboard