Sign in Get started

Datasets

List, fetch, create, delete projects and bulk-download their images + annotations.

View as Markdown

A dataset in Pictograph is a project — a collection of images sharing a class set. Datasets are unique by (organization, name). The SDK strongly prefers name-based lookups (agents pass strings users gave them; UUID indirection is friction).

from pictograph import Client
client = Client()

list

Single-page list of datasets in your organization.

datasets = client.datasets.list(limit=100)
for ds in datasets:
    print(ds.name, ds.image_count)
ArgTypeDefaultNotes
limitint100Backend cap: 1000

Returns list[Dataset].

iter

Auto-paging iterator over every dataset.

for ds in client.datasets.iter(page_size=100):
    print(ds.name)

# Or materialize:
all_datasets = client.datasets.iter().all()
ArgTypeDefaultNotes
page_sizeint100Items per backend round-trip
max_totalint | NoneNoneStop after this many items

Returns OffsetPager[Dataset].

get

Fetch by name (case-sensitive within org).

ds = client.datasets.get("road-signs", include_images=True, images_limit=200)
print(ds.image_count, len(ds.images))
ArgTypeDefaultNotes
namestrrequiredDataset name
include_imagesboolFalseEmbed first images_limit DatasetImage summaries
images_limitint1000Backend cap: 10000
images_offsetint0Page the embedded image list

Returns Dataset.

get_by_id

UUID lookup. Use only when you already have the ID.

ds = client.datasets.get_by_id("a3e12f...")

download

Bulk-download images and / or annotations to a local directory. Fetches a batch of signed download URLs in one call, then downloads in parallel via a thread pool.

report = client.datasets.download(
    "road-signs",
    output_dir="./dump",
    mode="full",                     # "full" | "images_only" | "annotations_only"
    status_filter="complete",        # restrict to annotation-finalised images
    max_workers=10,
    progress=lambda done, total, fn: print(f"{done}/{total} {fn}"),
)
print(report.images_downloaded, report.annotations_downloaded, len(report.failures))

Returns a DownloadReport. Inspect .failures to retry the subset — the call does not raise on individual file errors.

Project CRUD

Project create / update / delete live on the projects resource (it’s the same underlying entity; “dataset” is the SDK alias for the read paths and “project” is the alias for the write paths).

proj = client.projects.create("new-dataset", description="…")
client.projects.update("new-dataset", description="updated")
client.projects.delete("new-dataset")

Common errors

StatusExceptionCause
404NotFoundErrorName doesn’t exist (case-sensitive) or belongs to another org
409ConflictErrorcreate with a duplicate name
403ForbiddenErrordelete requires admin+ role

REST equivalent

curl -H "X-API-Key: pk_live_…" \
  https://api.pictograph.io/api/v1/developer/datasets/?limit=10
Copied to clipboard