Datasets
List, fetch, create, delete projects and bulk-download their images + annotations.
A dataset in Pictograph is a project — a collection of images
sharing a class set. Datasets are unique by (organization, name).
The SDK strongly prefers name-based lookups (agents pass strings users
gave them; UUID indirection is friction).
from pictograph import Client
client = Client()
list
Single-page list of datasets in your organization.
datasets = client.datasets.list(limit=100)
for ds in datasets:
print(ds.name, ds.image_count)
| Arg | Type | Default | Notes |
|---|---|---|---|
limit | int | 100 | Backend cap: 1000 |
Returns list[Dataset].
iter
Auto-paging iterator over every dataset.
for ds in client.datasets.iter(page_size=100):
print(ds.name)
# Or materialize:
all_datasets = client.datasets.iter().all()
| Arg | Type | Default | Notes |
|---|---|---|---|
page_size | int | 100 | Items per backend round-trip |
max_total | int | None | None | Stop after this many items |
Returns OffsetPager[Dataset].
get
Fetch by name (case-sensitive within org).
ds = client.datasets.get("road-signs", include_images=True, images_limit=200)
print(ds.image_count, len(ds.images))
| Arg | Type | Default | Notes |
|---|---|---|---|
name | str | required | Dataset name |
include_images | bool | False | Embed first images_limit DatasetImage summaries |
images_limit | int | 1000 | Backend cap: 10000 |
images_offset | int | 0 | Page the embedded image list |
Returns Dataset.
get_by_id
UUID lookup. Use only when you already have the ID.
ds = client.datasets.get_by_id("a3e12f...")
download
Bulk-download images and / or annotations to a local directory. Fetches a batch of signed download URLs in one call, then downloads in parallel via a thread pool.
report = client.datasets.download(
"road-signs",
output_dir="./dump",
mode="full", # "full" | "images_only" | "annotations_only"
status_filter="complete", # restrict to annotation-finalised images
max_workers=10,
progress=lambda done, total, fn: print(f"{done}/{total} {fn}"),
)
print(report.images_downloaded, report.annotations_downloaded, len(report.failures))
Returns a DownloadReport. Inspect .failures to retry the subset —
the call does not raise on individual file errors.
Project CRUD
Project create / update / delete live on the projects
resource (it’s the same underlying entity; “dataset” is the SDK alias for
the read paths and “project” is the alias for the write paths).
proj = client.projects.create("new-dataset", description="…")
client.projects.update("new-dataset", description="updated")
client.projects.delete("new-dataset")
Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | NotFoundError | Name doesn’t exist (case-sensitive) or belongs to another org |
| 409 | ConflictError | create with a duplicate name |
| 403 | ForbiddenError | delete requires admin+ role |
REST equivalent
curl -H "X-API-Key: pk_live_…" \
https://api.pictograph.io/api/v1/developer/datasets/?limit=10