---
title: Datasets
description: List, fetch, create, delete projects and bulk-download their images + annotations.
section: API Reference
order: 1
---
A **dataset** in Pictograph is a project — a collection of images
sharing a class set. Datasets are unique by `(organization, name)`.
The SDK strongly prefers name-based lookups (agents pass strings users
gave them; UUID indirection is friction).

```python
from pictograph import Client
client = Client()
```

## list

Single-page list of datasets in your organization.

```python
datasets = client.datasets.list(limit=100)
for ds in datasets:
    print(ds.name, ds.image_count)
```

| Arg | Type | Default | Notes |
|---|---|---|---|
| `limit` | `int` | `100` | Backend cap: 1000 |

Returns `list[Dataset]`.

## iter

Auto-paging iterator over every dataset.

```python
for ds in client.datasets.iter(page_size=100):
    print(ds.name)

# Or materialize:
all_datasets = client.datasets.iter().all()
```

| Arg | Type | Default | Notes |
|---|---|---|---|
| `page_size` | `int` | `100` | Items per backend round-trip |
| `max_total` | `int \| None` | `None` | Stop after this many items |

Returns `OffsetPager[Dataset]`.

## get

Fetch by name (case-sensitive within org).

```python
ds = client.datasets.get("road-signs", include_images=True, images_limit=200)
print(ds.image_count, len(ds.images))
```

| Arg | Type | Default | Notes |
|---|---|---|---|
| `name` | `str` | required | Dataset name |
| `include_images` | `bool` | `False` | Embed first `images_limit` `DatasetImage` summaries |
| `images_limit` | `int` | `1000` | Backend cap: 10000 |
| `images_offset` | `int` | `0` | Page the embedded image list |

Returns `Dataset`.

## get_by_id

UUID lookup. Use only when you already have the ID.

```python
ds = client.datasets.get_by_id("a3e12f...")
```

## download

Bulk-download images and / or annotations to a local directory. Fetches
a batch of signed download URLs in one call, then downloads in parallel
via a thread pool.

```python
report = client.datasets.download(
    "road-signs",
    output_dir="./dump",
    mode="full",                     # "full" | "images_only" | "annotations_only"
    status_filter="complete",        # restrict to annotation-finalised images
    max_workers=10,
    progress=lambda done, total, fn: print(f"{done}/{total} {fn}"),
)
print(report.images_downloaded, report.annotations_downloaded, len(report.failures))
```

Returns a `DownloadReport`. Inspect `.failures` to retry the subset —
the call does **not** raise on individual file errors.

## Project CRUD

Project create / update / delete live on the [`projects`](/docs/api-reference/projects.md)
resource (it's the same underlying entity; "dataset" is the SDK alias for
the read paths and "project" is the alias for the write paths).

```python
proj = client.projects.create("new-dataset", description="…")
client.projects.update("new-dataset", description="updated")
client.projects.delete("new-dataset")
```

## Common errors

| Status | Exception | Cause |
|---|---|---|
| 404 | `NotFoundError` | Name doesn't exist (case-sensitive) or belongs to another org |
| 409 | `ConflictError` | `create` with a duplicate name |
| 403 | `ForbiddenError` | `delete` requires `admin`+ role |

## REST equivalent

```bash
curl -H "X-API-Key: pk_live_…" \
  https://api.pictograph.io/api/v1/developer/datasets/?limit=10
```