Search

Find images by SigLIP2 cosine similarity to a reference, or by automatic content tags (objects / scenes / attributes).

Two search modes:

Visual similarity — by_similarity() — SigLIP2 (1152-dim) embeddings + pgvector HNSW index.
Tag-based — by_tag() — JSONB containment over the auto-classified image_auto_tags field (objects / scenes / attributes).

Both auto-tag and embedding pipelines run on every upload (zero API cost; T4 GPU). No setup required.

Every example below shows the Python SDK call and the equivalent raw REST request. The REST examples authenticate with an X-API-Key header; set PICTOGRAPH_API_KEY in your shell to copy-and-run them. Both REST endpoints are GET with query parameters.

from pictograph import Client
client = Client()  # reads PICTOGRAPH_API_KEY

by_similarity

Find images visually similar to a reference image. Scope is the reference image’s dataset + folder unless overridden.

Arg	Type	Default	Notes
`image_id`	`str`	required	UUID of the reference image
`threshold`	`float`	`0.6`	Minimum cosine similarity (`0.6` ≈ “visually related”)
`limit`	`int`	`50`	Backend cap: 500
`folder_path`	`str \| None`	`None`	Override folder scope

results = client.search.by_similarity(
    image_id="img-uuid-1",
    threshold=0.6,                   # cosine similarity floor (0–1)
    limit=50,
    folder_path=None,                # None = inherit; "/" = whole dataset
)
for r in results:
    print(r.id, r.filename, f"{r.similarity:.3f}")

curl -s "https://api.pictograph.io/api/v1/developer/search/similar?image_id=img-uuid-1&threshold=0.6&limit=50" \
  -H "X-API-Key: $PICTOGRAPH_API_KEY"

Returns list[SimilarImage], sorted by descending similarity. The source image is excluded from results.

by_tag

Find images with auto-tags matching the given filters. Pass at least one of objects / scenes / attributes (an empty filter returns a 422 rather than everything — semantically clearer for agents).

Arg	Type	Default	Notes
`objects`	`Sequence[str] \| None`	`None`	At least one of objects/scenes/attributes required
`scenes`	`Sequence[str] \| None`	`None`
`attributes`	`Sequence[str] \| None`	`None`
`dataset_name`	`str \| None`	`None`	Org-wide search if `None`
`limit`	`int`	`50`	Backend cap: 500
`offset`	`int`	`0`	Pagination offset

results = client.search.by_tag(
    objects=["car", "truck"],            # require BOTH object tags
    scenes=["outdoor"],                  # require this scene tag
    attributes=["blurry"],               # require this attribute tag
    dataset_name="my-dataset",           # restrict scope; None = whole org
    limit=100,
)
for r in results:
    print(r.id, r.image_auto_tags["objects"])

curl -s "https://api.pictograph.io/api/v1/developer/search/by-tags?objects=car&objects=truck&scenes=outdoor&attributes=blurry&dataset_name=my-dataset&limit=100" \
  -H "X-API-Key: $PICTOGRAPH_API_KEY"

Returns list[TaggedImage]. Filters use JSONB containment, so tags are AND’d both within and across categories — every listed tag must be present:

objects=["car","truck"] → “car AND truck”
objects=["car"], scenes=["outdoor"] → “car AND outdoor”

Repeat the query key per value in REST (objects=car&objects=truck).

Auto-tag taxonomy

The SigLIP2 classifier picks from ~200 curated labels per category. Common ones:

objects: car, truck, person, bicycle, dog, sign, building, etc.
scenes: outdoor, indoor, urban, rural, daytime, nighttime, etc.
attributes: blurry, dark, bright, high-contrast, low-light, etc.

The full taxonomy ships with the SigLIP2 service prompts; tags not in the curated list won’t be assigned.

Cost

Search is free. Embeddings + auto-tags are computed once per image on upload (T4 GPU, zero API cost) and cached.

Common errors

Status	Exception	Cause
404	`NotFoundError`	`image_id` (similarity) or `dataset_name` (tag) missing
400	`ValidationError`	`by_tag` called with all three filters None