Search
Find images by SigLIP2 cosine similarity to a reference, or by automatic content tags (objects / scenes / attributes).
Two search modes:
- Visual similarity —
by_similarity()— SigLIP2 (1152-dim) embeddings + pgvector HNSW index. - Tag-based —
by_tag()— JSONB containment over the auto-classifiedimage_auto_tagsfield (objects / scenes / attributes).
Both auto-tag and embedding pipelines run on every upload (zero API cost; T4 GPU). No setup required.
from pictograph import Client
client = Client()
by_similarity
Find images visually similar to a reference image. Scope is the reference image’s dataset + folder unless overridden.
results = client.search.by_similarity(
image_id="img-uuid-1",
threshold=0.6, # cosine similarity floor (0–1)
limit=50,
folder_path=None, # None = inherit; "/" = whole dataset
)
for r in results:
print(r.image_id, r.filename, f"{r.similarity:.3f}")
| Arg | Type | Default | Notes |
|---|---|---|---|
image_id | str | required | UUID of the reference image |
threshold | float | 0.6 | Minimum cosine similarity (0.6 ≈ “visually related”) |
limit | int | 50 | Backend cap: 500 |
folder_path | str | None | None | Override folder scope |
Returns list[SimilarImage], sorted by descending similarity. The
source image is excluded from results.
by_tag
Find images with auto-tags matching the given filters. Pass at least
one of objects / scenes / attributes (an empty filter returns
nothing rather than everything — semantically clearer for agents).
results = client.search.by_tag(
objects=["car", "truck"], # match ANY object tag
scenes=["outdoor"], # match ANY scene tag
attributes=["blurry"], # match ANY attribute tag
dataset_name="my-dataset", # restrict scope; None = whole org
limit=100,
)
for r in results:
print(r.image_id, r.tags["objects"])
| Arg | Type | Default | Notes |
|---|---|---|---|
objects | Sequence[str] | None | None | At least one of objects/scenes/attributes required |
scenes | Sequence[str] | None | None | |
attributes | Sequence[str] | None | None | |
dataset_name | str | None | None | Org-wide search if None |
limit | int | 50 | Backend cap: 500 |
Returns list[TaggedImage]. Within a category, tags are OR’d; across
categories they are AND’d:
objects=["car","truck"]→ “car OR truck”objects=["car"], scenes=["outdoor"]→ “car AND outdoor”
Auto-tag taxonomy
The SigLIP2 classifier picks from ~200 curated labels per category. Common ones:
- objects: car, truck, person, bicycle, dog, sign, building, etc.
- scenes: outdoor, indoor, urban, rural, daytime, nighttime, etc.
- attributes: blurry, dark, bright, high-contrast, low-light, etc.
The full taxonomy ships with the SigLIP2 service prompts; tags not in the curated list won’t be assigned.
Cost
Search is free. Embeddings + auto-tags are computed once per image on upload (T4 GPU, zero API cost) and cached.
Common errors
| Status | Exception | Cause |
|---|---|---|
| 404 | NotFoundError | image_id (similarity) or dataset_name (tag) missing |
| 422 | ValidationError | by_tag called with all three filters None |