Skip to content

Managing Datasets

Datasets group ground-truth items by modality and challenge type. A dataset can contain thousands of items. Multiple API keys can draw from the same dataset.

Create a dataset

Terminal window
hiveguard datasets create "My Dataset" --modality image

Options:

FlagTypeDescription
--modalitystringimage, text, or audio (default: image)
--challenge-typestringDefault challenge type for items in this dataset
--promptstringDefault prompt shown to solvers

List datasets

Terminal window
hiveguard datasets list

Output (table format):

ID Name Modality Items
------------------------------------ --------------- -------- -----
00000000-0000-0000-0000-000000000001 ImageNet Cats image 2500
00000000-0000-0000-0000-000000000002 News Headlines text 800

Inspect a dataset

Terminal window
hiveguard datasets show DATASET_ID

Shows item count, creation date, challenge type, and label statistics.

Export a dataset’s labels

Terminal window
hiveguard datasets export DATASET_ID --fmt csv --output labels.csv

Streams the export. Safe for large datasets. See Exporting Labels for format details.

Delete a dataset

Terminal window
hiveguard datasets delete DATASET_ID

You’ll be prompted to confirm:

Delete dataset "ImageNet Cats" (2500 items)? [y/N]: y
Deleted.

Deleting a dataset removes all its items and labels. This cannot be undone.

Attach a dataset to an API key

When creating an API key, you can restrict it to specific datasets:

Terminal window
hiveguard keys create "production-key" --dataset DATASET_ID

Requests with that key can only draw challenges from the allowed datasets.

Dashboard

All dataset operations are also available in the dashboard under the Datasets tab.