Challenges

A challenge is a short labeling task drawn from your dataset. HiveGuard presents it to a visitor passing through your site. The visitor answers it — and their answer becomes a training label.

Each challenge contains two items:

A ground-truth item — an item you already know the answer to. This confirms the visitor is a real human paying attention.
An unknown item — an item from your dataset that you want labeled. This is the useful work.

The visitor sees them together as a single task. They don’t know which is which. If they answer the ground-truth item correctly, their answer on the unknown item is recorded as a candidate label. If they get the ground-truth wrong, their answer is discarded — it was noise, not signal.

What a challenge looks like

{
  "challenge_id": "chal_abc123",
  "items": [
    {
      "slot": "a",
      "data_ref": "https://cdn.example.com/img001.jpg",
      "modality": "image",
      "prompt": "Does this image contain a cat?"
    },
    {
      "slot": "b",
      "data_ref": "https://cdn.example.com/img002.jpg",
      "modality": "image",
      "prompt": "Does this image contain a cat?"
    }
  ],
  "expires_at": "2024-01-01T12:05:00Z"
}

Slot assignment (which is GT, which is unknown) is randomized server-side and never exposed to the client. This prevents bots from replaying known-good answers.

Challenge types

Modality	Task	Example prompt
`image`	Classify an image	”Does this image contain a cat?”
`text`	Classify a text snippet	”Is this review positive or negative?”
`audio`	Classify an audio clip	”Is the speaker male or female?”
`grid_select`	Select cells in an image	”Select all cells containing a road sign”
`select_all`	Select matching items from a grid	”Select all images containing a bicycle”

What happens after submission

When a visitor submits answers:

The ground-truth answer is checked. Correct → proceed. Incorrect → the session ends, no label recorded.
The unknown item answer is recorded as a vote.
When enough votes accumulate across visitors, the consensus engine finalizes a label.

Labels are not final until consensus is reached. A single visitor’s answer is a vote, not a label.

Challenge expiry

Challenges expire after a configurable TTL (default: 5 minutes). This prevents pre-computed answers from being replayed. If a visitor takes too long, they get a fresh challenge.

The visitor experience

A visitor hitting your site sees the challenge widget. It’s a compact task — typically solvable in 2–5 seconds. They answer it, the request goes through, and in the background a label has been recorded. The visitor did useful annotation work without knowing or caring.