Skip to content

Ground Truth Items

Ground-truth items are items in your dataset for which you already know the correct answer. They are not what you want labeled — they are the quality filter that keeps your labels trustworthy.

Every challenge pairs one unknown item (what you want labeled) with one ground-truth item (what you already know). If the visitor gets the ground-truth wrong, their answer on the unknown item is discarded. This filters out random clickers, bots, and inattentive responses without any manual review.

Why they matter

Without ground-truth items, you have no way to distinguish a genuine human answer from a random guess. Anyone could click through challenges without reading them, and their answers would pollute your dataset.

Ground-truth items solve this by creating a simple contract: answer this one correctly (proving you looked at it), and your answer on the other one counts. Fail the quality check, and nothing is recorded.

What makes a good ground-truth item

  • Unambiguous — the correct answer should be clear to any attentive human in under 5 seconds. Borderline items cause high fail rates (see Labeler Frustration).
  • Varied — a mix of positive and negative examples prevents pattern-matching (always clicking “yes” to pass).
  • Domain-relevant — items from your actual use case are harder for bots trained on generic tasks.

Item lifecycle

You upload item with known_answer = "cat"
Item enters the ground-truth pool
Challenge generated: this item paired with an unknown item
Visitor answers both
├─ Visitor says "cat" → correct → unknown item answer recorded
└─ Visitor says "dog" → incorrect → session ends, no label recorded
pass_count or fail_count incremented

Friction flagging

HiveGuard tracks the fail rate for every ground-truth item. If an item accumulates a high fail rate (≥ 35% with ≥ 10 attempts), it is automatically flagged as high-friction — the item may be ambiguous, broken, or genuinely too hard.

High-friction items appear in the dashboard Metrics tab. You can disable them so they no longer appear in challenges, or fix the underlying issue and re-enable them. Disabling a high-friction GT item immediately removes it from the pool — it won’t waste visitor time or generate noise in your labels.

How many do you need

A minimum of 5–10 well-distributed ground-truth items is enough to get started. More gives the system more variety to draw from, reducing the risk that visitors see the same GT item repeatedly and learn to game it.

A ratio of roughly 1 GT item per 3–5 unknown items works well in practice.