Consensus Engine
A single visitor’s answer isn’t necessarily correct — they might be careless, or skimming. The consensus engine aggregates responses from multiple visitors and produces a label only when enough of them agree. This is how you get reliable training data from general web traffic.
How it works
For each unknown item, HiveGuard collects responses from multiple visitors. When enough responses accumulate:
- Count responses by answer value
- Compute the fraction that agree on the most common answer
- If
agreement >= consensus_threshold, finalize the label
Example with consensus_threshold = 0.6:
| Item | Responses | Agreement | Result |
|---|---|---|---|
| cat.jpg | yes, yes, yes, no | 3/4 = 0.75 | ✓ Label: “yes” |
| dog.jpg | yes, no, yes, no | 2/4 = 0.50 | ✗ Not enough agreement |
Items that don’t reach consensus are re-queued for more solvers.
Consensus threshold
consensus_threshold (0.0–1.0) controls how much solver agreement is required. Default: 0.6.
- Higher threshold → more agreement required → higher-quality labels, but slower (more solvers needed)
- Lower threshold → labels finalize faster, but may be noisier
Change it via:
hiveguard config set --consensus-threshold 0.75Re-validation
Labels that were previously finalized can be re-queued for re-scoring if new information suggests the label is stale or contested. See Re-validation.
Label confidence
Once a label is finalized, it gets a confidence score equal to the agreement fraction. This is useful when exporting:
# Only use labels with 90%+ agreementhiveguard labels export --fmt jsonl | \ jq -c 'select(.confidence >= 0.9)'Outlier rejection
Responses that are statistical outliers (e.g., one solver who always answers differently from everyone else) may be down-weighted. This reduces the influence of careless or adversarial solvers.