Consensus Engine

A single visitor’s answer isn’t necessarily correct — they might be careless, or skimming. The consensus engine aggregates responses from multiple visitors and produces a label only when enough of them agree. This is how you get reliable training data from general web traffic.

How it works

For each unknown item, HiveGuard collects responses from multiple visitors. When enough responses accumulate:

Count responses by answer value
Compute the fraction that agree on the most common answer
If agreement >= consensus_threshold, finalize the label

Example with consensus_threshold = 0.6:

Item	Responses	Agreement	Result
cat.jpg	yes, yes, yes, no	3/4 = 0.75	✓ Label: “yes”
dog.jpg	yes, no, yes, no	2/4 = 0.50	✗ Not enough agreement

Items that don’t reach consensus are re-queued for more solvers.

Consensus threshold

consensus_threshold (0.0–1.0) controls how much solver agreement is required. Default: 0.6.

Higher threshold → more agreement required → higher-quality labels, but slower (more solvers needed)
Lower threshold → labels finalize faster, but may be noisier

Change it via:

hiveguard config set --consensus-threshold 0.75

Re-validation

Labels that were previously finalized can be re-queued for re-scoring if new information suggests the label is stale or contested. See Re-validation.

Label confidence

Once a label is finalized, it gets a confidence score equal to the agreement fraction. This is useful when exporting:

# Only use labels with 90%+ agreement
hiveguard labels export --fmt jsonl | \
  jq -c 'select(.confidence >= 0.9)'

Outlier rejection

Responses that are statistical outliers (e.g., one solver who always answers differently from everyone else) may be down-weighted. This reduces the influence of careless or adversarial solvers.