Collecting Labels from Your Users

This guide walks through deploying HiveGuard in front of a real application — the typical setup for passive, continuous data collection.

The idea

You have an application. It gets traffic. Some of that traffic is humans. HiveGuard intercepts those human visits and asks each one to complete a quick labeling task before passing them through. The visit continues normally. You get a label.

Visitor → POST /your-app/any-path
               │
               ▼
           HiveGuard (port 8000)
           ┌──────────────────────────────┐
           │ 1. Is this visitor human?    │
           │ 2. Serve a labeling task     │
           │ 3. Record the answer         │
           └──────────────────────────────┘
               │ visitor passes → forward to upstream
               ▼
           http://your-app:8080

Step 1: Point HiveGuard at your application

Set TARGET_URL in your environment:

TARGET_URL=http://your-app:8080

All requests that pass the challenge are forwarded transparently to this URL.

Step 2: Upload your dataset

Before traffic can generate labels, you need items in the challenge pool. See Uploading Training Data for the full guide.

Quick version:

hiveguard datasets create "My Dataset" --modality image
hiveguard items upload my_items.jsonl

Step 3: Create proxy rules (optional)

By default every request triggers a challenge. For paths that shouldn’t require one — health checks, static assets, internal API calls — create pass-through rules:

# Let health checks through without a challenge
hiveguard proxy-rules create "/health" --match-type exact --priority 10

# Let all static assets through
hiveguard proxy-rules create "/static/" --match-type prefix --priority 20

Or from the dashboard: Proxy Rules → New Rule.

Step 4: Verify labels are flowing

Check the dashboard Metrics tab after some traffic has passed through. You should see:

Challenges today ticking up as visitors arrive
Labeled items growing as consensus is reached
Solve rate showing the fraction of visitors completing the task

If solve rate is low, check that your ground-truth items are clear and unambiguous. The Labeler Frustration section flags GT items with high fail rates.

For cases where you want the labeling task inline in a page (not as a gate), embed the HiveGuard widget in your frontend:

<script
  src="http://localhost:8000/_hiveguard/widget.js"
  data-api-key="hg_xxxxxxxxxxxxxxxx"
  data-dataset-id="00000000-..."
></script>

The widget renders the challenge inline. On completion, it fires a hiveguard:solved event you can handle in JavaScript. The visitor’s answer is recorded regardless of whether you use it to gate access.

Monitoring label quality

hiveguard metrics

Or via the dashboard Metrics tab. Watch for:

High frustration index (visitors struggling with GT items) → review and possibly disable high-friction items
Low solve rate → GT items may be too hard or dataset pool may be thin

Exporting what you’ve collected

hiveguard labels export --fmt csv --output labels.csv

Labels are ready for model training as soon as they’re exported. See Exporting Labels for filtering options.