Collecting Labels from Your Users
This guide walks through deploying HiveGuard in front of a real application — the typical setup for passive, continuous data collection.
The idea
You have an application. It gets traffic. Some of that traffic is humans. HiveGuard intercepts those human visits and asks each one to complete a quick labeling task before passing them through. The visit continues normally. You get a label.
Visitor → POST /your-app/any-path │ ▼ HiveGuard (port 8000) ┌──────────────────────────────┐ │ 1. Is this visitor human? │ │ 2. Serve a labeling task │ │ 3. Record the answer │ └──────────────────────────────┘ │ visitor passes → forward to upstream ▼ http://your-app:8080Step 1: Point HiveGuard at your application
Set TARGET_URL in your environment:
TARGET_URL=http://your-app:8080All requests that pass the challenge are forwarded transparently to this URL.
Step 2: Upload your dataset
Before traffic can generate labels, you need items in the challenge pool. See Uploading Training Data for the full guide.
Quick version:
hiveguard datasets create "My Dataset" --modality imagehiveguard items upload my_items.jsonlStep 3: Create proxy rules (optional)
By default every request triggers a challenge. For paths that shouldn’t require one — health checks, static assets, internal API calls — create pass-through rules:
# Let health checks through without a challengehiveguard proxy-rules create "/health" --match-type exact --priority 10
# Let all static assets throughhiveguard proxy-rules create "/static/" --match-type prefix --priority 20Or from the dashboard: Proxy Rules → New Rule.
Step 4: Verify labels are flowing
Check the dashboard Metrics tab after some traffic has passed through. You should see:
- Challenges today ticking up as visitors arrive
- Labeled items growing as consensus is reached
- Solve rate showing the fraction of visitors completing the task
If solve rate is low, check that your ground-truth items are clear and unambiguous. The Labeler Frustration section flags GT items with high fail rates.
Embedding the widget directly
For cases where you want the labeling task inline in a page (not as a gate), embed the HiveGuard widget in your frontend:
<script src="http://localhost:8000/_hiveguard/widget.js" data-api-key="hg_xxxxxxxxxxxxxxxx" data-dataset-id="00000000-..."></script>The widget renders the challenge inline. On completion, it fires a hiveguard:solved event you can handle in JavaScript. The visitor’s answer is recorded regardless of whether you use it to gate access.
Monitoring label quality
hiveguard metricsOr via the dashboard Metrics tab. Watch for:
- High frustration index (visitors struggling with GT items) → review and possibly disable high-friction items
- Low solve rate → GT items may be too hard or dataset pool may be thin
Exporting what you’ve collected
hiveguard labels export --fmt csv --output labels.csvLabels are ready for model training as soon as they’re exported. See Exporting Labels for filtering options.