Online Evaluation Guardrails: Catching LLM Drift Before Users Do

September 29, 20251 min read

Offline evals are necessary, but not enough. Once a system is in production, behavior drifts. Online evals are how you catch that drift before users complain.

This is the guardrail setup I use for the offer-bundling assistant: small checks, lightweight sampling, clear thresholds.

In 30 seconds

Offline evals are not real-time. Online evals catch drift fast.
Insert small evaluators at critical points in the flow.
Set thresholds and alerts so you act before users notice.

Key takeaways

Guardrails can be lightweight and still effective.
Sampling reduces cost without losing signal.
Alerts should map to user-visible risk.

Offline vs online evals

Offline evals test known examples. Online evals watch real traffic. You need both:

Offline evals prevent regressions before release.
Online evals detect new failure modes in production.

Where to add guardrails

In the offer-bundling flow, I add checks at:

Output validation (schema conformity)
Compliance clause coverage
Budget threshold

Each is a small evaluator with a simple rule.

A minimal online evaluator

Example: check for missing compliance clauses in the EU.

# Pseudo-code: online guardrail
if request.region == "eu":
    if "gdpr" not in output.compliance_clauses:
        record_failure("missing_gdpr")

This is cheap, fast, and catches real issues.

Thresholds and alerts

I set thresholds based on impact. Example:

If more than 2% of EU bundles miss GDPR, alert.
If more than 5% exceed budget, alert.

The goal is not to alert on every anomaly. It is to alert when users will feel it.

Sampling to control cost

You do not need to evaluate 100% of traffic. Start with:

5% sampling for low-risk checks
20% for high-risk checks

You can increase later if needed.

Handling false positives

Most alerts end up being harmless, but the few real ones pay for the system I keep a short review loop:

Sample a few failures daily.
Confirm if they are real.
Adjust thresholds or rules.

How this connects to the project

The offer-bundling assistant already has schema validation and golden datasets. Online guardrails are the final layer: they protect against drift and real-world edge cases you never saw in tests.

Closing thought

Online evals are not a full test suite. They are a smoke alarm. You do not want them to be loud all the time, but you do want them to go off when something is actually on fire.