Products — Behique
Behique
Agent Drift Monitoring
What it does
The same model name doesn’t mean the same system. Weights, prompts, and settings change under a static label, and for an agent that acts, that’s a change in its decision-making — not just an isolated bad answer.
Passively, it records what an app’s AI actually does in production — every session, cost, and outcome — giving a non-technical operator an audit trail, proof-of-value analytics, and cost visibility in a plain browser.
Actively, it probes the agent on a schedule with a fixed battery of tests, compares the results against a frozen baseline of how the agent behaved on day one, and reports a per-layer profile of how far it has drifted.
The result: you find out whether the system you’re running today is still the one you signed off on — before your users do.
The approach
- A profile, never a single score. Drift is reported across distinct layers — from “is this literally the same model and prompt?” up through its core rules, skills, and persona — because one aggregate number hides which thing changed.
- Trust the instrument before the reading. No measurement is believed until the probe proves reliable on an unchanged reference (test–retest), so you never chase noise.
- Detect, don’t silently act. Drift is surfaced to a human; any automated response is an explicit, opt-in hook — never a silent change made on your behalf.
- Grounded in measurement science. Built on psychometrics and the philosophy of identity (Parfit’s “Relation R”), it measures degree of continuity — not a binary “same or different.”
Running an agent you can’t prove still behaves?