CS329A research Procedural Drift Project
Behavioral diagnostics

Evaluation approach

The evaluation treats behavioral consistency as a first-class metric. The goal is to measure whether repeated runs with fixed prompts and tools remain stable over time.

Primary metrics

Primary metric: Decision Disagreement Rate (DDR), a measure of how often identical scenarios yield different decisions under repeated execution.

Decision Disagreement Rate (DDR)

Probability of deviating from the modal decision under identical inputs.

Switch rate (SR)

Frequency of label changes between adjacent replays.

Escalation consistency

Stability of severity labels, triggers, and routing decisions.

Trace similarity

Overlap of tool calls, branch activations, and skill versions.

Drift indicators

  • Branch activation changes across replays
  • Escalation frequency shifts over time
  • Rationale and threshold drift in explanations
  • Tool and branch divergence under identical inputs

Analysis plan

Results will be summarized with DDR/SR statistics, escalation heatmaps, and trace similarity analysis. The analysis will focus on the delta between prompt-only and procedure-grounded agents under identical compute budgets.