2024-11-03 · rollouts · governance · teaching

Designing canary gates that humans actually respect

By Jonas Hwang

Supporting visual for Designing canary gates that humans actually respect

Canary gates fail for social reasons more often than statistical ones. Teams skip gates when the dashboard story does not match the risk they feel in their bones. We rehearse gate language with platform and product peers in the same room so approvals feel informed, not ceremonial.

In cohort labs, learners script two gate types: a hard stop on error budget burn and a soft pause that requires a typed rationale. The second one trains empathy for on-call engineers who inherit the decision.

We also insist on pairing metrics with qualitative checks—customer-visible latency, support ticket volume, or synthetic journeys depending on the scenario. Numbers alone rarely carry the nuance of a partial outage.

Capstones include a short memo explaining which gate would have fired during a historical incident at the learner's workplace (redacted). Mentors grade clarity, not heroics.