← Toate rapoartele

Livrat

Behind the scenes: how we safely switched on six features, investigated every test failure, and hardened the system

Engineering / Quality

A technical companion to the feature announcement. We switched on six already-built features one tier at a time - read-only views first, then additive merchant tools, then the one feature that writes data (self-service cancellation) last - checking the live system was healthy after every step and confirming that each role can only see what it should. Every claim was verified against the live system, including the customer cancellation tested end to end on a real phone. We then investigated each remaining automated browser-test failure at the browser level and proved, with evidence, that none were customer-facing bugs: one was a genuine test bug (a cookie banner overlapping a button), one was a real but harmless console-logging glitch in the operator console (now fixed), and the rest were test-timing or test-environment issues. Two independent automated reviewers checked the work and found no critical or high-risk issues. The code shipped through our standard gated pipeline - secret scan, linting, type-checking, the full automated test suite (nearly a thousand tests), and a production build - with an automatic health check and one-switch rollback.

  • engineering
  • quality
  • testing
  • safety
  • transparency

How we switched the features on safely

  • Risk-ordered roll-out: read-only views first (sales summary, cash reconciliation, the operator diagnostics screen), then additive merchant tools (item snooze, running-late notice), and only last the one feature that changes data - customer self-service cancellation.
  • After turning on each group we ran the production health check and confirmed it stayed clean, then verified the new endpoints behaved correctly and that the wrong role is always refused access.
  • Before turning anything on, every new endpoint correctly returned 'not found' - proving the features were genuinely off - and only became reachable once switched on.
  • Each feature can be switched back off individually in seconds, with no code change required.

How we verified it

  • Every feature was checked against the live system with real sign-ins, not assumptions: correct data shapes, access boundaries (a courier sees only their own cash, a merchant only their own sales), and sensible empty states.
  • Self-service cancellation was proven end to end on a real phone: placing a test cash order, tapping the in-app cancel button, and confirming the order moved to cancelled - plus checks that a second cancel and an unknown order are refused.
  • We confirmed the operator diagnostics screen exposes only on/off states and counts - never any personal or sensitive information.

Investigating the automated test failures (no shortcuts)

  • Change-address: the address pop-up opened correctly, but the cookie-consent banner was overlapping its button so an automated click could not land. This was a test gap, not a product bug - the test now dismisses the banner first, and passes.
  • Checkout cash wording: the page itself loads and shows 90+ products correctly; the test was wiping its own session on each step and was sensitive to test-machine load. We made the test navigate more directly and proved the feature works.
  • Operator console (dispatcher): clicking 'reassign' on an already-delivered order was correctly refused by the system, but the refusal was being logged as a background error. We fixed the console to handle the expected refusal quietly - the action was always safely blocked.
  • The remaining items were test-environment or external-service timing issues, not product defects - documented openly rather than hidden.

Safety and independent review

  • No payment flow was changed; online-paid orders are routed to the proper refund process rather than an instant cancel.
  • Two independent automated reviewers examined the change and raised no critical or high-risk issues; their points were either already handled by the design or noted as future user-experience improvements.
  • Shipped through the standard gated pipeline (secret scan, linting, type-checking, the full ~1,000-test suite, production build) with an automatic post-deploy health check and one-switch rollback. The live system reported zero failures and zero warnings throughout.

Commit: 2855cd8