Overview
Five Convoy Engineers Just Approved a Workday MFA Push They Didn't Send
It's a Tuesday morning at Convoy, the B2B collaboration platform serving ~600 paying organizations. About 55% of your customer base is in the EU, which means GDPR applies to most of what you process. The platform runs on Next.js 15 App Router on Vercel, with Supabase doing auth, Postgres, and the audit log; Stripe handles billing; Resend handles transactional email; Sentry catches the exceptions.
At 09:42 your Okta admin notices five MFA push approvals from Convoy engineers within a 90-second window — from IPs in three different countries. The push came from convoy-workday[.]com, not your domain. Three of the five engineers have prod Supabase access via service-role-equivalent JWTs cached on their dev machines. Push notifications are still allowed for Convoy prod-admin accounts because phishing-resistant MFA has been on the roadmap for a quarter without shipping. The attacker is now somewhere inside the perimeter. The question is how deep, how fast — and what they did with the Supabase tokens they grabbed.
Objectives: detection-to-declaration speed; containment vs. customer-impact tradeoff; GDPR Article 33 72-hour clock awareness; external comms discipline; recovery sequencing; evidence integrity.
Roles & Assignments
Each named role owns at least one decision moment. Fill in the player column before kickoff.
| Role | Player | Responsibilities | Owns Decisions In |
|---|---|---|---|
| Incident Commander | Declares severity, calls phase transitions, drives the response. | Phase 1, Phase 4 | |
| On-call SRE / Tech Lead | Hands-on Supabase + Vercel; JWT rotation and audit-log preservation. | Phase 2 | |
| Security Lead | Phishing-kit triage, scope assessment, evidence chain. | Phase 1, Phase 4 | |
| Comms Lead | Status page, customer email drafts, press posture. | Phase 3 | |
| Legal / DPO (Alma) | GDPR Article 33 clock, DPA obligations, evidence hold. | Phase 2, Phase 3 | |
| Exec Sponsor (CEO) | Board-prep relays, customer-escalation absorber. | Phase 3 | |
| Scribe | Maintains the decision log and timeline in real time. | All phases | |
| Facilitator | Runs the exercise, drops injects, manages the timer, enforces tempo. | (Out of play) |
Phase 1 — First sirens: five push approvals in ninety seconds T+0:00 → T+0:22
An Okta admin notices the impossible-geography pattern. The team has minutes to call this an incident before it scales.
It's 09:42 on a Tuesday morning. The deploy bot in #releases announced four hours ago that Convoy v0.4.2 went green — a routine release that touched the org settings page and one Supabase migration. Sarah from platform is half-watching the Vercel deploy dashboard while her oat-milk latte cools. Devon, your security lead, is reading a blog post about Supabase RLS edge cases. The Slack channel #alerts-okta is mostly quiet on a normal morning; today, it lights up. Five push-MFA approvals from Convoy engineers in ninety seconds, from IPs in three different countries — Bucharest, Lagos, somewhere in the western US. All five are from the same lure URL: convoy-workday[.]com. The push notifications are still on the engineers' phones. They all hit "approve." Devon stops reading. He pings the room: "…we should look at this."
Slack DM from Okta to #alerts-okta: Anomalous activity detected: 5 successful MFA push approvals from 3 distinct ASNs within 90 seconds, all targeting Convoy-prod Okta tenant. Affected users: sarah.lin, devon.kim, marcus.weber, ines.ortega, hannah.bao @convoy.app. Lure URL pattern (extracted from email gateway): convoy-workday[.]com/sso?next=…. The lure was sent at 09:30 from noreply@convoy-workday[.]com.
Three of the five users (Sarah, Marcus, Hannah) have prod Supabase access via service-role-equivalent JWTs cached on their dev machines from past migrations work.
Slack channel #ir-active is created at 09:48. The IC declares SEV-2 and pages the on-call SRE, security lead, and General Counsel. Alma replies in 80 seconds: "On my way. Don't delete anything. Don't restart anything. Tell me what you've touched." The IR runbook on Notion is pulled up — last updated 4 months ago. The phone tree at the bottom lists the previous CISO, who left in February.
Devon and Sarah start grepping the Okta audit log without paging anyone else. They're 8 minutes into "let's see if this is a real thing" when the customer success manager drops into #general with a screenshot from a customer's IT lead asking "Did Convoy just open an OAuth app in our Microsoft Entra tenant?"
A Twitter monitor pings: an account @b0n3saw posted "interesting day at the convoy.app office. five phish hits and a sleepy soc." — 14 retweets in 4 minutes. The screenshot they attached shows what looks like the inside of an Okta admin panel for the Convoy tenant, with an OAuth grant timestamped 09:39.
Sarah pulls the Supabase Studio audit log for the last hour. At 09:38 — four minutes before the Okta push approvals were noticed — a new service_role key was created in the Convoy production project. The actor field reads system@supabase.io, but Sarah remembers from the Supabase docs that this is the dashboard placeholder for any logged-in admin user. She runs a query against auth.audit_log_entries and sees Sarah's own user_id signing in from an IP in Bucharest at 09:33. She is currently in Berlin.
Severity call. SEV-1 (full activation: CEO, board chair notified, public-facing comms team activated, GDPR clock-tracking begins now even before personal data exposure is confirmed) or SEV-2 (handled inside engineering and security, escalation only on confirmed exposure)? Early signals are real but unconfirmed.
Owner: Incident Commander. Document: who called it, at what T+time, with what rationale, who was in the loop.
The severity call shapes Phase 2:
- If SEV-1: the GDPR clock-start timestamp is recorded in writing immediately. Alma joins early in Phase 2 with the breach register already pulled up.
- If SEV-2: Alma's appearance in Inject 2.3 is more abrupt; the clock-start timestamp is contested in the hotwash.
Detection-to-declaration speed. Triage discipline (Okta noise vs. coordinated phish). Severity calling under partial signal. Cross-team escalation hygiene.
Phase 2 — The token is the attacker now: Supabase under siege T+0:22 → T+0:45
The attacker pivoted from Okta into Supabase. The team must decide whether to invalidate every customer JWT or trace first.
Twenty-two minutes in. The room has settled into the rhythm of an incident — three laptops open to Supabase dashboards, two to Okta, Alma is on a tablet with the GDPR breach register pulled up, the CEO is in his office on a "give me ten minutes" hold. Devon discovers the next thing. The Supabase JWT validator pool — the keys that sign and verify auth tokens for the Convoy app — has, when he checks the configuration, eleven entries. Nine of them, your team can identify. Two of them, your team cannot.
Devon, in #ir-active: "I'm looking at the JWT signer pool. Eleven keys. Nine I can map to past rotations. Two — call them K10 and K11 — were created today at 09:39 and 09:41. They were created via the Supabase Management API using the new service_role key that appeared in the audit log earlier. Both K10 and K11 are currently valid for token issuance. Anyone holding a token signed with K10 or K11 is, as far as Supabase is concerned, a legitimate Convoy admin."
He adds: "I do not know how many tokens have been issued against K10 or K11. The audit log doesn't capture token-issuance volume in a queryable form. I can rotate them in 90 seconds. Doing so logs every Convoy user out — paying customers, EU and US, mid-Tuesday-morning, no warning."
Sarah runs a query against the audit_log table filtered by the past hour. She finds 1,847 rows of select actions against documents, org_members, and audit_log itself, all from the suspect service_role key. The reads span 34 customer organizations. Twenty-six of those orgs are in the EU. Ten of them have data_region = 'eu-west' and explicit DPAs that name the Berlin BfDI as the lead supervisory authority. The reads include, by row count, around 11,000 rows from documents — the body field is jsonb and may contain anything customers pasted in.
Alma Voss — General Counsel & acting DPO Calm, deliberate, in writing
"I need to put a number on the GDPR clock. When did you confirm a personal-data exposure? Because the answer to that question determines whether we have 71 hours and 41 minutes left or whether we have 24 hours left. I am not asking you to know what was taken. I am asking you to tell me, for the breach register, when you became aware that personal data had been accessed by an unauthorized third party. Give me a timestamp. In writing."
Facilitator note: Alma's job is to force a precise awareness timestamp into the decision log. Article 33 starts the 72-hour clock at awareness, not at confirmation. The team must commit.
Burn the keys, or trace. Rotate K10 and K11, invalidate every active Convoy JWT, and log every paying customer out at 09:55 on a Tuesday — or leave the keys in place for another 20 minutes while Devon traces what the attacker is doing with them, accepting that data exfil may continue during the trace window.
Owner: Incident Commander, with Tech Lead's input on rotation feasibility and Comms Lead's input on customer-impact framing. Alma logs the time of call regardless of outcome.
- If rotate K10/K11 immediately: customer-impact noise dominates Phase 3; Inject 3.2 (Beck Industries) lands harder.
- If trace first: Inject 4.1's smoking gun arrives later and Decision 4 is harder because trace data is partial.
Scope assessment under uncertainty. Containment-vs-availability tradeoff. Cross-team coordination with the DPO. Discipline of recording an awareness timestamp in writing.
Phase 3 — The 72-hour clock has started: DPO arrives T+0:45 → T+1:12
Three external pressures arrive at once: the supervisory authority, the largest EU customer, and the press.
Forty-five minutes in. The keys are either rotated or they aren't, depending on the call you made. The CEO has come into the room. The customer success manager is being pinged by three different account executives whose customers have noticed something — either the forced logout, or the slowness, or, in one case, an alert from their own SOC tools that flagged a session-token anomaly. A reporter from a security trade publication has emailed press@convoy.app. Alma is now on a separate phone with someone from the Berlin BfDI's office whose name she does not yet have. The 72-hour clock is running, and the room must decide what to put in the customer email.
Alma, in #ir-active: "Per the breach register, awareness of personal-data exposure was at 10:12 (T+30 in your exercise time). Article 33 obligation to notify the supervisory authority is therefore by 10:12 on Friday. The notification must include: nature of the breach, categories and approximate number of data subjects, categories and approximate number of records, name and contact of the DPO, likely consequences, measures taken or proposed. Drafting starts now. I need a four-line factual summary from engineering by 11:30. I will not file with anything I cannot defend in writing."
Email forwarded to the CEO, from the CISO of Beck Industries (Convoy's largest EU customer, ACV €340k, contract renewal in 11 days):
"We've detected anomalous read patterns against our org from a Convoy service account, timestamped between 09:39 and 09:54 today. We are invoking the security incident clause in our DPA. We require, within four hours: (1) confirmation of whether our data was accessed, (2) the exact records or row counts, (3) your supervisory authority filing reference number once obtained. Failure to provide will trigger contractual remediation review. — Hendrik Beck, CISO."
The CEO turns to the IC. "What do I tell him?"
Email to press@convoy.app from a reporter at The Register: "I've heard from two sources that Convoy is dealing with a token-theft incident this morning that may have touched EU customer data. I'm filing at 17:00 today. Would love a comment. Quote from a named spokesperson preferred. If no comment, I'll publish with 'an anonymous source familiar with the matter.'"
The reporter is polite. The clock is real. They have your domain in the subject line and at least one specific detail — the words token theft — that suggest they have an actual source.
The customer-comms call. Send a customer email by 12:00 today (Article 34 communication to data subjects, in plain language, while the team still doesn't have a final scope) — or hold the email until tomorrow morning when forensics is firmer, accepting that customers will likely hear from the press first. Third path: send to only the 26 EU customer orgs flagged in Inject 2.2 today, generic statement to everyone else.
Owner: CEO + Comms Lead, with Alma's veto on anything that conflicts with Article 33/34 obligations.
- If customer email goes out today: Inject 3.3 (reporter) plays differently — comms can offer named on-record statement.
- If held until tomorrow: reporter publishes with anonymous-source quote, intensifying Decision 4.
External communications under regulatory pressure. Discipline against premature precision. Customer-notification timing and scope. Holding the line on "we don't know yet" while three audiences demand specifics.
Phase 4 — Recovery: the smoking gun and the dangling thread T+1:12 → T+1:30
The technical incident is contained. A new finding emerges that won't fit in today's response.
Seventy-two minutes in. The customer email — whichever version you sent — has gone out, or hasn't. The supervisory authority filing draft is in Alma's inbox. The keys are rotated. The Okta tenant has had push-MFA disabled for prod-admin accounts pending phishing-resistant MFA rollout. Customers are emailing back, mostly calm, a few not. And Devon, who's been quiet for the last ten minutes, finds something.
Devon pulls the email gateway logs and isolates the original phishing email. The lure URL convoy-workday[.]com/sso?next=… resolves to a Cloudflare-fronted server that, when he requests the SSO page, returns a near-perfect clone of Convoy's actual Workday SSO. But the form submission goes to /api/relay, which then forwards the legitimate auth headers to the real Convoy Okta — and pockets the id_token and the SAML response as it passes through. The kit is a known commercial AiTM (adversary-in-the-middle) framework called EvilProxy v3.4. It bypasses TOTP and push MFA both by relaying the entire authenticated session in real time. The phishing-resistant MFA controls Convoy had been planning to roll out (FIDO2 / WebAuthn) would have stopped this.
The five engineers approved the push because, from their perspective, they had just clicked a Workday password-reset link and authenticated normally. The push felt expected.
The K10/K11 keys are rotated. New JWTs are issued. The Supabase audit log has been preserved to a separate S3 bucket. The audit log query Sarah ran shows no further reads from the suspect service_role key. Customer support volume is settling. Then Alma, in #ir-active: "For the Article 30 record of processing activities — the reads against the affected EU customer orgs included documents from a documents.body jsonb field. Our RoPA does not currently classify the contents of that field. Customers paste anything in there. We may have processed special categories of personal data without a lawful basis recorded for that category. Flag this for the next DPIA cycle."
It's not an active fire. It's a finding that's now load-bearing for the AAR.
When do we call it done. Downgrade from SEV-1 to SEV-2 now (the technical incident is contained, the regulatory clock is being managed) — or hold SEV-1 until the customer email cycle has run its course (24-48 more hours). The team is tired. The CEO has a board call at 14:00. Alma needs the breach register entry locked in by end of day. There is no objectively right answer.
The dust starts to settle. Customers who logged in today saw something different from what they expected, and a portion of them — you don't yet know how many — read the email and decided whether to trust you. The Berlin BfDI filing is drafted but not yet sent. The Register will publish at 17:00 with or without your quote. Two of K10/K11's actual issued tokens are still unaccounted for in your tracing — they may have expired naturally, or they may not. The forensic timeline goes to outside counsel tomorrow. The AAR meeting is on Thursday. Sarah closes her laptop. Her oat-milk latte is cold.
Recovery sequencing and verification. Evidence integrity for the post-incident inquiry. Discipline against premature all-clear. Honest characterization of what is still unknown.
Coverage matrix
Each inject and decision tagged against NIST CSF 2.0, SOC 2 (Trust Services Criteria), and GDPR (EU 2016/679). All control IDs cited here appear in the source reference at references/framework-controls.md. A — indicates the inject does not meaningfully touch that framework.
| Inject / Decision | NIST CSF | SOC 2 | GDPR |
|---|---|---|---|
| Inject 1.1 — Five-push pattern | DE.AE-02 | CC7.2 | — |
| Inject 1.2A — IR plan retrieved | RS.MA-01 | CC7.4 | — |
| Inject 1.2B — Quiet investigation | RS.MA-02 | CC7.3 | — |
| Inject 1.2C — Twitter post | DE.AE-07 | CC2.3 | — |
| Inject 1.3 — Supabase audit log lights up | DE.CM-09 | CC6.1 | — |
| Decision 1 — Severity call | RS.MA-03 | CC7.3 | — |
| Inject 2.1 — Phantom JWT keys | PR.AA-05 | CC6.3 | — |
| Inject 2.2 — Reads against EU customer orgs | DE.AE-04 | CC6.7 | Art. 32 |
| Inject 2.3 — Alma the DPO appears | GV.RR-02 | CC2.2 | Art. 33 |
| Decision 2 — Burn keys vs. trace | RS.MI-01 | CC6.1 | — |
| Inject 3.1 — Alma's clock is official | RS.CO-02 | CC2.3 | Art. 33 |
| Inject 3.2 — Beck Industries demands answer | RS.CO-02 | CC2.3 | Art. 28(3)(f) |
| Inject 3.3 — Reporter has a tip | — | CC2.3 | Art. 34 |
| Decision 3 — Customer-comms call | RS.CO-02 | CC2.3 | Art. 34 |
| Inject 4.1 — Smoking gun (EvilProxy) | RS.AN-03 | CC7.4 | — |
| Inject 4.2 — RoPA gap on documents.body | — | CC9.1 | Art. 30 |
| Decision 4 — All-clear call | RC.RP-04 | CC7.5 | Art. 33(5) |
Decision Log
Capture every consequential decision in real time. The Time column auto-fills from the exercise timer when you add a row.
| Time | Decision | Owner | Rationale |
|---|
Hotwash
Run this in the room while everyone's still warm. 10 minutes max. Don't let it slide into the AAR.
After-Action Capture
Fill these in within 24 hours while details are fresh. The /tabletop-aar skill ingests this section to draft the formal AAR.