Case Study · Migration

Replacing a brittle whitelabel stack with infrastructure the client actually owns.

Why patching the existing platform wasn't on the table, why migrating to direct cloud APIs was, and the principles I take from this work into every similar engagement.

Client
Educational events business · Israel
Engagement
Full-stack migration + retainer

The problem

A working business held together by a load-bearing third party.

The client had spent years building real business value on top of a whitelabel chat platform. The platform handled WhatsApp conversations, button-driven flows, and customer state. A no-code orchestrator on top of it handled CRM writes, image generation, and scheduled messages. A third-party CRM held the customer records.

On a good day, this worked. On a bad day — increasingly often — something inside the third-party platform would shift in a way the client couldn't see, debug, or fix. Buttons would render as plain text. Sessions would expire mid-conversation. Lead submissions would land in the inbox without ever reaching the CRM. Each failure was opaque: there was no log to read, no configuration to change, no support contact who could help on the timeline the business needed.

The decision

Stop patching. Go direct.

The platform's session model was the load-bearing wall, and it was rotting. Each new patch was buying days of stability, not solving the problem. The right move was to remove the platform from the architecture entirely — go direct to the underlying APIs, on infrastructure the client could own.

This was both cheaper and more durable. At the client's volume, going direct fit inside the underlying platform's free tier. Latency dropped because there was one less hop. Every failure mode became something the client could see, log, and fix.

The approach

Strangler pattern. Flow by flow. Owner sign-off after every cycle.

Big-bang migrations of revenue-critical systems are how clients lose customers. The right shape was a strangler pattern: keep the existing system live, build the replacement alongside it, cut over one flow at a time, and only after the owner had seen a real customer move through the new path end to end.

The work was sequenced by business risk, not technical convenience. The lead-capture flow — most fragile, most revenue-critical — went first. Once that ran cleanly for a week, we cut over the order flow. Then the subscription lifecycle. Then the long tail of one-off automations the orchestrator was quietly handling.

What's running now

Owned end to end.

The client now owns the integration with their messaging provider, the customer conversation history, the CRM sync logic, and every line of code that touches them. New flows ship in hours instead of days because the conversation logic lives in prompt and tool definitions, not in a third-party flow-tree UI. Fixed monthly infrastructure cost is roughly free at current volume.

When something breaks, there's a log. When a behavior needs to change, there's a single repo to change it in. When the business outgrows current capacity, the architecture scales by the same edge primitives it was built on.

What's reusable

Three principles for anyone running a business on a third-party platform.

  • Audit your dependencies for opacity, not uptime. A platform with 99.9% uptime that you can't debug when it misbehaves is more dangerous than a platform with 99.5% uptime that exposes its internals. Visibility matters more than promised availability.
  • Own the wire. If a vendor sits between your business and an API you fundamentally depend on, every change they make is a silent risk to you. Going direct trades a small amount of integration work for a permanent reduction in surface area.
  • Strangler pattern, not rewrite. Keep the old system running. Replace it flow by flow. Sign off each cut-over with the owner watching a real customer move through the new path. There's no faster way to ship — and no slower way to break trust — than the alternative.