Challenge
Every release required planned maintenance because too many moving parts had to change together: client builds across five platforms, a shared backend, CDN-hosted assets, and dedicated Unreal Engine servers on EC2. The live game could not safely keep running while those components changed underneath it.
The release surface included:
- Windows Store
- Mac App Store
- iOS
- Android
- Apple TV
- Client builds
- Common backend services
- Admin panels and SQL-backed tooling
- CDN-hosted graphical assets
- Dedicated UE game servers on EC2
Approach
- Wrote a comprehensive release checklist that dictated sequencing for backend, game servers, CDN, frontend, QA, and DevOps.
- Mapped game servers to specific client versions and build numbers.
- Mapped CDN asset sets to specific client versions and build numbers.
- Designed one-click swaps for choosing which CDN, game server set, and backend environment were active for new traffic.
- Introduced a simple folder/version structure to hold multiple CDN and game-server payloads without materially increasing infrastructure cost.
- Added forced-update parameters so incompatible clients could be controlled deliberately instead of relying on maintenance windows.
- Shifted heavy deployment steps earlier in the release flow so CDN and server assets were already staged before the critical cutover window.
How the system worked
- A release checklist defined the exact order of operations.
- Backend, game servers, CDN, and client compatibility were versioned together instead of being swapped ad hoc.
- DevOps followed the checklist exactly, including when to stage assets early and when to perform environment swaps.
- QA, development, and DevOps each had clearly timed responsibilities during the release window.
- New clients were routed to the correct backend, CDN assets, and server fleet while the existing live game remained available for active users.
Results
- Maintenance windows were no longer required for normal releases.
- Release execution became predictable because backend, CDN, and server swaps were controlled instead of improvised.
- Live users were protected while new builds rolled out across multiple stores and platform approval timelines.
- Coordination improved because QA, development, and DevOps all worked from the same operational sequence.
Key metrics
| Area | Before | After |
|---|---|---|
| Release availability | Maintenance required | True zero-downtime release flow |
| Version coordination | Manual cross-checking | Explicit client, CDN, server, and backend mapping |
| Cutover control | Multi-step manual switching | One-click environment swaps |
| Release preparation | Critical deployments during release window | Early staging reduced release-time pressure |
- Version-aware routing was the key to removing maintenance windows safely.
- Operational sequencing mattered as much as infrastructure design.
- Clear timing across QA, dev, and DevOps turned release execution into a repeatable system instead of a fragile event.