Professional case study

High-Availability Game Server Platform Architecture

A cloud-native foundation that absorbs launch-day traffic for a small team to run

  • SZ Code Lab
  • ASP.NET Core
  • Node.js
  • TypeScript
  • Docker
  • Kubernetes (EKS)
  • AWS
  • Terraform
  • Photon Engine
  • CloudWatch
  • Splunk
  • ELK
Availability
Zero-downtime rolling updates
Traffic
HPA-driven pod scale-out for spikes
Environments
Dev · Staging · Prod (IaC)
Ops team
Operated by a small engineering team

Challenge

Two projects arrived around the same time with very different server-side asks.

  • City of Holdem — Unify the client and server so feature developers could write cross-platform content logic in one language.
  • Golden Mango Casino — Survive global-launch and marketing-campaign traffic spikes, and stay manageable by a small engineering team.

Solution

City of Holdem — a unified client/server workflow

  • Prototyping: Abstracted the network module behind an Adapter, started development against a mock server, and moved to Photon Engine for real-time multiplayer prototyping.
  • Live architecture: Rewrote the server side in ASP.NET Core so client and server share the same C# content logic. The same engineer can implement a feature end-to-end without context-switching languages.

Golden Mango Casino — cloud-native infrastructure

graph TD Terraform["Terraform (IaC)
VPC · EKS · IAM · ECR
Git-tracked, version-pinned"] AWS["AWS
VPC / EKS / ELB / CloudWatch"] EKS["Kubernetes (EKS)
Rolling updates · HPA"] Pods["Game API Pods
Node.js / TypeScript
Dockerized"] Telemetry["Telemetry
CloudWatch + Splunk + ELK"] Envs["Environments
Dev / Staging / Prod"] Terraform --> AWS Terraform --> Envs AWS --> EKS EKS --> Pods Pods --> Telemetry Envs --> EKS
  • Containerization — TypeScript/Node.js API servers packaged into Docker images so the runtime environment was identical from a dev laptop to production.
  • IaC with Terraform — VPC, EKS, IAM, ECR — everything declared as code. Differences between Dev / Staging / Prod live as variables only, which kills the silent environment drift that normally rots multi-env setups.
  • Orchestration on Kubernetes — Distribution, scaling, recovery delegated to K8s. Predictable spikes (marketing campaigns) ride on HPA, and rolling updates keep deploys zero-downtime.
  • Telemetry layered intentionally — CloudWatch + Splunk for infra load, ELK for application/logic errors. Each channel watches what it should watch.

Achievements

  • Zero-downtime live updates and elastic traffic handling. Kubernetes rolling updates plus HPA absorbed major marketing-driven spikes with no service interruptions.
  • A small team running a complex cloud. Terraform + IaC discipline made the entire ecosystem manageable, extendable, and reproducible by a small engineering team.
  • Faster feature cycles thanks to client/server unification. In City of Holdem, a content change is one flow on one language instead of two parallel PRs in two stacks.

This case touches operational data, so I don’t share code excerpts or log captures publicly. The summary stays at the structural and decision level; concrete numbers and trade-offs are best discussed in an interview.

← Back to portfolio