Case Study

Designing a Scalable GKE Platform for a SaaS Application

AT A GLANCE

Industry: B2B SaaS — enterprise software

Platform: Google Cloud — GKE, Terraform, Cloud Build, Artifact Registry, Secret Manager

Services: GCP Architecture, Platform Engineering, DevSecOps

Location: Canada

Engagement led by Amit Malhotra, Principal GCP Architect at Buoyant Cloud Inc. — Toronto, Canada

A B2B SaaS Platform That Had Outgrown Its Infrastructure

A Canadian B2B SaaS company selling to enterprise customers in North America had built a product that worked — but on infrastructure that wasn’t designed to support the growth they were now experiencing. The platform had been provisioned manually in the early stages of the company, with each environment set up differently, deployments handled by individual engineers who knew where everything lived, and no Terraform in place to reproduce or govern what had been built.

As the engineering team grew and the enterprise sales pipeline expanded, the infrastructure that had served the early product was creating problems that were becoming harder to ignore. New engineers took weeks to get productive because there was no standardised way to provision environments or understand how the platform was structured. Deployments to production required manual steps, tribal knowledge, and sign-off from the engineers who had built the original setup. And as enterprise prospects began asking detailed questions about their security architecture and deployment practices, the honest answers were becoming a liability in sales conversations.

The company needed a GKE platform foundation that could support multiple enterprise customers reliably, give the engineering team a consistent and automated deployment model, and satisfy the security and architecture questions that were now appearing in enterprise procurement processes.

THE CHALLENGE

What Needed to Change

  • No reproducible infrastructure: Every GCP environment had been provisioned manually with no Terraform, no version control, and no documentation. Dev, Staging, and Production had drifted significantly and there was no reliable way to reproduce any of them.

  • GKE without operational foundations: The team had adopted GKE but without a cluster architecture designed for multi-tenant enterprise workloads — no namespace isolation strategy, no pod security standards, no node pool design, and no Binary Authorization. The clusters worked for development but weren’t production-ready for enterprise customers.

  • Deployments dependent on individuals: There was no CI/CD pipeline. Releases required a specific engineer to manually build, push, and deploy — creating a bottleneck that slowed release velocity and made every deployment a risk event rather than a routine operation.

  • No secrets management: Service account keys were stored in environment variables and CI configuration. Database credentials were shared between environments. There was no centralized secrets management and no rotation in place.

  • Security gaps blocking enterprise sales: Enterprise procurement processes were surfacing questions about IAM architecture, audit logging, data isolation between customers, and deployment controls — questions the team couldn’t answer confidently with the current platform.

A Production-Ready GKE Platform Built on a Terraform Foundation

I designed and implemented a complete GKE platform foundation — architected for multi-tenant B2B SaaS workloads, automated with Terraform, and secured with the patterns enterprise customers expect. The engagement followed the SCALE Framework, with Security by Design and Automation as the two highest-priority pillars given the enterprise sales context.

  • Terraform IaC foundation: Complete Terraform codebase covering GCP project structure, VPC networking, GKE cluster configuration, IAM bindings, Artifact Registry, and Secret Manager — with separate workspaces for Dev, Staging, and Production provisioned from the same module library. Every GCP resource version-controlled, peer-reviewable, and reproducible.

  • GKE cluster architecture: Regional GKE cluster with per-customer namespace isolation, node pool strategy separating platform services from application workloads, Pod Security Standards enforcement, Binary Authorization policy requiring signed images, and Workload Identity per service account — replacing all static service account keys.

  • Multi-tenant isolation design: Namespace-per-customer architecture with Kubernetes network policies enforcing tenant boundaries, resource quotas preventing noisy-neighbour impact between customers, and RBAC scoped so application teams could access their own namespace without cluster-level permissions.

  • CI/CD pipeline implementation: Cloud Build pipelines for all services — container image build, vulnerability scanning with Artifact Analysis, image signing, and automated deployment to GKE with environment-specific promotion gates. No manual steps in any deployment path, and all pipelines authenticating through Workload Identity Federation with no service account keys.

  • Secrets management: Secret Manager integration replacing all environment variables and CI secrets — database credentials, API keys, and certificates retrieved at runtime through Workload Identity, with audit logs for every access and automated rotation where supported.

  • Security documentation for enterprise sales: Architecture documentation covering IAM model, data isolation approach, audit logging configuration, and deployment controls — written to directly answer the security questionnaire questions that had been blocking enterprise deals.

THE OUTCOMES

What Changed After the Engagement

100%

of infrastructure version-controlled in Terraform — zero manually provisioned resources

Zero

static service account keys remaining in any environment after Workload Identity migration

~45 min → ~8 min

deployment time reduction — automated CI/CD replacing manual build and deploy process

Beyond the metrics:

  • Enterprise security questionnaires now answered with accurate, documented evidence rather than verbal reassurances — removing a consistent blocker from the sales process

  • New engineers onboarded to the platform in days rather than weeks — environments self-service through Terraform, deployment process documented and automated

  • GKE clusters hardened to a security baseline that satisfied enterprise procurement requirements — pod security enforcement, Binary Authorization, network policy isolation, and Workload Identity throughout. See the DevSecOps & Cloud Security service for more on how this work is structured.

  • Development, Staging, and Production environments provisioned from the same Terraform codebase — eliminating the environment drift that had been causing production-only bugs.

This engagement is representative of the Platform Engineering and GCP Architecture & Modernization work I do for SaaS companies. See the SaaS & Technology Platforms industry page for more context.

This engagement is typical for Canadian B2B SaaS companies preparing for enterprise sales or SOC 2 audits. If your GKE platform was set up manually and your team is carrying deployment risk or failing security questionnaires, this is the pattern we follow.

Buoyant Cloud Inc
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.