Skip to main content

ADR-020: Shared Kubernetes Cluster with PSX

Status

Proposed

Context

PSSaaS currently runs on local Docker Compose for development. As the platform matures toward staging and eventual production, it needs a deployment topology that supports:

  • Staging access for WTPO validation (collaborators can't run Docker Compose locally)
  • Cross-platform integration testing (PSX calls PSSaaS BestEx API, PSSaaS calls PSX pricing API)
  • Shared identity infrastructure (single sign-on across the PowerSeller ecosystem)
  • Cost efficiency (avoiding duplicate infrastructure for a startup-phase product)

PSX has already established the infrastructure pattern (PSX ADR-085, accepted 2026-03-29):

  • Single AKS cluster (pss-cluster) with namespace isolation
  • pss-platform namespace for shared services (Keycloak, MCP server, knowledge DB, Vault)
  • psx-staging / psx-production namespaces for PSX application workloads
  • psx-dev namespace on a bare-metal K3s server (192.168.143.121) for local dev
  • PSX ADR-085 line 36 explicitly anticipates: pssaas-* — PowerSeller SaaS namespaces (future)

The question is whether PSSaaS should adopt this same topology or stand up independent infrastructure.

Decision

PSSaaS adopts the shared cluster model

PSSaaS will deploy to the same AKS cluster as PSX, using dedicated namespaces:

NamespaceLocationContains
pssaas-devBare-metal K3s (alongside psx-dev).NET API, SQL Server container, Redis, React frontend (when built)
pssaas-stagingAKS cluster (alongside psx-staging)Same app stack, SQL MI connection instead of local SQL Server
pssaas-productionAKS cluster (future)Same image SHAs as staging
pss-platformAKS cluster (shared, already exists)Keycloak, Vault, MCP server — used by both PSX and PSSaaS

Phased adoption

This decision does not mean immediate migration. PSSaaS stays on local Docker Compose until a trigger condition is met:

PhaseTriggerAction
CurrentN/ADocker Compose on Kevin's workstation. No K8s.
Phase 1WTPO needs to test PSSaaS featuresAdd pssaas-staging to AKS. API + React behind ingress. SQL MI connection.
Phase 2Cross-platform integration testing neededAdd pssaas-dev to K3s bare-metal. Test PSX ↔ PSSaaS API calls locally.
Phase 3Production readinessAdd pssaas-production to AKS. Same promotion gate as PSX.

Docker Compose remains the inner-loop dev environment even after K8s namespaces exist — they coexist, just as PSX's Docker Compose coexists with psx-dev.

What PSSaaS shares vs. isolates

Shared via pss-platform:

  • Keycloak — PSSaaS registers a new realm or client. SSO across PSX and PSSaaS.
  • Vault — PSSaaS secrets (SQL MI connection strings, API keys) stored in Vault, injected via sidecar.
  • MCP server — if PSSaaS needs AI-assisted knowledge queries (future).

Isolated per pssaas-* namespace:

  • Application database — SQL MI (not PostgreSQL). Each namespace points to its own database: PSSaaS_Dev, PSSaaS_Staging, PSSaaS_Production. Tenant databases are per-customer within the MI instance (ADR-005).
  • Redis — separate instance per namespace. No cross-product cache sharing.
  • .NET API — PSSaaS containers, built from separate Dockerfiles, separate CI/CD pipeline.
  • React frontend — PSSaaS-specific UI (when built).

Networking

  • AKS (staging/production): Cross-namespace access via Kubernetes DNS. PSSaaS calls PSX pricing API at psx-api.psx-staging.svc.cluster.local. PSX calls PSSaaS BestEx API at pssaas-api.pssaas-staging.svc.cluster.local.
  • K3s dev: Same DNS pattern. Both psx-dev and pssaas-dev namespaces on the same K3s node.
  • Ingress: PSSaaS staging gets its own ingress rules under pssaas.staging.powerseller.com (consistent with ADR-017 hostname convention). PSX staging remains at psx.staging.powerseller.com.

Database topology divergence

PSX uses PostgreSQL (in-cluster, per-namespace). PSSaaS uses SQL Server / SQL MI (external to K8s). This is an intentional divergence (ADR-014):

EnvironmentPSX DatabasePSSaaS Database
Dev (Docker Compose)PostgreSQL containerSQL Server container (ADR-018)
Dev (K3s)PostgreSQL in psx-devSQL Server in pssaas-dev
StagingPostgreSQL in psx-stagingAzure SQL MI (external, ExternalName service)
ProductionPostgreSQL in psx-productionAzure SQL MI (external, ExternalName service)

In K8s, the SQL MI connection is exposed as a Kubernetes ExternalName Service so PSSaaS pods reference it like any other in-cluster service.

Consequences

Positive

  • Zero new identity infrastructure. Keycloak in pss-platform already handles OAuth2/OIDC. PSSaaS adds a client — done.
  • Cross-platform API calls over K8s DNS. No public internet hops for PSX ↔ PSSaaS communication in staging/production.
  • Cost efficiency. One AKS cluster, one Vault, one Keycloak. PSSaaS pays only for its compute pods.
  • Dev parity. pssaas-dev and psx-dev on the same K3s box enables local integration testing.
  • Consistent operational model. Same kubectl, same monitoring (Grafana/Loki/Tempo from psx-staging can monitor pssaas-staging), same CI/CD patterns.
  • PSX ADR-085 already plans for this. The namespace pssaas-* is explicitly listed as a future addition.

Negative

  • Resource contention on bare metal. K3s dev server runs ~17 PSX pods. Adding PSSaaS (API, SQL Server, Redis, React) requires auditing available RAM/CPU. SQL Server alone needs 2GB+ RAM.
  • Blast radius on shared platform. Keycloak or Vault downtime affects both products. Mitigated by pss-platform being designed for high availability.
  • Separate CI/CD pipelines. .NET builds are fundamentally different from Python builds. GitHub Actions workflows, Dockerfiles, health check patterns, and resource profiles are all separate — no reuse from PSX CI/CD.
  • SQL MI networking. Unlike PSX's in-cluster PostgreSQL, PSSaaS's SQL MI is external to K8s. Requires AKS VNet integration with the SQL MI subnet, or public endpoint access with firewall rules. This is a one-time infra setup but adds operational complexity.
  • Migration effort when triggered. Writing K8s manifests (Deployments, Services, ConfigMaps, Secrets, Ingress) for PSSaaS is non-trivial. Should be deferred until the .NET stack is more mature.

Risks

  • K3s dev server capacity. Mitigation: audit before Phase 2. If insufficient, PSSaaS dev stays on Docker Compose and only staging/production use K8s.
  • SQL MI VNet peering. Mitigation: use SQL MI public endpoint with IP whitelisting for staging (same pattern as current kevin_pssaas_dev access). VNet peering for production.

Relationship to Other ADRs

  • ADR-003 (Azure-preferred, vendor-agnostic): K8s manifests are cloud-agnostic per PSX ADR-085. Only the AKS provisioning script is Azure-coupled.
  • ADR-005 (database-per-tenant): Tenant isolation is within SQL MI, independent of K8s namespace topology.
  • ADR-013 (identity strategy): This ADR resolves ADR-013's open question — PSSaaS uses platform Keycloak in pss-platform.
  • ADR-014 (backend language divergence): Separate runtimes (.NET vs Python) live in separate namespaces. No conflict.
  • ADR-016 (Nginx proxy): Local Docker Compose proxy continues to work. K8s ingress handles the same routing in cloud environments.
  • ADR-017 (ecosystem hostnames): Staging hostname pssaas.staging.powerseller.com follows the established pattern.
  • ADR-018 (local SQL Server): SQL Server container moves from Docker Compose to pssaas-dev K3s namespace in Phase 2.
  • ADR-019 (PSX-to-SaaS BestEx integration): Cross-namespace K8s DNS enables the integration without public internet.
  • PSX ADR-085 (cloud infrastructure): PSSaaS is the pssaas-* namespace anticipated in that decision.