PSSaaS Architect — Phase 8.5: Ecosystem Auth + Embedded Superset
Role: PSSaaS Systems Architect
Phase: PowerFill Phase 8.5 (Ecosystem Integration) — PSSaaS joins the platform-Keycloak auth boundary AND replaces W2's "View in Superset" anchor links with @superset-ui/embedded-sdk <EmbeddedDashboard> component
Date dispatched: TBD (next Architect session; Phase 9 shipped 2026-04-20 sentinel phase-9-validation-ready; Phase 8.5 is the last phase before Greg demo per PO sequence preference)
Model required: Opus 4.7 High Thinking — verify in the Cursor model picker before responding. If you're running anything else, STOP and escalate to the PO.
Estimated effort: 2-3 Architect-sessions per ADR-027 (Proposed) estimate, IF the PSX file-path inheritance applies cleanly to our build shape. Could go higher if PSX gotcha #1 (PUBLIC_ROLE_LIKE silent no-op on existing Superset) bites in a way the inherited mitigation doesn't catch, OR if a fresh-eyes Alternatives-First Gate reveals our build shape diverges from PSX's static-site reference in ways ADR-027 didn't anticipate.
Predecessor work: Phase 9 (Parallel Validation Harness) shipped 2026-04-20 (commits b0f6469 + b548e78 + b350126). Sentinel phase-9-validation-ready LIVE on staging. First end-to-end harness run surfaced A69 (state-dependent psp_powerfillUE SqlException 207 on PS_DemoData) — banked as a Greg-demo asset (the harness earned its charter on its first run by catching exactly the class of finding it was built to catch). A70 banked alongside (mixed PSSaaS-deployed + legacy-encrypted proc-body state on PS_DemoData; only psp_powerfill_pool_guide is plain text). Phase 9 is COMPLETE; A69 root-cause investigation deferred per PO sequence preference (Phase 8.5 is the Greg-blocker, not A69).
FIVE substantive context updates since the Phase 9 kickoff (f41592f) that Phase 8.5 must internalize:
- Phase 9 shipped + sentinel live. Harness at
tools/parallel-validation/; first-run report atdocs-site/docs/devlog/2026-04-20-powerfill-phase-9-first-validation-run.md; ADR-028 documents Frame D Hybrid framing + invocation-path decisions. Phase 8.5 inherits a stable 6-step orchestration baseline + a verdict-logic shape (RowVerdict.INCOMPARABLE; A66/A69-aware) that demonstrates practice #13 in deployable form. - A69 + A70 banked. A69 surfaces a hypothesis-pair worth Greg/Tom consultation (PSSaaS silently swallows the SqlException 207 internally vs PSSaaS hits a different code path). A70 refines Frame D's "same fixed proc body" claim into "whatever proc bodies live on the target DB." Most relevant for Phase 8.5: the React UI's status-page already honestly displays
RunStepResult.ErrorMessageif populated (W2 ship); Phase 8.5 must NOT regress this honesty when wrapping the UI in auth + embedding. - Backlog #30 closed empirically clean. PSX Infra completed the Superset →
pss-platformmigration (~3 min cutover; hostnamebi.staging.powerseller.comUNCHANGED; all 20 dashboards / 56 charts / 77 datasets preserved; same Keycloak SSO + samesupersetOIDC client + same admin credentials + same image SHA so existingGUEST_TOKEN_JWT_SECRETcontinuity preserved). Phase 8.5 inherits a stable platform-Superset endpoint (no migration coordination needed; just consume). - PSX Collab embedding-pattern relay COMPLETE 2026-04-19. Authoritative file paths + Q3 architectural-mismatch correction + 8 ranked gotchas + "what I am NOT providing" honesty list. Archived at
docs-site/docs/agents/cross-project-relays/2026-04-19-psx-superset-embedding-relay.md. The Q3 correction is load-bearing: PSX uses NextAuth + Next.js for their main app + oauth2-proxy for static-site protection of admin services. Our Vite-built React UI is structurally a static site. Therefore the right reference is PSX's oauth2-proxy + Docs pattern, NOT their NextAuth + Next.js pattern. Translating NextAuth to a Vite static bundle is non-trivial because there's no server runtime to do the OIDC code exchange; oauth2-proxy is the correct shape for our build pattern. - AKS shared-cluster pod-density resolved. PSX Infra added a 3rd node (
aks-userpool1-16401317-vmss000000, separate node pool) on 2026-04-20 to relieve the per-node pod-count limit that briefly blocked Phase 9's API rollout. Phase 8.5 inherits expanded cluster capacity for new Deployments (oauth2-proxy will be a new Deployment inpssaas-staging).
PO milestone for Phase 8.5 ("when it's time to demo, I'd like to do it in staging, and I should have to authenticate via Keycloak to access /app"): achievable when Phase 8.5 ships. This phase is the final demo-blocker before Greg-demo readiness per PO sequence preference (A54 fix → W2 → Phase 9 → Phase 8.5 → Greg demo). After Phase 8.5 ships, the Greg-demo dry run becomes runnable end-to-end against the auth-protected staging URL.
Session-start checklist
Read these in this order before doing anything else:
CLAUDE.md— project identity, role-identification procedure, push-is-an-ask conventionAGENTS.md— agent memory: principles, lessons, F-PSD findings summary. Specifically the Cross-boundary cutover verification recipe (banked 2026-04-19 from Superset migration); Phase 8.5 ships a new auth boundary and the recipe applies to verifying the cutover from public-staging to authenticated-stagingdocs-site/docs/agents/architect-context.md— your role definitiondocs-site/docs/agents/process-discipline.md— canonical practices, gates, antipatterns. Per banked observations 2026-04-19/20 (commits8dba7b4+31b5d59), the next revision likely lands the "Writer-Time vs Reader-Time Truth Divergence" family heading + "Subagent Output Defended Beyond Scope" + "Convention Conflation Under Low-Corroboration Count" + "Single-Probe Confidence" observations. Use the latest committed version ofprocess-discipline.mdas authoritative; anticipate these refinements landing if the next discipline-doc revision ships in paralleldocs-site/docs/agents/handoff-prompts.md— Templates for delegationdocs-site/docs/handoffs/pssaas-session-handoff.md— current state. Re-read the Backlog table at planning time per the now-canonical-adoption-anticipated trigger-based countermeasure (4-instance corroborated as of Phase 9; 3 traditional findings + 1 "0 net-new findings = pattern works" data point). Use it explicitly in §2 of your plandocs-site/docs/adr/adr-027-superset-embedding-strategy.md(Proposed) — THE LOAD-BEARING INHERITANCE for this kickoff. Captures D-8.5-1 through D-8.5-5 framing decisions inheriting from PSX Collab's authoritative reply. Status Proposed pending your refinement at dispatch time. ADR-028 (Phase 9 Harness Design) was renumbered from an initial Architect ADR-027 collision when the Collaborator-authored ADR-027 (this one) landed inece500eduring the Phase 9 dispatch window. Phase 8.5's ADR will NOT collide; treat ADR-027 as the framing baseline + write any new ADRs at ADR-029+docs-site/docs/agents/cross-project-relays/2026-04-19-psx-superset-embedding-relay.md— PSX Collab's authoritative file-path references + 8 ranked gotchas + "what I am NOT providing" list. This is the empirical primary source for ADR-027's framing. The Q3 correction (oauth2-proxy + static-site, NOT NextAuth + Next.js) earned its own callout in the archive — load-bearing for any future re-readersdocs-site/docs/specs/powerfill-engine.md— full spec. Phase 8.5-relevant sections: §Run APIs (the operator-grade run-mgmt surface from Phases 6e/7 that the auth wraps), §Output APIs (the 8 Phase 7 endpoints feeding the embedded dashboards)docs-site/docs/handoffs/powerfill-phase-9-completion.md— most recent completion report. Phase 9's §Capability × Environment matrix (per practice #13) is the canonical example to copy; Phase 8.5 adapts to "auth-protected staging" cells (verified vs NOT MEASURED HERE) instead of "harness comparison" cellsdocs-site/docs/handoffs/powerfill-phase-8-w2-completion.md— W2 completion report. The W2 React UI is what Phase 8.5 wraps in auth + transforms anchor-links into embedded SDK. Read for the architectural primitives Phase 8.5 inherits (<ReportPageShell>,<FreshnessBanner>,useApi/useReportFetchhooks,supersetDashboards.tsconfig map)docs-site/docs/handoffs/powerfill-a54-fix-greg-demo-readiness.md— the PO-facing Greg-demo narrative. Phase 8.5 ships the surface Greg interacts with during the demo. The narrative gains TWO load-bearing slot-in slides post-Phase-8.5: (a) "live operator workflow against auth-protected staging URL" replacing the "demo dry-run on local URL" slide; (b) "embedded dashboards inside PSSaaS UI" replacing "click-through to Superset in new tab" slidedocs-site/docs/devlog/— most recent:2026-04-20-powerfill-phase-9.md+2026-04-20-powerfill-phase-9-first-validation-run.md(Phase 9 ship + A69 first-run); expect a2026-04-XX-powerfill-phase-8-5-*.mdas your devlog at session enddocs-site/docs/specs/powerfill-assumptions-log.md— A1 (revised), A28 + A37 (RESOLVED), A38 (RESOLVED), A41-A45, A47-A58, A60, A61, A62 (PS_DemoData view drift; Phase 9 close-out per Backlog #24 deferred again at Phase 9 ship; not Phase 8.5-relevant unless the embedding inadvertently re-surfaces it), A63-A64, A65-A66, A67 (CLOSED at Phase 9 commitb548e78), A68 (the tenant-id-vs-config-slot conflation; Phase 8.5 is the natural fold-in for the long-term decoupling per A68's platform-tailwind note — with Keycloak as source of truth for authenticated user identity, the OIDC tenant_id claim becomes the natural anchor for TenantId-on-rows), A69 (state-dependent UE failure; Greg-consultation hook), A70 (mixed PSSaaS-deployed + legacy-encrypted proc-body state on PS_DemoData)
After reading those, acknowledge your role and proceed.
YOUR TASK — Phase 8.5: Ecosystem Auth + Embedded Superset
Per ADR-027 (Proposed), Phase 8.5 ships:
oauth2-proxy in front of
/app/and/api/paths, leveraging the existingpss-platformKeycloak atauth.powerseller.com. Replaces W2's "View in Superset" anchor links with@superset-ui/embedded-sdk<EmbeddedDashboard>component, consuming the now-platform-hosted Superset atbi.staging.powerseller.com. Closes the public-staging-URL security debt. Folds in A68 long-term decoupling (TenantId from Keycloak claim instead of from connection-string-slot routing).
Phase 8.5 is the last demo-blocker before Greg-demo readiness. Phase 8.5 has substantial work but the framing decisions are inherited from ADR-027 + PSX Collab's empirical reply, so the work is execution rather than design-from-scratch.
Inherited context (do not re-litigate)
| Topic | State as of 31b5d59 |
|---|---|
| Phase 7 / Phase 8 W1 / Phase 8 W2 / A54 fix / Phase 9 | All COMPLETE; sentinel phase-9-validation-ready; staging at https://pssaas.staging.powerseller.com/ (currently unauth — Phase 8.5 closes this) |
| End-to-end Complete-run on PS_DemoData | Empirically achievable in ~30s; 12+ historical runs in pfill_run_history (3 Complete + 7 Failed + 2 Cancelled), all tagged tenant_id='ps-demodata' per A68 short-term Path γ disposition |
| PSSaaS API endpoints | 12 endpoints (4 run-mgmt + 8 reports) live; all currently unauth; Phase 8.5 wraps them with oauth2-proxy + adds the new guest-token-mint endpoint |
| W2 React UI | LIVE at https://pssaas.staging.powerseller.com/app/. 4 main pages (Home / Submit / Runs List / Run Status with polling + cancel) + 8 Phase 7 report pages with shared <ReportPageShell> + 4-verdict freshness banner per A60 + A66. Currently uses anchor-link "View in Superset" via dashboardUrl(reportDashboard) in <ReportPageShell>:105 and the equivalent in Home.tsx + RunStatus.tsx. Phase 8.5 replaces these with <EmbeddedDashboard> from the SDK |
| Superset infrastructure | Lives in pss-platform namespace post-Backlog-#30-closure. Hostname bi.staging.powerseller.com UNCHANGED. Same Keycloak SSO + same superset OIDC client + same admin credentials + same image SHA. PowerFill dashboard IDs 13-20 preserved. PSX Infra-owned operations + config |
| Keycloak | LIVE in pss-platform namespace at auth.powerseller.com. PSX uses it via superset OIDC client (for Superset SSO) + docs-proxy confidential client (for static-site protection — the canonical reference for Phase 8.5's PSSaaS client). PSX Collab provided file paths in infra/azure/scripts/configure-keycloak.ps1 (search docs-proxy). PSSaaS will create a new client pssaas-app (analog) |
| AKS cluster capacity | Expanded post-Phase-9 (3 nodes; aks-nodepool1 2 nodes + aks-userpool1 1 node). Adding oauth2-proxy as a new Deployment in pssaas-staging is well within capacity |
| A68 (tenant-id-vs-config-slot conflation) | Short-term Path γ disposition LIVE (Tenants__ps-demodata__ConnectionString env var on staging API). Long-term decoupling is THE Phase 8.5 fold-in opportunity per A68's platform-tailwind note. With Keycloak as source of truth for authenticated user identity, the OIDC tenant_id claim becomes the natural anchor for TenantId-on-rows, decoupled from connection-string-slot naming. Phase 8.5 should treat A68 closure as a deliverable, NOT a deferral |
| A69 (state-dependent UE failure) | NEW Phase 9 finding; Greg-consultation hook; NOT Phase 8.5-relevant beyond ensuring the React UI's status page continues to honestly display RunStepResult.ErrorMessage when the embedding wrap is added. Don't try to fix or hide A69; it's a demo asset. If hypothesis #1 (PSSaaS silently swallows the 207 internally) is later confirmed, that's a follow-up — Phase 8.5 inherits the W2 honesty pattern (display the field if populated; don't fabricate Complete) |
| A70 (mixed proc-body state on PS_DemoData) | NEW Phase 9 finding; production-cutover-relevant (Phase 10+); NOT Phase 8.5-relevant |
| PSX Collab pattern reference | Embedding: web/app/principal/components/SupersetEmbed.tsx (frontend SDK usage; iframe-sizing containerRef polling pattern). Backend: api/routes/principal.py:242-313 (3-step Superset handshake login → CSRF → guest_token; CSRF step is the most-missed piece). oauth2-proxy: infra/oauth2-proxy/oauth2-proxy.cfg. Keycloak client: infra/azure/scripts/configure-keycloak.ps1 search docs-proxy. PSSaaS Architect cannot read PSX repo directly; the file paths are reference points for the PO to relay to the Architect via the cross-project archive entry's compressed exchange section, OR for the PSSaaS Architect to broker pairing via PO if any specific translation friction surfaces |
| PSX Collab "what I am NOT providing" | (1) .NET 8 implementation of guest-token-mint endpoint — PSSaaS Architect's lane; FastAPI version is the pattern to copy. (2) Vite/React-specific SDK integration code — PSX uses Next.js; SDK API identical, but dynamic-imports / code-splitting / SSR-vs-CSR boundaries are PSSaaS's call. (3) Embedded dashboard registration script — PSX has none; PSSaaS would be writing it net-new. Treat these as crisp inheritance-boundary signals; PSSaaS owns these net-new |
| 8 PSX Collab gotchas (ranked by cost) | See ADR-027 §Risks. Most-cost-likely: (1) PUBLIC_ROLE_LIKE silent no-op on existing Superset — half-day debug trap if not pre-mitigated; requires PSX Infra collaboration to flip Public role permissions on platform-Superset's existing DB (NOT a fresh init scenario). (2) TALISMAN_ENABLED + X-Frame-Options nginx override. (3) CORS (avoided by our same-origin pattern; flagged for awareness). (4) Superset 3+ MessageChannel-only. (5) can_log on Superset v6 permission. (6) GUEST_TOKEN_JWT_SECRET coordination. (7) UUID ≠ dashboard ID |
| Cross-project-relays archive folder | Live; 2 entries as of 31b5d59. Phase 8.5 may surface a 3rd entry if Architect needs PSX Collab pairing during dispatch |
| Banked process observations (not yet canonical) | "Subagent Output Defended Beyond Scope" (W2 origin); "Convention Conflation Under Low-Corroboration Count" (A68 root pattern); "Single-Probe Confidence" (PSX Infra falsification + the embedded-SDK-OFF claim — already started preventing its own recurrence in same-session per the Phase 9 staging-verify probe near-miss); "Writer-Time vs Reader-Time Truth Divergence" family heading (build-shape verification + state-freshness verification, 2 members; threshold tracking: each at 1 instance from 1 agent pair, need second-instance independent-observation for canonical submission) |
Explicit scope (IN)
Workstream 1 — oauth2-proxy + Keycloak realm/client setup
- Architectural decision (Alternatives-First Gate, ADR-027 already commits to oauth2-proxy + static-site over NextAuth + Next.js per PSX Q3 correction; re-validate the framing applies cleanly to our build before implementing, but don't re-litigate the decision unless empirical primary-source evidence contradicts it)
- New
pssaas-appKeycloak client inpss-platformrealm. Confidential (not public — oauth2-proxy needsclient_secretfor server-side code exchange). Standard flow enabled, direct access disabled, service accounts disabled. Valid redirect URIs:https://pssaas.staging.powerseller.com/oauth2/callback. Web origins:+(single plus = same as redirect URIs). PSX Collab'sinfra/azure/scripts/configure-keycloak.ps1docs-proxyis the canonical reference shape; PSX Infra collaboration required to actually create the client (PSSaaS Architect doesn't have admin access to PSX-Infra-owned Keycloak) - New
oauth2-proxyDeployment + Service inpssaas-stagingnamespace. Providerkeycloak-oidc;oidc_issuer_urlto the Keycloak realm;redirect_urltohttps://pssaas.staging.powerseller.com/oauth2/callback; upstreams to existing K8s service FQDNs (frontend.pssaas-staging.svc.cluster.local:3000for/app/;api.pssaas-staging.svc.cluster.local:8080for/api/).set_xauthrequest = true+pass_access_token = trueso the access token is available to the upstream viaX-Forwarded-Access-Tokenheader. Cookie secret + client secret via Vault → K8s Secret → env var (PSX Infra's pattern) - Ingress reconfiguration to delegate
/app/and/api/paths to oauth2-proxy upstream FIRST, with oauth2-proxy forwarding to existingfrontendandapiservices after auth./docs/should remain unauthenticated — Docusaurus is a documentation surface; gating it behind auth makes the operational runbook harder to read - GHA workflow extension to deploy oauth2-proxy alongside existing api/docs/frontend pipeline. Path filter on
infra/azure/k8s/pssaas-staging/services.yamlalready triggers on Deployment changes; add the oauth2-proxy Deployment to the same yaml file
Workstream 2 — Embedded Superset SDK in React UI
- Add
@superset-ui/embedded-sdkv0.3.0 (PSX uses) tosrc/frontend/package.jsonas dynamic import so it doesn't bloat the initial bundle (PSX pattern; PSSaaS Collab's empirical bundle scan that missed it was actually a feature, not a bug — the SDK loads lazily) - New
<EmbeddedDashboard>shared component atsrc/frontend/src/components/EmbeddedDashboard.tsx. Mirrors PSX'sweb/app/principal/components/SupersetEmbed.tsxshape: useEffect chain that (1) GETs dashboard UUID + initial token from PSSaaS API, (2) callsembedDashboard()withfetchGuestTokencallback the SDK invokes on its own schedule (token refresh handled by SDK, not by us), (3) iframe-sizing trick: poll every 100ms untilcontainerRef.current.querySelector("iframe")returns non-null, then apply width/height/border-radius (PSX's empirically-debugged pattern; copy faithfully) - Replace anchor-link
<a href={dashboardUrl(reportDashboard)} target="_blank">patterns in:src/frontend/src/pages/reports/reportShell.tsxline 104-108 (canonical instance; one change here propagates through all 8 report pages)src/frontend/src/pages/RunStatus.tsx(run-status page Hub link)src/frontend/src/pages/Home.tsx(Home page Hub card)
- Update
src/frontend/src/config/supersetDashboards.ts: add per-dashboard UUID field alongside existing integeridfield. UUIDs come from the per-dashboard registration script (Workstream 3) — UUID is what the embedded SDK uses; integeridbecomes the in-Superset-admin-UI navigation reference only - Decision deferred to Architect: should the embedding fully replace the anchor-link pattern, OR ship behind a feature flag and migrate per-page? Recommend fully replace — feature-flagging adds UX confusion and Phase 8.5 is the natural cutover point. ADR-027 §"Decisions deferred to Phase 8.5 Architect" carries this open
Workstream 3 — .NET 8 guest-token-mint endpoint + Superset embedded-dashboard registration
- New
/api/superset/guest-tokenendpoint (or/api/powerfill/guest-token; ADR-027 carries the open decision; recommend cross-cutting/api/superset/...since Phase 10+ may add other modules' dashboards). Translates PSX's FastAPIapi/routes/principal.py:242-3133-step handshake to .NET 8 + HttpClient: login → CSRF token → guest_token POST. The CSRF step is the most-missed piece per PSX gotcha — Superset'sguest_tokenendpoint requiresX-CSRFTokenheader even though it's an API endpoint - Auth check at the .NET endpoint via
X-Forwarded-Access-Tokenheader forwarded by oauth2-proxy. Use the access token's claims (or a fresh/userinfocall against Keycloak) to enforce "is this user allowed to see this dashboard." For v1: any authenticated user can mint guest tokens for any of the 8 PowerFill dashboards (single-tenant PoC; per-user-per-dashboard permission filtering is Phase 10+) - Per-resource scoping decision: ADR-027 §D-8.5-3 starts with PSX's Option A (one token per dashboard). Architect should re-evaluate at dispatch time given PSSaaS's higher dashboard count (8 vs PSX's 1). Option B (one token covering all 8 resources via array) may be more efficient but adds invalidation complexity. Decide via Alternatives-First Gate; document choice in completion report
- Per-dashboard registration script at
infra/superset/register-powerfill-embeds.py(or similar location; Architect's call). PSX has no reference code for this — net-new per PSX's "what I am NOT providing" list. Uses Superset's/api/v1/dashboard/{id}/embedded/endpoint (POST to create, PUT to update). Captures UUIDs to a config file (e.g.infra/superset/powerfill-embed-uuids.json) that the .NET API reads. Idempotent so re-runs UPDATE rather than create-duplicate. Run viakubectl execagainst the pss-platform Superset pod (per AGENTS.md updated runbook + per Phase 9'stools/parallel-validationpattern) - PSX Infra collaboration: at dispatch, the PSSaaS Architect must surface to PO that platform-Superset config needs verification on:
EMBEDDED_SUPERSET = True(PSX uses this for their Principal dashboard so probably already on; verify)TALISMAN_ENABLED = False(load-bearing for iframe rendering)HTTP_HEADERS = {"X-Frame-Options": "ALLOWALL"}(load-bearing; nginx-side X-Frame-Options must NOT override)GUEST_TOKEN_JWT_SECRETset and stable (not rotated mid-session)PUBLIC_ROLE_LIKE = "Gamma"actual-state-on-existing-DB (per gotcha #1; this is the half-day debug trap if not pre-flipped viacopy_gamma_to_public.pyscript OR manually via Settings → Security → List Roles → Public)
Workstream 4 — A68 long-term fold-in (TenantId from Keycloak claim)
- Refactor
TenantMiddlewareand/orTenantRegistryto sourceTenantIdfrom the OIDCtenant_idclaim on the access token (or equivalent stable claim per Keycloak realm setup) rather than from theX-Tenant-Idheader value (which currently doubles as theTenantRegistryconfig-slot lookup key per A68) - Decouple logical TenantId from connection-string-slot routing per A68's long-term proposed shape:
TenantRegistry.Resolve(string identity)returns(string TenantId, string ConnectionString)— both fields independent. Config schema gains aTenants:<identity>:CanonicalIdalongsideTenants:<identity>:ConnectionStringso the persisted column value can be the same across multiple connection-string aliases in different environments - Migration of existing
pfill_run_history.tenant_id='ps-demodata'rows to whatever the canonical customer identity becomes (e.g.'sandbox'or a UUID) — one-shot UPDATE script. Decide the canonical identity convention via Alternatives-First Gate - Tenant-picker UI removal — once OIDC claim is the source-of-truth, the React UI's tenant-picker dropdown becomes vestigial. Remove from the header; the user's tenant is determined by their authenticated session
- W2 contract: the React UI continues to send
X-Tenant-Idheader on API calls for backward compatibility, BUT the API ignores it in favor of the OIDC claim. (OR: removeX-Tenant-Idfrom all React fetch helpers entirely. Architect's call.) - Test coverage: add
dotnet test-side coverage for the new TenantMiddleware Keycloak-claim resolution. Should include a happy-path test, a missing-claim test (rejection), and a claim-vs-header conflict test (claim wins)
Cross-cutting
- Status sentinel bump to
phase-8-5-ecosystem-ready(preserves thephase-N-<short-name>pattern; do NOT carry the-a54-fixedsub-suffix forward — that closure became historical at Phase 9) - Spec amendment to
docs-site/docs/specs/powerfill-engine.md— add a Phase 8.5 row in §Phased Implementation table (previously documented as deferred; now COMPLETE) - Assumptions log additions — A71+ for new Phase 8.5 findings. A68 marked CLOSED if Workstream 4 ships
- NEW ADRs for any architectural decisions ADR-027 didn't pre-commit:
- ADR-029 (proposed): PSSaaS Tenant Identity Strategy (the A68 long-term decoupling shape; OIDC claim as source-of-truth)
- ADR-030 (proposed) if new: per-dashboard guest-token scoping pattern (Option A vs Option B from PSX)
- Skip new ADRs if existing ADR-027 + ADR-013 cover the decisions adequately
- Pre-push docs-build check per Phase 6e/7/8-W1/8-W2/9 banked discipline (now 9-instance corroborated):
docker build -f docs-site/Dockerfile.prod docs-sitebefore push if any newdocs-site/docs/**files created - Cross-boundary cutover verification recipe (per AGENTS.md banked 2026-04-19): when Phase 8.5 ships the auth boundary, both PSSaaS Collab and PSX Infra should run primary-source verification independently — PSSaaS side checks
/app/requires auth (HTTP 302 to anonymous, HTTP 200 with valid session),/api/requires auth (HTTP 401 to anonymous, HTTP 200 with valid token),/docs/does NOT require auth (HTTP 200 to anonymous), guest-token-mint endpoint is auth-checked (HTTP 401 to anonymous, HTTP 200 with valid session); PSX Infra side checks Keycloak shows the newpssaas-appclient + the OIDC flow tracing for one full login cycle
Explicit scope (OUT)
- Multi-tenant production rollout (Phase 10+); Phase 8.5 ships single-tenant single-customer auth pattern that Phase 10+ generalizes
- Per-user-per-dashboard permission filtering at the guest-token-mint layer (Phase 10+)
- A69 root-cause investigation — banked as Phase 9 follow-up; Greg-demo asset in current form; deferred unless Phase 8.5 work surfaces a related finding
- A70 production-cutover playbook — Phase 10+ work
- Mobile-responsive design for embedded dashboards — Phase 10+ (desktop-only is fine for v1 + Greg demo)
- Internationalization — Phase 10+
- Per-environment Superset registration variance (A64 multi-tenant deferral) — Phase 10+
- Service-account auth for Phase 9 harness (the harness currently runs locally; if it needs to run against auth-protected staging, a Keycloak service account is the right shape — but Phase 9 explicitly OUT of Phase 8.5 dependency per the Phase 9 kickoff's "harness should remain runnable today against unauth staging + work post-Phase-8.5 against auth-protected staging by adding an OIDC-token mint step" framing)
- "View in Superset" anchor-link FALLBACK if embedding fails — recommend AGAINST shipping a fallback. Failed embedding should surface honestly to the user (per A66/A69-pattern of refusing to mis-claim) rather than silently degrading. Architect can defend a different position if the empirical SDK reliability surfaces unexpected fragility
Process discipline (canonical, non-negotiable)
Gates that must produce documented output
| Gate | Where to apply | What "documented output" means |
|---|---|---|
| Three-layer Primary-Source Verification Gate (now 4-instance corroborated; canonical-promotion-anticipated) | Spec-vs-implementation: verify ADR-027 framing decisions still apply to our build at dispatch time (build shapes haven't drifted). NVO-vs-implementation: less applicable for Phase 8.5 (no T-SQL ports). Implementation-vs-runtime: re-read session-handoff Backlog table during planning; cross-project-relays archive entries; ADR-027's Architect-at-dispatch checklist | A Phase 8.5 plan §2 findings table per layer + explicit Backlog re-read pass log per row |
| Alternatives-First Gate | At least 4 architectural decisions: (a) endpoint shape /api/superset/guest-token vs /api/powerfill/guest-token vs per-dashboard endpoints (recommend cross-cutting); (b) per-resource scoping Option A vs Option B (PSX's recommendation acknowledged but our 8-dashboard count differs); (c) anchor-link replacement approach (full replace vs feature-flagged migration; recommend full replace); (d) A68 canonical identity convention (UUID vs slug like sandbox vs customer-org-name like wtpo) | A Phase 8.5 plan §3 alternatives section per decision; ADR-029 (proposed) for the A68 decoupling shape |
| Required Delegation Categories | Heavily delegable: per-page anchor-link → embedded-component refactor (one delegated subagent per Phase 7 endpoint = up to 8 micro-deliverables); the per-dashboard registration script; .NET test coverage for the new TenantMiddleware Keycloak-claim resolution. Self-implement: the architectural-contract-per-artifact load-bearing parts — the .NET 8 guest-token-mint endpoint (CSRF handling is the most-missed piece per PSX), the A68 decoupling shape decision (load-bearing for tenant identity moving forward), the oauth2-proxy + ingress wiring (load-bearing for the auth boundary correctness) | A Phase 8.5 plan §8 delegation inventory with subagent prompts AND Deliberate Non-Delegation justifications per practice #9 |
| Reviewable Chunks at intra-session scope | Consider checkpointing after Workstream 1 (auth boundary works; oauth2-proxy + Keycloak client + ingress) before proceeding to Workstream 2-4 (embedding + tenant decoupling). Workstream 1 is the foundational work; W2-4 build on top. Recommend explicit plan-stage Architect Report after W1 lands + before W2-4 dispatch | If checkpointing, send a plan-stage Architect Report after W1 + initial smoke-test against oauth2-proxy |
| Deploy Verification Gate | Arm (a) sentinel = phase-8-5-ecosystem-ready. Arm (b) PSSaaS-side empirical verification per the cross-boundary cutover recipe: /app/ HTTP 302 to anonymous + HTTP 200 with valid Keycloak session; /api/health HTTP 401 to anonymous + HTTP 200 with valid token; /docs/ HTTP 200 to anonymous (UNCHANGED); guest-token-mint HTTP 401 to anonymous + HTTP 200 with valid session; embedded dashboards render in iframe with no CSP errors. Arm (c) end-to-end click-through: operator logs in via Keycloak → submits a run → watches status → clicks a report → embedded dashboard renders inline | A Phase 8.5 completion report Markdown citing: screenshots of the Keycloak login page + post-login redirect to /app/; curl outputs showing 401-vs-200 across the auth boundary; the cross-boundary cutover verification recipe applied bilaterally with PSX Infra |
| Counterfactual Retro | At session end | A retro section. Phase 9 banked 7+ observations including "the harness earned its Phase 9 charter on its very first run by surfacing A69." Phase 8.5 should report whether the auth+embedding patterns held up empirically OR whether new gotchas surfaced beyond PSX Collab's enumerated 8 |
Antipatterns to avoid (canonical list applies)
- Phase-0 Truth Rot — A57 + A59 framing applies; the cross-project-relays archive entries are the inheritance source-of-truth; if you find drift between ADR-027 framing and current empirical reality, surface as A71+ and adjust
- Empirical-Citation Type Mismatch (Phase 5 origin) — when calling Keycloak's OIDC endpoints + Superset's guest-token endpoint, use the actual JSON property names from the OIDC spec + Superset 6 API docs; the OpenAPI spec at
bi.staging.powerseller.com/swagger/(if exposed) is canonical for the Superset side - Verification Avoidance (Phase 4 origin) —
dotnet build+npm run build+ the cross-boundary cutover verification recipe before declaring complete; the embedded dashboard rendering inline IS the integration test - Ghost Deploy (PSX origin) — Phase 9 hit this empirically (image built + pushed to GHCR but pod didn't roll due to AKS pod-density; resolved by PSX Infra adding a 3rd node + manual rollout). Phase 8.5 ships a NEW Deployment (oauth2-proxy); confirm the rollout actually scheduled + reaches Ready before declaring complete. The
kubectl get pods -n pssaas-stagingempirical check is the countermeasure - Delegation Skip (Phase 4 origin) — per-page anchor-link refactor is a heavy delegation candidate; .NET guest-token-mint endpoint + A68 decoupling shape are yours to self-implement
- Capability Inflation (Phase 8 W1 / Claim-vs-Evidence family) — the auth boundary either works (verified by 401-to-anonymous + 200-with-session) or it doesn't. Do NOT extend "code-complete" to "deployed and working"; the matrix in your completion report should explicitly distinguish "verified at the artifact level" from "verified at the runtime level" (per practice #13). Phase 9 demonstrated this exemplary-fashion via A69's surfacing-in-TL;DR rather than parity-mis-claim
- Capability Drift (Claim-vs-Evidence family) — if PSX-side Keycloak realm/client config changes between Phase 8.5 spec and Phase 8.5 deploy, the inheritance-from-PSX-Collab-pattern could quietly become wrong. Re-verify the Keycloak
docs-proxyclient configuration shape via PSX Infra at dispatch time (5-min ask) before treating it as load-bearing for the newpssaas-appclient - Subagent Output Defended Beyond Scope (banked but not yet canonical; W2 origin) — when reviewing per-page-anchor-link-refactor subagent output, the explicit first question is "is this what the kickoff asked for?" before any disposition framing. Output additions outside scope are removed unless re-justified against the kickoff. The W2 PS608 incident is the canonical first instance — the kickoff didn't ask for PS608, the Architect added it (incorrectly attributed to the kickoff), Collaborator initially defended it as a naming-convention decision; PO had to push back. Don't repeat this shape
- Convention Conflation Under Low-Corroboration Count (banked but not yet canonical; A68 root pattern) — if Phase 8.5's tenant-identity convention work is single-writer (only Architect's local route), it's vulnerable to the same A68 class of bug at the next deploy that adds a second writer. Verify the convention agreement-across-writers BEFORE declaring W4 complete. Document the convention crisply in ADR-029 (proposed)
- Single-Probe Confidence (banked but not yet canonical; multi-instance origin) — when probing the auth boundary or the embedding boundary, get a second independent probe before claiming the boundary works. The PowerShell-bash variable-eating near-miss in Phase 9 staging-verify was the first "in-flight self-correction" instance; build the habit further
Tooling (verified post-Phase 9)
- WSL Ubuntu with
dotnet 8.0.420,jq 1.6,gh 2.4.0(un-authed). Usewsl.exe -- bash -lc '...'for shell work. PowerShell-to-WSL variable expansion is fragile — use script files written via the Write tool with explicit\nline endings +sed -i 's/\r$//'to strip CR before bash execution, OR use single-quoted PowerShell strings to prevent$expansion before WSL sees them - Windows-side kubectl at
C:\Program Files\Docker\Docker\resources\bin\kubectl.exe, kubeconfig at~/.kube/configwithPSS-clustercontext (PSX-shared cluster). Now 3 nodes (aks-nodepool12 nodes +aks-userpool11 node) post-2026-04-20 PSX Infra capacity expansion - Docker for local builds + GHCR manifest inspection. PSSaaS API + frontend + docs all use multi-stage
Dockerfile.prodpatterns - Node.js / npm for React workstream — Windows-host Node v22.x + production
node:22-alpineDocker base. Confirmed working post-W2 ship - Python 3.10+ in WSL Ubuntu for the per-dashboard registration script (Phase 8.5 ships a new Python script analogous to
tools/parallel-validation/pattern ORinfra/superset/deploy-powerfill.pypattern depending on where Architect locates it) - Pre-push docs-build check pattern (Phase 6e/7/8-W1/8-W2/9 lesson; now 9-instance corroborated):
docker build -f docs-site/Dockerfile.prod docs-sitebefore push if any newdocs-site/docs/**files created. Mandatory if Architect ships ADR-029 + completion report + devlog as expected - PSX Infra collaboration touchpoints: (a) Keycloak
pssaas-appclient creation (PSSaaS Architect doesn't have admin access); (b) platform-SupersetEMBEDDED_SUPERSET+TALISMAN_ENABLED+HTTP_HEADERS+GUEST_TOKEN_JWT_SECRET+PUBLIC_ROLE_LIKEverification + remediation if needed (the PUBLIC_ROLE_LIKE half-day-debug trap is the most likely friction point). Surface these collaboration asks early via PO; don't block trying to do PSX Infra's work yourself
Environment state (verified post-Phase 9 + post-AKS-capacity-expansion)
| Surface | State |
|---|---|
| Local API | phase-9-validation-ready ✓ |
| Staging API | phase-9-validation-ready ✓ (live; both default AND ps-demodata tenant slots wired to PS_DemoData private endpoint per Path γ) |
| Staging React UI | LIVE at https://pssaas.staging.powerseller.com/app/; 4 main pages + 8 report pages + 4-verdict freshness banner + anchor-link "View in Superset"; CURRENTLY UNAUTH (Phase 8.5 closes this) |
| Staging Docs | LIVE at https://pssaas.staging.powerseller.com/docs/; CURRENTLY UNAUTH; should remain UNAUTH post-Phase-8.5 (operational runbook surface) |
| Phase 7 endpoints | All 8 live; auth-wrap via oauth2-proxy is Phase 8.5's job |
pfill_run_history on PS_DemoData | 12+ rows tagged tenant_id='ps-demodata' (Path γ convention; A68 decoupling Workstream 4 may re-tag to canonical identity) |
| Superset infrastructure | LIVE at bi.staging.powerseller.com in pss-platform namespace (post-Backlog-#30 closure); 8 PowerFill dashboards at IDs 13-20; auth-gated via Keycloak SSO; EMBEDDED_SUPERSET flag + PUBLIC_ROLE_LIKE actual-state-on-existing-DB UNVERIFIED for PSSaaS use (PSX uses for their Principal dashboard so probably on; verify at dispatch) |
| Keycloak | LIVE in pss-platform namespace at auth.powerseller.com; PSX uses via superset + docs-proxy clients; pssaas-app client to be created at Phase 8.5 dispatch via PSX Infra |
| AKS cluster capacity | 3 nodes total post-2026-04-20 PSX Infra expansion. Adding oauth2-proxy Deployment + nginx side-car if needed is well within capacity |
| Phase 9 harness | tools/parallel-validation/ LIVE locally; runs against unauth staging today; will need OIDC-token mint step post-Phase-8.5 (out of Phase 8.5 scope; Phase 9 follow-up) |
| Backlog re-read pass at planning | 4-instance corroborated; canonical-promotion-anticipated; use it explicitly in Phase 8.5 §2 plan |
| Cross-project-relays archive | 2 entries; PSX-Claim-vs-Evidence (2026-04-19) + PSX-Superset-embedding (2026-04-19); Phase 8.5 may surface a 3rd entry if PSX Collab pairing surfaces |
Companion references
| Doc | Purpose |
|---|---|
docs-site/docs/adr/adr-027-superset-embedding-strategy.md | Load-bearing inheritance for Phase 8.5; Status Proposed pending your refinement at dispatch time |
docs-site/docs/agents/cross-project-relays/2026-04-19-psx-superset-embedding-relay.md | PSX Collab's authoritative file paths + 8 ranked gotchas + "what I am NOT providing" honesty list; primary-source for ADR-027 |
docs-site/docs/handoffs/powerfill-phase-9-completion.md | Phase 9 completion report + Capability × Environment matrix template Phase 8.5 adapts to "auth-protected staging" cells |
docs-site/docs/handoffs/powerfill-phase-8-w2-completion.md | W2 completion report; the React UI Phase 8.5 wraps in auth + transforms anchor-links |
docs-site/docs/handoffs/powerfill-a54-fix-greg-demo-readiness.md | PO-facing Greg-demo narrative; Phase 8.5 adds 2 load-bearing slot-in slides |
docs-site/docs/specs/powerfill-engine.md §Run APIs + §Output APIs | The 12 endpoints Phase 8.5 wraps with oauth2-proxy + the 8 reports the embedded dashboards consume |
src/frontend/src/config/supersetDashboards.ts | Anchor-link config map Phase 8.5 extends with per-dashboard UUIDs |
src/frontend/src/pages/reports/reportShell.tsx line 104-108 | Canonical anchor-link instance Phase 8.5 replaces (one change here propagates to all 8 report pages) |
src/backend/PowerSeller.SaaS.Api/Middleware/TenantMiddleware.cs | Currently sources TenantId from X-Tenant-Id header (per A68 conflation); Workstream 4 refactors to OIDC claim source |
src/backend/PowerSeller.SaaS.Infrastructure/Data/TenantRegistry.cs | Path γ short-term Dictionary lookup with StringComparer.OrdinalIgnoreCase; Workstream 4 long-term decoupling adds Resolve(string identity) returning (TenantId, ConnectionString) |
infra/azure/k8s/pssaas-staging/services.yaml | K8s manifests Phase 8.5 extends with oauth2-proxy Deployment + Service |
infra/azure/k8s/ingress/pssaas-ingress.yaml | Ingress Phase 8.5 reconfigures to delegate /app/ and /api/ paths to oauth2-proxy upstream |
Deliverables
When Phase 8.5 is complete, the Collaborator and PO should be able to verify each without trusting your word:
- Code commits — atomic, logically grouped. DO NOT push — the PO pushes; you
git addandgit commitonly infra/azure/k8s/pssaas-staging/services.yamlextended with oauth2-proxy Deployment + Serviceinfra/azure/k8s/ingress/pssaas-ingress.yamlreconfigured to delegate/app/+/api/to oauth2-proxy upstreaminfra/oauth2-proxy/oauth2-proxy.cfg(or equivalent location) for the PSSaaS oauth2-proxy config (PSX'sinfra/oauth2-proxy/oauth2-proxy.cfgis the reference shape)- PSX-Infra-coordination request for the new
pssaas-appKeycloak client (write the relay; PO sends; PSX Infra creates; PSSaaS Architect inherits secret via Vault → K8s Secret pattern PSX uses) src/backend/PowerSeller.SaaS.Modules.PowerFill/(or shared module) new endpoint(s) for guest-token-mint + the OIDC claim resolution logic. Includesdotnet test-side coveragesrc/frontend/src/components/EmbeddedDashboard.tsxnew shared componentsrc/frontend/src/config/supersetDashboards.tsextended with per-dashboard UUIDs- Anchor-link replacements in
reportShell.tsx+RunStatus.tsx+Home.tsx(3 files; reportShell propagates to all 8 report pages) infra/superset/register-powerfill-embeds.py(or similar) per-dashboard registration script + first-run output captured toinfra/superset/powerfill-embed-uuids.json(or similar)- TenantMiddleware refactor + TenantRegistry decoupling (Workstream 4)
- A68 row migration script (one-shot UPDATE on
pfill_run_history.tenant_id) - Sentinel bump to
phase-8-5-ecosystem-ready - NEW ADR-029 (proposed): PSSaaS Tenant Identity Strategy documenting the A68 long-term decoupling shape
- Spec amendment marking Phase 8.5 DONE
- Assumption log A71+ for new Phase 8.5 findings; A68 marked CLOSED
docs-site/docs/handoffs/powerfill-phase-8-5-completion.md— Phase 9-style completion report with Capability × Environment matrix per practice #13 + bilateral cross-boundary cutover verification recipe applied- Devlog entry at
docs-site/docs/devlog/2026-04-XX-powerfill-phase-8-5.md - Pre-push docs-build check (mandatory; new
docs-site/docs/**files expected)
Reporting protocol
Standard Architect Report format when you're done — what was produced / decisions / assumptions / open questions / recommended next steps / process notes.
If a Phase 8.5 component reveals an unexpected wire-shape gap with PSX's pattern (per Q3 architectural-mismatch antipattern in the cross-project-relays archive) — surface as A71+ and document.
If Keycloak/oauth2-proxy/Superset embedding encounters an unanticipated constraint NOT in PSX Collab's 8-gotcha list — STOP and surface, don't paper over. The PSX Collab pairing offer is standing; PO can broker if friction surfaces.
If a multi-day session reaches a natural pause point with partial Phase 8.5 completion (e.g. Workstream 1 ships + works but Workstream 2-4 incomplete), that's fine — write a handoff so the next Architect session resumes cleanly. Strongly recommend the W1-checkpoint-then-W2-4 chunking per Reviewable Chunks practice; W1 alone is meaningful demonstrable progress (auth boundary works) and de-risks the rest
The PO milestone for this phase: "I can demo PSSaaS to Greg in staging, with Keycloak auth, with embedded Superset dashboards inside the operator UI." Achievable when Phase 8.5 ships end-to-end.
What success looks like
https://pssaas.staging.powerseller.com/app/redirects to Keycloak login when accessed unauthenticated; redirects back to/app/after successful Keycloak login; React UI loads with user identity availablehttps://pssaas.staging.powerseller.com/api/*returns HTTP 401 to anonymous probes; HTTP 200 with valid Keycloak access token inAuthorization: Bearer ...ORX-Forwarded-Access-Tokenheaderhttps://pssaas.staging.powerseller.com/docs/UNCHANGED — HTTP 200 to anonymous, no auth required (operational runbook surface)- Operator workflow click-through: login → submit run → watch status (4 active states + 3 terminal) → land on Complete → click any report → embedded Superset dashboard renders inline (NOT in new tab); BLUE Complete+empty banner per A66 displays correctly; user identity visible in header
- Hub Dashboard (Superset 13) renders inline as the run-history canonical proof-of-life on the Home page
- A69 banked finding still surfaces honestly (status display unchanged from W2 W3 honesty pattern)
- Sentinel reflects
phase-8-5-ecosystem-ready - A68 closed:
pfill_run_history.tenant_idreflects the new canonical identity convention; Tenant identity sourced from OIDC claim; tenant-picker dropdown removed from React UI - Cross-boundary cutover verification recipe applied bilaterally with PSX Infra (PSSaaS-side checks + PSX-side checks both green)
- New deployment workflow rolls oauth2-proxy alongside existing api/docs/frontend on push
- ADR-029 documents the A68 closure shape
Begin when ready. Local environment + staging environment are both fully wired; PowerFill is end-to-end-Complete on PS_DemoData; the operator React UI is live; Phase 9 harness is shipped + first-run report banks A69; ADR-027 + cross-project-relays archive entry capture the framing decisions inherited from PSX Collab's authoritative reply; AKS cluster capacity expanded; Keycloak + Superset already operational in pss-platform; the only remaining demo-blocker between current state and Greg-demo is Phase 8.5.
Reminder: Opus 4.7 High Thinking. Verify model in picker before sending your first response. Do NOT push.