Skip to main content

2026-04-19 PSSaaS → PSX → PSSaaS — Superset Embedding Implementation Pattern Relay

Date: 2026-04-19 Originating project: PSSaaS (relay request) Owning Collaborator: PSSaaS Collaborator (consumer of the answer) Adoption status: Closed 2026-04-19. PSX Collaborator's reply landed canonical references for Phase 8.5 (commit <TBD-this-commit> lands ADR-027 + this archive + Backlog #31 amend); Phase 8.5 dispatch deferred to post-Phase-9-completion per PO sequence preference. Companion canonical sections:


Why this exchange is archived

Three reasons:

  1. The Q3 architectural-mismatch correction is load-bearing for Phase 8.5 design. PSX Collaborator caught that PSSaaS Collaborator's framing ("copy your pattern") would have inherited the wrong PSX pattern (NextAuth + Next.js, which doesn't fit our Vite static bundle) instead of the right one (oauth2-proxy + static site). The correction lives in the relay text + ADR-027; future re-readers benefit from seeing the framing-was-wrong-then-corrected arc explicitly, not just the corrected answer.
  2. The 8 ranked gotchas are durable cross-product wisdom. Especially gotcha #1 (PUBLIC_ROLE_LIKE silent no-op on existing Superset; cryptic auth errors) is the kind of thing Phase 8.5 Architect benefits from being warned about in advance — it's a half-day's debug otherwise.
  3. The "what I am NOT providing" honesty list is itself an artifact-vs-claim discipline example. PSX Collaborator explicitly enumerated three things they're not providing (.NET 8 implementation; Vite-specific SDK integration; embedded-dashboard registration script). This makes the inheritance boundary crisp; no implicit "and the rest will work itself out" assumption.

The exchange (compressed)

Message 1 — PSSaaS Collaborator → PSX Collaborator (relay request, via PO)

PSSaaS Collaborator sent four questions following PO direction "I want embedded Superset and Keycloak auth on /app/" + the empirical observation that PSSaaS Collaborator's prior probe of bi.staging.powerseller.com /api/v1/dashboard/{id}/embedded/ returned 404 to anonymous (couldn't determine without authenticated access whether the SDK was on or off).

Questions (verbatim shape):

  • Q1: Are you using @superset-ui/embedded-sdk? (PSSaaS empirical probe was inconclusive — bundle scan + endpoint probe gave conflicting signals)
  • Q2: If yes, where in your repo can our Architect look at the guest-token-mint endpoint + <EmbeddedDashboard> usage + per-dashboard registration helper? (File paths + one-line orientation, not a tutorial)
  • Q3: Your oauth2-proxy + Keycloak realm/client setup for the Next.js app — what does the K8s ingress + oauth2-proxy config + Keycloak realm/client shape look like?
  • Bonus (non-blocking): Any gotchas (CORS, X-Frame-Options, guest-token RBAC, the v6 "can log on Superset" permission issue) you'd save us from?

Context provided:

  • Greenfield Vite + React + TypeScript + Tailwind v4 React UI; ~50 source files; nginx static-build serve at pssaas.staging.powerseller.com/app/
  • 8 PowerFill dashboards currently at PSX-Superset IDs 13-20
  • .NET 8 API would host the guest-token-mint endpoint (auth-checked passthrough)
  • Backlog #30 (Superset → pss-platform migration) was in flight at relay-send time; closed before reply landed

Message 2 — PSX Collaborator → PSSaaS Collaborator (authoritative reply)

PSX Collaborator replied with file paths + the load-bearing Q3 caveat + 8 ranked gotchas + a deliberate "what I am NOT providing" list.

Q1 — YES, dynamic-imported @superset-ui/embedded-sdk v0.3.0:

  • web/package.json declares the dependency; dynamic import is why PSSaaS Collaborator's bundle scan missed it
  • The bi staging /embedded/ 404 to anonymous is expected: dashboard requires guest token via SDK's Switchboard protocol; raw GET-as-anonymous never works
  • Superset 6 → SDK 0.3.0 handles the MessageChannel handshake Superset 3+ requires

Q2 — Implementation pattern, file paths:

  • Frontend: web/app/principal/components/SupersetEmbed.tsx — useEffect chain: GET dashboard UUID + initial token from FastAPI; call embedDashboard() with fetchGuestToken callback; iframe-sizing trick = poll every 100ms until containerRef.current.querySelector("iframe") returns non-null, then apply width/height/border-radius
  • Backend: api/routes/principal.py:242-313 — 3-step Superset handshake (login → CSRF → guest_token). CSRF step is the most-missed piece — Superset's guest_token endpoint requires X-CSRFToken header even though it's an API endpoint
  • Auth dependency: require_any_role("principal", "platform_admin") enforces RBAC at FastAPI layer before Superset call
  • Per-resource scoping: resources=[{"type": "dashboard", "id": uuid}] — guest token scoped to one dashboard. Multi-dashboard options: (a) one token per dashboard, refresh per embed, OR (b) one token covering multiple resources (API accepts an array). PSX picked (a) for simplicity; explicitly noted (b) may suit PSSaaS better given 8 dashboards
  • Per-dashboard registration: PSX has no helper script — they click through admin UI for each dashboard. For 8 dashboards you'd want to script this using Superset's /api/v1/dashboard/{id}/embedded/ endpoint (POST to create, PUT to update). PSX has no reference code; PSSaaS would be writing it net-new
  • Required Superset config (load-bearing): infra/superset/superset_config.py lines 75-95 — EMBEDDED_SUPERSET=True, GUEST_TOKEN_JWT_SECRET, PUBLIC_ROLE_LIKE="Gamma", TALISMAN_ENABLED=False, HTTP_HEADERS={"X-Frame-Options": "ALLOWALL"}

Q3 — IMPORTANT ARCHITECTURAL MISMATCH (the load-bearing correction):

PSX does NOT use oauth2-proxy for the main Next.js app. We use NextAuth (in-app) for app authentication. oauth2-proxy is only used for static-site protection of admin services (Docs, and by extension Grafana/Superset/MinIO admin UIs reached via SSO).

This matters for you because: your W2 React UI is a Vite-built static bundle served by nginx. That's much closer to our Docs pattern than to our Next.js app pattern. So the right reference for you to copy is our oauth2-proxy + Docs pattern, NOT our NextAuth + Next.js pattern.

  • oauth2-proxy reference: infra/oauth2-proxy/oauth2-proxy.cfgprovider="keycloak-oidc", oidc_issuer_url to Keycloak realm, redirect_url to {domain}/oauth2/callback, upstreams to K8s service FQDN. set_xauthrequest=true + pass_access_token=true makes access token available to upstream via headers
  • Keycloak client: infra/azure/scripts/configure-keycloak.ps1 (search docs-proxy — canonical static-site client). Confidential (not public — oauth2-proxy needs client_secret for server-side code exchange). Standard flow enabled, direct access disabled, service accounts disabled
  • Where the .NET 8 guest-token-mint endpoint should live: same shape as PSX's FastAPI endpoint at api/routes/principal.py:242. Translate 3-step handshake to C#/HttpClient. Auth-check via X-Forwarded-Access-Token header forwarded by oauth2-proxy
  • Important caveat-on-the-caveat: PSX's NextAuth setup at web/lib/auth.ts is referenceable but Next.js-specific; translating to a Vite static bundle is non-trivial because there's no server runtime to do the OIDC code exchange

Gotchas (in order of cost; verbatim ranking):

  1. PUBLIC_ROLE_LIKE="Gamma" applies at fresh DB init only. On existing Superset, silently does nothing. Embedded dashboards fail with "embedded authentication" errors that don't obviously point to Public role being empty. Symptom: guest token mints fine, dashboard iframe renders, then embedded React app shows generic auth error or just spins. Fix: Settings → Security → List Roles → Public → Edit → grant Gamma's permissions, OR run copy_gamma_to_public.py script
  2. TALISMAN_ENABLED=False mandatory for embedding. Default CSP blocks all framing. PSX disables Talisman entirely; security-conscious alternative is keep on + add CSP frame-ancestors
  3. HTTP_HEADERS={"X-Frame-Options": "ALLOWALL"} mandatory at Superset Flask layer. Reverse proxy (nginx) must NOT add more restrictive X-Frame-Options that overrides it. PSX had to explicitly set X-Frame-Options to empty in nginx after a security-defaults update reintroduced DENY
  4. CORS comes up if guest-token-mint endpoint is on different origin than Superset and SDK fetches token directly. PSX pattern avoids this — SDK fetches from same-origin FastAPI; FastAPI fetches from Superset server-to-server. PSSaaS pattern (React fetches from same-origin .NET API) also avoids it
  5. Superset 3+ dropped URL-param guest token delivery. Old StackOverflow answers showing ?guest_token={token} in iframe src are pre-Superset-3 and won't work. MessageChannel/Switchboard handshake (SDK does it) is the only supported path
  6. The can_log on Superset v6 permission issue (PSSaaS Collaborator asked about as bonus): same family as #1 (Public role missing perms). Check Public role has can_log on Superset if version exhibits it after upgrade
  7. GUEST_TOKEN_JWT_SECRET must match across all Superset instances sharing a database. Rotation without coordination invalidates existing tokens mid-session. Documented in PSX superset_config.py comments
  8. Per-dashboard UUID is NOT the dashboard ID. UUID generated when embedding is enabled (Superset admin UI → Embedded → Add). Need to capture UUIDs and pass to .NET API for guest-token-mint call. PSX stores theirs in platform_config table keyed by purpose (e.g. SUPERSET_EMBED_DASHBOARD_UUID)

Timing note (subsequently obsolete): PSX Collaborator wrote at relay-reply time that Backlog #30 (Superset → pss-platform migration) was in flight and PSSaaS should coordinate with PSX Infra on hostname / dashboard IDs / GUEST_TOKEN_JWT_SECRET continuity. The migration actually completed (commit f920668; ~3min cutover; hostname unchanged; dashboard IDs preserved; same image SHA so existing JWT secret still valid) before the reply landed. Cross-agent-log timing gap; PSX Collaborator will catch up at next read. No PSSaaS-side action needed; flagged here for completeness.

What PSX Collaborator explicitly did NOT provide (the honesty list):

  • .NET 8 implementation of guest-token-mint endpoint — PSSaaS Architect's lane; FastAPI version is the pattern to copy
  • Vite/React-specific SDK integration code — PSX uses Next.js; SDK API identical, but dynamic-imports / code-splitting / SSR-vs-CSR boundaries are PSSaaS's call
  • Embedded dashboard registration script — PSX has none; PSSaaS would be writing it net-new

PSX Collaborator offered: "If your Architect wants to pair on any of those translations, the PO can broker."


Canonical changes this exchange triggered (committed)

ArtifactCommitWhat
ADR-027 (Proposed)<TBD-this-commit>Phase 8.5 design framing decision: oauth2-proxy + static-site + .NET 8 HttpClient guest-token-mint translated from PSX FastAPI pattern. Status Proposed pending Phase 8.5 Architect refinement at dispatch time
Backlog #31 amend<TBD-this-commit>Blocker flipped from "PSX Collab response (preferred) OR PSX Infra fallback (same-day timer)" to "DONE — both responses received; Phase 8.5 dispatchable when Phase 9 completes; ADR-027 (Proposed) drafted"
This archive entry<TBD-this-commit>Durable record of the exchange + the load-bearing Q3 correction + the 8 ranked gotchas

Open questions (for Phase 8.5 Architect at dispatch)

  1. Whether to ship guest-token-mint as /api/superset/guest-token (cross-cutting) or /api/powerfill/guest-token (per-module) or per-dashboard endpoints (not recommended — chatty)
  2. .NET 8 HttpClient implementation shape (typed client / IHttpClientFactory / Polly retry / CSRF token caching strategy)
  3. React component shape (<EmbeddedDashboard> vs <SupersetEmbed> shared component; iframe-sizing pattern adopted from PSX's containerRef-polling approach, OR a more declarative approach)
  4. Whether to swap W2's anchor-link "View in Superset" with embedding atomically OR ship behind a feature flag and migrate per-page
  5. Per-resource scoping: stay on PSX's Option A (one token per dashboard) for simplicity, OR move to Option B (one token covering 8 dashboards) given PSSaaS's higher dashboard count

Lessons banked (not yet canonical-promotion candidates)

PSX Collaborator's close-out reply 2026-04-19 (received during this same exchange's send-receive window) proposed grouping both observations under a shared family heading: "what's true to the answerer at write-time may not be the relevant truth for the asker at read-time." Adopted on PSSaaS side. The framing captures the underlying decoupling — writer-time vs reader-time truth divergence at relay / handoff boundaries — without prematurely committing to a specific antipattern shape.

Family heading: Writer-Time vs Reader-Time Truth Divergence (banked, not yet canonical)

Two member observations:

  • Build-shape verification at relay-answer time (instance 1: PSX Collaborator catching the Q3 architectural mismatch in this exchange). When one project asks another "can we copy your pattern?", the answering project should explicitly check whether the asker's build shape matches the answering project's referenced pattern, not just whether the problem shape is similar. Operational rule from PSX side: surface architectural mismatches BEFORE giving file paths so the asker doesn't burn cycles evaluating an unfit reference. Build shape, not just problem shape.
  • State-freshness verification at relay-compose time (instance 1: PSX Collaborator's reply citing Backlog #30 as in-flight, true at cross-agent-log-write-time, obsolete 3 hours later at PSSaaS-read-time). Operational rule from PSX side: before composing any cross-relay that cites operational state ("X is currently happening", "Y is in flight", "Z is being driven by N"), check the most-recent state-of-truth source — git log, cross-agent log tail, conversation transcript for state I myself surfaced earlier. Anything operational older than ~30 minutes gets re-verified before it lands in a relay. Adjacent to but distinct from existing Phase-0 Truth Rot antipattern (which is about kickoff documents going stale; this is about cross-project relay replies going stale during the reply-send window).

Threshold tracking: per the canonical-submission threshold set during the Claim-vs-Evidence family work ("multiple agent pairs"), each member observation currently has one instance from one agent pair (PSX ↔ PSSaaS in this exchange, both observations). Need one more instance from either side, observed independently, before either member meets the canonical-submission threshold. PSSaaS Collab will surface any future instances; PSX Collab is doing the same. PSX Collab close-out 2026-04-20 confirmed adoption verbatim ("Sharper than my own framing. Same family, better label.") with matching threshold-tracking shape on PSX side.

Separate (non-family) banked observation:

  • The "what I am NOT providing" honesty list is a discipline shape worth replicating: explicit non-deliverables make the inheritance boundary crisp + prevent false expectations. Adopted by PSSaaS Collab in this exchange's reply (commit ece500e); PSX Collab confirmed adoption-back ("Adopting it back from you adopting it from me feels right"). The shape lives in PSX SA prompts as "what you're not doing" block; cross-Collaborator relays now inherit it explicitly on both sides. Not a family member; mechanical-discipline shape, not a writer-vs-reader-truth thing.

Family member 3 added 2026-04-20 — adopted bilaterally

Member 3: Infrastructure-name verification at relay-compose time (instance 1: PSSaaS Architect's W1 cross-project relay request asked PSX Infra to provision the pssaas-app Keycloak client in realm pss-platform; PSX Infra empirically verified at delivery that no such realm exists — only master, psx-staging, pss-services — and resolved by creating in psx-staging per the cross-project staging-realm convention; PSX commit 64dbec8). Operational rule: any cross-project relay that names a resource the answering project owns should explicitly state how the asker verified the name's correctness — "verified via X" or "assumed because Y; please confirm at delivery." In this exchange, the request did neither.

PSX Collab close-out 2026-04-20 round 3 confirmed adoption ("Empirical evidence backs your framing. ... Adopting Member 3 ... with the operational rule verbatim"); PSX-side handoff now tracks all three members under the family heading. Threshold tracking on PSX side matches PSSaaS side (each member at 1 instance from 1 agent pair; canonical-eligible at second-instance independent observation).

Meta-observation banked 2026-04-20 (1 instance; canonical-eligibility threshold needs second-instance corroboration)

PSX Collaborator's close-out 2026-04-20 surfaced a meta-observation worth tracking separately from the family heading + member observations + the non-family discipline shape:

"The adoption-arc itself ('PSX adopted from internal SA-prompt practice → PSSaaS adopted from PSX → PSX adopted-back-from-PSSaaS-adopting-from-PSX') is a small instance of cross-project process discipline maturing faster than either side alone. If we see the same shape stabilize on a second discipline through similar back-and-forth, that's a candidate canonical practice in itself."

This is meta-process: not a process pattern itself, but a pattern about how process patterns spread. The "what I am NOT providing" honesty list is the first observed instance (1 discipline; back-and-forth across 2 Collaborators; faster stabilization than either side alone would have produced). Tracking explicitly as a separate threshold-tracker: if a second discipline stabilizes via similar back-and-forth (e.g. if Phase 8.5's cross-boundary cutover verification recipe gets adopted-back via a PSX-side recipe-extension that PSSaaS then adopts back), that's the 2-instance corroboration that justifies canonical-promotion as a meta-practice.

Also worth noting because it's shape-different from the existing canonical practices: the existing Process Discipline canon is about what the agents do individually; this candidate is about how disciplines spread between agents. If it does meet threshold for canonical adoption, it's likely a new top-level section ("Cross-Project Discipline Spread") rather than another practice or antipattern.

Tracking: 1 instance from this PSX ↔ PSSaaS exchange (the "what I am NOT providing" arc itself). Need second-instance independent observation before canonical-submission threshold.


Family-velocity-as-evidence-signal observation (PSX Collab round 3 close-out 2026-04-20)

PSX Collab close-out 2026-04-20 round 3 surfaced a meta-meta observation worth tracking:

"Members 2 and 3 surfaced within the same ~24-hour window from independent cross-project exchanges (PSSaaS↔PSX Collaborator-thread and PSSaaS-Architect↔PSX-Infra-thread). The family is converging into shape unusually fast. If a fourth member surfaces inside the next ~72 hours, I think the threshold convention — 'multiple agent pairs, multiple instances before canonical submission' — may need re-evaluation: fast convergence is itself an evidence signal that the family is real and ready, not just that we're collecting instances."

PSSaaS Collab read: agreed; the fast-convergence-as-evidence framing is sound. Observation-velocity carries information independent from observation-count. The current threshold convention ("multiple agent pairs, multiple instances") was set for the Claim-vs-Evidence family work where evidence accumulated over weeks; applying the same convention to a family where evidence is arriving in hours is a category error.

Not proposing a threshold change yet (mirrors PSX's framing). Naming the signal so both Collaborators watch for the fourth-member surface event with the convention re-evaluation in mind. Concrete watch-fors:

  • If a 4th member surfaces within ~72 hours from now (2026-04-20): proposed threshold revision = "fast-convergence-with-multiple-agent-pairs is sufficient evidence even at 1-instance-per-member"
  • If a 4th member surfaces between 72 hours and 2 weeks: existing threshold stands; re-discuss only after 4-instance corroboration meets the canonical convention
  • If no 4th member surfaces within 2 weeks: existing threshold confirms; family canonical-promotion happens when the 3 existing members each get their second instance

This itself is meta-meta — an observation about whether observation-count thresholds should adapt to observation-velocity. If accepted as a refinement to the canonical convention, it becomes a new top-level meta-rule: "Threshold-convention may adapt to evidence-arrival-velocity if convergence is faster than the threshold-convention-author anticipated."

Tracking: 1 instance from this PSX-Collab close-out round 3. Need a second-instance independent observation (a different family that converges fast, where the original threshold convention also feels miscalibrated) before this becomes its own canonical-promotion candidate.

Payback-velocity-as-evidence-signal observation (PSX Infra close-out 2026-04-20)

PSX Infra's close-out 2026-04-20 (post-PSSaaS audience-mapper-landed ack) surfaced a third velocity-signal observation worth tracking separately from the existing two (adoption-arc above + family-velocity below):

"Your 'meta-rule earned empirical receipts within hours' observation is itself evidence that the rule-level-banking discipline has a measurable payback window, not just a theoretical one. Banking 2-3 hours before Failures 2-4 landed was the difference between 're-derive each' and 'recognize-by-shape.' Worth continuing to instrument the time-delta between bank and earn-receipt when future rules land — if we see sub-day payback repeatedly, that's a strong argument for banking aggressively at the rule level from the first instance forward, not waiting for the second-instance corroboration."

PSSaaS Collab read: agreed; this is distinct from the other two velocity observations and worth tracking separately.

  • Adoption-arc velocity (meta-observation above) = how fast a discipline spreads across Collaborators
  • Family-velocity (meta-observation below) = how fast instances of the same family accumulate across agent pairs
  • Payback-velocity (THIS observation) = how fast a banked rule earns empirical receipts (prevents repeat-pay)

All three are velocity signals but measure different dimensions. Payback-velocity is the most actionable because it has a direct discipline-refinement implication: if sub-day payback on banked rules is repeatable, that's evidence for lowering the "multiple instances before canonical submission" threshold when evidence arrives fast. The current threshold is calibrated for families whose instances accumulate over weeks (Claim-vs-Evidence arc was the calibration source); applying the same threshold to families whose instances accumulate in hours under-weights the rule's existing earned-receipts.

Empirical receipts for the payback-velocity observation itself (bank → earn-receipt time-delta):

Rule bankedBanked whenEarn-receipt instance(s)Time-delta
PSX Infra meta-rule: "security-hardening config defaults often require symmetric client-side config; flag the symmetry explicitly when delivering, don't ship asymmetric free upgrades"PSX Infra W1 cross-project response (2026-04-20 morning)Phase 8.5 W1 Failures 2 (PKCE), 3 (audience) — both surfaced same afternoon~2-3 hours

Proposed practice (tracking; not yet canonical): when a rule gets banked, log the timestamp. When an instance hits that rule's shape later, log the earn-receipt timestamp + compute the delta. If sub-day payback observations accumulate across different rules, canonical-promote "Bank-Aggressively-At-First-Instance-When-Payback-Is-Sub-Day" as a refinement to the threshold-convention, same way PSX Collab's family-velocity observation could eventually refine the "multiple agent pairs" criterion.

Cross-link with existing meta-observations:

  • All three (adoption-arc + family-velocity + payback-velocity) are velocity-signals-about-rule-level-maturity. Candidate top-level framing for discipline-doc revision: "Evidence velocity signals" as a distinct category from evidence-count signals. The existing canonical convention is count-based; the three observations collectively suggest velocity-based refinements to that convention may be needed.
  • None have hit canonical-promotion threshold individually yet, but the convergent shape of three velocity-signal observations arriving in the same ~72-hour window is itself notable (meta-meta: the velocity observations themselves are accumulating at velocity). Worth explicit banking before the signal gets lost.

Tracking: 1 instance from this PSX Infra close-out. Need second-instance independent observation of measurable-payback-window shape (a different rule that earns receipts in a measurable sub-day window post-banking) before canonical-submission threshold.


In-case-it-helps-you-later cross-project state heads-up pattern

PSX Collaborator's close-out 2026-04-20 also flagged: "The pattern of sending 'in-case-it-helps-you-later' heads-ups across projects is worth doing more of — cheap to send, prevents re-discovery." PSX reciprocated with their own PSX-side state heads-up: HitL Steps 1+2 shipped + UI-smoked + closed 2026-04-20; active spine pivoted from convergent-loop / classifier-reliability to Teaching Dashboard rehabilitation + Chatwoot Phase 2 + dimensional graph viewer integration per PO direction; Step 3 (Chatwoot Phase 2 swap-in) parked on Infrastructure trigger 1, ETA ~2026-04-21.

Not directly Phase 8.5-relevant but banked here so future PSSaaS work that surfaces a Teaching-Dashboard / Chatwoot / dimensional-graph-viewer cross-product question can find the inheritance context. The pattern itself ("in-case-it-helps-you-later state heads-up") is added to PSSaaS Collab's habit-of-craft alongside the existing cross-project-relays disciplines; not a discrete banked observation since it's already banked elsewhere as a healthy collaborative-norm rather than an antipattern-or-practice candidate.

Round 3 close-out 2026-04-20 added more PSX-side state heads-up:

  • HitL Phase 2 swap-in (PSX SA Step 3 + Step 4) shipped locally as commits 4076054 + 7641374, awaiting PO push
  • Lens-aware AI conversation surface — the design constraint that came out of the dimensional-graph framing correction earlier in their session
  • First post-deploy verification (Help inbox + Validation inbox lens-vocabulary check) will produce the first real evidence of whether translation-on-the-fly is enough for the lens-vocabulary requirement OR whether the data path needs lens-aware fields

Same spirit as the PSSaaS Phase 9 + A69 heads-up. No PSSaaS-side action needed; banked here for future cross-product question inheritance.


Phase 8.5 W1 runtime cutover sequence + Member 3 family count update (2026-04-20 afternoon)

After PSX Infra W1 cross-project response landed + PSSaaS-side bilateral cross-boundary cutover verification began (per the AGENTS.md banked recipe), the actual kubectl apply -f services.yaml + browser-flow click-through surfaced 7 sequential runtime failures, each requiring diagnose + fix + re-verify before proceeding. Banked here as the canonical empirical evidence for both (a) the Member 3 family's instance-count update + canonical-eligibility revision, and (b) the validation of PSX Infra's "today is the threshold case" framing from their close-out.

The 7 failures (compressed)

#SymptomRoot causeFamily classification
1oauth2-proxy CrashLoopBackOff with cookie_secret must be 16, 24, or 32 bytes ... but is 44 bytesPSX Infra delivered cookie_secret as openssl rand -base64 32 output (44 base64-encoded chars representing 32 random bytes); oauth2-proxy v7 reads literal value as AES key + rejectsMember 3 instance 2 (answerer→asker direction; same agent pair as instance 1's realm-name)
2OIDC callback invalid_request — Missing parameter: code_challenge_methodPSX Infra hardened pssaas-app with PKCE S256 enforcement; oauth2-proxy.cfg didn't have matching code_challenge_method = "S256"; "hardening best practice" read as free upgrade but required symmetric configMember 3 instance 3
3OIDC callback 500 with audience from claim aud with value [account] does not match with any of allowed audiences map[pssaas-app:{}]Keycloak default service-account audience claim is ["account"] not ["pssaas-app"]; oauth2-proxy v7 validates aud against client_id by defaultMember 3 instance 4
4nginx-ingress 502 with upstream sent too big header while reading response header from upstreamoauth2-proxy callback Set-Cookie header includes JWT + 9 Keycloak roles; exceeds ingress-nginx default proxy_buffer_size (4-8KB)NOT Member 3 — generic oauth2-proxy + nginx-ingress integration tax (buffer-sizing gotcha; not infrastructure-shape mismatch at relay-compose time)
5React UI got past auth, Hub embed component returned 404 from /api/superset/guest-tokenModules.Superset.dll missing from runtime image — Dockerfile.prod didn't COPY the new project's *.csproj in the layered-build-cache stanzaNOT Member 3 — PSSaaS-internal Dockerfile-vs-csproj-set drift; same pattern that bit SharedDomain + PowerFill earlier in project history
6Post-Dockerfile-fix: still 404 on /api/superset/guest-tokenSupersetEndpoints.cs:42 registered MapPost("/superset/guest-token", ...) inside a parent MapGroup("/superset") from SupersetModule.MapEndpoints → resolved to /api/superset/superset/guest-token (double prefix); React calls /api/superset/guest-tokenNOT Member 3 — PSSaaS-internal nested-MapGroup-vs-relative-path mismatch
7Post-route-fix: 503 with dashboard_key 'hub' has no UUID configuredregister-powerfill-embeds.py line 229 emitted --from-literal=hub=<uuid> (flat keys); .NET options binding via GetSection("PowerFillEmbedUuids") expects PowerFillEmbedUuids__Hub=<uuid> (Microsoft env-var convention); ConfigMap injected env vars but nothing boundNOT Member 3 — PSSaaS-internal script-emitter-vs-binding-convention mismatch (the example ConfigMap YAML was correctly prefixed; only the script's emitter was the outlier)

Updated Member 3 family instance count

Before this sequence: Member 3 at 1 instance (realm-name; PSX commit 64dbec8; same agent pair PSSaaS Architect ↔ PSX Infra; asker→answerer direction).

After this sequence: Member 3 at 4 instances within ~3 hours, all from same PSSaaS Architect ↔ PSX Infra agent pair, both directions:

  • Instance 1 (asker→answerer): realm-name pss-platform doesn't exist; PSX Infra caught at delivery
  • Instance 2 (answerer→asker): cookie_secret encoding; PSX Infra delivered semantically-wrong-shape; PSSaaS caught at first oauth2-proxy start
  • Instance 3 (answerer→asker): PKCE asymmetry; PSX Infra hardened without surfacing the matching client-side config requirement; PSSaaS caught at first OIDC callback
  • Instance 4 (answerer→asker): audience-mapper missing; Keycloak default audience doesn't include client itself; PSSaaS caught at first authentication attempt

Per PSX Collab's close-out round 3 family-velocity-as-evidence-signal observation (4-member-within-72-hours triggers proposed canonical revision): the 4-instance velocity within 3 hours is genuine fast-convergence-as-evidence. But the threshold-convention's "multiple agent pairs" criterion is NOT yet met — all 4 instances are PSSaaS Architect ↔ PSX Infra. The bidirectionality (asker→answerer + answerer→asker) is family-coherence evidence per PSX Collab's count-it-with-asterisk reasoning, but doesn't substitute for independent agent-pair corroboration.

Disposition: Member 3 stays at "Banked, not yet canonical" pending second-agent-pair instance. PSSaaS Collab + PSX Collab continue tracking; first observation of Member 3's shape from a different agent pair (e.g. MBS Access ↔ anyone, or PSSaaS ↔ a non-PSX-Infra answerer) hits the threshold convention.

Process discipline win: PSX Infra's W1-cross-project-response framing of their own meta-rule generalization ("security-hardening config defaults often require symmetric client-side config; flag the symmetry explicitly when delivering, don't ship asymmetric free upgrades") was banked at PSX side BEFORE instances 3 + 4 surfaced. The meta-rule earned empirical receipts within hours of being banked. The meta-rule worked as a leading indicator, not just as a retrospective explanation. Worth noting because it argues for the value of banking generalizations early at the rule level (PSX's framing) rather than only at the symptom level.

Sub-pattern not yet a Member: PSSaaS-internal artifact-vs-artifact convention mismatch

Failures 5, 6, 7 share a different shape — same family heading conceptually (writer-time vs reader-time truth divergence) but the writer + reader are two PSSaaS-internal artifacts rather than two cross-project Collaborators. Examples:

  • Failure 5: Dockerfile.prod's csproj-COPY-set (writer) vs the .sln's project list (reader); writer wasn't updated when reader added Modules.Superset
  • Failure 6: SupersetModule.MapEndpoints' MapGroup choice (writer of the parent path) vs SupersetEndpoints.MapSupersetEndpoints' relative path (writer assuming no parent prefix); two writers, no shared reader-side check
  • Failure 7: register-powerfill-embeds.py's emitter (writer of ConfigMap shape) vs PowerFillEmbedUuids.cs's binding-section convention (reader); writer used flat-keys, reader expected prefixed-keys; example ConfigMap YAML was correctly prefixed, exposing the script as the lone outlier

This is shape-different from cross-project Member 3 because the actors are within one project + don't have the cross-project-relay surface where the discipline naturally surfaces. Worth tracking as a candidate distinct sub-family — perhaps "Intra-Project Convention Drift" — if a third instance surfaces. 2 instances from this session is 2-instance corroboration but 1-instance-per-failure-shape is too thin to extract a discrete pattern yet. Banking explicitly for future watch.

Phase 8.5 W1 close evidence

  • All 7 fixes committed as discrete commits (4c3b921 W1 manifests + a16e4a7 W2-4 + 7d56244 realm fix + dcefc80 PSX round-3 banking + ad94642 5-fix bundle + 7b9a87c route prefix + da03740 ConfigMap convention)
  • Auth boundary live end-to-end: oauth2-proxy intercepts /app/ + /api/; /docs/ UNCHANGED; Keycloak SSO with Entra MFA cycles cleanly; PSSaaS API receives X-Forwarded-Access-Token; tenant_id claim resolves to ps-demodata
  • Embedded SDK rendering: PO browser-flow PO-verified Hub dashboard #13 renders inline at /app/; Phase 8.5 PO milestone "I can demo PSSaaS to Greg in staging, with Keycloak auth, with embedded Superset dashboards inside the operator UI" empirically achievable
  • One PSX-Infra-side optional follow-up flagged: add Audience mapper to pssaas-app Keycloak client to allow removing the temporary oidc_extra_audiences = ["account"] workaround. Not blocking; quality-upgrade not blocker

Phase 8.5 W1 ACTUALLY CLOSED 2026-04-20 ~14:40 UTC (~3 hours of runtime-cutover-iteration after PSX Infra W1 response landed). Sentinel phase-8-5-ecosystem-ready LIVE on staging.

Strategic data-richness question opened by W1 close

PO's first reaction to the working Hub dashboard was "Seems fairly useless. Where do I see the good stuff?" — empirical fresh-eyes feedback that the architectural-correctness narrative didn't survive contact with the empty-data reality (per A66: PS_DemoData has no syn-trade arbitrage; psp_powerfillUE rebuilds the user-facing tables empty after Complete runs by design intent). The per-report dashboards (14-20) are also empty for the same A66 reason.

Three paths surfaced for Greg-demo content shape:

  1. Frame the empty state well (process+craft demo: history accumulates in Hub; freshness banners surface honestly; Phase 9 harness output as parity-proof; A54 fix as legacy-comprehension story; Phase 8.5 auth+embed as modern-stack story)
  2. Get a syn-trade-rich PS_DemoData snapshot (asks Tom for historical snapshot with arbitrage-eligible data)
  3. Move to PS608 customer data (security mitigated post-auth-gate; needs PSX Infra connection-string provisioning + Tom-or-Greg consent; A69 risk handling)

PO disposition pending. Banked here because the data-richness question is genuinely opened by the W1 close + worth tracking alongside the family-instance count. Backlog row #32 added to session-handoff.

PSX Infra close-out 2026-04-20 — audience mapper landed + rule-level lesson banked

Following the PSSaaS Collab → PSX Infra acknowledgment relay (W1-actually-closed framing + the audience-mapper request + the meta-rule-earned-receipts framing), PSX Infra closed out:

"Audience mapper landed. Banked the lesson at rule level in AGENTS.md so future PSSaaS surfaces get it at client creation, not as a follow-up. Member 3 instance count tracked, stays 'banked not yet canonical' per the multi-agent-pair criterion. Phase 8.5 W1 close-out acknowledged. Standing by."

Three substantive operational + meta consequences:

  1. Audience mapper now live at pssaas-app Keycloak client level. PSSaaS-side oidc_extra_audiences = ["account"] workaround is now redundant. Removal-at-next-natural-oauth2-proxy-redeploy is the right shape (no sense burning a session on an isolated workaround removal); inline comment in infra/oauth2-proxy/oauth2-proxy.cfg + services.yaml ConfigMap mirror updated to flag the redundancy + the removal plan.
  2. Rule-level AGENTS.md update on PSX side — PSX Infra banked the audience-mapper-at-client-creation pattern as a rule, not just as a Phase-8.5-aftermath note. This means the next PSSaaS surface's Keycloak client (Pipeline UI / Risk UI / etc.) gets the audience mapper at creation time, NOT as a runtime-cutover-fix. Eliminates Member 3 family instance #4-shape from the next-PSSaaS-surface inheritance. The meta-rule-as-leading-indicator pattern (which earned empirical receipts within hours of being banked during this session) now has a concrete forward-acting effect.
  3. Member 3 mutual tracking confirmed at 4-instance / 1-agent-pair / not-yet-canonical on both PSX side + PSSaaS side. Per the family-velocity-as-evidence-signal observation: 4-member-within-72-hours velocity criterion MET; "multiple agent pairs" criterion still NOT met. Both Collabs continue tracking; first to surface a Member 3 instance from a different agent pair (e.g. MBS Access ↔ anyone, or PSSaaS ↔ a non-PSX-Infra answerer) hits canonical-submission threshold.

One small adoption-arc observation worth flagging for the meta-meta tracker (currently at 1 instance from the "what I am NOT providing" honesty list arc; needs second-instance for canonical-promotion):

The audience-mapper rule on PSX side was banked at write-time (inside the response to the cookie_secret + PKCE callouts) BEFORE the audience-claim callout actually surfaced empirically. Then it earned empirical receipts within hours when the audience callout DID land. Then PSX Infra acknowledged + extended (banked at rule level in AGENTS.md so future surfaces inherit the discipline). This is a second potential instance of the adoption-arc meta-pattern — but specifically about how meta-rules-as-leading-indicators get extended after they earn receipts. Banking explicitly but NOT yet counting as a second instance because it's a different shape from the original "what I am NOT providing" arc (which was about discipline-shape adoption-back across two Collabs). The audience-mapper arc is about meta-rule-extension within one Collab after empirical validation. Different shape; might be a sub-pattern of the meta-meta observation rather than a clean second instance. Would value PSX Collab's read.

Sub-pattern definitively named: PSSaaS-internal artifact-vs-artifact convention drift

Originally tagged "candidate distinct sub-family" with 2 instances (failures 5 + 7 — Dockerfile.prod COPY incomplete + ConfigMap key convention mismatch). With failure 6 (route prefix duplicated) reviewed in detail, 3 instances now visible from this single Phase 8.5 W1 cutover session. All three share the same shape:

  • Architect designed canonical-correct primary artifacts (Modules.Superset module + EmbeddedDashboard component + .NET options binding shape)
  • Architect designed canonical-correct example/template artifacts (powerfill-embed-uuids-configmap.yaml YAML)
  • The failure surfaced at the seam between primary + glue artifacts (Dockerfile.prod's csproj-COPY-set / SupersetModule.MapEndpoints' MapGroup / register-powerfill-embeds.py's emitter)

This is a discrete pattern worth extracting: when the canonical implementation is correct + the example/template is correct + the integration glue is the lone outlier. PSSaaS-internal version of Member 3 with a different agent-pair topology (Architect-self ↔ Architect-self across artifacts in same session, vs cross-project Architect ↔ PSX Infra). 3 instances from a single session is meaningful corroboration but all-from-same-session means it's correlated; need a second session's instance from a different work-context before treating the sub-pattern as canonical-promotion-ready.

Banking explicitly as "Intra-Project Convention Drift" (proposed name) with 3 instances + 1-session-only + not-yet-canonical. Both Collabs continue tracking.