PSSaaS Architect — Phase 9: Parallel Validation Against Desktop App
Role: PSSaaS Systems Architect Phase: PowerFill Phase 9 (Parallel Validation + Tom/Greg Critique) — empirical proof that PSSaaS PowerFill produces per-loan-equivalent allocations to the legacy Desktop App on the same input data Date dispatched: TBD (next Architect session; can dispatch in parallel with Phase 8.5 work since the surfaces don't overlap) Model required: Opus 4.7 High Thinking — verify in the Cursor model picker before responding. If running under any other model, STOP and escalate. Estimated effort: Spec line 651 ("Phase 9 — Validation + Tom/Greg critique + cutover — 2-3 weeks"). Phase 9 as scoped here is the validation harness + first comparison run + first Tom/Greg review surface, NOT cutover (that's Phase 10+). Realistic estimate: 3-5 Architect-sessions for the harness + first end-to-end comparison + the report-generation surface; subsequent customer-DB validation runs are operator-driven (Phase 9 builds the lever; the PO + customer reps pull it).
Predecessor work: Phase 8 fully COMPLETE as of 2026-04-19 — both workstreams + the A54 fix + the F-W2-PSD-1 / Path γ tenant-slot fix all shipped. The 6-step PowerFill orchestration runs end-to-end-Complete on PS_DemoData via staging API; the React UI surface at https://pssaas.staging.powerseller.com/app/ exposes the operator workflow (auth deferred to Phase 8.5; staging is currently public). Phase 9 is the next sequential phase per spec, dispatched in parallel with Phase 8.5 (auth + Superset embedding) which has its own kickoff queued behind PSX-Infra-side work (#30 + #31 in session-handoff Backlog).
FOUR substantive context updates since the W2 kickoff (f4531ae) that Phase 9 must internalize:
- End-to-end Complete-run on PS_DemoData empirically achievable post-A54-fix (
cf8ef8b); 12+ historical runs inpfill_run_history(3 Complete + 7 Failed + 2 Cancelled) provide a baseline corpus for the harness to compare against. - A66 + A65 banked observations are now load-bearing for Phase 9's harness design — A65 directly names the harness as the proving-ground for "do the multi-pa_key + settlement-date-variance triggers fire on this customer DB?" probes; A66 names the post-Complete empty-output behavior on syn-trade-empty datasets like PS_DemoData (Phase 9 must distinguish "PSSaaS produced empty because UE rebuild-empty per A66" from "PSSaaS produced empty because of a bug" — the harness's verdict logic must understand both).
- A68 added (tenant-id-vs-config-slot conflation as design wart). Phase 9's harness will be a third writer into
pfill_run_history(the harness itself, possibly running on AKS or a dev workstation) — the convention used must agree with the existing'ps-demodata'tag, OR Phase 8.5 closure (long-term decoupling of TenantId from connection-string-slot routing) must land first. Currently the safer assumption is "convention-agree with existing rows" (writetenant_id='ps-demodata'); fold-into-Phase-8.5 is the long-term answer. - Phase 8.5 (NEW) is queued in parallel — PSSaaS joins ecosystem auth (Keycloak via oauth2-proxy) + replaces "View in Superset" anchor links with embedded SDK. Surfaces don't overlap with Phase 9 (Phase 9 is harness + comparison + reports; 8.5 is auth + embed). Phase 9 should NOT make the harness depend on auth being landed — the harness should be runnable today against unauth staging, and remain runnable against Phase 8.5's auth-protected staging by adding an OIDC-token mint step in the harness invocation. Build the harness assuming it'll need to grow that step but doesn't yet.
Per the PO's strategic frame (re-stated for Architect awareness, since Phase 9 ships the load-bearing demo asset): the PSSaaS migration thesis is "chunk-by-chunk extraction from the Desktop App, with each chunk verifiably matching legacy behavior loan-by-loan against real customer data." Phase 9's harness is the empirical proof of that thesis for chunk #1 (PowerFill). Without it, the Greg demo is "look at our cool new UI"; with it, the Greg demo is "here is empirical evidence the chunk-extraction model works." This framing is in the PO-written A54-fix completion report (docs-site/docs/handoffs/powerfill-a54-fix-greg-demo-readiness.md §"Bug as Feature" demo narrative); Phase 9's harness output should slot into that demo narrative as the "and here's loan-by-loan parity proof" slide.
Session-start checklist
Read these in this order before doing anything else:
CLAUDE.md— project identity, role-identification procedure, push-is-an-ask conventionAGENTS.md— agent memory: principles, lessons, F-PSD findings summarydocs-site/docs/agents/architect-context.md— your role definitiondocs-site/docs/agents/process-discipline.md— canonical practices, gates, antipatterns. Per banked observation 2026-04-19 (commit8dba7b4message), the next revision is likely to add "Subagent Output Defended Beyond Scope" antipattern + a Backlog re-read pass canonical promotion + a refinement of practice #13 cell-granularity ("does artifact X get written to with the same convention from every environment that can write it?"). Use the latest committed version ofprocess-discipline.mdas authoritative, but anticipate these refinements landing during Phase 9 if the next discipline-doc revision ships in parallel.docs-site/docs/agents/handoff-prompts.md— Templates for delegationdocs-site/docs/handoffs/pssaas-session-handoff.md— current state. Re-read the Backlog table at planning time per the trigger-based countermeasure shape (canonical commit10133c6) — Backlog re-read pass at planning-start now has 3-instance corroboration (F-7-7 anticipated; F-8-BR-1 caught; F-W2-CONTRACT-1 + F-W2-BR-3 caught) and the pattern-recognition is canonical-adoption-ready. Use it explicitly in §2 of your plan.docs-site/docs/specs/powerfill-engine.md— full spec. Phase 9-relevant sections: §Phased Implementation (line ~651), §Algorithm (the per-stage semantics A1 documents, which Phase 9's harness validates per-loan), §Data Contracts (the 8 Phase 7 endpoint shapes are the harness's read surface), §Run APIs (the operator-grade run-mgmt surface from Phases 6e/7).docs-site/docs/handoffs/powerfill-phase-8-w2-completion.md— most recent completion report. The §Capability × Environment matrix demonstrates practice #13 application; Phase 9's harness completion report should produce the same matrix shape, but with cells about per-customer-DB validation runs instead of per-environment deploy verifications (per Architect's W2 recommendation #2: "Phase 9 should add a parallel-validation harness 'data-shape compatibility' pre-flight per A65 + A66 + the W2 environment matrix").docs-site/docs/handoffs/powerfill-a54-fix-greg-demo-readiness.md— the PO-facing Greg-demo narrative. Phase 9's harness output is the load-bearing slide of that narrative. The harness's report format should slot into the §"Bug as Feature" 5-slide structure, specifically as the "and here's loan-by-loan parity proof against real customer data" addition.docs-site/docs/handoffs/powerfill-phase-7-completion.md— Phase 7 endpoint contracts (the 8 GET endpoints the harness reads from). Per A67 + F-W2-CONTRACT-1: the canonical wire-shape isRunEndpoints.csMapGet/MapPost registrations + the OpenAPI document at/api/powerfill/swagger/, NOT theReportContracts.csXML doc-comments (which still claim a/reports/path segment that doesn't exist).docs-site/docs/legacy/powerfill-deep-dive.md+docs-site/docs/legacy/n_cst_powerfill.sru(in raw legacy source) — the Desktop App side of the parallel validation. The harness invokes the Desktop App's PowerFill plugin path; the Architect needs a working understanding of how the Desktop App is invoked headlessly OR via PowerBuilder script-mode OR via direct T-SQL exec ofpsp_powerfill_*against PS_DemoData (skipping the PB front-end entirely; the PB layer mostly orchestrates the samepsp_powerfill_*calls PSSaaS now wraps).docs-site/docs/specs/powerfill-assumptions-log.md— A1 (revised; Phase 9's per-loan correctness validation is the gate per A1's Banking note); A28 + A37 (RESOLVED); A38 (RESOLVED); A41-A45; A47-A58; A60 (latest-Complete-wins; affects harness's read-side semantics); A61; A62 (PS_DemoData view drift; Phase 9 close-out per Backlog #24); A63-A64 (Phase 8 W1; A64 platform-tailwind note about multi-tenant Superset registration becoming easier under platform-Superset); A65 (Phase 9 directly named: harness must probe multi-pa_key + settlement-date variance triggers on each customer DB); A66 (NEW; harness's verdict logic must distinguish UE-rebuild-empty from buggy-empty); A67 (cosmetic XML doc fix, harness can serve as the forcing function to align it); A68 (NEW 2026-04-20; harness convention must agree with existing'ps-demodata'tag OR fold the long-term decoupling into Phase 8.5).docs-site/docs/devlog/— most recent:2026-04-19g-powerfill-phase-8-w2.mdis the W2 ship; expect a2026-04-19h-powerfill-phase-9-kickoff.mdas your devlog at session end.
After reading those, acknowledge your role and proceed.
YOUR TASK — Phase 9: Parallel Validation Harness
Per the spec's Phase 9 row + every prior phase's "Phase 9 parallel-validation will exercise this" carry-over (A1, A47, A53, A54-historical, A56-historical, A58, A65, A66 directly named), Phase 9 ships:
Harness: A reproducible tool that takes a frozen input snapshot, runs PowerFill in PSSaaS (via the staging API or a local equivalent), runs PowerFill in the Desktop App equivalent path (likely direct T-SQL exec of the legacy procs against the same DB, skipping the PB front-end), and produces a per-loan-pool-allocation diff report with a verdict (Match / TolerableDiff / Divergent) per loan + a summary stat per run.
First comparison: One end-to-end harness run against PS_DemoData with a documented input snapshot, producing a comparison report demonstrating the harness works. NOT a multi-customer-DB sweep; that's operator-driven post-Phase-9.
Demo asset: The harness's first-run output report formatted to slot into the existing PO-written
powerfill-a54-fix-greg-demo-readiness.md"Bug as Feature" 5-slide demo narrative as the "loan-by-loan parity proof" addition.
Phase 9 has no React UI work (auth + embedding is Phase 8.5, dispatched in parallel) and no production cutover work (that's Phase 10+). Phase 9's surface is the harness + first comparison + the report shape the PO can use with Greg.
Inherited context (do not re-litigate)
| Topic | State as of 9a83b92 |
|---|---|
| Phase 7 / Phase 8 W1 / Phase 8 W2 / A54 fix / Path γ / A68 banked / Backlog #30 + #31 / A64+A68 platform-tailwinds | All COMPLETE as of HEAD; sentinel phase-8-superset-react-ready-a54-fixed; staging at https://pssaas.staging.powerseller.com/ (currently public; Phase 8.5 will gate behind Keycloak in parallel with Phase 9) |
| End-to-end Complete-run on PS_DemoData | Empirically achievable in ~30s post-A54-fix; 12+ rows in pfill_run_history (3 Complete + 7 Failed + 2 Cancelled); the canonical tagged-tenant_id='ps-demodata' row population per the local-route convention |
| 6-step orchestration | All 6 steps (BX cash-grids → BX settle-and-price → candidates → conset → pool_guide → UE) exercise green end-to-end on PS_DemoData |
| Existing endpoints (which the harness reads from) | POST /run (202; the harness triggers the PSSaaS run via this), GET /runs/{id} (the harness polls via this for run completion), 8 Phase 7 GET /runs/{id}/<report> (the harness reads outputs via these) |
| A1 per-stage allocation semantics | Documented in the assumptions log; Phase 9's per-loan correctness validation is the gate (per A1's Banking note "the legacy proc body deploys verbatim per ADR-021; per-stage-semantic correctness validation against Desktop App output is the Phase 9 parallel-validation gate") |
| A65 — multi-pa_key + settlement-date-variance | Two distinct A54 triggers; both fire on PS_DemoData. The harness's pre-flight should explicitly probe both on each tenant DB it validates against (per the W2 Architect's Phase 9 Recommendation #2) |
| A66 — UE clears + rebuilds-empty on syn-trade-empty datasets | The harness's verdict logic must distinguish "PSSaaS empty because UE rebuild-empty per A66" (Match if Desktop App also produces empty under the same path) from "PSSaaS empty because of a bug" (Divergent). Hub Dashboard 1 (run history) is the canonical proof-of-life on PS_DemoData; the user-facing report tables are 0-row by design on this dataset |
| A68 — tenant-id-vs-config-slot conflation | The harness will be a third writer into pfill_run_history. Currently safest convention: write tenant_id='ps-demodata' (matches existing 12 rows on PS_DemoData). Long-term: fold the decoupling into Phase 8.5 per A68's platform-tailwind note. Phase 9 should NOT re-tag existing rows; it should adopt the existing convention. |
| Backlog #30 — Superset → pss-platform migration | DONE 2026-04-19 (PSX Infra completed during Phase 9 dispatch window; ~3 min cutover). Hostname unchanged at bi.staging.powerseller.com; all 20 dashboards / 56 charts / 77 datasets preserved (pg_dump immediately pre-cutover); same Keycloak SSO + same superset OIDC client + same admin credentials; both data sources (PSX postgres cross-namespace via FQDN + SQL MI via VNet peering) still work; same Superset Python image SHA so pymssql still installed. Old psx-staging Superset pod scaled to 0 with deployment retained for 24-hour rollback insurance. Phase 9 implication: NONE. The base-URL paranoia from this kickoff's draft no longer applies — bi.staging.powerseller.com is the stable forever URL. Hard-coded psx-staging namespace references in our codebase need updating (Collaborator-side sweep in flight); the Architect can ignore the namespace for harness purposes since the harness reads from the PSSaaS API, not Superset directly. |
| Backlog #31 — Phase 8.5 (ecosystem auth + embedded Superset) | Queued in parallel with Phase 9. Not a Phase 9 prerequisite. Phase 9's harness should be runnable today against unauth staging, and remain runnable against Phase 8.5's auth-protected staging by adding an OIDC-token mint step in the harness invocation. Build the harness assuming it'll grow that step but doesn't yet. |
| W2 React UI | Live at https://pssaas.staging.powerseller.com/app/. Not directly Phase 9-relevant (the harness is operator-driven, not UI-driven), but the UI's Hub dashboard link + run-status page provide a useful click-through for Phase 9 reviewers wanting to see "did this PSSaaS run actually Complete?" without curling. |
| Canonical Claim-vs-Evidence family + practice #13 | Live (commits 95084cb + 10133c6 + d4a70af + 863c139 + 4b08b51). Most relevant for Phase 9: practice #13 — the harness's first-run completion report MUST produce a Capability × Environment matrix with explicit "verified vs not measured here" per cell. The W2 completion report's matrix is the canonical example to copy; Phase 9 adapts it to "per customer DB" cells (PS_DemoData verified; PS608 customer DB NOT MEASURED HERE pending customer-rep approval; future tenants NOT MEASURED HERE). |
| W2 process observation banked but not yet canonical | "Subagent Output Defended Beyond Scope" antipattern (the PS608-tenant-dropdown scope drift); "convention conflation under low-corroboration count" (A68 root pattern); "Single-Probe Confidence" (the PSX Infra falsification + the embedded-SDK-OFF claim). Phase 9 may surface more instances; bank as observations in the completion report's Counterfactual Retro, don't pre-litigate canonical adoption. |
Explicit scope (IN)
Workstream 1: Harness design + first comparison
- Architectural decision (Alternatives-First Gate): how to invoke the Desktop App side of the comparison? Three candidate options:
- (A) Direct T-SQL exec of the legacy
psp_powerfill_conset+psp_powerfill_pool_guide+psp_powerfillUEprocs against PS_DemoData via sqlcmd, skipping the PowerBuilder front-end entirely. Then_cst_powerfill.sruis mostly orchestration over these procs anyway. Lowest friction; matches what we already do for ad-hoc PoC runs. - (B) PowerBuilder headless invocation via PB's command-line / script-mode if available — gives true Desktop-App-equivalent path including any PB-side state setup. Higher friction; uncertain whether headless PB invocation is straightforward.
- (C) Snapshot-then-compare — take a PS_DemoData snapshot of the relevant
pfill_*tables BEFORE Desktop App runs, have a human (or the PO) trigger Desktop App via the normal UI, take a snapshot AFTER, compare PSSaaS-against-the-pre-snapshot vs Desktop-App-against-the-pre-snapshot. Trades headless invocation difficulty for human-in-the-loop overhead. - Recommend: (A) for the first harness instance; document (B)/(C) as future extensions in the Phase 9 ADR. A is the direct-comparison path: PSSaaS (via staging API) and Desktop App equivalent (via direct sqlcmd) both touch the same procs against the same DB; the diff is between whatever the procs produce in the two invocations. Note: (A) trades "exact Desktop App path" for "exact legacy proc body path" — A1's Banking note specifies the legacy proc body is the canonical contract per ADR-021.
- Document choice in NEW ADR-027 (Phase 9 Parallel Validation Harness Design).
- (A) Direct T-SQL exec of the legacy
- NEW
tools/parallel-validation/directory (or wherever fits the repo layout — Architect's decision; document in ADR-027). Contains:- The harness invoker (Python? .NET console app? PowerShell? — Alternatives-First Gate decision; recommend Python for sqlcmd + JSON manipulation + report rendering ergonomics, but Architect can defend an alternative).
- A
harness_config.yaml(or similar) declaring the input snapshot + the comparison thresholds + the per-column tolerance bands. - The output-report renderer (Markdown or HTML for the PO to share with Greg + Tom).
- Input-snapshot capture mechanism — what gets frozen? At minimum:
- The
pfill_run_historyoptions_jsonfrom the reference run (so the harness re-runs with identical options). - The DB state checksum (e.g.
loan+pscat_*+pfill_constraints+pfill_carry_costrow counts + a hash ofloan.id+loan.note_ratefor the relevant pipeline subset). - The reference timestamp (so PSSaaS's
start_date_defaultderivation matches exactly per Q9).
- The
- PSSaaS side invocation:
- Trigger via
POST /runwith the resolved options from the input snapshot. - Poll
GET /runs/{id}for terminal state (Complete / Failed / Cancelled). - Read all 8 Phase 7 reports.
- Optionally: also direct-query
pfill_pool_guideetc. for cross-checks the report APIs don't cover.
- Trigger via
- Desktop App side invocation (assuming Option A):
- Run the same 6-step proc sequence directly via sqlcmd against the SAME DB but possibly different
pa_key/ scratch-table namespace to avoid colliding with the in-flight PSSaaS run. Architect must figure out the scratch-table isolation pattern (##cte_*global temp tables are session-scoped; consider running PSSaaS and Desktop-App-equivalent in separate sqlcmd sessions). - Capture the same 8 report-shape outputs from the post-run state.
- Run the same 6-step proc sequence directly via sqlcmd against the SAME DB but possibly different
- Diff engine + verdict logic:
- Per-loan comparison across the relevant report shapes (Pooling Guide, Cash Trade Slotting, Recap, Pool Candidates, Switching, Kickouts, Existing Disposition, Guide).
- Per-column tolerance (e.g. price comparisons within $0.001; carry-cost within rounding tolerance per A36; row-count exact equality for inclusion/exclusion lists).
- Verdict per loan: Match / TolerableDiff / Divergent.
- Verdict per run: aggregate stats (% Match / % TolerableDiff / % Divergent; absolute Divergent count).
- A66-aware: if PSSaaS produces 0 rows on a syn-trade-empty dataset AND the Desktop-App-equivalent path produces 0 rows on the same dataset, that's Match (NOT Divergent). The verdict logic must be aware that "both produce empty for the same A66 reason" is the right answer.
- Output report shape:
- Markdown (or HTML) artifact suitable for inclusion in a Greg-demo deck.
- Top of report: per-run summary verdict (e.g. "PSSaaS-vs-Desktop-App on PS_DemoData run
<harness-run-id>: 514/515 loans Match; 1 TolerableDiff (rounding); 0 Divergent. Time: 30s PSSaaS + 28s Desktop-App-equivalent."). - Per-section breakdown by report type.
- Per-Divergent-loan detail with the specific column(s) that diverged, the PSSaaS value, the Desktop-App value, and the magnitude of the divergence.
- Slot-into-PO-demo: the report's top-level summary line is what slots into the existing
powerfill-a54-fix-greg-demo-readiness.md"Bug as Feature" demo as the "loan-by-loan parity proof" slide.
Workstream 2: Phase 9 close-out items (carried over from prior phases)
The accumulated "close at Phase 9" carry-overs from prior phases:
- A62 closure (PS_DemoData view drift): per Backlog #24, deploy
002_CreatePowerFillViews.sqlto PS_DemoData OR rename PSSaaS view topfillv2_*. Phase 9 is the natural close point (the harness will exercise the existing-disposition endpoint; if A62 is open, the harness emits a Note-handling carve-out for that endpoint). Recommend: rename PSSaaS view topfillv2_*(less risk than blind-overwriting an encrypted legacy view; matches the PSSaaS-namespacing convention). - A67 closure (
ReportContracts.csXML doc-comment Truth Rot): the harness's read surface IS the canonical contract; Phase 9 should fix the XML docs to match the actualRunEndpoints.csroutes. Trivial 8-line edit; satisfies the W2 Architect's Recommendation #4 to use Phase 9 as the forcing function. - Process-discipline observations from W2-deploy session (banked but not yet canonical): "Subagent Output Defended Beyond Scope" + "convention conflation under low-corroboration count" + "Single-Probe Confidence". Phase 9's Counterfactual Retro should reference these; if Phase 9 surfaces additional instances of any, that's the canonical-promotion forcing function.
Cross-cutting
- Status sentinel bump to
phase-9-validation-ready(preserves thephase-N-<short-name>pattern; do NOT carry the-a54-fixedsub-suffix forward into Phase 9 — A54 closure is now historical, not gating). - Spec amendment to
docs-site/docs/specs/powerfill-engine.md— Phase 9 row in §Phased Implementation table marks "validation harness DONE; first comparison run DONE; cutover not in scope". - Assumptions log additions — A69+ for new Phase 9 findings.
- NEW ADR-027 — Phase 9 Parallel Validation Harness Design (mandatory if harness lands this session).
- Pre-push docs-build check per Phase 6e/7/8-W1/8-W2 banked discipline:
docker build -f docs-site/Dockerfile.prod docs-sitebefore push if any newdocs-site/docs/**files created.
Explicit scope (OUT)
- Multi-customer-DB sweep — operator-driven post-Phase-9 (Phase 9 builds the lever; PO + customer reps pull it for PS608 / future tenants once customer-rep approval lands).
- Production cutover — Phase 10+ (the spec's "cutover" wording in the Phase 9 row is a 2026-03 spec-line assumption that's been superseded by the 8.5 + 9 + 10 split; Phase 9 ships validation only).
- Auth integration — Phase 8.5.
- Superset embedded SDK — Phase 8.5.
- React UI changes for harness output viewing — out of scope; the harness output is a Markdown/HTML artifact that the PO views via the docs site OR shares as a leave-behind.
- New PowerFill API endpoints — Phase 6e + 7 closed those; Phase 9 reads from the existing surface.
- Real-time monitoring dashboard for harness runs — out of scope; Phase 9 produces one-shot reports.
- Performance tuning of PowerFill itself — out of scope unless the harness reveals a per-loan correctness issue rooted in a perf shortcut.
- Re-tagging existing
pfill_run_historyrows to a new tenant_id convention — A68's long-term decoupling lives in Phase 8.5; Phase 9 adopts the existing'ps-demodata'convention. - A54 fix re-litigation — RESOLVED 2026-04-19; if the harness reveals A54 is fired by a customer DB in a way the fix doesn't address, that's an A65 follow-up filed against the customer DB, NOT a re-opening of A54's fix.
Process discipline (canonical, non-negotiable)
Gates that must produce documented output
| Gate | Where to apply | What "documented output" means |
|---|---|---|
| Three-layer Primary-Source Verification Gate (now 3-instance corroborated; canonical-promotion-anticipated) | Spec-vs-implementation: verify Phase 7 endpoint contracts match what the harness's HTTP-client layer assumes. NVO-vs-implementation: verify the Desktop-App-equivalent invocation (Option A or B or C) actually exercises the same proc body PSSaaS does. Implementation-vs-runtime: re-read session-handoff Backlog table during planning; F-W2-CONTRACT-1 + F-W2-BR-3 caught at W2 planning are canonical evidence this catches issues before the PoC. | A Phase 9 plan §2 findings table per layer + explicit Backlog re-read pass log per row. |
| Alternatives-First Gate | At least 3 architectural decisions: (a) Desktop-App invocation path (A / B / C above); (b) harness implementation language (Python / .NET / PowerShell / TypeScript); (c) verdict-rendering format (Markdown / HTML / JSON-with-static-renderer). | A Phase 9 plan §3 alternatives section per decision; ADR-027 for the harness-design choices. |
| Required Delegation Categories | Heavily delegable: per-report diff logic (one delegated subagent per report shape = up to 8 micro-deliverables); the harness invoker scaffold; the verdict-renderer template. Self-implement: the architectural-contract-per-artifact load-bearing parts — Desktop-App invocation pattern + the "is this a real divergence or an A66 expected-empty?" verdict logic + the snapshot-capture mechanism. | A Phase 9 plan §8 delegation inventory with subagent prompts AND Deliberate Non-Delegation justifications per practice #9. |
| Reviewable Chunks at intra-session scope | Consider checkpointing after the harness scaffold lands + first end-to-end comparison run (against a known-good PS_DemoData input) before producing the full per-report diff output. | If checkpointing, send a plan-stage Architect Report after the first end-to-end run. |
| Deploy Verification Gate | Arm (a) sentinel = phase-9-validation-ready. Arm (b) harness invocable from a non-Architect machine (Collaborator-side reproducibility check). Arm (c) end-to-end harness run against PS_DemoData produces a verdict report. | A Phase 9 completion report Markdown citing screenshots / report excerpts + the per-tenant-DB Capability matrix per practice #13. |
| Counterfactual Retro | At session end | A retro section. Phase 8 W2 banked 7 observations including "Backlog re-read pass IS canonical-adoption-ready at 3-instance corroboration"; Phase 9 should report whether the practice continues to pay off (or if 4-instance corroboration justifies pulling the trigger on canonical promotion). |
Antipatterns to avoid (canonical list applies)
- Phase-0 Truth Rot — A57 was qualified by A59; the Backlog re-read pass at planning-start has been the empirical safeguard. Phase 9's harness design WILL surface contract-vs-implementation drift if any exists; embrace it as forcing function for A62 + A67 closure.
- Empirical-Citation Type Mismatch (Phase 5 origin) — when reading from the 8 Phase 7 endpoints in the harness, use the actual JSON property names from
ReportContracts.cs(snake_case via[JsonPropertyName]), NOT the C# property names (PascalCase). The OpenAPI / Swagger UI athttp://pssaas.powerseller.local/api/swagger/is the canonical wire-shape reference; consider auto-generating types from it (or manually mirroring as the W2 React UI does). - Verification Avoidance (Phase 4 origin) —
dotnet build(if any backend changes) + the harness's own test surface (if any) before declaring complete; the harness's first end-to-end comparison run on PS_DemoData IS the integration test. - Ghost Deploy (PSX origin) — the harness is operator-driven, not deployed-as-a-service, so this antipattern doesn't directly apply BUT the harness's invocation pattern should support content-match verification (e.g. the harness's output report should include the exact PSSaaS sentinel + Desktop-App-proc-version it ran against, so the PO can verify in seconds whether the report is from the right code).
- Delegation Skip (Phase 4 origin) — per-report diff logic is the heaviest delegation candidate; architectural-contract-per-artifact decisions (ADR-027, the verdict semantics, the snapshot-capture mechanism) are yours to self-implement.
- Capability Inflation (Phase 8 W1 / Claim-vs-Evidence family) — the harness's first comparison run produces a result on PS_DemoData ONLY. Do NOT extend that to "validates PowerFill on customer data" — that's a Capability Inflation framing. The honest claim is "validates PowerFill behavior on PS_DemoData against the legacy proc body when invoked via [chosen Option]". Customer DB validation is post-Phase-9 operator work.
- Capability Drift (Claim-vs-Evidence family) — if the W2 React UI's tenant-picker constants drift between sessions (e.g. someone re-introduces PS608), the harness assumptions could quietly become wrong. Phase 9's harness should explicitly verify the tenant-id convention agrees with
pfill_run_historyrow tags before invoking PSSaaS — a 1-query pre-flight that surfaces A68-class drift early. - Subagent Output Defended Beyond Scope (banked but not yet canonical; W2 origin) — when reviewing delegated diff-logic subagent output, the explicit first question is "is this what the kickoff asked for?" before any disposition framing. Output additions outside scope are removed unless re-justified against the kickoff.
Tooling (verified post-Phase 8 W2)
- WSL Ubuntu with
dotnet 8.0.420,jq 1.6,gh 2.4.0(un-authed). Usewsl.exe -- bash -lc '...'for shell work. - Windows-side kubectl at
C:\Program Files\Docker\Docker\resources\bin\kubectl.exe, kubeconfig at~/.kube/configwithPSS-clustercontext (PSX-shared cluster). - PS_DemoData public-endpoint password in
docker-compose.override.yml:M0th3rFuck1ng$$44$$(Compose interpolates$$→$, actualM0th3rFuck1ng$44$). Pass via env var in single-quoted PowerShell string. Or usedocker exec -e PWORD='...' pssaas-db sh -c '...'pattern. - PS_DemoData private-endpoint for AKS connectivity at
hostedps-sql.086ea791c2f1.database.windows.net,1433— wired into the staging API Deployment via secretpssaas-secrets:SQLMI_CONNECTION_STRING. - EXECUTE on dbo procs is GRANTED to
kevin_pssaas_devon PS_DemoData. db_ddladmin also granted (per A30 resolution). - Pre-push docs-build check pattern (Phase 6e lesson; now 4-instance corroborated): mandatory if any new docs files use URL templates or unfamiliar MDX syntax.
- Node.js / npm: not directly required for Phase 9 (harness is not a React surface) unless Architect chooses TypeScript implementation. Windows-host Node v22.x is available; production Docker base
node:22-alpineif needed. - Python 3.10+ in WSL Ubuntu — preferred harness implementation language per Collaborator recommendation;
pip install pyodbc requests pyyaml jinja2covers the typical needs.
Environment state (verified post-Path γ + 9a83b92)
| Surface | State |
|---|---|
| Local API | phase-8-superset-react-ready-a54-fixed ✓ (ps-demodata tenant slot wired to PS_DemoData public endpoint via docker-compose.override.yml) |
| Staging API | phase-8-superset-react-ready-a54-fixed ✓ (default AND ps-demodata tenant slots both wired to PS_DemoData private endpoint via pssaas-secrets:SQLMI_CONNECTION_STRING per Path γ) |
| Phase 7 endpoints (live verified) | All 8 endpoints respond on staging + locally against PS_DemoData; the 12 historical run-history rows surface under X-Tenant-Id: ps-demodata on both routes |
pfill_run_history on PS_DemoData | 12 rows total (3 Complete + 7 Failed + 2 Cancelled), all tagged tenant_id='ps-demodata'; latest Complete is 1ce2b077-af9d-4969-a348-b535ba265bbd (2026-04-19T22:10:11) |
| End-to-end PowerFill run on PS_DemoData | Empirically Complete in ~30s; allocated_count: 515; pool_guide_count: 515; UE step succeeds with 12 forensic events. Hub Dashboard 1 shows the 12-row history. |
| A66 empty-state | Post-Complete, the 11 user-facing pfill_* data tables are 0 rows on PS_DemoData (UE rebuilt-empty per A66 — by design). Phase 9's harness verdict logic must understand this. |
| Superset infrastructure | 36 + 8 = 44 queries + 6 deploy scripts in infra/superset/; Phase 8 W1 dashboards at IDs 13-20. Migration #30 DONE 2026-04-19 — bi.staging.powerseller.com is stable + dashboards intact + auth-gated via Keycloak SSO (HTTP 302 to anonymous probes; HTTP 200 with same dashboard IDs to authenticated users). The harness reads from the PSSaaS API (NOT Superset), so this is informational-only. |
| React frontend | LIVE at https://pssaas.staging.powerseller.com/app/ (currently public; Phase 8.5 will gate behind Keycloak in parallel). Not directly Phase 9-relevant but useful as a click-through verification surface. |
| Backlog re-read pass at planning | 3-instance corroborated; canonical-promotion-anticipated. Use it explicitly in Phase 9 §2 plan. |
| Practice #13 Capability × Environment matrix | Canonical; W2 completion report is the canonical example. Phase 9's matrix adapts to "per customer DB" cells (PS_DemoData verified; PS608 NOT MEASURED HERE; future tenants NOT MEASURED HERE). |
Companion references
| Doc | Purpose |
|---|---|
docs-site/docs/specs/powerfill-engine.md §Phased Implementation + §Algorithm + §Data Contracts + §Run APIs | Authoritative scope + algorithm semantics + endpoint contracts |
docs-site/docs/handoffs/powerfill-phase-8-w2-completion.md | W2 completion report; precedent for Phase 9 completion-report shape (especially §Capability × Environment matrix) |
docs-site/docs/handoffs/powerfill-a54-fix-greg-demo-readiness.md | The PO-facing demo narrative; Phase 9's harness output slots in as the "loan-by-loan parity proof" addition |
docs-site/docs/handoffs/powerfill-phase-7-completion.md | Phase 7 endpoint contracts (the 8 GET endpoints the harness reads) + ADR-025 |
docs-site/docs/adr/adr-021-powerfill-port-strategy.md | Verbatim-port discipline + §Narrow Bug-Fix Carve-Out (the A54 fix is the canonical first instance; future Phase-9-surfaced legacy bugs follow the same pattern) |
docs-site/docs/legacy/powerfill-deep-dive.md | Legacy plugin reverse-engineering; the Desktop App side context the harness invocation must respect |
src/backend/PowerSeller.SaaS.Modules.PowerFill/Sql/008_CreateAllocationProcedure.sql + 009_CreatePoolGuideProcedure.sql + 011_CreatePowerFillUeProcedure.sql | The exact proc bodies PSSaaS deploys; the Desktop-App-equivalent invocation (Option A) reads from the SAME bodies on PS_DemoData (which PS_DemoData has had since the PSSaaS deployment that put them there) |
src/backend/PowerSeller.SaaS.Modules.PowerFill/Contracts/ReportContracts.cs + RunContracts.cs | Phase 7 + 6e wire-shape source of truth (snake_case JSON properties; harness mirrors these) |
src/backend/PowerSeller.SaaS.Modules.PowerFill/Endpoints/RunEndpoints.cs | The 12 endpoint contracts the harness invokes (4 from 6e + 8 from 7) |
infra/azure/k8s/pssaas-staging/services.yaml | Kubernetes deployment manifest; Phase γ-amended with Tenants__ps-demodata__ConnectionString; reference for the convention Phase 9's harness must respect |
docker-compose.override.yml.example + docker-compose.override.yml (gitignored) | Local-dev tenant-slot wiring; Phase 9's harness may run locally against the same PS_DemoData via this path |
Deliverables
When Phase 9 is complete, the Collaborator and PO should be able to verify each without trusting your word:
- Code commits — atomic, logically grouped. DO NOT push — the PO pushes; you
git addandgit commitonly. tools/parallel-validation/(or similar; Architect's directory choice documented in ADR-027) containing the harness implementation.- Harness configuration file — declarative input snapshot + tolerance bands + comparison thresholds.
- First end-to-end harness run output — Markdown or HTML report at
docs-site/docs/devlog/2026-04-XX-powerfill-phase-9-first-validation-run.md(or similar location) showing PSSaaS-vs-Desktop-App-equivalent on PS_DemoData with per-loan verdict + per-run summary. - Demo-asset slot-in — the report's top-level summary line + selected per-loan-divergence detail formatted to slot into
powerfill-a54-fix-greg-demo-readiness.mdas the "loan-by-loan parity proof" addition. Could be an amendment to that doc OR a sibling doc the PO weaves in. - Sentinel bump to
phase-9-validation-ready. - NEW ADR-027 — Phase 9 Parallel Validation Harness Design documenting the Alternatives-First Gate decisions (invocation path / language / report format).
- Spec amendment marking Phase 9 (validation harness scope) DONE; cutover scope deferred to Phase 10+.
- Assumption log A69+ for new Phase 9 findings.
- A62 closure (if Phase 9 takes the recommended
pfillv2_*rename path; mark Backlog #24 done). - A67 closure (XML doc-comment fix; mark in assumptions log).
docs-site/docs/handoffs/powerfill-phase-9-completion.md— W2 completion-report format + Capability × Environment matrix per practice #13.- Devlog entry at
docs-site/docs/devlog/2026-04-XX-powerfill-phase-9.md. - Pre-push docs-build check if any new
docs-site/docs/**files.
Reporting protocol
Standard Architect Report format when you're done — what was produced / decisions / assumptions / open questions / recommended next steps / process notes.
If the harness reveals a real PSSaaS-vs-Desktop-App divergence beyond rounding tolerance — STOP and surface as A69+. The disposition is PO's call (could be a PSSaaS bug to fix; could be a legacy-bug carve-out per ADR-021's pattern; could be a Tom-or-Greg consultation).
If the chosen invocation path (Option A / B / C) encounters an unanticipated constraint — STOP and surface, don't paper over.
If a multi-day session reaches a natural pause point with partial Phase 9 completion (e.g. harness scaffold + Option A invocation works but verdict logic incomplete), that's fine — write a handoff so the next Architect session resumes cleanly.
The PO milestone for this phase: "I have empirical loan-by-loan evidence that PSSaaS PowerFill matches the legacy Desktop App on PS_DemoData." Achievable when the harness's first comparison run completes with a verdict report. Subsequent customer-DB validation runs are operator-driven post-Phase-9.
What success looks like
- Harness scaffold exists + invokable from a non-Architect machine (Collaborator-side reproducibility check passes)
- First end-to-end comparison run on PS_DemoData produces a verdict report
- Verdict report's top-level summary slot-fits into the existing PO Greg-demo narrative
- A66 expected-empty case is correctly classified as Match (NOT Divergent) — empirically demonstrates the verdict logic is A66-aware
- Sentinel reflects
phase-9-validation-ready - ADR-027 documents the harness-design choices
- A62 + A67 closure landed (or explicit defer with rationale)
- Capability × Environment matrix in completion report explicitly distinguishes "PS_DemoData verified; future customer DBs NOT MEASURED HERE pending operator-driven runs"
Begin when ready. Local environment + staging environment are both fully wired; PS_DemoData has 12 historical runs (3 Complete + 7 Failed + 2 Cancelled); the 12 PowerFill endpoints (4 run-mgmt + 8 reports) are live; the operator React UI is live at https://pssaas.staging.powerseller.com/app/ for click-through verification; A54 is RESOLVED so end-to-end Complete-runs are reproducible.
Reminder: Opus 4.7 High Thinking. Verify model in picker before sending your first response. Do NOT push.