Skip to main content

PowerFill Phase 9 — Parallel Validation Harness

Date: 2026-04-20 Agent: PSSaaS Systems Architect (Claude Opus 4.7 High Thinking) Scope: Ship the Phase 9 Parallel Validation Harness at tools/parallel-validation/ — Python 3.10+ in WSL Ubuntu, invoking local pssaas-api for the PSSaaS side + direct sqlcmd to PS_DemoData public endpoint for the legacy-equivalent side. Frame D Hybrid framing (orchestration parity on PS_DemoData; legacy-vs-fixed-body parity NOT MEASURABLE HERE per ADR-021's forward-only carve-out). Ships ADR-028 (renumbered from initial ADR-027 — a parallel Collaborator-authored ADR-027 for Superset Embedding Strategy landed in commit ece500e during the Phase 9 dispatch window) documenting all 4 architectural decisions + Frame D framing; A67 closure (XML doc-comment alignment in ReportContracts.cs); sentinel bumped from phase-8-superset-react-ready-a54-fixed to phase-9-validation-ready (drops the -a54-fixed sub-suffix per kickoff §"Cross-cutting"). First end-to-end comparison run surfaced A69 — state-dependent UE failure on non-empty post-pool_guide state on PS_DemoData; the harness's verdict logic was hardened post-first-run to honor the asymmetric-failure case via a new RowVerdict.INCOMPARABLE classification (Capability Inflation countermeasure). A70 banked documenting the Frame D framing refinement (PS_DemoData has mixed PSSaaS-deployed + legacy-encrypted proc-body state).

Why

Per the Phase 9 kickoff at powerfill-phase-9-kickoff: the PSSaaS migration thesis is "chunk-by-chunk extraction from the Desktop App, with each chunk verifiably matching legacy behavior loan-by-loan against real customer data." Phase 9 ships the empirical proof of that thesis for chunk #1 (PowerFill). Without a parallel-validation surface, the Greg-demo claim is "look at our cool new UI"; with it, the claim is "here is empirical evidence the chunk-extraction model works."

Phase 9 ships the lever (the harness as a reproducible tool); the pull (multi-customer-DB sweep) is operator-driven post-Phase-9 work the kickoff explicitly defers to operator + customer-rep workflow.

What Was Done

Code (zero backend changes beyond the 1-line sentinel bump + the A67 XML doc-comment fix)

New harness tree at tools/parallel-validation/ (~10 source files, ~1,500 LOC):

  • harness.py — main entry point + CLI + .env loader + sentinel-aware logging; sequential PSSaaS-then-sqlcmd run order (PSSaaS materializes HTTP reports into in-memory dicts BEFORE sqlcmd writes overwrite the pfill_* tables — no race condition).
  • snapshot.py — input snapshot capture: reads pfill_run_history for the latest Complete run's options_json as the reference options; computes SHA-256 state hash over per-table row counts; UTC reference timestamp for start_date_default derivation.
  • pssaas_invoker.pyPOST /run with X-Tenant-Id: ps-demodata header (per A68 convention); polls GET /runs/{id} until terminal; reads all 8 Phase 7 reports via the ACTUAL routes (per A67 closure, no /reports/ segment); chases pagination via natural-key cursor shapes from ReportContracts.cs.
  • sqlcmd_invoker.py — pyodbc EXECs the 5-step proc sequence (bx_cash_grids if floor set + bx_settle_and_price + conset + pool_guide + ue) in a separate connection (scratch-table isolation via separate session; ##cte_* are session-scoped); reads same 8 sources via direct SELECT against pfill_*. Skips the C#-side PowerFillCandidateBuilder step because conset rebuilds pfill_loan2trade_candy_level_01 itself per 008_CreateAllocationProcedure.sql lines 1300-1301 (verified at scaffold time).
  • diff_engine.py — umbrella + per-report row comparators + A66-aware verdict logic + post-first-run RowVerdict.INCOMPARABLE for asymmetric-failure case + 7-case self-test (Cases 1-6 from initial scaffold + Case 7 added post-first-run as A69-honesty regression).
  • report_renderer.py — Jinja2 → Markdown rendering; pssaas_ui_base_url config indirection so report URLs reference staging React UI even when harness binary runs locally; staging-URL composition logic in Python (testable), template focuses on structure.
  • templates/comparison_report.md.j2 — the load-bearing template containing the Frame D Capability × Environment matrix verbiage; the per-section breakdown; the §"Phase 9 finding A69" surfacing block that fires when verdict.incomparable_count > 0; the per-report Incomparable explanation.
  • harness_config.yaml — declarative config: PSSaaS API base URL + sqlcmd connection params + tolerance bands + pssaas_ui_base_url indirection per Backlog #30.
  • requirements.txt — pinned deps (pyodbc 5.1.0, requests 2.32.3, PyYAML 6.0.2, Jinja2 3.1.4).
  • install_venv.sh — one-shot WSL Ubuntu venv setup script.
  • README.md — Frame D framing + reproducibility instructions + A66 context + .env pattern.
  • .env (gitignored) — PFILL_SQL_PASSWORD (sidesteps shell-quoting hell with $ in passwords).
  • check_db.py — pre-flight pyodbc + pfill_run_history connectivity check.
  • diagnose_a69.py + diagnose_a69_v2.py + diagnose_a69_v3.py — A69 root-cause diagnostic scripts; banked alongside the harness for any follow-up session focused on A69 archaeology.

Backend modifications:

  • src/backend/PowerSeller.SaaS.Modules.PowerFill/Contracts/ReportContracts.cs — A67 closure: 9 XML doc-comment paths re-aligned from /api/powerfill/runs/{run_id}/reports/<name> to /api/powerfill/runs/{run_id}/<name> matching actual RunEndpoints.cs route registrations + a banking comment block explaining the Truth Rot pattern.
  • src/backend/PowerSeller.SaaS.Modules.PowerFill/PowerFillModule.cs — sentinel bumped from phase-8-superset-react-ready-a54-fixed to phase-9-validation-ready.

New documentation

Existing documentation amended

Files Produced / Modified

(See Phase 9 completion report §"What was produced" for the canonical exhaustive list.)

Key Decisions

  • D-9-0 (Frame D Hybrid) — orchestration parity on PS_DemoData; Capability × Environment matrix carries explicit "NOT MEASURABLE HERE" cells for legacy-vs-fixed-body parity + Customer DB column. PO-confirmed pre-plan via Andon-cord exchange.
  • D-9-1 (Option A — Direct sqlcmd) — lowest friction; matches ADR-021 verbatim-port discipline; the procs ARE the canonical contract per A1's Banking note.
  • D-9-2 (Python 3.10+ in WSL) — kickoff §"Tooling" recommendation; ergonomic stack for the harness's 4 concerns (HTTP + ODBC + YAML + Markdown rendering).
  • D-9-3 (Markdown output) — slots into the existing Greg-demo narrative format; renders in Docusaurus.
  • D-9-4 (Option L Local-only runtime) — smallest testable increment; demo asset's content references staging via config indirection so harness binary's location doesn't constrain demo surface.
  • D-9-5 (A69 honesty post-first-run) — added RowVerdict.INCOMPARABLE
    • suppressed A66 Match classification when EITHER side Failed (Capability Inflation countermeasure).
  • D-9-6 (harness_plus_a67 carry-over scope) — A67 closes; A62 defers; canonical-promotion for Backlog re-read pass banked for Collaborator nomination.

All 6 decisions documented in ADR-028

  • surfaced via Andon-cord pre-plan + during-implementation checkpoints.

What's Next

Per Phase 9 completion report §"Recommended next steps":

  1. Pre-push docs-build check + Collaborator review + PO sign-off + PO push.
  2. Post-push staging verification (sentinel + first-run-report Docusaurus rendering).
  3. Greg-demo dry-run with the harness's first-run report as the load-bearing slot-in addition to the existing "Bug as Feature" narrative.
  4. Choice between (a) A69 root-cause investigation + customer-DB sweep planning, or (b) Phase 8.5 (ecosystem auth + Superset embedding) per the parallel-track plan.

Risks Captured

  • A69 — state-dependent UE failure on non-empty post-pool_guide state on PS_DemoData; banked for Greg/Tom consultation.
  • A70 — PS_DemoData has mixed PSSaaS-deployed + legacy-encrypted proc-body state; production-cutover playbook (Phase 10+) needs explicit A70 disposition.
  • A62 — PS_DemoData view drift (pfillv_existng_pool_disposition 14-col legacy version vs PSSaaS's 15-col definition); deferred to a new Backlog row per the Phase 9 carry-over scope.
  • Backlog re-read pass canonical-promotion — banked for Collaborator nomination; 4-instance corroboration (3 traditional + 1 "0 findings = pattern works") evidence captured in the completion report's Counterfactual Retro.