ADR-028: Phase 9 Parallel Validation Harness Design
Status
Proposed (2026-04-20) — drafted as part of the Phase 9 dispatch per the
kickoff at powerfill-phase-9-kickoff.
Records three architectural decisions surfaced via Alternatives-First Gate
during Phase 9 planning + a fourth framing decision (Frame D Hybrid)
surfaced via Andon-cord pre-plan exchange between the Architect and the
PO.
Note on ADR number assignment (renumbered 2026-04-20 in commit batch):
the Phase 9 kickoff initially mentioned ADR-027 as the Phase 9 harness
ADR slot, but during the Phase 9 Architect dispatch window the
Collaborator independently authored
ADR-027 (Superset Embedding Strategy)
in commit ece500e (banking the PSX Collaborator's reply to the
embedding-pattern relay; landed before this Phase 9 ADR was committed).
Per the canonical "ADRs are numbered sequentially and never renumbered"
rule, the first-committed ADR-027 (Superset Embedding) keeps its
number; this Phase 9 harness design ADR took ADR-028 instead. The
Phase 9 kickoff document's reference to "NEW ADR-027" is now historical;
the actual Phase 9 ADR is ADR-028. The Phase 8.5 PSSaaS Auth Strategy
ADR (originally projected as ADR-028) folds into the Collaborator's
ADR-027 per the framing decision documented there.
Context
The PSSaaS migration thesis is "chunk-by-chunk extraction from the Desktop App, with each chunk verifiably matching legacy behavior loan-by-loan against real customer data." Phase 9 ships the empirical proof of that thesis for chunk #1 (PowerFill). Without a parallel-validation surface, the Greg-demo claim is "look at our cool new UI"; with it, the claim is "here is empirical evidence the chunk-extraction model works."
Phase 9's harness ships:
- A reproducible tool that runs PSSaaS PowerFill alongside a Desktop-App-equivalent invocation against the same tenant DB and produces a per-loan-allocation diff report with a verdict per loan (Match / TolerableDiff / Divergent / Incomparable) and a per-run summary stat.
- A first comparison run against PS_DemoData with a documented input snapshot, producing a comparison report demonstrating the harness works.
- A demo asset: the harness's first-run output report formatted to slot into the existing PO-written Greg-demo "Bug as Feature" narrative as the loan-by-loan parity-proof addition.
Three decisions about how the harness is built drive the architecture.
Decision
Framing Decision (D-9-0): Frame D Hybrid (PO-confirmed pre-plan)
Adopted: Frame D Hybrid.
Per the Andon-cord pre-plan exchange between the Architect and the PO
(captured in the Phase 9 plan §"Framing
locked"), Phase 9's first comparison run on PS_DemoData proves
orchestration parity (PSSaaS's C# orchestration of the SQL procs
produces row-equivalent outputs to direct sqlcmd EXEC of the same procs
against the same DB). It does NOT prove legacy-vs-fixed-body parity
because the ADR-021 §Narrow Bug-Fix Carve-Out
is forward-only — the A54-fixed psp_powerfill_pool_guide body is
deployed to PS_DemoData; both invocation paths execute the same fixed
body; the legacy unmodified body deterministically Fails on PS_DemoData
(the canonical "Bug as Feature" demo signal).
The Capability × Environment matrix in every harness output report explicitly carries cells like "Legacy unmodified proc body vs PSSaaS- fixed proc body parity: NOT MEASURABLE HERE — pending customer-rep approval" (Capability Inflation countermeasure per canonical practice #13).
Alternatives considered + rejected:
- Frame A (orchestration parity, sole framing) — rejected as sole
framing because packaging it as "loan-by-loan parity vs Desktop App"
in a Greg-demo slide titled "loan-by-loan parity proof against real
customer data" would be a Capability Inflation instance (the canonical
example named in
process-discipline.md). - Frame B (defer to customer-DB run) — rejected because Phase 9's kickoff explicitly defers multi-customer-DB sweeps to operator-driven post-Phase-9 work; choosing B re-scopes Phase 9 to the wrong phase.
- Frame C (snapshot pre/post-fix on PS_DemoData) — rejected because
it re-tells the A54 fix story already told in
powerfill-a54-fix-greg- demo-readiness.md, conflating "the fix works" with "PSSaaS produces the right answer per loan".
Decision 1 (D-9-1): Desktop App equivalent invocation path
Chosen: Option A — Direct sqlcmd EXEC of psp_pfill_bx_settle_and_price,
psp_powerfill_conset, psp_powerfill_pool_guide, and psp_powerfillUE
(plus optionally psp_pfill_bx_cash_grids if bx_price_floor is set
per A12) against the target tenant DB via pyodbc. Skips the
PowerBuilder front-end entirely; matches the legacy ADR-021 verbatim-port
discipline (the procs ARE the canonical contract per A1's Banking note).
Alternatives:
- Option B — PowerBuilder headless invocation — rejected. Higher friction; uncertain headless support; no available test surface; re-introduces a PB dependency the modular-monolith ADR-004 + ADR-021 explicitly avoid.
- Option C — Snapshot-then-compare — rejected for first-run instance. Trades headless-invocation difficulty for human-in-the-loop overhead. Documented as a future extension in the §"Future considerations" section below.
Note on parameter mapping fidelity: the harness's
SqlcmdInvoker._resolve_six_params mirrors PowerFillRunService.cs
lines 389-394 + 549-554 byte-for-byte, including the cl/co scope
mapping and the pc/po price-mode mapping per A40 + F-6d-5. UE takes
the same 6 parameters as conset.
Decision 2 (D-9-2): Harness implementation language
Chosen: Python 3.10+ in WSL Ubuntu with the pyodbc + requests +
PyYAML + Jinja2 stack. Kickoff §"Tooling" line 180 explicitly
recommends this combination; Architect-side fluency is high; the dev-
environment install path requires msodbcsql18 + unixodbc-dev +
python3-pip from the Microsoft apt repo, captured in the harness's
README.md §"Prerequisites".
Alternatives:
- .NET console app — rejected. The PSSaaS API contract types could
be reused via project reference, but the diff-rendering ergonomics
are weaker; would create a fourth deploy artifact (after
api,docs,frontend). - PowerShell — rejected. Cross-platform constraints (the kickoff anticipates the harness must work in WSL Ubuntu); JSON manipulation ergonomics weaker than Python.
- TypeScript — rejected. Would re-use the React UI's existing wire-shape types but would re-introduce a Node-runtime dependency the Option L runtime constraint (see D-9-4 below) explicitly avoids.
Decision 3 (D-9-3): Verdict-rendering output format
Chosen: Markdown rendered via Jinja2 from
tools/parallel-validation/templates/comparison_report.md.j2. The
output artifact lives at docs-site/docs/devlog/2026-04-20-powerfill- phase-9-first-validation-run.md (Docusaurus-rendered; PO can paste
sections into the Greg-demo deck or screenshot for slides). Per kickoff
§"Demo asset" the artifact must "slot into the existing PO-written
powerfill-a54-fix-greg-demo-readiness.md
'Bug as Feature' demo narrative" — Markdown is the same format that
doc uses.
Alternatives:
- HTML — rejected for first instance. Heavier toolchain; less natural for the docs-site source-of-truth integration.
- JSON-with-static-renderer — rejected. Premature for a 1-shot first-run output; can be added as a sibling Jinja2 template if a future phase needs programmatic consumption.
Decision 4 (D-9-4): Runtime location for the harness binary (Option L)
Chosen: Option L — Local-only. The harness runs in WSL Ubuntu,
invokes the local pssaas-api container (http://pssaas.powerseller.local/api/powerfill)
for the PSSaaS side and direct sqlcmd to the PS_DemoData public endpoint
for the legacy-equivalent side. OIDC sidecar planned-for, NOT built
this session.
Alternatives:
- Option S — Local harness against staging API + private endpoint — rejected. Requires the Architect's WSL to reach the SQL MI private endpoint (which is reachable from AKS via VNet peering, not from a dev workstation). Mixing API-against-staging + sqlcmd-against-public- endpoint creates a configuration discrepancy the harness has to encode. Architect-vs-Infra ownership ambiguity per architect-context.md "Infrastructure operations escalate to the Collaborator/PSX Infra Agent".
- Option K — Containerized K8s Job — rejected. +1 sub-session of plumbing (image build, GHCR push, K8s manifest, GHA path-filter, RBAC) before any harness-as-tool work happens. Phase 9's PO milestone is "loan-by-loan parity evidence on PS_DemoData" — building a Job-deployable harness produces zero additional evidence over Option L for the first comparison run; it just changes WHO can run it WHERE. Kept as a future-extension if post-Phase-9 customer-DB runs reveal local-only doesn't generalize.
Demo-vs-runtime dynamic (PO-clarified during Q1 follow-up): the
Greg demo lives on staging React UI; the harness output artifact's
clickable run-status URLs point at staging via the
harness_config.yaml :: report.pssaas_ui_base_url indirection (default
https://pssaas.staging.powerseller.com). The harness binary runs
locally; its output references staging surfaces. PSSaaS's API path is
identical on local + staging (same proc body via SQL MI), so the parity
claim is environment-independent.
Consequences
Positive
- Zero new platform infrastructure. The harness reuses the existing
local-dev
pssaas-api+ the PS_DemoData public endpoint. No new GHCR images, no new K8s manifests, no new GHA workflows, no new secrets. - Honest about what's measured. Frame D Hybrid + the Capability × Environment matrix per practice #13 prevent Capability Inflation in the demo-asset framing. The harness's own self-test exercises the load-bearing semantics (A66-aware Match; Asymmetric-failure Incomparable) at unit level.
- Surfaces real findings. The first comparison run surfaced A69 (state-dependent UE failure on non-empty post-pool_guide state on PS_DemoData) — exactly the class of finding the Phase 9 harness was built to surface. The harness earned its Phase 9 charter on its very first run.
- Reproducible.
tools/parallel-validation/README.mddocuments the install + invocation pattern; a non-Architect machine can reproduce bydocker compose --profile dev up + python harness.py.
Negative
- Sqlcmd-direct path skips the C#-side
PowerFillCandidateBuilderpre-step. That step writes diagnostic counters into PSSaaS'sRunSummary(constraint_count / candidate_count / etc.) but does NOT populatepfill_loan2trade_candy_level_01directly —psp_powerfill_consetrebuilds that table itself per008_CreateAllocationProcedure.sqllines 1300-1301. So the proc-body output is symmetric across both invocation paths; only the C#-side counter set is asymmetric. The harness's verdict logic doesn't rely on those counters. - OIDC integration is not yet wired. When Phase 8.5 lands per
Backlog #31, the harness's HTTP client signatures will need a
bearer-token argument. The function shapes already accept an
optional
bearer_token-style parameter for forward compatibility; wiring is a future commit. - PSX Infra ownership of msodbcsql18 + ODBC Driver 18 install on
any non-Architect machine. The harness's
README.md §Prerequisitesdocuments the apt sequence, but perarchitect-context.mdinfrastructure operations escalate to the Collaborator/PSX Infra Agent. Banking observation: F-W2-TOOLING-1 was Node-on-WSL; this ADR's tooling-prereqs is the analogous case for pyodbc.
Risks and Mitigations
| Risk | Mitigation |
|---|---|
| Harness's first comparison run reveals real PSSaaS-vs-sqlcmd-direct divergence beyond rounding tolerance | STOP and surface as A69+ per kickoff §"Reporting protocol". Already exercised on Phase 9's first run (A69 banked). |
| Harness's A66-aware verdict logic mis-classifies a buggy-empty case as Match | Diff-engine self-test Cases 1, 6, 7 cover the load-bearing semantics: A66 happy-path, Failed-PSSaaS, Asymmetric-Failed. Verdict logic adjusted post-first-run to suppress A66 when EITHER side Failed. |
Sqlcmd-direct path overwrites pfill_* tables PSSaaS just wrote, racing if the harness invokes them in parallel | Harness invokes them sequentially (PSSaaS first; reads HTTP reports into in-memory dicts; THEN sqlcmd-direct EXECs). PSSaaS's data is materialized before any sqlcmd write. |
Harness's clickable run-status URLs point at staging via pssaas_ui_base_url config indirection; if Backlog #30's Superset migration happens to change the staging URL, the report links break | Config indirection lives in harness_config.yaml; one-line update flips all URLs in future runs. Already-rendered reports remain historical artifacts. |
Related ADRs
- ADR-021: PowerFill Port Strategy — verbatim-port discipline + the §Narrow Bug-Fix Carve-Out that forward-only-deploys A54 fixes; the canonical reference for what Frame D's "same fixed proc body" claim means.
- ADR-022: PowerFill Allocation Algorithm — the iterative-passes algorithm whose per-stage semantics A1 documents and Phase 9's per-loan correctness validation is the gate for.
- ADR-024: PowerFill Async Run Pattern — the BackgroundService + Channel pattern that's the PSSaaS-side invocation surface the harness reads from.
- ADR-025: PowerFill Report API Pattern — the latest-Complete-wins semantics + freshness verdicts that the harness's HTTP-client layer needs to understand (see also A60 in the assumptions log).
Related Assumptions
- A1 (per-stage allocation semantics) — Phase 9's per-loan correctness validation is the canonical gate per A1's Banking note.
- A60 (latest-Complete-wins; ADR-025 reference).
- A65 (multi-pa_key + per-loan settlement-date variance are two distinct A54 triggers; harness pre-flight could probe these on a customer DB; deferred to operator-driven post-Phase-9 sweep).
- A66 (UE rebuild-empty on syn-trade-empty datasets like PS_DemoData) — encoded in the harness's verdict logic as the A66-aware Match rule.
- A67 (
ReportContracts.csXML doc Truth Rot) — closed in this Phase 9 commit batch. - A68 (tenant_id-vs-config-slot conflation) — the harness uses
X-Tenant-Id: ps-demodatato match the existingpfill_run_historyrow tags per A68's short-term Path γ disposition. - A69 (NEW this session) — state-dependent UE failure on non-empty post-pool_guide state surfaced by the harness's first run; banked for Greg/Tom consultation.
Future Considerations
- Snapshot replay (Option C from D-9-1) — capture pre-run DB state
- replay after each comparison to enable true side-by-side comparisons of two harness runs against the same input. Currently the harness relies on the proc bodies being deterministic given identical inputs.
- Multi-tenant operator-driven sweep — point the harness at a customer DB (substituting connection string + tenant ID) for the post-Phase-9 validation runs the kickoff defers.
- Phase 8.5 OIDC sidecar — when Keycloak lands per Backlog #31, wire bearer-token authentication into the harness's HTTP client.
- HTML / JSON output formats — sibling Jinja2 templates if a future phase needs programmatic consumption beyond the Markdown v1.
- Containerized harness (Option K from D-9-4) — defer until evidence emerges that local-only invocation doesn't generalize to a customer-DB scenario.
Revision Triggers
This ADR is revised when:
- The Frame D framing is challenged by Greg/Tom in a way that makes the orchestration-equivalence claim insufficient as the demo signal.
- The first non-PS_DemoData customer-DB validation run is performed (closes the Capability × Environment matrix's "Customer DB" column).
- A second invocation-path option (B / C from D-9-1) becomes preferred over the direct-sqlcmd default.
- The Phase 8.5 OIDC integration lands and the harness's HTTP client signature needs to bake bearer-token in by default.