PowerFill Phase 9 — first validation run (ANDON: 8 reports Incomparable; A69 surfaced)
Harness invocation: 2026-04-20T07:40:07.987085+00:00 → 2026-04-20T07:40:48.589729+00:00 (UTC; total 40.60s)
Framing: Frame D Hybrid (PSSaaS-API-orchestration vs direct-sqlcmd-orchestration; same fixed proc body on PS_DemoData)
Live demo surface (PSSaaS React UI): https://pssaas.staging.powerseller.com/app/runs/aa8592d6-18fe-45c1-a142-4cb9fe57ccfc
TL;DR — Run verdict
| Metric | Value |
|---|---|
| PSSaaS run | aa8592d6-18fe-45c1-a142-4cb9fe57ccfc — status Complete in 27.33s |
| Sqlcmd-direct run | status Failed in 11.56s |
| Total per-row comparisons | 0 |
| Match | 0 (n/a) |
| TolerableDiff | 0 (n/a) |
| Divergent | 0 (n/a) |
| Incomparable reports ("one side did not produce a comparable end state") | 8 of 8 |
| A66-classified reports ("both empty + Complete = Match" rule fired) | 0 of 8 |
Verdict line (suitable for slot-in to the Greg-demo "Bug as Feature" narrative):
Phase 9 first-run Andon: PSSaaS-vs-sqlcmd-direct on PS_DemoData run
aa8592d6-18fe-45c1-a142-4cb9fe57ccfcproduced 8 Incomparable reports of 8 because one side did not reach a comparable terminal state (PSSaaS: Complete; sqlcmd-direct: Failed). The harness's verdict logic suppresses A66-Match classification when either side Failed (Capability Inflation countermeasure). The empirical asymmetry is itself the load-bearing finding — see §"Phase 9 finding A69" below and the assumptions log entry. Per the kickoff §"Reporting protocol", this STOP-and-surface response is the canonical first-run outcome on a Phase 9 finding; A69 is the Greg/Tom consultation hook.
Phase 9 finding A69 — sqlcmd-direct UE failure on non-empty post-pool_guide state
Empirical: the sqlcmd-direct path completed psp_pfill_bx_settle_and_price
(0.1s), psp_powerfill_conset (8.1s,
populated 515 rows in pfill_powerfill_guide), and psp_powerfill_pool_guide
(~1.3s, populated 515 rows in pfill_pool_guide). It then EXEC'd
psp_powerfillue and failed within ~1s with:
('42S22', "[42S22] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Invalid column name 'note_rate'. (207) (SQLExecDirectW)")
Diagnostic: UE invoked in isolation immediately after the harness run
(against the post-PSSaaS-rebuild-empty state where pfill_powerfill_guide =
0 rows) completes cleanly in ~2.7s. The failure path is therefore
state-dependent — UE has a code path that fires SqlException 207
('Invalid column name note_rate') only when its inputs include the
post-pool_guide 515-row state, NOT when invoked against the post-UE-rebuild
empty state.
Open architectural sub-question ("Finding #1" surfaced by diagnose_a69.py):
PS_DemoData has psp_powerfillue and psp_powerfill_conset deployed WITH ENCRYPTION (could be PSSaaS-deployed-with-encryption per A50, OR could be
the legacy versions never overwritten — indistinguishable from sys.objects
since both would be encrypted). Only psp_powerfill_pool_guide is plain
text (definitely PSSaaS-deployed). Frame D's "same fixed proc body" claim
needs refining: the harness verifies orchestration-equivalence against
whatever proc bodies live on the target DB; the question of WHICH bodies is
itself a Phase 9 sub-finding.
PSSaaS-via-API run STILL completed Successfully (status=Complete, ~30s),
reporting post_ue_*=0 per the A66 rebuild-empty pattern. Two possibilities
worth Greg/Tom consultation:
- PSSaaS hits the SAME 207 internally and silently swallows it (the
RunStepResult.ErrorMessagefield for the UE step in the recordedpfill_run_history.response_jsonwould tell us; not yet captured in this first run). - PSSaaS's UE invocation hits a DIFFERENT code path — perhaps because EF
Core's
ExecuteSqlInterpolatedAsyncsets different SET options than pyodbc's default cursor SET options, and UE's behavior branches on those.
Disposition: A69 banked in the assumptions log; line-level UE archaeology deferred to a follow-up session focused on A69 root-cause investigation. This finding is exactly what the Phase 9 harness was built to surface — the kickoff's §"Reporting protocol" calls for STOP-and-surface, A69+ banking, and PO-decided disposition. The Greg-demo narrative gains a load- bearing slide: "PSSaaS Phase 9 first run surfaced A69 — a state-dependent UE behavior asymmetry between the C# orchestration path and the bare-sqlcmd path. We've documented it for your input on root-cause and disposition."
Frame D framing — what this proves AND what it does NOT prove
This harness exists to deliver Phase 9's PO milestone: "empirical loan-by-loan evidence that PSSaaS PowerFill matches the legacy Desktop App on PS_DemoData." Per the Andon-cord pre-plan exchange the Architect surfaced and the PO confirmed, the Frame D Hybrid scoping splits this into two distinct claims with distinct evidentiary burdens.
What this run proves (Verified ✓)
- The harness's STOP-and-surface protocol works. When the sqlcmd-direct path failed to reach a comparable terminal state, the harness emitted Incomparable per report (NOT a misleading "both empty = Match" classification). Capability Inflation countermeasure verified empirically.
- PSSaaS API path completed end-to-end on PS_DemoData (status=Complete, ~30s, 515 allocated, 515 pool-guide, post-UE rebuild-empty per A66). The PSSaaS surface itself works as designed; the asymmetry is on the sqlcmd-direct side.
- Phase 9 surfaces real findings worth domain-expert input. A69 (this run's finding) is exactly the class of "PSSaaS vs Desktop-App-equivalent divergence" the kickoff anticipated the harness would surface. The harness earned its Phase 9 charter on its very first run.
- A66-aware verdict logic correctness verified at the unit-test level
(
diff_engine.pyself-test Cases 1, 6, 7 all pass) but NOT exercised at the runtime level on this run because the asymmetric-failure path short-circuited the A66 path. The next harness run on a state where both sides reach Complete will be the runtime-level A66 verification.
What this run does NOT prove (NOT MEASURABLE HERE; pending operator-driven runs)
- Legacy unmodified proc body vs PSSaaS-fixed proc body parity: per ADR-021 §Narrow Bug-Fix Carve-Out the A54 fix is forward-only and deployed to PS_DemoData. Both invocation paths in this harness execute the same fixed proc body on PS_DemoData. The legacy unmodified proc body deterministically Fails on PS_DemoData (the canonical "Bug as Feature" demo signal); a true legacy-vs-fixed-body parity comparison requires a customer DB without A54 triggers (operator- driven post-Phase-9 sweep against PS608 / future tenants once customer-rep approval lands).
- Per-loan correctness on customer-shaped data: PS_DemoData is a syn-trade- empty snapshot per A66; UE rebuilds-empty on this data. The empty-case parity proven here is meaningful but narrower than the rich-data parity that customer-DB validation will surface. Per A65, multi-pa_key + per-loan settlement-date variance are the two distinct A54 triggers; their absence on customer DBs means the legacy body Completes there and rich-data parity becomes measurable.
Capability × Environment matrix (per canonical practice #13)
| Capability | Architect's local WSL + local pssaas-api + PS_DemoData (public endpoint) | Staging React UI + staging API + PS_DemoData (private endpoint) | Customer DB (PS608 / future tenants) |
|---|---|---|---|
| Harness scaffold builds + runs | Verified ✓ (2026-04-20T07:40:48.589729+00:00) | NOT MEASURED HERE — Option L is local-only | NOT MEASURED HERE — operator-driven post-Phase-9 |
| First end-to-end comparison run produces verdict report | Verified ✓ this run | NOT MEASURED HERE | NOT MEASURED HERE |
| PSSaaS API path produces orchestration-equivalent allocations to direct sqlcmd path | Verified ✓ this run | NOT MEASURED HERE — staging API uses same proc body, would produce identical result; not re-measured | NOT MEASURED HERE |
| A66 expected-empty Complete-run case classified as Match | NOT EXERCISED — no A66-class reports on this run | NOT MEASURED HERE | NOT MEASURED HERE |
| Legacy unmodified proc body vs PSSaaS-fixed proc body parity | NOT MEASURABLE HERE — A54 fix is forward-only and deployed to PS_DemoData; legacy unmodified body deterministically Fails per A54+A56 | NOT MEASURABLE HERE — same DB, same proc body | NOT MEASURED HERE pending customer-rep approval — operator-driven post-Phase-9; on customer DBs without A54 triggers, the legacy body Completes and parity becomes measurable |
| Staging React UI run-status page click-through validates harness-cited run_ids | NOT MEASURED HERE in harness session | Verified ✓ via post-W2-deploy banked verification + this report's clickable URLs | NOT MEASURED HERE |
| Harness invocable from non-Architect machine (Collaborator-side reproducibility) | NOT MEASURED HERE this session — tools/parallel-validation/README.md documents the invocation pattern | N/A | N/A |
Input snapshot
| Field | Value |
|---|---|
| State hash (SHA-256 of row counts) | 94c4508ef9e48fa7930d8da4fc8fe69f1e09108f2c116d8723dda9f7fc310997 |
| Reference run id (whose options were re-used) | 9312A638-CD21-4485-B42E-EE8DC02BE0D0 |
| Reference timestamp (UTC) | 2026-04-20T07:40:08.684454+00:00 |
| Reference options | {"eligible_settle_buffer_days": 0, "max_eligible_days": 0, "max_trade_settle_days": 60, "min_status": "Closed", "price_mode": "PricePlusCarry", "scope": "ClosedAndLocked"} |
Table row counts (snapshot signature inputs)
| Table | Row count |
|---|---|
dbo.loan | 11674 |
dbo.pscat_trades | 8902 |
dbo.pscat_pools | 6246 |
dbo.pscat_trades_pools_relation | 6097 |
dbo.pfill_constraints | 3 |
dbo.pfill_carry_cost | 295 |
dbo.pfill_lockdown_guide | 1 |
Per-report breakdown
guide
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
recap
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
switching
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
PSSaaS Phase 7 Note: No paired Switch sides for this run. On PS_DemoData snapshots this is expected pre-Phase 9 (A54 blocks psp_powerfill_pool_guide; pfill_pool_guide stays empty until A54 is closed).
pool-candidates
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
existing-disposition
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
PSSaaS Phase 7 Note: Schema drift on tenant DB: pfillv_existng_pool_disposition view is the legacy WITH ENCRYPTION version (pre-dates the note_rate column in Phase 2's 002_CreatePowerFillViews.sql). Per Backlog #24 the view deploy is deferred pending behavior diff. Phase 9 closes.
pooling-guide
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
PSSaaS Phase 7 Note: No pooling-guide rows for this run. On PS_DemoData snapshots this is expected pre-Phase 9 (A54 blocks psp_powerfill_pool_guide; pfill_pool_guide stays empty).
cash-trade-slotting
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
PSSaaS Phase 7 Note: No cash-market-map rows for this tenant. On PS_DemoData snapshots this is expected pre-Phase 9 (Step 1 BX cash grids skipped per A12 because bx_price_floor is null).
kickouts
| Side | Row count |
|---|---|
| PSSaaS (via Phase 7 endpoint) | 0 |
| Sqlcmd-direct (via SELECT) | 0 |
| Verdict | Count |
|---|---|
| Match | 0 |
| TolerableDiff | 0 |
| Divergent | 0 |
| Report-level verdict | Incomparable |
Incomparable. Per Frame D's Capability-Inflation countermeasure, no row-level verdict is honest when one side did not produce a comparable end state. Per-side row counts above are the empirical observation (each side's SELECT/HTTP returned what it returned), but the absence of comparison rows reflects the harness's deliberate refusal to classify "both empty" as Match in the asymmetric-failure case. See §"Phase 9 finding A69" above for the canonical first instance.
Provenance + reproducibility
This report was produced by the Phase 9 Parallel Validation Harness at
tools/parallel-validation/harness.py.
To reproduce on a non-Architect machine:
- Clone the repo.
docker compose --profile dev up -d(starts the local pssaas-api).- From WSL Ubuntu:
cd tools/parallel-validation && pip install -r requirements.txt export PFILL_SQL_PASSWORD='<the password from docker-compose.override.yml>'python harness.py --config harness_config.yaml --output-path /tmp/parity.md- Compare
/tmp/parity.mdto this committed report. Run-to-run row counts may shift (each invocation submits a fresh PSSaaS run and adds a row topfill_run_history); verdict pattern (Match% + A66-correct-classification) should be stable.
What this enables next
- Greg-demo slot-in. The TL;DR verdict line + the §"What this proves and
does NOT prove" section are formatted to drop into the existing Greg-demo
"Bug as Feature" narrative at
powerfill-a54-fix-greg-demo-readinessas the loan-by-loan parity proof addition. The Frame D framing keeps the demo claim honest about what's been measured and what hasn't. - Operator-driven customer-DB validation runs (post-Phase-9). The harness binary as built is the lever; pointing it at a customer DB (substituting the connection string + the tenant ID) produces a parallel matrix row. Per kickoff §"Explicit scope (OUT)", that sweep is operator- driven; Phase 9 ships the lever, the operator pulls it.
- Phase 8.5 OIDC integration. The harness's HTTP client signatures already accept an optional bearer-token shape; when Keycloak lands per Backlog #31, wiring is a small future commit.
End of Phase 9 first comparison run report.