PowerFill Phase 6 — Open Questions for PO
Author: PSSaaS Systems Architect Date: 2026-04-17 Status: Pending PO input Companion docs:
- Sub-phase breakdown:
powerfill-phase-6-subphase-breakdown - Original kickoff:
powerfill-architect-phase6-kickoff - Architect-internal Phase 6a plan:
.cursor/plans/powerfill-phase-6a.plan.md(gitignored)
How to use this document
The Phase 6 kickoff (line 414) requires me to surface 7 open questions to PO and acknowledge that they are pending PO input before sub-phase 6b planning begins. This document is that surface.
For each question:
- Context — what surfaced the question
- Options — 2-4 candidate answers with pros and cons
- Architect recommendation — the option I would default to if PO doesn't comment, with a one-sentence rationale
- Default behavior if no PO answer — what 6a/6b/etc. plans will assume
- Impact of being wrong — what rework looks like if a different answer turns out to be correct
PO can reply per question with accept default, pick option X, or escalate (need to ask Tom/Greg/Lisa/Greg). Mixed answers across questions are fine.
The 7 questions are listed in the order the kickoff anticipated, with two additional questions surfaced during Phase 6 kickoff verification gate work (Q8 and Q9, marked "surfaced during planning").
Q1 — Background-job pattern for asynchronous run execution
Context
Spec §Run Execution Model line 242: "PowerFill runs are asynchronous — API returns a run id immediately, job executes in background." Phase 6 kickoff line 116-122 confirms this is required, leaves the pattern open. Sub-phase 6e implements the async runtime; sub-phases 6a-6d ship synchronous-only.
PSSaaS does not currently have a job-queue or BackgroundService anywhere — every existing endpoint is request-scoped. This is a precedent-setting decision that other modules (Phase 7 reports? scheduled BestEx? Risk Manager batch jobs?) will likely follow.
Options
| Option | Pattern | Persistence | Pros | Cons |
|---|---|---|---|---|
A. .NET hosted service + in-memory Channel<T> queue | IHostedService background worker; System.Threading.Channels.Channel for the queue | None — queue lost on pod restart | Zero new dependencies; fits modular monolith; cancellation works naturally | Pod restart mid-run = orphaned pfill_run_history row; no replay; only single-pod safe (BR-8 already restricts to one run per tenant, so single-pod is mostly fine) |
| B. .NET hosted service + DB-backed queue table | pfill_job_queue row insert on POST /run; hosted service polls the table; single transaction picks up the next job | DB row survives pod restart; can be reclaimed by another pod | Pod restart safe; multi-pod compatible; visible job state in DB | Need a polling interval (latency vs cost trade-off); DB polling pattern; pfill_job_queue is another PSSaaS-only table to maintain |
C. RabbitMQ (already in stack — full profile) | Publish job message on POST /run; subscriber background worker consumes | RabbitMQ persists messages | Production-grade; aligns with the eventual full profile; pub/sub for other consumers (Phase 7 reports?) | RabbitMQ not in the dev profile today; PSSaaS API doesn't currently take a dependency on it; multi-tenant routing requires care; new infra surface |
| D. Hangfire-equivalent (e.g., Quartz.NET, Hangfire) | External library does queue + worker + persistence + dashboard | Library handles it | Mature, dashboard, retries, scheduled jobs free | New dependency; opinionated; persistent storage usually SQL Server (which we have); need to evaluate licenses (Hangfire Pro is paid, OSS is per-tenant DB unfriendly) |
Architect recommendation
Option A for sub-phase 6e, with explicit ADR-024 documenting the choice and Phase 9-10 revisit trigger.
Rationale: 6a-6d ship synchronous; the "synchronous best-effort" implementation works at PS_DemoData scale (~11K loans, ~9K trades per run). Async is needed for production scale (50K+ loans?), but Option A is the smallest viable change. Pod-restart concern is mitigated by the BR-8 single-active-run rule + a 6e detail: pfill_run_history.status = 'Failed' is set by a startup health-check that finds rows stuck in Allocating state from a prior pod. Multi-pod is not a current requirement (PSSaaS-staging runs single-replica per ADR-020).
If PO disagrees, Option B is the next-most-natural fit (DB-backed queue is one new table, no new infra, polling can be tuned). Option C is right when PSSaaS adds its second batch consumer; recommend deferring to that point.
Default behavior if no PO answer
Sub-phase 6e plan adopts Option A (in-memory channel + hosted service). Goes into draft as "default unless PO overrides." A new ADR-024 (PowerFill Async Run Pattern) drafts the decision; PO approves the ADR alongside the 6e plan.
Impact of being wrong
- A → B: replace the in-memory
Channel<T>with a DB-backed queue table + polling loop; ~2 days of 6e work. - A → C: introduce RabbitMQ dependency to PSSaaS API + multi-tenant routing; ~5-7 days of 6e work plus infrastructure adjustment.
- A → D: integrate the library + per-tenant DB connection routing; ~3-5 days plus license/Pro-tier evaluation.
Q2 — Single-active-run-per-tenant enforcement mechanism
Context
Spec BR-8 (line 419-421): "Only one PowerFill run per tenant can be active at a time. Attempting a second run while one is in progress returns HTTP 409 Conflict." Mechanism is not specified. Sub-phase 6e implements this.
Options
| Option | Mechanism | Pros | Cons |
|---|---|---|---|
A. SQL UNIQUE index on pfill_run_history (tenant_id) WHERE status IN ('Pending','PreProcessing','Allocating','PostProcessing') | Filtered unique index; INSERT-conflict surfaces as a constraint violation | Database-enforced; survives all process failures; single source of truth | DB-level conflict produces a generic exception that the API must translate to 409; SQL Server filtered indexes have edge cases around null handling |
B. Application-level lock (in-memory SemaphoreSlim keyed by tenant_id) | API checks an in-memory dictionary before INSERTing the run row | Simple; fast; no DB constraint | Lost on pod restart; multi-pod unsafe; needs reconciliation (a stale lock from a crashed run blocks future runs) |
C. Redis distributed lock (SETNX keyed by tenant_id) | Redis is already in PSSaaS stack | Multi-pod safe; survives pod restart with TTL; standard pattern | New runtime dependency on Redis for POST /run (currently only the future Phase 5 ↔ Phase 6 cache may use Redis); TTL must be longer than the longest run; manual unlock on completion |
D. Advisory lock via sp_getapplock (SQL Server) | Application-level lock managed in SQL Server | Tied to the DB session lifetime; auto-released on session end | Requires explicit lock acquisition in every code path that writes to pfill_* tables; easy to forget; not enforceable by other tools |
Architect recommendation
Option A. The DB-level constraint is the strongest enforcement and matches the PSSaaS pattern of "the database is the source of truth." A wrapper at the API layer translates SQL constraint exceptions to 409 responses (~10 lines of code). Survives all pod / process failures.
Option B is the right choice if PO wants the simplest possible 6e and is OK with rare BR-8 violations during pod restart edge cases.
Default behavior if no PO answer
Sub-phase 6e plan adopts Option A (filtered unique index). 6e plan owns the pfill_run_history schema design including this index.
Impact of being wrong
- A → B: drop the DB index, add an in-memory dictionary; ~1 day of 6e work.
- A → C: add Redis dependency to API; lock acquire/release plumbing; ~2-3 days of 6e work.
- A → D: rewrite 6e to use sp_getapplock; ~3 days but adds a discipline burden to anyone touching
pfill_*writes later.
Q3 — pfill_run_history shape (replay-capable vs summary-only)
Context
Spec §Audit Trail line 251-253 lists the columns: run_id, tenant_id, user_id, started_at, ended_at, status, options, input_loan_count, input_trade_count, output_guide_count, output_kickout_count. The spec is silent on whether to capture the full input loan set so a run can be replayed, or just the summary stats.
The trade-off is meaningful. A full input snapshot lets us reconstruct exactly what was given to the algorithm (replay debugging, audit, "what would have happened if we'd run today instead of yesterday"). A summary-only history is much cheaper but means we can never reproduce a past run's exact conditions.
Options
| Option | What's persisted | Storage cost | Pros | Cons |
|---|---|---|---|---|
| A. Summary-only (per spec line 252) | The 6 columns spec lists, plus options JSON | Tiny — a few KB per run | Cheap; matches spec verbatim | No replay; "why did this run produce X?" is unanswerable after the fact |
| B. Summary + input loan IDs (FK list) | Add input_loan_ids (varbinary or JSON array of loan_ids); fetch row content from current loan table at replay | ~80 KB per run (10K loans × 8 bytes) | Cheap; lets us know which loans were considered, even if the loan rows have changed | Loan rows may have changed since the run; "replay" gives different results |
| C. Summary + full input snapshot | Add pfill_run_history_loans (per-run snapshot of relevant loan columns) and pfill_run_history_trades (per-run snapshot of trade columns) | ~5-20 MB per run (10K loans × ~30 columns + 50 trades × ~30 columns) | True replay capability; debugging is straightforward | Storage grows quickly; ~30 days of daily runs = ~500 MB per tenant; backup/restore impact |
| D. Hybrid: summary-only by default; "snapshot=true" option triggers C | Snapshot tables exist but are sparse | Variable; depends on usage | Best of both | Complexity (two code paths); confusing UX (the same run_id may or may not be replayable) |
Architect recommendation
Option B for sub-phase 6e. The ID list is small, gives forensic value ("which loans were in the input?"), and lets us decide later whether to upgrade to C without changing the audit table primary keys.
Option A is acceptable if PO wants Phase 6e's footprint as small as possible; the trade-off is real but defensible (we can always introduce snapshots later as a Phase 7 enhancement).
Option C is the right answer if PO knows we'll want replay debugging from day one (typical when first customer onboards).
Default behavior if no PO answer
Sub-phase 6e plan adopts Option B (summary + input loan ID list). The schema reserves a JSON column for future expansion to C without a schema change.
Impact of being wrong
- B → A: drop the input loan IDs column; ~30 minutes of 6e schema work.
- B → C: add the two snapshot tables, populate them on run start, query them on read; ~3-5 days of 6e work.
- B → D: add the option flag + conditional snapshot population; ~2 days of 6e work.
Q4 — A1 multi-pass semantics (allocation pass boundaries and selection criteria)
Context
Assumption A1 (in powerfill-assumptions-log): "psp_powerfill_conset executes allocation in four discrete passes: exact fit, best fit, fill remaining, and orphan handling." — flagged as Medium confidence and "needs Tom/Greg confirmation." Phase 6 kickoff line 354-358 says: "Phase 6 NEEDS this answered. Either trace it from the NVO empirically (Architect's primary task) or escalate to PO if NVO ambiguity remains."
This is on the critical path of sub-phase 6b. 6b cannot port psp_powerfill_conset without knowing what the passes actually do.
Options
| Option | Source | Pros | Cons |
|---|---|---|---|
| A. Architect traces empirically from NVO | Architect reads the conset body (NVO 50-5886; specifically the multi-pass blocks the kickoff says are scattered through 7500-12500) and documents each pass as A1.1, A1.2, A1.3, A1.4 in the assumptions log. | No PO bottleneck; verifiable against NVO line citations; result is a reusable artifact | Architect interpretation of legacy code without domain expert validation; any wrong pass-boundary identification propagates into 6b and is only caught in Phase 9 parallel validation |
| B. PO escalates to Tom or Greg for direct interpretation | One real domain expert reviews the pass logic and documents intent | Authoritative; closes A1 with high confidence; sets up Tom/Greg for Phase 9 validation | Adds a calendar dependency; Tom is finite-time; A1 has been open since Phase 0 (2026-04-16) without a Tom/Greg review trigger |
| C. Hybrid: Architect drafts (A); Tom/Greg review post-6b | Architect produces 6b output; Tom/Greg do a focused review of just the pass-boundary identification + at most 1 sample run output | Lowest blocking; preserves authoritative review; review scope is small | Still requires Tom/Greg time; if the review finds errors, 6b output is partially wrong and 6c/6d/6e built on top compound the problem |
Architect recommendation
Option A with explicit Phase 9 critique hook: 6b plan documents the empirical pass identification, ships, and Phase 9 (parallel validation) becomes the authoritative gate. This is the same approach the rest of the module has used (no Tom/Greg upstream; PS_DemoData and Phase 9 are the reality check). PO can override to Option B or C if Tom is available within ~1 week.
The empirical NVO trace is doable; 6b plan estimates ~1-2 days of NVO reading inside its 7-10 day range. The NVO uses readable (if verbose) T-SQL — the passes are demarcated by separate INSERT INTO @working_table blocks with different WHERE clauses, which is structurally identifiable.
Default behavior if no PO answer
Sub-phase 6b plan adopts Option A. The 6b plan's §3 (Algorithm) becomes the canonical pass-boundary documentation, with NVO line citations per pass. A1 in the assumptions log is updated with a 6b-completion amendment showing the verified passes.
Impact of being wrong
- A → B: 6b might re-do the multi-pass section after Tom/Greg review; up to ~3 days of 6b rework.
- A → C: same as B but the rework is concentrated post-6b ship rather than mid-6b.
Q5 — Synthetic-trades scope split (schema-first vs all-in-6d)
Context
Per the syn-trades deep dive, the 3 pfill_syn_* tables are part of psp_powerfillUE. Phase 6 kickoff line 363-366 asks: "Should sub-phase 6a include syn-trades table creation (so the schema is whole), defer to 6d, or split (table CREATE in 6a, populate logic in 6d)?"
Options
| Option | What 6a does | What 6d does | Pros | Cons |
|---|---|---|---|---|
| A. Defer all to 6d (current breakdown default) | Nothing about syn tables | Adds schema (007_*.sql) + EF entities + UE proc that populates them | Single coherent commit; deploy state never has empty syn tables | 6a-6c run with the syn tables missing; if any 6a-6c code accidentally references them, runtime error rather than schema-time error |
| B. Schema-first in 6a; populate logic in 6d | Adds 007_*.sql (schema only) + EF entities | Adds UE proc that populates the (already-created) tables | Schema is whole from 6a onward; if 6b-6c references syn tables for any reason, EF queries succeed (return empty) | 6a deploys with empty unused tables that look broken to anyone querying them; "what is this table for?" question for 30+ days |
| C. Deferred + assertions | Nothing; 6a runtime asserts the syn tables don't exist (so accidental references throw clear errors) | Same as A | Adds clarity vs A | One more piece of conditional code |
Architect recommendation
Option A (defer all to 6d). The breakdown's default. Reason: the syn tables have no consumers in 6a-6c. Empty tables in production-like environments cause more confusion than a missing-table error in dev. The cost of "missing table" if 6b-6c accidentally reference them is a clear Invalid object name 'dbo.pfill_syn_trade_base' exception, which is easier to diagnose than empty-table behavior.
If PO prefers Option B for "schema is always whole" architectural cleanliness, that's defensible — it's a small surface change.
Default behavior if no PO answer
Sub-phase breakdown stays as drafted: 6d owns both schema and populate logic.
Impact of being wrong
- A → B: move the schema work from 6d's plan to 6a's plan; ~30 minutes of plan re-shuffling.
- A → C: add a startup assertion to 6a; ~30 minutes of code.
Q6 — Calculator integration pattern (per-constraint-iteration vs single-batch)
Context
PowerFillCarryCostCalculator (Phase 5) takes a batch of CarryCostInput records and issues one DB query per batch. Phase 6 candidate-builder produces those records mid-pipeline.
Phase 6 kickoff line 367-371: "Should the calculator be called per-constraint-iteration (small batches, fresh calculator scope) or once for all candidates (large batch, single round-trip)?"
This question is internal to sub-phase 6a. PO input is welcome but not blocking; 6a plan includes a recommendation that becomes the default.
Options
| Option | Pattern | Pros | Cons |
|---|---|---|---|
| A. Per-constraint-iteration (one calculator call per constraint loop) | Loop iterates pfill_constraints; for each constraint, build candidates for that constraint, call calculator once for that batch | Small failure radius (one constraint failing doesn't taint others); easier to log; calculator scope is fresh per constraint | More calculator calls per run (~ N constraints × 1 query each); slightly more overhead |
| B. Single-batch (one calculator call for the full candidate set) | Build all candidates first, call calculator once for everything | One DB round-trip total; cleaner perf | Failure cascades across all constraints; harder to attribute scoring to a specific constraint; calculator memory footprint grows with total candidate count |
| C. Hybrid (calculator-per-trade-batch within constraint iteration) | Constraint loop batches candidates per-trade then per-constraint | Finest control; balances perf and isolation | Most complex pattern |
Architect recommendation
Option A (per-constraint-iteration). Reasons:
- Failure isolation matches BR-7 priority semantics — constraints process in priority order; if constraint #5's calculator call fails, constraints #1-#4 are already correct. Aborting at #5 leaves a partial-but-correct state.
- Performance is acceptable — Phase 5 calculator is a single
IN @marketsquery. PS_DemoData has 3 constraints; production tenants likely have 10-50. 50 queries × ~1ms each = 50ms total overhead. Negligible vs allocation pass duration. - Cleaner logging — each constraint can log its own scoring batch outcome (
{constraint_priority: 10, candidates: 1234, matched: 1100, missing_curve: 134}). - Symmetric with the conset proc's structure — NVO's allocation engine iterates constraints; calling the calculator per constraint mirrors that natively.
Default behavior if no PO answer
Sub-phase 6a plan §3 (candidate-builder pipeline) commits to Option A.
Impact of being wrong
- A → B: refactor the candidate-builder to batch globally; ~half a day.
- A → C: add the per-trade nested batching; ~1 day.
Q7 — Failure-state semantics for in-progress runs (what happens to partial outputs)
Context
Spec BR-9 (line 423-425): "Run output is immutable — once a run completes, its output tables are not modified until the next run." Spec line 246: "Failed runs must leave the output tables in a consistent state (either fully populated with prior run data, or fully cleared with error markers)."
Phase 6 kickoff line 372-375: "BR-9 (run output is immutable) vs BR-10 (run output overwrites prior run) — what happens to an in-progress run when a new one starts? Spec says BR-8 prevents this but doesn't say what happens to the partial outputs."
The question is really three-in-one:
- (a) When run N succeeds, what state is run N's output in? — Spec is clear: BR-10 overwrites BR-9; final state is all-N or all-failed.
- (b) When run N fails mid-pass, what is the state of
pfill_powerfill_guideetc.? — Spec leaves this open. - (c) When run N is in-progress and the API attempts to start run N+1, BR-8 returns 409. What happens to N's partial output? — N continues to completion or failure per (b).
Options for (b) — failure-state cleanup
| Option | What happens on failure | Pros | Cons |
|---|---|---|---|
| A. Preserve prior run's output (rollback) | Wrap the run in a SQL transaction; on failure, the transaction rollbacks; prior run's output preserved | Strong consistency; "no surprise" UX | Long-running transaction (a Phase 6 run could be minutes); pfill_* tables held lock the whole time; competing reads block; high risk of timeout / lock escalation |
| B. Clear output tables with error markers | On failure, DELETE all pfill_* run-output rows and INSERT a marker row in pfill_run_history with status=Failed, failure_step=<name>, failure_message=<text> | Output tables visibly empty after failure; no stale data; clean re-run | Prior run's results lost; "I want to see yesterday's run" is impossible after a fresh failed run |
| C. Leave the partial output and mark the run as Failed | On failure, output tables contain mid-run state; pfill_run_history.status=Failed; user sees "this is partial; do not consume" | Cheapest implementation; useful for forensics | UX: a partial run looks like a real run unless the user reads the run history first; risk of "looks valid but is wrong" decisions |
| D. Snapshot prior output before run; restore on failure | Before run N starts, COPY prior run's output to a pfill_*_prior table; on failure, swap back | Best UX (failures are invisible to consumers); no long transaction | 2x storage; copy time on every run start; complexity in pfill_* table set (now 17 tables → 34) |
Architect recommendation
Option B (clear with error markers) for sub-phase 6e default.
Rationale: the operator value of "the output tables show whatever the last run produced" is high. The operator value of "I can roll back to yesterday's results" is low — yesterday's results are exactly what would have come back from re-running with yesterday's inputs, and we have pfill_run_history to tell them what yesterday's input shape was. Option B has the sharpest "this is the current truth" semantics. Failed-run forensics goes through pfill_run_history.failure_message and pfill_run_history.failure_step rather than through stale row data.
Option A is the right answer if PO wants strong-consistency UX and is OK with the transaction-duration trade-off (which can be several minutes for a real run).
Default behavior if no PO answer
Sub-phase 6e plan adopts Option B. pfill_run_history.failure_step and failure_message columns capture the diagnostic.
Impact of being wrong
- B → A: wrap the run in a transaction; ~1 day plus performance testing.
- B → C: remove the cleanup logic; ~half a day.
- B → D: add the snapshot infrastructure; ~3-5 days.
Q8 — psp_pfill_insert4_pool_guide disposition (surfaced during planning, Phase 6 kickoff verification gate finding F-VERIFY-4)
Context
The Phase 6 kickoff completely omitted psp_pfill_insert4_pool_guide (NVO line 11712-12483, ≈ 771 lines). It was discovered during the kickoff verification gate when I enumerated all public function definitions in n_cst_powerfill.sru.
The naming convention (_insert4_pool_guide) suggests it's a sub-step that inserts into pfill_pool_guide. It almost certainly fits in sub-phase 6c (pool-action derivation) but its exact role is unconfirmed.
This is not a question whether to do it — schema preservation says we must, otherwise pool-guide population is incomplete. The question is which sub-phase owns it and whether it's a separate concern that warrants its own bullet.
Options
| Option | Owner | Notes |
|---|---|---|
A. Sub-phase 6c folds psp_pfill_insert4_pool_guide into pool-guide work | 6c plan | Most natural fit by name; 6c plan must verify the proc's role and decide whether to port verbatim or fold the logic into the main pool-guide port |
| B. Sub-phase 6b folds it into conset work | 6b plan | Possible if _insert4_ is invoked from psp_powerfill_conset; 6b plan investigates |
| C. New sub-phase between 6c and 6d | 6c.5 | If the proc is genuinely independent and large, it deserves its own sub-phase; this would push the breakdown to 6 sub-phases |
Architect recommendation
Option A with a 6c-plan investigation gate: the 6c plan's first §2 (Primary-Source Verification Gate) item is "where is psp_pfill_insert4_pool_guide invoked from?" If invoked from pool-guide, A is correct. If invoked from elsewhere, escalate.
Default behavior if no PO answer
The sub-phase breakdown above lists psp_pfill_insert4_pool_guide under 6c with the investigation gate noted. PO acceptance is implicit unless PO objects.
Impact of being wrong
Re-shuffling the proc to a different sub-phase is a plan-level change, ~half a day per re-shuffling.
Q9 — Default values for run options (scope, min_status, etc.) (surfaced during planning, Phase 6 kickoff verification gate finding)
Context
Phase 6 kickoff verification gate verified the legacy defaults from w_powerfill.srw line 154-169:
scope:"cl"(legacy code) ↔ spec calls itClosedAndLocked↔ spec line 232 says default isClosedOnlyprice_value:"pc"↔ spec calls itPricePlusCarry↔ spec line 233 defaultPricePlusCarry✓ matchesstatus_code:"Docs Out"↔ spec line 234 says "tenant default" — specificallyDocs Outper legacy, but spec abstractsmax_eligible_days:0↔ spec line 235 says default30max_trade_settle_days:0↔ spec line 236 says default60eligible_settle_buffer_days:0↔ spec line 237 says default0✓ matches
That's 3 out of 6 defaults that don't match between legacy and spec:
scope: legacy default iscl(ClosedAndLocked), spec default isClosedOnly. Different default behavior.min_status: legacy default is"Docs Out", spec default is"Closed"(in Phase 4 preflight settings).max_eligible_days/max_trade_settle_days: legacy defaults to0, spec defaults to30/60.0is suspect — likely "no limit" semantics in legacy, but spec's30/60is a hard upper bound.
This is a 5th Phase-0 Truth Rot finding — the spec's documented defaults don't match the legacy proven defaults. PO should rule on the canonical PSSaaS defaults before sub-phase 6a's POST /run accepts requests.
Options
| Option | Defaults | Pros | Cons |
|---|---|---|---|
| A. Match legacy defaults verbatim (per ADR-006 schema/behavior preservation) | scope=ClosedAndLocked, min_status=Docs Out, max_eligible_days=0 (no limit), max_trade_settle_days=0 (no limit), eligible_settle_buffer_days=0 | Faithful port; Phase 9 parallel validation will pass on default-options runs | Spec-vs-implementation drift; new operators may be surprised by 0=no-limit semantics |
| B. Match spec defaults verbatim (per current spec text) | scope=ClosedOnly, min_status=Closed, max_eligible_days=30, max_trade_settle_days=60, eligible_settle_buffer_days=0 | Modern; sensible upper bounds; spec stays canonical | Phase 9 parallel-validation will produce different results on default-options runs unless the user overrides; "match Desktop App" claim needs caveats |
| C. Spec is amended to match legacy; PSSaaS defaults match legacy | Same as A; spec text changes | Truth Rot fixed; ADR-006 honored | Spec changes; explicit PO acknowledgment required |
| D. Hybrid: defaults match legacy; modern defaults available as a separate "preset" | Default = A; ?preset=modern triggers B | Both ergonomics available | Two presets to document and test |
Architect recommendation
Option C (spec amended to match legacy; A's defaults become canonical PSSaaS defaults).
Rationale:
- ADR-006 schema preservation extends to behavior preservation during transition (ADR-021 confirms).
- Phase 9 parallel-validation will simply not work on default-options runs if PSSaaS defaults differ from Desktop App. We need every parity-validation lever.
- The spec drift is a one-line fix per spec line 232/234/235/236 — a small spec amendment.
- If a future PSSaaS-only modernization wants different defaults, that's a feature flag / preset / new ADR — not a default change in v1.
Default behavior if no PO answer
Sub-phase 6a plan adopts Option C and includes a small spec amendment as part of the 6a deliverable. The amendment cites this Q9 as the resolution path.
Impact of being wrong
- C → A: same defaults, different spec text; spec stays as-is, Architect adds a runbook note "PSSaaS defaults match Desktop App, spec text is being updated in Phase 7."
- C → B: a deliberate behavior divergence from Desktop App; Phase 9 parallel-validation needs explicit caveats; PO sign-off on the divergence.
- C → D: implement two preset paths; ~1 day of 6a work plus testing.
Summary — what 6a needs vs what 6b+ needs
| Question | Needed for | If unanswered, sub-phase plan defaults to |
|---|---|---|
| Q1 — background-job pattern | 6e | Option A (in-memory hosted service) |
| Q2 — single-active-run mechanism | 6e | Option A (filtered unique index) |
Q3 — pfill_run_history shape | 6e | Option B (summary + input loan IDs) |
| Q4 — A1 multi-pass semantics | 6b — critical path | Option A (Architect empirical NVO trace) |
| Q5 — synthetic-trades scope split | 6d | Option A (defer all to 6d) |
| Q6 — calculator integration pattern | 6a | Option A (per-constraint-iteration) |
| Q7 — failure-state semantics | 6e | Option B (clear with error markers) |
Q8 — psp_pfill_insert4_pool_guide disposition | 6c | Option A (6c with investigation gate) |
| Q9 — run-options defaults | 6a — affects API contract | Option C (spec amends to match legacy) |
Sub-phase 6a is unblocked once Q6 and Q9 are resolved (or Architect defaults are accepted). The 6a plan ships with these defaults assumed.
Sub-phase 6b cannot start until Q4 is resolved. If PO can't get Tom/Greg input, defaulting to Option A means 6b plan owns the empirical trace (~1-2 days of NVO reading inside 6b's range).
Sub-phases 6c, 6d, 6e are unblocked by the breakdown approval — their own kickoffs and plans handle their own questions per the per-sub-phase pattern.
Acknowledgement (per kickoff line 414)
I acknowledge that the above 9 questions (7 anticipated by the kickoff + 2 surfaced during planning) are pending PO input before sub-phase 6b planning begins. Sub-phase 6a planning proceeds under the Architect-default assumptions for Q6 and Q9; PO can override before 6a implementation starts. All other questions block the relevant sub-phase's planning — not the breakdown approval, which can proceed independently.