Import Normalization Rules
The PowerSeller Desktop App uses a multi-layer system to normalize incoming loan tape data from different sellers. Each seller sends data in their own format with their own terminology — the normalization rules translate everything into PowerSeller's standard schema.
Source: Azure SQL MI (PS608 / Watermark TPO), extracted 2026-03-17.
The Problem
Every seller's loan tape uses different:
- Column names ("Loan Amount" vs "UPB" vs "Principal Balance" vs "loan_amount")
- Value conventions ("Conv noPMI" vs "Conventional" vs "CONV" for the same loan type)
- Data formats (LTV as 80.0 vs 0.80, dates as MM/DD/YYYY vs YYYY-MM-DD)
- Product codes ("CONVENTIONAL 30-YEAR FIXED" vs "CF30" vs "Conv 30yr")
- File formats (CSV, fixed-width, pipe-delimited, different column orders)
Users historically spend significant time writing SQL syntax rules to normalize each new seller's data. At WTPO, there are 1,610 normalization rules across 8 sellers.
Architecture
Layer 1: File Maps (47 maps, 1,618 column definitions)
Each seller/format combination has a file map that defines how to parse the incoming file.
Source: pxcat_external_file_maps + pxcat_external_file_map_cols
| Field | Purpose |
|---|---|
map_name | Unique identifier (e.g., "Citizens Import") |
map_type | Import or export |
table_name | Target table (usually psbid_loans) |
delimiter_name | Field delimiter |
date_format | Date parsing format |
conversion_group_name | Which value conversion rules to apply |
conv_expr_group_name | Which expression conversion rules to apply |
Each map has column definitions specifying:
column_name— target column in PowerSellerfile_column_contents— source column name or position in the filecolumn_type— data typelength— field lengthletter_case— case conversionformat— display formatimport_blanks— how to handle empty values
Layer 2: Simple Value Mapping (1,016 rules)
Direct find-and-replace rules organized by seller. When a value in the incoming file matches from_value, it's replaced with to_value.
Source: pxcat_conversion_values
Conversion Groups at WTPO
| Seller | Target Table | Rule Count |
|---|---|---|
| Citizens | psbid_loans | ~100+ |
| First Alliance | psbid_loans | varies |
| First Home | psbid_loans | varies |
| Kind Lending | psbid_loans | varies |
| LenderWorks FHM | psbid_loans | varies |
| LenderWorks ICM | psbid_loans | varies |
| Loan Steady | psbid_loans | varies |
| Mortgage One | psbid_loans | varies |
| US Mortgage | psbid_loans | varies |
| Notional | loan_notional | varies |
Also includes export conversion groups for agency delivery (FHLMC, GNMA, Mornet).
Example Rules (Citizens)
| Column | From Value | To Value |
|---|---|---|
loan_type | Conv noPMI | Conventional |
loan_type | Conv wPMI | Conventional |
property_type | Condo | Condominium |
property_type | Single Family Detached | SFR |
amort_type | Amort ARM | AdjustableRate |
first_time_borrower | Y | true |
first_time_borrower | N | false |
first_time_borrower | NULL | false |
lienprioritytype | First | First Lien |
Every seller needs their own set of these mappings because every seller uses different terminology.
Layer 3: Expression-Based Conversions (594 rules)
SQL syntax rules that evaluate conditions against other columns to compute derived values. More powerful than simple mapping — handles conditional logic and calculations.
Source: pxcat_conversion_expr_values
Example Rules (Citizens)
Product code to instrument mapping:
| Syntax (IF condition) | Column | To Value |
|---|---|---|
product_code = 'CONVENTIONAL 30-YEAR FIXED' | instrument_name | CF30 |
product_code = 'CONV 30 YR FIXED RATE HOME POSSIBLE' | instrument_name | HP30 |
product_code = 'FHA FIXED 25-YEAR (GNMA II)' | instrument_name | FF30 |
product_code = 'HomePossible 30yr Fixed' | instrument_name | HP30 |
product_code = 'CONV 30-YEAR FIXED HOME READY' | instrument_name | HR30 |
Computed values:
| Syntax | Column | To Value |
|---|---|---|
1=1 (always) | cltv | cltv*100 (decimal to percentage) |
1=1 (always) | import_los_source | 'Citizens' (stamp the source) |
1=1 (always) | import_los_date | getdate()-0 (current timestamp) |
refinancecashouttype = 'Refi Equity Takeout' | cash_out_indicator | 'true' |
refinancecashouttype <> 'Refi Equity Takeout' | cash_out_indicator | 'false' |
Why This Is Painful
For each new seller, someone (usually Jay or a power user) must:
- Get a sample loan tape from the seller
- Identify which columns map to which PowerSeller fields
- Identify all unique values in each field
- Write simple conversion rules for every non-standard value
- Write expression rules for conditional logic (product code → instrument)
- Test the import and fix mismatches
- Repeat when the seller changes their format
This process can take hours to days per seller and is error-prone. Changes to a seller's file format require manual rule updates.
How PSX Replaces This
PowerSeller X's loan tape normalizer (engine/field_registry.py) solves this problem programmatically:
| Desktop App Approach | PSX approach |
|---|---|
| Manual SQL rules per seller | Automatic field detection by name/alias |
| 1,016 simple mapping rules | Built-in canonical value mappings |
| 594 expression rules | Programmatic transformations |
| Hours/days per new seller | Minutes — upload and auto-detect |
| Breaks when format changes | Fuzzy matching handles variations |
| Per-customer configuration | Centralized, learns from all sellers |
The PSX field registry maintains a dictionary of known aliases for each canonical field (e.g., loan_amount is also known as "UPB", "unpaid principal balance", "Principal Balance", "loan_amount", "baseloanamount"). When a seller uploads a loan tape, the normalizer matches columns automatically rather than requiring manual mapping rules.
This is a key value proposition of PSX over the Desktop App's import workflow.
Tables Reference
| Table | Rows | Purpose |
|---|---|---|
pxcat_external_file_maps | 47 | File format definitions (one per seller/format) |
pxcat_external_file_map_cols | 1,618 | Column mapping within each file format |
pxcat_conversion_groups | 25 | Named groups of value conversion rules |
pxcat_conversion_values | 1,016 | Simple from → to value mappings |
pxcat_conversion_expr_groups | 18 | Named groups of expression conversion rules |
pxcat_conversion_expr_values | 594 | SQL syntax-based conditional conversions |
pxcat_column_mappings | 27 | Column-level mapping definitions |
pxcat_column_mapping_contexts | 2 | Mapping context definitions |
pxcat_column_types | 14 | Supported column data types |