Overview
A common issue reported by users is the absence of specific records in the unified_coalesced or unified_preprocessed_raw tables, even when those records appear to be correctly clustered and visible in raw input tables. This KB article outlines how to investigate these "missing" records and identify if the cause is a technical bug or a legitimate system process, such as Compliance Deletion.
Issue Summary
Users may observe that while an Amperity ID exists, certain expected Primary Keys (PKs) associated with that ID are missing from the unified output tables. This typically impacts non-match reporting and data lineage validation.
Diagnostic Symptoms:
Queries against
unified_coalescedfor specific PKs return zero results.The PK might be missing from
unified_preprocessed_rawas well, despite being present in the source data.Internal Stitch diagnostic tables (like
stitch_scores) may appear empty or missing, complicating standard troubleshooting.
Troubleshooting & Investigation Steps
1. Verify Data Presence Across the Lineage
Before assuming a Stitch failure, verify exactly where the record "disappears." Run a cross-table join to see if the PK exists in the preprocessing stage:
SQL
SELECT uc.amperity_id, uc.pk AS coalesced_pk, ur.pk AS preprocessed_pk
FROM Unified_Coalesced uc
INNER JOIN unified_preprocessed_raw ur ON uc.amperity_id = ur.amperity_id
WHERE uc.amperity_id = 'target-amperity-id'
2. Inspect the Unified_Changes_PKS Table
The most definitive way to trace "lost" records is through the internal Unified_Changes_PKS table (found within the Stitch tables dataset). This table logs changes, including deletions, made during the stitching process.
What to look for:
Delete Actions: Look for the specific PKs in this table. If they are marked as deleted, the system intentionally removed them.
Job ID Reference: Identify the
job_idassociated with the deletion. This usually links back to a specific Daily_Load or Stitch workflow.
3. Check Compliance Processing
If records are present in Raw data but missing from all Unified tables, they may have been removed by a Compliance Workflow (GDPR/CCPA).
Check the Domain Tables (e.g.,
Credit:CrdtClassicAcctMstr) to see if the records were filtered out during a compliance job.Compliance deletions occur before data is ingested into the unified layers, which explains why a record might appear to "vanish" during stitching.
Root Cause: Legitimate Deletion vs. Bug
| Scenario | Explanation |
|---|---|
| Compliance Deletion | The record was part of a "Right to be Forgotten" request or filtered via a data retention policy. It is removed from domain tables and excluded from Unified tables. |
| Merged Accounts | The record might have been merged into a different "Master" PK. Check the merged_accounts_stitch table using the daas_acct_key. |
| Stitch Over-clustering | If supersized_id is present, the record might be part of a cluster that was too large to process, causing it to be dropped for performance safety. |
Resolution
If the investigation shows the records were deleted via a Unified_Changes_PKS entry linked to a compliance job, no technical fix is required. The system is performing as designed by respecting data privacy or retention rules.
Next Steps for Users:
Verify if the missing PKs belong to customers who requested data deletion.
Review compliance job logs for the timestamp identified in the
Unified_Changes_PKStable.