Overview
When executing complex SQL queries—particularly those with deep Common Table Expression (CTE) structures or numerous window functions—Amperity may return an error indicating the query has exceeded the Spark stage limit (typically a hard limit of 150 stages). This occurs when the Spark engine's execution plan becomes too fragmented for the available resource planning overhead.
Troubleshooting & Optimization Steps
If you encounter a query stage limit error, follow these technical best practices identified through internal support investigations:
1. Consolidate Sequential CTEs
Deep nesting of CTEs is the primary driver of stage explosion. Consolidate CTEs that reference the same source tables or perform sequential logic into a single block.
Example: Combine foundational campaign metadata (like
QueryInfoandTacticInfo) into one CTE rather than joining them later.
2. Position Window Functions Strategically
Window functions (e.g., ROW_NUMBER(), RANK()) are resource-intensive. Moving CTEs containing window functions lower in the "funnel" (closer to the final SELECT statement) can significantly reduce the number of stages generated early in the execution plan.
3. Refactor Complex Variables
Break down large "catch-all" CTEs that handle multiple independent logic streams (e.g., refactoring a massive DS_Variables block into smaller, dedicated entities).
4. Leverage the AI Assistant
Use the Amperity AI Assistant to identify potential CTE consolidations. It can often suggest more compact versions of your query that maintain the same business logic but use fewer stages.
The Materialization Solution
If standard tuning does not reduce the stage count sufficiently, Materialization is the recommended technical workaround. Materialization frees up memory resources by temporarily writing table data to storage mid-query.
How to use Materialization Hints:
You can "hint" to the Spark engine that specific CTEs should be materialized by adding the prefix materialized_ to the CTE name.
Syntax:
WITH materialized_Audience AS (SELECT ...)Effect: This breaks the execution plan into smaller, manageable chunks, preventing the stage count from exceeding the limit in a single run.
Advanced: Enabling the Materialization Feature Flag
In cases where naming conventions (materialized_ prefix) do not resolve the issue, or for exceptionally large queries that fail to run durably, an explicit "Enable Materialization" option can be activated in the Query window settings.
Requirement: This requires a specific Feature Flag "show materializer option" to be enabled in the Settings.
Action: Contact your Amperity Admin or Support representative to enable this flag for your tenant.
Usage: Once enabled, an admin must check the "Enable Materialization" box under Advanced Settings on the right side pane on the query window UI.
[!IMPORTANT] Note on Code Freeze: During platform code freeze periods, structural configuration changes in production are not advisable. It is recommended to create a Sandbox environment first to test if the materialization flag resolves the issue before applying it to Production.