Column Filtering for Change Retention #2110
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Table Column Filtering
Adds column filtering capabilities to change retention pipelines, allowing users to exclude sensitive columns (like user data fields or large metadata columns) or include only specific columns they need. This provides fine-grained control over which column data is captured and stored in change retention tables.
Data schema changes
include_column_attnumsandexclude_column_attnumsarray fields to thesource_tablesJSONB column inwal_pipelines. Existing pipelines are migrated with NULL values for these fields.SourceTableembedded schema with two new optional array fields storing column attribute numbers. The changeset validates that these fields are mutually exclusive—only one can be set at a time.attnum), which are resolved from column names during YAML parsing and UI configuration.UI changes
Form.svelte), appearing after table selection. It displays available columns with checkboxes, automatically handles primary key constraints (PKs cannot be excluded and are always included), and shows selected columns as removable tags.Setting Column Filtering at Change Retention Creation
Note that the PK is disabled and cannot be included/excluded
Viewing Change Retention Details
The details show the selected columns if filtering is enabled
Editing Change Retention
The user can modify the configuration to update the filtering. There is a note in this field indicating that the changes will not backfill historic data for those columns in those rows, which is out of scope.
Implementation details
ColumnSelectionmodule handles filtering at multiple points:recordandchangespayloadsMessageHandler.wal_event/2andConsumers.message_record/2/message_changes/2, ensuring filtered columns are excluded from all downstream consumers and sinks.exclude_columnsandinclude_columns(column name lists) and converts them to attribute numbers, with validation to ensure primary keys are never excluded and that both options aren't specified simultaneously.Configuration
Configure column selection via:
exclude_columnsorinclude_columnsarrays inchange_retentionssource table configurationPrimary key columns are automatically protected—they cannot be excluded and are always included, even when using include mode.
Tests
Added a few tests and ran
mix test: