fix: preserve original error in exception_log on retry#430
fix: preserve original error in exception_log on retry#430mason-sharp merged 3 commits intov5_STABLEfrom
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
… main) In TRANSDISCARD and SUB_DISABLE modes the apply worker retries a failed transaction with use_try_block=true. Previously, the original error was only written to the server log and lost before the retry pass ran. Every row in the discarded transaction was logged to spock.exception_log with error_message = NULL, stored as "unknown", giving no indication of why the transaction was discarded. Add initial_error_message[1024], initial_operation[16], and failed_action to SpockExceptionLog (shared memory). In the outer PG_CATCH, save edata->message, errcallback_arg.action_name, and xact_action_counter into these fields before FlushErrorState() clears the error context. On the retry pass, the row whose xact_action_counter matches failed_action receives the saved initial_error_message as its error_message in spock.exception_log. All other rows in the same transaction receive NULL, stored as "unavailable", which accurately describes their status (they were not the cause of the failure). Also change the NULL fallback in add_entry_to_exception_log from "unknown" to "unavailable" to eliminate the misleading implication that those rows independently encountered an unknown error. Clear initial_error_message at both exception-state reset points in handle_commit so stale messages do not carry over to subsequent transactions. Backport of commits d8cf5db, eb30df1 from main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Port main's 013_exception_handling.pl to v5_STABLE. Adaptations from main: - GUC changes via postgresql.conf append + pg_reload_conf() instead of ALTER SYSTEM to avoid SIGHUP-propagation races with the apply worker fork. - Tables created explicitly on both nodes; no DDL-replication dependency. - Fixed sleeps replaced with wait_for_sub_status / wait_for_exception_log polling helpers. - done_testing() instead of a hard-coded test count. Part 5 added (new): TRANSDISCARD error_message quality regression test. Verifies that after the backport in d08c65db: - Bystander rows carry error_message = 'unavailable' (not 'unknown'). - The originally-failing row carries the real constraint-violation message. - No entries with the old 'unknown' fallback exist. Scenario: three-table transaction where t_ehx_b has NOT NULL on n2 only; INSERT with v=NULL fails on n2, triggering TRANSDISCARD for the whole xact. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
198afbb to
886477d
Compare
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
| Duplication | 2 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
- Remove dead initial_operation[16] field from SpockExceptionLog and its write site in the outer PG_CATCH; the field was written but never read. - Initialize ErrorData *edata = NULL in handle_sql_or_exception, matching all three DML handlers; the uninitialized declaration was formally UB even though the ternary short-circuit kept runtime safe. - Clear failed_action = 0 at both handle_commit reset points alongside initial_error_message[0] = '\0', making the guard explicit rather than relying on implicit coupling between the two fields. - Standardize initial_error_message[0] guard to != '\0' in all DML handlers, matching the SQL handler style. - Add LOG message with initial_error_message before each clear point in the TRANSDISCARD/SUB_DISABLE handle_commit paths, surfacing the original error in the server log (ported from main). - Reset xact_had_exception = false in handle_commit's post-commit state reset alongside xact_action_counter and use_try_block. Without this, the flag retained its 'true' value after a TRANSDISCARD retry pass and could trigger a spurious spock_disable_subscription() call if the GUC changed from transdiscard to sub_disable before the next BEGIN arrived, leaving exception_log empty and the subscription disabled prematurely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
886477d to
5dbdcb4
Compare
In TRANSDISCARD and SUB_DISABLE modes the apply worker retries a failed
transaction with use_try_block=true. Previously, the original error was
only written to the server log and lost before the retry pass ran.
Every row in the discarded transaction was logged to spock.exception_log
with error_message = NULL, stored as "unknown", giving no indication of
why the transaction was discarded.
Add initial_error_message[1024], initial_operation[16], and failed_action
to SpockExceptionLog (shared memory). In the outer PG_CATCH, save
edata->message, errcallback_arg.action_name, and xact_action_counter into
these fields before FlushErrorState() clears the error context.
On the retry pass, the row whose xact_action_counter matches failed_action
receives the saved initial_error_message as its error_message in exception_log.