feat: add ReachableFileCleanup to expire snapshots#592
Merged
wgtmac merged 7 commits intoapache:mainfrom Apr 24, 2026
Merged
Conversation
596c5d2 to
d3df6e3
Compare
wgtmac
requested changes
Mar 20, 2026
wgtmac
reviewed
Mar 20, 2026
4f36084 to
1ebc015
Compare
Contributor
Author
|
@wgtmac, made the change. Can you have a look? |
Member
|
I'm on the way :) |
wgtmac
reviewed
Apr 1, 2026
Contributor
Author
|
@wgtmac do you have time to have another look? |
wgtmac
reviewed
Apr 17, 2026
Member
wgtmac
left a comment
There was a problem hiding this comment.
Codex review comments for the two issues found in expire snapshots cleanup.
Implement the file cleanup logic that was missing from the expire
snapshots feature (the original PR noted "TODO: File recycling will
be added in a followup PR").
Port the "reachable file cleanup" strategy from Java's
ReachableFileCleanup, following the same phased approach:
Phase 1: Collect manifest paths from expired and retained snapshots
Phase 2: Prune manifests still referenced by retained snapshots
Phase 3: Find data files only in manifests being deleted, subtract
files still reachable from retained manifests (kAll only)
Phase 4: Delete orphaned manifest files
Phase 5: Delete manifest lists from expired snapshots
Phase 6: Delete expired statistics and partition statistics files
Key design decisions matching Java parity:
- Best-effort deletion: suppress errors on individual file deletions
to avoid blocking metadata updates (Java suppressFailureWhenFinished)
- Branch/tag awareness: retained snapshot set includes all snapshots
reachable from any ref (branch or tag), preventing false-positive
deletions of files still referenced by non-main branches
- Data file safety: only delete data files from manifests that are
themselves being deleted, then subtract any files still reachable
from retained manifests (two-pass approach from ReachableFileCleanup)
- Respect CleanupLevel: kNone skips all, kMetadataOnly skips data
files, kAll cleans everything
- FileIO abstraction: uses FileIO::DeleteFile for filesystem
compatibility (S3, HDFS, local), with custom DeleteWith() override
- Statistics cleanup via snapshot ID membership in retained set
TODOs for follow-up:
- Multi-threaded file deletion (Java uses Tasks.foreach with executor)
- IncrementalFileCleanup strategy for linear ancestry optimization
(Java uses this when no branches/cherry-picks involved)
- Fix O(M*S) I/O: Pre-cache ManifestFile objects in manifest_cache_ during Phase 1 (ReadManifestsForSnapshot), eliminating repeated manifest list reads in FindDataFilesToDelete. - Fix storage leak: Use LiveEntries() instead of Entries() to match Java's ManifestFiles.readPaths behavior (only ADDED/EXISTING entries). - Fix data loss risk: When reading a retained manifest fails, abort data file deletion entirely instead of silently continuing. Java retries and throws on failure here. - Fix statistics file deletion: Use path-based set difference instead of snapshot_id-only check, preventing erroneous deletion of statistics files shared across snapshots. - Remove goto anti-pattern: Extract ManifestFile lookup into MakeManifestReader() helper and use manifest_cache_ for direct lookup. - Improve API: FindDataFilesToDelete now returns Result<unordered_set<string>> instead of using a mutable out-parameter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirror Java's file cleanup class hierarchy for expire snapshots: - Add abstract FileCleanupStrategy with shared DeleteFile() and ExpiredStatisticsFilePaths() utilities (path-based set difference) - Add ReachableFileCleanup concrete class owning manifest_cache_, ReadManifestsForSnapshot(), and FindDataFilesToDelete() - Move MakeManifestReader() to a free function in anonymous namespace using ICEBERG_ASSIGN_OR_RAISE - Remove cleanup-specific private methods and manifest_cache_ from ExpireSnapshots class; Finalize() now delegates to the strategy - Clear apply_result_ after consumption in Finalize() - Rename DeleteFilePath to DeleteFile; use std::ignore for FileIO return - Remove manifest_list.h and manifest_reader.h from the header
… stats file deletion P0: ReadManifestsForSnapshot now returns bool. If any retained snapshot's manifest list cannot be read, phases 2-4 (manifest and data file deletion) are skipped entirely. An incomplete retained set makes it unsafe to compute manifests_to_delete, as manifests still referenced by unreadable snapshots would be wrongly included. This matches Java's throwFailureWhenFinished behavior in ReachableFileCleanup. Manifest list deletion (phase 5) is unaffected since it is keyed on expired snapshots only. P1: Remove physical statistics and partition-statistics file deletion (the former phase 6). RemoveStatistics/RemovePartitionStatistics are still not called in RemoveSnapshots (the TODO in table_metadata.cc), so the committed metadata still references those files after they would be deleted on disk. Deletion is deferred until the metadata-level removal is wired in, at which point the two operations can be kept in sync.
59dfa56 to
c9c1878
Compare
wgtmac
approved these changes
Apr 24, 2026
Member
wgtmac
left a comment
There was a problem hiding this comment.
Thanks @shangxinli for adding this!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ReachableFileCleanup