WIP: V4 Adaptive Metadata Tree Prototype by anoopj · Pull Request #16150 · apache/iceberg

anoopj · 2026-04-28T23:41:04Z

WIP PR for s.apache.org/iceberg-single-file-commit

Works end to end from Spark, including scan planning and fast appends.

Previous prototype: [WIP] V4 Manifest Read Support #14533 (before we had the colocated DVs)
Cherry picks Russell's Parquet manifest support that is not merged yet.

Not implemented in this PR:

Merge DML
Writing DVs and metadata DVs (read for metadata DVs is supported)
Optimizing tree shape: currently it always writes a leaf and root
Inheritance on reads

…leteFile APIs This adapter would allow to minimize the v4 related code changes during scan planning and commits.

…et by Default Extends V4 Manifest writer to allow it to write manfiests in either Parquet or Avro based on the file extension. A default is also added to do Parquet Manifests in the SDK when the Version is 4. This could be parameterized later but that will requrie parameterizing the test suites so I decied on a single format (parquet) for now. There are a few other requried changes here outside of testing 1. Handling of splitOffsets in Parquet needs to be changed since BaseFile returns an immutable view which Parquet was attempting to re-use by clearing. 2. Unpartitioned Tables need special care since parquet cannot store empty structs in the schema. This means reading from parquet manfiests means skipping the parquet field and then changing read offsets if the partition is not defined. The read code is shared between all versions at this time so this change effects older avro readers as well. 3. Some of the tests code for TestReplacePartitions assumed that you could validate against a slightly different vesrion of the table. This is a problem if the table you make is partitioned and the validation table is unpartitioned. It use to work ... accidently I think because we would make unpartitioned operations committed to a partitioned table.

- ManifestReader: Mark partition field optional for unpartitioned tables instead of removing it from the projection, preserving positional access and avoiding ClassCastException from shifted ordinals - BaseFile: Deep copy ByteBuffer values in copyByteBufferMap to prevent Parquet container reuse from corrupting bounds in copied files, which caused equality deletes to fail stats-based overlap checks - BaseFile: Guard against null partition value in internalSet - TestRewriteTablePathsAction: Simplify manifest file predicate to use name patterns instead of file extensions

- Collapse broken builder chain in ManifestReader.open() into a single fluent expression - Extract manifest format determination in SnapshotProducer into a private field computed once in the constructor - Replace magic format version 4 with TableMetadata.MIN_FORMAT_VERSION_PARQUET_MANIFESTS in tests - Parameterize TestManifestFileUtil across all format versions - Fix TestJdbcCatalog.manifestFiles to use exclusion filter instead of allowlisting file extensions - Improve ParquetValueReaders container reuse comments to reference specific BaseFile fields

Replace instanceof-then-cast with Java 16+ pattern matching to eliminate redundant casts in outputFile() and keyMetadataBuffer().

…test names - ParquetValueReaders: only skip recycling reuse as scratch buffer for Guava ImmutableList / ImmutableMap - BaseFile: factor ByteBuffer map deep copy into deepCopyByteBufferMap - V4Metadata: build file schema fields with ImmutableList.builderWithExpectedSize - TestSnapshotProducer: rename Avro manifest compression tests for clarity

…ch reuse Reuse ArrayList/LinkedHashMap-style buffers only via instanceof; avoids Class.forName and non-API JDK type checks while keeping clear() safe.

For v4 tables, SnapshotProducer now writes a Parquet root manifest containing TrackedFile entries with content_type=DATA_MANIFEST instead of an Avro manifest list. BaseSnapshot detects Parquet format and reads root manifests via V4ManifestReader, converting entries back to ManifestFile objects for compatibility with the existing pipeline.

anoopj marked this pull request as draft April 28, 2026 23:41

github-actions Bot added spark parquet core labels Apr 28, 2026

anoopj and others added 17 commits April 28, 2026 16:44

[core] v4: Add TrackedFileAdapters: bridge TrackedFile to DataFile/De…

80fc5fe

…leteFile APIs This adapter would allow to minimize the v4 related code changes during scan planning and commits.

Clean up tests

cb7133d

Change design such that a DV adapted to DeleteFile

2f9c29f

Make copy safe

1a33130

Reorder

4ed70e1

Core: Use instanceof pattern matching in ManifestWriter

9e0d9d7

Replace instanceof-then-cast with Java 16+ pattern matching to eliminate redundant casts in outputFile() and keyMetadataBuffer().

Core: Remove duplicate validateSnapshot overload in TestBase

2e2aa47

Parquet: Whitelist mutable JDK collections for Parquet list/map scrat…

0276a96

…ch reuse Reuse ArrayList/LinkedHashMap-style buffers only via instanceof; avoids Class.forName and non-API JDK type checks while keeping clear() safe.

Checkpoint

dec20aa

Store relative paths in v4 metadata JSON and root manifests

1f11a88

Core: Relativize all location fields in v4 metadata

8277e44

Apply MDVs

395365a

anoopj force-pushed the v4-amt branch from eadd0bd to 37973df Compare April 28, 2026 23:46

anoopj changed the title ~~V4 Adaptive Metadata Tree Prototype~~ WIP: V4 Adaptive Metadata Tree Prototype Apr 29, 2026

Fix bug

f32caec

anoopj force-pushed the v4-amt branch from 37973df to f32caec Compare April 29, 2026 14:33

Add testing guide

c475d63

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: V4 Adaptive Metadata Tree Prototype#16150

WIP: V4 Adaptive Metadata Tree Prototype#16150
anoopj wants to merge 19 commits intoapache:mainfrom
anoopj:v4-amt

anoopj commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anoopj commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants