Skip to content

WIP: V4 Adaptive Metadata Tree Prototype#16150

Draft
anoopj wants to merge 19 commits intoapache:mainfrom
anoopj:v4-amt
Draft

WIP: V4 Adaptive Metadata Tree Prototype#16150
anoopj wants to merge 19 commits intoapache:mainfrom
anoopj:v4-amt

Conversation

@anoopj
Copy link
Copy Markdown
Contributor

@anoopj anoopj commented Apr 28, 2026

WIP PR for s.apache.org/iceberg-single-file-commit

Works end to end from Spark, including scan planning and fast appends.

Not implemented in this PR:

  1. Merge DML
  2. Writing DVs and metadata DVs (read for metadata DVs is supported)
  3. Optimizing tree shape: currently it always writes a leaf and root
  4. Inheritance on reads

@anoopj anoopj marked this pull request as draft April 28, 2026 23:41
anoopj and others added 17 commits April 28, 2026 16:44
…leteFile APIs

This adapter would allow to minimize the v4 related code changes during
scan planning and commits.
…et by Default

Extends V4 Manifest writer to allow it to write manfiests in either Parquet
or Avro based on the file extension. A default is also added to do Parquet
Manifests in the SDK when the Version is 4. This could be parameterized
later but that will requrie parameterizing the test suites so I decied
on a single format (parquet) for now.

There are a few other requried changes here outside of testing

1. Handling of splitOffsets in Parquet needs to be changed since
BaseFile returns an immutable view which Parquet was attempting to
re-use by clearing.

2. Unpartitioned Tables need special care since parquet cannot store
empty structs in the schema. This means reading from parquet manfiests
means skipping the parquet field and then changing read offsets if the
partition is not defined. The read code is shared between all versions
at this time so this change effects older avro readers as well.

3. Some of the tests code for TestReplacePartitions assumed that you
could validate against a slightly different vesrion of the table. This is
a problem if the table you make is partitioned and the validation table
is unpartitioned. It use to work ... accidently I think because we would
make unpartitioned operations committed to a partitioned table.
- ManifestReader: Mark partition field optional for unpartitioned tables
  instead of removing it from the projection, preserving positional
  access and avoiding ClassCastException from shifted ordinals
- BaseFile: Deep copy ByteBuffer values in copyByteBufferMap to prevent
  Parquet container reuse from corrupting bounds in copied files, which
  caused equality deletes to fail stats-based overlap checks
- BaseFile: Guard against null partition value in internalSet
- TestRewriteTablePathsAction: Simplify manifest file predicate to use
  name patterns instead of file extensions
- Collapse broken builder chain in ManifestReader.open() into a single
  fluent expression
- Extract manifest format determination in SnapshotProducer into a
  private field computed once in the constructor
- Replace magic format version 4 with
  TableMetadata.MIN_FORMAT_VERSION_PARQUET_MANIFESTS in tests
- Parameterize TestManifestFileUtil across all format versions
- Fix TestJdbcCatalog.manifestFiles to use exclusion filter instead of
  allowlisting file extensions
- Improve ParquetValueReaders container reuse comments to reference
  specific BaseFile fields
Replace instanceof-then-cast with Java 16+ pattern matching to
eliminate redundant casts in outputFile() and keyMetadataBuffer().
…test names

- ParquetValueReaders: only skip recycling reuse as scratch buffer for Guava
  ImmutableList / ImmutableMap
- BaseFile: factor ByteBuffer map deep copy into deepCopyByteBufferMap
- V4Metadata: build file schema fields with ImmutableList.builderWithExpectedSize
- TestSnapshotProducer: rename Avro manifest compression tests for clarity
…ch reuse

Reuse ArrayList/LinkedHashMap-style buffers only via instanceof; avoids
Class.forName and non-API JDK type checks while keeping clear() safe.
For v4 tables, SnapshotProducer now writes a Parquet root manifest
containing TrackedFile entries with content_type=DATA_MANIFEST instead
of an Avro manifest list. BaseSnapshot detects Parquet format and reads
root manifests via V4ManifestReader, converting entries back to
ManifestFile objects for compatibility with the existing pipeline.
@anoopj anoopj changed the title V4 Adaptive Metadata Tree Prototype WIP: V4 Adaptive Metadata Tree Prototype Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants