Skip to content

Enable 128bit atomics#803

Open
sleeepyjack wants to merge 9 commits intoNVIDIA:devfrom
sleeepyjack:worktree-128bit-atomics
Open

Enable 128bit atomics#803
sleeepyjack wants to merge 9 commits intoNVIDIA:devfrom
sleeepyjack:worktree-128bit-atomics

Conversation

@sleeepyjack
Copy link
Copy Markdown
Collaborator

No description provided.

@sleeepyjack sleeepyjack self-assigned this Apr 17, 2026
@sleeepyjack sleeepyjack added P1: Should have Necessary but not critical In Progress Currently a work in progress type: improvement Improvement / enhancement to an existing function labels Apr 17, 2026
Comment thread include/cuco/detail/open_addressing/open_addressing_impl.cuh Outdated
Comment thread include/cuco/detail/open_addressing/open_addressing_ref_impl.cuh Outdated
Comment thread include/cuco/detail/open_addressing/open_addressing_ref_impl.cuh Outdated
Comment thread include/cuco/detail/pair/pair.inl
Comment thread include/cuco/detail/static_map/static_map.inl
Comment thread tests/static_map/insert_or_apply_test.cu
Comment thread tests/static_map/insert_or_assign_test.cu
Comment thread tests/static_map/retrieve_test.cu
Comment thread tests/static_multimap/insert_contains_test.cu
Comment thread tests/static_multimap/insert_if_test.cu
Comment thread benchmarks/benchmark_defaults.hpp
@sleeepyjack sleeepyjack force-pushed the worktree-128bit-atomics branch from 1cbf8bb to 4a6ccc2 Compare April 17, 2026 23:53
@sleeepyjack sleeepyjack added the topic: performance Performance related issue label Apr 17, 2026
@sleeepyjack sleeepyjack force-pushed the worktree-128bit-atomics branch from 4a6ccc2 to 6a2a1d2 Compare April 21, 2026 23:38
@sleeepyjack sleeepyjack marked this pull request as ready for review April 22, 2026 01:35
@PointKernel PointKernel added Needs Review Awaiting reviews before merging and removed In Progress Currently a work in progress labels Apr 22, 2026
Comment thread include/cuco/detail/open_addressing/open_addressing_impl.cuh
// wait to ensure that the write to the value part also took place
this->wait_for_payload(slot_ptr->second, this->empty_value_sentinel());
}
this->maybe_wait_for_payload(slot_ptr);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes code much cleaner but the readability is getting worse IMO. a weak approval :)

Copy link
Copy Markdown
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

128-bit now in place 🔥

@sleeepyjack can you please update the PR descriptions to reflect the changes brought by this PR? Otherwise looks great to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs Review Awaiting reviews before merging P1: Should have Necessary but not critical topic: performance Performance related issue type: improvement Improvement / enhancement to an existing function

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants