MFDNN-14690: Replace XE3P_35_10/11/UNKNOWN Core enum values with Xe3p#4981
Open
MFDNN-14690: Replace XE3P_35_10/11/UNKNOWN Core enum values with Xe3p#4981
Conversation
Collapse ngen::Core::XE3P_35_10, XE3P_35_11, and XE3P_UNKNOWN into a single ngen::Core::Xe3p value. Use ngen::ProductFamily to distinguish hardware-specific features where needed (512 GRFs, F4 DPAS support, SLM capacity). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
umar456
reviewed
Apr 8, 2026
| @@ -1300,7 +1293,7 @@ class GRF : public Register | |||
| static constexpr int maxRegs() { return 512; } | |||
| static constexpr int maxRegs(HW hw) { | |||
| return (hw < HW::XeHP) ? 128 | |||
| : (hw == HW::XE3P_35_11) ? 512 | |||
| : (hw >= HW::Xe3p) ? 512 | |||
rjoursler
reviewed
Apr 9, 2026
| static int max_slm_size(gpu_arch_t gpu_arch, gpu_product_t product); | ||
| static int max_slm_size_per_tg(gpu_arch_t gpu_arch, gpu_product_t product); | ||
| static int max_slm_size_per_tg(gpu_arch_t gpu_arch, int tg_size, | ||
| bool large_grf_mode, gpu_product_t product); |
Contributor
There was a problem hiding this comment.
We should be able to remove the gpu_arch as inputs, as gpu_product() alone contains all the necessary information.
| } | ||
|
|
||
| static inline size_t maxSLMPerWG(ngen::HW hw, int grfCount) | ||
| static inline size_t maxSLMPerWG(ngen::HW hw, int grfCount, ngen::ProductFamily pf) |
Contributor
There was a problem hiding this comment.
Remove ngen::HW input to this function as it is redundant.
| inline ngen::ProductFamily get_ngen_product_family( | ||
| const compute::device_info_t *device_info) { | ||
| return reinterpret_cast<const ngen::Product &>(device_info->gpu_product()) | ||
| .family; |
Contributor
There was a problem hiding this comment.
This needs to use a memcpy to avoid UB.
Suggested change
| .family; | |
| return get_ngen_product(device_info).family |
| } | ||
|
|
||
| inline ngen::ProductFamily get_ngen_product_family( | ||
| const compute::device_info_t *device_info) { |
Contributor
There was a problem hiding this comment.
Take a reference as this function does not support nullptr as input.
| MatrixAddressing sroundSeed; | ||
| PostOpsProblem postOps; // Fused post operations to apply | ||
| ngen::HW hw; | ||
| ngen::ProductFamily family; |
Contributor
There was a problem hiding this comment.
Lets just store the full ngen::Product here, rather than just the family.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Collapse ngen::Core::XE3P_35_10, XE3P_35_11, and XE3P_UNKNOWN into a single ngen::Core::Xe3p value. Use ngen::ProductFamily to distinguish hardware-specific features where needed (512 GRFs, F4 DPAS support, SLM capacity).
addresses MFDNN-14960