Skip to content

MFDNN-14690: Replace XE3P_35_10/11/UNKNOWN Core enum values with Xe3p#4981

Open
dyoussif wants to merge 1 commit intomainfrom
dyoussif/hw_rebase
Open

MFDNN-14690: Replace XE3P_35_10/11/UNKNOWN Core enum values with Xe3p#4981
dyoussif wants to merge 1 commit intomainfrom
dyoussif/hw_rebase

Conversation

@dyoussif
Copy link
Copy Markdown
Contributor

@dyoussif dyoussif commented Apr 8, 2026

Collapse ngen::Core::XE3P_35_10, XE3P_35_11, and XE3P_UNKNOWN into a single ngen::Core::Xe3p value. Use ngen::ProductFamily to distinguish hardware-specific features where needed (512 GRFs, F4 DPAS support, SLM capacity).

addresses MFDNN-14960

Collapse ngen::Core::XE3P_35_10, XE3P_35_11, and XE3P_UNKNOWN into a
single ngen::Core::Xe3p value. Use ngen::ProductFamily to distinguish
hardware-specific features where needed (512 GRFs, F4 DPAS support,
SLM capacity).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dyoussif dyoussif requested a review from a team as a code owner April 8, 2026 23:32
@github-actions github-actions bot added platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel third_party labels Apr 8, 2026
@@ -1300,7 +1293,7 @@ class GRF : public Register
static constexpr int maxRegs() { return 512; }
static constexpr int maxRegs(HW hw) {
return (hw < HW::XeHP) ? 128
: (hw == HW::XE3P_35_11) ? 512
: (hw >= HW::Xe3p) ? 512
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential issue here.

static int max_slm_size(gpu_arch_t gpu_arch, gpu_product_t product);
static int max_slm_size_per_tg(gpu_arch_t gpu_arch, gpu_product_t product);
static int max_slm_size_per_tg(gpu_arch_t gpu_arch, int tg_size,
bool large_grf_mode, gpu_product_t product);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to remove the gpu_arch as inputs, as gpu_product() alone contains all the necessary information.

}

static inline size_t maxSLMPerWG(ngen::HW hw, int grfCount)
static inline size_t maxSLMPerWG(ngen::HW hw, int grfCount, ngen::ProductFamily pf)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove ngen::HW input to this function as it is redundant.

inline ngen::ProductFamily get_ngen_product_family(
const compute::device_info_t *device_info) {
return reinterpret_cast<const ngen::Product &>(device_info->gpu_product())
.family;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to use a memcpy to avoid UB.

Suggested change
.family;
return get_ngen_product(device_info).family

}

inline ngen::ProductFamily get_ngen_product_family(
const compute::device_info_t *device_info) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a reference as this function does not support nullptr as input.

MatrixAddressing sroundSeed;
PostOpsProblem postOps; // Fused post operations to apply
ngen::HW hw;
ngen::ProductFamily family;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets just store the full ngen::Product here, rather than just the family.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel third_party

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants