Skip to content

gh-145665: Configurable tos cache#145830

Draft
aisk wants to merge 21 commits intopython:mainfrom
aisk:configurable-tos-cache
Draft

gh-145665: Configurable tos cache#145830
aisk wants to merge 21 commits intopython:mainfrom
aisk:configurable-tos-cache

Conversation

@aisk
Copy link
Copy Markdown
Member

@aisk aisk commented Mar 11, 2026

@markshannon
Copy link
Copy Markdown
Member

Thanks for giving this a go. I've had a quick look.

Rather than take up memory for tables and stencils for all permutations up to MAX_GENERATED_CACHED_REGISTER we only want them up to MAX_CACHED_REGISTER.

We still want to generate the code for all possible cases, but have it turned off by the preprocessor.
For example, in executor_cases.c.h you have:

        case _NOP_r55: {
            #if MAX_CACHED_REGISTER >= 5

but it will need to be:

    #if MAX_CACHED_REGISTER >= 5
        case _NOP_r55: {

Likewise in pycore_uop_ids.h, the names will need to be guarded:

#if MAX_CACHED_REGISTER >= 5
    #define _NOP_r55
#endif

To avoid gaps, the ids will need to be sorted by the largest cached register used.

@aisk
Copy link
Copy Markdown
Member Author

aisk commented Mar 16, 2026

Sorry, I don't have free time in the next few weeks, so I will close this temporarily incase of someone want work on this.

Will re-open and continue the work if no one work on this when I have time.

@aisk aisk closed this Mar 16, 2026
@aisk aisk reopened this Apr 9, 2026
@aisk aisk added the skip news label Apr 9, 2026
@aisk
Copy link
Copy Markdown
Member Author

aisk commented Apr 10, 2026

Hi @markshannon, with different MAX_CACHED_REGISTER, get_uop_cache_depths will generate different rXY combinations, so I changed to generate completely different cases / metadata variants by different MAX_CACHED_REGISTER, guarded by a C macro, and using the C macro to select them when building.

But I have a few questions:

  1. Currently, only 3–5 variants are generated, but what's the recommended range for this?
  2. What's the right way to configure MAX_CACHED_REGISTER? Currently, it's hardcoded in pycore_uop.h and _targets.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants