[3.14]: gh-148284: Move stackref buffer to per-eval loop, and don't override PGO data.#148310
[3.14]: gh-148284: Move stackref buffer to per-eval loop, and don't override PGO data.#148310Fidget-Spinner wants to merge 7 commits intopython:3.14from
Conversation
|
See analysis here #138115 (comment). This reduces the interpreter loop's C stack consumption by roughly 1kb on recent versions of Clang. |
colesbury
left a comment
There was a problem hiding this comment.
I don't entirely understand the addition of the Py_EnterRecursiveCall() in _testcapi.pyobject_vectorcall. Is this needed to prevent a test from crashing?
In practice, we don't do Py_EnterRecursiveCall() checks before most PyObject_Vectorcall() calls and I wouldn't expect extensions to do so either. It seems more common that the called function does the check (i.e., possibly recursive reprs or when entering the interpreter)
You're right. I removed it. |
|
For those wondering why I removed _Py_HOT_FUNCTION for eval loop. I'm almost 100% certain it's a bug to include it for the eval loop. On GCC, PGO overrides it https://gcc.gnu.org/onlinedocs/gcc/Common-Attributes.html#index-cold On Clang, it overrides PGO https://clang.llvm.org/docs/AttributeReference.html#hot The only time it matters is when there's no PGO. In which case, I'm not sure why people are using that CPython for performance. @vstinner added it back in 2016 and said that to the effect of PGO should be preferred https://bugs.python.org/issue28618 Back in 2016, PGO was newer, now it's a decade later, and practically all perf-critical Python use PGO. We should remove to unbreak modern compilers. |
|
That makes sense to me. One question: do you want to do the |
Good point. I will break that out into another PR. |
python -m test -v test_callsegfault with clang 22.1.3 build #148284