Skip to content

gh-149238: avoid bool objects when branching on int/str in JIT#149418

Open
NekoAsakura wants to merge 2 commits intopython:mainfrom
NekoAsakura:gh-149238/bit-to-bool
Open

gh-149238: avoid bool objects when branching on int/str in JIT#149418
NekoAsakura wants to merge 2 commits intopython:mainfrom
NekoAsakura:gh-149238/bit-to-bool

Conversation

@NekoAsakura
Copy link
Copy Markdown
Contributor

@NekoAsakura NekoAsakura commented May 5, 2026

In Py_STACKREF_DEBUG builds the stack slot holds index, so I've added _Py_STACKREF_BIT_{0,1}_INDEX.
I tried LIST as well, but it doesn't work. In order for deopt to work properly, we can't leave a bare 0 or 1 on the stack, or the interpreter would read it as a wild pointer.

@read-the-docs-community
Copy link
Copy Markdown

Copy link
Copy Markdown
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful when converting (Python) bools to bit and bits to bools.

We might need to track the difference in the optimizer symbols, but the context should be sufficient.

We will need to add a new _BOOL_TO_BIT uop for correctness, but we should be able to optimize it away in all cases.

  • _BOOL_TO_BIT _BIT_TO_BOOL -> NOP
  • _BIT_TO_BOOL _BOOL_TO_BIT -> NOP
  • _BOOL_TO_BIT _GUARD_IS_TRUE_BIT_POP -> _GUARD_IS_TRUE_POP -> _GUARD_BIT_IS_SET_POP

etc.

#define _Py_STACKREF_FALSE_INDEX (3 << Py_TAGGED_SHIFT)
#define _Py_STACKREF_TRUE_INDEX (4 << Py_TAGGED_SHIFT)
/* Tier-2 transient bit values (gh-149238). */
#define _Py_STACKREF_BIT_0_INDEX (5 << Py_TAGGED_SHIFT)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than expose the internal representation, can you add new PyStackRef_WrapBit PyStackRef_UnwrapBit functions to present a nicer API.

How there are stored doesn't need to leak into the rest of the code.

Comment thread Python/bytecodes.c
DEAD(value);
int truthy = _PyLong_IsZero((PyLongObject *)value_o) ? 0 : 1;
PyStackRef_CLOSE_SPECIALIZED(value, _PyLong_ExactDealloc);
#ifdef Py_STACKREF_DEBUG
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifdef Py_STACKREF_DEBUG
bit = PyStackRef_WrapBit(truthy);

Comment thread Python/bytecodes.c
DEAD(value);
int truthy = value_o == &_Py_STR(empty) ? 0 : 1;
PyStackRef_CLOSE_SPECIALIZED(value, _PyUnicode_ExactDealloc);
#ifdef Py_STACKREF_DEBUG
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment thread Python/bytecodes.c
}

op (_BIT_TO_BOOL, (bit -- res)) {
#ifdef Py_STACKREF_DEBUG
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        res = PyStackRef_UnwrapBit(bit) ? PyStackRef_True : PyStackRef_False;

int already_bool = optimize_to_bool(this_instr, ctx, value, &res,
_NOP, _SWAP);
op(_TO_BOOL_BIT_INT, (value -- bit)) {
int already_bool = optimize_to_bool(this_instr, ctx, value, &bit,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right.
We need to treat bits (C bools) different from Python bools, otherwise we might end up omitting a conversion, or using one when we should be using another.
I think you need to emit a _BOOL_TO_BIT here (which can be optimized away later) to ensure that you have the correct kind of value on the stack.

Previously, _TO_BOOL_INT for the constant 7 would be changed to _LOAD_CONST_INLINE_BORROW True, but for _TO_BOOL_BIT_INT we need to change it to _LOAD_CONST_INLINE_BORROW True; _BOOL_TO_BIT, or add new tier 2 uops to load bit constants.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, as bit is sym_const(Py_True), _BIT_TO_BOOL replaces itself with _NOP; then _GUARD_IS_TRUE_POP is replaced by _POP_TOP. The actual emitted sequence is _POP_TOP; _LOAD_CONST_INLINE_BORROW True; _NOP; _POP_TOP. Nothing is being omitted.

    op(_BIT_TO_BOOL, (bit -- res)) {
        // Bypass when optimize_to_bool short-circuited to a real bool.
        if (sym_is_const(ctx, bit) &&
            (sym_get_const(ctx, bit) == Py_True ||
             sym_get_const(ctx, bit) == Py_False)) {
            REPLACE_OP(this_instr, _NOP, 0, 0);
            res = bit;
        }
        else {
            res = sym_new_truthiness(ctx, bit, true);
        }
    }

That said, _BOOL_TO_BIT is cleaner and better practice. I'll implement it.

@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented May 6, 2026

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants