Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 21, 2026

Summary by CodeRabbit

  • New Features

    • Direct small-integer loading (0–255) and locals-loading for faster execution
    • Async-generator wrapping and improved generator resume behavior
  • Performance

    • Faster integer loads and streamlined jump/loop handling for better runtime performance
  • Bug Fixes

    • More robust StopIteration handling and stricter init return checks
    • Corrected iterator cleanup for async and sync loops
  • Improvements

    • Aligns loop and jump semantics with CPython 3.14 patterns

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

This PR updates bytecode and codegen to align with CPython 3.14: adds/renames instructions (LoadSmallInt, EndFor, PopIter, ReturnGenerator), consolidates Jump/JumpNoInterrupt handling, adjusts async-function scope propagation, and updates VM/JIT and oparg enums to support new intrinsics and loop semantics.

Changes

Cohort / File(s) Summary
Symbol table / scopes
crates/codegen/src/symboltable.rs
Propagate is_async into enter_scope_with_parameters; treat AsyncFunction like Function for nesting/super resolution.
Codegen & compile IR
crates/codegen/src/ir.rs, crates/codegen/src/compile.rs
Unify Jump/JumpNoInterrupt preprocessing and emission; emit END_FOR/POP_ITER loop cleanup patterns; move RETURN_GENERATOR/POP_TOP before RESUME in async setup; add LoadSmallInt constant-path optimization; replace interruptible Jumps with JumpNoInterrupt where required.
Bytecode instruction surface
crates/compiler-core/src/bytecode/instruction.rs
Rename BinarySubscr* → BinaryOpSubscr*; add CPython-3.14 pseudo-instructions (AnnotationsPlaceholder, JumpIfFalse/True, JumpNoInterrupt, LoadClosure, SetupCleanup/Finally/With, StoreFastMaybeNull); expose EndFor/PopIter/LoadSmallInt in disassembly; switch InstructionMetadata::stack_effect signature.
OpArg / intrinsics
crates/compiler-core/src/bytecode/oparg.rs
Activate IntrinsicFunction1::StopIterationError = 3 and IntrinsicFunction1::AsyncGenWrap = 4.
VM / frame semantics
crates/vm/src/frame.rs
Add LoadLocals, LoadSmallInt, ReturnGenerator, EndFor, PopIter, ExitInitCheck instructions; implement StopIterationError and AsyncGenWrap intrinsic handling; adjust StopIteration/END_FOR/POP_ITER execution patterns.
JIT handling
crates/jit/src/instructions.rs
Implement LoadSmallInt codegen path (emit immediate Cranelift const → JitValue::Int).
Stdlib opcode helper
crates/stdlib/src/_opcode.rs
Update stack_effect call site to pass only oparg (jump parameter accepted but unused for Python API compatibility).

Sequence Diagram(s)

sequenceDiagram
  participant Source as Source AST
  participant Codegen as Codegen (compile.rs)
  participant IR as IR (ir.rs)
  participant Emitter as Bytecode Emitter (compiler-core)
  participant VM as VM / JIT

  Source->>Codegen: traverse AST (FunctionDef / async)
  Codegen->>IR: emit pseudo-instructions (Jump / JumpNoInterrupt, END_FOR markers, LoadSmallInt hints)
  IR->>Emitter: map pseudo → real instructions (JumpBackwardNoInterrupt, EndFor, PopIter, LoadSmallInt)
  Emitter->>VM: write bytecode with new opcodes and intrinsics
  VM->>VM: execute patterns (END_FOR/POP_ITER, ReturnGenerator, StopIterationError, AsyncGenWrap)
  VM->>JIT: JIT handles LoadSmallInt as immediate const
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐰 I hop through bytecode, light and spry,
Small ints bound quickly, constants fly,
Async nests know where to hide,
EndFor, PopIter tidy the stride,
Jumps that don't interrupt — onward I try! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'instruction improvements' is vague and overly broad, failing to convey the specific nature of the extensive changes across multiple compilation and instruction-handling modules. Consider a more specific title that highlights a primary change, such as 'CPython 3.14 instruction alignment' or 'Add LoadSmallInt and async handling improvements' to better guide reviewers.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 93.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

Click to expand dependency information
[+] lib: cpython/Lib/_opcode_metadata.py
soft_deps:
- [ ] _opcode_metadata

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated
  • native: Rust/C extension modules

@youknowone youknowone force-pushed the instr branch 3 times, most recently from 12c443f to 9b1d6d6 Compare January 21, 2026 15:19
@youknowone youknowone marked this pull request as ready for review January 21, 2026 23:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)

2096-2172: Ensure nested-scope detection treats AsyncFunction as a function.

Now that AsyncFunction is a distinct scope, enter_scope still only treats CompilerScope::Function as “nested.” That will misclassify nested defs inside async functions. Please update the nesting predicate.

🐛 Suggested fix in enter_scope
-            .map(|table| table.is_nested || table.typ == CompilerScope::Function)
+            .map(|table| {
+                table.is_nested
+                    || matches!(table.typ, CompilerScope::Function | CompilerScope::AsyncFunction)
+            })
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 509-514: The ForIter arm in Instruction::stack_effect currently
ignores the jump parameter and always returns 1; update the ForIter handling in
crates/compiler-core/src/bytecode/instruction.rs (the ForIter match arm inside
the stack_effect implementation for Instruction) to return 1 when jump is false
and 0 when jump is true so it matches CPython 3.14 semantics (jump=False => +1,
jump=True => 0); ensure the change preserves the existing comment/context about
CPython behavior and does not affect other Instruction arms.

In `@crates/vm/src/frame.rs`:
- Around line 2724-2746: The StopIteration handling in the match arm for
bytecode::IntrinsicFunction1::StopIterationError currently uses
arg.class().is(vm.ctx.exceptions.stop_iteration) which only matches the exact
StopIteration type; change this to use
arg.fast_isinstance(vm.ctx.exceptions.stop_iteration) so subclasses of
StopIteration are also treated as StopIteration, keeping the same behavior for
converting to vm.new_runtime_error("coroutine raised StopIteration") and
preserving the existing else branch that re-raises via arg.downcast().
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)

2313-2327: Guard the END_FOR skip with a POP_ITER check.

Skipping to target + 1 assumes a valid POP_ITER follows END_FOR. Consider guarding to avoid out-of-bounds or unexpected bytecode layouts.

💡 Suggested defensive guard
-                let jump_target = if let Some(unit) = self.code.instructions.get(target_idx) {
-                    if matches!(unit.op, bytecode::Instruction::EndFor) {
-                        // Skip END_FOR, jump to next instruction
-                        bytecode::Label(target.0 + 1)
-                    } else {
-                        // Legacy pattern: jump directly to target (POP_TOP/POP_ITER)
-                        target
-                    }
-                } else {
-                    target
-                };
+                let jump_target = if let Some(unit) = self.code.instructions.get(target_idx) {
+                    if matches!(unit.op, bytecode::Instruction::EndFor)
+                        && matches!(
+                            self.code.instructions.get(target_idx + 1).map(|u| &u.op),
+                            Some(bytecode::Instruction::PopIter)
+                        )
+                    {
+                        // Skip END_FOR, jump to POP_ITER
+                        bytecode::Label(target.0 + 1)
+                    } else {
+                        // Legacy pattern: jump directly to target (POP_TOP/POP_ITER)
+                        target
+                    }
+                } else {
+                    target
+                };

Comment on lines 509 to 514
Self::ForIter { .. } => {
// CPython 3.14 pattern: always +1 for stack analysis
// jump=False: push next value (+1)
// jump=True: iterator stays on stack, no change (0)
if jump { 0 } else { 1 }
// jump=True: conceptually +1 so END_FOR(-1) + POP_ITER(-1) = -1 total
// At runtime, FOR_ITER skips END_FOR and jumps directly to POP_ITER
1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

CPython 3.14 FOR_ITER stack_effect jump parameter dis module

💡 Result:

Short answer: dis.stack_effect(opcode.FOR_ITER, oparg, jump=...) — jump=True = jump taken (iterator exhausted) → stack effect 0; jump=False = fall-through (next value produced) → stack effect +1; jump=None (default) → returns the maximal effect (+1). See the dis docs. [1][2]

Sources:

  • dis.stack_effect documentation. [1]
  • FOR_ITER opcode description (notes on popping behavior and changes across versions). [2]

🌐 Web query:

CPython 3.14 FOR_ITER opcode behavior stack semantics

💡 Result:

Short answer — semantics in CPython 3.14:

  • Input stack before FOR_ITER: [..., iter] (iterator object at TOS).
  • Execution:
    • If next(iter) yields a value: FOR_ITER pushes that value, leaving the iterator beneath it. Resulting stack: [..., iter, value].
    • If iterator is exhausted (StopIteration): FOR_ITER removes the iterator from the stack (pop) and jumps forward by the instruction’s oparg (no value pushed).
  • So the two possible “after” states are [..., iter, value] (loop body runs) or [...] (loop exit — iterator removed and jump). This is the traditional FOR_ITER behaviour used by for-loops and comprehensions; 3.14 continues that semantics (with related bytecode/iterator machinery changes such as adding POP_ITER/virtual iterators). (stackoverflow.com)

If you want the exact C implementation lines in ceval.c for 3.14 I can fetch and cite them.

Citations:


🏁 Script executed:

cd crates/compiler-core && find . -type f -name "*.rs" | head -20

Repository: RustPython/RustPython

Length of output: 212


🏁 Script executed:

grep -r "FOR_ITER\|ForIter" crates/compiler-core/src/bytecode/ -n | head -30

Repository: RustPython/RustPython

Length of output: 1596


🏁 Script executed:

cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '509,514p'

Repository: RustPython/RustPython

Length of output: 456


🏁 Script executed:

cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '440,520p'

Repository: RustPython/RustPython

Length of output: 4188


🏁 Script executed:

grep -r "stack_effect" crates/compiler-core/src/ -n | grep -v "test" | head -20

Repository: RustPython/RustPython

Length of output: 784


🏁 Script executed:

grep -r "END_FOR\|POP_ITER" crates/compiler-core/src/bytecode/instruction.rs -n

Repository: RustPython/RustPython

Length of output: 336


🏁 Script executed:

cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '575,610p'

Repository: RustPython/RustPython

Length of output: 2078


🏁 Script executed:

grep -rn "\.stack_effect\|stack_effect(" crates/compiler-core/src/ --include="*.rs" | grep -v "fn stack_effect" | head -30

Repository: RustPython/RustPython

Length of output: 188


🏁 Script executed:

grep -rn "stack_effect" crates/compiler/ --include="*.rs" | grep -v "test\|compiler-core" | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

grep -rn "stack_effect" crates/ --include="*.rs" | grep -v "fn stack_effect\|test\|//.*stack_effect" | head -40

Repository: RustPython/RustPython

Length of output: 565


🏁 Script executed:

cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '1165,1175p'

Repository: RustPython/RustPython

Length of output: 581


🏁 Script executed:

git log --oneline -20 -- crates/compiler-core/src/bytecode/instruction.rs

Repository: RustPython/RustPython

Length of output: 120


🏁 Script executed:

cat -n crates/codegen/src/ir.rs | sed -n '475,510p'

Repository: RustPython/RustPython

Length of output: 1912


🏁 Script executed:

cat -n crates/stdlib/src/opcode.rs | sed -n '185,200p'

Repository: RustPython/RustPython

Length of output: 591


🏁 Script executed:

cat -n crates/codegen/src/ir.rs | sed -n '500,515p'

Repository: RustPython/RustPython

Length of output: 901


🏁 Script executed:

grep -A 20 "Self::ForIter" crates/codegen/src/ir.rs | head -30

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

cd crates/compiler-core && cat -n src/bytecode/instruction.rs | sed -n '602,603p'

Repository: RustPython/RustPython

Length of output: 191


🏁 Script executed:

cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '602,602p'

Repository: RustPython/RustPython

Length of output: 146


Respect the jump parameter in ForIter stack_effect to match CPython 3.14 semantics.

The jump parameter is actively used by codegen (lines 480 and 505 in crates/codegen/src/ir.rs) to calculate correct stack depths for fall-through vs. jump paths, but the ForIter implementation ignores it and unconditionally returns +1. This causes incorrect stack depth analysis for the jump case.

In CPython 3.14, FOR_ITER stack effects are:

  • jump=False (next value produced): +1
  • jump=True (iterator exhausted, jumps): 0

When the jump is taken, the iterator is popped and execution jumps away; there is no net stack change. The implementation should return 0 when jump=True and 1 when jump=False.

🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 509 - 514, The
ForIter arm in Instruction::stack_effect currently ignores the jump parameter
and always returns 1; update the ForIter handling in
crates/compiler-core/src/bytecode/instruction.rs (the ForIter match arm inside
the stack_effect implementation for Instruction) to return 1 when jump is false
and 0 when jump is true so it matches CPython 3.14 semantics (jump=False => +1,
jump=True => 0); ensure the change preserves the existing comment/context about
CPython behavior and does not affect other Instruction arms.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 7763-7772: The small-int fast-path currently only accepts 0..=255
via value.to_u8(); change it to accept the CPython 3.14 small-int cache range
-5..=256 by converting the integer to a signed type (e.g. value.to_i64()) and
checking if it falls within -5..=256, then use that signed small_int for the
emitted argument; adjust the cast passed to self.emit_arg to convert the signed
small_int into the parameter type expected by emit_arg/Instruction::LoadSmallInt
(e.g. compute small_int as i32 and cast to u32 when calling emit_arg, and update
Instruction::LoadSmallInt idx type if necessary) so negative values and 256 are
handled correctly while still adding the constant via
self.arg_constant(ConstantData::Integer { .. }).
- Around line 4989-4993: In compile_break_continue and unwind_fblock, replace
the existing POP_TOP-based loop cleanup with the CPython 3.14 END_FOR + POP_ITER
pattern: instead of emitting Instruction::PopTop for for-loop teardown paths,
emit Instruction::EndFor followed by Instruction::PopIter so the VM's
special-skip behavior (skip END_FOR to reach POP_ITER) is preserved; update all
break/continue/return cleanup sites within those functions to use this pattern
and ensure any comments reflect the CPython 3.14 semantics.
♻️ Duplicate comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)

447-447: Simplified stack_effect signature removes jump parameter.

The stack_effect method now takes only arg: OpArg, removing the jump: bool parameter. This simplification is consistent across the codebase.

However, note that ForIter now returns a fixed +1 (line 509), which differs from CPython's behavior where jump=True (iterator exhausted) should return 0 and jump=False (value produced) should return +1. This was flagged in a previous review.

Since the ir.rs stack depth analysis no longer distinguishes jump paths, this may cause conservative stack depth estimates but shouldn't cause underflow. The comment on the previous review explains the semantic difference.

Also applies to: 509-509

🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)

2313-2332: Runtime instruction inspection may indicate a compiler-level opportunity.

The code inspects the instruction stream at runtime to detect the CPython 3.14 END_FOR/POP_ITER pattern and skip END_FOR on loop exhaustion. While this works correctly, runtime instruction inspection is unusual and could be fragile if bytecode patterns change.

Consider whether the compiler could emit a different jump target directly, or if FOR_ITER should have an additional operand to indicate the number of instructions to skip. This would eliminate the runtime pattern matching.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 7760-7770: The JIT path mis-decodes LoadSmallInt by zero-extending
the stored u32 when computing small_int (currently doing idx.get(arg) as i64);
update the JIT handler that decodes LoadSmallInt (the place that assigns
small_int from idx.get(arg)) to cast to i32 first and then to i64 (i.e.,
small_int = (idx.get(arg) as i32) as i64) so negative values preserve
two's-complement sign like the VM handler (which uses idx.get(arg) as i32);
ensure any comments/reference to LoadSmallInt, emit_arg, and arg_constant remain
consistent.
♻️ Duplicate comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)

447-728: Re-check FOR_ITER jump-path stack effect with the new API.

stack_effect now ignores jump direction and ForIter returns +1. If jump-path depth still differs (iterator exhausted), stack depth analysis can drift. Consider a jump-aware path or special-casing ForIter in depth analysis if needed.

🧹 Nitpick comments (1)
crates/codegen/src/ir.rs (1)

339-346: Add a debug assertion that JumpNoInterrupt is backward-only.

This arm always emits JumpBackwardNoInterrupt; an assertion would catch unexpected forward targets early.

♻️ Suggested debug-assert
                         AnyInstruction::Pseudo(PseudoInstruction::JumpNoInterrupt { .. })
                             if target != BlockIdx::NULL =>
                         {
-                            // JumpNoInterrupt is always backward (used in yield-from/await loops)
+                            let target_offset = block_to_offset[target.idx()].0;
+                            debug_assert!(
+                                target_offset <= current_offset,
+                                "JumpNoInterrupt must be backward"
+                            );
+                            // JumpNoInterrupt is always backward (used in yield-from/await loops)
                             Instruction::JumpBackwardNoInterrupt {
                                 target: Arg::marker(),
                             }
                         }

Comment on lines +7760 to +7770
// Use LOAD_SMALL_INT for integers in small int cache range (-5..=256)
// Still add to co_consts for compatibility (CPython does this too)
if let ConstantData::Integer { ref value } = constant
&& let Some(small_int) = value.to_i32()
&& (-5..=256).contains(&small_int)
{
// Add to co_consts even though we use LOAD_SMALL_INT
let _idx = self.arg_constant(constant);
// Store as u32 (two's complement for negative values)
self.emit_arg(small_int as u32, |idx| Instruction::LoadSmallInt { idx });
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "LoadSmallInt" crates -g '*.rs' -C 3

Repository: RustPython/RustPython

Length of output: 3919


LoadSmallInt VM decoder is correct; JIT decoder has a potential issue.

The VM handler correctly decodes LoadSmallInt as signed i32 (line 1188 of crates/vm/src/frame.rs):

let value = vm.ctx.new_int(idx.get(arg) as i32);

Casting u32 to i32 preserves the two's complement bit pattern, so negative values are materialized correctly. However, the JIT handler (crates/jit/src/instructions.rs:576) casts directly to i64:

let small_int = idx.get(arg) as i64;

This uses zero-extension instead of sign-extension, causing small negative integers (e.g., -1 encoded as 0xFFFFFFFF) to become large positive integers (4294967295) in the JIT path. Consider casting to i32 first, then to i64 to preserve the sign.

🤖 Prompt for AI Agents
In `@crates/codegen/src/compile.rs` around lines 7760 - 7770, The JIT path
mis-decodes LoadSmallInt by zero-extending the stored u32 when computing
small_int (currently doing idx.get(arg) as i64); update the JIT handler that
decodes LoadSmallInt (the place that assigns small_int from idx.get(arg)) to
cast to i32 first and then to i64 (i.e., small_int = (idx.get(arg) as i32) as
i64) so negative values preserve two's-complement sign like the VM handler
(which uses idx.get(arg) as i32); ensure any comments/reference to LoadSmallInt,
emit_arg, and arg_constant remain consistent.

@youknowone youknowone changed the title instructions instruction enhancements Jan 22, 2026
@youknowone youknowone changed the title instruction enhancements instruction improvements Jan 22, 2026
@youknowone youknowone merged commit 40a43f3 into RustPython:main Jan 22, 2026
14 checks passed
@youknowone youknowone deleted the instr branch January 22, 2026 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant