-
Notifications
You must be signed in to change notification settings - Fork 1.4k
instruction improvements #6829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
instruction improvements #6829
Conversation
📝 WalkthroughWalkthroughThis PR updates bytecode and codegen to align with CPython 3.14: adds/renames instructions (LoadSmallInt, EndFor, PopIter, ReturnGenerator), consolidates Jump/JumpNoInterrupt handling, adjusts async-function scope propagation, and updates VM/JIT and oparg enums to support new intrinsics and loop semantics. Changes
Sequence Diagram(s)sequenceDiagram
participant Source as Source AST
participant Codegen as Codegen (compile.rs)
participant IR as IR (ir.rs)
participant Emitter as Bytecode Emitter (compiler-core)
participant VM as VM / JIT
Source->>Codegen: traverse AST (FunctionDef / async)
Codegen->>IR: emit pseudo-instructions (Jump / JumpNoInterrupt, END_FOR markers, LoadSmallInt hints)
IR->>Emitter: map pseudo → real instructions (JumpBackwardNoInterrupt, EndFor, PopIter, LoadSmallInt)
Emitter->>VM: write bytecode with new opcodes and intrinsics
VM->>VM: execute patterns (END_FOR/POP_ITER, ReturnGenerator, StopIterationError, AsyncGenWrap)
VM->>JIT: JIT handles LoadSmallInt as immediate const
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: Click to expand dependency informationLegend:
|
12c443f to
9b1d6d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)
2096-2172: Ensure nested-scope detection treatsAsyncFunctionas a function.Now that
AsyncFunctionis a distinct scope,enter_scopestill only treatsCompilerScope::Functionas “nested.” That will misclassify nested defs inside async functions. Please update the nesting predicate.🐛 Suggested fix in
enter_scope- .map(|table| table.is_nested || table.typ == CompilerScope::Function) + .map(|table| { + table.is_nested + || matches!(table.typ, CompilerScope::Function | CompilerScope::AsyncFunction) + })
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 509-514: The ForIter arm in Instruction::stack_effect currently
ignores the jump parameter and always returns 1; update the ForIter handling in
crates/compiler-core/src/bytecode/instruction.rs (the ForIter match arm inside
the stack_effect implementation for Instruction) to return 1 when jump is false
and 0 when jump is true so it matches CPython 3.14 semantics (jump=False => +1,
jump=True => 0); ensure the change preserves the existing comment/context about
CPython behavior and does not affect other Instruction arms.
In `@crates/vm/src/frame.rs`:
- Around line 2724-2746: The StopIteration handling in the match arm for
bytecode::IntrinsicFunction1::StopIterationError currently uses
arg.class().is(vm.ctx.exceptions.stop_iteration) which only matches the exact
StopIteration type; change this to use
arg.fast_isinstance(vm.ctx.exceptions.stop_iteration) so subclasses of
StopIteration are also treated as StopIteration, keeping the same behavior for
converting to vm.new_runtime_error("coroutine raised StopIteration") and
preserving the existing else branch that re-raises via arg.downcast().
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
2313-2327: Guard the END_FOR skip with a POP_ITER check.Skipping to
target + 1assumes a valid POP_ITER follows END_FOR. Consider guarding to avoid out-of-bounds or unexpected bytecode layouts.💡 Suggested defensive guard
- let jump_target = if let Some(unit) = self.code.instructions.get(target_idx) { - if matches!(unit.op, bytecode::Instruction::EndFor) { - // Skip END_FOR, jump to next instruction - bytecode::Label(target.0 + 1) - } else { - // Legacy pattern: jump directly to target (POP_TOP/POP_ITER) - target - } - } else { - target - }; + let jump_target = if let Some(unit) = self.code.instructions.get(target_idx) { + if matches!(unit.op, bytecode::Instruction::EndFor) + && matches!( + self.code.instructions.get(target_idx + 1).map(|u| &u.op), + Some(bytecode::Instruction::PopIter) + ) + { + // Skip END_FOR, jump to POP_ITER + bytecode::Label(target.0 + 1) + } else { + // Legacy pattern: jump directly to target (POP_TOP/POP_ITER) + target + } + } else { + target + };
| Self::ForIter { .. } => { | ||
| // CPython 3.14 pattern: always +1 for stack analysis | ||
| // jump=False: push next value (+1) | ||
| // jump=True: iterator stays on stack, no change (0) | ||
| if jump { 0 } else { 1 } | ||
| // jump=True: conceptually +1 so END_FOR(-1) + POP_ITER(-1) = -1 total | ||
| // At runtime, FOR_ITER skips END_FOR and jumps directly to POP_ITER | ||
| 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
CPython 3.14 FOR_ITER stack_effect jump parameter dis module
💡 Result:
Short answer: dis.stack_effect(opcode.FOR_ITER, oparg, jump=...) — jump=True = jump taken (iterator exhausted) → stack effect 0; jump=False = fall-through (next value produced) → stack effect +1; jump=None (default) → returns the maximal effect (+1). See the dis docs. [1][2]
Sources:
- dis.stack_effect documentation. [1]
- FOR_ITER opcode description (notes on popping behavior and changes across versions). [2]
🌐 Web query:
CPython 3.14 FOR_ITER opcode behavior stack semantics
💡 Result:
Short answer — semantics in CPython 3.14:
- Input stack before FOR_ITER: [..., iter] (iterator object at TOS).
- Execution:
- If next(iter) yields a value: FOR_ITER pushes that value, leaving the iterator beneath it. Resulting stack: [..., iter, value].
- If iterator is exhausted (StopIteration): FOR_ITER removes the iterator from the stack (pop) and jumps forward by the instruction’s oparg (no value pushed).
- So the two possible “after” states are [..., iter, value] (loop body runs) or [...] (loop exit — iterator removed and jump). This is the traditional FOR_ITER behaviour used by for-loops and comprehensions; 3.14 continues that semantics (with related bytecode/iterator machinery changes such as adding POP_ITER/virtual iterators). (stackoverflow.com)
If you want the exact C implementation lines in ceval.c for 3.14 I can fetch and cite them.
Citations:
🏁 Script executed:
cd crates/compiler-core && find . -type f -name "*.rs" | head -20Repository: RustPython/RustPython
Length of output: 212
🏁 Script executed:
grep -r "FOR_ITER\|ForIter" crates/compiler-core/src/bytecode/ -n | head -30Repository: RustPython/RustPython
Length of output: 1596
🏁 Script executed:
cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '509,514p'Repository: RustPython/RustPython
Length of output: 456
🏁 Script executed:
cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '440,520p'Repository: RustPython/RustPython
Length of output: 4188
🏁 Script executed:
grep -r "stack_effect" crates/compiler-core/src/ -n | grep -v "test" | head -20Repository: RustPython/RustPython
Length of output: 784
🏁 Script executed:
grep -r "END_FOR\|POP_ITER" crates/compiler-core/src/bytecode/instruction.rs -nRepository: RustPython/RustPython
Length of output: 336
🏁 Script executed:
cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '575,610p'Repository: RustPython/RustPython
Length of output: 2078
🏁 Script executed:
grep -rn "\.stack_effect\|stack_effect(" crates/compiler-core/src/ --include="*.rs" | grep -v "fn stack_effect" | head -30Repository: RustPython/RustPython
Length of output: 188
🏁 Script executed:
grep -rn "stack_effect" crates/compiler/ --include="*.rs" | grep -v "test\|compiler-core" | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
grep -rn "stack_effect" crates/ --include="*.rs" | grep -v "fn stack_effect\|test\|//.*stack_effect" | head -40Repository: RustPython/RustPython
Length of output: 565
🏁 Script executed:
cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '1165,1175p'Repository: RustPython/RustPython
Length of output: 581
🏁 Script executed:
git log --oneline -20 -- crates/compiler-core/src/bytecode/instruction.rsRepository: RustPython/RustPython
Length of output: 120
🏁 Script executed:
cat -n crates/codegen/src/ir.rs | sed -n '475,510p'Repository: RustPython/RustPython
Length of output: 1912
🏁 Script executed:
cat -n crates/stdlib/src/opcode.rs | sed -n '185,200p'Repository: RustPython/RustPython
Length of output: 591
🏁 Script executed:
cat -n crates/codegen/src/ir.rs | sed -n '500,515p'Repository: RustPython/RustPython
Length of output: 901
🏁 Script executed:
grep -A 20 "Self::ForIter" crates/codegen/src/ir.rs | head -30Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
cd crates/compiler-core && cat -n src/bytecode/instruction.rs | sed -n '602,603p'Repository: RustPython/RustPython
Length of output: 191
🏁 Script executed:
cat -n crates/compiler-core/src/bytecode/instruction.rs | sed -n '602,602p'Repository: RustPython/RustPython
Length of output: 146
Respect the jump parameter in ForIter stack_effect to match CPython 3.14 semantics.
The jump parameter is actively used by codegen (lines 480 and 505 in crates/codegen/src/ir.rs) to calculate correct stack depths for fall-through vs. jump paths, but the ForIter implementation ignores it and unconditionally returns +1. This causes incorrect stack depth analysis for the jump case.
In CPython 3.14, FOR_ITER stack effects are:
jump=False(next value produced):+1jump=True(iterator exhausted, jumps):0
When the jump is taken, the iterator is popped and execution jumps away; there is no net stack change. The implementation should return 0 when jump=True and 1 when jump=False.
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 509 - 514, The
ForIter arm in Instruction::stack_effect currently ignores the jump parameter
and always returns 1; update the ForIter handling in
crates/compiler-core/src/bytecode/instruction.rs (the ForIter match arm inside
the stack_effect implementation for Instruction) to return 1 when jump is false
and 0 when jump is true so it matches CPython 3.14 semantics (jump=False => +1,
jump=True => 0); ensure the change preserves the existing comment/context about
CPython behavior and does not affect other Instruction arms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 7763-7772: The small-int fast-path currently only accepts 0..=255
via value.to_u8(); change it to accept the CPython 3.14 small-int cache range
-5..=256 by converting the integer to a signed type (e.g. value.to_i64()) and
checking if it falls within -5..=256, then use that signed small_int for the
emitted argument; adjust the cast passed to self.emit_arg to convert the signed
small_int into the parameter type expected by emit_arg/Instruction::LoadSmallInt
(e.g. compute small_int as i32 and cast to u32 when calling emit_arg, and update
Instruction::LoadSmallInt idx type if necessary) so negative values and 256 are
handled correctly while still adding the constant via
self.arg_constant(ConstantData::Integer { .. }).
- Around line 4989-4993: In compile_break_continue and unwind_fblock, replace
the existing POP_TOP-based loop cleanup with the CPython 3.14 END_FOR + POP_ITER
pattern: instead of emitting Instruction::PopTop for for-loop teardown paths,
emit Instruction::EndFor followed by Instruction::PopIter so the VM's
special-skip behavior (skip END_FOR to reach POP_ITER) is preserved; update all
break/continue/return cleanup sites within those functions to use this pattern
and ensure any comments reflect the CPython 3.14 semantics.
♻️ Duplicate comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
447-447: Simplified stack_effect signature removes jump parameter.The
stack_effectmethod now takes onlyarg: OpArg, removing thejump: boolparameter. This simplification is consistent across the codebase.However, note that
ForIternow returns a fixed+1(line 509), which differs from CPython's behavior wherejump=True(iterator exhausted) should return0andjump=False(value produced) should return+1. This was flagged in a previous review.Since the
ir.rsstack depth analysis no longer distinguishes jump paths, this may cause conservative stack depth estimates but shouldn't cause underflow. The comment on the previous review explains the semantic difference.Also applies to: 509-509
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
2313-2332: Runtime instruction inspection may indicate a compiler-level opportunity.The code inspects the instruction stream at runtime to detect the CPython 3.14
END_FOR/POP_ITERpattern and skipEND_FORon loop exhaustion. While this works correctly, runtime instruction inspection is unusual and could be fragile if bytecode patterns change.Consider whether the compiler could emit a different jump target directly, or if
FOR_ITERshould have an additional operand to indicate the number of instructions to skip. This would eliminate the runtime pattern matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 7760-7770: The JIT path mis-decodes LoadSmallInt by zero-extending
the stored u32 when computing small_int (currently doing idx.get(arg) as i64);
update the JIT handler that decodes LoadSmallInt (the place that assigns
small_int from idx.get(arg)) to cast to i32 first and then to i64 (i.e.,
small_int = (idx.get(arg) as i32) as i64) so negative values preserve
two's-complement sign like the VM handler (which uses idx.get(arg) as i32);
ensure any comments/reference to LoadSmallInt, emit_arg, and arg_constant remain
consistent.
♻️ Duplicate comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
447-728: Re-check FOR_ITER jump-path stack effect with the new API.
stack_effectnow ignores jump direction andForIterreturns+1. If jump-path depth still differs (iterator exhausted), stack depth analysis can drift. Consider a jump-aware path or special-casingForIterin depth analysis if needed.
🧹 Nitpick comments (1)
crates/codegen/src/ir.rs (1)
339-346: Add a debug assertion that JumpNoInterrupt is backward-only.This arm always emits
JumpBackwardNoInterrupt; an assertion would catch unexpected forward targets early.♻️ Suggested debug-assert
AnyInstruction::Pseudo(PseudoInstruction::JumpNoInterrupt { .. }) if target != BlockIdx::NULL => { - // JumpNoInterrupt is always backward (used in yield-from/await loops) + let target_offset = block_to_offset[target.idx()].0; + debug_assert!( + target_offset <= current_offset, + "JumpNoInterrupt must be backward" + ); + // JumpNoInterrupt is always backward (used in yield-from/await loops) Instruction::JumpBackwardNoInterrupt { target: Arg::marker(), } }
| // Use LOAD_SMALL_INT for integers in small int cache range (-5..=256) | ||
| // Still add to co_consts for compatibility (CPython does this too) | ||
| if let ConstantData::Integer { ref value } = constant | ||
| && let Some(small_int) = value.to_i32() | ||
| && (-5..=256).contains(&small_int) | ||
| { | ||
| // Add to co_consts even though we use LOAD_SMALL_INT | ||
| let _idx = self.arg_constant(constant); | ||
| // Store as u32 (two's complement for negative values) | ||
| self.emit_arg(small_int as u32, |idx| Instruction::LoadSmallInt { idx }); | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "LoadSmallInt" crates -g '*.rs' -C 3Repository: RustPython/RustPython
Length of output: 3919
LoadSmallInt VM decoder is correct; JIT decoder has a potential issue.
The VM handler correctly decodes LoadSmallInt as signed i32 (line 1188 of crates/vm/src/frame.rs):
let value = vm.ctx.new_int(idx.get(arg) as i32);Casting u32 to i32 preserves the two's complement bit pattern, so negative values are materialized correctly. However, the JIT handler (crates/jit/src/instructions.rs:576) casts directly to i64:
let small_int = idx.get(arg) as i64;This uses zero-extension instead of sign-extension, causing small negative integers (e.g., -1 encoded as 0xFFFFFFFF) to become large positive integers (4294967295) in the JIT path. Consider casting to i32 first, then to i64 to preserve the sign.
🤖 Prompt for AI Agents
In `@crates/codegen/src/compile.rs` around lines 7760 - 7770, The JIT path
mis-decodes LoadSmallInt by zero-extending the stored u32 when computing
small_int (currently doing idx.get(arg) as i64); update the JIT handler that
decodes LoadSmallInt (the place that assigns small_int from idx.get(arg)) to
cast to i32 first and then to i64 (i.e., small_int = (idx.get(arg) as i32) as
i64) so negative values preserve two's-complement sign like the VM handler
(which uses idx.get(arg) as i32); ensure any comments/reference to LoadSmallInt,
emit_arg, and arg_constant remain consistent.
Summary by CodeRabbit
New Features
Performance
Bug Fixes
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.