Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 19, 2026

Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences:

  • BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop
  • BuildSetFromTuples → BUILD_SET + SET_UPDATE loop
  • BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple)
  • BuildMapForCall → DICT_MERGE loop

Implement missing opcodes:

  • ListExtend: Extend list with iterable elements
  • SetUpdate: Add iterable elements to set
  • DictMerge: Merge dict with duplicate key checking

Summary by CodeRabbit

  • Refactor
    • Star-unpacking and argument/collection assembly now use incremental streaming (per-element extend/update then final conversion), improving large/unpacked sequence handling.
    • Adjusted bytecode instruction stack accounting for correct runtime behavior.
  • Bug Fixes
    • Keyword-argument merging now detects duplicate keys and invalid mappings, surfacing clearer errors.
  • New Features
    • Runtime supports explicit list-extend and set-update operations for expanded iterables.
  • Chores
    • Added spell-check dictionary entries for new token names.

✏️ Tip: You can customize this high-level summary in your review settings.

@youknowone youknowone marked this pull request as ready for review January 19, 2026 01:14
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

Reworks collection-building and call-argument emission to a two-step streaming pattern: build empty list/set, extend/update per-element (via ListExtend/SetUpdate), optionally convert to tuple (ListToTuple), and adds DictMerge for indexed dict merges. Adjusts call codegen to use these flows.

Changes

Cohort / File(s) Summary
Codegen pattern updates
crates/codegen/src/compile.rs
Replaces direct "FromTuples" builds with: emit BuildList/BuildSet, per-element ListExtend/SetUpdate/DictMerge, and ListToTuple when tuples required; refactors star-unpack and call argument assembly (codegen_call_helper, codegen_subkwargs).
VM runtime handlers
crates/vm/src/frame.rs
Adds Instruction::DictMerge { index }, ListExtend { i }, SetUpdate { i } and their execution paths; implements runtime logic for merging dicts (duplicate-key checks), extending lists, and updating sets from iterables; minor BuildList pop handling tweak.
Bytecode metadata
crates/compiler-core/src/bytecode/instruction.rs
Adjusts stack-effect metadata for DictMerge, ListExtend, and SetUpdate (previously reported 0, now -1) to reflect new stack semantics.
Spell/dict additions
.cspell.dict/cpython.txt
Adds entries: kwnames, nkwelts, subkwargs to cspell dictionary.

Sequence Diagram(s)

sequenceDiagram
  participant Codegen
  participant Bytecode
  participant VM
  participant Runtime

  Codegen->>Bytecode: emit BuildList(size=0) / BuildSet(size=0)
  Codegen->>Bytecode: emit ListExtend(i) / SetUpdate(i) / DictMerge(index) (per element)
  Codegen->>Bytecode: emit ListToTuple (if target is tuple)
  Bytecode->>VM: stream of instructions
  VM->>Runtime: execute BuildList -> create empty sequence
  VM->>Runtime: on ListExtend -> iterate source -> append items to list
  VM->>Runtime: on SetUpdate -> iterate source -> add items to set
  VM->>Runtime: on DictMerge -> validate mapping -> merge keys into target dict
  alt conversion requested
    VM->>Runtime: ListToTuple intrinsic -> produce tuple
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐇
I hopped through stacks and streams today,
swapped big builds for steps that play —
extend a list, update a set,
merge the dicts with no regret.
Hooray, small hops pave the way! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective: replacing RustPython-specific opcodes with CPython standard sequences. It directly corresponds to the primary changes across multiple files (compile.rs, instruction.rs, frame.rs).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 542-568: gather_elements leaves tuple chunks on the stack before
the collection is created, so emitting Instruction::BuildList/BuildSet after
those chunks causes ListExtend/SetUpdate to index the wrong stack slot and
trigger unsafe casts; fix by ensuring the collection is created before tuple
chunks are left on the stack (emit Instruction::BuildList { size: 0 } /
Instruction::BuildSet { size: 0 } prior to the code that pushes tuple chunks in
gather_elements) or, if reordering is infeasible, adjust the ListExtend { i: ...
} / SetUpdate { i: ... } indices to account for the tuple-first layout
(increment the depth used) in the CollectionType::List and CollectionType::Set
handling (also apply same fix to the other occurrences noted).
🧹 Nitpick comments (2)
crates/vm/src/frame.rs (2)

941-958: Consider using lazy iteration over keys instead of collecting into Vec.

The current implementation collects all keys into a Vec before iterating, which allocates memory proportional to the source dict size. This could be inefficient for large mappings and doesn't match CPython's lazy iteration behavior.

Consider using the iterator protocol directly:

♻️ Suggested refactor using lazy iteration
-                // Get keys from source and check for duplicates
-                let keys_iter = vm.call_method(&source, "keys", ())?;
-                for key in keys_iter.try_to_value::<Vec<PyObjectRef>>(vm)? {
+                // Get keys from source and check for duplicates
+                let keys_obj = vm.call_method(&source, "keys", ())?;
+                let keys_iter = keys_obj.get_iter(vm)?;
+                while let PyIterReturn::Return(key) = keys_iter.next(vm)? {
                     // Check if keyword argument is a string
                     if key.downcast_ref::<PyStr>().is_none() {
                         return Err(vm.new_type_error("keywords must be strings".to_owned()));
                     }
                     // Check for duplicate keyword arguments
                     if dict.contains_key(&*key, vm) {
                         let key_repr = key.repr(vm)?;
                         return Err(vm.new_type_error(format!(
                             "got multiple values for keyword argument {}",
                             key_repr.as_str()
                         )));
                     }
                     let value = vm.call_method(&source, "__getitem__", (key.clone(),))?;
                     dict.set_item(&*key, value, vm)?;
                 }

1594-1607: Consider using PySet's update_internal method for consistency with ListExtend.

While ListExtend delegates to list.extend(), this handler manually iterates and adds items. PySet has an update_internal() method that takes a single iterable and would provide the same consistency pattern: replace the manual iteration with set.update_internal(iterable, vm)?;.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/frame.rs`:
- Around line 921-959: Instruction::DictMerge currently pops the source with
pop_value() then calls nth_value(index.get(arg)) without compensating for the
pop, so it reads the wrong stack slot; mirror the DictUpdate pattern by
subtracting 1 from the computed slot before calling nth_value (i.e., use
nth_value(index.get(arg) - 1) or otherwise adjust the index after pop_value()),
ensuring you reference the same index handling as in DictUpdate and keep the
existing checks (get_method, keys iteration, contains_key, set_item) intact.
♻️ Duplicate comments (3)
crates/codegen/src/compile.rs (3)

539-570: Fix LIST_EXTEND/SET_UPDATE stack target (collection is on TOS).

gather_elements leaves tuple chunks on the stack, then BUILD_LIST/BUILD_SET pushes the collection above them. LIST_EXTEND/SET_UPDATE expects the collection below the iterable on the stack, so this order can target a tuple instead of the list/set and fail at runtime. Please reorder stack construction or adjust the index/depth to match opcode expectations.


4436-4448: Fix LIST_EXTEND stack target for starred bases.

This sequence builds the list on TOS above tuple chunks, but LIST_EXTEND expects the list below the iterable. That can mis-target the tuple and trigger unsafe casts. Please reorder or adjust index depth as in the earlier unpacking path.


6994-7004: *Fix LIST_EXTEND stack target when building args tuple.

Here the list is pushed on top of tuple chunks, but LIST_EXTEND expects the list below the iterable. This can mis-target a tuple at runtime. Please reorder the stack or adjust the depth/index so the list is below the iterable.

@youknowone youknowone marked this pull request as draft January 19, 2026 02:24
@youknowone youknowone force-pushed the no-rustpython-ops branch 2 times, most recently from a35f3bb to d8b46b1 Compare January 19, 2026 08:42
@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • python scripts/generate_opcode_metadata.py
    Please pull the latest changes before pushing again:
git pull origin no-rustpython-ops

Remove RustPython-specific opcodes (BuildListFromTuples,
BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples)
and replace their usage with CPython 3.14 standard opcode
sequences:

- BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop
- BuildSetFromTuples → BUILD_SET + SET_UPDATE loop
- BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple)
- BuildMapForCall → DICT_MERGE loop

Implement missing opcodes:
- ListExtend: Extend list with iterable elements
- SetUpdate: Add iterable elements to set
- DictMerge: Merge dict with duplicate key checking
@youknowone youknowone marked this pull request as ready for review January 19, 2026 15:22
@youknowone youknowone merged commit 82a8f67 into RustPython:main Jan 19, 2026
23 of 24 checks passed
@youknowone youknowone deleted the no-rustpython-ops branch January 19, 2026 15:34
youknowone added a commit to youknowone/RustPython that referenced this pull request Jan 19, 2026
* Replace custom opcodes with standard sequences

Remove RustPython-specific opcodes (BuildListFromTuples,
BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples)
and replace their usage with CPython 3.14 standard opcode
sequences:

- BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop
- BuildSetFromTuples → BUILD_SET + SET_UPDATE loop
- BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple)
- BuildMapForCall → DICT_MERGE loop

Implement missing opcodes:
- ListExtend: Extend list with iterable elements
- SetUpdate: Add iterable elements to set
- DictMerge: Merge dict with duplicate key checking

* Auto-generate: generate_opcode_metadata.py

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This was referenced Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant