Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 16, 2026

Summary by CodeRabbit

  • New Features

    • Builder-based interpreter API for configurable construction and stdlib registration.
    • Multi-phase module initialization (create + exec) to support safer circular imports.
  • Refactor

    • Stdlib and native module bootstrapping reorganized to use module definitions and exec hooks across many modules.
    • Public initialization surface unified around the builder and module_def/module_exec pattern.
  • Behavior

    • Additional file metadata fields exposed on Windows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

📝 Walkthrough

Walkthrough

Introduce InterpreterBuilder and migrate stdlib/native module wiring from single‑step make_module to multi‑phase PyModuleDef APIs (module_def + module_exec). VM bootstrap now creates → inserts into sys.modules → exec (with rollback on failure). Callers updated to builder-based initialization and stdlib_module_defs(ctx).

Changes

Cohort / File(s) Summary
Interpreter & builder
crates/vm/src/vm/interpreter.rs, crates/vm/src/vm/mod.rs, crates/vm/src/lib.rs
Add InterpreterBuilder, builder API (settings, add_native_module(s), init_hook, with_frozen, build), initialize_main_vm; expose InterpreterBuilder; shift VM global state to module_defs.
Import & two‑phase init
crates/vm/src/import.rs, crates/vm/src/builtins/module.rs, crates/vm/src/builtins/mod.rs
Add PyModuleDef::create_module and exec_module; implement create → insert into sys.modules → exec with rollback on exec failure; update builtins init sequencing.
Stdlib registry
crates/stdlib/src/lib.rs, crates/vm/src/stdlib/mod.rs
Replace get_module_inits/StdlibMap with stdlib_module_defs(ctx) / builtin_module_defs(ctx) returning Vec<&'static PyModuleDef>; switch per-module wiring to module_def(ctx).
Bulk module rewiring
crates/stdlib/src/*, crates/vm/src/stdlib/*, crates/wasm/src/*
Many files: change pub(crate) use X::make_modulepub(crate) use X::module_def; remove top‑level make_module wrappers; add module_exec(vm,module) where module-specific runtime setup is required.
Stdlib modules with custom exec
examples in crates/stdlib and crates/vm/src/stdlib (e.g., array, contextvars, _sqlite3, ctypes, errno, io, socket, unicodedata, typing, etc.)
Add crate-visible module_exec wrappers that call generated __module_exec, perform registrations/platform setup, use ? error propagation, and re-export module_def.
Codegen changes (derive)
crates/derive-impl/src/pymodule.rs, crates/derive/src/lib.rs
Generator now emits module_def and module_exec wiring (rename has_extend_modulehas_module_exec), sets PyModuleDef exec slot to module_exec, and updates docs/examples.
Public API & host changes
src/interpreter.rs, src/lib.rs, src/main.rs, examples, benches, example_projects
Replace InterpreterConfig/InitHook with InterpreterBuilderExt::init_stdlib() and builder-based flows; update run() and examples/benches to use Interpreter::builder(...).add_native_modules(...).build() and stdlib_module_defs(&builder.ctx).
Minor exports & platform tweaks
crates/common/src/crt_fd.rs, crates/vm/src/builtins/mod.rs, crates/vm/src/builtins/module.rs, benches/examples
Small import/visibility tweaks (AsFd import, export PyModuleSlots), add create_module/exec_module helpers on PyModuleDef, and adjust call sites to use builder/context.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Builder as InterpreterBuilder (ctx)
    participant DefList as stdlib_module_defs(&ctx)
    participant BuilderActions as Builder.add_native_modules / build
    participant Interpreter
    participant VM as VirtualMachine
    participant Sys as sys.modules
    participant Def as PyModuleDef

    User->>Builder: Interpreter::builder(settings)
    Builder->>DefList: stdlib_module_defs(&builder.ctx)
    DefList-->>Builder: Vec<&PyModuleDef>
    User->>BuilderActions: add_native_modules(defs) / build()
    BuilderActions->>Interpreter: initialize_main_vm(...)
    Interpreter->>VM: construct VM with module_defs map
    loop for each builtin module_def
      VM->>Def: create_module(vm)           %% Phase 1: create
      Def-->>VM: PyModule instance
      VM->>Sys: insert module into sys.modules (pre-exec)
      VM->>Def: exec_module(vm, module)     %% Phase 2: exec
      Def-->>VM: module init completes or error
      alt exec failed
        VM->>Sys: remove partial module
      end
    end
    Interpreter-->>User: Interpreter (with global_state)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh
  • fanninpm
  • coolreader18

Poem

"🐇
I hop through bytes and stitch the start,
A builder seeds the interpreter's heart.
Modules born, inserted, then run —
create then exec till init is done.
A rabbit cheers: smooth flow, well spun."

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Multi phase module init' accurately describes the primary change in this changeset, which implements a multi-phase module initialization system replacing the previous single-phase approach.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the multi-phase-module-init branch 3 times, most recently from 660f2fb to 054b125 Compare January 16, 2026 06:32
@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin multi-phase-module-init

@youknowone youknowone force-pushed the multi-phase-module-init branch 2 times, most recently from 338374a to 5b2598b Compare January 16, 2026 06:51
#[pymodule(with(cert::ssl_cert, ssl_error::ssl_error, ossl101, ossl111, windows))]
mod _ssl {
#[cfg(openssl_vendored)]
use super::PROBE;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does PROBE still need to defined on super?

@youknowone youknowone force-pushed the multi-phase-module-init branch 17 times, most recently from 85d072d to ea3a345 Compare January 21, 2026 12:11
@youknowone youknowone force-pushed the multi-phase-module-init branch from ea3a345 to 5eed597 Compare January 21, 2026 13:39
@youknowone youknowone force-pushed the multi-phase-module-init branch from 5eed597 to c6dc3d2 Compare January 21, 2026 14:07
@youknowone youknowone marked this pull request as ready for review January 21, 2026 14:23
@youknowone youknowone force-pushed the multi-phase-module-init branch from c6dc3d2 to 7a38e56 Compare January 21, 2026 14:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/os.rs (1)

1900-1923: Avoid panics in module_exec; propagate errors instead.

unwrap() during import can abort the interpreter. Returning PyResult<()> and using ? provides a proper Python exception instead. If you adopt this, update callers to handle the new result.

🛠️ Proposed fix
-pub fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) {
+pub fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
     let support_funcs = _os::support_funcs();
     let supports_fd = PySet::default().into_ref(&vm.ctx);
     let supports_dir_fd = PySet::default().into_ref(&vm.ctx);
     let supports_follow_symlinks = PySet::default().into_ref(&vm.ctx);
     for support in support_funcs {
-        let func_obj = module.get_attr(support.name, vm).unwrap();
+        let func_obj = module.get_attr(support.name, vm)?;
         if support.fd.unwrap_or(false) {
-            supports_fd.clone().add(func_obj.clone(), vm).unwrap();
+            supports_fd.clone().add(func_obj.clone(), vm)?;
         }
         if support.dir_fd.unwrap_or(false) {
-            supports_dir_fd.clone().add(func_obj.clone(), vm).unwrap();
+            supports_dir_fd.clone().add(func_obj.clone(), vm)?;
         }
         if support.follow_symlinks.unwrap_or(false) {
-            supports_follow_symlinks.clone().add(func_obj, vm).unwrap();
+            supports_follow_symlinks.clone().add(func_obj, vm)?;
         }
     }

     extend_module!(vm, module, {
         "supports_fd" => supports_fd,
         "supports_dir_fd" => supports_dir_fd,
         "supports_follow_symlinks" => supports_follow_symlinks,
         "error" => vm.ctx.exceptions.os_error.to_owned(),
     });
+    Ok(())
 }
🤖 Fix all issues with AI agents
In `@crates/vm/src/import.rs`:
- Around line 91-116: The module is inserted into sys_modules via
sys_modules.set_item(...) before def.exec_module(...) and if exec_module fails
the partially-initialized module remains in sys_modules; update the two-phase
init path (the block that uses vm.state.module_defs, def.create_module,
sys_modules.set_item, def.exec_module) to remove the module from sys_modules on
exec failure (call sys_modules.del_item(module_name, vm) or equivalent) before
returning the error so subsequent imports won't return the broken module,
ensuring the original error from def.exec_module is propagated by returning
Err(e).

In `@crates/vm/src/stdlib/errno.rs`:
- Around line 9-23: The module_exec function currently uses unwrap() when
calling errorcode.set_item(...) and module.set_attr(...), which will panic on
failure; change those unwrap() calls to propagate errors by using the ? operator
so the function returns a Python exception instead of aborting. Locate
module_exec (and the loop over super::ERROR_CODES) and replace the two .unwrap()
usages on set_item and set_attr with ? so the PyResult<()> return handles
failures; ensure signatures remain unchanged and error propagation compiles.

In `@crates/vm/src/stdlib/imp.rs`:
- Around line 122-149: When inserting the module into sys.modules before calling
the def.slots.exec slot (module variable, sys_modules.set_item and
def.slots.exec), ensure you remove that entry if exec returns an Err so a
partially-initialized module is not left cached; change the exec call to
match/if-let on its Result, and on Err call sys_modules.del_item(&*name, vm) (or
the equivalent removal API) before propagating the error so behavior matches
CPython’s cleanup on exec failure.

In `@crates/vm/src/stdlib/mod.rs`:
- Around line 96-104: The posix::module_def(ctx) is being registered twice for
WASI because both #[cfg(target_os = "wasi")] and the broader
#[cfg(all(any(not(target_arch = "wasm32"), target_os = "wasi"), not(any(unix,
windows))))] match; remove the redundant explicit #[cfg(target_os = "wasi")]
posix::module_def(ctx) entry so posix::module_def(ctx) is only included once
(leave the broader cfg that covers non-unix/windows and WASI targets or vice
versa based on project convention), ensuring only a single
posix::module_def(ctx) call remains in the builtin modules list.

In `@crates/vm/src/vm/interpreter.rs`:
- Around line 30-142: The code in initialize_main_vm currently calls Box::leak
for sysconfigdata_name on every VM creation, leaking memory each time; change
this to leak the platform-specific name only once per process by introducing a
static OnceLock/OnceCell (e.g., static SYSCONFIGDATA_LEAK: OnceLock<&'static
str>) and use get_or_init to perform Box::leak only the first time, then insert
the returned &'static str into all_module_defs; reference initialize_main_vm,
sysconfigdata_name, and the all_module_defs insertion where leaked_name is
created to locate the change.
🧹 Nitpick comments (5)
crates/wasm/src/js_module.rs (1)

623-623: LGTM - Re-export added for module_def pattern.

The addition of pub(crate) use _js::module_def; aligns with the crate-wide module initialization refactoring. Note that this re-export is placed at the end of the file, unlike most other modules where it appears at the top. While functionally equivalent, you may consider moving it to line 2 (after the PyJsValue, PyPromise re-export) for consistency with other stdlib modules.

crates/vm/src/builtins/module.rs (1)

53-71: Remove redundant import.

PyPayload is already imported at the file level (line 3), so the use crate::PyPayload; on line 54 is unnecessary.

Proposed fix
     pub fn create_module(&'static self, vm: &VirtualMachine) -> PyResult<PyRef<PyModule>> {
-        use crate::PyPayload;
-
         // Create module (use create slot if provided, else default creation)
         let module = if let Some(create) = self.slots.create {
crates/stdlib/src/_sqlite3.rs (1)

847-867: Propagate setup_module_exceptions errors instead of panicking.

Line 856 invokes setup_module_exceptions, which still uses unwrap() internally. Since module_exec now returns PyResult<()>, consider returning PyResult<()> from setup_module_exceptions and using ? to avoid panics during module init.

crates/vm/src/stdlib/sysconfigdata.rs (1)

14-25: Chain __module_exec before custom init.

Custom module_exec doesn’t call the macro-generated __module_exec, so any #[pyattr]/#[pyfunction] exports added now or later won’t be registered. Consider calling it first, consistent with other stdlib modules.

♻️ Suggested change
     fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
+        __module_exec(vm, module);
         // Set build_time_vars attribute
         let build_time_vars = build_time_vars(vm);
crates/vm/src/stdlib/sys.rs (1)

168-173: Consider deduplicating builtin module names after augmentation.

If any native module definition already uses "sys" or "builtins", this could yield duplicates in sys.builtin_module_names. A quick dedup after sorting prevents that edge case.

♻️ Proposed tweak
-        module_names.sort();
+        module_names.sort();
+        module_names.dedup();

Comment on lines +122 to +149
// Check sys.modules first
if let Ok(module) = sys_modules.get_item(&*name, vm) {
return Ok(module);
}

// Try multi-phase init modules first (they need special handling)
if let Some(&def) = vm.state.module_defs.get(name.as_str()) {
// Phase 1: Create module (use create slot if provided, else default creation)
let module = if let Some(create) = def.slots.create {
// Custom module creation
create(vm, &spec, def)?
} else {
// Default module creation
PyModule::from_def(def).into_ref(&vm.ctx)
};

// Initialize module dict and methods
// Corresponds to PyModule_FromDefAndSpec: md_def, _add_methods_to_object, PyModule_SetDocString
PyModule::__init_dict_from_def(vm, &module);
module.__init_methods(vm)?;

// Add to sys.modules BEFORE exec (critical for circular import handling)
sys_modules.set_item(&*name, module.clone().into(), vm)?;

// Phase 2: Call exec slot (can safely import other modules now)
if let Some(exec) = def.slots.exec {
exec(vm, &module)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clean up sys.modules when the exec slot fails.

Line 143 inserts the module into sys.modules before exec. If Line 147–149 errors, a partially initialized module remains cached and may be reused by later imports. Consider removing it on failure to match CPython’s behavior.

🛠️ Proposed fix
-            if let Some(exec) = def.slots.exec {
-                exec(vm, &module)?;
-            }
+            if let Some(exec) = def.slots.exec {
+                if let Err(err) = exec(vm, &module) {
+                    let _ = sys_modules.del_item(&*name, vm);
+                    return Err(err);
+                }
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check sys.modules first
if let Ok(module) = sys_modules.get_item(&*name, vm) {
return Ok(module);
}
// Try multi-phase init modules first (they need special handling)
if let Some(&def) = vm.state.module_defs.get(name.as_str()) {
// Phase 1: Create module (use create slot if provided, else default creation)
let module = if let Some(create) = def.slots.create {
// Custom module creation
create(vm, &spec, def)?
} else {
// Default module creation
PyModule::from_def(def).into_ref(&vm.ctx)
};
// Initialize module dict and methods
// Corresponds to PyModule_FromDefAndSpec: md_def, _add_methods_to_object, PyModule_SetDocString
PyModule::__init_dict_from_def(vm, &module);
module.__init_methods(vm)?;
// Add to sys.modules BEFORE exec (critical for circular import handling)
sys_modules.set_item(&*name, module.clone().into(), vm)?;
// Phase 2: Call exec slot (can safely import other modules now)
if let Some(exec) = def.slots.exec {
exec(vm, &module)?;
}
// Check sys.modules first
if let Ok(module) = sys_modules.get_item(&*name, vm) {
return Ok(module);
}
// Try multi-phase init modules first (they need special handling)
if let Some(&def) = vm.state.module_defs.get(name.as_str()) {
// Phase 1: Create module (use create slot if provided, else default creation)
let module = if let Some(create) = def.slots.create {
// Custom module creation
create(vm, &spec, def)?
} else {
// Default module creation
PyModule::from_def(def).into_ref(&vm.ctx)
};
// Initialize module dict and methods
// Corresponds to PyModule_FromDefAndSpec: md_def, _add_methods_to_object, PyModule_SetDocString
PyModule::__init_dict_from_def(vm, &module);
module.__init_methods(vm)?;
// Add to sys.modules BEFORE exec (critical for circular import handling)
sys_modules.set_item(&*name, module.clone().into(), vm)?;
// Phase 2: Call exec slot (can safely import other modules now)
if let Some(exec) = def.slots.exec {
if let Err(err) = exec(vm, &module) {
let _ = sys_modules.del_item(&*name, vm);
return Err(err);
}
}
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/imp.rs` around lines 122 - 149, When inserting the
module into sys.modules before calling the def.slots.exec slot (module variable,
sys_modules.set_item and def.slots.exec), ensure you remove that entry if exec
returns an Err so a partially-initialized module is not left cached; change the
exec call to match/if-let on its Result, and on Err call
sys_modules.del_item(&*name, vm) (or the equivalent removal API) before
propagating the error so behavior matches CPython’s cleanup on exec failure.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/sys.rs (1)

1678-1685: Revert visibility change for sysconfigdata_name.

The function is only called from within crates/vm/src/ (via crate::stdlib::sys::sysconfigdata_name() in interpreter.rs and sysconfigdata.rs), so pub(crate) visibility is sufficient. No external crates use this function. Change it back to pub(crate) to maintain proper encapsulation.

♻️ Duplicate comments (1)
crates/vm/src/vm/interpreter.rs (1)

67-72: Avoid per-interpreter Box::leak for _sysconfigdata aliasing.

This repeats a previously flagged issue: leaking on each interpreter creation grows memory unboundedly. Prefer a process-wide OnceLock to leak once.

🩹 Suggested fix (same as prior review)
     if let Some(&sysconfigdata_def) = all_module_defs.get("_sysconfigdata") {
+        use std::sync::OnceLock;
         let sysconfigdata_name = crate::stdlib::sys::sysconfigdata_name();
         // Leak the string to get a 'static lifetime for the key
-        let leaked_name: &'static str = Box::leak(sysconfigdata_name.into_boxed_str());
-        all_module_defs.insert(leaked_name, sysconfigdata_def);
+        static SYSCONFIGDATA_NAME: OnceLock<&'static str> = OnceLock::new();
+        let leaked_name =
+            *SYSCONFIGDATA_NAME.get_or_init(|| Box::leak(sysconfigdata_name.into_boxed_str()));
+        all_module_defs.insert(leaked_name, sysconfigdata_def);
     }

@youknowone youknowone force-pushed the multi-phase-module-init branch from 7a38e56 to 203756d Compare January 21, 2026 14:54
@youknowone youknowone marked this pull request as draft January 21, 2026 15:00
@youknowone youknowone force-pushed the multi-phase-module-init branch 2 times, most recently from b25c907 to 22ea6ad Compare January 21, 2026 15:14
@youknowone youknowone marked this pull request as ready for review January 21, 2026 23:31
@youknowone youknowone force-pushed the multi-phase-module-init branch from 22ea6ad to bed12ed Compare January 21, 2026 23:58
@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

Click to expand dependency information
[+] lib: cpython/Lib/token.py
soft_deps:
- [ ] token

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated
  • native: Rust/C extension modules

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/vm/mod.rs (1)

1339-1339: Update test to use builder pattern for consistency.

test_nested_frozen (line 1314) uses Interpreter::builder(), but frozen_origname_matches (line 1339) still uses Interpreter::with_init(). Update the latter to match the builder pattern for consistency with the rest of the test suite.

🤖 Fix all issues with AI agents
In `@crates/stdlib/src/_sqlite3.rs`:
- Around line 847-867: The call to __module_exec(vm, module) in module_exec
currently discards its Result; change the invocation to propagate errors (e.g.,
use __module_exec(vm, module)? or otherwise handle its PyResult) so that any
error returned by __module_exec is forwarded from module_exec; ensure
module_exec still returns PyResult<()> and that subsequent initialization
(ERROR_CODES loop, setup_module_exceptions,
CONVERTERS/ADAPTERS/USER_FUNCTION_EXCEPTION/ENABLE_TRACEBACK setup, and
module.set_attr calls) only runs after successful __module_exec.

In `@crates/vm/src/stdlib/ctypes.rs`:
- Around line 1281-1312: The call to __module_exec(vm, module) in module_exec
currently discards its PyResult return value; change the call to propagate
errors by using the ? operator (i.e., call __module_exec(vm, module)?;) so that
module_exec (which returns PyResult<()>) will return any error from
__module_exec instead of silently ignoring it.

In `@crates/vm/src/stdlib/typing.rs`:
- Around line 191-218: The call to __module_exec(vm, module) inside module_exec
currently discards its PyResult, so initialization errors are ignored; change
the call to propagate errors (e.g., use the ? operator) so that module_exec
returns early with the error when __module_exec fails—update the invocation in
the module_exec function to propagate the result rather than ignoring it.

In `@crates/wasm/src/vm_class.rs`:
- Around line 29-35: In module_exec, the result of __module_exec(vm, module) is
ignored so errors don't propagate; change the call to use the ? operator (i.e.,
__module_exec(vm, module)?;) so any PyErr returned by __module_exec is
propagated, then proceed to extend_module! and return Ok(()) from module_exec;
this ensures module initialization failures are handled consistently.
♻️ Duplicate comments (2)
crates/vm/src/stdlib/imp.rs (1)

143-149: Clean up sys.modules when exec slot fails.

This was flagged in a previous review: if def.slots.exec returns an error on line 148, the module inserted at line 144 remains in sys.modules, leaving a partially initialized module cached. This can cause issues with subsequent imports.

🔧 Suggested fix
             // Add to sys.modules BEFORE exec (critical for circular import handling)
             sys_modules.set_item(&*name, module.clone().into(), vm)?;

             // Phase 2: Call exec slot (can safely import other modules now)
             if let Some(exec) = def.slots.exec {
-                exec(vm, &module)?;
+                if let Err(err) = exec(vm, &module) {
+                    let _ = sys_modules.del_item(&*name, vm);
+                    return Err(err);
+                }
             }
crates/vm/src/stdlib/mod.rs (1)

96-102: Potential duplicate posix module registration on WASI targets.

The cfg conditions may still overlap for WASI:

  • Line 96-97: #[cfg(any(unix, target_os = "wasi"))] matches WASI
  • Lines 98-101: #[cfg(all(any(not(target_arch = "wasm32"), target_os = "wasi"), not(any(unix, windows))))] also matches WASI since target_os = "wasi" satisfies the first part and WASI is neither unix nor windows

This could cause posix::module_def(ctx) to be included twice in the vector for WASI builds.

#!/bin/bash
# Verify the cfg conditions for WASI target
# Check if both conditions can be true simultaneously for wasi

echo "Checking posix module cfg conditions in mod.rs..."
rg -n "posix::module_def" crates/vm/src/stdlib/mod.rs -B2 -A1

echo ""
echo "Checking how other files handle WASI posix module:"
rg -n "target_os.*wasi" crates/vm/src/stdlib/mod.rs
🧹 Nitpick comments (4)
crates/vm/src/vm/interpreter.rs (1)

438-474: Consider tracking the FIXME about duplicate frozen core_modules.
If you’d like, I can help draft a follow-up to avoid double-generation while keeping without_stdlib() and init_stdlib() consistent.

crates/derive-impl/src/pymodule.rs (1)

56-90: Consider guarding def.slots.exec assignment or documenting that module_exec must not be cfg-gated.

The detection at lines 86-90 sets has_module_exec = true based on the function name alone, without checking func.attrs for #[cfg] guards. While no cfg-gated module_exec functions currently exist in the codebase (all use conditional code inside the function body instead), the implementation remains fragile: if a user adds a top-level #[cfg]-gated module_exec, the fallback suppression (line 149) combined with the unconditional reference at line 142 would cause silent build failures on excluded configurations.

Prevent this by either checking cfg attributes during detection, conditionally wrapping the def.slots.exec assignment, or documenting that module_exec cannot be cfg-gated at the function level.

crates/vm/src/stdlib/os.rs (1)

1900-1924: Consider error propagation in module_exec.

The function uses multiple unwrap() calls (lines 1906, 1908, 1911, 1914) that could panic during module initialization if get_attr or add fails. While the current callers in posix.rs and nt.rs don't handle errors from this function, propagating errors would be more consistent with the pattern in errno.rs which uses ? for error handling.

Given the callers currently ignore the result anyway (they just call os::module_exec(vm, module);), this is a lower-priority refactor that could be addressed in a follow-up.

♻️ Suggested change for error propagation
-pub fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) {
+pub fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
     let support_funcs = _os::support_funcs();
     let supports_fd = PySet::default().into_ref(&vm.ctx);
     let supports_dir_fd = PySet::default().into_ref(&vm.ctx);
     let supports_follow_symlinks = PySet::default().into_ref(&vm.ctx);
     for support in support_funcs {
-        let func_obj = module.get_attr(support.name, vm).unwrap();
+        let func_obj = module.get_attr(support.name, vm)?;
         if support.fd.unwrap_or(false) {
-            supports_fd.clone().add(func_obj.clone(), vm).unwrap();
+            supports_fd.clone().add(func_obj.clone(), vm)?;
         }
         if support.dir_fd.unwrap_or(false) {
-            supports_dir_fd.clone().add(func_obj.clone(), vm).unwrap();
+            supports_dir_fd.clone().add(func_obj.clone(), vm)?;
         }
         if support.follow_symlinks.unwrap_or(false) {
-            supports_follow_symlinks.clone().add(func_obj, vm).unwrap();
+            supports_follow_symlinks.clone().add(func_obj, vm)?;
         }
     }
     // ... extend_module! ...
+    Ok(())
 }

This would require updating callers in posix.rs, nt.rs, and posix_compat.rs to propagate the error.

crates/vm/src/vm/mod.rs (1)

231-232: Minor grammar issue in error message.

Consider: "rustpython_pylib may not be set" instead of "maybe not set" for grammatical correctness.

Suggested fix
             let guide_message = if cfg!(feature = "freeze-stdlib") {
-                "`rustpython_pylib` maybe not set while using `freeze-stdlib` feature. Try using `rustpython::InterpreterBuilder::init_stdlib` or manually call `builder.add_frozen_modules(rustpython_pylib::FROZEN_STDLIB)` in `rustpython_vm::Interpreter::builder()`."
+                "`rustpython_pylib` may not be set while using `freeze-stdlib` feature. Try using `rustpython::InterpreterBuilder::init_stdlib` or manually call `builder.add_frozen_modules(rustpython_pylib::FROZEN_STDLIB)` in `rustpython_vm::Interpreter::builder()`."

Comment on lines +847 to 867
pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
__module_exec(vm, module);

for (name, code) in ERROR_CODES {
let name = vm.ctx.intern_str(*name);
let code = vm.new_pyobj(*code);
module.set_attr(name, code, vm).unwrap();
module.set_attr(name, code, vm)?;
}

setup_module_exceptions(module, vm);
setup_module_exceptions(module.as_object(), vm);

let _ = CONVERTERS.set(vm.ctx.new_dict());
let _ = ADAPTERS.set(vm.ctx.new_dict());
let _ = USER_FUNCTION_EXCEPTION.set(PyAtomicRef::from(None));
let _ = ENABLE_TRACEBACK.set(Radium::new(false));

module
.set_attr("converters", converters().to_owned(), vm)
.unwrap();
module
.set_attr("adapters", adapters().to_owned(), vm)
.unwrap();
module.set_attr("converters", converters().to_owned(), vm)?;
module.set_attr("adapters", adapters().to_owned(), vm)?;

Ok(())
}
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Return value of __module_exec is discarded; otherwise LGTM.

The module_exec function properly:

  • Sets up error codes with ? propagation (line 853)
  • Configures exceptions via setup_module_exceptions
  • Initializes converters/adapters
  • Uses ? for set_attr calls (lines 863-864)

However, line 848 discards the return value of __module_exec(vm, module). Consider propagating any potential errors.

🔧 Suggested fix
     pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
-        __module_exec(vm, module);
+        __module_exec(vm, module)?;
 
         for (name, code) in ERROR_CODES {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
__module_exec(vm, module);
for (name, code) in ERROR_CODES {
let name = vm.ctx.intern_str(*name);
let code = vm.new_pyobj(*code);
module.set_attr(name, code, vm).unwrap();
module.set_attr(name, code, vm)?;
}
setup_module_exceptions(module, vm);
setup_module_exceptions(module.as_object(), vm);
let _ = CONVERTERS.set(vm.ctx.new_dict());
let _ = ADAPTERS.set(vm.ctx.new_dict());
let _ = USER_FUNCTION_EXCEPTION.set(PyAtomicRef::from(None));
let _ = ENABLE_TRACEBACK.set(Radium::new(false));
module
.set_attr("converters", converters().to_owned(), vm)
.unwrap();
module
.set_attr("adapters", adapters().to_owned(), vm)
.unwrap();
module.set_attr("converters", converters().to_owned(), vm)?;
module.set_attr("adapters", adapters().to_owned(), vm)?;
Ok(())
}
pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
__module_exec(vm, module)?;
for (name, code) in ERROR_CODES {
let name = vm.ctx.intern_str(*name);
let code = vm.new_pyobj(*code);
module.set_attr(name, code, vm)?;
}
setup_module_exceptions(module.as_object(), vm);
let _ = CONVERTERS.set(vm.ctx.new_dict());
let _ = ADAPTERS.set(vm.ctx.new_dict());
let _ = USER_FUNCTION_EXCEPTION.set(PyAtomicRef::from(None));
let _ = ENABLE_TRACEBACK.set(Radium::new(false));
module.set_attr("converters", converters().to_owned(), vm)?;
module.set_attr("adapters", adapters().to_owned(), vm)?;
Ok(())
}
🤖 Prompt for AI Agents
In `@crates/stdlib/src/_sqlite3.rs` around lines 847 - 867, The call to
__module_exec(vm, module) in module_exec currently discards its Result; change
the invocation to propagate errors (e.g., use __module_exec(vm, module)? or
otherwise handle its PyResult) so that any error returned by __module_exec is
forwarded from module_exec; ensure module_exec still returns PyResult<()> and
that subsequent initialization (ERROR_CODES loop, setup_module_exceptions,
CONVERTERS/ADAPTERS/USER_FUNCTION_EXCEPTION/ENABLE_TRACEBACK setup, and
module.set_attr calls) only runs after successful __module_exec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__module_exec returns ()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youknowone, thank you for the clarification! You're correct—if __module_exec returns () (unit type) rather than PyResult<()>, then the current code is correct as written. The call on line 848 doesn't need the ? operator since there's no Result to propagate.

The module_exec wrapper properly:

  1. Invokes the infallible __module_exec(vm, module)
  2. Performs additional fallible setup (error code registration, module attributes) with proper ? propagation
  3. Returns PyResult<()> to handle any errors from the subsequent operations

The implementation correctly separates the auto-generated initialization (__module_exec) from the module-specific setup that can fail.

Comment on lines +191 to +218
pub(crate) fn module_exec(
vm: &VirtualMachine,
module: &Py<crate::builtins::PyModule>,
) -> PyResult<()> {
__module_exec(vm, module);

use super::{Generic, ParamSpec, ParamSpecArgs, ParamSpecKwargs, TypeVar, TypeVarTuple};

TypeVar::make_class(&vm.ctx);
ParamSpec::make_class(&vm.ctx);
TypeVarTuple::make_class(&vm.ctx);
ParamSpecArgs::make_class(&vm.ctx);
ParamSpecKwargs::make_class(&vm.ctx);
Generic::make_class(&vm.ctx);

extend_module!(vm, module, {
"NoDefault" => vm.ctx.typing_no_default.clone(),
"TypeVar" => TypeVar::class(&vm.ctx).to_owned(),
"ParamSpec" => ParamSpec::class(&vm.ctx).to_owned(),
"TypeVarTuple" => TypeVarTuple::class(&vm.ctx).to_owned(),
"ParamSpecArgs" => ParamSpecArgs::class(&vm.ctx).to_owned(),
"ParamSpecKwargs" => ParamSpecKwargs::class(&vm.ctx).to_owned(),
"Generic" => Generic::class(&vm.ctx).to_owned(),
"Union" => vm.ctx.types.union_type.to_owned(),
});

Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Return value of __module_exec is discarded.

Similar to other modules, __module_exec(vm, module) on line 195 doesn't propagate its result. If the macro-generated initialization fails, the error will be silently ignored.

🔧 Suggested fix
     pub(crate) fn module_exec(
         vm: &VirtualMachine,
         module: &Py<crate::builtins::PyModule>,
     ) -> PyResult<()> {
-        __module_exec(vm, module);
+        __module_exec(vm, module)?;
 
         use super::{Generic, ParamSpec, ParamSpecArgs, ParamSpecKwargs, TypeVar, TypeVarTuple};
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/typing.rs` around lines 191 - 218, The call to
__module_exec(vm, module) inside module_exec currently discards its PyResult, so
initialization errors are ignored; change the call to propagate errors (e.g.,
use the ? operator) so that module_exec returns early with the error when
__module_exec fails—update the invocation in the module_exec function to
propagate the result rather than ignoring it.

@youknowone youknowone force-pushed the multi-phase-module-init branch from bed12ed to 04a9da3 Compare January 22, 2026 00:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/os.rs (1)

1900-1923: Replace unwrap() calls in os::module_exec with ? operator; change return type to PyResult<()> and update callers.

The current module_exec function panics on get_attr() or add() failures (OOM, missing attributes). Change the return type to PyResult<()>, replace the four unwrap() calls with ? operators, and update the three callers in posix.rs, nt.rs, and posix_compat.rs to use ? when calling super::super::os::module_exec(vm, module).

🔧 Proposed fix
-pub fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) {
+pub fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
     let support_funcs = _os::support_funcs();
     let supports_fd = PySet::default().into_ref(&vm.ctx);
     let supports_dir_fd = PySet::default().into_ref(&vm.ctx);
     let supports_follow_symlinks = PySet::default().into_ref(&vm.ctx);
     for support in support_funcs {
-        let func_obj = module.get_attr(support.name, vm).unwrap();
+        let func_obj = module.get_attr(support.name, vm)?;
         if support.fd.unwrap_or(false) {
-            supports_fd.clone().add(func_obj.clone(), vm).unwrap();
+            supports_fd.clone().add(func_obj.clone(), vm)?;
         }
         if support.dir_fd.unwrap_or(false) {
-            supports_dir_fd.clone().add(func_obj.clone(), vm).unwrap();
+            supports_dir_fd.clone().add(func_obj.clone(), vm)?;
         }
         if support.follow_symlinks.unwrap_or(false) {
-            supports_follow_symlinks.clone().add(func_obj, vm).unwrap();
+            supports_follow_symlinks.clone().add(func_obj, vm)?;
         }
     }

     extend_module!(vm, module, {
         "supports_fd" => supports_fd,
         "supports_dir_fd" => supports_dir_fd,
         "supports_follow_symlinks" => supports_follow_symlinks,
         "error" => vm.ctx.exceptions.os_error.to_owned(),
     });
+    Ok(())
 }

Update callers:

  • posix.rs: super::super::os::module_exec(vm, module)?;
  • nt.rs: super::super::os::module_exec(vm, module)?;
  • posix_compat.rs: super::super::os::module_exec(vm, module)?;

This aligns with the pattern used in other stdlib modules (sysconfigdata, errno, io, signal) that return PyResult<()> and propagate errors rather than panicking during initialization.

🤖 Fix all issues with AI agents
In `@crates/derive-impl/src/pymodule.rs`:
- Around line 85-90: The derive currently marks context.has_module_exec = true
when it sees an Item::Fn named module_exec but doesn't handle cfg-gated
declarations, which leads to emitting def.slots.exec = Some(module_exec)
unconditionally; update the visitor in pymodule.rs to detect if the
function/item named module_exec has any #[cfg(...)] attributes (inspect
func.attrs for a Meta path "cfg") and, if so, emit a compile_error (or return a
syn::Error) rejecting cfg‑gated module_exec with a clear message that the cfg
must be moved to the containing module or a non-gated fallback provided, instead
of setting context.has_module_exec = true and generating def.slots.exec =
Some(module_exec).

In `@crates/stdlib/src/pyexpat.rs`:
- Around line 53-65: The submodules "model" and "errors" created in module_exec
(variables model and errors) are attached to the pyexpat module but not
registered in sys.modules with their qualified names; update module_exec to
insert entries for "pyexpat.model" and "pyexpat.errors" into the VM's
sys.modules mapping after creating the submodules and before returning (use
vm.sys_module or the VM API to access sys.modules), ensuring the Py<PyModule>
values for model and errors are stored under those qualified keys so imports and
introspection that expect sys.modules["pyexpat.model"] /
sys.modules["pyexpat.errors"] will work as in CPython.

In `@crates/vm/src/stdlib/ast/python.rs`:
- Around line 100-107: The call to __module_exec(...) currently ignores its
PyResult, letting initialization errors be dropped; change module_exec to
propagate the error from __module_exec (e.g. use the ? operator or return on
Err) before calling super::super::pyast::extend_module_nodes so that failures in
__module_exec are returned as Err(PyErr) instead of always returning Ok(()).
Ensure you reference the existing functions __module_exec and module_exec and
preserve the subsequent call to extend_module_nodes only after successful
__module_exec.

In `@crates/vm/src/stdlib/sys.rs`:
- Around line 168-174: The returned tuple from builtin_module_names may contain
duplicate "sys" or "builtins" entries; in the builtin_module_names function,
after collecting and pushing module names (module_names) and sorting, remove
duplicates (e.g., dedupe the Vec<String> in-place) before converting to a
PyTupleRef so the exported tuple contains unique module names only.
♻️ Duplicate comments (5)
crates/wasm/src/vm_class.rs (1)

24-35: Propagate __module_exec errors (already noted earlier).
Line 30 ignores the PyResult from __module_exec, so init errors won’t propagate.

crates/vm/src/stdlib/typing.rs (1)

191-218: Propagate __module_exec errors.

Same issue: __module_exec(vm, module); discards errors; use ? to avoid silent init failures.

crates/vm/src/stdlib/ctypes.rs (1)

1281-1311: Propagate __module_exec errors.

Same issue: ignoring the Result hides init failures.

crates/vm/src/stdlib/imp.rs (1)

122-149: Remove sys.modules entry if exec fails.

If the exec slot errors, a partially initialized module remains cached. Clean it up before propagating the error to match CPython’s behavior.

🔧 Proposed fix
-            if let Some(exec) = def.slots.exec {
-                exec(vm, &module)?;
-            }
+            if let Some(exec) = def.slots.exec {
+                if let Err(err) = exec(vm, &module) {
+                    let _ = sys_modules.del_item(&*name, vm);
+                    return Err(err);
+                }
+            }
crates/vm/src/stdlib/mod.rs (1)

96-102: Avoid duplicate posix module registration on WASI.

Both cfgs can match for target_os = "wasi", causing posix::module_def(ctx) to be included twice. Consider narrowing the first entry to unix-only (or remove one entry).

🛠 Suggested fix
-        #[cfg(any(unix, target_os = "wasi"))]
+        #[cfg(unix)]
         posix::module_def(ctx),

Comment on lines +85 to +90
// Check if module_exec function is already defined
if let Item::Fn(func) = item
&& func.sig.ident == "module_exec"
{
context.has_module_exec = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's explore the file structure and locate the file
find . -name "pymodule.rs" -type f

Repository: RustPython/RustPython

Length of output: 101


🏁 Script executed:

# Read the context around lines 85-90 in the file to understand the code
cat -n crates/derive-impl/src/pymodule.rs | sed -n '75,100p'

Repository: RustPython/RustPython

Length of output: 1226


🏁 Script executed:

# Search for any cfg-gated module_exec definitions in the entire codebase
rg --type=rust -B5 'fn\s+module_exec' --multiline-dotted

Repository: RustPython/RustPython

Length of output: 519


🏁 Script executed:

# Search for where def.slots.exec is set or assigned
rg --type=rust 'def\.slots\.exec|slots\.exec\s*=' -B2 -A2

Repository: RustPython/RustPython

Length of output: 752


🏁 Script executed:

# Search for module_exec with surrounding context, checking for #[cfg(...)]
rg --type=rust -B10 'fn\s+module_exec' | head -50

Repository: RustPython/RustPython

Length of output: 2866


🏁 Script executed:

# Search specifically for cfg-gated module_exec
rg --type=rust '#\[cfg' -A10 | rg -A10 'module_exec'

Repository: RustPython/RustPython

Length of output: 2829


🏁 Script executed:

# Find where context.has_module_exec is used
rg --type=rust 'has_module_exec' -B2 -A2

Repository: RustPython/RustPython

Length of output: 1176


🏁 Script executed:

# Find where def.slots.exec = Some(module_exec) is generated with more context
rg --type=rust 'def\.slots\.exec\s*=\s*Some' -B10 -A5 crates/derive-impl/src/pymodule.rs

Repository: RustPython/RustPython

Length of output: 769


🏁 Script executed:

# Also search for where the module definition is generated to understand full flow
rg --type=rust 'gen_module_def|ModuleDef' crates/derive-impl/src/pymodule.rs | head -20

Repository: RustPython/RustPython

Length of output: 293


🏁 Script executed:

# Search for any actual cfg-gated module_exec definitions that might have caused issues
rg --type=rust -B3 '#\[cfg' -A3 | rg -B3 -A3 'fn\s+module_exec'

Repository: RustPython/RustPython

Length of output: 1309


🏁 Script executed:

# Read the full context around where has_module_exec is used to understand the code generation
cat -n crates/derive-impl/src/pymodule.rs | sed -n '110,150p'

Repository: RustPython/RustPython

Length of output: 1913


🏁 Script executed:

# Check if there's any conditional generation based on has_module_exec
rg --type=rust 'has_module_exec' -B5 -A10 crates/derive-impl/src/pymodule.rs

Repository: RustPython/RustPython

Length of output: 1806


Reject cfg‑gated module_exec to prevent build failures.

When a user defines module_exec with a #[cfg(...)] attribute, the code checks for its presence (line 87) and sets context.has_module_exec = true. However, the code generation at line 142 unconditionally generates def.slots.exec = Some(module_exec); regardless of whether module_exec is cfg-gated. In builds where the cfg condition is false, the function won't exist but the code still references it, causing a hard compile error.

The suggested diagnostic check is appropriate: detect cfg-gated module_exec and emit an error, directing users to either move the cfg to the containing module or provide an explicit fallback.

🤖 Prompt for AI Agents
In `@crates/derive-impl/src/pymodule.rs` around lines 85 - 90, The derive
currently marks context.has_module_exec = true when it sees an Item::Fn named
module_exec but doesn't handle cfg-gated declarations, which leads to emitting
def.slots.exec = Some(module_exec) unconditionally; update the visitor in
pymodule.rs to detect if the function/item named module_exec has any #[cfg(...)]
attributes (inspect func.attrs for a Meta path "cfg") and, if so, emit a
compile_error (or return a syn::Error) rejecting cfg‑gated module_exec with a
clear message that the cfg must be moved to the containing module or a non-gated
fallback provided, instead of setting context.has_module_exec = true and
generating def.slots.exec = Some(module_exec).

Comment on lines +53 to +65
pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
__module_exec(vm, module);

// Add submodules
let model = super::_model::module_def(&vm.ctx).create_module(vm)?;
let errors = super::_errors::module_def(&vm.ctx).create_module(vm)?;

extend_module!(vm, module, {
"model" => model,
"errors" => errors,
});

Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate the file and understand the current implementation
find . -name "pyexpat.rs" -type f

Repository: RustPython/RustPython

Length of output: 95


🏁 Script executed:

# Search for how other modules handle submodule registration in RustPython
fd -e rs -type f | xargs grep -l "sys_modules\|sys\.modules" | head -20

Repository: RustPython/RustPython

Length of output: 235


🏁 Script executed:

# Look for any tests that use pyexpat.model or pyexpat.errors
rg -n "pyexpat\.(model|errors)" --type rs

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Check for existing submodule patterns in the codebase
rg -n "create_module" --type rs -A 5 | head -60

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Look at the _model and _errors module definitions
fd -e rs -type f -path "*pyexpat*" | xargs ls -la

Repository: RustPython/RustPython

Length of output: 3391


🏁 Script executed:

# Read the pyexpat.rs file
cat -n crates/stdlib/src/pyexpat.rs | head -100

Repository: RustPython/RustPython

Length of output: 4159


🏁 Script executed:

# Search for sys_modules usage patterns
rg -n "sys_modules\|sys\.modules" | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for create_module calls to see submodule patterns
rg -n "create_module" -A 3 | head -40

Repository: RustPython/RustPython

Length of output: 2163


🏁 Script executed:

# Look for pyexpat tests
find . -name "*test*" -o -name "*tests*" | xargs grep -l "pyexpat" 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 143


🏁 Script executed:

# Search for how other stdlib modules register submodules
rg -n "extend_module" -B 5 -A 5 | head -80

Repository: RustPython/RustPython

Length of output: 5217


🏁 Script executed:

# Check the pyexpat test file to see how submodules are used
cat Lib/test/test_pyexpat.py | head -150

Repository: RustPython/RustPython

Length of output: 4977


🏁 Script executed:

# Search for model/errors attribute access in test files
rg -n "pyexpat\.(model|errors)" Lib/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at how create_module is implemented
rg -n "fn create_module" -A 10

Repository: RustPython/RustPython

Length of output: 980


🏁 Script executed:

# Check other stdlib modules that have submodules for comparison
find crates/stdlib/src -name "*.rs" -type f | head -20

Repository: RustPython/RustPython

Length of output: 699


🏁 Script executed:

# Look for other submodule patterns - search for modules that might have nested structure
rg -n "module_exec.*PyResult" -A 15 crates/stdlib/src/ | head -100

Repository: RustPython/RustPython

Length of output: 6065


🏁 Script executed:

# Search for any import statements in tests that use qualified submodule syntax
rg -n "import.*pyexpat\." Lib/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are any tests that try to import pyexpat.model or pyexpat.errors
rg -n "import pyexpat" Lib/test/ -A 5

Repository: RustPython/RustPython

Length of output: 344


🏁 Script executed:

# Look at the _model and _errors submodule definitions in pyexpat
find crates/stdlib/src -name "*model*" -o -name "*errors*" | grep pyexpat

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for how other modules with submodules handle sys.modules registration
rg -n "__name__.*=.*\|sys_modules.*set" crates/stdlib/src/ | head -30

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at the extend_module macro implementation more closely
cat -n crates/vm/src/macros.rs | head -20

Repository: RustPython/RustPython

Length of output: 1051


🏁 Script executed:

# Look at how __module_set_attr is implemented
rg -n "__module_set_attr" crates/vm/src/ -A 5 | head -40

Repository: RustPython/RustPython

Length of output: 1128


🏁 Script executed:

# Search for how modules get registered in sys.modules
rg -n "sys_modules\|sys\.modules" crates/vm/src/builtins/module.rs -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are any tests that specifically test pyexpat module attributes or imports
cat Lib/test/test_pyexpat.py | grep -n "expat\.\|pyexpat\." | head -20

Repository: RustPython/RustPython

Length of output: 1229


🏁 Script executed:

# Look at the module.rs file to understand module creation and sys.modules registration
cat -n crates/vm/src/builtins/module.rs | head -80

Repository: RustPython/RustPython

Length of output: 3343


🏁 Script executed:

# Search for how PyModule::from_def works and if it sets __name__ automatically
rg -n "from_def\|PyModule::new" crates/vm/src/builtins/module.rs -A 10 | head -60

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at how PyModule::from_def works and what __name__ is set to
rg -n "from_def\|__init_dict_from_def" crates/vm/src/builtins/module.rs -A 20 | head -80

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check how the pymodule macro initializes module __name__
rg -n "#\[pymodule" crates/stdlib/src/pyexpat.rs -B 2 -A 10

Repository: RustPython/RustPython

Length of output: 638


🏁 Script executed:

# Look for how __name__ is set on modules during creation
rg -n "__name__\|\"__name__\"" crates/vm/src/builtins/module.rs -B 2 -A 2 | head -40

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Test if qualified imports actually work by looking at import system
rg -n "import_submodule\|find_submodule" crates/vm/src/ | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check how CPython's pyexpat.errors and pyexpat.model are used
rg -n "expat\.errors\|expat\.model" Lib/test/test_pyexpat.py

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at how RustPython imports work, especially for submodules
find crates/vm/src -name "*import*" -type f

Repository: RustPython/RustPython

Length of output: 88


🏁 Script executed:

# Check the import implementation
rg -n "def import_module\|import_from\|__import__" crates/vm/src/ | head -30

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for how the import machinery handles parent.child imports
rg -n "parent.*module\|submodule" crates/vm/src/stdlib/ | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are any tests that test qualified submodule imports
find Lib/test -name "*.py" -exec grep -l "import.*\\..*" {} \; 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 195


🏁 Script executed:

# Search the codebase for examples of modules with submodules that ARE registered in sys.modules
rg -n "set_item.*sys" crates/stdlib/src/ | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at how json module handles its submodules (as json has encoder/decoder)
grep -rn "json" crates/stdlib/src/ | grep module_exec | head -5

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at the import system implementation
cat -n crates/vm/src/import.rs | head -200

Repository: RustPython/RustPython

Length of output: 8431


🏁 Script executed:

# Check if there are any package/namespace tests
rg -n "import.*\\..*as\|from.*\\..*import" Lib/test/ | grep -v "Binary\|\.pyc" | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look specifically for how the pymodule macro generates __name__
rg -n "pymodule\|__name__" crates/vm/src/py_serde.rs | head -40

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check the implementation of module initialization in the macro
find crates -name "*.rs" -exec grep -l "pymodule" {} \; | head -5

Repository: RustPython/RustPython

Length of output: 3529


🏁 Script executed:

# Look at a stdlib module that imports submodules to see pattern
rg -n "module_exec\|__module_exec" crates/stdlib/src/json/ -A 10 | head -60

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check the pymodule macro implementation to understand how __name__ is set
cat -n crates/derive-impl/src/pymodule.rs | head -150

Repository: RustPython/RustPython

Length of output: 6221


🏁 Script executed:

# Look for any tests that import qualified submodules
rg -n "from pyexpat import\|import pyexpat\." Lib/test/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check Python standard library behavior for pyexpat submodules
python3 -c "import pyexpat; print('model' in dir(pyexpat)); import pyexpat.errors; print(pyexpat.errors.__name__)" 2>&1 || true

Repository: RustPython/RustPython

Length of output: 151


🏁 Script executed:

# Check if there's documentation or tests about how RustPython handles submodule imports
rg -n "submodule\|parent.*module" Lib/test/ | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at what __name__ is set to for the submodules created via create_module
rg -n "PyModule.*__name__\|set.*__name__" crates/vm/src/builtins/module.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Let me check if there are any tests specifically for pyexpat.model or pyexpat.errors
cat Lib/test/test_pyexpat.py | grep -n "\.model\|\.errors" | head -20

Repository: RustPython/RustPython

Length of output: 127


🏁 Script executed:

# Check what happens with qualified module imports in RustPython
rg -n "import.*\\..*import" Lib/test/test_import*.py | head -20

Repository: RustPython/RustPython

Length of output: 129


🏁 Script executed:

# Look at how modules set their __name__ attribute during creation
cat -n crates/vm/src/builtins/module.rs | sed -n '1,200p' | tail -100

Repository: RustPython/RustPython

Length of output: 4025


🏁 Script executed:

# Check if there's anything about parent module setup in module creation
rg -n "__package__\|__path__" crates/vm/src/builtins/module.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are any tests or code that use import pyexpat.model or pyexpat.errors
rg -n "import.*pyexpat\." Lib/test/test_pyexpat.py

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at the test more carefully to see if qualified imports are tested
cat -n Lib/test/test_pyexpat.py | sed -n '1,50p'

Repository: RustPython/RustPython

Length of output: 2047


🏁 Script executed:

# Check how create_module is implemented to understand what __name__ gets set
cat -n crates/vm/src/builtins/module.rs | sed -n '44,75p'

Repository: RustPython/RustPython

Length of output: 1461


🏁 Script executed:

# Verify if vm.sys_module.get_attr is available and the set_item API works
rg -n "set_item\|get_attr" crates/stdlib/src/pyexpat.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for similar patterns in other modules to see if anyone else registers submodules
rg -n "sys_modules.*set_item" crates/stdlib/src/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if test_pyexpat.py or other tests actually try qualified imports like "import pyexpat.errors"
grep -rn "import pyexpat" Lib/test/ | grep -v "from pyexpat"

Repository: RustPython/RustPython

Length of output: 109


🏁 Script executed:

# Look at what the test actually expects - check line 275 more closely
sed -n '270,280p' Lib/test/test_pyexpat.py

Repository: RustPython/RustPython

Length of output: 600


🏁 Script executed:

# Check how the xml.parsers.expat module relates to pyexpat
find . -name "expat.py" -o -name "*expat*" -path "*/xml/parsers/*"

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# See if there are any imports of pyexpat.model or pyexpat.errors in the entire codebase
rg -rn "pyexpat\.model\|pyexpat\.errors"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check the actual behavior - what __name__ does the model submodule have
rg -n "model\|errors" crates/stdlib/src/pyexpat.rs | grep -E "pymodule|module_def"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at the xml.parsers.expat module to understand the relationship
cat -n Lib/xml/parsers/expat.py | head -50

Repository: RustPython/RustPython

Length of output: 368


🏁 Script executed:

# Check if expat.py re-exports or wraps pyexpat
grep -n "pyexpat\|from.*import" Lib/xml/parsers/expat.py | head -30

Repository: RustPython/RustPython

Length of output: 151


🏁 Script executed:

# Look for how sys.modules handles submodule registrations in import tests
find Lib/test -name "*.py" -exec grep -l "sys.modules" {} \; | head -5

Repository: RustPython/RustPython

Length of output: 3453


🏁 Script executed:

# Check if there's any code that tests or uses qualified submodule imports
grep -rn "sys\.modules\[.*\.\|set_item.*\." Lib/test/ | head -20

Repository: RustPython/RustPython

Length of output: 1996


🏁 Script executed:

# Check how the import system finds submodules - look for the import machinery
grep -n "find.*submodule\|parent.*module" crates/vm/src/import.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# See the full expat.py file to understand how it uses the submodules
cat -n Lib/xml/parsers/expat.py

Repository: RustPython/RustPython

Length of output: 368


🏁 Script executed:

# Check if RustPython's import system is in importlib (frozen)
find . -name "*importlib*" -type f | grep -v ".pyc\|test"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Test whether qualified imports actually work in RustPython by looking for test cases
rg -n "import.*\\..*\|sys.modules.*\\." Lib/test/test_importlib/ | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are any actual tests that try to import pyexpat.model or pyexpat.errors directly
grep -rn "import pyexpat\.model\|import pyexpat\.errors\|from pyexpat import model\|from pyexpat import errors" Lib/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at what attributes are accessible on pyexpat module
cat -n crates/stdlib/src/pyexpat.rs | sed -n '53,75p'

Repository: RustPython/RustPython

Length of output: 917


🏁 Script executed:

# Check if the import system in RustPython handles qualified imports
rg -n "resolve.*parent\|split.*module.*name" crates/vm/src/import.rs | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for how other builtin modules with submodules handle this
find crates/stdlib/src -name "*.rs" -exec grep -l "super::_" {} \; | head -10

Repository: RustPython/RustPython

Length of output: 150


🏁 Script executed:

# Final check - see if there's a test that verifies pyexpat.errors can be imported as a qualified name
rg -rn "pyexpat\.errors\|pyexpat\.model" Lib/test/ 2>/dev/null

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for documentation on how Python handles module.submodule imports
grep -rn "__path__" crates/vm/src/builtins/module.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if pyexpat should be a package by looking at CPython source patterns
rg -n "is_package\|__path__" crates/stdlib/src/ | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Verify what __name__ is actually set to for the submodules by checking module creation
cat -n crates/vm/src/builtins/module.rs | sed -n '119,130p'

Repository: RustPython/RustPython

Length of output: 513


🏁 Script executed:

# Check if there's any handling of qualified module imports in RustPython's import system
rg -n "\.split\(.*\.\)" crates/vm/src/import.rs | head -10

Repository: RustPython/RustPython

Length of output: 47


Register pyexpat submodules with qualified names in sys.modules for CPython compatibility.

The current implementation attaches model and errors submodules as attributes but doesn't register them in sys.modules with qualified names (pyexpat.model, pyexpat.errors). This follows the pattern shown in Lib/xml/parsers/expat.py, which explicitly registers submodules in sys.modules. While RustPython's import system may not currently support qualified submodule imports like import pyexpat.model, registering them ensures compatibility with code that expects this behavior and with introspection tools that check sys.modules.

💡 Suggested fix
     pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
         __module_exec(vm, module);

         // Add submodules
-        let model = super::_model::module_def(&vm.ctx).create_module(vm)?;
-        let errors = super::_errors::module_def(&vm.ctx).create_module(vm)?;
+        let sys_modules = vm.sys_module.get_attr("modules", vm)?;
+
+        let model = super::_model::module_def(&vm.ctx).create_module(vm)?;
+        model.set_attr("__name__", vm.ctx.new_str("pyexpat.model"), vm)?;
+        model.set_attr("__package__", vm.ctx.new_str("pyexpat"), vm)?;
+        sys_modules.set_item("pyexpat.model", model.clone().into(), vm)?;
+
+        let errors = super::_errors::module_def(&vm.ctx).create_module(vm)?;
+        errors.set_attr("__name__", vm.ctx.new_str("pyexpat.errors"), vm)?;
+        errors.set_attr("__package__", vm.ctx.new_str("pyexpat"), vm)?;
+        sys_modules.set_item("pyexpat.errors", errors.clone().into(), vm)?;

         extend_module!(vm, module, {
             "model" => model,
             "errors" => errors,
         });
🤖 Prompt for AI Agents
In `@crates/stdlib/src/pyexpat.rs` around lines 53 - 65, The submodules "model"
and "errors" created in module_exec (variables model and errors) are attached to
the pyexpat module but not registered in sys.modules with their qualified names;
update module_exec to insert entries for "pyexpat.model" and "pyexpat.errors"
into the VM's sys.modules mapping after creating the submodules and before
returning (use vm.sys_module or the VM API to access sys.modules), ensuring the
Py<PyModule> values for model and errors are stored under those qualified keys
so imports and introspection that expect sys.modules["pyexpat.model"] /
sys.modules["pyexpat.errors"] will work as in CPython.

Comment on lines 168 to 174
fn builtin_module_names(vm: &VirtualMachine) -> PyTupleRef {
let mut module_names: Vec<_> = vm.state.module_inits.keys().cloned().collect();
module_names.push("sys".into());
module_names.push("builtins".into());
let mut module_names: Vec<String> =
vm.state.module_defs.keys().map(|&s| s.to_owned()).collect();
module_names.push("sys".to_owned());
module_names.push("builtins".to_owned());

module_names.sort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Deduplicate sys.builtin_module_names entries.

module_defs may already contain sys/builtins, so pushing them unconditionally can introduce duplicates in the exported tuple. Consider deduping after sort to keep the public surface clean.

🩹 Proposed fix
         module_names.sort();
+        module_names.dedup();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn builtin_module_names(vm: &VirtualMachine) -> PyTupleRef {
let mut module_names: Vec<_> = vm.state.module_inits.keys().cloned().collect();
module_names.push("sys".into());
module_names.push("builtins".into());
let mut module_names: Vec<String> =
vm.state.module_defs.keys().map(|&s| s.to_owned()).collect();
module_names.push("sys".to_owned());
module_names.push("builtins".to_owned());
module_names.sort();
fn builtin_module_names(vm: &VirtualMachine) -> PyTupleRef {
let mut module_names: Vec<String> =
vm.state.module_defs.keys().map(|&s| s.to_owned()).collect();
module_names.push("sys".to_owned());
module_names.push("builtins".to_owned());
module_names.sort();
module_names.dedup();
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/sys.rs` around lines 168 - 174, The returned tuple from
builtin_module_names may contain duplicate "sys" or "builtins" entries; in the
builtin_module_names function, after collecting and pushing module names
(module_names) and sorting, remove duplicates (e.g., dedupe the Vec<String>
in-place) before converting to a PyTupleRef so the exported tuple contains
unique module names only.

@youknowone youknowone force-pushed the multi-phase-module-init branch from 04a9da3 to 1087d6c Compare January 22, 2026 01:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@crates/stdlib/src/openssl.rs`:
- Around line 100-108: module_exec currently ignores the PyResult returned by
__module_exec, masking initialization errors; change module_exec to propagate
that result instead of always returning Ok(()). After the existing
LazyLock::force call, call __module_exec(vm, module) and return its PyResult
(either via the ? operator or by returning the result directly) so any error
from __module_exec is forwarded to the caller.

In `@crates/vm/src/builtins/module.rs`:
- Around line 44-81: The create_module implementation must guard against create
slots returning modules with no def and pass a consistent spec object: when
self.slots.create is Some, build and pass a module spec object (not just a
string) compatible with imp.rs::create_builtin (i.e., an object with a name
attribute), then call the create slot; after creation, check the returned
module's def (module.def or module.def.is_some()) and only call
PyModule::__init_dict_from_def and module.__init_methods if the def is present;
otherwise skip those calls to avoid panics for modules created via
PyModule::new().

In `@crates/vm/src/stdlib/errno.rs`:
- Around line 9-22: The module_exec function currently calls __module_exec(vm,
module) and discards its PyResult, which can hide initialization errors; change
the call to propagate errors by using the ? operator (i.e., __module_exec(vm,
module)?;) so module_exec returns early with the underlying PyErr when
__module_exec fails, keeping the function's PyResult return type semantics
intact.

In `@crates/vm/src/stdlib/time.rs`:
- Around line 577-588: The module initialization currently calls
__module_exec(vm, module) but discards its PyResult, masking failures; update
module_exec to propagate errors by returning the result of __module_exec(vm,
module) (i.e., replace the ignored call + Ok(()) with returning __module_exec's
PyResult), preserving the existing unsafe c_tzset call and signature of
module_exec (working with VirtualMachine and Py<crate::builtins::PyModule>).
♻️ Duplicate comments (7)
crates/stdlib/src/pyexpat.rs (1)

42-66: Register pyexpat submodules in sys.modules for qualified imports

The module still attaches model/errors as attributes without registering pyexpat.model/pyexpat.errors in sys.modules, which can break qualified imports and introspection.

crates/vm/src/stdlib/imp.rs (1)

118-149: Remove partially initialized modules on exec failure.
If exec errors, the module stays in sys.modules and may be reused in a bad state.

🛠️ Suggested fix
-            if let Some(exec) = def.slots.exec {
-                exec(vm, &module)?;
-            }
+            if let Some(exec) = def.slots.exec {
+                if let Err(err) = exec(vm, &module) {
+                    let _ = sys_modules.del_item(&*name, vm);
+                    return Err(err);
+                }
+            }
crates/vm/src/stdlib/typing.rs (1)

191-218: Propagate __module_exec failures.
The result is dropped; initialization errors can be silently ignored.

🛠️ Suggested fix
-        __module_exec(vm, module);
+        __module_exec(vm, module)?;
crates/vm/src/stdlib/ctypes.rs (1)

1281-1312: Propagate __module_exec failures.
The result is dropped; initialization errors can be silently ignored.

🛠️ Suggested fix
-        __module_exec(vm, module);
+        __module_exec(vm, module)?;
crates/vm/src/stdlib/sys.rs (1)

168-174: Duplicate: dedupe builtin_module_names after sorting.
module_defs can already include "sys"/"builtins", so the tuple may contain duplicates.

crates/derive-impl/src/pymodule.rs (1)

85-90: Duplicate: reject cfg‑gated module_exec to avoid missing symbol.

crates/vm/src/stdlib/mod.rs (1)

96-102: Duplicate posix module registration on WASI targets.

On WASI targets, both conditions match:

  • #[cfg(any(unix, target_os = "wasi"))] (line 96) matches because target_os = "wasi"
  • #[cfg(all(any(not(target_arch = "wasm32"), target_os = "wasi"), not(any(unix, windows))))] (lines 98-101) also matches because WASI is wasm32 but has target_os = "wasi", and WASI is neither unix nor windows

This results in posix::module_def(ctx) being included twice in the vector for WASI builds, which could cause issues during module registration.

🔧 Suggested fix
         #[cfg(any(unix, target_os = "wasi"))]
         posix::module_def(ctx),
         #[cfg(all(
             any(not(target_arch = "wasm32"), target_os = "wasi"),
-            not(any(unix, windows))
+            not(any(unix, windows, target_os = "wasi"))
         ))]
         posix::module_def(ctx),

Alternatively, remove the WASI-specific entry since it's already covered:

-        #[cfg(any(unix, target_os = "wasi"))]
-        posix::module_def(ctx),
+        #[cfg(unix)]
+        posix::module_def(ctx),
+        #[cfg(target_os = "wasi")]
+        posix::module_def(ctx),
         #[cfg(all(
             any(not(target_arch = "wasm32"), target_os = "wasi"),
             not(any(unix, windows))
         ))]
         posix::module_def(ctx),
🧹 Nitpick comments (2)
src/interpreter.rs (2)

30-37: Potential panic if state has multiple references.

The .unwrap() on PyRc::get_mut will panic if there are other references to vm.state. While this is likely safe during initialization (when init_hook runs), consider using expect() with a descriptive message to aid debugging if assumptions are violated.

-    let state = PyRc::get_mut(&mut vm.state).unwrap();
+    let state = PyRc::get_mut(&mut vm.state)
+        .expect("VM state should be uniquely owned during init_hook");

58-76: Potential panic on non-UTF-8 paths.

The .unwrap() on into_string() (line 64) will panic if BUILDTIME_RUSTPYTHONPATH contains non-UTF-8 characters. While unlikely in controlled build environments, consider using to_string_lossy() or handling the error gracefully:

-            .map(|path| path.into_os_string().into_string().unwrap())
+            .filter_map(|path| path.into_os_string().into_string().ok())

Alternatively, if non-UTF-8 paths should cause a build/runtime failure, document this expectation.

Comment on lines +44 to +81
impl PyModuleDef {
/// Create a module from this definition (Phase 1 of multi-phase init).
///
/// This performs:
/// 1. Create module object (using create slot if provided)
/// 2. Initialize module dict from def
/// 3. Add methods to module
///
/// Does NOT add to sys.modules or call exec slot.
pub fn create_module(&'static self, vm: &VirtualMachine) -> PyResult<PyRef<PyModule>> {
use crate::PyPayload;

// Create module (use create slot if provided, else default creation)
let module = if let Some(create) = self.slots.create {
// Custom module creation
let spec = vm.ctx.new_str(self.name.as_str());
create(vm, spec.as_object(), self)?
} else {
// Default module creation
PyModule::from_def(self).into_ref(&vm.ctx)
};

// Initialize module dict and methods
PyModule::__init_dict_from_def(vm, &module);
module.__init_methods(vm)?;

Ok(module)
}

/// Execute the module's exec slot (Phase 2 of multi-phase init).
///
/// Calls the exec slot if present. Returns Ok(()) if no exec slot.
pub fn exec_module(&'static self, vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
if let Some(exec) = self.slots.exec {
exec(vm, module)?;
}
Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find ModuleCreate implementations and create slot definitions
rg -n "ModuleCreate" --type rs -A 5 -B 2

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Find PyModuleSlots definitions with create field
rg -n "struct PyModuleSlots|create:" --type rs -B 2 -A 5

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Find __init_dict_from_def implementation to check for unwrap
rg -n "__init_dict_from_def" --type rs -B 2 -A 10

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Check what spec parameter is used for in create slots
rg -n "fn.*create.*spec" --type rs -B 2 -A 8

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Find ModuleCreate and create slot definitions
rg "ModuleCreate|PyModuleSlots" -A 5 -B 2 -- '*.rs'

Repository: RustPython/RustPython

Length of output: 109


🏁 Script executed:

# Find __init_dict_from_def implementation
rg "__init_dict_from_def" -A 10 -B 2 -- '*.rs'

Repository: RustPython/RustPython

Length of output: 109


🏁 Script executed:

# Look at the module.rs file directly to understand the structure
cat crates/vm/src/builtins/module.rs | head -150

Repository: RustPython/RustPython

Length of output: 4736


🏁 Script executed:

# Find where PyModule is defined and check def field
rg "pub struct PyModule|pub def:" -A 5 -- '*.rs'

Repository: RustPython/RustPython

Length of output: 109


🏁 Script executed:

# Search for actual ModuleCreate implementations or assignments
rg "ModuleCreate" -B 2 -A 8

Repository: RustPython/RustPython

Length of output: 1194


🏁 Script executed:

# Search for create: Some( patterns to find where ModuleCreate is used
rg "create:\s*Some" -B 2 -A 3

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are any tests or examples of custom create slots
rg "create.*=.*fn" -B 3 -A 5

Repository: RustPython/RustPython

Length of output: 1841


🏁 Script executed:

# Look for where PyModuleSlots is instantiated with a create function
rg "PyModuleSlots\s*\{" -A 5

Repository: RustPython/RustPython

Length of output: 810


🏁 Script executed:

# Search for any module creation patterns outside of the reviewed code
rg "PyModule::from_def|PyModule::new()" -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 1769


🏁 Script executed:

# Check the complete __init_dict_from_def and __init_methods implementations
rg "__init_dict_from_def|__init_methods" -B 2 -A 8

Repository: RustPython/RustPython

Length of output: 5302


🏁 Script executed:

# Search for any actual custom module create implementations
rg "fn.*create.*PyModuleDef" -B 2 -A 10

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for PyModule struct definition to see the def and name fields
rg "pub struct PyModule" -A 8

Repository: RustPython/RustPython

Length of output: 2348


🏁 Script executed:

# Check what the type signature requires - look for how modules are created in practice
rg "PyRef::new_ref.*PyModule" -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at imp.rs to see if the same pattern exists there
sed -n '1,200p' crates/vm/src/stdlib/imp.rs | grep -A 30 "create_module\|ModuleCreate"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for any definitions or assignments of create slots
rg "create\s*:" --type-list | head -20

Repository: RustPython/RustPython

Length of output: 454


🏁 Script executed:

# Look for files that might contain ModuleCreate implementations
fd -e rs | xargs grep -l "PyModuleSlots"

Repository: RustPython/RustPython

Length of output: 127


🏁 Script executed:

# Check the entire module.rs file around the review area to understand context better
wc -l crates/vm/src/builtins/module.rs

Repository: RustPython/RustPython

Length of output: 101


🏁 Script executed:

# Get the full module.rs file to understand all the context
cat crates/vm/src/builtins/module.rs

Repository: RustPython/RustPython

Length of output: 12814


🏁 Script executed:

# Search for any actual module creation slot implementations or assignments
rg "create\s*=\s*Some" -B 3 -A 5

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are tests or examples that use custom ModuleCreate slots
rg "fn.*create.*vm.*spec.*def" -B 2 -A 8

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at imp.rs to see if the same pattern exists there
sed -n '1,300p' crates/vm/src/stdlib/imp.rs | cat

Repository: RustPython/RustPython

Length of output: 7377


🏁 Script executed:

# Look for any pymodule macro usage that might define custom create slots
rg "pymodule|create.*=\|slots.*=" -B 2 -A 5 | head -100

Repository: RustPython/RustPython

Length of output: 5209


🏁 Script executed:

# Check if there are any integration tests or examples
fd -e rs -path "*/test*" -o -path "*/examples*" | xargs grep -l "create\|ModuleCreate" 2>/dev/null | head -10

Repository: RustPython/RustPython

Length of output: 233


🏁 Script executed:

# Verify the type signature match - what spec parameter looks like in practice
rg "create\(vm\|spec" -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if PyModule can be created with def=None
rg "def: None\|PyModule::new()" -B 2 -A 3

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Verify the type signature and what create slot implementations would need to return
rg "PyModule::from_def|from_def" -B 1 -A 1 | head -30

Repository: RustPython/RustPython

Length of output: 1798


🏁 Script executed:

# Look for any tests or debug assertions that use custom create slots
rg "debug_assert.*def\|unwrap.*def" -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Double-check the type signature and what ModuleCreate must return
rg "pub type ModuleCreate|ModuleCreate\s*=" -A 2

Repository: RustPython/RustPython

Length of output: 348


🏁 Script executed:

# Verify __init_methods also has the same unwrap/assert pattern
rg "__init_methods" -A 8 | grep -A 8 "pub fn __init_methods"

Repository: RustPython/RustPython

Length of output: 761


🏁 Script executed:

# Check if there are any other places calling __init_dict_from_def to see the pattern
rg "__init_dict_from_def" -B 3 -A 1

Repository: RustPython/RustPython

Length of output: 1536


Guard create_module against create slots returning modules without def

create_module unconditionally calls __init_dict_from_def and __init_methods, both of which unwrap module.def (with __init_methods also asserting it's Some). If any custom create slot returns a module created via PyModule::new() (which sets def to None), this will panic at runtime. Additionally, the spec argument passed to create slots differs from imp.rs's create_builtin: this function passes a string, while create_builtin passes a spec object expected to have a name attribute. ModuleCreate implementations should be consistent with whichever signature they target.

🐛 Suggested guard to avoid panic
@@
         // Initialize module dict and methods
+        if module.def.is_none() {
+            return Err(vm.new_runtime_error(
+                "module create slot must return a module created from this def".to_owned(),
+            ));
+        }
         PyModule::__init_dict_from_def(vm, &module);
         module.__init_methods(vm)?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
impl PyModuleDef {
/// Create a module from this definition (Phase 1 of multi-phase init).
///
/// This performs:
/// 1. Create module object (using create slot if provided)
/// 2. Initialize module dict from def
/// 3. Add methods to module
///
/// Does NOT add to sys.modules or call exec slot.
pub fn create_module(&'static self, vm: &VirtualMachine) -> PyResult<PyRef<PyModule>> {
use crate::PyPayload;
// Create module (use create slot if provided, else default creation)
let module = if let Some(create) = self.slots.create {
// Custom module creation
let spec = vm.ctx.new_str(self.name.as_str());
create(vm, spec.as_object(), self)?
} else {
// Default module creation
PyModule::from_def(self).into_ref(&vm.ctx)
};
// Initialize module dict and methods
PyModule::__init_dict_from_def(vm, &module);
module.__init_methods(vm)?;
Ok(module)
}
/// Execute the module's exec slot (Phase 2 of multi-phase init).
///
/// Calls the exec slot if present. Returns Ok(()) if no exec slot.
pub fn exec_module(&'static self, vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
if let Some(exec) = self.slots.exec {
exec(vm, module)?;
}
Ok(())
}
impl PyModuleDef {
/// Create a module from this definition (Phase 1 of multi-phase init).
///
/// This performs:
/// 1. Create module object (using create slot if provided)
/// 2. Initialize module dict from def
/// 3. Add methods to module
///
/// Does NOT add to sys.modules or call exec slot.
pub fn create_module(&'static self, vm: &VirtualMachine) -> PyResult<PyRef<PyModule>> {
use crate::PyPayload;
// Create module (use create slot if provided, else default creation)
let module = if let Some(create) = self.slots.create {
// Custom module creation
let spec = vm.ctx.new_str(self.name.as_str());
create(vm, spec.as_object(), self)?
} else {
// Default module creation
PyModule::from_def(self).into_ref(&vm.ctx)
};
// Initialize module dict and methods
if module.def.is_none() {
return Err(vm.new_runtime_error(
"module create slot must return a module created from this def".to_owned(),
));
}
PyModule::__init_dict_from_def(vm, &module);
module.__init_methods(vm)?;
Ok(module)
}
/// Execute the module's exec slot (Phase 2 of multi-phase init).
///
/// Calls the exec slot if present. Returns Ok(()) if no exec slot.
pub fn exec_module(&'static self, vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
if let Some(exec) = self.slots.exec {
exec(vm, module)?;
}
Ok(())
}
🤖 Prompt for AI Agents
In `@crates/vm/src/builtins/module.rs` around lines 44 - 81, The create_module
implementation must guard against create slots returning modules with no def and
pass a consistent spec object: when self.slots.create is Some, build and pass a
module spec object (not just a string) compatible with imp.rs::create_builtin
(i.e., an object with a name attribute), then call the create slot; after
creation, check the returned module's def (module.def or module.def.is_some())
and only call PyModule::__init_dict_from_def and module.__init_methods if the
def is present; otherwise skip those calls to avoid panics for modules created
via PyModule::new().

@youknowone youknowone force-pushed the multi-phase-module-init branch from 1087d6c to 2037645 Compare January 22, 2026 02:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/io.rs`:
- Around line 4732-4744: The fileio module is missing a pub(crate) module_exec
function required by the call super::fileio::module_exec(vm, module)?; — add a
minimal wrapper with the signature pub(crate) fn module_exec(vm:
&VirtualMachine, module: &Py<PyModule>) -> PyResult<()> that calls the
autogenerated __module_exec(vm, module) and returns Ok(()), so the call site
compiles on non-WASM platforms; reference the __module_exec symbol inside the
new fileio::module_exec.
♻️ Duplicate comments (7)
crates/vm/src/stdlib/ctypes.rs (1)

1281-1288: Propagate errors from __module_exec.

Line 1287 discards the PyResult return; errors will be silently ignored.

🔧 Suggested fix
-        __module_exec(vm, module);
+        __module_exec(vm, module)?;
crates/stdlib/src/pyexpat.rs (1)

53-63: Register pyexpat.model / pyexpat.errors in sys.modules for qualified imports.

crates/vm/src/stdlib/imp.rs (1)

143-149: Rollback sys.modules if exec fails.

Line 143 inserts the module before exec; if exec errors, the partially initialized module remains cached. Clean it up on failure to avoid stale imports.

🛠️ Proposed fix
-            if let Some(exec) = def.slots.exec {
-                exec(vm, &module)?;
-            }
+            if let Some(exec) = def.slots.exec {
+                if let Err(err) = exec(vm, &module) {
+                    let _ = sys_modules.del_item(&*name, vm);
+                    return Err(err);
+                }
+            }
crates/derive-impl/src/pymodule.rs (1)

85-90: Guard against cfg‑gated module_exec to avoid build breaks.
Line 85–90 marks any module_exec as present, but def.slots.exec = Some(module_exec) is always emitted. If a user #[cfg]‑gates module_exec, builds with the cfg disabled will fail due to an unresolved symbol. This was previously flagged; consider rejecting cfg‑gated module_exec or requiring a non‑gated fallback.

🛠️ Proposed fix
-        if let Item::Fn(func) = item
-            && func.sig.ident == "module_exec"
-        {
-            context.has_module_exec = true;
-        }
+        if let Item::Fn(func) = item
+            && func.sig.ident == "module_exec"
+        {
+            let has_cfg = func.attrs.iter().any(|a| a.path().is_ident("cfg"));
+            if has_cfg {
+                context.errors.push(syn::Error::new_spanned(
+                    &func.sig.ident,
+                    "`module_exec` must not be cfg-gated; move the cfg to the module or provide a non-gated fallback",
+                ));
+            } else {
+                context.has_module_exec = true;
+            }
+        }
#!/bin/bash
# Find cfg-gated module_exec definitions to validate impact.
rg -n --type=rust '#\[cfg[^\]]*\]\s*\n\s*fn\s+module_exec' -C2
crates/vm/src/stdlib/mod.rs (1)

96-102: Avoid duplicate posix::module_def registration on WASI.
Both cfg blocks match on WASI, so posix::module_def(ctx) is added twice. Keep a single entry to prevent double registration.

🛠 Suggested fix
-        #[cfg(any(unix, target_os = "wasi"))]
-        posix::module_def(ctx),
+        #[cfg(unix)]
+        posix::module_def(ctx),
crates/wasm/src/vm_class.rs (1)

29-34: Propagate __module_exec errors.
The return value is ignored, so init failures won’t surface.

🔧 Suggested fix
-        __module_exec(vm, module);
+        __module_exec(vm, module)?;
crates/vm/src/stdlib/sys.rs (1)

168-174: Deduplicate sys.builtin_module_names entries.

module_defs may already contain sys/builtins, so pushing them unconditionally can introduce duplicates in the exported tuple. Add dedup() after sorting to ensure uniqueness.

🩹 Proposed fix
         module_names.sort();
+        module_names.dedup();
🧹 Nitpick comments (2)
crates/vm/src/stdlib/sysconfigdata.rs (1)

14-23: Optional: skip redundant sys.modules overwrite.

At Line 21–23, consider checking if sys.modules already contains the platform-specific key before setting it to avoid unnecessary mutation during reloads.

crates/vm/src/vm/interpreter.rs (1)

130-146: Consider returning global_state from the VM rather than cloning.

The comment on line 131 mentions "Don't clone here" but line 145 does clone vm.state. Since vm.state is already a PyRc<PyGlobalState>, the clone is cheap (just reference count bump). However, the function could simply return vm.state.clone() directly without the intermediate variable for clarity.

♻️ Optional simplification
     vm.initialize();

-    // Clone global_state for Interpreter after all initialization is done
-    let global_state = vm.state.clone();
-    (vm, global_state)
+    (vm, vm.state.clone())
 }

Comment on lines +4732 to +4744
pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>) -> PyResult<()> {
// Call auto-generated initialization first
__module_exec(vm, module);

// Initialize FileIO types on non-WASM platforms
#[cfg(any(not(target_arch = "wasm32"), target_os = "wasi"))]
super::fileio::module_exec(vm, module)?;

let unsupported_operation = unsupported_operation().to_owned();
extend_module!(vm, module, {
"UnsupportedOperation" => unsupported_operation,
"BlockingIOError" => vm.ctx.exceptions.blocking_io_error.to_owned(),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for module_exec in the io.rs file and also look for the fileio module definition
rg -n "module_exec" crates/vm/src/stdlib/io.rs | head -20

Repository: RustPython/RustPython

Length of output: 255


🏁 Script executed:

# Also look for the fileio module and its structure
rg -n "mod fileio" crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 82


🏁 Script executed:

# Search for fileio module_exec more broadly
rg -n "fileio.*module_exec|fn module_exec" crates/vm/src/stdlib/

Repository: RustPython/RustPython

Length of output: 1161


🏁 Script executed:

# Read the fileio module starting from line 4751
sed -n '4751,4850p' crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 3361


🏁 Script executed:

# Continue reading the fileio module - let's search for module_exec within fileio specifically
sed -n '4751,5100p' crates/vm/src/stdlib/io.rs | grep -n "module_exec\|^}"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Also count the lines to understand module extent
wc -l crates/vm/src/stdlib/io.rs

Repository: RustPython/RustPython

Length of output: 96


🏁 Script executed:

# Search for both module_exec and __module_exec in the fileio range
rg -n "__module_exec|fn module_exec" crates/vm/src/stdlib/io.rs | grep -A2 -B2 "4[7-9][0-9][0-9]\|5[0-2][0-9][0-9]"

Repository: RustPython/RustPython

Length of output: 201


🏁 Script executed:

# Find where fileio module ends by looking for the closing brace
tail -n 100 crates/vm/src/stdlib/io.rs | head -50

Repository: RustPython/RustPython

Length of output: 1750


🏁 Script executed:

# Search for the pattern "pub fn module_exec\|pub(crate) fn module_exec" specifically in fileio range
sed -n '4751,5346p' crates/vm/src/stdlib/io.rs | grep -n "pub.*fn module_exec"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Double-check by looking at the entire fileio module definition for any module_exec
sed -n '4751,5346p' crates/vm/src/stdlib/io.rs | grep -i "module_exec"

Repository: RustPython/RustPython

Length of output: 47


Add missing module_exec to the fileio module.

The call to super::fileio::module_exec(vm, module)?; at line 4738 will not compile because module_exec does not exist in the fileio module. Add the following function to fix the compilation error on non-WASM platforms:

Minimal wrapper
+    pub(crate) fn module_exec(
+        vm: &VirtualMachine,
+        module: &Py<crate::builtins::PyModule>,
+    ) -> PyResult<()> {
+        __module_exec(vm, module);
+        Ok(())
+    }
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/io.rs` around lines 4732 - 4744, The fileio module is
missing a pub(crate) module_exec function required by the call
super::fileio::module_exec(vm, module)?; — add a minimal wrapper with the
signature pub(crate) fn module_exec(vm: &VirtualMachine, module: &Py<PyModule>)
-> PyResult<()> that calls the autogenerated __module_exec(vm, module) and
returns Ok(()), so the call site compiles on non-WASM platforms; reference the
__module_exec symbol inside the new fileio::module_exec.

@youknowone youknowone merged commit 47df6dd into RustPython:main Jan 22, 2026
14 checks passed
@youknowone youknowone deleted the multi-phase-module-init branch January 22, 2026 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant