Native support for parallel imputation by stefvanbuuren · Pull Request #701 · amices/mice

stefvanbuuren · 2025-04-06T11:50:28Z

Experimental feature

This PR adds native support for parellel imputation to mice().

The mice() function now supports parallel execution of imputations via the new parallel = TRUE argument. When enabled, instead of sequentially calculating m imputations at a given iteration, the m chains are distributed across available CPU cores using the future and future.apply frameworks.
Parallel imputation may significantly reduce runtime, especially for large datasets and many imputations (m), but does not pay-off for small datasets or few imputations.
Parallel execution is implemented only in the mice() function, and does not affect the mice.impute.*() functions.
To activate parallel execution:

library(mice)
imp <- mice(data, parallel = TRUE)

The default is parallel = FALSE for backward compatibility.
The argument n.core specifies the number of CPU cores to use. If n.core is not specified (default) the actual number of cores used is calculated as minimum(number of available cores - 1, number of imputations).
printFlag = TRUE prints iteration and imputation number only in sequential mode; parallel mode reports timing per iteration.
Note: mice() will automatically select a parallel backend (default is multisession). To override, users may manually call plan(...) before running mice().
The future and future.apply packages must be installed to run parallel imputation. If not installed, mice() will throw an error and suggest installing the packages.
The wrappers parlmice() and futuremice() are still functional, but now throw a warning that they will be deprecated in the future. Users are encouraged to use the new parallel argument in mice() instead.

Notes and questions:

Not yet finalized, but far enough to play around and test. As far as I can see, everything functions properly, but I did not perform any specific tests for parallelization, time gains, and so on.
I implemented only the minimal parallel and n.core argument for mice(). To keep the mice API clean, I suggest setting alternative future plans outside mice(). Does that sound like a good idea?
seeds are tricky for parallel environments. It is not yet clear whether we can borrow the standard seed argument from mice() to get reproducible solutions. Do we need to add a few tests for this?
Should we create a vignette, similar to https://www.gerkovink.com/parlMICE/Vignette_parlMICE.html and https://www.gerkovink.com/miceVignettes/futuremice/Vignette_futuremice.html to help users to get started?
Do we miss any arguments from those vignettes that we should support in mice()?
In parallel mode, isTRUE(printFlag) generates the time per iteration. Is that appropriate, or better report something else?

…em functional

Copilot

Copilot reviewed 6 out of 24 changed files in this pull request and generated no comments.

Files not reviewed (18)

DESCRIPTION: Language not supported
NAMESPACE: Language not supported
R/cbind.R: Language not supported
R/complete.R: Language not supported
R/edit.setup.R: Language not supported
R/futuremice.R: Language not supported
R/initialize.imp.R: Language not supported
R/internal.R: Language not supported
R/mice.R: Language not supported
R/parlmice.R: Language not supported
R/sampler.R: Language not supported
R/sampler.univ.R: Language not supported
man/futuremice.Rd: Language not supported
man/mice.Rd: Language not supported
man/parlmice.Rd: Language not supported
man/record.event.Rd: Language not supported
tests/testthat/test-as.mids.R: Language not supported
tests/testthat/test-mice.impute.norm.R: Language not supported

Comments suppressed due to low confidence (2)

_pkgdown.yml:131

Verify that the new 'record.event' reference corresponds to an existing documentation page to avoid broken links.

  - record.event

_pkgdown.yml:184

Ensure that the 'developer-notes-complete' page is properly linked and exists in the documentation to support developer guidance.

  - developer-notes-complete

stefvanbuuren · 2025-04-07T10:03:26Z

Well, thanks Copilot for your informative review.

stefvanbuuren · 2025-05-01T09:38:00Z

We also need support for custom imputation methods (cf #550).

Merge branch 'master' into future.apply # Conflicts: # DESCRIPTION # NEWS.md # R/edit.setup.R # R/mice.R # R/sampler.R # _pkgdown.yml

stefvanbuuren · 2025-05-30T09:16:19Z

See alexanderrobitzsch/miceadds#30 for an example with miceadds integration.

stefvanbuuren · 2025-05-30T09:29:51Z

Development note

The future.apply branch is merged into de dev branch. Any further development are made there.
For better integration and backward compatibility, the dev branch no longer supports the new record.event() function and the logenv environment. Logging of the imputation process relies - as before - on the loggedEvents object and the updateLog() function.

thomvolker · 2025-05-27T06:51:41Z

R/mice.R


+  # Set up parallel backend if requested
+  if (parallel) {
+    if (!requireNamespace("future.apply", quietly = TRUE)) {


You could consider making future.apply a hard dependency. Then, if parallel = FALSE, you just specify a sequential backend, and if parallel = TRUE. The advantage of this would be that the code is cleaner and easier to maintain (there is just one loop instead of the if-else later on). The disadvantage is that a future::apply() with sequential backend may use the rng state differently, and may thus not reproduce the original results.

This is an interesting idea that simplifies the main loop in sampler() at the price of an additional dependency. I would favor it if the algorithm is exactly reproducible. It needs not be backward compatible (impossible to maintain over versions).

I think it should be reproducible at the same machine, but perhaps not across platforms. We can test this, I suppose. I think, but am not entirely certain, that mice will then also be reproducible regardless of whether the replicator uses parallel = TRUE. I can dive into this.

thomvolker · 2025-05-27T06:57:36Z

R/sampler.R

-            }
+          log_i <- if (exists("loggedEvents", inherits = FALSE)) get("loggedEvents", inherits = FALSE) else NULL
+          list(imp = imp_i, mean = mean_i, var = var_i, log = log_i)
+        }, future.seed = TRUE)


Add the dotdotdot here when closing the future_lapply() call, then users can specify future.packages, future.globals and eventually all other functionality that is available. This allows users to specify their own imputation functions, for example.

We can support a user-specified list future.globals as mice dots argument. It should be strainghtforward to adapt the code.

thomvolker · 2025-05-30T09:42:26Z

R/sampler.R

+          log_i <- if (exists("loggedEvents", inherits = FALSE)) get("loggedEvents", inherits = FALSE) else NULL
+          list(imp = imp_i, mean = mean_i, var = var_i, log = log_i)
+        },
+        future.packages = future.packages,


Hi Stef, I think the current code does not allow users to have their self-written custom imputation functions as imputation function, unless they save these within a custom package. Perhaps you can allow users to specify a list with objects (functions/variables) for future.globals, and append these two functions to it (if non-null). Alternatively, if these run_imputation_cycle and update_chain_stats functions are properly added to the mice package, calling these through globals is not necessary, as the futures should have access to the entire mice package.

Would that require current run_imputation_cycle() and update_chain_stats() to be exported functions?

I don't think so. That would imply that many unexported functions that are called under the hood should be exported explicitly. It's more likely that all functions in a package are available in downstream futures.

If this mostly concerns that functions specified in the $meth slot may not be available for the future workers, would it just make sense to scan $meth, see if the functions only exist in .GlobalEnv and throw a warning to the user?

stefvanbuuren · 2025-09-28T16:07:00Z

To install the experimental version use

remotes::install_github("amices/mice@refs/pull/701/head")

stefvanbuuren added 6 commits April 5, 2025 22:25

Add parallel imputation to the internal sampler() function

811359a

Add NEWS item to announce parallel imputation

730ba11

Rename logEvent() --> report.event() to maintain mice style

f65eac1

Capture and use if plan was specified outside mice()

d3f6b6a

Add decprecation warnings to futuremice() and parlmice(), but keep th…

fddddbb

…em functional

Update NEWS, site and attach dev 9000 indicator

f1932ed

stefvanbuuren added the enhancement label Apr 6, 2025

stefvanbuuren requested a review from Copilot April 7, 2025 10:00

Copilot AI reviewed Apr 7, 2025

View reviewed changes

stefvanbuuren changed the base branch from master to dev April 9, 2025 15:26

stefvanbuuren changed the base branch from dev to master April 9, 2025 15:30

Merge branch 'master' into future.apply

d9292d3

stefvanbuuren mentioned this pull request Apr 28, 2025

On futuremice() and reproducibility #557

Open

Merge mice 3.18.0 into future.apply branch

0aca266

Merge branch 'master' into future.apply # Conflicts: # DESCRIPTION # NEWS.md # R/edit.setup.R # R/mice.R # R/sampler.R # _pkgdown.yml

thomvolker reviewed May 30, 2025

View reviewed changes

This was referenced Jan 21, 2026

futuremice(): Best practices for undoing future::plan() changes #735

Open

WISH: futuremice(... future.plan = NULL) to use the currently set future plan #736

Open

Conversation

stefvanbuuren commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

stefvanbuuren commented Apr 7, 2025

Uh oh!

stefvanbuuren commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefvanbuuren commented May 30, 2025

Uh oh!

stefvanbuuren commented May 30, 2025

Development note

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stefvanbuuren commented Sep 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stefvanbuuren commented Apr 6, 2025 •

edited

Loading

stefvanbuuren commented May 1, 2025 •

edited

Loading