Skip to content

Fix parallelism in cythonize cmd & callable via concurrent.futures#7183

Closed
webknjaz wants to merge 8 commits intocython:masterfrom
webknjaz:bugfixes/3262-3263-3973-spawn-in-cythonize
Closed

Fix parallelism in cythonize cmd & callable via concurrent.futures#7183
webknjaz wants to merge 8 commits intocython:masterfrom
webknjaz:bugfixes/3262-3263-3973-spawn-in-cythonize

Conversation

@webknjaz
Copy link
Contributor

@webknjaz webknjaz commented Oct 1, 2025

A similar problem was worked around in #3263 by disabling parallelism altogether. This patch reverts that and updates the worker management to use concurrent.futures.ProcessPoolExecutor instead of multiprocessing.Pool.

The symptoms are Python 3.8+, macOS and the underlying compiler invocations failing. If these are met, cythonize would hang before this commit.

Fixes #3973.

Refs:

webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 1, 2025
webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 1, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 1, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
@webknjaz webknjaz moved this to ⏰ Review reminder needed 👀 in 📅 Procrastinating in public 😵‍💫 Oct 1, 2025
webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 2, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 2, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
CHANGES.rst Outdated
Comment on lines +142 to +144
is set up to use ``spawn()`` in :mod:`multiprocessing`. Prior to this fix,
compilation failures would cause ``cythonize`` to hang in parallel mode.
This would happen under Python 3.8+ on macOS. A related issue in a different
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that it hangs on Linux w/ fork() too. The quickest repro is CFLAGS='-boom' pip wheel 'git+https://github.com/ansible/pylibssh.git@a220d06a9c5c39a4609f4a5d574ff92713b2cdf8' -v --no-deps. I don't have a small one atm.

But this also means that #3263 might be incomplete too.

cc @da-woods @scoder where do I go from here? This sounds like a problem described in https://dnmtechs.com/python-multiprocessing-pool-hangs-at-join/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that the setup() call @

setup(
script_name='setup.py',
script_args=script_args,
ext_modules=ext_modules,
)
raises a SystemExit. And that's what makes it hang.

Injecting something like

    except BaseException as err:
        raise Exception

in there stops the main process from hanging. However, the cythonize then proceeds and produces a successful return code, which is undesired (forcing parallel=1 makes it fail properly because it bypasses muliprocessing altogeher).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like raraising as a non-base exception + calling .get() on the result object might be what we want. I'll try playing with our options and report back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested but: I think concurrent.futures.ProcessPoolExecutor might do a better job of handling unexpected termination. I switched the parallelization in runtests.py to use it because of that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll add that to the list of things to check. Thanks for the suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried and confirmed that it doesn't get stuck on Linux. Yet to check macOS.

Comment on lines +92 to +95
mp_configured_to_spawn = multiprocessing.get_start_method() == 'spawn'
if mp_configured_to_spawn:
print('Disabling parallel cythonization for "spawn" process start method.')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's be better just to use an mp_context with a start method that does work rather than disabling it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering the same but wanted to focus on making sure the patch works at least somehow and is similar to the other PR before trying to remake everything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@da-woods changed to concurrent.futures and checked that the cythonize cmd fails on mac and linux correctly.

@webknjaz webknjaz force-pushed the bugfixes/3262-3263-3973-spawn-in-cythonize branch from 3a54d7b to f283576 Compare October 4, 2025 01:10
webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 4, 2025
webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 4, 2025
@webknjaz webknjaz changed the title Disable parallelism in cythonize cmd on spawn Fix parallelism in cythonize cmd & callable via concurrent.futures Oct 4, 2025
@webknjaz webknjaz marked this pull request as draft October 4, 2025 01:20
@webknjaz webknjaz force-pushed the bugfixes/3262-3263-3973-spawn-in-cythonize branch from f283576 to 22e5333 Compare October 4, 2025 01:41
webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 4, 2025
webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 4, 2025
@webknjaz webknjaz force-pushed the bugfixes/3262-3263-3973-spawn-in-cythonize branch from 22e5333 to 10003f0 Compare October 4, 2025 01:46
webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 4, 2025
@webknjaz webknjaz force-pushed the bugfixes/3262-3263-3973-spawn-in-cythonize branch from 10003f0 to 7c29c6f Compare October 4, 2025 01:48
webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 4, 2025
@webknjaz webknjaz force-pushed the bugfixes/3262-3263-3973-spawn-in-cythonize branch from 7c29c6f to a21b057 Compare October 4, 2025 01:49
@webknjaz webknjaz marked this pull request as ready for review October 4, 2025 01:51
Comment on lines +86 to +91
try:
_con_fut.wait(compiler_tasks, return_when=_con_fut.FIRST_EXCEPTION)
except KeyboardInterrupt:
for task in compiler_tasks:
task.cancel()
raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this the try should be broader and wrap the submission step

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to wrap the entire CM which makes more sense to me.

Comment on lines -94 to -95
except OSError:
pool = _FakePool()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect removing this might break webassembly (for example). We may need to keep the fake pool at least for that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I wasn't sure what could cause that. Plus I don't know if concurrent.futures will raise the same exceptions. Do you have any way of checking this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can quickly check it with https://pyodide.org/en/stable/console.html (just as a basic REPL - I'm not sure how to set up a more complete environment).

It looks like it raises a ModuleNotFoundError rather than an ImportError. So this probably isn't a useful fallback right now anyway.

It also looks like Android doesn't support multiprocessing but I wouldn't know how to test that.

It's probably a good thing to only have one fallback path (whether that's a manual loop or a dummy executor - not sure which). I don't think it's worth trying to explicitly test these more obscure platforms though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you mean. And for concurrent.futures, it's NotImplementedError even:

>>> import concurrent.futures
>>> import multiprocessing
>>> pool = multiprocessing.Pool(10)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python313.zip/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
                context=self.get_context())
  File "/lib/python313.zip/multiprocessing/pool.py", line 191, in __init__
    self._setup_queues()
    ~~~~~~~~~~~~~~~~~~^^
  File "/lib/python313.zip/multiprocessing/pool.py", line 346, in _setup_queues
    self._inqueue = self._ctx.SimpleQueue()
                    ~~~~~~~~~~~~~~~~~~~~~^^
  File "/lib/python313.zip/multiprocessing/context.py", line 113, in SimpleQueue
    return SimpleQueue(ctx=self.get_context())
  File "/lib/python313.zip/multiprocessing/queues.py", line 361, in __init__
    self._rlock = ctx.Lock()
                  ~~~~~~~~^^
  File "/lib/python313.zip/multiprocessing/context.py", line 67, in Lock
    from .synchronize import Lock
  File "/lib/python313.zip/multiprocessing/synchronize.py", line 17, in <module>
    import _multiprocessing
ModuleNotFoundError: No module named '_multiprocessing'
>>> concurrent.futures.ProcessPoolExecutor(10)
Traceback (most recent call last):
  File "/lib/python313.zip/concurrent/futures/process.py", line 586, in _check_system_limits
    import multiprocessing.synchronize
  File "/lib/python313.zip/multiprocessing/synchronize.py", line 17, in <module>
    import _multiprocessing
ModuleNotFoundError: No module named '_multiprocessing'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python313.zip/concurrent/futures/process.py", line 651, in __init__
    _check_system_limits()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/lib/python313.zip/concurrent/futures/process.py", line 592, in _check_system_limits
    raise NotImplementedError(_system_limited)
NotImplementedError: This Python build lacks multiprocessing.synchronize, usually due to named semaphores being u
navailable on this platform.

As for testing, it might be possible to make use of cibuildwheel for smoke tests. But that would be a separate integration effort, I imagine.

@da-woods so should we keep it removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we could catch both OSError and ModuleNotFoundError, but maybe just call _build(ext_modules, parallel=0) as the fallback rather than doing fake executors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I shuffled things around. lmk what you think.

webknjaz added a commit to webknjaz/cython that referenced this pull request Oct 5, 2025
A similar problem was fixed in cython#3263 which is what this patch is
porting to another module. That fixed the Python-land API while this
change aims to also fix its command-line counterpart.

The symptoms are Python 3.8+, macOS and the underlying compiler
invocations failing. If these are met, `cythonize` would hang before
this commit.

Fixes cython#3973.

Refs:
* cython#3262
* cython#7146
* ansible/pylibssh#769
@webknjaz webknjaz force-pushed the bugfixes/3262-3263-3973-spawn-in-cythonize branch from 3a2e39a to 6abe6d4 Compare October 10, 2025 14:27
@webknjaz
Copy link
Contributor Author

(rebased w/o changes to see the CI outcome first)

webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 10, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 10, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
webknjaz added a commit to webknjaz/ansible--pylibssh that referenced this pull request Oct 10, 2025
This is a workaround for a hang in Cython extention compilation under
Python 3.8+ on macOS. It hangs in `multiprocessing` when the `spawn`
method is set up so the patch simply sets the workers number to `1` to
combat that.

Fixes ansible#762 and fixes ansible#769.

Upstream fix: cython/cython#7183
@github-project-automation github-project-automation bot moved this from ⏰ Review reminder needed 👀 to 🌈 Done 🦄 in 📅 Procrastinating in public 😵‍💫 Oct 10, 2025
@webknjaz webknjaz reopened this Oct 10, 2025
@webknjaz webknjaz moved this from 🌈 Done 🦄 to ⏰ Review reminder needed 👀 in 📅 Procrastinating in public 😵‍💫 Oct 10, 2025
webknjaz added a commit to aio-libs/yarl that referenced this pull request Oct 19, 2025
This adds a new section for `cython` subcommand separate from
`cythonize`. It also applies a workaround for the Cython bug that
makes it hang in parallel mode on any compiler errors [[1]].

[1]: cython/cython#7183
Comment on lines +6 to +7
import concurrent.futures as _con_fut
import contextlib as _ctx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying not to pollute the from [...] import * namespace with extra public objects.

I understand that. However, this module is not meant for public consumption and it already imports a couple of things globally, so more global names won't hurt. Please use readable names.

Comment on lines +95 to +97
except (OSError, ModuleNotFoundError):
# `OSError` is a historic exception in `multiprocessing`
# `ModuleNotFoundError` happens under pyodide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I updated my suggestion.

That said, if "OSError is a historic exception in multiprocessing", then why do we still need it here?

import cython

import collections
import concurrent.futures as _con_fut
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open.

scoder added a commit to scoder/cython that referenced this pull request Oct 21, 2025
A similar problem was worked around in cython#3263
by disabling parallelism altogether.  This patch reverts that and updates the worker
management to use concurrent.futures.ProcessPoolExecutor instead of multiprocessing.Pool.

The symptoms are Python 3.8+ and the underlying compiler invocations failing.
If these are met, cythonize would hang before this commit.

Original PR by Sviatoslav Sydorenko in cython#7183

Fixes cython#3973

See cython#3262
See cython#7146
scoder added a commit that referenced this pull request Oct 21, 2025
A similar problem was worked around in #3263
by disabling parallelism altogether.  This patch reverts that and updates the worker
management to use concurrent.futures.ProcessPoolExecutor instead of multiprocessing.Pool.

The symptoms are Python 3.8+ and the underlying compiler invocations failing.
If these are met, cythonize would hang before this commit.

Original PR by Sviatoslav Sydorenko in #7183

Fixes #3973

See #3262
See #7146
@scoder
Copy link
Contributor

scoder commented Oct 21, 2025

I cleaned up the imports and pushed your changes as 440b19a, to have them integrated into 3.2b1 in time. Thank you for your contribution.

@scoder scoder closed this Oct 21, 2025
@github-project-automation github-project-automation bot moved this from ⏰ Review reminder needed 👀 to 🌈 Done 🦄 in 📅 Procrastinating in public 😵‍💫 Oct 21, 2025
scoder added a commit that referenced this pull request Oct 21, 2025
scoder added a commit that referenced this pull request Oct 21, 2025
A similar problem was worked around in #3263
by disabling parallelism altogether.  This patch reverts that and updates the worker
management to use concurrent.futures.ProcessPoolExecutor instead of multiprocessing.Pool.

The symptoms are Python 3.8+ and the underlying compiler invocations failing.
If these are met, cythonize would hang before this commit.

Original PR by Sviatoslav Sydorenko in #7183

Fixes #3973
Closes #7183

See #3262
See #7146
@webknjaz
Copy link
Contributor Author

Thanks! I see you had to re-apply it as 440b19a.

P.S. @scoder here's how to preserve commit authorship FYI for the future: https://hynek.me/til/easier-crediting-contributors-github/

scoder added a commit that referenced this pull request Oct 23, 2025
A similar problem was worked around in #3263
by disabling parallelism altogether.  This patch reverts that and updates the worker
management to use concurrent.futures.ProcessPoolExecutor instead of multiprocessing.Pool.

The symptoms are Python 3.8+ and the underlying compiler invocations failing.
If these are met, cythonize would hang before this commit.

Original PR by Sviatoslav Sydorenko in #7183

Fixes #3973
Closes #7183

See #3262
See #7146
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] python setup.py build_ext -j4 fails on macOS with Python 3.8+

3 participants