Skip to content

Multiprocessing PipeConnection handles leak on failed subprocess spawn #114720

@richardsheridan

Description

@richardsheridan

Bug report

Bug description:

On Windows, if a multiprocessing.Process fails to spawn, and it is given multiprocessing.Pipe(duplex=True) arguments, then the underlying handles pipe handles leak. One way to cause a subprocess to fail to spawn is forgetting to guard spawning code with if __name__ == "__main__":. Here is an MRE along those lines:

import multiprocessing
import traceback

EXHIBIT_LEAK = True


def child(*pipes):
    print("child", pipes)
    for pipe in pipes:
        pipe.close()


def parent(*pipes):
    print("parent", pipes)
    p = multiprocessing.Process(target=child, args=pipes)
    p.start()
    for pipe in pipes:
        pipe.close()
    p.join()


if EXHIBIT_LEAK or __name__ == "__main__":
    child_send_pipe, recv_pipe = multiprocessing.Pipe(duplex=True)
    send_pipe, child_recv_pipe = multiprocessing.Pipe(duplex=True)
    parent(child_send_pipe, child_recv_pipe)
    print("after")
    try:
        send_pipe.send_bytes(b"test")
    except:
        traceback.print_exc()
    else:
        print("send failed to raise")
    try:
        recv_pipe.poll()
    except:
        traceback.print_exc()
    else:
        print("recv failed to raise")

I noticed this because it makes the death of the process impossible to detect by looking only at the pipes, leading to an ugly workaround that I only recently realized was actually leaking resources.

This is in principle the same bug as bpo-33929. To be honest I don't understand the fix there enough to know if it could be generalized to PipeConnection objects, but it would likely prevent leaks from all sorts of handle stealing edge cases. Otherwise, workaround for the specific case of recursive spawning could be achieved by signaling the unpickling error back to parent during the Process.start() method. The state of the system as the MRE failure occurs can be summarized as:

  • parent calls Process.start -> self._Popen(self) ->CreateProcess
  • parent dumps prep_data and then process_obj sequentially to_child with big buffer, so it's nonblocking
  • dump of process_obj induces duplication of PipeConnection handle via DupeHandle
  • child runs mp.spawn.spawn_main -> _main
  • child drains data (fd from_parent of bpo-33929: multiprocessing: fix handle leak on race condition #7921) only to reduction.pickle.load(from_parent)
  • child unpickles __main__ module
  • child notices issue in Process.start -> self._Popen(self) -> spawn.get_preparation_data -> _check_not_importing_main
  • child raises and never unpickles the arguments and so never steals the handle via DupeHandle.detach()
  • parent calls poll on read end of PipeConnection
  • parent never gets expected BrokenPipeError

If Process.start can raise and close from_parent (actually I'm not sure if from_parent or fd or both must be closed) in child before dump(process_obj, to_child) in parent, all stealing leaks would prevented. However, that would require some sort of IO wait between line 94 and 95. My first thought was to make another private pipe pair only for the purpose of signaling parent if prep_data were successfully unpickled or not. parent would then either read a sentinel out of the pipe or raise some kind of exception. Also, wouldn't it be better in principle for start to raise an exception and clean up if it knows it failed to start?

CPython versions tested on:

3.8, 3.9, 3.10, 3.11, 3.12

Operating systems tested on:

Windows

Metadata

Metadata

Assignees

No one assigned
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions