Skip to content

Allow pluggable file backends in DiskManager and IPCStreamWriter to support non-OS file systems #21215

@pantShrey

Description

@pantShrey

Is your feature request related to a problem or challenge?

DataFusion's spill infrastructure is currently hardcoded to OS-level files in two distinct areas, leaving no way to plug in a custom file backend

1. File Creation (DiskManager)
DiskManagerMode has no extension point for custom file creation:

pub enum DiskManagerMode {
    OsTmpDirectory,
    Directories(Vec<PathBuf>),
    Disabled,
    // No way to plug in a custom backend
}

And create_tmp_file returns a concrete struct hardcoded to NamedTempFile:

pub fn create_tmp_file(
    self: &Arc,
    request_description: &str,
) -> Result<RefCountedTempFile>  // hardcoded to NamedTempFile internally

2. File Access (IPCStreamWriter / InProgressSpillFile)
Even if DiskManager were made extensible, InProgressSpillFile::append_batch opens the file by OS path:

self.writer = Some(IPCStreamWriter::new(
    in_progress_file.path(),  // assumes an OS path exists
    schema.as_ref(),
    self.spill_writer.compression,
)?);

This means IPCStreamWriter also assumes an OS-level file with a real filesystem path.

The concrete use case

This blocks embedding DataFusion in environments where temp files must go through the host system's own file management APIs. A concrete example is Postgres extensions (like ParadeDB), where spill files need to go through Postgres's BufFile/VFD APIs in order to:

  • Respect temp_tablespaces (controls where temp files are written)
  • Enforce temp_file_limit (Postgres's GUC for capping temp file usage per session)
  • Participate in Postgres's resource owner cleanup on transaction abort — if Postgres longjmps through a FATAL error, Rust's Drop won't run, but Postgres's resource owner will still clean up BufFile-backed files automatically

Postgres's BufFile has no OS-visible filesystem path, it is an internal abstraction, so it cannot be made to fit the current path-based interface at either level.

Describe the solution you'd like

The fix requires abstracting both the creation and the access of these files:

Make DiskManager extensible

Introduce a TempFileFactory trait and a Custom variant in DiskManagerMode:

pub trait TempFileFactory: Send + Sync {
    fn create_temp_file(
        &self,
        request_description: &str,
    ) -> Result<Box<dyn SpillFile>, DataFusionError>;
}

pub enum DiskManagerMode {
    OsTmpDirectory,
    Directories(Vec<PathBuf>),
    Disabled,
    Custom(Arc<dyn TempFileFactory>), // new
}

Abstract over file access in IPCStreamWriter

Instead of accepting a path, IPCStreamWriter should accept a Write trait object so it can write through any backend:

// Instead of:
IPCStreamWriter::new(in_progress_file.path(), schema, compression)

// Something like:
IPCStreamWriter::new(Box<dyn Write>, schema, compression)

Similarly SpillFile (the trait returned by TempFileFactory) would need to expose Read/Write/Seek rather than a path, so InProgressSpillFile and SpillReaderStream can work through the abstraction.

All existing behavior defaults unchanged, the current OS-backed path is just one implementation of the trait.

Describe alternatives you've considered

Configuring DiskManager with Directories(postgres_temp_path) to point it at Postgres's temp tablespace directory, but this bypasses Postgres's resource accounting entirely. temp_file_limit won't be enforced, VFD descriptor limits won't be respected, and files won't be cleaned up through Postgres's resource owner on transaction abort.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions