-
Notifications
You must be signed in to change notification settings - Fork 2k
Allow pluggable file backends in DiskManager and IPCStreamWriter to support non-OS file systems #21215
Description
Is your feature request related to a problem or challenge?
DataFusion's spill infrastructure is currently hardcoded to OS-level files in two distinct areas, leaving no way to plug in a custom file backend
1. File Creation (DiskManager)
DiskManagerMode has no extension point for custom file creation:
pub enum DiskManagerMode {
OsTmpDirectory,
Directories(Vec<PathBuf>),
Disabled,
// No way to plug in a custom backend
}And create_tmp_file returns a concrete struct hardcoded to NamedTempFile:
pub fn create_tmp_file(
self: &Arc,
request_description: &str,
) -> Result<RefCountedTempFile> // hardcoded to NamedTempFile internally2. File Access (IPCStreamWriter / InProgressSpillFile)
Even if DiskManager were made extensible, InProgressSpillFile::append_batch opens the file by OS path:
self.writer = Some(IPCStreamWriter::new(
in_progress_file.path(), // assumes an OS path exists
schema.as_ref(),
self.spill_writer.compression,
)?);This means IPCStreamWriter also assumes an OS-level file with a real filesystem path.
The concrete use case
This blocks embedding DataFusion in environments where temp files must go through the host system's own file management APIs. A concrete example is Postgres extensions (like ParadeDB), where spill files need to go through Postgres's BufFile/VFD APIs in order to:
- Respect
temp_tablespaces(controls where temp files are written) - Enforce
temp_file_limit(Postgres's GUC for capping temp file usage per session) - Participate in Postgres's resource owner cleanup on transaction abort — if Postgres
longjmps through aFATALerror, Rust'sDropwon't run, but Postgres's resource owner will still clean upBufFile-backed files automatically
Postgres's BufFile has no OS-visible filesystem path, it is an internal abstraction, so it cannot be made to fit the current path-based interface at either level.
Describe the solution you'd like
The fix requires abstracting both the creation and the access of these files:
Make DiskManager extensible
Introduce a TempFileFactory trait and a Custom variant in DiskManagerMode:
pub trait TempFileFactory: Send + Sync {
fn create_temp_file(
&self,
request_description: &str,
) -> Result<Box<dyn SpillFile>, DataFusionError>;
}
pub enum DiskManagerMode {
OsTmpDirectory,
Directories(Vec<PathBuf>),
Disabled,
Custom(Arc<dyn TempFileFactory>), // new
}Abstract over file access in IPCStreamWriter
Instead of accepting a path, IPCStreamWriter should accept a Write trait object so it can write through any backend:
// Instead of:
IPCStreamWriter::new(in_progress_file.path(), schema, compression)
// Something like:
IPCStreamWriter::new(Box<dyn Write>, schema, compression)Similarly SpillFile (the trait returned by TempFileFactory) would need to expose Read/Write/Seek rather than a path, so InProgressSpillFile and SpillReaderStream can work through the abstraction.
All existing behavior defaults unchanged, the current OS-backed path is just one implementation of the trait.
Describe alternatives you've considered
Configuring DiskManager with Directories(postgres_temp_path) to point it at Postgres's temp tablespace directory, but this bypasses Postgres's resource accounting entirely. temp_file_limit won't be enforced, VFD descriptor limits won't be respected, and files won't be cleaned up through Postgres's resource owner on transaction abort.
Additional context
No response