Spark: Use bulk deletes in rewrite manifests action#10343
Conversation
9c20bbf to
0add66c
Compare
nastra
left a comment
There was a problem hiding this comment.
the change makes sense to me. I wonder if we should have a variation of SparkCleanupUtil.deleteFiles for this, where we have some additional logging/exception handling when e.g. bulk deletes are only partially applied (inside SparkCleanupUtil.bulkDelete
dramaticlly
left a comment
There was a problem hiding this comment.
I also noticed that there's deleteFiles in BaseSparkAction which batch files by group of 10k, but I think it's more for snapshot expiration or purge
ece6a88 to
f51bf8e
Compare
f51bf8e to
d10a86f
Compare
|
Forgot I had this PR up :) thanks for the reviews @nastra @dramaticlly ! I went with just using deleteFiles from the BaseSparkAction, that will also log on failed deletes. SparkCleanupUtil is package private and not accessible from the action implementation. I'll go ahead and merge this |
This change uses bulk deletes when possible when cleaning up files as part of any failure or as part of the staging logic for v1 tables.