Add Spark UI metrics from Iceberg scan metrics #8717
Conversation
|
Let me take a look. |
There was a problem hiding this comment.
This looks almost good, I had some minor comments and it seems like we are missing the implementation and tests for a few total counters. @karuppayya, could you check if I got everything correctly?
|
|
||
| @Override | ||
| public String description() { | ||
| return "total delete file size"; |
There was a problem hiding this comment.
Include (bytes) at the end like we do in TotalFileSize?
068b9ae to
0a4c2c1
Compare
aokolnychyi
left a comment
There was a problem hiding this comment.
This looks almost ready to go. I had a few minor points that would be nice to fix.
| @@ -200,12 +220,31 @@ public CustomTaskMetric[] reportDriverMetrics() { | |||
| } | |||
|
|
|||
| List<CustomTaskMetric> driverMetrics = Lists.newArrayList(); | |||
There was a problem hiding this comment.
Minor: Can we add an empty line after this one and before // common?
| @@ -215,12 +254,32 @@ public CustomMetric[] supportedCustomMetrics() { | |||
| return new CustomMetric[] { | |||
| new NumSplits(), | |||
There was a problem hiding this comment.
Minor: Can we add // task metrics before this line?
Both NumSplits and NumDeletes are populated at the task level.
|
|
||
| List<CustomTaskMetric> driverMetrics = Lists.newArrayList(); | ||
| // common | ||
| driverMetrics.add(TaskTotalFileSize.from(scanReport)); |
There was a problem hiding this comment.
Could you confirm TaskTotalFileSize represents the total size of read data?
If so, shall we call these metrics as TaskTotalDataFileSize and TotalDataFileSize? I know we follow the API from core but it seems a bit confusing. I had to look up the code to understand what this metric means. If we decide to rename, let's move it to the data files block below.
|
|
||
| @Override | ||
| public String description() { | ||
| return "total delete file size in bytes"; |
There was a problem hiding this comment.
Let's follow what we have in TotalFileSize where use ... (bytes) instead of ... in bytes.
|
Thanks @aokolnychyi for the review, i have addressed the latest comments, ready for another round |
|
Thanks, @karuppayya! |
This change cherry-picks PR #8717 to Spark 3.4.
This is a followup to #7447 (comment)
cc: @aokolnychyi @RussellSpitzer @szehon-ho