Skip to content

Spec: Avoid struct field conflicts in default values#12841

Merged
RussellSpitzer merged 6 commits into
apache:mainfrom
rdblue:v3-clarify-default-values
Apr 24, 2025
Merged

Spec: Avoid struct field conflicts in default values#12841
RussellSpitzer merged 6 commits into
apache:mainfrom
rdblue:v3-clarify-default-values

Conversation

@rdblue
Copy link
Copy Markdown
Contributor

@rdblue rdblue commented Apr 18, 2025

This makes a slight change to the spec for v3 default values.

Previously, the spec allowed default values for struct fields in the struct default and in fields. For example, this was valid:

ALTER TABLE t ADD COLUMN point struct<x int NOT NULL DEFAULT 0, y int NOT NULL DEFAULT 0> DEFAULT struct(-1, -1)

This would result in a different write default depending on whether a field (default to 0) or the struct itself (default fields to -1) was missing. This behavior is difficult to implement correctly and unnecessary. It is also confusing what will happen when attempting to modify the struct's default, which should not change field defaults, and what will be used for a field's initial default.

The proposed change is to always track field defaults in the struct fields. This results in a uniform way to handle a struct: either produce null, or create a non-null default struct from field defaults. Effectively, this means that structs can default to null or non-null (struct()).

Also note that this change does not prevent us from allowing struct-level and field-level defaults in the future. It simply states that for v3, there can be no field defaults from the struct level. We can add them later if there is a use case that requires them.

@github-actions github-actions Bot added the Specification Issues that may introduce spec changes. label Apr 18, 2025
@rdblue rdblue requested review from RussellSpitzer, aokolnychyi, danielcweeks and szehon-ho and removed request for aokolnychyi April 18, 2025 18:49
Copy link
Copy Markdown
Contributor

@danielcweeks danielcweeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with clarifying/simplifying the behavior.

Comment thread format/spec.md Outdated
Comment thread format/spec.md Outdated
Copy link
Copy Markdown
Contributor

@wmoustafa wmoustafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with the overall direction. However, I think the wording does not address the case when the field within the struct is a struct. Maybe the idea is that the constraint will apply recursively, but either we state that explicitly or simplify the default spec to restrict it to primitive types and maps/arrays.

Comment thread format/spec.md Outdated
Copy link
Copy Markdown
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the suggested wording from @RussellSpitzer

@github-actions github-actions Bot added the core label Apr 24, 2025
Comment thread format/spec.md Outdated
Comment thread format/spec.md Outdated
Comment thread core/src/main/java/org/apache/iceberg/variants/ShreddedObject.java Outdated
Comment thread format/spec.md

For example, a struct column `point` with fields `x` (default 0) and `y` (default 0) can be defaulted to `{"x": 0, "y": 0}` or `null`. A non-null default is stored by setting `initial-default` or `write-default` to an empty struct (`{}`) that will use field values set from each field's `initial-default` or `write-default`, respectively.

| `point` default | `point.x` default | `point.y` default | Data value | Result value |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really helpful. Thank you for adding it!

Copy link
Copy Markdown
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, I think it's very clear.

@RussellSpitzer RussellSpitzer merged commit ee70a53 into apache:main Apr 24, 2025
2 checks passed
@RussellSpitzer
Copy link
Copy Markdown
Member

Merged! Thank you @rdblue for tightening up the spec here.

Long thanks to all the reviewers
@adutra
@manuzhang
@wmoustafa
@nastra
@danielcweeks
@amogh-jahagirdar
@szehon-ho
@aokolnychyi

@rdblue
Copy link
Copy Markdown
Contributor Author

rdblue commented Apr 24, 2025

Thanks for merging, @RussellSpitzer! And thanks to all the reviewers, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Specification Issues that may introduce spec changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants