-
-
Notifications
You must be signed in to change notification settings - Fork 9.8k
[Yaml] Fix parsing of unquoted multiline scalars with comments or blank lines #62359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Yaml] Fix parsing of unquoted multiline scalars with comments or blank lines #62359
Conversation
|
Thank you @yoeunes. |
| next line | ||
| # this comment should be ignored | ||
| final line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is a valid example, all online parser that I found terminate parsing the value as soon as they stumble upon a comment line (see https://yaml-online-parser.appspot.com/?yaml=key%3A+foo%0A+bar%0A+%23+m%0A+foobar%0Akey2%3A+baz&type=json for such an example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see also yaml/yaml#49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @xabbuh, thank you for your feedback on the spec compliance (3.2.3.3).
I've opened a new PR to address this properly: #62409 which implements what I believe is the correct, spec-compliant behavior.
What do you think of this new approach? Is this the right way to go, or do you think we should prefer to keep a more permissive (doc-aligned) behavior?
Comments are ignored by the YAML parser and do not need to be indented according to the current level of nesting in a collection.
…or comments (yoeunes) This PR was merged into the 6.4 branch. Discussion ---------- [Yaml] Align unquoted multiline scalar parsing with spec for comments | Q | A | ------------- | --- | Branch? | 6.4 | Bug fix? | yes | New feature? | no | Deprecations? | no | Issues | - | License | MIT This PR is a follow-up to [#62359](#62359 (comment)) based on feedback from `@xabbuh` regarding spec compliance. My previous PR fixed a `ParseException` on blank lines, but it handled comments incorrectly (it **ignored** them inside the scalar). This PR aligns the parser with the YAML spec ([3.2.3.3](https://yaml.org/spec/1.2.2/#3233-comments)): 1. It **keeps the fix** for blank lines (they are correctly preserved). 2. It **changes comment handling**: instead of ignoring an indented comment, the parser now **terminates** the multiline scalar when it encounters one. Commits ------- 7911644 [Yaml] Align unquoted multiline scalar parsing with spec for comments
…lars (yoeunes) This PR was merged into the 6.4 branch. Discussion ---------- [Yaml] Fix regression handling blank lines in unquoted scalars | Q | A | ------------- | --- | Branch? | 6.4 | Bug fix? | yes | New feature? | no | Deprecations? | no | Issues | Fix #62581 | License | MIT This PR fixes a regression introduced in #62359 regarding unquoted multiline scalars. A check for blank lines was incorrectly added, causing the parser to bypass indentation checks and consume trailing newlines. This resulted in incorrect line numbers being reported in exceptions (e.g. when a colon is used in an unquoted value). Commits ------- 1480a23 [Yaml] Fix regression handling blank lines in unquoted scalars
This PR fixes two bugs when parsing unquoted multiline scalars:
ParseException.For example, the following YAML would previously fail or parse incorrectly: