Skip to content

Conversation

@serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Nov 29, 2025

This affects string formatting as well as bytes and bytearray formatting.

  • For errors in the format string, always include the position of the start of the format unit.
  • For errors related to the formatted arguments, always include the number or the name of the argument.
  • Suggest more probable causes of errors in the format string (stray %, unsupported format, unexpected character).
  • Provide more information when the number of arguments does not match the number of format units.
  • Raise more specific errors when access of arguments by name is mixed with sequential access and if * is used with a mapping.
  • Add tests for some uncovered cases.

This affects string formatting as well as bytes and bytearray formatting.

* For errors in the format string, always include the position of the
  start of the format unit.
* For errors related to the formatted arguments, always include the number
  or the name of the argument.
* Suggest more probable causes of errors in the format string (stray %,
  unsupported format, unexpected character).
* Provide more information when the number of arguments does not match
  the number of format units.
* Raise more specific errors when access of arguments by name is mixed with
  sequential access and if * is used with a mapping.
* Add tests for some uncovered cases.
test_exc_common("abc %Id", 1, ValueError,
"unsupported format %I at position 4")
test_exc_common("abc %'d", 1, ValueError,
"stray % at position 4 or unexpected format character ''' at position 5")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

''' is not very readable. Would it be possible to format it as U+HHHH or 0xHH?

test_exc_common('%(x)r', 1, TypeError,
"format requires a mapping, not int")
test_exc_common('%*r', 3.14, TypeError,
"* requires int, not float")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without context, it's uneasy to understand that the error comes from a format string. Maybe write %* instead?

Suggested change
"* requires int, not float")
"%* requires int, not float")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is perhaps better to remove this test, because the example is incorrect in more than one way. If * is used, then we need more than one argument, and in that case we will get "format argument N" in the error message, like in the following line..

"format argument 1: too big for precision")
test_exc_common('%d', '1', TypeError,
"%d format: a real number is required, not str")
"a real number is required for format %d, not str")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really a real number which is expected, or an integer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It will be truncated to integer.

test_exc('%c', 2**128, OverflowError,
"%c argument not in range(0x110000)")
test_exc('%c', 3.14, TypeError,
"%c requires an integer or a unicode character, not float")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"%c requires an integer or a unicode character, not float")
"%c requires an integer or a Unicode character, not float")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in many other places. We should do all such changes at once -- either capitalize "unicode", or remove it.

(int)arg->ch, arg->fmtstart);
}
else if (arg->ch >= 32 && arg->ch < 127) {
else if (arg->ch >= 32 && arg->ch < 127 && arg->ch != '\'') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ignore also the quote character in Py_UNICODE_ISPRINTABLE() test below?

Copy link
Member Author

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently consider an idea of using "%x expected an integer, got float" instead of "%x requires an integer, not float". What do you think?

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left minor comments.

Comment on lines +409 to +421
"format argument: %c requires an integer or a unicode character, not float")
test_exc('%c', (3.14,), TypeError,
"format argument 1: %c requires an integer or a unicode character, not float")
test_exc('%(x)c', {'x': 3.14}, TypeError,
"format argument 'x': %c requires an integer or a unicode character, not float")
test_exc('%c', 'ab', TypeError,
"format argument: %c requires an integer or a unicode character, not a string of length 2")
test_exc('%c', ('ab',), TypeError,
"format argument 1: %c requires an integer or a unicode character, not a string of length 2")
test_exc('%(x)c', {'x': 'ab'}, TypeError,
"format argument 'x': %c requires an integer or a unicode character, not a string of length 2")
test_exc('%c', b'x', TypeError,
"format argument: %c requires an integer or a unicode character, not bytes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are changing error messages, I suggest to write Unicode with an uppercase U.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a different issue. "unicode character" is used in other error messages which this error message was based on.

self.assertRaisesRegex(TypeError, '%i format: a real number is required, not complex', operator.mod, '%i', 2j)
self.assertRaisesRegex(TypeError, '%d format: a real number is required, not complex', operator.mod, '%d', 1j)
self.assertRaisesRegex(TypeError, r'%c requires an int or a unicode character, not .*\.PseudoFloat', operator.mod, '%c', pi)
self.assertRaisesRegex(TypeError, '%x requires an integer, not float', operator.mod, '%x', 3.14)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to check also the format argument: prefix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that it is related to the purpose of this test, but I'll do this. Although it make the lines obscenely long.

@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) January 24, 2026 11:11
@serhiy-storchaka serhiy-storchaka merged commit 012c498 into python:main Jan 24, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants