[Test] Add unit tests for encoding_dsv32.py#21623
[Test] Add unit tests for encoding_dsv32.py#21623dondetir wants to merge 3 commits intosgl-project:mainfrom
Conversation
72 tests covering encode_arguments_to_dsml, decode_dsml_to_arguments, render_tools, find_last_user_index, render_message, drop_thinking_messages, encode_messages, and _read_until_stop. No server launch, no model weights required (pure Python/JSON module). Registered with register_cpu_ci(est_time=5, suite='stage-a-test-cpu'). Closes part of sgl-project#20865
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive unit test suite for the encoding_dsv32.py module, verifying DSML encoding/decoding, tool rendering, and message processing logic across various roles and scenarios. A critical syntax error was found in the regex patterns within the roundtrip tests, and a refactor is suggested to consolidate duplicated parsing logic into a reusable helper method.
| import re | ||
| matches = re.findall( | ||
| rf'<\{dsml_token}parameter name="(.*?)" string="(true|false)">(.*?)</\{dsml_token}parameter>', | ||
| dsml_str, | ||
| flags=re.DOTALL, | ||
| ) | ||
| tool_args = {m[0]: (m[2], m[1]) for m in matches} |
There was a problem hiding this comment.
This block contains a critical syntax error and its logic is duplicated in test_roundtrip_int.
- Syntax Error: The regex pattern on line 182 (
rf'<\{dsml_token}...) is invalid. In a raw f-string, a backslash cannot be used to escape a curly brace, which will raise aSyntaxError. The backslash before{dsml_token}must be removed. The same error is present on line 199. - Code Duplication: This block for parsing DSML is duplicated in
test_roundtrip_int(lines 197-203).
To address both issues, please move import re to the top of the file, correct the regex, and extract the parsing logic into a reusable helper method. Here is an example of such a helper:
def _parse_dsml_args(self, dsml_str: str) -> dict:
"""Helper to parse DSML parameter tags back into a dict for decode_dsml_to_arguments."""
matches = re.findall(
rf'<{dsml_token}parameter name="(.*?)" string="(true|false)">(.*?)</{dsml_token}parameter>',
dsml_str,
flags=re.DOTALL,
)
return {m[0]: (m[2], m[1]) for m in matches}You can then replace this block with a call to self._parse_dsml_args(dsml_str) in both roundtrip tests.
…tract helper Remove unnecessary backslash escapes before interpolated variables in raw f-string regex patterns. On Python 3.12+ (PEP 701) these are valid but misleading — the backslash inserts a literal \ before the interpolated value, which only works by accident since the full-width pipe character is not a special regex character. Extract duplicated DSML argument parsing logic into a shared _parse_dsml_args helper. Move import re to module top level. Signed-off-by: rdondeti <[email protected]>
Summary
Part of #20865 (Improve Unit Test Coverage)
Adds 72 CPU-only unit tests for
python/sglang/srt/entrypoints/openai/encoding_dsv32.py— the DSML encoding/decoding module for DeepSeek v3.2 tool calling.Coverage
TestEncodeArgumentsToDsmlencode_arguments_to_dsmlTestDecodeDsmlToArgumentsdecode_dsml_to_arguments(incl. 2 roundtrip tests)TestRenderToolsrender_toolsTestFindLastUserIndexfind_last_user_indexTestRenderMessagerender_message(all 5 role branches + error paths)TestDropThinkingMessagesdrop_thinking_messages(boundary conditions, immutability)TestEncodeMessagesencode_messages(bos_token, context, drop_thinking)TestReadUntilStop_read_until_stop(Unicode DeepSeek tokens included)All 8 public functions in encoding_dsv32.py are covered with happy paths, edge cases, and error conditions.
Verification
No server launch, no model weights, no GPU required. Registered with
register_cpu_ci(est_time=5, suite="stage-a-test-cpu").