Skip to content

[Test] Add unit tests for encoding_dsv32.py#21623

Open
dondetir wants to merge 3 commits intosgl-project:mainfrom
dondetir:test/unit-encoding-dsv32
Open

[Test] Add unit tests for encoding_dsv32.py#21623
dondetir wants to merge 3 commits intosgl-project:mainfrom
dondetir:test/unit-encoding-dsv32

Conversation

@dondetir
Copy link
Copy Markdown

Summary

Part of #20865 (Improve Unit Test Coverage)

Adds 72 CPU-only unit tests for python/sglang/srt/entrypoints/openai/encoding_dsv32.py — the DSML encoding/decoding module for DeepSeek v3.2 tool calling.

Coverage

Class Tests Functions
TestEncodeArgumentsToDsml 9 encode_arguments_to_dsml
TestDecodeDsmlToArguments 7 decode_dsml_to_arguments (incl. 2 roundtrip tests)
TestRenderTools 4 render_tools
TestFindLastUserIndex 7 find_last_user_index
TestRenderMessage 17 render_message (all 5 role branches + error paths)
TestDropThinkingMessages 9 drop_thinking_messages (boundary conditions, immutability)
TestEncodeMessages 7 encode_messages (bos_token, context, drop_thinking)
TestReadUntilStop 9 _read_until_stop (Unicode DeepSeek tokens included)

All 8 public functions in encoding_dsv32.py are covered with happy paths, edge cases, and error conditions.

Verification

python -m pytest test/registered/unit/entrypoints/openai/test_encoding_dsv32.py -v
# 72 passed in 0.35s

ruff check test/registered/unit/entrypoints/openai/test_encoding_dsv32.py
# Clean

No server launch, no model weights, no GPU required. Registered with register_cpu_ci(est_time=5, suite="stage-a-test-cpu").

72 tests covering encode_arguments_to_dsml, decode_dsml_to_arguments,
render_tools, find_last_user_index, render_message, drop_thinking_messages,
encode_messages, and _read_until_stop.

No server launch, no model weights required (pure Python/JSON module).
Registered with register_cpu_ci(est_time=5, suite='stage-a-test-cpu').

Closes part of sgl-project#20865
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive unit test suite for the encoding_dsv32.py module, verifying DSML encoding/decoding, tool rendering, and message processing logic across various roles and scenarios. A critical syntax error was found in the regex patterns within the roundtrip tests, and a refactor is suggested to consolidate duplicated parsing logic into a reusable helper method.

Comment on lines +180 to +186
import re
matches = re.findall(
rf'<\{dsml_token}parameter name="(.*?)" string="(true|false)">(.*?)</\{dsml_token}parameter>',
dsml_str,
flags=re.DOTALL,
)
tool_args = {m[0]: (m[2], m[1]) for m in matches}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This block contains a critical syntax error and its logic is duplicated in test_roundtrip_int.

  1. Syntax Error: The regex pattern on line 182 (rf'<\{dsml_token}...) is invalid. In a raw f-string, a backslash cannot be used to escape a curly brace, which will raise a SyntaxError. The backslash before {dsml_token} must be removed. The same error is present on line 199.
  2. Code Duplication: This block for parsing DSML is duplicated in test_roundtrip_int (lines 197-203).

To address both issues, please move import re to the top of the file, correct the regex, and extract the parsing logic into a reusable helper method. Here is an example of such a helper:

    def _parse_dsml_args(self, dsml_str: str) -> dict:
        """Helper to parse DSML parameter tags back into a dict for decode_dsml_to_arguments."""
        matches = re.findall(
            rf'<{dsml_token}parameter name="(.*?)" string="(true|false)">(.*?)</{dsml_token}parameter>',
            dsml_str,
            flags=re.DOTALL,
        )
        return {m[0]: (m[2], m[1]) for m in matches}

You can then replace this block with a call to self._parse_dsml_args(dsml_str) in both roundtrip tests.

…tract helper

Remove unnecessary backslash escapes before interpolated variables in
raw f-string regex patterns. On Python 3.12+ (PEP 701) these are valid
but misleading — the backslash inserts a literal \ before the
interpolated value, which only works by accident since the full-width
pipe character is not a special regex character.

Extract duplicated DSML argument parsing logic into a shared
_parse_dsml_args helper. Move import re to module top level.

Signed-off-by: rdondeti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants