Skip to content

Comments

Fix re.fullmatch POSSESSIVE_REPEAT#7187

Merged
youknowone merged 8 commits intoRustPython:mainfrom
wvmscs:closes-#7183
Feb 20, 2026
Merged

Fix re.fullmatch POSSESSIVE_REPEAT#7187
youknowone merged 8 commits intoRustPython:mainfrom
wvmscs:closes-#7183

Conversation

@wvmscs
Copy link
Contributor

@wvmscs wvmscs commented Feb 17, 2026

This PR implements the corrections for the POSSESSIVE_REPEAT i.e. "(x)++)" given in #7183.

3 tests in the file test_re.py (in Lib/test) failed for unexpected success

UNEXPECTED SUCCESS: test_bug_gh91616 (test.test_re.ReTests.test_bug_gh91616)
UNEXPECTED SUCCESS: test_fullmatch_atomic_grouping (test.test_re.ReTests.test_fullmatch_atomic_grouping)
UNEXPECTED SUCCESS: test_fullmatch_possessive_quantifiers (test.test_re.ReTests.test_fullmatch_possessive_quantifiers)

I believe this was the old behavior before applying the correction, as the result is now the same as CPython 3.14.3 .
So, I also corrected the test file and add one new test for POSSESSIVE_REPEAT.

please review.
best regards

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of possessive repeat patterns in regex matching.
    • Fixed context management for atomic group and possessive repeat operations to ensure correct behavior.
  • Tests

    • Added comprehensive tests for possessive repeat functionality with full-match scenarios.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 17, 2026

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

This PR modifies the SRE (Regex) engine to ensure explicit next-context handling when breaking to subsequent matching paths for possessive repeat and atomic group operations. The changes set toplevel = false on context objects before transitioning them, accompanied by new test cases verifying possessive repeat behavior with full-match mode.

Changes

Cohort / File(s) Summary
Engine Context Handling
crates/sre_engine/src/engine.rs
Modified four matching paths (PossessiveRepeat1, PossessiveRepeat3, ATOMIC_GROUP, PossessiveRepeat4) to explicitly set toplevel = false on next contexts before breaking to them, ensuring correct context state during transitions.
Test Coverage
crates/sre_engine/tests/tests.rs, extra_tests/snippets/stdlib_re.py
Added test cases for possessive repeat with full-match behavior, including Rust unit tests and Python integration tests verifying pattern matching on "1.25.38" with regex ([0-9]++(?:\\.[0-9]+)*+).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Poem

🐰 Through patterns and repeats we hop,
With contexts clear, we'll never stop,
Each toplevel set to false with care,
Possessive matches now fair,
A bunny's regex delight! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly summarizes the main change: fixing re.fullmatch handling for POSSESSIVE_REPEAT patterns, which is the primary focus of the changeset across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 17, 2026

Code has been automatically formatted

The code in this PR has been formatted using:

  • ruff format
    Please pull the latest changes before pushing again:
git pull origin closes-#7183

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/sre_engine/Cargo.toml (1)

17-19: Remove the redundant explicit [[test]] entry or document the reason for making it explicit.

In Rust 2024 edition with autotests = true (the default), Cargo automatically discovers tests/tests.rs as an integration test target. The explicit [[test]] name = "tests" without a path field resolves to the same file that would already be discovered, making this entry redundant. If the intent is to make it explicit for documentation or consistency with the [[bench]] declaration above, that's acceptable—but if it serves no purpose, removing it would clean up unnecessary boilerplate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/sre_engine/Cargo.toml` around lines 17 - 19, Remove the redundant
explicit test table: delete the `[[test]]` block that only contains `name =
"tests"` (the `[[test]] name = "tests"` entry) from Cargo.toml, or if you intend
it for documentation/consistency, add a short comment above that block
explaining why it's explicit (e.g., clarifying it's intentional to mirror the
`[[bench]]` declaration); do not add a `path` field since Cargo already
autodiscovers tests in Rust 2024 with `autotests = true`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/sre_engine/src/engine.rs`:
- Around line 473-476: Typo fix: change the comment text "herited" to
"inherited" in all occurrences; locate the comments near the
ctx.next_offset(...) usage and the 'context label (e.g., the comment above the
let mut next = ctx.next_offset(4, Jump::PossessiveRepeat2); block and the two
other comments that reference "herited" around similar code paths) and update
the three comment instances to read "inherited".

---

Nitpick comments:
In `@crates/sre_engine/Cargo.toml`:
- Around line 17-19: Remove the redundant explicit test table: delete the
`[[test]]` block that only contains `name = "tests"` (the `[[test]] name =
"tests"` entry) from Cargo.toml, or if you intend it for
documentation/consistency, add a short comment above that block explaining why
it's explicit (e.g., clarifying it's intentional to mirror the `[[bench]]`
declaration); do not add a `path` field since Cargo already autodiscovers tests
in Rust 2024 with `autotests = true`.

Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. Thank you for contributing! Please also check coderabbitai's review comments.

@youknowone youknowone changed the title Closes #7183 Fix re.fullmatch POSSESSIVE_REPEAT Feb 17, 2026
Copy link
Contributor Author

@wvmscs wvmscs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended changes done

@wvmscs wvmscs requested a review from youknowone February 18, 2026 10:06
@youknowone
Copy link
Member

@wvmscs probably you forgot to push the changes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
extra_tests/snippets/stdlib_re.py (1)

80-80: Unescaped . in pattern matches any character, not only literal dot.

(?:.[0-9]+) will match 1X25Y38 just as readily as 1.25.38. The test still exercises the correct code path for the issue, but using \. better expresses intent.

✏️ Proposed fix
-assert re.fullmatch(r"([0-9]++(?:.[0-9]+)*+)", "1.25.38")
-assert re.fullmatch(r"([0-9]++(?:.[0-9]+)*+)", "1.25.38").group(0) == "1.25.38"
+assert re.fullmatch(r"([0-9]++(?:\.[0-9]+)*+)", "1.25.38")
+assert re.fullmatch(r"([0-9]++(?:\.[0-9]+)*+)", "1.25.38").group(0) == "1.25.38"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extra_tests/snippets/stdlib_re.py` at line 80, The regex in the re.fullmatch
call uses an unescaped dot in the non-capturing group (pattern
"([0-9]++(?:.[0-9]+)*+)"), so it will match any character instead of a literal
period; update the pattern passed to re.fullmatch to escape the dot in the group
(i.e., change the "(?:.[0-9]+)" portion to "(?:\.[0-9]+)") so the test enforces
literal dots (reference the re.fullmatch invocation and the pattern string).
crates/sre_engine/src/engine.rs (1)

473-473: Comments read as change descriptions rather than code explanations.

// modified next.toplevel from inherited to false describes what was changed (commit-message style) rather than why the invariant exists. A future reader maintaining this code won't immediately understand the significance.

✏️ Suggested wording (apply to all three sites)
-            // modified next.toplevel from inherited to false
-            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat2);
-            next.toplevel = false;
+            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat2);
+            next.toplevel = false; // sub-pattern must not apply match_all/must_advance constraints
-            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat4);
-            next.toplevel = false; // modified next.toplevel from inherited to false
+            let mut next = ctx.next_offset(4, Jump::PossessiveRepeat4);
+            next.toplevel = false; // sub-pattern must not apply match_all/must_advance constraints
-            let mut next_ctx = ctx.next_offset(2, Jump::AtomicGroup1);
-            next_ctx.toplevel = false; // modified next.toplevel from inherited to false
+            let mut next_ctx = ctx.next_offset(2, Jump::AtomicGroup1);
+            next_ctx.toplevel = false; // sub-pattern must not apply match_all/must_advance constraints

Also applies to: 501-501, 841-841

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/sre_engine/src/engine.rs` at line 473, Replace the commit-style
comment "// modified next.toplevel from inherited to false" with a concise
explanation of the invariant and rationale: state why next.toplevel must be set
to false (what property/state would break if it remained inherited), what
callers or code paths rely on this being false, and any assumptions about
downstream behavior or performance; apply the same explanatory replacement at
the other two occurrences that reference next.toplevel so future maintainers
understand the intent and not just the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@extra_tests/snippets/stdlib_re.py`:
- Line 81: The assertion uses re.fullmatch(...).groups(0) which returns a tuple,
causing the comparison to always fail; change the call to use the singular
.group(0) (or .group(1) if you prefer the first capture) on the Match object
returned by re.fullmatch so the expression compares a string ("1.25.38") to a
string rather than a tuple; locate the call to re.fullmatch in the snippet and
replace .groups(0) with .group(0).

---

Nitpick comments:
In `@crates/sre_engine/src/engine.rs`:
- Line 473: Replace the commit-style comment "// modified next.toplevel from
inherited to false" with a concise explanation of the invariant and rationale:
state why next.toplevel must be set to false (what property/state would break if
it remained inherited), what callers or code paths rely on this being false, and
any assumptions about downstream behavior or performance; apply the same
explanatory replacement at the other two occurrences that reference
next.toplevel so future maintainers understand the intent and not just the
change.

In `@extra_tests/snippets/stdlib_re.py`:
- Line 80: The regex in the re.fullmatch call uses an unescaped dot in the
non-capturing group (pattern "([0-9]++(?:.[0-9]+)*+)"), so it will match any
character instead of a literal period; update the pattern passed to re.fullmatch
to escape the dot in the group (i.e., change the "(?:.[0-9]+)" portion to
"(?:\.[0-9]+)") so the test enforces literal dots (reference the re.fullmatch
invocation and the pattern string).

Copy link
Contributor Author

@wvmscs wvmscs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request review

Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! Welcome to RustPython project

@github-actions
Copy link
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/re
[x] lib: cpython/Lib/sre_compile.py
[x] lib: cpython/Lib/sre_constants.py
[x] lib: cpython/Lib/sre_parse.py
[ ] test: cpython/Lib/test/test_re.py (TODO: 14)

dependencies:

  • re

dependent tests: (62 tests)

  • re: test_android test_ast test_asyncio test_binascii test_builtin test_bytes test_ctypes test_dict test_dis test_docxmlrpc test_dtrace test_email test_faulthandler test_filecmp test_fileinput test_format test_fstring test_functools test_future_stmt test_genericalias test_glob test_hashlib test_http_cookiejar test_httplib test_httpservers test_imaplib test_importlib test_ipaddress test_launcher test_logging test_mailbox test_mmap test_optparse test_pprint test_pydoc test_re test_regrtest test_runpy test_site test_smtplib test_socket test_ssl test_strftime test_strtod test_symtable test_syntax test_sysconfig test_tempfile test_tools test_traceback test_typing test_unittest test_venv test_webbrowser test_winapi test_with test_wsgiref test_xmlrpc test_zipfile test_zipimport test_zoneinfo test_zstd

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone merged commit dfc5de7 into RustPython:main Feb 20, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants