-
Couldn't load subscription status.
- Fork 3.1k
HTML API: Reliably parse HTML in wp_html_split().
#9270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
HTML API: Reliably parse HTML in wp_html_split().
#9270
Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
1410116 to
0440d83
Compare
|
I believe this would fix https://core.trac.wordpress.org/ticket/45387. |
|
@github-actions why don’t I come in and mess with all of your work unsolicited, huh? |
0440d83 to
4a0e1a2
Compare
5fe2dcd to
3f806d8
Compare
| $regex = get_html_split_regex(); | ||
| $result = benchmark_pcre_backtracking( $regex, $input, 'split' ); | ||
| return $this->assertLessThan( 200, $result ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no longer a PCRE used in wp_html_split() and therefore no backtracking.
d319642 to
c5b62b8
Compare
a51cc59 to
8a5805f
Compare
e17de28 to
479b18a
Compare
a52a8f9 to
fcac561
Compare
c06e2c8 to
e4f3798
Compare
ff20347 to
085390b
Compare
Trac ticket: Core-63694 This probably improves the performance in terms of both CPU time and memory compared to the old PCRE-based approach.
Was detecting a non-escaped `<` as the start of an “element” and then replaced a newline in the text as `<!-- wpnl -->` since it thought it was replacing inside a tag. In the end that translated into a raw `\n` again in the end.
fcb6b14 to
f8a1e05
Compare
Trac ticket: Core-63694
Replaces #6651
See: (#9270), #9850, #9851
Design feedback
<[[gallery]]>to be an escaped shortcode inside an HTML tag, but HTML considers it plaintext instead of a tag (because the starting character after the initial<is not a letter).a. Is this actually a shortcode inside a tag to be ignored?
b. Is this a shortcode inside a text node?
<[gallery]>and the[gallery]shortcode translated into a tag name then this entire thing would become a tag on replacement.Implementation
This probably improves the performance in terms of both CPU time and memory compared to the old PCRE-based approach.