User Details
- User Since
- Oct 7 2014, 5:34 AM (554 w, 1 d)
- Availability
- Available
- IRC Nick
- subbu
- LDAP User
- Subramanya Sastry
- MediaWiki User
- SSastry (WMF) [ Global Accounts ]
Today
I did a quick inspection of the handlers that lead to the most simplification of the DOMTraverser -- and there are just 3 of them (dedupe-ids, gen-anchors, add-link-attributes), and looking at --profile output of a few pages, those handlers account for < 1% of total time in most profiles. So, even if those handlers sped up 25%, the total page speedup is going to be marginal.
Yesterday
This is now reproducible on testwiki as well. Looking at the line that throws the error seen on testwiki, that is only triggered if you got a valid list id, but only if $silent is not set to true. Who passes $slient? Is it from the JS code?
From this 'git grep' output I don't see any calls to setupForUser that calls it with a true value?
maintenance/populateWithTestData.php: $repository->setupForUser(); src/Api/ApiReadingListsSetup.php: $list = $this->getReadingListRepository( $this->getUser() )->setupForUser(); src/ReadingListRepository.php: public function setupForUser( $silent = false ) { src/ReadingListRepository.php: * Check whether reading lists have been set up for the given user (i.e. setupForUser() was src/Rest/SetupHandler.php: $this->getRepository()->setupForUser();
Memory limit is set to 1400 MiB in the config repo.
While this patch isn't yet deployed everywhere on wmf.1 (I see that the backport to wmf.1 is scheduled for a late backport window today), I can confirm the old failure on enwiki where this change isn't yet live.
curl -X POST -H "Content-Type: application/json" --data '{ "wikitext": "== Hello Jupiter ==" }' 'https://en.wikipedia.org/w/rest.php/v1/transform/wikitext/to/html/||DBMS_PIPE.RECEIVE_MESSAGE(CHR(98)||CHR(98)||CHR(98)%2C15)||' {"message":"Error: exception of type LogicException","httpCode":500,"httpReason":"Internal Server Error"}%
In an attempt to verify and close this task, I ran into this. This is not from @mszabo's change but this response should not have been HTTP 403.
$ curl "https://en.wikipedia.org/w/rest.php/v1/page/||DBMS_PIPE.RECEIVE_MESSAGE(CHR(98)||CHR(98)||CHR(98)%2C15)||/html" {"errorKey":"rest-permission-denied-title","messageTranslations":{"en":"The user does not have rights to read title (||DBMS_PIPE.RECEIVE_MESSAGE(CHR(98)||CHR(98)||CHR(98),15)||)"},"httpCode":403,"httpReason":"Forbidden"}%
Regarding OOMs, after excluding user pages and FST-based langconversion pages (which has known issues), I found at least two pages that are legitimate OOMs (haven't looked at others closely):
Thu, May 15
I think we will need to solve some version of this for Parsoid since the current solution doesn't help Parsoid mitigate latencies (See T392261#10824804 for example)
Spot-checking other wikis for last month:
- nlwiki: all user pages
- kowiki: no timeouts
- jawiki: 14 across all namespaces, one user page & rest wikipedia namespace
- frwiki: user pages OR project pages like this with large lists
- itwiki: except user pages, wikipedia pages, project pages, there are 12 entries -- all of them seem to have been transient ones and are small pages and all use timeline charts (so could have been a transient timeline outage).
It might have been the same thing with https://en.wikipedia.org/w/index.php?title=Yuri%27s_Night&action=history and https://en.wikipedia.org/w/index.php?title=Gagarin%27s_Start&action=history which show a number of deleted revisions.
Aha .. so,, revid 1288359999 on enwiki:Sputnik_1 is a vandalized version and has 15323 uses of Template:Chem_name and 15835 uses of Template:Sic. Using --profile, it turns out that WrapTemplates explodes in time usage on that page and takes 35s! So, that is worth fixing.
That turned out to be mostly a nothingburger for the most part. Here is the dump of parse.php times on the above titles (after resolving redirects). So, except for the two math pages (Filters_in_topology, List_of_set_identities_and_relations), everything else parses pretty quickly and I confirmed with an "?action=purge" on two of the pages that the pages do render fine. So, except for those two titles, everything else turned out to be probably transient timeouts.
Wed, May 14
I downloaded the logstash data from the last month and extracted the exception urls, stripped the revision ids (and exclude File, Template, Category, *Talk namespaces as well) that had timeouts in the last month:
Ankh_Morpork_City_Watch Battle_of_khaybar Fairbanks%2C_Morse_and_Company Filters_in_topology Gagarin%27s_Start Good_Morning%2C_Judge List_of_Evolve_Tag_Team_Champions List_of_set_identities_and_relations Magnum_Airlines_Helicopters New_Super_Mario_Bros._(series) Sputnik_1 Yuri%27s_Night
Looking just at enwiki timeouts in our Logstash dashboard for the last 3 months,
- If I exclude the "User:" and "Wikipedia:" namespaces, we have 2072 timeouts and 1629 OOMs.
- If I look at just the "User:" namespace, we have ~16000 timeouts, and ~19700 OOMs.
- If I look at just the "Wikipedia:" namespace, we have ~6900 timeouts and ~5800 OOMs.
No, we should at least investigate this.
Parsoid handles this correctly.
$ echo "; [[File:FAQ icon (Noun like).svg|20px]] Responses to questions : such as defined and free-form responses" | php bin/parse.php 141 ↵ <dl data-parsoid='{"dsr":[0,105,0,0]}'><dt data-parsoid='{"dsr":[0,64,1,0,1,1]}'><span typeof="mw:File" data-parsoid='{"optList":[{"ck":"width","ak":"20px"}],"dsr":[2,40,null,null]}'><a href="./File:FAQ_icon_(Noun_like).svg" class="mw-file-description" data-parsoid="{}"><img resource="./File:FAQ_icon_(Noun_like).svg" src="//upload.wikimedia.org/wikipedia/commons/thumb/1/17/FAQ_icon_%28Noun_like%29.svg/20px-FAQ_icon_%28Noun_like%29.svg.png" decoding="async" data-file-width="38" data-file-height="31" data-file-type="drawing" height="16" width="20" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/1/17/FAQ_icon_%28Noun_like%29.svg/30px-FAQ_icon_%28Noun_like%29.svg.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/1/17/FAQ_icon_%28Noun_like%29.svg/39px-FAQ_icon_%28Noun_like%29.svg.png 2x" class="mw-file-element" data-parsoid='{"a":{"resource":"./File:FAQ_icon_(Noun_like).svg","height":"16","width":"20"},"sa":{"resource":"File:FAQ icon (Noun like).svg"}}'/></a></span> Responses to questions</dt><dd data-parsoid='{"stx":"row","dsr":[64,105,1,0,1,0]}'>such as defined and free-form responses</dd></dl>
Tue, May 13
Thanks!
Anytime today or tomorrow works. We'll hold off running rt-testing till the reboot happens.
Mon, May 12
T306679 is the other task I worked on related to performance outliers which had some patches merged and deployed.
T391416#10814194 reports the benefits from focusing on this work so far on an outlier page.
On current master (what is going to be tagged as v0.22.0-a2), parse time on this page is 0.68x of what it was on v0.21.0-a26. So, a pretty substantial improvement. Almost all of it comes from efficiencies in the token handler pipeline.
T393971 is another task I just filed.
This is what I have been doing with my patches that I've been submitted over the last 3 weeks. T391416 and T268584 has a bunch of tagged patches. I've looked at 5 or 10 pages at this point. I'll continue to do so and will file phab tasks based on analyses. I am going to close this task as resolved since this doesn't need any additional action beyond creating specific actionable tasks based on reviewing pages from that performance data spreadsheet.
Fri, May 9
The goal here is to cache the entire wikilink processing going from a PEG wikilink token --> a-link html tokens. Wikilinks are commonly repeated on pages.
Thu, May 8
Even a simple and small page like https://en.wikipedia.org/wiki/Hospet has titles that repeat.
For bonus points, the cache implementation is smart enough that [[Foo|bar]] and [[Foo]] would still benefit from caching.
Wed, May 7
The patches above create compound tokens for List & Indent-Pre. Tables are a bit trickier -- I haven't looked into it.
Mon, May 5
I am not complaining about the incremental approach. All I am saying is: to deal with performance concerns, you have to eliminate unnecessary format conversions. That means your passes will (have to) fall into format-aligned buckets which you can use to make format a pipeline property.
I think creating a HTMLHolder or ContentHolder is orthogonal to what I am recommending.
Fri, May 2
Tue, Apr 29
Sun, Apr 27
Fri, Apr 25
This is because of TemplateHandlers::convertToString(..) call for invalid template names. That effectively leads to 2^N calls to PipelineUtils::parseContentInPipeline where N is the number of wrappers in that particular template. The real fix is to stop tokenizing template names and args in the PEG tokenizer since Parsoid shuttles template expansions to the preprocessor, so trying to tokenize and resolve template names inside Parsoid is not very useful. In the interim, there are a few patches coming that will mitigate this issue.
I didn't know about this task beforehand, but https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/1138519 fixed this in Parsoid because we needed more precise/accurate profiles than what microtime gave us.
Thu, Apr 24
Tue, Apr 22
Apr 20 2025
Is this your local wiki? That patch hasn't yet been deployed to the WMF production wikis and will only go out the week of April 28th.
Apr 18 2025
Apr 17 2025
This page now renders, so this task is technically roeslved.
Here are two possibilities:
- Take a look at ParserPIpelineFactory and the definition for the "fullparse-wikitext-to-dom" pipeline for the top-level stages to create spans for. The only difference would be that I would collapse TokenTransform2 and TokenTransform3 into a single span called "TokenTransforms"
- Alternatively, take a look at the output emitted (and embedded in a HTML comment at the bottom of the page) by parse.php --profile (recommend running this on parsoidtest1001) and use that output as a recipe for what spans to emit.
Apr 15 2025
Seems like a largish page. Even with legacy parser, the page times out --> www.wikidata.org/wiki/Wikidata:Database reports/Constraint violations/P17?useparsoid=0 .. this is another instance of a page used by a Bot as a log file / database table.
That is good then and explains why RedLinking time isn't worse. But, given that both legacy and parsoid need to do redlinking ... parsoid taking 2.5s just for redlinking when legacy takes 3.5 for the entire parse indicates some other inefficiency lurking there, but it is not going to be the biggest bang for the buck. Maybe worth looking at accounting for THP time since it makes up 50% of it.
Apr 14 2025
Apr 13 2025
Investigation notes: getReparseType() and other code have incorrect checks for "in extension content" because the wrapper node is both from an extension *and* from a template and so it skips over the entire template content when the first node is also an extension tag. T87274: DOM nodes with multiple typeof values is related.
Apr 11 2025
Cannot reproduce anymore. The page renders.
Cannot reproduce this anymore -- must have been fixed by some deployed code.
Cannot reproduce this anymore -- must have been fixed by some deployed code.
After purge, this issue looks fixed.