Page MenuHomePhabricator

cscott (C. Scott Ananian)
Parser whisperer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Oct 21 2014, 6:47 PM (558 w, 5 d)
Availability
Available
IRC Nick
cscott
LDAP User
C. Scott Ananian
MediaWiki User
Cscott [ Global Accounts ]

Editor since 2005; WMF developer since 2013. I work on Parsoid and OCG, and dabble with VE, real-time collaboration, and OOjs.

On github: https://github.com/cscott

See https://en.wikipedia.org/wiki/User:cscott for more.

Recent Activity

Fri, Jul 4

cscott added a comment to T398655: mediawiki-phan-config minimum php target version for libraries is broken.

I'm still getting PhanCompatibleTrailingCommaParameterList on https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/1166266 for some reason (PS).

Fri, Jul 4, 1:34 AM · phan

Thu, Jul 3

cscott updated subscribers of T398655: mediawiki-phan-config minimum php target version for libraries is broken.

Sorry for wasting your time! I assumed that wikipeg got a bump to mediawiki-phan-config at the same time @Jdforrester-WMF bumped the min version in the composer.json.

Thu, Jul 3, 8:06 PM · phan
cscott renamed T398656: PageEditStash (and Content) should use JSON serialization from PageEditStash should use JSON serialization to PageEditStash (and Content) should use JSON serialization.
Thu, Jul 3, 6:48 PM · Patch-For-Review, Content-Transform-Team (Work In Progress), MediaWiki-General, JsonCodec
cscott added a subtask for T161647: RFC: Deprecate using php serialization inside MediaWiki: T398656: PageEditStash (and Content) should use JSON serialization.
Thu, Jul 3, 6:45 PM · Patch-For-Review, Security, TechCom, Platform Team Legacy (Watching / External), TechCom-RFC (TechCom-RFC-Closed), Services (watching)
cscott added a parent task for T398656: PageEditStash (and Content) should use JSON serialization: T161647: RFC: Deprecate using php serialization inside MediaWiki.
Thu, Jul 3, 6:45 PM · Patch-For-Review, Content-Transform-Team (Work In Progress), MediaWiki-General, JsonCodec
cscott updated the task description for T398656: PageEditStash (and Content) should use JSON serialization.
Thu, Jul 3, 6:28 PM · Patch-For-Review, Content-Transform-Team (Work In Progress), MediaWiki-General, JsonCodec
cscott created T398656: PageEditStash (and Content) should use JSON serialization.
Thu, Jul 3, 6:23 PM · Patch-For-Review, Content-Transform-Team (Work In Progress), MediaWiki-General, JsonCodec
cscott assigned T398655: mediawiki-phan-config minimum php target version for libraries is broken to Umherirrender.
Thu, Jul 3, 5:57 PM · phan
cscott created T398655: mediawiki-phan-config minimum php target version for libraries is broken.
Thu, Jul 3, 5:55 PM · phan
cscott added a comment to T398402: Remaining feature parity issues between the two Cite parsers.

It appears like Parsoid allows much more complex nesting of <ref> tags. This was intentionally blocked in the classic parser.

Thu, Jul 3, 2:28 PM · MW-1.45-notes (1.45.0-wmf.9; 2025-07-08), Content-Transform-Team, Epic, Cite

Wed, Jul 2

cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

We don't have to manually review 4000 method signatures. We can assume that the original authors gave their method reasonable names, and review the fewer cases in the Make method parameter names consistent for PHP 8 named arguments patch where there seem to be done disagreement. If even that is too hard, we can add @no-named-arguments only to those classes where these is an existing disagreement, as highlighted by that patch.

Wed, Jul 2, 9:08 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General
cscott added a comment to T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.

This is still causing some production errors due to bad content stuck in the ParserCache, but that should resolve itself with time.

Wed, Jul 2, 8:56 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott merged T398489: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading! Node id: 63 Stored data: {"parsoid":{"stx":"piped","a":{"href":"./jour"},"sa":{"href":"jour"},"dsr":{"start":779,"end":793,"openWidth":7,"closeWi into T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.
Wed, Jul 2, 8:55 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott merged task T398489: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading! Node id: 63 Stored data: {"parsoid":{"stx":"piped","a":{"href":"./jour"},"sa":{"href":"jour"},"dsr":{"start":779,"end":793,"openWidth":7,"closeWi into T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.
Wed, Jul 2, 8:55 PM · Content-Transform-Team, Parsoid, Wikimedia-production-error
cscott added a comment to T398489: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading! Node id: 63 Stored data: {"parsoid":{"stx":"piped","a":{"href":"./jour"},"sa":{"href":"jour"},"dsr":{"start":779,"end":793,"openWidth":7,"closeWi.

Yeah, almost certainly fixed, the bad content (html with content outside of the <body>) just takes a while to leave the parser cache.

Wed, Jul 2, 8:55 PM · Content-Transform-Team, Parsoid, Wikimedia-production-error
cscott merged T390628: Wikimedia\Assert\InvariantException: Invariant failed: Bogus nodeId given! into T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.
Wed, Jul 2, 8:54 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott merged task T390628: Wikimedia\Assert\InvariantException: Invariant failed: Bogus nodeId given! into T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.
Wed, Jul 2, 8:54 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T390628: Wikimedia\Assert\InvariantException: Invariant failed: Bogus nodeId given!.

Yeah, we probably fixed it, there's just bad content in the cache that will take a while to expire.

Wed, Jul 2, 8:54 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T390628: Wikimedia\Assert\InvariantException: Invariant failed: Bogus nodeId given!.

Almost certainly a dup of T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading! which I thought we'd fixed.

Wed, Jul 2, 8:45 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T398489: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading! Node id: 63 Stored data: {"parsoid":{"stx":"piped","a":{"href":"./jour"},"sa":{"href":"jour"},"dsr":{"start":779,"end":793,"openWidth":7,"closeWi.

Almost certainly the same problem as T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!, which I thought we'd fixed.

Wed, Jul 2, 8:45 PM · Content-Transform-Team, Parsoid, Wikimedia-production-error
cscott merged T398488: PHP Warning: Undefined array key 4 into T390344: v3 parserfunction serialization doesn't properly support named arguments.
Wed, Jul 2, 7:43 PM · Patch-For-Review, Parsoid-Read-Views (Phase 3 - Main namespace of officewiki / mediawiki.org renders with Parsoid), Parsoid
cscott merged task T398488: PHP Warning: Undefined array key 4 into T390344: v3 parserfunction serialization doesn't properly support named arguments.
Wed, Jul 2, 7:43 PM · Content-Transform-Team, Parsoid, Wikimedia-production-error
cscott added a comment to T398488: PHP Warning: Undefined array key 4.

This is a dup of T390344: v3 parserfunction serialization doesn't properly support named arguments and is caused by using named arguments with wikifunctions, which is not (yet) supported.

Wed, Jul 2, 7:42 PM · Content-Transform-Team, Parsoid, Wikimedia-production-error
cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

To be clear, under current conventions PHP (and phan) will allow named argument calls unless there is an explicit @no-named-arguments annotation. There's a lot of work whichever way -- either someone needs to write a new phan plugin to enforce a new "no named arguments unless ..." and a new annotation type (unique to mediawiki) for permitting named arguments ... or we can use the existing tooling to sync up our argument names in short order with a small number of explicit @no-named-arguments annotations where appropriate (@Tacsipacsi makes the case that SpecialPage::execute() may be one such exception).

Wed, Jul 2, 6:00 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General
cscott closed T343997: Message should support FORMAT_HTML as Declined.

Declined in favor of T343994: OutputPage::setPageTitle() should not accept Message objects, introduce OutputPage::setPageTitleMsg().

Wed, Jul 2, 3:05 AM · Security, MediaWiki-Internationalization

Tue, Jul 1

cscott added a comment to T379525: Add a magic word that returns a page's DISPLAYTITLE.

One option would be to make displaytitle a separate metadata property independent of the page wikitext, like language links are now. Then the parser function could echo the metadata without causing circularity issues with wikitext parsing.

Tue, Jul 1, 8:42 PM · Patch-For-Review, patch-welcome, good first task, Content-Transform-Team, MediaWiki-Parser
cscott added a comment to T379525: Add a magic word that returns a page's DISPLAYTITLE.

New parser features should use {{#...}} syntax and be case sensitive. See T204370 and T389029.

Tue, Jul 1, 8:38 PM · Patch-For-Review, patch-welcome, good first task, Content-Transform-Team, MediaWiki-Parser
cscott added a project to T254522: Set appropriate wikitext limits for Parsoid to ensure it doesn't OOM: Parsoid-Read-Views (Performance).
Tue, Jul 1, 3:15 PM · Parsoid-Read-Views (Performance), affects-Kiwix-and-openZIM, Parsoid
cscott closed T325322: Performance implications of using dynamic properties in NodeData in newer versions of PHP as Resolved.
Tue, Jul 1, 3:10 PM · Patch-For-Review, OKR-Work, Parsoid-Read-Views (Performance), Content-Transform-Team (Work In Progress), Parsoid
cscott moved T373472: Make ParserMigration indicator optional from Q4 FY24-25 to In Progress on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:51 PM · Content-Transform-Team (Work In Progress), MW-1.44-notes (1.44.0-wmf.6; 2024-12-03), OKR-Work, MW-1.43-notes (1.43.0-wmf.6; 2024-05-21), MediaWiki-extensions-ParserMigration
cscott moved T365371: ParserMigration: Add "report visual bug" link from Q4 FY24-25 to In Progress on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:51 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Small Size Wikipedias), OKR-Work, MediaWiki-extensions-ParserMigration
cscott moved T395946: PEG backtracking causes OOMs in some cases from Q4 FY24-25 to Code Review on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:50 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Performance), Parsoid
cscott moved T393726: Cache WikiLink processing in WikiLinkHandler from Code Review to To Verify on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:49 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Performance), Parsoid
cscott moved T391839: FlaggedRevs doesn't use Parsoid even the wiki is when configured to do so (or otherwise doesn't call Parsoid parser functions) from Code Review to To Verify on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:44 PM · Essential-Work, Parsoid-Read-Views (Wikifunctions Support), Content-Transform-Team (Work In Progress), FlaggedRevs
cscott moved T380530: Add Parsoid-compatible <link> tag to legacy parser output for redirects from Code Review to To Verify on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:44 PM · MW-1.45-notes (1.45.0-wmf.7; 2025-06-24), MW-1.44-notes (1.44.0-wmf.27; 2025-04-29), Essential-Work, Content-Transform-Team (Work In Progress), Patch-For-Review, Accessibility, MediaWiki-Redirects
cscott moved T392113: Parsoid doesn't use the "Stable" version for anonymous viewers with FlaggedRevs from Code Review to To Verify on the Content-Transform-Team (Work In Progress) board.
Tue, Jul 1, 2:44 PM · MW-1.45-notes (1.45.0-wmf.7; 2025-06-24), OKR-Work, Parsoid-Read-Views, Patch-For-Review, Content-Transform-Team (Work In Progress)

Mon, Jun 30

cscott added a comment to T363484: Update ParserMigration notice.

Updated the patchdemo with the render information as a separate sentence: https://patchdemo.wmcloud.org/wikis/d6bfc596bf/wiki/New_York_City

Mon, Jun 30, 7:50 PM · MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), Patch-For-Review, Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), OKR-Work, MediaWiki-extensions-ParserMigration
cscott added a comment to T398175: AbuseFilterConsequencesTest broken when $wgParsoidSettings['linting'] is set to true.

My wild guess would be that Parsoid is logging something (probably something inconsequential) when linting is turned on, and that's interfering with the naive way that the AF test is looking for its log entries:

The AF test in question is only making edits and then checking what change tags were applied to that edit. From a quick look at the code (getActionTags), it grabs the logging entry for the given page with the maximum log_id and matching type. This can obviously fail if tags were not applied, but also if it happens to be picking the wrong log entry.

Mon, Jun 30, 6:14 PM · MW-1.45-notes (1.45.0-wmf.9; 2025-07-08), Parsoid, AbuseFilter, ci-test-error (WMF-deployed Build Failure)
cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

Speaking just out of self-interest here, I would love this feature to be widely used for the simple sake of readability. I really appreciate reading a call fooSomething($page=blah, $user=blat, $showAll=true) instead of fooSomething(blah, blat, true). So whatever we can reasonably do to get to "widely used", I'm in favour of.

Mon, Jun 30, 5:22 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General
cscott updated subscribers of T397999: Have a way to include wikicode in a redirect that will be displayed at the top of the target page when the redirect is followed.

Discussed this task at the Content-Transform-Team tech forum today. A few notes:

  • @ihurbain was wondering how you planned to handle self links to redirects.
  • @ssastry noted that, if you wanted this to be a uniform feature of articles on toki wikipedia, you don't need the explicit __keeptitle__ page property, you could just make this a configuration variable. That would shrink the already-small Extension:MultiTitle by quite a bit, since a lot of what's going on at the moment is just boilerplate related to the __keeptitle__ magic word.
Mon, Jun 30, 4:05 PM · MediaWiki-Parser, MediaWiki-Redirects
cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

It's also worth noting briefly that by default psalm propagates the name of the base class/interface to all of its subclasses/implementations. It's relatively easy to do a bulk rename by renaming the parameter on the base class/interface and then re-running psalm --alter --issues=ParamNameMismatch; you can cherry-pick the base psalm configuration from DNM: add psalm checker. So we can have a discussion about what 'good' parameter names are during the 1.45 cycle and make updates fairly easily before we fix the names in place going forward.

Mon, Jun 30, 2:21 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General
cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

This would immediately cement the names the paramters of all existing stable methods. These names were not chosen with long term stability and consistency in mind. We'd have to go over all of them and fix at least the really bad ones, since changing them later is rather fiddly.

Mon, Jun 30, 1:59 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General

Sun, Jun 29

cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

Let's also consider how you would safely rename a parameter.

function foo($oldName) { ... }

first new add the new name, making it available for invocation as a named parameter, while preserving argument order:

function foo($newName, $oldName=null) {
  if ($oldName!==null) {
    $newName = $oldName;
    wfDeprecated(__METHOD__ . ' with $oldName', '1.xx');
  }
}

then after the appropriate amount of time, the trailing $oldName parameter can be removed.

Sun, Jun 29, 7:52 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General
cscott added a comment to T310511: Metadata comparison testing between Parsoid and the legacy parser.

This can be based on the code we added to the Linter to do real time performance comparsions: T393399.

Sun, Jun 29, 7:48 PM · Parsoid-Read-Views (Phase 4 - Parsoid generates metadata needed by core), Parsoid
cscott added a comment to T397789: Define stable interface policy and coding conventions for named arguments.

Just as a straw proposal: why not do the opposite? Method names should be considered part of the API unless you have an explicit @no-named-arguments annotation?

Sun, Jun 29, 7:26 PM · Patch-For-Review, MW-Interfaces-Team, MediaWiki-General
cscott added a comment to T396813: Make use of PHP 8.0 and 8.1 features in existing code.

Named arguments require that all implementations/overrides of a method use the same name for the method arguments. psalm has a sniff for this and an automatic fixer, I don't know if phan/phpcs do? This should be turned on IMO as it is a prerequisite to using named arguments in our codebase. (I've already merged patches to fix Parsoid's argument naming, written by psalm.)

phpcs wouldn't really do it, it's mostly for syntax-level code conventions, and doing anything that needs to analyze the world like this is terribly unpleasant.

phan could do it, but it doesn't quite: it checks function/method calls (which are always a runtime error), but doesn't check method overrides (which only cause a runtime error at call time, but the overriding itself is fine according to PHP, just ill-advised – see Parameter name changes during inheritance). I could probably write a plugin or an upstream patch to warn about this too.

See also T397789: Define stable interface policy and coding conventions for named arguments (I just created it based on discussion elsewhere).

Sun, Jun 29, 7:22 PM · Patch-For-Review, MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), phan, MediaWiki-Codesniffer, MediaWiki-extensions-General, MediaWiki-General
cscott added a comment to T363484: Update ParserMigration notice.

Patch demo at https://patchdemo.wmcloud.org/wikis/e5a16a7d5c/wiki/New_York_(magazine) -- no back end yet, but this is a good place to check the UX layout of the dialog.

Sun, Jun 29, 6:34 PM · MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), Patch-For-Review, Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), OKR-Work, MediaWiki-extensions-ParserMigration

Fri, Jun 27

cscott updated subscribers of T363484: Update ParserMigration notice.

In his review, @Arlolra pointed out that the footer text is not used in the MinervaNeue skin. (I tested every other skin available on patchdemo and they looked fine).

Fri, Jun 27, 10:54 PM · MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), Patch-For-Review, Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), OKR-Work, MediaWiki-extensions-ParserMigration
cscott added a comment to T397999: Have a way to include wikicode in a redirect that will be displayed at the top of the target page when the redirect is followed.

(I've also looked at the code for Extension:KeepTitle which actually seems like a relatively small and efficient way to do what you want, using existing Hooks in completely appropriate ways. I'm inclined to think it would be easier to have WMF adopt that extension than to have WMF resource a new #redirect functionality, but ¯\_(ツ)_/¯

Fri, Jun 27, 8:13 PM · MediaWiki-Parser, MediaWiki-Redirects
cscott added a comment to T397999: Have a way to include wikicode in a redirect that will be displayed at the top of the target page when the redirect is followed.

For the record, I'm on a personal quest (T204370) to eventually replace all the "magic words" in wikitext with a uniform alternative {{#....}} syntax. This would make it easier to add additional arguments, as was noted. So:

{{#redirect|NewTitle|special message}}

is a somewhat reasonable straw-proposal. That would put 'old title' => 'special message' in a db table (maybe a new column of the existing redirects table), and then the existing code which puts the redirect header at the top (OutputTransform/Stages/AddRedirectHeader.php and the post-cache part of WikitextContentHandler::fillParserOutput which calls setRedirectHeader) could swap in a custom redirect header where appropriate. Since this modification was done post-cache it wouldn't require splitting the parser cache.

Fri, Jun 27, 7:50 PM · MediaWiki-Parser, MediaWiki-Redirects
cscott added a comment to T397999: Have a way to include wikicode in a redirect that will be displayed at the top of the target page when the redirect is followed.

As I think @bvibber probably notes, implementing this literally as the bug description states would be A Bad Idea, since it would mean the each redirected title would have to be cached as an entirely separate object, since the content of each page could depend on arbitrary ways on the "original title". That's a nonstarter, and I suspect the task description should be changed to describe /what/ is wanted, ie "text at the top that depends on the original redirect title" not the /how/ (a specific magic word).

Fri, Jun 27, 4:06 PM · MediaWiki-Parser, MediaWiki-Redirects

Thu, Jun 26

cscott added a comment to T303788: InvalidArgumentException: Asked for code outside of range (55296).

This was probably related to the fix I landed in 4cd16aa25622b2f63ba57dc8fef9c344ae2e4e5b?

Thu, Jun 26, 9:28 PM · MediaWiki-libs-utfnormal
cscott added a comment to T380516: Introduce ParserOptions type for "propagated to ParserOutput but not otherwise affecting the parse".

This is partially done: we have ParserOutput::setFromParserOptions(ParserOptions $po) now which transfers a number of things: wrapper class (T303615), section edit link flag, the collapsible sections flag (T363496) and the preview flag (T341010). The missing piece is to add a list of these in ParserOptions akin to the existing $initialCacheVaryingOptionsHash, $initialLazyOptions, etc lists so that these "only for ParserOutput" options aren't included in the cache key -- right now collapsibleSections and suppressSectionEditLinks are included in the cache key, and all of these are included in ParserOptions::matches() etc.

Thu, Jun 26, 3:03 PM · Parsoid-Read-Views (Performance), Parsoid
cscott added a comment to T303615: There's no reason to split the ParserCache on wrapclass.

Wrapclass was removed from the parser cache key in 8b0e7298ac25ccbb83d382bcb81f6c9433d053ca in 2017.

Thu, Jun 26, 3:02 PM · MediaWiki-Parser
cscott closed T385609: Wikimedia\Assert\InvariantException: Invariant failed: Bad UTF-8 at start of string as Resolved.

Likely an extension returning bad UTF-8 data. The stack trace seems to indicate that the bad UTF-8 was wrapped as a strip marker. https://he.wikipedia.org/wiki/%D7%95%D7%99%D7%A7%D7%99%D7%A4%D7%93%D7%99%D7%94%3A%D7%97%D7%93%D7%A9%D7%95%D7%AA doesn't display any errors at the present, and during February we were still actively working on the fragment/strip marker code.

Thu, Jun 26, 2:53 PM · Parsoid, Wikimedia-production-error
cscott added a comment to T385753: Temporary data lost when serializing dom to html.

I suspect this has been fixed by our continued work in this area. In particular, we don't round-trip through HTML strings much (if at all) any more, so the issue roundtripping through the string form should have been fixed.

Thu, Jun 26, 2:48 PM · Content-Transform-Team (Work In Progress), Parsoid
cscott closed T373256: Define ParserCache strategy for pages with content placeholder slots (as with wikifunctions), a subtask of T374870: [EPIC] Support Wikifunctions on wikis with Parsoid readviews enabled, as Resolved.
Thu, Jun 26, 2:46 PM · Parsoid-Read-Views (Wikifunctions Support), Content-Transform-Team (Work In Progress), Abstract Wikipedia team (25Q2 (Oct–Dec)), OKR-Work, Wikifunctions, Epic
cscott closed T373256: Define ParserCache strategy for pages with content placeholder slots (as with wikifunctions), a subtask of T392133: Async content needs !misermode, as Resolved.
Thu, Jun 26, 2:46 PM · Essential-Work, MediaWiki-Parser, Abstract Wikipedia team, Content-Transform-Team (Work In Progress)
cscott closed T373256: Define ParserCache strategy for pages with content placeholder slots (as with wikifunctions) as Resolved.

We've implemented a reasonable strategy for now; see patches above. We also have metrics in place (part of the wikifunctions SLO/SLA) which will allow us to determine if our caching strategy works/backfires/etc. So we can resolve this for now, and open a new task if the metrics show that we need to revisit our initial caching strategy.

Thu, Jun 26, 2:46 PM · MW-1.44-notes (1.44.0-wmf.27; 2025-04-29), Parsoid-Read-Views (Wikifunctions Support), Parsoid, Abstract Wikipedia team, OKR-Work, Wikifunctions
cscott created T397939: Revisit how localization markup works with VE.
Thu, Jun 26, 2:36 PM · Content-Transform-Team, VisualEditor, Parsoid
cscott closed T396886: TokenUtils: PHP Warning: foreach() argument must be of type array|object, null given as Resolved.

Verified that the last instance of this exception was on Jun 19.

Thu, Jun 26, 2:20 PM · Content-Transform-Team (Work In Progress), User-brennen, Parsoid, Wikimedia-production-error
cscott added a comment to T316050: Sunset Reading List browser Extension on Web.

The Content-Transform-Team in theory owns Reading Lists, but none of us have actually touched that code AFAIK. I think clearly communicating the sunset is preferable to maintaining an illusion of support.

Thu, Jun 26, 2:14 PM · Wikipedia-Android-App-Backlog, Content-Transform-Team

Wed, Jun 25

cscott added a comment to T396835: Interwiki links with double underscore get rendered as single underscore.

Space normalization (replace all spaces with _, compress runs of spaces) is part of the wiki title normalization and shouldn't be changed.

Wed, Jun 25, 5:48 PM · Content-Transform-Team (Work In Progress), MediaWiki-Platform-Team (Radar), MediaWiki-Parser, MediaWiki-Interwiki

Tue, Jun 24

cscott added a comment to T393308: Charts with errors are not editable in VE.

<p><wiki-chart><div>......</div></wiki-chart></p> is parsed by every browser as <p><wiki-chart></wiki-chart></p><div>......</div><p></p> so that's not a parsoid bug per se. It's one of the dangers of using custom elements, which are always parsed as inline AFAIK.

Tue, Jun 24, 8:48 PM · Charts, Reader Growth Team, MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), VisualEditor
cscott added a comment to T396813: Make use of PHP 8.0 and 8.1 features in existing code.

Named arguments require that all implementations/overrides of a method use the same name for the method arguments. psalm has a sniff for this and an automatic fixer, I don't know if phan/phpcs do? This should be turned on IMO as it is a prerequisite to using named arguments in our codebase. (I've already merged patches to fix Parsoid's argument naming, written by psalm.)

Tue, Jun 24, 3:55 PM · Patch-For-Review, MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), phan, MediaWiki-Codesniffer, MediaWiki-extensions-General, MediaWiki-General
cscott added a comment to T383328: Kartographer map overlay not visible in Parsoid rendering.

I like the second idea (add useparsoid=1 to ApiQueryMapData).

Tue, Jun 24, 3:12 PM · Parsoid-Read-Views (Small Size Wikipedias), Maps (Kartographer)

Mon, Jun 23

cscott moved T363484: Update ParserMigration notice from Blocked to In Progress on the Content-Transform-Team (Work In Progress) board.
Mon, Jun 23, 3:05 PM · MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), Patch-For-Review, Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), OKR-Work, MediaWiki-extensions-ParserMigration

Sat, Jun 21

cscott added a comment to T392775: Add link color for temporary usernames.

It is stuck awaiting review. It got to a C+1 and never managed to get merged. It's not clear which team is ultimately responsible for the review.

Sat, Jun 21, 4:03 AM · MW-1.45-notes (1.45.0-wmf.9; 2025-07-08), Temporary accounts (Global wiki rollout), Essential-Work, Patch-For-Review, MediaWiki-General, Content-Transform-Team (Work In Progress)

Thu, Jun 19

cscott added a project to T161278: Add default gadget styling to Parsoid's output: Content-Transform-Team (Work In Progress).
Thu, Jun 19, 6:21 PM · Wikimania-Hackathon-2025, Content-Transform-Team (Work In Progress), MW-Interfaces-Team, Patch-Needs-Improvement, affects-Kiwix-and-openZIM, Parsoid-Rendering, Platform Engineering, Parsoid, MediaWiki-extensions-Gadgets, MediaWiki-Action-API
cscott added a comment to T363484: Update ParserMigration notice.

Sorry, you'll need the parser migration extension installed, and to have parsoid read views enabled for your account:
https://www.mediawiki.org/wiki/Help:Extension%3AParserMigration
That is installed on beta and testwiki, I believe, so it should just be a matter of enabling parsoid read views in your user preferences.

Thu, Jun 19, 2:20 PM · MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), Patch-For-Review, Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), OKR-Work, MediaWiki-extensions-ParserMigration
cscott closed T397389: Test Task as Invalid.
Thu, Jun 19, 4:04 AM · Trash
cscott renamed T397389: Test Task from Assemble in the barnyard to Test Task.
Thu, Jun 19, 4:03 AM · Trash
cscott created T397389: Test Task.
Thu, Jun 19, 4:02 AM · Trash

Mon, Jun 16

cscott updated the task description for T397073: CTT Tasks week of 2025-06-13.
Mon, Jun 16, 8:08 PM · MW-1.45-notes (1.45.0-wmf.6; 2025-06-17), Essential-Work, Content-Transform-Team (Work In Progress)
cscott updated the task description for T397073: CTT Tasks week of 2025-06-13.
Mon, Jun 16, 4:41 PM · MW-1.45-notes (1.45.0-wmf.6; 2025-06-17), Essential-Work, Content-Transform-Team (Work In Progress)
cscott updated the task description for T397073: CTT Tasks week of 2025-06-13.
Mon, Jun 16, 4:40 PM · MW-1.45-notes (1.45.0-wmf.6; 2025-06-17), Essential-Work, Content-Transform-Team (Work In Progress)
cscott renamed T396208: CTT tasks week of 2025-06-06 from CTT tasks week of 2025-06-05 to CTT tasks week of 2025-06-06.
Mon, Jun 16, 4:35 PM · MW-1.45-notes (1.45.0-wmf.5; 2025-06-10), Essential-Work, Content-Transform-Team (Work In Progress)
cscott created T397073: CTT Tasks week of 2025-06-13.
Mon, Jun 16, 4:30 PM · MW-1.45-notes (1.45.0-wmf.6; 2025-06-17), Essential-Work, Content-Transform-Team (Work In Progress)
cscott renamed T396208: CTT tasks week of 2025-06-06 from CTT tasks week of 2025-05-06 to CTT tasks week of 2025-06-05.
Mon, Jun 16, 4:27 PM · MW-1.45-notes (1.45.0-wmf.5; 2025-06-10), Essential-Work, Content-Transform-Team (Work In Progress)
cscott added a comment to T396615: ImageMap fails to render when useParsoid=1 if links are providing using the "pipetrick".

{T4700: Pre-save transform skips extensions using wikitext (gallery, references, footnotes, Cite, status indicators, pipe trick, subst, signatures)}, which has a patch https://gerrit.wikimedia.org/r/c/mediawiki/core/+/981193 but https://gerrit.wikimedia.org/r/c/mediawiki/core/+/607367 is the better patch I think.

Mon, Jun 16, 3:23 PM · Content-Transform-Team, ImageMap

Sat, Jun 14

cscott added a comment to T396913: Possible parsercache corruption in Charts localization.

Because the legacy parser is pre-cache, we might already be splitting the parser cache by user variant, but parsoid expects to do language conversion post cache. It is more technically correct to use user language here because user language strings are also pre-converted, excluded from language converter, etc. That said, your content might be hidden behind a strip marker and hidden from language converter anyway.

Sat, Jun 14, 2:56 AM · Charts, Reader Growth Team, MW-1.45-notes (1.45.0-wmf.7; 2025-06-24), Chinese-Sites

Thu, Jun 12

cscott updated subscribers of T396656: TypeError: MediaWiki\Parser\ParserOutput::appendJsConfigVar(): Argument #2 ($value) must be of type string, int given.

Welcome to my favorite misfeature of PHP: numeric strings used as keys to associative arrays are silently converted to ints. I count 16 (string) casts in ParserOutput alone which are all due to this same misfeature. phan doesn't flag this, likely because the false positive rate would be high. I've changed a number of APIs to make the problem less likely to occur for clients, for example ParserOutput::getCategoryNames() returns a list<string> which in connection with ParserOutput::getCategorySortKey() avoiding the problems of the previous interface (::getCategoryMap()) which exported a array<string,string> directly which was really/secretly an array<string|int,string> trapping the unwary (::getCategoryMap() at least documents the correct type now).

Thu, Jun 12, 4:15 PM · MW-1.45-notes (1.45.0-wmf.5; 2025-06-10), Content-Transform-Team (Work In Progress), User-brennen, Parsoid, Wikimedia-production-error
cscott added a comment to T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.

This orphan tag shows up locally using the API as well, but you have to use body_only=false:

$ php bin/parse.php --domain da.wiktionary.org --page mercredi --wrapSections --body_only=false
...
</section></section></body>
Thu, Jun 12, 3:55 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.

Something is adding bogus HTML to the bottom of the document before the OutputTransform stages get started:

<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head prefix="mwr: https://da.wiktionary.org/wiki/Special:Redirect/"><meta charset="utf-8"/><meta property="mw:pageId" content="0"/><meta property="mw:pageNamespace" content="0"/>
...
</section></li></section></section></body>
Thu, Jun 12, 3:51 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.

I can reproduce on parsoidtest1001:

$ sudo -u www-data php /srv/mediawiki/multiversion/MWScript.php view.php --wiki=dawiktionary mercredi > mercredi.wt
$ sudo -u www-data php /srv/mediawiki/multiversion/MWScript.php parse.php --wiki=dawiktionary -p < mercredi.wt 

seems like I should add a --page option to core's parse.php to save a step here. regardless, now I have something to look at...

Thu, Jun 12, 2:56 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T391031: UnreachableException should be Normalized.

Unreachable exceptions shouldn't be reached. Not sure it makes sense to try to normalize this, as opposed to fixing the bug.

Thu, Jun 12, 2:54 PM · Parsoid
cscott added a comment to T390629: Wikimedia\Assert\UnreachableException: Trying to fetch node data without loading!.

Reproducible: https://da.wiktionary.org/wiki/mercredi?useparsoid=1 ; note that https://da.wiktionary.org/wiki/mercredi?useparsoid=0 is fine. Also running via the API of integrated modes on parsoidtest1001 is fine; this is a bug in the OutputTransform pipeline after Parsoid is done parsing.

Thu, Jun 12, 2:48 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error
cscott added a comment to T391487: Category sort key set incorrectly.

Indeed, it looks like we're using some information in data-parsoid to indicate whether an explicit sort key should be provided, and the id attribute is telling selser to use the data-parsoid from the previous element. We should be flagging this as modified (and thus not using the old data-parsoid) but apparently are not. Yep, probably a bug in Parsoid.

Thu, Jun 12, 2:27 PM · Parsoid, VisualEditor-MediaWiki, VisualEditor
cscott added a comment to T372387: Wrong section numbering if Parsoid is used and wikitext is invalid.

Ok, Bug fix for 'Rule reference variables should not be affected by failed rules' and the patch it follows up, Rule reference variables should not be affected by failed rules, will probably fix this once we move Parsoid to the new wikipeg.

Thu, Jun 12, 2:17 PM · Content-Transform-Team (Work In Progress), Parsoid-Read-Views (Phase 1 - DiscussionTools support), Parsoid

Mon, Jun 9

cscott added a comment to T393306: Chart output makes VE support difficult.

To briefly update: part of the issue here is how we sanitize content cut-and-pasted into VisualEditor. VE uses a somewhat restrictive sanitizer. Supporting custom components naively would require updating the sanitizer every time to include each custom tag name defined by an extension.

Mon, Jun 9, 4:18 PM · Editing-team (Tracking), VisualEditor, Content-Transform-Team, Patch-For-Review, Charts

Jun 5 2025

cscott added a comment to T363484: Update ParserMigration notice.

@mwilliams anything learned from the "bit of time next week to explore a couple other possible solutions", or is this design ready to implement?

Jun 5 2025, 2:37 PM · MW-1.45-notes (1.45.0-wmf.8; 2025-07-01), Patch-For-Review, Parsoid-Read-Views (Small Size Wikipedias), Content-Transform-Team (Work In Progress), OKR-Work, MediaWiki-extensions-ParserMigration
cscott closed T391702: Document v3 parser function DOM, a subtask of T388786: Follow up from Parsoid Fragment support, as Resolved.
Jun 5 2025, 2:31 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid
cscott closed T391702: Document v3 parser function DOM as Resolved.
Jun 5 2025, 2:31 PM · Parsoid
cscott added a project to T391844: Convert data-mw-variant to use rich attributes: Parsoid-Read-Views (Language Converter Support).
Jun 5 2025, 2:27 PM · Parsoid-Read-Views (Language Converter Support), Content-Transform-Team (Work In Progress), Essential-Work, Parsoid
cscott added a comment to T395979: Provide means to extract the parameters and values of a template invocation in wikitext.

The proposed summary fails to mention a couple of additional implementations, including in Parsoid. This strikes me as a https://xkcd.com/927/ issue.

Jun 5 2025, 2:15 PM · Content-Transform-Team (Work In Progress), Patch-For-Review, MediaWiki-Parser

Jun 4 2025

cscott updated the task description for T367616: Rename data-mw.attrs to extAttrs to avoid confusion with data-mw.attribs.
Jun 4 2025, 6:37 PM · MW-1.45-notes (1.45.0-wmf.6; 2025-06-17), Content-Transform-Team (Work In Progress), Essential-Work, Parsoid
cscott added a comment to T395968: Semantic template information in TemplateData.

https://en.wikipedia.org/wiki/User:Cscott/Opportunities_for_Content_Transform_Team mentions both Subbu's Typed Templates proposal as well as a number of other possible initial clients/use cases.

Jun 4 2025, 6:23 PM · Content-Transform-Team

Jun 3 2025

cscott created T395968: Semantic template information in TemplateData.
Jun 3 2025, 9:04 PM · Content-Transform-Team
cscott created T395965: PHP Deprecated: Use of MediaWiki\Parser\ParserOutput::getText was deprecated in MediaWiki 1.42. [Called from MediaWiki\Extension\ParserMigration\MigrationEditPage::doPreviewParse].
Jun 3 2025, 8:43 PM · MediaWiki-extensions-ParserMigration, Content-Transform-Team (Work In Progress), Wikimedia-production-error
cscott added a comment to T394836: Use refactored grammar for `{{....}}` constructs.

An effective reproduction of the backtracking is this snippet:

* a {{bar|1
* b
Jun 3 2025, 7:44 PM · Parsoid-Read-Views (Performance), Parsoid