feat/refactor: 'expr.transform()'; fixes & refactors to 'replace()' and rule-application by samueltlg · Pull Request #301 · cortex-js/compute-engine

samueltlg · 2026-04-11T01:20:18Z

(Code commentary incoming)

Sorry that this was delayed again... Upon thinking it was ready to go, encountered a new set of problems/niggles, requiring a fair amount of investigation.
The change here (fixes) are the sum of quite a lot of tedious debugging!

Most of these are fixes - and an update to 'replace()' - but you may wish to discard the head/feature change here (transform()). Personally - think it is very useful (not something have added to the personal/bug-fix fork up to this point, but is something will certainly use...)

(For reference, the failing 'compilation-performance' tests, were notably also failing before any of the changes made here))

Update:

Key commits are:
- fix (workaround): ensure intended canonical - non-canonical - match pattern ... (highlights a potential unresolved wildcard/canonicalization bug, as outlined in the commit message)
- feat: ReplaceOptions.direction - May not be to your taste (is used for 'expr.transform()'). An alternative could be possible.
- Refactor (workaround): during simplification, do not simplify all numeric operands of 'lazy' operator definitions
  - this is a change bearing relevance to a domain (simplification architecture) still requiring resolution (primarily concerning the 'modularity' of simplify-rules). Lengthy outline in the commit message...
- refactor (+fix): update ReplaceOptions.canonical to 'form'; fix: when considering replacement expr. sameness, also consider expression 'form' - various comments (pending) throughout here. Also, I thought that jointly upgrading this option, along with the fundamental policy change (applicability to replacements only) both offers more flexibility, and is useful for the purposes of transform().
- Feat: 'Expression.transform()'.
  - The primary contribution. (Note that in addition to adding the 'transform()' function/method, this involves shifting around some key types (SimplifyOptions), which would benefit from your review.

Queries & points of contention (after scanning through changes):

'transform()'
- The naming of this method is at your discretion. I think 'transform' is on-point enough (designation here borrowed from other libraries).
- TransformOptions option 'targets'
  - Having this as an alternative to 'match' may not be to your taste; but presently this offers a solution and matching route not offered by traditional pattern-matching (- referential-identity; predicate-based). An alternative altogether, or a merge with 'match', may be preferred.
  - !There still exists the problem of this not being able to match against 'cached' (common) engine-exprs (One,Zero,Infinity...): although, for the majority of use-cases, it is hard to imagine the requirement for applying transformations to these expressions anyway (perhaps some cases of 'replace').

Identified further bugs, for reference:

In a 'replace()' context...:
- During expression matching, the non-canonical expr. is used... Sometimes, however, consequently there can be an unfortunate non-match where a match would otherwise take place. Example:
  - ce.expr(['Exp', 3]).match(['Exp', '_')]) will not match, due to the non-canonical pattern matching against the canonical variant of ['Exp', 3], which notably canonicalizes to a 'Power' FN expr. (['Power', 'ExponentialE', 3]).
  - BoxedString has a missing replace() method (such that replacing always to return 'null').
As highlighted in fix (workaround): ensure intended canonical - non-canonical - match pattern...:
- Requesting the canonical variant of wildcard-containing expressions causes entries (BoxedValueDefinition) to be added to global scope for the captured wildcard symbol (e.g. _ or _c).
  - If this point were addressed, this workaround should be able to be retracted.
A couple of extant issues present with the 'structural' Expression form:
- Requesting the structural form of canonical vs. non-canonical exprs. presently can produce differing output.
  - For the case of a Rational number (representation), for inputs as produced by either ce.parse('2/5', {form: ...}) or ce.expr(['Divide', 2, 5], {form: ...}):
    - In case of boxing/parsing with form 'structural' or 'raw' (and then 'expr.structural'), the representation is 'Divide' (same in both cases)
    - In case of first obtaining the canonical variant, and then requesting the structural, the result is instead a BoxedNumber (type = rational, with an ExactNumericValue value).
  - For the case of an expression with an associative/commutative function head:
    - If obtaining a structural form through requesting this from a canonical expression, then the result will be both sorted (the operands) and flattened.
    - If in contrast boxing/parsing directly as 'structural' - or requesting the 'structural' formfrom of a 'raw' expr. - then expression/operands are neither sorted, or expression flattened.
  - ^Personally here have identified the source of this to be the fact that the 'canonical' variant has access to the OperatorDefinition: whilst in other cases not.

Some possible future directions:

expr.subs() - and possibly one or two others (methods) - could also admit a 'form' specification over 'canonical'.
If sticking with keeping expr.transform():
- As per above, TransformOptions 'targets' could be merged with 'match' - or some category of alternative - according to preference.
- A verbose switch for this method: with secondary/meta data returned such as stats on replacement quantity, replacement-expression instances (or a map of replaced->replacements), transformation-specific data (e.g. 'steps' for simplify). Perhaps overall expression validity
- An async variant commensurate to 'evaluateAsync()', and others.

…ecursively/to operands

…rn canonical-variant wildcard loss

…attern variant comparison (rule application) (this change is marked as a *workaround* rather than a 'fix' per se, since the issue it tackles instead likely has its root in the current behaviour regarding the binding of 'wildcard' symbol variants (-outside the context of pattern matching / replacement. (See the closing of this message for a further remark on this point.)) Explanation: - Currently, because canonicalization of expresssions involves wildcard binding as part of symbol binding (this may be undesirable / a bug), it is presently necessary that a new scope be created upon 'match' pattern canonicalization (performed for optimization purposes) in rule application: with this otherwise running the risk of present wildcard symbols being bound to definitions in the current scope... For example, before this change: ``` ce.expr(['Add', 'x', 3]).replace({ match: ['_', '__'], replace: 'y'}); // →Expectedly captures wildcards, and replaces // top-level expression. // *However*, the canonical-request of the 'match' expression within 'applyRule' // results in symbol/wildcard '_' being attributed a function-definition // (type='function'): within the scope in which this 'expr.replace()' was called // Consequently, a subsequent replace call involving a universal wildcard ce.expr(['Add', 'x', 3]).replace({ match: ['Add', 'x', '_')], replace: 5 }); // → Returns 'null'... (assuming this call made with an unmodified/same scope as the previous) // Ultimately because the same point-of-canonicalization results in the Universal // Wildcard in this instance uptaking the previous, 'function' definition, resulting // in a MathJson-internal type-error for the canonicalized match-pattern variant // in this instance, resulting in an absent wildcard in the canonical variant, // and leading to an early exit from rule-application 'applyRule()' //For illustration, the canonical-variant of the initial, non-canonically boxed // `['Add', 'x', 3]` pattern results in the canonical-variant as: [ "Add", "x", [ "Error", [ "ErrorCode", "'incompatible-type'", "number", "function", ], ], ] ``` *Note*: - This 'fix' may no longer be necessary, if canonicalization of wildcard-containing expressions were to no longer perform name-binding on these (this may be unintentional?)

…onical' value during 'applyRule()'

…ition' Before: passing around a rule with '{ match: .., condition: 'non-delimited LatexString }' would fail to yield/parse a condition. Now: cases such as the previous result in a parsed condition (yet, presence of display/inline mode delimiters *should* still be admissible)

…so that it may evaluate)

…ctural'

…fyNonCommutativeFunction (This inadvertently resulted in *full*-rule simplification of all operands for all functions Correcting this has resulted in *no broken tests*)

…umeric operands of 'lazy' operator definitions Explanation ------------ 'evaluateNumericSubexpressions()' (function, which is notably called within the 'simplify' process only for operator-definitions with flag 'lazy' set to 'true', and operator 'Divide', presently fully evaluate all (more-or-less) numeric operands during simplification. At least a couple of problems exist with this, mentionably that 'simplify' is, use-case depending, likely not called with intention to *evaluate all numeric* operands/expressions. However, related to this, and really a more primary issue here, is that proceeding with this unconditionally is troublesome in terms of simplifying in the context of any given rules: particularly when the ruleset is a custom one, or an incomplete subset of the default ruleset. For example, consider the simplification of an 'Add' expression with evaluable 'Ln' or 'Log' arguments... If the accompanying simplify rule-set does not with it include rules covering the simplification (perhaps, 'evaluation') of this set or category of functions, then it is dubious that these should be evaluated prior to be being summed... (instead - given a ruleset consisting of a single rule which handles 'Add', an input such as 'Ln(e) + Ln(e)' should understandably reduce this to '2 * Ln(e)' (instead of simply '2'); the case therefore of a fuller simplification being dependent on presence of separate rules handling Ln/Log. (Another illustrative example, is that at time of commit, input '3!' will remain unsimplified (unevaluated)... illustrating the need for a requiring/wrapping operator (with a 'lazy' flag) for its evaluation) \*Stemming from this consideration, it may be worth considering a single, or set of 'evaluation'-rules: decoupled from respective simplification rules (plain/eager evaluation of trig., vs. identities/construction, for example)

… considering replacement expr. sameness, also consider expression 'form' - A switch to 'form' allows specification of 'structural', and also an explicit specification of 'raw'. - As per inline documentation, the same/previous behaviour can be replicated, by first ensuring the entire input bears the same form as any speified replacement form. - When considering 'successful' rule application, any differences in expression form before-> after is now considered, in spite of these anyway bearing structural similiarity. - The same change would likely benefit from being made on `expr.subs()`.

In disuse: but set for employment particularly in test-files

…explictly 'raw'

'transform()' is a wrapper around 'replace()' - always acting recursively - and offers an ergonomic means to match sub-expression targets and apply common fundamental operations/transformations in an expression 'tree', without the need for custom 'Rule' logic / workarounds. Change includes tests. - The typing of 'TransformOptions' has necessitated that 'SimplifyOptions' be shifted from its previous home of 'types-definitions(.ts)', to 'types-kernel-evaluation.ts'. It has also been made generic over Expr/SemiExpr/CE, in a similar way as for its sibling types (such as 'RuleReplaceFunction').

samueltlg · 2026-04-11T23:43:25Z


  if (canonical && match) {
    const awc = getWildcards(match);
+    pushSafeScope(ce);


(See commit message - this change indeed may not be necessary, if the described present behaviour of the wildcard-symbol canonicalization (global-scope) is not as it should be).

samueltlg · 2026-04-12T00:22:48Z

+   * **Default** is: `'left-right'` (standard post-order)
+   *
+   */
+  direction: 'left-right' | 'right-left';


At your discretion! (this is anyway achievable in a workaround-way with custom rule-functions and what-not).

(I did also intend to update the 'once' rule such that it would be typed as something like 'one-rule' | 'one-replacement' | false: where 'one-rule' would assume the old behaviour of 'true', but 'one-replacement' would command an even stricter policy (currently 'true' applies at the rule level: i.e. a rule may apply multiple times; 'one-replacement' would further restrict this to one replacement/within one rule). This would also somewhat add a 'raison d'être' for this property ('direction'), too.
However, again, both of these can, with a bit of trouble, be achieved through RuleFunctions.)

samueltlg · 2026-04-12T00:27:40Z

  // Normalize the condition to a function
  let condFn: undefined | RuleConditionFunction;
  if (typeof condition === 'string') {
-    const latex = asLatexString(condition);


Taking back on this did not appear to break anything. Concludes that it was one of either an outdated or mistaken check... the 'asLatexString' call appears to presently require that any string [input] be delimited by an '$' or '$$'

samueltlg · 2026-04-12T00:30:54Z

  // canonical one. This allows subsequent .canonical calls to perform full
  // canonicalization.
-  if (isFunction(expr) && expr.isCanonical) {
+  if (isFunction(expr) && forms.length > 0) {


Was not a test to this... but changing this appeared to correct some behaviours whilst outlining the tests for transform. Throughout the partial-canonicalization process, appears that sometimes the expression is 'intermediarily' not marked as canonical.

samueltlg · 2026-04-12T00:32:31Z

@@ -407,7 +407,7 @@ function simplifyNonCommutativeFunction(
  let last = result.at(-1)!.value;
  if (last.isSame(expr)) return steps;



Small oversight... but was inadvertently applying the entire default rule-set to operands! (indeed; no test failures result)

samueltlg · 2026-04-12T01:00:28Z

 type EvaluateOptions = KernelEvaluateOptions;
-type Rule = KernelRule<Expression, ExpressionInput, ComputeEngine>;
-type BoxedRule = KernelBoxedRule<Expression, ComputeEngine>;
-type BoxedRuleSet = KernelBoxedRuleSet<Expression, ComputeEngine>;


(These re-constructions only previously used by SimplifyOptions (now re-located))

samueltlg · 2026-04-12T01:01:42Z


+  /**
+   *
+   * Process and transform this expression *recursively* by applying one of a set of predefined


(This doc.) an outline of justification for this method

samueltlg · 2026-04-12T01:02:27Z

+ * @category Boxed Expression
+ */
+
+export type SimplifyOptions<


Possibly not the perfect place for this (i.e. 'evaluate') - but seemed most sensible given the options

samueltlg · 2026-04-12T01:02:47Z

+/** Available transformation types for `Expression.transform()`
+ *
+ * @category Boxed Expression */
+export type Transformation =


(Others could be possible)

samueltlg · 2026-04-12T01:03:40Z

+   * ::Note
+   * The 'extended' matching routes available here are unique to *transform()* and facilitate
+   * convenient and more expressive matching in the context of recursive traversal.*/
+  targets?: Expr | Expr[] | MatchConditionFunction<Expr>;


Possible point of contention: distinct matching route from 'match'

arnog · 2026-05-10T01:18:35Z

Thanks for putting in the substantial work here, @samueltlg — the bug analyses in the commit messages were really helpful in understanding what each change is solving. There's a lot of good material in this PR.

To make it easier to land incrementally, I've cherry-picked four of the lower-risk fixes from this PR into #303 with regression tests. Authorship is preserved on each cherry-picked commit:

applyRule no longer returns null after operand-level rewrites
Rule-application optimization (early exit on canonical-variant wildcard loss)
Object-rule condition strings work without $...$ delimiters
Object-rule condition strings are canonicalized so they can evaluate

Once #303 lands, a rebase off main here will pick those up cleanly.

For the remaining changes, a few things I'd like to address before the rest is ready to merge:

Cleanup

test.only / describe.only markers in transform.test.ts (about 13 of them) need to be removed so the tests run in CI.
Lint is currently failing with real issues, not just style. The Multiple exports of name 'LatexString' (and similar for LatexDictionaryEntry, ParseLatexOptions, Parser, SerializeLatexOptions) are coming from a recent refactor on main that split global-types.ts into multiple types-*.ts files — your branch needs to pick a single source for those re-exports. There's also a no-this-alias violation around line 720, and prettier wants a sweep.

Splitting

The PR is ambitious enough that I'd like to review a few pieces independentlybefore they all land together. Could we split it into roughly:

expr.transform() feature — the new method, its types, and the transform.test.ts file. This is self-contained and the most straightforward piece to review on its own.
ReplaceOptions.canonical → form — this is a public API breaking change without a deprecation path. I'd like to discuss the migration story before merging: at minimum we'd want to keep canonical as a deprecated alias for one release, and the type change should be called out in CHANGELOG. Worth a small dedicated PR so this gets visibility.
The simplify.ts BASIC_ARITHMETIC operand guard — this is the change you flagged as a workaround. It causes three previously-passing tests to be converted to test.todo (10!/(3!*7!), two exp(log(x)±y) cases). Could you open a tracking issue describing the underlying simplification problem? I'd rather understand the root cause and either fix it properly or accept the workaround with eyes open, rather than have it slip in alongside other changes. Happy to help debug if useful.
The canonical.ts forms.length > 0 change (commit 12cabb53) — this one I'd like to look at separately because it changes how partially-canonicalized expressions are wrapped, and I want to confirm the snapshot drift it causes (e.g. Power(Multiply(j, 4), -1) → Divide(1, Multiply(j, 4))) is intentional and consistent with the rest of the codebase.

The remaining smaller fixes (pushSafeScope consolidation, direction option, form-cast policy refinement) can probably go with whichever PR they're most adjacent to.

I realize this is more bureaucracy than ideal. The reason for the splitting ask is that this PR touches enough load-bearing code (rule application, simplification, canonical-form policy, the public replace() API) that I want each piece to land with focused review. It also makes bisecting much easier if anything regresses.

Thanks again — looking forward to landing this incrementally.

* fix: do not return 'null' in 'applyRule' if replacement has applied recursively/to operands * refactor: optimize rule application by exiting earlier on match-pattern canonical-variant wildcard loss * fix: parse object-Rule condition strings without '$' delimiters and canonicalize - Drop the '$...$' wrapping requirement so bare LaTeX condition strings parse correctly via ce.parse(). - Always canonicalize the condition so it can evaluate against substitutions. Co-Authored-By: samueltlg <[email protected]> * refactor: singular documentative type-cast * test: add regression tests for cherry-picked rule-application fixes Three tests covering the fixes from PR #301 commits 2bd9953, 591f870: - applyRule preserves operand rewrites when function-replace returns undefined at top (tests the !result early-return path). - Object-rule condition string without '$' delimiters is honored (tests parsing of bare LaTeX condition strings). - Object-rule condition string is canonicalized so it can evaluate (tests that non-trivial conditions like 'a^2 > 0' work). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: samueltlg <[email protected]> Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>

samueltlg · 2026-05-10T13:00:48Z

Thanks Arno - those requirements sound reasonable and sensible... Indeed, the PR had initial intention of introducing expr.transform() perhaps only with one or two accompanying changes or refactors; However, the attempt to do this resulted in encountering a series of additional issues and requirements - mostly centred in 'replace()' - which simultaneously demanded attention. (The bureaucracy and preference for splitting are reasonable). Later on tonight (UTC+0) I'll make some headway in making those splits as requested (adhering to 'cleanup' points, and will contribute to CHANGELOG too...).

(Notably, the 'transform()' set of changes will have to be made finally... with its implementation requiring several of these included fixes/refactors).

samueltlg · 2026-05-10T16:22:21Z

And glad that you are content with keeping expr.transform() - I think it's pretty cool 😎 (although, not sure if there is anything you wanted to do between the distinct 'targets' / 'matching' routes - if so just let me know & will make sure to implement).
If you'd find it easier to do the splitting yourself then also do inform; otherwise, I'll have a go at doing so today and tomorrow 👍🏻

samueltlg added 15 commits March 31, 2026 23:21

fix: do not return 'null' in 'applyRule' if replacement has applied r…

adf26f8

…ecursively/to operands

refactor: optimize rule application by exiting earlier on match-patte…

1bf80b5

…rn canonical-variant wildcard loss

fix: do not jointly consider form 'structural' in calculation of 'can…

d36d165

…onical' value during 'applyRule()'

feat: ReplaceOptions.direction

6e116c1

fix: when parsing an object Rule 'condition', ensure canonical-form (…

ccd32d6

…so that it may evaluate)

fix: consistently return partially-canonicalized expressions as 'stru…

12cabb5

…ctural'

fix: dispatch options (including rules) to simplifyOperands in simpli…

27e8e81

…fyNonCommutativeFunction (This inadvertently resulted in *full*-rule simplification of all operands for all functions Correcting this has resulted in *no broken tests*)

refactor: singular documentative type-cast

7c761b9

feat: 'assertIsFunction' / 'getOp' utilities

bec5201

In disuse: but set for employment particularly in test-files

refactor: refine form-cast policy for 'expr.replace()' where form is …

e7a30e9

…explictly 'raw'

samueltlg commented Apr 12, 2026

View reviewed changes

arnog mentioned this pull request May 10, 2026

Cherry-pick rule-application fixes from #301 #303

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/refactor: 'expr.transform()'; fixes & refactors to 'replace()' and rule-application#301

feat/refactor: 'expr.transform()'; fixes & refactors to 'replace()' and rule-application#301
samueltlg wants to merge 15 commits into
cortex-js:mainfrom
samueltlg:expr-transform

samueltlg commented Apr 11, 2026 •

edited

Loading

Uh oh!

samueltlg Apr 11, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

samueltlg Apr 12, 2026

Uh oh!

arnog commented May 10, 2026

Uh oh!

samueltlg commented May 10, 2026 •

edited

Loading

Uh oh!

samueltlg commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

		@@ -407,7 +407,7 @@ function simplifyNonCommutativeFunction(
		let last = result.at(-1)!.value;
		if (last.isSame(expr)) return steps;

Conversation

samueltlg commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Queries & points of contention (after scanning through changes):

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arnog commented May 10, 2026

Uh oh!

samueltlg commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samueltlg commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

samueltlg commented Apr 11, 2026 •

edited

Loading

samueltlg commented May 10, 2026 •

edited

Loading