Skip to content

Conversation

@mihaibudiu
Copy link
Contributor

The compiler generates some dataflow graph fragments by generating SQL strings and recursively invoking itself on these SQL program. These dataflow graph fragments are then stitched into the final dataflow graph.

For example, the error view is obtained by the following declaration:

CREATE TABLE FELDERA_ERROR_TABLE(table_or_view_name VARCHAR NOT NULL, message VARCHAR NOT NULL, metadata VARCHAR NOT NULL);
CREATE VIEW ERROR_VIEW AS SELECT * FROM FELDERA_ERROR_TABLE

The FELDERA_ERROR_TABLE is deleted later and the view is hooked to the rightful sources.

SQL User-defined functions are implemented in a similar way.

An unpleasant side-effect of this process is that the source code propagation mechanism of the compiler records source positions for these fictitious programs, which then surface while browsing the profiles (where they point to incorrect source code origins, since these fictitious programs are not recorded in the actual program sources).

This PR distinguished two kinds of source positions: internal and external. Only the external ones are propagated to the outside world.

Unfortunately this PR is much larger than I expected, because I realized that many Calcite classes incorrectly implement the APIs used by the Calcite visitors, and I am using a visitor to rewrite the position information. As a consequence I had to clone and fix all the Calcite classes in question. I may file an issue for Calcite as well, but no one seems to have noticed this problem so far.

The final two resource files in this PR show indeed that the source positions are no longer present for the ERROR_VIEW.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the distinction between internal and external source positions in the SQL compiler to prevent internally-generated SQL from polluting profiler views with incorrect source code origins. The compiler now marks source positions as "internal" for SQL code it generates itself (such as error view definitions and UDF implementations), ensuring only user-written code positions are propagated externally.

Key Changes:

  • Introduced ExtendedSqlParserPos to track whether source positions are internal or external
  • Fixed multiple Calcite SQL node classes to properly implement visitor APIs by overriding createCall() and getOperandList() methods
  • Updated SourcePositionRange to filter out internal positions when requested

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated no comments.

Show a summary per file
File Description
metadataTests-generateDFRecursive.json Test output showing empty positions array for internal ERROR_VIEW
metadataTests-generateDF.json Test output showing empty positions array for internal ERROR_VIEW
CreateTypeStatement.java Import path change to use custom SqlCreateType
SqlViewColumnDeclaration.java Fixed getOperandList() to include all operands and added createCall() override
SqlRemove.java Fixed getOperandList() to include all three operands
SqlPrimaryKey.java New class replacing Calcite's SqlKeyConstraint with proper visitor support
SqlLateness.java Removed empty TODO validation method
SqlForeignKey.java Added createCall() override for proper visitor support
SqlExtendedColumnDeclaration.java Fixed getOperandList() to include all operands, added createCall() override and constructor
SqlDropView.java New class with proper createCall() override
SqlDropTable.java New class with proper createCall() override
SqlDropObject.java New base class for drop operations
SqlDeclareView.java Added createCall() override for proper visitor support
SqlCreateView.java Added createCall() override, clone() method, and fixed getOperandList()
SqlCreateType.java New class replacing Calcite's version with proper visitor support
SqlCreateTable.java Added createCall() override, clone() method, and fixed getOperandList()
SqlCreateIndex.java Added createCall() override for proper visitor support
SqlCreateFunctionDeclaration.java Added createCall() override and fixed getOperandList()
SqlCreateAggregate.java Added createCall() override and fixed getOperandList()
SqlAttributeDefinition.java New class replacing Calcite's version with proper visitor support
SqlToRelCompiler.java Added ReplacePositions visitor to mark internal positions and updated parsing flow
ExtendedSqlParserPos.java New class extending SqlParserPos to track internal/external distinction
CalciteToDBSPCompiler.java Import path changes to use custom SQL classes
SourcePositionRange.java Added logic to skip internal positions when requested
ddl.ftl Updated parser template to use custom classes instead of Calcite's SqlDdlNodes
config.fmpp Updated imports to include custom SQL classes

@mihaibudiu mihaibudiu added this pull request to the merge queue Dec 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 20, 2025
@mihaibudiu
Copy link
Contributor Author

This needs apache/calcite#4696 to be merged first

@mihaibudiu mihaibudiu enabled auto-merge December 24, 2025 22:29
@mihaibudiu mihaibudiu added this pull request to the merge queue Dec 24, 2025
@mihaibudiu mihaibudiu removed this pull request from the merge queue due to a manual request Dec 24, 2025
@mihaibudiu mihaibudiu enabled auto-merge December 24, 2025 23:01
@mihaibudiu mihaibudiu added this pull request to the merge queue Dec 24, 2025
Merged via the queue into main with commit cb1eb8f Dec 25, 2025
1 check passed
@mihaibudiu mihaibudiu deleted the noposition branch December 25, 2025 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants