Filter and project protobuf messages directly on the wire encoding, in a single forward pass, with zero deserialization.
WQL compiles a small query language into bytecode for the Wire Virtual Machine (WVM). The WVM scans protobuf wire bytes and can:
- Filter — evaluate a predicate (like
grepfor protobuf streams) - Project — select/reshape fields (like
cut/jqfor protobuf) - Both at once — filter + project in a single pass
All operations run in O(n) time over the input bytes with no heap allocation on the hot path.
cargo install --path crates/wqlc# Filter: keep messages where age > 18
wqlc eval -q 'age > 18' -s schema.bin -m pkg.Person < messages.bin
# Project: extract only name and age, output as JSON
wqlc eval -q '{ name, age }' -s schema.bin -m pkg.Person --json < message.bin
# Combined: filter + project in one pass, stream mode (JSONL)
wqlc eval -q 'WHERE age > 18 SELECT { name, address { city } }' \
-s schema.bin -m pkg.Person --json --delimited < stream.bin
# Schema-free mode: reference fields by number
wqlc eval -q '{ #1, #3 }' < message.bin
# Compile to bytecode for embedding in other runtimes
wqlc compile -q '{ name, age }' -s schema.bin -m pkg.Person -o program.wqlbcwqlc <command> [options]
Commands:
compile Compile a WQL query to bytecode
eval Compile and execute a WQL query on stdin
inspect Disassemble a compiled WQL program
| Flag | Description |
|---|---|
-q <query> |
WQL query string (required) |
-s <schema.bin> |
FileDescriptorSet for schema-bound mode |
-m <message> |
Root message type (required with -s) |
-o <output> |
Output file (compile only; default: stdout) |
--delimited |
Varint length-delimited stream mode (eval only) |
--json |
Output as JSON (eval only; requires -s and -m) |
Single message (default): reads one protobuf message from stdin. Filter exit code: 0 = pass, 1 = filtered out. Projections write the result to stdout.
Delimited stream (--delimited): reads/writes varint length-prefixed records. Filters pass through matching records (grep semantics). Projections emit one output record per input record.
A WQL query has one of three forms:
[WHERE] <predicate>
[SELECT] <projection>
WHERE <predicate> SELECT <projection>
The WHERE and SELECT keywords are optional when only a predicate or only a projection is given:
| Long form | Shorthand |
|---|---|
SELECT { name, age } |
{ name, age } |
WHERE age > 18 |
age > 18 |
WHERE age > 18 SELECT { name, age } |
(no shorthand) |
With a schema (-s schema.bin -m pkg.Person), fields are referenced by name:
name
address.city
Without a schema, fields are referenced by protobuf field number:
#1
#3.#1
Named and numbered references can be mixed: address.#2, #3.city.
A projection selects which fields to keep in the output, wrapped in braces.
Strict mode — only listed fields survive:
{ name, age }
{ name, address { city, zip } }
Copy mode (..) — keep everything, then refine:
{ name, address { city }, .. } -- keep all, but reshape address
{ .. -payload, -thumbnail } -- keep all except these fields
Deep exclusion (..-field) — remove a field at every nesting depth (requires schema):
{ ..-secret, .. } -- remove "secret" from all levels
{ name, ..-internal, .. } -- keep name, remove "internal" everywhere
A predicate is a boolean expression over field values.
Comparison — integer and string fields:
age > 18
name == "Alice"
address.city != "NYC"
Logical operators — combine with && / || / ! (or AND / OR / NOT):
age >= 21 && address.city == "NYC"
status == 1 || status == 2
!active
Set membership — test against a list of values:
status IN [1, 2, 3]
region IN ["US", "EU", "APAC"]
String matching — case-sensitive:
name STARTS_WITH "A"
email ENDS_WITH "@example.com"
description CONTAINS "urgent"
id MATCHES "^[A-Z]{3}[0-9]+$"
Field presence — distinguish missing from zero-value:
EXISTS(email)
HAS(address.city)
Parentheses for grouping:
(status IN [1, 2] || premium == true) && age >= 18
When both clauses are present, the predicate is evaluated first. If it fails, no output is produced. If it passes, the projection is applied:
WHERE age > 18 AND EXISTS(email)
SELECT { name, email, address { city }, .. -internal }
| Type | Examples |
|---|---|
| Integer | 0, 42, -10 |
| String | "hello", "line\nbreak", "\x41" |
| Boolean | true, false |
use wql_runtime::LoadedProgram;
let program = LoadedProgram::from_bytes(&bytecode)?;
let mut output = vec![0u8; input.len() * 2];
let result = program.eval(&input, &mut output)?;
// result.matched — true if the record passed the predicate (always true when no predicate)
// result.output_len — bytes written to output (0 when no projection)
if result.matched && result.output_len > 0 {
let projected = &output[..result.output_len];
}The program header determines what happens — callers don't need to know whether the query is a filter, projection, or both. Pass &mut [] when you only care about filtering.
use wql_compiler::{compile, CompileOptions};
let opts = CompileOptions {
schema: Some(&schema_bytes),
root_message: Some("pkg.Person"),
..Default::default()
};
let bytecode = compile("{ name, age }", &opts)?;WQL is split into independent crates with a strict dependency graph:
wql-ir (no_std + alloc) — shared IR types + bytecode codec
├──▶ wql-runtime (no_std) — interpreter (LoadedProgram::eval)
├──▶ wql-compiler (std) — parser → binder → emitter → linker
├──▶ wql-capi (std, cdylib)— C FFI layer
└──▶ wqlc (std, bin) — CLI tool
The compiler and runtime are fully independent: the compiler produces bytecode bytes, the runtime consumes them. Neither depends on the other.
The WVM has 19 instructions organized around a single looping construct (DISPATCH) that iterates over protobuf (tag, value) pairs:
| Category | Instructions |
|---|---|
| Control | DISPATCH, LABEL, RETURN |
| Scope | FRAME (enter sub-message via arm action) |
| Predicate: int | CMP_EQ, CMP_NEQ, CMP_LT, CMP_LTE, CMP_GT, CMP_GTE |
| Predicate: bytes | CMP_LEN_EQ, BYTES_STARTS, BYTES_ENDS, BYTES_CONTAINS, BYTES_MATCHES |
| Predicate: set | IN_SET, IS_SET |
| Logic | AND, OR, NOT |
Field actions (COPY, SKIP, DECODE, FRAME) exist as arm actions within DISPATCH, not as standalone instructions.
See doc/IR.md for the full specification and doc/ARCHITECTURE.md for crate design details.
The wql-capi crate produces libwql with a stable C ABI. The workflow is: compile query to bytecode, load bytecode into a program handle, evaluate against input bytes.
// Compile (schema-free)
wql_bytes_t wql_compile(const char* query, char** errmsg);
// Compile (with schema)
wql_bytes_t wql_compile_with_schema(
const char* query,
const uint8_t* schema, size_t schema_len,
const char* root_message,
char** errmsg);
// Load bytecode into a reusable program handle
wql_program_t* wql_program_load(const uint8_t* bytecode, size_t len, char** errmsg);
// Evaluate — single entry point for filter, project, or both
typedef struct {
uintptr_t output_len; // bytes written (0 when no projection)
bool matched; // predicate result (true when no predicate)
} wql_eval_result_t;
// Evaluate a program against input bytes.
// Returns 0 on success, -1 on error. On success, *result is populated.
// For filter-only programs, pass output=NULL / output_len=0.
// For project-only programs, result->matched is always true.
// Buffer sizing: output_len >= input_len is always sufficient.
int32_t wql_eval(const wql_program_t* prog,
const uint8_t* input, size_t input_len,
uint8_t* output, size_t output_len,
wql_eval_result_t* result,
char** errmsg); // 0=ok, -1=error
// Cleanup
void wql_program_free(wql_program_t* prog);
void wql_bytes_free(wql_bytes_t bytes);
void wql_errmsg_free(char* msg);Thread-safe: wql_eval takes const wql_program_t* and can be called concurrently. For filter-only programs, pass output=NULL / output_len=0.
#include "wql.h"
#include <string.h>
#include <stdio.h>
int main(void) {
char *err = NULL;
/* 1. Compile a query to bytecode */
wql_bytes_t bc = wql_compile("WHERE #1 > 10 SELECT { #2 }", &err);
if (bc.data == NULL) {
fprintf(stderr, "compile error: %s\n", err);
wql_errmsg_free(err);
return 1;
}
/* 2. Load bytecode into a reusable program handle */
wql_program_t *prog = wql_program_load(bc.data, bc.len, &err);
wql_bytes_free(bc);
if (prog == NULL) {
fprintf(stderr, "load error: %s\n", err);
wql_errmsg_free(err);
return 1;
}
/* 3. Evaluate against a protobuf message */
const uint8_t *input = /* your protobuf bytes */ NULL;
size_t input_len = /* byte count */ 0;
uint8_t output[4096];
wql_eval_result_t result;
memset(&result, 0, sizeof(result));
int rc = wql_eval(prog, input, input_len,
output, sizeof(output),
&result, &err);
if (rc != 0) {
fprintf(stderr, "eval error: %s\n", err);
wql_errmsg_free(err);
wql_program_free(prog);
return 1;
}
if (result.matched) {
/* Predicate passed — projected bytes are in output[..result.output_len] */
fwrite(output, 1, result.output_len, stdout);
}
/* 4. Cleanup */
wql_program_free(prog);
return 0;
}WQL programs can be packaged as standalone WASM modules for broker-side deployment (Redpanda data transforms, WasmEdge) or any WASM host. The module bundles the WVM interpreter and a single sealed program — no compilation at runtime, no external dependencies.
# 1. Compile a query to bytecode
wqlc compile -q 'WHERE age > 18 SELECT { name, email }' \
-s schema.bin -m pkg.Person -o transform.wqlbc
# 2. Package into a standalone WASM module
wqlc wasm transform.wqlbc -o transform.wasmThe output is a self-contained .wasm file (~27 KB) that exports a single function:
wql_eval(in_ptr: i32, in_len: i32, out_ptr: i32, out_len: i32) -> i64
Returns >= 0 (matched, value is output bytes written), -1 (not matched), or -2 (error). The host provides both input and output buffers in WASM linear memory.
Build wqlc with the wasm feature to test the full pipeline (compile, package, execute) via wasmtime:
cargo install --path crates/wqlc --features wasm
# Runs through the WASM path instead of native
wqlc eval -q '{ name, age }' -s schema.bin -m pkg.Person --wasm < message.binThe WASM template (crates/wqlc/data/template.wasm) is checked into the repo. Rebuild it after changing wql-runtime or wql-ir:
cargo wasm-template
cp target/wasm32-unknown-unknown/release-wasm/wql_wasm.wasm crates/wqlc/data/template.wasmSee LICENSE.