Offload leaf search work to AWS Lambda functions by fulmicoton-dd · Pull Request #6157 · quickwit-oss/quickwit

fulmicoton-dd · 2026-02-12T12:49:18Z

Summary

The goal is to handle traffic spikes gracefully without provisioning additional searcher nodes: when the local search queue is saturated, overflow splits are transparently routed to Lambda for processing.

How offloading works

The offloading decision happens on the leaf side, inside the SearchPermitProvider. The permit provider already manages a bounded queue of pending split search tasks (gated by memory budget and download slots). When a leaf search request arrives, the provider checks the current queue depth against a configurable offload_threshold. If granting permits for all requested splits would exceed this threshold, only enough splits to fill up to the threshold are processed locally — the rest are marked for offloading.

The offloaded splits are batched (up to max_splits_per_invocation splits per batch, balanced by document count) and sent to Lambda in parallel. Each Lambda invocation runs the same leaf search code path and returns per-split results individually. This is important: the per-split responses are fed back into the IncrementalCollector and populate the partial result cache, so subsequent queries hitting the same splits benefit from cached results regardless of whether the split was searched locally or on Lambda.

Auto-deployment

Depending on the configuration, the Lambda function code can be deployed automatically at startup. The quickwit-lambda-client crate embeds a compressed Lambda binary at compile time. When auto_deploy is configured, Quickwit will:

Check if a published Lambda version matching the current binary already exists (identified by a description tag quickwit:{version}-{hash})
Create or update the function and publish a new version if needed
Garbage-collect old versions (keeping the current one + 5 most recent)

This ensures the Lambda function always matches the running Quickwit version without any external deployment tooling. Manual deployment is also supported for users who prefer to manage Lambda functions through Terraform or other IaC tools.

Configuration

Lambda offloading is opt-in. Add a lambda section under searcher in the node configuration:

searcher:
  lambda:
    offload_threshold: 100     # queue depth before offloading kicks in (0 = always offload)
    max_splits_per_invocation: 10
    auto_deploy:
      execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role
      memory_size: 5 GiB
      invocation_timeout_secs: 15

New crates

quickwit-lambda-client: Handles Lambda invocation (with metrics) and auto-deployment logic. Embeds the Lambda binary at build time.
quickwit-lambda-server: The Lambda function handler itself — receives a LeafSearchRequest, runs multi_index_leaf_search, and returns per-split LeafSearchResponses.

Key changes in existing crates

quickwit-search: New LambdaLeafSearchInvoker trait; SearchPermitProvider gains get_permits_with_offload to split work between local and offloaded; leaf.rs orchestrates local and Lambda tasks in parallel.
quickwit-config: New LambdaConfig and LambdaDeployConfig structs under SearcherConfig.
quickwit-serve: Initializes the Lambda invoker at startup when configured.
quickwit-proto: New LeafSearchResponses wrapper message for batched per-split responses.

Copilot

Pull request overview

Adds an opt-in AWS Lambda “overflow” execution path for leaf split search to handle traffic spikes without adding more searcher nodes, including auto-deploy of the Lambda binary and per-split result integration into the existing partial result cache / incremental merge flow.

Changes:

Introduces quickwit-lambda-client (invocation + auto-deploy + metrics) and quickwit-lambda-server (Lambda handler running Quickwit leaf search).
Extends searcher configuration/context to support Lambda offloading, and updates leaf-search scheduling to split work between local permits and Lambda batches.
Adds protobuf support for batched per-split responses plus docs and CI workflow for publishing the Lambda binary.

Reviewed changes

Copilot reviewed 43 out of 45 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
quickwit/quickwit-storage/src/cache/memory_sized_cache.rs	Adds a regression test for `CacheConfig::no_cache()` behavior.
quickwit/quickwit-serve/src/lib.rs	Initializes Lambda invoker on startup when searcher+lambda are configured.
quickwit/quickwit-serve/Cargo.toml	Adds dependency on `quickwit-lambda-client`.
quickwit/quickwit-search/src/tests.rs	Updates tests for `SearcherContext::new(..., lambda_invoker)` signature changes.
quickwit/quickwit-search/src/service.rs	Extends `SearcherContext` to carry an optional Lambda invoker.
quickwit/quickwit-search/src/search_permit_provider.rs	Adds offload-aware permit acquisition logic (threshold-based truncation).
quickwit/quickwit-search/src/root.rs	Minor tracing import/use adjustment.
quickwit/quickwit-search/src/list_terms.rs	Adjusts permit sizing collection (prep for offload-aware behavior).
quickwit/quickwit-search/src/lib.rs	Exposes new `invoker` module and re-exports `LambdaLeafSearchInvoker`.
quickwit/quickwit-search/src/leaf_cache.rs	Minor whitespace cleanup.
quickwit/quickwit-search/src/leaf.rs	Implements local-vs-Lambda scheduling, batching, parallel execution, and merge integration.
quickwit/quickwit-search/src/invoker.rs	Introduces `LambdaLeafSearchInvoker` trait abstraction.
quickwit/quickwit-proto/src/error.rs	Updates error header doc comment wording.
quickwit/quickwit-proto/src/codegen/quickwit/quickwit.search.rs	Adds `LeafSearchResponses` wrapper message to generated code.
quickwit/quickwit-proto/protos/quickwit/search.proto	Adds `LeafSearchResponses` proto definition.
quickwit/quickwit-lambda/README.md	Removes old deprecation stub text.
quickwit/quickwit-lambda-server/src/lib.rs	Defines Lambda server crate exports/modules.
quickwit/quickwit-lambda-server/src/handler.rs	Implements Lambda handler: decode request, run per-split searches, encode responses.
quickwit/quickwit-lambda-server/src/error.rs	Adds Lambda error types and conversions.
quickwit/quickwit-lambda-server/src/context.rs	Builds Lambda-optimized `SearcherConfig` from env and sets caches to `no_cache`.
quickwit/quickwit-lambda-server/src/config.rs	Adds a (currently empty) config stub file.
quickwit/quickwit-lambda-server/src/bin/leaf_search.rs	Provides Lambda binary entrypoint using `lambda_runtime`.
quickwit/quickwit-lambda-server/Cargo.toml	Adds Lambda server crate definition and dependencies/features.
quickwit/quickwit-lambda-client/src/metrics.rs	Adds Prometheus metrics for Lambda invocation and payload sizes.
quickwit/quickwit-lambda-client/src/lib.rs	Exposes deploy/invoker APIs and payload types.
quickwit/quickwit-lambda-client/src/invoker.rs	Implements AWS Lambda invocation + response decoding into per-split responses.
quickwit/quickwit-lambda-client/src/deploy.rs	Implements auto-deploy logic (version discovery/publish + GC).
quickwit/quickwit-lambda-client/build.rs	Downloads and embeds Lambda zip; computes content hash for versioning.
quickwit/quickwit-lambda-client/README.md	Documents Lambda release process and content-based versioning.
quickwit/quickwit-lambda-client/Cargo.toml	Adds Lambda client crate definition and dependencies/build deps.
quickwit/quickwit-config/src/node_config/serialize.rs	Extends serialization tests to cover lambda config.
quickwit/quickwit-config/src/node_config/mod.rs	Adds `LambdaConfig`, `LambdaDeployConfig`, `SearcherConfig.lambda`, and `CacheConfig::no_cache()`.
quickwit/quickwit-config/src/lib.rs	Re-exports lambda config types.
quickwit/quickwit-config/resources/tests/node_config/quickwit.yaml	Adds lambda section to test YAML config.
quickwit/quickwit-config/resources/tests/node_config/quickwit.toml	Adds lambda section to test TOML config.
quickwit/quickwit-config/resources/tests/node_config/quickwit.json	Adds lambda section to test JSON config.
quickwit/quickwit-config/Cargo.toml	Adds `quickwit-common` testsuite feature dep for config tests.
quickwit/quickwit-aws/src/lib.rs	Bumps AWS SDK behavior version used in defaults.
quickwit/Cargo.toml	Adds new crates to workspace members + workspace deps (lambda_runtime, ureq, zip, aws-sdk-lambda, aws-smithy-mocks).
quickwit/Cargo.lock	Locks new dependencies (aws-sdk-lambda, lambda_runtime, ureq, zip, etc.).
docs/configuration/lambda-config.md	Adds end-user documentation for lambda offloading + IAM + deployment.
LICENSE-3rdparty.csv	Updates third-party license list for new deps.
.github/workflows/publish_lambda.yaml	Adds workflow to build and draft-release the Lambda binary zip.
.github/actions/cross-build-binary/action.yml	Pins upload-to-github-release action by commit SHA.
.github/actions/cargo-build-macos-binary/action.yml	Pins upload-to-github-release action by commit SHA.

Comments suppressed due to low confidence (1)

quickwit/quickwit-lambda-server/src/config.rs:17

src/config.rs appears to be an unused stub (not referenced via mod config; and contains only imports). If it’s not needed, it should be removed; if it is intended to host config parsing, wire it up and add the missing implementation.

use anyhow::Context as _;
use bytesize::ByteSize;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T13:05:42Z

quickwit/quickwit-lambda-client/build.rs

+use std::path::PathBuf;
+
+/// URL to download the pre-built Lambda zip from GitHub releases.
+/// This should be updated when a new Lambda binary is released.


The hardcoded LAMBDA_ZIP_URL does not match the release naming described in quickwit-lambda-client/README.md / .github/workflows/publish_lambda.yaml (it has a lambda-ff6fdfa5 tag and quickwit-aws-lambda--aarch64.zip filename with a double dash). If left as-is this is likely to 404 and break builds; consider deriving the URL from the Quickwit version/tag convention or documenting why this specific tag/filename is expected.

Suggested change

/// This should be updated when a new Lambda binary is released.

///

/// Note:

/// - This is intentionally pinned to a specific Lambda release tag

/// (`lambda-ff6fdfa5`) and asset name (`quickwit-aws-lambda--aarch64.zip`),

/// which may not follow the generic naming pattern described in

/// `quickwit-lambda-client/README.md` or `.github/workflows/publish_lambda.yaml`.

/// - If the Lambda binary is rebuilt or a new version is published, both the

/// tag and the asset filename in this URL must be updated to match the

/// current release naming used in CI.

Copilot · 2026-02-12T13:05:42Z

docs/configuration/lambda-config.md

+3. Publish a version with description format `quickwit:{version}-{sha1}` (e.g., `quickwit:0_8_0-fa752891`)
+
+The description must match the format Quickwit expects, or it won't find the function version.


Manual deployment instructions refer to quickwit:{version}-{sha1}, but the implementation uses an MD5-based LAMBDA_BINARY_HASH in the Lambda version description. Update these instructions/examples to match the actual description format so manual deployments remain discoverable by Quickwit.

Suggested change

3. Publish a version with description format `quickwit:{version}-{sha1}` (e.g., `quickwit:0_8_0-fa752891`)

The description must match the format Quickwit expects, or it won't find the function version.

3. Publish a version with description format `quickwit:{version}-{LAMBDA_BINARY_HASH}` (e.g., `quickwit:0_8_0-3b5d5c3712955042212316173ccf37be`)

The description must match the format Quickwit expects (including the MD5-based `LAMBDA_BINARY_HASH`), or it won't find the function version.

quickwit/quickwit-search/src/leaf.rs

quickwit/quickwit-lambda-server/src/handler.rs

quickwit/quickwit-lambda-client/build.rs

quickwit/quickwit-lambda-client/src/lib.rs

quickwit/quickwit-search/src/search_permit_provider.rs

Copilot · 2026-02-12T13:05:45Z

quickwit/quickwit-search/src/leaf.rs

+            let mut split_search_guard = SplitSearchStateGuard::new(split_outcome_counters.clone());
+            split_search_guard.set_state(SplitSearchState::CacheHit);
+            incremental_merge_collector.add_result(cached_response).ok();
+        } else {


process_partial_result_cache silently drops errors from incremental_merge_collector.add_result(...) via .ok(). Since add_result can fail (e.g., aggregation merge), this can hide corruption/incompatibility in cached responses and lead to incomplete results without any signal. At minimum log the error; ideally propagate it and fail the request.

docs/configuration/lambda-config.md

The goal is to handle traffic spikes gracefully without provisioning additional searcher nodes: when the local search queue is saturated, overflow splits are transparently routed to Lambda for processing. The offloading decision happens **on the leaf side**, inside the `SearchPermitProvider`. The permit provider already manages a bounded queue of pending split search tasks (gated by memory budget and download slots). When a leaf search request arrives, the provider checks the current queue depth against a configurable `offload_threshold`. If granting permits for all requested splits would exceed this threshold, only enough splits to fill up to the threshold are processed locally — the rest are marked for offloading. The offloaded splits are batched (up to `max_splits_per_invocation` splits per batch, balanced by document count) and sent to Lambda in parallel. Each Lambda invocation runs the same leaf search code path and **returns per-split results individually**. This is important: the per-split responses are fed back into the `IncrementalCollector` and populate the **partial result cache**, so subsequent queries hitting the same splits benefit from cached results regardless of whether the split was searched locally or on Lambda. Depending on the configuration, the Lambda function code can be **deployed automatically** at startup. The `quickwit-lambda-client` crate embeds a compressed Lambda binary at compile time. When `auto_deploy` is configured, Quickwit will: 1. Check if a published Lambda version matching the current binary already exists (identified by a description tag `quickwit:{version}-{hash}`) 2. Create or update the function and publish a new version if needed 3. Garbage-collect old versions (keeping the current one + 5 most recent) This ensures the Lambda function always matches the running Quickwit version without any external deployment tooling. Manual deployment is also supported for users who prefer to manage Lambda functions through Terraform or other IaC tools. Lambda offloading is opt-in. Add a `lambda` section under `searcher` in the node configuration: ```yaml searcher: lambda: offload_threshold: 100 # queue depth before offloading kicks in (0 = always offload) max_splits_per_invocation: 10 auto_deploy: execution_role_arn: arn:aws:iam::123456789012:role/quickwit-lambda-role memory_size: 5 GiB invocation_timeout_secs: 15 ``` - **`quickwit-lambda-client`**: Handles Lambda invocation (with metrics) and auto-deployment logic. Embeds the Lambda binary at build time. - **`quickwit-lambda-server`**: The Lambda function handler itself — receives a `LeafSearchRequest`, runs `multi_index_leaf_search`, and returns per-split `LeafSearchResponse`s. - **`quickwit-search`**: New `LambdaLeafSearchInvoker` trait; `SearchPermitProvider` gains `get_permits_with_offload` to split work between local and offloaded; `leaf.rs` orchestrates local and Lambda tasks in parallel. - **`quickwit-config`**: New `LambdaConfig` and `LambdaDeployConfig` structs under `SearcherConfig`. - **`quickwit-serve`**: Initializes the Lambda invoker at startup when configured. - **`quickwit-proto`**: New `LeafSearchResponses` wrapper message for batched per-split responses. build fix

trinity-1686a

review still in progress

trinity-1686a · 2026-02-12T16:28:17Z

docs/configuration/lambda-config.md

+
+### Lambda execution role
+
+The Lambda function requires an execution role with S3 read access to your index data. CloudWatch logging permissions are not required.


why is CloudWatch mentioned?

trinity-1686a · 2026-02-13T16:24:24Z

quickwit/quickwit-lambda-client/src/invoker.rs

+            payload: BASE64_STANDARD.encode(&request_bytes),
+        };
+
+        let payload_json = serde_json::to_vec(&payload)


LeafSearchRequest should implement (De)Serialize, if we're forced to use json, i think direct json is probably cleaner than json(bae64(protobuf(request)))

trinity-1686a · 2026-02-13T16:37:59Z

quickwit/quickwit-lambda-client/build.rs

+
+/// URL to download the pre-built Lambda zip from GitHub releases.
+/// This should be updated when a new Lambda binary is released.
+const LAMBDA_ZIP_URL: &str = "https://github.com/quickwit-oss/quickwit/releases/download/lambda-ff6fdfa5/quickwit-aws-lambda--aarch64.zip";


as with all ressources downloaded from the internet, i think we should check the hash is the one expected (if someone's account get compromised, we don't want people to be able to bait&switch what lambda is going to eventually get executed)

(ftaod, i don't mean that in the "make sure the lambda client is compatible with the lambda-server and use a hash as a versioning mechanism", but "let's ensure that any ressource that is referenced in tree can be authenticated as being the same as the one intended by the person that added that reference in tree")

trinity-1686a · 2026-02-13T17:12:15Z

quickwit/quickwit-lambda-server/src/handler.rs

+    let mut split_search_futures: Vec<tokio::task::JoinHandle<_>> =
+        Vec::with_capacity(all_splits.len());
+    for (leaf_req_idx, split) in all_splits {
+        let leaf_request_ref = &leaf_search_request.leaf_requests[leaf_req_idx];


it took me multiple reading through the code to understand why all_splits existed and we needed to index leaf_search_request.leaf_requests here.
I think moving the creation of the LeafSearchRequest inside the flap_map that created all_splits would be a lot easier to understand

trinity-1686a · 2026-02-13T17:14:48Z

quickwit/quickwit-lambda-server/src/handler.rs

+
+        let searcher_context = ctx.searcher_context.clone();
+        let storage_resolver = ctx.storage_resolver.clone();
+        split_search_futures.push(tokio::task::spawn(multi_index_leaf_search(


i'm not a fan of using tokio::spawn for this kind of tasks. imo this should be a JoinSet, with some logic to preserve ordering. wdyt?

trinity-1686a · 2026-02-13T17:19:01Z

quickwit/quickwit-search/src/list_terms.rs

+    // allow offload to lambda
+    // https://github.com/quickwit-oss/quickwit/issues/6150


it's not clear that this is a todo

trinity-1686a · 2026-02-13T17:19:52Z

quickwit/quickwit-search/src/search_permit_provider.rs

-        permit_sender: oneshot::Sender<Vec<SearchPermitFuture>>,
+    RequestWithOffload {
        permit_sizes: Vec<u64>,
+        /// Maximum number of pending requests. If granting permits all


Suggested change

/// Maximum number of pending requests. If granting permits all

/// Maximum number of pending requests. If granting all

fulmicoton requested a review from Copilot February 12, 2026 12:51

Copilot started reviewing on behalf of fulmicoton February 12, 2026 12:52 View session

fulmicoton-dd force-pushed the lambda3 branch 2 times, most recently from 63a7c92 to 0441e5f Compare February 12, 2026 12:58

fulmicoton changed the title ~~This PR introduces the ability to offload leaf search work to AWS Lam…~~ Offload leaf search work to AWS Lambda functions Feb 12, 2026

Copilot AI reviewed Feb 12, 2026

View reviewed changes

fulmicoton-dd force-pushed the lambda3 branch 2 times, most recently from e212982 to 94b0a4d Compare February 12, 2026 13:48

fulmicoton-dd marked this pull request as ready for review February 12, 2026 13:48

fulmicoton-dd force-pushed the lambda3 branch from 94b0a4d to 68cf3dd Compare February 12, 2026 13:57

fulmicoton requested a review from trinity-1686a February 12, 2026 13:57

fulmicoton-dd force-pushed the lambda3 branch from 68cf3dd to c335f75 Compare February 13, 2026 11:26

trinity-1686a reviewed Feb 13, 2026

View reviewed changes

-/// This should be updated when a new Lambda binary is released.
+///
+/// Note:
+/// - This is intentionally pinned to a specific Lambda release tag
+///   (`lambda-ff6fdfa5`) and asset name (`quickwit-aws-lambda--aarch64.zip`),
+///   which may not follow the generic naming pattern described in
+///   `quickwit-lambda-client/README.md` or `.github/workflows/publish_lambda.yaml`.
+/// - If the Lambda binary is rebuilt or a new version is published, both the
+///   tag and the asset filename in this URL must be updated to match the
+///   current release naming used in CI.

		3. Publish a version with description format `quickwit:{version}-{sha1}` (e.g., `quickwit:0_8_0-fa752891`)

		The description must match the format Quickwit expects, or it won't find the function version.


		### Lambda execution role

		The Lambda function requires an execution role with S3 read access to your index data. CloudWatch logging permissions are not required.

		// allow offload to lambda
		// https://github.com/quickwit-oss/quickwit/issues/6150

	/// Maximum number of pending requests. If granting permits all
	/// Maximum number of pending requests. If granting all

Conversation

fulmicoton-dd commented Feb 12, 2026 • edited by fulmicoton Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How offloading works

Auto-deployment

Configuration

New crates

Key changes in existing crates

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

trinity-1686a left a comment

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

trinity-1686a Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fulmicoton-dd commented Feb 12, 2026 •

edited by fulmicoton

Loading