Skip to content

Add Discord bot integration#1022

Open
cyrusagent wants to merge 2 commits intomainfrom
cyhost-731
Open

Add Discord bot integration#1022
cyrusagent wants to merge 2 commits intomainfrom
cyhost-731

Conversation

@cyrusagent
Copy link
Copy Markdown
Contributor

Assignee: Payton Webber

Summary

  • Creates new cyrus-discord-event-transport package with full Discord Gateway WebSocket client (HELLO → IDENTIFY → READY → heartbeat lifecycle with automatic reconnect/resume), HTTP event transport for CYHOST-forwarded webhooks, message translator, REST message service (with 2K-char message splitting), and reaction service
  • Adds Discord platform types to cyrus-core: DiscordPlatformRef, DiscordSessionStartPlatformData, DiscordUserPromptPlatformData, isDiscordMessage type guard, "discord" added to MessageSource union
  • Implements DiscordChatAdapter (following SlackChatAdapter pattern) with Discord-specific Markdown formatting rules, thread context fetching, reply posting with message references, and emoji acknowledgment
  • Wires up registerDiscordEventTransport() in EdgeWorker with /discord-webhook endpoint, Bearer token auth, status check integration, and shutdown cleanup
  • Updates tsconfig.base.json with path mappings for all event-transport packages

Test plan

  • All 638 edge-worker tests pass
  • All 74 core tests pass
  • All 62 slack-event-transport tests pass
  • Full monorepo build succeeds (all 16 packages)
  • Biome lint passes
  • TypeScript type-checking passes across all packages
  • Manual testing: connect Discord bot via OAuth, send @mention, verify response

Linear issue

CYHOST-731


Tip: I will respond to comments that @ mention @cyrusagent on this PR. You can also submit a "changes requested" review with all your feedback at once, and I will automatically wake up to address each comment.

…rker wiring)

Implements Discord bot functionality analogous to the existing Slack integration:

- New `cyrus-discord-event-transport` package with DiscordGatewayClient (WebSocket
  Gateway lifecycle), DiscordEventTransport (HTTP endpoint for CYHOST-forwarded events),
  DiscordMessageTranslator, DiscordMessageService (REST API with 2K-char message
  splitting), and DiscordReactionService
- Core types: DiscordPlatformRef, DiscordSessionStartPlatformData,
  DiscordUserPromptPlatformData, isDiscordMessage type guard, "discord" in MessageSource
- DiscordChatAdapter implementing ChatPlatformAdapter with Discord Markdown formatting
- EdgeWorker: registerDiscordEventTransport() with /discord-webhook endpoint,
  status check, and shutdown integration
- Updated tsconfig.base.json paths for all event-transport packages

CYHOST-731
The DiscordGatewayClient was fully implemented but never instantiated.
registerDiscordEventTransport() only created an HTTP endpoint, but Discord
requires a persistent WebSocket connection (unlike Slack's HTTP webhooks).

Now when DISCORD_BOT_TOKEN is set, the EdgeWorker connects to Discord's
Gateway to receive MESSAGE_CREATE events for @mentions directly.
@cyrusagent
Copy link
Copy Markdown
Contributor Author

Great questions — here's a breakdown:

1. Every client uses the same bot token?

Yes, currently. DISCORD_BOT_TOKEN is a single env var at the EdgeWorker level — same pattern as SLACK_BOT_TOKEN. Each edge worker deployment gets one bot token. This is fine for the current architecture where each edge worker instance serves one team/deployment.

2. How do we know what guild @mention matches to a client?

The DiscordGatewayClient has a guildFilter config option that can restrict which guild IDs it processes:

export interface DiscordGatewayConfig {
    botToken: string;
    intents: number;
    guildFilter?: string[];  // exists but not yet wired up
}

The filtering logic is implemented in handleMessageCreate() — it drops messages from guilds not in the allow-list. However, it's not currently configured when the client is instantiated in registerDiscordEventTransport().

For v1, this is analogous to Slack: one bot token per edge worker, one workspace/guild per deployment. True multi-tenant routing (one bot serving N guilds mapped to N teams) would need:

  1. Wire guildFilter from config (e.g., the team's discord_guild_id from Supabase)
  2. Include guild ID in session thread keys to prevent cross-guild collisions
  3. Potentially per-team bot tokens looked up dynamically

3. WebSocket limitations

  • One connection per bot token — Discord enforces this. Multiple connections with the same token cause the first to be disconnected (or require sharding for 2500+ guilds).
  • Must run in a long-lived process — not compatible with serverless (Vercel). This is why the Gateway client lives in the edge worker (persistent Node.js process), not CYHOST.
  • Reconnection handling — the client handles RESUME, heartbeat timeouts, and exponential backoff, but a prolonged network outage means missed messages (Discord doesn't replay them after reconnect beyond the session buffer).
  • Rate limits — Discord's Gateway has identify rate limits (1 identify per 5 seconds). The client handles this via the reconnection delay logic.

@cyrusagent
Copy link
Copy Markdown
Contributor Author

Short answer: The current architecture already handles multi-tenancy — each team's edge worker is its own process with its own DISCORD_BOT_TOKEN.

This is the same model as Slack today: each team has their own bot token in their edge worker's env. The Gateway connection is per-edge-worker, which is per-team. No shared state.

Where it gets tricky

If we ever want a single shared Cyrus Discord app (one bot that customers invite to their servers, like a SaaS product), we'd need a centralized Gateway connection that routes events to the right edge worker. Three options:

Approach Pros Cons
1. Per-team bot tokens (current) Simple, isolated, no routing needed. Each team creates their own Discord app. Customers must create their own Discord app (extra onboarding friction).
2. Shared bot + Gateway proxy service One-click "Add to Discord" install. Single bot identity. Needs a persistent always-on service (not Vercel) to run the Gateway client and forward events to edge workers via HTTP — this is exactly what the /discord-webhook HTTP endpoint was designed for.
3. Shared bot + Discord Interactions endpoint Uses Discord's HTTP-based Interactions Endpoint instead of Gateway. Serverless-compatible. Only works for slash commands and components, NOT for @mention messages. Would need to switch UX from @mentions to slash commands.

Recommendation

For v1: Option 1 (per-team tokens) is correct. It's how Slack works today and scales fine — each edge worker is already isolated per-team.

For v2 (if we want one-click Discord install): Option 2 — add a lightweight persistent service (could be a single Node.js process, a Fly.io app, or a Cloudflare Durable Object) that:

  1. Maintains one Gateway WebSocket per shared bot token
  2. On MESSAGE_CREATE, looks up guild_idteam mapping (from Supabase teams.discord_guild_id)
  3. Forwards the event to the team's edge worker via HTTP /discord-webhook (the endpoint already exists for this)

This is essentially the same role CYHOST plays for Slack (receive webhook → look up team → forward to edge worker), just with a WebSocket frontend instead of an HTTP one.

@cyrusagent
Copy link
Copy Markdown
Contributor Author

Good point — Vercel can't hold WebSocket connections, so the Gateway proxy can't live in CYHOST itself. Here are the options for where to run it:

Option A: Cloudflare Durable Object (recommended)

We already use Cloudflare for tunnels (cyrus-cloudflare-tunnel-client). A Durable Object is a stateful, long-lived process on Cloudflare's edge — perfect fit:

  • Maintains the Gateway WebSocket to Discord
  • On MESSAGE_CREATE, looks up guild_id → team in Supabase, forwards to edge worker via HTTP
  • Auto-scales: one DO instance per bot token (or per shard if needed)
  • No new infra to manage, stays within our existing Cloudflare account
  • ~$0.15/million requests + $12.50/million GB-s duration — negligible for our scale

Option B: Fly.io / Railway lightweight service

A single Node.js process that:

  • Connects to Discord Gateway
  • Forwards events to edge workers via /discord-webhook
  • Could literally be DiscordGatewayClient + a fetch call — ~50 lines of code
  • $3-5/mo for an always-on micro VM

Option C: Vercel Cron + Discord REST polling (hacky, not recommended)

Poll Discord's REST API for new messages on a cron schedule (e.g., every 5s). Technically works on Vercel but:

  • 5s latency minimum
  • REST API rate limits (50 req/s per bot)
  • Misses real-time feel entirely
  • Not how Discord is meant to be used

Option D: Move the Gateway client to CYHOST's edge worker sidecar

If CYHOST ever moves off pure Vercel (e.g., to a hybrid setup with a long-running sidecar), the Gateway client could run there. But this is a bigger architectural shift.

Recommendation

Option A (Cloudflare Durable Object) is the cleanest path. We already have the Cloudflare dependency, Durable Objects are designed for exactly this pattern (stateful WebSocket ↔ HTTP bridge), and it keeps the architecture consistent. The /discord-webhook HTTP endpoint on the edge worker is already built to receive forwarded events — the DO just needs to be the thing that forwards them.

For v1 though, none of this is needed — the per-team bot token on the edge worker works fine.

@cyrusagent
Copy link
Copy Markdown
Contributor Author

Here's a concrete implementation plan for the Cloudflare Durable Object approach:

Architecture

Discord Gateway (WSS)
        ↕
Cloudflare Worker + Durable Object ("DiscordGatewayProxy")
        ↓ HTTP POST
CYHOST (Vercel) /api/discord/webhook  [NEW — mirrors /api/slack/webhook]
        ↓ HTTP POST
Edge Worker /discord-webhook  [EXISTING — already built]
        ↓
ChatSessionHandler → DiscordChatAdapter → AI runner

This mirrors the existing Slack flow exactly:

Slack HTTP webhook → CYHOST /api/slack/webhook → Edge Worker /slack-webhook

Implementation Steps

1. Create Cloudflare Worker + Durable Object package

New package: packages/discord-gateway-proxy/ (in cyrus-hosted or a new repo)

discord-gateway-proxy/
├── wrangler.toml
├── src/
│   ├── worker.ts          # CF Worker entry — routes to DO
│   └── DiscordGatewayDO.ts # Durable Object — holds Gateway WebSocket
└── package.json

The Durable Object does 3 things:

  1. Maintains a WebSocket connection to Discord's Gateway (reuse the protocol logic from our existing DiscordGatewayClient)
  2. On MESSAGE_CREATE with bot @mention, forwards the event to CYHOST via HTTP
  3. Handles reconnection, heartbeat, and session resume (same as DiscordGatewayClient)

wrangler.toml:

name = "discord-gateway-proxy"
main = "src/worker.ts"
compatibility_date = "2024-01-01"

[durable_objects]
bindings = [
  { name = "DISCORD_GATEWAY", class_name = "DiscordGatewayDO" }
]

[[migrations]]
tag = "v1"
new_classes = ["DiscordGatewayDO"]

[vars]
CYHOST_WEBHOOK_URL = "https://app.atcyrus.com/api/discord/webhook"

Key design: One DO instance per bot token. The Worker entry point routes based on bot token hash → DO instance. This means if we ever have multiple shared bots (unlikely), each gets its own connection.

2. Add Discord webhook forwarding route in CYHOST

New file: apps/app/src/app/api/discord/webhook/route.ts

This mirrors the existing Slack webhook route (/api/slack/webhook/route.ts) exactly:

  1. Receive event from Cloudflare DO (with a shared secret for auth)
  2. Look up team by discord_guild_id in Supabase
  3. Determine if team is self-host or cloud
  4. Forward to https://{tunnel_domain}/discord-webhook with Bearer auth (self-host) or http://{droplet_ip}:3000/discord-webhook (cloud)

The DB index idx_teams_discord_guild_id already exists from our migration.

3. Shared bot setup (Discord Developer Portal)

Create a single "Cyrus" Discord application owned by the Cyrus company account:

  • Bot token stored as a Cloudflare Worker secret (wrangler secret put DISCORD_BOT_TOKEN)
  • OAuth2 redirect configured for production CYHOST domain
  • Required intents: GUILDS, GUILD_MESSAGES, MESSAGE_CONTENT (privileged — needs Discord approval for 100+ guilds)

4. Update onboarding flow

When a customer clicks "Add to Discord":

  1. OAuth flow installs the shared Cyrus bot into their guild
  2. CYHOST stores discord_guild_id on the team record (already implemented)
  3. No per-team bot token needed — the shared bot is already connected via the DO

What stays the same

  • Edge worker code: No changes. The /discord-webhook endpoint already accepts forwarded events with Bearer auth.
  • DiscordChatAdapter: No changes. It already extracts discordBotToken from the event or falls back to env var.
  • V1 (per-team bot token): Still works in parallel. The Gateway client on the edge worker handles per-team tokens; the DO handles the shared bot.

Effort Estimate

Component Complexity Notes
Durable Object + Worker Medium ~200 lines. Can reuse Gateway protocol logic from DiscordGatewayClient.
CYHOST webhook route Low ~80 lines. Copy-paste from Slack webhook route, change field names.
Wrangler deploy config Low CI/CD for wrangler deploy.
Discord app setup Low Manual one-time setup in Discord Developer Portal.
Onboarding flow update Low Remove "create your own bot" step, use shared OAuth install.

Should we build this now?

For CYHOST-731 (v1), the per-team bot token on the edge worker is sufficient. The DO approach is a v2 optimization for when we want one-click Discord install without customers creating their own bot app. I'd recommend shipping v1 first, validating the Discord experience works end-to-end, then building the DO proxy as a follow-up issue.

@cyrusagent
Copy link
Copy Markdown
Contributor Author

Here's what's needed to complete the V2 Discord Gateway via Rivet:

Already Done

  • Rivet actor (discordGateway) — maintains Gateway WebSocket, filters @mentions, forwards events
  • /api/rivet/[...all] route handler in CYHOST
  • /api/discord/webhook forwarding route (guild → team → edge worker)
  • RIVET_ENDPOINT and RIVET_PUBLIC_ENDPOINT env vars on Vercel
  • Deployed and verified /api/rivet/metadata returns actor info

Remaining Steps

1. Add the Vercel provider URL in the Rivet dashboard

Go to the Rivet Dashboard → your namespace → Settings → Providers → Add Vercel provider with URL:

https://cyrus-preview-cyhost-731.vercel.app/api/rivet

This tells Rivet where your runner (the Vercel app) is hosted so it can route actor requests to it.

2. Set DISCORD_BOT_TOKEN env var on Vercel

The actor reads process.env.DISCORD_BOT_TOKEN to authenticate with Discord's Gateway. This should already be set from our earlier deployment — verify with vercel env ls.

3. Set DISCORD_GATEWAY_WEBHOOK_SECRET env var on Vercel

The actor uses this shared secret when forwarding events to /api/discord/webhook. Generate a random string and set it:

openssl rand -hex 32  # generate a secret
vercel env add DISCORD_GATEWAY_WEBHOOK_SECRET preview cyhost-731

Also referenced in the webhook route for auth validation.

4. Create the actor instance

Once the provider is connected, create the discordGateway actor via the Rivet client or dashboard. The actor's run hook auto-connects to Discord's Gateway on start. From a server-side script or the Rivet inspector:

import { createClient } from "rivetkit/client";
const client = createClient<typeof registry>();
await client.discordGateway.getOrCreate(["primary"]);

Or use the Rivet dashboard to create an actor with key ["primary"].

5. Verify the connection

Check the actor status via the Rivet inspector or call the getStatus action:

const status = await client.discordGateway.getOrCreate(["primary"]).getStatus();
// { connected: true, botUserId: "1483958995852202074", sessionId: "...", reconnectAttempts: 0 }

You should also see Gateway logs in the Rivet dashboard:

  • [DiscordGateway] HELLO, heartbeat interval: 41250ms
  • [DiscordGateway] READY: session=..., bot=..., guilds=1

6. Test end-to-end

@mention the bot in Discord → actor receives MESSAGE_CREATE → forwards to /api/discord/webhook → looks up team by guild ID → forwards to edge worker /discord-webhook → ChatSessionHandler → AI response posted back to Discord.

@cyrusagent
Copy link
Copy Markdown
Contributor Author

Found the issue. When I try to create the actor via the API, I get:

No runners with name 'default' are available in any datacenter for the namespace 'cyrus-whhd-production-zpro'.
Validate the runner is listed in the Connect tab and that the runner's name matches the requested runner name.

The Vercel provider hasn't been connected yet in the Rivet dashboard. This is the step that tells Rivet where to actually run the actor.

How to fix

  1. Go to Rivet Dashboard → namespace cyrus-whhd-production-zproConnect tab
  2. Click Add Provider → select Vercel
  3. Set the endpoint URL to:
    https://cyrus-preview-cyhost-731.vercel.app/api/rivet
    
  4. Save

Once the provider is connected, creating the actor should work. I can create it via the API once you've done this:

curl -X POST "https://api.rivet.dev/actors?namespace=cyrus-whhd-production-zpro" \
  -H "Authorization: Bearer sk_..." \
  -H "Content-Type: application/json" \
  -d '{"name":"discordGateway","key":"[\"primary\"]","runner_name_selector":"default","crash_policy":"restart"}'

Or use the Rivet dashboard UI to create it after the provider is registered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant