Skip to content

[BUG] send_nostr_dm blocks asyncio event loop on unreachable relays #3924

@BenGWeeks

Description

@BenGWeeks

Take the proposed resolution with a pinch of salt — drafted quickly with AI assistance. The diagnosis (sync websocket call blocking the event loop) is verified against live logs, but the exact fix shape deserves maintainer review.

Summary

send_nostr_dm() in lnbits/core/services/nostr.py uses the synchronous websocket.create_connection() inside an async function and iterates relays sequentially. One unreachable or slow relay blocks the asyncio event loop for up to ~30 seconds per attempt, making NWC Provider and other WebSocket-based extensions unresponsive during that time.

This is a separate event-loop-starvation bug from #3917 / #3918 (IN_FLIGHT payment polling). After applying the #3918 fix, this became the next visible bottleneck.

Offending code

lnbits/core/services/nostr.py, line ~30:

ws_connections: list[WebSocket] = []
for relay in relays:
    try:
        ws = create_connection(relay, timeout=2)   # sync call in async fn
        ws.send(notification)
        ws_connections.append(ws)
    except Exception as e:
        logger.warning(f"Error sending notification to relay {relay}: {e}")
await asyncio.sleep(1)

Issues:

  1. create_connection() is synchronous — it blocks the event loop on TCP connect + TLS handshake
  2. The timeout=2 is not always honored (we observe 30s blocks in production against wss://relay.snort.social and wss://relay.nostr.band)
  3. Relays are tried sequentially — 3 dead relays = 3 × blocking time added up
  4. ws.send() is also sync

With 3 dead relays in the user's configured list, a single send_nostr_dm() call can block the event loop for 60–90 seconds.

Symptoms

  • NWC list_transactions, get_balance, pay_invoice intermittently time out
  • docker logs repeatedly shows:
    WARNING | Error sending notification to relay wss://nostr.wine: Handshake status 403 Forbidden
    WARNING | Error sending notification to relay wss://relay.snort.social: Connection timed out
    WARNING | Error sending notification to relay wss://relay.nostr.band: timed out
    
  • Web UI works (HTTP served separately), masking the issue
  • Only recovery from a pile-up is docker restart

Proposed fix

Two options (either or both):

A. Offload sync calls via asyncio.to_thread and run concurrently:

async def _publish(relay: str):
    try:
        ws = await asyncio.wait_for(
            asyncio.to_thread(create_connection, relay, timeout=2),
            timeout=3,
        )
        await asyncio.to_thread(ws.send, notification)
        return ws
    except Exception as e:
        logger.warning(f"Error sending notification to relay {relay}: {e}")
        return None

results = await asyncio.gather(*(_publish(r) for r in relays))
ws_connections = [ws for ws in results if ws]

B. Switch to an async websocket library (websockets or aiohttp) — larger change but removes the sync/async footgun entirely.

Option A is minimally invasive and fixes the starvation. Happy to open a PR with that approach.

Related

Environment

  • LNbits v1.5.3
  • LNDRest backend
  • Python 3.x, asyncio

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions