A standalone audio server for noVNC that captures system audio via PulseAudio, encodes with Opus, and streams over WebSocket.
📌 Important: This server requires the babelcloud/noVNC fork which includes client-side audio support. The upstream novnc/noVNC does not support audio playback.
- 🎵 Audio Capture: Captures system audio via PulseAudio monitor sources
- 🗜️ Opus Encoding: Efficient audio compression with Opus codec (fallback to PCM)
- 🌐 WebSocket Streaming: Real-time audio streaming to multiple clients
- ⏱️ Timestamp Sync: Adds timestamps for audio-video synchronization
- 🔧 TypeScript: Full type safety and excellent developer experience
- ✅ Tested: Comprehensive test suite with Vitest
⚠️ Important: Audio playback requires user interaction in the browser (click or keypress) due to browser autoplay policies. This is a browser security feature, not a bug.
Independent WebSocket connection for audio streaming, separate from VNC video stream:
┌─────────────────────────────────────────┐
│ Server Container │
│ │
│ ┌──────────┐ ┌───────────────┐ │
│ │ x11vnc │ │ PulseAudio │ │
│ └────┬─────┘ └───────┬───────┘ │
│ │ │ monitor │
│ ┌────▼─────┐ ┌───────▼────────┐ │
│ │websockify│ │ novnc-audio- │ │
│ │:6080/vnc │ │ server :6090 │ │
│ └──────────┘ └────────────────┘ │
└─────────────────────────────────────────┘
│ │
↓ Video ↓ Audio
┌──────────────────────────────────┐
│ Browser (noVNC Client) │
└──────────────────────────────────┘
- Node.js >= 18.0.0
- pnpm (recommended) or npm
- PulseAudio (Linux only)
# Ubuntu/Debian sudo apt-get install pulseaudio pulseaudio-utils # Fedora/RHEL sudo dnf install pulseaudio pulseaudio-utils
Note: This server requires Linux with PulseAudio. macOS and Windows are not supported.
-
babelcloud/noVNC - Required!
The upstream noVNC does not include audio support. You must use the babelcloud fork:
# Clone the audio-enabled noVNC fork git clone https://github.com/babelcloud/noVNC.git cd noVNC # Install and build pnpm install pnpm run build # Serve the client # ... (use your web server or noVNC's built-in server)
What's included in the fork:
- WebSocket audio stream connection
- Opus decoder (opus-decoder)
- Audio playback via Web Audio API
- Audio/video synchronization
- Audio controls in UI
For better performance and lower bandwidth, install a native Opus encoder:
# Recommended - actively maintained
pnpm add @discordjs/opus# Ubuntu/Debian
sudo apt-get install build-essential python3 libopus-dev
# Fedora/RHEL
sudo dnf install gcc-c++ make python3 opus-devel
# Alpine Linux
apk add --no-cache python3 make g++ opus-devDocker Users: Install build tools before pnpm install:
RUN apt-get update && apt-get install -y \
python3 make g++ pkg-config libopus-dev
RUN pnpm install# After pnpm install, manually build native module
RUN cd node_modules/.pnpm/@discordjs+opus@*/node_modules/@discordjs/opus && \
npm run installVerification:
# Check if native module is built
ls -la node_modules/.pnpm/@discordjs+opus@*/node_modules/@discordjs/opus/prebuild/
# You should see compiled .node filesWithout a native encoder, the server automatically falls back to PCM passthrough (larger bandwidth).
# Clone the audio-enabled noVNC fork
git clone https://github.com/babelcloud/noVNC.git
cd noVNC
pnpm install && pnpm build
# Start noVNC server (example)
./utils/novnc_proxy --vnc localhost:5900 --listen 6080# Clone this repository
git clone https://github.com/babelcloud/novnc-audio-server.git
cd novnc-audio-server
# Install dependencies (including native Opus encoder)
pnpm install
# Build
pnpm build
# Start audio server
pnpm start -- --port 6090 --device default- Open browser:
http://localhost:6080/vnc.html - Connect to VNC server
- Click anywhere on the page to enable audio (browser autoplay policy)
- Audio should start playing automatically!
Audio URL Configuration: The noVNC client automatically builds the audio URL from the VNC URL. For example:
- VNC:
ws://localhost:6080/websockify→ Audio:ws://localhost:6090/audio - VNC:
wss://example.com:6081/vnc→ Audio:wss://example.com:6091/audio
# Clone repository
git clone https://github.com/babelcloud/novnc-audio-server.git
cd novnc-audio-server
# Install dependencies
pnpm install
# Build
pnpm build# Hot reload for development
pnpm dev
# With custom options
pnpm dev -- --port 6090 --device default# Build and run
pnpm build
pnpm start
# With custom options
pnpm start -- --port 6090 --device default# Install globally
pnpm link --global
# Now you can use it anywhere
novnc-audio --port 6090 --device default# Direct execution (after build)
./dist/main.js --port 6090 --device default
# Or with node
node dist/main.js --port 6090 --device defaultOptions:
-p, --port <port> WebSocket port (default: "6090")
-H, --host <host> Host to bind to (default: "0.0.0.0")
--path <path> WebSocket path (default: "/audio")
-d, --device <device> PulseAudio device name (default: "default")
-r, --sample-rate <rate> Audio sample rate in Hz (default: "48000")
-c, --channels <channels> Audio channels - 1=mono, 2=stereo (default: "2")
-b, --bitrate <bitrate> Opus bitrate in bps (default: "64000")
-l, --latency <latency> Capture latency in microseconds (default: "20000")
--no-monitor Do not append .monitor to device name
--codec <codec> Audio codec: opus or pcm (default: "opus")
-h, --help Display help
# Basic usage (after global install)
novnc-audio --port 6090 --device default
# Or with pnpm
pnpm start -- --port 6090 --device default
# Custom PulseAudio device
novnc-audio --device alsa_output.pci-0000_00_1f.3.analog-stereo
# Low latency, high quality
novnc-audio --latency 10000 --bitrate 128000 --sample-rate 48000
# PCM mode (no encoding)
novnc-audio --codec pcmimport { AudioServer, AudioCodec } from 'novnc-audio-server';
const server = new AudioServer({
capture: {
device: 'default',
sampleRate: 48000,
channels: 2,
format: 's16le',
latency: 20000,
useMonitor: true,
},
encoder: {
sampleRate: 48000,
channels: 2,
bitrate: 64000,
frameSize: 960,
},
websocket: {
port: 6090,
path: '/audio',
host: '0.0.0.0',
},
codec: AudioCodec.OPUS,
});
await server.start();
// Get statistics
const stats = server.getStats();
console.log(stats);
// Graceful shutdown
await server.stop();Audio frames sent over WebSocket follow this binary format:
┌──────────────┬──────────┬──────────────┐
│ Timestamp │ Codec │ Audio Data │
│ (8 bytes) │ (1 byte) │ (variable) │
│ BigUInt64BE │ 0x00/01 │ Buffer │
└──────────────┴──────────┴──────────────┘
- Timestamp: Milliseconds since server start (for A/V sync)
- Codec:
0x00= PCM,0x01= Opus - Audio Data: Encoded audio bytes
# List all audio output devices
pactl list short sinks
# List monitor sources (for capturing output)
pactl list short sources | grep monitor
# Test audio capture
parec --device=default.monitor --format=s16le | aplay📌 Note: Client-side audio support is already implemented in babelcloud/noVNC. You don't need to implement this yourself!
The noVNC fork includes:
core/audio/audiostream.js- WebSocket audio connectioncore/audio/audio-decoder.js- Opus/PCM decodercore/audio/audio-player.js- Web Audio API playback- Automatic connection to audio server when VNC connects
For reference, the audio frame format is:
// Audio frame structure (handled automatically by babelcloud/noVNC)
const audioWs = new WebSocket('ws://server:6090/audio');
audioWs.binaryType = 'arraybuffer';
audioWs.onmessage = (event) => {
const data = new Uint8Array(event.data);
// Parse frame (8 bytes timestamp + 1 byte codec + audio data)
const timestamp = new DataView(data.buffer).getBigUint64(0);
const codec = data[8]; // 0x00 = PCM, 0x01 = Opus
const audioData = data.slice(9);
// Decode and play (implemented in babelcloud/noVNC)
// ...
};If you want to integrate into your own client:
See the implementation in babelcloud/noVNC:
core/audio/directory for complete audio stackcore/rfb.js- Audio initialization and connection logicapp/ui.js- Audio controls UI
| Configuration | Bandwidth | Quality | Use Case |
|---|---|---|---|
| Opus 32 kbps, mono | ~32 Kbps | Basic | Voice, minimal bandwidth |
| Opus 64 kbps, stereo | ~64 Kbps | Good | General use (recommended) |
| Opus 128 kbps, stereo | ~128 Kbps | High | Music, high quality |
| PCM 48kHz stereo | ~1.5 Mbps | Lossless | Development only |
- Audio Capture: 20ms (configurable)
- Encoding: 5-10ms (Opus)
- Network: Variable
- Total: ~50-150ms typical
- babelcloud/noVNC ⭐ - Required noVNC fork with audio support
- This is the only compatible noVNC client for this audio server
- Includes Opus decoder and audio playback components
- Based on novnc/noVNC with audio extensions
- See FORK_CHANGES.md for details
- 🐛 Report Issues
- 💬 Discussions
- 📧 Email: [email protected]