Hackable HCI tools to fix all those AI bugs
An AI-first, privacy respecting, integrated diagnostic and development suite built the Unix way, through small tools that can be orchestrated together.
- Tmux Talkers: A sidebar chat in tmux
- Shell Snoopers: Tiny tools for shining up your shell
- Xorg Xtractors: LLM interception and injection in your Xorg
screen-query, sq-picker and sq-add
Simple installer:
curl day50.dev/talker | sh
An llm intervention in your little terminal with suppport for adding screenshots, command output, cycling pane focus, turning off and on pane capturing, adding external context and more, all sitting agnostically on top of tmux so there's no substantive workflow change needed. You can just beckon your trusty friend at your leisure.
demo.webm
You should also use sq-add
which can pipe anything into the context. Here's an example:
Once you're in there's a few slash commands. Use /help
to get the current list.
Multiline is available with backslash \
, just like it is at the shell
Here's some screenshots of how it seamlessly works with Streamdown's built in Savebrace
feature and how it helps workflow.
Also you don't need tmux
! Often you'll be doing things and then realize you want the talk party and you're not in tmux.
That's fine! If you use streamdown, sq-picker
works like it does inside tmux. You can also sq-add
by id. It's not great but you're not locked in. That's the point!
shell-hook, shellwrap and wtf
A Zsh shell hook that intercepts user input before execution. It constructs a detailed prompt including system information and the user's input, sends this to an LLM, and replaces the user's input with the LLM's response.
ffmpeg, ssh port forwarding, openssl certificate checking, jq stuff ... this one is indispensible! Highly recommended!
shellwrap is a new concept, generally. It shepherds your input and output as a true wrapper and logs both sides of the conversation into files. Then when you invoke the llm it will pre-empt any existing interaction, kind of like the ssh shell escape. This is what the reversed triangle input in the video is. That's invoked with a keyboard shortcut, currently ctrl+x
.
Then you type your command in and press enter. This command, plus the context of your previous input and output is then sent off to the llm and its response is wired up to the stdin of the application.
So for instance,
- Inside the zsh shell it gives shell commands.
- Inside a full screen program, in this case vim. The vim session is pre-empted with a keystroke then just start typing. The llm infers it's vim and knows what mode it's in from the previous keystrokes and correctly exits.
- Interactive python is opened. It uses the context to infer it and responds appropriately.
This works seamlessly over ssh boundaries, in visual applications, at REPLs --- anywhere.
shellwrap1.webm
A tool designed to read a directory of files, describe their content, categorize their purposes and answer basic questions. Example!
(Notice how it has no idea what shellwrap does. Told you it was new! ;-) )
kb-capture.py and llm-magic
kb-capture.py
captures keyboard events from an X server and converts them into a string. It exits and prints the captured string when a semicolon (;
) or colon (:
) is pressed. llm-magic
is a shell script that uses kb-capture.py
to capture keyboard input, sends it to an LLM for processing, and displays the LLM's response using dzen2
and then types it out using xdotool.
Their powers combined gives you llm prompting in any application. Here the user is
- ssh'ing to a remote machine
- using a classic text editor (scite)
- using classic vim
I do a keystroke to invoke llm-magic
, type my request, then ; and it replaces my query with the response. Totally magic. Just like it says.
Thanks for stopping by!