Skip to content
chris mckenzie edited this page May 5, 2025 · 1 revision

Welcome to the llmehelp wiki!

There's interesting terminals called waveterm and warp basically an i3/tmux merge point with some kind of chat built in to the side. Then there's this IDE called windsurf that looks at your behavior as you navigate around. The idea here, is to extend screen-query to be a less clunky version of ai-chat that looks at your various tmux panes, uses procfs to be a paired programmer. The problem with things like aider, plandex codex and goose is it takes on too much of a role, as opposed to being a tool it tries to be a junior dev - taking on too much responsibility, making bad assumptions, and creating messes.

I think I synthesized this on this HN comment on 2025-04-19:

I found that if I'm unfamiliar with the knowledge domain I'm mostly using AI but then as I dive in the ratio of AI to human changes to the point where it's AI at 0 and it's all human. Basically AI wins at day 1 but isn't any better at day 50. If this can change then it's the next step.

You can see this in the featured demos. They are all "starting from 0" points of unblocking yourself. This is great. You won't do anything until you convince yourself it's easy. But the core objective of screen-query at this point, and the thing I'm excited about is moving beyond that.

As human knowledge (and opinion) progresses, ai needs to plays a "co" role - co-expert, co-craftsman, co-reference. Take the streamdown project I mentioned before. As time progressed and I discovered more subtle bugs, I found that they were inserted by the vibe-coding AI I did early on to get myself started. The percentage of the code that is AI basically went from maybe 90% on first commit to perhaps 10% now, where it's slowly asymptotically decreasing because it continues to be able to only make what I call Day-0 or Day-1 contributions to a project I've reached Day-30 on.

The fidelity of this nuance has to be crisp without introducing the noise and the context of the HCI will get us there. That's where screen-query is going next.

It's not just prompting a single llm. Take teapot for example - there's many models that excel at certain modalities of interaction. The key is to slot these multi-agents into a proper stack using things such as file contents, commit histories, documentation, tests, and user interaction to get past Day-0/1 contributions and stop producing coding slop when quality starts mattering. agno in March 2025 attempted to address this through the roles abstraction and it's probably the right 1.0

There's a famous 2019 essay called the bitter lesson about how sophistication of models and complexity of solution are simply getting out-classed by naive "more compute" solutions. This is not that. These problems require problem contextualization and more nuanced presumptions with more clever AI can't get you there. This is wildly observable in humans. There is no human that can just sit down at a company and be productive on day 1 of a complex project, making meaningful contributions. Maximizing the utility of AI requires it to play a codependent fluid dynamic role with the human throughout the creative process

All code is institutional and all institutions are collective behavior and all behaviors are learned and poorly documented. Therein lies the next step-change

~chris 2025-04-20

Clone this wiki locally