LLM-powered ghost text for your terminal

Matt Legrand, Staff, GitHub

February 22, 2026·5 min read

285 embeddings

I wanted Copilot-style autocomplete in my terminal

Code editors have had inline ghost text for years. You type a few characters, a dim suggestion appears, you hit Tab to accept. It’s become invisible infrastructure. You forget it’s there until it’s gone.

Terminals don’t have this. You get Ctrl+R for substring history search and tab completion for file paths, but nothing that understands what you’re trying to do. I wanted three things: inline autocomplete as I type, a way to describe commands in plain English, and history search by meaning instead of substrings.

So I built ghst. It’s a zsh plugin that adds all three: ghost text appears as you type (accept with Tab or →, word-by-word with Shift+→), Ctrl+G opens a natural language prompt, and Ctrl+R searches history by meaning. It works with any terminal emulator, supports OpenAI and Anthropic, and never auto-executes anything.

Under the hood

Zsh’s line editor (ZLE) is synchronous. Any blocking call freezes your prompt. So ghst splits into two processes: zsh widgets on one side, a Python asyncio daemon on the other, connected by a Unix domain socket.

text

zsh (ZLE widgets)  ←── Unix domain socket ──→  ghstd (Python asyncio daemon)

The shell connects once via zsocket and reuses the connection. The daemon maintains a connection pool, an LRU response cache, a circuit breaker, and in-flight request cancellation. When you type another character before the previous suggestion arrives, it cancels the pending API call.

Getting ghost text to render correctly was its own challenge. ZLE’s POSTDISPLAY renders ANSI escapes as literal characters on zsh 5.9 macOS. The solution is bypassing ZLE entirely: write 256-color escape codes directly to /dev/tty. Colors 16–255 are fixed across all terminals and don’t get remapped by themes, so ghost text looks consistent in iTerm, Ghostty, or the default Terminal.

Two prompting modes

Autocomplete and NL commands feel like the same feature, but they need different prompting strategies. Autocomplete is a text continuation task: the model sees git ch and continues with eckout main. NL commands are instruction-following: the model sees “list files larger than 10MB” and produces find . -size +10M. ghst uses separate models for each: a fast model like gpt-4o-mini for autocomplete, and a more capable model like gpt-4o for NL commands and history search.

python

# FIM-style: "continue this text"
AUTOCOMPLETE_SYSTEM = """Continue the user's shell command from where it left off.
Return ONLY the continuation text, nothing else."""

# Instruction-following: "translate this request"
COMMAND_SYSTEM = """Convert the user's natural language description into a shell command.
Return ONLY the command, nothing else."""

This distinction also solved spacing. When the user types git ch, should the suggestion start with a space? FIM-style continuation made the problem disappear. The model continues the text naturally, and spacing emerges from the continuation itself.

Safety and privacy

ghst never auto-executes anything. NL commands and history search results are placed in your buffer for review. Press Enter to run, Ctrl+Z to undo. Dangerous commands like rm -rf / and curl | sh are flagged before they reach the LLM.

All shell history and terminal output is sanitized before being sent to your LLM provider: API keys, tokens, passwords, and credentials are automatically redacted. If you’d rather not store your API key in the config file, set the GHST_API_KEY environment variable instead.

What’s next

Streaming responses: each suggestion currently waits for the full LLM response. Streaming would let ghost text appear token-by-token, significantly reducing perceived latency.
Error correction: when a command fails, show a corrected version as ghost text. Read the last command’s stderr and suggest a fix.
Bash and Fish support: the daemon is shell-agnostic, but the ZLE widget layer is zsh-specific. Bash’s readline and Fish’s commandline need their own implementations.
Local model support: Ollama and LM Studio for offline use. The daemon’s LLM client is already provider-agnostic; it just needs new endpoint configuration.

Building in public over a weekend

ghst went from zero to PyPI in a weekend. 52 commits, 3 releases, 2 name changes (aish → shai → ghst, after PyPI rejected shai as too similar to sha1), and roughly 5,000 lines of code. The first release had a working autocomplete, NL commands, and history search. The second added directory and git awareness. The third was the UX overhaul that triggered the recursive-edit bug cascade.

The shell is a surprisingly hostile environment for async UI. Every technique that works in a modern rendering framework (diffing, state management, async updates) has to be reinvented with escape codes and file descriptors. But when it works, ghost text in your terminal feels like the kind of tool that becomes invisible infrastructure. You forget it’s there until it’s gone.

Try it

bash

uv tool install ghst
ghst init
exec zsh

That’s it. init configures your LLM provider, adds shell integration to .zshrc, and starts the daemon. exec zsh reloads your shell to activate it. Check out the GitHub repo for configuration options, supported providers, and to open issues.