Put Your Agent in Jail
In Which the Author Worries About Coding Agents Growing Claws
Coding agents love to execute shell commands, through what usually gets termed the "Bash" tool.
(I'll avoid the pedantry of pointing out that the shell used often is not actually bash...)
Coding agents are happy to do everything from run ls to custom scripting some Python and running it using python3. The Bash tool is the ultimate cheat code for an agent, letting it do much of what a developer can do on her own machine.
That is a double-edged sword. What the agent can do with Bash for you could be rivaled or surpassed by what the agent can do against you.
There have been plenty of horror stories about what OpenClaw has done to unwitting users. However, there are similar stories about what coding agents have done and related tools have done. That story about the husband who let an LLM accidentally delete his wife's photos? That was Claude Cowork, part of the Claude Code desktop tool. That hard drive that got wiped? That was Google's Antigravity.
And just as OpenClaw could be tricked via malicious skills, so could a coding agent, as the skills marketplaces explode. OpenClaw may be worse than common coding agents when it comes to security, but that does not mean that coding agents are safe.
My long-term ambition is to minimize how much agents that I use rely on the Bash tool. My short-term objective was to minimize how much a rogue agent could screw stuff up using Bash (or possibly other write-access tools, such as overwriting or deleting the wrong files).
Right now, I am using ai-jail for this. It is available for Linux and macOS, and helps to sandbox the coding agent. While not as strong as a separate VM, the sandbox does help reduce agent access:
- Reduces what is visible in the user's home directory
- Makes lots of system files read-only, if they were not read-only already
- Gives the sandbox a separate temp directory
With --lockdown mode, everything is read-only, including the project directory.
This is imperfect, but it is useful "middle ground" between setting up a full VM and just letting the coding agent run amok.
If you have other suggestions of tools in this area, please leave a comment!
Add a comment: