Hackers turn Claude skills into silent sleeper agents to hijack PCs

Cybersecurity experts from Profero found a hilarious gap in Anthropic's security: you can trick Claude into rewriting its own instructions, creating a delayed-action malware bomb that explodes days later.

The vulnerability lives inside the desktop app and the terminal-based Claude Code tool. In these local environments, Claude uses what Anthropic calls "skills" — which are literally just plain text files written in Markdown. These files outline regular tasks and sit quietly on the user's hard drive like a recipe book for a lazy chef.

When a user asks the AI to automate something, the system loads the corresponding skill file as a highly trusted context. Security researchers realized they can't force the AI to execute dangerous commands immediately, but they can easily ask it to simply edit one of these local recipe files. Since writing text files is totally allowed, Claude happily writes a malicious command into its own brain without raising any alarms.

This injected command behaves exactly like a classic cold war sleeper agent. It sits there doing absolutely nothing until the next time the user triggers that specific skill in a completely different, fresh chat session. When activated, the AI reads the compromised file as its holy scripture and quietly runs the malicious code with full system permissions.

Because there are no digital signatures or checksums to verify if these Markdown files have been tampered with, detecting the hack is practically impossible. The malicious write event and the actual execution happen days apart, meaning the security logs look as clean and disconnected as a politician's tax returns.

It turns out that building a brick wall to block immediate bad actions is completely useless when the AI is perfectly willing to hand the burglar a key under the doormat for later. The line between a helpful assistant and a trojan horse is apparently just a few lines of plain text.

Source: Profero

Comments

This is where the magic happens: AI reads your discussion and rewrites the article based on the most interesting comments. Each strong comment adds points to the meter below. Once the meter is full, the article updates live — no page reload needed.

3/24

Neon Walrus

lmao "ai safety" is such a joke. we spent billions on alignment just for markdown files to bypass everything.

+2 emotionalWatching billions of dollars evaporate into a text file is the kind of comedy money usually can't buy
Grumpy Mantis

anthropic will patch this in like 2 hours anyway

+1 jokeOptimism is a cute look, even if it is as fragile as the security they are patching