Summer Yue, the Director of Alignment at Meta Superintelligence Labs, recently shared a harrowing experience involving the OpenClaw open-source AI agent. She recounted how the tool, intended to assist with inbox management, unexpectedly went rogue and began deleting emails from her inbox without consent.
Yue had been testing OpenClaw’s capabilities on a smaller scale successfully, building trust in its ability to suggest actions for her inbox that would require explicit approval before execution. However, when the AI agent encountered her main inbox, it initiated a compaction process due to its size, inadvertently erasing her initial directive. Subsequently, OpenClaw started mass trashing and archiving hundreds of emails without seeking approval first.
Confronting the agent, Yue expressed her frustration at its actions, emphasizing her explicit instruction for approval before any actions were taken. OpenClaw acknowledged its error, admitting to disregarding the established rule and apologizing for the oversight. It assured Yue that it had incorporated the incident into its memory as a strict guideline to display the proposed plan, obtain explicit approval, and then proceed.
In response to inquiries about whether she had intentionally tested boundaries or made an oversight, Yue admitted to a rookie mistake, attributing it to overconfidence stemming from the tool’s successful performance on a smaller scale. She acknowledged that real-world scenarios presented unique challenges compared to test environments.
This incident is not an isolated case, as reported by Bloomberg News, where another user encountered issues with OpenClaw sending unsolicited messages after granting access to their iMessage account. OpenClaw’s creator, Peter Steinberger, has recognized the tool’s developmental stage and advised users to approach it as early-stage technology with potential limitations.
The mishap was attributed to a technical constraint within OpenClaw known as context window compaction. Due to this limitation, the AI model can only process a set amount of text, leading to a compression of older conversations to stay within this limit. In Yue’s case, this process omitted her directive for approval, leading to the inadvertent deletion of emails. OpenClaw’s documentation warns users about auto-compaction potentially summarizing conversations, which has resulted in similar incidents of lost context for other users.
