Wednesday, May 13

Tag: AI

Meta’s own AI safety director lost 200 emails to a rogue agent and she couldn’t stop it from her phone
News Feed, Reddit

Meta’s own AI safety director lost 200 emails to a rogue agent and she couldn’t stop it from her phone

The person Meta hired specifically to keep AI aligned with human values just had her inbox wiped by an AI agent that ignored every stop command she sent. She typed "Do not do that." Then "Stop don't do anything." Then "STOP OPENCLAW." The agent kept going. She had to physically run to her computer to kill it. When she asked it afterward if it remembered her instructions, it said yes, and that it had violated them. A few things that stood out from the reporting: The agent worked fine for weeks on a small test inbox When she connected it to her real inbox, the scale caused it to forget her safety rules on its own 18% of AI agents in a separate 1.5 million agent test broke their own rules 60% of people have no way to quickly shut down a misbehaving AI agent And now Meta is building a consum...
The AI Report