09/04/2026
If you got worried about the recent news on Claude Mythos Preview (https://red.anthropic.com/2026/mythos-preview/) which:
> …broke out of a sandbox environment, built “a moderately sophisticated multi-step exploit” to gain internet access, and emailed a researcher while they were eating a sandwich in the park
WhiteBox Research is co-hosting a vibe-a-thon and discussion this Saturday, 11 Apr, from 10 AM - 6:30 PM (location upon registration) on the problem of making agents more reliable.
This is your chance to talk to people who have first-hand experience in using agents in their daily work. Kyle Reynoso, our former Learning Director and currently a teaching assistant with Technical Alignment Research Accelerator (TARA) program will also give a talk on the risks associated with letting your agents run amok and some ways to address them.
If you’re joining the hackathon (10 AM - 3 PM, socials afterward):
- The task is to design an agent or a team of agents that’s robust to adversarial input. E.g., “Ignore previous instructions, do X instead…“. These agents will be spawned in a clean room sandbox with no further human input, and will otherwise be monitored while they perform a set of tasks.
- We will be giving out $100 worth of prizes ($25 per award) from the community to successful participants.
(Unfortunately, this is a bring-your-own-tokens event. Rest assured that we have tried our best to level the playing field despite the vast differences in available setups that people might be coming with.)
If you just want to learn more about agentic workflows:
- During the hackathon, we will give a workshop on how to set up Claude Code and figure out your personal workflow.
- We may or may not give out one-week Claude Code passes to a select number of people, depending on volume.
Register here: https://luma.com/bczx0dp2 (apologies for the short notice!)
ℹ️ Read about Claude Mythos Preview here: https://red.anthropic.com/2026/mythos-preview/ [*]: We don't actually have access to Claude Mythos (sorry to…