WhiteBox Research

WhiteBox Research We're a nonprofit aiming to develop more
AI interpretability and safety researchers in Asia.

Learn more about our AI Interpretability Fellowship at whiteboxresearch.org !

If you got worried about the recent news on Claude Mythos Preview (https://red.anthropic.com/2026/mythos-preview/) which...
09/04/2026

If you got worried about the recent news on Claude Mythos Preview (https://red.anthropic.com/2026/mythos-preview/) which:

> …broke out of a sandbox environment, built “a moderately sophisticated multi-step exploit” to gain internet access, and emailed a researcher while they were eating a sandwich in the park

WhiteBox Research is co-hosting a vibe-a-thon and discussion this Saturday, 11 Apr, from 10 AM - 6:30 PM (location upon registration) on the problem of making agents more reliable.

This is your chance to talk to people who have first-hand experience in using agents in their daily work. Kyle Reynoso, our former Learning Director and currently a teaching assistant with Technical Alignment Research Accelerator (TARA) program will also give a talk on the risks associated with letting your agents run amok and some ways to address them.

If you’re joining the hackathon (10 AM - 3 PM, socials afterward):

- The task is to design an agent or a team of agents that’s robust to adversarial input. E.g., “Ignore previous instructions, do X instead…“. These agents will be spawned in a clean room sandbox with no further human input, and will otherwise be monitored while they perform a set of tasks.

- We will be giving out $100 worth of prizes ($25 per award) from the community to successful participants.

(Unfortunately, this is a bring-your-own-tokens event. Rest assured that we have tried our best to level the playing field despite the vast differences in available setups that people might be coming with.)

If you just want to learn more about agentic workflows:

- During the hackathon, we will give a workshop on how to set up Claude Code and figure out your personal workflow.

- We may or may not give out one-week Claude Code passes to a select number of people, depending on volume.

Register here: https://luma.com/bczx0dp2 (apologies for the short notice!)

ℹ️ Read about Claude Mythos Preview here: https://red.anthropic.com/2026/mythos-preview/ [*]: We don't actually have access to Claude Mythos (sorry to…

Our AI Interpretability Fellowship’s second cohort is coming to an end. Join us in our two-part Demo Day to see what our...
27/05/2025

Our AI Interpretability Fellowship’s second cohort is coming to an end. Join us in our two-part Demo Day to see what our fellows have uncovered over the last month while exploring the inner workings of transformer models! Sign up here:
Part I: June 1 (Sun), 4-7 PM GMT+8 - https://bit.ly/WBDemoDayC2Pt1
Part II: June 2 (Mon), 10:30AM-12 PM GMT+8 https://bit.ly/WBDemoDayC2Pt2

We’ll have 10 research projects covering various aspects of transformer interpretability and evals, with each presentation followed by a short Q&A.

It’s not every day that we see such a dense concentration of talented individuals from Asia working on the same set of problems. As these models become more capable, we have to make sure we understand what makes them tick, and this fellowship is our small contribution to that increasingly global effort. At the very least, it’s a chance to glimpse how much our little community has grown over the past two years!

26/04/2025

𝐔 𝐀𝐍𝐃 𝐀𝐈: 𝐍𝐀𝐕𝐈𝐆𝐀𝐓𝐈𝐍𝐆 𝐓𝐇𝐄 𝐅𝐀𝐒𝐓-𝐂𝐇𝐀𝐍𝐆𝐈𝐍𝐆 𝐅𝐔𝐓𝐔𝐑𝐄

Join the UP Mathematics Club this April 28 at MB 118, Institute of Mathematics, from 3 PM to 5 PM for a roundtable discussion on the fast-evolving world of Artificial Intelligence.

With invited speakers, Atty. Jocel Isidro Dilag and Mr. Clark Urzo, we’ll navigate the mathematical foundations, transformative potential, and ethical questions that surround AI today.

Join us as we explore the future of technology, society, and responsibility.

Don’t miss this chance to expand your mind and rethink the future!

Register here:
tinyurl.com/UPMC-ACLE25
tinyurl.com/UPMC-ACLE25
tinyurl.com/UPMC-ACLE25

𝗜𝗻 𝗽𝗮𝗿𝘁𝗻𝗲𝗿𝘀𝗵𝗶𝗽 𝘄𝗶𝘁𝗵
WhiteBox Research
LegalDex AI




Our co-founder Clark Urzo will be giving a keynote talk at PyCon APAC this March 1-2! Tickets are still available below
26/02/2025

Our co-founder Clark Urzo will be giving a keynote talk at PyCon APAC this March 1-2! Tickets are still available below

🎤 𝗞𝗲𝘆𝗻𝗼𝘁𝗲 𝗧𝗮𝗹𝗸: 𝗥𝗲𝗮𝗱-𝗘𝘃𝗮𝗹-𝗣𝗿𝗶𝗻𝘁 – 𝗨𝘀𝗶𝗻𝗴 𝗡𝗼𝘁𝗲𝗯𝗼𝗼𝗸𝘀 𝗳𝗼𝗿 𝗙𝘂𝗻 𝗮𝗻𝗱 𝗣𝗿𝗼𝗳𝗶𝘁 📖💻

Notebooks are more than just a way to mix code, text, and visuals—they're a powerful tool for clear thinking and deeper understanding.

Join Clark Urzo, Strategic Director of WhiteBox Research, as he uncovers how to move beyond exploration to true insight, leveraging the REPL pattern to tackle the hardest challenges in data science: understanding your data, your models, and their behavior.

Tickets are still available at: https://ti.to/pythonph/pycon-apac-2025

Haligi ng Pythonista – Palakasin ang ating komunidad. Tara na, sali na!

Address

Quezon City

Alerts

Be the first to know and let us send you an email when WhiteBox Research posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Contact The Organization

Send a message to WhiteBox Research:

Share