AI Control

I want to see a future where AI systems help humans get what they want, and where AI is much more robust and reliable than existing software. I think we’ll probably succeed and that progress in AI will make the world radically better. But this will require substantial work, and there is a risk that sophisticated AI systems will be much less robust than conventional software—especially if we measure by doing what the user actually wants rather than solving the particular problem that we actually specified.

I write about AI control from a theoretical perspective here.

I will be a researcher at OpenAI starting January 2017.

I recently co-authored a paper exploring some practical problems in AI safety.

I am a research associate at the Future of Humanity Institute.