AI alignment

I want to see a future where AI systems help humanity thrive. I think this will probably happen, but only because the ML community will probably put in a whole lot of work. If we fail at this work, I think we run a real risk of making humanity’s future much worse.

AI alignment is the problem of building machines which faithfully try to do what we want them to do (or what we ought to want them to do). I write about alignment here.

Since January 2017 I have been working on the safety team at OpenAI.

I am on the board of Ought.

I am a research associate at the Future of Humanity Institute.

Many people have found this interview helpful for understanding my perspective on AI alignment.