I want to see a future where AI systems help humanity thrive. I think this will probably happen, but only because the ML community will probably put in a whole lot of work. If we fail at this work, I think we run a real risk of making humanity’s future much worse.
AI alignment is the problem of building machines which faithfully try to do what we want them to do (or what we ought to want them to do). I write about alignment here.
Since March 2021 I have been running the Alignment Research Center.
From January 2017-January 2021 I worked on the safety team at OpenAI.
I am on the board of Ought.
I am a research associate at the Future of Humanity Institute.
Many people have found this interview helpful for understanding my perspective on AI alignment.