Paul Christiano, Buck Shlegeris, Dario Amodei. Supervising strong learners by amplifying weak experts. 2018.
Zvika Brakerski, Paul Christiano, Urmila Mahadev, Umesh Vazirani, Thomas Vidick: Certifiable randomness from a single quantum device. STOC 2018.
Paul Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, Dario Amodei: Deep reinforcement learning from human preferences. NIPS 2017.
Paul Christiano: Manipulation-resistant online learning. 2017 (my thesis).
Benya Fallenstein, Jessica Taylor, Paul Christiano: Reflective oracles: a foundation for classical game theory. 2017.
Chelsea Finn*, Paul Christiano*, Pieter Abbeel, Sergey Levine: A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models. NIPS 2016 workshop on adversarial training.
Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dandelion Mané: Concrete Problems in AI Safety. 2016.
Paul Christiano: Collaborative prediction with expert advice. 2016.
Paul Christiano: Provably manipulation-resistant reputation systems. COLT 2016 (best student paper).
Paul Christiano: Online local learning via semidefinite programming. STOC 2014 (best student paper).
Scott Aaronson, Paul Christiano: Quantum money from hidden subspaces. STOC 2012.
Paul Christiano, Jonathan A. Kelner, Aleksander Madry, Daniel A. Spielman, Shang-Hua Teng: Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. STOC 2011 (best paper).
Paul Christiano, Erik D. Demaine, Shaunak Kishore: Lossless fault-tolerant data structures with additive overhead. WADS 2011.
(Note that papers at STOC, COLT, and WADS have alphabetical author lists. * indicates equal contribution.)