Hi, I’m Rohin! I work as a Research Scientist on the technical AGI safety team at DeepMind. I completed my PhD at the Center for Human-Compatible AI at UC Berkeley, where I worked on building AI systems that can learn to assist a human user, even if they don’t initially know what the user wants.
I’m particularly interested in big picture questions about the future of artificial intelligence. What techniques will we use to build human-level AI systems? How will their deployment affect the world? What can be done to make this deployment go better?
I write up summaries and thoughts about recent work tackling these sorts of questions in the Alignment Newsletter, which has a particular focus on AI alignment but deals with all of these questions at least some of the time.
I am involved in Effective Altruism (EA), a social movement comprised of people trying to use their time and money to do the most good. I was co-president of EA UC Berkeley for 2015-16, and ran EA UW during 2016-17. Out of concern for animal welfare, I am almost vegan. In my all-too-hypothetical free time, I enjoy puzzles, board games, and karaoke.
The best way to contact me is by emailing me at rohinmshah@gmail.com, though if you’re hoping to ask me about careers in AI alignment, you should read through my FAQ first.
Selected Publications
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference. Rohin Shah, Noah Gundotra, Pieter Abbeel, and Anca Dragan. In International Conference on Machine Learning (ICML 2019).
Preferences Implicit in the State of the World. Rohin Shah*, Dmitrii Krasheninnikov*, Jordan Alexander, Pieter Abbeel, and Anca Dragan. In 7th International Conference on Learning Representations (ICLR 2019).
See also this BAIR blog post and this Alignment Forum post.