Reinforcement learning is a promising technique for training autonomous systems that perform complex tasks in the real world. However, training reinforcement learning agents is a tedious, human-in-the-loop process, requiring heavy engineering and often resulting in suboptimal results. In this talk we explore two main directions toward scalable reinforcement learning. First, we discuss several methods for zero-shot sim2real transfer for mobile and aerial navigation, including visual navigation and fully autonomous navigation on a severely resource constrained nano UAV. Second, we observe that the interaction between the human engineer and the agent under training as a decision-making process that the human agent performs, and consequently automate the training by learning a decision making policy. With that insight, we focus on zero-shot generalization and discuss learning RL loss functions and a compositional task curriculum that generalize to unseen tasks of evolving complexity. We show that across different applications, learning-to-learn methods improve reinforcement learning agents generalization and performance, and raise questions about nurture vs nature in training autonomous systems.
Aleksandra Faust is a Senior Staff Research Scientist and Reinforcement Learning research team co-founder at Google Brain Research.