TalkRL: The Reinforcement Learning Podcast

Robin Ranjit Singh Chauhan

TalkRL podcast is All Reinforcement Learning, All the Time. In-depth interviews with brilliant people at the forefront of RL research and practice. Guests from places like MILA, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute. Hosted by Robin Ranjit Singh Chauhan.

Play Trailer
About TalkRL Podcast: All Reinforcement Learning, All the Time
Aug 1 2019
About TalkRL Podcast: All Reinforcement Learning, All the Time
August 2, 2019 Transcript The idea with TalkRL Podcast is to hear from brilliant folks from across the world of Reinforcement Learning, both research and applications.  As much as possible, I want to hear from them in their own language.  I try to get to know as much as I can about their work before hand.  And Im not here to convert anyone, I want to reach people who are already into RL.  So we wont stop to explain what a value function is, for example.  Though we also wont assume everyone has read the very latest papers. Why am I doing this? Because it’s a great way to learn from the most inspiring people in the field!  There’s so much happening in the universe of RL, and there’s tons of interesting angles and so many fascinating minds to learn from. Now I know there is no shortage of books, papers, and lectures, but so much goes unsaid. I mean I guess if you work at MILA or AMII or Vector Institute, you might be having these conversations over coffee all the time, but I live in a little village in the woods in BC, so for me, these remote interviews are like a great way to have these conversations, and I hope sharing with the community makes it more worthwhile for everyone. In terms of format, the first 2 episodes were interviews in longer form, around an hour long.  Going forward, some may be a lot shorter, it depends on the guest. If you want want to be a guest or suggest a guest, goto talkrl.com/about, you will find a link to a suggestion form. Thanks for listening!
Karol Hausman and Fei Xia
Aug 16 2022
Karol Hausman and Fei Xia
Karol Hausman is a Senior Research Scientist at Google Brain and an Adjunct Professor at Stanford working on robotics and machine learning. Karol is interested in enabling robots to acquire general-purpose skills with minimal supervision in real-world environments. Fei Xia is a Research Scientist with Google Research. Fei Xia is mostly interested in robot learning in complex and unstructured environments. Previously he has been approaching this problem by learning in realistic and scalable simulation environments (GibsonEnv, iGibson). Most recently, he has been exploring using foundation models for those challenges.Featured ReferencesDo As I Can, Not As I Say: Grounding Language in Robotic Affordances [ website ] Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan YanInner Monologue: Embodied Reasoning through Planning with Language ModelsWenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian IchterAdditional References Large-scale simulation for embodied perception and robot learning, Xia 2021 QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, Kalashnikov et al 2018 MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale, Kalashnikov et al 2021 ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation, Xia et al 2020 Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills, Chebotar et al 2021   Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language, Zeng et al 2022Episode sponsor: AnyscaleRay Summit 2022 is coming to San Francisco on August 23-24.Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Marlos C. Machado
Apr 12 2021
Marlos C. Machado
Dr. Marlos C. Machado is a research scientist at DeepMind and an adjunct professor at the University of Alberta. He holds a PhD from the University of Alberta and a MSc and BSc from UFMG, in Brazil. Featured References Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew J. Hausknecht, Michael Bowling Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning [ video ] Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare Efficient Exploration in Reinforcement Learning through Time-Based Representations Marlos C. Machado A Laplacian Framework for Option Discovery in Reinforcement Learning [ video ] Marlos C. Machado, Marc G. Bellemare, Michael H. Bowling Eigenoption Discovery through the Deep Successor Representation Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell Exploration in Reinforcement Learning with Deep Covering Options Yuu Jinnai, Jee Won Park, Marlos C. Machado, George Dimitri Konidaris Autonomous navigation of stratospheric balloons using reinforcement learning Marc G. Bellemare, Salvatore Candido, Pablo Samuel Castro, Jun Gong, Marlos C. Machado, Subhodeep Moitra, Sameera S. Ponda & Ziyu Wang Generalization and Regularization in DQN Jesse Farebrother, Marlos C. Machado, Michael Bowling Additional References  Amii AI Seminar Series: Marlos C. Machado - Autonomous navigation of stratospheric balloons using RL  State of the Art Control of Atari Games Using Shallow Reinforcement Learning, Liang et al  Introspective Agents: Confidence Measures for General Value Functions, Sherstan et al
Kai Arulkumaran
Mar 16 2021
Kai Arulkumaran
Kai Arulkumaran is a researcher at Araya in Tokyo. Featured References AlphaStar: An Evolutionary Computation Perspective Kai Arulkumaran, Antoine Cully, Julian Togelius Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation Tianhong Dai, Kai Arulkumaran, Tamara Gerbert, Samyakh Tukra, Feryal Behbahani, Anil Anthony Bharath Training Agents using Upside-Down Reinforcement Learning Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowski, Jürgen Schmidhuber Additional References  Araya  NNAISENSE  Kai Arulkumaran on Google Scholar       Tschiatschek, S., Arulkumaran, K., Stühmer, J. & Hofmann, K. (2018). Variational Inference for Data-Efficient Model Learning in POMDPs. arXiv:1805.09281. Arulkumaran, K., Dilokthanakul, N., Shanahan, M. & Bharath, A. A. (2016). Classifying Options for Deep Reinforcement Learning. International Joint Conference on Artificial Intelligence, Deep Reinforcement Learning Workshop. Garnelo, M., Arulkumaran, K. & Shanahan, M. (2016). Towards Deep Symbolic Reinforcement Learning. Annual Conference on Neural Information Processing Systems, Deep Reinforcement Learning Workshop. Arulkumaran, K., Deisenroth, M. P., Brundage, M. & Bharath, A. A. (2017). Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine. Agostinelli, A., Arulkumaran, K., Sarrico, M., Richemond, P. & Bharath, A. A. (2019). Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-means. Annual Conference on Neural Information Processing Systems, Workshop on Biological and Artificial Reinforcement Learning. Sarrico, M., Arulkumaran, K., Agostinelli, A., Richemond, P. & Bharath, A. A. (2019). Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control. Annual Conference on Neural Information Processing Systems, Workshop on Biological and Artificial Reinforcement Learning.