Transferable Proximal Policy Optimization for Multitask Reinforcement Learning

Published in Manuscript in preparation, 2024

We study how to share experience across related reinforcement learning tasks when training data and computation are constrained. Our approach extends proximal policy optimization with an auxiliary-task selection strategy, a shared experience buffer, and a fine-tuning stage that improves stability and final performance across tasks.