We are interested in methods that can deliver a robust decision-making capability in complex scenarios. Such methods can be applied in control to obtain broadly intelligent agents.

At the core of this effort is solving credit assignment over prolonged horizons. I like to put, perhaps unorthodoxly, many problems under this umbrella. The tasks requiring extreme reasoning, like math solving, on the one side and classical continuous control in robotics, on the other side.

There are several specific areas of this research:

Papers:

[1] Hierarchical Reinforcement Learning with Parameters [2] Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. Proximal Policy Optimization with Policy Blending [3] Expert-augmented actor-critic for ViZDoom and Montezumas Revenge [4] Developmentally motivated emergence of compositional communication via template transfer [5] Simulation-based reinforcement learning for real-world autonomous driving [6] Uncertainty-sensitive Learning and Planning with Ensembles [7] Model Based Reinforcement Learning for Atari [8] Structure and randomness in planning and reinforcement learning [9] Trust, but verify: model-based exploration in sparse reward environments [10] CARLA Real Traffic Scenarios -- novel training ground and benchmark for autonomous driving [11] Robust and Efficient Planning using Adaptive Entropy Tree Search [12] Continuous Control With Ensemble Deep Deterministic Policy Gradients [13] Off-Policy Correction For Multi-Agent Reinforcement Learning [14] Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication [15] Continual World: A Robotic Benchmark For Continual Reinforcement Learning [16] Subgoal Search For Complex Reasoning Tasks [17] Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers [18] Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search [19] Formal Premise Selection With Language Models [20] Disentangling Transfer in Continual Reinforcement Learning

[21] Magnushammer: A Transformer-based Approach to Premise Selection

[22] Focused Transformer: Contrastive Training for Context Scaling