I am interested in sequential modeling and decision-making, focusing on methods that deliver robust capabilities in complex scenarios.
These methods have the potential to enable broadly intelligent agents and can be applied to diverse domains. I view many problems under this umbrella, including tasks that require extreme reasoning, such as mathematical problem solving, and classical continuous control in robotics. My efforts also involve building foundational knowledge, including exploring sequential modeling techniques (e.g., long-context transformers), deep learning and reinforcement learning techniques.
There are several specific areas of this research:
[1] Hierarchical Reinforcement Learning with Parameters [2] Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. Proximal Policy Optimization with Policy Blending [3] Expert-augmented actor-critic for ViZDoom and Montezumas Revenge [4] Developmentally motivated emergence of compositional communication via template transfer [5] Simulation-based reinforcement learning for real-world autonomous driving [6] Uncertainty-sensitive Learning and Planning with Ensembles [7] Model Based Reinforcement Learning for Atari [8] Structure and randomness in planning and reinforcement learning [9] Trust, but verify: model-based exploration in sparse reward environments [10] CARLA Real Traffic Scenarios -- novel training ground and benchmark for autonomous driving [11] Robust and Efficient Planning using Adaptive Entropy Tree Search [12] Continuous Control With Ensemble Deep Deterministic Policy Gradients [13] Off-Policy Correction For Multi-Agent Reinforcement Learning [14] Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication [15] Continual World: A Robotic Benchmark For Continual Reinforcement Learning [16] Subgoal Search For Complex Reasoning Tasks [17] Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers [18] Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search [19] Formal Premise Selection With Language Models [20] Disentangling Transfer in Continual Reinforcement Learning
[21] Magnushammer: A Transformer-based Approach to Premise Selection
[22] Focused Transformer: Contrastive Training for Context Scaling