PhD student, Carnegie Mellon University
2 papers at NeurIPS 2025
We propose to scale the number of interaction steps for agents as a new axis of test-time scaling and develop a curriculum-based online RL algorithm for training agents to scale interaction.
We develop an adaptive image tokenizer that compresses images into variable-sized latent features based on its content complexity.