3 papers across 3 sessions
We propose DCoLT, a method to enhance diffusion language models by treating each reverse diffusion step as a latent "thinking" using reinforcement learning. Achieves promising results on several math and code metrics with SEDD and LLaDA.
SIU3R: the first alignment-free framework for generalizable simultaneous understanding and 3D reconstruction from unposed images.
We improve the scalability of pathology foundation models through multi-scale vector quantization (MSVQ) to keep more patch tokens with compression, achieving performance comparable to other SOTA methods with significantly less WSI data.