3 papers across 3 sessions
We propose a parallel generation method for LLMs, where multiple instances synchronize through a shared, dynamically-updated attention cache
We present GraphLand — a benchmark of 14 diverse graph node property prediction datasets from real-world industrial applications with both random and temporal data splits.
Alchemist: a compact (3.3k) SFT dataset via diffusion-model filtering. Boosts T2I aesthetics/complexity in 5 SD models (weights released) while keeping diversity.