5 papers across 3 sessions
Explicit clustering bias added during training improves structural consistency of cell embeddings but does not reveal clear cell types in mouse V1
We investiage the use of autoregressive models for exchangeable sequences in decision-making, showing multi-step inference improves decision making and standard causal architectures outperform existing custom ones.
The generalization of a DiT is influenced by the inductive bias of attention locality rather than harmonic bases like UNet. Using attention window restrictions can modify its generalization ability.
Under input uncertainty, transformer models exhibit a systematic exploration of input‑agnostic conceptual representations, increasing the likelihood of hallucinations.