4 papers across 3 sessions
We propose grafting, a simple approach to materialize new architectures by editing pretrained diffusion transformers. It enables architectural exploration under small compute budgets.
We train a hybrid autoencoder with separate gating for a neural and tree-based encoder, achieving strong low-label classification and regression using only the gated neural encoder at inference.
We enable tree-based decoding on SSMs to facilitate speculative decoding with tree-based verification with SSMs