8 papers across 3 sessions
We introduce FLAMES, a novel spiking neural network architecture that effectively captures long-range temporal dependencies through Spike-Aware HiPPO and dendrite attention.
We provide a method for enabling length generalization within state-space models by modulating the $A$ matrices per layer.
We propose a backward-mode AD proxy using only forward passes applying to Hamiltonian recurrent units and stacks thereof (namely, SSMs) with theoretical guarantees and experimental evidence
Introduced a new SSM that is maximally expressive and scalable to long sequence modeling tasks
We enable tree-based decoding on SSMs to facilitate speculative decoding with tree-based verification with SSMs
We present a unified theory for the study of RNN expressivity, with novel results on several popular architectures, and insights on the relationship between linear and non-linear RNNs.
We regularize the Hankel singular value distribution when training state space models for high compressibility.