3 papers across 3 sessions
We propose LION, a framework for extending Linear Transformers to the bidirectional setting by providing three theoretically equivalent representations: full attention, bidirectional RNN, and chunkwise parallel form.
We demonstrate for the first time, fully quantized training of a 7B LLM using FP4 format.
First defense against cryptanalytic parameter extraction using extraction-aware training with zero overhead inference and negligible accuracy impact.