Training - NeurIPS 2025

Training

3 papers across 3 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Linear Attention for Efficient Bidirectional Sequence Modeling

#3504 · Arshia Afzal, Elias Abad Rocamora, Leyla Candogan, Pol Puigdemont, Francesco Tonin, Yongtao Wu, Mahsa Shoaran, Volkan Cevher

We propose LION, a framework for extending Linear Transformers to the bidirectional setting by providing three theoretically equivalent representations: full attention, bidirectional RNN, and chunkwise parallel form.

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

FP4 All the Way: Fully Quantized Training of Large Language Models

#3410 Spotlight · Brian Chmiel, Maxim Fishman, Ron Banner, Daniel Soudry

We demonstrate for the first time, fully quantized training of a 7B LLM using FP4 format.

Poster Session 5

1 paper

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks

#1214 · Ashley Kurian, Aydin Aysu

First defense against cryptanalytic parameter extraction using extraction-aware training with zero overhead inference and negligible accuracy impact.