d-Matrix - NeurIPS 2025

🏛 d-Matrix

2 papers across 2 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation

#4416 · Muhammad Adnan, Nithesh Kurella, Akhil Arunkumar, Prashant Nair

We propose an adaptive layer reuse technique that dynamically reuse intermediate feature across adjacent denoising steps to enable efficient inference of text-to-video generation models

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

MUSTAFAR: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference

#906 · Donghyeon Joo, Helya Hosseini, Ramyad Hadidi, Bahar Asgari

Our work, Mustafar, unlocks 70% sparsity in KV cache pruning by leveraging unstructured sparsity pattern, supported by a custom attention kernel, and boosts the inference efficiency of LLMs.