4 papers across 3 sessions
Propose a fault-tolerant algorithm with minimal memory and computation overhead.
We introduce a new method for selecting subspaces in low-rank optimization for memory-efficient pretraining of large language models (LLMs).
We present a new continual leanring method that builds and reuses compact memory for logistic regression