2 papers across 2 sessions
We propose the novel low-rank regularization methodology Q3R which enables robust pre-training of low-rank models for the first time in the literature.
We introduce a new method for selecting subspaces in low-rank optimization for memory-efficient pretraining of large language models (LLMs).