1 paper across 1 session
LoTA-QAF uses ternary adaptation for fine-tuning quantized LLMs, enabling the lossless merging of adaptation into quantized weights.