1 paper across 1 session
HiFC swaps LLM KV caches directly between GPU and pSLC-SSD, matching DRAM-level throughput while eliminating DRAM and slashing cost five-fold.