1 paper across 1 session
We present JetLM, a new family of LMs, which matches leading full-attention models while significantly improving generation throughput.