1 paper across 1 session
We introduce contextualized n-gram embeddings to extend input embedding layers, improving performance while maintaining fixed accelerator usage during inference.