PhD student, Universität des Saarlandes
1 paper at NeurIPS 2025
While large-scale pretraining brings remarkable capabilities, it cannot fundamentally rewrite the architecture’s core inductive biases.