PhD student, Tel Aviv University, Tel Aviv University
3 papers at NeurIPS 2025
Standard Glorot initialization becomes unstable when used in RNNs with long sequences, leading to exploding hidden states. To address this, we propose a simple rescaling that effectively mitigates the instability.