Researcher, Apple
2 papers at NeurIPS 2025
Standard Glorot initialization becomes unstable when used in RNNs with long sequences, leading to exploding hidden states. To address this, we propose a simple rescaling that effectively mitigates the instability.
Video Recognition models are not smooth as a function of time, but smoothing them improves accuracy