2 papers across 2 sessions
We present the shortcomings of existing dropout-based methods in modeling long-range tasks.
We make neural network training cheaper and more accurate by progressively dropping parts of the data after each epoch.