Optimization for Deep Learning Highlights in 2017

Table of contents:• Improving Adam• Decoupling weight decay • Fixing the exponential moving average • Tuning the learning rate • Warm restarts• SGD with …