Abstract: Gradient compression algorithms are widely used to alleviate the communication bottleneck in distributed ML. However, existing gradient compression algorithms suffer from accuracy ...