Deep Learning: CPU v M1 v GPU

Measuring Relative Processing Speeds

I am building deep learning models on my Mac and notice a considerable timing gap between code run by others using a GPU and what I am experiencing on my Mac, which has an M1 processor. PyTorch began supporting M1 about a month ago.

My experiment ran a matrix multiplication of two random matrices (1000x10, 10x50) and repeated it for 10k times. I compared 3 types of hardware processing for 2 environments: my Mac and Colab. The latter let me observe the gain from GPU. The 3 types of hardware: NumPy on CPU, Torch on CPU, Torch on M1 / GPU.

Results:

	Mac (ms)	Colab (ms)
NumPy	7190	1500
Torch - CPU	456	140
Torch - M1	458	x
Torch - GPU	x	115

Conclusion

Mysterious that there is no gain from CPU-->M1, potentially thinking about this wrong, will have to revisit!

This great blog does a more robust test and while he observes a gain by training a full model and performing inference, he concludes that M1 is a hard no in terms of replacing GPUs for training models.