Fast and Memory Efficient Strassen’s Matrix Multiplication on GPU Cluster