Training runs painfully slow on V100

koko191 · Post by **koko191** » Tue Aug 27, 2019 4:21 pm

For some reason my training runs super slow on V100 even though it runs like a mad lad on a Titan RTX. I'm using the same settings for both GPU. When I check nvidia-smi, the script was using 12+GB of VRAM on the Titan RTX, but always 305MB on the V100. Am I doing something wrong? The V100, according to what I know, should be a lot faster than any other GPU.

koko191 · Post by **koko191** » Tue Aug 27, 2019 7:21 pm

Solved.

Both machines had CUDA 10.1 (and other versions as well) but the RTX had cuDNN 7.5.1 (compatible with CUDA 10.1) while the V100 had only cuDNN 7.1.4 (incompatible with CUDA 10.1). I was using TF 1.14.0. Downgraded to 1.12.3 (compatible with CUDA 9.2, which both machines had, and CUDA 9.2 is compatible with cuDNN 7.1.4) and everything worked.

Faceswap Forum

Training runs painfully slow on V100

Training runs painfully slow on V100

Re: Training runs painfully slow on V100