Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Hollywood · Post by **Hollywood** » Thu Jun 11, 2020 10:04 pm

When I try to start training I get these error message: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED and E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Anyone have any idea how to fix this?
(I attached the crash report)

Post by **torzdf** » Fri Jun 12, 2020 8:22 am

Enable the "Allow Growth" option

Hollywood · Post by **Hollywood** » Fri Jun 12, 2020 4:42 pm

torzdf wrote: ↑Fri Jun 12, 2020 8:22 am
Enable the "Allow Growth" option

That didn't work I included my log in case that helps with debugging.

Post by **deephomage** » Fri Jun 12, 2020 9:12 pm

Your log file states "Crash report written to 'F:\Face Swap\faceswap\crash_report.2020.06.12.151752362351.log'. You MUST provide this file if seeking assistance." Post your crash report, not the log file.

Hollywood · Post by **Hollywood** » Fri Jun 12, 2020 9:17 pm

deephomage wrote: ↑Fri Jun 12, 2020 9:12 pm
Your log file states "Crash report written to 'F:\Face Swap\faceswap\crash_report.2020.06.12.151752362351.log'. You MUST provide this file if seeking assistance." Post your crash report, not the log file.

I attached it in the original post.

Post by **deephomage** » Sat Jun 13, 2020 1:55 am

Try lowering your batch size to 32 and closing all other programs that might be using GPU memory, especially web browsers.

Hollywood · Post by **Hollywood** » Sat Jun 13, 2020 2:38 am

That didn't help either, I even lowered it all the way down to one. It had been working before for many hours and then I shut down my computer overnight and tried to restart it and it started doing this.
I also made a trace level crash report in case that will help narrow it down.

Post by **torzdf** » Sat Jun 13, 2020 9:35 am

Ultimately this is an issue in Tensorflow/Cuda.

Generally, this indicates a GPU memory issue, which Allow Growth normally fixes.

Try with a very low batch size, Allow Growth ON, Memory Saving Gradients OFF.

If this doesn't work, try removing faceswap and the environment and training again.

If it still doesn't work, you will need to google around this issue for potential solutions, as it is occuring upstream from Faceswap.

Hollywood · Post by **Hollywood** » Sat Jun 13, 2020 5:45 pm

Thanks for the help some combination of all of those helped to fix it.

Faceswap Forum

Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Re: Crash While Beginning Training

Re: Crash While Beginning Training

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED