Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Hollywood
Posts: 18
Joined: Thu Jun 11, 2020 2:52 pm
Has thanked: 2 times

Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by Hollywood »

When I try to start training I get these error message: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED and E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Anyone have any idea how to fix this?
(I attached the crash report)

Attachments
crash_report.2020.06.11.180042299945.log
(87.59 KiB) Downloaded 241 times
by torzdf » Sat Jun 13, 2020 9:35 am

Ultimately this is an issue in Tensorflow/Cuda.

Generally, this indicates a GPU memory issue, which Allow Growth normally fixes.

Try with a very low batch size, Allow Growth ON, Memory Saving Gradients OFF.

If this doesn't work, try removing faceswap and the environment and training again.

If it still doesn't work, you will need to google around this issue for potential solutions, as it is occuring upstream from Faceswap.

Go to full post
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: Crash While Beginning Training

Post by torzdf »

Enable the "Allow Growth" option

My word is final

User avatar
Hollywood
Posts: 18
Joined: Thu Jun 11, 2020 2:52 pm
Has thanked: 2 times

Re: Crash While Beginning Training

Post by Hollywood »

torzdf wrote: Fri Jun 12, 2020 8:22 am

Enable the "Allow Growth" option

That didn't work I included my log in case that helps with debugging.

Attachments
faceswap.log
(10.85 KiB) Downloaded 231 times
User avatar
deephomage
Posts: 33
Joined: Fri Jul 12, 2019 6:09 pm
Answers: 1
Has thanked: 2 times
Been thanked: 8 times

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by deephomage »

Your log file states "Crash report written to 'F:\Face Swap\faceswap\crash_report.2020.06.12.151752362351.log'. You MUST provide this file if seeking assistance." Post your crash report, not the log file.

User avatar
Hollywood
Posts: 18
Joined: Thu Jun 11, 2020 2:52 pm
Has thanked: 2 times

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by Hollywood »

deephomage wrote: Fri Jun 12, 2020 9:12 pm

Your log file states "Crash report written to 'F:\Face Swap\faceswap\crash_report.2020.06.12.151752362351.log'. You MUST provide this file if seeking assistance." Post your crash report, not the log file.

I attached it in the original post.

User avatar
deephomage
Posts: 33
Joined: Fri Jul 12, 2019 6:09 pm
Answers: 1
Has thanked: 2 times
Been thanked: 8 times

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by deephomage »

Try lowering your batch size to 32 and closing all other programs that might be using GPU memory, especially web browsers.

User avatar
Hollywood
Posts: 18
Joined: Thu Jun 11, 2020 2:52 pm
Has thanked: 2 times

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by Hollywood »

That didn't help either, I even lowered it all the way down to one. It had been working before for many hours and then I shut down my computer overnight and tried to restart it and it started doing this.
I also made a trace level crash report in case that will help narrow it down.

Attachments
crash_report.2020.06.12.223938299862.log
(87.63 KiB) Downloaded 222 times
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by torzdf »

Ultimately this is an issue in Tensorflow/Cuda.

Generally, this indicates a GPU memory issue, which Allow Growth normally fixes.

Try with a very low batch size, Allow Growth ON, Memory Saving Gradients OFF.

If this doesn't work, try removing faceswap and the environment and training again.

If it still doesn't work, you will need to google around this issue for potential solutions, as it is occuring upstream from Faceswap.

My word is final

User avatar
Hollywood
Posts: 18
Joined: Thu Jun 11, 2020 2:52 pm
Has thanked: 2 times

Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

Post by Hollywood »

Thanks for the help some combination of all of those helped to fix it.

Locked