When I try to start training I get these error message: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED and E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Anyone have any idea how to fix this?
(I attached the crash report)
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Read the FAQs and search the forum before posting a new topic.
This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.
Please mark any answers that fixed your problems so others can find the solutions.
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
- Attachments
-
- crash_report.2020.06.11.180042299945.log
- (87.59 KiB) Downloaded 241 times
Ultimately this is an issue in Tensorflow/Cuda.
Generally, this indicates a GPU memory issue, which Allow Growth normally fixes.
Try with a very low batch size, Allow Growth ON, Memory Saving Gradients OFF.
If this doesn't work, try removing faceswap and the environment and training again.
If it still doesn't work, you will need to google around this issue for potential solutions, as it is occuring upstream from Faceswap.
Re: Crash While Beginning Training
That didn't work I included my log in case that helps with debugging.
- Attachments
-
- faceswap.log
- (10.85 KiB) Downloaded 231 times
- deephomage
- Posts: 33
- Joined: Fri Jul 12, 2019 6:09 pm
- Has thanked: 2 times
- Been thanked: 8 times
Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Your log file states "Crash report written to 'F:\Face Swap\faceswap\crash_report.2020.06.12.151752362351.log'. You MUST provide this file if seeking assistance." Post your crash report, not the log file.
Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
deephomage wrote: ↑Fri Jun 12, 2020 9:12 pmYour log file states "Crash report written to 'F:\Face Swap\faceswap\crash_report.2020.06.12.151752362351.log'. You MUST provide this file if seeking assistance." Post your crash report, not the log file.
I attached it in the original post.
- deephomage
- Posts: 33
- Joined: Fri Jul 12, 2019 6:09 pm
- Has thanked: 2 times
- Been thanked: 8 times
Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Try lowering your batch size to 32 and closing all other programs that might be using GPU memory, especially web browsers.
Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
That didn't help either, I even lowered it all the way down to one. It had been working before for many hours and then I shut down my computer overnight and tried to restart it and it started doing this.
I also made a trace level crash report in case that will help narrow it down.
- Attachments
-
- crash_report.2020.06.12.223938299862.log
- (87.63 KiB) Downloaded 222 times
Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Ultimately this is an issue in Tensorflow/Cuda.
Generally, this indicates a GPU memory issue, which Allow Growth normally fixes.
Try with a very low batch size, Allow Growth ON, Memory Saving Gradients OFF.
If this doesn't work, try removing faceswap and the environment and training again.
If it still doesn't work, you will need to google around this issue for potential solutions, as it is occuring upstream from Faceswap.
My word is final
Re: Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
Thanks for the help some combination of all of those helped to fix it.