I just set up a new workstation with Ubuntu Server 19.10 and an RTX 2060 Super. After installing faceswap and copying over some Trump/Cage test photos, I ran a python setup.py train command which crashed.
Crash log attached, any clues appreciated.
PS - This is my second workstation. The first one was Ubuntu Desktop 19.10 with a GTX 960, and I was able to get that to work - very slowly.
Thanks - you were right that those packages weren't installed. I installed them and run the train command again. This time I hit two problems:
An unknown error: 2020-02-06 16:06:37.244458: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
(more context in attached screenshot)
painfully slow
When I ran my initial tests on my first installation (with Ubuntu Desktop 19.10 + GTX 960), it took about 3s/iteration to train. This installation (Ubuntu Server 19.10 + RTX 2060 Super) is taking about 30s/iteration. After about 7-8m, I still wasn't up iteration #20...
Attachments
Screenshot from 2020-02-06 08-10-52.png (114.03 KiB) Viewed 12308 times
Last edited by korrupt78 on Thu Feb 06, 2020 4:19 pm, edited 1 time in total.
2020-02-06 16:06:37.244458: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
This error means that Cuda failed to initialize. We'd need the crash report to properly diagnose it, but please make sure you're using the latest version of the official Nvidia graphics driver.
This error means that Cuda failed to initialize. We'd need the crash report to properly diagnose it, but please make sure you're using the latest version of the official Nvidia graphics driver.
In this case no crash report was generated because it didn't crash - in fact it's still running now, as you can see in the above screenshot.
As for the nvidia drivers, is there a good reference for the recommended way to do that on Ubuntu Server? I've been googling for the last few min, and the answers are all over the place - and most of them seem wrong or out of date.
As for the nvidia drivers, is there a good reference for the recommended way to do that on Ubuntu Server? I've been googling for the last few min, and the answers are all over the place - and most of them seem wrong or out of date.
I tried downloading and running NVIDIA-Linux-x86_64-440.59.run from GeForce.com, but when I ran it I got this error:
ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
The last command kicked off the installation of 533 packages, including nvidia driver 440, and a rebuild of the kernel. I gotta run, and it's not done - will report back when I get back.
Screw it. I'm wiping and starting over with Ubuntu Desktop instead of Server, since it already worked once on my other box, and the Nvidia driver requires installing most of the Desktop packages anyway. Hopefully that'll let me skip past all these new problems.
I'm sorry, I don't know what that means. Can you elaborate?
BTW - I'm not wedded to Ubuntu 19.10, I just figured its best to use the newest distro. If there's a different Ubuntu version recommended by the faceswap team, I'm happy to start over with that if it will make things work smooth.
Just thought to check python faceswap.py train -h for options and notice the one for allow growth. Currently running with it enabled. No crash so far. Will keep my fingers crossed until I see some output, but thanks for getting me this far!