Page 1 of 1

can't get working on new workstation

Posted: Thu Feb 06, 2020 10:31 am
by korrupt78

I just set up a new workstation with Ubuntu Server 19.10 and an RTX 2060 Super. After installing faceswap and copying over some Trump/Cage test photos, I ran a python setup.py train command which crashed.

Crash log attached, any clues appreciated.

PS - This is my second workstation. The first one was Ubuntu Desktop 19.10 with a GTX 960, and I was able to get that to work - very slowly.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 12:15 pm
by torzdf

Faceswap isn't tested on Ubuntu 19.10 (should be fine, but I always run the LTS versions)....

That said, this is a missing package, which may be related to 19.10.

Code: Select all

apt-get install -y libsm6 libxext6 libxrender-dev
[/code[

Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 4:14 pm
by korrupt78
torzdf wrote: Thu Feb 06, 2020 12:15 pm

Code: Select all

apt-get install -y libsm6 libxext6 libxrender-dev

Thanks - you were right that those packages weren't installed. I installed them and run the train command again. This time I hit two problems:

  1. An unknown error:
    2020-02-06 16:06:37.244458: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
    (more context in attached screenshot)

  2. painfully slow
    When I ran my initial tests on my first installation (with Ubuntu Desktop 19.10 + GTX 960), it took about 3s/iteration to train. This installation (Ubuntu Server 19.10 + RTX 2060 Super) is taking about 30s/iteration. After about 7-8m, I still wasn't up iteration #20...


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 4:18 pm
by bryanlyon

Code: Select all

2020-02-06 16:06:37.244458: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)

This error means that Cuda failed to initialize. We'd need the crash report to properly diagnose it, but please make sure you're using the latest version of the official Nvidia graphics driver.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 4:35 pm
by korrupt78
bryanlyon wrote: Thu Feb 06, 2020 4:18 pm

This error means that Cuda failed to initialize. We'd need the crash report to properly diagnose it, but please make sure you're using the latest version of the official Nvidia graphics driver.

In this case no crash report was generated because it didn't crash - in fact it's still running now, as you can see in the above screenshot.

As for the nvidia drivers, is there a good reference for the recommended way to do that on Ubuntu Server? I've been googling for the last few min, and the answers are all over the place - and most of them seem wrong or out of date.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 4:40 pm
by korrupt78
korrupt78 wrote: Thu Feb 06, 2020 4:35 pm

As for the nvidia drivers, is there a good reference for the recommended way to do that on Ubuntu Server? I've been googling for the last few min, and the answers are all over the place - and most of them seem wrong or out of date.

I tried downloading and running NVIDIA-Linux-x86_64-440.59.run from GeForce.com, but when I ran it I got this error:

ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 4:48 pm
by korrupt78

Currently attempting to follow the instructions in this answer:

https://askubuntu.com/a/1130926/122045

The last command kicked off the installation of 533 packages, including nvidia driver 440, and a rebuild of the kernel. I gotta run, and it's not done - will report back when I get back.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 4:49 pm
by bryanlyon

It's best to disable the Nouveau driver and install using the .run file.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 5:09 pm
by torzdf

Honestly, imho the easiest and best way is:

Check the latest available graphics driver here:
https://launchpad.net/~graphics-drivers ... ubuntu/ppa

Code: Select all

sudo apt purge nvidia*
sudo add-apt-repository ppa:graphics-drivers
sudo apt install nvidia-driver-440

Replace the last line with latest available version


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 9:08 pm
by korrupt78

I finished the install and rebooted. The output of lsmod now shows many nvidia modules and nothing called Nouveau, so I assume that means it worked.

However, I now get a different error. Screenshot and log attached.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 9:32 pm
by korrupt78

Screw it. I'm wiping and starting over with Ubuntu Desktop instead of Server, since it already worked once on my other box, and the Nvidia driver requires installing most of the Desktop packages anyway. Hopefully that'll let me skip past all these new problems.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 10:31 pm
by korrupt78

Same box (w/RTX 2060 Super), fresh install of Ubuntu Desktop 19.10, installed and configured everything correctly.

First test again failed. Screenshot and logs attached.

I really have no idea what to do now.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 10:47 pm
by korrupt78

Decided to try some of the fixes from when I was trying to get it working on Server.

  1. I checked the list of packages recommended by torzdf and discovered libxrender-dev wasn't installed, so I installed it. Faceswap still crashed.

  2. I noticed the driver was on version 435, so I added the PPA, updated, and upgraded to 440, and rebooted. Faceswap still crashed.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 10:49 pm
by torzdf

Enable Allow Growth


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 10:53 pm
by korrupt78
torzdf wrote: Thu Feb 06, 2020 10:49 pm

Enable Allow Growth

I'm sorry, I don't know what that means. Can you elaborate?

BTW - I'm not wedded to Ubuntu 19.10, I just figured its best to use the newest distro. If there's a different Ubuntu version recommended by the faceswap team, I'm happy to start over with that if it will make things work smooth.


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 10:57 pm
by torzdf
ag.jpg
ag.jpg (162.45 KiB) Viewed 11668 times

Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 10:58 pm
by korrupt78

Just thought to check python faceswap.py train -h for options and notice the one for allow growth. Currently running with it enabled. No crash so far. Will keep my fingers crossed until I see some output, but thanks for getting me this far!


Re: can't get working on new workstation

Posted: Thu Feb 06, 2020 11:19 pm
by korrupt78

Wow. It's literally ten times faster that what I got used to last week.

That certainly helps make up for lost time... :)