can't get working on new workstation

Installing and setting up FaceSwap
Forum rules
Read the FAQs and search the forum before posting a new topic.

Please mark any answers that fixed your problems so others can find the solutions.
Locked
User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 10:31 am

I just set up a new workstation with Ubuntu Server 19.10 and an RTX 2060 Super. After installing faceswap and copying over some Trump/Cage test photos, I ran a python setup.py train command which crashed.

Crash log attached, any clues appreciated.

PS - This is my second workstation. The first one was Ubuntu Desktop 19.10 with a GTX 960, and I was able to get that to work - very slowly.
Attachments
crash_report.2020.02.06.102034694574.log
(14.83 KiB) Downloaded 37 times

User avatar
torzdf
Posts: 547
Joined: Fri Jul 12, 2019 12:53 am
Answers: 86
Has thanked: 16 times
Been thanked: 120 times

Re: can't get working on new workstation

Post by torzdf » Thu Feb 06, 2020 12:15 pm

Faceswap isn't tested on Ubuntu 19.10 (should be fine, but I always run the LTS versions)....

That said, this is a missing package, which may be related to 19.10.

Code: Select all

apt-get install -y libsm6 libxext6 libxrender-dev
[/code[
My word is final

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 4:14 pm

torzdf wrote:
Thu Feb 06, 2020 12:15 pm

Code: Select all

apt-get install -y libsm6 libxext6 libxrender-dev
Thanks - you were right that those packages weren't installed. I installed them and run the train command again. This time I hit two problems:

1. An unknown error:
2020-02-06 16:06:37.244458: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
(more context in attached screenshot)

2. *painfully slow*
When I ran my initial tests on my first installation (with Ubuntu Desktop 19.10 + GTX 960), it took about 3s/iteration to train. This installation (Ubuntu Server 19.10 + RTX 2060 Super) is taking about 30s/iteration. After about 7-8m, I still wasn't up iteration #20...
Attachments
Screenshot from 2020-02-06 08-10-52.png
Screenshot from 2020-02-06 08-10-52.png (114.03 KiB) Viewed 1547 times
Last edited by korrupt78 on Thu Feb 06, 2020 4:19 pm, edited 1 time in total.

User avatar
bryanlyon
Site Admin
Posts: 273
Joined: Fri Jul 12, 2019 12:49 am
Answers: 20
Location: San Francisco
Has thanked: 3 times
Been thanked: 77 times
Contact:

Re: can't get working on new workstation

Post by bryanlyon » Thu Feb 06, 2020 4:18 pm

Code: Select all

2020-02-06 16:06:37.244458: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
This error means that Cuda failed to initialize. We'd need the crash report to properly diagnose it, but please make sure you're using the latest version of the official Nvidia graphics driver.

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 4:35 pm

bryanlyon wrote:
Thu Feb 06, 2020 4:18 pm
This error means that Cuda failed to initialize. We'd need the crash report to properly diagnose it, but please make sure you're using the latest version of the official Nvidia graphics driver.
In this case no crash report was generated because it didn't crash - in fact it's still running now, as you can see in the above screenshot.

As for the nvidia drivers, is there a good reference for the recommended way to do that on Ubuntu Server? I've been googling for the last few min, and the answers are all over the place - and most of them seem wrong or out of date.

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 4:40 pm

korrupt78 wrote:
Thu Feb 06, 2020 4:35 pm
As for the nvidia drivers, is there a good reference for the recommended way to do that on Ubuntu Server? I've been googling for the last few min, and the answers are all over the place - and most of them seem wrong or out of date.
I tried downloading and running NVIDIA-Linux-x86_64-440.59.run from GeForce.com, but when I ran it I got this error:

ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 4:48 pm

Currently attempting to follow the instructions in this answer:

https://askubuntu.com/a/1130926/122045

The last command kicked off the installation of 533 packages, including nvidia driver 440, and a rebuild of the kernel. I gotta run, and it's not done - will report back when I get back.

User avatar
bryanlyon
Site Admin
Posts: 273
Joined: Fri Jul 12, 2019 12:49 am
Answers: 20
Location: San Francisco
Has thanked: 3 times
Been thanked: 77 times
Contact:

Re: can't get working on new workstation

Post by bryanlyon » Thu Feb 06, 2020 4:49 pm

It's best to disable the Nouveau driver and install using the .run file.

User avatar
torzdf
Posts: 547
Joined: Fri Jul 12, 2019 12:53 am
Answers: 86
Has thanked: 16 times
Been thanked: 120 times

Re: can't get working on new workstation

Post by torzdf » Thu Feb 06, 2020 5:09 pm

Honestly, imho the easiest and best way is:

Check the latest available graphics driver here:
https://launchpad.net/~graphics-drivers ... ubuntu/ppa

Code: Select all

sudo apt purge nvidia*
sudo add-apt-repository ppa:graphics-drivers
sudo apt install nvidia-driver-440
Replace the last line with latest available version
My word is final

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 9:08 pm

I finished the install and rebooted. The output of lsmod now shows many nvidia modules and nothing called Nouveau, so I assume that means it worked.

However, I now get a different error. Screenshot and log attached.
Attachments
Screenshot from 2020-02-06 13-09-45.png
Screenshot from 2020-02-06 13-09-45.png (75.17 KiB) Viewed 1535 times
Screenshot from 2020-02-06 13-06-48.png
Screenshot from 2020-02-06 13-06-48.png (144.35 KiB) Viewed 1535 times
crash_report.2020.02.06.205507327985.log
(60.83 KiB) Downloaded 27 times

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 9:32 pm

Screw it. I'm wiping and starting over with Ubuntu Desktop instead of Server, since it already worked once on my other box, and the Nvidia driver requires installing most of the Desktop packages anyway. Hopefully that'll let me skip past all these new problems.

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 10:31 pm

Same box (w/RTX 2060 Super), fresh install of Ubuntu Desktop 19.10, installed and configured everything correctly.

First test again failed. Screenshot and logs attached.

I really have no idea what to do now.
Attachments
crash_report.2020.02.06.142814330494.log
(73.05 KiB) Downloaded 27 times
Screenshot from 2020-02-06 14-29-03.png
Screenshot from 2020-02-06 14-29-03.png (146.96 KiB) Viewed 1531 times

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 10:47 pm

Decided to try some of the fixes from when I was trying to get it working on Server.

1. I checked the list of packages recommended by torzdf and discovered libxrender-dev wasn't installed, so I installed it. Faceswap still crashed.

2. I noticed the driver was on version 435, so I added the PPA, updated, and upgraded to 440, and rebooted. Faceswap still crashed.

User avatar
torzdf
Posts: 547
Joined: Fri Jul 12, 2019 12:53 am
Answers: 86
Has thanked: 16 times
Been thanked: 120 times

Re: can't get working on new workstation

Post by torzdf » Thu Feb 06, 2020 10:49 pm

Enable Allow Growth
My word is final

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 10:53 pm

torzdf wrote:
Thu Feb 06, 2020 10:49 pm
Enable Allow Growth
I'm sorry, I don't know what that means. Can you elaborate?

BTW - I'm not wedded to Ubuntu 19.10, I just figured its best to use the newest distro. If there's a different Ubuntu version recommended by the faceswap team, I'm happy to start over with that if it will make things work smooth.

User avatar
torzdf
Posts: 547
Joined: Fri Jul 12, 2019 12:53 am
Answers: 86
Has thanked: 16 times
Been thanked: 120 times

Re: can't get working on new workstation

Post by torzdf » Thu Feb 06, 2020 10:57 pm

ag.jpg
ag.jpg (162.45 KiB) Viewed 1524 times
My word is final

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 10:58 pm

Just thought to check python faceswap.py train -h for options and notice the one for allow growth. Currently running with it enabled. No crash so far. Will keep my fingers crossed until I see some output, but thanks for getting me this far!

User avatar
korrupt78
Posts: 49
Joined: Wed Jan 29, 2020 1:34 am
Has thanked: 2 times

Re: can't get working on new workstation

Post by korrupt78 » Thu Feb 06, 2020 11:19 pm

Wow. It's literally ten times faster that what I got used to last week.

That certainly helps make up for lost time... :)

Locked