Exclude GPU has wrong device IDs

The Extraction process failing on you, and you aren't getting an error back with clear instructions? Tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Extraction process. If you want to get tips, or better understand the Extract process, then you should look in the Extract Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
acaint
Posts: 9
Joined: Sat Nov 06, 2021 3:04 pm
Answers: 1
Has thanked: 2 times

Exclude GPU has wrong device IDs

Post by acaint »

Hello,

when using "Exclude GPU" checkboxes, you can see the GPU name when hovering mouse over the checkbox.
However it appears that the GPU IDs those checkboxes indicate, are wrong.

Example:
I have two GPUs, 2060S and 2080Ti.
I wish to exclude 2060S to extract using 2080Ti.
The checkbox indicates that 0: 2060S and 1: 2080Ti, so I checkbox the ID 0 to force operation to 2080Ti.
This generates a script, which has the

Code: Select all

-X 0

as it should.

However, this in reality excluded 2080Ti and tried using already occupied 2060S for the execution. This can be seen in GPU utilization, 2080Ti remained inactive with nigh zero memory allocated.
When I changed the script to

Code: Select all

-X 1

it worked fine and the work was progressed using 2080Ti. Also in task manager I could see the GPU memory of 2080Ti was allocated.

The IDs reported by windows task manager are: GPU 0: 2080Ti and GPU 2: 2060S (GPU 1 is motherboard-integrated).

BR,
A

User avatar
acaint
Posts: 9
Joined: Sat Nov 06, 2021 3:04 pm
Answers: 1
Has thanked: 2 times

Re: Exclude GPU has wrong device IDs

Post by acaint »

Looking at the logs, there appears to be something else mixed up.

Configuration in faceswap:
ID 0 = 2060 <-- excluded with checkbox
ID 1 = 2080Ti

It starts by checking the free memory of 2080Ti.
But then tensorflow puts the job on to ID 0, that is 2060.

Checking the memory consumption and utilization, the extraction is indeed done by 2060.
The 2080Ti is at this time working on rendering.
If I swap the checkboxes, it checks 2060's memory and then tries to put the job on 2080Ti - which immediately jams as the card is 100% utilized already by other process.

It appears that this causes faceswap to change into Serial processing which is much slower.
I tried with same settings but without 2080Ti being occupied. In that case, it runs parallel (1-pass extraction), in 2060. So it appears it checks memory of the wrong card? Or puts the tensorflow job on the wrong card?

Log here:

Code: Select all

04/19/2022 09:05:04 VERBOSE  Alignments filepath: 'N:\Convert\vid\to-align\1_alignments.fsa'
04/19/2022 09:05:04 INFO     Loading Detect from S3Fd plugin...
04/19/2022 09:05:04 VERBOSE  Loading config: 'C:\Users\acaint\faceswap\config\extract.ini'
04/19/2022 09:05:04 INFO     Loading Align from Fan plugin...
04/19/2022 09:05:04 VERBOSE  Loading config: 'C:\Users\acaint\faceswap\config\extract.ini'
04/19/2022 09:05:04 INFO     Loading Mask from Components plugin...
04/19/2022 09:05:04 VERBOSE  Loading config: 'C:\Users\acaint\faceswap\config\extract.ini'
04/19/2022 09:05:04 INFO     Loading Mask from Extended plugin...
04/19/2022 09:05:04 VERBOSE  Loading config: 'C:\Users\acaint\faceswap\config\extract.ini'
04/19/2022 09:05:04 VERBOSE  GeForce RTX 2080 Ti - 4310MB free of 11264MB
04/19/2022 09:05:04 WARNING  Not enough free VRAM for parallel processing. Switching to serial
04/19/2022 09:05:04 INFO     Reset batch sizes due to available VRAM: Detect: 1
04/19/2022 09:05:04 INFO     Starting, this may take a while...
04/19/2022 09:05:04 INFO     Initializing S3FD (Detect)...
04/19/2022 09:05:04 INFO     Setting allow growth for GPU: PhysicalDevice(name='/physical_device:GPU:1',
 device_type='GPU')

2022-04-19 09:05:04.776504: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized
 with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical
  operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-19 09:05:05.560745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device
 /job:localhost/replica:0/task:0/device:GPU:0 with 3955 MB memory:  -> device: 1, name: GeForce RTX 2060, pci bus id:
  0000:08:00.0, compute capability: 7.5
User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Exclude GPU has wrong device IDs

Post by torzdf »

This is the first time I have heard of this kind of issue. Anyone else who I have discussed this with in a multi-gpu environment tells me it is working fine, so there isn't much I can offer here/check unfortunately.

My word is final

Locked