CPU Load is 100% and ~40% GPU

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

CPU Load is 100% and ~40% GPU

Post by VadVergasov »

As I wrote on Github I have RTX 2060, i7-8750H, 16Gb Ram. I started training and saw, that I have fully loaded the only CPU, instead of GPU. I used also nvidia-smi to control load, but it is the same as in Windows task manager->GPU->set graphic to CUDA.
I have Windows 10 x64 Home 1903, Nvidia Driver 431.60 DCH, CUDA 10.0 with cuDNN 7.6.2.24
I installed FaceSwap as .exe and manually, at there is no difference in perfomance.
Tried TensorFlow's examples, and there are running normally (https://www.tensorflow.org/tutorials/ and https://github.com/tensorflow/tensorflo ... cgan.ipynb). They load CPU, but not at 100%.
First I thought it is because FaceSwap ran out of memory, so I decreased Batch Size, and now there is no memory problem, but CPU remains 100%.
Tried FakeApp: loads CPU at 25% and GPU at 70%.

VadVergasov

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: CPU Load is 100% and ~40% GPU

Post by bryanlyon »

What activity were you getting these results during? Extract/Training/Convert? Can you please post your faceswap.log?

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

When extracting and training this happens. In extracting GPU load is only 20%.

VadVergasov

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

Can't send log, because of .log extension

Last edited by VadVergasov on Thu Aug 01, 2019 8:51 am, edited 1 time in total.

VadVergasov

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

It happens in extract and training, what happens in convert I don't know now. And my log file is too large, even when I 7-zipped it. The size of the archive is 2Mb.

VadVergasov

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

Ok, this is strange....

Extract Can feasibly max CPU as it leverages multi-processing quite heavily. Training really shouldn't.... It spins off 2 processes (one each for data augmentation on each of the sides) and also the main process. If you are using the GUI then that will also use another process, and it can be quite heavy in calculating the stats for the graph.

That said, you can test this by running from the cli and seeing if it is still maxing your cpu.

The next thing is you say the logfile is 2MB zipped. This sounds Waaaaaay too big, and would suggest that you have TRACE logging enabled. This will add significant overhead, as it logs a LOT and it will pretty much slow everything down.

Unfortunately you can't really compare FakeApp with FaceSwap. FS does more extensive augmentation than FA, which will add overhead.

What kind of training speed are you getting on which model?

My word is final

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

I tried to run it threw cli with no preview. It seems to be the same situation as in the GUI.

Here is what GUI generates as run on training:

Code: Select all

python.exe C:\Users\Vadim\faceswap\faceswap.py train -A D:/FakeApp/data_A -B D:/FakeApp/data_B -ala D:/FakeApp/video_alignments.json -alb D:/FakeApp/alignments.json -m D:/FakeApp/Model -t original -s 50 -ss 25000 -bs 32 -it 1000000 -g 1 -ps 50 -L INFO

Average speed in GUI is 2 iterations per second.

Attachments
faceswap.7z
Here is GUI log.
(1.02 KiB) Downloaded 260 times
faceswap.7z
Here is cli log. Log level info
(1.05 KiB) Downloaded 289 times
Here is when I ran from cli. All as in USAGE.md only paths to data and model
Here is when I ran from cli. All as in USAGE.md only paths to data and model
FaceSwap.png (15.16 KiB) Viewed 22508 times

VadVergasov

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

Here are more screenshots.

Attachments
Load of CUDA and memory when cli running
Load of CUDA and memory when cli running
FaceSwapCLICuda.png (66.76 KiB) Viewed 22507 times
Load of CUDA and memory when GUI running
Load of CUDA and memory when GUI running
FaceSwapGUICuda.png (68.15 KiB) Viewed 22507 times
Load in GUI mode.
Load in GUI mode.
FaceSwapGUI.png (27.6 KiB) Viewed 22507 times

VadVergasov

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

Ok, you're going to need to excuse me, because I use Linux... but.... Those grabs don't really make sense.

Python, by it's nature, is a single-threaded application. You have to jump through quite a lot of hoops to get multi-processing to work (trust me, I have jumped through a lot of those!).

For the stats you've posted to make sense on a 6 core/12 Thread CPU then, training must be using 6 threads per batch for training (or 5 per batch + 2 for main code). This just isn't how Python, and particularly the training code, works. We have to specifically code for parallel processing to utilize more of your cores (hence why we split off 2 separate processes for image augmentation):

As for the speed. That sounds about right for Original model on GPU

My word is final

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

So, 2 iteration per second is normal for 6/12 CPU and RTX 2060? If yes, it is OK. But it is really strange, that code, which should execute on GPU also use a lot of CPU.

VadVergasov

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

I would expect CPU usage. Image augmentation is processor intensive, and it has to do it on 64 (32 x 2) images at a time. This is all done on CPU prior to feeding to the GPU.

I would not expect it to use your full CPU... As I said before, I don't know how that would even be possible in the current training set up.

I would say that, yes, that speed sounds about right. I don't have an RTX card though, so I can't directly compare. If you look at the Analysis tab, the "EG/s" would give me a more accurate measure and I could let you know if it seems reasonable.

My word is final

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

Here is the analysis tab. But there is a lot of 'void' runs. On long runs, there is 63 EG/s. But now I measured the speed of FakeApp, and there is 1 iteration per second, so the result of FaceSwap looks OK. And as I know you currently have no support of tensor cores, so there will be almost no difference between 2060 and 1070.

Attachments
Analisys.png
Analisys.png (45.12 KiB) Viewed 22491 times
Last edited by VadVergasov on Thu Aug 01, 2019 6:58 pm, edited 1 time in total.

VadVergasov

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

Ok. I'm away from my PC at the moment, and don't have the figures on the top of my head. I will double check though and get back to you tomorrow.

My word is final

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

Here I ran a long session, so it will be the most correct numbers.

Attachments
Here it is.
Here it is.
Analisys.png (56.34 KiB) Viewed 22470 times

VadVergasov

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

Thanks for this.

I am currently having GPU issues, so I will have to wait for some parts to arrive before I can run my tests.

Off the top of my head, these figures seem reasonable though.

My word is final

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

We have identified a potential issues on certain batch-sizes.

It still doesn't explain your CPU usage (although, that may be numpy doing multiprocessing in the background).

When we have a bit of time, we are going to review the data augmentation code to see if we can optimize it a bit better.

My word is final

User avatar
underscorenorm
Posts: 1
Joined: Thu Aug 22, 2019 3:25 pm
Has thanked: 2 times

Re: CPU Load is 100% and ~40% GPU

Post by underscorenorm »

I can say I am experiencing the same behavior. I used the windows installer method, and I am on a Ryzen 1700X with a GTX 1080. Batch size doesn't seem to make a huge difference, but for what it's worth, I have it set to 128

Hope this helps!

Code: Select all

Setting Faceswap backend to NVIDIA
08/22/2019 09:05:11 INFO     Log level set to: INFO
Using TensorFlow backend.
08/22/2019 09:05:13 INFO     Model A Directory: T:\Deep\B
08/22/2019 09:05:13 INFO     Model B Directory: T:\Deep\C
08/22/2019 09:05:13 INFO     Training data directory: T:\Deep\Models
08/22/2019 09:05:13 INFO     ===================================================
08/22/2019 09:05:13 INFO       Starting
08/22/2019 09:05:13 INFO       Press 'Terminate' to save and quit
08/22/2019 09:05:13 INFO     ===================================================
08/22/2019 09:05:14 INFO     Loading data, this may take a while...
08/22/2019 09:05:14 INFO     Loading Model from Iae plugin...
08/22/2019 09:05:14 INFO     Using configuration saved in state file
08/22/2019 09:05:14 WARNING  Support for multiple model types within the same folder has been deprecated and will be removed from a future update. Please split each model into separate folders to avoid issues in future.
08/22/2019 09:05:14 WARNING  From C:\Users\n0rm\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.\n
08/22/2019 09:05:14 WARNING  From C:\Users\n0rm\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.\n
08/22/2019 09:05:14 WARNING  From C:\Users\n0rm\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.\n
08/22/2019 09:05:14 WARNING  From C:\Users\n0rm\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.\n
08/22/2019 09:05:14 WARNING  From C:\Users\n0rm\MiniConda3\envs\faceswap\lib\site-packages\keras\backend\tensorflow_backend.py:181: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.\n
08/22/2019 09:05:17 INFO     Loaded model from disk: 'T:\Deep\Models'
08/22/2019 09:05:18 WARNING  From C:\Users\n0rm\MiniConda3\envs\faceswap\lib\site-packages\keras\optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.\n
08/22/2019 09:05:18 INFO     Loading Trainer from Original plugin...
08/22/2019 09:05:23 INFO     Enabled TensorBoard Logging
08/22/2019 09:05:39 INFO     Backing up models...
08/22/2019 09:05:48 INFO     [Saved models] - Average since last save: face_loss_A: 0.02298, face_loss_B: 0.03001

And then lots of INFO logs regarding face_loss

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

For what it's worth, I overlooked that openCV and numpy are both wrappers (in Python) for C, so they will utilize multi-processes under the hood.

Ultimately, Image processing IS CPU intensive, and we augment all of our data on the fly.

If you are concerned that you are being bottle-necked by the CPU you can run a test. Train a more lightweight model (The "Lightweight" model would be the perfect example) at the same batch size as your bigger model. Watch the iterations fly a lot faster!

The CPU code to augment the images to feed the lightweight + the heavyweight model is the same, so the fact that the bigger model trains slower shows that the bottleneck is the GPU and not the CPU.

My word is final

User avatar
VadVergasov
Posts: 10
Joined: Wed Jul 31, 2019 11:25 pm
Been thanked: 1 time

Re: CPU Load is 100% and ~40% GPU

Post by VadVergasov »

I've tested the 'light' model and it seems to be almost the same performance, so it's limited to CPU performance. But now I found some problems with the CPU because it's heating more now when I solve this problem I'll write again if something changes. Also, I'll attach configs and screens. And can you tell your setup? I mean CPU, GPU, OS. If you are using Linux I will test on it, maybe it's a Windows problem (maybe).

Attachments
configs.7z
Configs inside
(593 Bytes) Downloaded 267 times
Light model
Light model
Light_Stats.png (5.32 KiB) Viewed 22093 times
Original model
Original model
Original.png (52.12 KiB) Viewed 22093 times

VadVergasov

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: CPU Load is 100% and ~40% GPU

Post by torzdf »

I do use Linux. Specifically Ubuntu 18.04

My word is final

Locked