VRAM Usage Limits to 80%

Talk about Hardware used for Deep Learning


Locked
User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

VRAM Usage Limits to 80%

Post by Tekniklee »

I have a 4Gb (Nvidia 1050) GPU, but only 3.2Gb (exactly 80%) ever gets allocated, according to Windows performance manager. It doesn't seem to matter what it's doing (extract, training, sorts, convert), the maximum ever allocated is 80%. This represents a large performance decrease in an already limited GPU. Is there some kind of a config setting that tells the system to only allocate a certain percentage of VRAM?

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: VRAM Usage Limits to 80%

Post by torzdf »

I don't know, so others may have a better idea, however Tensorflow will grab all of the VRAM that the OS makes available to it at launch (if you run with verbose logging, you will see an output saying how much VRAM Tensorflow has taken).

It is entirely possible that Windows keeps a buffer back.

My word is final

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: VRAM Usage Limits to 80%

Post by Tekniklee »

I can't see any memory info at all using Verbose, but I'll try Debug later. On the console it stops trying to get less at about 3.2Gb, but I'll switch logs to Debug to see if I can get specifics. I don't think it's a Windows reserve because that would show up in the total VRAM allocation (i.e. Windows would show 4Gb in use, but TF would still only have 3.2Gb). I'm hoping it's just a parameter somewhere.

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: VRAM Usage Limits to 80%

Post by torzdf »

No, it's verbose. It will print to the command line (it won't appear in the logs, as it's a tensorflow message). It will say something like "allocated xxxxMB of yyyyMB".

This is, effectively, what the OS makes available to Tensorflow. On Linux, it gives me almost 100% of the total vram on the card (a little under). On Windows it gives me significantly less.


Edit, this is the line I mean:

Code: Select all

2020-06-12 09:44:44.867865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10312 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:05:00.0, compute capability: 7.5)

Ultimately, this is the amount of VRAM that the OS has made available to Tensorflow. On windows, this figure will always be about 80% of your GPU VRAM because Windows.

My word is final

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: VRAM Usage Limits to 80%

Post by Tekniklee »

Here is what I get:

2020-06-16 17:37:41.551462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3001 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:3f:00.0, compute capability: 6.1)

So looks like Windows is reporting only 3Gb vs 4Gb. That's a full 25%, which is huge in my case. Unless someone else has any ideas, I'll see if I can get a response from NVidia.

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: VRAM Usage Limits to 80%

Post by Tekniklee »

BTW, what is "compute capability"?

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: VRAM Usage Limits to 80%

Post by Tekniklee »

Here is some additional info. In my research, people kept referring to a program called nvidia-smi.exe to get the info as Windows sees it. At first I couldn't find it in places it was supposed to be, and apparantly neither can anyone else. It should be in the NVIDIA Corporation\NVSMI directory, which exists, but is completely empty. I finally searched from the root and found it in...Windows\System32. ???

Anywho, the output is below. Training is in progress, so I think it's reporting the same as the Windows Performance, which is 3.2/4.0Gb. Don't know if this helps any, but just in case....

C:\WINDOWS\system32>nvidia-smi
Tue Jun 16 18:41:25 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 442.19 Driver Version: 442.19 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1050 WDDM | 00000000:3F:00.0 Off | N/A |
|100% 77C P0 N/A / N/A | 3357MiB / 4096MiB | 100% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 14272 C ...onda3\envs\faceswapNVIDIAGPU\python.exe N/A |
+-----------------------------------------------------------------------------+

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: VRAM Usage Limits to 80%

Post by bryanlyon »

Tekniklee wrote: Wed Jun 17, 2020 4:02 am

BTW, what is "compute capability"?

https://developer.nvidia.com/cuda-gpus

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: VRAM Usage Limits to 80%

Post by Tekniklee »

Yeah, I found that page also. But all it does is list the "compute capability" for their various NVidia products. It never says exactly what it is or how the compute capability is computed. It doesn't really matter - it's basically a marketing number. I'll stick with clock speed, number of cores and amount of memory.

And speaking of memory, while running the new Manual Tool beta, I noticed the following sections in the logs - strange for a couple of reasons. First, in the first section, the system appears to be detecting much more of the 4Gb VRAM (3763Mb). And same in the second section, although I included several lines so you could also see the switching of the DTG format in mid-log (from mm/dd/yyyy to yyyy-mm-dd). Finally, in the last section, it's back to only detecting 3001Mb VRAM, which is the same reported by the training program starting this thread. Does any of that shed any light on the VRAM usage problem?

06/17/2020 16:07:02 VERBOSE Loading config: 'F:\FaceSwap NVIDIA GPU\config\extract.ini'
06/17/2020 16:07:02 VERBOSE GeForce GTX 1050 - 3763MB free of 4096MB

06/17/2020 16:07:02 INFO Initializing cv2-DNN Aligner (Align)...

06/17/2020 16:07:02 INFO Loading Mask from Extended plugin...
06/17/2020 16:07:02 VERBOSE Loading config: 'F:\FaceSwap NVIDIA GPU\config\extract.ini'
06/17/2020 16:07:02 VERBOSE GeForce GTX 1050 - 3763MB free of 4096MB
06/17/2020 16:07:02 INFO Initializing FAN (Align)...
2020-06-17 16:07:02.434606: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-06-17 16:07:02.467505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493

pciBusID: 0000:3f:00.0

2020-06-17 16:07:03.250707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-06-17 16:07:03.250815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N06/17/2020 16:07:03 VERBOSE Initializing plugin model: FAN
2020-06-17 16:07:03.251907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3001 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:3f:00.0, compute capability: 6.1)

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: VRAM Usage Limits to 80%

Post by torzdf »

Tekniklee wrote: Wed Jun 17, 2020 3:59 am

So looks like Windows is reporting only 3Gb vs 4Gb. That's a full 25%, which is huge in my case. Unless someone else has any ideas, I'll see if I can get a response from NVidia.

It's not so much that this is what it is reporting, it's more that Windows has gobbled up about a gig for no discernible reason.

Windows does this, I doubt that Nvidia will have a solve, as ultimately this is at the OS level, rather than the driver level.

My word is final

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: VRAM Usage Limits to 80%

Post by torzdf »

Compute Capability is basically the Cuda API that the card supports. Later cards support later versions.

My word is final

User avatar
Tekniklee
Posts: 37
Joined: Fri Jan 31, 2020 6:03 pm
Has thanked: 7 times
Been thanked: 3 times

Re: VRAM Usage Limits to 80%

Post by Tekniklee »

I'm not so sure it's a Windows issue. I noticed that your VRAM is reporting 10312 of 11Gb used, which would be an almost identical amount to my loss (3357 of 4Gb). Even tho you're on Unix vs Windows? If that's the case then people would get about 7325 with 8Gb of VRAM and 23325 with 24Gb. So with all that in mind, does everyone have this unused VRAM penalty and it's just more visible with only 4Gb?

BTW, I did some messing with nvidia-smi.exe, and noticed that:
1) It does show some VRAM in unknown use prior execution, but it's only 77MiB. No change when FS launched.
2) As expected, doing grabs while watching FS log progress it appears to jump up to 3357MiB when TensorFlow does it's VRAM reservation. I have too much free time.

And thanks for the compute capability explaination - not even close to what I was thinking it meant :).

User avatar
FuorissimoX
Posts: 56
Joined: Mon Sep 21, 2020 6:49 am
Location: Italy
Has thanked: 10 times
Been thanked: 2 times

Re: VRAM Usage Limits to 80%

Post by FuorissimoX »

Hello there, i'm new here.
I have a similar issue. I have AMD R9 270X with 4Gb but system use never more than 70% vram (and about 90+% of gpu).
I also try to change parameter training batch size but don't change, i can't use more than 70% of my vram.

I see also that i have 8ram dedicated shared for gpu but are not used (0%). I wrong something?

Locked