GPU Training Speed

Talk about Hardware used for Deep Learning
Post Reply
User avatar
tochan
Posts: 9
Joined: Sun Sep 22, 2019 8:17 am

GPU Training Speed

Post by tochan » Sun Sep 22, 2019 8:41 am

Hi to all,
last week my "old" GPU died... more precisely the pump on my EVGA 1080 Hybrid.
For the warrenty time, i bought a 2080 RTX MSI Aero (cooling blower design).

now, my idea is a comparison list (same trainer is important) with some system Info to help other for there GPU "upgrade" plans (new or second).

So, here are the resoults of these tow cards.

My Sys info:
CPU AMD 1800x, 64GB RAM, SSD storage, Windows 10,
Trainer Dfl-H128
Batchsize 64
Warp to Landmarks
Optimizer Savings
Save interval 100

GPU:
1080 GTX EVGA Hybrid (not overclocked) 8GB. Iterations 17.9 (Faceswap software 09/10/19 18.2)
2080 RTX MSI Aero (not overclocked) 8GB. Iterations 25.1 (Faceswap software 09/22/19 25.1)

Hope you like the idea and show some infos from youre mashine...

User avatar
tochan
Posts: 9
Joined: Sun Sep 22, 2019 8:17 am

Re: Hardware best practices

Post by tochan » Sun Sep 22, 2019 11:23 am

For Traing speed Information (Trainer Dfl-128):
1080 GTX EVGA Hybrid (AIO Whatercooler) 18,2 (12k Iterations)
2080 RTX MSI Aero (Blower Cooler)25,2 (15k iterations)

User avatar
torzdf
Posts: 226
Joined: Fri Jul 12, 2019 12:53 am
Answers: 57
Has thanked: 8 times
Been thanked: 43 times

Re: GPU Training Speed

Post by torzdf » Sun Sep 22, 2019 12:32 pm

I have been wanting something like this for a while, so thanks for making a start!

Ideally we could have a google sheet, which someone could maintain.

I'll see if I can pull out some stats to add.
My word is final

User avatar
kilroythethird
Posts: 20
Joined: Fri Jul 12, 2019 11:35 pm
Answers: 1
Has thanked: 2 times
Been thanked: 9 times

Re: GPU Training Speed

Post by kilroythethird » Sun Sep 22, 2019 2:41 pm

Considering how fast faceswap develops we maybe should use a fixed version for this (tag on GH or at some fixed commit) ?

A c+p able snippet to checkout at a given commit, download a test faceset, run for a fixed iteration count at a given BS (?), revert to current master should do.
Someone could even write some simple batch/sh/py(?) benchmark script.
that amd guy

User avatar
torzdf
Posts: 226
Joined: Fri Jul 12, 2019 12:53 am
Answers: 57
Has thanked: 8 times
Been thanked: 43 times

Re: GPU Training Speed

Post by torzdf » Sun Sep 22, 2019 2:43 pm

I thought of that, but I figured that most people are not going to want to rollback to train a model just to post stats.

Ultimately, I expect these numbers to bump a bit with the forthcoming augmentation optimizations, but I suspect (and hope) for it to settle down for a while after that.
My word is final

User avatar
AndrewB
Posts: 8
Joined: Tue Nov 12, 2019 10:16 am
Has thanked: 1 time

Re: GPU Training Speed

Post by AndrewB » Wed Nov 13, 2019 2:17 pm

Where can I see these stats? I only see Eg/s on the Analysis tab and it depends on a batch size. About 10-15 Eg/s for bs=8 (RTX 2070 Super).

User avatar
tochan
Posts: 9
Joined: Sun Sep 22, 2019 8:17 am

Re: GPU Training Speed

Post by tochan » Sun Nov 17, 2019 8:54 pm

little update after my 1080 return form the repair.

At the moment, i train a Dlight model with this 2 cards (now it works)

general
Dlight model,
allow Growth "on"
Oprtimizer Savings.
Batch Size 39
Gpus 2

Train Plugin Global
Coverage "87.5"
Masktyp "compents"
Subpixel Upscaling "on"
Loss Funktion "mae"
Penalized Mask Loss "on"

Dlight
Features "Best"
Details "Good"
Output "256"

Results:
17.4 EGs/sec for batchsize 39.... 40 Crash ;)

User avatar
tochan
Posts: 9
Joined: Sun Sep 22, 2019 8:17 am

Re: GPU Training Speed

Post by tochan » Fri Nov 22, 2019 8:50 am

Hi again,

little info form my side with some "new" parts in my System.

Switch the AMD 1800x to 3800x

Change the GPU from 2080+1080 to SLI 2080. Training works fine with SLI option is on in the OS with NVlink birge but there is no benefit for the training speed with ore without NVlink.

Here is a Traingspeed History from a Dfl-H128 model.

18.5 EGs/sec =1800x +1080 GTX Batchsize 64
25.2 EGs/sec = 1800x + 2080 RTX Batchsize 64
31.4 EGs/sec =1800x + 2080 RTX + 1080 GTX Batchsize 112
39.9 EGs/sec= 3800x + 2x2080 RTX Batchsize 117

The 2 2080 are ceeper than one 2080 TI. Info for power consume: 300-670Watt peak (Status Monitor form the RM750i)

Post Reply