Page 1 of 1
Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 12:25 am
by FuorissimoX
Hi everyone, I'm a little confused about the importance of iterations, batch size and Eg / s.
For example I noticed that if I set (model: Original) for example BS = 64 I have 3,000 iterations per hour and 230 Eg / s. If I set BS = 32 I have 6,000 iterations per hour and about 210 Eg / s.
Is it better to do many iterations with fewer BSs or the other way around? Or the important thing is the total of Eg / s X minutes elapsed? Because at first glance, nothing changes. If I have BS = 32 I carry out at the same time twice the iterations that I would have with Bs = 64 and approximately Eg / s equal (but with BS = 64 reaching the same number of iterations requires more time and therefore we obtain a greater number of Eg / s).
In summary, the best choice is:
1) have a total of Eg as high as possible?
2) have many iterations?
I just don't understand what changes between doing 100,000 iterations in one hour with BS = 1 or doing 100,000 iterations with BS = 64
Perhaps stopping the program after X iterations is a misleading parameter. Maybe it would be more useful to think in processed Eg?
We need a tool that calculates the ideal parameters according to the available HW
Re: Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 1:20 am
by abigflea
EGs/s is the best to go by .
It's seeing more faces per iteration, Which means it trains faster/better.
Batch is telling the model how many faces it should try to do per iteration (limit to your vram.). There is a caveat about how big of a batch you probably don't want to go over a hundred but that's another long discussion that's likely unnecessary.
Iterations is really just how many cycles it's gone through.
I could set my batch to 2 and it would fly through iterations, it would be in reality, training very slowly.
Re: Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 1:31 am
by abigflea
How many iterations is the million-dollar question.
If you watch your loss graph, you'll see it flatten out a bit.
In the training guides you'll see mentioned how to zoom in and see that it's still going down.
Looks flat but it's still learning.
When it seems to be flat, it should be trained.
But all that academic.... it's just math.
The real answer is when it looks good enough!
Might be nearly kinda flat at 100k iterations, might have looked flat at 400k.
It's where all the math and programming have a limit and its up to your eyes.
If it's not good enough you can run it another fifty or a hundred k iterations .. all depends on what you see and how quickly it's learning.
Re: Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 5:00 am
by FuorissimoX
Of course, I know that. In the example of my first deepfake there are graphs. My question was different. That's what I meant. If bs = 16 I run 100,000 iterations and with bs = 32 I run 50,000 at the same time, is there any difference? In my opinion, if we also consider the Eg / s value, we deduce that it is not the number of interactions but the total number of Eg performed that is important. So I come back to the question. Isn't it better to reason in the millions of Egs performed? The Batch size value really affects the number of eg / s so I normally do some tests with different values (bs 64 on the P100 seems to have the highest egs result) and when I find the highest one, I keep it. Then, however, they do not reason by iterations but EG multiplied the seconds. Do you agree with this theory?
Re: Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 5:06 am
by FuorissimoX
Yes, we agree. The most important value Is Eg/s for use max power of GPU
Can really useful a tool that calculate best BS value with mounted GPU.
Many thanks for your reply. Im spending 50h right now for try and learn
Re: Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 5:39 am
by abigflea
FuorissimoX wrote: ↑Fri Sep 25, 2020 5:00 am
Of course, I know that. In the example of my first deepfake there are graphs. My question was different. That's what I meant. If bs = 16 I run 100,000 iterations and with bs = 32 I run 50,000 at the same time, is there any difference?
No it's not the same.
With a higher BS , the neural network will learn quicker and "better" The math to calculate the equivalence is the matter of training and your perception (that is hard to quantify).
FuorissimoX wrote: ↑Fri Sep 25, 2020 5:00 am Then, however, they do not reason by iterations but EG multiplied the seconds. Do you agree with this theory?
No see above.
I know what and how you're thinking. It's all a bit more complicated under the hood. Neural networks try to mimic just that.... Mushy brains. There is a degree of randomness inherent to this.
You may run the same model twice and get 2 slightly different results of the same subjective high quality.
So just run the highest batch you can ( don't go over 90 or so), and wait till it looks good.
I rarely get a batch over 30 due to the high quality data I'm using... And 600k iterations . 8 -15 days before it's looks good .
Re: Iterations, batch site and Eg / s
Posted: Fri Sep 25, 2020 5:45 am
by FuorissimoX
But the target of batch size Is reach highest eg/s value? its correct? For example, on Google Cloud with P100 i have theses value:
BS / Egs
16 200
32 210
64 225
128 210
So, i set to 64.
Its correct my concept? I think a tools can be useuful