Speed/Quality and Batch size

Training your model
Forum rules
Read the FAQs and search the forum before posting a new topic.

Please mark any answers that fixed your problems so others can find the solutions.
Locked
User avatar
nnifj
Posts: 17
Joined: Sat Jan 18, 2020 6:32 pm
Has thanked: 3 times
Been thanked: 1 time

Speed/Quality and Batch size

Post by nnifj »

I know about as little there is to know, and I have two questions relating to speed/quality and batch size.

1. The guide states that
"while large batches train faster, batch sizes in the 8 to 16 range likely produce better quality."
Is there any reason that the number starts at 8? Would it be even better quality if I did 7 or 6.....or even 1?

2. Is training different, although closely related to iterations? Or does training = iterations? The guide clearly suggests that larger batch size trains faster, but smaller produces better quality. For my project with 500 input A faces and 3k input B faces, when I'm running a batch size of 50, I'm getting about 240 iterations per hour, at a batch size of 30 -Im getting about 500 iterations per hour. At a batch size of 6 -I'm getting 900 iterations per hour.
So did more "training" occur after 1 hour with my high batchsize low iteration count, or did more "training" occur from my low batchsize high iteration count?

User avatar
bryanlyon
Site Admin
Posts: 329
Joined: Fri Jul 12, 2019 12:49 am
Answers: 28
Location: San Francisco
Has thanked: 3 times
Been thanked: 88 times
Contact:

Re: Speed/Quality and Batch size

Post by bryanlyon »

Training speed is not just iterations. It's actually best to use the number of images in a time period. So Batch Size * Iterations is a better calculation

So

50 * 240 = 12,000 images/hour

30 * 500 = 15,000 images/hour

6 * 900 = 5,400 images/hour

(I however, believe you've likely mistaken the speeds here, and it's best to let the GUI calculate the EGs/sec as shown in the Analysis tab.)

The arguments for Batch Size and quality are VERY tenuous and really only apply to very extreme numbers. A BS of 128 will likely learn average face shape faster than a low BS, but will learn details less effectively. However, a BS of 1 will learn details better, but miss out on things like general shape and fail to learn generic features well. More important than an extremely high/low BS is to match the model to your hardware.

User avatar
nnifj
Posts: 17
Joined: Sat Jan 18, 2020 6:32 pm
Has thanked: 3 times
Been thanked: 1 time

Re: Speed/Quality and Batch size

Post by nnifj »

bryanlyon wrote:
Sat Jan 18, 2020 7:08 pm
Training speed is not just iterations. It's actually best to use the number of images in a time period. So Batch Size * Iterations is a better calculation

So

50 * 240 = 12,000 images/hour

30 * 500 = 15,000 images/hour

6 * 900 = 5,400 images/hour

(I however, believe you've likely mistaken the speeds here, and it's best to let the GUI calculate the EGs/sec as shown in the Analysis tab.)

The arguments for Batch Size and quality are VERY tenuous and really only apply to very extreme numbers. A BS of 128 will likely learn average face shape faster than a low BS, but will learn details less effectively. However, a BS of 1 will learn details better, but miss out on things like general shape and fail to learn generic features well. More important than an extremely high/low BS is to match the model to your hardware.
This is very interesting. If I'm understanding you right, I guess I'll try to quantify in my mind how much training has been done not by hours or iterations alone, but by how many examples were given to the model over the training process. It would be nice if they had a number for that somewhere on here like they do for iterations.

When you say "matching your model to your hardware"? do you just mean your trainer? If so then I get that -use lightweight if your on a potato, villian requires lots of VRAM etc. etc. Is there any other guides or advice out there for matching your trainer to your computer other than those brief couple sentences that are on the official guide?

User avatar
bryanlyon
Site Admin
Posts: 329
Joined: Fri Jul 12, 2019 12:49 am
Answers: 28
Location: San Francisco
Has thanked: 3 times
Been thanked: 88 times
Contact:

Re: Speed/Quality and Batch size

Post by bryanlyon »

viewtopic.php?f=6&t=146
This is the best source of information on this. Honestly, much more than what this guide shows is overthinking things. Just go for the largest Batch Size you can without turning on any memory hacks (as those slow down training for Vram).

User avatar
blackomen
Posts: 9
Joined: Fri Feb 21, 2020 2:14 pm
Has thanked: 2 times
Been thanked: 1 time

Re: Speed/Quality and Batch size

Post by blackomen »

bryanlyon wrote:
Sat Jan 18, 2020 7:08 pm
Training speed is not just iterations. It's actually best to use the number of images in a time period. So Batch Size * Iterations is a better calculation

So

50 * 240 = 12,000 images/hour

30 * 500 = 15,000 images/hour

6 * 900 = 5,400 images/hour

(I however, believe you've likely mistaken the speeds here, and it's best to let the GUI calculate the EGs/sec as shown in the Analysis tab.)

The arguments for Batch Size and quality are VERY tenuous and really only apply to very extreme numbers. A BS of 128 will likely learn average face shape faster than a low BS, but will learn details less effectively. However, a BS of 1 will learn details better, but miss out on things like general shape and fail to learn generic features well. More important than an extremely high/low BS is to match the model to your hardware.
In that case, would it be effective to start training with a large batch size then interrupt and continue with a smaller bs if you want the best of both worlds?

Locked