Smaller batch size = Worse training?

Want to understand the training process better? Got tips for which model to use and when? This is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved with Training a Faceswap model.

If you have found a bug are having issues with the Training process not working, then you should post in the Training Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
AmITheNax?
Posts: 1
Joined: Mon Oct 26, 2020 5:47 pm

Smaller batch size = Worse training?

Post by AmITheNax? »

So im about 70k iterations in, and my training model has stopped getting better roughly 30k iterations ago. This is probably due to my lack of training data available.

Either way, I had my batch size set to 16, with a loss of about 0.03-4 on both A and B.
So, In a sort of last ditch effort to increase output quality, I decreased my batch size to 1. I assumed that having less pictures go through at once would yield marginally better output. And yet, my loss jumped to about 0.07-8 on both A and B.

Am I missing something?
I can't increase my batch size to 256, since, as I said, I severely lack training data. so I cant test if I get a smaller loss.
Are batch sizes actually the reverse of what I think they are?

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Smaller batch size = Worse training?

Post by bryanlyon »

They're not the reverse of what you think they are, they're a lot more complicated than that. Larger batch sizes do tend to generalize better, while smaller ones focus better. I know that sounds vague, because it is. No one batch size is going to get you universally better results. That's just not how it works.

User avatar
abigflea
Posts: 182
Joined: Sat Feb 22, 2020 10:59 pm
Answers: 2
Has thanked: 20 times
Been thanked: 62 times

Re: Smaller batch size = Worse training?

Post by abigflea »

Of your 2 subjects, what is the the least amount of data, least photos?

What model are you using?

Go the biggest batch size you can, although I wouldn't go over a batch size of 80-100. Tends to actually make it train worse depending on model.

Near the end of training (usually after 600K for me) , I tend to drop the batch to something relatively low to bring in a little better focus as Bryanlyon stated. What I do and when, depends on data, model architecture, ect.

:o I dunno what I'm doing :shock:
2X RTX 3090 : RTX 3080 : RTX: 2060 : 2x RTX 2080 Super : Ghetto 1060

Locked