Page 1 of 1
Smaller batch size = Worse training?
Posted: Fri Oct 30, 2020 2:54 am
by AmITheNax?
So im about 70k iterations in, and my training model has stopped getting better roughly 30k iterations ago. This is probably due to my lack of training data available.
Either way, I had my batch size set to 16, with a loss of about 0.03-4 on both A and B.
So, In a sort of last ditch effort to increase output quality, I decreased my batch size to 1. I assumed that having less pictures go through at once would yield marginally better output. And yet, my loss jumped to about 0.07-8 on both A and B.
Am I missing something?
I can't increase my batch size to 256, since, as I said, I severely lack training data. so I cant test if I get a smaller loss.
Are batch sizes actually the reverse of what I think they are?
Re: Smaller batch size = Worse training?
Posted: Fri Oct 30, 2020 3:08 pm
by bryanlyon
They're not the reverse of what you think they are, they're a lot more complicated than that. Larger batch sizes do tend to generalize better, while smaller ones focus better. I know that sounds vague, because it is. No one batch size is going to get you universally better results. That's just not how it works.
Re: Smaller batch size = Worse training?
Posted: Fri Oct 30, 2020 3:33 pm
by abigflea
Of your 2 subjects, what is the the least amount of data, least photos?
What model are you using?
Go the biggest batch size you can, although I wouldn't go over a batch size of 80-100. Tends to actually make it train worse depending on model.
Near the end of training (usually after 600K for me) , I tend to drop the batch to something relatively low to bring in a little better focus as Bryanlyon stated. What I do and when, depends on data, model architecture, ect.