Question on what's happening under the hood
So I decided to put an old 1080ti back into my rig to see if I can increase the memory of my 3080ti. It runs. I'm able to increase my batch size from 1, to 2. (Beggars can't be choosers) It runs at a snails pace, but I'm assuming that is because it's running on full precision, because of the 1080s lack of cores, right?
My main question though is the model I'm running at small batches suddenly went from between a loss value of .089/.092, to a loss value in the mid-.05s instantly.
That seems weird there was a .035-ish drop instantly just because I raised the batch size. I know we're not to pay too much attention to loss values, but it's still metric we look at. So, Why did that happen? Why would it drop that much instantly?