The questions are super simple, just let me give a little context.
So I've made a handful of deepfakes up to this point, and if seeing their individual sharp k-9 tooth, and seeing the shimmering of light in their pupil crystal clear can be considered 100% trained, usually I settle for 90% trained. But for the first time, I have a really good model with a really good data, and a really good chance at hitting that "100% trained". So I had a couple questions about getting there involving learning rate and batch size
The guide states
Now I'm well aware that training is an exponential process, the Loss drops less and less as you go on, But I felt like my Loss wasn't dropping after a considerably long time. It stayed with in the same range it started at after I gave the model an additional 25% time training. Or in other words, of all the time it trained, in the last quarter of training the Loss didn't change. Is this sign that its time to lower the learning rate? I have good data on both the A and B side. Also should I be lowering the batch size as I near the end similar to lowering the learning rate?A learning rate set too high will constantly swing above and below the lowest value and will never learn anything. Set the learning rate set too low and the model may hit a trough and think it has reached it's lowest point, and will stop improving.
Think of it as walking down a mountain. You want to get to the bottom, so you should always be going down. However, the way down the mountain is not always downhill, there are smaller hills and valleys on the way. The learning rate needs to be high enough to be able to get out of these smaller valleys, but not so high that you end up on top of the next mountain.