I was training and I was rotating my data like its recommended for better training, you know... letting the model see something new after a while, and I've noticed these odd valleys coming as a result. Now I've seen this happen many times and thought it was weird but never really thought too much about it, but I think I just made a connection in my head I hadn't realized before.
Is this huge drop in loss the evidence for "rotating your data often improves training"? Like this right here is exactly why you suggest people do it?
If the answer to that question is yes, then how come this same "evidence" does not seem to occur with my B data? You can clearly see that the B data still is higher than where it's estimated it would have been if I didn't rotate the data around.
Also a question on a unrelated note, I know that running dual or triple monitors often will affect my gaming performance, and when the game is serious I'll unplug the extra monitors. Is the same concept true for face swap? If I let faceswap run without extra monitors hooked up, can I squeak out 2-3% better training times?