The bottom line of my question is, does training for shorter periods prevent NaNs?
When a NaN warning occurs it's usually after a few - several hours of training. If I were to limit training to (as an example) one hour of training 24 separate times will that avoid NaNs, with the benefit of 24 hours worth of training?
If the NaN warning is given after 6 hours of training, why doesn't it give a NaN warning immediately after re-starting? If it was detected before it should still be there soon after restarting, right? The NaN detection should already be stored in the back-ups, so in my (presumably faulty) logic this means it's tied to extreme training times, and therefore shorter training lengths mean less likelihood of NaNs.
Except for this article (which doesn't address image learning)
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597167/
I can't find any articles that answer this question that relate to image machine learning and training time.
If training time does affect NaN appearance (as the above abstract suggests) would it behoove the program to have a training time/length batch option to address this issue? Where it's an automatic kill/restart training after a certain amount of time or iterations.
Again, forgive my ignorance if this is a obvious/stupid question.