Training stops after ~2000 iterations without crash or error

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
kuklangaren
Posts: 2
Joined: Wed Oct 13, 2021 12:34 pm

Training stops after ~2000 iterations without crash or error

Post by kuklangaren »

Hello,

I can't seem to find any similiar topic, but apologies in advance if there is one.

Recently I've been unable to do any training since it just stops after a while. No error message shows up and the program works just fine until I press stop training, which makes it stop responding.

When I first started out it would not stop training until I pressed stop, like it should, with batch size 16. I've tried reinstalling several times and I've tried using smaller batch sizes but it doesn't seem to make a difference. I only tried the original trainer when it worked, but now no trainer does.

My graphics card is GTX 1080 and I have 32 gb RAM, if that helps

Thanks!

by bryanlyon » Wed Oct 13, 2021 2:20 pm

I've never seen this before. Sometimes if there are power problems (Noisy, aging, or insufficient power supply) it can cause issues similar though. I'd see if restarting your Nvidia drivers do anything to the freeze ( Win+Ctrl+Shift+B ).

Unfortunately if there are no messages it's very hard to troubleshoot something like this.

Go to full post

User avatar
bryanlyon
Site Admin
Posts: 643
Joined: Fri Jul 12, 2019 12:49 am
Answers: 42
Location: San Francisco
Has thanked: 3 times
Been thanked: 161 times
Contact:

Re: Training stops after ~2000 iterations without crash or error

Post by bryanlyon »

I've never seen this before. Sometimes if there are power problems (Noisy, aging, or insufficient power supply) it can cause issues similar though. I'd see if restarting your Nvidia drivers do anything to the freeze ( Win+Ctrl+Shift+B ).

Unfortunately if there are no messages it's very hard to troubleshoot something like this.


User avatar
kuklangaren
Posts: 2
Joined: Wed Oct 13, 2021 12:34 pm

Re: Training stops after ~2000 iterations without crash or error

Post by kuklangaren »

I updated my drivers just after posting this and now it works again! It was most likely a problem with the drivers as you suggested.

Many thanks!


Locked