Crashing after 70 iterations

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
jebanke
Posts: 5
Joined: Tue Sep 14, 2021 5:04 pm
Been thanked: 3 times

Crashing after 70 iterations

Post by jebanke »

Hello,
I'm trying to train my model, but it keeps crashing after approximately 70 iterations.
My setup is as follows : phaze-a stojo preset, at 384px output size, mixed precision, batch size of 10, in "distributed" mode (I have 5x RTX 2060S)

Here is the crash log generated by faceswap : https://pastebin.com/k7PJA2m0

I did check to see if I was using the latest version of faceswap, and it says I am.

Any help is appreciated !

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Crashing after 70 iterations

Post by bryanlyon »

This is a network/remote storage error. It's saying that your network data is becoming unavailable and causing faceswap to be unable to access the files. Unfortunately this is nothing to do with Faceswap and is entirely due to your setup. Either get your network storage more reliable or move the files locally to train.

Locked