Does VRAM Requirements Fall

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
MaxHunter
Posts: 193
Joined: Thu May 26, 2022 6:02 am
Has thanked: 176 times
Been thanked: 13 times

Does VRAM Requirements Fall

Post by MaxHunter »

Do VRAM requirements fall as the model grows?

I'm sorry if this is a newbie question but I can't understand why one model with the same settings fails while another model is exceptional. I started to wonder if I had my previous model on Central Storage distribution and later switched over, because when I started training my new model it kept crashing until I switched over to central storage. After several thousand "its" of training can I switch back to default distribution?

User avatar
torzdf
Posts: 2651
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 129 times
Been thanked: 622 times

Re: Does VRAM Requirements Fall

Post by torzdf »

No, but different iterations of Tensorflow may handle things differently though in terms of VRAM allocation.

I do suspect that there is some kind of leak going on in Tensorflow recently. Not big, but enough that a model may OOM after several thousand iterations (this should not be able to happen, as TF gathers the all the VRAM it requires at the start).

My word is final

Locked