Same missing alignments problem while training

cosmico · Post by **cosmico** » Sat Aug 29, 2020 10:22 pm

08/29/2020 15:14:28 ERROR Caught exception in thread: '_training_0'
08/29/2020 15:14:28 ERROR Alignments file does not exist: D:\Nueral Network programs\[i]Input A frames[/i]\Training frames\alignments.fsa

Its the same error I had last time, only last time deselecting penalized mask loss made it work again, This time it doesnt work regardless of whether I have it on or off.
Was running a dflsae, when I decided to update and then this happened. Randomly adjusting the loss function, mask loss function, l2 reg, eye or mouth multiplier, mask type, learn mask, and penalized mask loss all seem to do nothing. Also other projects with different models and training sets dont work either. Same with starting a brand new model

Post by **torzdf** » Sun Aug 30, 2020 10:26 am

Eye Multiplier and Mouth Multiplier both need the alignments file.

Reduce these values to 1.

You are really going to start missing out on benefits in training if you don't use an alignments file though.

cosmico · Post by **cosmico** » Sun Aug 30, 2020 5:53 pm

Thanks again for helping me out.

You are really going to start missing out on benefits in training

They seem like great features, I would honestly love to use them, but every time I start a new project and a new model, there's just no alignments file there. -and thus I have this problem after every update. What am I doing wrong where every new project defaults me to not having this? And can I add this to a model that's already half way trained not using this? -like is it to late for me to fix this and implement these features on my dflsae at 400k iterations?

-Edit: I get everytime I extract I create an alignments file, but I use multiple clips and thus have multiple alignment files, plus the alignment files are all named after the video clip. The only time I get a single alignment file named "alignment.fsa" is when I extract from images.

cosmico · Post by **cosmico** » Mon Aug 31, 2020 3:41 am

I figured it out but boy did it give me plenty of issues. At first it had some keyerror where it would have a problem with the first picture in my b set, and if i removed that picture it would have an issue with the second picture and on and on and on. It turns out taking out all the pathways (that there was nothing wrong with) under the timelapse solved that. Now I'm dealing with an illegal address issue.
2020-08-30 22:51:43.060518: E tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2020-08-30 22:51:43.061058: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:273] Unexpected Event status: 1
geforce experience is saying my drivers are up to date, and reinstalling miniconda and faceswap didnt solve

Post by **torzdf** » Mon Aug 31, 2020 7:25 am

That looks like an Out of Memory error. Try lowering you batchsize.

cosmico · Post by **cosmico** » Mon Aug 31, 2020 7:38 pm

If by out of memory you mean it runs out of my computer memory, perhaps. It does seem to work best (best meaning it works for an hour or 2 ) when I restart my computer and it tends to crash when I open up even light applications.

A batchsize of 2 is what I'm using as any larger crashes after 30 seconds. Turning it down to 1 didn't seem to help that much. I would have thought my 16gb ram and rtx 2060 would have been able to handle DFLSAE @ 144pixles, growth, mixed precision, multiscale decoder, 512 autoencoder, 42 encoder and 30 decoder,a lot better than this.

In the info boxes, adjusting the encoders state that lower setting can free up vram. Would turning off growth, mixed precision, or multiscale decoder also help free up vram? Also would turning down the learning rate help in any way?

Post by **torzdf** » Mon Aug 31, 2020 10:19 pm

Honestly, I don't know the specific settings on that model too well, so don't know what expected requirements are. [mention]abigflea[/mention] may be able to tell you whether those settings seem sensible.

By Out of Memory, I'm referring to GPU memory, System RAM shouldn't be an issue.

Post by **abigflea** » Tue Sep 01, 2020 11:25 pm

nnifj wrote: ↑Mon Aug 31, 2020 7:38 pm
If by out of memory you mean it runs out of my computer memory, perhaps. It does seem to work best (best meaning it works for an hour or 2 ) when I restart my computer and it tends to crash when I open up even light applications.

If your computer is crashing after an hour or 2, I personally would start thinking some component is getting hot and unstable.

nnifj wrote: ↑Mon Aug 31, 2020 7:38 pm
A batchsize of 2 is what I'm using as any larger crashes after 30 seconds. Turning it down to 1 didn't seem to help that much. I would have thought my 16gb ram and rtx 2060 would have been able to handle DFLSAE @ 144pixles, growth, mixed precision, multiscale decoder, 512 autoencoder, 42 encoder and 30 decoder,a lot better than this.

DFL uses a lot of VRAM. I have a model with 42Enc, 22 Dec, at 128pix and get a Batch of 12 on my RTX 2070 8GB.
I can run a batch of 16 but get OOM after a few 1000 iterations.
With your 6GB card and those settings, you are not going to get a high batch.
Maybe tone down that decoder back to 21(default).
Big disclaimer here, every model will use slightly different amounts of memory, batch may go up and down a notch.
Windows will take more for itself and knock down your maximum batch size.
I suspect other programs running will use more VRAM, I use mine not connected to any monitor to keep every megabyte free as possible.

If your System Ram is filling up, you have some other program(s) stealing your memory away. FS doesn't use much (Thanks Torzdf). I can run 3 separate instances , all training, and it used about 14GB in Linux .

nnifj wrote: ↑Mon Aug 31, 2020 7:38 pm
In the info boxes, adjusting the encoders state that lower setting can free up vram. Would turning off growth, mixed precision, or multiscale decoder also help free up vram? Also would turning down the learning rate help in any way?

Mixed Precision should increase your available memory so leave that on.
Allow growth shouldn't make much difference with this issue. I have to leave it on because my 2070 won't behave without it.
I haven't really tried turning Multi-scale decoder on/off. It may help a little, try it out.

Faceswap Forum

Same missing alignments problem while training

Same missing alignments problem while training

Re: Same missing alignments problem while training

Re: Same missing alignments problem while training

Re: Same missing alignments problem while training

Re: Same missing alignments problem while training

Re: Same missing alignments problem while training

Re: Same missing alignments problem while training

Re: Same missing alignments problem while training