Page 1 of 1

Is "load weights" the right way to go if I only need the B part?

Posted: Sat Jun 18, 2022 4:05 pm
by Fed

I've red some explanations, but I'm not sure if I understand correctly, so I'd want to get some clarification.

I started with the desire to exchange face A with face B.
Trained the correlating model. All good.
Now I want to exchange face C with face B.
Seems like a waste to start training both faces from zero point.
Seems like it would be more effective to proceed training face B from the point I reached while training A->B swap.
I guess it will not save any time since face C needs to be trained anyway, but the resulting face B might be a bit better - for the same training time.

Is this the correct situation to use load weights + freeze weights?
I'm confuzled.
There's no option to load a half of a model (the B face). And face A has nothing to do with face C, so I don't feel confident about loading and freezing anything related to face A.

P.S. If this is the right way to go - can I change so training settings? Like - face coverage?
P.P.S. If this is't the right way to go - is there any decent way to do this? If not - it's a bit depressing. What if I want to exchange three faces - A->B, B->C, C->A? In this case I only need three faces trained, but I would have to full-scale train three pairs of faces?


Re: Is "load weights" the right way to go if I only need the B part?

Posted: Sun Jun 19, 2022 11:06 am
by torzdf
Fed wrote: Sat Jun 18, 2022 4:05 pm

Is this the correct situation to use load weights + freeze weights?

It is a correct situation, yes. Other people re-use their trained encoder to kick start training totally new identities too (say, C + D). Effectively the logic is "you can use weights from a different problem to solve this problem"

Fed wrote: Sat Jun 18, 2022 4:05 pm

There's no option to load a half of a model (the B face). And face A has nothing to do with face C, so I don't feel confident about loading and freezing anything related to face A.

There is in Phaze-A, there is not in other models. The other models only give you the option to load the encoder. This will still bring you savings. The encoder has trained encodings for A + B. These encodings will still be useful to you for training B + C (or even C + D). You would load these saved weights and freeze the encoder whilst training B + C decoders only. Once the decodings are beginning to resemble faces, you would unfreeze the encoder, and let it adapt to the new C data.

Fed wrote: Sat Jun 18, 2022 4:05 pm

P.S. If this is the right way to go - can I change so training settings? Like - face coverage?

You can change coverage, yes. I don't know how much this will hurt things, but as you will eventually unlock the encoder anyway, when any coverage changes should feed back into the encoder.

Fed wrote: Sat Jun 18, 2022 4:05 pm

P.P.S. If this is't the right way to go - is there any decent way to do this? If not - it's a bit depressing. What if I want to exchange three faces - A->B, B->C, C->A? In this case I only need three faces trained, but I would have to full-scale train three pairs of faces?

I would say yes. With Phaze-A you could freeze the entire B side and just get decoder C caught up.


Re: Is "load weights" the right way to go if I only need the B part?

Posted: Sun Jun 19, 2022 11:32 am
by Fed
torzdf wrote: Sun Jun 19, 2022 11:06 am

I would say yes. With Phaze-A you could freeze the entire B side and just get decoder C caught up.

Ah... Interesting. I've been using Villain so far since it looked like an easy option. Phaze-A could be interesting to look at in the future.

Thanks for the answer. Good to know that previously trained models will not have only one use.

Actually now I wonder... I have one more question now.
Is there an accumulating ability to jump start?
I mean if we have two scenarios:
1) Train A->B then use it to jump start and train X->Y
2) Train A->B then use it to jump start and train C->D then use it to jump start and train X->Y
Is it going to be a better jump start for X->Y in second scenario?


Re: Is "load weights" the right way to go if I only need the B part?

Posted: Wed Jun 22, 2022 10:43 am
by torzdf

I don't know if it will be better, it may be or it may not. I've never really tested at length so don't know how much the encoder information gets rewritten. I certainly no of people who do what you are saying though and just keep re-using their encoder.