How to improve training?

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
burned4l
Posts: 2
Joined: Thu Nov 18, 2021 11:50 am

How to improve training?

Post by burned4l »

I've been looking around in the forums, but not finding a lot of info on what I can do to improve an existing model's training. I have a model that was trained on two sets of 3000 samples for A and B with plenty of different angles. The training in graph analysis turned pretty much into a straight line at around 0.02 error for A after 50K iterations (B side was 0.03, but I was not looking for swapping B>A anyway).

My data sets were fairly cleaned up manually - not saying they are perfect, as it is my first time at this. I read in some threads here that we can stop training and make some changes to the input data (add/remove), but not sure what I am adding or removing? For example, if I see the preview for training is having difficulty with side views, should I add more sides and remove some frontal views, or just add more side views. (I think I have an OK distribution of all angels in my datasets as-is). Should I change some settings and run the training again (loading weights where it left off)?

Anyway, any pointers/reference is appreciated. Below my config I used for training (pretty much default everything per your awesome tutorials!!!)

One note, during training with batch size of 16, I was running into a WARNING with value of an index going out of range after a few thousands of integrations, so I did some training on 64 batch size and it was OK, then I tried 16 again and got the same warning, so settled for 32 batch size.

Code: Select all

{
  "convert": {
    "Input Dir": "",
    "Output Dir": "",
    "Alignments": "",
    "Reference Video": "",
    "Model Dir": "",
    "Color Adjustment": "avg-color",
    "Mask Type": "extended",
    "Writer": "opencv",
    "Output Scale": 100,
    "Frame Ranges": "",
    "Input Aligned Dir": "",
    "Nfilter": "",
    "Filter": "",
    "Ref Threshold": 0.4,
    "Jobs": 0,
    "Trainer": "",
    "On The Fly": false,
    "Keep Unchanged": false,
    "Swap Model": false,
    "Singleprocess": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "extract": {
    "Input Dir": "",
    "Output Dir": "",
    "Alignments": "",
    "Detector": "s3fd",
    "Aligner": "fan",
    "Masker": "",
    "Normalization": "none",
    "Re Feed": 0,
    "Rotate Images": "",
    "Min Size": 0,
    "Nfilter": "",
    "Filter": "",
    "Ref Threshold": 0.4,
    "Size": 512,
    "Extract Every N": 1,
    "Save Interval": 0,
    "Debug Landmarks": false,
    "Singleprocess": false,
    "Skip Existing": false,
    "Skip Existing Faces": false,
    "Skip Saving Faces": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "train": {
    "Input A": "D:/Data/Model A/Trainer",
    "Input B": "D:/Data/Model B/Trainer",
    "Model Dir": "D:/Data/Model 1",
    "Load Weights": "D:/Data/Model 1/dfaker.h5",
    "Trainer": "dfaker",
    "Summary": false,
    "Freeze Weights": false,
    "Batch Size": 32,
    "Iterations": 1000000,
    "Distributed": false,
    "Save Interval": 250,
    "Snapshot Interval": 25000,
    "Timelapse Input A": "",
    "Timelapse Input B": "",
    "Timelapse Output": "",
    "Preview Scale": 100,
    "Preview": false,
    "Write Image": false,
    "No Logs": false,
    "Warp To Landmarks": false,
    "No Flip": false,
    "No Augment Color": false,
    "No Warp": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "alignments": {
    "Job": "",
    "Output": "console",
    "Alignments File": "",
    "Faces Folder": "",
    "Frames Folder": "",
    "Extract Every N": 1,
    "Size": 512,
    "Large": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "effmpeg": {
    "Action": "extract",
    "Input": "input",
    "Output": "",
    "Reference Video": "",
    "Fps": "-1.0",
    "Extract Filetype": ".png",
    "Start": "00:00:00",
    "End": "00:00:00",
    "Duration": "00:00:00",
    "Mux Audio": false,
    "Transpose": "",
    "Degrees": "",
    "Scale": "1920x1080",
    "Quiet": false,
    "Verbose": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "manual": {
    "Alignments": "",
    "Frames": "",
    "Thumb Regen": false,
    "Single Process": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "mask": {
    "Alignments": "",
    "Input": "",
    "Input Type": "frames",
    "Masker": "extended",
    "Processing": "missing",
    "Output Folder": "",
    "Blur Kernel": 3,
    "Threshold": 4,
    "Output Type": "combined",
    "Full Frame": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "preview": {
    "Input Dir": "",
    "Alignments": "",
    "Model Dir": "",
    "Swap Model": false,
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "restore": {
    "Model Dir": "",
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "sort": {
    "Input": "",
    "Output": "",
    "Sort By": "face",
    "Keep": false,
    "Ref Threshold": -1.0,
    "Final Process": "rename",
    "Group By": "hist",
    "Bins": 5,
    "Log Changes": false,
    "Log File": "sort_log.json",
    "Exclude Gpus": "",
    "Configfile": "",
    "Loglevel": "INFO",
    "Logfile": ""
  },
  "tab_name": "train"
}

Thank you!

User avatar
torzdf
Posts: 2651
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 129 times
Been thanked: 622 times

Re: How to improve training?

Post by torzdf »

Everyone's first swap is terrible. Just be prepared for that. However....

burned4l wrote: Fri Nov 19, 2021 3:23 pm

graph analysis turned pretty much into a straight line at around 0.02 error for A after 50K iterations

Make sure to re-read this section. Graph at default zoom will not give you enough information in later stage training:
viewtopic.php?f=6&t=146#monitor

but not sure what I am adding or removing? For example, if I see the preview for training is having difficulty with side views, should I add more sides and remove some frontal views, or just add more side views. (I think I have an OK distribution of all angels in my datasets as-is). Should I change some settings and run the training again (loading weights where it left off)?

Either/or. Can just add or add + remove. Doesn't really matter. No need to "load weights." That is just for starting a new model. If you have set the model-dir at a location that already exists, Faceswap will auto-resume from where you left off.

Anyway, any pointers/reference is appreciated. Below my config I used for training (pretty much default everything per your awesome tutorials!!!)

Bear in mind that the original model is only a 64px output. At that resolution it will only ever get to a certain level of quality, given that it will need to be resized to fit your final frame. It is a good starting out model though for quicker results whilst you learn the process.

One note, during training with batch size of 16, I was running into a WARNING with value of an index going out of range after a few thousands of integrations, so I did some training on 64 batch size and it was OK, then I tried 16 again and got the same warning, so settled for 32 batch size.

Not sure what this error is. Would need to see it to know if it is important or not. It may just be a gui error.

My word is final

User avatar
cagonzon
Posts: 4
Joined: Sun Jan 02, 2022 4:38 am
Has thanked: 1 time
Been thanked: 1 time

Re: How to improve training?

Post by cagonzon »

I also had this same line of inquiry, so figured I'd hop on this thread if that's ok instead of making a new one. The FAQ (and the response above) mentions simply add /remove to the selections.

Am I understanding this correctly and following this procedure, please:

  1. I take a new batch of photos
  2. extract them
  3. use the output files and place them into the same folder that's being used in training.
  4. Resume training using the model that's already 200k iterations in?

My question is: what about the alignments? Those would be separate? Or should I add the "new" photos to the original batch. Extract al that again, and then attempt to resume with the new Face As in the dataset? Or would that be starting from scratch?

Also, this applies to both FAce A and Face B "source" ?

I'm sorry if I'm overthinking this, but I'd appreciate guidance. Thank you!

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: How to improve training?

Post by bryanlyon »

All required alignments for training are included in the training images. The .fsa is now used only for final conversion.

Locked