Alternative Extraction Workflow

Want to know about the Faceswap's Face Extraction process? Got tips, ideas or just want to learn about how it all works? Then this is the place for you


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for discussing tips and understanding the process involved for Extracting and preparing face sets for training a model in Faceswap.

If you have found a bug are having issues with the Extraction process not working, then you should post in the Extract Support forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
363LS2GTO
Posts: 30
Joined: Fri Jul 08, 2022 7:06 pm
Has thanked: 1 time
Been thanked: 4 times

Alternative Extraction Workflow

Post by 363LS2GTO »

I have tried the extraction method advocated in the "extraction" a workflow guide and while it works for some applications, I found it to be overly cumbersome.

I wanted to share my solution.

First, I collect all of my sources. For videos, I place my source videos in file folders such as video 1, video 2, etc.

I then create a separate folder labeled extracted photos for video 1, video 2, etc.

I go to tools--Effmpeg and 'extract' photos from my source videos. The settings allow you to choose every N'th frame to be extracted based upon your particular video.

I do this for all of my videos and have the extracted photos go to their respective folders.

I then look at every photo and delete photos without faces, blurry photos, etc. This is not as difficult as it sounds. I was able to extract and hand select over 3,000 photos (out of a total of 6,000+ generated photos) from 14 videos in just a few hours my first try.

I can now use an AI program such as Remini AI to enhance as many photos as I want to improve source quality. As stated by others, source quality seems to be one of the largest factors in swap quality. This takes the most amount of time by far and for someone who has the money, a program such as Gigapixel would be a worth while investment for the batch processing ability.

I then move all of the extracted photos from video, photos downloaded from the internet, and the enhanced photos to a "master source frame" file. You will want to back up this file and probably insert the number of photos (files) into the name of the folder. DO NOT DELETE ANY SOURCE FRAMES (the thousands of photos) IN THIS FILE FROM THIS POINT or you will mess up the alignments file you are about to create!

Now go and run "EXTRACT" as directed in the extract guide to extract the faces.

From this point you will be able to use the sort tool as normal as well as the manual tool with the added benefit of being able to reference the exact photos used for training in the manual tool so every face will be exactly land marked and masked.

You can now go to tools--mask to regenerate your masks from a single folder / alignments file.

Anytime you want to regenerate your entire face set or perform any other action, it's only one folder and one alignments file you have to reference.

This process cuts down on the repetition of having to extract faces, sort faces, manually edit faces, and generate masks from multiple sources. If you are processing for fit training or only using one or two sources it may be advantageous to use the regular method but this method has many advantages for saving time and improving image quality along the way.

The only downside I can see is that if a single source photo (frame) is deleted or moved, it will mess up your alignments file. Keep a backup and this will not be a problem. You could also use effmpeg to create a single video of your photos but this would probably reduce image quality if you had to extract from the video. Regardless, its probably not a bad idea as now you have a 'master' source video of all of your hand edited photos.

Effmpeg runs very quickly and I am able to extract thousands of photos in just minutes using the high quality PNG format. I use a SATA SSD but my computer is fairly old overall. The hand sorting takes time, but vastly improves quality and saves time down the road.

From this point forward, I just extract my faces to be used for training to a folder and then add and remove the faces I want to use as I train.

You can always add 'updates' in the future as additional (separate) source videos or photos and alignments files as the traditional method but I find that I want to have all of my prep work done first before I invest 60-150+ hours training as this is costly and time consuming.

This method also works great if you want to come back at a later point in time to use the same person again for training as it greatly simplifies your file structure and cuts down on confusion (how did I have all of this arranged 6 months ago?).

This method is working well for me so far and I am currently training on this face set generation method with great results due to the improved source quality I can achieve by hand selecting the photos, hand aligning each individual photo, and then running the photos through AI enhancement software.

User avatar
wader
Posts: 14
Joined: Tue Jun 07, 2022 6:54 pm
Been thanked: 4 times

Re: Alternative Extraction Workflow

Post by wader »

I figured out to combine extracted frames from different video sources + individual images into single folder for extraction/masking/alignments from the Extraction FAQ, I think it was.

Like you, I saw it would be added cycles to perform the same process across each source, when in the end the application doesn't care from where you obtained the different images to train and/or convert against.

This is also related to my preferred end-to-end workflow, where I work entirely on the target video's frames for extraction/masking/alignment and rebuild the converted frames back to video at the end: this bypasses errors I previously encountered where the Convert step could not handle some video formats very well. Working entirely on individual frames from the target video to convert is foolproof in that regard.

Locked