I did some messing around and the biggest problem I encountered (and partially overcome) is that when I get files extracted and have to clean the files in the clean or sort step, is to not make a new folder for the sorted faces to go (this was my first mistake). That seems to keep the sort and cleaning steps from causing errors abound.
The new issue I've found is that using the manual tool and finding a face which isn't helpful and just right-clicking on it's bounding box and hitting "delete" causes more harm than good. Because after finding around 20+ of these mostly useless faces in the manual tool, the file becomes unable to be trained. It will hit an error rather than skip X number of faces (my assumption is that it has some maximum number of faces it can ignore, I know this because it will skip at least 1 in my current set and not err, and that when I hit "delete" in the manual tool it isn't deleting the image from the folder, or isn't deleting a space for information in the alignments file, idk which) however, by random guessing at which tool i can use to clean the alignment's file I've fixed this at least once successfully... I basically ran it through just about everything (missing faces, missing frames, missing alignments, leftover faces, whatever seemed like it may fix it, though I was doing this rather blindly since there isn't a good explanation I've seen on what those tools actually do really. So I just went with blind faith in running everything and hoped it's be okay.)
Also I noticed a big place where what the guide says is actually more confusing than what the tooltip in the program gives for an explanation. (this was a separate problem) I think mostly it revolves around the terms: "Faces" "Frames" and "input" when using each tool (my use of the word tool here is meant to include: the initial extraction, the alignment extraction, each sort and clean step, train, and convert etc.) It seems confusing how in one step a given set of data seems to be called the "faces folder" and in another it will be called the "frames folder" and in another we even see things like "training images" when best I can tell, I want to have every image in my training set unless I have a set of images beyond say 10,000. So I just select extract every N = 1.
My suggestions is to have a specific term for each folder/file and a separate term for a video file.
Data set A = Folder 1 or "original video 1"
Data set b = Folder 2 or "original video 2"
then other files that may be created:
"Sorted folder 1" or "sorted folder 2"
"Alignment file 1" (being from data set A) or ""Alignment file 2" (for B data set) or "Alignment file 3" (For the video to swap onto)
since for the case of swapping onto video we would always have a 3rd data set (we could call it data set C) which goes through a different process than the others we could avoid calling it anything other than the "Final swap video" or the "Final swap video's alignment file"
To me calling it a "face folder" or a "frames folder" gets needlessly confusing, and within the guide from step to step what was once faces in one step becomes frames for another so it gets very "wait whut?" to the newcomers.
The same goes for the input/output tags since the question can arise "Wait the output for this step? or from last step??"
Also, if there were a tool for looking not at the outcome of the alignments file (like the manual tool) but instead for seeing the actual important code within and being able to edit it. I may be able to decipher what exactly is going wrong when I use the delete function on a bounding box in the manual tool, or when I add a new bounding box on an image that didn't have one. (As that too was a problem)
I do have a guess at the solution though for how to take a lot of work out of the user's end when using the manual tool and fix a few potential issues it causes. When we hit the save key the manual tool should do a final check on all the images within the alignment file and be sure all the images are accounted for, if one isn't then it should remove it from the file. Also it should go through and add in any new alignments given to images which had none before. Then once it's gone through and fixed those issues, save all the data we made changes on that already existed.
Also I have found that some people mentioned crashes with the manual tool. I have info on the cause of those crashes.
For me if the images and the alignments have de-synced on a file (not a video) to where I have several hundred images with correct facial alignments in terms of the form of the face, however they are far off from where the actual face in the image lies, I can go through and correct them with the manual tool as needed, (however idk why they happen, I've had about 1100 out of a data set of 2300+ faces) but I have noticed while correcting them I tend to stay on the bounding box screen while flipping through the frames with z & x keys, that if I try to pull a bounding box from out of the overall workspace into the workspace, the computer crashes. I assume what has happened here is since it is outside the workspace and cannot see past the edges of the workspace the bounding box doesn't know where to try to lock on to a landmark so it just crashes. I've found a work around however. If I swap to the view where I see the yellow box instead of the blue bounding box (i forget the name of this screen and I'm in the middle of a training set now so I can't check,) and then I pull the box within the bounds of the screen, (while still seeing the yellow box, I don't just flip in and out of the screens) then it won't crash. I can then flip back to the bounding box screen and the bounding box will be within the bounds of the workspace so it doesn't crash as it had.
I should mention I did notice that the files which are extracted from a video *do* have to be off to the side from the original thumbnail image we see. Best I can guess is the reason the dots for the face are off in a black area instead of over top of the image of the face we see is because the alignments file in that instance is pointing the dots to the location of the actual face as it would appear on the screen while we watched it full view. But then saves a snapshot of that face off to the side, just for reference for us, and in part may not even be needed. (Idk if the program does anything useful with it)
Though alignment files should be corresponding to the faces always when you input a folder of images rather than a video, so I'm unsure why for several hundred images I'm getting some sort of artifact from the video process on my images/alignments data sets in the manual tool, like they had come form a video, if they didn't.
A possible way to help everyone I feel, would be to set another 2 tabs in the program along the same space as the "tools" and "convert" tabs. and simply rename the first extraction tab to "extract 1" and then name the two new tabs "extract 2" and "extract 3".
Though as I write this I notice it may also be convenient to have an "align/alignments 1" "...2" for the first two data sets. If memory serves the 3rd data set doesn't go through the alignment extraction process since they aren't used but instead discarded to make room in the convert step for the new faces.
Gosh, I'm sorry this is long, I'm trying to explain things in a way my meaning isn't lost and it's rough through text. I hope on some level I'm helping refine the workflow in a way that is more intuitive and easy to learn for new-comers.
About this section from before that you didn't understand where I said:
"I've gone back through the same data set each about three times and the best I can figure out is, after you sort and clean the file, assume the alignments file is useless and just start at the extraction step again"
What I meant was with each alignment file and set of images I went through the steps (extract, sort, clean, alignment) about three times over and kept running into errors where it seemed to either say there were no frames/faces/alignments were missing seemingly interchangeably. As I mentioned earlier I think this was caused by running the sort command and choosing a new folder then reverting the faces' file names back after the sort. For some reason even though this should work anyway, it didn't. So what I was doing was running the folder I had my images in after the sort process, back through the original extraction process in hopes of making a new alignment file that wasn't missing something. (be it frames/faces/alignments or whatever else it wanted to not hit errors. and I should mention I was using the "remove faces" tool through this process too, it was just seeming to not help matters much.)
Found a NEW problem and this one I don't think I can fix.
When I attempt to load the manual tool I can get it to load, I can edit masks/points whatever I need. up until the 2107th face. After that I can see in the scrolling section more thumbnails with more faces. I can even click to view their points and masks instead of the thumbs, but when I click them they won't load in the workspace. Also, I can only see them on the "all frames" selection. Further, on the initial load in to the manual tool I get a count of all frames of 2107 (which I even have that count of images in the folder I am viewing and using with the alignments.) However, after switching from all frames to frames with faces my count drops to 2107 (correct number of overall frames) and then up to 2407 (which seems to be including the thumbs which I don't know where they come from.) and I cannot edit any from 2107-2407... and I checked all my files with images, the most I've seen are 2541 with 4 files being the alignments and the backed up alignments. So it doesn't make sense to have the number 2407 showing up. From what I can tell when I run the train command it actually uses the additional "ghost" 300 files and it mucks up the entire neural network's attempt at trying to make a model, even though it started with 2107 good files the 300 ghost ones inevitably send it to hell. This is my new struggle and my only solution? Take my file with the lowest number of images and turn it into the source for my new extraction file and start over for side B of my training...
I shouldn't be getting forced at ever problem to rework a whole extraction over from the beginning like this. Something's gotta give. what can I do?