As in exactly what part of the face they're supposed to mark? I can guess some or maybe most of them, but I suspect it'd help the model if we knew exactly what each point was supposed to mark when fixing alignments with the manual tool. Things can get murky with shadows, angles, eyes closed vs opened, etc.
Is there a guide to the landmarks?
Read the FAQs and search the forum before posting a new topic.
This forum is for reporting errors with the Extraction process. If you want to get tips, or better understand the Extract process, then you should look in the Extract Discussion forum.
Please mark any answers that fixed your problems so others can find the solutions.
Re: Is there a guide to the landmarks?
I've asked this a couple years ago, and didn't get a response. My two cents, for what it's worth.
Accuracy is important for training, however, I don't think it needs to be perfect, but the more perfect the better.
Jaw lines end at the beginning of the chin, then the middle of the chin (chin dimple) to the other edge of chin. The nose goes from far edge of nostril, to middle, to other edge. The eyes go from one edge of the whites to other edge, and the two landmarks in the middle measure the iris. The mouth you have the outer edge of lips, to the peaks, and then the dimple in the middle (the outer bottom lip is pretty straightforward.). The inner is measuring more of the hole of the mouth, by tracing the inner lip.
Again, this is my opinion as I never got an answer when I asked.
Hope this helps.
Re: Is there a guide to the landmarks?
So some things you should know about landmarks....
They are laid out as follows:
An example from a real face:
This is fine for forward facing, but is a little more complex in profile. Generally, with profile shots, the obscured jawline points will follow the visible face contour (excluding the nose). Obscured eye/mouth/eyebrow points appear to be placed as if they were seen 'through' the face.
Ultimately though, with landmarks, I say, don't worry about being 100% accurate. The primary use of landmarks are for 1 thing and one thing only, and that is to align a face image correctly for training/swapping. This is done purely using the 'core' landmarks (that is eye/nose/mouth/eyebrow points), the jawline does not come into this at all.
If you are not using a landmark based mask (and you shouldn't be, BiseNet is a much better mask solution), then you can ignore badly aligned jawlines entirely. They will not impact anything in any way.
My word is final
Re: Is there a guide to the landmarks?
Thanks, interesting and good to know.
One more general question: does the algorithm somehow take into account and use details in 4k or HD footage when swapping? I'm wondering how such relatively small size inputs can yield an algorithm that gives detailed results (like skin texture) with HD or 4k footage and faces that may take up the full height of the frame.
Re: Is there a guide to the landmarks?
@torzdf I know you've read my response before about this, but don't you think the jaw line impacts the placement of the other landmarks? I know the jaw line, when using Bisenet, doesn't effect the mask placement but don't you think it can effect the placement of the eyes or nose, or the mouth by pulling those features away from their intended placement if the jaw line gets wacky?
Re: Is there a guide to the landmarks?
I also will mention, that one of the more overlooked important landmarks (I know I overlooked it's importance,) is the two inner eye landmarks, represented by numbers 37,38,40,41, and 43,44,46,47. It took me a while to figure out that these landmarks direct the eyes. If you get googley-eyes, or cross-eyes in you masks, it's my opinion they are directly related to the poor placement of your iris' in training. For a couple years now I thought it had something to do with the chosen architecture of the model, etc. It wasn't until recently after re-extraction for higher definition training did I realize how off my iris landmarks were, causing some of my masks to become cross eyed, or looking in different directions.
Re: Is there a guide to the landmarks?
I'm also wondering if "variety of angles and lighting conditions" should actually exclude extreme lighting for training, like artsy low key shots where half the face is in darkness for example. Even if the target video has shots like that.
Re: Is there a guide to the landmarks?
Actually, I fill my training masks up with about 10-15% of "troubled" faces. These are faces that are blurry, extreme lighting (over/under exposed), partial faces, and extreme shots. If you're using this for general masks (rather than specific to a video, etc.) I think this can help train the mask when it encounters problems. At the very least it can't hurt.
Re: Is there a guide to the landmarks?
hullo wrote: ↑Fri Oct 13, 2023 10:22 pmThanks, interesting and good to know.
One more general question: does the algorithm somehow take into account and use details in 4k or HD footage when swapping? I'm wondering how such relatively small size inputs can yield an algorithm that gives detailed results (like skin texture) with HD or 4k footage and faces that may take up the full height of the frame.
You should probably open another topic for this as it is an unrelated question. However short answer is 'not especially'. Higher res images will yield higher quality results, but (for example) a 256px model takes faces at 256px regardless if they come from a low/high res source. Downsizing a 4K image to 256px will not retain any more detail than using an image that was 256px to start with.
My word is final
Re: Is there a guide to the landmarks?
MaxHunter wrote: ↑Sun Oct 15, 2023 12:19 am@torzdf I know you've read my response before about this, but don't you think the jaw line impacts the placement of the other landmarks? I know the jaw line, when using Bisenet, doesn't effect the mask placement but don't you think it can effect the placement of the eyes or nose, or the mouth by pulling those features away from their intended placement if the jaw line gets wacky?
Nope. Doesn't come into it. Either the eyes/mouth (core landmarks) are correct, or they are not. If they are correct, then you don't need to worry about jawline. If they aren't correct, then it will be because the aligner failed to 'guess' those landmarks points correctly, which in turn, has nothing to do with the jawline.
My word is final
Re: Is there a guide to the landmarks?
MaxHunter wrote: ↑Sun Oct 15, 2023 12:31 amI also will mention, that one of the more overlooked important landmarks (I know I overlooked it's importance,) is the two inner eye landmarks, represented by numbers 37,38,40,41, and 43,44,46,47. It took me a while to figure out that these landmarks direct the eyes. If you get googley-eyes, or cross-eyes in you masks, it's my opinion they are directly related to the poor placement of your iris' in training. For a couple years now I thought it had something to do with the chosen architecture of the model, etc. It wasn't until recently after re-extraction for higher definition training did I realize how off my iris landmarks were, causing some of my masks to become cross eyed, or looking in different directions.
Urrrgh, I hate to burst your bubble (and I definitely don't want to discourage you from doing experiments and posting findings here, as often things can be uncovered that way), but.... yeah, no. It doesn't work that way at all.
The landmarks do not feed into training directly at all (except for aforementioned masks + eye/mouth masks). The landmarks for eyes should be placed as displayed in my images, regardless of eye direction. More often than not they are not in the right place (FAN is trained on a terrible, terrible dataset, as I have recently discovered), however the impact this actually has on FS is minimal. The points don't need to be 100% correct, as long as the face is ultimately aligning correctly for training.
Even then, some of our augmentation involves rotating/shifting/zooming the face slightly, in order to build a more robust model, so highly precise landmarks will rarely be used in training, other than to use as a starting point prior to cropping/shifting/zooming the training image
My word is final
Re: Is there a guide to the landmarks?
It honestly depends on your final output. For general purpose models, I will keep all images in as everything can contribute something.
For highly targeted models, then my B choice is entirely dictated by my A dataset. If there are 'extreme lighting' images in A, then I would always be looking to try to include these in B.
That being said, this is just me with my own anecdotal experience. I may not be right.
My word is final
Re: Is there a guide to the landmarks?
Thanks for setting me straight.
My logic was based on the mask using a mesh where one pull of the mesh will effect all the other "triangles" within the mesh