in my converted video I'm noticing 3 key issues
1- The face is shaky/jitters. I managed to fix this a little with the spatial alignments tool, but its still a very noticeable problem. Is this caused by too many of a similar reference in the training set?
2- The features blur during particular facial expressions. Particularly the bottom lip and mouth blur and fade when opening wide. I figure this would be a lack of data for that particular expression on one of/both sets, but that shape definitely exists in both sets.
3- The inside mouth and tongue are unconvincing. The video is up close, so you can tell when their lips and tongue don't quite make the correct shape for that word. I have lots of up close talking footage of both actors in the sets, is there anything to keep in mind to improve this?
I'm unsure of how to "properly" build a training set. I'm worried I don't have enough images, but also that I have too many of the angle. How can I tell what might be lacking and what might be too much?