Random thought/question that just popped into my mind:
The recommendation is always that using high quality images/video sources for training is important...
... but since models seem to use small images (many only doing 64x64), does it mean the source quality isn't all that important, as long as the face is as detailed as it can be in that size?
To put it another way: If I use a 1080p video of an interview, and the faces take up say 512x512 pixels on the actual video frames, but the training model only does 64x64, will there be much of a difference between that and using a 100x100 pixel source video with a head close-up where the face takes up say 80x80 on the video frame?
This all sounds kind of academic, but I guess it might help answer the question, "if I have a really high quality video, should I still keep smaller faces in the training data, as long as they're bigger than 64x64 pixels in the video?"