"High quality sources" vs. "training only uses 64x64 image"

Replicon · Post by **Replicon** » Sat May 01, 2021 6:21 pm

Random thought/question that just popped into my mind:

The recommendation is always that using high quality images/video sources for training is important...

... but since models seem to use small images (many only doing 64x64), does it mean the source quality isn't all that important, as long as the face is as detailed as it can be in that size?

To put it another way: If I use a 1080p video of an interview, and the faces take up say 512x512 pixels on the actual video frames, but the training model only does 64x64, will there be much of a difference between that and using a 100x100 pixel source video with a head close-up where the face takes up say 80x80 on the video frame?

This all sounds kind of academic, but I guess it might help answer the question, "if I have a really high quality video, should I still keep smaller faces in the training data, as long as they're bigger than 64x64 pixels in the video?"

Post by **bryanlyon** » Wed May 19, 2021 3:27 pm

Quality means noise/variety more than high resolution. You don't want a whole bunch of low bitrate youtube videos since the compression artifacts lead to visible face distortion. You don't have to get everything from 4k Blu-rays but you want as good of quality as possible in order to minimize the issues you run into.

If you have enough high quality faces I'd ignore low quality ones altogether, but if you're short on faces, then sure throw in the low resolution ones.

Faceswap Forum

"High quality sources" vs. "training only uses 64x64 image"

"High quality sources" vs. "training only uses 64x64 image"

Re: "High quality sources" vs. "training only uses 64x64 image"