Hi, I am using faceswap to do my course project. And I want to get some knowledge of how the data flows in the training/converting procedure.
Here is the setting. I have trained a model following the instructions in the USAGE.md, including extracting, training and converting, on the Trump-Cage dataset. I want to reproduce the convert process in my python script. I have loaded the encoder and decoders. Now, I load a picture (denote as P) of Trump from the original folder (the input of extraction, i.e. before extracting). From my understanding, it is as easy as to feed P into the encoder and feed the output into the decoder for Cage to mimic the process of convert. However, the input image is (256, 256, 3) while the input of the encoder should be of size (64, 64, 3). I read the train guide in this forum. It seems that there is a 68.5% coverage by default. Moreover, even after the conduct the coverage, the size is far more larger than (64, 64, 3), so we need further resize the image. I also read the log of training, the Training image size is 256. And here are my questions:
- There must be the procedure of transforming the raw image of size (256, 256, 3) in your implementation, but I fail to locate it in this huge project. Can you kindly point out where it is? Then, I can directly use that to do image processing.
- If I cannot use that directly somehow, is my following design correct? First, get the 68.5% coverage area. Then, use any resize method to downsize it into (64, 64, 3). If it is correct, where is the coverage area? Is it at the center of the image? If it is wrong, then what is the correct procedure to transform the (256, 256, 3) picture into the (64, 64, 3) input of the encoder?
Thank you very much for your time.