Do I need to adjust batch size when distributed training

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
misaka17009
Posts: 1
Joined: Fri Feb 18, 2022 5:57 pm

Do I need to adjust batch size when distributed training

Post by misaka17009 »

If I have 4 GPUs, do I need to divide batch_size by 4 for getting the same result that came with 1 GPU?

I saw the memory allocated to each GPU in distribute mode is the same as when training in single GPU mode. So, I thought the actual batch size with 4 GPU distribution mode is four times in 1 GPU. That means, if I set the batch size to 16 and train the model with 4 GPU, my actual batch size is 16 x 4 which is 64.

Just want to confirm if the script auto divides the batch size.

User avatar
bryanlyon
Site Admin
Posts: 793
Joined: Fri Jul 12, 2019 12:49 am
Answers: 44
Location: San Francisco
Has thanked: 4 times
Been thanked: 218 times
Contact:

Re: Do I need to adjust batch size when distributed training

Post by bryanlyon »

Faceswap does not multiply your batch size. That's up to you. If you set the Batch Size to 16 and have 4 GPUs, it'll split that so that each GPU has a batch size of 4.

Locked