I have multiple Windows 10 machines, each with a decent video card. I'd like to have them all working together on the same training session.
ChatGPT recommended a few different options:
TensorFlow's Distributed Training
PyTorch's Distributed Data Parallel (DDP)
Horovod
Ray's RaySGD
Was hoping to see if anyone else had experience doing this, or if there was a recommended workflow? Any advice/tips/tricks would be greatly appreciated.