Thought I would follow up with some findings for future reference and anyone reading this post.
Using the original model with a batch size of 64, I was finding the EG/s would slow from 120 to 40 - after advice and installing GPU-Z and Afterburner, I noticed my GPU was hitting its thermal limit of 83c.
GPU-Z would show the below, note the 'CPU Load' is pretty much always 100% except for every 100 iterations when faceswap would save the model.

After approx 2 hours, iterations would slow quite significantly and GPU-Z Load would have lots of peaks and troughs like below.

So... I tried resolving this by addressing any potential cooling issues, starting by setting the GPU fan to 100% (made no difference), then by cranking up the fans on the r720 to Max, which sounds like a jet taking off, and keeps the GPU at a relatively consistent 60c @ 100% load (23c below its thermal limit of 83c) - same results still get GPU load issues after about 2 hours.
Below is a snapshot of Afterburner after about 2 hours use (5 sec snapshot intervals) and you can see the Max Temp of the GPU is 63c which I would think is fine. Also... you can see the temp stays roughly the same, but GPU load starts peaking and troughing, increasing in frequency over time.

This would suggest to me that thermals is not the cause of this performance degradation. As an experiment, I tried changing the batch size from 64 to 8 and have now been running for 21 hours and am seeing no performance degradation. The below graph shows the GPU load is constant after 21 hours

This machine has a 1070Ti 8GB, 24 virtual cores (Xeon E5-2650 v2 @ 2.6GHz) and 32GB RAM, max memory used is about 12GB.
I dont know if this is quirk, feature or bug of the original trainer, but I'm not too concerned about this as I think I will switch to using a different trainer, just thought I would share my observations. If anyone knows of something I have done wrong, would be glad to be made aware.
Thanks