Faceswap Model/Training folder on NVMe SSD - Great performance but with massive writes

Talk about Hardware used for Deep Learning
Post Reply
User avatar
vichitra5587
Posts: 10
Joined: Fri Aug 19, 2022 5:08 am
Has thanked: 6 times
Been thanked: 1 time

Faceswap Model/Training folder on NVMe SSD - Great performance but with massive writes

Post by vichitra5587 »

Just last week I completed building my deep learning rig featuring the beast RTX 3080 ti with 2 NVMe Samsung SSDs & 1 HDD.

Right now I testing two of my most interested models Realface & Disney 256 plugin under Phaze A.
What I have noticed is when using my Model & Training images folder on SSD, my GPU is used to 90% in Realface & 75% in Disney model.
The Realface model is set to 128px input 256px output.

I have to use batch 1 while training in Realface model as batch 8 results in OOM error after sometime.
On batch of 1, I get around 30k iterations in 1 hour on Realface model while having my Model & Training images folder on NVMe SSD.
But when I checked the Samsung Magician software, it said I have done a massive 1TB write in just one day.
As far as I remember, I had used faceswap for around 4-5 hrs in that one day.

I think this massive writes on SSD has to do with the Save Interval which was set to 250 & since I was getting the speed of 30k iterations/hr on batch 1,
it performed a massive 1 TB write in that one day on my SSD.

Therefore to save my SSD's life, I have shifted my Model's folder to HDD & increased the Save Interval to 1000.
But this has resulted in downgrading my speed which has now come around to 20k iterations/hr on batch 1 on Realface model.

The other difference I have seen is the GPU consumption which has gone down to around 75% from 90% which was on NVMe SSD.
It does gives me the benefit of lower temperatures & reduced watt usage & it will be useful for me since I am planning on training
for 7-8hrs/day in sessions of 1hr.

Following are the observations that I made using Open Hardware Monitor software :-

On NVMe SSD

GPU - RTX 3080 ti
Realface model - 128px Input 256px Output - Batch 1
30k iterations/hr
GPU usage - around 90%
GPU watt usage - around 280w - 310w

GPU - RTX 3080 ti
Disney 256 model - Defaults - Batch 1
I didn't ran it for an hour so not recorded the iterations
GPU usage - around 75%
GPU watt usage - around 240w - 270w


On HDD

GPU - RTX 3080 ti
Realface model - 128px Input 256px Output - Batch 1
20k iterations/hr
GPU usage - around 75%
GPU watt usage - around 240w - 270w

GPU - RTX 3080 ti
Disney 256 model - Defaults - Batch 1
I didn't ran it for an hour so not recorded the iterations
GPU usage - around 50%
GPU watt usage - around 190w - 220w

I have only shifted my Model's folder to HDD & while Training image's folder still remains on NVMe SSD.


I again ran the same benchmarks to get some more data updates on things like GPU temperature :-

On NVMe SSD

GPU - Asus TUF RTX 3080 ti 12GB OC
Realface model - 128px Input 256px Output - Batch 1
30k iterations/hr
GPU usage - around 90%
GPU watt usage - around 280w - 310w
GPU temperature - 67°C-69°C

GPU - Asus TUF RTX 3080 ti 12GB OC
Disney 256 model - Defaults - Batch 1
I didn't ran it for an hour so not recorded the iterations
GPU usage - around 75%
GPU watt usage - around 240w - 270w
GPU temperature - 60°C-62°C


On HDD

GPU - Asus TUF RTX 3080 ti 12GB OC
Realface model - 128px Input 256px Output - Batch 1
20k iterations/hr
GPU usage - around 75%
GPU watt usage - around 240w - 270w
GPU temperature - 62°C-64°C

GPU - Asus TUF RTX 3080 ti 12GB OC
Disney 256 model - Defaults - Batch 1
I didn't ran it for an hour so not recorded the iterations
GPU usage - around 40-50%
GPU watt usage - around 100w - 170w
GPU temperature - 42°C-46°C


User avatar
Boogie
Posts: 6
Joined: Sun Jun 21, 2020 2:45 pm

Re: Faceswap Model/Training folder on NVMe SSD - Great performance but with massive writes

Post by Boogie »

How much disk space is required? if it's not too much, you could use a ramdisk.


Post Reply