Search found 40 matches

by dheinz70
Tue Apr 13, 2021 1:21 am
Forum: Training Support
Topic: an illegal memory access was encountered
Replies: 0
Views: 4

an illegal memory access was encountered

I've been getting this error alot. The model runs for a while and this then spits out. AMD 3800x 32 GB ram 2060 Super Any Ideas. This same hardware setup used to run like a top on the same (or heavier) models. No crashlog is created, 2021-04-12 04:43:11.527466: F ./tensorflow/core/kernels/conv_2d_gp...
by dheinz70
Tue Mar 16, 2021 3:31 pm
Forum: Training Support
Topic: GPU Training on GeForce GTX 1660 Ti under linux Ubuntu 20.04.2 LTS
Replies: 2
Views: 281

Re: GPU Training on GeForce GTX 1660 Ti under linux Ubuntu 20.04.2 LTS

Make sure you using the proprietary Nvidia driver on Linux? A default install uses the open source display drivers. Click the "Additional Drivers" icon and check from there. Also, the 450 nvidia driver works better than the 460 one. Just a guess.

by dheinz70
Tue Dec 29, 2020 1:37 am
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

My 2 cents on using two GPUs is that you really, really need a high end motherboard. I upgraded to a Ryzen 7 and a x570 mobo and dual GPUs are still flaky. I'm using Linux and I get maybe about a 20% increase in EG/s using dual 2060 supers over just using one. SO... 2 GPUs with batch of 16 only runs...
by dheinz70
Sat Dec 19, 2020 2:24 am
Forum: Hardware
Topic: MULTI GPU - double speed at same batc size or not?
Replies: 10
Views: 1002

Re: MULTI GPU - double speed at same batc size or not?

Two similarly capable cards should work. The rate determining step will be the slowest card. Dunno if the 30x cards are supported yet.

by dheinz70
Wed Dec 02, 2020 1:08 am
Forum: Convert Discussion
Topic: Converted faces are blurry
Replies: 16
Views: 3762

Re: DLight model still blurry after 400k iterations

My experience with DLight gives good A quality, and totally washed out B quality. Dunno if that helps you. 192 SAE and Villain give me the best results.

by dheinz70
Sat Nov 21, 2020 3:27 am
Forum: Hardware
Topic: What Do you think of this MB
Replies: 4
Views: 682

Re: What Do you think of this MB

My feeling I'm seeing a linux driver issue or a tensorflow issue. I installed windows on the new computer and tried to run the same test. Single gpu performance was about 10-20% worse under Windows. I could not get distributed to work at all. Kept throwing out tensorflow illegal memory errors. I als...
by dheinz70
Thu Nov 19, 2020 12:39 am
Forum: Hardware
Topic: What Do you think of this MB
Replies: 4
Views: 682

Re: What Do you think of this MB

Alright.... The new setup is built. Ubuntu 20.10, Ryzen 3800x, 32gb ram, and the MPG x570 Plus mb. Interesting results on my first tests. Tests with one gpu (did both GPU0 and GPU1) give me 20-21 EGs/sec with a batch of 7. Tests with distributed, batch of 14 (2x7) give me 24 EGs/sec. Only a slight g...
by dheinz70
Sat Nov 07, 2020 12:29 am
Forum: Hardware
Topic: What Do you think of this MB
Replies: 4
Views: 682

What Do you think of this MB

https://www.newegg.com/msi-mpg-x570-gam ... 6813144262

Another question, When using both PCIe slots do all MBs switch them to 8x8 or do some keep both the slots at 16x?

by dheinz70
Wed Nov 04, 2020 10:15 pm
Forum: General Discussion
Topic: Save As bug
Replies: 3
Views: 483

Save As bug

When I save a new project it adds the extension twice.

"realfacetest.fsw .FSW" is what it named the file, I only entered "realfacetest" in the name window.

by dheinz70
Sat Oct 31, 2020 7:45 pm
Forum: Training Support
Topic: Log and graph weirdness
Replies: 14
Views: 877

Re: Log and graph weirdness

The two bugs I've seen:

Changing the smoothing from 0.9 causes the stats to crash

It shows more iterations than the session has done. Hope that helps.

by dheinz70
Mon Oct 19, 2020 11:51 pm
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

The fact it drops down to 8x8 tells me it is probably mostly hardware. Just thought it was weird that running two cards is almost exactly half as productive.

Screenshot from 2020-10-19 18-49-07.png
Screenshot from 2020-10-19 18-49-07.png (35.19 KiB) Viewed 526 times
by dheinz70
Mon Oct 19, 2020 8:54 pm
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

Another test with the same results. 2000 iterations on Original, single batch 150, distributed batch 300. Distributed is almost exactly half as efficient. Screenshot from 2020-10-19 15-49-51.png Watching the nvtop the stats never pegged and held there. avgeraged about 70% of what the gpus could hand...
by dheinz70
Sun Oct 18, 2020 6:34 pm
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

2000 iterations of a batch of 2 each on single and distributed. I doubt it was ever enuf data to clog the pipes. My feeling is there is some hardware limitation, but i suspect there is something else going on too.

Screenshot from 2020-10-18 13-28-38.png
Screenshot from 2020-10-18 13-28-38.png (892 Bytes) Viewed 1150 times
by dheinz70
Sun Oct 18, 2020 5:41 pm
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

I added the coolbits to my xorg.conf, so I could control the fans on the cards. I cranked them all to 100% and the cards ran at 55C. Same slow results on distributed. The nvidia control panel lists 93C as the slowdown temp. Batch of 12 gave me 23 EG/s. Screenshot from 2020-10-18 12-41-24.png I'm goi...
by dheinz70
Sun Oct 18, 2020 3:00 am
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

Did a couple thousand iterations to test. Screenshot from 2020-10-17 21-54-05.png A singe gpu batch of 10, dual with a batch of 16. I'm getting better EGs/sec from the single gpu. It took 4 mins to start on distributed. 1.5 mins to start on single gpu. I watch nvtop and I never saw the rx/tx get peg...
by dheinz70
Sun Oct 18, 2020 2:09 am
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

Yes, I do lower to 80% of what one can handle. It isn't just a startup issue. Depending on the model it can take 5-10 mins to start. It just runs really slowly once it starts. In terms of EG/s I'm getting better performance from 1 gpu doing 1/2 as many at a time. I'm watching nvtop with the single 2...
by dheinz70
Sun Oct 18, 2020 1:44 am
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

PCIe 2.0 8x should be somewhere near 4 Gb/sec. Faceswap uses that much?

by dheinz70
Sat Oct 17, 2020 10:42 pm
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

The MSI site says 2x16. Other sites show 1x16 and 1x8. Which would explain why the drop down to 1x8. Well, gonna take out one of the cards and see if the single runs at 16x.

-edit-
Yep, one card shows 16x.

Screenshot from 2020-10-17 17-54-10.png
Screenshot from 2020-10-17 17-54-10.png (56.84 KiB) Viewed 1189 times

Time to start saving up for a Ryzen.....

by dheinz70
Sat Oct 17, 2020 9:30 pm
Forum: Training Support
Topic: Distributed with Dual 2060 supers
Replies: 46
Views: 3425

Re: Distributed with Dual 2060 supers

Hmmm, you might be on to something. This is showing only 8x PCIe when they are both in use. Specs on my MB show 2x 16...

Screenshot from 2020-10-17 16-22-51.png
Screenshot from 2020-10-17 16-22-51.png (81.12 KiB) Viewed 1195 times

Now with just GPU1 doing the work....

Screenshot from 2020-10-17 16-27-45.png
Screenshot from 2020-10-17 16-27-45.png (81.61 KiB) Viewed 1195 times

Still 8x