Page 1 of 1

Error during training: an illegal instruction was encountered

Posted: Fri Dec 30, 2022 10:29 am
by SPMSBH

I've just started for a few days and experimenting with the different models.

Previously I've done around 100k iterations of original model. And 180k iterations of Dfaker models (before NANs keeps being encountered).

So I'm tidying up the faces for training a bit and starting a new Dfaker model. But after around 10mins/1000 iterations of training I keep getting this error and the training stops:

F .\tensorflow/core/kernels/reduction_gpu_kernels.cu.h:883] Non-OK-status: GpuLaunchKernel(ColumnReduceSimpleKernel<IN_T, OUT_T, Op>, num_blocks, threads_per_block, 0, cu_stream, in, out, extent_x, extent_y, extent_z, op) status: INTERNAL: an illegal instruction was encountered

I'm not sure what's causing this. But everything seems fine before in the last model I did. Please let me know if you need more info.

Here are the training options:
Trainer: Dfaker
Batch Size: 8
Loss function: ssim, mae 25%, lpips_alex 50%, ffl 100%
Eye multiplier= 2
Mouth multiplier= 1
Penalized mask loss= true
Mask type: Bisenet-Fp Face (all faces for training were extracted with Bisenet mask)
Other settings are default

Here are the hardware specs in case you need it:
Processor: Core i7-10750H 2.6GHz
RAM: 16Gb
GPU: RTX2060 (Driver ver. 527.56)
Windows 11 Home


Re: Error during training: an illegal instruction was encountered

Posted: Sat Dec 31, 2022 2:46 pm
by torzdf

Do you get a crash report? If so, please provide it as per: app.php/rules#rule-4a