Error during training: an illegal instruction was encountered

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
SPMSBH
Posts: 1
Joined: Tue Dec 20, 2022 7:48 am

Error during training: an illegal instruction was encountered

Post by SPMSBH »

I've just started for a few days and experimenting with the different models.

Previously I've done around 100k iterations of original model. And 180k iterations of Dfaker models (before NANs keeps being encountered).

So I'm tidying up the faces for training a bit and starting a new Dfaker model. But after around 10mins/1000 iterations of training I keep getting this error and the training stops:

F .\tensorflow/core/kernels/reduction_gpu_kernels.cu.h:883] Non-OK-status: GpuLaunchKernel(ColumnReduceSimpleKernel<IN_T, OUT_T, Op>, num_blocks, threads_per_block, 0, cu_stream, in, out, extent_x, extent_y, extent_z, op) status: INTERNAL: an illegal instruction was encountered

I'm not sure what's causing this. But everything seems fine before in the last model I did. Please let me know if you need more info.

Here are the training options:
Trainer: Dfaker
Batch Size: 8
Loss function: ssim, mae 25%, lpips_alex 50%, ffl 100%
Eye multiplier= 2
Mouth multiplier= 1
Penalized mask loss= true
Mask type: Bisenet-Fp Face (all faces for training were extracted with Bisenet mask)
Other settings are default

Here are the hardware specs in case you need it:
Processor: Core i7-10750H 2.6GHz
RAM: 16Gb
GPU: RTX2060 (Driver ver. 527.56)
Windows 11 Home

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 622 times

Re: Error during training: an illegal instruction was encountered

Post by torzdf »

Do you get a crash report? If so, please provide it as per: app.php/rules#rule-4a

My word is final

Locked