I can't use my GPU in model training

If training is failing to start, and you are not receiving an error message telling you what to do, tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Training process. If you want to get tips, or better understand the Training process, then you should look in the Training Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
qsbird
Posts: 2
Joined: Wed Nov 25, 2020 12:20 pm

I can't use my GPU in model training

Post by qsbird »

For example, in the model training, the CPU load is too high and the GPU load is 0. Try to output the system information as follows (cudnn has been correctly installed, but it can not be detected, I don't know if this matter). I would appreciate it if you could help me.

Following is "Output system information"

Code: Select all

============ System Information ============
encoding:            cp936
git_branch:          master
git_commits:         c24bf2b GUI - Revert Conda default font fix
gpu_cuda:            10.2
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: GeForce GTX 1060
gpu_devices_active:  GPU_0
gpu_driver:          457.30
gpu_vram:            GPU_0: 6144MB
os_machine:          AMD64
os_platform:         Windows-10-10.0.18362-SP0
os_release:          10
py_command:          D:\Software\faceswap/faceswap.py gui
py_conda_version:    conda 4.9.2
py_implementation:   CPython
py_version:          3.8.5
py_virtual_env:      True
sys_cores:           8
sys_processor:       Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
sys_ram:             Total: 16258MB, Available: 10086MB, Used: 6171MB, Free: 10086MB

=============== Pip Packages ===============
absl-py==0.11.0
astunparse==1.6.3
cachetools==4.1.1
certifi==2020.11.8
cycler==0.10.0
fastcluster==1.1.26
ffmpy==0.2.3
gast==0.3.3
google-auth==1.23.0
google-auth-oauthlib==0.4.2
google-pasta==0.2.0
grpcio==1.33.2
h5py==2.10.0
imageio @ file:///tmp/build/80754af9/imageio_1594161405741/work
imageio-ffmpeg @ file:///home/conda/feedstock_root/build_artifacts/imageio-ffmpeg_1589202782679/work
joblib @ file:///tmp/build/80754af9/joblib_1601912903842/work
Keras-Preprocessing==1.1.2
kiwisolver @ file:///C:/ci/kiwisolver_1604014703538/work
Markdown==3.3.3
matplotlib @ file:///C:/ci/matplotlib-base_1592837548929/work
mkl-fft==1.2.0
mkl-random==1.1.1
mkl-service==2.3.0
numpy==1.18.5
nvidia-ml-py3 @ git+https://github.com/deepfakes/nvidia-ml-py3.git@6fc29ac84b32bad877f078cb4a777c1548a00bf6
oauthlib==3.1.0
olefile==0.46
opencv-python==4.4.0.46
opt-einsum==3.3.0
pathlib==1.0.1
Pillow @ file:///C:/ci/pillow_1603823068645/work
protobuf==3.14.0
psutil @ file:///C:/ci/psutil_1598370330503/work
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.7
python-dateutil==2.8.1
pywin32==227
requests-oauthlib==1.3.0
rsa==4.6
scikit-learn @ file:///C:/ci/scikit-learn_1598377018496/work
scipy @ file:///C:/ci/scipy_1604596260408/work
sip==4.19.13
six @ file:///C:/ci/six_1605187374963/work
tensorboard==2.2.2
tensorboard-plugin-wit==1.7.0
tensorflow-gpu==2.2.1
tensorflow-gpu-estimator==2.2.0
termcolor==1.1.0
threadpoolctl @ file:///tmp/tmp9twdgx9k/threadpoolctl-2.1.0-py3-none-any.whl
tornado==6.0.4
tqdm @ file:///tmp/build/80754af9/tqdm_1605303662894/work
Werkzeug==1.0.1
wincertstore==0.2
wrapt==1.12.1

============== Conda Packages ==============
# packages in environment at C:\Users\QS\MiniConda3\envs\faceswap:
#
# Name                    Version                   Build  Channel
blas                      1.0                         mkl  
ca-certificates 2020.10.14 0
certifi 2020.11.8 py38haa95532_0
cycler 0.10.0 py38_0
fastcluster 1.1.26 py38h251f6bf_2 conda-forge ffmpeg 4.3.1 ha925a31_0 conda-forge freetype 2.10.4 hd328e21_0
git 2.23.0 h6bb4b03_0
icc_rt 2019.0.0 h0cc432a_1
icu 58.2 ha925a31_3
imageio 2.9.0 py_0
imageio-ffmpeg 0.4.2 py_0 conda-forge intel-openmp 2020.2 254
joblib 0.17.0 py_0
jpeg 9b hb83a4c4_2
kiwisolver 1.3.0 py38hd77b12b_0
libpng 1.6.37 h2a8f88b_0
libtiff 4.1.0 h56a325e_1
lz4-c 1.9.2 hf4a77e7_3
matplotlib 3.2.2 0
matplotlib-base 3.2.2 py38h64f37c6_0
mkl 2020.2 256
mkl-service 2.3.0 py38h2bbff1b_0
mkl_fft 1.2.0 py38h45dec08_0
mkl_random 1.1.1 py38h47e9c7a_0
numpy 1.19.2 py38hadc3359_0
numpy-base 1.19.2 py38ha3acd2a_0
nvidia-ml-py3 7.352.1 pypi_0 pypi olefile 0.46 py_0
openssl 1.1.1h he774522_0
pathlib 1.0.1 py_1
pillow 8.0.1 py38h4fa10fc_0
pip 20.2.4 py38haa95532_0
psutil 5.7.2 py38he774522_0
pyparsing 2.4.7 py_0
pyqt 5.9.2 py38ha925a31_4
python 3.8.5 h5fd99cc_1
python-dateutil 2.8.1 py_0
python_abi 3.8 1_cp38 conda-forge pywin32 227 py38he774522_1
qt 5.9.7 vc14h73c81de_0
scikit-learn 0.23.2 py38h47e9c7a_0
scipy 1.5.2 py38h14eb087_0
setuptools 50.3.1 py38haa95532_1
sip 4.19.13 py38ha925a31_0
six 1.15.0 py38haa95532_0
sqlite 3.33.0 h2a8f88b_0
threadpoolctl 2.1.0 pyh5ca1d4c_0
tk 8.6.10 he774522_0
tornado 6.0.4 py38he774522_1
tqdm 4.51.0 pyhd3eb1b0_0
vc 14.1 h0510ff6_4
vs2015_runtime 14.16.27012 hf0eaf9b_3
wheel 0.35.1 pyhd3eb1b0_0
wincertstore 0.2 py38_0
xz 5.2.5 h62dcd97_0
zlib 1.2.11 h62dcd97_4
zstd 1.4.5 h04227a9_0 ================= Configs ================== --------- .faceswap --------- backend: nvidia --------- convert.ini --------- [color.color_transfer] clip: True preserve_paper: True [color.manual_balance] colorspace: HSV balance_1: 0.0 balance_2: 0.0 balance_3: 0.0 contrast: 0.0 brightness: 0.0 [color.match_hist] threshold: 99.0 [mask.box_blend] type: gaussian distance: 11.0 radius: 5.0 passes: 1 [mask.mask_blend] type: normalized kernel_size: 3 passes: 4 threshold: 4 erosion: 0.0 [scaling.sharpen] method: none amount: 150 radius: 0.3 threshold: 5.0 [writer.ffmpeg] container: mp4 codec: libx264 crf: 23 preset: medium tune: none profile: auto level: auto skip_mux: False [writer.gif] fps: 25 loop: 0 palettesize: 256 subrectangles: False [writer.opencv] format: png draw_transparent: False jpg_quality: 75 png_compress_level: 3 [writer.pillow] format: png draw_transparent: False optimize: False gif_interlace: True jpg_quality: 75 png_compress_level: 3 tif_compression: tiff_deflate --------- extract.ini --------- [global] allow_growth: False [align.fan] batch-size: 12 [detect.cv2_dnn] confidence: 50 [detect.mtcnn] minsize: 20 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 scalefactor: 0.709 batch-size: 8 [detect.s3fd] confidence: 70 batch-size: 4 [mask.unet_dfl] batch-size: 8 [mask.vgg_clear] batch-size: 6 [mask.vgg_obstructed] batch-size: 2 --------- gui.ini --------- [global] fullscreen: False tab: extract options_panel_width: 30 console_panel_height: 20 icon_size: 14 font: default font_size: 9 autosave_last_session: prompt timeout: 120 auto_load_model_stats: True --------- train.ini --------- [global] coverage: 68.75 icnr_init: False conv_aware_init: False optimizer: adam learning_rate: 5e-05 reflect_padding: False allow_growth: False mixed_precision: False convert_batchsize: 16 [global.loss] loss_function: ssim mask_loss_function: mse l2_reg_term: 100 eye_multiplier: 3 mouth_multiplier: 2 penalized_mask_loss: True mask_type: extended mask_blur_kernel: 3 mask_threshold: 4 learn_mask: False [model.dfl_h128] lowmem: False [model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False [model.dlight] features: best details: good output_size: 256 [model.original] lowmem: False [model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512 [model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512 [model.villain] lowmem: False [trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 disable_warp: False color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4
User avatar
qsbird
Posts: 2
Joined: Wed Nov 25, 2020 12:20 pm

Re: I can't use my GPU in model training

Post by qsbird »

solved
I installed tensorflow wrongly by 2.3,as it only support cuda for 10.1 and cudnn for 7.6.I reinstalled cuda and cudnn to solve this mistake.

User avatar
torzdf
Posts: 2682
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 133 times
Been thanked: 626 times

Re: I can't use my GPU in model training

Post by torzdf »

Glad you got it solved.

If you used the installer, it is recommended to not install Cuda/cuDNN at all system-wide, as it interferes with the local install we provide.

My word is final

Locked