llama.cpp无法使用gpu的问题_warning: no usable gpu found, --gpu-layers option

技术文档

使用cuda编译llama.cpp后，仍然无法使用gpu。

./llama-server -m ../../../../../model/hf_models/qwen/qwen3-4b-q8_0.gguf -ngl 40

ggml_cuda_init: failed to initialize CUDA: forward compatibility was attempted on non supported HW
warning: no usable GPU found, --gpu-layers option will be ignored
warning: one possible reason is that llama.cpp was compiled without GPU support
warning: consult docs/build.md for compilation instructions

使用nvidia-smi

$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 550.144

重启即可解决问题

./llama-server -m ../../../../../model/hf_models/qwen/qwen3-4b-q8_0.gguf -ngl 40
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce GTX 1660 Ti, compute capability 7.5, VMM: yes
...

load_tensors: offloading 36 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 37/37 layers to GPU
load_tensors: CUDA0 model buffer size = 4076.43 MiB
load_tensors: CPU_Mapped model buffer size = 394.12 MiB

llama.cpp无法使用gpu的问题_warning: no usable gpu found, --gpu-layers option

公告

DeepSeek全套部署资料免费下载

免费可商用字体批量下载

标签

llama.cpp无法使用gpu的问题_warning: no usable gpu found, --gpu-layers option

相关问题

公告

DeepSeek全套部署资料免费下载

免费可商用字体批量下载

标签