问题

新申请了几张H100的显卡,但运行程序会出现提示

NVIDIA H100 PCIe with CUDA capability sm_90 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75 sm_80 sm_86.
If you want to use the NVIDIA H100 PCIe GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

原本的cuda版本是12.1,torch版本是2.0.1

解决

卸载掉之前安装的,重新安装11.8版本的cuda

pip install torch2.0.0+cu118 torchaudio2.0.0+cu118 torchvision==0.15.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
conda install cudnn
conda install -c “nvidia/label/cuda-11.8.0” cuda-toolkit
conda install -c “nvidia/label/cuda-11.8.0” cuda-nvcc
conda install -c “nvidia/label/cuda-11.8.0” cuda-runtime

验证
import torch
print("PyTorch Version: ",torch.__version__) ; 
print("Is available: ", torch.cuda.is_available()) ; 
print("Current Device: ", torch.cuda.current_device()) ;
print("Number of GPUs: ",torch.cuda.device_count())

结果

import torch
print("PyTorch Version: ",torch.__version__) ; 
# PyTorch Version:  2.0.0+cu118
print("Is available: ", torch.cuda.is_available()) ; 
# Is available:  True
print("Current Device: ", torch.cuda.current_device()) ;
# Current Device:  0
print("Number of GPUs: ",torch.cuda.device_count())
# Number of GPUs:  8

补充
在这里插入图片描述

Logo

欢迎来到FlagOS开发社区,这里是一个汇聚了AI开发者、数据科学家、机器学习爱好者以及业界专家的活力平台。我们致力于成为业内领先的Triton技术交流与应用分享的殿堂,为推动人工智能技术的普及与深化应用贡献力量。

更多推荐