背景

ubantu中conda虚拟环境安装transformer-engine时报错,具体原因查看:

pip install --no-build-isolation transformer_engine[pytorch] --verbose

报错

RuntimeError: Error when running CMake: Command '['/home/skr/miniconda3/envs/cosmos/lib/python3.12/site-packages/cmake/data/bin/cmake', '-S', '/tmp/pip-req-build-nehutrdg/transformer_engine/common', '-B', '/tmp/pip-req-build-nehutrdg/build/cmake', '-DPython_EXECUTABLE=/home/skr/miniconda3/envs/cosmos/bin/python', '-DPython_INCLUDE_DIR=/home/skr/miniconda3/envs/cosmos/include/python3.12', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-nehutrdg/build/lib.linux-x86_64-cpython-312', '-DCMAKE_CUDA_ARCHITECTURES=70;80;89;90;100;120', '-Dpybind11_DIR=/tmp/pip-req-build-nehutrdg/.eggs/pybind11-2.13.6-py3.12.egg/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
  [end of output]

解决方案

!前提:

已经安装了cudnn,可pip list查看, 未安装可参考https://developer.nvidia.com/cudnn

解决方案参考:

https://github.com/NVIDIA/TransformerEngine/issues/1506

步骤:

  1. export CUDNN_PATH=/path/to/cudnn
  2. export CPLUS_INCLUDE_PATH=/path/to/cudnn/include

确定/path/to/cudnn

在当前虚拟环境中中,例如

/data/miniconda3/envs/<your_env>/lib/python3.x/site-packages/nvidia/cudnn

Logo

欢迎来到FlagOS开发社区,这里是一个汇聚了AI开发者、数据科学家、机器学习爱好者以及业界专家的活力平台。我们致力于成为业内领先的Triton技术交流与应用分享的殿堂,为推动人工智能技术的普及与深化应用贡献力量。

更多推荐