yolov5的3.0版本代码在训练的时候报错:ImportError: cannot import name ‘amp‘ from ‘torch.cuda‘ 以及yolov5的3.0环境安装
1 错误原因分析yolov5的3.0版本代码在训练的时候报错:ImportError: cannot import name 'amp' from 'torch.cuda'(yolov5) shl@zfcv:~/project/yolov5_v3_0820$ ./4_clothes_shoes_hat_v1_trainTraceback (most recent call last):File "
yolov5的3.0版本代码在训练的时候报错:ImportError: cannot import name ‘amp‘ from ‘torch.cuda‘ 以及yolov5的3.0环境安装
文章目录:
本人环境声明:
系统环境:Ubuntu18.04.1cuda版本:10.2.89cudnn版本:7.6.5torch:1.6.0torchvision:0.7.0
1 错误原因分析
yolov5的3.0版本代码在训练的时候报错:ImportError: cannot import name 'amp' from 'torch.cuda'
(yolov5) shl@zfcv:~/project/yolov5_v3_0820$ ./4_clothes_shoes_hat_v1_train
Traceback (most recent call last):
File "train.py", line 16, in <module>
from torch.cuda import amp
ImportError: cannot import name 'amp' from 'torch.cuda' (/home/shl/anaconda3/envs/yolov5/lib/python3.7/site-packages/torch/cuda/__init__.py)
2 错误解决方法一
然后我就开始尝试解决这个问题,在网上看到issues的解决方法,这可能可cuda和pytorch的版本问题,具体解决方法如下:
1、查看你的cuda和pytorch的版本
nvcc --version 或 cat usr/local/cuda-[version]
2、如果你的cuda是10.0,请在这里检查你的pytorch的对应版本

3、重新安装cuda10.0对应的pytorch
pip install torch==1.4+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html
我的pytorch安装的就是torch==1.4+cu100 torchvision==0.5.0+cu100,但是我的cuda是10.2版本,但是我并不像改变这个版本!
3 错误解决方法二
之前在yolov5的v2版本中,train.py的apex使用代码如下:
(yolov5) shl@zfcv:~/project/yolov5_v3_0820$ cat ~/shl/yolov5/train.py -n
1 import argparse
2
3 import torch.distributed as dist
4 import torch.nn.functional as F
5 import torch.optim as optim
6 import torch.optim.lr_scheduler as lr_scheduler
7 import yaml
8 from torch.utils.tensorboard import SummaryWriter
9
10 import test # import test.py to get mAP after each epoch
11 from models.yolo import Model
12 from utils.datasets import *
13 from utils.utils import *
14
15 mixed_precision = True
16 try: # Mixed precision training https://github.com/NVIDIA/apex
17 from apex import amp
18 except:
19 print('Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex')
20 mixed_precision = False # not installed
于是我把train.py的from torch.cuda import amp修改为:from apex import amp
1、首先安装amp,如果没有安装会提示库包不存在
(yolov5) shl@zfcv:~/project/yolov5_v3_0820$ ./4_clothes_shoes_hat_v1_train
Traceback (most recent call last):
File "train.py", line 18, in <module>
from apex import amp
ModuleNotFoundError: No module named 'apex'
2、安装好之后又再执行训练,报错如下
报错:TypeError: Class advice impossible in Python3. Use the @implementer class decorator instead.
(yolov5) shl@zfcv:~/project/yolov5_v3_0820$ ./4_clothes_shoes_hat_v1_train
Traceback (most recent call last):
File "train.py", line 18, in <module>
from apex import amp
File "/home/shl/anaconda3/envs/yolov5/lib/python3.7/site-packages/apex/__init__.py", line 18, in <module>
from apex.interfaces import (ApexImplementation,
File "/home/shl/anaconda3/envs/yolov5/lib/python3.7/site-packages/apex/interfaces.py", line 10, in <module>
class ApexImplementation(object):
File "/home/shl/anaconda3/envs/yolov5/lib/python3.7/site-packages/apex/interfaces.py", line 14, in ApexImplementation
implements(IApex)
File "/home/shl/anaconda3/envs/yolov5/lib/python3.7/site-packages/zope/interface/declarations.py", line 706, in implements
raise TypeError(_ADVICE_ERROR % 'implementer')
TypeError: Class advice impossible in Python3. Use the @implementer class decorator instead.
(yolov5) shl@zfcv:~/project/yolov5_v3_0820$
3、解决错误(参考):TypeError: Class advice impossible in Python3. Use the @implementer class decorator instead.
从apx的github主页的源码进行安装,如下:
git clone https://github.com.cnpmjs.org/NVIDIA/apex.git
cd apex
python setup.py install
然后错误完美解决,但是,紧接着又报错:AttributeError: module 'torch.nn' has no attribute 'Hardswish'
nn.Hardswish这个激活函数应该是torch1.6中刚有的,yolov5官方也说了,在yolov5的3.0版本要保证torch>=1.6,我C C 艹 !
无奈,我只能重新建一个虚拟环境,然后乖乖安装torch1.6版本了!
#4 最终解决方法
1、新建虚拟环境
conda create -n yolov5-v3 python=3.7
2、激活虚拟环境
conda activate yolov5-v3
3、下载安装pytorch1.6
先去Pytorch官网查看pytorch1.6对应依赖的cuda版本
使用如下的下载命令:
pip install torch===1.6.0 torchvision===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

4、安装yolov5-v3的依赖库包
pip install -r requirements.txt
torch.nn.modules.module.ModuleAttributeError: ‘BatchNorm2d’ object has no attribute ‘_non_persistent_buffers_set’

查看官网issues,其中说的解决方法是更改torch的版本
https://download.pytorch.org/whl/torch_stable.html
然后我安装了torch1.5.1之后,开始训练时又报错:AttributeError: module 'torch.nn' has no attribute 'Hardswish'这TMD是死循环吗,我决定放弃了!
4 最终错误解决方法
具体参考这篇博客,我就不在这里在罗列了!
- 博客地址:https://shliang.blog.csdn.net/article/details/108219810


♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠ ⊕ ♠
欢迎来到FlagOS开发社区,这里是一个汇聚了AI开发者、数据科学家、机器学习爱好者以及业界专家的活力平台。我们致力于成为业内领先的Triton技术交流与应用分享的殿堂,为推动人工智能技术的普及与深化应用贡献力量。
更多推荐
所有评论(0)