
华为Altas 200DK A2 部署实战(三)基于Yolov8-pose自制数据集训练的模型导出onnx模型推理&预测
本文基于Yolov8-pose自制数据集训练的模型,导出onnx格式的模型,并实现了前处理、后处理模块的设计,通过原始pytorch模型推理模块进行验证,实现了利用onnx模型在PC端预测图片与预测摄像头输入视频的功能。
基于Yolov8-pose自制数据集模型导出onnx模型推理&预测
模型训练成功后,我们可以通过程序测试pytorch框架下模型的性能,代码如下所示。
import cv2
import numpy as np
from ultralytics import YOLO
import random
def hsv2bgr(h, s, v):
h_i = int(h * 6)
f = h * 6 - h_i
p = v * (1 - s)
q = v * (1 - f * s)
t = v * (1 - (1 - f) * s)
r, g, b = 0, 0, 0
if h_i == 0:
r, g, b = v, t, p
elif h_i == 1:
r, g, b = q, v, p
elif h_i == 2:
r, g, b = p, v, t
elif h_i == 3:
r, g, b = p, q, v
elif h_i == 4:
r, g, b = t, p, v
elif h_i == 5:
r, g, b = v, p, q
return int(b * 255), int(g * 255), int(r * 255)
def random_color(id):
h_plane = (((id << 2) ^ 0x937151) % 100) / 100.0
s_plane = (((id << 3) ^ 0x315793) % 100) / 100.0
return hsv2bgr(h_plane, s_plane, 1)
#骨架连接
skeleton = [[1, 2], [2, 3], [3, 4], [4, 5], [1, 6], [6, 7], [7, 8], [8, 9], [6, 10],
[10, 11], [11, 12], [12, 13], [10, 14], [14, 15], [15, 16], [16, 17], [14, 18], [18, 19], [19, 20],[20,21],[1,18]]
#调色板
pose_palette = np.array([[255, 128, 0], [255, 153, 51], [255, 178, 102], [230, 230, 0], [255, 153, 255],
[153, 204, 255], [255, 102, 255], [255, 51, 255], [102, 178, 255], [51, 153, 255],
[255, 153, 153], [255, 102, 102], [255, 51, 51], [153, 255, 153], [102, 255, 102],
[51, 255, 51], [0, 255, 0], [0, 0, 255], [255, 0, 0], [255, 255, 255]],dtype=np.uint8)
#关键点颜色
kpt_color = pose_palette[[0, 16, 16, 16, 16, 0, 16, 16, 16, 0, 16, 16, 16, 0, 16, 16, 16, 0, 16, 16, 16, 16, 16]]
#骨架颜色
limb_color = pose_palette[[8, 8, 8, 8, 0, 13, 13, 13, 0, 10, 10, 10, 0, 2, 2, 2, 0, 7, 7, 7, 0]]
if __name__ == "__main__":
model = YOLO("/home/lx/model_deploy/yolov8/runs/pose/train3/weights/best.pt") #改成自己模型的存储地址
img = cv2.imread("predict_test.jpg") #改成自己要推理的图片
results = model(img)[0]
names = results.names
boxes = results.boxes.data.tolist()
# keypoints.data.shape -> n,21,2
keypoints = results.keypoints.cpu().numpy()
# keypoint -> 每个人的关键点
for keypoint in keypoints.data:
for i, (x, y) in enumerate(keypoint):
color_k = [int(x) for x in kpt_color[i]]
if x != 0 and y != 0:
cv2.circle(img, (int(x), int(y)), 5, color_k , -1, lineType=cv2.LINE_AA)
for i, sk in enumerate(skeleton):
pos1 = (int(keypoint[(sk[0] - 1), 0]), int(keypoint[(sk[0] - 1), 1]))
pos2 = (int(keypoint[(sk[1] - 1), 0]), int(keypoint[(sk[1] - 1), 1]))
if pos1[0] == 0 or pos1[1] == 0 or pos2[0] == 0 or pos2[1] == 0:
continue
cv2.line(img, pos1, pos2, [int(x) for x in limb_color[i]], thickness=2, lineType=cv2.LINE_AA)
for obj in boxes:
left, top, right, bottom = int(obj[0]), int(obj[1]), int(obj[2]), int(obj[3])
confidence = obj[4]
label = int(obj[5])
color = random_color(random.randint(1,100))
cv2.rectangle(img, (left, top), (right, bottom), color = color ,thickness=2, lineType=cv2.LINE_AA)
caption = f"{names[label]} {confidence:.2f}"
w, h = cv2.getTextSize(caption, 0, 1, 2)[0]
cv2.rectangle(img, (left - 3, top - 33), (left + w + 10, top), color, -1)
cv2.putText(img, caption, (left, top - 5), 0, 1, (0, 0, 0), 2, 16)
cv2.imwrite("predict-pose.jpg", img)
print("save done")
其中,skeleton数组表示关键点之间的连线关系,具体连线关系对应我上一节内容中的图,pose_palette数组存储颜色,我预设了一些颜色,也可以自己添加新的。kpt_color数组表示关键点的颜色,limb_color数组表示连线的颜色。
执行成功后,就可以看到如下图的效果了
由于华为昇腾框架不支持直接从pt到om格式的转化,我们需要先将模型由pt转化为onnx格式,然后再将模型转化成om格式。
根据下面代码进行pt到onnx格式的转化。
from ultralytics import YOLO
# Load a model
#model = YOLO('yolov8s-pose.pt') # load an official model
model = YOLO("/home/lx/model_deploy/yolov8/runs/pose/train3/weights/handpose.pt") # load a custom trained model
# Export the model
model.export(format='onnx',imgsz=640)
转化细节如下所示,我们的onnx版本为1.12.0 opset 17,如果发生意外可以在转换时指定opset版本与我的一致。
Ultralytics YOLOv8.0.225 🚀 Python-3.8.0 torch-2.0.1+cu117 CPU (Intel Core(TM) i7-10700 2.90GHz)
YOLOv8n-pose summary (fused): 187 layers, 3228485 parameters, 0 gradients, 8.9 GFLOPs
PyTorch: starting from '/home/lx/model_deploy/yolov8/runs/pose/train3/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 47, 8400) (6.4 MB)
ONNX: starting export with onnx 1.12.0 opset 17...
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
ONNX: export success ✅ 0.8s, saved as '/home/lx/model_deploy/yolov8/runs/pose/train3/weights/best.onnx' (12.6 MB)
Export complete (2.1s)
Results saved to /home/lx/model_deploy/yolov8/runs/pose/train3/weights
Predict: yolo predict task=pose model=/home/lx/model_deploy/yolov8/runs/pose/train3/weights/best.onnx imgsz=640
Validate: yolo val task=pose model=/home/lx/model_deploy/yolov8/runs/pose/train3/weights/best.onnx imgsz=640 data=pose.yaml
Visualize: https://netron.app
然后在我们的目录中就会出现handpose.onnx了(具体名字和onnx模型的生成路径与你导入的pt模型路径有关)。
onnx相关包需要自行安装,版本肯定是越新的越好。下面我们测试一下导出的onnx模型的性能,具体代码如下所示
import onnxruntime
import numpy as np
import cv2
import time
import torch
from ultralytics.nn.autobackend import AutoBackend
# 调色板
palette = np.array([[255, 128, 0], [255, 153, 51], [255, 178, 102], [230, 230, 0], [255, 153, 255],
[153, 204, 255], [255, 102, 255], [255, 51, 255], [102, 178, 255], [51, 153, 255],
[255, 153, 153], [255, 102, 102], [255, 51, 51], [153, 255, 153], [102, 255, 102],
[51, 255, 51], [0, 255, 0], [0, 0, 255], [255, 0, 0], [255, 255, 255]],dtype=np.uint8)
# 21个关键点连接顺序
skeleton = [[1, 2], [2, 3], [3, 4], [4, 5], [1, 6], [6, 7], [7, 8], [8, 9], [6, 10],
[10, 11], [11, 12], [12, 13], [10, 14], [14, 15], [15, 16], [16, 17], [14, 18], [18, 19], [19, 20],[20,21],[1,18]]
# 骨架颜色
pose_limb_color = palette[[8, 8, 8, 8, 0, 13, 13, 13, 0, 10, 10, 10, 0, 2, 2, 2, 0, 7, 7, 7, 0]]
# 关键点颜色
pose_kpt_color = palette[[0, 16, 16, 16, 16, 0, 16, 16, 16, 0, 16, 16, 16, 0, 16, 16, 16, 0, 16, 16, 16, 16, 16]]
def preprocess_warpAffine(image, dst_width=640, dst_height=640):
scale = min((dst_width / image.shape[1], dst_height / image.shape[0]))
ox = (dst_width - scale * image.shape[1]) / 2
oy = (dst_height - scale * image.shape[0]) / 2
M = np.array([
[scale, 0, ox],
[0, scale, oy]
], dtype=np.float32)
img_pre = cv2.warpAffine(image, M, (dst_width, dst_height), flags=cv2.INTER_LINEAR,
borderMode=cv2.BORDER_CONSTANT, borderValue=(114, 114, 114))
IM = cv2.invertAffineTransform(M)
img_pre = (img_pre[...,::-1] / 255.0).astype(np.float32)
img_pre = img_pre.transpose(2, 0, 1)[None]
#转化为tentor,用pytorch推理时使用
#img_pre = torch.from_numpy(img_pre)
return img_pre, IM
def iou(box1, box2):
def area_box(box):
return (box[2] - box[0]) * (box[3] - box[1])
left, top = max(box1[:2], box2[:2])
right, bottom = min(box1[2:4], box2[2:4])
union = max((right-left), 0) * max((bottom-top), 0)
cross = area_box(box1) + area_box(box2) - union
if cross == 0 or union == 0:
return 0
return union / cross
def NMS(boxes, iou_thres):
remove_flags = [False] * len(boxes)
keep_boxes = []
for i, ibox in enumerate(boxes):
if remove_flags[i]:
continue
keep_boxes.append(ibox)
for j in range(i + 1, len(boxes)):
if remove_flags[j]:
continue
jbox = boxes[j]
if iou(ibox, jbox) > iou_thres:
remove_flags[j] = True
return keep_boxes
def postprocess(pred, IM=[], conf_thres=0.25, iou_thres=0.45):
# 输入是模型推理的结果,即8400个预测框
# 1,8400,42 [cx,cy,w,h,conf,21*2]
boxes = []
for img_id, box_id in zip(*np.where(pred[...,4] > conf_thres)):
item = pred[img_id, box_id]
cx, cy, w, h, conf = item[:5]
left = cx - w * 0.5
top = cy - h * 0.5
right = cx + w * 0.5
bottom = cy + h * 0.5
keypoints = item[5:].reshape(-1, 2)
keypoints[:, 0] = keypoints[:, 0] * IM[0][0] + IM[0][2]
keypoints[:, 1] = keypoints[:, 1] * IM[1][1] + IM[1][2]
boxes.append([left, top, right, bottom, conf, *keypoints.reshape(-1).tolist()])
#防止原始图像没有待测目标
if boxes !=[]:
boxes = np.array(boxes)
lr = boxes[:,[0, 2]]
tb = boxes[:,[1, 3]]
boxes[:,[0,2]] = IM[0][0] * lr + IM[0][2]
boxes[:,[1,3]] = IM[1][1] * tb + IM[1][2]
boxes = sorted(boxes.tolist(), key=lambda x:x[4], reverse=True)
return NMS(boxes, iou_thres)
return []
def hsv2bgr(h, s, v):
h_i = int(h * 6)
f = h * 6 - h_i
p = v * (1 - s)
q = v * (1 - f * s)
t = v * (1 - (1 - f) * s)
r, g, b = 0, 0, 0
if h_i == 0:
r, g, b = v, t, p
elif h_i == 1:
r, g, b = q, v, p
elif h_i == 2:
r, g, b = p, v, t
elif h_i == 3:
r, g, b = p, q, v
elif h_i == 4:
r, g, b = t, p, v
elif h_i == 5:
r, g, b = v, p, q
return int(b * 255), int(g * 255), int(r * 255)
def random_color(id):
h_plane = (((id << 2) ^ 0x937151) % 100) / 100.0
s_plane = (((id << 3) ^ 0x315793) % 100) / 100.0
return hsv2bgr(h_plane, s_plane, 1)
class Keypoint():
def __init__(self,modelpath):
# self.session = onnxruntime.InferenceSession(modelpath, providers=['CUDAExecutionProvider','CPUExecutionProvider'])
self.session = onnxruntime.InferenceSession(modelpath, providers=['CPUExecutionProvider'])
self.input_name = self.session.get_inputs()[0].name
self.label_name = self.session.get_outputs()[0].name
def inference(self,image):
#预处理 img为预处理后的图像 IM为缩放比例系数
img, IM = preprocess_warpAffine(image)
# pytorch模型推理
#model = AutoBackend(weights="/home/lx/model_deploy/yolov8/runs/pose/train3/weights/best.pt")
#result = model(img)[0].transpose(-1, -2) # 1,8400,47
# 预测输出float32[1, 47, 8400]
pred = self.session.run([self.label_name], {self.input_name: img.astype(np.float32)})[0]
# 转化成[1,8400,47]
pred =np.transpose(pred,(0,2,1))
#后处理
boxes = postprocess(pred,IM)
for box in boxes:
left, top, right, bottom = int(box[0]), int(box[1]), int(box[2]), int(box[3])
confidence = box[4]
label = 0
color = random_color(label)
cv2.rectangle(image, (left, top), (right, bottom), color, 2, cv2.LINE_AA)
caption = f"hand {confidence:.2f}"
w, h = cv2.getTextSize(caption, 0, 1, 2)[0]
cv2.rectangle(image, (left - 3, top - 33), (left + w + 10, top), color, -1)
cv2.putText(image, caption, (left, top - 5), 0, 1, (0, 0, 0), 2, 16)
keypoints = box[5:]
keypoints = np.array(keypoints).reshape(-1, 2)
for i, keypoint in enumerate(keypoints):
x, y = keypoint
color_k = [int(x) for x in pose_kpt_color[i]]
if x != 0 and y != 0:
cv2.circle(image, (int(x), int(y)), 5, color_k, -1, lineType=cv2.LINE_AA)
for i, sk in enumerate(skeleton):
pos1 = (int(keypoints[(sk[0] - 1), 0]), int(keypoints[(sk[0] - 1), 1]))
pos2 = (int(keypoints[(sk[1] - 1), 0]), int(keypoints[(sk[1] - 1), 1]))
if pos1[0] == 0 or pos1[1] == 0 or pos2[0] == 0 or pos2[1] == 0:
continue
cv2.line(image, pos1, pos2, [int(x) for x in pose_limb_color[i]], thickness=2, lineType=cv2.LINE_AA)
return image
if __name__ == '__main__':
modelpath = r'handpose.onnx' #改为自己模型的目录
# 实例化模型
keydet = Keypoint(modelpath)
# 两种模式 1为图片预测,并显示结果图片;2为摄像头检测,并实时显示FPS
mode = 1
if mode == 1:
# 输入图片路径
image = cv2.imread('predict_test.jpg') #要预测的图片
start = time.time()
image = keydet.inference(image)
end = time.time()
det_time = (end - start) * 1000
print("推理时间为:{:.2f} ms".format(det_time))
print("图片完成检测")
cv2.namedWindow("keypoint", cv2.WINDOW_NORMAL)
cv2.imshow("keypoint", image)
cv2.imwrite('imgs/res.jpg',image)
#按'q'退出
cv2.waitKey(0)
cv2.destroyAllWindows()
elif mode == 2:
# 摄像头人体关键点检测
cap = cv2.VideoCapture(0)
# 返回当前时间
start_time = time.time()
counter = 0
while True:
# 从摄像头中读取一帧图像
ret, frame = cap.read()
image = keydet.inference(frame)
counter += 1 # 计算帧数
# 实时显示帧数
if (time.time() - start_time) != 0:
cv2.putText(image, "FPS:{0}".format(float('%.1f' % (counter / (time.time() - start_time)))), (5, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 0, 255), 1)
# 显示图像
cv2.imshow('keypoint', image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# 释放资源
cap.release()
cv2.destroyAllWindows()
else:
print("\033[1;91m 输入错误,请检查mode的赋值 \033[0m")
上述代码中两种推理模式,一种是输入单张图片进行推理保存,一种是读取摄像头输入进行推理实时显示,当然自己也可以加个功能实现读取本地视频进行推理保存的功能。
最开始的地方我导入了torch的库和yolov8中的AutoBackend模块,用来实现pytorch模型的推理部分,进一步验证前处理和后处理模块是否有问题。前后处理模块可以参考这个链接。
前处理的代码和网上大部分人写的差不多,修改图像大小,加灰条,图像归一化,将图像从BGR转换为RGB,如果使用pytorch框架推理还要将图像从numpy转化为tensor。
后处理的代码跟网上的版本有些差异,因为yolov8的代码封装程度很高,我也是根据网上的模板自己修改的。我们的数据集中关键点只有x和y两个特征,与原始的yolov8-pose中使用的数据集不同,原始的数据集有x,y和v(v代表关键点是否可见)三个特征,所以在实现后处理时需要很小心,我们提取打包的关键点特征时只能2个、2个特征提取,不可以跟之前一样提取三个特征,另外,对于后面的代码也需要改动,要把关于第三个特征v的代码统统删除,即默认所有关键点都是可见的,这样才可以走通,当然自己写个脚本直接修改label也是可以的,给所有关键点的x和y后面统统加上数字2(0代表不可见,2代表可见)补上第三个特征也可以对齐。此外,列表boxes也要判断是否非空,避免传入图像不存在待测物体时,程序越界崩溃。
当使用pytorch模型走通前处理、推理、后处理流程生成图片正确后,我们将推理部分替换为onnx模型的推理。onnx的输入使用numpy,不用在预处理中将其转化为tensor。
下面就可以展示我们onnx模型的推理结果啦
测试发现我们推理视频差不多有个15-20帧的效果,如果我们对onnx进一步量化减枝优化,将会取得更好的效果。
以上就是基于Yolov8-pose自制数据集训练的模型导出onnx模型推理验证的全部的过程了^ _ ^

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。
更多推荐
所有评论(0)