> 技术文档 > 开源 python 应用 开发(五)python opencv之目标检测_识别物品的开源模型

开源 python 应用 开发(五)python opencv之目标检测_识别物品的开源模型

 最近有个项目需要做视觉自动化处理的工具,最后选用的软件为python,刚好这个机会进行系统学习。短时间学习,需要快速开发,所以记录要点步骤,防止忘记。

 

 链接:

开源 python 应用 开发(一)python、pip、pyAutogui、python opencv安装-CSDN博客

开源 python 应用 开发(二)基于pyautogui、open cv 视觉识别的工具自动化-CSDN博客

开源 python 应用 开发(三)python语法介绍-CSDN博客

开源 python 应用 开发(四)python文件和系统综合应用-CSDN博客

开源 python 应用 开发(五)python opencv之目标检测-CSDN博客

开源 python 应用 开发(六)网络爬虫-CSDN博客

开源 python 应用 开发(七)数据可视化-CSDN博客

 推荐链接:

开源 Arkts 鸿蒙应用 开发(一)工程文件分析-CSDN博客

开源 Arkts 鸿蒙应用 开发(二)封装库.har制作和应用-CSDN博客

开源 Arkts 鸿蒙应用 开发(三)Arkts的介绍-CSDN博客

开源 Arkts 鸿蒙应用 开发(四)布局和常用控件-CSDN博客

开源 Arkts 鸿蒙应用 开发(五)控件组成和复杂控件-CSDN博客

 推荐链接:

开源 java android app 开发(一)开发环境的搭建-CSDN博客

开源 java android app 开发(二)工程文件结构-CSDN博客

开源 java android app 开发(三)GUI界面布局和常用组件-CSDN博客

开源 java android app 开发(四)GUI界面重要组件-CSDN博客

开源 java android app 开发(五)文件和数据库存储-CSDN博客

开源 java android app 开发(六)多媒体使用-CSDN博客

开源 java android app 开发(七)通讯之Tcp和Http-CSDN博客

开源 java android app 开发(八)通讯之Mqtt和Ble-CSDN博客

开源 java android app 开发(九)后台之线程和服务-CSDN博客

开源 java android app 开发(十)广播机制-CSDN博客

开源 java android app 开发(十一)调试、发布-CSDN博客

开源 java android app 开发(十二)封库.aar-CSDN博客

推荐链接:

开源C# .net mvc 开发(一)WEB搭建_c#部署web程序-CSDN博客

开源 C# .net mvc 开发(二)网站快速搭建_c#网站开发-CSDN博客

开源 C# .net mvc 开发(三)WEB内外网访问(VS发布、IIS配置网站、花生壳外网穿刺访问)_c# mvc 域名下不可訪問內網,內網下可以訪問域名-CSDN博客

开源 C# .net mvc 开发(四)工程结构、页面提交以及显示_c#工程结构-CSDN博客

开源 C# .net mvc 开发(五)常用代码快速开发_c# mvc开发-CSDN博客

本章节内容如下:实现了使用 YOLOv3 (You Only Look Once version 3) 深度学习模型进行目标检测的功能。识别了香蕉、手机、夜景中的汽车和行人,第一次玩感觉挺有意思的。对图片有些要求自己更换图片,最好选高清大图。

1.  YOLOv3 模型

2.  主要函数

3.  所有代码

4.  效果演示

一、YOLOv3 模型

使用预训练的 YOLOv3 模型(需要 yolov3.cfg 和 yolov3.weights 文件)

基于 COCO 数据集(80 个类别)

这个代码需要3个文件,分别是yolov3.cfg、yolov3.weights、coco.names网上很容易能搜到。

更换对应的 .cfg文件和.weights文件

链接:YOLO: Real-Time Object Detection

yolov3.cfg文件

[net]# Testing# batch=1# subdivisions=1# Trainingbatch=64subdivisions=16width=608height=608channels=3momentum=0.9decay=0.0005angle=0saturation = 1.5exposure = 1.5hue=.1learning_rate=0.001burn_in=1000max_batches = 500200policy=stepssteps=400000,450000scales=.1,.1[convolutional]batch_normalize=1filters=32size=3stride=1pad=1activation=leaky# Downsample[convolutional]batch_normalize=1filters=64size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=32size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=64size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=128size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=64size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=128size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=64size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=128size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=256size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=512size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear# Downsample[convolutional]batch_normalize=1filters=1024size=3stride=2pad=1activation=leaky[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1filters=1024size=3stride=1pad=1activation=leaky[shortcut]from=-3activation=linear######################[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=1024activation=leaky[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=1024activation=leaky[convolutional]batch_normalize=1filters=512size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=1024activation=leaky[convolutional]size=1stride=1pad=1filters=255activation=linear[yolo]mask = 6,7,8anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=80num=9jitter=.3ignore_thresh = .7truth_thresh = 1random=1[route]layers = -4[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[upsample]stride=2[route]layers = -1, 61[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=512activation=leaky[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=512activation=leaky[convolutional]batch_normalize=1filters=256size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=512activation=leaky[convolutional]size=1stride=1pad=1filters=255activation=linear[yolo]mask = 3,4,5anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=80num=9jitter=.3ignore_thresh = .7truth_thresh = 1random=1[route]layers = -4[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[upsample]stride=2[route]layers = -1, 36[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=256activation=leaky[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=256activation=leaky[convolutional]batch_normalize=1filters=128size=1stride=1pad=1activation=leaky[convolutional]batch_normalize=1size=3stride=1pad=1filters=256activation=leaky[convolutional]size=1stride=1pad=1filters=255activation=linear[yolo]mask = 0,1,2anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=80num=9jitter=.3ignore_thresh = .7truth_thresh = 1random=1

coco.names文件

personbicyclecarmotorbikeaeroplanebustraintruckboattraffic lightfire hydrantstop signparking meterbenchbirdcatdoghorsesheepcowelephantbearzebragiraffebackpackumbrellahandbagtiesuitcasefrisbeeskissnowboardsports ballkitebaseball batbaseball gloveskateboardsurfboardtennis racketbottlewine glasscupforkknifespoonbowlbananaapplesandwichorangebroccolicarrothot dogpizzadonutcakechairsofapottedplantbeddiningtabletoilettvmonitorlaptopmouseremotekeyboardcell phonemicrowaveoventoastersinkrefrigeratorbookclockvasescissorsteddy bearhair driertoothbrush

二、主要函数

2.1  load_yolo() 函数

功能:加载 YOLOv3 模型和相关文件

def load_yolo(): # 获取当前目录 current_dir = os.path.dirname(os.path.abspath(__file__)) # 构建完整路径 cfg_path = os.path.join(current_dir, \"yolov3.cfg\") weights_path = os.path.join(current_dir, \"yolov3.weights\") names_path = os.path.join(current_dir, \"coco.names\") # 检查文件是否存在 if not all(os.path.exists(f) for f in [cfg_path, weights_path, names_path]): raise FileNotFoundError(\"缺少YOLO模型文件,请确保yolov3.cfg、yolov3.weights和coco.names在脚本目录下\") # 加载网络 net = cv2.dnn.readNet(weights_path, cfg_path) with open(names_path, \"r\") as f: classes = [line.strip() for line in f.readlines()] layer_names = net.getLayerNames() output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()] return net, classes, output_layers

2.2  detect_objects(img, net, output_layers) 函数

功能:对输入图像进行目标检测

def detect_objects(img, net, output_layers): # 从图像创建blob blob = cv2.dnn.blobFromImage(img, scalefactor=1/255.0, size=(416, 416), swapRB=True, crop=False) # 设置输入并进行前向传播 net.setInput(blob) outputs = net.forward(output_layers) return outputs

2.3  get_box_dimensions(outputs, height, width) 函数

功能:处理网络输出,提取边界框信息

def get_box_dimensions(outputs, height, width): boxes = [] confidences = [] class_ids = [] for output in outputs: for detection in output: scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] if confidence > 0.5: # 置信度阈值 # 计算边界框坐标 center_x = int(detection[0] * width) center_y = int(detection[1] * height) w = int(detection[2] * width) h = int(detection[3] * height) # 矩形左上角坐标 x = int(center_x - w / 2) y = int(center_y - h / 2) boxes.append([x, y, w, h]) confidences.append(float(confidence)) class_ids.append(class_id) return boxes, confidences, class_ids

2.4   draw_labels(boxes, confidences, class_ids, classes, img) 函数

功能:在图像上绘制检测结果

def draw_labels(boxes, confidences, class_ids, classes, img): # 应用非极大值抑制 indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4) # 设置颜色 colors = np.random.uniform(0, 255, size=(len(classes), 3)) if len(indexes) > 0: for i in indexes.flatten(): x, y, w, h = boxes[i] label = str(classes[class_ids[i]]) confidence = str(round(confidences[i], 2)) color = colors[class_ids[i]] # 绘制边界框和标签 cv2.rectangle(img, (x, y), (x + w, y + h), color, 2) cv2.putText(img, f\"{label} {confidence}\", (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2) return img

2.5  object_detection(image_path) 函数

功能:主函数,整合整个检测流程

def object_detection(image_path): try: net, classes, output_layers = load_yolo() img = cv2.imread(image_path) if img is None: raise FileNotFoundError(f\"无法加载图像: {image_path}\") height, width = img.shape[:2] outputs = detect_objects(img, net, output_layers) boxes, confidences, class_ids = get_box_dimensions(outputs, height, width) img = draw_labels(boxes, confidences, class_ids, classes, img) cv2.imshow(\"Object Detection\", img) cv2.waitKey(0) cv2.destroyAllWindows() except Exception as e: print(f\"发生错误: {str(e)}\")

三、所有代码

import cv2import numpy as npimport osdef load_yolo(): # 获取当前目录 current_dir = os.path.dirname(os.path.abspath(__file__)) # 构建完整路径 cfg_path = os.path.join(current_dir, \"yolov3.cfg\") weights_path = os.path.join(current_dir, \"yolov3.weights\") names_path = os.path.join(current_dir, \"coco.names\") # 检查文件是否存在 if not all(os.path.exists(f) for f in [cfg_path, weights_path, names_path]): raise FileNotFoundError(\"缺少YOLO模型文件,请确保yolov3.cfg、yolov3.weights和coco.names在脚本目录下\") # 加载网络 net = cv2.dnn.readNet(weights_path, cfg_path) with open(names_path, \"r\") as f: classes = [line.strip() for line in f.readlines()] layer_names = net.getLayerNames() output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()] return net, classes, output_layersdef detect_objects(img, net, output_layers): # 从图像创建blob blob = cv2.dnn.blobFromImage(img, scalefactor=1/255.0, size=(416, 416), swapRB=True, crop=False) # 设置输入并进行前向传播 net.setInput(blob) outputs = net.forward(output_layers) return outputsdef get_box_dimensions(outputs, height, width): boxes = [] confidences = [] class_ids = [] for output in outputs: for detection in output: scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] if confidence > 0.5: # 置信度阈值 # 计算边界框坐标 center_x = int(detection[0] * width) center_y = int(detection[1] * height) w = int(detection[2] * width) h = int(detection[3] * height) # 矩形左上角坐标 x = int(center_x - w / 2) y = int(center_y - h / 2) boxes.append([x, y, w, h]) confidences.append(float(confidence)) class_ids.append(class_id) return boxes, confidences, class_idsdef draw_labels(boxes, confidences, class_ids, classes, img): # 应用非极大值抑制 indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4) # 设置颜色 colors = np.random.uniform(0, 255, size=(len(classes), 3)) if len(indexes) > 0: for i in indexes.flatten(): x, y, w, h = boxes[i] label = str(classes[class_ids[i]]) confidence = str(round(confidences[i], 2)) color = colors[class_ids[i]] # 绘制边界框和标签 cv2.rectangle(img, (x, y), (x + w, y + h), color, 2) cv2.putText(img, f\"{label} {confidence}\", (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2) return imgdef object_detection(image_path): try: net, classes, output_layers = load_yolo() img = cv2.imread(image_path) if img is None: raise FileNotFoundError(f\"无法加载图像: {image_path}\") height, width = img.shape[:2] outputs = detect_objects(img, net, output_layers) boxes, confidences, class_ids = get_box_dimensions(outputs, height, width) img = draw_labels(boxes, confidences, class_ids, classes, img) cv2.imshow(\"Object Detection\", img) cv2.waitKey(0) cv2.destroyAllWindows() except Exception as e: print(f\"发生错误: {str(e)}\")if __name__ == \"__main__\": # 使用示例 image_path = \"myimg.jpg\" # 替换为你的图片路径 object_detection(image_path)

四、效果演示

4.1  识别香蕉

4.2  识别手机

4.3  识别夜景中的汽车和人