【全新发布】CODrone：一种面向无人机的综合目标检测基准。(提供主流数据格式处理)_codrone 无人机

技术文档

【全新发布】CODrone：一种面向无人机的综合目标检测基准。(提供主流数据 格式处理)

论文链接：arXiv预印本
github链接：github仓库
Google Drive下载：点击获取

转载需在首页标注本文地址，禁止转载未标明出处！！！

1、摘要

无人机 (UAV) 在物流、农业自动化、城市管理和应急响应等地方的应用高度依赖于定向目标检测 (OOD) 来增强视觉感知。尽管现有的无人机 OOD 数据集提供了宝贵的资源，但它们通常针对特定的下游任务而设计。因此，它们在实际飞行场景中的泛化性能有限，并且无法充分展现算法在实际环境中的有效性。为了弥补这一关键差距，我们推出了 CODrone，这是一个全面的无人机定向目标检测数据集，能够准确反映真实世界的情况。它同时也是一个全新的基准测试集，旨在满足下游任务的需求，确保基于无人机的 OOD 具有更高的适用性和鲁棒性。基于应用需求，我们识别了当前无人机 OOD 数据集的四个主要局限性——图像分辨率低、目标类别有限、单视图成像和飞行高度受限——并提出了相应的改进措施，以增强其适用性和鲁棒性。此外，CODrone 包含从多个城市在不同光照条件下采集的广泛带注释图像，增强了基准测试集的真实性。为了严格评估 CODrone 作为新基准的有效性，并深入了解其带来的新挑战，我们基于 22 种经典或 SOTA 方法开展了一系列实验。我们的评估不仅评估了 CODrone 在实际场景中的有效性，还揭示了其在无人机应用中推进 OOD 的关键瓶颈和机遇。总而言之，CODrone 从无人机角度填补了 OOD 的数据空白，并提供了一个泛化能力更强的基准，更贴合实际应用和未来算法的发展。

2、简介

该数据集由厦门大学于2025年4.29发布，采用OBB的VOC格式的数据集标注。在文章中一共有13个类别，分别是：

car、truck、traffic-sign、people、motor、bicycle、traffic-light、tricycle、bridge、bus、boat、ship、ignored

训练集：5002张

验证集：2000张

测试集：3002张

每张图片分辨率均为3840×2160，但其画质实际上进行了压缩处理。其拍摄场景有：码头、高速公路、村庄、雪景、夜间。

在这里插入图片描述

3、数据集格式处理

因提供的是官方VOC的OBB数据集格式，没有正常水平数据集格式处理，本章节将提供相关处理代码。
我们处理后，目标类别分别为：car、truck、traffic-sign、people、motor、bicycle、traffic-light、tricycle、bridge、bus、boat、ship
一共12个类别。

！！！

首先，先赞扬一下相关工作人员的进行标注处理工作，使UAV图像检测有了新的benchmark，感谢他们的工作奉献。但是吐槽一下，本人整理这数据集实属被扰了道心。我仔细看了他们的数据名字，应该是他们学校同组的不同人进行标注的，不知道是他们没有统一好格式，还是有自己的想法。在VOC格式中，出现了不同标记的痕迹。

voc格式中，filename与图片名字不对等。多出一个字符。并且不止一处地方，还出现字符 “-” 和 “_” 混用
voc格式中，bndbox里面标签混乱。希望还是严谨一点吧！！！

4、代码提供

由于本人没那么多精力，我处理的过程是：

voc —> yolo —> coco

4.1、voc2yolo（矩形坐标框）

import osfrom lxml import etreeimport numpy as npdef VOC2YOLO(class_num, voc_img_path, voc_xml_path, yolo_txt_save_path, yolo_img_save_path=None): xmls = os.listdir(voc_xml_path) xmls = [x for x in xmls if x.endswith(\'.xml\')] if yolo_img_save_path is not None: if not os.path.exists(yolo_img_save_path): os.mkdir(yolo_img_save_path) if not os.path.exists(yolo_txt_save_path): os.mkdir(yolo_txt_save_path) for idx, one_xml in enumerate(xmls): print(f\"Processing file: {one_xml}\") xl = etree.parse(os.path.join(voc_xml_path, one_xml)) root = xl.getroot() objects = root.findall(\'object\') img_size = root.find(\'size\') img_w, img_h = 0, 0 if img_size: img_width = img_size.find(\'width\') if img_width is not None: img_w = int(img_width.text) img_height = img_size.find(\'height\') if img_height is not None: img_h = int(img_height.text) # Add image info to YOLO yolo_data = [] for ob in objects: label = ob.find(\'name\').text if label == \'ignored\': continue class_id = class_num.get(label, -1) if class_id == -1: print(f\"Warning: Class \'{label}\' not found in class_num, skipping.\") continue bbox = ob.find(\'bndbox\') if bbox is None: print(f\"Warning: No \'bndbox\' found in {one_xml}, skipping this object.\") continue # 获取四个顶点 x0 = float(bbox.find(\'x0\').text) y0 = float(bbox.find(\'y0\').text) x1 = float(bbox.find(\'x1\').text) y1 = float(bbox.find(\'y1\').text) x2 = float(bbox.find(\'x2\').text) y2 = float(bbox.find(\'y2\').text) x3 = float(bbox.find(\'x3\').text) y3 = float(bbox.find(\'y3\').text) # 计算最小外接矩形 vertices = np.array([[x0, y0], [x1, y1], [x2, y2], [x3, y3]]) x_min = np.min(vertices[:, 0]) x_max = np.max(vertices[:, 0]) y_min = np.min(vertices[:, 1]) y_max = np.max(vertices[:, 1]) # 计算中心点、宽度和高度 x_center = (x_min + x_max) / 2 y_center = (y_min + y_max) / 2 width = x_max - x_min height = y_max - y_min # 归一化 x_center /= img_w y_center /= img_h width /= img_w height /= img_h # 添加到YOLO格式数据 yolo_data.append(f\"{class_id} {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}\") # 保存YOLO格式的文本文件 with open(os.path.join(yolo_txt_save_path, one_xml.replace(\".xml\", \".txt\")), \'w\') as f: f.write(\"\\n\".join(yolo_data))if __name__ == \'__main__\': VOC2YOLO( class_num={ \'car\': 0, \'truck\': 1, \'traffic-sign\': 2, \'people\': 3, \'motor\': 4, \'bicycle\': 5, \'traffic-light\': 6, \'tricycle\': 7, \'bridge\': 8, \'bus\': 9, \'boat\': 10, \'ship\': 11 }, # 标签种类 voc_img_path=r\'E:\\1-Data\\DataSet\\CODrone\\test\\images\', # 数据集图片文件夹存储路径 voc_xml_path=r\'E:\\1-Data\\DataSet\\CODrone\\test\\labels\', # 标签xml文件夹存储路径 yolo_txt_save_path=r\'E:\\1-Data\\DataSet\\CODrone\\test\\text\' # 将要生成的txt文件夹存储路径 )

4.2、voc2yolo（obb旋转坐标框）

import osfrom lxml import etreeimport numpy as npdef VOC2YOLO(class_num, voc_img_path, voc_xml_path, yolo_txt_save_path, yolo_img_save_path=None): xmls = os.listdir(voc_xml_path) xmls = [x for x in xmls if x.endswith(\'.xml\')] if yolo_img_save_path is not None: if not os.path.exists(yolo_img_save_path): os.mkdir(yolo_img_save_path) if not os.path.exists(yolo_txt_save_path): os.mkdir(yolo_txt_save_path) for idx, one_xml in enumerate(xmls): print(f\"Processing file: {one_xml}\") xl = etree.parse(os.path.join(voc_xml_path, one_xml)) root = xl.getroot() objects = root.findall(\'object\') img_size = root.find(\'size\') img_w, img_h = 0, 0 if img_size: img_width = img_size.find(\'width\') if img_width is not None: img_w = int(img_width.text) img_height = img_size.find(\'height\') if img_height is not None: img_h = int(img_height.text) # Add image info to YOLO yolo_data = [] for ob in objects: label = ob.find(\'name\').text if label == \'ignored\': continue class_id = class_num.get(label, -1) if class_id == -1: print(f\"Warning: Class \'{label}\' not found in class_num, skipping.\") continue bbox = ob.find(\'bndbox\') if bbox is None: print(f\"Warning: No \'bndbox\' found in {one_xml}, skipping this object.\") continue # 获取四个顶点 x0 = float(bbox.find(\'x0\').text) y0 = float(bbox.find(\'y0\').text) x1 = float(bbox.find(\'x1\').text) y1 = float(bbox.find(\'y1\').text) x2 = float(bbox.find(\'x2\').text) y2 = float(bbox.find(\'y2\').text) x3 = float(bbox.find(\'x3\').text) y3 = float(bbox.find(\'y3\').text) # 归一化 x0 /= img_w y0 /= img_h x1 /= img_w y1 /= img_h x2 /= img_w y2 /= img_h x3 /= img_w y3 /= img_h # 添加到YOLO OBB格式数据 yolo_data.append(f\"{class_id} {x0:.6f} {y0:.6f} {x1:.6f} {y1:.6f} {x2:.6f} {y2:.6f} {x3:.6f} {y3:.6f}\") # 保存YOLO OBB格式的文本文件 with open(os.path.join(yolo_txt_save_path, one_xml.replace(\".xml\", \".txt\")), \'w\') as f: f.write(\"\\n\".join(yolo_data))if __name__ == \'__main__\': VOC2YOLO( class_num={ \'car\': 0, \'truck\': 1, \'traffic-sign\': 2, \'people\': 3, \'motor\': 4, \'bicycle\': 5, \'traffic-light\': 6, \'tricycle\': 7, \'bridge\': 8, \'bus\': 9, \'boat\': 10, \'ship\': 11 }, # 标签种类 voc_img_path=r\'E:\\1-Data\\DataSet\\CODrone\\test\\images\', # 数据集图片文件夹存储路径 voc_xml_path=r\'E:\\1-Data\\DataSet\\CODrone\\test\\labels\', # 标签xml文件夹存储路径 yolo_txt_save_path=r\'E:\\1-Data\\DataSet\\CODrone\\test\\text\' # 将要生成的txt文件夹存储路径 )

4.3、yolo2coco（矩形坐标框）

import osimport jsonfrom PIL import Image\"\"\"首先满足YOLO格式要求： images: train val test labels: train val test\"\"\"# 保存位置output_dir = r\"E:\\1-Data\\DataSet\\CODrone\\coco\" # 修改为YOLO格式的数据集路径；# 数据集路径dataset_path = r\"E:\\1-Data\\DataSet\\CODrone\" # 修改你想输出的coco格式数据集路径images_path = os.path.join(dataset_path, \"images\")labels_path = os.path.join(dataset_path, \"labels\")# 类别映射categories = [ {\"id\": 1, \"name\": \"car\"}, {\"id\": 2, \"name\": \"truck\"}, {\"id\": 3, \"name\": \"bus\"}, {\"id\": 4, \"name\": \"traffic-sign\"}, {\"id\": 5, \"name\": \"people\"}, {\"id\": 6, \"name\": \"motor\"}, {\"id\": 7, \"name\": \"bicycle\"}, {\"id\": 8, \"name\": \"traffic-light\"}, {\"id\": 9, \"name\": \"tricycle\"}, {\"id\": 10, \"name\": \"bridge\"}, {\"id\": 11, \"name\": \"boat\"}, {\"id\": 12, \"name\": \"ship\"}]# YOLO格式转COCO格式的函数def convert_yolo_to_coco(x_center, y_center, width, height, img_width, img_height): x_min = (x_center - width / 2) * img_width y_min = (y_center - height / 2) * img_height width = width * img_width height = height * img_height return [x_min, y_min, width, height]# 初始化COCO数据结构def init_coco_format(): return { \"images\": [], \"annotations\": [], \"categories\": categories }# 处理每个数据集分区# 只有train文件夹# for split in [\'train\']:# 只有train val文件夹# for split in [\'train\', \'val\',]:# 只有train val test 文件夹# for split in [\'train\', \'val\', \'test\']for split in [\'train\', \'val\', \'test\']: coco_format = init_coco_format() annotation_id = 0 for img_name in os.listdir(os.path.join(images_path, split)): if img_name.lower().endswith((\'.png\', \'.jpg\', \'.jpeg\')): img_path = os.path.join(images_path, split, img_name) label_path = os.path.join(labels_path, split, img_name.replace(\"jpg\", \"txt\")) img = Image.open(img_path) img_width, img_height = img.size image_info = { \"file_name\": img_name, \"id\": len(coco_format[\"images\"]) + 0, \"width\": img_width, \"height\": img_height } coco_format[\"images\"].append(image_info) if os.path.exists(label_path): with open(label_path, \"r\") as file:  for line in file: category_id, x_center, y_center, width, height = map(float, line.split()) bbox = convert_yolo_to_coco(x_center, y_center, width, height, img_width, img_height) annotation = { \"id\": annotation_id, \"image_id\": image_info[\"id\"], # 根据你的数据集修改category_id是否需要减1或者加1 \"category_id\": int(category_id)+1, \"bbox\": bbox, \"area\": bbox[2] * bbox[3], \"iscrowd\": 0 } coco_format[\"annotations\"].append(annotation) annotation_id += 1 # 每处理1000个图片时打印一次\"正在处理\" if (len(coco_format[\"images\"]) + 1) % 1000 == 0: print(\"正在处理\") # 为每个分区保存JSON文件 with open(os.path.join(output_dir, f\"{split}.json\"), \"w\") as json_file: json.dump(coco_format, json_file, indent=4)

5、懒人模式

这将提供本人处理好的数据集标注文件格式
百度云盘：CODrone

【全新发布】CODrone：一种面向无人机的综合目标检测基准。(提供主流数据格式处理)_codrone 无人机