ICDAR2015&2017数据是一张jpg图片对应一张txt标注文件,文件内容形式如下:
在这里插入图片描述
创建icdar2voc.py文件,代码如下:

# coding:utf-8

import os
import numpy as np
import cv2

def xml(num,width,height,labelname,box,imageName,imagePath):
    """
    写xml文件
    :param num: 第num个文件
    :param width: 图的宽
    :param height: 图的高
    :param ImgArray: 存放图的list
    :param labelname: 贴的图的名字
    :param box: 贴图的框坐标
    :return: 写好的xml文件
    """
    xml_file = open(num,'w')  ######################gai res1,res2
    xml_file.write('<annotation>\n')
    xml_file.write('    <folder>IMage</folder>\n')
    xml_file.write('    <filename>' + imageName + '</filename>\n')
    xml_file.write('    <path>' + imagePath + '</path>\n')
    xml_file.write('    <source>\n')
    xml_file.write('        <database>' + 'Unknown' + '</database>\n')
    xml_file.write('    </source>\n')
    xml_file.write('    <size>\n')
    xml_file.write('        <width>' + str(width) + '</width>\n')
    xml_file.write('        <height>' + str(height) + '</height>\n')
    xml_file.write('        <depth>3</depth>\n')
    xml_file.write('    </size>\n')
    xml_file.write('    <segmented>0</segmented>\n')
    for i in range(len(labelname)):
        xml_file.write('    <object>\n')
        xml_file.write('        <name>' + str(labelname[i]) + '</name>\n')
        xml_file.write('        <pose>Unspecified</pose>\n')
        xml_file.write('        <truncated>0</truncated>\n')
        xml_file.write('        <Difficult>0</Difficult>\n')
        xml_file.write('        <bndbox>\n')
        xml_file.write('            <xmin>' + str(box[i][0]) + '</xmin>\n')
        xml_file.write('            <ymin>' + str(box[i][1]) + '</ymin>\n')
        xml_file.write('            <xmax>' + str(box[i][2]) + '</xmax>\n')
        xml_file.write('            <ymax>' + str(box[i][3]) + '</ymax>\n')
        xml_file.write('        </bndbox>\n')
        xml_file.write('    </object>\n')

    xml_file.write('</annotation>')
    return xml_file

def load_annoataion(p):
    text_polys = []
    text_tags = []
    label = 'text'
    with open(p, "r",encoding='UTF-8-sig') as f:
    # with open(p, "r", encoding='unicode_escape') as f:
        data = f.readlines()
        for item2 in data:
            # print(p,item2)

            # print(item.split(','))
            item=item2.split(',')
            if int(item[0])>1 and int(item[1])>1 and int(item[4])>1 and int(item[5])>1:
                text_polys.append([item[0], item[1], item[4], item[5]])
                text_tags.append(label)
        # print(data)
    return np.array(text_polys, dtype=np.int32), np.array(text_tags, dtype=np.str)

base_dir = "ICDAR2017/VOC2007"

if __name__ == "__main__":
    txt_path = "ICDAR2017/ICDAR2017/txt/"     #自定义训练集Ground Truth路径
    xml_path = base_dir+'/Annotations/'       #自定义生成的xml文件路径
    img_path = "ICDAR2017/ICDAR2017/image/"   #自定义训练集图片路径
    print(os.path.exists(txt_path))
    txts = os.listdir(txt_path)
    for count, t in enumerate(txts):
        print(os.path.join(txt_path, t))
        boxes, labels = load_annoataion(os.path.join(txt_path, t))
        print(len(boxes),len(labels))
        filepath, tmpfilename = os.path.split(t)
        shotname, extension = os.path.splitext(tmpfilename)
        print('****************',shotname)

        # realName=shotname.split("_")[1]+"_"+shotname.split("_")[2]       # ICDAR2015
        realName = os.path.splitext(shotname)[0]                           # ICDAR2017 去除后缀
        print(realName)
        saveXml=xml_path+realName+".xml"
        img = cv2.imread(img_path+realName+'.jpg')
        print(img_path+realName+'.jpg')
        h, w, d = img.shape
        print(h,w,d)
        # for item in boxes:
        #     print(len(item))
        xml(saveXml,w,h,labels,boxes,realName,img_path+realName+'.jpg')

运行代码即可批量转换得到每张jpg对应的xml标注文件。

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐