Skip to content

Scripts to convert datasets (Caltech pedestrian, MS COCO, HDA) to PASCAL VOC format

Notifications You must be signed in to change notification settings

liuhaolinwen/Dataset_to_VOC_converter

 
 

Repository files navigation

These scripts are used for convert datasets (MS COCO, Caltech pedestrian dataset) to PASCAL VOC format for later training.

Requirements

Usage

COCO

anno_json_image_urls.py: extract image url (coco source not filckr) from annotation json file. See anno_json_image_urls.sh
download_coco_images.py: download coco image files from given urls (extracted from instance/keypoint annotation json file) . See download.sh
anno_coco2voc.py: convert coco annotation json file to VOC xml files. See anno_coco2voc.sh

For exmaple:

python3 anno_coco2voc.py --anno_file=/Path/to/instances_train2014.json \
                         --type=instance \
                         --output_dir=/Path/to/instance_anno_dir

Caltech

vbb2voc.py: extract images with person bbox in seq file and convert vbb annotation file to xml files.
PS: For Caltech pedestrian dataset, there are 4 kind of persons: person, person-fa, person?, people. In my case, I just need to use person type data. If you want to use other types, specify person_types with corresponding type list (like ['person', 'people']) in parse_anno_file function.

python3 vbb2voc.py --seq_dir=path/to/caltech/seq/dir \        
                   --vbb_dir=path/to/caltech/vbb/dir \
                   --output_dir=/output/saving/path \
                   --person_type=person

HDA

anno_had2voc: convert HDA annotation info to VOC format.

python3 anno_hda2voc.py --input_dir=path/to/HDA_Dataset_V1.3/hda_detections/GtAnnotationsAll \ 
                        --output_dir=anno/saving/path

About

Scripts to convert datasets (Caltech pedestrian, MS COCO, HDA) to PASCAL VOC format

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 90.3%
  • Shell 9.7%