论文 - MMDetection: Open MMLab Detection Toolbox and Benchmark - 2019
Github - open-mmlab/mmdetection
mmdetection - Data preparation pipeline
更新日期:2019.08.23
mmdetection 的数据准备管道(data preparaion pipeline)和数据集的处理过程是进行了分解的. 一般情况下,数据集定义了如何处理标注数据;数据准备管道定义了数据字典(data dict)准备的全部步骤. 数据管道包含一系列序列化的操作,每个操作都采用 dict 作为输入,并同样输出一个 dict.
如图,蓝色块表示管道操作. 随着管道的进行,每个操作子会添加新的 keys(标记为绿色) 到输出 dict 中,或者更新已有的 keys(标记为橙色).
这些数据操作被归类为:数据加载,预处理,格式化,测试数据增强.
例如,Faster R-CNN 的数据管道示例:
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
[1] - 数据加载(Data loading)
LoadImageFromFile
- add: img, img_shape, ori_shape
LoadAnnotations
- add: gt_bboxes, gt_bboxes_ignore, gt_labels, gt_masks, gt_semantic_seg, bbox_fields, mask_fields
LoadProposals
- add: proposals
[2] - 预处理(Pre-processing)
Resize
- add: scale, scale_idx, pad_shape, scale_factor, keep_ratio
- update: img, img_shape, bbox_fields, mask_fields
RandomFlip
- add: flip
- update: img, bbox_fields, mask_fields
Pad
- add: pad_fixed_size, pad_size_divisor
- update: img, pad_shape, *mask_fields
RandomCrop
- update: img, pad_shape, gt_bboxes, gt_labels, gt_masks, *bbox_fields
Normalize
- add: img_norm_cfg
- update: img
SegResizeFlipPadRescale
- update: gt_semantic_seg
PhotoMetricDistortion
- update: img
Expand
- update: img, gt_bboxes
MinIoURandomCrop
- update: img, gt_bboxes, gt_labels
Corrupt
- update: img
[3] - 格式化(Formatting)
ToTensor
- update: specified by
keys
.
ImageToTensor
- update: specified by
keys
.
Transpose
- update: specified by
keys
.
ToDataContainer
- update: specified by
fields
.
DefaultFormatBundle
- update: img, proposals, gt_bboxes, gt_bboxes_ignore, gt_labels, gt_masks, gt_semantic_seg
Collect
- add: img_meta (the keys of img_meta is specified by
meta_keys
) - remove: all other keys except for those specified by
keys
[4] - 测试数据增强(Test time augmentation)
MultiScaleFlipAug
5 comments
你好 请问如果在数据增强部分进行mix up 怎么修改代码啊
您好,通过这个处理,输入一个样本输出还是一个样本,有没有办法输入一个样本然后输出多个增广的样本?谢谢了
修改下输出部分的代码即可.
你好,请问img_scale是指的resize阶段,把数据调整成img_scale大小吗?
是的,resize 到 img_scale 尺寸.