论文 - MMDetection: Open MMLab Detection Toolbox and Benchmark - 2019

Github - open-mmlab/mmdetection

mmdetection 是基于 PyTorch 的开源目标检测工具包,其是由 Multimedia Laboratory, CUHK 开发的 open-mmlab 项目的一部分.

此外,对于计算机视觉研究,还开源了 mmcv 库,是 mmdetection 所高度依赖的.

1. mmdetection 主要特点

[1] - 模块化设计

mmdetection 将检测框架被分解为不同的模块,易于快速简单的定制目标检测框架.

[2] - 支持多个框架的开箱即用

mmdetection 直接支持很多检测框架,如,Faster RCNN, Mask RCNN, RetinaNet, etc.

[3] - 更高效

mmdetection 所有的基础 bbox 和 mask 操作都是在 GPUs 上运行的. 训练速度与 Detectron, maskrcnn-benchmarkSimpleDet 等相比,相当,甚至更快.

[4] - 更先进

mmdetection 是由 COCO Detection Challenge 2018 第一名的 MMDet 团队开发,并保持持续推进.

1.1. 更新日志

v0.6.0 (14/04/2019)

  • Up to 30% speedup compared to the model zoo.
  • Support both PyTorch stable and nightly version.
  • Replace NMS and SigmoidFocalLoss with Pytorch CUDA extensions.

v0.6rc0(06/02/2019)

  • Migrate to PyTorch 1.0.

v0.5.7 (06/02/2019)

  • Add support for Deformable ConvNet v2. (Many thanks to the authors and @chengdazhi)
  • This is the last release based on PyTorch 0.4.1.

v0.5.6 (17/01/2019)

  • Add support for Group Normalization.
  • Unify RPNHead and single stage heads (RetinaHead, SSDHead) with AnchorHead.

v0.5.5 (22/12/2018)

  • Add SSD for COCO and PASCAL VOC.
  • Add ResNeXt backbones and detection models.
  • Refactoring for Samplers/Assigners and add OHEM.
  • Add VOC dataset and evaluation scripts.

v0.5.4 (27/11/2018)

  • Add SingleStageDetector and RetinaNet.

v0.5.3 (26/11/2018)

  • Add Cascade R-CNN and Cascade Mask R-CNN.
  • Add support for Soft-NMS in config files.

v0.5.2 (21/10/2018)

  • Add support for custom datasets.
  • Add a script to convert PASCAL VOC annotations to the expected format.

v0.5.1 (20/10/2018)

  • Add BBoxAssigner and BBoxSampler, the train_cfg field in config files are restructured.
  • ConvFCRoIHead / SharedFCRoIHead are renamed to ConvFCBBoxHead / SharedFCBBoxHead for consistency.

2. mmdetection 安装

2.1. 依赖项

  • Linux (Ubuntu 16.04/18.04 and CentOS 7.2)
  • Python 3.5+
  • PyTorch 1.0+ or PyTorch-nightly
  • CUDA 9.0+ (9.0/9.2/10.0)
  • NCCL 2+ (2.1.15/2.2.13/2.3.7/2.4.2)
  • GCC 4.9+ (4.9/5.3/5.4/7.3)
  • mmcv

2.2. 安装

PyTorch 1.1 貌似要求 CUDA 在 10.0 以上.

以 Ubuntu 16.0, CUDA10.0, Python3.5 为例.

[1] - PyTorch 安装

sudo pip3 install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp35-cp35m-linux_x86_64.whl
sudo pip3 install https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp35-cp35m-linux_x86_64.whl

[2] - mmdetection 下载与安装

git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection

pip install mmcv --user
python setup.py develop --user

[3] - 使用

如果同时存在多个版本的 mmdetection 库,可以在 py 脚本文件中添加如下内容:

import sys
sys.path.insert(0, '/path/to/mmdetection')

3. MODEL_ZOO

https://github.com/open-mmlab/mmdetection/blob/master/MODEL_ZOO.md

ResNetResNeXtSENetVGGHRNet
RPN
Fast R-CNN
Faster R-CNN
Mask R-CNN
Cascade R-CNN
Cascade Mask R-CNN
SSD
RetinaNet
GHM
Mask Scoring R-CNN
FCOS
Grid R-CNN
Hybrid Task Cascade
Libra R-CNN
Guided Anchoring

其它特点:

  • [x] DCNv2
  • [x] Group Normalization
  • [x] Weight Standardization
  • [x] OHEM
  • [x] Soft-NMS
  • [x] Generalized Attention
  • [x] GCNet
  • [x] Mixed Precision (FP16) Training

3.1. Faster R-CNN

BackboneStyleLr schdMem (GB)Train time (s/iter)Inf time (fps)box APDownload
R-50-C4caffe1x--9.534.9model
R-50-C4caffe2x4.00.399.336.5model
R-50-C4pytorch1x--9.333.9model
R-50-C4pytorch2x--9.435.9model
R-50-FPNcaffe1x3.60.33313.536.6-
R-50-FPNpytorch1x3.80.35313.636.4model
R-50-FPNpytorch2x---37.7model
R-101-FPNcaffe1x5.50.46511.538.8-
R-101-FPNpytorch1x5.70.47411.938.5model
R-101-FPNpytorch2x---39.4model
X-101-32x4d-FPNpytorch1x6.90.67210.340.1model
X-101-32x4d-FPNpytorch2x---40.4model
X-101-64x4d-FPNpytorch1x9.81.0407.341.3model
X-101-64x4d-FPNpytorch2x---40.7model

3.2. Mask R-CNN

BackboneStyleLr schdMem (GB)Train time (s/iter)Inf time (fps)box APmask APDownload
R-50-C4caffe1x--8.135.931.5model
R-50-C4caffe2x4.20.438.137.932.9model
R-50-C4pytorch1x--7.935.131.2model
R-50-C4pytorch2x--8.037.232.5model
R-50-FPNcaffe1x3.80.43010.237.434.3-
R-50-FPNpytorch1x3.90.45310.637.334.2model
R-50-FPNpytorch2x---38.535.1model
R-101-FPNcaffe1x5.70.5349.439.936.1-
R-101-FPNpytorch1x5.80.5719.539.435.9model
R-101-FPNpytorch2x---40.336.5model
X-101-32x4d-FPNpytorch1x7.10.7598.341.137.1model
X-101-32x4d-FPNpytorch2x---41.437.1model
X-101-64x4d-FPNpytorch1x10.01.1026.542.138.0model
X-101-64x4d-FPNpytorch2x---42.037.7model

3.3. RetinaNet

BackboneStyleLr schdMem (GB)Train time (s/iter)Inf time (fps)box APDownload
R-50-FPNcaffe1x3.40.28512.535.8-
R-50-FPNpytorch1x3.60.30812.135.6model
R-50-FPNpytorch2x---36.4model
R-101-FPNcaffe1x5.30.41010.437.8-
R-101-FPNpytorch1x5.50.42910.937.7model
R-101-FPNpytorch2x---38.1model
X-101-32x4d-FPNpytorch1x6.70.6329.339.0model
X-101-32x4d-FPNpytorch2x---39.3model
X-101-64x4d-FPNpytorch1x9.60.9937.040.0model
X-101-64x4d-FPNpytorch2x---39.6model

3.4. Cascade R-CNN

BackboneStyleLr schdMem (GB)Train time (s/iter)Inf time (fps)box APDownload
R-50-C4caffe1x8.70.925.038.7model
R-50-FPNcaffe1x3.90.46410.940.5-
R-50-FPNpytorch1x4.10.45511.940.4model
R-50-FPNpytorch20e---41.1model
R-101-FPNcaffe1x5.80.5699.642.4-
R-101-FPNpytorch1x6.00.58410.342.0model
R-101-FPNpytorch20e---42.5model
X-101-32x4d-FPNpytorch1x7.20.7708.943.6model
X-101-32x4d-FPNpytorch20e---44.0model
X-101-64x4d-FPNpytorch1x10.01.1336.744.5model
X-101-64x4d-FPNpytorch20e---44.7model

3.5. Cascade Mask R-CNN

BackboneStyleLr schdMem (GB)Train time (s/iter)Inf time (fps)box APmask APDownload
R-50-C4caffe1x9.10.994.539.332.8model
R-50-FPNcaffe1x5.10.6927.640.935.5-
R-50-FPNpytorch1x5.30.6837.441.235.7model
R-50-FPNpytorch20e---42.336.6model
R-101-FPNcaffe1x7.00.8037.243.137.2-
R-101-FPNpytorch1x7.20.8076.842.637.0model
R-101-FPNpytorch20e---43.337.6model
X-101-32x4d-FPNpytorch1x8.40.9766.644.438.2model
X-101-32x4d-FPNpytorch20e---44.738.6model
X-101-64x4d-FPNpytorch1x11.41.335.345.439.1model
X-101-64x4d-FPNpytorch20e---45.739.4model

3.6. Hybrid Task Cascade (HTC)

BackboneStyleLr schdMem (GB)Train time (s/iter)Inf time (fps)box APmask APDownload
R-50-FPNpytorch1x7.40.9364.142.137.3model
R-50-FPNpytorch20e---43.238.1model
R-101-FPNpytorch20e9.31.0514.044.939.4model
X-101-32x4d-FPNpytorch20e5.80.7693.846.140.3model
X-101-64x4d-FPNpytorch20e7.51.1203.546.940.8model

3.7. benchmarks

3.8. 模型精度

4. mmdetection 简单测试

下载 MODEL_ZOO 中 mmdetection 提供的预训练模型,进行测试.

#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import os
from mmdet.apis import init_detector, inference_detector, show_result
import time
import random

#配置文件
config_file = 'configs/cascade_rcnn_r101_fpn_1x.py'
checkpoint_file = 'checkpoints/cascade_rcnn_r101_fpn_1x_20181129-d64ebac7.pth'

#加载模型
model = init_detector(config_file, checkpoint_file, device='cuda:0')

#测试单张图片
img = '/path/to/test.jpg'  
#或
#img = mmcv.imread(img), which will only load it once
start = time.time()
result = inference_detector(model, img)
print('[INFO]timecost: ', time.time() - start)
show_result(img, result, model.CLASSES)

#测试多张图片
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs)):
    show_result(imgs[i], result, model.CLASSES, 
print('[INFO]Done.')

5. 与 Detectron/maskrcnn-benchmark 对比

5.1. 精度

5.2. 训练速度

值越小越好.

TypeDetectron (P100)maskrcnn-benchmark (V100)mmdetection (V100)
RPN0.416-0.253
Faster R-CNN0.5440.3530.333
Mask R-CNN0.8890.4540.430
Fast R-CNN0.285-0.242
Fast R-CNN (w/mask)0.377-0.328

5.3. 推断速度

单张 GPU 上,评价标准为 fps(img/s). 值越高越好.

TypeDetectron (P100)maskrcnn-benchmark (V100)mmdetection (V100)
RPN12.5-16.9
Faster R-CNN10.37.913.5
Mask R-CNN8.57.710.2
Fast R-CNN12.5-18.4
Fast R-CNN (w/mask)9.9-12.8

5.4. 训练显存占用

TypeDetectronmaskrcnn-benchmarkmmdetection
RPN6.4-3.3
Faster R-CNN7.24.43.6
Mask R-CNN8.65.23.8
Fast R-CNN6.0-3.3
Fast R-CNN (w/mask)7.9-3.4

maskrcnn-benchmark 和 mmdetection 相对于 Detectron 而言,是更节省显存的,其主要原因是 PyTorch 的优势. mmdetection 还采用了一些显存优化策略.

5.5. 已支持的检测方法

Last modification:July 13th, 2019 at 01:40 pm