论文 - MMDetection: Open MMLab Detection Toolbox and Benchmark - 2019
Github - open-mmlab/mmdetection
mmdetection 是基于 PyTorch 的开源目标检测工具包,其是由 Multimedia Laboratory, CUHK 开发的 open-mmlab 项目的一部分.
此外,对于计算机视觉研究,还开源了 mmcv 库,是 mmdetection 所高度依赖的.
1. mmdetection 主要特点
[1] - 模块化设计
mmdetection 将检测框架被分解为不同的模块,易于快速简单的定制目标检测框架.
[2] - 支持多个框架的开箱即用
mmdetection 直接支持很多检测框架,如,Faster RCNN, Mask RCNN, RetinaNet, etc.
[3] - 更高效
mmdetection 所有的基础 bbox 和 mask 操作都是在 GPUs 上运行的. 训练速度与 Detectron, maskrcnn-benchmark 和 SimpleDet 等相比,相当,甚至更快.
[4] - 更先进
mmdetection 是由 COCO Detection Challenge 2018 第一名的 MMDet 团队开发,并保持持续推进.
1.1. 更新日志
v0.6.0 (14/04/2019)
- Up to 30% speedup compared to the model zoo.
- Support both PyTorch stable and nightly version.
- Replace NMS and SigmoidFocalLoss with Pytorch CUDA extensions.
v0.6rc0(06/02/2019)
- Migrate to PyTorch 1.0.
v0.5.7 (06/02/2019)
- Add support for Deformable ConvNet v2. (Many thanks to the authors and @chengdazhi)
- This is the last release based on PyTorch 0.4.1.
v0.5.6 (17/01/2019)
- Add support for Group Normalization.
- Unify RPNHead and single stage heads (RetinaHead, SSDHead) with AnchorHead.
v0.5.5 (22/12/2018)
- Add SSD for COCO and PASCAL VOC.
- Add ResNeXt backbones and detection models.
- Refactoring for Samplers/Assigners and add OHEM.
- Add VOC dataset and evaluation scripts.
v0.5.4 (27/11/2018)
- Add SingleStageDetector and RetinaNet.
v0.5.3 (26/11/2018)
- Add Cascade R-CNN and Cascade Mask R-CNN.
- Add support for Soft-NMS in config files.
v0.5.2 (21/10/2018)
- Add support for custom datasets.
- Add a script to convert PASCAL VOC annotations to the expected format.
v0.5.1 (20/10/2018)
- Add BBoxAssigner and BBoxSampler, the
train_cfg
field in config files are restructured. ConvFCRoIHead
/SharedFCRoIHead
are renamed toConvFCBBoxHead
/SharedFCBBoxHead
for consistency.
2. mmdetection 安装
2.1. 依赖项
- Linux (Ubuntu 16.04/18.04 and CentOS 7.2)
- Python 3.5+
- PyTorch 1.0+ or PyTorch-nightly
- CUDA 9.0+ (9.0/9.2/10.0)
- NCCL 2+ (2.1.15/2.2.13/2.3.7/2.4.2)
- GCC 4.9+ (4.9/5.3/5.4/7.3)
- mmcv
2.2. 安装
PyTorch 1.1 貌似要求 CUDA 在 10.0 以上.
以 Ubuntu 16.0, CUDA10.0, Python3.5 为例.
[1] - PyTorch 安装
sudo pip3 install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp35-cp35m-linux_x86_64.whl
sudo pip3 install https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp35-cp35m-linux_x86_64.whl
[2] - mmdetection 下载与安装
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install mmcv --user
python setup.py develop --user
[3] - 使用
如果同时存在多个版本的 mmdetection 库,可以在 py 脚本文件中添加如下内容:
import sys
sys.path.insert(0, '/path/to/mmdetection')
3. MODEL_ZOO
https://github.com/open-mmlab/mmdetection/blob/master/MODEL_ZOO.md
ResNet | ResNeXt | SENet | VGG | HRNet | |
---|---|---|---|---|---|
RPN | ✓ | ✓ | ☐ | ✗ | ✓ |
Fast R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
Faster R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
Mask R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
Cascade R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
Cascade Mask R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
SSD | ✗ | ✗ | ✗ | ✓ | ✗ |
RetinaNet | ✓ | ✓ | ☐ | ✗ | ✓ |
GHM | ✓ | ✓ | ☐ | ✗ | ✓ |
Mask Scoring R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
FCOS | ✓ | ✓ | ☐ | ✗ | ✓ |
Grid R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
Hybrid Task Cascade | ✓ | ✓ | ☐ | ✗ | ✓ |
Libra R-CNN | ✓ | ✓ | ☐ | ✗ | ✓ |
Guided Anchoring | ✓ | ✓ | ☐ | ✗ | ✓ |
其它特点:
- [x] DCNv2
- [x] Group Normalization
- [x] Weight Standardization
- [x] OHEM
- [x] Soft-NMS
- [x] Generalized Attention
- [x] GCNet
- [x] Mixed Precision (FP16) Training
3.1. Faster R-CNN
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | - | - | 9.5 | 34.9 | model |
R-50-C4 | caffe | 2x | 4.0 | 0.39 | 9.3 | 36.5 | model |
R-50-C4 | pytorch | 1x | - | - | 9.3 | 33.9 | model |
R-50-C4 | pytorch | 2x | - | - | 9.4 | 35.9 | model |
R-50-FPN | caffe | 1x | 3.6 | 0.333 | 13.5 | 36.6 | - |
R-50-FPN | pytorch | 1x | 3.8 | 0.353 | 13.6 | 36.4 | model |
R-50-FPN | pytorch | 2x | - | - | - | 37.7 | model |
R-101-FPN | caffe | 1x | 5.5 | 0.465 | 11.5 | 38.8 | - |
R-101-FPN | pytorch | 1x | 5.7 | 0.474 | 11.9 | 38.5 | model |
R-101-FPN | pytorch | 2x | - | - | - | 39.4 | model |
X-101-32x4d-FPN | pytorch | 1x | 6.9 | 0.672 | 10.3 | 40.1 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 40.4 | model |
X-101-64x4d-FPN | pytorch | 1x | 9.8 | 1.040 | 7.3 | 41.3 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 40.7 | model |
3.2. Mask R-CNN
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | - | - | 8.1 | 35.9 | 31.5 | model |
R-50-C4 | caffe | 2x | 4.2 | 0.43 | 8.1 | 37.9 | 32.9 | model |
R-50-C4 | pytorch | 1x | - | - | 7.9 | 35.1 | 31.2 | model |
R-50-C4 | pytorch | 2x | - | - | 8.0 | 37.2 | 32.5 | model |
R-50-FPN | caffe | 1x | 3.8 | 0.430 | 10.2 | 37.4 | 34.3 | - |
R-50-FPN | pytorch | 1x | 3.9 | 0.453 | 10.6 | 37.3 | 34.2 | model |
R-50-FPN | pytorch | 2x | - | - | - | 38.5 | 35.1 | model |
R-101-FPN | caffe | 1x | 5.7 | 0.534 | 9.4 | 39.9 | 36.1 | - |
R-101-FPN | pytorch | 1x | 5.8 | 0.571 | 9.5 | 39.4 | 35.9 | model |
R-101-FPN | pytorch | 2x | - | - | - | 40.3 | 36.5 | model |
X-101-32x4d-FPN | pytorch | 1x | 7.1 | 0.759 | 8.3 | 41.1 | 37.1 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 41.4 | 37.1 | model |
X-101-64x4d-FPN | pytorch | 1x | 10.0 | 1.102 | 6.5 | 42.1 | 38.0 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 42.0 | 37.7 | model |
3.3. RetinaNet
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|
R-50-FPN | caffe | 1x | 3.4 | 0.285 | 12.5 | 35.8 | - |
R-50-FPN | pytorch | 1x | 3.6 | 0.308 | 12.1 | 35.6 | model |
R-50-FPN | pytorch | 2x | - | - | - | 36.4 | model |
R-101-FPN | caffe | 1x | 5.3 | 0.410 | 10.4 | 37.8 | - |
R-101-FPN | pytorch | 1x | 5.5 | 0.429 | 10.9 | 37.7 | model |
R-101-FPN | pytorch | 2x | - | - | - | 38.1 | model |
X-101-32x4d-FPN | pytorch | 1x | 6.7 | 0.632 | 9.3 | 39.0 | model |
X-101-32x4d-FPN | pytorch | 2x | - | - | - | 39.3 | model |
X-101-64x4d-FPN | pytorch | 1x | 9.6 | 0.993 | 7.0 | 40.0 | model |
X-101-64x4d-FPN | pytorch | 2x | - | - | - | 39.6 | model |
3.4. Cascade R-CNN
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Download |
---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | 8.7 | 0.92 | 5.0 | 38.7 | model |
R-50-FPN | caffe | 1x | 3.9 | 0.464 | 10.9 | 40.5 | - |
R-50-FPN | pytorch | 1x | 4.1 | 0.455 | 11.9 | 40.4 | model |
R-50-FPN | pytorch | 20e | - | - | - | 41.1 | model |
R-101-FPN | caffe | 1x | 5.8 | 0.569 | 9.6 | 42.4 | - |
R-101-FPN | pytorch | 1x | 6.0 | 0.584 | 10.3 | 42.0 | model |
R-101-FPN | pytorch | 20e | - | - | - | 42.5 | model |
X-101-32x4d-FPN | pytorch | 1x | 7.2 | 0.770 | 8.9 | 43.6 | model |
X-101-32x4d-FPN | pytorch | 20e | - | - | - | 44.0 | model |
X-101-64x4d-FPN | pytorch | 1x | 10.0 | 1.133 | 6.7 | 44.5 | model |
X-101-64x4d-FPN | pytorch | 20e | - | - | - | 44.7 | model |
3.5. Cascade Mask R-CNN
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|
R-50-C4 | caffe | 1x | 9.1 | 0.99 | 4.5 | 39.3 | 32.8 | model |
R-50-FPN | caffe | 1x | 5.1 | 0.692 | 7.6 | 40.9 | 35.5 | - |
R-50-FPN | pytorch | 1x | 5.3 | 0.683 | 7.4 | 41.2 | 35.7 | model |
R-50-FPN | pytorch | 20e | - | - | - | 42.3 | 36.6 | model |
R-101-FPN | caffe | 1x | 7.0 | 0.803 | 7.2 | 43.1 | 37.2 | - |
R-101-FPN | pytorch | 1x | 7.2 | 0.807 | 6.8 | 42.6 | 37.0 | model |
R-101-FPN | pytorch | 20e | - | - | - | 43.3 | 37.6 | model |
X-101-32x4d-FPN | pytorch | 1x | 8.4 | 0.976 | 6.6 | 44.4 | 38.2 | model |
X-101-32x4d-FPN | pytorch | 20e | - | - | - | 44.7 | 38.6 | model |
X-101-64x4d-FPN | pytorch | 1x | 11.4 | 1.33 | 5.3 | 45.4 | 39.1 | model |
X-101-64x4d-FPN | pytorch | 20e | - | - | - | 45.7 | 39.4 | model |
3.6. Hybrid Task Cascade (HTC)
Backbone | Style | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|
R-50-FPN | pytorch | 1x | 7.4 | 0.936 | 4.1 | 42.1 | 37.3 | model |
R-50-FPN | pytorch | 20e | - | - | - | 43.2 | 38.1 | model |
R-101-FPN | pytorch | 20e | 9.3 | 1.051 | 4.0 | 44.9 | 39.4 | model |
X-101-32x4d-FPN | pytorch | 20e | 5.8 | 0.769 | 3.8 | 46.1 | 40.3 | model |
X-101-64x4d-FPN | pytorch | 20e | 7.5 | 1.120 | 3.5 | 46.9 | 40.8 | model |
3.7. benchmarks
3.8. 模型精度
4. mmdetection 简单测试
下载 MODEL_ZOO 中 mmdetection 提供的预训练模型,进行测试.
#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import os
from mmdet.apis import init_detector, inference_detector, show_result
import time
import random
#配置文件
config_file = 'configs/cascade_rcnn_r101_fpn_1x.py'
checkpoint_file = 'checkpoints/cascade_rcnn_r101_fpn_1x_20181129-d64ebac7.pth'
#加载模型
model = init_detector(config_file, checkpoint_file, device='cuda:0')
#测试单张图片
img = '/path/to/test.jpg'
#或
#img = mmcv.imread(img), which will only load it once
start = time.time()
result = inference_detector(model, img)
print('[INFO]timecost: ', time.time() - start)
show_result(img, result, model.CLASSES)
#测试多张图片
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs)):
show_result(imgs[i], result, model.CLASSES,
print('[INFO]Done.')
5. 与 Detectron/maskrcnn-benchmark 对比
5.1. 精度
5.2. 训练速度
值越小越好.
Type | Detectron (P100) | maskrcnn-benchmark (V100) | mmdetection (V100) |
---|---|---|---|
RPN | 0.416 | - | 0.253 |
Faster R-CNN | 0.544 | 0.353 | 0.333 |
Mask R-CNN | 0.889 | 0.454 | 0.430 |
Fast R-CNN | 0.285 | - | 0.242 |
Fast R-CNN (w/mask) | 0.377 | - | 0.328 |
5.3. 推断速度
单张 GPU 上,评价标准为 fps(img/s). 值越高越好.
Type | Detectron (P100) | maskrcnn-benchmark (V100) | mmdetection (V100) |
---|---|---|---|
RPN | 12.5 | - | 16.9 |
Faster R-CNN | 10.3 | 7.9 | 13.5 |
Mask R-CNN | 8.5 | 7.7 | 10.2 |
Fast R-CNN | 12.5 | - | 18.4 |
Fast R-CNN (w/mask) | 9.9 | - | 12.8 |
5.4. 训练显存占用
Type | Detectron | maskrcnn-benchmark | mmdetection |
---|---|---|---|
RPN | 6.4 | - | 3.3 |
Faster R-CNN | 7.2 | 4.4 | 3.6 |
Mask R-CNN | 8.6 | 5.2 | 3.8 |
Fast R-CNN | 6.0 | - | 3.3 |
Fast R-CNN (w/mask) | 7.9 | - | 3.4 |
maskrcnn-benchmark 和 mmdetection 相对于 Detectron 而言,是更节省显存的,其主要原因是 PyTorch 的优势. mmdetection 还采用了一些显存优化策略.