Tensorflow - 语义分割 Deeplab API 之 Demo

Author： AIHGF
发布时间：June 7, 2018
14411views
42 comments
19435 words
Categories：语义分割

DeepLab: Deep Labelling for Semantic Image Segmentation 语义分割的目标是对输入图片的每个像素分配特定的类别标签，如 person, cat 等等.

Tensorflow 语义分割 DeepLab API
Tensorflow DeepLab ModelZoo

语义分割 DeepLab 的系列论文:

[1] - DeepLabv1- Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs-ICLR2015
采用 atrous conv 显示地控制 CNN 计算的 feature maps 的分别率.

[2] - DeepLabv2 - DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs-TPAMI2017
采用 atrous spatial pyramid pooling (ASPP) 在多种尺度分割目标物体，ASPP 采用多种采样率(sampling rates) 和有效的 fields-of-views 来捕捉多尺度信息.

[3] - DeepLabv3 - Rethinking Atrous Convolution for Semantic Image Segmentation-2017
在 image-level 特征扩展了 ASPP 模块，以捕获较长范围内的信息. 采用 batch normalization 来加速讯息.
在训练和评测过程中，采用 atrous conv 以不同输出步长来提取输出特征,. 训练时 output_stride=16 训练BN，测试时 output_stride=8 得到较高的表现效果.

[4] - DeepLabv3++ - Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation-2018
扩展 DeepLabV3，加入了简单有效的解码模块decoder module 来精调分割结果，尤其时沿着物体边缘的分割效果. 而且，在编码-解码encoder-deconder 结构中，通过 atrous conv 可以任意的控制编码特征的分辨率，以平衡精度和运行时间. TensorFlow 语义分割 API 相关的论文 - 当前实现采用的

Backbone 网络：

[1] - MobileNetv2 - Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation-CVPR2018
用于移动设备的快速网络结构

[2] - Xception: Deep Learning with Depthwise Separable Convolutions-CVPR2017
用于服务器端部署的强大网络结构

[3] - Deformable Convolutional Networks -- COCO Detection and Segmentation Challenge 2017 Entry

TensorFlow DeepLab API 提供了 IoU 精度评测和分割结果的可视化.
代码中以 PASCAL VOC 2012 和 Cityscapes benchmarks 为例.

1. DeepLab API 安装

１.1 依赖项

Numpy
Pillow 1.0
tf Slim (位于 "tensorflow/models/research/" 路径)
Jupyter notebook
Matplotlib
Tensorflow

# For CPU
pip install tensorflow
# For GPU
pip install tensorflow-gpu

sudo apt-get install python-pil python-numpy
sudo pip install jupyter
sudo pip install matplotlib

1.2. 环境变量设置

添加 tensorflow/models/research/ 和 slim 路径到 PYTHONPATH:

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

1.3. 测试 API 安装

在目录　tensorflow/models/research/ 中运行：

#From tensorflow/models/research/
python deeplab/model_test.py　#快速测试

输出：

....
2018-06-06 15:58:06.598998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1821 MB memory) -> physical GPU (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
....
----------------------------------------------------------------------
Ran 5 tests in 15.748s

OK

也可以运行在　 PASCAL VOC 2012　数据集上的完整代码.

# From tensorflow/models/research/deeplab
sh local_test.sh

2. Tensorflow ModelZoo

Tensorflow DeepLab ModelZoo

Tensorflow 提供了在 PASCAL VOC 2012, Cityscapes 和 ADE20K 数据集上的预训练模型.

2.1. DeepLab models trained on PASCAL VOC 2012

2.1.1. 模型描述

Checkpoint name	Network backbone	Pretrained dataset	ASPP	Decoder
mobilenetv2_dm05_coco_voc_trainaug	MobileNet-v2 Depth-Multiplier = 0.5	MS-COCO VOC 2012 train_aug set	N/A	N/A
mobilenetv2_dm05_coco_voc_trainval	MobileNet-v2 Depth-Multiplier = 0.5	MS-COCO VOC 2012 train_aug + trainval sets	N/A	N/A
mobilenetv2_coco_voc_trainaug	MobileNet-v2	MS-COCO VOC 2012 train_aug set	N/A	N/A
mobilenetv2_coco_voc_trainval	MobileNet-v2	MS-COCO VOC 2012 train_aug + trainval sets	N/A	N/A
xception65_coco_voc_trainaug	Xception_65	MS-COCO VOC 2012 train_aug set	[6,12,18] for OS=16 [12,24,36] for OS=8	OS = 4
xception65_coco_voc_trainval	Xception_65	MS-COCO VOC 2012 train_aug + trainval sets	[6,12,18] for OS=16 [12,24,36] for OS=8	OS = 4

其中，OS 表示输出步长(output stride).

2.1.2. 模型下载

Checkpoint name	Eval OS	Eval scales	Left-right Flip	Multiply-Adds	Runtime (sec)	PASCAL mIOU	File Size
mobilenetv2_dm05_coco_voc_trainaug	16	[1.0]	No	0.88B	-	70.19% (val)	7.6MB
mobilenetv2_dm05_coco_voc_trainval	8	[1.0]	No	2.84B	-	71.83% (test)	7.6MB
mobilenetv2_coco_voc_trainaug	16 8	[1.0] [0.5:0.25:1.75]	No Yes	2.75B 152.59B	0.1 26.9	75.32% (val) 77.33 (val)	23MB
mobilenetv2_coco_voc_trainval	8	[0.5:0.25:1.75]	Yes	152.59B	26.9	80.25% (test)	23MB
xception65_coco_voc_trainaug	16 8	[1.0] [0.5:0.25:1.75]	No Yes	54.17B 3055.35B	0.7 223.2	82.20% (val) 83.58% (val)	439MB
xception65_coco_voc_trainval	8	[0.5:0.25:1.75]	Yes	3055.35B	223.2	87.80% (test)	439MB

2.1.3. 模型压缩包说明

下载的每个 .tar 压缩包里包含如下文件：

[1] - frozen_inference_graph.pb.

All frozen inference graphs use output stride of 8 and a single eval scale of 1.0. No left-right flips are used, and MobileNet-v2 based models do not include the decoder module.

[2] - checkpoint 文件：model.ckpt.data-00000-of-00001, model.ckpt.index.

2.2. DeepLab models trained on Cityscapes

2.2.1. 模型描述

Checkpoint name	Network backbone	Pretrained dataset	ASPP	Decoder
mobilenetv2_coco_cityscapes_trainfine	MobileNet-v2	MS-COCO Cityscapes train_fine set	N/A	N/A
xception65_cityscapes_trainfine	Xception_65	ImageNet Cityscapes train_fine set	[6, 12, 18] for OS=16 [12, 24, 36] for OS=8	OS = 4
xception71_dpc_cityscapes_trainfine	Xception_71	ImageNet MS-COCO Cityscapes train_fine set	Dense Prediction Cell	OS = 4
xception71_dpc_cityscapes_trainval	Xception_71	ImageNet MS-COCO Cityscapes trainval_fine and coarse set	Dense Prediction Cell	OS = 4

2.2.2. 模型下载

Checkpoint name	Eval OS	Eval scales	Left-right Flip	Multiply-Adds	Runtime (sec)	Cityscapes mIOU	File Size
mobilenetv2_coco_cityscapes_trainfine	16 8	[1.0] [0.75:0.25:1.25]	No Yes	21.27B 433.24B	0.8 51.12	70.71% (val) 73.57% (val)	23MB
xception65_cityscapes_trainfine	16 8	[1.0] [0.75:0.25:1.25]	No Yes	418.64B 8677.92B	5.0 422.8	78.79% (val) 80.42% (val)	439MB
xception71_dpc_cityscapes_trainfine	16	[1.0]	No	502.07B	-	80.31% (val)	445MB
xception71_dpc_cityscapes_trainval	8	[0.75:0.25:2]	Yes	-	-	82.66% (test)	446MB

2.3. DeepLab models trained on ADE20K

2.3.1. 模型描述

Checkpoint name	Network backbone	Pretrained dataset	ASPP	Decoder	Input size
mobilenetv2_ade20k_train	MobileNet-v2	ImageNet ADE20K training set	N/A	OS = 4	257x257
xception65_ade20k_train	Xception_65	ImageNet ADE20K training set	[6, 12, 18] for OS=16 [12, 24, 36] for OS=8	OS = 4	513x513

2.3.2. 模型下载

Checkpoint name	Eval OS	Eval scales	Left-right Flip	mIOU	Pixel-wise Accuracy	File Size
mobilenetv2_ade20k_train	16	[1.0]	No	32.04% (val)	75.41% (val)	24.8MB
xception65_ade20k_train	8	[0.5:0.25:1.75]	Yes	45.65% (val)	82.52% (val)	439MB

2.4. Checkpoints pretrained on ImageNet

模型 checkpoint 文件：model.ckpt.data-00000-of-00001, model.ckpt.index.

Model name	File Size
xception_41_imagenet	288MB
xception_65_imagenet	447MB
xception_65_imagenet_coco	292MB
xception_71_imagenet	474MB
resnet_v1_50_beta_imagenet	274MB
resnet_v1_101_beta_imagenet	477MB

3. Demo.py

#!--*-- coding:utf-8 --*--

# Deeplab Demo 

import os
import tarfile

from matplotlib import gridspec
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import tempfile
from six.moves import urllib

import tensorflow as tf


class DeepLabModel(object):
    """
    加载 DeepLab 模型；
    推断 Inference.
    """
    INPUT_TENSOR_NAME = 'ImageTensor:0'
    OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
    INPUT_SIZE = 513
    FROZEN_GRAPH_NAME = 'frozen_inference_graph'

    def __init__(self, tarball_path):
        """
        Creates and loads pretrained deeplab model.
        """
        self.graph = tf.Graph()

        graph_def = None
        # Extract frozen graph from tar archive.
        tar_file = tarfile.open(tarball_path)
        for tar_info in tar_file.getmembers():
            if self.FROZEN_GRAPH_NAME in os.path.basename(tar_info.name):
                file_handle = tar_file.extractfile(tar_info)
                graph_def = tf.GraphDef.FromString(file_handle.read())
                break

        tar_file.close()

        if graph_def is None:
            raise RuntimeError('Cannot find inference graph in tar archive.')

        with self.graph.as_default():
            tf.import_graph_def(graph_def, name='')

        self.sess = tf.Session(graph=self.graph)


    def run(self, image):
        """
        Runs inference on a single image.

        Args:
        image: A PIL.Image object, raw input image.

        Returns:
        resized_image: RGB image resized from original input image.
        seg_map: Segmentation map of `resized_image`.
        """
        width, height = image.size
        resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
        target_size = (int(resize_ratio * width), int(resize_ratio * height))
        resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
        batch_seg_map = self.sess.run(self.OUTPUT_TENSOR_NAME,
                                      feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
        seg_map = batch_seg_map[0]
        return resized_image, seg_map


def create_pascal_label_colormap():
    """
    Creates a label colormap used in PASCAL VOC segmentation benchmark.

    Returns:
        A Colormap for visualizing segmentation results.
    """
    colormap = np.zeros((256, 3), dtype=int)
    ind = np.arange(256, dtype=int)

    for shift in reversed(range(8)):
        for channel in range(3):
            colormap[:, channel] |= ((ind >> channel) & 1) << shift
        ind >>= 3

    return colormap


def label_to_color_image(label):
    """
    Adds color defined by the dataset colormap to the label.

    Args:
        label: A 2D array with integer type, storing the segmentation label.

    Returns:
        result: A 2D array with floating type. The element of the array
        is the color indexed by the corresponding element in the input label
        to the PASCAL color map.

    Raises:
        ValueError: If label is not of rank 2 or its value is larger than color
        map maximum entry.
    """
    if label.ndim != 2:
        raise ValueError('Expect 2-D input label')

    colormap = create_pascal_label_colormap()

    if np.max(label) >= len(colormap):
        raise ValueError('label value too large.')

    return colormap[label]


def vis_segmentation(image, seg_map):
    """Visualizes input image, segmentation map and overlay view."""
    plt.figure(figsize=(15, 5))
    grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])

    plt.subplot(grid_spec[0])
    plt.imshow(image)
    plt.axis('off')
    plt.title('input image')

    plt.subplot(grid_spec[1])
    seg_image = label_to_color_image(seg_map).astype(np.uint8)
    plt.imshow(seg_image)
    plt.axis('off')
    plt.title('segmentation map')

    plt.subplot(grid_spec[2])
    plt.imshow(image)
    plt.imshow(seg_image, alpha=0.7)
    plt.axis('off')
    plt.title('segmentation overlay')

    unique_labels = np.unique(seg_map)
    ax = plt.subplot(grid_spec[3])
    plt.imshow(FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation='nearest')
    ax.yaxis.tick_right()
    plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])
    plt.xticks([], [])
    ax.tick_params(width=0.0)
    plt.grid('off')
    plt.show()


## 
LABEL_NAMES = np.asarray(
    ['background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
     'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike',
     'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv' ])

FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)


## Tensorflow 提供的模型下载
MODEL_NAME = 'xception_coco_voctrainval'
# ['mobilenetv2_coco_voctrainaug', 'mobilenetv2_coco_voctrainval', 'xception_coco_voctrainaug', 'xception_coco_voctrainval']

_DOWNLOAD_URL_PREFIX = 'http://download.tensorflow.org/models/'
_MODEL_URLS = {'mobilenetv2_coco_voctrainaug': 'deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz',
               'mobilenetv2_coco_voctrainval': 'deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz',
               'xception_coco_voctrainaug': 'deeplabv3_pascal_train_aug_2018_01_04.tar.gz',
               'xception_coco_voctrainval': 'deeplabv3_pascal_trainval_2018_01_04.tar.gz', }
# _TARBALL_NAME = 'deeplab_model.tar.gz'

# model_dir = tempfile.mkdtemp()
# tf.gfile.MakeDirs(model_dir)
#
# download_path = os.path.join(model_dir, _TARBALL_NAME)
# print('downloading model, this might take a while...')
# urllib.request.urlretrieve(_DOWNLOAD_URL_PREFIX + _MODEL_URLS[MODEL_NAME], download_path)
# print('download completed! loading DeepLab model...')

model_dir = '/path/to/models_zoo/deeplab'
download_path = os.path.join(model_dir, _MODEL_URLS[MODEL_NAME])
MODEL = DeepLabModel(download_path)
print('model loaded successfully!')


## 
def run_visualization(imagefile):
    """
    DeepLab 语义分割，并可视化结果.
    """
    orignal_im = Image.open(imagefile)
    print('running deeplab on image %s...' % imagefile)
    resized_im, seg_map = MODEL.run(orignal_im)

    vis_segmentation(resized_im, seg_map)

images_dir = '/path/to/images'
images = sorted(os.listdir(images_dir))
for imgfile in images:
    run_visualization(os.path.join(images_dir, imgfile))

print('Done.')

如:

Last modification：May 14th, 2019 at 09:30 am

42 comments

ljh
September 15th, 2020 at 04:30 pm

您好，我想请问一下我的图片时480,720的，源代码中INPUT_SIZE = 513是否是改成481,721呢

Reply
1. AIHGF
  September 15th, 2020 at 04:48 pm
  
  @ljh
  
  固定的是网络的输入为 513，图片需要调整为 513x513
  
  Reply
wellstar
May 28th, 2020 at 02:20 am

seg_map结果为多个类，我仅需要person类该怎样操作

Reply
1. AIHGF
  May 28th, 2020 at 06:50 am
  
  @wellstar
  
  提取出来 person 类的 seg_map，非 person 类的像素位置设为背景.
  
  Reply
  1. wellstar
    May 29th, 2020 at 02:35 am
    
    @AIHGF
    
    博主你好这个model运行的结果resized_im, seg_map resized_im原图是3维的而seg_map mask是二维的
    改怎样处理才能用cv2的bitwise_and 进行and操作
    
    Reply
tanweidong
April 21st, 2020 at 01:52 pm

博主，用jupyter notebook运行最后出现这个报错Cannot retrieve image. Please check url: https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/img/image1.jpg?raw=true

Reply
1. AIHGF
  April 22nd, 2020 at 08:50 am
  
  @tanweidong
  
  看着像是图片不能正常读取.
  
  Reply
kevin.lin
January 8th, 2020 at 07:52 pm

你好，请问下tensorflow的版本有要求吗？用的是哪个版本

Reply
1. AIHGF
  January 8th, 2020 at 08:17 pm
  
  @kevin.lin
  
  当时用的是 tf-gpu 1.11 版本吧
  
  Reply
一瞬之间
December 2nd, 2019 at 09:43 pm

请问一下ade20k的demo怎么实现，需要替换类名还有别的吗

Reply
1. AIHGF
  December 3rd, 2019 at 08:50 am
  
  @一瞬之间
  
  类别名、选择模型、确认模型输入.
  
  Reply
  1. 一瞬之间
    December 3rd, 2019 at 02:13 pm
    
    @AIHGF
    
    确认模型输入哪块，我是小白，你这个demo就是在voc数据集测试的，我改成ade20k会报错
    
    Reply
    
    AIHGF
    December 3rd, 2019 at 09:32 pm
    
    @一瞬之间
    
    会报什么错呢，模型输入是指输入的图片尺寸.
    
    Reply
    
    一瞬之间
    December 10th, 2019 at 05:26 pm
    
    @AIHGF
    
    你好，会出现这种错误
    IndexError: index 2 is out of bounds for axis 0 with size 1
    请问你有调好的嘛
    
    Reply
    
    AIHGF
    December 10th, 2019 at 05:50 pm
    
    @一瞬之间
    
    定位到错误位置，确定下问题. 或者你找一个 docker 镜像.
    
    Reply
    
    一瞬之间
    December 10th, 2019 at 08:55 pm
    
    @AIHGF
    
    解决了，ade20k有150个类，输入类别时忘记加逗号这个符号了
    
    Reply
Yuan Linfeng
September 17th, 2019 at 07:24 pm

大佬，原图边长大于513的时候就出现Invalid argument: padded_shape[1]=xxx is not divisible by block_shape[1]=2（xxx是个随图片大小不同会变化的整数），这个应该怎么办呀。。。。

Reply
1. AIHGF
  September 18th, 2019 at 08:42 am
  
  @Yuan Linfeng
  
  这个好像是默认输入要是 513x513 的，增加一步图片预处理到 513 的操作，最后再后处理到原图尺寸.
  
  Reply
marc
August 14th, 2019 at 11:00 pm

请问demo.py如果resize_ratio = 1.0 ，即原图大小的时候会报错，该如何解决？
报错内容：
Invalid argument: padded_shape[1]=245 is not divisible by block_shape[1]=2

Reply
1. AIHGF
  August 15th, 2019 at 08:32 am
  
  @marc
  
  原图尺寸是多少？
  
  Reply
wyz
April 17th, 2019 at 04:24 pm

您好，请问如何将原图中对应的mask区域分离出来？

Reply
1. AIHGF
  April 17th, 2019 at 08:57 pm
  
  @wyz
  
  取与操作即可.
  
  Reply
  1. wyz
    April 26th, 2019 at 02:54 pm
    
    @AIHGF
    
    原图与mask图做与操作后，无关区域变成黑色，请问有没有方法可以把这些区域改为白色或者mask区域的主要颜色呢？遍历像素点的话对大图片就会很慢了
    
    Reply
    
    AIHGF
    April 26th, 2019 at 03:08 pm
    
    @wyz
    
    很多分割的开源中都有对不同的分割主体显示不同颜色的可视化 demo，类似于你的需求.
    
    Reply
  2. wyz
    April 18th, 2019 at 03:24 pm
    
    @AIHGF
    
    非常感谢，具体使用我是这样的。
    crop_result = cv2.bitwise_and(oringnal_im, oringnal_im, mask=mask)
    
    Reply
    
    卓一凡
    September 9th, 2019 at 03:57 pm
    
    @wyz
    
    crop_result = cv2.bitwise_and(oringnal_im, oringnal_im, mask=mask)这个代码当中为什么会有两个 oringnal_im，网上很多人都是写两个参数，而且不一样，而且图片经过deeplab之后是numpy.ndarray格式的，还是单通道，您是怎么把图像抠出来的？
    
    Reply
zy
January 23rd, 2019 at 03:49 pm

您好，请问如何只保存中间的分割图？

Reply
1. AIHGF
  January 23rd, 2019 at 04:14 pm
  
  @zy
  
  输入中间分割图是指图片对应的 mask 图吗？还是网络中间层特征？
  
  Reply
  1. zy
    January 23rd, 2019 at 06:19 pm
    
    @AIHGF
    
    是指图片对应的 mask 图
    
    Reply
    
    AIHGF
    January 23rd, 2019 at 06:34 pm
    
    @zy
    
    seg_map 就是输出的分割 mask，resized_im, seg_map = MODEL.run(orignal_im)
    
    Reply
    
    zy
    January 23rd, 2019 at 07:30 pm
    
    @AIHGF
    
    好的，多谢
    
    Reply
kam
December 5th, 2018 at 04:52 pm

您好！想请问一下我在用jupyter运行这个demo时，会报FileNotFoundError: [Errno 2] No such file or directory: '/path/to/models_zoo/deeplab\deeplabv3_pascal_trainval_2018_01_04.tar.gz'
的错误

Reply
1. AIHGF
  December 5th, 2018 at 05:40 pm
  
  @kam
  
  是模型没有下载下来吧
  
  Reply
2. kam
  December 5th, 2018 at 04:54 pm
  
  @kam
  
  有理解错误还想请您指点，谢谢！
  
  Reply
3. kam
  December 5th, 2018 at 04:53 pm
  
  @kam
  
  请问我是需要把TFModelZoo下载下来放在本地，还是通过url直接使用就可以？
  
  Reply
  1. AIHGF
    December 5th, 2018 at 05:41 pm
    
    @kam
    
    根据给定的 url 可以自动下载的
    
    Reply
    
    MIKE
    November 29th, 2019 at 05:51 pm
    
    @AIHGF
    
    能不能讲一下具体咋操作呀
    
    Reply
    
    AIHGF
    November 29th, 2019 at 06:31 pm
    
    @MIKE
    
    具体操作是指？
    
    Reply
    
    MIKE
    November 29th, 2019 at 08:14 pm
    
    @AIHGF
    
    我出现了同样的问题，提示路径错误，尝试登陆了一下TF的model download页面，提示没有权限
    
    Reply
    
    AIHGF
    November 29th, 2019 at 08:40 pm
    
    @MIKE
    
    可以直接根据url下载模型
    
    Reply
dennis
November 5th, 2018 at 01:49 pm

您好，想问一下cityscapes的demo该如何实现，我把路径和对应的类名替换后一直有二进制的报错，希望得到您的指导！谢谢

Reply
1. AIHGF
  November 5th, 2018 at 04:17 pm
  
  @dennis
  
  二进制的报错？具体错误是什么？
  
  Reply

Tensorflow - 语义分割 Deeplab API 之 Demo

AIHGF • 2018 年 06 月 07 日

DeepLab: Deep Labelling for Semantic Image Segmentation 语义分割的目标是对输入图片的每个像素分配特定的类别标签，如 person, cat 等等.

Tensorflow 语义分割 DeepLab API
Tensorflow DeepLab ModelZoo

语义分割 DeepLab 的系列论文:

[1] - DeepLabv1- Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs-ICLR2015
采用 atrous conv 显示地控制 CNN 计算的 feature maps 的分别率.

Backbone 网络：

[1] - MobileNetv2 - Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation-CVPR2018
用于移动设备的快速网络结构

[2] - Xception: Deep Learning with Depthwise Separable Convolutions-CVPR2017
用于服务器端部署的强大网络结构

[3] - Deformable Convolutional Networks -- COCO Detection and Segmentation Challenge 2017 Entry

TensorFlow DeepLab API 提供了 IoU 精度评测和分割结果的可视化.
代码中以 PASCAL VOC 2012 和 Cityscapes benchmarks 为例.

1. DeepLab API 安装

１.1 依赖项

Numpy
Pillow 1.0
tf Slim (位于 "tensorflow/models/research/" 路径)
Jupyter notebook
Matplotlib
Tensorflow

# For CPU
pip install tensorflow
# For GPU
pip install tensorflow-gpu

sudo apt-get install python-pil python-numpy
sudo pip install jupyter
sudo pip install matplotlib

1.2. 环境变量设置

添加 tensorflow/models/research/ 和 slim 路径到 PYTHONPATH:

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

1.3. 测试 API 安装

在目录　tensorflow/models/research/ 中运行：

#From tensorflow/models/research/
python deeplab/model_test.py　#快速测试

输出：

....
2018-06-06 15:58:06.598998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1821 MB memory) -> physical GPU (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
....
----------------------------------------------------------------------
Ran 5 tests in 15.748s

OK

也可以运行在　 PASCAL VOC 2012　数据集上的完整代码.

# From tensorflow/models/research/deeplab
sh local_test.sh

2. Tensorflow ModelZoo

Tensorflow DeepLab ModelZoo

Tensorflow 提供了在 PASCAL VOC 2012, Cityscapes 和 ADE20K 数据集上的预训练模型.

2.1. DeepLab models trained on PASCAL VOC 2012

2.1.1. 模型描述

Checkpoint name	Network backbone	Pretrained dataset	ASPP	Decoder
mobilenetv2_dm05_coco_voc_trainaug	MobileNet-v2 Depth-Multiplier = 0.5	MS-COCO VOC 2012 train_aug set	N/A	N/A
mobilenetv2_dm05_coco_voc_trainval	MobileNet-v2 Depth-Multiplier = 0.5	MS-COCO VOC 2012 train_aug + trainval sets	N/A	N/A
mobilenetv2_coco_voc_trainaug	MobileNet-v2	MS-COCO VOC 2012 train_aug set	N/A	N/A
mobilenetv2_coco_voc_trainval	MobileNet-v2	MS-COCO VOC 2012 train_aug + trainval sets	N/A	N/A
xception65_coco_voc_trainaug	Xception_65	MS-COCO VOC 2012 train_aug set	[6,12,18] for OS=16 [12,24,36] for OS=8	OS = 4
xception65_coco_voc_trainval	Xception_65	MS-COCO VOC 2012 train_aug + trainval sets	[6,12,18] for OS=16 [12,24,36] for OS=8	OS = 4

其中，OS 表示输出步长(output stride).

2.1.2. 模型下载

Checkpoint name	Eval OS	Eval scales	Left-right Flip	Multiply-Adds	Runtime (sec)	PASCAL mIOU	File Size
mobilenetv2_dm05_coco_voc_trainaug	16	[1.0]	No	0.88B	-	70.19% (val)	7.6MB
mobilenetv2_dm05_coco_voc_trainval	8	[1.0]	No	2.84B	-	71.83% (test)	7.6MB
mobilenetv2_coco_voc_trainaug	16 8	[1.0] [0.5:0.25:1.75]	No Yes	2.75B 152.59B	0.1 26.9	75.32% (val) 77.33 (val)	23MB
mobilenetv2_coco_voc_trainval	8	[0.5:0.25:1.75]	Yes	152.59B	26.9	80.25% (test)	23MB
xception65_coco_voc_trainaug	16 8	[1.0] [0.5:0.25:1.75]	No Yes	54.17B 3055.35B	0.7 223.2	82.20% (val) 83.58% (val)	439MB
xception65_coco_voc_trainval	8	[0.5:0.25:1.75]	Yes	3055.35B	223.2	87.80% (test)	439MB

2.1.3. 模型压缩包说明

下载的每个 .tar 压缩包里包含如下文件：

[1] - frozen_inference_graph.pb.

All frozen inference graphs use output stride of 8 and a single eval scale of 1.0. No left-right flips are used, and MobileNet-v2 based models do not include the decoder module.

[2] - checkpoint 文件：model.ckpt.data-00000-of-00001, model.ckpt.index.

2.2. DeepLab models trained on Cityscapes

2.2.1. 模型描述

Checkpoint name	Network backbone	Pretrained dataset	ASPP	Decoder
mobilenetv2_coco_cityscapes_trainfine	MobileNet-v2	MS-COCO Cityscapes train_fine set	N/A	N/A
xception65_cityscapes_trainfine	Xception_65	ImageNet Cityscapes train_fine set	[6, 12, 18] for OS=16 [12, 24, 36] for OS=8	OS = 4
xception71_dpc_cityscapes_trainfine	Xception_71	ImageNet MS-COCO Cityscapes train_fine set	Dense Prediction Cell	OS = 4
xception71_dpc_cityscapes_trainval	Xception_71	ImageNet MS-COCO Cityscapes trainval_fine and coarse set	Dense Prediction Cell	OS = 4

2.2.2. 模型下载

Checkpoint name	Eval OS	Eval scales	Left-right Flip	Multiply-Adds	Runtime (sec)	Cityscapes mIOU	File Size
mobilenetv2_coco_cityscapes_trainfine	16 8	[1.0] [0.75:0.25:1.25]	No Yes	21.27B 433.24B	0.8 51.12	70.71% (val) 73.57% (val)	23MB
xception65_cityscapes_trainfine	16 8	[1.0] [0.75:0.25:1.25]	No Yes	418.64B 8677.92B	5.0 422.8	78.79% (val) 80.42% (val)	439MB
xception71_dpc_cityscapes_trainfine	16	[1.0]	No	502.07B	-	80.31% (val)	445MB
xception71_dpc_cityscapes_trainval	8	[0.75:0.25:2]	Yes	-	-	82.66% (test)	446MB

2.3. DeepLab models trained on ADE20K

2.3.1. 模型描述

Checkpoint name	Network backbone	Pretrained dataset	ASPP	Decoder	Input size
mobilenetv2_ade20k_train	MobileNet-v2	ImageNet ADE20K training set	N/A	OS = 4	257x257
xception65_ade20k_train	Xception_65	ImageNet ADE20K training set	[6, 12, 18] for OS=16 [12, 24, 36] for OS=8	OS = 4	513x513

2.3.2. 模型下载

Checkpoint name	Eval OS	Eval scales	Left-right Flip	mIOU	Pixel-wise Accuracy	File Size
mobilenetv2_ade20k_train	16	[1.0]	No	32.04% (val)	75.41% (val)	24.8MB
xception65_ade20k_train	8	[0.5:0.25:1.75]	Yes	45.65% (val)	82.52% (val)	439MB

2.4. Checkpoints pretrained on ImageNet

模型 checkpoint 文件：model.ckpt.data-00000-of-00001, model.ckpt.index.

Model name	File Size
xception_41_imagenet	288MB
xception_65_imagenet	447MB
xception_65_imagenet_coco	292MB
xception_71_imagenet	474MB
resnet_v1_50_beta_imagenet	274MB
resnet_v1_101_beta_imagenet	477MB

3. Demo.py

#!--*-- coding:utf-8 --*--

# Deeplab Demo 

import os
import tarfile

from matplotlib import gridspec
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import tempfile
from six.moves import urllib

import tensorflow as tf


class DeepLabModel(object):
    """
    加载 DeepLab 模型；
    推断 Inference.
    """
    INPUT_TENSOR_NAME = 'ImageTensor:0'
    OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
    INPUT_SIZE = 513
    FROZEN_GRAPH_NAME = 'frozen_inference_graph'

    def __init__(self, tarball_path):
        """
        Creates and loads pretrained deeplab model.
        """
        self.graph = tf.Graph()

        graph_def = None
        # Extract frozen graph from tar archive.
        tar_file = tarfile.open(tarball_path)
        for tar_info in tar_file.getmembers():
            if self.FROZEN_GRAPH_NAME in os.path.basename(tar_info.name):
                file_handle = tar_file.extractfile(tar_info)
                graph_def = tf.GraphDef.FromString(file_handle.read())
                break

        tar_file.close()

        if graph_def is None:
            raise RuntimeError('Cannot find inference graph in tar archive.')

        with self.graph.as_default():
            tf.import_graph_def(graph_def, name='')

        self.sess = tf.Session(graph=self.graph)


    def run(self, image):
        """
        Runs inference on a single image.

        Args:
        image: A PIL.Image object, raw input image.

        Returns:
        resized_image: RGB image resized from original input image.
        seg_map: Segmentation map of `resized_image`.
        """
        width, height = image.size
        resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
        target_size = (int(resize_ratio * width), int(resize_ratio * height))
        resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
        batch_seg_map = self.sess.run(self.OUTPUT_TENSOR_NAME,
                                      feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
        seg_map = batch_seg_map[0]
        return resized_image, seg_map


def create_pascal_label_colormap():
    """
    Creates a label colormap used in PASCAL VOC segmentation benchmark.

    Returns:
        A Colormap for visualizing segmentation results.
    """
    colormap = np.zeros((256, 3), dtype=int)
    ind = np.arange(256, dtype=int)

    for shift in reversed(range(8)):
        for channel in range(3):
            colormap[:, channel] |= ((ind >> channel) & 1) << shift
        ind >>= 3

    return colormap


def label_to_color_image(label):
    """
    Adds color defined by the dataset colormap to the label.

    Args:
        label: A 2D array with integer type, storing the segmentation label.

    Returns:
        result: A 2D array with floating type. The element of the array
        is the color indexed by the corresponding element in the input label
        to the PASCAL color map.

    Raises:
        ValueError: If label is not of rank 2 or its value is larger than color
        map maximum entry.
    """
    if label.ndim != 2:
        raise ValueError('Expect 2-D input label')

    colormap = create_pascal_label_colormap()

    if np.max(label) >= len(colormap):
        raise ValueError('label value too large.')

    return colormap[label]


def vis_segmentation(image, seg_map):
    """Visualizes input image, segmentation map and overlay view."""
    plt.figure(figsize=(15, 5))
    grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])

    plt.subplot(grid_spec[0])
    plt.imshow(image)
    plt.axis('off')
    plt.title('input image')

    plt.subplot(grid_spec[1])
    seg_image = label_to_color_image(seg_map).astype(np.uint8)
    plt.imshow(seg_image)
    plt.axis('off')
    plt.title('segmentation map')

    plt.subplot(grid_spec[2])
    plt.imshow(image)
    plt.imshow(seg_image, alpha=0.7)
    plt.axis('off')
    plt.title('segmentation overlay')

    unique_labels = np.unique(seg_map)
    ax = plt.subplot(grid_spec[3])
    plt.imshow(FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation='nearest')
    ax.yaxis.tick_right()
    plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])
    plt.xticks([], [])
    ax.tick_params(width=0.0)
    plt.grid('off')
    plt.show()


## 
LABEL_NAMES = np.asarray(
    ['background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
     'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike',
     'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv' ])

FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)


## Tensorflow 提供的模型下载
MODEL_NAME = 'xception_coco_voctrainval'
# ['mobilenetv2_coco_voctrainaug', 'mobilenetv2_coco_voctrainval', 'xception_coco_voctrainaug', 'xception_coco_voctrainval']

_DOWNLOAD_URL_PREFIX = 'http://download.tensorflow.org/models/'
_MODEL_URLS = {'mobilenetv2_coco_voctrainaug': 'deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz',
               'mobilenetv2_coco_voctrainval': 'deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz',
               'xception_coco_voctrainaug': 'deeplabv3_pascal_train_aug_2018_01_04.tar.gz',
               'xception_coco_voctrainval': 'deeplabv3_pascal_trainval_2018_01_04.tar.gz', }
# _TARBALL_NAME = 'deeplab_model.tar.gz'

# model_dir = tempfile.mkdtemp()
# tf.gfile.MakeDirs(model_dir)
#
# download_path = os.path.join(model_dir, _TARBALL_NAME)
# print('downloading model, this might take a while...')
# urllib.request.urlretrieve(_DOWNLOAD_URL_PREFIX + _MODEL_URLS[MODEL_NAME], download_path)
# print('download completed! loading DeepLab model...')

model_dir = '/path/to/models_zoo/deeplab'
download_path = os.path.join(model_dir, _MODEL_URLS[MODEL_NAME])
MODEL = DeepLabModel(download_path)
print('model loaded successfully!')


## 
def run_visualization(imagefile):
    """
    DeepLab 语义分割，并可视化结果.
    """
    orignal_im = Image.open(imagefile)
    print('running deeplab on image %s...' % imagefile)
    resized_im, seg_map = MODEL.run(orignal_im)

    vis_segmentation(resized_im, seg_map)

images_dir = '/path/to/images'
images = sorted(os.listdir(images_dir))
for imgfile in images:
    run_visualization(os.path.join(images_dir, imgfile))

print('Done.')

如:

Tensorflow - 语义分割 Deeplab API 之 Demo

1. DeepLab API 安装

１.1 依赖项

1.2. 环境变量设置

1.3. 测试 API 安装

2. Tensorflow ModelZoo

2.1. DeepLab models trained on PASCAL VOC 2012

2.1.1. 模型描述

2.1.2. 模型下载

2.1.3. 模型压缩包说明

2.2. DeepLab models trained on Cityscapes

2.2.1. 模型描述

2.2.2. 模型下载

2.3. DeepLab models trained on ADE20K

2.3.1. 模型描述

2.3.2. 模型下载

2.4. Checkpoints pretrained on ImageNet

3. Demo.py

※相关文章推荐※

※最新文章推荐※

42 comments

Leave a Comment Cancel reply

Tensorflow - 语义分割 Deeplab API 之 Demo

1. DeepLab API 安装

１.1 依赖项

1.2. 环境变量设置

1.3. 测试 API 安装

2. Tensorflow ModelZoo

2.1. DeepLab models trained on PASCAL VOC 2012

2.1.1. 模型描述

2.1.2. 模型下载

2.1.3. 模型压缩包说明

2.2. DeepLab models trained on Cityscapes

2.2.1. 模型描述

2.2.2. 模型下载

2.3. DeepLab models trained on ADE20K

2.3.1. 模型描述

2.3.2. 模型下载

2.4. Checkpoints pretrained on ImageNet

3. Demo.py