RefineNet:_Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

CVPR 2017       Posted on May 9, 2020

Paper link

Take away message

  1. 设计RefineNet,结合高层语义信息和低层细节信息,得到高分辨率的分割图
  2. RefineNet中大量使用了ResNet中Identity mapping (both short range and long range) 的思想,保证了有效的端到端训练

Introduction

网络中pooling操作会降低分割图的分辨率,目前有三种解决方式:

  1. learn deconvolutional filters as an up-sampling operation: The deconvolution operations are not able to recover the low-level visual features which are lost after the downsampling operation in the convolution forward stage.
  2. Deeplab系列中的atrous convolution: a significant cost in memory, because unlike the image subsampling methods, one must retain very large numbers of feature maps at higher resolution. In practice, therefore, dilation convolution methods usually have a resolution prediction of no more than 1/8 size of the original rather than 1/4, when using a deep network.
  3. exploits features from intermediate layers for generating high-resolution prediction (本文基于的方法)

Model

Result

Reflection

感觉整个网络设计并没有太新颖的模块或思路,但可能是通过大量的identity mapping达成了有效的端对端训练,最后得到了非常棒的结果。如果是的话,说明设计网络时,网络各个模块能否得到充分的训练非常重要。