Github 项目 - Semantic Soft Segmentation

Semantic Soft Segmentation SIGGRAPH2018 论文开源了其测试实现，主要包括两个项目：特征提取和SoftSegmentation.

[Github – SIGGRAPH18SSS – Semantic feature generator- 特征提取源码](https://github.com/iyah4888/SIGGRAPH18SSS)
[Github – Semantic Soft Segmentation – 分割源码](https://github.com/yaksoy/SemanticSoftSegmentation)

## 1. 特征提取

[Github源码 – SIGGRAPH18SSS](https://github.com/iyah4888/SIGGRAPH18SSS)
[预训练 TensorFlow 模型](http://cvg.ethz.ch/research/semantic-soft-segmentation/SSS_model.zip)

该项目主要是基于 [deeplab_resnet](https://github.com/iyah4888/SIGGRAPH18SSS/tree/master/deeplab_resnet) 实现的特征提取过程，输出的是 128 维特征表示.

测试代码 - [main_hyper.py](https://github.com/iyah4888/SIGGRAPH18SSS/blob/master/main_hyper.py)：

```python
from future import print_function

import os
import scipy.io as sio
from glob import glob

import tensorflow as tf
import numpy as np

from parse_opt import get_arguments
from deeplab_resnet import HyperColumn_Deeplabv2, read_data_list

IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)

#######################################################
'''
Helper functions
'''
def load_dir_structs(dataset_path):
    # Get list of subdirs
    # types = ('.jpg', '.png')    # jpg is not supported yet by read_img()
    types = ('*.png')

curflist= []
    for files in types:
        curflist.extend(glob(os.path.join(dataset_path, files)))
    return curflist

def read_img(t_imgfname, input_size, img_mean): 
    """Read one image and its corresponding mask with optional pre-processing.

Args:
      input_queue: tf queue with paths to the image and its mask.
      input_size: a tuple with (height, width) values.
                  If not given, return images of original size.
      random_scale: whether to randomly scale the images prior
                    to random crop.
      random_mirror: whether to randomly mirror the images prior
                    to random crop.
      ignore_label: index of label to ignore during the training.
      img_mean: vector of mean colour values.

Returns:
      Two tensors: the decoded image and its mask.
    """

img_contents = tf.read_file(t_imgfname)

img = tf.image.decode_png(img_contents, channels=3)
    img_r, img_g, img_b = tf.split(axis=2, num_or_size_splits=3, value=img)
    img = tf.cast(tf.concat(axis=2, values=[img_b, img_g, img_r]), dtype=tf.float32)
    # Extract mean.
    img -= img_mean

if input_size is not None:
        h, w = input_size

# Randomly scale the images and labels.
        newshape = tf.squeeze(tf.stack([h, w]), squeeze_dims=[1])
        img2 = tf.image.resize_images(img, newshape)
    else:
        img2 = tf.image.resize_images(img, tf.shape(img)[0:2,]*2)

return img2, img

#######################################################
'''
Main function
'''
if name == "__main__":
    args = get_arguments()

# Set up tf session and initialize variables.
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:
        model = HyperColumn_Deeplabv2(sess, args)

# Load variables if the checkpoint is provided.
        model.load(args.snapshot_dir)

local_imgflist = load_dir_structs(args.data_dir)
        save_folder = os.path.join(args.data_dir, args.feat_dir)
        if not os.path.exists(save_folder):
            os.mkdir(save_folder)

for i in range(len(local_imgflist)):
            if os.path.splitext(local_imgflist[i])[1] == '':
                continue

print('{} Processing {}'.format(i, local_imgflist[i]))
            padsize = 50
            _, ori_img = read_img(local_imgflist[i], input_size = None, img_mean = IMG_MEAN)
            pad_img = tf.pad(ori_img, [[padsize,padsize], [padsize,padsize], [0,0]], mode='REFLECT')
            cur_embed = model.test(pad_img.eval())
            cur_embed = np.squeeze(cur_embed)
            curfname = os.path.split(os.path.splitext(local_imgflist[i])[0])[1]
            cur_svpath = os.path.join(save_folder, curfname + '.mat')
            # 保存特征为 mat 格式，用于 matlab 测试时加载提取的特征.
            sio.savemat(cur_svpath, {'embedmap': cur_embed[padsize:(cur_embed.shape[0]-padsize),padsize:(cur_embed.shape[1]-padsize),:]})
```

## 2. SemanticSoftSegmentation

根据 [论文 Semantci Soft Segmentation](http://people.inf.ethz.ch/aksoyy/papers/TOG18-sss.pdf)、[论文阅读 - Semantic Soft Segmentation](https://www.aiuai.cn/aifarm366.html) 和开源实现 [SemanticSoftSegmentation.m](https://github.com/yaksoy/SemanticSoftSegmentation/blob/master/SemanticSoftSegmentation.m) 进行理解.

主要处理：

[1] - 特征降维，从 128 维特征降维到 3 维.

[2] - 超像素处理，用于计算 Superpixels.

[3] - 计算仿射变换和 Laplacian，包括：Matting Affinity，semantic affinity 和 non-local color affinity 以及 affinityMatrixToLaplacian.

[4] - 计算 Laplacian 的特征分解，得到 100 个特征向量(eigenvectors).

[5] - 初始化优化，主要计算初始化 soft segments，对特征进行语义初始化；关于深度网络提取的特征进行 Group segments.

[6] - 最终优化Final optimization，sparsification.

```matlab
% Semantic Soft Segmentation
% This function implements the soft segmentation approach described in
% Yagiz Aksoy, Tae-Hyun Oh, Sylvain Paris, Marc Pollefeys, Wojciech Matusik
% "Semantic Soft Segmentation", ACM TOG (Proc. SIGGRAPH) 2018

function [softSegments, initSoftSegments, Laplacian, affinities, features, superpixels, eigenvectors, eigenvalues] = SemanticSoftSegmentation(image, features)

disp('Semantic Soft Segmentation')
    % Prepare the inputs and superpixels
    image = im2double(image);
    % --------------------------------------------------
    % preprocessFeatures 函数，对应于论文的 3.5 - Semantic Feature Vectors 章节
    % 主要是将深度网络提取的高维 128 维特征向量进行 guidedfilter， PCA 降维，并归一化值到 [0, 1].
    % --------------------------------------------------
    if size(features, 3) > 3 % If the features are raw, hyperdimensional, preprocess them
        features = preprocessFeatures(features, image);
    else
        features = im2double(features);
    end

% -------------------------------------------------
    % Superpixels 函数，对应于论文 3.2 - Nonlocal Color Affinity 章节
    % 主要是对图像进行 SLIC 超像素分割
    % --------------------------------------------------
    superpixels = Superpixels(image);
    [h, w, ~] = size(image);

disp('     Computing affinities')
    % -------------------------------------------------
    % 计算图像image 的仿射变换，和 Laplacian.
    % --------------------------------------------------
    % Compute the affinities and the Laplacian
    affinities{1} = mattingAffinity(image);
    affinities{2} = superpixels.neighborAffinities(features); % semantic affinity
    affinities{3} = superpixels.nearbyAffinities(image); % non-local color affinity
    Laplacian = affinityMatrixToLaplacian(affinities{1} + 0.01  affinities{2} + 0.01  affinities{3}); % Equation 6

disp('     Computing eigenvectors')
    % -------------------------------------------------
    % 对得到的图像 Laplacian 计算特征分解，得到其特征向量和特征值.
    % --------------------------------------------------
    eigCnt = 100; % We use 100 eigenvectors in the optimization
    [eigenvectors, eigenvalues] = eigs(Laplacian, eigCnt, 'SM');

disp('     Initial optimization')
    % -------------------------------------------------
    % softSegmentsFromEigs 函数，对应于论文 3.4 - Creating the Layers 章节
    % 用于初始化 soft segments.
    % --------------------------------------------------
    initialSegmCnt = 40;
    sparsityParam = 0.8;
    iterCnt = 40;
    % 将特征向量，特征值，Laplacian，降维后的特征，
    % 及其它参数，送入 softSementsFromEigs 函数.
    initSoftSegments = softSegmentsFromEigs(eigenvectors, eigenvalues, Laplacian, ...
                                            h, w, features, initialSegmCnt, iterCnt, sparsityParam, [], []);

% 对于初始化到的 initSoftSegments，
    % 计算关于其语义特征向量，即降维后特征向量的分组分割(Group segments)
    groupedSegments = groupSegments(initSoftSegments, features, 3); # 参数 3 表示最终得到的图像层数. 默认是 5.

disp('     Final optimization')
    % -------------------------------------------------
    % sparsifySegments 函数，对应于论文 3.4 - Creating the Layers 章节中的 relaxed sparsification.
    % 用于最终的 sparsification，得到最终的分割结果.
    % --------------------------------------------------
    softSegments = sparsifySegments(groupedSegments, Laplacian, imageGradient(image, false, 6));

disp('     Done.')
end
```

Semantic Soft Segmentation SIGGRAPH2018 论文开源了其测试实现，主要包括两个项目：特征提取和SoftSegmentation.

Github – SIGGRAPH18SSS – Semantic feature generator- 特征提取源码
Github – Semantic Soft Segmentation – 分割源码

1. 特征提取

Github源码 – SIGGRAPH18SSS
预训练 TensorFlow 模型

该项目主要是基于 deeplab_resnet 实现的特征提取过程，输出的是 128 维特征表示.

测试代码 - main_hyper.py：

from future import print_function

import os
import scipy.io as sio
from glob import glob

import tensorflow as tf
import numpy as np

from parse_opt import get_arguments
from deeplab_resnet import HyperColumn_Deeplabv2, read_data_list

IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)


#######################################################
'''
Helper functions
'''
def load_dir_structs(dataset_path):
    # Get list of subdirs
    # types = ('.jpg', '.png')    # jpg is not supported yet by read_img()
    types = ('*.png')

    curflist= []
    for files in types:
        curflist.extend(glob(os.path.join(dataset_path, files)))
    return curflist


def read_img(t_imgfname, input_size, img_mean): 
    """Read one image and its corresponding mask with optional pre-processing.

    Args:
      input_queue: tf queue with paths to the image and its mask.
      input_size: a tuple with (height, width) values.
                  If not given, return images of original size.
      random_scale: whether to randomly scale the images prior
                    to random crop.
      random_mirror: whether to randomly mirror the images prior
                    to random crop.
      ignore_label: index of label to ignore during the training.
      img_mean: vector of mean colour values.

    Returns:
      Two tensors: the decoded image and its mask.
    """

    img_contents = tf.read_file(t_imgfname)

    img = tf.image.decode_png(img_contents, channels=3)
    img_r, img_g, img_b = tf.split(axis=2, num_or_size_splits=3, value=img)
    img = tf.cast(tf.concat(axis=2, values=[img_b, img_g, img_r]), dtype=tf.float32)
    # Extract mean.
    img -= img_mean

    if input_size is not None:
        h, w = input_size

        # Randomly scale the images and labels.
        newshape = tf.squeeze(tf.stack([h, w]), squeeze_dims=[1])
        img2 = tf.image.resize_images(img, newshape)
    else:
        img2 = tf.image.resize_images(img, tf.shape(img)[0:2,]*2)

    return img2, img

#######################################################
'''
Main function
'''
if name == "__main__":
    args = get_arguments()

    # Set up tf session and initialize variables.
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:
        model = HyperColumn_Deeplabv2(sess, args)

        # Load variables if the checkpoint is provided.
        model.load(args.snapshot_dir)

        local_imgflist = load_dir_structs(args.data_dir)
        save_folder = os.path.join(args.data_dir, args.feat_dir)
        if not os.path.exists(save_folder):
            os.mkdir(save_folder)

        for i in range(len(local_imgflist)):
            if os.path.splitext(local_imgflist[i])[1] == '':
                continue

            print('{} Processing {}'.format(i, local_imgflist[i]))
            padsize = 50
            _, ori_img = read_img(local_imgflist[i], input_size = None, img_mean = IMG_MEAN)
            pad_img = tf.pad(ori_img, [[padsize,padsize], [padsize,padsize], [0,0]], mode='REFLECT')
            cur_embed = model.test(pad_img.eval())
            cur_embed = np.squeeze(cur_embed)
            curfname = os.path.split(os.path.splitext(local_imgflist[i])[0])[1]
            cur_svpath = os.path.join(save_folder, curfname + '.mat')
            # 保存特征为 mat 格式，用于 matlab 测试时加载提取的特征.
            sio.savemat(cur_svpath, {'embedmap': cur_embed[padsize:(cur_embed.shape[0]-padsize),padsize:(cur_embed.shape[1]-padsize),:]})

2. SemanticSoftSegmentation

根据论文 Semantci Soft Segmentation、论文阅读 - Semantic Soft Segmentation 和开源实现 SemanticSoftSegmentation.m 进行理解.

主要处理：

[1] - 特征降维，从 128 维特征降维到 3 维.

[2] - 超像素处理，用于计算 Superpixels.

[3] - 计算仿射变换和 Laplacian，包括：Matting Affinity，semantic affinity 和 non-local color affinity 以及 affinityMatrixToLaplacian.

[4] - 计算 Laplacian 的特征分解，得到 100 个特征向量(eigenvectors).

[5] - 初始化优化，主要计算初始化 soft segments，对特征进行语义初始化；关于深度网络提取的特征进行 Group segments.

[6] - 最终优化Final optimization，sparsification.

% Semantic Soft Segmentation
% This function implements the soft segmentation approach described in
% Yagiz Aksoy, Tae-Hyun Oh, Sylvain Paris, Marc Pollefeys, Wojciech Matusik
% "Semantic Soft Segmentation", ACM TOG (Proc. SIGGRAPH) 2018

function [softSegments, initSoftSegments, Laplacian, affinities, features, superpixels, eigenvectors, eigenvalues] = SemanticSoftSegmentation(image, features)

    disp('Semantic Soft Segmentation')
    % Prepare the inputs and superpixels
    image = im2double(image);
    % --------------------------------------------------
    % preprocessFeatures 函数，对应于论文的 3.5 - Semantic Feature Vectors 章节
    % 主要是将深度网络提取的高维 128 维特征向量进行 guidedfilter， PCA 降维，并归一化值到 [0, 1].
    % --------------------------------------------------
    if size(features, 3) > 3 % If the features are raw, hyperdimensional, preprocess them
        features = preprocessFeatures(features, image);
    else
        features = im2double(features);
    end

    % -------------------------------------------------
    % Superpixels 函数，对应于论文 3.2 - Nonlocal Color Affinity 章节
    % 主要是对图像进行 SLIC 超像素分割
    % --------------------------------------------------
    superpixels = Superpixels(image);
    [h, w, ~] = size(image);

    disp('     Computing affinities')
    % -------------------------------------------------
    % 计算图像image 的仿射变换，和 Laplacian.
    % --------------------------------------------------
    % Compute the affinities and the Laplacian
    affinities{1} = mattingAffinity(image);
    affinities{2} = superpixels.neighborAffinities(features); % semantic affinity
    affinities{3} = superpixels.nearbyAffinities(image); % non-local color affinity
    Laplacian = affinityMatrixToLaplacian(affinities{1} + 0.01  affinities{2} + 0.01  affinities{3}); % Equation 6

    disp('     Computing eigenvectors')
    % -------------------------------------------------
    % 对得到的图像 Laplacian 计算特征分解，得到其特征向量和特征值.
    % --------------------------------------------------
    eigCnt = 100; % We use 100 eigenvectors in the optimization
    [eigenvectors, eigenvalues] = eigs(Laplacian, eigCnt, 'SM');

    disp('     Initial optimization')
    % -------------------------------------------------
    % softSegmentsFromEigs 函数，对应于论文 3.4 - Creating the Layers 章节
    % 用于初始化 soft segments.
    % --------------------------------------------------
    initialSegmCnt = 40;
    sparsityParam = 0.8;
    iterCnt = 40;
    % 将特征向量，特征值，Laplacian，降维后的特征，
    % 及其它参数，送入 softSementsFromEigs 函数.
    initSoftSegments = softSegmentsFromEigs(eigenvectors, eigenvalues, Laplacian, ...
                                            h, w, features, initialSegmCnt, iterCnt, sparsityParam, [], []);

    % 对于初始化到的 initSoftSegments，
    % 计算关于其语义特征向量，即降维后特征向量的分组分割(Group segments)
    groupedSegments = groupSegments(initSoftSegments, features, 3); # 参数 3 表示最终得到的图像层数. 默认是 5.

    disp('     Final optimization')
    % -------------------------------------------------
    % sparsifySegments 函数，对应于论文 3.4 - Creating the Layers 章节中的 relaxed sparsification.
    % 用于最终的 sparsification，得到最终的分割结果.
    % --------------------------------------------------
    softSegments = sparsifySegments(groupedSegments, Laplacian, imageGradient(image, false, 6));

    disp('     Done.')
end

Last modification：March 4, 2021

If you think my article is useful to you, please feel free to appreciate

8 comments

picard
April 9th, 2019 at 05:19 pm

你好，我想问一下128维特征指的是可以识别128种物体吗？

1. AIHGF
  April 9th, 2019 at 05:35 pm
  
  @picard
  
  特征是与语义无关的.
  
ming
April 1st, 2019 at 08:01 pm

你好，这个算法跑通了吗，效果怎么样，我记得说这个算法耗时挺长的

1. AIHGF
  April 1st, 2019 at 08:44 pm
  
  @ming
  
  跑的通，确实比较耗时，效果离实用还有一定距离.
  
  1. ming
    April 2nd, 2019 at 11:19 am
    
    @AIHGF
    
    你知道现在抠图开源里面哪个最好吗，你博客里面的我都试了，效果都打不到实用级别
    
    
    AIHGF
    April 2nd, 2019 at 01:14 pm
    
    @ming
    
    开源的实现中确实还没有能达到实用级别的，毕竟抠图是很有盈利前景的，哪怕是在很小的垂直行业，抠图也是现阶段比较棘手的. 现在 remove.bg 最新的技术抠图效果进步明显，但应该离实用级别还有段距离.
    
    
    ming
    April 2nd, 2019 at 11:31 am
    
    @ming
    
    你好，我可以加你qq嘛，我的qq386168383
    
    
    AIHGF
    April 2nd, 2019 at 01:14 pm
    
    @ming
    
    qq:2258922522

1. 特征提取

2. SemanticSoftSegmentation

8 comments