原文 - Manage Deep Learning Networks with Caffe* Optimized for Intel® Architecture.

目录:
- 摘要
- 安装
- 数据层
- 数据集准备
- 训练
- 多节点分布式训练
- 微调
- 测试
- 特征提取和可视化
- 使用Python API
- 调试
- 实例
- Caffe用处
- 进一步阅读

1 摘要

Caffe是BVLC开发的深度学习框架,基于C++和 CUDA C++语言,并提供了Python 和 Matlab接口. 该框架对于卷积神经网络CNN、循环神经网络RNN及多层感知器很有帮助. 现在已经具有对于检测、分类、分割以及Spark兼容的分支.

基于Intel结构优化的Caffe(Caffe-Intel)整合了Intel Math Kernel Library(Intel MKL) 2017,并对 Advanced Vector Extensions(AVX)-2 和AVX-512 指令集进行了优化,能够支持 Intel Xeon 和 Intel Xeon Phi 处理器. 因此,Caffe-Intel 框架除了包含BVLC Caffe的所有优点外,还能在 Intel 架构上有效运行,并能在许多节点进行分布式训练.

该文档主要阐述了基于Intel结构优化的Caffe框架的编译、使用一个或多个计算节点进行网络模型的训练以及网络的部署. 另外, 详细介绍了Caffe的一些函数,比如网络微调、不同模型的特征提取与可视化、Caffe的Python API接口.

名词:
- weights 权重 - 也被叫做核(kernels)、滤波器(filters)、模板(templates)、或特征提取器(feature extractors);
- blob 数据块 - 也被叫做张量(tesor),一种N维数据结构,N-D维张量,包含了数据、梯度或权重(偏置bias);
- units 神经元 - 也被叫做 neurons,在数据块进行非线性变化;
- feature maps 特征图 - 也被叫做通道(channels);
- testing 测试 - 也被叫做推断(inference)、分类、得分(scoring)或部署(deployment);
- model 模型 - 也被叫做拓扑结构或网络结构.

快速熟悉Caffe:
- Caffe-Intel 安装
- Caffe-Intel 基于 MNIST 数据集训练和测试 LeNet 网络
- 在一些图片上,比如 catfish-bike测试训练好的模型,比如,bvlc_googlenet.caffemodel
- 在 Cats vs Dogs Challenge 对已有模型微调

2 Caffe-Intel 安装

这里仅针对 Ubuntu14.04 平台说明 Caffe 的安装,其他Linux和OS X操作系统,BVLC官方提供了相应的安装方法.

sudo apt-get update &&
sudo apt-get -y install build-essential git cmake &&
sudo apt-get -y install libprotobuf-dev libleveldb-dev libsnappy-dev &&
sudo apt-get -y install libopencv-dev libhdf5-serial-dev protobuf-compiler &&
sudo apt-get -y install --no-install-recommends libboost-all-dev &&
sudo apt-get -y install libgflags-dev libgoogle-glog-dev liblmdb-dev &&
sudo apt-get -y install libatlas-base-dev

对于Ubuntu16.04,需要进行以下库的链接:

find . -type f -exec sed -i -e 's^"hdf5.h"^"hdf5/serial/hdf5.h"^g' -e 's^"hdf5_hl.h"^"hdf5/serial/hdf5_hl.h"^g' '{}' ;
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so

针对CentOS7,安装以下依赖项:

sudo yum -y update &&
sudo yum -y groupinstall "Development Tools" &&
sudo yum -y install wget cmake git &&
sudo yum -y install protobuf-devel protobuf-compiler boost-devel &&
sudo yum -y install snappy-devel opencv-devel atlas-devel &&
sudo yum -y install gflags-devel glog-devel lmdb-devel leveldb-devel hdf5-devel

# The following steps are only required if some packages failed to install
# add EPEL repository then install missing packages
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo rpm -ivh epel-release-latest-7.noarch.rpm
sudo yum -y install gflags-devel glog-devel lmdb-devel leveldb-devel hdf5-devel &&
sudo yum -y install protobuf-devel protobuf-compiler boost-devel

# if packages are still not found--download and install/build the packages, e.g.,
# snappy:
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm
sudo yum -y install http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm
# atlas:
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/atlas-devel-3.10.1-10.el7.x86_64.rpm
sudo yum -y install http://mirror.centos.org/centos/7/os/x86_64/Packages/atlas-devel-3.10.1-10.el7.x86_64.rpm
# opencv:
wget https://github.com/Itseez/opencv/archive/2.4.13.zip
unzip 2.4.13.zip
cd opencv-2.4.13/
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=/usr/local ..
NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make all -j $NUM_THREADS
sudo make install -j $NUM_THREADS

# optional (not required for Caffe)
# other useful repositories for CentOS are RepoForge and IUS:
wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
sudo rpm -Uvh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
wget https://rhel7.iuscommunity.org/ius-release.rpm
sudo rpm -Uvh ius-release*.rpm

各依赖项的说明(source):
- boost - 使用 math functions 和 shared pointer 的C++库;
- glog、gflags - 提供日志和命令行工具,对于调试十分必要;
- leveldb、lmdb - 数据库IO,用于准备数据;
- protobuf - 用于有效的定义数据结构;
- BLAS(Basic Linear Algebra Subprograms) - 由Intel MKL提供的矩阵乘法、矩阵加法等操作库,类似的还有ATLAS、openBLAS 等运算库.

Caffe安装指南指出对于CPU来说,安装MKL会有更好的表现.

为了最佳表现,采用Intel MKL 2017,可以免费从 Intel® Parallel Studio XE 2017 Beta 获取Beta版.
安装好后,正确的环境库可以设置如下(其中的路径需要根据实际情况修改):

echo 'source /opt/intel/bin/compilervars.sh intel64' >> ~/.bashrc
# alternatively edit <mkl_path>/mkl/bin/mklvars.sh replacing INSTALLDIR in
# CPRO_PATH=<INSTALLDIR> with the actual mkl path: CPRO_PATH=<full mkl path>
# echo 'source <mkl path>/mkl/bin/mklvars.sh intel64' >> ~/.bashrc

克隆并准备 Caffe-Intel

cd ~
# For BVLC caffe use:
# git clone https://github.com/BVLC/caffe.git
# For intel caffe use:
git clone https://github.com/intel/caffe.git 
cd caffe
echo "export CAFFE_ROOT=`pwd`" >> ~/.bashrc
source ~/.bashrc
cp Makefile.config.example Makefile.config
# Open Makefile.config and modify it (see comments in the Makefile)
vi Makefile.config

编辑Makefile.config:

# To run on CPU only and to avoid installing CUDA installers, uncomment
CPU_ONLY := 1

# To use MKL, replace atlas with mkl as follows
# (make sure that the BLAS_DIR and BLAS_LIB paths are correct)
BLAS := mkl
BLAS_DIR := $(MKLROOT)/include
BLAS_LIB := $(MKLROOT)/lib/intel64

# To use MKL2017 DNN primitives as the default engine, uncomment
# (however leave it commented if using multinode training)
# USE_MKL2017_AS_DEFAULT_ENGINE := 1

# To customized compiler choice, uncomment and set the following
# CUSTOM_CXX := g++

# To train on multinode uncomment and verify path
# USE_MPI := 1
# CXX := /usr/bin/mpicxx

如果是Ubuntu16.04, 编辑Makefile:

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/

并创建链接:

cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so 

如果是CentOS7和ATLAS库(而不是推荐的MKL库),编辑Makefile:

# Change this line
LIBRARIES += cblas atlas
# to
LIBRARIES += satlas

编译Caffe-Intel:

NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make -j $NUM_THREADS
# To save the output stream to file makestdout.log use this instead
# make -j $NUM_THREADS 2>&1 | tee makestdout.log

另一种方式是采用cmake方式:

mkdir build
cd build
cmake -DCPU_ONLY=on -DBLAS-mkl -DUSE_MKL2017_AS_DEFAULT_ENGINE=on /path/to/caffe
NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make -j $NUM_THREADS

安装Python依赖项:

# These steps are OPTIONAL but highly recommended to use the Python interface
sudo apt-get -y install gfortran python-dev python-pip
cd ~/caffe/python
for req in $(cat requirements.txt); do sudo pip install $req; done
sudo pip install scikit-image #depends on other packages
sudo ln -s /usr/include/python2.7/ /usr/local/include/python2.7
sudo ln -s /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ \
  /usr/local/include/python2.7/numpy
cd ~/caffe
make pycaffe -j NUM_THREADS
echo "export PYTHONPATH=$CAFFE_ROOT/python" >> ~/.bashrc
source ~/.bashrc

其它安装选项:

# These steps are OPTIONAL to test caffe
make test -j $NUM_THREADS
make runtest #"YOU HAVE <some number> DISABLED TESTS" output is OK

# This step is OPTIONAL to disable cam hardware OpenCV driver
# alternatively, the user can skip this and ignore the harmless 
# libdc1394 error that may occasionally appears
sudo ln /dev/null /dev/raw1394

3 Caffe 数据层

该部分是可选,将对 Caffe 支持的数据类型进行阐述,对于学习 Caffe 是非必须的,主要基于Caffe官方提供的layers 介绍材料src/caffe/proto/caffe.proto.

Data 通过数据层进入 Caffe,其位于网络的最底部,在prototxt文件中进行定义. 关于prototxt文件的更多信息会在Caffe-Intel网络训练部分详细介绍.
Data可以来自数据库(LevelDB或LMDB), 直接从内存、从磁盘HDF5格式文件或通用图像格式.
常用的输入图片预处理(比如中心化(mean subtraction)、尺度变换、随机裁剪、镜像处理等)变换可以通过指定

transfrom_params

(不是所有的数据类型都支持该参数,比如HDF5即不支持)来定义. 如果已经预先进行数据变换,则不必再使用.
常用的数据变换定义方式:

transform_param {
  # 随机水平反转图片,镜像处理
  mirror: 1
  # 裁剪  `crop_size` x `crop_size`  图片块:
  # - 训练时随机裁剪
  # - 测试时根据图片 center 裁剪
  crop_size: 227
  # 去均值: 可以设定值, 或者从 mean.binaryproto 文件加载
  # mean_file: name_of_mean_file.binaryproto
  mean_value: 104
  mean_value: 117
  mean_value: 123
}

这里,图像要进行裁剪、镜像、中心化变换. 其他数据变换操作可以查看 src/caffe/proto/caffe.proto 文件的

TransformationParameter

参数.

3.1 LMDB 数据

LMDB(Lightning Memory-Mapped Databases )LevelDB 数据形式可以作为输入数据的一种有效方式.
他们只对于

1-of-K

分类任务较适用. 由于Caffe在读取数据集效率问题,这两种数据形式被推荐用于

1-of-K

任务.

data_params
属性
- source - 包含图片数据库的路径
- batch_size - 一次处理输入的数目

参数
- backend[默认LEVELDB] - 选择采用 LEVELDB 或 LMDB
- rand_skip - 在开始处跳过的输入数目,对于

async sgd

有用

详细介绍查看 src/caffe/proto/caffe.proto文件中

DataParameter

参数.

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: 1
    crop_size: 227
    mean_value: 104
    mean_value: 117
    mean_value: 123
  }
  data_param {
    source: "examples/imagenet/ilsvrc12_train_lmdb"
    batch_size: 32
    backend: LMDB
  }
}

或者,均值中心化可以通过均值图像(

"data/ilsvrc12/imagenet_mean.binaryproto"

) 来取代

mean_value

. LMDB数据集的

binaryproto

的计算为:

cd ~/caffe
build/tools/compute_image_mean examples/imagenet/ilsvr12_train_lmdb 
data/ilsvrc12/imagenet_mean.binaryproto

根据实际需求,可以分别替换

examples/imagenet/ilsvr12_train_lmdb

data/ilsvrc12/imagenet_mean.binaryproto

为合适的 lmdb 文件夹和

binaryproto

文件.

3.2 ImageData

直接从图像文件得到images和labels.

image_data_params
属性
- source - 包含了输入数据和labels的文本文件名字

参数
- batch_size[默认为1] - 一次处理的输入数目
- new_height[默认为0] - 调整图像height值,如果为0,则忽略
- new_width[默认为0] - 调整图像width值,如果为0,则忽略
- shuffle[默认为0] - 打乱数据,如果为0,则忽略
- rand_skip[默认为0] - 在开始处跳过的输入数目,对于

async sgd

有用

详细介绍查看src/caffe/proto/caffe.proto文件中

ImageDataParameter

参数.

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_value: 104
    mean_value: 117
    mean_value: 123
  }
  image_data_param {
    source: "/path/to/file/train.txt"
    batch_size: 32
    shuffle: 1
  }
}

这里,图像进行了顺序打乱、裁剪、镜像和中心化处理.
需要注意的是,文本中每行应为图像名和对应的labels,比如,

"tran.txt"

形式:

/path/to/images/img3423.jpg 2
/path/to/images/img3424.jpg 13
/path/to/images/img3425.jpg 8
...

3.3 Input

指定数据维度时,采用零值 blob 作为输入数据.

input_params
属性
- shape - 指定为1或top blobs的维度信息

在 prototxt 中的定义形式:

layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 32
      dim: 3
      dim: 227
      dim: 227
    }
  }
}

等价形式:

input: "data"
input_dim: 32
input_dim: 3
input_dim: 227
input_dim: 227

3.4 DummyData

类似于 Input 类型,不同之处在于需要指定数据类型. 往往用于调试,详细可参考例子

dummy_data_params
属性
- shape - 指定为1或top blobs的维度信息

参数
- data_filler[默认是值为0的ConstantFiller] - 指定top blob的值

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "DummyData"
  top: "data"
  include {
    phase: TRAIN
  }
  dummy_data_param {
    data_filler {
      type: "constant"
      value: 0.01
    }
    shape {
      dim: 32
      dim: 3
      dim: 227
      dim: 227
    }
  }
}
layer {
  name: "data"
  type: "DummyData"
  top: "label"
  include {
    phase: TRAIN
  }
  dummy_data_param {
    data_filler {
      type: "constant"
    }
    shape {
      dim: 32
    }
  }
}

3.5 MemoryData

直接从内存读取数据,调用方式为:调用

MemoryDataLayer::Reset (from C++)

Net.set_input_arrays (from Python)

来读取连续的数据,一般是4D array,一次读取一个batch_size.
由于该方式需要将数据首先送到内存中,速率可能会慢,但一旦放到内存中,这种方式很有效率.

memory_data_param
属性
- bacth_size,channels, height, width - 数据的维度信息

在 prototxt 中的定义形式:

layers {
  name: "data"
  type: MEMORY_DATA
  top: "data"
  top: "label"
  transform_param {
    crop_size: 227
    mirror: true
    mean_file: "mean.binaryproto"
  }
  memory_data_param {
   batch_size: 32
   channels: 3
   height: 227
   width: 227
  }

3.6 HDF5Data

以HDF5格式文件来读取数据,对于很多任务都是可用的,但一般只用于FP32和FP64数据,不是uint8,故图像数据会很大.
不允许使用

transform_param

. 只在必要的时候使用该方式.

hdf5_data_param
属性
- source - 包含输入数据和labels路径的文本文件名
- batch_size

参数
- shuffle[默认false] - 打乱HDF5文件顺序

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "HDF5_DATA"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 32
  }
}

3.7 HDF5DataOutput

HDF5输出层的作用与其他数据层相反,将输入数据块写入磁盘

hdf5_output_param
属性
- file_name

在 prototxt 中的定义形式:

layer {
  name: "data_output"
  type: "HDF5_OUTPUT"
  bottom: "data"
  bottom: "label"
  include {
    phase: TRAIN
  }
  hdf5_output_param {
    file_name: "output_file.h5"
  }
}

3.8 WindowData

用于detection,Read windows from image files class labels.

window_data_param
属性
- source - 指定数据源
- mean_file
- batch_size

参数
- mirror
- crop_size - 随机裁剪图像
- crop_mode[默认"warp"] - 裁剪detection window的模式,比如,"warp"裁剪为固定尺寸, "square"在window四周裁剪紧凑方框
- fg_threshold[默认0.5] - 前景重叠阈值(foreground (object) overlap threshold)
- bg_threshold[默认0.5] - 背景重叠阈值(background (object) overlap threshold)
- fg_fraction[默认0.25]: 前景物体交集(fraction of batch that should be foreground) objects
- context_pad[默认10]: 围绕window补零数目(amount of contextual padding around a window)

详细信息可参考src/caffe/proto/caffe.proto文件中的

WindowDataParameter

参数.

在 prototxt 中的定义形式:

layers {
  name: "data"
  type: "WINDOW_DATA"
  top: "data"
  top: "label"
  window_data_param {
    source: "/path/to/file/window_train.txt"
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
    batch_size: 128
    mirror: true
    crop_size: 227
    fg_threshold: 0.5
    bg_threshold: 0.5
    fg_fraction: 0.25
    context_pad: 16
  }
}

4 数据集准备

对于

1-of-K

分类任务推荐使用LMDB数据格式.
在使用Caffe工具生成LMDB格式数据需要指定:
- 数据所在目录
- 输出目录,比如

mydataset_train_lmdb

,必须
- 包含图像名和对应labels的文本文件,比如,"

train.txt

",内容格式为:

img3423.jpg 2
img3424.jpg 13
img3425.jpg 8
...

如果数据分散在不同的文件夹, "

train.txt

"需要包含数据的绝对路径.

create_label_file.py 可以生成针对Kaggle's Dog vs Cats Competition任务的 training 和 validation 数据集划分,同样适用于其它任务.

create_label_file.py

#!/usr/bin/env python

import sys
import os
import os.path

def main():

  TRAIN_TEXT_FILE = 'train.txt'
  VAL_TEXT_FILE = 'val.txt'
  IMAGE_FOLDER = 'train'

  # Selects 10% of the images (the ones that end in '2') for validation

  fr = open(TRAIN_TEXT_FILE, 'w')
  fv = open(VAL_TEXT_FILE, 'w')

  filenames = os.listdir(IMAGE_FOLDER)
  for filename in filenames:
    if filename[0:3] == 'cat':
      if filename[-5] == '2':# or filename[-5] == '8':
        fv.write(filename + ' 0\n')
      else:
        fr.write(filename + ' 0\n')
    if filename[0:3] == 'dog':
      if filename[-5] == '2':# or filename[-5] == '8':
        fv.write(filename + ' 1\n')
      else:
        fr.write(filename + ' 1\n')

  fr.close()
  fv.close()

# Standard boilerplate to call the main() function to begin the program.
if __name__ == '__main__':
  main()

在测试阶段,假设labels不存在的. 如果labels可用,可以采用相同的方法生成 test LMDB数据集.

4.1 准备三通道数据(图像)

下面的例子生成training LMDB,工作路径位于

$CAFFE_ROOT
#!/usr/bin/env sh
# folder containing the training and validation images
TRAIN_DATA_ROOT=/path/to/training/images

# folder containing the file with the name of training images
DATA=/path/to/file
# folder for the lmdb datasets
OUTPUT=/path/to/output/directory
TOOLS=/path/to/caffe/build/tools

# Set to resize the images to 256x256
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
echo "Creating train lmdb..."

# Delete the shuffle line if shuffle is not desired
GLOG_logtostderr=1 *TOOLS/convert_imageset 
    --resize_height=*RESIZE_HEIGHT 
    --resize_width=*RESIZE_WIDTH 
    --shuffle 
    *TRAIN_DATA_ROOT/ 
    *DATA/train.txt 
    *OUTPUT/mydataset_train_lmdb
echo "Done."

注: * 号替换为 $ 符号.

计算LMDB数据集的图像均值:

#!/usr/bin/env sh
# Compute the mean image in lmdb dataset
OUTPUT=/path/to/output/directory

 # folder for the lmdb datasets and output for mean image
TOOLS=/path/to/caffe/build/tools

*TOOLS/compute_image_mean *OUTPUT/mydataset_train_lmdb 
  *OUTPUT/train_mean.binaryproto

*TOOLS/compute_image_mean *OUTPUT/mydataset_val_lmdb 
  *OUTPUT/val_mean.binaryproto

4.2 准备不同通道数据

灰度值图像(Gray scale images, 单通道)、RADAR图像(双通道)、视频(videos,四通道)、图像+深度信息(四通道)、brometry(单通道)以及频谱图(spectrograms,单通道)需要进行变换以生成LMDB数据集(参考资料).

4.3 调整图像尺寸

有两种调整图像尺寸的方式:
- 变换图像到指定尺寸
- 按比例调整到比指定尺寸相对较小的尺寸,然后中心裁剪大的一边以达到指定尺寸

调整图像尺寸的方法有:
- 基于OPENCV* - build/tools/convert_imageset --resize_height=256 --resize_width=256 将图像裁剪到指定尺寸,其中

convert_imageset

调用了

ReadImageToDatum

函数,后者调用了

caffe/src/util/io.cpp

中的

ReadImageToCVMat

函数;
- 基于ImageMagick - convert -resize 256x256! 将图像裁剪到指定尺寸;
- 基于OPENCV - 采用脚本

tools/extra/resize_and_crop_images.py

来进行多线程图像变换,对图像进行比例地变换,再进行中心裁剪

sudo pip install git+https://github.com/Yangqing/mincepie.git
sudo apt-get install -y python-opencv
vi tools/extra/launch_resize_and_crop_images.sh # set number of clients (use num_of_cores*2); file.txt, input, and output folders

另外,网络中的图像可以在数据层定义参数来进行裁剪或者调整尺寸:

layer {
  name: "data"
  transform_param {
    crop_size: 227
...
}
layer {
  name: "data"
  image_data_param {
    new_height: 227
    new_width: 227
...

5 网络训练 Training

网络训练需要:
- train_val.prototxt - 定义了网络结构、初始化参数和学习率
- solver.prototxt - 定义了优化参数的方式,训练深度网络的文件
- deploy.prototxt - 只用于testing,与

train_val.prototxt

基本一致,除了没有输入层、loss层

参数初始化十分重要,其主要方式有:
- gaussian - 从高斯分布 N(0,std)采样权重值
- xavier - 从uniform distribution U(-a,a)采样权重,其中 a=sqrt(3/fan_in), where fan_in is the number of incoming inputs
- MSRAFiller - 从正态分布 normal distribution N(0,a) 采样权重, 其中a=sqrt(2/fan_in)

网络层关于学习率的参数:
- base_lr - 初始化学习率,默认为0.01,训练时如果出现NAN,则将值调小
- lr_mult - 偏置的lr_mult一般设为2×非偏置权重的lr_mult

以LeNet为例,分别定义 lenet_train_test.prototxt, deploy.prototxt, solver.prototxt

solver.prototxt

# 网络定义
net: "examples/mnist/lenet_train_test.prototxt"

# 每500次训练迭代进行一次validation test
test_interval: 500 
# 指定validation test迭代的次数,推荐值设为 num_val_imgs / batch_size
test_iter: 100 

# 训练网络的基础学习率、动量和权重衰减
base_lr: 0.01
momentum: 0.9 
weight_decay: 0.0005

# 不同的学习策略
#  fixed: always return base_lr.
#  step: return base_lr * gamma ^ (floor(iter / step))
#  exp: return base_lr * gamma ^ iter
#  inv: return base_lr * (1 + gamma * iter) ^ (- power)
#  multistep: similar to step but it allows non uniform steps defined by stepvalue
#  poly: the effective learning rate follows a polynomial decay, to be zero by the max_iter: return base_lr (1 - iter/max_iter) ^ (power)
#  sigmoid: the effective learning rate follows a sigmod decay: return base_lr * ( 1/(1 + exp(-gamma * (iter - stepsize))))
lr_policy: "step"
gamma: 0.1 
stepsize: 10000 # Drop the learning rate in steps by a factor of gamma every stepsize iterations

# 每100次迭代显示一次结果
display: 100 

# 最大迭代次数
max_iter: 10000

# 每5000次迭代输出一次快照,即模型训练状态和模型参数
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet_multistep"

# solver mode: CPU or GPU
solver_mode: CPU

训练网络:

*CAFFE_ROOT/build/tools/caffe train -solver solver.prototxt

训练网络会输出两种类型的文件,比如:
- lenet_multistep_10000.caffemodel - 网络的权重,即用于测试的模型参数
- lenet_multistep_10000.solverstate - 如果中间训练过程中断,便于恢复训练

训练网络,并画出验证数据集上的精度或loss vs迭代的曲线:

#CHART_TYPE=[0-7]
#  0: Test accuracy  vs. Iters
#  1: Test accuracy  vs. Seconds
#  2: Test loss  vs. Iters
#  3: Test loss  vs. Seconds
#  4: Train learning rate  vs. Iters
#  5: Train learning rate  vs. Seconds
#  6: Train loss  vs. Iters
#  7: Train loss  vs. Seconds
CHART_TYPE=0
*CAFFE_ROOT/build/tools/caffe train -solver solver.prototxt 2>&1 | tee logfile.log
python *CAFFE_ROOT/tools/extra/plot_training_log.py.example *CHART_TYPE name_of_plot.png logfile.log

Dropout被用于全连接层,在forward-pass过程只激活部分权重来避免权重间的协同性,以降低过拟合.
在测试过程被忽略.

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }   
    bias_filler {
      type: "constant"
      value: 1
    }   
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5 
  }
}

估计前向传播和后向传播的时间,不更新权重:

# 计算NUMITER=50次前向和后向传播的时间,总时间以及平均时间
# 可能需要训练样本和mean.binaryproto
NUMITER=50
/path/to/caffe/build/tools/caffe time --model=train_val.prototxt -iterations *NUMITER

Linux的

numactl

工具可以进行内存分配管理:

numactl -i all /path/to/caffe/build/tools/caffe time --model=train_val.prototxt -iterations *NUMITER

Caffe Model Zoo

Caffe Model Zoo给出了针对不同任务的网络模型以及模型参数,便于fine-tuning或者testing.

6 多节点分布式训练 Multinode distributed training

该部分内容基于Intel's Caffe Github wiki. 主要有两种方式进行多节点的分布式训练:
- 模型并行
- 数据并行

模型并行是指,将模型置于不同的节点,每个节点都进行全部的数据处理;
数据并行是指,将数据块置于不同的节点,每个节点都有全部的模型参数.
对于模型中权重数较少,数据块较大时,数据并行比较使用.
混合模型和数据并行可以同时进行,对于网络层权重较少,比如卷积层采用数据并行训练,对于网络层权重较多,比如全连接层采用模型并行训练.
论文 - Distributed Deep Learning Using Synchronous Stochastic Gradient Descent - 2016 - Intel 对混合方法中数据并行和模型并行间的优化平衡进行了理论分析.

结合当前比较流行的权重较少的深度网络,比如GoogleNet和ResNet,以及采用数据并行分布式训练的成功案例, 可以看出,Caffe-Intel支持数据并行计算的.
多节点分布式训练也是当前比较活跃的发展方向.

多节点网络训练对 Makefile.config进行修改:

USE_MPI := 1
# update with the path to binary MPI library
CXX := /usr/bin/mpicxx

采用多节点进行训练也比较简单:

mpirun --hostfile path/to/hostfile -n <num_processes> /path/to/caffe/build/tools/caffe train --solver=/path/to/solver.prototxt --param_server=mpi

其中,
- - 使用节点的数目
- hostfile - 包含了每条线节点的ip地址

solver.prototxt

中指定了各节点的

train.prototxt

,且每个

train.prototxt

需要指定到数据集的不同部分. 更多细节,参考相关材料.

7 微调 Fine-tuning

重复利用prototxt中定义的网络结构,主要进行的两处修改如下:

  • 1 修改网络数据层,以适应新数据
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00390625 # 1/255
  }
  data_param {
    source: "newdata_lmdb" # 指定到新的数据集
    batch_size: 64
    backend: LMDB
  }
}
  • 2 修改输出层,这里是ip2网络层(注:在deploy.prototxt文件中进行同样的修改)
layer {
  name: "ip2-ft" # 修改网络名
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2-ft" # 修改网络输出名
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 2 # 修改为新数据集的类别数目,这里是2
    bias_filler {
      type: "constant"
    }
  }
}

在Caffe中fine-tuning:

#From the command line on $CAFFE_ROOT
./build/tools/caffe train -solver /path/to/solver.prototxt -weights  /path/to/trained_model.caffemodel

微调技巧:
- 首先学习最后网络输出层,其它层不变动
- 减小初始学习率,一般为10×或100×
- 可定义Caffe网络层的局部学习率 lr_mult
- 保持除了最后输出层或倒数第二层网络不变,以进行快速优化,即: 局部学习率lr_mult=0
- 增大最后输出层的局部学习率为10×,倒数第二层的局部学习率为5×
- 如果效果已足够好,停止,或者微调其它网络层

微调网络的特点:
- 创建了新的网络结构
- 复制初始化网络权重
- 类似于网络的训练,参考实例.

8 测试Testing

测试也被叫做推断、分类、或者打得分,可以使用Caffe提供的Python接口或者C++工具进行. C++工具不够灵活,推荐使用Python.
分类一张图片或信号或图像集,需要:
- 图片
- 网络结构
- 网络权重

8.1 测试图片集

模型的prototxt中应该有TEST数据层,指定了testing数据集,以测试模型表现:

/path/to/caffe/build/tools/caffe test -model /path/to/train_val.prototxt 
- weights /path/to/trained_model.caffemodel -iterations <num_iter>

该实例参考了材料.

8.2 测试单张图片

首先,在使用训练好的模型进行图片分类前,需要下载模型:

./scripts/download_model_binary.py models/bvlc_reference_caffenet

然后,下载数据集labels,来映射网络预测结果到图片类别,这里以ILSVRC2012为例:

./data/ilsvrc12/get_ilsvrc_aux.sh

最后,分类图片:

./build/examples/cpp_classification/classification.bin 
  models/bvlc_reference_caffenet/deploy.prototxt 
  models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 
  data/ilsvrc12/imagenet_mean.binaryproto 
  data/ilsvrc12/synset_words.txt 
  examples/images/cat.jpg

输出结果样式:

---------- Prediction for examples/images/cat.jpg ----------
0.3134 - "n02123045 tabby, tabby cat"
0.2380 - "n02123159 tiger cat"
0.1235 - "n02124075 Egyptian cat"
0.1003 - "n02119022 red fox, Vulpes vulpes"
0.0715 - "n02127052 lynx, catamount"

9 特征提取和可视化

网络卷积层的权重数据格式为: output_feature_maps x height x width x input_feature_maps,feature_maps也被叫做channels. Caffe的特征提取方式有两种: Python API和C++ API.

# 下载模型参数
scripts/download_model_binary.py models/bvlc_reference_caffenet

# Generate a list of the files to process
# Use the images that ship with caffe
find `pwd`/examples/images -type f -exec echo {} ; > examples/images/test.txt

# Add a 0 to the end of each line
# input data structures expect labels after each image file name
sed -i "s/*/ 0/" examples/images/test.txt

# Get the mean of trainint set to subtract it from images
./data/ilsvrc12/get_ilsvrc_aux.sh

# Copy and modify the data layer to load and resize the images:
cp examples/feature_extraction/imagenet_val.prototxt examples/images
vi examples/iamges/imagenet_val.prototxt

# 提取特征
./build/tools/extract_features.bin models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 
  examples/images/imagenet_val.prototxt fc7 examples/images/features 10 lmdb

这里提取了

fc7

网络层的特征图,表现的是模型的最高层特征. 同样的,也可以提取其它层的特征,比如

conv5

pool3

等. 最后的参数

10 lmdb

是最小的batch size, 提取的特征被保存在

examples/images/features

的LevelDB文件夹内.

10 Python API

Caffe提供了testing、分类、特定提取、网络定义和网络训练的Python API.

10.1 Caffe Python API 设置

编译Caffe后需要再执行

make pycaffe

,成功后即可进行调用:

import sys 
CAFFE_ROOT = '/path/to/caffe/' #路径要设置正确
sys.path.insert(0, CAFFE_ROOT + 'python')
import caffe
caffe.set_mode_cpu() # CPU模式

10.2 加载网络结构API

网络结构定义在train_val.prototxt或者deploy.prototxt中:

net = caffe.Net('train_val.prototxt', caffe.TRAIN)

如果指定了权重,则:

net = caffe.Net('deploy.prototxt', 'trained_model.caffemodel', caffe.TRAIN)

net中包含了数据块(net.blobs)和权重参数块(net.params),以conv1网络层为例:
- net.blobs['conv1'] - conv1层的输出数据,也被叫做特征图(feature maps)
- net.params['conv1'][0] - conv1层权重项
- net.params['conv1'][1] - conv1层偏置项
- net.blobs.items() - 所有网络层的数据块

10.3 网络可视化API

这里需要安装

pydot

graphviz

模块:

sudo apt-get install -y GraphViz
sudo pip install pydot

利用caffe的

draw_net.py

脚本实现可视化:

python python/draw_net.py examples/net_surgery/deploy.prototxt train_val_net.png
open train_val_net.png

10.4 数据输入API

  • 方式1:修改数据层以匹配图像大小
import numpy as np
# get input image and arrange it as a 4-D tensor
im = np.array(Image.open('/path/to/caffe/examples/images/cat_gray.jpg'))
im = im[np.newaxis, np.newaxis, :, :]
# resize the blob to be the size of the input image
net.blobs['data'].reshape(im.shape) # if the image input is different 
# compute the blobs given the input data
net.blobs['data'].data[...] = im
  • 方式2: 修改输入数据以匹配网络数据层的图像大小
im = caffe.io.load.image('/path/to/caffe/examples/images/cat_gray.jpg')
shape = net.blobs['data'].data.shape
# resize the img to be the size of the data blob
im = caffe.io.resize(im, shape[3], shape[2], shape[1])
# compute the blobs given the input data
net.blobs['data'].data[...] = im
  • 数据层对输入数据一般会进行数据变换
net = caffe.Net('deploy.prototxt', 'trained_model.caffemodel', caffe.TRAIN)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
ilsvrc_mean = 'python/caffe/imagenet/ilsvrc_2012_mean.npy'
transformer.set_mean('data', np.load(ilsvrc_mean).mean(1).mean(1))
# puts the channel as the first dimention
transformer.set_transpose('data', (2,0,1))
# (2,1,0) maps RGB to BGR for example
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)
# the batch size can be changed on-the-fly
net.blobs['data'].reshape(1,3,227,227)
# load the image in the data layer
im = caffe.io.load.image('/path/to/caffe/examples/images/cat_gray.jpg')
# transform the image and store it in the net.blob
net.blobs['data'].data[...] = transformer.preprocess('data', im)

图像可视化:

import matplotlib.pyplot as plt
plt.imshow(im)

10.5 推断 Inference API

输入图像的网络预测:

# assumes that images are loaded
prediction = net.forward()
print 'predicted class:', prediction['prob'].argmax()

也可以统计forward propagation的时间(不包括数据处理的时间):

timeit net.forward()

Caffe还提供了对多个输入数据同时进行数据变换和分类的Python API - net.Classifier, 可以取代net.Net和caffe.io.Transformer.

im1 = caffe.io.load.images('/path/to/caffe/examples/images/cat.jpg')
im2 = caffe.io.load.images('/path/to/caffe/examples/images/fish-bike.jpg')
imgs = [im1, im2]
ilsvrc_mean = '/path/to/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy'
net = caffe.Classifier('deploy.prototxt', 'trained_model.caffemodel',
                       mean=np.load(ilsvrc_mean).mean(1).mean(1),
                       channel_swap=(2,1,0),
                       raw_scale=255,
                       image_dims=(256, 256))
prediction = net.predict(imgs) # predict takes any number of images
print 'predicted classes:', prediction[0].argmax(), prediction[1].argmax()

对于多张图片的文件夹,只需修改

imgs

部分:

IMAGES_FOLDER = '/path/to/folder/w/images/'
import os
images = os.listdir(IMAGES_FOLDER)
imgs = [ caffe.io.load_image(IMAGES_FOLDER + im) for im in images ]
plt.plot(prediction[0])  # 以bar chart的形式可视化所有类别的概率
timeit net.predict([im1])  # 时间统计
timeit net.predict([im1], oversample=0)

10.6 特征提取和可视化API

fc7

层为例,

# Retrieve details of the network's layers
[(k, v.data.shape) for k, v in net.blobs.items()]

# Retrieve weights of the network's layers
[(k, v[0].data.shape) for k, v in net.params.items()]

# Retrieve the features in the last fully connected layer
# prior to outputting class probabilities
feat = net.blobs['fc7'].data[4]

# Retrieve size/dimensions of the array
feat.shape

# Assumes that the "net = caffe.Classifier" module has been called
# and data has been formatted as in the example above

# Take an array of shape (n, height, width) or (n, height, width, channels)
# and visualize each (height, width) section in a grid
# of size approx. sqrt(n) by sqrt(n)
def vis_square(data, padsize=1, padval=0):
    # values between 0 and 1
    data -= data.min()
    data /= data.max()

    # force the number of filters to be square
    n = int(np.ceil(np.sqrt(data.shape[0])))
    padding = ((0, n ** 2 - data.shape[0]), (0, padsize), (0, padsize)) + ((0, 0),) * (data.ndim - 3)
    data = np.pad(data, padding, mode='constant', constant_values=(padval, padval))

    # tile the filters into an image
    data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))
    data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:])

    plt.imshow(data)

plt.rcParams['figure.figsize'] = (25.0, 20.0)

# visualize the weights after the 1st conv layer
net.params['conv1'][0].data.shape
filters = net.params['conv1'][0].data
vis_square(filters.transpose(0, 2, 3, 1))

# visualize the feature maps after 1st conv layer
net.blobs['conv1'].data.shape
feat = net.blobs['conv1'].data[0,:96]
vis_square(feat, padval=1)

# visualize the weights after the 2nd conv layer
net.blobs['conv2'].data.shape
feat = net.blobs['conv2'].data[0,:96]
vis_square(feat, padval=1)

# visualize the weights after the 2nd pool layer
net.blobs['pool2'].data.shape
feat = net.blobs['pool2'].data[0,:256] # change 256 data = np.pad(data, padding, mode='constanto number of pool outputs
vis_square(feat, padval=1)

# Visualize the neuron activations for the 2nd fully-connected layer
net.blobs['ip2'].data.shape
feat = net.blobs['ip2'].data[0]
plt.plot(feat.flat)
plt.legend()
plt.show()

10.7 网络定义API

from caffe import layers as L
from caffe import params as P

def lenet(lmdb, batch_size):
    # auto generated LeNet
    n = caffe.NetSpec()
    n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, transform_param=dict(scale=1./255), ntop=2)
    n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))
    n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))
    n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.ip1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier'))
    n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
    return n.to_proto()

with open('examples/mnist/lenet_auto_train.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_train_lmdb', 64)))

with open('examples/mnist/lenet_auto_test.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_test_lmdb', 100)))

生成的prototxt文件内容如下:

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00392156862745
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 20
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 50
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

10.8 网络训练API

solver = caffe.get_solver('models/bvlc_reference_caffenet/solver.prototxt')
net = caffe.Net('train_val.prototxt', caffe.TRAIN)
solver.net.forward()  # train net
solver.test_nets[0].forward()  # test net (there can be more than one)

solver.net.backward() # 计算梯度
# data gradients
net.blobs['conv1'].diff
# weight gradients
net.params['conv1'][0].diff
# biases gradients
net.params['conv1'][1].diff

solver.step(1) # 进行一次迭代,包括一次forward propagation 和一次backward propagation

solver.step() # 进行solver.prototxt中定义的max_iter次迭代

11 调试 Debugging

Debugging是可选部分,只针对Caffe开发者.
Debugging有用的小技巧:
- 移除随机性 remove randomness
- 对比caffemodels compare caffemodels
- 利用Caffe的调试信息 use Caffe's debug info

移除随机性有利于重用和输出. 随机性出现在很多阶段,如
- 权重的随机初始化,一般是从概率分布在进行初始化,比如Gaussion分布
- 输入图像的水平随机翻转、随机裁剪以及图像顺序的随机打乱等随机性
- dropout层随机训练部分权重,忽略其它权重

一中解决方案是使用seed,即在solver.prototxt中加入以下内容:

# pick some value for random_seed that is greater or equal to 1, for example:
random_seed: 42

保证每次都是相同的'random'值. 不过在不同的机器上,seed会产生不同的值.
针对多台机器,一种鲁棒的方式是:
- 采用相同的打乱顺序的图片进行数据准备,即每次实验中不再打乱顺序
- train.prototxt的 ImageDataLayer层中,定义 transform_param不进行图片裁剪和镜像:

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
 #   mirror: true
 #   crop_size: 227
    mean_value: 104
    mean_value: 117
    mean_value: 123
  }
  image_data_param {
    source: "/path/to/file/train.txt"
    batch_size: 32
    new_height: 224
    new_width: 224
  }
}
  • train.prototxt的dropout层,设置dropout_ratio=0
  • solver.prototxt中设置lr_policy='fixed'
  • solver.prototxt中添加debug_info: 1

为了对比两个caffemodels,下面的脚本统计了两个caffemodels的所有权重间的差异之和:

# Intel Corporation
# Author: Ravi Panchumarthy

import sys, os, argparse, time
import pdb
import numpy as np

def get_args():
    parser = argparse.ArgumentParser('Compare weights of two caffe models')

    parser.add_argument('-m1', dest='modelFile1', type=str, required=True,
                        help='Caffe model weights file to compare')
    parser.add_argument('-m2', dest='modelFile2', type=str, required=True,
                        help='Caffe model weights file to compare aganist')
    parser.add_argument('-n', dest='netFile', type=str, required=True,
                        help='Network prototxt file associated with model')
    return parser.parse_args()

if __name__ == "__main__":
    import caffe

    args = get_args()
    net = caffe.Net(args.netFile, args.modelFile1, caffe.TRAIN)
    net2compare = caffe.Net(args.netFile, args.modelFile2, caffe.TRAIN)

    wt_sumOfAbsDiffByName = dict()
    bias_sumOfAbsDiffByName = dict()

    for name, blobs in net.params.iteritems():
        wt_diffTensor = np.subtract(net.params[name][0].data, net2compare.params[name][0].data)
        wt_absDiffTensor = np.absolute(wt_diffTensor)
        wt_sumOfAbsDiff = wt_absDiffTensor.sum()
        wt_sumOfAbsDiffByName.update({name : wt_sumOfAbsDiff})

        # if args.layerDebug == 1:
        #     print("%s : %s" % (name,wt_sumOfAbsDiff))

        bias_diffTensor = np.subtract(net.params[name][1].data, net2compare.params[name][1].data)
        bias_absDiffTensor = np.absolute(bias_diffTensor)
        bias_sumOfAbsDiff = bias_absDiffTensor.sum()
        bias_sumOfAbsDiffByName.update({name : bias_sumOfAbsDiff})

    print("\nThe sum of absolute difference of all layer's weight is : %s" % sum(wt_sumOfAbsDiffByName.values()))
    print("The sum of absolute difference of all layer's bias is : %s" % sum(bias_sumOfAbsDiffByName.values()))

    finalDiffVal = sum(wt_sumOfAbsDiffByName.values())+ sum(bias_sumOfAbsDiffByName.values())
    print("The sum of absolute difference of all layers weight's and bias's is : %s" % finalDiffVal )

在Makefile.config中取消注释 DEBUG := 1,以进一步的debugging:

gdb /path/to/caffe/build/caffe

gdb开始后,运行命令:

run train -solver /path/to/solver.prototxt

12 实例

12.1 LeNet on MNIST 手写字体

# 准备数据集
cd $CAFFE_ROOT
./data/mnist/get_mnist.sh # downloads MNIST dataset
./examples/mnist/create_mnist.sh # creates dataset in LMDB format

# 训练模型
# Reduce the number of iterations from 10K to 1K to quickly run through this example
sed -i 's/max_iter: 10000/max_iter: 1000/g' examples/mnist/lenet_solver.prototxt
./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt

# 估计forward propagation和backward propagation的时间
./build/tools/caffe time --model=examples/mnist/lenet_train_test.prototxt -iterations 50 # runs on CPU

# 测试模型
# the file with the model should have a 'phase: TEST'
./build/tools/caffe test -model examples/mnist/lenet_train_test.prototxt 
  -weights examples/mnist/lenet_iter_1000.caffemodel -iterations 50

12.2 Dogs vs Cats

Kaggle下载Dogs vs Cats Dataset. 解压 dogvscat.zip, 并运行

dogvscat.sh

.

#!/usr/bin/env sh
CAFFE_ROOT=/path/to/caffe
mkdir dogvscat
DOG_VS_CAT_FOLDER=/path/to/dogvscat

cd *DOG_VS_CAT_FOLDER
## Download datasets (requires first a login)
#https://www.kaggle.com/c/dogs-vs-cats/download/train.zip
#https://www.kaggle.com/c/dogs-vs-cats/download/test1.zip

# Unzip train and test data
sudo apt-get -y install unzip
unzip train.zip -d .
unzip test1.zip -d .

# Format data
python create_label_file.py # creates 2 text files with labels for training and validation
./build_datasets.sh # build lmdbs

# Download ImageNet pretrained weights (takes ~20 min)
*CAFFE_ROOT/scripts/download_model_binary.py *CAFFE_ROOT/models/bvlc_reference_caffenet 

# Fine-tune weights in the AlexNet architecture (takes ~100 min)
*CAFFE_ROOT/build/tools/caffe train -solver *DOG_VS_CAT_FOLDER/dogvscat_solver.prototxt 
    -weights *CAFFE_ROOT/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 

# Classify test dataset
cd *DOGVSCAT_FOLDER
python convert_binaryproto2npy.py
python dogvscat_classify.py # Returns prediction.txt (takes ~30 min)

# A better approach is to train five AlexNets w/init parameters from the same distribution,
# fine-tune those five, and compute the average of the five networks

12.3 PASCAL VOC Classification

解压voc2012.zip,运行

voc2012.sh

,以训练AlexNet.

#!/usr/bin/env sh

# Copy and unzip voc2012.zip (it contains this file) then run this file. But first
#  change paths in: voc2012.sh; build_datasets.sh; solvers/*; nets/*; classify.py

# As you run various files, you can ignore the following error if it shows up:
#  libdc1394 error: Failed to initialize libdc1394

# set Caffe root directory
CAFFE_ROOT=$CAFFE_ROOT
VOC=/path/to/voc2012

chmod 700 *.sh

# Download datasets
# Details: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#devkit
if [ ! -f VOCtrainval_11-May-2012.tar ]; then
  wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
fi
# VOCtraival_11-May-2012.tar contains the VOC folder with:
#  JPGImages: all jpg images
#  Annotations: objects and corresponding bounding box/pose/truncated/occluded per jpg
#  ImageSets: breaks the images by the type of task they are used for
#  SegmentationClass and SegmentationObject: segmented images (duplicate directories)
tar -xvf VOCtrainval_11-May-2012.tar

# Run Python scripts to create labeled text files
python create_labeled_txt_file.py

# Execute shell script to create training and validation lmdbs
# Note that lmdbs directories w/the same name cannot exist prior to creating them
./build_datasets.sh

# Execute following command to download caffenet pre-trained weights (takes ~20 min)
#  if weights exist already then the command is ignored
CAFFE_ROOT/scripts/download_model_binary.py CAFFE_ROOT/models/bvlc_reference_caffenet

# Fine-tune weights in the AlexNet architecture (takes ~60 min)
# you can also chose one of six solvers: pascal_solver[1-6].prototxt
CAFFE_ROOT/build/tools/caffe train -solver VOC/solvers/voc2012_solver.prototxt 
  -weights CAFFE_ROOT/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel

# The lines below are not really needed; they served as examples on how to do some tasks

# Test against voc2012_val_lmbd dataset (name of lmdb is the model under PHASE: test)
 CAFFE_ROOT/build/tools/caffe test -model VOC/nets/voc2012_train_val_ft678.prototxt 
   -weights VOC/weights_iter_5000.caffemodel -iterations 116

# Classify validation dataset: returns a file w/the labels of the val dataset
#  but it doesn't report accuracy (that would require some adjusting of the code)
python convert_binaryproto2npy.py
mkdir results
python cls_confidence.py
python average_precision.py

VOC相关信息:
- PASCAL VOC datasets
- 20 classes
- Training: 5,717 images, 13,609 objects
- Validation: 5,823 images, 13,841 objects
- Testing: 10,991 images

13 相关材料

Last modification:October 9th, 2018 at 09:31 am