Caffe - 基于 Caffe-Intel 框架深度学习网络的训练和部署[译]

博主： AIHGF
发布时间：2018 年 05 月 18 日
2064 次浏览
暂无评论
41996字数
分类：图像分类

原文 - Manage Deep Learning Networks with Caffe* Optimized for Intel® Architecture.

目录：
- 摘要
- 安装
- 数据层
- 数据集准备
- 训练
- 多节点分布式训练
- 微调
- 测试
- 特征提取和可视化
- 使用Python API
- 调试
- 实例
- Caffe用处
- 进一步阅读

1 摘要

Caffe是BVLC开发的深度学习框架，基于C++和 CUDA C++语言，并提供了Python 和 Matlab接口. 该框架对于卷积神经网络CNN、循环神经网络RNN及多层感知器很有帮助. 现在已经具有对于检测、分类、分割以及Spark兼容的分支.

基于Intel结构优化的Caffe(Caffe-Intel)整合了Intel Math Kernel Library(Intel MKL) 2017，并对 Advanced Vector Extensions(AVX)-2 和AVX-512 指令集进行了优化，能够支持 Intel Xeon 和 Intel Xeon Phi 处理器. 因此，Caffe-Intel 框架除了包含BVLC Caffe的所有优点外，还能在 Intel 架构上有效运行，并能在许多节点进行分布式训练.

该文档主要阐述了基于Intel结构优化的Caffe框架的编译、使用一个或多个计算节点进行网络模型的训练以及网络的部署. 另外，详细介绍了Caffe的一些函数，比如网络微调、不同模型的特征提取与可视化、Caffe的Python API接口.

名词：
- weights 权重 - 也被叫做核(kernels)、滤波器(filters)、模板(templates)、或特征提取器(feature extractors)；
- blob 数据块 - 也被叫做张量(tesor)，一种N维数据结构，N-D维张量，包含了数据、梯度或权重(偏置bias)；
- units 神经元 - 也被叫做 neurons，在数据块进行非线性变化；
- feature maps 特征图 - 也被叫做通道(channels)；
- testing 测试 - 也被叫做推断(inference)、分类、得分(scoring)或部署(deployment)；
- model 模型 - 也被叫做拓扑结构或网络结构.

快速熟悉Caffe：
- Caffe-Intel 安装
- Caffe-Intel 基于 MNIST 数据集训练和测试 LeNet 网络
- 在一些图片上，比如 cat和fish-bike，测试训练好的模型，比如，bvlc_googlenet.caffemodel
- 在 Cats vs Dogs Challenge 对已有模型微调

2 Caffe-Intel 安装

这里仅针对 Ubuntu14.04 平台说明 Caffe 的安装，其他Linux和OS X操作系统，BVLC官方提供了相应的安装方法.

sudo apt-get update &&
sudo apt-get -y install build-essential git cmake &&
sudo apt-get -y install libprotobuf-dev libleveldb-dev libsnappy-dev &&
sudo apt-get -y install libopencv-dev libhdf5-serial-dev protobuf-compiler &&
sudo apt-get -y install --no-install-recommends libboost-all-dev &&
sudo apt-get -y install libgflags-dev libgoogle-glog-dev liblmdb-dev &&
sudo apt-get -y install libatlas-base-dev

对于Ubuntu16.04，需要进行以下库的链接：

find . -type f -exec sed -i -e 's^"hdf5.h"^"hdf5/serial/hdf5.h"^g' -e 's^"hdf5_hl.h"^"hdf5/serial/hdf5_hl.h"^g' '{}' ;
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so

针对CentOS7，安装以下依赖项：

sudo yum -y update &&
sudo yum -y groupinstall "Development Tools" &&
sudo yum -y install wget cmake git &&
sudo yum -y install protobuf-devel protobuf-compiler boost-devel &&
sudo yum -y install snappy-devel opencv-devel atlas-devel &&
sudo yum -y install gflags-devel glog-devel lmdb-devel leveldb-devel hdf5-devel

# The following steps are only required if some packages failed to install
# add EPEL repository then install missing packages
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo rpm -ivh epel-release-latest-7.noarch.rpm
sudo yum -y install gflags-devel glog-devel lmdb-devel leveldb-devel hdf5-devel &&
sudo yum -y install protobuf-devel protobuf-compiler boost-devel

# if packages are still not found--download and install/build the packages, e.g.,
# snappy:
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm
sudo yum -y install http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm
# atlas:
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/atlas-devel-3.10.1-10.el7.x86_64.rpm
sudo yum -y install http://mirror.centos.org/centos/7/os/x86_64/Packages/atlas-devel-3.10.1-10.el7.x86_64.rpm
# opencv:
wget https://github.com/Itseez/opencv/archive/2.4.13.zip
unzip 2.4.13.zip
cd opencv-2.4.13/
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=/usr/local ..
NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make all -j $NUM_THREADS
sudo make install -j $NUM_THREADS

# optional (not required for Caffe)
# other useful repositories for CentOS are RepoForge and IUS:
wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
sudo rpm -Uvh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
wget https://rhel7.iuscommunity.org/ius-release.rpm
sudo rpm -Uvh ius-release*.rpm

各依赖项的说明(source)：
- boost - 使用 math functions 和 shared pointer 的C++库；
- glog、gflags - 提供日志和命令行工具，对于调试十分必要；
- leveldb、lmdb - 数据库IO，用于准备数据；
- protobuf - 用于有效的定义数据结构；
- BLAS(Basic Linear Algebra Subprograms) - 由Intel MKL提供的矩阵乘法、矩阵加法等操作库，类似的还有ATLAS、openBLAS 等运算库.

Caffe安装指南指出对于CPU来说，安装MKL会有更好的表现.

为了最佳表现，采用Intel MKL 2017，可以免费从 Intel® Parallel Studio XE 2017 Beta 获取Beta版.
安装好后，正确的环境库可以设置如下(其中的路径需要根据实际情况修改)：

echo 'source /opt/intel/bin/compilervars.sh intel64' >> ~/.bashrc
# alternatively edit <mkl_path>/mkl/bin/mklvars.sh replacing INSTALLDIR in
# CPRO_PATH=<INSTALLDIR> with the actual mkl path: CPRO_PATH=<full mkl path>
# echo 'source <mkl path>/mkl/bin/mklvars.sh intel64' >> ~/.bashrc

克隆并准备 Caffe-Intel：

cd ~
# For BVLC caffe use:
# git clone https://github.com/BVLC/caffe.git
# For intel caffe use:
git clone https://github.com/intel/caffe.git 
cd caffe
echo "export CAFFE_ROOT=`pwd`" >> ~/.bashrc
source ~/.bashrc
cp Makefile.config.example Makefile.config
# Open Makefile.config and modify it (see comments in the Makefile)
vi Makefile.config

编辑Makefile.config：

# To run on CPU only and to avoid installing CUDA installers, uncomment
CPU_ONLY := 1

# To use MKL, replace atlas with mkl as follows
# (make sure that the BLAS_DIR and BLAS_LIB paths are correct)
BLAS := mkl
BLAS_DIR := $(MKLROOT)/include
BLAS_LIB := $(MKLROOT)/lib/intel64

# To use MKL2017 DNN primitives as the default engine, uncomment
# (however leave it commented if using multinode training)
# USE_MKL2017_AS_DEFAULT_ENGINE := 1

# To customized compiler choice, uncomment and set the following
# CUSTOM_CXX := g++

# To train on multinode uncomment and verify path
# USE_MPI := 1
# CXX := /usr/bin/mpicxx

如果是Ubuntu16.04，编辑Makefile：

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/

并创建链接：

cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so

如果是CentOS7和ATLAS库(而不是推荐的MKL库)，编辑Makefile：

# Change this line
LIBRARIES += cblas atlas
# to
LIBRARIES += satlas

编译Caffe-Intel：

NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make -j $NUM_THREADS
# To save the output stream to file makestdout.log use this instead
# make -j $NUM_THREADS 2>&1 | tee makestdout.log

另一种方式是采用cmake方式：

mkdir build
cd build
cmake -DCPU_ONLY=on -DBLAS-mkl -DUSE_MKL2017_AS_DEFAULT_ENGINE=on /path/to/caffe
NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make -j $NUM_THREADS

安装Python依赖项：

# These steps are OPTIONAL but highly recommended to use the Python interface
sudo apt-get -y install gfortran python-dev python-pip
cd ~/caffe/python
for req in $(cat requirements.txt); do sudo pip install $req; done
sudo pip install scikit-image #depends on other packages
sudo ln -s /usr/include/python2.7/ /usr/local/include/python2.7
sudo ln -s /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ \
  /usr/local/include/python2.7/numpy
cd ~/caffe
make pycaffe -j NUM_THREADS
echo "export PYTHONPATH=$CAFFE_ROOT/python" >> ~/.bashrc
source ~/.bashrc

其它安装选项：

# These steps are OPTIONAL to test caffe
make test -j $NUM_THREADS
make runtest #"YOU HAVE <some number> DISABLED TESTS" output is OK

# This step is OPTIONAL to disable cam hardware OpenCV driver
# alternatively, the user can skip this and ignore the harmless 
# libdc1394 error that may occasionally appears
sudo ln /dev/null /dev/raw1394

3 Caffe 数据层

该部分是可选，将对 Caffe 支持的数据类型进行阐述，对于学习 Caffe 是非必须的，主要基于Caffe官方提供的layers 介绍材料和 src/caffe/proto/caffe.proto.

Data 通过数据层进入 Caffe，其位于网络的最底部，在prototxt文件中进行定义. 关于prototxt文件的更多信息会在Caffe-Intel网络训练部分详细介绍.
Data可以来自数据库(LevelDB或LMDB)，直接从内存、从磁盘HDF5格式文件或通用图像格式.
常用的输入图片预处理(比如中心化(mean subtraction)、尺度变换、随机裁剪、镜像处理等)变换可以通过指定

transfrom_params

(不是所有的数据类型都支持该参数，比如HDF5即不支持)来定义. 如果已经预先进行数据变换，则不必再使用.
常用的数据变换定义方式：

transform_param {
  # 随机水平反转图片,镜像处理
  mirror: 1
  # 裁剪  `crop_size` x `crop_size`  图片块:
  # - 训练时随机裁剪
  # - 测试时根据图片 center 裁剪
  crop_size: 227
  # 去均值: 可以设定值, 或者从 mean.binaryproto 文件加载
  # mean_file: name_of_mean_file.binaryproto
  mean_value: 104
  mean_value: 117
  mean_value: 123
}

这里，图像要进行裁剪、镜像、中心化变换. 其他数据变换操作可以查看 src/caffe/proto/caffe.proto 文件的

TransformationParameter

参数.

3.1 LMDB 数据

LMDB(Lightning Memory-Mapped Databases ) 和 LevelDB 数据形式可以作为输入数据的一种有效方式.
他们只对于

1-of-K

分类任务较适用. 由于Caffe在读取数据集效率问题，这两种数据形式被推荐用于

1-of-K

任务.

data_params
属性
- source - 包含图片数据库的路径
- batch_size - 一次处理输入的数目

参数
- backend[默认LEVELDB] - 选择采用 LEVELDB 或 LMDB
- rand_skip - 在开始处跳过的输入数目，对于

async sgd

有用

详细介绍查看 src/caffe/proto/caffe.proto文件中

DataParameter

参数.

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: 1
    crop_size: 227
    mean_value: 104
    mean_value: 117
    mean_value: 123
  }
  data_param {
    source: "examples/imagenet/ilsvrc12_train_lmdb"
    batch_size: 32
    backend: LMDB
  }
}

或者，均值中心化可以通过均值图像(

"data/ilsvrc12/imagenet_mean.binaryproto"

) 来取代

mean_value

. LMDB数据集的

binaryproto

的计算为：

cd ~/caffe
build/tools/compute_image_mean examples/imagenet/ilsvr12_train_lmdb 
data/ilsvrc12/imagenet_mean.binaryproto

根据实际需求，可以分别替换

examples/imagenet/ilsvr12_train_lmdb

和

data/ilsvrc12/imagenet_mean.binaryproto

为合适的 lmdb 文件夹和

binaryproto

文件.

3.2 ImageData

直接从图像文件得到images和labels.

image_data_params
属性
- source - 包含了输入数据和labels的文本文件名字

参数
- batch_size[默认为1] - 一次处理的输入数目
- new_height[默认为0] - 调整图像height值，如果为0，则忽略
- new_width[默认为0] - 调整图像width值，如果为0，则忽略
- shuffle[默认为0] - 打乱数据，如果为0，则忽略
- rand_skip[默认为0] - 在开始处跳过的输入数目，对于

async sgd

有用

详细介绍查看src/caffe/proto/caffe.proto文件中

ImageDataParameter

参数.

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_value: 104
    mean_value: 117
    mean_value: 123
  }
  image_data_param {
    source: "/path/to/file/train.txt"
    batch_size: 32
    shuffle: 1
  }
}

这里，图像进行了顺序打乱、裁剪、镜像和中心化处理.
需要注意的是，文本中每行应为图像名和对应的labels，比如，

"tran.txt"

形式：

/path/to/images/img3423.jpg 2
/path/to/images/img3424.jpg 13
/path/to/images/img3425.jpg 8
...

3.3 Input

指定数据维度时，采用零值 blob 作为输入数据.

input_params
属性
- shape - 指定为1或top blobs的维度信息

在 prototxt 中的定义形式:

layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 32
      dim: 3
      dim: 227
      dim: 227
    }
  }
}

等价形式:

input: "data"
input_dim: 32
input_dim: 3
input_dim: 227
input_dim: 227

3.4 DummyData

类似于 Input 类型，不同之处在于需要指定数据类型. 往往用于调试，详细可参考例子

dummy_data_params
属性
- shape - 指定为1或top blobs的维度信息

参数
- data_filler[默认是值为0的ConstantFiller] - 指定top blob的值

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "DummyData"
  top: "data"
  include {
    phase: TRAIN
  }
  dummy_data_param {
    data_filler {
      type: "constant"
      value: 0.01
    }
    shape {
      dim: 32
      dim: 3
      dim: 227
      dim: 227
    }
  }
}
layer {
  name: "data"
  type: "DummyData"
  top: "label"
  include {
    phase: TRAIN
  }
  dummy_data_param {
    data_filler {
      type: "constant"
    }
    shape {
      dim: 32
    }
  }
}

3.5 MemoryData

直接从内存读取数据，调用方式为：调用

MemoryDataLayer::Reset (from C++)

和

Net.set_input_arrays (from Python)

来读取连续的数据，一般是4D array，一次读取一个batch_size.
由于该方式需要将数据首先送到内存中，速率可能会慢，但一旦放到内存中，这种方式很有效率.

memory_data_param
属性
- bacth_size，channels， height， width - 数据的维度信息

在 prototxt 中的定义形式:

layers {
  name: "data"
  type: MEMORY_DATA
  top: "data"
  top: "label"
  transform_param {
    crop_size: 227
    mirror: true
    mean_file: "mean.binaryproto"
  }
  memory_data_param {
   batch_size: 32
   channels: 3
   height: 227
   width: 227
  }

3.6 HDF5Data

以HDF5格式文件来读取数据，对于很多任务都是可用的，但一般只用于FP32和FP64数据，不是uint8，故图像数据会很大.
不允许使用

transform_param

. 只在必要的时候使用该方式.

hdf5_data_param
属性
- source - 包含输入数据和labels路径的文本文件名
- batch_size

参数
- shuffle[默认false] - 打乱HDF5文件顺序

在 prototxt 中的定义形式:

layer {
  name: "data"
  type: "HDF5_DATA"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 32
  }
}

3.7 HDF5DataOutput

HDF5输出层的作用与其他数据层相反，将输入数据块写入磁盘

hdf5_output_param
属性
- file_name

在 prototxt 中的定义形式:

layer {
  name: "data_output"
  type: "HDF5_OUTPUT"
  bottom: "data"
  bottom: "label"
  include {
    phase: TRAIN
  }
  hdf5_output_param {
    file_name: "output_file.h5"
  }
}

3.8 WindowData

用于detection，Read windows from image files class labels.

window_data_param
属性
- source - 指定数据源
- mean_file
- batch_size

参数
- mirror
- crop_size - 随机裁剪图像
- crop_mode[默认"warp"] - 裁剪detection window的模式，比如，"warp"裁剪为固定尺寸， "square"在window四周裁剪紧凑方框
- fg_threshold[默认0.5] - 前景重叠阈值(foreground (object) overlap threshold)
- bg_threshold[默认0.5] - 背景重叠阈值(background (object) overlap threshold)
- fg_fraction[默认0.25]: 前景物体交集(fraction of batch that should be foreground) objects
- context_pad[默认10]: 围绕window补零数目(amount of contextual padding around a window)

详细信息可参考src/caffe/proto/caffe.proto文件中的

WindowDataParameter

参数.

在 prototxt 中的定义形式:

layers {
  name: "data"
  type: "WINDOW_DATA"
  top: "data"
  top: "label"
  window_data_param {
    source: "/path/to/file/window_train.txt"
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
    batch_size: 128
    mirror: true
    crop_size: 227
    fg_threshold: 0.5
    bg_threshold: 0.5
    fg_fraction: 0.25
    context_pad: 16
  }
}

4 数据集准备

对于

1-of-K

分类任务推荐使用LMDB数据格式.
在使用Caffe工具生成LMDB格式数据需要指定：
- 数据所在目录
- 输出目录，比如

mydataset_train_lmdb

，必须
- 包含图像名和对应labels的文本文件，比如，"

train.txt

"，内容格式为：

img3423.jpg 2
img3424.jpg 13
img3425.jpg 8
...

如果数据分散在不同的文件夹， "

train.txt

"需要包含数据的绝对路径.

create_label_file.py 可以生成针对Kaggle's Dog vs Cats Competition任务的 training 和 validation 数据集划分，同样适用于其它任务.

create_label_file.py

#!/usr/bin/env python

import sys
import os
import os.path

def main():

  TRAIN_TEXT_FILE = 'train.txt'
  VAL_TEXT_FILE = 'val.txt'
  IMAGE_FOLDER = 'train'

  # Selects 10% of the images (the ones that end in '2') for validation

  fr = open(TRAIN_TEXT_FILE, 'w')
  fv = open(VAL_TEXT_FILE, 'w')

  filenames = os.listdir(IMAGE_FOLDER)
  for filename in filenames:
    if filename[0:3] == 'cat':
      if filename[-5] == '2':# or filename[-5] == '8':
        fv.write(filename + ' 0\n')
      else:
        fr.write(filename + ' 0\n')
    if filename[0:3] == 'dog':
      if filename[-5] == '2':# or filename[-5] == '8':
        fv.write(filename + ' 1\n')
      else:
        fr.write(filename + ' 1\n')

  fr.close()
  fv.close()

# Standard boilerplate to call the main() function to begin the program.
if __name__ == '__main__':
  main()

在测试阶段，假设labels不存在的. 如果labels可用，可以采用相同的方法生成 test LMDB数据集.

4.1 准备三通道数据（图像）

下面的例子生成training LMDB，工作路径位于

$CAFFE_ROOT

#!/usr/bin/env sh
# folder containing the training and validation images
TRAIN_DATA_ROOT=/path/to/training/images

# folder containing the file with the name of training images
DATA=/path/to/file
# folder for the lmdb datasets
OUTPUT=/path/to/output/directory
TOOLS=/path/to/caffe/build/tools

# Set to resize the images to 256x256
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
echo "Creating train lmdb..."

# Delete the shuffle line if shuffle is not desired
GLOG_logtostderr=1 *TOOLS/convert_imageset 
    --resize_height=*RESIZE_HEIGHT 
    --resize_width=*RESIZE_WIDTH 
    --shuffle 
    *TRAIN_DATA_ROOT/ 
    *DATA/train.txt 
    *OUTPUT/mydataset_train_lmdb
echo "Done."

注: * 号替换为 $ 符号.

计算LMDB数据集的图像均值：

#!/usr/bin/env sh
# Compute the mean image in lmdb dataset
OUTPUT=/path/to/output/directory

 # folder for the lmdb datasets and output for mean image
TOOLS=/path/to/caffe/build/tools

*TOOLS/compute_image_mean *OUTPUT/mydataset_train_lmdb 
  *OUTPUT/train_mean.binaryproto

*TOOLS/compute_image_mean *OUTPUT/mydataset_val_lmdb 
  *OUTPUT/val_mean.binaryproto

4.2 准备不同通道数据

灰度值图像(Gray scale images，单通道)、RADAR图像(双通道)、视频(videos，四通道)、图像+深度信息(四通道)、brometry(单通道)以及频谱图(spectrograms，单通道)需要进行变换以生成LMDB数据集(参考资料).

4.3 调整图像尺寸

有两种调整图像尺寸的方式：
- 变换图像到指定尺寸
- 按比例调整到比指定尺寸相对较小的尺寸，然后中心裁剪大的一边以达到指定尺寸

调整图像尺寸的方法有：
- 基于OPENCV* - build/tools/convert_imageset --resize_height=256 --resize_width=256 将图像裁剪到指定尺寸，其中

convert_imageset

调用了

ReadImageToDatum

函数，后者调用了

caffe/src/util/io.cpp

中的

ReadImageToCVMat

函数；
- 基于ImageMagick - convert -resize 256x256! 将图像裁剪到指定尺寸；
- 基于OPENCV - 采用脚本

tools/extra/resize_and_crop_images.py

来进行多线程图像变换，对图像进行比例地变换，再进行中心裁剪

sudo pip install git+https://github.com/Yangqing/mincepie.git
sudo apt-get install -y python-opencv
vi tools/extra/launch_resize_and_crop_images.sh # set number of clients (use num_of_cores*2); file.txt, input, and output folders

另外，网络中的图像可以在数据层定义参数来进行裁剪或者调整尺寸：

layer {
  name: "data"
  transform_param {
    crop_size: 227
...
}

layer {
  name: "data"
  image_data_param {
    new_height: 227
    new_width: 227
...

5 网络训练 Training

网络训练需要：
- train_val.prototxt - 定义了网络结构、初始化参数和学习率
- solver.prototxt - 定义了优化参数的方式，训练深度网络的文件
- deploy.prototxt - 只用于testing，与

train_val.prototxt

基本一致，除了没有输入层、loss层

参数初始化十分重要，其主要方式有：
- gaussian - 从高斯分布 N(0,std)采样权重值
- xavier - 从uniform distribution U(-a,a)采样权重，其中 a=sqrt(3/fan_in), where fan_in is the number of incoming inputs
- MSRAFiller - 从正态分布 normal distribution N(0,a) 采样权重, 其中a=sqrt(2/fan_in)

网络层关于学习率的参数：
- base_lr - 初始化学习率，默认为0.01，训练时如果出现NAN，则将值调小
- lr_mult - 偏置的lr_mult一般设为2×非偏置权重的lr_mult

以LeNet为例，分别定义 lenet_train_test.prototxt, deploy.prototxt, solver.prototxt：

solver.prototxt

# 网络定义
net: "examples/mnist/lenet_train_test.prototxt"

# 每500次训练迭代进行一次validation test
test_interval: 500 
# 指定validation test迭代的次数，推荐值设为 num_val_imgs / batch_size
test_iter: 100 

# 训练网络的基础学习率、动量和权重衰减
base_lr: 0.01
momentum: 0.9 
weight_decay: 0.0005

# 不同的学习策略
#  fixed: always return base_lr.
#  step: return base_lr * gamma ^ (floor(iter / step))
#  exp: return base_lr * gamma ^ iter
#  inv: return base_lr * (1 + gamma * iter) ^ (- power)
#  multistep: similar to step but it allows non uniform steps defined by stepvalue
#  poly: the effective learning rate follows a polynomial decay, to be zero by the max_iter: return base_lr (1 - iter/max_iter) ^ (power)
#  sigmoid: the effective learning rate follows a sigmod decay: return base_lr * ( 1/(1 + exp(-gamma * (iter - stepsize))))
lr_policy: "step"
gamma: 0.1 
stepsize: 10000 # Drop the learning rate in steps by a factor of gamma every stepsize iterations

# 每100次迭代显示一次结果
display: 100 

# 最大迭代次数
max_iter: 10000

# 每5000次迭代输出一次快照，即模型训练状态和模型参数
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet_multistep"

# solver mode: CPU or GPU
solver_mode: CPU

训练网络：

*CAFFE_ROOT/build/tools/caffe train -solver solver.prototxt

训练网络会输出两种类型的文件，比如：
- lenet_multistep_10000.caffemodel - 网络的权重，即用于测试的模型参数
- lenet_multistep_10000.solverstate - 如果中间训练过程中断，便于恢复训练

训练网络，并画出验证数据集上的精度或loss vs迭代的曲线：

#CHART_TYPE=[0-7]
#  0: Test accuracy  vs. Iters
#  1: Test accuracy  vs. Seconds
#  2: Test loss  vs. Iters
#  3: Test loss  vs. Seconds
#  4: Train learning rate  vs. Iters
#  5: Train learning rate  vs. Seconds
#  6: Train loss  vs. Iters
#  7: Train loss  vs. Seconds
CHART_TYPE=0
*CAFFE_ROOT/build/tools/caffe train -solver solver.prototxt 2>&1 | tee logfile.log
python *CAFFE_ROOT/tools/extra/plot_training_log.py.example *CHART_TYPE name_of_plot.png logfile.log

Dropout被用于全连接层，在forward-pass过程只激活部分权重来避免权重间的协同性，以降低过拟合.
在测试过程被忽略.

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }   
    bias_filler {
      type: "constant"
      value: 1
    }   
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5 
  }
}

估计前向传播和后向传播的时间，不更新权重：

# 计算NUMITER=50次前向和后向传播的时间，总时间以及平均时间
# 可能需要训练样本和mean.binaryproto
NUMITER=50
/path/to/caffe/build/tools/caffe time --model=train_val.prototxt -iterations *NUMITER

Linux的

numactl

工具可以进行内存分配管理：

numactl -i all /path/to/caffe/build/tools/caffe time --model=train_val.prototxt -iterations *NUMITER

Caffe Model Zoo

Caffe Model Zoo给出了针对不同任务的网络模型以及模型参数，便于fine-tuning或者testing.

6 多节点分布式训练 Multinode distributed training

该部分内容基于Intel's Caffe Github wiki. 主要有两种方式进行多节点的分布式训练：
- 模型并行
- 数据并行

模型并行是指，将模型置于不同的节点，每个节点都进行全部的数据处理；
数据并行是指，将数据块置于不同的节点，每个节点都有全部的模型参数.
对于模型中权重数较少，数据块较大时，数据并行比较使用.
混合模型和数据并行可以同时进行，对于网络层权重较少，比如卷积层采用数据并行训练，对于网络层权重较多，比如全连接层采用模型并行训练.
论文 - Distributed Deep Learning Using Synchronous Stochastic Gradient Descent - 2016 - Intel 对混合方法中数据并行和模型并行间的优化平衡进行了理论分析.

结合当前比较流行的权重较少的深度网络，比如GoogleNet和ResNet，以及采用数据并行分布式训练的成功案例，可以看出，Caffe-Intel支持数据并行计算的.
多节点分布式训练也是当前比较活跃的发展方向.

多节点网络训练对 Makefile.config进行修改：

USE_MPI := 1
# update with the path to binary MPI library
CXX := /usr/bin/mpicxx

采用多节点进行训练也比较简单：

mpirun --hostfile path/to/hostfile -n <num_processes> /path/to/caffe/build/tools/caffe train --solver=/path/to/solver.prototxt --param_server=mpi

其中，
- - 使用节点的数目
- hostfile - 包含了每条线节点的ip地址

solver.prototxt

中指定了各节点的

train.prototxt

，且每个

train.prototxt

需要指定到数据集的不同部分. 更多细节，参考相关材料.

7 微调 Fine-tuning

重复利用prototxt中定义的网络结构，主要进行的两处修改如下：

1 修改网络数据层，以适应新数据

layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00390625 # 1/255
  }
  data_param {
    source: "newdata_lmdb" # 指定到新的数据集
    batch_size: 64
    backend: LMDB
  }
}

2 修改输出层，这里是ip2网络层(注：在deploy.prototxt文件中进行同样的修改)

layer {
  name: "ip2-ft" # 修改网络名
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2-ft" # 修改网络输出名
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 2 # 修改为新数据集的类别数目，这里是2
    bias_filler {
      type: "constant"
    }
  }
}

在Caffe中fine-tuning：

#From the command line on $CAFFE_ROOT
./build/tools/caffe train -solver /path/to/solver.prototxt -weights  /path/to/trained_model.caffemodel

微调技巧：
- 首先学习最后网络输出层，其它层不变动
- 减小初始学习率，一般为10×或100×
- 可定义Caffe网络层的局部学习率 lr_mult
- 保持除了最后输出层或倒数第二层网络不变，以进行快速优化，即: 局部学习率lr_mult=0
- 增大最后输出层的局部学习率为10×，倒数第二层的局部学习率为5×
- 如果效果已足够好，停止，或者微调其它网络层

微调网络的特点：
- 创建了新的网络结构
- 复制初始化网络权重
- 类似于网络的训练，参考实例.

8 测试Testing

测试也被叫做推断、分类、或者打得分，可以使用Caffe提供的Python接口或者C++工具进行. C++工具不够灵活，推荐使用Python.
分类一张图片或信号或图像集，需要：
- 图片
- 网络结构
- 网络权重

8.1 测试图片集

模型的prototxt中应该有TEST数据层，指定了testing数据集，以测试模型表现：

/path/to/caffe/build/tools/caffe test -model /path/to/train_val.prototxt 
- weights /path/to/trained_model.caffemodel -iterations <num_iter>

该实例参考了材料.

8.2 测试单张图片

首先，在使用训练好的模型进行图片分类前，需要下载模型：

./scripts/download_model_binary.py models/bvlc_reference_caffenet

然后，下载数据集labels，来映射网络预测结果到图片类别，这里以ILSVRC2012为例：

./data/ilsvrc12/get_ilsvrc_aux.sh

最后，分类图片：

./build/examples/cpp_classification/classification.bin 
  models/bvlc_reference_caffenet/deploy.prototxt 
  models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 
  data/ilsvrc12/imagenet_mean.binaryproto 
  data/ilsvrc12/synset_words.txt 
  examples/images/cat.jpg

输出结果样式：

---------- Prediction for examples/images/cat.jpg ----------
0.3134 - "n02123045 tabby, tabby cat"
0.2380 - "n02123159 tiger cat"
0.1235 - "n02124075 Egyptian cat"
0.1003 - "n02119022 red fox, Vulpes vulpes"
0.0715 - "n02127052 lynx, catamount"

9 特征提取和可视化

网络卷积层的权重数据格式为： output_feature_maps x height x width x input_feature_maps，feature_maps也被叫做channels. Caffe的特征提取方式有两种： Python API和C++ API.

# 下载模型参数
scripts/download_model_binary.py models/bvlc_reference_caffenet

# Generate a list of the files to process
# Use the images that ship with caffe
find `pwd`/examples/images -type f -exec echo {} ; > examples/images/test.txt

# Add a 0 to the end of each line
# input data structures expect labels after each image file name
sed -i "s/*/ 0/" examples/images/test.txt

# Get the mean of trainint set to subtract it from images
./data/ilsvrc12/get_ilsvrc_aux.sh

# Copy and modify the data layer to load and resize the images:
cp examples/feature_extraction/imagenet_val.prototxt examples/images
vi examples/iamges/imagenet_val.prototxt

# 提取特征
./build/tools/extract_features.bin models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 
  examples/images/imagenet_val.prototxt fc7 examples/images/features 10 lmdb

这里提取了

fc7

网络层的特征图，表现的是模型的最高层特征. 同样的，也可以提取其它层的特征，比如

conv5

、

pool3

等. 最后的参数

10 lmdb

是最小的batch size，提取的特征被保存在

examples/images/features

的LevelDB文件夹内.

10 Python API

Caffe提供了testing、分类、特定提取、网络定义和网络训练的Python API.

10.1 Caffe Python API 设置

编译Caffe后需要再执行

make pycaffe

，成功后即可进行调用：

import sys 
CAFFE_ROOT = '/path/to/caffe/' #路径要设置正确
sys.path.insert(0, CAFFE_ROOT + 'python')
import caffe
caffe.set_mode_cpu() # CPU模式

10.2 加载网络结构API

网络结构定义在train_val.prototxt或者deploy.prototxt中：

net = caffe.Net('train_val.prototxt', caffe.TRAIN)

如果指定了权重，则：

net = caffe.Net('deploy.prototxt', 'trained_model.caffemodel', caffe.TRAIN)

net中包含了数据块(net.blobs)和权重参数块(net.params)，以conv1网络层为例：
- net.blobs['conv1'] - conv1层的输出数据，也被叫做特征图(feature maps)
- net.params['conv1'][0] - conv1层权重项
- net.params['conv1'][1] - conv1层偏置项
- net.blobs.items() - 所有网络层的数据块

10.3 网络可视化API

这里需要安装

pydot

和

graphviz

模块：

sudo apt-get install -y GraphViz
sudo pip install pydot

利用caffe的

draw_net.py

脚本实现可视化：

python python/draw_net.py examples/net_surgery/deploy.prototxt train_val_net.png
open train_val_net.png

10.4 数据输入API

方式1：修改数据层以匹配图像大小

import numpy as np
# get input image and arrange it as a 4-D tensor
im = np.array(Image.open('/path/to/caffe/examples/images/cat_gray.jpg'))
im = im[np.newaxis, np.newaxis, :, :]
# resize the blob to be the size of the input image
net.blobs['data'].reshape(im.shape) # if the image input is different 
# compute the blobs given the input data
net.blobs['data'].data[...] = im

方式2：修改输入数据以匹配网络数据层的图像大小

im = caffe.io.load.image('/path/to/caffe/examples/images/cat_gray.jpg')
shape = net.blobs['data'].data.shape
# resize the img to be the size of the data blob
im = caffe.io.resize(im, shape[3], shape[2], shape[1])
# compute the blobs given the input data
net.blobs['data'].data[...] = im

数据层对输入数据一般会进行数据变换

net = caffe.Net('deploy.prototxt', 'trained_model.caffemodel', caffe.TRAIN)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
ilsvrc_mean = 'python/caffe/imagenet/ilsvrc_2012_mean.npy'
transformer.set_mean('data', np.load(ilsvrc_mean).mean(1).mean(1))
# puts the channel as the first dimention
transformer.set_transpose('data', (2,0,1))
# (2,1,0) maps RGB to BGR for example
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)
# the batch size can be changed on-the-fly
net.blobs['data'].reshape(1,3,227,227)
# load the image in the data layer
im = caffe.io.load.image('/path/to/caffe/examples/images/cat_gray.jpg')
# transform the image and store it in the net.blob
net.blobs['data'].data[...] = transformer.preprocess('data', im)

图像可视化：

import matplotlib.pyplot as plt
plt.imshow(im)

10.5 推断 Inference API

输入图像的网络预测：

# assumes that images are loaded
prediction = net.forward()
print 'predicted class:', prediction['prob'].argmax()

也可以统计forward propagation的时间(不包括数据处理的时间)：

timeit net.forward()

Caffe还提供了对多个输入数据同时进行数据变换和分类的Python API - net.Classifier，可以取代net.Net和caffe.io.Transformer.

im1 = caffe.io.load.images('/path/to/caffe/examples/images/cat.jpg')
im2 = caffe.io.load.images('/path/to/caffe/examples/images/fish-bike.jpg')
imgs = [im1, im2]
ilsvrc_mean = '/path/to/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy'
net = caffe.Classifier('deploy.prototxt', 'trained_model.caffemodel',
                       mean=np.load(ilsvrc_mean).mean(1).mean(1),
                       channel_swap=(2,1,0),
                       raw_scale=255,
                       image_dims=(256, 256))
prediction = net.predict(imgs) # predict takes any number of images
print 'predicted classes:', prediction[0].argmax(), prediction[1].argmax()

对于多张图片的文件夹，只需修改

imgs

部分：

IMAGES_FOLDER = '/path/to/folder/w/images/'
import os
images = os.listdir(IMAGES_FOLDER)
imgs = [ caffe.io.load_image(IMAGES_FOLDER + im) for im in images ]

plt.plot(prediction[0])  # 以bar chart的形式可视化所有类别的概率
timeit net.predict([im1])  # 时间统计
timeit net.predict([im1], oversample=0)

10.6 特征提取和可视化API

以

fc7

层为例，

# Retrieve details of the network's layers
[(k, v.data.shape) for k, v in net.blobs.items()]

# Retrieve weights of the network's layers
[(k, v[0].data.shape) for k, v in net.params.items()]

# Retrieve the features in the last fully connected layer
# prior to outputting class probabilities
feat = net.blobs['fc7'].data[4]

# Retrieve size/dimensions of the array
feat.shape

# Assumes that the "net = caffe.Classifier" module has been called
# and data has been formatted as in the example above

# Take an array of shape (n, height, width) or (n, height, width, channels)
# and visualize each (height, width) section in a grid
# of size approx. sqrt(n) by sqrt(n)
def vis_square(data, padsize=1, padval=0):
    # values between 0 and 1
    data -= data.min()
    data /= data.max()

    # force the number of filters to be square
    n = int(np.ceil(np.sqrt(data.shape[0])))
    padding = ((0, n ** 2 - data.shape[0]), (0, padsize), (0, padsize)) + ((0, 0),) * (data.ndim - 3)
    data = np.pad(data, padding, mode='constant', constant_values=(padval, padval))

    # tile the filters into an image
    data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))
    data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:])

    plt.imshow(data)

plt.rcParams['figure.figsize'] = (25.0, 20.0)

# visualize the weights after the 1st conv layer
net.params['conv1'][0].data.shape
filters = net.params['conv1'][0].data
vis_square(filters.transpose(0, 2, 3, 1))

# visualize the feature maps after 1st conv layer
net.blobs['conv1'].data.shape
feat = net.blobs['conv1'].data[0,:96]
vis_square(feat, padval=1)

# visualize the weights after the 2nd conv layer
net.blobs['conv2'].data.shape
feat = net.blobs['conv2'].data[0,:96]
vis_square(feat, padval=1)

# visualize the weights after the 2nd pool layer
net.blobs['pool2'].data.shape
feat = net.blobs['pool2'].data[0,:256] # change 256 data = np.pad(data, padding, mode='constanto number of pool outputs
vis_square(feat, padval=1)

# Visualize the neuron activations for the 2nd fully-connected layer
net.blobs['ip2'].data.shape
feat = net.blobs['ip2'].data[0]
plt.plot(feat.flat)
plt.legend()
plt.show()

10.7 网络定义API

from caffe import layers as L
from caffe import params as P

def lenet(lmdb, batch_size):
    # auto generated LeNet
    n = caffe.NetSpec()
    n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, transform_param=dict(scale=1./255), ntop=2)
    n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))
    n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))
    n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.ip1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier'))
    n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
    return n.to_proto()

with open('examples/mnist/lenet_auto_train.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_train_lmdb', 64)))

with open('examples/mnist/lenet_auto_test.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_test_lmdb', 100)))

生成的prototxt文件内容如下：

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00392156862745
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 20
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 50
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

10.8 网络训练API

solver = caffe.get_solver('models/bvlc_reference_caffenet/solver.prototxt')
net = caffe.Net('train_val.prototxt', caffe.TRAIN)
solver.net.forward()  # train net
solver.test_nets[0].forward()  # test net (there can be more than one)

solver.net.backward() # 计算梯度
# data gradients
net.blobs['conv1'].diff
# weight gradients
net.params['conv1'][0].diff
# biases gradients
net.params['conv1'][1].diff

solver.step(1) # 进行一次迭代，包括一次forward propagation 和一次backward propagation

solver.step() # 进行solver.prototxt中定义的max_iter次迭代

11 调试 Debugging

Debugging是可选部分，只针对Caffe开发者.
Debugging有用的小技巧：
- 移除随机性 remove randomness
- 对比caffemodels compare caffemodels
- 利用Caffe的调试信息 use Caffe's debug info

移除随机性有利于重用和输出. 随机性出现在很多阶段，如
- 权重的随机初始化，一般是从概率分布在进行初始化，比如Gaussion分布
- 输入图像的水平随机翻转、随机裁剪以及图像顺序的随机打乱等随机性
- dropout层随机训练部分权重，忽略其它权重

一中解决方案是使用seed，即在solver.prototxt中加入以下内容：

# pick some value for random_seed that is greater or equal to 1, for example:
random_seed: 42

保证每次都是相同的'random'值. 不过在不同的机器上，seed会产生不同的值.
针对多台机器，一种鲁棒的方式是：
- 采用相同的打乱顺序的图片进行数据准备，即每次实验中不再打乱顺序
- train.prototxt的 ImageDataLayer层中，定义 transform_param不进行图片裁剪和镜像：

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
 #   mirror: true
 #   crop_size: 227
    mean_value: 104
    mean_value: 117
    mean_value: 123
  }
  image_data_param {
    source: "/path/to/file/train.txt"
    batch_size: 32
    new_height: 224
    new_width: 224
  }
}

train.prototxt的dropout层，设置dropout_ratio=0
solver.prototxt中设置lr_policy='fixed'
solver.prototxt中添加debug_info: 1

为了对比两个caffemodels，下面的脚本统计了两个caffemodels的所有权重间的差异之和：

# Intel Corporation
# Author: Ravi Panchumarthy

import sys, os, argparse, time
import pdb
import numpy as np

def get_args():
    parser = argparse.ArgumentParser('Compare weights of two caffe models')

    parser.add_argument('-m1', dest='modelFile1', type=str, required=True,
                        help='Caffe model weights file to compare')
    parser.add_argument('-m2', dest='modelFile2', type=str, required=True,
                        help='Caffe model weights file to compare aganist')
    parser.add_argument('-n', dest='netFile', type=str, required=True,
                        help='Network prototxt file associated with model')
    return parser.parse_args()

if __name__ == "__main__":
    import caffe

    args = get_args()
    net = caffe.Net(args.netFile, args.modelFile1, caffe.TRAIN)
    net2compare = caffe.Net(args.netFile, args.modelFile2, caffe.TRAIN)

    wt_sumOfAbsDiffByName = dict()
    bias_sumOfAbsDiffByName = dict()

    for name, blobs in net.params.iteritems():
        wt_diffTensor = np.subtract(net.params[name][0].data, net2compare.params[name][0].data)
        wt_absDiffTensor = np.absolute(wt_diffTensor)
        wt_sumOfAbsDiff = wt_absDiffTensor.sum()
        wt_sumOfAbsDiffByName.update({name : wt_sumOfAbsDiff})

        # if args.layerDebug == 1:
        #     print("%s : %s" % (name,wt_sumOfAbsDiff))

        bias_diffTensor = np.subtract(net.params[name][1].data, net2compare.params[name][1].data)
        bias_absDiffTensor = np.absolute(bias_diffTensor)
        bias_sumOfAbsDiff = bias_absDiffTensor.sum()
        bias_sumOfAbsDiffByName.update({name : bias_sumOfAbsDiff})

    print("\nThe sum of absolute difference of all layer's weight is : %s" % sum(wt_sumOfAbsDiffByName.values()))
    print("The sum of absolute difference of all layer's bias is : %s" % sum(bias_sumOfAbsDiffByName.values()))

    finalDiffVal = sum(wt_sumOfAbsDiffByName.values())+ sum(bias_sumOfAbsDiffByName.values())
    print("The sum of absolute difference of all layers weight's and bias's is : %s" % finalDiffVal )

在Makefile.config中取消注释 DEBUG := 1，以进一步的debugging：

gdb /path/to/caffe/build/caffe

gdb开始后，运行命令：

run train -solver /path/to/solver.prototxt

12 实例

12.1 LeNet on MNIST 手写字体

# 准备数据集
cd $CAFFE_ROOT
./data/mnist/get_mnist.sh # downloads MNIST dataset
./examples/mnist/create_mnist.sh # creates dataset in LMDB format

# 训练模型
# Reduce the number of iterations from 10K to 1K to quickly run through this example
sed -i 's/max_iter: 10000/max_iter: 1000/g' examples/mnist/lenet_solver.prototxt
./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt

# 估计forward propagation和backward propagation的时间
./build/tools/caffe time --model=examples/mnist/lenet_train_test.prototxt -iterations 50 # runs on CPU

# 测试模型
# the file with the model should have a 'phase: TEST'
./build/tools/caffe test -model examples/mnist/lenet_train_test.prototxt 
  -weights examples/mnist/lenet_iter_1000.caffemodel -iterations 50

12.2 Dogs vs Cats

Kaggle下载Dogs vs Cats Dataset. 解压 dogvscat.zip，并运行

dogvscat.sh

#!/usr/bin/env sh
CAFFE_ROOT=/path/to/caffe
mkdir dogvscat
DOG_VS_CAT_FOLDER=/path/to/dogvscat

cd *DOG_VS_CAT_FOLDER
## Download datasets (requires first a login)
#https://www.kaggle.com/c/dogs-vs-cats/download/train.zip
#https://www.kaggle.com/c/dogs-vs-cats/download/test1.zip

# Unzip train and test data
sudo apt-get -y install unzip
unzip train.zip -d .
unzip test1.zip -d .

# Format data
python create_label_file.py # creates 2 text files with labels for training and validation
./build_datasets.sh # build lmdbs

# Download ImageNet pretrained weights (takes ~20 min)
*CAFFE_ROOT/scripts/download_model_binary.py *CAFFE_ROOT/models/bvlc_reference_caffenet 

# Fine-tune weights in the AlexNet architecture (takes ~100 min)
*CAFFE_ROOT/build/tools/caffe train -solver *DOG_VS_CAT_FOLDER/dogvscat_solver.prototxt 
    -weights *CAFFE_ROOT/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 

# Classify test dataset
cd *DOGVSCAT_FOLDER
python convert_binaryproto2npy.py
python dogvscat_classify.py # Returns prediction.txt (takes ~30 min)

# A better approach is to train five AlexNets w/init parameters from the same distribution,
# fine-tune those five, and compute the average of the five networks

12.3 PASCAL VOC Classification

解压voc2012.zip，运行

voc2012.sh

，以训练AlexNet.

#!/usr/bin/env sh

# Copy and unzip voc2012.zip (it contains this file) then run this file. But first
#  change paths in: voc2012.sh; build_datasets.sh; solvers/*; nets/*; classify.py

# As you run various files, you can ignore the following error if it shows up:
#  libdc1394 error: Failed to initialize libdc1394

# set Caffe root directory
CAFFE_ROOT=$CAFFE_ROOT
VOC=/path/to/voc2012

chmod 700 *.sh

# Download datasets
# Details: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#devkit
if [ ! -f VOCtrainval_11-May-2012.tar ]; then
  wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
fi
# VOCtraival_11-May-2012.tar contains the VOC folder with:
#  JPGImages: all jpg images
#  Annotations: objects and corresponding bounding box/pose/truncated/occluded per jpg
#  ImageSets: breaks the images by the type of task they are used for
#  SegmentationClass and SegmentationObject: segmented images (duplicate directories)
tar -xvf VOCtrainval_11-May-2012.tar

# Run Python scripts to create labeled text files
python create_labeled_txt_file.py

# Execute shell script to create training and validation lmdbs
# Note that lmdbs directories w/the same name cannot exist prior to creating them
./build_datasets.sh

# Execute following command to download caffenet pre-trained weights (takes ~20 min)
#  if weights exist already then the command is ignored
CAFFE_ROOT/scripts/download_model_binary.py CAFFE_ROOT/models/bvlc_reference_caffenet

# Fine-tune weights in the AlexNet architecture (takes ~60 min)
# you can also chose one of six solvers: pascal_solver[1-6].prototxt
CAFFE_ROOT/build/tools/caffe train -solver VOC/solvers/voc2012_solver.prototxt 
  -weights CAFFE_ROOT/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel

# The lines below are not really needed; they served as examples on how to do some tasks

# Test against voc2012_val_lmbd dataset (name of lmdb is the model under PHASE: test)
 CAFFE_ROOT/build/tools/caffe test -model VOC/nets/voc2012_train_val_ft678.prototxt 
   -weights VOC/weights_iter_5000.caffemodel -iterations 116

# Classify validation dataset: returns a file w/the labels of the val dataset
#  but it doesn't report accuracy (that would require some adjusting of the code)
python convert_binaryproto2npy.py
mkdir results
python cls_confidence.py
python average_precision.py

VOC相关信息：
- PASCAL VOC datasets
- 20 classes
- Training: 5,717 images, 13,609 objects
- Validation: 5,823 images, 13,841 objects
- Testing: 10,991 images

13 相关材料

Caffe Model-Zoo
Caffe主页
Soumith Chintala, "Intel are CPU magicians." Oct. 2015
Dipankar Das, et al., "Distributed Deep Learning Using Synchronous Stochastic Gradient Descent." Feb. 2016
Jeff Donahue, "Sequences in Caffe." CVPR Tutorial, June 2015
Andrej Karpathy, "Caffe Tutorial." Stanford CS 231n, 2015
Xinlei Chen, "Caffe Tutorial." Carnegie Mellon University 16824, 2015
MIT Scene Recognition demo: Pick an image of a scene from an URL or give it your own

最后修改：2018 年 10 月 09 日

如果觉得我的文章对你有用，请随意赞赏

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

评论 *

私密评论

名称 *

🎲

邮箱 *

地址

Caffe - 基于 Caffe-Intel 框架深度学习网络的训练和部署[译]

AIHGF • 2018 年 05 月 18 日

<blockquote><a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture" target="_blank">原文 - Manage Deep Learning Networks with Caffe* Optimized for Intel® Architecture</a>.</blockquote>目录： 
- 摘要 
- 安装 
- 数据层 
- 数据集准备 
- 训练 
- 多节点分布式训练 
- 微调 
- 测试 
- 特征提取和可视化 
- 使用Python API 
- 调试 
- 实例 
- Caffe用处 
- 进一步阅读<h2>1 摘要</h2><a class="no-external-link" href="http://caffe.berkeleyvision.org/" target="_blank">Caffe</a>是BVLC开发的深度学习框架，基于C++和 CUDA C++语言，并提供了Python 和 Matlab接口. 该框架对于卷积神经网络CNN、循环神经网络RNN及多层感知器很有帮助. 现在已经具有对于检测、分类、分割以及Spark兼容的分支.基于Intel结构优化的Caffe(Caffe-Intel)整合了Intel Math Kernel Library(Intel MKL) 2017，并对 Advanced Vector Extensions(AVX)-2 和AVX-512 指令集进行了优化，能够支持 Intel Xeon 和 Intel Xeon Phi 处理器. 因此，Caffe-Intel 框架除了包含BVLC Caffe的所有优点外，还能在 Intel 架构上有效运行，并能在许多节点进行分布式训练.该文档主要阐述了基于Intel结构优化的Caffe框架的编译、使用一个或多个计算节点进行网络模型的训练以及网络的部署. 另外， 详细介绍了Caffe的一些函数，比如网络微调、不同模型的特征提取与可视化、Caffe的Python API接口.名词： - weights 权重 - 也被叫做核(kernels)、滤波器(filters)、模板(templates)、或特征提取器(feature extractors)； - blob 数据块 - 也被叫做张量(tesor)，一种N维数据结构，N-D维张量，包含了数据、梯度或权重(偏置bias)； 
- units 神经元 - 也被叫做 neurons，在数据块进行非线性变化； - feature maps 特征图 - 也被叫做通道(channels)； 
- testing 测试 - 也被叫做推断(inference)、分类、得分(scoring)或部署(deployment)； 
- model 模型 - 也被叫做拓扑结构或网络结构.快速熟悉Caffe： 
- <a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Installation" target="_blank">Caffe-Intel 安装</a> 
- <a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples" target="_blank">Caffe-Intel 基于 MNIST 数据集训练和测试 LeNet 网络</a> 
- 在一些图片上，比如 <a class="no-external-link" href="https://github.com/BVLC/caffe/tree/master/examples/images/cat.jpg" target="_blank">cat</a>和<a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/examples/images/fish-bike.jpg" target="_blank">fish-bike</a>，<a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Testing" target="_blank">测试训练好的模型</a>，比如，<a class="no-external-link" href="http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" target="_blank">bvlc_googlenet.caffemodel</a> 
- 在 <a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Examples" target="_blank">Cats vs Dogs Challenge</a> 对已有模型<a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Fine-tuning" target="_blank">微调</a><h2>2 Caffe-Intel 安装</h2>这里仅针对 Ubuntu14.04 平台说明 Caffe 的安装，其他Linux和OS X操作系统，<a class="no-external-link" href="http://caffe.berkeleyvision.org/installation.html" target="_blank">BVLC官方</a>提供了相应的安装方法.<pre data-language=><code class="language-shell line-numbers">sudo apt-get update &amp;&amp;
sudo apt-get -y install build-essential git cmake &amp;&amp;
sudo apt-get -y install libprotobuf-dev libleveldb-dev libsnappy-dev &amp;&amp;
sudo apt-get -y install libopencv-dev libhdf5-serial-dev protobuf-compiler &amp;&amp;
sudo apt-get -y install --no-install-recommends libboost-all-dev &amp;&amp;
sudo apt-get -y install libgflags-dev libgoogle-glog-dev liblmdb-dev &amp;&amp;
sudo apt-get -y install libatlas-base-dev
</code></pre>对于Ubuntu16.04，需要进行以下库的链接：<pre data-language=><code class="language-shell line-numbers">find . -type f -exec sed -i -e 's^"hdf5.h"^"hdf5/serial/hdf5.h"^g' -e 's^"hdf5_hl.h"^"hdf5/serial/hdf5_hl.h"^g' '{}' ;
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so
</code></pre>针对CentOS7，安装以下依赖项：<pre data-language=><code class="language-shell line-numbers">sudo yum -y update &amp;&amp;
sudo yum -y groupinstall "Development Tools" &amp;&amp;
sudo yum -y install wget cmake git &amp;&amp;
sudo yum -y install protobuf-devel protobuf-compiler boost-devel &amp;&amp;
sudo yum -y install snappy-devel opencv-devel atlas-devel &amp;&amp;
sudo yum -y install gflags-devel glog-devel lmdb-devel leveldb-devel hdf5-devel

# The following steps are only required if some packages failed to install
# add EPEL repository then install missing packages
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo rpm -ivh epel-release-latest-7.noarch.rpm
sudo yum -y install gflags-devel glog-devel lmdb-devel leveldb-devel hdf5-devel &amp;&amp;
sudo yum -y install protobuf-devel protobuf-compiler boost-devel

# if packages are still not found--download and install/build the packages, e.g.,
# snappy:
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm
sudo yum -y install http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm
# atlas:
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/atlas-devel-3.10.1-10.el7.x86_64.rpm
sudo yum -y install http://mirror.centos.org/centos/7/os/x86_64/Packages/atlas-devel-3.10.1-10.el7.x86_64.rpm
# opencv:
wget https://github.com/Itseez/opencv/archive/2.4.13.zip
unzip 2.4.13.zip
cd opencv-2.4.13/
mkdir build &amp;&amp; cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=/usr/local ..
NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make all -j $NUM_THREADS
sudo make install -j $NUM_THREADS

# optional (not required for Caffe)
# other useful repositories for CentOS are RepoForge and IUS:
wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
sudo rpm -Uvh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
wget https://rhel7.iuscommunity.org/ius-release.rpm
sudo rpm -Uvh ius-release*.rpm
</code></pre>各依赖项的说明(<a class="no-external-link" href="http://graphics.cs.cmu.edu/courses/16-824/2016_spring/slides/caffe_tutorial.pdf" target="_blank">source</a>)： 
- boost - 使用 math functions 和 shared pointer 的C++库； 
- glog、gflags - 提供日志和命令行工具，对于调试十分必要； 
- leveldb、lmdb - 数据库IO，用于准备数据； 
- protobuf - 用于有效的定义数据结构； 
- BLAS(Basic Linear Algebra Subprograms) - 由Intel MKL提供的矩阵乘法、矩阵加法等操作库，类似的还有ATLAS、openBLAS 等运算库.<a class="no-external-link" href="http://caffe.berkeleyvision.org/install_apt.html" target="_blank">Caffe安装指南</a>指出对于CPU来说，安装MKL会有更好的表现.为了最佳表现，采用Intel MKL 2017，可以免费从 <a class="no-external-link" href="https://softwareproductsurvey.intel.com/f/150587/1103/" target="_blank">Intel® Parallel Studio XE 2017 Beta</a> 获取Beta版. 安装好后，正确的环境库可以设置如下(其中的路径需要根据实际情况修改)：<pre data-language=><code class="language-shell line-numbers">echo 'source /opt/intel/bin/compilervars.sh intel64' &gt;&gt; ~/.bashrc
# alternatively edit &lt;mkl_path&gt;/mkl/bin/mklvars.sh replacing INSTALLDIR in
# CPRO_PATH=&lt;INSTALLDIR&gt; with the actual mkl path: CPRO_PATH=&lt;full mkl path&gt;
# echo 'source &lt;mkl path&gt;/mkl/bin/mklvars.sh intel64' &gt;&gt; ~/.bashrc
</code></pre>克隆并准备 <a class="no-external-link" href="https://github.com/intel/caffe" target="_blank">Caffe-Intel</a>：<pre data-language=><code class="language-shell line-numbers">cd ~
# For BVLC caffe use:
# git clone https://github.com/BVLC/caffe.git
# For intel caffe use:
git clone https://github.com/intel/caffe.git 
cd caffe
echo "export CAFFE_ROOT=`pwd`" &gt;&gt; ~/.bashrc
source ~/.bashrc
cp Makefile.config.example Makefile.config
# Open Makefile.config and modify it (see comments in the Makefile)
vi Makefile.config
</code></pre>编辑Makefile.config：<pre data-language=><code class="language-shell line-numbers"># To run on CPU only and to avoid installing CUDA installers, uncomment
CPU_ONLY := 1

# To use MKL, replace atlas with mkl as follows
# (make sure that the BLAS_DIR and BLAS_LIB paths are correct)
BLAS := mkl
BLAS_DIR := $(MKLROOT)/include
BLAS_LIB := $(MKLROOT)/lib/intel64

# To use MKL2017 DNN primitives as the default engine, uncomment
# (however leave it commented if using multinode training)
# USE_MKL2017_AS_DEFAULT_ENGINE := 1

# To customized compiler choice, uncomment and set the following
# CUSTOM_CXX := g++

# To train on multinode uncomment and verify path
# USE_MPI := 1
# CXX := /usr/bin/mpicxx
</code></pre>如果是Ubuntu16.04， 编辑Makefile：<pre data-language=><code class="language-shell line-numbers">INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
</code></pre>并创建链接：<pre data-language=><code class="language-shell line-numbers">cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so 
</code></pre>如果是CentOS7和ATLAS库(而不是推荐的MKL库)，编辑Makefile：<pre data-language=><code class="language-shell line-numbers"># Change this line
LIBRARIES += cblas atlas
# to
LIBRARIES += satlas
</code></pre>编译Caffe-Intel：<pre data-language=><code class="language-shell line-numbers">NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make -j $NUM_THREADS
# To save the output stream to file makestdout.log use this instead
# make -j $NUM_THREADS 2&gt;&amp;1 | tee makestdout.log
</code></pre>另一种方式是采用cmake方式：<pre data-language=><code class="language-shell line-numbers">mkdir build
cd build
cmake -DCPU_ONLY=on -DBLAS-mkl -DUSE_MKL2017_AS_DEFAULT_ENGINE=on /path/to/caffe
NUM_THREADS=$(($(grep 'core id' /proc/cpuinfo | sort -u | wc -l)*2))
make -j $NUM_THREADS
</code></pre>安装Python依赖项：<pre data-language=><code class="language-shell line-numbers"># These steps are OPTIONAL but highly recommended to use the Python interface
sudo apt-get -y install gfortran python-dev python-pip
cd ~/caffe/python
for req in $(cat requirements.txt); do sudo pip install $req; done
sudo pip install scikit-image #depends on other packages
sudo ln -s /usr/include/python2.7/ /usr/local/include/python2.7
sudo ln -s /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ \
 /usr/local/include/python2.7/numpy
cd ~/caffe
make pycaffe -j NUM_THREADS
echo "export PYTHONPATH=$CAFFE_ROOT/python" &gt;&gt; ~/.bashrc
source ~/.bashrc
</code></pre>其它安装选项：<pre data-language=><code class="language-shell line-numbers"># These steps are OPTIONAL to test caffe
make test -j $NUM_THREADS
make runtest #"YOU HAVE &lt;some number&gt; DISABLED TESTS" output is OK

# This step is OPTIONAL to disable cam hardware OpenCV driver
# alternatively, the user can skip this and ignore the harmless 
# libdc1394 error that may occasionally appears
sudo ln /dev/null /dev/raw1394
</code></pre><h2>3 Caffe 数据层</h2>该部分是可选，将对 Caffe 支持的数据类型进行阐述，对于学习 Caffe 是非必须的，主要基于Caffe官方提供的<a class="no-external-link" href="http://caffe.berkeleyvision.org/tutorial/layers.html" target="_blank">layers 介绍材料</a> 和 <a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto" target="_blank">src/caffe/proto/caffe.proto</a>.Data 通过数据层进入 Caffe，其位于网络的最底部，在prototxt文件中进行定义. 关于prototxt文件的更多信息会在<a class="no-external-link" href="https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture#Training" target="_blank">Caffe-Intel网络训练</a>部分详细介绍. 
Data可以来自数据库(LevelDB或LMDB)， 直接从内存、从磁盘HDF5格式文件或通用图像格式. 常用的输入图片预处理(比如中心化(mean subtraction)、尺度变换、随机裁剪、镜像处理等)变换可以通过指定transfrom_params(不是所有的数据类型都支持该参数，比如HDF5即不支持)来定义. 如果已经预先进行数据变换，则不必再使用. 
常用的数据变换定义方式：<pre data-language=><code class="language-txt line-numbers">transform_param {
 # 随机水平反转图片,镜像处理
 mirror: 1
 # 裁剪 `crop_size` x `crop_size` 图片块:
 # - 训练时随机裁剪
 # - 测试时根据图片 center 裁剪
 crop_size: 227
 # 去均值: 可以设定值, 或者从 mean.binaryproto 文件加载
 # mean_file: name_of_mean_file.binaryproto
 mean_value: 104
 mean_value: 117
 mean_value: 123
}
</code></pre>这里，图像要进行裁剪、镜像、中心化变换. 其他数据变换操作可以查看 <a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto" target="_blank">src/caffe/proto/caffe.proto</a> 文件的TransformationParameter参数.<h3>3.1 LMDB 数据</h3><a class="no-external-link" href="http://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database" target="_blank">LMDB(Lightning Memory-Mapped Databases )</a> 和 <a class="no-external-link" href="https://en.wikipedia.org/wiki/LevelDB" target="_blank">LevelDB</a> 数据形式可以作为输入数据的一种有效方式. 
他们只对于1-of-K分类任务较适用. 由于Caffe在读取数据集效率问题，这两种数据形式被推荐用于1-of-K任务.data_params 属性 - source - 包含图片数据库的路径 
- batch_size - 一次处理输入的数目参数 
- backend[默认LEVELDB] - 选择采用 LEVELDB 或 LMDB 
- rand_skip - 在开始处跳过的输入数目，对于async sgd有用详细介绍查看 <a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto" target="_blank">src/caffe/proto/caffe.proto</a>文件中DataParameter参数.在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 type: "Data"
 top: "data"
 top: "label"
 include {
 phase: TRAIN
 }
 transform_param {
 mirror: 1
 crop_size: 227
 mean_value: 104
 mean_value: 117
 mean_value: 123
 }
 data_param {
 source: "examples/imagenet/ilsvrc12_train_lmdb"
 batch_size: 32
 backend: LMDB
 }
}
</code></pre>或者，均值中心化可以通过均值图像("data/ilsvrc12/imagenet_mean.binaryproto") 来取代mean_value. LMDB数据集的binaryproto的计算为：<pre data-language=><code class="language-shell line-numbers">cd ~/caffe
build/tools/compute_image_mean examples/imagenet/ilsvr12_train_lmdb 
data/ilsvrc12/imagenet_mean.binaryproto
</code></pre>根据实际需求，可以分别替换examples/imagenet/ilsvr12_train_lmdb和data/ilsvrc12/imagenet_mean.binaryproto为合适的 lmdb 文件夹和binaryproto文件.<h3>3.2 ImageData</h3>直接从图像文件得到images和labels.image_data_params 属性 
- source - 包含了输入数据和labels的文本文件名字参数 - batch_size[默认为1] - 一次处理的输入数目 
- new_height[默认为0] - 调整图像height值，如果为0，则忽略 
- new_width[默认为0] - 调整图像width值，如果为0，则忽略 
- shuffle[默认为0] - 打乱数据，如果为0，则忽略 
- rand_skip[默认为0] - 在开始处跳过的输入数目，对于async sgd有用详细介绍查看<a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto" target="_blank">src/caffe/proto/caffe.proto</a>文件中ImageDataParameter参数.在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 type: "ImageData"
 top: "data"
 top: "label"
 include {
 phase: TRAIN
 }
 transform_param {
 mirror: true
 crop_size: 227
 mean_value: 104
 mean_value: 117
 mean_value: 123
 }
 image_data_param {
 source: "/path/to/file/train.txt"
 batch_size: 32
 shuffle: 1
 }
}
</code></pre>这里，图像进行了顺序打乱、裁剪、镜像和中心化处理. 需要注意的是，文本中每行应为图像名和对应的labels，比如，"tran.txt"形式：<pre data-language=><code class="language-txt line-numbers">/path/to/images/img3423.jpg 2
/path/to/images/img3424.jpg 13
/path/to/images/img3425.jpg 8
...
</code></pre><h3>3.3 Input</h3>指定数据维度时，采用零值 blob 作为输入数据.input_params 
属性 - shape - 指定为1或top blobs的维度信息在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layer {
 name: "input"
 type: "Input"
 top: "data"
 input_param {
 shape {
 dim: 32
 dim: 3
 dim: 227
 dim: 227
 }
 }
}
</code></pre>等价形式:<pre data-language=><code class="language-txt line-numbers">input: "data"
input_dim: 32
input_dim: 3
input_dim: 227
input_dim: 227
</code></pre><h3>3.4 DummyData</h3>类似于 Input 类型，不同之处在于需要指定数据类型. 往往用于调试，详细可参考<a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/examples/pycaffe/linreg.prototxt" target="_blank">例子</a>dummy_data_params 属性 - shape - 指定为1或top blobs的维度信息参数 - data_filler[默认是值为0的ConstantFiller] - 指定top blob的值在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 type: "DummyData"
 top: "data"
 include {
 phase: TRAIN
 }
 dummy_data_param {
 data_filler {
 type: "constant"
 value: 0.01
 }
 shape {
 dim: 32
 dim: 3
 dim: 227
 dim: 227
 }
 }
}
layer {
 name: "data"
 type: "DummyData"
 top: "label"
 include {
 phase: TRAIN
 }
 dummy_data_param {
 data_filler {
 type: "constant"
 }
 shape {
 dim: 32
 }
 }
}
</code></pre><h3>3.5 MemoryData</h3>直接从内存读取数据，调用方式为：调用MemoryDataLayer::Reset (from C++)和Net.set_input_arrays (from Python)来读取连续的数据，一般是4D array，一次读取一个batch_size. 由于该方式需要将数据首先送到内存中，速率可能会慢，但一旦放到内存中，这种方式很有效率.memory_data_param 属性 - bacth_size，channels， height， width - 数据的维度信息在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layers {
 name: "data"
 type: MEMORY_DATA
 top: "data"
 top: "label"
 transform_param {
 crop_size: 227
 mirror: true
 mean_file: "mean.binaryproto"
 }
 memory_data_param {
 batch_size: 32
 channels: 3
 height: 227
 width: 227
 }
</code></pre><h3>3.6 HDF5Data</h3>以HDF5格式文件来读取数据，对于很多任务都是可用的，但一般只用于FP32和FP64数据，不是uint8，故图像数据会很大. 
不允许使用transform_param. 只在必要的时候使用该方式.hdf5_data_param 属性 - source - 包含输入数据和labels路径的文本文件名 - batch_size参数 - shuffle[默认false] - 打乱HDF5文件顺序在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 type: "HDF5_DATA"
 top: "data"
 top: "label"
 include {
 phase: TRAIN
 }
 hdf5_data_param {
 source: "examples/hdf5_classification/data/train.txt"
 batch_size: 32
 }
}
</code></pre><h3>3.7 HDF5DataOutput</h3>HDF5输出层的作用与其他数据层相反，将输入数据块写入磁盘hdf5_output_param 属性 - file_name在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data_output"
 type: "HDF5_OUTPUT"
 bottom: "data"
 bottom: "label"
 include {
 phase: TRAIN
 }
 hdf5_output_param {
 file_name: "output_file.h5"
 }
}
</code></pre><h3>3.8 WindowData</h3>用于detection，Read windows from image files class labels.window_data_param 属性 - source - 指定数据源 
- mean_file 
- batch_size参数 - mirror 
- crop_size - 随机裁剪图像 
- crop_mode[默认"warp"] - 裁剪detection window的模式，比如，"warp"裁剪为固定尺寸， "square"在window四周裁剪紧凑方框 - fg_threshold[默认0.5] - 前景重叠阈值(foreground (object) overlap threshold) 
- bg_threshold[默认0.5] - 背景重叠阈值(background (object) overlap threshold) 
- fg_fraction[默认0.25]: 前景物体交集(fraction of batch that should be foreground) objects 
- context_pad[默认10]: 围绕window补零数目(amount of contextual padding around a window)详细信息可参考<a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto" target="_blank">src/caffe/proto/caffe.proto</a>文件中的WindowDataParameter参数.在 prototxt 中的定义形式:<pre data-language=><code class="language-txt line-numbers">layers {
 name: "data"
 type: "WINDOW_DATA"
 top: "data"
 top: "label"
 window_data_param {
 source: "/path/to/file/window_train.txt"
 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
 batch_size: 128
 mirror: true
 crop_size: 227
 fg_threshold: 0.5
 bg_threshold: 0.5
 fg_fraction: 0.25
 context_pad: 16
 }
}
</code></pre><h2>4 数据集准备</h2>对于1-of-K分类任务推荐使用LMDB数据格式. 
在使用Caffe工具生成LMDB格式数据需要指定： 
- 数据所在目录 
- 输出目录，比如mydataset_train_lmdb，必须 
- 包含图像名和对应labels的文本文件，比如，"train.txt"，内容格式为：<pre data-language=><code class="language-txt line-numbers">img3423.jpg 2
img3424.jpg 13
img3425.jpg 8
...
</code></pre>如果数据分散在不同的文件夹， "train.txt"需要包含数据的绝对路径.<a class="no-external-link" href="https://github.com/RodriguezAndres/Kaggle-dogs-vs-cats/blob/master/create_label_file.py" target="_blank">create_label_file.py</a> 可以生成针对<a class="no-external-link" href="https://www.kaggle.com/c/dogs-vs-cats/data" target="_blank">Kaggle's Dog vs Cats Competition</a>任务的 training 和 validation 数据集划分，同样适用于其它任务.create_label_file.py<pre data-language=><code class="language-python line-numbers">#!/usr/bin/env python

import sys
import os
import os.path

def main():

TRAIN_TEXT_FILE = 'train.txt'
  VAL_TEXT_FILE = 'val.txt'
  IMAGE_FOLDER = 'train'

# Selects 10% of the images (the ones that end in '2') for validation

fr = open(TRAIN_TEXT_FILE, 'w')
  fv = open(VAL_TEXT_FILE, 'w')

filenames = os.listdir(IMAGE_FOLDER)
  for filename in filenames:
    if filename[0:3] == 'cat':
      if filename[-5] == '2':# or filename[-5] == '8':
        fv.write(filename + ' 0\n')
      else:
        fr.write(filename + ' 0\n')
    if filename[0:3] == 'dog':
      if filename[-5] == '2':# or filename[-5] == '8':
        fv.write(filename + ' 1\n')
      else:
        fr.write(filename + ' 1\n')

fr.close()
  fv.close()

# Standard boilerplate to call the main() function to begin the program.
if __name__ == '__main__':
 main()
</code></pre>在测试阶段，假设labels不存在的. 如果labels可用，可以采用相同的方法生成 test LMDB数据集.<h3>4.1 准备三通道数据（图像）</h3>下面的例子生成training LMDB，工作路径位于$CAFFE_ROOT<pre data-language=><code class="language-shell line-numbers">#!/usr/bin/env sh
# folder containing the training and validation images
TRAIN_DATA_ROOT=/path/to/training/images

# folder containing the file with the name of training images
DATA=/path/to/file
# folder for the lmdb datasets
OUTPUT=/path/to/output/directory
TOOLS=/path/to/caffe/build/tools

# Set to resize the images to 256x256
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
echo "Creating train lmdb..."

# Delete the shuffle line if shuffle is not desired
GLOG_logtostderr=1 *TOOLS/convert_imageset 
 --resize_height=*RESIZE_HEIGHT 
 --resize_width=*RESIZE_WIDTH 
 --shuffle 
 *TRAIN_DATA_ROOT/ 
 *DATA/train.txt 
 *OUTPUT/mydataset_train_lmdb
echo "Done."
</code></pre>注: <code>*</code> 号替换为 <code>$</code> 符号.计算LMDB数据集的图像均值：<pre data-language=><code class="language-shell line-numbers">#!/usr/bin/env sh
# Compute the mean image in lmdb dataset
OUTPUT=/path/to/output/directory

# folder for the lmdb datasets and output for mean image
TOOLS=/path/to/caffe/build/tools

*TOOLS/compute_image_mean *OUTPUT/mydataset_train_lmdb 
  *OUTPUT/train_mean.binaryproto

*TOOLS/compute_image_mean *OUTPUT/mydataset_val_lmdb 
 *OUTPUT/val_mean.binaryproto
</code></pre><h3>4.2 准备不同通道数据</h3>灰度值图像(Gray scale images， 单通道)、RADAR图像(双通道)、视频(videos，四通道)、图像+深度信息(四通道)、brometry(单通道)以及频谱图(spectrograms，单通道)需要进行变换以生成LMDB数据集(<a class="no-external-link" href="https://github.com/BVLC/caffe/issues/1698#issuecomment-70211045" target="_blank">参考资料</a>).<h3>4.3 调整图像尺寸</h3>有两种调整图像尺寸的方式： 
- 变换图像到指定尺寸 
- 按比例调整到比指定尺寸相对较小的尺寸，然后中心裁剪大的一边以达到指定尺寸调整图像尺寸的方法有： 
- 基于OPENCV* - build/tools/convert_imageset --resize_height=256 --resize_width=256 将图像裁剪到指定尺寸，其中convert_imageset调用了ReadImageToDatum函数，后者调用了caffe/src/util/io.cpp中的ReadImageToCVMat函数； 
- 基于ImageMagick - convert -resize 256x256&#33; <input_img> <output_img> 将图像裁剪到指定尺寸； 
- 基于OPENCV - 采用脚本tools/extra/resize_and_crop_images.py来进行多线程图像变换，对图像进行比例地变换，再进行中心裁剪<pre data-language=><code class="language-shell line-numbers">sudo pip install git+https://github.com/Yangqing/mincepie.git
sudo apt-get install -y python-opencv
vi tools/extra/launch_resize_and_crop_images.sh # set number of clients (use num_of_cores*2); file.txt, input, and output folders
</code></pre>另外，网络中的图像可以在数据层定义参数来进行裁剪或者调整尺寸：<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 transform_param {
 crop_size: 227
...
}
</code></pre><pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 image_data_param {
 new_height: 227
 new_width: 227
...
</code></pre><h2>5 网络训练 Training</h2>网络训练需要： 
- train_val.prototxt - 定义了网络结构、初始化参数和学习率 
- solver.prototxt - 定义了优化参数的方式，训练深度网络的文件 
- deploy.prototxt - 只用于testing，与train_val.prototxt基本一致，除了没有输入层、loss层参数初始化十分重要，其主要方式有： 
- gaussian - 从高斯分布 N(0,std)采样权重值 
- xavier - 从uniform distribution U(-a,a)采样权重，其中 a=sqrt(3/fan_in), where fan_in is the number of incoming inputs 
- MSRAFiller - 从正态分布 normal distribution N(0,a) 采样权重, 其中a=sqrt(2/fan_in)网络层关于学习率的参数： 
- base_lr - 初始化学习率，默认为0.01，训练时如果出现NAN，则将值调小 
- lr_mult - 偏置的lr_mult一般设为2×非偏置权重的lr_mult以LeNet为例，分别定义 <a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt" target="_blank">lenet_train_test.prototxt</a>, <a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet.prototxt" target="_blank">deploy.prototxt</a>, <a class="no-external-link" href="https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt" target="_blank">solver.prototxt</a>：solver.prototxt<pre data-language=><code class="language-txt line-numbers"># 网络定义
net: "examples/mnist/lenet_train_test.prototxt"

# 每500次训练迭代进行一次validation test
test_interval: 500 
# 指定validation test迭代的次数，推荐值设为 num_val_imgs / batch_size
test_iter: 100

# 训练网络的基础学习率、动量和权重衰减
base_lr: 0.01
momentum: 0.9 
weight_decay: 0.0005

# 不同的学习策略
#  fixed: always return base_lr.
#  step: return base_lr * gamma ^ (floor(iter / step))
#  exp: return base_lr * gamma ^ iter
#  inv: return base_lr * (1 + gamma * iter) ^ (- power)
#  multistep: similar to step but it allows non uniform steps defined by stepvalue
#  poly: the effective learning rate follows a polynomial decay, to be zero by the max_iter: return base_lr (1 - iter/max_iter) ^ (power)
#  sigmoid: the effective learning rate follows a sigmod decay: return base_lr * ( 1/(1 + exp(-gamma * (iter - stepsize))))
lr_policy: "step"
gamma: 0.1 
stepsize: 10000 # Drop the learning rate in steps by a factor of gamma every stepsize iterations

# 每100次迭代显示一次结果
display: 100

# 最大迭代次数
max_iter: 10000

# 每5000次迭代输出一次快照，即模型训练状态和模型参数
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet_multistep"

# solver mode: CPU or GPU
solver_mode: CPU
</code></pre>训练网络：<pre data-language=><code class="language-shell line-numbers">*CAFFE_ROOT/build/tools/caffe train -solver solver.prototxt
</code></pre>训练网络会输出两种类型的文件，比如： 
- lenet_multistep_10000.caffemodel - 网络的权重，即用于测试的模型参数 
- lenet_multistep_10000.solverstate - 如果中间训练过程中断，便于恢复训练训练网络，并画出验证数据集上的精度或loss vs迭代的曲线：<pre data-language=><code class="language-shell line-numbers">#CHART_TYPE=[0-7]
# 0: Test accuracy vs. Iters
# 1: Test accuracy vs. Seconds
# 2: Test loss vs. Iters
# 3: Test loss vs. Seconds
# 4: Train learning rate vs. Iters
# 5: Train learning rate vs. Seconds
# 6: Train loss vs. Iters
# 7: Train loss vs. Seconds
CHART_TYPE=0
*CAFFE_ROOT/build/tools/caffe train -solver solver.prototxt 2&gt;&amp;1 | tee logfile.log
python *CAFFE_ROOT/tools/extra/plot_training_log.py.example *CHART_TYPE name_of_plot.png logfile.log
</code></pre>Dropout被用于全连接层，在forward-pass过程只激活部分权重来避免权重间的协同性，以降低过拟合. 
在测试过程被忽略.<pre data-language=><code class="language-txt line-numbers">layer {
 name: "fc6"
 type: "InnerProduct"
 bottom: "pool5"
 top: "fc6"
 param {
 lr_mult: 1
 decay_mult: 1
 }
 param {
 lr_mult: 2
 decay_mult: 0
 }
 inner_product_param {
 num_output: 4096
 weight_filler {
 type: "gaussian"
 std: 0.005
 } 
 bias_filler {
 type: "constant"
 value: 1
 } 
 }
}
layer {
 name: "relu6"
 type: "ReLU"
 bottom: "fc6"
 top: "fc6"
}
layer {
 name: "drop6"
 type: "Dropout"
 bottom: "fc6"
 top: "fc6"
 dropout_param {
 dropout_ratio: 0.5 
 }
}
</code></pre>估计前向传播和后向传播的时间，不更新权重：<pre data-language=><code class="language-shell line-numbers"># 计算NUMITER=50次前向和后向传播的时间，总时间以及平均时间
# 可能需要训练样本和mean.binaryproto
NUMITER=50
/path/to/caffe/build/tools/caffe time --model=train_val.prototxt -iterations *NUMITER
</code></pre>Linux的numactl工具可以进行内存分配管理：<pre data-language=><code class="language-shell line-numbers">numactl -i all /path/to/caffe/build/tools/caffe time --model=train_val.prototxt -iterations *NUMITER
</code></pre><h3>Caffe Model Zoo</h3><a class="no-external-link" href="https://github.com/BVLC/caffe/wiki/Model-Zoo" target="_blank">Caffe Model Zoo</a>给出了针对不同任务的网络模型以及模型参数，便于fine-tuning或者testing.<h2>6 多节点分布式训练 Multinode distributed training</h2>该部分内容基于<a class="no-external-link" href="http://github.com/intel/caffe/wiki/Multinode---How-to-...%3F" target="_blank">Intel's Caffe Github wiki</a>. 主要有两种方式进行多节点的分布式训练： 
- 模型并行 
- 数据并行模型并行是指，将模型置于不同的节点，每个节点都进行全部的数据处理； 数据并行是指，将数据块置于不同的节点，每个节点都有全部的模型参数. 对于模型中权重数较少，数据块较大时，数据并行比较使用. 混合模型和数据并行可以同时进行，对于网络层权重较少，比如卷积层采用数据并行训练，对于网络层权重较多，比如全连接层采用模型并行训练. <a class="no-external-link" href="http://arxiv.org/abs/1602.06709" target="_blank">论文 - Distributed Deep Learning Using Synchronous Stochastic Gradient Descent - 2016 - Intel</a> 对混合方法中数据并行和模型并行间的优化平衡进行了理论分析.结合当前比较流行的权重较少的深度网络，比如GoogleNet和ResNet，以及采用数据并行分布式训练的<a class="no-external-link" href="http://arxiv.org/abs/1511.00175" target="_blank">成功案例</a>， 可以看出，Caffe-Intel支持数据并行计算的. 多节点分布式训练也是当前比较活跃的发展方向.多节点网络训练对 Makefile.config进行修改：<pre data-language=><code class="language-txt line-numbers">USE_MPI := 1
# update with the path to binary MPI library
CXX := /usr/bin/mpicxx
</code></pre>采用多节点进行训练也比较简单：<pre data-language=><code class="language-shell line-numbers">mpirun --hostfile path/to/hostfile -n &lt;num_processes&gt; /path/to/caffe/build/tools/caffe train --solver=/path/to/solver.prototxt --param_server=mpi
</code></pre>其中， 
- <num_processes> - 使用节点的数目 
- hostfile - 包含了每条线节点的ip地址solver.prototxt中指定了各节点的train.prototxt，且每个train.prototxt需要指定到数据集的不同部分. 更多细节，参考<a class="no-external-link" href="https://github.com/intel/caffe/wiki/Cifar10-multinode#data" target="_blank">相关材料</a>.<h2>7 微调 Fine-tuning</h2>重复利用prototxt中定义的网络结构，主要进行的两处修改如下：<ul>
<li>1 修改网络数据层，以适应新数据</li>
</ul><pre data-language=><code class="language-txt line-numbers">layer {
 name: "mnist"
 type: "Data"
 top: "data"
 top: "label"
 transform_param {
 scale: 0.00390625 # 1/255
 }
 data_param {
 source: "newdata_lmdb" # 指定到新的数据集
 batch_size: 64
 backend: LMDB
 }
}
</code></pre><ul>
<li>2 修改输出层，这里是ip2网络层(注：在deploy.prototxt文件中进行同样的修改)</li>
</ul><pre data-language=><code class="language-txt line-numbers">layer {
 name: "ip2-ft" # 修改网络名
 type: "InnerProduct"
 bottom: "ip1"
 top: "ip2-ft" # 修改网络输出名
 param {
 lr_mult: 1
 }
 param {
 lr_mult: 2
 }
 inner_product_param {
 num_output: 2 # 修改为新数据集的类别数目，这里是2
 bias_filler {
 type: "constant"
 }
 }
}
</code></pre>在Caffe中fine-tuning：<pre data-language=><code class="language-shell line-numbers">#From the command line on $CAFFE_ROOT
./build/tools/caffe train -solver /path/to/solver.prototxt -weights /path/to/trained_model.caffemodel
</code></pre>微调技巧： 
- 首先学习最后网络输出层，其它层不变动 
- 减小初始学习率，一般为10×或100× 
- 可定义Caffe网络层的局部学习率 lr_mult 
- 保持除了最后输出层或倒数第二层网络不变，以进行快速优化，即: 局部学习率lr_mult=0 
- 增大最后输出层的局部学习率为10×，倒数第二层的局部学习率为5× 
- 如果效果已足够好，停止，或者微调其它网络层微调网络的特点： 
- 创建了新的网络结构 
- 复制初始化网络权重 
- 类似于网络的训练，参考<a class="no-external-link" href="http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html" target="_blank">实例</a>.<h2>8 测试Testing</h2>测试也被叫做推断、分类、或者打得分，可以使用Caffe提供的Python接口或者C++工具进行. C++工具不够灵活，推荐使用Python. 分类一张图片或信号或图像集，需要： 
- 图片 
- 网络结构 
- 网络权重<h3>8.1 测试图片集</h3>模型的prototxt中应该有TEST数据层，指定了testing数据集，以测试模型表现：<pre data-language=><code class="language-shell line-numbers">/path/to/caffe/build/tools/caffe test -model /path/to/train_val.prototxt 
- weights /path/to/trained_model.caffemodel -iterations &lt;num_iter&gt;
</code></pre>该实例参考了<a class="no-external-link" href="https://github.com/BVLC/caffe/blob/886563bb49080acf4479395025ccd39f733473e8/examples/cpp_classification/readme.md" target="_blank">材料</a>.<h3>8.2 测试单张图片</h3>首先，在使用训练好的模型进行图片分类前，需要下载模型：<pre data-language=><code class="language-shell line-numbers">./scripts/download_model_binary.py models/bvlc_reference_caffenet
</code></pre>然后，下载数据集labels，来映射网络预测结果到图片类别，这里以<a class="no-external-link" href="http://www.image-net.org/challenges/LSVRC/2012/" target="_blank">ILSVRC2012</a>为例：<pre data-language=><code class="language-shell line-numbers">./data/ilsvrc12/get_ilsvrc_aux.sh
</code></pre>最后，分类图片：<pre data-language=><code class="language-shell line-numbers">./build/examples/cpp_classification/classification.bin 
 models/bvlc_reference_caffenet/deploy.prototxt 
 models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 
 data/ilsvrc12/imagenet_mean.binaryproto 
 data/ilsvrc12/synset_words.txt 
 examples/images/cat.jpg
</code></pre>输出结果样式：<pre data-language=><code class="language-txt line-numbers">---------- Prediction for examples/images/cat.jpg ----------
0.3134 - "n02123045 tabby, tabby cat"
0.2380 - "n02123159 tiger cat"
0.1235 - "n02124075 Egyptian cat"
0.1003 - "n02119022 red fox, Vulpes vulpes"
0.0715 - "n02127052 lynx, catamount"
</code></pre><h2>9 特征提取和可视化</h2>网络卷积层的权重数据格式为： output_feature_maps x height x width x input_feature_maps，feature_maps也被叫做channels. Caffe的特征提取方式有两种： Python API和C++ API.<pre data-language=><code class="language-shell line-numbers"># 下载模型参数
scripts/download_model_binary.py models/bvlc_reference_caffenet

# Generate a list of the files to process
# Use the images that ship with caffe
find `pwd`/examples/images -type f -exec echo {} ; &gt; examples/images/test.txt

# Add a 0 to the end of each line
# input data structures expect labels after each image file name
sed -i "s/*/ 0/" examples/images/test.txt

# Get the mean of trainint set to subtract it from images
./data/ilsvrc12/get_ilsvrc_aux.sh

# Copy and modify the data layer to load and resize the images:
cp examples/feature_extraction/imagenet_val.prototxt examples/images
vi examples/iamges/imagenet_val.prototxt

# 提取特征
./build/tools/extract_features.bin models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 
 examples/images/imagenet_val.prototxt fc7 examples/images/features 10 lmdb
</code></pre>这里提取了fc7网络层的特征图，表现的是模型的最高层特征. 同样的，也可以提取其它层的特征，比如conv5、pool3等. 最后的参数10 lmdb是最小的batch size， 提取的特征被保存在examples/images/features的LevelDB文件夹内.<h2>10 Python API</h2>Caffe提供了testing、分类、特定提取、网络定义和网络训练的Python API.<h3>10.1 Caffe Python API 设置</h3>编译Caffe后需要再执行make pycaffe，成功后即可进行调用：<pre data-language=><code class="language-python line-numbers">import sys 
CAFFE_ROOT = '/path/to/caffe/' #路径要设置正确
sys.path.insert(0, CAFFE_ROOT + 'python')
import caffe
caffe.set_mode_cpu() # CPU模式
</code></pre><h3>10.2 加载网络结构API</h3>网络结构定义在train_val.prototxt或者deploy.prototxt中：<pre data-language=><code class="language-python line-numbers">net = caffe.Net('train_val.prototxt', caffe.TRAIN)
</code></pre>如果指定了权重，则：<pre data-language=><code class="language-python line-numbers">net = caffe.Net('deploy.prototxt', 'trained_model.caffemodel', caffe.TRAIN)
</code></pre>net中包含了数据块(net.blobs)和权重参数块(net.params)，以conv1网络层为例： 
- net.blobs['conv1'] - conv1层的输出数据，也被叫做特征图(feature maps) 
- net.params['conv1'][0] - conv1层权重项 
- net.params['conv1'][1] - conv1层偏置项 
- net.blobs.items() - 所有网络层的数据块<h3>10.3 网络可视化API</h3>这里需要安装pydot和graphviz模块：<pre data-language=><code class="language-shell line-numbers">sudo apt-get install -y GraphViz
sudo pip install pydot
</code></pre>利用caffe的draw_net.py脚本实现可视化：<pre data-language=><code class="language-python line-numbers">python python/draw_net.py examples/net_surgery/deploy.prototxt train_val_net.png
open train_val_net.png
</code></pre><h3>10.4 数据输入API</h3><ul>
<li>方式1：修改数据层以匹配图像大小</li>
</ul><pre data-language=><code class="language-python line-numbers">import numpy as np
# get input image and arrange it as a 4-D tensor
im = np.array(Image.open('/path/to/caffe/examples/images/cat_gray.jpg'))
im = im[np.newaxis, np.newaxis, :, :]
# resize the blob to be the size of the input image
net.blobs['data'].reshape(im.shape) # if the image input is different 
# compute the blobs given the input data
net.blobs['data'].data[...] = im
</code></pre><ul>
<li>方式2： 修改输入数据以匹配网络数据层的图像大小</li>
</ul><pre data-language=><code class="language-python line-numbers">im = caffe.io.load.image('/path/to/caffe/examples/images/cat_gray.jpg')
shape = net.blobs['data'].data.shape
# resize the img to be the size of the data blob
im = caffe.io.resize(im, shape[3], shape[2], shape[1])
# compute the blobs given the input data
net.blobs['data'].data[...] = im
</code></pre><ul>
<li>数据层对输入数据一般会进行数据变换</li>
</ul><pre data-language=><code class="language-python line-numbers">net = caffe.Net('deploy.prototxt', 'trained_model.caffemodel', caffe.TRAIN)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
ilsvrc_mean = 'python/caffe/imagenet/ilsvrc_2012_mean.npy'
transformer.set_mean('data', np.load(ilsvrc_mean).mean(1).mean(1))
# puts the channel as the first dimention
transformer.set_transpose('data', (2,0,1))
# (2,1,0) maps RGB to BGR for example
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)
# the batch size can be changed on-the-fly
net.blobs['data'].reshape(1,3,227,227)
# load the image in the data layer
im = caffe.io.load.image('/path/to/caffe/examples/images/cat_gray.jpg')
# transform the image and store it in the net.blob
net.blobs['data'].data[...] = transformer.preprocess('data', im)
</code></pre>图像可视化：<pre data-language=><code class="language-python line-numbers">import matplotlib.pyplot as plt
plt.imshow(im)
</code></pre><h3>10.5 推断 Inference API</h3>输入图像的网络预测：<pre data-language=><code class="language-python line-numbers"># assumes that images are loaded
prediction = net.forward()
print 'predicted class:', prediction['prob'].argmax()
</code></pre>也可以统计forward propagation的时间(不包括数据处理的时间)：<pre data-language=><code class="language-python line-numbers">timeit net.forward()
</code></pre>Caffe还提供了对多个输入数据同时进行数据变换和分类的Python API - net.Classifier， 可以取代net.Net和caffe.io.Transformer.<pre data-language=><code class="language-python line-numbers">im1 = caffe.io.load.images('/path/to/caffe/examples/images/cat.jpg')
im2 = caffe.io.load.images('/path/to/caffe/examples/images/fish-bike.jpg')
imgs = [im1, im2]
ilsvrc_mean = '/path/to/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy'
net = caffe.Classifier('deploy.prototxt', 'trained_model.caffemodel',
 mean=np.load(ilsvrc_mean).mean(1).mean(1),
 channel_swap=(2,1,0),
 raw_scale=255,
 image_dims=(256, 256))
prediction = net.predict(imgs) # predict takes any number of images
print 'predicted classes:', prediction[0].argmax(), prediction[1].argmax()
</code></pre>对于多张图片的文件夹，只需修改imgs部分：<pre data-language=><code class="language-python line-numbers">IMAGES_FOLDER = '/path/to/folder/w/images/'
import os
images = os.listdir(IMAGES_FOLDER)
imgs = [ caffe.io.load_image(IMAGES_FOLDER + im) for im in images ]
</code></pre><pre data-language=><code class="language-python line-numbers">plt.plot(prediction[0]) # 以bar chart的形式可视化所有类别的概率
timeit net.predict([im1]) # 时间统计
timeit net.predict([im1], oversample=0)
</code></pre><h3>10.6 特征提取和可视化API</h3>以fc7层为例，<pre data-language=><code class="language-python line-numbers"># Retrieve details of the network's layers
[(k, v.data.shape) for k, v in net.blobs.items()]

# Retrieve weights of the network's layers
[(k, v[0].data.shape) for k, v in net.params.items()]

# Retrieve the features in the last fully connected layer
# prior to outputting class probabilities
feat = net.blobs['fc7'].data[4]

# Retrieve size/dimensions of the array
feat.shape

# Assumes that the "net = caffe.Classifier" module has been called
# and data has been formatted as in the example above

# Take an array of shape (n, height, width) or (n, height, width, channels)
# and visualize each (height, width) section in a grid
# of size approx. sqrt(n) by sqrt(n)
def vis_square(data, padsize=1, padval=0):
    # values between 0 and 1
    data -= data.min()
    data /= data.max()

# force the number of filters to be square
    n = int(np.ceil(np.sqrt(data.shape[0])))
    padding = ((0, n ** 2 - data.shape[0]), (0, padsize), (0, padsize)) + ((0, 0),) * (data.ndim - 3)
    data = np.pad(data, padding, mode='constant', constant_values=(padval, padval))

# tile the filters into an image
    data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))
    data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:])

plt.imshow(data)

plt.rcParams['figure.figsize'] = (25.0, 20.0)

# visualize the weights after the 1st conv layer
net.params['conv1'][0].data.shape
filters = net.params['conv1'][0].data
vis_square(filters.transpose(0, 2, 3, 1))

# visualize the feature maps after 1st conv layer
net.blobs['conv1'].data.shape
feat = net.blobs['conv1'].data[0,:96]
vis_square(feat, padval=1)

# visualize the weights after the 2nd conv layer
net.blobs['conv2'].data.shape
feat = net.blobs['conv2'].data[0,:96]
vis_square(feat, padval=1)

# visualize the weights after the 2nd pool layer
net.blobs['pool2'].data.shape
feat = net.blobs['pool2'].data[0,:256] # change 256 data = np.pad(data, padding, mode='constanto number of pool outputs
vis_square(feat, padval=1)

# Visualize the neuron activations for the 2nd fully-connected layer
net.blobs['ip2'].data.shape
feat = net.blobs['ip2'].data[0]
plt.plot(feat.flat)
plt.legend()
plt.show()
</code></pre><h3>10.7 网络定义API</h3><pre data-language=><code class="language-python line-numbers">from caffe import layers as L
from caffe import params as P

def lenet(lmdb, batch_size):
    # auto generated LeNet
    n = caffe.NetSpec()
    n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, transform_param=dict(scale=1./255), ntop=2)
    n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))
    n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))
    n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n.ip1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier'))
    n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
    return n.to_proto()

with open('examples/mnist/lenet_auto_train.prototxt', 'w') as f:
    f.write(str(lenet('examples/mnist/mnist_train_lmdb', 64)))

with open('examples/mnist/lenet_auto_test.prototxt', 'w') as f:
 f.write(str(lenet('examples/mnist/mnist_test_lmdb', 100)))
</code></pre>生成的prototxt文件内容如下：<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 type: "Data"
 top: "data"
 top: "label"
 transform_param {
 scale: 0.00392156862745
 }
 data_param {
 source: "examples/mnist/mnist_train_lmdb"
 batch_size: 64
 backend: LMDB
 }
}
layer {
 name: "conv1"
 type: "Convolution"
 bottom: "data"
 top: "conv1"
 convolution_param {
 num_output: 20
 kernel_size: 5
 weight_filler {
 type: "xavier"
 }
 }
}
layer {
 name: "pool1"
 type: "Pooling"
 bottom: "conv1"
 top: "pool1"
 pooling_param {
 pool: MAX
 kernel_size: 2
 stride: 2
 }
}
layer {
 name: "conv2"
 type: "Convolution"
 bottom: "pool1"
 top: "conv2"
 convolution_param {
 num_output: 50
 kernel_size: 5
 weight_filler {
 type: "xavier"
 }
 }
}
layer {
 name: "pool2"
 type: "Pooling"
 bottom: "conv2"
 top: "pool2"
 pooling_param {
 pool: MAX
 kernel_size: 2
 stride: 2
 }
}
layer {
 name: "ip1"
 type: "InnerProduct"
 bottom: "pool2"
 top: "ip1"
 inner_product_param {
 num_output: 500
 weight_filler {
 type: "xavier"
 }
 }
}
layer {
 name: "relu1"
 type: "ReLU"
 bottom: "ip1"
 top: "ip1"
}
layer {
 name: "ip2"
 type: "InnerProduct"
 bottom: "ip1"
 top: "ip2"
 inner_product_param {
 num_output: 10
 weight_filler {
 type: "xavier"
 }
 }
}
layer {
 name: "loss"
 type: "SoftmaxWithLoss"
 bottom: "ip2"
 bottom: "label"
 top: "loss"
}
</code></pre><h3>10.8 网络训练API</h3><pre data-language=><code class="language-python line-numbers">solver = caffe.get_solver('models/bvlc_reference_caffenet/solver.prototxt')
net = caffe.Net('train_val.prototxt', caffe.TRAIN)
solver.net.forward() # train net
solver.test_nets[0].forward() # test net (there can be more than one)

solver.net.backward() # 计算梯度
# data gradients
net.blobs['conv1'].diff
# weight gradients
net.params['conv1'][0].diff
# biases gradients
net.params['conv1'][1].diff

solver.step(1) # 进行一次迭代，包括一次forward propagation 和一次backward propagation

solver.step() # 进行solver.prototxt中定义的max_iter次迭代
</code></pre><h2>11 调试 Debugging</h2>Debugging是可选部分，只针对Caffe开发者. Debugging有用的小技巧： 
- 移除随机性 remove randomness 
- 对比caffemodels compare caffemodels 
- 利用Caffe的调试信息 use Caffe's debug info移除随机性有利于重用和输出. 随机性出现在很多阶段，如 
- 权重的随机初始化，一般是从概率分布在进行初始化，比如Gaussion分布 
- 输入图像的水平随机翻转、随机裁剪以及图像顺序的随机打乱等随机性 
- dropout层随机训练部分权重，忽略其它权重一中解决方案是使用seed，即在solver.prototxt中加入以下内容：<pre data-language=><code class=" line-numbers"># pick some value for random_seed that is greater or equal to 1, for example:
random_seed: 42
</code></pre>保证每次都是相同的'random'值. 不过在不同的机器上，seed会产生不同的值. 针对多台机器，一种鲁棒的方式是： 
- 采用相同的打乱顺序的图片进行数据准备，即每次实验中不再打乱顺序 
- train.prototxt的 ImageDataLayer层中，定义 transform_param不进行图片裁剪和镜像：<pre data-language=><code class="language-txt line-numbers">layer {
 name: "data"
 type: "ImageData"
 top: "data"
 top: "label"
 include {
 phase: TRAIN
 }
 transform_param {
 # mirror: true
 # crop_size: 227
 mean_value: 104
 mean_value: 117
 mean_value: 123
 }
 image_data_param {
 source: "/path/to/file/train.txt"
 batch_size: 32
 new_height: 224
 new_width: 224
 }
}
</code></pre><ul>
<li>train.prototxt的dropout层，设置dropout_ratio=0</li>
<li>solver.prototxt中设置lr_policy='fixed'</li>
<li>solver.prototxt中添加debug_info: 1</li>
</ul>为了对比两个caffemodels，下面的脚本统计了两个caffemodels的所有权重间的差异之和：<pre data-language=><code class="language-python line-numbers"># Intel Corporation
# Author: Ravi Panchumarthy

import sys, os, argparse, time
import pdb
import numpy as np

def get_args():
    parser = argparse.ArgumentParser('Compare weights of two caffe models')

parser.add_argument('-m1', dest='modelFile1', type=str, required=True,
                        help='Caffe model weights file to compare')
    parser.add_argument('-m2', dest='modelFile2', type=str, required=True,
                        help='Caffe model weights file to compare aganist')
    parser.add_argument('-n', dest='netFile', type=str, required=True,
                        help='Network prototxt file associated with model')
    return parser.parse_args()

if __name__ == "__main__":
    import caffe

args = get_args()
    net = caffe.Net(args.netFile, args.modelFile1, caffe.TRAIN)
    net2compare = caffe.Net(args.netFile, args.modelFile2, caffe.TRAIN)

wt_sumOfAbsDiffByName = dict()
    bias_sumOfAbsDiffByName = dict()

for name, blobs in net.params.iteritems():
        wt_diffTensor = np.subtract(net.params[name][0].data, net2compare.params[name][0].data)
        wt_absDiffTensor = np.absolute(wt_diffTensor)
        wt_sumOfAbsDiff = wt_absDiffTensor.sum()
        wt_sumOfAbsDiffByName.update({name : wt_sumOfAbsDiff})

# if args.layerDebug == 1:
        #     print("%s : %s" % (name,wt_sumOfAbsDiff))

bias_diffTensor = np.subtract(net.params[name][1].data, net2compare.params[name][1].data)
        bias_absDiffTensor = np.absolute(bias_diffTensor)
        bias_sumOfAbsDiff = bias_absDiffTensor.sum()
        bias_sumOfAbsDiffByName.update({name : bias_sumOfAbsDiff})

print("\nThe sum of absolute difference of all layer's weight is : %s" % sum(wt_sumOfAbsDiffByName.values()))
    print("The sum of absolute difference of all layer's bias is : %s" % sum(bias_sumOfAbsDiffByName.values()))

finalDiffVal = sum(wt_sumOfAbsDiffByName.values())+ sum(bias_sumOfAbsDiffByName.values())
 print("The sum of absolute difference of all layers weight's and bias's is : %s" % finalDiffVal )
</code></pre>在Makefile.config中取消注释 DEBUG := 1，以进一步的debugging：<pre data-language=><code class="language-shell line-numbers">gdb /path/to/caffe/build/caffe
</code></pre>gdb开始后，运行命令：<pre data-language=><code class="language-shell line-numbers">run train -solver /path/to/solver.prototxt
</code></pre><h2>12 实例</h2><h3>12.1 LeNet on MNIST 手写字体</h3><pre data-language=><code class="language-shell line-numbers"># 准备数据集
cd $CAFFE_ROOT
./data/mnist/get_mnist.sh # downloads MNIST dataset
./examples/mnist/create_mnist.sh # creates dataset in LMDB format

# 训练模型
# Reduce the number of iterations from 10K to 1K to quickly run through this example
sed -i 's/max_iter: 10000/max_iter: 1000/g' examples/mnist/lenet_solver.prototxt
./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt

# 估计forward propagation和backward propagation的时间
./build/tools/caffe time --model=examples/mnist/lenet_train_test.prototxt -iterations 50 # runs on CPU

# 测试模型
# the file with the model should have a 'phase: TEST'
./build/tools/caffe test -model examples/mnist/lenet_train_test.prototxt 
 -weights examples/mnist/lenet_iter_1000.caffemodel -iterations 50
</code></pre><h3>12.2 Dogs vs Cats</h3><a class="no-external-link" href="https://www.kaggle.com/" target="_blank">Kaggle</a>下载<a class="no-external-link" href="https://www.kaggle.com/c/dogs-vs-cats/data" target="_blank">Dogs vs Cats Dataset</a>. 解压 <a class="no-external-link" href="https://github.com/RodriguezAndres/Kaggle-dogs-vs-cats/blob/master/dogvscat.sh" target="_blank">dogvscat.zip</a>， 并运行dogvscat.sh.<pre data-language=><code class="language-shell line-numbers">#!/usr/bin/env sh
CAFFE_ROOT=/path/to/caffe
mkdir dogvscat
DOG_VS_CAT_FOLDER=/path/to/dogvscat

cd *DOG_VS_CAT_FOLDER
## Download datasets (requires first a login)
#https://www.kaggle.com/c/dogs-vs-cats/download/train.zip
#https://www.kaggle.com/c/dogs-vs-cats/download/test1.zip

# Unzip train and test data
sudo apt-get -y install unzip
unzip train.zip -d .
unzip test1.zip -d .

# Format data
python create_label_file.py # creates 2 text files with labels for training and validation
./build_datasets.sh # build lmdbs

# Download ImageNet pretrained weights (takes ~20 min)
*CAFFE_ROOT/scripts/download_model_binary.py *CAFFE_ROOT/models/bvlc_reference_caffenet

# Fine-tune weights in the AlexNet architecture (takes ~100 min)
*CAFFE_ROOT/build/tools/caffe train -solver *DOG_VS_CAT_FOLDER/dogvscat_solver.prototxt 
    -weights *CAFFE_ROOT/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel

# Classify test dataset
cd *DOGVSCAT_FOLDER
python convert_binaryproto2npy.py
python dogvscat_classify.py # Returns prediction.txt (takes ~30 min)

# A better approach is to train five AlexNets w/init parameters from the same distribution,
# fine-tune those five, and compute the average of the five networks
</code></pre><h3>12.3 PASCAL VOC Classification</h3>解压<a class="no-external-link" href="https://github.com/RodriguezAndres/Pascal-VOC-2012/blob/master/voc2012.zip" target="_blank">voc2012.zip</a>，运行voc2012.sh，以训练AlexNet.<pre data-language=><code class="language-shell line-numbers">#!/usr/bin/env sh

# Copy and unzip voc2012.zip (it contains this file) then run this file. But first
#  change paths in: voc2012.sh; build_datasets.sh; solvers/*; nets/*; classify.py

# As you run various files, you can ignore the following error if it shows up:
#  libdc1394 error: Failed to initialize libdc1394

# set Caffe root directory
CAFFE_ROOT=$CAFFE_ROOT
VOC=/path/to/voc2012

chmod 700 *.sh

# Download datasets
# Details: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#devkit
if [ ! -f VOCtrainval_11-May-2012.tar ]; then
  wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
fi
# VOCtraival_11-May-2012.tar contains the VOC folder with:
#  JPGImages: all jpg images
#  Annotations: objects and corresponding bounding box/pose/truncated/occluded per jpg
#  ImageSets: breaks the images by the type of task they are used for
#  SegmentationClass and SegmentationObject: segmented images (duplicate directories)
tar -xvf VOCtrainval_11-May-2012.tar

# Run Python scripts to create labeled text files
python create_labeled_txt_file.py

# Execute shell script to create training and validation lmdbs
# Note that lmdbs directories w/the same name cannot exist prior to creating them
./build_datasets.sh

# Execute following command to download caffenet pre-trained weights (takes ~20 min)
#  if weights exist already then the command is ignored
CAFFE_ROOT/scripts/download_model_binary.py CAFFE_ROOT/models/bvlc_reference_caffenet

# Fine-tune weights in the AlexNet architecture (takes ~60 min)
# you can also chose one of six solvers: pascal_solver[1-6].prototxt
CAFFE_ROOT/build/tools/caffe train -solver VOC/solvers/voc2012_solver.prototxt 
  -weights CAFFE_ROOT/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel

# The lines below are not really needed; they served as examples on how to do some tasks

# Test against voc2012_val_lmbd dataset (name of lmdb is the model under PHASE: test)
 CAFFE_ROOT/build/tools/caffe test -model VOC/nets/voc2012_train_val_ft678.prototxt 
   -weights VOC/weights_iter_5000.caffemodel -iterations 116

# Classify validation dataset: returns a file w/the labels of the val dataset
# but it doesn't report accuracy (that would require some adjusting of the code)
python convert_binaryproto2npy.py
mkdir results
python cls_confidence.py
python average_precision.py
</code></pre>VOC相关信息： 
- <a class="no-external-link" href="http://host.robots.ox.ac.uk/pascal/VOC/" target="_blank">PASCAL VOC datasets</a> 
- 20 classes 
- Training: 5,717 images, 13,609 objects 
- Validation: 5,823 images, 13,841 objects 
- Testing: 10,991 images<h2>13 相关材料</h2><ul>
<li><a class="no-external-link" href="https://github.com/BVLC/caffe/wiki/Model-Zoo" target="_blank">Caffe Model-Zoo</a></li>
<li><a class="no-external-link" href="http://caffe.berkeleyvision.org/" target="_blank">Caffe主页</a></li>
<li>Soumith Chintala, "<a class="no-external-link" href="https://github.com/soumith/convnet-benchmarks/issues/59" target="_blank">Intel are CPU magicians.</a>" Oct. 2015</li>
<li>Dipankar Das, et al., "<a class="no-external-link" href="http://arxiv.org/abs/1602.06709" target="_blank">Distributed Deep Learning Using Synchronous Stochastic Gradient Descent.</a>" Feb. 2016</li>
<li>Jeff Donahue, "<a class="no-external-link" href="http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-sequences.pdf" target="_blank">Sequences in Caffe.</a>" CVPR Tutorial, June 2015</li>
<li>Andrej Karpathy, "<a class="no-external-link" href="http://vision.stanford.edu/teaching/cs231n/slides/caffe_tutorial.pdf" target="_blank">Caffe Tutorial.</a>" Stanford CS 231n, 2015</li>
<li>Xinlei Chen, "<a class="no-external-link" href="http://graphics.cs.cmu.edu/courses/16-824-S15/16824_2015/7.pptx" target="_blank">Caffe Tutorial.</a>" Carnegie Mellon University 16824, 2015</li>
<li><a class="no-external-link" href="http://places.csail.mit.edu/demo.html" target="_blank">MIT Scene Recognition demo: Pick an image of a scene from an URL or give it your own</a></li>
</ul>

1 摘要

2 Caffe-Intel 安装

3 Caffe 数据层

3.1 LMDB 数据

3.2 ImageData

3.3 Input

3.4 DummyData

3.5 MemoryData

3.6 HDF5Data

3.7 HDF5DataOutput

3.8 WindowData

4 数据集准备

4.1 准备三通道数据（图像）

4.2 准备不同通道数据

4.3 调整图像尺寸

5 网络训练 Training

Caffe Model Zoo

6 多节点分布式训练 Multinode distributed training

7 微调 Fine-tuning

8 测试Testing

8.1 测试图片集

8.2 测试单张图片

9 特征提取和可视化

10 Python API

10.1 Caffe Python API 设置

10.2 加载网络结构API

10.3 网络可视化API

10.4 数据输入API

10.5 推断 Inference API

10.6 特征提取和可视化API

10.7 网络定义API

10.8 网络训练API

11 调试 Debugging

12 实例

12.1 LeNet on MNIST 手写字体

12.2 Dogs vs Cats

12.3 PASCAL VOC Classification

13 相关材料

发表评论 取消回复 使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

Caffe - 基于 Caffe-Intel 框架深度学习网络的训练和部署[译]

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款