在 OpenCV4.X 版本(OpenCV3.4.1之后版本) 可以采用 cv2.dnn.readNetFromTensorflow(pbmodel, pbtxt) 函数直接调用 TensorFlow 训练的目标检测模型.

1. TensorFlow Detection Model Zoo

TensorFlow 目标检测预训练模型:

Tensorflow Detection Model Zoo

1.1. 基于 COCO 数据训练的模型

Model nameSpeed (ms)COCO mAP1Outputs
ssd_mobilenet_v1_coco3021Boxes
ssd_mobilenet_v1_0.75_depth_coco ☆2618Boxes
ssd_mobilenet_v1_quantized_coco ☆2918Boxes
ssd_mobilenet_v1_0.75_depth_quantized_coco ☆2916Boxes
ssd_mobilenet_v1_ppn_coco ☆2620Boxes
ssd_mobilenet_v1_fpn_coco ☆5632Boxes
ssd_resnet_50_fpn_coco ☆7635Boxes
ssd_mobilenet_v2_coco3122Boxes
ssd_mobilenet_v2_quantized_coco2922Boxes
ssdlite_mobilenet_v2_coco2722Boxes
ssd_inception_v2_coco4224Boxes
faster_rcnn_inception_v2_coco5828Boxes
faster_rcnn_resnet50_coco8930Boxes
faster_rcnn_resnet50_lowproposals_coco64 Boxes
rfcn_resnet101_coco9230Boxes
faster_rcnn_resnet101_coco10632Boxes
faster_rcnn_resnet101_lowproposals_coco82 Boxes
faster_rcnn_inception_resnet_v2_atrous_coco62037Boxes
faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco241 Boxes
faster_rcnn_nas183343Boxes
faster_rcnn_nas_lowproposals_coco540 Boxes
mask_rcnn_inception_resnet_v2_atrous_coco77136Masks
mask_rcnn_inception_v2_coco7925Masks
mask_rcnn_resnet101_atrous_coco47033Masks
mask_rcnn_resnet50_atrous_coco34329Masks

注:

[1] - 带五角星符号(☆) 表示模型支持 TPU 训练.

[2] - 下载 quantized 模型.tar.gz 文件并解压后,会得到不同的文件,包括:checkpoint 文件,config 配置文件和 tfile frozen graphs(txt/binary)文件.

1.2. 基于 Kitti 数据集训练的模型

Model nameSpeed (ms)Pascal mAP@0.5Outputs
faster_rcnn_resnet101_kitti7987Boxes

1.3. 基于 Open Images 数据集训练的模型

Model nameSpeed (ms)Open Images mAP@0.52Outputs
faster_rcnn_inception_resnet_v2_atrous_oidv272737Boxes
faster_rcnn_inception_resnet_v2_atrous_lowproposals_oidv2347 Boxes
facessd_mobilenet_v2_quantized_open_image_v4 32073 (faces)Boxes
Model nameSpeed (ms)Open Images mAP@0.54Outputs
faster_rcnn_inception_resnet_v2_atrous_oidv442554Boxes
ssd_mobilenetv2_oidv48936Boxes
ssd_resnet_101_fpn_oidv423738Boxes

1.4. 基于 iNaturalist Species 数据集训练的模型

Model nameSpeed (ms)Pascal mAP@0.5Outputs
faster_rcnn_resnet101_fgvc39558Boxes
faster_rcnn_resnet50_fgvc36655Boxes

1.5. 基于 AVA v2.1 训练的模型

Model nameSpeed (ms)Pascal mAP@0.5Outputs
faster_rcnn_resnet101_ava_v2.19311Boxes

1.6. TensorFlow 目标检测 API - SSD 例示

TensorFlow 中,深度学习网络被表示为图(graphs),其中图中每个节点(node) 是其输入的一种变换. 节点可以是常用网络层,如 C++ 实现的 ConvolutionMaxPooling 层. 也可以采用 python 利用 TensorFlow 操作子(operations) 来构建自定义网络层.

TensorFlow 目标检测API 是用于创建目标检测深度网络的框架.

TensorFlow 训练得到的模型是 .pb 后缀的二值文件,其同时保存了训练网络的拓扑(topology)结构和模型权重.

这里以 ssd_mobilenet_v2_coco_2018_03_29 预训练模型(基于 COCO 数据集训练的 MobileNet-SSD模型)为例:

#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import os
import numpy as np
import cv2
import matplotlib.pyplot as plt
import tensorflow as tf


model_path = "/path/to/ssd_mobilenet_v2_coco_2018_03_29"
frozen_pb_file = os.path.join(model_path, 'frozen_inference_graph.pb')

score_threshold = 0.3

img_file = 'test.jpg'

# Read the graph.
with tf.gfile.FastGFile(frozen_pb_file, 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())


with tf.Session() as sess:
    # Restore session
    sess.graph.as_default()
    tf.import_graph_def(graph_def, name='')

    # Read and preprocess an image.
    img_cv2 = cv2.imread(img_file)
    img_height, img_width, _ = img_cv2.shape

    img_in = cv2.resize(img_cv2, (300, 300))
    img_in = img_in[:, :, [2, 1, 0]]  # BGR2RGB

    # Run the model
    outputs = sess.run([sess.graph.get_tensor_by_name('num_detections:0'),
                    sess.graph.get_tensor_by_name('detection_scores:0'),
                    sess.graph.get_tensor_by_name('detection_boxes:0'),
                    sess.graph.get_tensor_by_name('detection_classes:0')],
                   feed_dict={
                       'image_tensor:0': img_in.reshape(1, 
                                                        img_in.shape[0], 
                                                        img_in.shape[1], 
                                                        3)})

    # Visualize detected bounding boxes.
    num_detections = int(outputs[0][0])
    for i in range(num_detections):
        classId = int(outputs[3][0][i])
        score = float(outputs[1][0][i])
        bbox = [float(v) for v in outputs[2][0][i]]
        if score > score_threshold:
            x = bbox[1] * img_width
            y = bbox[0] * img_height
            right = bbox[3] * img_width
            bottom = bbox[2] * img_height
            cv2.rectangle(img_cv2, 
                          (int(x), int(y)), 
                          (int(right), int(bottom)), 
                          (125, 255, 51), 
                          thickness=2)

plt.figure(figsize=(10, 8))
plt.imshow(img_cv2[:, :, ::-1])
plt.title("TensorFlow MobileNetV2-SSD")
plt.axis("off")
plt.show()

目标检测结果如:

image

https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API

2. TensorFlow 目标检测模型转换为 DNN 可调用格式

OpenCV DNN 模块调用 TensorFlow 训练的目标检测模型时,需要一个额外的配置文件,其主要是基于与 protocol buffers(protobuf) 格式序列化图(graph) 相同的文本格式版本.

2.1. DNN 已可直接调用检测模型

OpenCV 中已经提供的 TensorFlow 目标检测模型和配置文件有:

ModelVersion
MobileNet-SSD v12017_11_17weightsconfig
MobileNet-SSD v1 PPN2018_07_03weightsconfig
MobileNet-SSD v22018_03_29weightsconfig
Inception-SSD v22017_11_17weightsconfig
Faster-RCNN Inception v22018_01_28weightsconfig
Faster-RCNN ResNet-502018_01_28weightsconfig
Mask-RCNN Inception v22018_01_28weightsconfig

2.2. 常用目标检测模型转换

三种不同的 TensorFlow 目标检测模型转换脚本为:

From: https://github.com/opencv/opencv/tree/master/samples/dnn

转换脚本的输入参数:

[1] - --input: TensorFlow frozen graph 文件路径.

[2] - --config: TensorFlow 模型训练时的 *.config 文件路径.

注: TensorFlow *.config配置文件:configuration file.

转换脚本的输出参数:

[1] - --output: 输出的 text graph 文件.

如:

faster rcnn 模型:

 python3 tf_text_graph_faster_rcnn.py \
     --input '/path/to/faster_rcnn_resnet50_coco_2018_01_28/frozen_inference_graph.pb' \
     --config '/path/to/faster_rcnn_resnet50_coco.config' \
     --output '/path/to/faster_rcnn_resnet50_coco_2018_01_28/graph.pbtxt'  

ssd 模型:

python3 tf_text_graph_ssd.py \
    --input /path/to/ssd_inception_v2_coco_2018_01_28/frozen_inference_graph.pb \
    --config /path/to/ssd_inception_v2_coco.config \
    --output /path/to/ssd_inception_v2_coco_2018_01_28/graph.pbtxt

mask rcnn 模型:

python3 tf_text_graph_mask_rcnn2.py \
    --input '/path/to/mask_rcnn_resnet50_atrous_coco_2018_01_28/frozen_inference_graph.pb' \
    --config '/path/to/mask_rcnn_resnet50_atrous_coco.config' \
    --output '/path/to/mask_rcnn_resnet50_atrous_coco_2018_01_28/graph.pbtxt' 

对于生成的 graph.pbtxt 可采用 Netron 工具进行可视化.

Github 项目 - 深度网络可视化工具 - Netron - AIUAI

2.3. DNN 目标检测 - SSD 例示

TensorFLow 目标检测 API -SSD 例示 一样,检测测试下基于 OpenCV DNN 的 SSD 目标检测.

[1] - 首先进行模型转换,如:

python3 tf_text_graph_ssd.py \
    --input '/path/to/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb' \
    --config '/path/to/ssd_mobilenet_v2_coco.config' \
    --output '/path/to/ssd_mobilenet_v2_coco_2018_03_29/graph.pbtxt' 

终端打印的转换过程如:

Scale: [0.200000-0.950000]
Aspect ratios: [1.0, 2.0, 0.5, 3.0, 0.3333]
Reduce boxes in the lowest layer: True
Number of classes: 90
Number of layers: 6
box predictor: convolutional
Input image size: 300x300

[2] - 然后,目标检测模型测试:

#!/usr/bin/python
#!--*-- coding:utf-8 --*--
import cv2
import matplotlib.pyplot as plt


pb_file = '/path/to/ssd_inception_v2_coco_2018_01_28/frozen_inference_graph.pb'
pbtxt_file = '/path/to/ssd_inception_v2_coco_2018_01_28/graph.pbtxt'
net = cv2.dnn.readNetFromTensorflow(pb_file, pbtxt_file)

score_threshold = 0.3

img_file = 'test.jpg'

img_cv2 = cv2.imread(img_file)
height, width, _ = img_cv2.shape
net.setInput(cv2.dnn.blobFromImage(img_cv2, 
                                   size=(300, 300), 
                                   swapRB=True, 
                                   crop=False))

out = net.forward()
print(out)

for detection in out[0, 0, :,:]:
    score = float(detection[2])
    if score > score_threshold:
        left = detection[3] * width
        top = detection[4] * height
        right = detection[5] * width
        bottom = detection[6] * height
        cv2.rectangle(img_cv2, 
                      (int(left), int(top)), 
                      (int(right), int(bottom)), 
                      (23, 230, 210), 
                      thickness=2)

t, _ = net.getPerfProfile()
label = 'Inference time: %.2f ms' % \
            (t * 1000.0 / cv2.getTickFrequency())
cv2.putText(img_cv2, label, (0, 15), 
            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))


plt.figure(figsize=(10, 8))
plt.imshow(img_cv2[:, :, ::-1])
plt.title("OpenCV DNN MobileNetV2-SSD")
plt.axis("off")
plt.show()

目标检测结果如:

image


  1. 1
  2. 2
  3. 3
  4. 4
Last modification:April 8th, 2019 at 05:56 pm