Github项目 - 全球AI挑战之场景分类实现

<Github 项目 - SceneClassyify>

全球AI挑战赛中场景分类 比赛源码.

这里主要包含赛期间遇到的问题,踩的坑等的总结.

数据集下载 - 官网 https://challenger.ai/dataset/scene

或:

百度网盘链接: https://pan.baidu.com/s/1cjR-xhsCq8BD5nH7yQeiIA 密码: xfcp

1. 源码快速使用

1.1 配置数据集路径

修改 config.py 文件:

# coding=utf-8
import os
import platform

os_name = platform.system().lower()

def is_mac():
    return os_name.startswith('darwin')

def is_windows():
    return os_name.startswith('windows')

def is_linux():
    return os_name.startswith('linux')

def parse_weigths(weights):
    if not weights \
            or not weights.endswith('.h5') \
            or not weights.__contains__('/') \
            or not weights.__contains__('-'):
        return None
    try:
        weights_info = weights.split(os.path.sep)[-1].replace('.h5', '').split('-')
        if len(weights_info) != 3:
            return None
        epoch = int(weights_info[0])
        val_loss = float(weights_info[1])
        val_acc = float(weights_info[2])
        return epoch, val_loss, val_acc
    except Exception as e:
        raise Exception('Parse weights failure: %s', str(e))


def CONTEXT(name, **kwargs):
    return {
        'weights': 'params/%s/{epoch:05d}-{val_loss:.4f}-{val_acc:.4f}.h5' % name,
        'summary': 'log/%s' % name,
        'predictor_cache_dir': 'cache/%s' % name,
        'load_imagenet_weights': is_windows(),
        'path_json_dump': 'eval_json/%s/result%s.json' % (
            name, ('_' + kwargs['policy']) if kwargs.__contains__('policy') else ''),
    }

# 数据集图片路径
# image path
if is_windows():
    PATH_TRAIN_BASE = 'D:/path/to/ai_challenger_scene_train_20170904'
    PATH_VAL_BASE = 'D:/path/to/ai_challenger_scene_validation_20170908'
    PATH_TEST_B = 'D:/path/to/ai_challenger_scene_test_b_20170922/scene_test_b_images_20170922'
elif is_mac():
    PATH_TRAIN_BASE = '/path/to/ai_challenger_scene_train_20170904'
    PATH_VAL_BASE = '/path/to/ai_challenger_scene_validation_20170908'
    PATH_TEST_B = ''
elif is_linux():
    PATH_TRAIN_BASE = ''
    PATH_VAL_BASE = ''
    PATH_TEST_B = ''
else:
    raise Exception('No images configured on %s' % os_name)

PATH_TRAIN_IMAGES = os.path.join(PATH_TRAIN_BASE, 'classes')
PATH_TRAIN_JSON = os.path.join(PATH_TRAIN_BASE, 'scene_train_annotations_20170904.json')

PATH_VAL_IMAGES = os.path.join(PATH_VAL_BASE, 'classes')
PATH_VAL_JSON = os.path.join(PATH_VAL_BASE, 'scene_validation_annotations_20170908.json')

PATH_JSON_DUMP = 'eval_json/resnet.json'

# train info
IM_SIZE_299 = 299
IM_SIZE_224 = 224
BATCH_SIZE = 32
CLASSES = len(os.listdir(PATH_TRAIN_IMAGES))
EPOCH = 100

if __name__ == '__main__':
    print(PATH_TRAIN_IMAGES)
    print(CONTEXT('test').values())

1.2. 数据集分类

修改split_by_class.py 脚本文件,分别对 train 数据集和 val 数据集进行按照子文件夹分类.

# coding=utf-8
import numpy as np
import config
import json
import csv
import os

# 源文件路径
PATH_BASE_DIR = config.PATH_TRAIN_BASE
# PATH_BASE_DIR = config.PATH_VAL_BASE

# 保存文件路径
PATH_SAVE_DIR = os.path.join(PATH_BASE_DIR, 'classes')
# 是否按照分类名保存
SUB_DIR_WITH_NAME = False

PATH_IMAGES = os.path.join(PATH_BASE_DIR, 'scene_train_images_20170904')
PATH_JSON = os.path.join(PATH_BASE_DIR, 'scene_train_annotations_20170904.json')

# PATH_IMAGES = os.path.join(PATH_BASE_DIR, 'scene_validation_images_20170908')
# PATH_JSON = os.path.join(PATH_BASE_DIR, 'scene_validation_annotations_20170908.json')
PATH_CSV = os.path.join(PATH_BASE_DIR, 'scene_classes.csv')
PRINT = True
# 均值处理类不均衡问题
MEAN_HANDLE = False


def output(obj):
    if PRINT:
        if isinstance(obj, list) or isinstance(obj, tuple):
            for i in obj:
                print(i)
        else:
            print(obj)


def parse_labels():
    with open(PATH_CSV, encoding='utf-8') as f:
        return [line[1] for line in csv.reader(f)]


def parse_mapping():
    with open(PATH_JSON) as f:
        mapping = json.load(f)
        image2label = {item['image_id']: 
                       int(item['label_id']) for item in mapping}
        label2image = {}
        for image, label in image2label.items():
            if not label2image.__contains__(label):
                label2image[label] = []
            label2image[label].append(image)
        return image2label, label2image


if __name__ == '__main__':
    labels = parse_labels()
    output(labels[:5])

    image2label, label2image = parse_mapping()
    output(label2image[0][:5])

    for label, images in label2image.items():
        label_format = unicode(labels[label], 'utf-8') 
                        if SUB_DIR_WITH_NAME else ('%02d' % label)
        sub_dir = os.path.join(PATH_SAVE_DIR, label_format)
        if not os.path.exists(sub_dir):
            os.makedirs(sub_dir)
        if MEAN_HANDLE:
            target_files_size = len(image2label) // len(label2image)
            if len(images) > target_files_size:
                # 多了抽取
                images = np.random.choice(images, 
                                          target_files_size, 
                                          replace=False).tolist()
            elif len(images) < target_files_size:
                # 少了添加
                added = []
                while len(images) + len(added) < target_files_size:
                    offset = target_files_size - len(images) - len(added)
                    if offset >= len(images):
                        added.extend(images)
                    else:
                        images.extend(np.random.choice(images, 
                                                       offset, 
                                                       replace=False).tolist())
                images.extend(added)
        for image in images:
            with open(os.path.join(PATH_IMAGES, image), 'rb') as old:
                target_file = os.path.join(sub_dir, image)
                while os.path.exists(target_file):
                    target_file = target_file.replace('.', '_.')
                with open(target_file, 'wb') as new:
                    new.write(old.read())
                    output('Write finish % s' % image)
    output('Completed.')

1.3. 模型训练

主要包含的模型训练脚本有:

  • classifier_10.py
  • classifier_base.py
  • classifier_inception_resnet_v2.py
  • classifier_inception_v3.py
  • classifier_resnet.py
  • classifier_vgg16.py
  • classifier_vgg19.py
  • classifier_xception.py
  • classifier_xception_trainable.py

运行任意一个 classifier_xxx.py 训练脚本(classifier_base 除外). 包含了VGG16/19XceptionInception-V3Inception-Resnet-V2等经典模型.

2. 要点概述

[1] - 支持多个单模型进行集成,可选多种集成方式

[2] - 支持多种集成方式间的任意组合和自动择优

[3] - 支持间断训练时权重文件的择优选择

[4] - 支持VGG16VGG19Resnet50Inception-V3XceptionInception-Resnet-V3模型

[5] - imgaug 图片数据增强库替换Keras自带的图片预处理

[6] - 支持多进程进行图片预处理

3. 踩坑

3.1. 数据增强很重要

Keras 自带的图片增强远远不够的,这里选择了 imgaug图片数据增强库,直接上图,这种效果是目前的Keras望尘莫及的,尽可能最大限度利用当前有限的数据集.

提高1~3个百分点

3.2. 尽可能高效使用CPU

训练任务交给GPU去做,新添加的 imgaug 图片处理方式之后,一个Epoch在1050Ti上耗时90mins+,排查发现大部分时间都在进行图片数据增强处理,于是将该部分的处理替换为多进程方式.

时间从90mins降到30mins左右

3.3. 标准化很重要

先计算出整体训练集的mean和std,然后在训练阶段的输入数据以mean和std进行高斯化处理(参mean_var_fetcher.py

提高0.5~1.0个百分点.

mean_var_fetcher.py

from PIL import Image
import numpy as np
import config


def get_files(dir):
    import os
    if not os.path.exists(dir):
        return []
    if os.path.isfile(dir):
        return [dir]
    result = []
    for subdir in os.listdir(dir):
        sub_path = os.path.join(dir, subdir)
        result += get_files(sub_path)
    return result


r = 0  # r mean
g = 0  # g mean
b = 0  # b mean

r_2 = 0  # r^2
g_2 = 0  # g^2
b_2 = 0  # b^2

total = 0

files = get_files(config.PATH_TRAIN_IMAGES)
count = len(files)

for i, image_file in enumerate(files):
    print('Process: %d/%d' % (i, count))
    img = Image.open(image_file)
    # img = img.resize((299, 299))
    img = np.asarray(img)
    img = img.astype('float32') / 255.
    total += img.shape[0] * img.shape[1]

    r += img[:, :, 0].sum()
    g += img[:, :, 1].sum()
    b += img[:, :, 2].sum()

    r_2 += (img[:, :, 0] ** 2).sum()
    g_2 += (img[:, :, 1] ** 2).sum()
    b_2 += (img[:, :, 2] ** 2).sum()

r_mean = r / total
g_mean = g / total
b_mean = b / total

r_var = r_2 / total - r_mean ** 2
g_var = g_2 / total - g_mean ** 2
b_var = b_2 / total - b_mean ** 2

print('Mean is %s' % ([r_mean, g_mean, b_mean]))
print('Var is %s' % ([r_var, g_var, b_var]))
# Mean is [0.4960301824223457, 
#          0.47806493084428053, 
#          0.44767167301470545]
# Var is [0.084966025569294362, 
#         0.082005493489533315, 
#         0.088877477602068156]

3.4. Fine-tune别绑太紧

这点尤为重要!

Fine-tune时松太开,可能导致训练耗时,也可能导致机器带不动;

绑太紧可能导致Fixed的权重参数扼制了模型的学习能力.

建议是在机器能扛得住的基础下,尽可能松绑多一些.

提高2~5个百分点

3.5. 模型选择很重要

糟糕的模型训练几天几夜,可能赶不上优势模型训练几个epoch.

VGG16=>Xception提高5~8个百分点

3.6. Loss降不下去时尝试调低LR

降不下去就调小,调下的幅度一般是5倍、10倍左右.

提高1~3个百分点

3.7. TensorbBoard监视训练状态

尽可能使用Tensorflow提供的Tensorboard可视化工具,方便从宏观把控训练过程.

3.8. 适度过拟合是良性的

训练过程中一直没有过拟合,要从两方面考虑:

  • 模型太简单,拟合能力不足,这时要考虑增强网络复杂度
  • 数据增强程度太大,学不到某些特征

3.9. 模型集成

单模型没有什么提升空间时,要尝试将多个单模型进行集成.

集成的方式可以选择投票法、均值法、按照模型Acc加权法等等.

提高0.5~1.5个百分点

3.10. 预测数据增强

为了确保预测结果的准确性,可以将待预测结果进行水平翻转(或随机裁取patch等)处理,将这多张孪生图片进行预测,最终结果取多个结果的均值.

提高0.25~1.0个百分点

Last modification:January 7th, 2019 at 09:56 pm

Leave a Comment