全球AI挑战赛中场景分类 比赛源码.
这里主要包含赛期间遇到的问题,踩的坑等的总结.
数据集下载 - 官网 https://challenger.ai/dataset/scene
或:
百度网盘链接: https://pan.baidu.com/s/1cjR-xhsCq8BD5nH7yQeiIA 密码: xfcp
1. 源码快速使用
1.1 配置数据集路径
修改 config.py
文件:
# coding=utf-8
import os
import platform
os_name = platform.system().lower()
def is_mac():
return os_name.startswith('darwin')
def is_windows():
return os_name.startswith('windows')
def is_linux():
return os_name.startswith('linux')
def parse_weigths(weights):
if not weights \
or not weights.endswith('.h5') \
or not weights.__contains__('/') \
or not weights.__contains__('-'):
return None
try:
weights_info = weights.split(os.path.sep)[-1].replace('.h5', '').split('-')
if len(weights_info) != 3:
return None
epoch = int(weights_info[0])
val_loss = float(weights_info[1])
val_acc = float(weights_info[2])
return epoch, val_loss, val_acc
except Exception as e:
raise Exception('Parse weights failure: %s', str(e))
def CONTEXT(name, **kwargs):
return {
'weights': 'params/%s/{epoch:05d}-{val_loss:.4f}-{val_acc:.4f}.h5' % name,
'summary': 'log/%s' % name,
'predictor_cache_dir': 'cache/%s' % name,
'load_imagenet_weights': is_windows(),
'path_json_dump': 'eval_json/%s/result%s.json' % (
name, ('_' + kwargs['policy']) if kwargs.__contains__('policy') else ''),
}
# 数据集图片路径
# image path
if is_windows():
PATH_TRAIN_BASE = 'D:/path/to/ai_challenger_scene_train_20170904'
PATH_VAL_BASE = 'D:/path/to/ai_challenger_scene_validation_20170908'
PATH_TEST_B = 'D:/path/to/ai_challenger_scene_test_b_20170922/scene_test_b_images_20170922'
elif is_mac():
PATH_TRAIN_BASE = '/path/to/ai_challenger_scene_train_20170904'
PATH_VAL_BASE = '/path/to/ai_challenger_scene_validation_20170908'
PATH_TEST_B = ''
elif is_linux():
PATH_TRAIN_BASE = ''
PATH_VAL_BASE = ''
PATH_TEST_B = ''
else:
raise Exception('No images configured on %s' % os_name)
PATH_TRAIN_IMAGES = os.path.join(PATH_TRAIN_BASE, 'classes')
PATH_TRAIN_JSON = os.path.join(PATH_TRAIN_BASE, 'scene_train_annotations_20170904.json')
PATH_VAL_IMAGES = os.path.join(PATH_VAL_BASE, 'classes')
PATH_VAL_JSON = os.path.join(PATH_VAL_BASE, 'scene_validation_annotations_20170908.json')
PATH_JSON_DUMP = 'eval_json/resnet.json'
# train info
IM_SIZE_299 = 299
IM_SIZE_224 = 224
BATCH_SIZE = 32
CLASSES = len(os.listdir(PATH_TRAIN_IMAGES))
EPOCH = 100
if __name__ == '__main__':
print(PATH_TRAIN_IMAGES)
print(CONTEXT('test').values())
1.2. 数据集分类
修改split_by_class.py
脚本文件,分别对 train
数据集和 val
数据集进行按照子文件夹分类.
# coding=utf-8
import numpy as np
import config
import json
import csv
import os
# 源文件路径
PATH_BASE_DIR = config.PATH_TRAIN_BASE
# PATH_BASE_DIR = config.PATH_VAL_BASE
# 保存文件路径
PATH_SAVE_DIR = os.path.join(PATH_BASE_DIR, 'classes')
# 是否按照分类名保存
SUB_DIR_WITH_NAME = False
PATH_IMAGES = os.path.join(PATH_BASE_DIR, 'scene_train_images_20170904')
PATH_JSON = os.path.join(PATH_BASE_DIR, 'scene_train_annotations_20170904.json')
# PATH_IMAGES = os.path.join(PATH_BASE_DIR, 'scene_validation_images_20170908')
# PATH_JSON = os.path.join(PATH_BASE_DIR, 'scene_validation_annotations_20170908.json')
PATH_CSV = os.path.join(PATH_BASE_DIR, 'scene_classes.csv')
PRINT = True
# 均值处理类不均衡问题
MEAN_HANDLE = False
def output(obj):
if PRINT:
if isinstance(obj, list) or isinstance(obj, tuple):
for i in obj:
print(i)
else:
print(obj)
def parse_labels():
with open(PATH_CSV, encoding='utf-8') as f:
return [line[1] for line in csv.reader(f)]
def parse_mapping():
with open(PATH_JSON) as f:
mapping = json.load(f)
image2label = {item['image_id']:
int(item['label_id']) for item in mapping}
label2image = {}
for image, label in image2label.items():
if not label2image.__contains__(label):
label2image[label] = []
label2image[label].append(image)
return image2label, label2image
if __name__ == '__main__':
labels = parse_labels()
output(labels[:5])
image2label, label2image = parse_mapping()
output(label2image[0][:5])
for label, images in label2image.items():
label_format = unicode(labels[label], 'utf-8')
if SUB_DIR_WITH_NAME else ('%02d' % label)
sub_dir = os.path.join(PATH_SAVE_DIR, label_format)
if not os.path.exists(sub_dir):
os.makedirs(sub_dir)
if MEAN_HANDLE:
target_files_size = len(image2label) // len(label2image)
if len(images) > target_files_size:
# 多了抽取
images = np.random.choice(images,
target_files_size,
replace=False).tolist()
elif len(images) < target_files_size:
# 少了添加
added = []
while len(images) + len(added) < target_files_size:
offset = target_files_size - len(images) - len(added)
if offset >= len(images):
added.extend(images)
else:
images.extend(np.random.choice(images,
offset,
replace=False).tolist())
images.extend(added)
for image in images:
with open(os.path.join(PATH_IMAGES, image), 'rb') as old:
target_file = os.path.join(sub_dir, image)
while os.path.exists(target_file):
target_file = target_file.replace('.', '_.')
with open(target_file, 'wb') as new:
new.write(old.read())
output('Write finish % s' % image)
output('Completed.')
1.3. 模型训练
主要包含的模型训练脚本有:
- classifier_10.py
- classifier_base.py
- classifier_inception_resnet_v2.py
- classifier_inception_v3.py
- classifier_resnet.py
- classifier_vgg16.py
- classifier_vgg19.py
- classifier_xception.py
- classifier_xception_trainable.py
运行任意一个 classifier_xxx.py
训练脚本(classifier_base
除外). 包含了VGG16/19
、Xception
、Inception-V3
、Inception-Resnet-V2
等经典模型.
2. 要点概述
[1] - 支持多个单模型进行集成,可选多种集成方式
[2] - 支持多种集成方式间的任意组合和自动择优
[3] - 支持间断训练时权重文件的择优选择
[4] - 支持VGG16
、VGG19
、Resnet50
、Inception-V3
、Xception
、Inception-Resnet-V3
模型
[5] - imgaug
图片数据增强库替换Keras自带的图片预处理
[6] - 支持多进程进行图片预处理
3. 踩坑
3.1. 数据增强很重要
Keras 自带的图片增强远远不够的,这里选择了 imgaug图片数据增强库,直接上图,这种效果是目前的Keras望尘莫及的,尽可能最大限度利用当前有限的数据集.
提高1~3个百分点
3.2. 尽可能高效使用CPU
训练任务交给GPU去做,新添加的 imgaug 图片处理方式之后,一个Epoch在1050Ti上耗时90mins+,排查发现大部分时间都在进行图片数据增强处理,于是将该部分的处理替换为多进程方式.
时间从90mins降到30mins左右
3.3. 标准化很重要
先计算出整体训练集的mean和std,然后在训练阶段的输入数据以mean和std进行高斯化处理(参mean_var_fetcher.py)
提高0.5~1.0个百分点.
mean_var_fetcher.py
:
from PIL import Image
import numpy as np
import config
def get_files(dir):
import os
if not os.path.exists(dir):
return []
if os.path.isfile(dir):
return [dir]
result = []
for subdir in os.listdir(dir):
sub_path = os.path.join(dir, subdir)
result += get_files(sub_path)
return result
r = 0 # r mean
g = 0 # g mean
b = 0 # b mean
r_2 = 0 # r^2
g_2 = 0 # g^2
b_2 = 0 # b^2
total = 0
files = get_files(config.PATH_TRAIN_IMAGES)
count = len(files)
for i, image_file in enumerate(files):
print('Process: %d/%d' % (i, count))
img = Image.open(image_file)
# img = img.resize((299, 299))
img = np.asarray(img)
img = img.astype('float32') / 255.
total += img.shape[0] * img.shape[1]
r += img[:, :, 0].sum()
g += img[:, :, 1].sum()
b += img[:, :, 2].sum()
r_2 += (img[:, :, 0] ** 2).sum()
g_2 += (img[:, :, 1] ** 2).sum()
b_2 += (img[:, :, 2] ** 2).sum()
r_mean = r / total
g_mean = g / total
b_mean = b / total
r_var = r_2 / total - r_mean ** 2
g_var = g_2 / total - g_mean ** 2
b_var = b_2 / total - b_mean ** 2
print('Mean is %s' % ([r_mean, g_mean, b_mean]))
print('Var is %s' % ([r_var, g_var, b_var]))
# Mean is [0.4960301824223457,
# 0.47806493084428053,
# 0.44767167301470545]
# Var is [0.084966025569294362,
# 0.082005493489533315,
# 0.088877477602068156]
3.4. Fine-tune别绑太紧
这点尤为重要!
Fine-tune时松太开,可能导致训练耗时,也可能导致机器带不动;
绑太紧可能导致Fixed的权重参数扼制了模型的学习能力.
建议是在机器能扛得住的基础下,尽可能松绑多一些.
提高2~5个百分点
3.5. 模型选择很重要
糟糕的模型训练几天几夜,可能赶不上优势模型训练几个epoch.
VGG16=>Xception提高5~8个百分点
3.6. Loss降不下去时尝试调低LR
降不下去就调小,调下的幅度一般是5倍、10倍左右.
提高1~3个百分点
3.7. TensorbBoard监视训练状态
尽可能使用Tensorflow提供的Tensorboard可视化工具,方便从宏观把控训练过程.
3.8. 适度过拟合是良性的
训练过程中一直没有过拟合,要从两方面考虑:
- 模型太简单,拟合能力不足,这时要考虑增强网络复杂度
- 数据增强程度太大,学不到某些特征
3.9. 模型集成
单模型没有什么提升空间时,要尝试将多个单模型进行集成.
集成的方式可以选择投票法、均值法、按照模型Acc加权法等等.
提高0.5~1.5个百分点
3.10. 预测数据增强
为了确保预测结果的准确性,可以将待预测结果进行水平翻转(或随机裁取patch等)处理,将这多张孪生图片进行预测,最终结果取多个结果的均值.
提高0.25~1.0个百分点