EfficientNet 是一种新的模型缩放方法,准确率比之前最好的Gpipe提高了0.1%,但是模型更小更快,参数的数量和FLOPS都大大减少,效率提升了10倍.

1. EfficientNet 简述

将 EfficientNet 划分为 base model 和 building block 两部分来分述. 众所周知的,经典 ResNet 模型的 building block 是恒等映射和卷积(identity and convolution block).

Efficient 的主要 building block 是 MBConv(mobile inverted bottleneck),其是在 MobileNetV2 中首先被提出的. 与展开的网络层相比,在 bottlenecks 间直接采用 shortcuts 连接的通道数量会少很对; 结合 depthwise separable convolution,相比较与传统网络层,能够有效的减少计算量,缩减因子几乎是 k^2. 其中,k 表示 kernel size,指定了 2D 卷积滑窗的 height 和 width.

By using shortcuts directly between the bottlenecks which connects a much fewer number of channels compared to expansion layers, combined with depthwise separable convolution which effectively reduces computation by almost a factor of k^2, compared to traditional layers. Where k stands for the kernel size, specifying the height and width of the 2D convolution window.

EfficientNet 还增加了 squeeze-and-excitation(SE) optimization,其有助于进一步提升性能.

另外,EfficientNet 的另一个优势在于,通过仔细的平衡网络深度(depth)、宽度(width) 和分辨率(resolution),以便更有效的扩展,获得更好的性能.

如图,从最小的 EfficientNet B0 到最大的 B7,精度在稳步增加,同时模型参数保持相对小的大小.

相比于 ImageNet 同样精度的其它模型,EfficientNet 参数量少很多. 例如,ResNet50 模型在 Keras 实现中参数量共 23534592,即使其精度不如最小的 EfficientNet B0,B0 的参数两仅有 5330564.

2. PyTorch - EfficientNet

Github - lukemelas/EfficientNet-PyTorch

2.1. 安装

Pip 安装:

pip install efficientnet_pytorch

或源码安装:

git clone lukemelas/EfficientNet-PyTorch cd EfficientNet-Pytorch pip install -e .

2.2. 加载

EfficientNet 的加载:

from efficientnet_pytorch import EfficientNet # 加载网络结构,无预训练权重参数 model = EfficientNet.from_name('efficientnet-b0') print(model) # 加载网络结构,带预训练权重参数 model = EfficientNet.from_pretrained('efficientnet-b0') print(model)

提供的预训练权重的精度:

Name# ParamsTop-1 Acc.Pretrained?
efficientnet-b05.3M76.3
efficientnet-b17.8M78.8
efficientnet-b29.2M79.8
efficientnet-b312M81.1
efficientnet-b419M82.6
efficientnet-b530M83.3
efficientnet-b643M84.0
efficientnet-b766M84.4

2.3. 修改输出层

修改网络输出层:

原:

(_conv_head): Conv2dStaticSamePadding( 320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn1): BatchNorm2d(1280, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_avg_pooling): AdaptiveAvgPool2d(output_size=1) (_dropout): Dropout(p=0.2, inplace=False) (_fc): Linear(in_features=1280, out_features=1000, bias=True) (_swish): MemoryEfficientSwish() )

修改:

from torch import nn from efficientnet_pytorch import EfficientNet model = EfficientNet.from_pretrained('efficientnet-b0') in_features = model._fc.in_features #或 # model._fc.out_features = 20 model._fc = nn.Linear(in_features=in_features, out_features=20, bias=True) print(model)

修改后:

(_conv_head): Conv2dStaticSamePadding( 320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False (static_padding): Identity() ) (_bn1): BatchNorm2d(1280, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True) (_avg_pooling): AdaptiveAvgPool2d(output_size=1) (_dropout): Dropout(p=0.2, inplace=False) (_fc): Linear(in_features=1280, out_features=20, bias=True) (_swish): MemoryEfficientSwish() )

2.4. 预测

#!/usr/bin/python3 #!--*-- coding:utf-8 --*-- import json from PIL import Image import torch from torchvision import transforms from efficientnet_pytorch import EfficientNet model = EfficientNet.from_pretrained('efficientnet-b0') # 图片处理 tfms = transforms.Compose([transforms.Resize(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),]) # img = tfms(Image.open('test.jpg')).unsqueeze(0) print(img.shape) # torch.Size([1, 3, 224, 224]) # 加载 ImageNet class names labels_map = json.load(open('labels_map.txt')) labels_map = [labels_map[str(i)] for i in range(1000)] # 分类 model.eval() with torch.no_grad(): outputs = model(img) # 预测结果 print('-----') for idx in torch.topk(outputs, k=5).indices.squeeze(0).tolist(): prob = torch.softmax(outputs, dim=1)[0, idx].item() print('{label:<75} ({p:.2f}%)'.format(label=labels_map[idx], p=prob*100))

2.5. 特征提取

from efficientnet_pytorch import EfficientNet model = EfficientNet.from_pretrained('efficientnet-b0') # 图片处理 tfms = transforms.Compose([transforms.Resize(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),]) # img = tfms(Image.open('test.jpg')).unsqueeze(0) print(img.shape) # torch.Size([1, 3, 224, 224]) features = model.extract_features(img) print(features.shape) # torch.Size([1, 1280, 7, 7])

2.6. 微调

(multi-label 场景)

import sys import time from PIL import Image import torchimport torch.nn as nn import torch.optim as optim from torch.utils import data from torch.autograd import Variable import torchvision import torchvision.transforms as transforms from efficientnet_pytorch import EfficientNet # DataLoader class Dataset(data.Dataset): def __init__(self,csv_path,images_path,transform=None): #Read The CSV and create the dataframe self.train_set=pd.read_csv(csv_path) self.train_path=images_path #Images Path self.transform=transform # Augmentation Transforms def __len__(self): return len(self.train_set) def __getitem__(self,idx): file_name=self.train_set.iloc[idx][0]+'.png' label=self.train_set.iloc[idx][1] img=Image.open(os.path.join(self.train_path,file_name)) #Loading Image if self.transform is not None: img=self.transform(img) return img,label # transform_train = transforms.Compose([ transforms.Resize((224,224)), transforms.RandomApply([ torchvision.transforms.RandomRotation(10), transforms.RandomHorizontalFlip()], 0.7), transforms.ToTensor()]) # training_set=Dataset('/path/to/train.csv', '/path/to/train_images/', transform=transform_train) # params = {'batch_size': 16, 'shuffle': True } training_generator=data.DataLoader(training_set,**params) # use_cuda = torch.cuda.is_available() device = torch.device("cuda:0" if use_cuda else "cpu") print(device) outputfolder='/path/to/train_outputs/' if(not os.path.exists(outputfolder)): os.mkdir(outputfolder) # Model model = EfficientNet.from_pretrained('efficientnet-b0', num_classes=20) model.to(device) print(summary(model, input_size=(3, 512, 512))) criterion = nn.CrossEntropyLoss() learning_rate=1e-3 lr_decay=0.99 optimizer = optim.Adam(model.parameters(), lr=learning_rate) # 模型训练 epochs = 100 history_accuracy=[] history_loss=[] for epoch in range(epochs): running_loss = 0.0 correct=0 total=0 class_correct = list(0. for _ in classes) class_total = list(0. for _ in classes) for i, data in enumerate(training_generator, 0): inputs, labels = data t0 = time() inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, torch.max(labels, 1)[1]) _, predicted = torch.max(outputs, 1) _, labels = torch.max(labels, 1) c = (predicted == labels.data).squeeze() correct += (predicted == labels).sum().item() total += labels.size(0) accuracy = float(correct) / float(total) history_accuracy.append(accuracy) history_loss.append(loss) loss.backward() optimizer.step() for j in range(labels.size(0)): label = labels[j] class_correct[label] += c[j].item() class_total[label] += 1 running_loss += loss.item() print( "Epoch : ",epoch+1," Batch : ", i+1," Loss : ",running_loss/(i+1)," Accuracy : ",accuracy,"Time ",round(time()-t0, 2),"s" ) for k in range(len(classes)): if(class_total[k]!=0): print('Accuracy of %5s : %2d %%' % (classes[k], 100 * class_correct[k] / class_total[k])) print('[%d epoch] Accuracy of the network on the Training images: %d %%' % (epoch+1, 100 * correct / total)) if epoch%10==0 or epoch==0: torch.save(model.state_dict(), os.path.join(outputfolder,str(epoch+1)+'_'+str(round(accuracy, 4))+'.pth')) # 模型保存 torch.save(model.state_dict(), os.path.join(outputfolder,'Last_epoch'+str(round(accuracy, 4))+'.pth')) # 可视化训练曲线 plt.plot(history_accuracy) plt.plot(history_loss) # 模型预测 model.load_state_dict(torch.load(os.path.join(outputfolder, '/50_0.9246.pth'))) model.eval() # test_transforms = transforms.Compose([transforms.Resize(512), transforms.ToTensor(), ]) def predict_image(image): image_tensor = test_transforms(image) image_tensor = image_tensor.unsqueeze_(0) input = Variable(image_tensor) input = input.to(device) output = model(input) index = output.data.cpu().numpy().argmax() return index # img=Image.open('test.jpg') prediction=predict_image(img)

3. Keras - EfficientNet

Github - qubvel/efficientnet

3.1. 安装

依赖项:

  • Keras >= 2.2.0 / TensorFlow >= 1.12.0
  • keras_applications >= 1.0.7
  • scikit-image

Pip 安装:

# stable pip install -U efficientnet # latest - with keras and tf.keras support pip install -U --pre efficientnet # source pip install -U git+https://github.com/qubvel/efficientnet

3.2. 加载

模型初始化:

# models can be build with Keras or Tensorflow frameworks # use keras and tfkeras modules respectively # efficientnet.keras / efficientnet.tfkeras import efficientnet.keras as efn model = efn.EfficientNetB0(weights='imagenet') # or weights='noisy-student'

加载预训练权重:

# model use some custom objects, so before loading saved model # import module your network was build with # e.g. import efficientnet.keras / import efficientnet.tfkeras import efficientnet.tfkeras from tensorflow.keras.models import load_model model = load_model('path/to/model.h5')

提供的云训练权重参数的精度:

Architecture@top1* Imagenet@top1* Noisy-Student
EfficientNetB00.7720.788
EfficientNetB10.7910.815
EfficientNetB20.8020.824
EfficientNetB30.8160.841
EfficientNetB40.8300.853
EfficientNetB50.8370.861
EfficientNetB60.8410.864
EfficientNetB70.8440.869

3.3. 预测

import os import sys import numpy as np from skimage.io import imread import matplotlib.pyplot as plt from keras.applications.imagenet_utils import decode_predictions from efficientnet.keras import EfficientNetB0 from efficientnet.keras import center_crop_and_resize, preprocess_input ## 或使用 tensorflow.keras: # from efficientnet.tfkeras import EfficientNetB0 # from efficientnet.tfkeras import center_crop_and_resize, preprocess_input # 测试图片 image = imread('test.jpg') # 加载预训练模型 model = EfficientNetB0(weights='imagenet') # 输入处理 image_size = model.input_shape[1] x = center_crop_and_resize(image, image_size=image_size) x = preprocess_input(x) x = np.expand_dims(x, 0) # 预测及解码 y = model.predict(x) decode_predictions(y)

3.4. 微调

#!/usr/bin/python3 #!--*-- coding: utf-8 --*-- from efficientnet.tfkeras import EfficientNetB0 from efficientnet.tfkeras import center_crop_and_resize, preprocess_input from tensorflow.keras import models from tensorflow.keras import layers from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.utils import plot_model # 加载预训练模型,作为 conv base model conv_base = EfficientNetB0(weights="imagenet", include_top=False, input_shape=input_shape) # 构建微调模型 dropout_rate = 0.2 model = models.Sequential() model.add(conv_base) model.add(layers.GlobalMaxPooling2D(name="gap")) # model.add(layers.Flatten(name="flatten")) if dropout_rate > 0: model.add(layers.Dropout(dropout_rate, name="dropout_out")) # model.add(layers.Dense(256, activation='relu', name="fc1")) model.add(layers.Dense(20, activation="softmax", name="fc_out")) # 冻结 conv_base 模型的卷积权重 conv_base.trainable = False # 或 #conv_base.trainable = True #set_trainable = False #for layer in conv_base.layers: # if layer.name == 'multiply_16': # set_trainable = True # if set_trainable: # layer.trainable = True # else: # layer.trainable = False # 训练数据处理 train_datagen = ImageDataGenerator( rescale=1.0 / 255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode="nearest", ) # 验证数据处理 test_datagen = ImageDataGenerator(rescale=1.0 / 255) train_generator = train_datagen.flow_from_directory( train_dir, # All images will be resized to target height and width. target_size=(height, width), batch_size=batch_size, # Since we use categorical_crossentropy loss, we need categorical labels class_mode="categorical", ) validation_generator = test_datagen.flow_from_directory( validation_dir, target_size=(height, width), batch_size=batch_size, class_mode="categorical", ) # model.compile( loss="categorical_crossentropy", optimizer=optimizers.RMSprop(lr=2e-5), metrics=["acc"], ) history = model.fit_generator( train_generator, steps_per_epoch=NUM_TRAIN // batch_size, epochs=epochs, validation_data=validation_generator, validation_steps=NUM_TEST // batch_size, verbose=1, use_multiprocessing=True, workers=4, ) # plot_model(conv_base, to_file='conv_base.png', show_shapes=True)

4. 参考

[1] - pytorch笔记:Efficientnet微调 - 知乎

[2] - How to do Transfer learning with Efficientnet

[3] - Keras Applications

[4] - 谷歌EfficientNet缩放模型,PyTorch实现出炉,登上GitHub热榜丨Demo可用

[5] - Github - kartikdutt18/Kaggle-ATPOS-Efficient_Net

[6] - Kaggle - Efficient_Net

Last modification:June 28th, 2020 at 10:41 am