基于项目 Face editing with Generative Adversarial Networks.

https://github.com/pbaylies/stylegan-encoder

依赖项安装:

pip --version
pip install tqdm requests -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
apt-get install -y libgl1-mesa-dev  libglib2.0-dev libsm6 libxrender1
pip install opencv-python pillow -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com

#
#import tensorflow as tf
#print(tf.__version__)
#1.15.2

pip install keras==2.3.1 -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
#import keras
#print(keras.__version__)
#2.3.1

pip install cmake -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
pip install dlib -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com

1. 人脸编码

处理流程:

[1] - 下载源码:

git clone https://github.com/pbaylies/stylegan-encoder
cd stylegan-encoder
ls

如:

LICENSE.txt*                           dnnlib/                 robust_loss/
Learn_direction_in_latent_space.ipynb  encode_images.py        run_metrics.py*
Play_with_latent_directions.ipynb      encoder/                swa.py
README.md*                             ffhq_dataset/           teaser.png
StyleGAN_Encoder_Tutorial.ipynb        generate_figures.py*    train.py*
adaptive.py                            metrics/                train_effnet.py
align_images.py                        mona_example.jpg        train_resnet.py
config.py*                             pretrained_example.py*  training/
dataset_tool.py*                       requirements.txt

[2] - 人脸对齐

关于测试图像的建议:

  • 采用 HD 图像,分辨率最好在 1000x1000 pixels 以上
  • 避免图像中人脸过小
  • 自然表情、正脸的人脸效果会更优
  • 推荐采用良好的光照条件

创建文件夹:

mkdir cache #放置模型等文件
mkdir -p data/raw_images #未对齐人脸图像路径
mkdir -p data/aligned_images #对齐后人脸图像路径

运行人脸对齐脚本:

python align_images.py data/raw_images/ data/aligned_images/ --output_size=1024

其主要进行了几步处理:

  • 寻找图像中的人脸
  • 从图像中裁剪出人脸
  • 对齐人脸(鼻子居中,眼睛水平)
  • 缩放人脸,并保存到目标路径.

人脸对齐后,可以过滤删除下 data/aligned_images/ 中不太好的人脸.

由于采用 dlib 库,如果需要下载 shape_predictor_68_face_landmarks.dat.bz2 模型文件,如果速度慢,可离线下载,放置到路径./cache,并修改 align_images.py 中如下:

#LANDMARKS_MODEL_URL = 'http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2'
landmarks_model_path = unpack_bz2(get_file('./shape_predictor_68_face_landmarks.dat.bz2',
                    LANDMARKS_MODEL_URL, cache_dir='./cache',cache_subdir=''))

perceptual_model.py 类似处理:

landmarks_model_path = unpack_bz2(get_file('./shape_predictor_68_face_landmarks.dat.bz2',
                   LANDMARKS_MODEL_URL, cache_dir='./cache',cache_subdir=''))

[3] - 编码器编码

下载与训练的 resnet 编码器, 该编码器模型的输入为图像,输出对应的 latent code.

编码器模型: https://drive.google.com/uc?id=1aT59NFy9-bNyXjDuZOTMl0qX0jmZc6Zb

将编码器模型 finetuned_resnet.h5 data 放于 ./cache 路径.

然后,编码过程:

高度推荐:设置编码参数(encoding params),其对 latent representions 和图像均有巨大影响.

参考:https://github.com/pbaylies/stylegan-encoder/blob/master/encode_images.py

batchsize 设置:

print("aligned_images contains %d images ready for encoding!" %len(os.listdir('data/aligned_images/')))
print("Recommended batch_size for the encode_images process: %d" %min(len(os.listdir('data/aligned_images/')), 8))

建议,batch_size 设置小于等于对齐后的图像数量. 为了避免显存不足问题,batch_size<8.

网络模型:

采用 NVIDIA 训练的 StyleGAN 网络和 ImageNet 上训练的 VGG16 网络.

在采用预训练的 ResNet 模型所预测的初始化 latent codes 后,采用梯度下降来优化 latent faces.

可采用两种实现,一种是快速的,一种速度是慢的.

默认情况下,优化 w 向量,而不是 z 向量.

#Fast
python encode_images.py --optimizer=lbfgs --face_mask=True --iterations=6 --use_lpips_loss=0 --use_discriminator_loss=0 --output_video=True --model_url=cache/karras2019stylegan-ffhq-1024x1024.pkl --mask_dir=outputs/masks/ --video_dir=outputs/videos/ data/aligned_images/ data/generated_images/ outputs/latent_representations/

#Slow
python encode_images.py --optimizer=adam --lr=0.02 --decay_rate=0.95 --decay_steps=6 --use_l1_penalty=0.3 --face_mask=True --iterations=400 --early_stopping=True --early_stopping_threshold=0.05 --average_best_loss=0.5 --use_lpips_loss=0 --use_discriminator_loss=0 --output_video=True --model_url=cache/karras2019stylegan-ffhq-1024x1024.pkl --mask_dir=outputs/masks/ --video_dir=outputs/videos/ data/aligned_images/ data/generated_images/ outputs/latent_representations/

参数 --face_mask=True 会复制源图的头发,如果不希望如此,可以设置 --face_mask=False.

[4] - 随机可视化 StyleGAN 样本:

import dnnlib, pickle
import dnnlib.tflib as tflib

tflib.init_tf()
synthesis_kwargs = dict(output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True), minibatch_size=1)

model_dir = 'cache/'
model_path = [model_dir+f for f in os.listdir(model_dir) if 'stylegan-ffhq' in f][0]
print("Loading StyleGAN model from %s..." %model_path)

with dnnlib.util.open_url(model_path) as f:
  generator_network, discriminator_network, averaged_generator_network = pickle.load(f)
  
print("StyleGAN loaded & ready for sampling!")

#
def generate_images(generator, latent_vector, z = True):
    batch_size = latent_vector.shape[0]
    
    if z: #Start from z: run the full generator network
        return generator.run(latent_vector.reshape((batch_size, 512)), None, randomize_noise=False, **synthesis_kwargs)
    else: #Start from w: skip the mapping network
        return generator.components.synthesis.run(latent_vector.reshape((batch_size, 18, 512)), randomize_noise=False, **synthesis_kwargs)
    
#
import matplotlib.pyplot as plt
import numpy as np
def plot_imgs(model, rows, columns):
  for i in range(rows):
    f, axarr = plt.subplots(1,columns, figsize = (20,8))
    for j in range(columns):
      img = generate_images(model, np.random.randn(1,512), z = True)[0]
      print('[INFO]--{}--{}--{}'.format(i, j, img.shape))
      axarr[j].imshow(img)
      axarr[j].axis('off')
      axarr[j].set_title('Resolution: %s' %str(img.shape))
    # plt.show()
    print('[INFO]outputs/results/{}_{}.jpg'.format(i, j))  
    plt.savefig('outputs/results/{}_{}.jpg'.format(i, j))
    plt.close()   

[5] - 可视化测试图像的编码结果:

import matplotlib.pyplot as plt 
import numpy as np

for f in sorted(os.listdir('outputs/latent_representations')):
  w = np.load('outputs/latent_representations/' + f).reshape((1,18,-1))
  img = generate_images(averaged_generator_network, w, z = False)[0]
  plt.imshow(img)
  plt.axis('off')
  plt.title("Generated image from %s" %f)
  # plt.show()
  print('[INFO]outputs/results/latent_representations_{}.jpg'.format(f))  
  plt.savefig('outputs/results/latent_representations_{}.jpg'.format(f))
  plt.close()   

如果效果不够好,可以设置些编码参数,如

[6] - 对比测试图像原图及编码后的样本:

#
from PIL import Image 
import matplotlib.pyplot as plt

def plot_two_images(img1,img2, img_id, fs = 12):
  f, axarr = plt.subplots(1,2, figsize=(fs,fs))
  axarr[0].imshow(img1)
  axarr[0].title.set_text('Encoded img %d' %img_id)
  axarr[1].imshow(img2)
  axarr[1].title.set_text('Original img %d' %img_id)
  plt.setp(plt.gcf().get_axes(), xticks=[], yticks=[])
  # plt.show()
  print('[INFO]outputs/compare/{}_compare.jpg'.format(img_id))  
  plt.savefig('outputs/compare/{}_compare.jpg'.format(img_id))
  plt.close()   

def display_sbs(folder1, folder2, res = 256):
  if folder1[-1] != '/': folder1 += '/'
  if folder2[-1] != '/': folder2 += '/'
    
  imgs1 = sorted([f for f in os.listdir(folder1) if '.png' in f])
  imgs2 = sorted([f for f in os.listdir(folder2) if '.png' in f])
  if len(imgs1)!=len(imgs2):
    print("Found different amount of images in aligned vs raw image directories. That's not supposed to happen...")
  
  for i in range(len(imgs1)):
    img1 = Image.open(folder1+imgs1[i]).resize((res,res))
    img2 = Image.open(folder2+imgs2[i]).resize((res,res))
    plot_two_images(img1,img2, i)
    print("")
     
display_sbs('data/generated_images/', 'data/aligned_images/', res = 512)
print('[INFO]Done.')

[7] - 选择一部分较好的结果图片,并保存其 latent vectors 到磁盘.

由于 latent vectors 的控制是比较棘手的,仅对 face encoding 看起来比较 good 的有效. 因此需要筛选下.

#选择 good 的 face encoding,如
good_images = [0,1] #参考[6]

import numpy as np
latents = sorted(os.listdir('outputs/latent_representations'))

out_file = 'outputs/output_vectors.npy'

final_w_vectors = []
for img_id in good_images:
  w = np.load('outputs/latent_representations/' + latents[img_id])
  final_w_vectors.append(w)

final_w_vectors = np.array(final_w_vectors)
np.save(out_file, final_w_vectors)
print("%d latent vectors of shape %s saved to %s!" %(len(good_images), str(w.shape), out_file))

2. 人脸编辑

依赖项:

tensorflow-gpu==1.12

[1] - 下载项目

#Original Repo: https://github.com/ShenYujun/InterFaceGAN
git clone https://github.com/tr1pzz/InterFaceGAN.git
cd InterFaceGAN/

[2] - 下载预训练 StyleGAN FFHQ 模型:

#https://drive.google.com/uc?id=1MEGjdvVpUsu1jB4zrXZN7Y4kBBOzizDQ
mv /path/to/karras2019stylegan-ffhq-1024x1024.pkl InterFaceGAN/models/pretrain/

[3] - 加载人脸编码中所得到的 latent vectors:

import numpy as np
final_w_vectors = np.load('./output_vectors.npy')

[4] - InterFaceGAN 提供了很多预训练的 latent directions.(也可以自行训练), 可以自行选择 latent space 控制.

如:

latent_direction = 'age'     #### Pick one of ['age', 'eyeglasses', 'gender', 'pose', 'smile']
morph_strength = 3           # Controls how strongly we push the face into a certain latent direction (try 1-5)
nr_interpolation_steps = 48  # The amount of intermediate steps/frames to render along the interpolation path

#
boundary_file = 'stylegan_ffhq_%s_w_boundary.npy' %latent_direction
print("Ready to start manipulating faces in the ** %s ** direction!" %latent_direction)
print("Interpolation from %d to %d with %d intermediate frames." %(-morph_strength, morph_strength, nr_interpolation_steps))
print("\nLoading latent directions from %s" %boundary_file)

如-smile:

Last modification:December 25th, 2020 at 10:24 am