基于项目 Face editing with Generative Adversarial Networks.
依赖项安装:
pip --version
pip install tqdm requests -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
apt-get install -y libgl1-mesa-dev libglib2.0-dev libsm6 libxrender1
pip install opencv-python pillow -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
#
#import tensorflow as tf
#print(tf.__version__)
#1.15.2
pip install keras==2.3.1 -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
#import keras
#print(keras.__version__)
#2.3.1
pip install cmake -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
pip install dlib -i http://mirrors.cloud.tencent.com/pypi/simple --trusted-host mirrors.cloud.tencent.com
1. 人脸编码
处理流程:
[1] - 下载源码:
git clone https://github.com/pbaylies/stylegan-encoder
cd stylegan-encoder
ls
如:
LICENSE.txt* dnnlib/ robust_loss/
Learn_direction_in_latent_space.ipynb encode_images.py run_metrics.py*
Play_with_latent_directions.ipynb encoder/ swa.py
README.md* ffhq_dataset/ teaser.png
StyleGAN_Encoder_Tutorial.ipynb generate_figures.py* train.py*
adaptive.py metrics/ train_effnet.py
align_images.py mona_example.jpg train_resnet.py
config.py* pretrained_example.py* training/
dataset_tool.py* requirements.txt
[2] - 人脸对齐
关于测试图像的建议:
- 采用 HD 图像,分辨率最好在 1000x1000 pixels 以上
- 避免图像中人脸过小
- 自然表情、正脸的人脸效果会更优
- 推荐采用良好的光照条件
创建文件夹:
mkdir cache #放置模型等文件
mkdir -p data/raw_images #未对齐人脸图像路径
mkdir -p data/aligned_images #对齐后人脸图像路径
运行人脸对齐脚本:
python align_images.py data/raw_images/ data/aligned_images/ --output_size=1024
其主要进行了几步处理:
- 寻找图像中的人脸
- 从图像中裁剪出人脸
- 对齐人脸(鼻子居中,眼睛水平)
- 缩放人脸,并保存到目标路径.
人脸对齐后,可以过滤删除下 data/aligned_images/ 中不太好的人脸.
由于采用 dlib 库,如果需要下载 shape_predictor_68_face_landmarks.dat.bz2
模型文件,如果速度慢,可离线下载,放置到路径./cache
,并修改 align_images.py
中如下:
#LANDMARKS_MODEL_URL = 'http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2'
landmarks_model_path = unpack_bz2(get_file('./shape_predictor_68_face_landmarks.dat.bz2',
LANDMARKS_MODEL_URL, cache_dir='./cache',cache_subdir=''))
perceptual_model.py
类似处理:
landmarks_model_path = unpack_bz2(get_file('./shape_predictor_68_face_landmarks.dat.bz2',
LANDMARKS_MODEL_URL, cache_dir='./cache',cache_subdir=''))
[3] - 编码器编码
下载与训练的 resnet 编码器, 该编码器模型的输入为图像,输出对应的 latent code.
编码器模型: https://drive.google.com/uc?id=1aT59NFy9-bNyXjDuZOTMl0qX0jmZc6Zb
将编码器模型 finetuned_resnet.h5 data
放于 ./cache
路径.
然后,编码过程:
高度推荐:设置编码参数(encoding params),其对 latent representions 和图像均有巨大影响.
参考:https://github.com/pbaylies/stylegan-encoder/blob/master/encode_images.py
batchsize 设置:
print("aligned_images contains %d images ready for encoding!" %len(os.listdir('data/aligned_images/')))
print("Recommended batch_size for the encode_images process: %d" %min(len(os.listdir('data/aligned_images/')), 8))
建议,batch_size 设置小于等于对齐后的图像数量. 为了避免显存不足问题,batch_size<8.
网络模型:
采用 NVIDIA 训练的 StyleGAN 网络和 ImageNet 上训练的 VGG16 网络.
在采用预训练的 ResNet 模型所预测的初始化 latent codes 后,采用梯度下降来优化 latent faces.
可采用两种实现,一种是快速的,一种速度是慢的.
默认情况下,优化 w
向量,而不是 z
向量.
#Fast
python encode_images.py --optimizer=lbfgs --face_mask=True --iterations=6 --use_lpips_loss=0 --use_discriminator_loss=0 --output_video=True --model_url=cache/karras2019stylegan-ffhq-1024x1024.pkl --mask_dir=outputs/masks/ --video_dir=outputs/videos/ data/aligned_images/ data/generated_images/ outputs/latent_representations/
#Slow
python encode_images.py --optimizer=adam --lr=0.02 --decay_rate=0.95 --decay_steps=6 --use_l1_penalty=0.3 --face_mask=True --iterations=400 --early_stopping=True --early_stopping_threshold=0.05 --average_best_loss=0.5 --use_lpips_loss=0 --use_discriminator_loss=0 --output_video=True --model_url=cache/karras2019stylegan-ffhq-1024x1024.pkl --mask_dir=outputs/masks/ --video_dir=outputs/videos/ data/aligned_images/ data/generated_images/ outputs/latent_representations/
参数 --face_mask=True
会复制源图的头发,如果不希望如此,可以设置 --face_mask=False
.
[4] - 随机可视化 StyleGAN 样本:
import dnnlib, pickle
import dnnlib.tflib as tflib
tflib.init_tf()
synthesis_kwargs = dict(output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True), minibatch_size=1)
model_dir = 'cache/'
model_path = [model_dir+f for f in os.listdir(model_dir) if 'stylegan-ffhq' in f][0]
print("Loading StyleGAN model from %s..." %model_path)
with dnnlib.util.open_url(model_path) as f:
generator_network, discriminator_network, averaged_generator_network = pickle.load(f)
print("StyleGAN loaded & ready for sampling!")
#
def generate_images(generator, latent_vector, z = True):
batch_size = latent_vector.shape[0]
if z: #Start from z: run the full generator network
return generator.run(latent_vector.reshape((batch_size, 512)), None, randomize_noise=False, **synthesis_kwargs)
else: #Start from w: skip the mapping network
return generator.components.synthesis.run(latent_vector.reshape((batch_size, 18, 512)), randomize_noise=False, **synthesis_kwargs)
#
import matplotlib.pyplot as plt
import numpy as np
def plot_imgs(model, rows, columns):
for i in range(rows):
f, axarr = plt.subplots(1,columns, figsize = (20,8))
for j in range(columns):
img = generate_images(model, np.random.randn(1,512), z = True)[0]
print('[INFO]--{}--{}--{}'.format(i, j, img.shape))
axarr[j].imshow(img)
axarr[j].axis('off')
axarr[j].set_title('Resolution: %s' %str(img.shape))
# plt.show()
print('[INFO]outputs/results/{}_{}.jpg'.format(i, j))
plt.savefig('outputs/results/{}_{}.jpg'.format(i, j))
plt.close()
[5] - 可视化测试图像的编码结果:
import matplotlib.pyplot as plt
import numpy as np
for f in sorted(os.listdir('outputs/latent_representations')):
w = np.load('outputs/latent_representations/' + f).reshape((1,18,-1))
img = generate_images(averaged_generator_network, w, z = False)[0]
plt.imshow(img)
plt.axis('off')
plt.title("Generated image from %s" %f)
# plt.show()
print('[INFO]outputs/results/latent_representations_{}.jpg'.format(f))
plt.savefig('outputs/results/latent_representations_{}.jpg'.format(f))
plt.close()
如果效果不够好,可以设置些编码参数,如
- 运行更多次优化迭代,如500次
- 降低 L1 惩罚,如 0.15
- 尝试更小的学习率,如 0.02, 或者 decay_rate 参数
- 尝试其他编码参数,如https://github.com/pbaylies/stylegan-encoder/blob/master/encode_images.py
- 更多说明,可参考,https://github.com/pbaylies/stylegan-encoder
[6] - 对比测试图像原图及编码后的样本:
#
from PIL import Image
import matplotlib.pyplot as plt
def plot_two_images(img1,img2, img_id, fs = 12):
f, axarr = plt.subplots(1,2, figsize=(fs,fs))
axarr[0].imshow(img1)
axarr[0].title.set_text('Encoded img %d' %img_id)
axarr[1].imshow(img2)
axarr[1].title.set_text('Original img %d' %img_id)
plt.setp(plt.gcf().get_axes(), xticks=[], yticks=[])
# plt.show()
print('[INFO]outputs/compare/{}_compare.jpg'.format(img_id))
plt.savefig('outputs/compare/{}_compare.jpg'.format(img_id))
plt.close()
def display_sbs(folder1, folder2, res = 256):
if folder1[-1] != '/': folder1 += '/'
if folder2[-1] != '/': folder2 += '/'
imgs1 = sorted([f for f in os.listdir(folder1) if '.png' in f])
imgs2 = sorted([f for f in os.listdir(folder2) if '.png' in f])
if len(imgs1)!=len(imgs2):
print("Found different amount of images in aligned vs raw image directories. That's not supposed to happen...")
for i in range(len(imgs1)):
img1 = Image.open(folder1+imgs1[i]).resize((res,res))
img2 = Image.open(folder2+imgs2[i]).resize((res,res))
plot_two_images(img1,img2, i)
print("")
display_sbs('data/generated_images/', 'data/aligned_images/', res = 512)
print('[INFO]Done.')
[7] - 选择一部分较好的结果图片,并保存其 latent vectors 到磁盘.
由于 latent vectors 的控制是比较棘手的,仅对 face encoding 看起来比较 good 的有效. 因此需要筛选下.
#选择 good 的 face encoding,如
good_images = [0,1] #参考[6]
import numpy as np
latents = sorted(os.listdir('outputs/latent_representations'))
out_file = 'outputs/output_vectors.npy'
final_w_vectors = []
for img_id in good_images:
w = np.load('outputs/latent_representations/' + latents[img_id])
final_w_vectors.append(w)
final_w_vectors = np.array(final_w_vectors)
np.save(out_file, final_w_vectors)
print("%d latent vectors of shape %s saved to %s!" %(len(good_images), str(w.shape), out_file))
2. 人脸编辑
依赖项:
tensorflow-gpu==1.12
[1] - 下载项目
#Original Repo: https://github.com/ShenYujun/InterFaceGAN
git clone https://github.com/tr1pzz/InterFaceGAN.git
cd InterFaceGAN/
[2] - 下载预训练 StyleGAN FFHQ 模型:
#https://drive.google.com/uc?id=1MEGjdvVpUsu1jB4zrXZN7Y4kBBOzizDQ
mv /path/to/karras2019stylegan-ffhq-1024x1024.pkl InterFaceGAN/models/pretrain/
[3] - 加载人脸编码中所得到的 latent vectors:
import numpy as np
final_w_vectors = np.load('./output_vectors.npy')
[4] - InterFaceGAN 提供了很多预训练的 latent directions.(也可以自行训练), 可以自行选择 latent space 控制.
如:
latent_direction = 'age' #### Pick one of ['age', 'eyeglasses', 'gender', 'pose', 'smile']
morph_strength = 3 # Controls how strongly we push the face into a certain latent direction (try 1-5)
nr_interpolation_steps = 48 # The amount of intermediate steps/frames to render along the interpolation path
#
boundary_file = 'stylegan_ffhq_%s_w_boundary.npy' %latent_direction
print("Ready to start manipulating faces in the ** %s ** direction!" %latent_direction)
print("Interpolation from %d to %d with %d intermediate frames." %(-morph_strength, morph_strength, nr_interpolation_steps))
print("\nLoading latent directions from %s" %boundary_file)
如-smile: