TensorFlow ML cookbook 第八章5节实现DeepDream

问题导读：
1、如何使用DeepDream算法来探索CNN和特征？
2、如何声明解压缩模型参数的位置？
3、如何计算图像上子区域（图块）的渐变？
4、如何将图像分为高频和低频，并计算低频部分的渐变？

上一篇:TensorFlow ML cookbook 第八章3、4节重新培训现有的CNN模型及应用SNS

实现DeepDream
受过训练的CNN的另一种用途是利用一些中间节点检测标签特征（例如猫耳或鸟的羽毛）的事实。使用这个事实，我们可以找到转换任何图像的方法，以反映我们选择的任何节点的节点特征。对于这个配方，我们将在TensorFlow的网站上浏览DeepDream教程。但是对于这个方法，我们将更详细地介绍基本部分。希望我们可以让读者准备使用DeepDream算法来探索在这些CNN中创建的CNN和特征。

做好准备
TensorFlow的官方教程展示了如何通过脚本实现DeepDream（参见See also部分的第一个要点）。这个方法的目的是通过他们提供的脚本并解释每一行。虽然教程很棒，但有些部分可以跳过，有些部分可以使用更多解释。我们希望提供更详细的逐行说明。我们还在必要时将代码更改为符合Python 3。

怎么做…
1.为了开始使用DeepDream，我们需要下载在CIFAR-1000上接受过CNN培训的GoogleNet：
[mw_shl_code=shell,true]me@computer:~$ wget https://storage.googleapis.com/download. tensorflow.org/models/inception5h.zip
me@computer:~$ unzip inception5h.zip
[/mw_shl_code]
2.首先，我们首先加载必要的库并开始图形会话：
[mw_shl_code=python,true]import os
import matplotlib.pyplot as plt
import numpy as np
import PIL.Image
import tensorflow as tf
from io import BytesIO
graph = tf.Graph()
sess = tf.InteractiveSession(graph=graph)
[/mw_shl_code]
3.我们现在声明解压缩模型参数的位置（从步骤1开始）并将参数加载到TensorFlow图中：
[mw_shl_code=python,true]# Model location
model_fn = 'tensorflow_inception_graph.pb'
# Load graph parameters
with tf.gfile.FastGFile(model_fn, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read()) [/mw_shl_code]

4.我们为输入创建一个占位符，保存imagenet平均值117.0，然后我们使用规范化的占位符导入图形定义：
[mw_shl_code=python,true]# Create placeholder for input
t_input = tf.placeholder(np.float32, name='input')
# Imagenet average bias to subtract off images
imagenet_mean = 117.0
t_preprocessed = tf.expand_dims(t_input-imagenet_mean, 0)
tf.import_graph_def(graph_def, {'input':t_preprocessed}) [/mw_shl_code]

5.接下来，我们将导入卷积层以便可视化并稍后用于DeepDream处理：
[mw_shl_code=python,true]# Create a list of layers that we can refer to later
layers = [op.name for op in graph.get_operations() if op.type=='Conv2D' and 'import/' in op.name]
# Count how many outputs for each layer
feature_nums = [int(graph.get_tensor_by_name(name+':0').get_ shape()[-1]) for name in layers] [/mw_shl_code]

6.现在我们将选择一个图层进行可视化。我们也可以通过名字选择其他人。我们选择查看特征号139.图像以随机噪声开始：
[mw_shl_code=python,true]layer = 'mixed4d_3x3_bottleneck_pre_relu'
channel = 139
img_noise = np.random.uniform(size=(224,224,3)) + 100.0 [/mw_shl_code]

7.我们声明了一个绘制图像数组的函数：
[mw_shl_code=python,true]def showarray(a, fmt='jpeg'):
  # First make sure everything is between 0 and 255
  a = np.uint8(np.clip(a, 0, 1)*255)
  # Pick an in-memory format for image display
  f = BytesIO()
  # Create the in memory image
  PIL.Image.fromarray(a).save(f, fmt)
  # Show image
  plt.imshow(a)[/mw_shl_code]

8.我们将通过创建一个从图中按名称检索图层的函数来缩短一些重复代码：
[mw_shl_code=python,true]def T(layer):
#Helper for getting layer output tensor
  return graph.get_tensor_by_name("import/%s:0"%layer) [/mw_shl_code]

9.我们将创建的下一个函数是一个包装函数，用于根据我们指定的参数创建占位符：
[mw_shl_code=python,true]# The following function returns a function wrapper that will create the placeholder
# inputs of a specified dtype
def tffunc(*argtypes):
'''Helper that transforms TF-graph generating function into a regular one.
See "resize" function below.
'''
  placeholders = list(map(tf.placeholder, argtypes))
  def wrap(f):
out = f(*placeholders)
def wrapper(*args, **kw):
   return out.eval(dict(zip(placeholders, args)), session=kw.get('session'))
return wrapper
  return wrap [/mw_shl_code]

10.我们还需要一个将图像大小调整为大小规格的功能。我们使用TensorFlow的内置图像线性插值函数tf.image.resize执行此操作。双线性（）：
[mw_shl_code=python,true]# Helper function that uses TF to resize an image
def resize(img, size):
  img = tf.expand_dims(img, 0)
  # Change 'img' size by linear interpolation
  return tf.image.resize_bilinear(img, size)[0,:,:,:] [/mw_shl_code]

11.现在我们需要一种方法来更新源图像，使其更像我们选择的功能。我们通过指定如何计算图像上的渐变来完成此操作。我们定义了一个函数，用于计算图像上子区域（图块）的渐变，以便更快地进行计算。为了防止平铺输出，我们将在x和y方向上随机移动或滚动图像，这将平滑平铺效果。[mw_shl_code=python,true]def calc_grad_tiled(img, t_grad, tile_size=512):[mw_shl_code=python,true]'''Compute the value of tensor t_grad over the image in a tiled way.
Random shifts are applied to the image to blur tile boundaries over
multiple iterations.'''
# Pick a subregion square size
sz = tile_size
# Get the image height and width
h, w = img.shape[:2]
# Get a random shift amount in the x and y direction
sx, sy = np.random.randint(sz, size=2)
# Randomly shift the image (roll image) in the x and y directions
img_shift = np.roll(np.roll(img, sx, 1), sy, 0)
# Initialize the while image gradient as zeros
grad = np.zeros_like(img)
# Now we loop through all the sub-tiles in the image
for y in range(0, max(h-sz//2, sz),sz):
for x in range(0, max(w-sz//2, sz),sz):
# Select the sub image tile
sub = img_shift[y:y+sz,x:x+sz]
# Calculate the gradient for the tile
g = sess.run(t_grad, {t_input:sub})
# Apply the gradient of the tile to the whole image gradient
grad[y:y+sz,x:x+sz] = g
# Return the gradient, undoing the roll operation
return np.roll(np.roll(grad, -sx, 1), -sy, 0)
[/mw_shl_code]
12.现在我们可以声明我们的DeepDream函数。我们算法的目标是我们选择的特征的平均值。损耗在梯度上运行，这取决于输入图像和所选特征之间的距离。策略是将图像分为高频和低频，并计算低频部分的渐变。将得到的高频图像再次分开并重复处理。原始图像和低频图像的集合称为八度音程。对于每次传递，我们计算渐变并将它们应用于图像：
[mw_shl_code=python,true]def render_deepdream(t_obj, img0=img_noise, iter_n=10, step=1.5, octave_n=4, octave_ scale=1.4):
  # defining the optimization objective, the objective is the mean of the feature
  t_score = tf.reduce_mean(t_obj)
  # Our gradients will be defined as changing the t_input to get closer tothe values of t_score. Here, t_score is the mean of the feature we select.
  # t_input will be the image octave (starting with the last)
  t_grad = tf.gradients(t_score, t_input)[0] # behold the power of automatic differentiation!
  # Store the image
  img = img0
  # Initialize the image octave list
  octaves = []
  # Since we stored the image, we need to only calculate n-1 octaves
  for i in range(octave_n-1):
# Extract the image shape
hw = img.shape[:2]
# Resize the image, scale by the octave_scale (resize by linear interpolation)
lo = resize(img, np.int32(np.float32(hw)/octave_scale))
# Residual is hi. Where residual = image - (Resize lo to be hw-shape)
hi = img-resize(lo, hw)
# Save the lo image for re-iterating
img = lo
# Save the extracted hi-image
octaves.append(hi)
  # generate details octave by octave
  for octave in range(octave_n):
if octave>0:
# Start with the last octave
hi = octaves[-octave]
img = resize(img, hi.shape[:2])+hi
for i in range(iter_n):
   # Calculate gradient of the image.
   g = calc_grad_tiled(img, t_grad)
   # Ideally, we would just add the gradient, g, but
   # we want do a forward step size of it ('step'),
   # and divide it by the avg. norm of the gradient, so
   # we are adding a gradient of a certain size each step.
   # Also, to make sure we aren't dividing by zero, we add 1e-7.
   img += g*(step / (np.abs(g).mean()+1e-7))
   print('.',end = '')
showarray(img/255.0)
[/mw_shl_code]

13.通过我们所做的所有功能设置，我们现在可以执行DeepDream算法。
[mw_shl_code=python,true]# Run Deep Dream
if __name__=="__main__":
  # Create resize function that has a wrapper that creates specified placeholder types
  resize = tffunc(np.float32, np.int32)(resize)
  # Open image
  img0 = PIL.Image.open('book_cover.jpg')
  img0 = np.float32(img0)
  # Show Original Image
  showarray(img0/255.0)
  # Create deep dream
  render_deepdream(T(layer)[:,:,:,139], img0, iter_n=15)
  sess.close()[/mw_shl_code]

图7：本书的封面，贯穿深度梦算法，其特征层数为50,110,100和139。
微信图片_20190820185244.jpg

还有更多
我们敦促读者访问官方DeepDream教程以获取更多参考，并访问DeepDream上的原始Google研究博客文章（请参阅另请参阅部分的第二个要点）。

也可以看看
DeepDream上的TensorFlow教程：
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/deepdream
关于DeepDream的原始Google研究博客文章：
https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.HTML

原文：
Implementing DeepDream
Another usage of trained CNNs is exploiting the fact that some of the intermediate nodes detect features of labels (e.g. a cat ear or a feather of a bird). Using this fact, we can find ways to transform any image to reflect those node features of any node we choose. For this recipe, we will go through the DeepDream tutorial on TensorFlow's website. But for this recipe, we will go through the essential parts in much more detail. The hope is that we can prepare the reader to use the DeepDream algorithm for exploration of CNNs and features created in such CNNs.

Getting ready
TensorFlow's official tutorials show how to implement DeepDream through a script (refer to the first bullet point of the See also section). The purpose of this recipe is to go through the script they provide and explain each line. While the tutorial is great, there are some parts that are skippable and some parts that could use more explanation. We hope to provide a more detailed line-by-line explanation. We also change the code to be Python 3 compliant where necessary.

How to do it…
1.In order to get started with DeepDream, we need to download the GoogleNet, which is CNN trained on CIFAR-1000:
me@computer:~$ wget https://storage.googleapis.com/download. tensorflow.org/models/inception5h.zip
me@computer:~$ unzip inception5h.zip

2.First, we'll start by loading the necessary libraries and starting a graph session:
import os
import matplotlib.pyplot as plt
import numpy as np
import PIL.Image
import tensorflow as tf
from io import BytesIO
graph = tf.Graph()
sess = tf.InteractiveSession(graph=graph)

3.We now declare the location of the unzipped model parameters (from step 1) and load the parameters into a TensorFlow graph:
# Model location
model_fn = 'tensorflow_inception_graph.pb'
# Load graph parameters
with tf.gfile.FastGFile(model_fn, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())

4.We create a placeholder for the input, save the imagenet mean value of 117.0, and then we import the graph definition with the normalized placeholder:
# Create placeholder for input
t_input = tf.placeholder(np.float32, name='input')
# Imagenet average bias to subtract off images
imagenet_mean = 117.0
t_preprocessed = tf.expand_dims(t_input-imagenet_mean, 0)
tf.import_graph_def(graph_def, {'input':t_preprocessed})

5.Next, we will import the convolutional layers to visualize and use for DeepDream processing later:
# Create a list of layers that we can refer to later
layers = [op.name for op in graph.get_operations() if op.type=='Conv2D' and 'import/' in op.name]
# Count how many outputs for each layer
feature_nums = [int(graph.get_tensor_by_name(name+':0').get_ shape()[-1]) for name in layers]

6.Now we will pick a layer to visualize. We can pick others by name as well. We choose to look at feature number 139. The image starts with random noise:
layer = 'mixed4d_3x3_bottleneck_pre_relu'
channel = 139
img_noise = np.random.uniform(size=(224,224,3)) + 100.0

7.We declare a function that will plot an image array:
def showarray(a, fmt='jpeg'):
# First make sure everything is between 0 and 255
a = np.uint8(np.clip(a, 0, 1)*255)
# Pick an in-memory format for image display
f = BytesIO()
# Create the in memory image
PIL.Image.fromarray(a).save(f, fmt)
# Show image
plt.imshow(a)

8.We'll shorten some repetitive code by creating a function that retrieves a layer by name from the graph:
def T(layer):
#Helper for getting layer output tensor
return graph.get_tensor_by_name("import/%s:0"%layer)

9.The next function we will create is a wrapper function for creating placeholders according to the arguments we specify:
# The following function returns a function wrapper that will create the placeholder
# inputs of a specified dtype
def tffunc(*argtypes):
'''Helper that transforms TF-graph generating function into a regular one.
See "resize" function below.
'''
placeholders = list(map(tf.placeholder, argtypes))
def wrap(f):
out = f(*placeholders)
def wrapper(*args, **kw):
return out.eval(dict(zip(placeholders, args)), session=kw.get('session'))
return wrapper
return wrap

10.We also need a function that resizes an image to a size specification. We do this with TensorFlow's built in image linear interpolation function, tf.image.resize. bilinear():
# Helper function that uses TF to resize an image
def resize(img, size):
img = tf.expand_dims(img, 0)
# Change 'img' size by linear interpolation
return tf.image.resize_bilinear(img, size)[0,:,:,:]

11.Now we need a way to update the source image to be more like a feature we select. We do this by specifying how the gradient on the image is calculated. We define a function that will calculate gradients on subregions (tiles) over the image to make the calculations quicker. In order to prevent a tiled output, we will randomly shift, or roll, the image in the x and y direction, which will smooth out the tiling effect.
def calc_grad_tiled(img, t_grad, tile_size=512):
'''Compute the value of tensor t_grad over the image in a tiled way.
Random shifts are applied to the image to blur tile boundaries over
multiple iterations.'''

# Pick a subregion square size
sz = tile_size
# Get the image height and width
h, w = img.shape[:2]
# Get a random shift amount in the x and y direction
sx, sy = np.random.randint(sz, size=2)
# Randomly shift the image (roll image) in the x and y directions
img_shift = np.roll(np.roll(img, sx, 1), sy, 0)
# Initialize the while image gradient as zeros
grad = np.zeros_like(img)
# Now we loop through all the sub-tiles in the image
for y in range(0, max(h-sz//2, sz),sz):
for x in range(0, max(w-sz//2, sz),sz):
# Select the sub image tile
sub = img_shift[y:y+sz,x:x+sz]
# Calculate the gradient for the tile
g = sess.run(t_grad, {t_input:sub})
# Apply the gradient of the tile to the whole image gradient
grad[y:y+sz,x:x+sz] = g
# Return the gradient, undoing the roll operation
return np.roll(np.roll(grad, -sx, 1), -sy, 0)

12.Now we can declare our DeepDream function. The objective of our algorithm is the mean of the feature we select. The loss operates on gradients, which will depend on the distance between the input image and the selected feature. The strategy is to separate the image into high and low frequency, and calculate gradients on the low part. The resulting high frequency image is split up again and the processes is repeated. The set of the original image and the low frequency images are called octaves. For each pass, we calculate the gradients and apply them to the images:
def render_deepdream(t_obj, img0=img_noise,
iter_n=10, step=1.5, octave_n=4, octave_ scale=1.4):
# defining the optimization objective, the objective is the mean of the feature
t_score = tf.reduce_mean(t_obj)
# Our gradients will be defined as changing the t_input to get closer tothe values of t_score. Here, t_score is the mean of the feature we select.
# t_input will be the image octave (starting with the last)
t_grad = tf.gradients(t_score, t_input)[0] # behold the power of automatic differentiation!
# Store the image
img = img0

# Initialize the image octave list
octaves = []
# Since we stored the image, we need to only calculate n-1 octaves
for i in range(octave_n-1):
# Extract the image shape
hw = img.shape[:2]
# Resize the image, scale by the octave_scale (resize by linear interpolation)
lo = resize(img, np.int32(np.float32(hw)/octave_scale))
# Residual is hi. Where residual = image - (Resize lo to be hw-shape)
hi = img-resize(lo, hw)
# Save the lo image for re-iterating
img = lo
# Save the extracted hi-image
octaves.append(hi)
# generate details octave by octave
for octave in range(octave_n):
if octave>0:
# Start with the last octave
hi = octaves[-octave]
#
img = resize(img, hi.shape[:2])+hi
for i in range(iter_n):
# Calculate gradient of the image.
g = calc_grad_tiled(img, t_grad)
# Ideally, we would just add the gradient, g, but
# we want do a forward step size of it ('step'),
# and divide it by the avg. norm of the gradient, so
# we are adding a gradient of a certain size each step.
# Also, to make sure we aren't dividing by zero, we add 1e-7.
img += g*(step / (np.abs(g).mean()+1e-7))
print('.',end = '')
showarray(img/255.0)

13.With all the function setup we have done, we now can perform the DeepDream algorithm.
# Run Deep Dream
if __name__=="__main__":
# Create resize function that has a wrapper that creates specified placeholder types

resize = tffunc(np.float32, np.int32)(resize)
# Open image
img0 = PIL.Image.open('book_cover.jpg')
img0 = np.float32(img0)
# Show Original Image
showarray(img0/255.0)
# Create deep dream
render_deepdream(T(layer)[:,:,:,139], img0, iter_n=15)
sess.close()

Figure 7: The cover of the book, run through the deep dream algorithm with feature layer numbers 50, 110, 100, and 139.

There's more…
We urge the reader to visit the official DeepDream tutorials for more reference and also to visit the original Google research blog post on DeepDream (refer to the second bullet point of the See also section).

See also
The TensorFlow tutorial on DeepDream: https://github.com/tensorflow/ tensorflow/tree/master/tensorflow/examples/tutorials/deepdream
The original Google research blog post on DeepDream: https://research. googleblog.com/2015/06/inceptionism-going-deeper-into-neural. html