分享

TensorFlow ML cookbook 第六章6、7节 改进线性模型的预测和学习玩Tic Tac Toe

本帖最后由 levycui 于 2019-1-22 17:02 编辑
问题导读:
1、如何在模型中初始化变量和图层的函数?
2、如何使用matplotlib绘制交叉熵损失以及训练和测试集精度?
3、如何使用神经网络来学习Tic Tac Toe的最佳运动?
4、如何初始化变量并循环训练我们的神经网络?




上一篇:TensorFlow ML cookbook 第六章4、5节 实现不同的层次和使用多层神经网络

改进线性模型的预测
在先前的方法中,我们注意到我们拟合的参数数量远远超过等效的线性模型。 在这个方法中,我们将尝试使用神经网络改进我们的低出生体重的逻辑模型。

做好准备
对于这个方法,我们将加载低出生体重数据并使用两个神经网络隐藏完全连接的层与sigmoid激活,以适应低出生体重的概率。

怎么做
1.我们首先加载库并初始化我们的计算图:
[mw_shl_code=python,true]import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import requests
sess = tf.Session() [/mw_shl_code]

2.现在我们将像以前一样加载,提取和标准化我们的数据,除了我们将使用低出生体重指标变量作为我们的目标而不是实际出生体重:
[mw_shl_code=python,true]birthdata_url = 'https://www.umass.edu/statdata/statdata/data/ lowbwt.dat''
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n'')[5:]
birth_header = [x for x in birth_data[0].split(' '') if len(x)>=1]
birth_data = [[float(x) for x in y.split(' '') if len(x)>=1] for y in birth_data[1:] if len(y)>=1]
y_vals = np.array([x[1] for x in birth_data])
x_vals = np.array([x[2:9] for x in birth_data])
train_indices = np.random.choice(len(x_vals), round(len(x_ vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_ indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
def normalize_cols(m):
col_max = m.max(axis=0)
col_min = m.min(axis=0)
return (m-col_min) / (col_max - col_min)
x_vals_train = np.nan_to_num(normalize_cols(x_vals_train))
x_vals_test = np.nan_to_num(normalize_cols(x_vals_test)) [/mw_shl_code]

3.接下来我们将声明我们的批量大小和数据的占位符:
[mw_shl_code=python,true]batch_size = 90
x_data = tf.placeholder(shape=[None, 7], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)[/mw_shl_code]

4.像以前一样,我们将声明在模型中初始化变量和图层的函数。 为了创建更好的逻辑函数,我们需要创建一个在输入层上返回逻辑层的函数。 换句话说,我们将使用完全连接的层并为每个层返回一个sigmoid元素。 重要的是要记住我们的丢失函数将包含最终的sigmoid,因此我们要在最后一层指定我们不会返回输出的sigmoid:
[mw_shl_code=python,true]def init_variable(shape):
  return(tf.Variable(tf.random_normal(shape=shape)))
  # Create a logistic layer definition
def logistic(input_layer, multiplication_weight, bias_weight, activation = True):
  linear_layer = tf.add(tf.matmul(input_layer, multiplication_ weight), bias_weight)
  if activation:
    return(tf.nn.sigmoid(linear_layer))
  else:
    return(linear_layer) [/mw_shl_code]

5.现在我们将声明三个层(两个隐藏层和一个输出层)。 我们将首先为每个层初始化权重和偏差矩阵并定义图层操作:
[mw_shl_code=python,true]# First logistic layer (7 inputs to 14 hidden nodes)
A1 = init_variable(shape=[7,14])
b1 = init_variable(shape=[14])
logistic_layer1 = logistic(x_data, A1, b1)
# Second logistic layer (14 hidden inputs to 5 hidden nodes)
A2 = init_variable(shape=[14,5])
b2 = init_variable(shape=[5])
logistic_layer2 = logistic(logistic_layer1, A2, b2)
# Final output layer (5 hidden nodes to 1 output)
A3 = init_variable(shape=[5,1])
b3 = init_variable(shape=[1])
final_output = logistic(logistic_layer2, A3, b3, activation=False) [/mw_shl_code]

6.接下来我们声明我们的损失函数(交叉熵)和优化算法,并初始化变量:
[mw_shl_code=python,true]# Create loss function
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits( final_output, y_target))
# Declare optimizer
my_opt = tf.train.AdamOptimizer(learning_rate = 0.002)
train_step = my_opt.minimize(loss)
# Initialize variables
init = tf.initialize_all_variables()
sess.run(init)[/mw_shl_code]

7.为了评估我们的模型并将其与先前的模型进行比较,我们希望在图上创建预测和精确度操作。 这将允许我们提供整个测试集并确定准确性:
[mw_shl_code=python,true]prediction = tf.round(tf.nn.sigmoid(final_output))
predictions_correct = tf.cast(tf.equal(prediction, y_target), tf.float32)
accuracy = tf.reduce_mean(predictions_correct) [/mw_shl_code]

8.现在我们已准备好开始我们的训练循环。 我们将训练1500代并保存模型损失和训练/测试精度以便以后绘图:
[mw_shl_code=python,true]# Initialize loss and accuracy vectors
loss_vec = []
train_acc = []
test_acc = []
for i in range(1500):
# Select random indicies for batch selection
rand_index = np.random.choice(len(x_vals_train), size=batch_ size)
# Select batch
rand_x = x_vals_train[rand_index]
rand_y = np.transpose([y_vals_train[rand_index]])
# Run training step
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
# Get training loss
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss)
# Get training accuracy
temp_acc_train = sess.run(accuracy, feed_dict={x_data: x_vals_
train, y_target: np.transpose([y_vals_train])})
train_acc.append(temp_acc_train)
# Get test accuracy
temp_acc_test = sess.run(accuracy, feed_dict={x_data: x_vals_
test, y_target: np.transpose([y_vals_test])})
test_acc.append(temp_acc_test)
if (i+1)%150==0:
print('Loss = '' + str(temp_loss)) [/mw_shl_code]

9.这导致以下输出:
[mw_shl_code=python,true]Loss = 0.696393
Loss = 0.591708
Loss = 0.59214
Loss = 0.505553
Loss = 0.541974
Loss = 0.512707
Loss = 0.590149
Loss = 0.502641
Loss = 0.518047
Loss = 0.502616 [/mw_shl_code]

10.以下代码块说明了如何使用matplotlib绘制交叉熵损失以及训练和测试集精度:
[mw_shl_code=python,true]# Plot loss over time
plt.plot(loss_vec, 'k-'')
plt.title('Cross Entropy Loss per Generation'')
plt.xlabel('Generation'')
plt.ylabel('Cross Entropy Loss'')
plt.show()
# Plot train and test accuracy
plt.plot(train_acc, 'k-'', label=''Train Set Accuracy'')
plt.plot(test_acc, 'r--'', label=''Test Set Accuracy'')
plt.title('Train and Test Accuracy'')
plt.xlabel('Generation'')
plt.ylabel('Accuracy'')
plt.legend(loc='lower right'')
plt.show()[/mw_shl_code]
2019-01-22_161818.jpg
图7:超过1,500次迭代的训练损失。
在大约50代之内,我们已经达到了良好的模式。 在我们继续训练时,我们可以看到在剩余的迭代中获得的很少:
2019-01-22_161859.jpg
图8:列车组和测试装置的准确度。
在这里我们可以看到我们很快就达到了一个好模型。

这个怎么运作…
在考虑使用神经网络建模数据时,我们必须考虑优缺点。虽然我们的模型比以前的模型收敛得更快,并且在某些情况下可能更准确,但这需要付出代价:我们正在训练更多的模型变量并且更有可能过度拟合。为了看到过度拟合的发生,我们看一下测试和训练集的准确性,并看到训练集的准确性继续略有增加,而测试集的准确度保持不变甚至略有下降。

为了对抗欠拟合,我们可以增加模型深度或训练模型以进行更多迭代。为了解决过度拟合问题,我们可以为模型添加更多数据或添加正则化技术。
同样重要的是要注意我们的模型变量不像线性模型那样可解释。神经网络模型具有比线性模型更难解释的系数,因为它涉及解释模型中特征的重要性。

学习玩Tic Tac Toe
为了展示适应性神经网络的可行性,我们将尝试使用神经网络来学习Tic Tac Toe的最佳运动。我们将知道Tic Tac Toe是一种确定性游戏,并且最佳动作已经知道。

做好准备
为了训练我们的模型,我们将列出一个董事会职位列表,然后是针对许多不同董事会的最佳最佳响应。我们可以通过仅考虑在对称性方面不同的板位来减少要培训的板数量。 Tic Tac Toe板的非同一性变换是旋转(任一方向)90度,180度,270度,水平反射和垂直反射。鉴于这个想法,我们将使用具有最佳移动的板的候选名单,应用两个随机变换,并将其馈送到我们的神经网络中以进行学习。

如果我们将X注释为1,将O注释为-1,将空白空间注释为零,那么下面将向我们展示如何将板位置和最佳移动视为一行数据:
2019-01-22_161942.jpg
图9:这里我们说明如何将电路板和最佳移动视为一行数据。 注意,X = 1,O = -1,空格为0,我们开始索引为0。
除了模型损失之外,为了检查模型的执行情况,我们将做两件事。 我们将进行的第一项检查是从训练集中删除位置和最佳移动行。 这将使我们能够看到神经网络模型是否可以概括为之前从未见过的移动。 我们将评估模型的第二种方法是在最后实际对抗它。
可以在此配方的GitHub目录中找到可能的电路板列表和最佳移动:https://github.com/nfmcclure/tensorflow_cookbook/tree/ master / 06_Neural_Networks / 08_Learning_Tic_Tac_Toe。

怎么做…
1.我们将首先为此脚本加载必要的库:
[mw_shl_code=python,true]import tensorflow as tf
import matplotlib.pyplot as plt
import csv
import random
import numpy as np
import random[/mw_shl_code]

2.接下来我们宣布我们的批量大小来培训我们的模型:
[mw_shl_code=python,true]batch_size = 50[/mw_shl_code]

3.为了使电路板更容易可视化,我们将创建一个输出带Xs和Os的Tic Tac Toe电路板的功能:
[mw_shl_code=python,true]def print_board(board):
  symbols = ['O',' ','X']
  board_plus1 = [int(x) + 1 for x in board]
  print(' ' + symbols[board_plus1[0]] + ' | ' + symbols[board_ plus1[1]] + ' | ' + symbols[board_plus1[2]])
  print('___________')
  print(' ' + symbols[board_plus1[3]] + ' | ' + symbols[board_ plus1[4]] + ' | ' + symbols[board_plus1[5]])
  print('___________')
  print(' ' + symbols[board_plus1[6]] + ' | ' + symbols[board_ plus1[7]] + ' | ' + symbols[board_plus1[8]]) [/mw_shl_code]

4.现在我们必须创建一个能够在转换下返回新板和最佳响应位置的函数:
[mw_shl_code=python,true]def get_symmetry(board, response, transformation):
  '''
  :param board: list of integers 9 long:
  opposing mark = -1
  friendly mark = 1
  empty space = 0
  :param transformation: one of five transformations on a board:
  rotate180, rotate90, rotate270, flip_v, flip_h
  :return: tuple: (new_board, new_response)
  '''
  if transformation == 'rotate180':
    new_response = 8 - response
    return(board[::-1], new_response)
  elif transformation == 'rotate90':
    new_response = [6, 3, 0, 7, 4, 1, 8, 5, 2].index(response)
    tuple_board = list(zip(*[board[6:9], board[3:6], board[0:3]]))
    return([value for item in tuple_board for value in item], new_response)
   elif transformation == 'rotate270':
    new_response = [2, 5, 8, 1, 4, 7, 0, 3, 6].index(response)
    tuple_board = list(zip(*[board[0:3], board[3:6], board[6:9]]))[::-1]
    return([value for item in tuple_board for value in item], new_response)
  elif transformation == 'flip_v':
    new_response = [6, 7, 8, 3, 4, 5, 0, 1, 2].index(response)
    return(board[6:9] + board[3:6] + board[0:3], new_ response)
  elif transformation == 'flip_h':
    # flip_h = rotate180, then flip_v
    new_response = [2, 1, 0, 5, 4, 3, 8, 7, 6].index(response)
    new_board = board[::-1]
    return(new_board[6:9] + new_board[3:6] + new_board[0:3], new_response)
  else:
    raise ValueError('Method not implmented.') [/mw_shl_code]

5.电路板列表及其最佳响应位于目录中的.csv文件中。 我们将创建一个函数,它将使用f板和响应加载文件并将其存储为元组列表:
[mw_shl_code=python,true]def get_moves_from_csv(csv_file):
  '''
  :param csv_file: csv file location containing the boards w/ responses
  :return: moves: list of moves with index of best response
  '''
  moves = []
  with open(csv_file, 'rt') as csvfile:
  reader = csv.reader(csvfile, delimiter=',')
  for row in reader:
    moves.append(([int(x) for x in row[0:9]],int(row[9])))
  return(moves)
[/mw_shl_code]
6.现在我们将所有内容组合在一起以创建一个函数,该函数将返回一个随机转换的板和响应:
[mw_shl_code=python,true]def get_rand_move(moves, rand_transforms=2):
  # This function performs random transformations on a board.
  (board, response) = random.choice(moves)
  possible_transforms = ['rotate90', 'rotate180', 'rotate270', 'flip_v', 'flip_h']
  for i in range(rand_transforms):
    random_transform = random.choice(possible_transforms)
  (board, response) = get_symmetry(board, response, random_ transform)
  return(board, response) [/mw_shl_code]

7.接下来我们将初始化我们的图形会话,加载我们的数据,并创建一个训练集:
[mw_shl_code=python,true]sess = tf.Session()
moves = get_moves_from_csv('base_tic_tac_toe_moves.csv')
# Create a train set:
train_length = 500
train_set = []
for t in range(train_length):
train_set.append(get_rand_move(moves))
[/mw_shl_code]
8.请记住我们要从我们的训练集中移除一个板和最佳响应,以查看该模型是否可以概括为最佳移动。 以下董事会的最佳举措是参加第六号指数:
[mw_shl_code=python,true]test_board = [-1, 0, 0, 1, -1, -1, 0, 0, 1]
train_set = [x for x in train_set if x[0] != test_board] [/mw_shl_code]

9.我们现在可以创建函数来创建模型变量和模型操作。 请注意,我们在模型中不包含softmax()激活函数,因为它包含在loss函数中:
[mw_shl_code=python,true]def init_weights(shape):
  return(tf.Variable(tf.random_normal(shape)))
def model(X, A1, A2, bias1, bias2):
  layer1 = tf.nn.sigmoid(tf.add(tf.matmul(X, A1), bias1))
  layer2 = tf.add(tf.matmul(layer1, A2), bias2)
  return(layer2) [/mw_shl_code]

10.现在我们将声明我们的占位符,变量和模型:
[mw_shl_code=python,true]X = tf.placeholder(dtype=tf.float32, shape=[None, 9])
Y = tf.placeholder(dtype=tf.int32, shape=[None])
A1 = init_weights([9, 81])
bias1 = init_weights([81])
A2 = init_weights([81, 9])
bias2 = init_weights([9])
model_output = model(X, A1, A2, bias1, bias2)
= tf.train.GradientDescentOptimizer(0.025). minimize(loss)
prediction = tf.argmax(model_output, 1) [/mw_shl_code]

11.我们现在可以初始化变量并循环训练我们的神经网络:
[mw_shl_code=python,true]# Initialize variables
init = tf.initialize_all_variables()
sess.run(init)
loss_vec = []
for i in range(10000):
# Select random indices for batch
  rand_indices = np.random.choice(range(len(train_set)), batch_ size, replace=False)
# Get batch
batch_data = [train_set for i in rand_indices]
x_input = [x[0] for x in batch_data]
y_target = np.array([y[1] for y in batch_data])
# Run training step
sess.run(train_step, feed_dict={X: x_input, Y: y_target})
# Get training loss
temp_loss = sess.run(loss, feed_dict={X: x_input, Y: y_ target})
loss_vec.append(temp_loss)
if i%500==0:
  print('iteration ' + str(i) + ' Loss: ' + str(temp_loss)) [/mw_shl_code]

12.这是绘制模型训练损失的代码:
[mw_shl_code=python,true]plt.plot(loss_vec, 'k-', label='Loss')
plt.title('Loss (MSE) per Generation')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()[/mw_shl_code]
2019-01-22_162020.jpg
图10:超过10,000次迭代的Tic-Tac-Toe列车组损失。

在这里,我们绘制了训练步骤的损失:
1.为了测试模型,我们看到它是如何在我们从训练集中删除的测试板上执行的。 我们希望模型能够推广和预测最佳的移动指数,这将是指数6。 大多数时候,模型将在这里成功:[mw_shl_code=python,true]test_boards = [test_board]
feed_dict = {X: test_boards}
logits = sess.run(model_output, feed_dict=feed_dict)
predictions = sess.run(prediction, feed_dict=feed_dict)
print(predictions)
[/mw_shl_code]
2.这导致以下输出:
[mw_shl_code=python,true][6] [/mw_shl_code]

3.为了评估我们的模型,我们计划与我们训练有素的模型进行对比。 要做到这一点,我们必须创建一个能够检查胜利的函数。 这样,我们的程序将知道何时停止要求更多动作:
[mw_shl_code=python,true]def check(board):
  wins = [[0,1,2], [3,4,5], [6,7,8], [0,3,6], [1,4,7], [2,5,8], [0,4,8], [2,4,6]]
  for i in range(len(wins)):
    if board[wins[0]]==board[wins[1]]==board[wins [2]]==1.:
    return(1)
  elif board[wins[0]]==board[wins[1]]==board[wins [2]]==-1.:
    return(1)
  return(0) [/mw_shl_code]

4.现在我们可以使用我们的模型循环播放游戏。 我们从一个空白板(全零)开始,然后我们要求用户输入一个索引(0-8),然后将其输入模型进行预测。 对于模型的移动,我们采用最大的可用预测,也是一个开放空间。 最后显示了一个示例游戏。 从这个游戏中,我们可以看到我们的模型并不完美:[mw_shl_code=python,true]game_tracker = [0., 0., 0., 0., 0., 0., 0., 0., 0.]
win_logical = False
num_moves = 0
while not win_logical:
player_index = input('Input index of your move (0-8): ')
num_moves += 1
# Add player move to game
game_tracker[int(player_index)] = 1.
# Get model's move by first getting all the logits for each index
[potential_moves] = sess.run(model_output, feed_dict={X: [game_tracker]})
# Now find allowed moves (where game tracker values = 0.0)
allowed_moves = [ix for ix,x in enumerate(game_tracker) if x==0.0]
# Find best move by taking argmax of logits if they are in allowed moves
model_move = np.argmax([x if ix in allowed_moves else -999.0 for ix,x in enumerate(potential_moves)])
# Add model move to game
game_tracker[int(model_move)] = -1.
print('Model has moved')
print_board(game_tracker)
# Now check for win or too many moves
if check(game_tracker)==1 or num_moves>=5:
print('Game Over!')
win_logical = True
[/mw_shl_code]
5.这导致以下交互输出:
[mw_shl_code=python,true]Input index of your move (0-8): 4
Model has moved
O | |
___________
| X |
___________
| |
Input index of your move (0-8): 6
Model has moved
O | |
___________
| X |
___________
X | | O
Input index of your move (0-8): 2
Model has moved
O | | X
___________
O | X |
___________
X | | O
Game Over! [/mw_shl_code]

这个怎么运作…
我们训练了一个神经网络,通过喂食板位,九维矢量来发挥井字游戏,并预测最佳反应。 我们只需要喂几个可能的Tic Tac Toe板并对每个板应用随机变换以增加训练集大小。
为了测试我们的算法,我们删除了一个特定板的所有实例,并查看我们的模型是否可以推广以预测最佳响应。 最后,我们还针对我们的模型玩了一个示例游戏。 虽然它还不完美,但我们仍然可以尝试不同的架构和培训程序来改进它。



原文:

Improving the Predictions of Linear Models
In the prior recipes, we have noted that the number of parameters we are fitting far exceeds the equivalent linear models. In this recipe, we will attempt to improve our logistic model of low birthweight with using a neural network.


Getting ready
For this recipe, we will load the low birth-weight data and use a neural network with two hidden fully connected layers with sigmoid activations to fit the probability of a low birth-weight.


How to do it
1.We start by loading the libraries and initializing our computational graph:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import requests
sess = tf.Session()


2.Now we will load, extract, and normalize our data just like as in the prior recipe, except that we are going to using the low birthweight indicator variable as our target instead of the actual birthweight:
birthdata_url = 'https://www.umass.edu/statdata/statdata/data/ lowbwt.dat''
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n'')[5:]
birth_header = [x for x in birth_data[0].split(' '') if len(x)>=1]
birth_data = [[float(x) for x in y.split(' '') if len(x)>=1] for y in birth_data[1:] if len(y)>=1]
y_vals = np.array([x[1] for x in birth_data])
x_vals = np.array([x[2:9] for x in birth_data])
train_indices = np.random.choice(len(x_vals), round(len(x_ vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_ indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
def normalize_cols(m):
col_max = m.max(axis=0)
col_min = m.min(axis=0)
return (m-col_min) / (col_max - col_min)
x_vals_train = np.nan_to_num(normalize_cols(x_vals_train))
x_vals_test = np.nan_to_num(normalize_cols(x_vals_test))


3.Next we'll declare our batch size and our placeholders for the data:

batch_size = 90
x_data = tf.placeholder(shape=[None, 7], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

4.Just as before, we will declare functions that initialize a variable and a layer in our model. To create a better logistic function, we need to create a function that returns a logistic layer on an input layer. In other words, we will just use a fully connected layer and return a sigmoid element-wise for each layer. It is important to remember that our loss function will have the final sigmoid included, so we want to specify on our last layer that we will not return the sigmoid of the output:
def init_variable(shape):
return(tf.Variable(tf.random_normal(shape=shape)))
# Create a logistic layer definition
def logistic(input_layer, multiplication_weight, bias_weight, activation = True):
linear_layer = tf.add(tf.matmul(input_layer, multiplication_ weight), bias_weight)
if activation:
return(tf.nn.sigmoid(linear_layer))
else:
return(linear_layer)


5.Now we will declare three layers (two hidden layers and an output layer). We will start by initializing a weight and bias matrix for each layer and defining the layer operations:
# First logistic layer (7 inputs to 14 hidden nodes)
A1 = init_variable(shape=[7,14])
b1 = init_variable(shape=[14])
logistic_layer1 = logistic(x_data, A1, b1)
# Second logistic layer (14 hidden inputs to 5 hidden nodes)
A2 = init_variable(shape=[14,5])
b2 = init_variable(shape=[5])
logistic_layer2 = logistic(logistic_layer1, A2, b2)
# Final output layer (5 hidden nodes to 1 output)
A3 = init_variable(shape=[5,1])
b3 = init_variable(shape=[1])
final_output = logistic(logistic_layer2, A3, b3, activation=False)


6.Next we declare our loss function (cross-entropy) and optimization algorithm, and initialize the variables:
# Create loss function
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits( final_output, y_target))
# Declare optimizer
my_opt = tf.train.AdamOptimizer(learning_rate = 0.002)
train_step = my_opt.minimize(loss)
# Initialize variables
init = tf.initialize_all_variables()
sess.run(init)

7.In order to evaluate and compare our model to prior models, we want to create a prediction and accuracy operation on the graph. This will allow us to feed in the whole test set and determine the accuracy:
prediction = tf.round(tf.nn.sigmoid(final_output))
predictions_correct = tf.cast(tf.equal(prediction, y_target), tf.float32)
accuracy = tf.reduce_mean(predictions_correct)


8.Now we are ready to start our training loop. We will train for 1500 generations and save the model loss and train/test accuracies for plotting later:
# Initialize loss and accuracy vectors
loss_vec = []
train_acc = []
test_acc = []
for i in range(1500):
# Select random indicies for batch selection
rand_index = np.random.choice(len(x_vals_train), size=batch_ size)
# Select batch
rand_x = x_vals_train[rand_index]
rand_y = np.transpose([y_vals_train[rand_index]])
# Run training step
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
# Get training loss
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss)
# Get training accuracy
temp_acc_train = sess.run(accuracy, feed_dict={x_data: x_vals_
train, y_target: np.transpose([y_vals_train])})
train_acc.append(temp_acc_train)
# Get test accuracy
temp_acc_test = sess.run(accuracy, feed_dict={x_data: x_vals_
test, y_target: np.transpose([y_vals_test])})
test_acc.append(temp_acc_test)
if (i+1)%150==0:
print('Loss = '' + str(temp_loss))


9.This results in the following output:
Loss = 0.696393
Loss = 0.591708
Loss = 0.59214
Loss = 0.505553
Loss = 0.541974
Loss = 0.512707
Loss = 0.590149
Loss = 0.502641
Loss = 0.518047
Loss = 0.502616


10.The following code blocks illustrate how to plot the cross entropy loss and the train and test set accuracies with matplotlib:
# Plot loss over time
plt.plot(loss_vec, 'k-'')
plt.title('Cross Entropy Loss per Generation'')
plt.xlabel('Generation'')
plt.ylabel('Cross Entropy Loss'')
plt.show()
# Plot train and test accuracy
plt.plot(train_acc, 'k-'', label=''Train Set Accuracy'')
plt.plot(test_acc, 'r--'', label=''Test Set Accuracy'')
plt.title('Train and Test Accuracy'')
plt.xlabel('Generation'')
plt.ylabel('Accuracy'')
plt.legend(loc='lower right'')
plt.show()
2019-01-22_161818.jpg
Figure 7: Training loss over 1,500 iterations.
Within approximately 50 generations, we have reached a good model. As we continue to train, we can see that very little is gained over the remaining iterations:
2019-01-22_161859.jpg
Figure 8: Accuracy for the train set and test set.


Here we can see that we arrived at a good model very quickly.


How it works…
When considering using neural networks to model data, we have to consider the advantages and disadvantages. While our model has converged faster than prior models and is maybe a bit more accurate in some cases, this comes with a price: we are training many more model variables and have a greater chance of overfitting. To see overfitting occurring, we look at the accuracy of the test and train sets and see the accuracy of the training set continue to increase slightly, while the accuracy on the test set stays the same or even decreases slightly.

To combat underfitting, we can increase our model depth or train the model for more iterations. To address overfitting, we can add more data or add regularization techniques to our model.
It is also important to note that our model variables are not as interpretable as a linear model. Neural network models have coefficients that are harder to interpret than linear models as it pertains to explaining the significance of features in the model.

Learning to Play Tic Tac Toe

To show how adaptable neural networks can be, we will attempt to use a neural network to learn the optimal moves of Tic Tac Toe. We will approach this knowing that Tic Tac Toe is a deterministic game and that the optimal moves are already known.


Getting ready
To train our model, we will have a list of board positions followed by the best optimal response for a number of different boards. We can reduce the amount of boards to train on by considering only board positions that are different with respect to symmetries. The non-identity transformations of a Tic Tac Toe board are a rotation (either direction) by 90 degrees, 180 degrees, 270 degrees, a horizontal reflection, and a vertical reflection. Given this idea, we will use a shortlist of boards with the optimal move, apply two random transformations, and feed that into our neural network to learn.

If we annotate Xs by 1, Os by -1, and empty spaces by zero, then the following shows us how we can consider a board position and optimal move as a row of data:
2019-01-22_161942.jpg
Figure 9: Here we illustrate how to consider a board and optimal move as a row of data. Note that X = 1, O = -1, and empty spaces are 0, and we start indexing at 0.


In addition to the model loss, to check how our model is performing, we will do two things. The first check we will perform is to remove a position and optimal move row from our training set. This will allow us to see whether the neural network model can generalize out to a move it hasn't seen before. The second method we will take to evaluate our model is to actually play a game against it at the end.
The list of possible boards and optimal moves can be found on the GitHub directory for this recipe here: https://github.com/nfmcclure/tensorflow_cookbook/tree/ master/06_Neural_Networks/08_Learning_Tic_Tac_Toe.


How to do it…
1.We will start by loading the necessary libraries for this script:

import tensorflow as tf
import matplotlib.pyplot as plt
import csv
import random
import numpy as np
import random


2.Next we declare our batch size for training our model:
batch_size = 50

3.To make visualizing the boards a bit easier, we will create a function that outputs the Tic Tac Toe boards with Xs and Os:
def print_board(board):
symbols = ['O',' ','X']
board_plus1 = [int(x) + 1 for x in board]
print(' ' + symbols[board_plus1[0]] + ' | ' + symbols[board_ plus1[1]] + ' | ' + symbols[board_plus1[2]])
print('___________')
print(' ' + symbols[board_plus1[3]] + ' | ' + symbols[board_ plus1[4]] + ' | ' + symbols[board_plus1[5]])
print('___________')
print(' ' + symbols[board_plus1[6]] + ' | ' + symbols[board_ plus1[7]] + ' | ' + symbols[board_plus1[8]])


4.Now we have to create a function that will return a new board and optimal response position under a transformation:
def get_symmetry(board, response, transformation):
'''
:param board: list of integers 9 long:
opposing mark = -1
friendly mark = 1
empty space = 0
:param transformation: one of five transformations on a board:
rotate180, rotate90, rotate270, flip_v, flip_h
:return: tuple: (new_board, new_response)
'''
if transformation == 'rotate180':
new_response = 8 - response
return(board[::-1], new_response)
elif transformation == 'rotate90':
new_response = [6, 3, 0, 7, 4, 1, 8, 5, 2].index(response)
tuple_board = list(zip(*[board[6:9], board[3:6], board[0:3]]))
return([value for item in tuple_board for value in item], new_response)
elif transformation == 'rotate270':
new_response = [2, 5, 8, 1, 4, 7, 0, 3, 6].index(response)
tuple_board = list(zip(*[board[0:3], board[3:6], board[6:9]]))[::-1]
return([value for item in tuple_board for value in item], new_response)
elif transformation == 'flip_v':
new_response = [6, 7, 8, 3, 4, 5, 0, 1, 2].index(response)
return(board[6:9] + board[3:6] + board[0:3], new_ response)
elif transformation == 'flip_h':
# flip_h = rotate180, then flip_v
new_response = [2, 1, 0, 5, 4, 3, 8, 7, 6].index(response)
new_board = board[::-1]
return(new_board[6:9] + new_board[3:6] + new_board[0:3], new_response)
else:
raise ValueError('Method not implmented.')


5.The list of boards and their optimal response is in a .csv file in the directory. We will create a function that will load the file with the f boards and responses and store it as a list of tuples:
def get_moves_from_csv(csv_file):
'''
:param csv_file: csv file location containing the boards w/ responses
:return: moves: list of moves with index of best response
'''
moves = []
with open(csv_file, 'rt') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
moves.append(([int(x) for x in row[0:9]],int(row[9])))
return(moves)


6.Now we'll tie everything together to create a function that will return a randomly transformed board and response:
def get_rand_move(moves, rand_transforms=2):
# This function performs random transformations on a board.
(board, response) = random.choice(moves)
possible_transforms = ['rotate90', 'rotate180', 'rotate270', 'flip_v', 'flip_h']
for i in range(rand_transforms):
random_transform = random.choice(possible_transforms)
(board, response) = get_symmetry(board, response, random_ transform)
return(board, response)


7.Next we'll initialize our graph session, load our data, and create a training set:
sess = tf.Session()
moves = get_moves_from_csv('base_tic_tac_toe_moves.csv')
# Create a train set:
train_length = 500
train_set = []
for t in range(train_length):
train_set.append(get_rand_move(moves))


8.Remember that we want to remove one board and optimal response from our training set to see whether the model can generalize out to make the best move. The best move for the following board will be to play at index number six:
test_board = [-1, 0, 0, 1, -1, -1, 0, 0, 1]
train_set = [x for x in train_set if x[0] != test_board]


9.We can now create functions to create our model variables and our model operations. Note that we do not include the softmax() activation function in the model because it is included in the loss function:
def init_weights(shape):
return(tf.Variable(tf.random_normal(shape)))
def model(X, A1, A2, bias1, bias2):
layer1 = tf.nn.sigmoid(tf.add(tf.matmul(X, A1), bias1))
layer2 = tf.add(tf.matmul(layer1, A2), bias2)
return(layer2)


10.Now we will declare our placeholders, variables, and model:
X = tf.placeholder(dtype=tf.float32, shape=[None, 9])
Y = tf.placeholder(dtype=tf.int32, shape=[None])
A1 = init_weights([9, 81])
bias1 = init_weights([81])
A2 = init_weights([81, 9])
bias2 = init_weights([9])
model_output = model(X, A1, A2, bias1, bias2)
= tf.train.GradientDescentOptimizer(0.025). minimize(loss)
prediction = tf.argmax(model_output, 1)


11.We can now initialize our variables and loop through the training of our neural network:
# Initialize variables
init = tf.initialize_all_variables()
sess.run(init)
loss_vec = []
for i in range(10000):
# Select random indices for batch
rand_indices = np.random.choice(range(len(train_set)), batch_ size, replace=False)
# Get batch
batch_data = [train_set for i in rand_indices]
x_input = [x[0] for x in batch_data]
y_target = np.array([y[1] for y in batch_data])
# Run training step
sess.run(train_step, feed_dict={X: x_input, Y: y_target})
# Get training loss
temp_loss = sess.run(loss, feed_dict={X: x_input, Y: y_ target})
loss_vec.append(temp_loss)
if i%500==0:
print('iteration ' + str(i) + ' Loss: ' + str(temp_loss))


12.Here is code to plot the loss over the model training:
plt.plot(loss_vec, 'k-', label='Loss')
plt.title('Loss (MSE) per Generation')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()

2019-01-22_162020.jpg
Figure 10: Tic-Tac-Toe train set loss over 10,000 iterations.


Here we plot the loss over the training steps:
1.To test the model, we see how it performs on the test board that we removed from the training set. We are hoping that the model can generalize and predict the optimal index for moving, which will be the index six. Most of the time, the model will succeed here:
test_boards = [test_board]
feed_dict = {X: test_boards}
logits = sess.run(model_output, feed_dict=feed_dict)
predictions = sess.run(prediction, feed_dict=feed_dict)
print(predictions)


2.This results in the following output:
[6]


3.In order to evaluate our model, we planned to play against our trained model. To do this, we have to create a function that will check for a win. This way, our program will know when to stop asking for more moves:
def check(board):
wins = [[0,1,2], [3,4,5], [6,7,8], [0,3,6], [1,4,7], [2,5,8], [0,4,8], [2,4,6]]
for i in range(len(wins)):
if board[wins[0]]==board[wins[1]]==board[wins [2]]==1.:
return(1)
elif board[wins[0]]==board[wins[1]]==board[wins [2]]==-1.:
return(1)
return(0)


4.Now we can loop through and play a game with our model. We start with a blank board (all zeros), and then we ask the user to input an index (0-8) of where to play and then feed that into the model for a prediction. For the model's move, we take the largest available prediction that is also an open space. A sample game is shown at the end. From this game, we can see that our model is not perfect:
game_tracker = [0., 0., 0., 0., 0., 0., 0., 0., 0.]
win_logical = False
num_moves = 0
while not win_logical:
player_index = input('Input index of your move (0-8): ')
num_moves += 1
# Add player move to game
game_tracker[int(player_index)] = 1.
# Get model's move by first getting all the logits for each index
[potential_moves] = sess.run(model_output, feed_dict={X: [game_tracker]})
# Now find allowed moves (where game tracker values = 0.0)
allowed_moves = [ix for ix,x in enumerate(game_tracker) if x==0.0]
# Find best move by taking argmax of logits if they are in allowed moves
model_move = np.argmax([x if ix in allowed_moves else -999.0 for ix,x in enumerate(potential_moves)])
# Add model move to game
game_tracker[int(model_move)] = -1.
print('Model has moved')
print_board(game_tracker)
# Now check for win or too many moves
if check(game_tracker)==1 or num_moves>=5:
print('Game Over!')
win_logical = True

5.This results in the following interactive output:
5.这导致以下交互输出:
Input index of your move (0-8): 4
Model has moved
O | |
___________
| X |
___________
| |
Input index of your move (0-8): 6
Model has moved
O | |
___________
| X |
___________
X | | O
Input index of your move (0-8): 2
Model has moved
O | | X
___________
O | X |
___________
X | | O
Game Over!


How it works…
We trained a neural network to play tic-tac-toe by feeding in board positions, a nine-dimensional vector, and predicted the optimal response. We only had to feed in a few possible Tic Tac Toe boards and apply random transformations to each board to increase the training set size.
To test our algorithm, we removed all instances of one specific board and saw whether our model could generalize to predict the optimal response. Finally, we also played a sample game against our model. While it is not perfect yet, we could still try different architectures and training procedures to improve it.

关注最新经典文章,欢迎关注公众号

已有(1)人评论

跳转到指定楼层
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

推荐上一条 /2 下一条