start

  • 创建一个graph,
  • TensorFlow 会话(Session)负责处理在诸如 CPU 和 GPU 之类的设备上的操作并运行它们,并且它保留所有变量值。

常量

1
2
3
4
5
6
7
a = tf.constant(5)
b = 5 + a
c = a + b
with tf.Session() as sess:
    result = sess.run(c)
    print(result)
10

变量

训练过程中会不断优化 当值需要在会话中更新时,我们使用可变张量

1
2
3
4
5
6
7
8
x = tf.Variable(1, name="x")
y = tf.Variable(2, name="y")
f = (x + y) * a

# 使用一个Variable初始化另一个Variable
weights = tf.Variable(tf.random_normal([100,100],stddev=2))
weight2=tf.Variable(weights.initialized_value(), name='w2')

global_variables_initializer必须运行在所有Variable声明之后 不然报错 fuck session 和 变量初始化不能搞在一起吗?

1
2
3
4
5
6
7
8
9
10
11
12
13
# 初始化变量
init_var = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_var)
    """
    想要高效地评估f和c,在一次图形运行中评估f和c,所有节点值都在图运行之间删除,除了变量值, b会重复求值
    单进程TensorFlow中,每个会话都有自己的每个变量副本 多个会话不共享任何状态,即使它们重复使用同一个图。
    变量在其初始化程序运行时启动其生命周期,并且在会话关闭时结束。
    """
    con_, var_ = sess.run([f, c])
    print(con_, var_)

15 10

assign

变量的再赋值

scope

get_variable and Variable

  • tf.Variable 必须有初始值
  • tf.Variable 始终创建一个新变量,已存在具有此类名称的变量,则可能会为变量名称添加后缀
  • tf.get_variable 从graph中获取具有这些参数的现有变量,如果它不存在,它将创建一个新变量,存在的话会报错

get_variable初始化相关

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from pprint import pprint

tf.Variable(0, name="a")  # name == "a:0"
tf.get_variable(initializer=0, name="a")  # name == "a_1:0"
tf.Variable(0, name="a")  # name == "a_2:0"

b = tf.get_variable(initializer=0, name="b")  # name == "b:0"
c = tf.get_variable(name="c", shape=1)  # name == "c:0"
pprint(tf.trainable_variables())
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    print(b.eval())
    print(c.eval())

name_scope and variable_scope

  • 两个scope对使用tf.Variable创建的变量具有相同的效果,即范围将作为操作或变量名称的前缀添加。
  • tf.variable_scope就不一样了
1
2
3
4
5
6
7
8
9
10
with tf.variable_scope('var'):
    with tf.name_scope("name"):
        v1 = tf.get_variable("var1", [1], dtype=tf.float32)
        v2 = tf.Variable(1, name="var2", dtype=tf.float32)
        a = tf.add(v1, v2)

print(v1.name)  # var/var1:0
print(v2.name)  # var/name/var2:0
print(a.name)  # var/name/Add:0

scope reuse 共享变量 tf.variable_scopetf.get_variable 搭配

1
2
3
4
5
6
7
8
9
10
11
12
13
with tf.variable_scope('bar'):
    foo1 = tf.get_variable(name="a", shape=1)  # name == "bar/a:0"
    foo2 = tf.get_variable(name="a", shape=1)  # ValueError: Variable bar/a already exists

with tf.variable_scope('foo', reuse=tf.AUTO_REUSE):
    foo1 = tf.get_variable(name="a", shape=1)  # name == "foo/a:0"

with tf.name_scope("bar"):
    with tf.variable_scope("foo", reuse=True):
        v1 = tf.get_variable("a")
        assert foo1 == v1

pprint(tf.trainable_variables())

占位符

它们通常用于在训练期间将训练数据传递给TensorFlow 实际上不执行任何计算

1
2
3
4
5
6
7
8
9
10
11
12
A = tf.placeholder(tf.float32, shape=(None, 3))
B = A + 5
with tf.Session() as sess:
    B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
    B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

print(B_val_1)
print(B_val_2)

[[6. 7. 8.]]
[[ 9. 10. 11.]
 [12. 13. 14.]]

dtype

dtype 转换

1
2
3
4
5
b = tf.Variable(tf.random_uniform([5,10], 0, 2, dtype= tf.int32))
b_new = tf.cast(b, dtype=tf.float32)

# tensor
tf.convert_to_tensor()

save load

默认情况下,保存器将以自己的名称保存并还原所有Variable,但如果需要更多控制,则可以指定要保存或还原的变量以及要使用的名称

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
saver = tf.train.Saver()
"""
指定保存的变量
saver = tf.train.Saver({"weights": theta})
"""
with tf.Session() as sess:
     save_path = saver.save(sess, "/tmp/my_model.ckpt")

with tf.Session() as sess:
     saver.restore(sess, "/tmp/my_model_final.ckpt")


# 导入TensorFlow训练好的model
saver = tf.train.import_meta_graph('mnist_dcgan-60000.meta')
saver.restore(sess, tf.train.latest_checkpoint(''))

graph = tf.get_default_graph()
g = graph.get_tensor_by_name('generator/g/Tanh:0')
noise = graph.get_tensor_by_name('noise:0')
is_training = graph.get_tensor_by_name('is_training:0')

n = np.random.uniform(-1.0, 1.0, [batch_size, z_dim]).astype(np.float32)
gen_imgs = sess.run(g, feed_dict={noise: n, is_training: False})

tensor board

正在定期保存检查点 可视化训练进度 checkpoint

1
2
3
4
5
6
7
tf.summary.histogram('D_X/true', self.D_X(x))
一般用来显示训练过程中变量的分布情况

tf.summary.scalar('loss/cycle', cycle_loss)
一般在画loss,accuary时会用到这个函数

tf.summary.image('X/generated', utils.batch_convert2int(self.G(x)))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Merge all the summaries and write them out to the summaries_dir
# 直接run(merged,feed_dict=**kwargs) 非常方便
# merged = tf.summary.merge_all()

# graph = sess.graph or tf.get_default_graph()
file_writer = tf.summary.FileWriter(checkpoints_dir, graph)

mse_summary = tf.summary.scalar('MSE', mse)
with tf.Session() as sess:
    for i in step:
        s = sess.run(mse_summary)
        file_writer.add_summary(s, i)
        # file_writer.flush()

file_writer.close()

必须是完整路径 fuck 不能相对路径 Pycharm 有Copy Path

1
tensorboard --logdir path/to/log-directory

device

"/cpu:0" for the CPU devices and "/gpu:I" for the \(i^{th}\) GPU device

  • log_device_placement=True 验证TensorFlow确实使用指定的设备,打log
  • allow_soft_placement=True 如果您不确定该设备并希望TensorFlow选择现有和支持的设备

GPU比CPU快得多,因为它们有许多小内核。然而,就计算速度而言,将GPU用于所有类型的计算并不总是有利的。 与GPU相关的开销有时在计算上比GPU提供的并行计算的优势更昂贵。为了解决这个问题, TensorFlow提供了在特定设备上进行计算的条款。默认情况下,如果CPU和GPU都存在,TensorFlow优先考虑GPU。

1
2
3
4
5
6
7
8
9
10
11
12
config = tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)

# 手动选择设备
with tf.device('/cpu:0'):
    rand_t = tf.random_uniform([50,50], 0, 10, dtype=tf.float32, seed=0)
    a = tf.Variable(rand_t)
    b = tf.Variable(rand_t)
    c = tf.matmul(a,b)
    init = tf.global_variables_initializer()

with tf.Session(config=config) as sess:
    ....

optimizers

learning_rate传入初始lr值,global_step用于逐步计算衰减指数,decay_steps用于决定衰减周期,decay_rate是每次衰减的倍率, staircase若为False则是标准的指数型衰减,True时则是阶梯式的衰减方法,目的是为了在一段时间内(往往是相同的epoch内)保持相同的learning rate

其实可以直接使用一些改变learning_rate的变化的AdamOptimizer之类的

decay every 100000 steps with a base of 0.96

1
2
3
4
# 衰减learning_rate
global_step = tf.Variable(0, trainable=False)
initial_learning_rate = 0.2
learning_rate = tf.train.exponential_decay(initial_learning_rate, global_step, decay_steps=100000, decay_rate=0.95, staircase=True)

1
2
3
4
5
6
7
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train_step = optimizer.minimize(loss)

with tf.Session() as sess:
    ...
    sess.run(train_step, feed_dict = {X:X_data, Y:Y_data})

Gradient Clipping

直观作用就是让权重的更新限制在一个合适的范围

1
2
3
4
5
6
7
trainable_variables = tf.trainable_variables() # 获取到模型中所有需要训练的变量
grads, _ = tf.clip_by_global_norm(tf.gradients(self.cost, trainable_variables), MAX_GRAD_NORM) # 求导,并且进行截断
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 1.0)
self.train_op = optimizer.apply_gradients(
    zip(grads, trainable_variables)
)

autoencode

CAE

Convolutional Autoencoders (CAE)

transposed convolution layers tf.nn.conv2d_transpose 产生人工假象(artefacts) 使用tf.image.resize_nearest_neighbor 代替

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
### Encoder
conv1 = tf.layers.conv2d(X_noisy, filters[1], (k,k), padding='same', activation=activation_fn)
# Output size h_in x w_in x filters[1]
maxpool1 = tf.layers.max_pooling2d(conv1, (p,p), (s,s), padding='same')
# Output size h_l2 x w_l2 x filters[1]
conv2 = tf.layers.conv2d(maxpool1, filters[2], (k,k), padding='same', activation=activation_fn)
# Output size h_l2 x w_l2 x filters[2]
maxpool2 = tf.layers.max_pooling2d(conv2,(p,p), (s,s), padding='same')
# Output size h_l3 x w_l3 x filters[2]
conv3 = tf.layers.conv2d(maxpool2,filters[3], (k,k), padding='same', activation=activation_fn)
# Output size h_l3 x w_l3 x filters[3]
encoded = tf.layers.max_pooling2d(conv3, (p,p), (s,s), padding='same')
# Output size h_l3/s x w_l3/s x filters[3] Now 4x4x16


### Decoder
upsample1 = tf.image.resize_nearest_neighbor(encoded, (h_l3,w_l3))
# Output size h_l3 x w_l3 x filters[3]
conv4 = tf.layers.conv2d(upsample1, filters[3], (k,k), padding='same', activation=activation_fn)
# Output size h_l3 x w_l3 x filters[3]
upsample2 = tf.image.resize_nearest_neighbor(conv4, (h_l2,w_l2))
# Output size h_l2 x w_l2 x filters[3]
conv5 = tf.layers.conv2d(upsample2, filters[2], (k,k), padding='same', activation=activation_fn)
# Output size h_l2 x w_l2 x filters[2]
upsample3 = tf.image.resize_nearest_neighbor(conv5, (h_in,w_in))
# Output size h_in x w_in x filters[2]
conv6 = tf.layers.conv2d(upsample3, filters[1], (k,k), padding='same', activation=activation_fn)
# Output size h_in x w_in x filters[1]

logits = tf.layers.conv2d(conv6, 1, (k,k) , padding='same', activation=None)

# Output size h_in x w_in x 1
decoded = tf.nn.sigmoid(logits, name='decoded')


loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=X, logits=logits)
cost = tf.reduce_mean(loss)
opt = tf.train.AdamOptimizer(0.001).minimize(cost)

"""
摘录来自: Antonio Gulli. “TensorFlow 1.x Deep Learning Cookbook。” Apple Books.
"""