TensorFlow
▌Introduction
(From TensorFlow
document)
The activation
ops provide different types of nonlinearities for use in neural networks.
All activation ops apply componentwise, and produce a tensor
of the same shape as the input tensor.
|
Activation function is also aka. 激勵函數、激活函數、活化函數 in Chinese.
The activation function is non-linear
function essentially for defining the output of a neuron and given an input for
next layer in neural networks.
There is excellent
article: Understanding
Activation Functions in Neural Networks, by Avinash Sharma V.
Also we
should know what problem that activation function faces, like
1. Vanishing gradients:
The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value.
The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value.
2. Hard saturation:
If there is a constant c for which for which the gradient becomes 0.
If there is a constant c for which for which the gradient becomes 0.
3. Soft saturation:
If there is no constant, then the input value must become +/- infinity.
If there is no constant, then the input value must become +/- infinity.
See Noisy Activation Functions by Caglar
Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio
In the
following sample codes, we will see how to use Activation
function in TensorFlow and see the
distributions of f(x) values of Sigmoid, Tanh, Relu and Softplus.
▌Environment
▋Python 3.6.2
▋TensorFlow 1.5.0
▌Implement
▋Sigmoid
API Document: tf.sigmoid
Source code: Github
with tf.Session() as sess:
for step in range(-10,11):
X =
tf.convert_to_tensor(step, dtype=tf.float32)
# X =
tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random
number
print(sess.run(X), sess.run(tf.sigmoid(X)))
▋Tanh
API Document: tf.tanh
Source code: Github
with tf.Session() as sess:
for step in range(-10,11):
X =
tf.convert_to_tensor(step, dtype=tf.float32)
# X =
tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random
number
print(sess.run(X), sess.run(tf.tanh(X)))
▋Relu
API Document: tf.nn.relu
Source code: Github
with tf.Session() as sess:
for step in range(-10,11):
X = tf.convert_to_tensor(step,
dtype=tf.float32)
# X =
tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random
number
print(sess.run(X), sess.run(tf.nn.relu(X)))
The f(x) value becomes 0
after X<=0, so the weight will not get adjusted during descent. This is the Dying Relu Problem.
▋Softplus
API Document: tf.nn.softplus
Source code: Github
with tf.Session() as sess:
for step in range(-10,11):
X =
tf.convert_to_tensor(step, dtype=tf.float32)
# X =
tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random
number
print(sess.run(X), sess.run(tf.nn.softplus(X)))
▌Reference
沒有留言:
張貼留言