2018年2月17日 星期六

[TensorFlow] Activation functions


 TensorFlow  





Introduction



The activation ops provide different types of nonlinearities for use in neural networks.
All activation ops apply componentwise, and produce a tensor of the same shape as the input tensor.

Activation function is also aka. 激勵函數激活函數、活化函數 in Chinese.

The activation function is non-linear function essentially for defining the output of a neuron and given an input for next layer in neural networks.


Also we should know what problem that activation function faces, like

1.  Vanishing gradients:
The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value.

2.  Hard saturation:
If there is a constant c for which for which the gradient becomes 0.

3.  Soft saturation:
If there is no constant, then the input value must become +/- infinity.

See Noisy Activation Functions by Caglar Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio

In the following sample codes, we will see how to use Activation function in TensorFlow and see the distributions of f(x) values of Sigmoid, Tanh, Relu and Softplus.




Environment


Python 3.6.2
TensorFlow 1.5.0


Implement


Sigmoid

API Document: tf.sigmoid
Source code: Github

with tf.Session() as sess:
    for step in range(-10,11):
        X = tf.convert_to_tensor(step, dtype=tf.float32)
        # X = tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random number
        print(sess.run(X), sess.run(tf.sigmoid(X)))





Tanh
API Document: tf.tanh
Source code: Github

with tf.Session() as sess:   
    for step in range(-10,11):
        X = tf.convert_to_tensor(step, dtype=tf.float32)
        # X = tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random number
        print(sess.run(X), sess.run(tf.tanh(X)))




Relu
API Document: tf.nn.relu
Source code: Github

with tf.Session() as sess:
   
    for step in range(-10,11):
        X = tf.convert_to_tensor(step, dtype=tf.float32)
        # X = tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random number
        print(sess.run(X), sess.run(tf.nn.relu(X)))




The f(x) value becomes 0 after X<=0, so the weight will not get adjusted during descent. This is the Dying Relu Problem.

Softplus
API Document: tf.nn.softplus
Source code: Github

with tf.Session() as sess:
    
    for step in range(-10,11):
        X = tf.convert_to_tensor(step, dtype=tf.float32)
        # X = tf.random_uniform([1,1], minval=1.0, maxval=3.0, seed=step) //Or use random number
        print(sess.run(X), sess.run(tf.nn.softplus(X)))





Reference


沒有留言:

張貼留言