karatejb: [TensorFlow] Activation functions

2018年2月17日星期六

[TensorFlow] Activation functions

TensorFlow

▌Introduction

(From TensorFlow document)

The activation ops provide different types of nonlinearities for use in neural networks.

All activation ops apply componentwise, and produce a tensor of the same shape as the input tensor.

Activation function is also aka. 激勵函數、激活函數、活化函數 in Chinese.

The activation function is non-linear function essentially for defining the output of a neuron and given an input for next layer in neural networks.

There is excellent article: Understanding Activation Functions in Neural Networks, by Avinash Sharma V.

Also we should know what problem that activation function faces, like

1. Vanishing gradients:
The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value.

2. Hard saturation:
If there is a constant c for which for which the gradient becomes 0.

3. Soft saturation:
If there is no constant, then the input value must become +/- infinity.

See Noisy Activation Functions by Caglar Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio

In the following sample codes, we will see how to use Activation function in TensorFlow and see the distributions of f(x) values of Sigmoid, Tanh, Relu and Softplus.

▋Related articles

1. [TensorFlow] Install on Windows

2. [TensorFlow] Linear Regression sample

3. [TensorFlow] Visualize learning by TensorBoard

4. [TensorFlow] Activation functions

5. [TensorFlow] Batch Normalization

6. [TensorFlow] Save and Restore model

7. [TensorFlow] Thread and Queue

▌Environment

▋Python 3.6.2

▋TensorFlow 1.5.0

▌Implement