[New] Concise and Practical AI/ML

Pages
- Preface
- Artificial Intelligence
  Concepts
  High-level Intelligence
- Maths for ML
  Calculus
  Algebra
- Machine Learning
  History of ML
  ML Models
  ML Model is Better
  How a Model Learns
  Boosted vs Combinatory
  Neuralnet
  Neuron
  Types of Neurons
  Layers
  Neuralnet Alphabet
  Heuristic Hyperparams
  Feedforward
  Input Separation
  Backprop
  Activation Functions
  Loss Functions
  Gradient Descent
  Optimizers
  Design Techniques
  Normalization
  Regularization
  Drop-out Technique
  Concatenation
  Overfitting & Underfitting
  Explosion & Vanishing
  Engineering Techniques
- Methods of ML
  Supervised Learning
  Regression
  Classification
  Reinforcement Learning
  Concepts
  Bellman Equation
  Q-table
  Q-network
  Learning Tactics
  Policy Network
  Unsupervised Learning
  Some Applications
  Other Methods
- Practical Cases
- Ref & Glossary

...

Explore

Activation Functions

Activation functions are for limiting output values, it is to avoid saturated output values into larger range which makes the network hard to learn which is while identify activation performs badly because it doesn’t limit anything.

Identity Activation

The identity activation function is f(x)=x and it doesn’t change nor limit the value passed in by the nucleus, dot-product for example.

Unit-step Activation

Unit-step

f(x) =1 if x>=0, =0 otherwise

Half-maximum Unit-step

f(x) =1 if x>0, =.5 if x=0, =0 otherwise

Rectifier Activations

Rectifier activation function usually has a flat section and a rectified (erected) section in function diagram.

ReLU

Rectifier Linear Unit is the most common activation function, it is faster than sigmoid-like functions. It limits, throws one half of the output from nucleus away. Good for hidden layers, and shrink (distill) the net easiler later.

Leaky ReLU

Leaky ReLU is the ReLU function which is not flat on the negative side and it is rising up a bit to avoid vanishing values due to multiplications with zeros.

Sigmoid-like Activations

Sigmoid

Sigmoid is the most common S-shape activation function. Good for output layer, bad for hidden layer as the training may stall due to vanhising gradient.