Skip to content
[New] Concise and Practical AI/ML
  • Pages
    • Preface
    • Artificial Intelligence
      • Concepts
      • High-level Intelligence
    • Maths for ML
      • Calculus
      • Algebra
    • Machine Learning
      • History of ML
      • ML Models
        • ML Model is Better
        • How a Model Learns
        • Boosted vs Combinatory
      • Neuralnet
        • Neuron
          • Types of Neurons
        • Layers
        • Neuralnet Alphabet
        • Heuristic Hyperparams
      • Feedforward
        • Input Separation
      • Backprop
        • Activation Functions
        • Loss Functions
        • Gradient Descent
        • Optimizers
      • Design Techniques
        • Normalization
        • Regularization
          • Drop-out Technique
        • Concatenation
        • Overfitting & Underfitting
        • Explosion & Vanishing
      • Engineering Techniques
    • Methods of ML
      • Supervised Learning
        • Regression
        • Classification
      • icon picker
        Reinforcement Learning
        • Concepts
        • Bellman Equation
        • Q-table
        • Q-network
        • Learning Tactics
          • Policy Network
      • Unsupervised Learning
        • Some Applications
      • Other Methods
    • Practical Cases
    • Ref & Glossary

Formula

Q += Rate x TD
Where Q is expected future return, Rate is learning rate, TD is temporal difference.

Important Things vs SL

In supervised learning there are
Delta is (U-Y), and Loss calculated from delta
Loss should reach zero
In reinforcement learning, the concepts are
Diff (temporal diff), and Goal = sum(abs(diff)) of episode
Goal should reach max sum reward of episode
abs(diff) or diff^2 should not be called error or loss, ‘cause it’s increasing, not to reduce.

Training

Logging

In reinforcement learning, the trainer programme should log out not the loss of q-network as it is no use, should log out the sum of rewards at episode run end. Logging out the Q(S,A) at square 1 (start point of episode) may be what people think of but it is increasing always, also no use.

Q-* Related Terms

There are some related terms with q prefix in Reinforcement Learning based on Bellman:
q-learning, q-function, q-table, q-network, q-value.
Definitions:
• Q-learning is the learning method utilising Bellman equation.
• Q-function is used by Q-learning, giving q-value
• Q-table or Q-network are 2 common kinds of Q-function
• Q-network (Deep Q-Network) is the practical replacement for q-table
• Q-value is the returning result of q-function, q-table, q-network

 
Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.