Gallery
[New] Concise and Practical AI/ML
Share
Explore
Reinforcement Learning

icon picker
Q-network

Code Files

The Network

Q-network returns the q-value just as by a q-function. It returns the q-value instead of the action to do. Another name of Q-network is DQN (Deep Q-Network) but it’s just Q-network, deep is of course in multiple layers.

Q-learning on Q-network

Based on the same q-value update formula as in q-table: ​
image.png
For each update:
Feed to the current q-network to get current q-value.
Train the q-network to the new q-value.

When to Train

Unlike q-value being updated after every action as in q-table. Q-network takes much much less RAM but the call to get output is slow compared to constant time to getting q-value from table, and the fit (training) is extremely slow compare to setting q-value in q-table.
There are 2 options of when to train the q-network:
• Train after an action
• Train after a run through the whole episode
Train after an action as in q-table shouldn't be a choice, it's very slow training, unless having enough hardware resources. Train after a whole single run is better, faster for training.
Share
 
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.