Q-network

Q-network returns the q-value just as by a q-function. It returns the q-value instead of the action to do.

Q-learning on Q-network

Based on the same q-value update formula as in q-table:

⁠

For each update:

Feed to the current q-network to get current q-value.

Train the q-network to the new q-value.

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.