Reinforcement Learning

Type of machine learning technique that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences.

Environment - Physical world in which the agent operates
State - Current situation of the agent.
Reward - Feedback from the environment.
Policy - Method to map agent's state to actions.
Value - Future reward that an agent would receive by taking an action in a particular state.

Q-learning

Updates Q values which denote the value of performing action a in state s. The following value update rule is the core of the Q-learning algorithm.

Reward example: Best path with resources - path bandwidth / path length

Learning rate and discount factor: ]0, 1[

PreviousNeural Networks NextUnsupervised Learning: K-means

Last updated 2 years ago