Reinforcement Learning

Type of machine learning technique that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences.

  1. Environment - Physical world in which the agent operates

  2. State - Current situation of the agent.

  3. Reward - Feedback from the environment.

  4. Policy - Method to map agent's state to actions.

  5. Value - Future reward that an agent would receive by taking an action in a particular state.

Q-learning

Updates Q values which denote the value of performing action a in state s. The following value update rule is the core of the Q-learning algorithm.

Reward example: Best path with resources - path bandwidth / path length

Learning rate and discount factor: ]0, 1[

Last updated