Reinforcement Learning
Type of machine learning technique that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences.
Environment - Physical world in which the agent operates
State - Current situation of the agent.
Reward - Feedback from the environment.
Policy - Method to map agent's state to actions.
Value - Future reward that an agent would receive by taking an action in a particular state.
Q-learning
Updates Q values which denote the value of performing action a in state s. The following value update rule is the core of the Q-learning algorithm.
Reward example: Best path with resources - path bandwidth / path length
Learning rate and discount factor: ]0, 1[
Last updated