REINFORCEMENT LEARNING

    REINFORCEMENT  LEARNING





What is reinforcement learning ?

Reinforcement learning is a machine learning training method based on rewarding desired behaviors and / or punishing undesired ones . In general , a reinforcement learning agent is able to perceive and interpret its environment , take actions and learn through trial and error .

How does reinforcement learning work ?

In reinforcement learning , developers devise a method of rewarding desired behaviors and punishing negative behaviors . This method assigns positive values to the desired actions to encourage the agent and negative values to undesired behaviors . This programs the agent to seek long-term and maximum overall reward to achieve an optimal solution .

These long - term goals help prevent the agent from stalling on lesser goals . With time , the agent learns to avoid the negative and seek the positive . This learning method has been adopted in artificial intelligence ( AI ) as a way of directing unsupervised machine learning through rewards and penalties .


Applications and examples of reinforcement learning :


While reinforcement learning has been a topic of much interest in the field of AI , its widespread , real - world adoption and application remain limited . Noting this , however , research papers abound on theoretical applications , and there have been some successful use cases .

Current use cases include , but are not limited to , the following :

  • Gaming
  • Resource Management
  • Personalized Recommendations
  • Robotics

Gaming is likely the most common usage field for reinforcement learning . It is capable of achieving superhuman performance in numerous games . A common example involves the game Pac - Man .

A learning algorithm playing Pac - Man might have the ability to move in one of four possible directions , barring obstruction . From pixel data , an agent might be given a numeric reward for the result of a unit of travel : 0 for empty space , 1 for pellets , 2 for fruit , 3 for power pellets , 4 for ghost post - power pellets , 5 for collecting all pellets and completing a level , and a 5 - point deduction for collision with a ghost . The agent starts from randomized play and moves to more sophisticated play , learning the goal of getting all pellets to complete the level . Given time , an agent might even learn tactics like conserving power pellets until needed for self - defense.

Reinforcement learning can operate in a situation as long as a clear reward can be applied . In enterprise resource management ( ERM ) , reinforcement learning algorithms can allocate limited resources to different tasks as long as there is an overall goal it is trying to achieve . A goal in this circumstance would be to save time or conserve resources .

In robotics , reinforcement learning has found its way into limited tests . This type of machine learning can provide robots with the ability to learn tasks a human teacher cannot demonstrate , to adapt a learned skill to a new task or to achieve optimization despite a lack of analytic formulation available .

Reinforcement learning is also used in operations research , information theory , game theory , control theory , simulation - based optimization , multiagent systems , swarm intelligence , statistics and genetic algorithms .

Challenges of applying reinforcement learning :

Reinforcement learning , while high in potential , can be difficult to deploy and remains limited in its application . One of the barriers of the deployment of this type of machine learning is its reliance on exploration of the environment .

For example , if you were to deploy a robot that was reliant on reinforcement learning to navigate a complex physical environment , it will seek new states and take different actions as it moves . It is difficult to consistently take the best actions in a real - world environment , however , because of how frequently the environment changes .

The time required to ensure the learning is done properly through this method can limit its usefulness and be intensive on computing resources . As the training environment grows more complex , so too do demands on time and compute resources . Supervised Learning can deliver faster , more efficient results than reinforcement learning to companies if the proper amount of data is available , as it can be employed with fewer resources .

Common reinforcement learning algorithms :

Rather than referring to the specific algorithm , the field of reinforcement learning is made up of several algorithms that take somewhat different approaches . The differences are mainly due to their strategies for exploring their environments .

  • State - action - reward - state - action ( SARSA ) : This reinforcement learning algorithm starts by giving the agent what's known as a policy . The policy is essentially a probability that tells it the odds of certain actions resulting in rewards , or beneficial states .
  • Q - learning : This approach to reinforcement learning takes the opposite approach . The agent receives no policy, meaning its exploration of its environment is more self - directed .
  • Deep Q - Networks : These algorithms utilize neural networks in addition to reinforcement learning techniques . They utilize the self - directed environment exploration of reinforcement learning . Future actions are based on a random sample of past beneficial actions learned by the neural network .

How is reinforcement learning different from supervised and unsupervised learning ?

Reinforcement learning is considered its own branch of machine learning , though it does have some similarities to other types of machine learning , which break down into the following four domains :

  1. Supervised learning : In supervised learning , algorithms train on a body of labeled data . Supervised learning algorithms can only learn attributes that are specified in the data set . Common applications of supervised learning are image recognition models . These models receive a set of labeled images and learn to distinguish common attributes of predefined forms .
  2. Unsupervised learning : In unsupervised learning, developers turn algorithms loose on fully unlabeled data . The algorithm learns by cataloging its own observations about data features without being told what to look for .
  3. Semisupervised learning : This method takes a middle - ground approach . Developers enter a relatively small set of labeled training data , as well as a larger corpus of unlabeled data . The algorithm is then instructed to extrapolate what it learns from the labeled data to the unlabeled data and draw conclusions from the set as a whole .
  4. Reinforcement learning : This takes a different approach altogether. It situates an agent in an environment with clear parameters defining beneficial activity and nonbeneficial activity and an overarching endgame to reach . It is similar in some ways to supervised learning in that developers must give algorithms clearly specified goals and define rewards and punishments . This means the level of explicit programming required is greater than in unsupervised learning . But , once these parameters are set , the algorithm operates on its own , making it much more self - directed than supervised learning algorithms . For this reason , people sometimes refer to reinforcement learning as a branch of semisupervised learning , but in truth , it is most often acknowledged as its own type of machine learning .

                                                                                                                    Written By - Ritesh Pandita  ©

Comments

  1. I truly appreciate the time and work you put into sharing your knowledge. I found this topic to be quite effective and beneficial to me. Thank you very much for sharing. Continue to blog.

    Data Engineering Services 

    AI & ML Solutions

    Data Analytics Services

    Data Modernization Services

    ReplyDelete

Post a Comment

MORE LIKE THIS

BIG DATA ANALYTICS

ARTIFICIAL - INTELLIGENCE