Operant conditioning as a markovian
decision problem : toward a dynamic model of asymptotic performance
under random ratio schedules of reinforcement. J. Jozefowiez Université de Lille, Ch. de Gaulle, France Reinforcement learning is one of the most active research area in contemporary artificial intelligence. It deals with the design of algorithms which allow computer agents to learn how to maximize the collecting of goods while interacting with an unknown environment. Recently, neural network models of operant conditioning using these algorithms have been proposed but most of these models do not exploit the framework of markovian decision problems, central to recent development in reinforcement learning. In this communication, we will argue that markovian decision problems and reinforcement learning could be used as a formal framework for the analysis of operant learning in animals. We will provide an example of how they can be used to derive a model of asymptotic performance under random ratio schedules of reinforcement. Our goal is to achieve a dynamic model of random ratio performance, e.g. one which will allow us to understand not only the molar properties of behavior under random ratio schedules but also its molecular properties by explaining how response rate changes from time to time. Keywords: |
|
|
|
|
|
|