since 2002-04-10
aimes to Hierarchy Formation in MAS
Littman, M. L., Markov games as a framework for multi-agent
reinforcement learning, in W. W. Cohen & H. Hirsh, eds, `Proceedings of the
Eleventh International Conference on Machine Learning (ML-94)', Morgan Kauffman
Publishers, Inc., New Brunswick, NJ, pp. 157--163.
http://citeseer.nj.nec.com/littman94markov.html
Littman, M. L., Value-function
reinforcement learning in Markov games, in
Journal of Cognitive Systems Research 2 (2001) 55-66.
People
Michael L. Littman , Associate Research Professor, Department of Computer Science , Rutgers UniversitySome Unsorted "raw" papers:
Multiagent Reinforcement
Learning: Theoretical Framework and an .. - Hu, Wellman (1998)
(Correct) (26 citations)
Reinforcement
Learning with Perceptual Aliasing: The Perceptual.. - Chrisman (1992)
(Correct) (84 citations)
...1991 ] , [ Mill'an and Torras, 1991 ] , [ Chapman and
Kaelbling, 1991 ] ). The objective for a reinforcement learning agent is
to acquire a policy for choosing actions so as to maximize overall performance.
After each...
Learning Policies with
External Memory - Peshkin, Meuleau, Kaelbling (1999)
(Correct) (13 citations)
...We compare the performance of these two
algorithms on benchmark problems. 1 Introduction A reinforcement-learning
agent must learn a mapping from a stream of observations of the world to a
stream of actions. In...
Learning to Use
Selective Attention and Short-Term Memory in.. - McCallum (1996)
(Correct) (31 citations)
...their environment through sensors and
effectors. It focuses on the following question: How can a reinforcement
learning agent successfully learn to interact with a complex environment
when the agent's perception of that...
Towards Collaborative and
Adversarial Learning: A Case Study.. - Stone, Veloso (1997)
(Correct) (15 citations)
...in the multiagent learning literature. One of
the earliest multiagent learning papers describes a reinforcement learning
agent which incorporates information that is gathered by another agent (Tan,
1993). It is considered...
Adaptive Load
Balancing: A Study in Multi-Agent Learning - Schaerf, Shoham, Tennenholtz
(1995)
(Correct) (21 citations)
Near-Optimal
Reinforcement Learning in Polynomial Time - Kearns, Singh (1998)
(Correct) (10 citations)
...the parameters of this process, but has to
learn how to act directly from experience. Thus, the reinforcement learning
agent faces a fundamental trade-o#between exploitation and exploration
(Thrun, 1992 Sutton & Barto,...
Instance-Based State
Identification for Reinforcement Learning - Andrew Mccallum (1994)
(Correct) (18 citations)
...field of view and limited attention, the robot
suffers from hidden state. More formally, we say a reinforcement learning
agent suffers from the hidden state problem if the agent's state
representation is non-Markovian with...
Multiagent Reinforcement
Learning in Stochastic Games - Hu, Wellman (1999)
(Correct) (4 citations)
The Effect of
Representation and Knowledge on Goal-Directed.. - Sven Koenig (1996)
(Correct) (10 citations)
Hierarchical Multi
Agent Reinforcement Learning - Makar, Mahadevan (2001)
(Correct) (1 citation)
...330-337, Amherst, MA. [5] Crites, R.H. &
Barto, A.G. (1998) Elevator group control using multiple reinforcement
learning agents. Machine Learning 33 pp.235-262. [6] Hu, J. & Wellman,
M. (1998) Multiagent reinforcement...
Memory Approaches To
Reinforcement Learning In Non-Markovian.. - Long-Ji Lin (1992)
(Correct) (16 citations)
...algorithm and Q-learning to solve several
nontrivial learning problems. Consider a reinforcement learning agent
whose state representation is based on only its immediate sensation. When its
sensors are not...
Multi-Agent Reinforcement
Learning: Weighting and Partitioning - Sun, Peterson (1999)
(Correct) (3 citations)
... In relation to other more frequently studied
types of learning tasks, we note that the output of a reinforcement learning
agent, such as that of Q-learning as described above, can be interpreted in
two different ways: ffl As ...
A General Method For
Multi-Agent Reinforcement Learning In.. - Jürgen Schmidhuber (1996)
(Correct) (8 citations)
...deal with this. This paper, however, introduces
a novel, general, sound method for multiple, reinforcement learning
agents living a single life with limited computational resources in an
unrestricted environment. The...
Learning mutual trust
- Banerjee, Mukherjee, Sen (2000)
(Correct) (1 citation)
ADVISOR: A
machine learning architecture for intelligent.. - Beck, Woolf, Beal (2000)
(Correct) (1 citation)
...users of the tutor to train the machine
learning (ML) agent. This model is combined with a reinforcement learning
agent to produce a configurable teaching policy. Research goals and previous
work Intelligent...
Hidden State and Reinforcement
Learning with Instance-Based.. - Andrew Mccallum
(Correct) (12 citations)
...is missing information needed to determine the
next correct action. More formally, we say a reinforcement learning agent
suffers from the hidden state problem if the agent's state representation is
non-Markovian with...
Learning From
Instruction And Experience: Methods For.. - Richard Frank Maclin (1995)
(Correct) (4 citations)
...to communicate advice, using statements in a
simple programming language, to a connectionist, reinforcement-learning
agent. The teacher indicates conditions of the environment and actions the
agent should take under...
An Analysis of Direct Reinforcement
Learning in non-Markovian.. - Mark Pendrith (1998)
(Correct) (2 citations)
... the side-effects of state-space representation
can lead to the domain appearing as non-Markov to a reinforcement learning
agent. In this paper, we examine various issues arising from applying
standard RL algorithms to...
A Near-Optimal
Polynomial Time Algorithm for Learning in.. - Brafman, Tennenholtz (1999)
(Correct) (1 citation)
Partitioning
in Reinforcement Learning - Ron Sun (1999)
(Correct) (1 citation)
Large-Scale Dynamic
Optimization Using Teams of Reinforcement.. - Crites (1996)
(Correct) (3 citations)
...DYNAMIC OPTIMIZATION USING TEAMS OF
REINFORCEMENT LEARNING AGENTS A Dissertation Presented by ROBERT HARRY
CRITES Submitted to the Graduate School of the...
Realistic
Multi-Agent Reinforcement Learning - Jürgen Schmidhuber (1996)
(Correct) (2 citations)
Incremental
Self-Improvement For Life-Time.. - Jieyu Zhao, Jürgen.. (1996)
(Correct) (2 citations)
Reduced
Training Time for Reinforcement Learning with Hidden.. - Andrew Mccallum
(1994)
(Correct) (2 citations)
... order of magnitude fewer steps than several
previous approaches. 1 Background and Related Work A reinforcement learning
agent suffers from the hidden state problem if at any time the agent's state
representation is missing...
Model-based Learning of Interaction
Strategies in Multi-agent.. - David Carmel (1997)
(Correct) (1 citation)
...demonstrating the superiority of the
model-based learning agent over non-adaptive agents and over
reinforcement-learning agents. 1 Introduction The recent tremendous
growth of the Internet has motivated a significant...
Dyna, an Integrated
Architecture for Learning, Planning, and.. - Richard Sutton (1991)
(Correct) (2 citations)
...policy what to do in the current situation. The
first three steps together comprise a standard reinforcement learning
agent. Given enough experience, such an agent can learn the optimal reactive
mapping from situations to ...
Reinforcement
Learning with Interacting Continually Running .. - Jürgen Schmidhuber (1990)
(Correct) (1 citation)
...over all remaining time ticks still to come.
Pain corresponds to negative reinforcement. The reinforcement learning
agent faces a very general spatio-temporal credit assignment task: No
external teacher provides...
An On-Line Algorithm
for Dynamic Reinforcement Learning and .. - Jürgen Schmidhuber (1990)
(Correct) (1 citation)
...over all remaining time ticks still to come.
Pain corresponds to negative reinforcement. The reinforcement learning
agent faces a very general spatio-temporal credit assignment task: No
external teacher provides...
Multi-Agent Reinforcement Learning
with Bidding for Segmenting .. - Sun, Sessions
(Correct) (1 citation)
...longrange ones) create problems for
reinforcement learning. Through the use of a coalition of reinforcement
learning agents, our approach seeks out proper con gurations of agents, with
the goal of reducing or eliminating...
Partitioning in Multi-Agent
Reinforcement Learning - Ron Sun Todd
(Correct) (1 citation)
Learning, Cooperation,
and Coordination in Multi-Agent Systems - Berenji, Vengerov
(Correct)
...analytical
and experimental results shedding some light on the reason why cooperation
between reinforcement learning agents can give better results than the
ones predicted by Whitehead (1991). We first considered a ...
Autonomous Agent
Navigation in Grid World: Using Java Exceptions.. - Gunay (2000)
(Correct)
... Keywords:
Agent architecture, articial intelligence, software reuse, software exceptions,
reinforcement learning, agent communications, objectoriented software
design. 1 Introduction This paper presents an account...
Rational Learning of
Mixed Equilibria in Stochastic Games - Bowling, Veloso (2000)
(Correct)
Totally Model-Free Reinforcement
Learning by Actor-Critic.. - Eiji Mizutani And
(Correct)
...University
of California at Berkeley, USA Abstract In this paper we describe how an
actor-critic reinforcement learning agent in a non-Markovian domain finds
an optimal sequence of actions in a totally modelfree fashion...
A
Reinforcement Learning Agent for Personalized Information.. - Seo, Zhang
(2000)
(Correct)
...A
Reinforcement Learning Agent for Personalized Information Filtering ...
An
Algorithm for Distributed Reinforcement Learning in.. - Lauer, Riedmiller
(2000)
(Correct)
...MA: MIT
Press Crites, R. H., & Barto, A. G. (1998). Elevator group control using
multiple reinforcement learning agents. Machine Learning, 33, 235-262.
Hu, J., & Wellman, M. P. (1998). Multiagent reinforcement...
Multi-agent Reinforcement
Learning for Planning and.. - Sachiyo Arai Katia
(Correct)
Multi-agent Reinforcement learning
for Planning and.. - Sachiyo Arai Katia
(Correct)
Multi-Agent Reinforcement
Learning: An Approach Based on.. - Yasuo Nagayuki Shin
(Correct)
Cognitive Learning for Practical
Solution of the Frame Problem - Atsushi Ueno Hideaki
(Correct)
Reinforcement Learning
with Bidding for Automatic Segmentation - Ron Sun Cecs (1999)
(Correct)
...long-range
ones) create problems for reinforcement learning. Through the use of a coalition
of reinforcement learning agents, our approach seeks out proper con
gurations of agents, with the goal of reducing or eliminating...
Rationality of
Reward Sharing in Multi-agent Reinforcement.. - Kazuteru Maza Eu
(Correct)
Chapter 6 Transferring Advice into
a Connectionist.. - Once The Teacher
(Correct)
...Chapter 6
Transferring Advice into a Connectionist Reinforcement-Learning Agent
Once the teacher specifies advice for the reinforcement-learning agent,
ratle transfers the...
Learning and control in a chaotic
system - Randløv, Barto (1999)
(Correct)
...with one or
several of these partial solutions. In this paper we consider the problem of
making a reinforcement learning agent cooperate with a hand-crafted local
controller and a global chaotic controller, and designing a...
An Analysis of non-Markov Automata
Games: Implications for.. - Mark Pendrith (1997)
(Correct)
... the
side-effects of state-space representation can lead to the domain appearing as
non-Markov to a reinforcement learning agent. In this paper, we examine
various issues arising from applying standard RL algorithms to...
A
Bibliography of Work Related to Reinforcement Learning - Kaelbling, Littman
(1994)
(Correct)
Co-Learning in Differential Games
- John W. Sheppard
(Correct)
Reinforcement Learning, Neural
Networks and PI Control.. - Charles Anderson
(Correct)
...trained to
minimize the n-step ahead error between the coil output and the set point, and a
reinforcement learning agent trained to minimize the sum of the squared
error over time. Although the PI controller works very ...
Learning Visual Routines with
Reinforcement Learning - Andrew Kachites
(Correct)
...which those
instances are organized. The leaves of the tree represent the internal states of
the reinforcement learning agent. That is, the agent's utility estimates
(Q-values) are stored in the leaves. When the agent...
Multi-Agent Reinforcement Learning
with Vicarious Rewards - Kevin Irwig
(Correct)
Synthesis of Reinforcement
Learning, Neural Networks, and PI .. - Charles Anderson
(Correct)
...trained to
minimize the n-step ahead error between the coil output and the set point, and a
reinforcement learning agent trained to minimize the sum of the squared
error over time. Although the PI controller works very ...
Multi-Agent Learning With The
Success-Story Algorithm - Jürgen Schmidhuber, Jieyu Zhao
(Correct)
Automatic Partitioning for
Multi-Agent Reinforcement Learning - Ron Sun
(Correct)
Markov games as a
framework for multi-agent reinforcement learning - Littman (1994)
(Correct) (87 citations)
...Markov games as a framework for
multi-agent reinforcement learning Michael L. Littman Brown University /...
/...can only be part of the environment and are therefore fixed in their
behavior. The framework of Markov games allows us to widen this view to
include multiple adaptive agents with interacting or competing...
Multiagent Reinforcement
Learning: Theoretical Framework and an .. - Hu, Wellman (1998)
(Correct) (26 citations)
...for multiagent reinforcement learning. The
framework we adopt is stochastic games (also called Markov games) [4,
15], which are the generalization of the Markov decision processes to the case
of two or... /... Journal of Artificial Intelligence Research, 4:237-285, May
1996. [6] Michael L. Littman. Markov games as a framework for multi-agent
reinforcement learning. In Proceedings of the Eleventh...
Learning to Cooperate
via Policy Search - Peshkin, Meuleau, Kaelbling (2000)
(Correct) (7 citations)
... and analyzes a Q-learning-like algorithm for
finding optimal policies in the framework of zero-sum Markov games, in
which two players have strictly opposite interests. Hu and Wellman [7] propose a
different... /...reactive policies for each agent. However, 1 IPSG's are also
called stochastic games [7], Markov games [9] and multi-agent Markov
decision processes [5]. 2 P(#denotes the set of probability...
Cooperative Behavior
Acquisition for Mobile Robots in.. - Asada, Uchibe, Hosoda (1999)
(Correct) (4 citations)
...in advance in order to learn successful
behaviors. Littman [15] proposed the framework of Markov Games in which
learning robots try to learn a mixed strategy optimal against the worst
possible... /...Simulation of Adaptive Behavior: From Animals to Animats, pp.
271-280, 1992. [15] M. L. Littman. Markov games as a framework for
multi-agent reinforcement learning. In Proc. of the 11th International...
Hierarchical
Multi-Agent Reinforcement Learning - Makar, Mahadevan, al. (2001)
(Correct) (1 citation)
...models. Littman [8], and Hu and Wellman [5],
among others, have studied the framework of Markov games for competitive
multi-agent learning. Here, we are primarily interested in the cooperative
case.... /...rules for multiple-vehicle AGV systems. SIMULATION, 66(2):121-130,
1996. [8] M. Littman. Markov games as a framework for multi-agent
reinforcement learning. In Proceedings of the Eleventh...
Hierarchical Multi Agent
Reinforcement Learning - Makar, Mahadevan (2001)
(Correct) (1 citation)
...to tackle this problem. Littman [3], and Hu and
Wellman [6] have studied the the framework of Markov games for multi
agent learning. Tan [4] studied the extension of at reinforcement learning to
the... /...Learning in the Multi-Robot Domain. Autonomous Robots, 4(1), 73-83.
[3] Michael Littman. (1994) Markov games as a framework for multi-agent
reinforcement learning. Proceedings of the Eleventh International...
Flow Control
Using The Theory Of Zero Sum Markov Games - Altman (1994)
(Correct) (11 citations)
...CONTROL USING THE THEORY OF ZERO SUM MARKOV
GAMES Eitan ALTMAN INRIA Centre Sophia Antipolis 06565 Valbonne Cedex,
France... /...that depends on the quality of the service. The problem is studied
in the framework of zero-sum Markov games, and a value iteration
algorithm is used to solve it. We show that there exists an optimal...
Foresight-based
pricing algorithms in an economy of software agents - Tesauro (1998)
(Correct) (4 citations)
...(DP)-style algorithms, that have recently been
extended to the domain of two-player zero-sum Markov games (Littman,
1994). 1 Introduction In prior work (Kephart et al., 1998 Sairamesh and
Kephart,... /... single-agent MDPs. Recently there has been some work
generalizing DP-type algorithms to two-player Markov games. For example,
(Littman, 1994) introduced an algorithm called minimax-Q for two-player
zero-sum...
Cooperative Behavior
Acquisition in Multi Mobile Robots.. - Uchibe, Asada, Hosoda (1998)
(Correct) (4 citations)
...movements in advance to learn the behaviors
successfully. Littman [5] proposed a framework of Markov Games in which
learning robots try to learn a mixed strategy optimal against the worst
possible... /... Simulation of Adaptive Behavior: From Animals to Animats 2.,
pp. 271-280, 1992. [5] M. L. Littman. Markov games as a framework for
multi-agent reinforcement learning. In Proc. of the 1 2 3 4 passer shooter...
Mutually Supervised
Learning in Multiagent Systems - Claudia Goldman (1995)
(Correct) (6 citations)
...learn more quickly than agents that do not
cooperate [ Tan, 1993 ] . Littman proposed the use of Markov games as a
framework for multiagent systems [ Littman, 1994 ] . He focused on two-player
games, where... /... technical report CMU-CS-93-165, Carnegie Mellon University,
1993. [ Littman, 1994 ] M. L. Littman. Markov games as a framework for
multi-agent reinforcement learning. In Machine Learning 1994, pages...
A
Generalized Reinforcement-Learning Model: Convergence and .. - Littman,
Szepesvári (1996)
(Correct) (6 citations)
...convergence to synchronous convergence.
Keywords: Reinforcement learning, Q-learning convergence, Markov games 1
INTRODUCTION Reinforcement learning is the process by which an agent improves
its behavior... /...converges to the optimal Q function under the proper
conditions [33, 31, 10]. 2.2 ALTERNATING MARKOV GAMES In alternating
Markov games, two players take turns issuing actions to try to maximize
their own ... /... learning rule [19] are given in an extended version of this
paper [28]. 4.2 Q-LEARNING FOR MARKOV GAMES Markov games are a
generalization of mdps and alternating Markov games in which both
players... /...Theorem 1, model-based methods can be used to find optimal
policies in mdps, alternating Markov games, Markov games,
risk-sensitive mdps, and exploration-sensitive mdps. Also, if R t R and P t P
for all ...
Learning mutual trust
- Banerjee, Mukherjee, Sen (2000)
(Correct) (1 citation)
...of the agents) rather than optimal strategies
for an individual agent. The stochastic-game (or Markov Games) framework,
a generalization of Markov Decision Processes for multiple players, has been
used to... /...Learning (ML'98), pages 242-250, San Francisco, CA, 1998. Morgan
Kaufmann. [4] M. L. Littman. Markov games as a framework for multi-agent
reinforcement learning. In Proceedings of the Eleventh...
Optimal Stopping of
Markov Processes: Hilbert Space Theory, .. - Tsitsiklis, Van Roy (1997)
(Correct) (3 citations)
... analysis can be extended, including
independent increment processes, finite-horizon problems, and Markov
games. Finally, connections between the ideas in this paper and the
neuro-dynamic programming and...
Mutual Adaptation
Enhanced by Social Laws - Goldman, Rosenschein (1998)
(Correct) (1 citation)
...that can follow everything that is happening in
a specific environment. Littman also looked at Markov games that deal
with the specific actions taken by the agents and how they pay back other
agents. In... /...International Conference on Genetic Algorithms, pages 303-310,
1991. [6] Michael L. Littman. Markov games as a framework for multi-agent
reinforcement learning. In Machine Learning: Proceedings of the...
Tree Based
Discretization for Continuous State Space.. - Uther, Veloso (1998)
(Correct) (1 citation)
...larger domain, hexagonal grid soccer, is
similar to the one used by Littman (1994) to investigate Markov games. It
is ordered discrete rather than strictly continuous. We increased the size of
this domain,... /...U Tree work used this capability to partially remove the
Markov assumption. As we were playing a Markov game we did not implement
this part of the U Tree algorithm in Continuous U Tree although there is no...
Strategy Classification
in Multi-agent Environment -.. - Uchibe, Asada, Hosoda (1996)
(Correct) (1 citation)
...in advance to learn the behaviors successfully.
Littman (Littman 1994) proposed the framework of Markov Games in which
Q-learning agents try to learn a mixed strategy optimal against the worst
possible... /...on Simulation of Adaptive Behavior: From Animals to Animats 2.,
271-280. Littman, M. L. 1994. Markov games as a framework for multi-agent
reinforcement learning. In Proc. of the 11th International...
Reinforcement Learning in Large
State Spaces - Simulated Robotic Soccer
(Correct)
...systems
(MAS). The rst is a problem of large state spaces. Existing formalisms such as
the Markov game model[Hu99][Lit94] suer from combinatorial explosion,
since they learn values for combinations... /...problem of modeling the
environment and other agents acting in the environment in the context of
Markov Games. The Markov game model is de ned by a set of states
S, and a collection of action sets A 1 ::: ...
The Steering Approach for
Multi-Criteria Reinforcement Learning - Shie Mannor And
(Correct)
...of other
agents or to non-stationary moves of Nature. This problem is modelled as a
stochastic (Markov) game between the learning agent and an arbitrary
player, with a vector-valued reward function. The... /...policies for
approaching these sets. Approachability theory has been extended to stochastic
(Markov) games in [14], and the relevant results are brie y reviewed in
Section 2. In this paper we add the...
Least-Squares Methods in
Reinforcement Learning for Control - Michail Lagoudakis Ronald
(Correct)
...and riding,
multiagent learning in factored domains, and, recently, on two-player zero-sum
Markov games and the game of Tetris. 1 Introduction Linear least-squares
methods have been successfully used... /... factored domains. Currently, LSPI is
being tested on the game of Tetris and on two-player zero-sum Markov
games. 2 MDPs and Reinforcement Learning We assume that the underlying
control problem is a Markov...
Consumption-Savings Decisions
with Quasi-Geometric.. - First Version June
(Correct)
...decisions
with quasi-geometric discounting. 2 There is also a related literature on
di#erential Markov games, e.g., in applications to models with imperfect
altruism (see, e.g., Leininger (1986) and...
Generalized
Approachability Results for Stochastic Games with .. - Shie Mannor And
(Correct)
...and related
approaching policies were given. The theory was later extended to stochastic
(Markov) games under the assumption that some xed state is recurrent
under all stationary strategies. In this...
Adversarial
Reinforcement Learning - William Uther And
(Correct)
...compared to
a previously developed adversarial Reinforcement Learning algorithm designed for
Markov games. Building upon these efforts, we introduce new algorithms to
handle the multi-agent, the... /...[Littman, 1994] took standard Q Learning,
[Watkins and Dayan, 1992], and modified it to work with Markov games. He
replaced the simple min update used in standard Q Learning with a mixed
strategy...
The
Empirical Bayes Envelope and Regret Minimization in.. - Shie Mannor And
(Correct)
...In this
paper we consider the question of regret minimization in the framework of
stochastic (Markov) games, under appropriate recurrence properties. Our
approach relies on the observed state-action... /...algorithm for generating
extended normal numbers. Preprint, May 1998. [Lit94] M.L. Littman. Markov
games as a framework for multi-agent reinforcement learning. In Morgan
Kaufman, editor, Eleventh...
William T. B. Uther - Manuela
Veloso Computer
(Correct)
...& Moore
1996). We use a similar environment to the one used by Littman (1994) to
investigate Markov games. Our environment is larger, both in number of
states and number of actions per state. This... /...original U Tree work uses
this capability to remove the Markov assumption. As we were playing a Markov
game we did not implement this part of the U Tree algorithm in Continuous U
Tree although there is no...
Learning Agents in a Homo Egualis
Society - Nowé, Verbeeck, Lenaerts (2001)
(Correct)
... : : : An !
. This model is the underlying model for the stochastic games, also referred to
as Markov games in [6, 7] The rst approach where the presence of other
agent is completely ignored, is a naive... /...P.,: Multi Agent Reinforcement
Learning in Stochastic Games. Submitted, 1999. [7] Litmann M.L.,: Markov
games as a framework for multi-agent reinforcement learning. Proceedings of
the Eleventh International...
Social Agents Playing a Periodical
Policy - Ann Now Johan
(Correct)
...: :An ! P
(). This model is the underlying model for the stochastic games, also referred
to as Markov games in [5, 6] 2.2 Drawbacks of the joint action space
approach The joint action space, is a safe... /... M. P.,: Multi Agent
Reinforcement Learning in Stochastic Games. Submitted, 1999. 6. Litmann M.L.,:
Markov games as a framework for multi-agent reinforcement learning.
Proceedings of the Eleventh International...
Nash Equilibria in
Partial-Information Games on Markov Chains - Hespanha, Prandini (2001)
(Correct)
...the Markov
chain and the actions taken by the players. We deviate from most of the
literature on Markov games in that we do not assume full-information. In
fact, each player only has available stochastic... /...distributions. Here, we
extend this class of policies to the setting of partial-information Markov
games. Moreover, we enlarge such a class by allowing the distributions over
the available actions to...
Genericity and
Markovian Behavior in Stochastic Games - Haller, Lagunoff (1999)
(Correct)
...move game.
Journal of Economic Literature Classication Numbers: C72, C73. Keywords:
stochastic games, Markov Perfect equilibria, genericity. Department of
Economics, Virginia Polytechnic Institute and...
Probabilistic Pursuit-Evasion
Games: A One-Step Nash Approach - Joao Hes Anha
(Correct)
...evader in a
non-accurately mapped terrain. By describing the problem as a partial
information Markov game, we are able to integrate map-learning and
pursuit. We propose receding horizon control ... /... aprioriprobabilis#66 mapis
intro uce . Inthis paper, wees#4 ibe the purs#rd4 evas#4 n problem as a
Markov game ([5]) where thes ys tem evolutionis 1 Research suppo ted by
Ho163 ell Inc.o DARPA co tract...
On the Dynamics of Evolutionary
Games - Mar'ia Teresa Gallegos
(Correct)
...These bounds
are in terms of the transition matrix and easy to compute. Key Words:
Evolutionary games, Markov chains, passage times, stationary
distributions, invariant measures, stochastic bounds AMS 1991...
Optimal Strategy
In A Dice Game - Haigh, Roters
(Correct)
...calculations, that finds the optimal strategy to reach such a target.
Keywords: Gambling, dice games, Markov decision theory. AMS 1991 Subject
Classification: Primary 60G40 2 Secondary 60J05 3 1....
Probabilistic
Pursuit-Evasion Games: A One-Step Nash Approach - Joao Hespanha Hespanha
(2000)
(Correct)
...evader in a
non-accurately mapped terrain. By describing the problem as a partial
information Markov game, we are able to integrate map-learning and
pursuit. We propose receding horizon control ... /...probabilistic map is
introduced. In this paper, we describe the pursuit-evasion problem as a
Markov game ([5]) where the system evolution is 1 Research supported by
Honeywell Inc. on DARPA contract...
Zero-Sum Markov Games and
Worst-Case Optimal Control of.. - Altman, Hordijk (1994)
(Correct)
...MARKOV
GAMES AND WORST-CASE OPTIMAL CONTROL OF QUEUEING SYSTEMS Eitan ALTMAN INRIA
2004 Route des Lucioles... /... the tools. We present some existing tools for
solving nite horizon and innite horizon discounted Markov games with
unbounded cost, and develop new ones that are typically applicable in queueing
problems. We...
A Markov Game Approach For
Optimal Routing Into A Queueing Network - Altman (1994)
(Correct)
...A MARKOV
GAME APPROACH FOR OPTIMAL ROUTING INTO A QUEUEING NETWORK Eitan ALTMAN INRIA
2004 Route des Lucioles... /...finite and infinite horizon discounted cost. The
problem is studied in the framework of zero-sum Markov games where the
server, called player 1, is assumed to play against the router, called player 2.
Each...
Monotonicity Of
Optimal Policies In A Zero Sum Game: A Flow.. - Eitan Altman (1994)
(Correct)
...both
discounted and expected average cost. The problem is studied in the framework of
zero-sum Markov games where the server, called player 1, is assumed to
play against the flow controller, called player... /... to find a best strategy
under the worst case service conditions. We model the system as a zero-sum
Markov game, where player 1 controls the service quality. Actions b and g
are assumed to be taken...
Constrained Markov
Games: Nash Equilibria - Eitan Altman (1995)
(Correct)
...Markov
Games: Nash Equilibria Eitan ALTMAN INRIA 2004 Route des Lucioles, B.P.93
06902 Sophia-Antipolis Cedex... /...Haifa 32000 Israel February 1995 Abstract In
this paper we develop the theory of constrained Markov games. We consider
the expected average cost as well as discounted cost. We allow different players
to...
Co-Learning in Differential Games
- John W. Sheppard
(Correct)
...learner and
comparable but faster performance with a treebased reinforcement learner.
Keywords: Markov games, differential games, pursuit games, multi-agent
learning, reinforcement learning, Q-learning 1 ... /...Littman explored using
Q-learning for co-learning among homogeneous agents in the context of Markov
games (Littman 1994 Littman 1996). It appears his approach also applies to
heterogeneous agents, but...
Constrained Markov Games: Nash
Equilibria - Eitan Altman (1995)
(Correct)
...Markov
Games: Nash Equilibria Eitan ALTMAN INRIA 2004 Route des Lucioles, B.P.93
06902 Sophia-Antipolis Cedex... /...To appear, Annals of Dynamic Games Abstract
In this paper we develop the theory of constrained Markov games. We
consider the expected average cost as well as discounted cost. We allow
different players to...
Incremental and Mutual Adaptation
in Multiagent Systems - Claudia Goldman
(Correct)
...among the
agents while they learn. Each agent learns directly from each of the others.
Markov games were proposed as a framework for multi-adaptive agents [4],
although this framework is not... /...International Conference on Genetic
Algorithms, pages 303-310, 1991. [4] Michael L. Littman. Markov games as
a framework for multi-agent reinforcement learning. In Machine Learning:
Proceedings of the...
Static and Dynamic
Aspects of Optimal Sequential Decision Making - Szepesvari (1998)
(Correct)
... . 62 4.3
Q-learning with Multi-State Updates . . . . . . . . . . . . . . . . 65 4.4
Q-learning for Markov Games . . . . . . . . . . . . . . . . . . . . 69
4.5 Risk-sensitive Models . . . . . . . . . . . . . .... /...or worst-case total
discounted/undiscounted cost criterion repeated zero-sum games such as
Markov-games or alternating Markov-games all admit such a
recursive structure. Our setup has the advantage... /...of this, model-based
methods can be used to find optimal policies in mdps, alternating Markov
games, Markov games, risksensitive models, and exploration-sensitive
(i.e., sarsa) models [35, 57]. Also, ...
Average Optimality in Markov Games
with General State Space - Rieder
(Correct)
...Optimality
in Markov Games with General State Space Ulrich Rieder Department of
Mathematics VII University of Ulm D-89069... /... Ulrich Rieder Department of
Mathematics VII University of Ulm D-89069 Ulm, Germany Abstract A Markov
game with general state space and the average reward as optimality criterion
is considered. Asymmetric ...
Vision Based State Space
Construction for Learning Mobile.. - Uchibe, Asada, Hosoda (1997)
(Correct)
...in advance
to learn the behaviors successfully. Littman (Littman, 1994) proposed the
framework of Markov Games in which Q-learning agents try to learn a mixed
strategy optimal against the worst possible... /... Simulation of Adaptive
Behavior: From Animals to Animats 2., pages 271-280. Littman, M. L. (1994).
Markov games as a framework for multi-agent reinforcement learning. In
Proc. of the 11th International...
A Unified Analysis of
Value-Function-Based.. - Szepesvári.. (1998)
(Correct)
...model-based
reinforcement learning, Q-learning with multi-state updates, Q-learning for
Markov games, and risk-sensitive reinforcement learning. 1 Introduction A
reinforcement learner interacts... /...model-based reinforcement learning,
Q-learning with multi-state updates, Q-learning for Markov games, and
risk-sensitive reinforcement learning. Section A then proves the theorem,
providing ... /...of this, model-based methods can be used to find optimal
policies in mdps, alternating Markov games, Markov games (Littman
1994), risk-sensitive models (Heger 1994), and explorationsensitive (i.e.,
sarsa)...
[Return to Jie Bao's Homepage]