In probability theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding a reward rate to each state. An additional variable records the reward accumulated up to the current time.[1] Features of interest in the model include expected reward at a given time and expected time to accumulate a given reward.[2] The model appears in Ronald A. Howard's book.[3] The models are often studied in the context of Markov decision processes where a decision strategy can impact the rewards received.

The Markov Reward Model Checker tool can be used to numerically compute transient and stationary properties of Markov reward models.

Continuous-time Markov chain

edit

The accumulated reward at a time t can be computed numerically over the time domain or by evaluating the linear hyperbolic system of equations which describe the accumulated reward using transform methods or finite difference methods.[4]

See also

edit

References

edit
  1. ^ Begain, K.; Bolch, G.; Herold, H. (2001). "Theoretical Background". Practical Performance Modeling. pp. 9. doi:10.1007/978-1-4615-1387-2_2. ISBN 978-1-4613-5528-1.
  2. ^ Li, Q. L. (2010). "Markov Reward Processes". Constructive Computation in Stochastic Models with Applications. pp. 526–573. doi:10.1007/978-3-642-11492-2_10. ISBN 978-3-642-11491-5.
  3. ^ Howard, R.A. (1971). Dynamic Probabilistic Systems, Vol II: Semi-Markov and Decision Processes. New York: Wiley. ISBN 0471416657.
  4. ^ Reibman, A.; Smith, R.; Trivedi, K. (1989). "Markov and Markov reward model transient analysis: An overview of numerical approaches" (PDF). European Journal of Operational Research. 40 (2): 257. doi:10.1016/0377-2217(89)90335-4.