The definition of Markov chains has evolved during the 20th century. In 1953 the term Markov chain was used for stochastic processes with discrete or continuous index set, living on a countable or finite state space, see Doob.[ 1] or Chung.[ 2] Since the late 20th century it became more popular to consider a Markov chain as a stochastic process with discrete index set, living on a measurable state space.[ 3] [ 4] [ 5]
Denote with
(
E
,
Σ
)
{\displaystyle (E,\Sigma )}
a measurable space and with
p
{\displaystyle p}
a Markov kernel with source and target
(
E
,
Σ
)
{\displaystyle (E,\Sigma )}
.
A stochastic process
(
X
n
)
n
∈
N
{\displaystyle (X_{n})_{n\in \mathbb {N} }}
on
(
Ω
,
F
,
P
)
{\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )}
is called a time homogeneous Markov chain with Markov kernel
p
{\displaystyle p}
and start distribution
μ
{\displaystyle \mu }
if
P
[
X
0
∈
A
0
,
X
1
∈
A
1
,
…
,
X
n
∈
A
n
]
=
∫
A
0
…
∫
A
n
−
1
p
(
y
n
−
1
,
A
n
)
p
(
y
n
−
2
,
d
y
n
−
1
)
…
p
(
y
0
,
d
y
1
)
μ
(
d
y
0
)
{\displaystyle \mathbb {P} [X_{0}\in A_{0},X_{1}\in A_{1},\dots ,X_{n}\in A_{n}]=\int _{A_{0}}\dots \int _{A_{n-1}}p(y_{n-1},A_{n})\,p(y_{n-2},dy_{n-1})\dots p(y_{0},dy_{1})\,\mu (dy_{0})}
is satisfied for any
n
∈
N
,
A
0
,
…
,
A
n
∈
Σ
{\displaystyle n\in \mathbb {N} ,\,A_{0},\dots ,A_{n}\in \Sigma }
. One can construct for any Markov kernel and any probability measure an associated Markov chain.[ 4]
For any measure
μ
:
Σ
→
[
0
,
∞
]
{\displaystyle \mu \colon \Sigma \to [0,\infty ]}
we denote for
μ
{\displaystyle \mu }
-integrable function
f
:
E
→
R
∪
{
∞
,
−
∞
}
{\displaystyle f\colon E\to \mathbb {R} \cup \{\infty ,-\infty \}}
the Lebesgue integral as
∫
E
f
(
x
)
μ
(
d
x
)
{\displaystyle \int _{E}f(x)\,\mu (dx)}
. For the measure
ν
x
:
Σ
→
[
0
,
∞
]
{\displaystyle \nu _{x}\colon \Sigma \to [0,\infty ]}
defined by
ν
x
(
A
)
:=
p
(
x
,
A
)
{\displaystyle \nu _{x}(A):=p(x,A)}
we used the following notation:
∫
E
f
(
y
)
p
(
x
,
d
y
)
:=
∫
E
f
(
y
)
ν
x
(
d
y
)
.
{\displaystyle \int _{E}f(y)\,p(x,dy):=\int _{E}f(y)\,\nu _{x}(dy).}
Starting in a single point
edit
If
μ
{\displaystyle \mu }
is a Dirac measure in
x
{\displaystyle x}
, we denote for a Markov kernel
p
{\displaystyle p}
with starting distribution
μ
{\displaystyle \mu }
the associated Markov chain as
(
X
n
)
n
∈
N
{\displaystyle (X_{n})_{n\in \mathbb {N} }}
on
(
Ω
,
F
,
P
x
)
{\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} _{x})}
and the expectation value
E
x
[
X
]
=
∫
Ω
X
(
ω
)
P
x
(
d
ω
)
{\displaystyle \mathbb {E} _{x}[X]=\int _{\Omega }X(\omega )\,\mathbb {P} _{x}(d\omega )}
for a
P
x
{\displaystyle \mathbb {P} _{x}}
-integrable function
X
{\displaystyle X}
. By definition, we have then
P
x
[
X
0
=
x
]
=
1
{\displaystyle \mathbb {P} _{x}[X_{0}=x]=1}
.
We have for any measurable function
f
:
E
→
[
0
,
∞
]
{\displaystyle f\colon E\to [0,\infty ]}
the following relation:[ 4]
∫
E
f
(
y
)
p
(
x
,
d
y
)
=
E
x
[
f
(
X
1
)
]
.
{\displaystyle \int _{E}f(y)\,p(x,dy)=\mathbb {E} _{x}[f(X_{1})].}
Family of Markov kernels
edit
For a Markov kernel
p
{\displaystyle p}
with starting distribution
μ
{\displaystyle \mu }
one can introduce a family of Markov kernels
(
p
n
)
n
∈
N
{\displaystyle (p_{n})_{n\in \mathbb {N} }}
by
p
n
+
1
(
x
,
A
)
:=
∫
E
p
n
(
y
,
A
)
p
(
x
,
d
y
)
{\displaystyle p_{n+1}(x,A):=\int _{E}p_{n}(y,A)\,p(x,dy)}
for
n
∈
N
,
n
≥
1
{\displaystyle n\in \mathbb {N} ,\,n\geq 1}
and
p
1
:=
p
{\displaystyle p_{1}:=p}
. For the associated Markov chain
(
X
n
)
n
∈
N
{\displaystyle (X_{n})_{n\in \mathbb {N} }}
according to
p
{\displaystyle p}
and
μ
{\displaystyle \mu }
one obtains
P
[
X
0
∈
A
,
X
n
∈
B
]
=
∫
A
p
n
(
x
,
B
)
μ
(
d
x
)
{\displaystyle \mathbb {P} [X_{0}\in A,\,X_{n}\in B]=\int _{A}p_{n}(x,B)\,\mu (dx)}
.
A probability measure
μ
{\displaystyle \mu }
is called stationary measure of a Markov kernel
p
{\displaystyle p}
if
∫
A
μ
(
d
x
)
=
∫
E
p
(
x
,
A
)
μ
(
d
x
)
{\displaystyle \int _{A}\mu (dx)=\int _{E}p(x,A)\,\mu (dx)}
holds for any
A
∈
Σ
{\displaystyle A\in \Sigma }
. If
(
X
n
)
n
∈
N
{\displaystyle (X_{n})_{n\in \mathbb {N} }}
on
(
Ω
,
F
,
P
)
{\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )}
denotes the Markov chain according to a Markov kernel
p
{\displaystyle p}
with stationary measure
μ
{\displaystyle \mu }
, and the distribution of
X
0
{\displaystyle X_{0}}
is
μ
{\displaystyle \mu }
, then all
X
n
{\displaystyle X_{n}}
have the same probability distribution, namely:
P
[
X
n
∈
A
]
=
μ
(
A
)
{\displaystyle \mathbb {P} [X_{n}\in A]=\mu (A)}
for any
A
∈
Σ
{\displaystyle A\in \Sigma }
.
A Markov kernel
p
{\displaystyle p}
is called reversible according to a probability measure
μ
{\displaystyle \mu }
if
∫
A
p
(
x
,
B
)
μ
(
d
x
)
=
∫
B
p
(
x
,
A
)
μ
(
d
x
)
{\displaystyle \int _{A}p(x,B)\,\mu (dx)=\int _{B}p(x,A)\,\mu (dx)}
holds for any
A
,
B
∈
Σ
{\displaystyle A,B\in \Sigma }
.
Replacing
A
=
E
{\displaystyle A=E}
shows that if
p
{\displaystyle p}
is reversible according to
μ
{\displaystyle \mu }
, then
μ
{\displaystyle \mu }
must be a stationary measure of
p
{\displaystyle p}
.
^ Joseph L. Doob: Stochastic Processes . New York: John Wiley & Sons, 1953.
^ Kai L. Chung: Markov Chains with Stationary Transition Probabilities . Second edition. Berlin: Springer-Verlag, 1974.
^ Sean Meyn and Richard L. Tweedie: Markov Chains and Stochastic Stability . 2nd edition, 2009.
^ a b c Daniel Revuz: Markov Chains . 2nd edition, 1984.
^ Rick Durrett: Probability: Theory and Examples . Fourth edition, 2005.