This notes follows materials from the slides for M3S17 at Imperial College London Mathematics Department by Dr. Tony Bellotti. Questions mentioned are referring to past exam papers of M3S17 at Imperial College London.

#Chapter 3: Markov Transition Model

## Markov Transition Models

### Why Markov Chain Model?

This model allow us to track model **changes in the state** of an account over time.

e.g. Tracking credit card use.

### First-order Markov Transition Model

- Let $(X_1, X_2, …, X_n)$ be a sequence of discrete random variables that take value from ${1,2,…,K}$ with K fixed.
**first-order finite value Markov chain**

The sequence is said to be a*first-order finite value Markov chain*if

$$

P(X_{n+1}=j|X_0=x_0,X_1=x_1,X_2=x_2,…,X_{n-1}=x_{n-1},X_n = i) = P(X_{n+1}=j|X_n = i)

$$

for all $n, x_0,x_1, …, x_{n-1}$ and $i,j$ such that $1\leq i,j \leq K$.**transition probability***Transition probability*$p_n$ is defined as

$$

p_{n}(i,j)=P(X_n=j|X_{n-1}=i)

$$**transition matrix***Transition matrix*$P$ is defined as a $K \times K$ matrix such that

$$

P_n[i,j] = p_{n}(i,j)

$$**structural zeroes**

If it is certain that no transition will be made from i to j in the $n$th period, then set $p_n(i,j)=0$, this is called a*structural zero*.

###Propagation of Markov Chain

If we know the transition matrix from period $n-r$ to $n$, i.e. $P_r,P_{r+1},…,P_{n-1}$ are known, then we also be able to know the state in the $n$th period:

$$

Prob(X_{n}=j|X_{n-r}=i) = P_{n-r+1}P_{n-r+2}…P_{n}[i,j]

$$

for any $1\leq r \leq n$

Forcasting from state 0:

$$

Prob(X_{n}=j|X_{0}=i) = P_{1}P_{2}…P_{n}[i,j]

$$

let $\pi_{i}$ be the marginal distribution for $X_n$:

$$\pi_n = (P(X_n = 1),P(X_n = 2),…,P(X_n = K))^T$$

Then,

$$\pi_n = \pi_0(P_1 P_2 … P_n)$$

### Stationary Markov Chain

**stationary**

A Markov chain is*stationary*if $p_n(i,j)=p(i,j)$ for all $**n**,\ and\ i,j\ where 1\leq i, j \leq K$, for some transition probability $p$.**stationary distribution**

A*stationary distribution*for transition matrix is a distribution $\pi^\star = \pi^\star P$.In practice, most Markov chains converge (with 𝑛) to a stationary

distribution.

(Markov chains which have a periodicity in state change do not necessarily converge, but we do not cover this material in this course).

## Estimation

The aim this section, given some data, is to estimate $\mathbf{\theta}=(\theta_{ij})^{K}*{i,j=1}$ where $\theta*{ij}=p(i,j)$.

Maximum likelihood estimator for $\mathbf{\theta}$:

$$

\hat\theta_{ij}=\hat p(i,j)=\frac{n_{ij}}{\sum_{l=1}^{n}n_{il}}

$$

where $n_{ij}=|{t:x_{t-1}=i,x_{t}=j,\ for \ t\in {1,2,…,n}}|$, i.e. total number of observations that move from state $i$ to state $j$ during period 1 to n.**proof**

2017Q1

(c) Derive the maximum likelihood estimator for the first order finite-valued Markov transition model, assuming stationarity.

- likelihood function

$P(X_0=x_0,X_1=x_1,…,X_n=x_n)$

$=\prod_{t=1}^{n} P(X_{t}=x_t|X_{t-1}=x_{t-1},X_{t-2}=x_{t-2},…,X_{1}=x_{1})P(X_0=x_0)$

$=\prod_{t=1}^{n} P(X_{t}=x_t|X_{t-1}=x_{t-1})P(X_0=x_0)(Chain rule)$

$=P(X_0=x_0)\prod_{t=1}^{n} P(X_{t}=x_t|X_{t-1}=x_{t-1})$

$=P(X_0=x_0)\prod_{i=1}^{n}\prod_{j=1}^{n}[p(i,j)]^{n_{ij}}$

$=P(X_0=x_0)\prod_{i=1}^{n}\prod_{j=1}^{n}[\theta_{ij}]^{n_{ij}}$

$$

where\ n_{ij}=|{t:x_t=x_t,X_{t-1}=x_{t-1},\ for\ t\in {1,…,n}}|

$$

- log likelihood function
- differentiate and find MLE
- adjust to constraints

## Testing First-order Assumptions

In order to test if first order markov chain is a suitable assumption in each occasion,

we use chi-square hypothesis test.

### second order Markov chain

A sequence (x_1,..,x_n) is a *second order Markov chain* if

$P(x_n+1=j|X_0=x_0,…,X_{n-1}=k,X_{n}=i)=P(x_n+1=j|X_{n-1}=k,X_{n}=i)$

The **transition probability**

$p_n(k,i,j)=P(X_n=i, X_{n-1}=k)$

The **MLE for a second-Markov chain** can similarly be proved as:

$\hat p(k,i,j)=\hat\theta_{kij}=\frac{n_kij}{mki}$

where $n_{kij}=|{t:x_{t-2}=k,x_{t-1}=i,x_{t}=j\ for\ t \in {2,…,n}}|,

m_{ki}=|{t:x_{t-2}=k,x_{t-1}=i\ for\ t \in {2,…,n}}|=\sum_{l=i}^{n}n_{kil}$

**$H_{0}$**

Null hypothesis:

$P(1,i,j)=P(2,i,j)=…=P(K,i,j)=P(i,j)$

where $P(k,i,j)$ is second-order Markov transition probability from state k to i then to j.**$n_k$**

$n_k=|{t:x_t=k\ for\ t \in {1,..,n}}|$

i.e. The observed number of times for each states**$O_{kj}$**

The observed number of times state k follwed by i and then j

$O_{kj}=n{kij}=m_{ki}\hat p(i,j)$**$E_{kj}$**

The expected number of times state $k$ is followed by state $i$ and then $j$,**given null hypothesis**is:

$E_{kj}=n_{k}\hat p(k,i)\hat p(i,j)\approx m_{ki}\hat p(i,j)$

becuase $\hat p(k,i)=\frac{n_{ki}}{\sum_{l=1}^{K}n_{kl}}\approx \frac{m_{ki}}{n_k}$**Pearson Chi_square test**

$$

\chi^2 = \sum_{k\in S_1}\sum_{k\in S_2}\frac{(O_{kj}-E{kj})^2}{E_{kj}}

$$

- S1 is the set of states which do not have a structural zero moving to state $i$
- S2 is the set of states which do not have a structural zero moving from state $i$

**Degree of freedom**

$df = (K-1-z_{1})(K-1-z_{2})$

- $z_1$: the number of structural zeroes in the transition from $k$ to $i$.
- $z_2$: the number of structural zeroes in the transition from $i$ to $j$.