Monday, 19 December 2011

Markov-switching AR(1) model

Often the behaviour of economic and financial time series changes dramatically due to structural factors in the economy. An obvious example is the business cycle. During a boom the annual growth rate in GDP will be positive and the time series of growth rates will have a strictly positive mean. But during a recession the growth rate will turn negative and the mean growth rate will be negative. Another example could be CDS spreads for European banks. Fig.1 is the CDS spread for a french bank. We can see from Fig.1 that there is a significant change in both the volatility and level of the series roughly from the end of July coinciding with the downgrade of spanish government debt.

Fig.1
One way of modelling these structural changes is with markov switching models. The basic idea is that the structural parameters which explain a time series can differ depending on the unobserved state. The number of possible unobserved states is chosen to be a small number. Often just two states are modelled e.g recession and boom, risk averse and risk loving, etc. The unobserved state evolves according to a markov process.

Example

Consider the following simple two state markov switching AR(1) model


where 


 and the state 



So that the transition probabilities are as follows





and the vector of unknown parameters which need to be estimated are



In the R code given below I simulated 200 observations from this model and estimated the parameters by maximum likelihood. I found that the likelihood function can be very flat with lots of local maxima, thus the parameter estimates can differ significantly depending on the choice of starting values used. To try and limit the possibility of selecting parameter estimates that only yield a local maximum I repeated the estimation procedure 50 times using random starting values to generate a list of potential MLEs. Then I simply chose the parameter estimates that corresponded to the largest likelihood from the list. In a future post I'll see if I can get faster and more reliable estimates using the Gibbs sampler. 

Fig.2 shows that both the smoothed and filtered probabilities offer an accurate reflection of the true underlying state at each date.

Fig.2


R code


# markov switching model
# generate data for a AR(1) markov switching model with the following pars
# state 0: y_t = 2 + 0.5 * y_{t-1} + e1_t
# state 1: y_t = -1 + 0.8 * y_{t-1} + e2_t
# where e1_t ~ N(0,1), e2_t ~ N(0,0.5^2)
# transition probabilities p_s0_s1 = p_s1_s0 = 0.20

# generate realisations of the state

n <- 200
gamma_s0 <- qnorm(0.8)
gamma_s1 <- qnorm(0.2)
gamma <- rep(0,n)
state <- rep(0,n)

# choose initial state at t=0 to be state 0 

gamma[1] <- gamma_s0
state[1] <- 0

for(i in 2:n) {
if(rnorm(1) < gamma[i-1]) {
gamma[i] <- gamma_s0
state[i] <- 0
}
else {
gamma[i] <- gamma_s1
state[i] <- 1
}
}
# generate observations
# choose y_0 = 0
# recall state at t=1 was set to 0 

y1 <- 2 + 0.5 * 0 + rnorm(1, sd = 1)
y <- rep(0,n)
y[1] <- y1

for(i in 2:n) {
if(state[i]==0) {
y[i] <- 2 + 0.5 * y[i-1] + rnorm(1, sd = 1)
}
else {
y[i] <- -1 + 0.8 * y[i-1] + rnorm(1, sd = 0.5)
}
}

# convert into time series object

y <- ts(y, start = 1, freq = 1)


hamilton.filter <- function(theta, y) {

# construct parameters
    beta_s0 <- theta[1:2]
beta_s1 <- theta[3:4]
sigma2_s0 <- exp(theta[5])
sigma2_s1 <- exp(theta[6])
gamma_s0 <- theta[7]
gamma_s1 <- theta[8]
# construct probabilities

#probit specification
  
p_s0_s0 <- pnorm(gamma_s0)
p_s0_s1 <- pnorm(gamma_s1)
p_s1_s0 <- 1-pnorm(gamma_s0)
p_s1_s1 <- 1-pnorm(gamma_s1)

# # # construct transition probability matrix P # # #

P.mat <- matrix(c(p_s0_s0, p_s1_s0, p_s0_s1, p_s1_s1), nrow = 2, ncol = 2, byrow = TRUE)

# create data matrix

X <- cbind(1,y)

# assume erogodicity of the markov chain
# use unconditional probabilities

p0_s0 <- (1 - p_s1_s1) / (2 -p_s0_s0 -p_s1_s1)
p0_s1 <- 1-p0_s0

# create variables

p_s0_t_1 <- rep(0, nrow(X))
p_s1_t_1 <- rep(0, nrow(X))
p_s0_t <- rep(0, nrow(X))
p_s1_t <- rep(0, nrow(X))
f_s0 <- rep(0,nrow(X)-1)
f_s1 <- rep(0,nrow(X)-1)
f <- rep(0,nrow(X)-1)
logf <- rep(0, nrow(X)-1)

p_s0_t[1] <- p0_s0
p_s1_t[1] <- p0_s1 

# initiate hamilton filter

for(i in 2:nrow(X)) {
# calculate prior probabilities using the TPT
# TPT for this example gives us 
# p_si_t_1 = p_si_t_1_si * p_si_t + p_si_t_1_sj * p_si_t
# where p_si_t_1 is the prob state_t = i given information @ time t-1
# p_si_t_1_sj is the prob state_t = i given state_t_1 = j, and all info @ time t-1
# p_si_t is the prob state_t = i given information @ time t

# in this simple example p_si_t_1_sj = p_si_sj

p_s0_t_1[i] <- (p_s0_s0 * p_s0_t[i-1]) + (p_s0_s1 * p_s1_t[i-1])
p_s1_t_1[i] <- (p_s1_s0 * p_s0_t[i-1]) + (p_s1_s1 * p_s1_t[i-1])

# calculate density function for observation i
# f_si is density conditional on state = i
# f is the density 
  
f_s0[i] <- dnorm(y[i]-X[i-1,]%*%beta_s0, sd = sqrt(sigma2_s0)) 
f_s1[i] <- dnorm(y[i]-X[i-1,]%*%beta_s1, sd = sqrt(sigma2_s1))
f[i] <- (f_s0[i] * p_s0_t_1[i]) + (f_s1[i] * p_s1_t_1[i])

# calculate filtered/posterior probabilities using bayes rule
# p_si_t is the prob that state = i given information @ time t

p_s0_t[i] <- (f_s0[i] * p_s0_t_1[i]) / f[i]
p_s1_t[i] <- (f_s1[i] * p_s1_t_1[i]) / f[i]

logf[i] <- log(f[i])

}

logl <-sum(logf)

output <- list(p_s0_t_1 = p_s0_t_1, p_s1_t_1 = p_s1_t_1,f.prob_s0 = p_s0_t,f.prob_s1 = p_s1_t,T.mat = P.mat,logLik = logl)

return(output)

}


#construct negative loglikelihood function

neg.logl <- function(theta,y) {
out <- hamilton.filter(theta,y)
return(-out$logLik)
}

# pick start values for the 7 unknown parameters
# k is the number of times the estim procedure is run
# different starting values for each run

k <- 50 
start_val <- matrix(rnorm(8*k), nrow = 8, ncol = k)

#matrix to store parameter estimates

par.cand <- matrix(NA,nrow = 8, ncol = k)

#row vector to store likelihood values

like.cand <- matrix(NA, nrow = 1, ncol = k)

#list to store hessian matrices

Hess.cand <- sapply(1:k, function(x) { matrix(0,nrow = 8, ncol = 8) }, simplify = FALSE) 

for(i in 1:k) {

x <-optim(start_val[,i],neg.logl,y=y, hessian = TRUE)

par.cand[,i] <- x$par
like.cand[,i] <-x$value
Hess.cand[[i]] <- x$hessian

cat("Iteration", i, "of", k, "\n")

}

results <- rbind(like.cand,par.cand)

# # # select mle from candidates # # # 

like.cand <- results[1,]
index <- which.min(like.cand)
mle.pars <- results[-1,index]
hessian <- Hess.cand[[index]]

# # # # # # # # # # # # # # # # # # # # # # # # #  
# # # # # # # # # # # # # # # # # # # # # # # # # 

# # # get filtered probabilities # # #

output <- hamilton.filter(mle.pars,y)

fprob_s0 <- output$f.prob_s0
fprob_s1 <- output$f.prob_s1

# # #  get transition matrix # # # 

Tmat <- output$T.mat

# # # define kim's smoothing algorithm # # #

kim.smooth <- function(T.mat,fp_s0,fp_s1,p_s0_t_1,p_s1_t_1) {

n <- length(fp_s0)

# create smoothed probability variables 
p_s0_T <- rep(0,n) 
p_s1_T <- rep(0,n)
# assign initial values to implement kim's algo

p_s0_T[n] <- fp_s0[n]
p_s1_T[n] <- fp_s1[n]

# initiate recursion

for(i in (n-1):1) {
p_s0_T[i] <- fp_s0[i] * ( T.mat[1,1] * (p_s0_T[i+1]/p_s0_t_1[i+1]) + T.mat[2,1] *(p_s1_T[i+1]/p_s1_t_1[i+1])) 
p_s1_T[i] <- fp_s1[i] * ( T.mat[1,2] * (p_s0_T[i+1]/p_s0_t_1[i+1]) + T.mat[2,2] *(p_s1_T[i+1]/p_s1_t_1[i+1]))
}

output <- list(p_s0_T = p_s0_T, p_s1_T = p_s1_T)
return(output)

}

# # # get smoothed probabilities using kim's algo # # #

smoothed.output <- kim.smooth(T.mat = Tmat, fp_s0 = fprob_s0, fp_s1 = fprob_s1, p_s0_t_1 = output$p_s0_t_1, p_s1_t_1 = output$p_s1_t_1)


p_s0_T <- smoothed.output$p_s0_T
p_s1_T <- smoothed.output$p_s1_T

# # # plot results # # #

state <- ts(state, start = 1, freq = 1)

par(mfrow = c(3,1))
plot.ts(y, ylab = "y", main = "Observed series")
plot.ts(cbind(state,p_s1_T), plot.type = c("single"), col = c("black","red"), ylab = "probability/state value", main = "True state and Smoothed probabilities", lty = c(2,1))
legend("topleft",legend = c("true","smoothed prob"), col = c("black", "red"), lty = c(1,1))
plot.ts(cbind(fprob_s1,p_s1_T), plot.type = c("single"), col = c("blue","red"), ylab = "probability", main = "Filtered and Smoothed probabilities", lty = c(1,1))
legend("topleft",legend = c("filtered prob","smoothed prob"), col = c("blue", "red"), lty = c(1,1))

No comments:

Post a Comment