The influence of observation sequence features on the performance of the Bayesian hidden Markov model: A Monte Carlo simulation study
by Jan-Willem Simons, Bart-Jan Boverhof, Emmeke Aarts
The hidden Markov model is a popular modeling strategy for describing and explaining latent process dynamics. There is a lack of information on the estimation performance of the Bayesian hidden Markov model when applied to categorical, one-level data. We conducted a simulation study to assess the effect of the 1) number of observations (250—8.000), 2) number of levels in the categorical outcome variable (3—7), and 3) state distinctiveness and state separation in the emission distribution (low, medium, high) on the performance of the Bayesian hidden Markov model. Performance is quantified in terms of convergence, accuracy, precision, and coverage. Convergence is generally achieved throughout. Accuracy, precision, and coverage increase with a higher number of observations and an increased level of state distinctiveness, and to a lesser extent with an increased level of state separation. The number of categorical levels only marginally influences performance. A minimum of 1.000 observations is recommended to ensure adequate model performance.