SciELO - Scientific Electronic Library Online

 
vol.43 issue1Spatial MCUSUM Control ChartA Birnbaum-Saunders Model for Joint Survival and Longitudinal Analysis of Congestive Heart Failure Data author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • On index processCited by Google
  • Have no similar articlesSimilars in SciELO
  • On index processSimilars in Google

Share


Revista Colombiana de Estadística

Print version ISSN 0120-1751

Rev.Colomb.Estad. vol.43 no.1 Bogotá Jan./June 2020  Epub June 05, 2020

https://doi.org/10.15446/rce.v43n1.77542 

ARTÍCULOS ORIGINALES DE INVESTIGACIÓN

Generalized Poisson Hidden Markov Model for Overdispersed or Underdispersed Count Data

Modelo oculto de Markov de Poisson generalizado para datos de recuento sobredispersados o subdispersos

Sebastian Georgea 

Ambily Joseb 

a Department of Statistics, St. Thomas College, Pala, India. PhD. E-mail: sthottom@gmail.com

b Department of Statistics, St. Thomas College, Pala, India. Research Scholar. E-mail: ambilystat06@gmail.com


Abstract

The most suitable statistical method for explaining serial dependency in time series count data is that based on Hidden Markov Models (HMMs). These models assume that the observations are generated from a finite mixture of distributions governed by the principle of Markov chain (MC). Poisson-Hidden Markov Model (P-HMM) may be the most widely used method for modelling the above said situations. However, in real life scenario, this model cannot be considered as the best choice. Taking this fact into account, we, in this paper, go for Generalised Poisson Distribution (GPD) for modelling count data. This method can rectify the overdispersion and underdispersion in the Poisson model. Here, we develop Generalised Poisson Hidden Markov model (GP-HMM) by combining GPD with HMM for modelling such data. The results of the study on simulated data and an application of real data, monthly cases of Leptospirosis in the state of Kerala in South India, show good convergence properties, proving that the GP-HMM is a better method compared to P-HMM.

Key words: EM algorithm; Generalized Poisson distribution; Hidden Markov Model; Overdispersion

Resumen

El método estadístico más adecuado para explicar la dependencia serial en los datos de recuento de series de tiempo se basan en los modelos ocultos de Markov (HMM). Estos modelos suponen que las observaciones se generan a partir de un finito mezcla de distribuciones regidas por el principio de la cadena de Markov (MC). El modelo de Markov oculto de Poisson (P-HMM) puede ser el método ms utilizado para modelar las situaciones mencionadas anteriormente. Sin embargo, en el escenario de la vida real, este modelo no puede considerarse como la mejor opción. Teniendo en cuenta este hecho, nosotros, en este artículo, apostamos por la distribución generalizada de Poisson (GPD) para modelar datos de conteo. Este método puede rectificar la sobredispersión y subdispersión en el modelo de Poisson. Aqu desarrollamos Poisson generalizado Modelo de Markov oculto (GP-HMM) combinando GPD con HMM para modelando tales datos. Los resultados del estudio sobre datos simulados y una aplicación de datos reales, casos mensuales de leptospirosis en el estado de Kerala en South India, muestra buenas propiedades de convergencia, lo que demuestra que el GP-HMM Es un método mejor en comparación con P-HMM.

Palabras clave: Algoritmo EM; Distribución generalizada de Poisson; Modelo oculto de Markov; Sobredispersión

Full text available only in PDF format.

References

Baum, L. E. (1972), 'An Inequality and Associated Maximization Technique in Statistical Estimation for Probabilistic Functions of Markov Processes', Inequalities 3, 1-8. [ Links ]

Cepeda-Cuervo, E. & Cifuentes-Amado, M. V. (2017), 'Double Generalized Beta-Binomial and Negative Binomial Regression Models', Revista Colombiana de Estadística 40(1), 141-163. [ Links ]

Consul, P. C. (1989), Generalized Poisson Distributions: Properties and Applications, Dekker, New York. [ Links ]

Consul, P. C. & Jain, G. C. (1973), 'A Generalization of Poisson Distribution', Technometrics 15(4), 791-799. [ Links ]

Consul, P. C. & Shoukri, M. M. (1984), 'Maximum likelihood estimation for the generalized Poisson distribution', Communication in Statistics - Theory and Methods 13(12), 1533-1547. [ Links ]

Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), 'Maximum Likelihood from Incomplete Data via the EM Algorithm', Journal of the Royal Statistical Society, Serie B 39(1), 1-38. [ Links ]

Greenwood, M. G. & Yule, G. U. (1920), 'An inquiry into the nature of frequency distributions representative of multiple happenings, with particular reference to the occurrence of multiple attacks of disease or of repeated accidence', Journal Royal Statistical Society 83, 255-279. [ Links ]

Joe, H. & Zhu, R. (2005), 'Generalized Poisson Distribution: the Property of Mixture of Poisson and Comparison with Negative Binomial Distribution', Biometrical Journal 47(2), 219-229. [ Links ]

Kendall, M. & Stuart, A. (1963), The Advanced Theory of Statistics, Vol. 1, Hafner Publishing Co., New York. [ Links ]

Neyman, J. (1931), 'On a new class of contagious distributions, applicable in entomology and bacteriology', Technometrics 10, 35-57. [ Links ]

Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), 'An Empirical Comparison of EM Initialization Methods and Model Choice Criteria for Mixtures of Skew-Normal Distributions', Revista Colombiana de Estadística 35(3), 457-478. [ Links ]

Sebastian, T., Jeyaseelan, V., Jeyaseelan, L., Anandan, S., George, S. & Bangdi-wala, S. (2019), 'Decoding and modelling of time series count data using Poisson hidden Markov model and Markov ordinal logistic regression models', Statistical Methods in Medical Research 28(5), 1552-1563. [ Links ]

Tuenter, H. J. H. (2000), 'On the generalized Poisson distribution', Statistica Neerlandica 54, 374-376. [ Links ]

Wang, W. & Famoye, F. (1997), 'Modelling household fertility decisions with generalized Poisson regression', Journal of Population Economics 10, 273-283. [ Links ]

Witowski, V. & Foraita, R. (2013), HMMpa: Analysing accelerometer data using hidden markov models, R package version 1.0.1. *https://cran.r-project.org/package=HMMpaLinks ]

Witowski, V., Foraita, R., Pitsiladis, Y., Pigeot, I. & Wirsik, N. (2014), 'Using hidden Markov models to improve quantifying physical activity in accelerometer data - A simulation study', PLOS ONE 9(12), 77-92. [ Links ]

Zucchini, W. & MacDonald, I. L. (2009), Hidden Markov Models for Time Series: An Introduction Using R, Chapman and Hall, Boca Raton. [ Links ]

Creative Commons License This is an open-access article distributed under the terms of the Creative Commons Attribution License