Extreme Value Theory Applied to r Largest Order Statistics Under the Bayesian Approach

Santos da Silva, Renato; Ferraz do Nascimento, Fernando; Santos da Silva, Renato; Ferraz do Nascimento, Fernando

doi:10.15446/rce.v42n2.70271

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Revista Colombiana de Estadística

Print version ISSN 0120-1751

Rev.Colomb.Estad. vol.42 no.2 Bogotá July/Dec. 2019

https://doi.org/10.15446/rce.v42n2.70271

Artículos originales de investigación

Extreme Value Theory Applied to r Largest Order Statistics Under the Bayesian Approach

Teoría de valores extremos aplicada a las r estadísticas de orden superior desde el punto de vista

Renato Santos da Silva^a

Fernando Ferraz do Nascimento^b

^{^a}Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo, Brazil. E-mail: renatoifpi@gmail.com

^{^b}Departamento de Estatística, Universidade Federal do Piauí, Teresina, Brazilecife Military School, Recife, Brazil. E-mail: ernandofn@ufpi.edu.br

Abstract

Extreme value theory (EVT) is an important tool for predicting efficient gains and losses in economic and environmental domains. Moreover, EVT was initially developed for use with normal and gamma parametric distribution patterns. However, economic and environmental data present a heavy-tailed distribution in most cases, which is in contrast with the above patterns. Thus, the framing of extreme events using EVT presented great difficulties. Furthermore, it is nearly impossible to use conventional models to make predictions about non-observed events that exceeded the maximum number of observations. In some situations, EVT is used to analyze only the maximum values of a given dataset, which provides few observations. In such cases, it is more effective to use the r largest order statistics. This study proposes Bayesian estimators for the parameters of the r largest order statistics. We use a Monte Carlo simulation to analyze the experimental data and observe certain estimator properties, such as mean, confiance interval, credibility interval, bias, and root mean square error (RMSE); estimation provided inferences for these parameters and return levels. In addition, this study proposes a procedure for selecting the r-optimal of the r largest order statistics based on the Bayesian approach and applying the Markov chains Monte Carlo (MCMC) method. Simulation results reveal that the Bayesian approach produced performance similar to that of the maximum likelihood estimation. Finally, the applications developed using the Bayesian approach showed a gain in accuracy compared with other estimators.

Key words: Markov chain monte carlo; Extreme value; Bayesian inference

Resumen

La teoría de valores extremos (EVT) es una herramienta importante para predecir ganancias y pérdidas eficientes en ambientes económicos y ambientales. Además, la EVT se desarrolló inicialmente para uso con patrones de distribución paramétricos normales y gamma. Sin embargo, los datos económicos y ambientales presentan una distribución de cola pesada en la mayoría de los casos, lo que contrasta con los patrones anteriores. Así, la formulación de eventos extremos con EVT presenta grandes dificultades. Además, es casi imposible usar modelos convencionales para obtener predicciones sobre eventos no observados que excedieron el número máximo de observaciones. En algunas situaciones, EVT es utilizado para analizar solamente los valores máximos de un conjunto de datos dado, que proporcionan poca información. En tales casos, es más eficiente usar las r estadísticas de orden superior. Este trabajo propone estimadores bayesianos para los parámetros de las r estadísticas de orden superior. Utilizamos simulaciones de Monte Carlo para analizar los datos experimentales y observar ciertas propiedades del estimador como: media, intervalos de confianza y credibilidad, sesgo y error cuadrático medio (RMSE). Este tipo de estimación proporciona inferencias para estos parámetros y niveles de retorno. También, proponemos un procedimiento para seleccionar el r-óptimo de la distribución de las r estadísticas de orden superior basadas en el en foque bayesiano y aplicando el método de Monte Carlo para cadenas de Markov (MCMC). Los resultados de la simulación muestran que el enfoque bayesiano produce un rendimiento similar al de la estimación de máxima verosemelianza. Finalmente, las aplicaciones desarrolladas utilizando el enfoque bayesiano mostraron una ganancia en la precisión en comparación con otros estimadores.

Palabras clave: Monte Carlo para cadena de Markov; Valores extremos; Inferencia bayesiana

Full text available only in PDF format.

References

Bader, B. & Yan, J. (2016), ‘eva: Extreme value analysis with goodness-of-fit testing’. R package version 0.2. [ Links ]

Bader, B., Yan, J. & Zhang, X. (2017), ‘Automated selection of r for the r largest order statistics approach with adjustment for sequential testing’, Statistics and Computing 27(6), 1435-1451. [ Links ]

Balakrishnan, N., Kannan, N. & Nagaraja, H. N. (2007), Advances in ranking and selection, multiple comparisons, and reliability: methodology and applications, Springer Science & Business Media. [ Links ]

Benjamini, Y. (2010a), ‘Discovering the false discovery rate’, Journal of the Royal Statistical Society: series B (statistical methodology) 72(4), 405-416. [ Links ]

Benjamini, Y. (2010b), ‘Simultaneous and selective inference: Current successes and future challenges’, Biometrical Journal 52(6), 708-721. [ Links ]

Benjamini, Y. & Hochberg, Y. (1995), ‘Controlling the false discovery rate: a practical and powerful approach to multiple testing’, Journal of the royal statistical society. Series B (Methodological) pp. 289-300. [ Links ]

Benjamini, Y. & Yekutieli, D. (2001), ‘The control of the false discovery rate in multiple testing under dependency’, Annals of statistics pp. 1165-1188. [ Links ]

Coles, S. (2006), ‘Ismev: an introduction to statistical modeling of extreme values’. http://cran.rproject.org/web/packages/ismev/index.html. [ Links ]

Coles, S., Bawa, J., Trenner, L. & Dorazio, P. (2001), An introduction to statistical modeling of extreme values, Vol. 208, Springer. [ Links ]

Coles, S. G. & Tawn, J. A. (1996), ‘A bayesian analysis of extreme rainfall data’, Applied statistics pp. 463-478. [ Links ]

Do Nascimento, F. F. & Moura e Silva, W. V. (2016), ‘MCMC4Extremes: Posterior Distribution of Extreme Value Models in R’. R package version 1.1. [ Links ]

Fisher, R. A. & Tippett, L. H. C. (1928), ‘On the estimation of the frequency distributions of the largest and smallest number of a sample’, Proceedings of the Cambridge Philosophycal Society 24, 180-190. [ Links ]

Gamerman, D. & Lopes, H. F. (2006), Markov chain Monte Carlo: stochastic simulation for Bayesian inference, Chapman and Hall/CRC. [ Links ]

Gonçalves, K. C., Migon, H. S. & Bastos, L. S. (2019), ‘Dynamic quantile linear models: A bayesian approach’, Bayesian Analysis (online). https://arxiv.org/abs/1711.00162. [ Links ]

G’Sell, M. G., Wager, S., Chouldechova, A. & Tibshirani, R. (2016), ‘Sequential selection procedures and false discovery rate control’, Journal of the royal statistical society: series B (statistical methodology) 78(2), 423-444. [ Links ]

Hastings, W. K. (1970), ‘Monte carlo sampling methods using markov chains and their applications’, 57(l). [ Links ]

Huerta, G. & Sansó, B. (2007), ‘Time-varying models for extreme values’, Environmental and Ecological Statistics 14(3), 285-299. [ Links ]

Jenkinson, A. F. (1955), ‘The frequency distribution of the annual maximum (or inimum) values of meteorological elements’, Quarterly Journal of the Royal Meteorological Society 81(348), 158-171. [ Links ]

Kozumi, H. & Kobayashi, G. (2011), ‘Gibbs sampling methods for Bayesian quantile regression’, Journal of statistical computation and simulation 81(11), 1565-1578. [ Links ]

Mises, R. v. (1936), ‘La distribution de la plus grande de n valeurs’, Revue Mathmatique de L’Union Interbalcanique 1, 141-160. [ Links ]

Nascimento, F. F. (2012), Modelos robabilisticos para dados Extremos: Teoria e aplicacoes, Teresina: Piaui. [ Links ]

Nascimento, F. F., Gamerman, D. & Lopes, H. F. (2011), ‘Regression models for exceedance data via the full likelihood’, Environmental and ecological statistics 18(3), 495-512. [ Links ]

Nascimento, F. F., Gamerman, D. & Lopes, H. F. (2016), ‘Time-varying extreme pattern with dynamic models’, Test 25(1), 131-149. [ Links ]

Parmesan, C., Root, T. L. & Willig, M. R. (2000), ‘Impacts of extreme weather and climate on terrestrial biota’, Bulletin of the American Meteorological Society 81(3), 443-450. [ Links ]

Pirazzoli, P. (1982), ‘Maree estreme a venezia (periodo 1872-1981)’, Acqua Aria 10, 1023-1039. [ Links ]

Pirazzoli, P. (1983), ‘Flooding in venice: a worsening problem’. International Geographical Union Union, Bologna. [ Links ]

Sang, H. & Gelfand, A. E. (2009), ‘Hierarchical modeling for extreme values observed over space and time’, Environmental and ecological statistics 16(3), 407-426. [ Links ]

Shaffer, J. P. (1995), ‘Multiple hypothesis testing’, Annual review of psychology 46(1), 561-584. [ Links ]

Singh, V. P. (2013), Entropy theory and its application in environmental and water engineering, John Wiley & Sons. [ Links ]

Smith, R. L. (1984), Threshold methods for sample extremes, in ‘Statistical extremes and applications’, Springer, pp. 621-638. [ Links ]

Smith, R. L. (1986), ‘Extreme value theory based on the r largest annual events’, Journal of Hydrology 86(1-2), 27-43. [ Links ]

Soares, C. G. & Scotto, M. (2004), ‘Application of the r largest-order statistics for long-term predictions of significant wave height’, Coastal Engineering 51(5-6), 387-394. [ Links ]

Tawn, J. A. (1988), ‘An extreme-value theory model for dependent observations’, Journal of Hydrology 101(1-4), 227-250. [ Links ]

Yu, K. & Moyeed, R. A. (2001), ‘Bayesian quantile regression’, Statistics & Probability Letters 54(4), 437-447. [ Links ]

This is an open-access article distributed under the terms of the Creative Commons Attribution License