Using Copula Functions to Estimate The AUC for Two Dependent Diagnostic Tests

Bravo melo, Luis Carlos; Portilla, Jennyfer; Tovar Cuevas, Jose Rafael; Bravo melo, Luis Carlos; Portilla, Jennyfer; Tovar Cuevas, Jose Rafael

doi:10.15446/rce.v43n2.80288

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Revista Colombiana de Estadística

Print version ISSN 0120-1751

Rev.Colomb.Estad. vol.43 no.2 Bogotá July/Dec. 2020 Epub Dec 05, 2020

https://doi.org/10.15446/rce.v43n2.80288

Original articles of research

Using Copula Functions to Estimate The AUC for Two Dependent Diagnostic Tests

Uso de funciones cópula para estimar el área bajo la curva característica de operación para dos pruebas de diagnóstico dependientes

Luis Carlos Bravo melo¹^a

Jennyfer Portilla¹^b

Jose Rafael Tovar Cuevas¹^c

^¹Statistics School, Universidad del Valle, Cali, Colombia

Abstract

When performing validation studies on diagnostic classification procedures, one or more biomarkers are typically measured in individuals. Some of these biomarkers may provide better information; moreover, more than one biomarker may be significant and may exhibit dependence between them. This proposal intends to estimate the Area Under the Receiver Operating Characteristic Curve (AUC) for classifying individuals in a screening study. We analyze the dependence between the results of the tests by means of copula-type dependence (using FGM and Gumbel-Barnett copula functions), and studying the respective AUC under this type of dependence. Three different dependence-level values were evaluated for each copula function considered. In most of the reviewed literature, the authors assume a normal model to represent the performance of the biomarkers used for clinical diagnosis. There are situations in which assuming normality is not possible because that model is not suitable for one or both biomarkers. The proposed statistical model does not depend on some distributional assumption for the biomarkers used for diagnosis procedure, and additionally, it is not necessary to observe a strong or moderate linear dependence between them.

Key words: AUC; Copula function; FGM copula; Gumbel copula; ROC curve; Weak dependence

Resumen

Cuando se realizan estudios de validación en procedimientos de clasificación diagnóstica, normalmente se miden uno o más biomarcadores en los individuos. Algunos biomarcadores pueden proporcionar mejor información que otros y en muchos casos, más de uno puede ser necesario. Cuando se utilizan varios biomarcadores para hacer clasificación, se presenta dependencia entre ellos. En este trabajo se estima el área bajo la curva característica de operación (ABCOR) para establecer la capacidad clasificadora de dos biomarcadores en un procedimiento para diagnóstico clínico. Se estudia mediante copulas (FGM y Gumbel-Barnett) la dependencia entre pruebas y se estima la respectiva área bajo la curva, asumiendo tres niveles para cada estructura de dependencia. En la literatura revisada los autores asumen un modelo normal para representar el comportamiento de los biomarcadores utilizados para el diagnóstico clínico. Hay situaciones en las que no es posible asumir este modelo porque no es adecuado para uno o ambos biomarcadores. El método estadístico propuesto no depende de un supuesto distribucional para los biomarcadores utilizados en el procedimiento de diagnóstico y tampoco es necesario considerar una dependencia lineal fuerte o moderada entre ellos.

Palabras clave: ABCOR; Cópula FGM; Cópula Gumbel Barnett; COR; Dependencia débil

Full text available only in PDF format.

Acknowledgements

We thank the program "Development of applied research to contribute to an effective and sustainable model of dengue intervention in Santander, Casanare and Valle del Cauca" of the AEDES Knowledge and Cooperation Network (RedAedes), Comfandi and the clinical and epidemiological experts participating in the consultations. This work was partially supported by the Science, Technology and Innovation Fund - FCTel of SGR, Colombia - BPIN 2013000100011 and the AEDES and Comfandi Network. The participation of the second author was partially financed by a scholarship from the Virginia Gutierrez de Pineda Program for Young Researches and Innovators of the Administrative Department of Science, Technology and Innovation (Colciencias) in Colombia.

References

Achcar, J., Tovar, J. & Moala, F. (2019), 'Use of graphical methods in the diagnostic of parametric probability distributions for bivariate lifetime data in presence of censored data', Journal of Data Science 17(3), 445-480. [ Links ]

Bamber, D. (1975), 'The area above the ordinal dominance graph and the area below the receiver operating characteristic graph', Journal of Mathematical Psychology 12(4), 387-415. [ Links ]

Bouyé, E., Durrleman, V., Nikeghbali, A., Riboulet, G. & Roncalli, T. (2000), 'Copulas for finance-a reading guide and some applications', SSRN Electronic Journal. 10.2139/ssrn.1032533. [ Links ]

Burgueño, M., García, J. & Gonzáles, J. (1995), 'Las curvas ROC en la evaluación de las pruebas diagnósticas', Medicina Clínica 104, 661-670. [ Links ]

DeLong, E., DeLong, D. & Clarke-Pearson, D. (1988), 'Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach', Biometrics 44(3), 837-845. [ Links ]

Dendukuri, N. & Joseph, L. (2001), 'Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests', Biometrics 57(1), 158-167. [ Links ]

Dupuis, D. J. (2007), 'Using copulas in hydrology: Benefits, cautions, and issues', Journal of Hydrologic Engineering 12(4), 381-393. [ Links ]

Etzionin, R., Kooperberg, C., Pepe, M., Smith, R. & Gann, P. (2003), 'Combining biomarker to detect disease with application to prostate cancer', Biostatistics 4(4), 523-538. [ Links ]

Faraggi, D. & Reiser, B. (2002), 'Estimation of the area under the ROC curve', Statistics in Medicine 21, 3093-3106. [ Links ]

Gallardo, B. (2010), Teoría de cópulas y aplicaciones en simulación de riesgos financieros y en ingeniería civil, Master Thesis, Estadística Aplicada, Universidad de Granada, Granada, España. [ Links ]

Genest, C., Quessy, J.-F. & Rémillard, B. (2006), 'Goodness-of-Fit Procedures for Copula Models Based on the Probability Integral Transformation', Scandinavian Journal of Statistics 33(2), 337-366. [ Links ]

Georgiadis, M., Johnson, W., Gardner, I. & Singh, R. (2003), 'Correlation-Adjusted Estimation of Sensitivity and Specificity of Two Diagnostic Tests', Journal of the Royal Statistical Society. Series C (Applied Statistics) 52(1), 63-76. [ Links ]

Gumbel, E. J. (1960), 'Bivariate Exponential Distributions', Journal of the American Statistical Association 55(292), 698-707. [ Links ]

Johnson, M. E. (1987), Multivariate statistical simulation, first edn, John Wiley & Sons, Inc., Wiley Series in Probability and Statistics. [ Links ]

Kojadinovic, I., Yan, J. & Holmes, M. (2011), 'Fast large sample godness-of-fit test for copulas', Statistica Sinica 21(2), 841-871. [ Links ]

Kruskal, W. H. (1958), 'Ordinal Measures of Association', Journal ofthe American Statistical Association 53(284), 814-861. [ Links ]

Ma, S. & Huang, J. (2007), 'Combining multiple markers for classification using ROC', Biometrics 63(3), 751-757. [ Links ]

National Institutes of Health and others (2004), US National Library of Medicine. https://medlineplus.gov/. [ Links ]

Nelsen, R. B. (2006), An Introduction to Copulas, second edn, Springer, New York. [ Links ]

Nikoloulopoulos, A. K. (2018), 'On composite likelihood in bivariate meta-analysis of diagnostic test accuracy studies', AstA Adv Stat Anal 102, 211-227. [ Links ]

Pepe, M. S. (2003), The Statistical Evaluation of Medical Tests for Classification and Prediction, first edn, Oxford University Press, Oxford University, New York. [ Links ]

Pepe, M. S. & Thompson, M. L. (2000), 'Combining diagnostic test results to increase accuracy', Biostatistics 1(2), 123-140. [ Links ]

Portilla, J. & Tovar, J. (2018), 'Estimating the Gumbel-Barnett copula parameter of dependence', Revista Colombiana de Estadística 41(1), 53-73. [ Links ]

Pundir, S. & Amala, R. (2012), 'A Study on the Bi-Rayleigh ROC Curve Model', Bonfring International Journal of Data Mining 2(2), 42-47. [ Links ]

Pundir, S. & Amala, R. (2015), 'Detecting diagnostic accuracy of two biomarkers through a bivariate log-normal ROC curve', Journal of Applied Statistics 42(12), 2671-2685. [ Links ]

Su, J. Q. & Liu, J. S. (1993), 'Linear Combinations of Multiple Diagnostic Markers', Journal of the American Statistical Association 88(424), 1350-1355. [ Links ]

Sumi, N. & Hossain, M. (2012), 'A study on parametric approaches to compare areas under two correlated ROC curves', Bangladesh Journal of Scientific Research 25(1), 61-71. [ Links ]

Tovar, J. R. (2011), 'Métodos estadísticos desarrollados para la estimación de la prevalencia y parámetros de desempeño de tests para diagnóstico clínico: una revisión de literatura', Revista Investigaciones Andina 13(23), 338-351. [ Links ]

Tovar, J. R. & Achcar, J. A. (2011a), 'Dependence between three diagnostic tests in presence of verification bias: a copula function approach', Revista Brasileira de Biometria 49(1), 74-90. [ Links ]

Tovar, J. R. & Achcar, J. A. (2011 b), 'Indexes to measure dependence between clinical ciagnostic tests: a comparative study', Revista Colombiana de Estadística 34(3), 433-450. [ Links ]

Tovar, J. R. & Achcar, J. A. (2012), 'Two depedent diagnostic tests: use of copula functions in the estimation of the prevalence and performance test parameters', Revista Colombiana de Estadística 35, 331-347. [ Links ]

Tovar, J. R. & Achcar, J. A. (2013), 'Dependence between two diagnostic tests with copula functions approach: a simulation study', Communications in Statistics: Simulation and Computation 42(2), 454-475. [ Links ]

Wang, M. C. & Li, S. (2012), 'Bivariate maker measurements and ROC analysis', Biometrics 68(4), 1207-1218. [ Links ]

Youden, W. J. (1950), 'Index for rating diagnostic tests', Cancer 3(1), 32-35. [ Links ]

Zou, K., O'Malley, A. & Mauri, L. (2007), 'Receiver-Operating characteristics analysis for evaluating diagnostic tests and predictive models', Circulation 115(5), 654-657. [ Links ]

Received: January 2019; Accepted: April 2020

^aPostgraduate student. E-mail: bravo.luis@correounivalle.edu.co

^bPost graduate student. E-mail: jennyfer.portilla@correounivalle.edu.co

^cPh.D. E-mail: jose.r.tovar@correounivalle.edu.co

This is an open-access article distributed under the terms of the Creative Commons Attribution License