The three main tasks of modern epidemiology: description, prediction and causai inference

Calvache, José A.; Tejera, César Higgins; Calvache, José A.; Tejera, César Higgins

doi:10.5554/22562087.e1088

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Colombian Journal of Anestesiology

Print version ISSN 0120-3347On-line version ISSN 2256-2087

Rev. colomb. anestesiol. vol.51 no.4 Bogotá Oct./Dec. 2023 Epub Nov 11, 2023

https://doi.org/10.5554/22562087.e1088

Editorial

The three main tasks of modern epidemiology: description, prediction and causai inference

José A. Calvache^a^b^c
http://orcid.org/0000-0001-9421-3717

César Higgins Tejera^d^e
http://orcid.org/0000-0001-5999-3607

^{^a} Editor in Chief, Colombian Journal of Anesthesiology. Bogotá, Colombia.

^{^b} Department of Anesthesiology, Universidad del Cauca. Popayán, Colombia.

^{^c} Department of Anesthesiology, Erasmus University Medical Center. Rotterdam, The Netherlands.

^{^d} Epidemiology Master's Program, Universidad del Magdalena. Santa Marta, Colombia.

^{^e} Epidemiology Doctoral Program, University of Michigan. Michigan, United States.

The advancement of science depends, at least partially, on editorial processes and the peer review. Despite multiple challenges and limitations, editorial and peer review processes continue to serve as quality filters for the improvement of scientific publications ¹^-³. As editors and reviewers, we work to improve the integrity and completeness of the report and discuss methodological and analytical issues that take part in the editorial process. At times, during the peer review process, we are able to identify inconsistencies between the research question, the study design, and the methodology of the study. These inconsistencies are critical elements that reviewers evaluate when assessing the viability of scientific publications ⁴.

The research question is one ofthe most important aspects ofthe scientific process. The research question must be clearly defined, because it informs the objectives of the study, an appropriate design, and a clear plan for analysis. As 2022 came to an end, the statistical editors of the British Medical Journal (BMJ) were looking forward to a quiet and peaceful Christmas holiday; with that in mind, they urged to pay attention to "twelve potential problems" they commonly identified as reviewers ⁵. At the top of their list was to have "absolute clarity of the research question". The primary suggestion was to think carefully about the research question and be clear about the objectives of the study. This first step helps to characterize the study design (cross-sectional, longitudinal, etc.), and the measure of association (relative risk, odds ratio, prevalence risk) to be estimated ⁵. Missteps in this early phase of the research process can hardly be resolved by methodological adjustments and may lead to misinterpretations of the study results.

Briefly, there are three principal areas of modern epidemiology and data science: description, prediction, and causal inference ⁶. In biomedical research, the objectives that stem from the research question must be ascribed to one of these categories. In fact, these objectives are later translated into: 1) the selection of the study sample, which is characterized by the study population, place, and time; 2) the health outcome to be studied; 3) the measures of association to describe the event (incidence, prevalence, survival); and 4) the selection of a set of covariates that may be confounders of the relationship understudied ⁷. Given the current plethora of data, statistical softwares, and the advent of artificial intelligence, the aforementioned considerations are more relevant than ever.

To illustrate these concepts, we provide some examples recently published in the Colombian Journal of Anesthesiology.

The area of description employs data to provide a quantitative assessment, or a graphic summary, of certain characteristics of the world. Descriptive tasks include, for example, calculating a proportion - cumulative incidence or prevalence - of patients with postoperative nausea and vomit in a large hospital database or in a cohort study. Descriptive analyses range from basic summary calculations - mean and other measures of central tendency - to highly elaborated figures and sophisticated data synthesis techniques. For example, in a cross-sectional study, Bocanegra et al., (2022) provide a very clear description of the frequency of legal claims (closed cases) filed against anesthetists between 2013 and 2019 ⁸. Given the nature of the study design, these results cannot be generalized beyond the sample population. These limitations must be explicit within the study with the end to inform the interpretations derived from it. In general, researchers must have a clear idea to what extent the objectives of their research are merely descriptive or seek other interests, because the objectives of the study must be reflected in the study design, the methodology, and the interpretation of the study results.

Prediction consists of using data to "map" certain characteristics of the world - inputs, or predictive variables - with other world characteristics - outputs or outcomes ⁶. Prediction usually begins with simple tasks, such as quantifying the association between midazolam premedication in children and the incidence of early postoperative delirium ⁹; and advances towards more complex tasks such as using multiple variables measured upon enrollment of patients undergoing cesarean section in order to "predict" which patients have a higher probability of developing postoperative nausea and vomiting ¹⁰. Predictive analyses range from simple calculations (e.g., incidence or risk difference) to more sophisticated modeling methods such as predictive and supervised learning algorithms ⁶. Questions related to "prediction or prognosis" are classified by the PRoGnosis RESearch Strategy (PROGRESS) group ¹¹ into four distinct types: 1) the ones that study the course of health-related conditions, or prognostic research; 2) the ones that study specific prognostic-related factors (biomarkers or others), or prognostic factors research; 3) the ones that study the development, validation and determination of the impact of statistical models on individuals' disease risk and their future health outcomes, or prognostic models research; and 4) the ones that employ prognostic information for targeted individualized treatment decisions ¹¹.

A common characteristic of predictive models is that the concept of "confounding bias" can become secondary because the primary focus of these models is not to establish "causal relationships" ¹¹. However, advances in software development have enabled the integration of supervised learning models (like the Super Learner) as essential tools for estimating parameters of causal inference ¹²^,¹³. The integration of these two areas holds significant promise for the advancement of epidemiological research in the 21st century.

Causal inference - defined by some authors as counterfactual prediction-uses data to predict certain features of the world, had the world been different; a journey back in time to change "something" in the past and observe what would have happened ⁶. The main aim of causal inference is to explain how the world works, and what would happen if we changed something in the world today. A widely known example of causal inference are randomized controlled clinical trials. In these studies, the random assignment of the intervention creates a counterfactual scenario where comparison groups are similar in terms of known and unknown characteristics that could influence the outcomes of the study. In a clinical trial, Casas-Arroyave et al., (2019) compared the use of a closed-loop system for the administration of total intravenous anesthesia versus the administration using a target-controlled infusion (TCI) ¹⁴. Many factors can influence the main outcome of this study, as is the case of the performance assessment of the system in terms of the depth of anesthesia, which is quantified using the bispectral index (BIS). However, those "factors" or confounding variables were controlled, in principle, by the methodological design of the experiment, and the randomized assignment of the treatment. This strategy allows us to recreate a "journey back in time". In this journey, the same group of patients would have been subjected to the anesthetic procedure using TCI and assessed in terms of the health outcomes; later, the same group of subjects could "travel back in time" and be subjected to the closed-loop strategy. In the real world, we are only able to assess one of those potential outcomes; for this reason, causal inference problems tend to be seen as a missing data problem. In ideal conditions, the control group is used to assess what would have happened had the subjects in the study not been subjected to the study intervention, and this is what is meant by counterfactual prediction. This counterfactual reasoning represents the paradigm of epidemiological studies that employ causal inference, as the randomized trials ¹⁵^,¹⁶.

The application of causal inference techniques in observational studies requires additional assumptions to those used in randomized trials ¹⁷^-¹⁹. In some cases, the inherent limitations of observation studies (reverse causality, confounding bias) preclude the use of causal language when it comes to reporting and interpreting study results ²⁰. Causality is a complex phenomenon that not only depends on the available information gathered in the data; it also requires external information, pre-existing knowledge, and the use of causal models that can be illustrated in the form of Directed Acyclic Graphs (DAGs). These graphs represent underlying premises, assumptions, theoretical concepts, and may provide guidance in the selection of confounding variables in regression models ²¹^,²². Although some studies in the area of perioperative and intensive care ²³^-²⁵ have approached causal inference using DAGs, the dissemination of these methods in such disciplines is still infrequent ²⁶, and even more so in epidemiologic studies in Latin America. Therefore, this editorial is a call to study the counterfactual paradigm, and to implement causal inference methods in future epidemiological studies, with the objective to advance the national and Latin American scientific production toward the epidemiology of the 21st century.

It is worth highlighting that causal inference techniques are not only reserved for experimental trials; observational studies can also provide evidence regarding the "causal effects of interventions'' in cases in which a randomized trial is not feasible, ethical, or appropriate. However, making causal inferences from observational data is challenging due to confounding and selection biases, as well as other threats to the internai validity of observational studies. However, certain strategies such as Target Trial Emulation are being more accessible for solving causal questions in observational studies ²⁷^,²⁸.

Finally, it is worth mentioning that the appropriate and clear selection of a research question leads to the correct interpretation of the study results. Clearly defining the aim of the study - including the meaning of variables such as "risk factors" or "predictive factors" - is essential for the correct and transparent interpretation of the study results, ²⁹ and for avoiding causal misinterpretations or clinically irrelevant recommendations ⁵. It is not uncommon to find cases in which wrong causal interpretations are made on the basis of descriptive studies with obvious biases. We also believe that it is wrong not to call things by their name, and by what they seek to accomplish; if we set out to study causality and we use the appropriate methods to do so, we should not be afraid to use the word cause during the research process ³⁰^,³¹. Straightforward questions and objectives help dispel the classical confusion between association and causality, a widely discussed conundrum in epidemiology, and a persistent topic in the scientific literature.

REFERENCES

1. Vilaró M, Cortés J, Selva-O'Callaghan A, Urru-tia A, Ribera J-M, Cardellach F, et al. Adherence to reporting guidelines increases the number of citations: the argument for including a methodologist in the editorial process and peer-review. BMC Medical Research Methodology. 2019;19:1-7. doi: https://doi.org/10.1186/si2874-0l9-0746-4. [ Links ]

2. Smith R. Peer review: a flawed process at the heart of science and journals. J Royal Soc Med. 2006;99:178-82. doi: https://doi.org/10.1177/01 4107680609900414. [ Links ]

3. Henderson M. Problems with peer review. BMJ. 2010;340. doi: https://doi.org/10.1136/bmj.c1409. [ Links ]

4. Calvache JA. Enhancing the value of research reports: time for complete reporting. Colombian Journal of Anesthesiology. 2019;47:209-10. doi: http://dx.doi.org/10.1097/CJ9.0000000000000134. [ Links ]

5. Riley RD, Cole TJ, Deeks J, Kirkham JJ, Morris J, Perera R, et al. On the 12th Day of Christmas, a Statistician Sent to Me... BMJ (Clinical Research Ed). 2022;379:e072883. doi: https://doi.org/10.1136/bmj-2022-072883. [ Links ]

6. Hernán MA, Hsu J, Healy B. A second chance to get causal inference right: A classification of data science tasks. CHANCE. 2019;32:42-9. doi: https://doi.org/10.1080/09332480.2019.1579578. [ Links ]

7. Lesko CR, Fox MP, Edwards JK. A framework for descriptive epidemiology. Am J Epidemiol. 2022;191:2063-70. doi: https://doi.org/10.1093/aje/kwac115. [ Links ]

8. Bocanegra Rivera JC, Gómez Buitrago LM, Sánchez Bello NF, Chaves Vega A. Adverse events in anesthesia: Analysis of claims against anesthetists affiliated to an insurance fund in Colombia. Cross-sectional study. Colombian Journal of Anesthesiology . 2023;51:e1043. doi: https://doi.org/10.5554/22562087.e1043. [ Links ]

9. González Cárdenas VH, Benítez Ávila DS, Gómez Barajas WJ, Tamayo Reina MA, Pinzón Villazón IL, Cuervo Pulgarín JL, et al. Premedication with midazolam in low-risk surgery in children does not reduce the incidence of postoperative delirium. Cohort study. Colombian Journal of Anesthesiology . 2023;51:e1055. doi: https://doi.org/10.5554/22562087.e1055. [ Links ]

10. Peña MDL, Giraldo OL, Aguirre DC, Peña AJDL, Arango JJ, Martínez R. Prognostic predictive model for PONV in cesarean delivery. Colombian Journal of Anesthesiology . 2023;51:e1077. doi: https://doi.org/10.5554/22562087.e1077. [ Links ]

11. Hemingway H, Croft P, Perel P Hayden JA, Abrams K, Timmis A, et al. Prognosis research strategy (PROGRESS) 1: A framework for re-searching clinical outcomes. BMJ. 2013;346. https://doi.org/10.1136/bmj.e5595. [ Links ]

12. van der Laan MJ, Rose S. Targeted learning. Springer New York; 2011. doi: https://doi.org/10.1007/978-1-4419-9782-1. [ Links ]

13. Balzer LB, Petersen ML. Invited commentary: Machine learning in causal inference-how do i love thee? Let Me count the ways. Am J Epidemiol . 2021;190:1483-7. doi: https://doi.org/10.1093/aje/kwab048. [ Links ]

14. Casas-Arroyave FD, Fernández JM, Zuleta-Tobón JJ. Evaluation of a closed-loop intravenous total anesthesia delivery system with BIS monitoring compared to an open-loop target-controlled infusion (TCI) system: randomized controlled clinical trial. Colombian Journal of Anesthesiology . 2019;47:84-91. http://dx.doi.org/10.1097/CJ9.0000000000000110 [ Links ]

15. Hernán MA, Robins JM. Causal inference: What if. Boca Raton: Chapman & Hall/CRC; 2020. [ Links ]

16. Vandenbroucke JP, Broadbent A, Pearce N. Causality and causal inference in epidemiology: the need for a pluralistic approach. Int J Epidemiol. 2016;45:1776-86. doi: https://doi.org/10.1093/ije/dyv341 [ Links ]

17. Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615-25. doi: https://doi.org/10.1097/01.ede.0000135174.63482.43. [ Links ]

18. Morabia A. History of the modern epidemiologica! concept of confounding. J Epidemiol Community Health. 2011;65:297-300. doi: https://doi.org/10.1136/jech.2010.112565. [ Links ]

19. Schisterman EF, Cole SR, Platt RW. Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiol. 2009;20:488-95. doi: https://doi.org/10.1097/ede.0b013e3181a819a1. [ Links ]

20. Haber NA, Wieten SE, Rohrer JM, Arah OA, Tennant PWG, Stuart EA, et al. Causal and associational language in observational health research: A systematic evaluation. Am J Epidemiol . 2022;191:2084-97. doi: https://doi.org/10.1093/aje/kwac137. [ Links ]

21. Greenland S, Pearl J, Robins JM. Causal dia-grams for epidemiologic research. Epidemiology. 1999;10(1):37-48. [ Links ]

22. Lipsky AM. Causal directed acyclic graphs. JAMA. 2022;327:1083-4. doi: https://doi.org/10.1001/jama.2022.1816. [ Links ]

23. Lederer DJ, Bell SC, Branson RD, Chalmers JD, Marshall R, Maslove DM, et al. Control of confounding and reporting of results in causal inference studies. Guidance for Authors from editors of respiratory, sleep, and critical care journals. Ann Am Thoracic Soc. 2019;16:22-8. doi: https://doi.org/10.1513/annalsats.201808-564ps. [ Links ]

24. Krishnamoorthy V, Wong DJN, Wilson M, Raghunathan K, Ohnuma T, McLean D, et al. Causal inference in perioperative medicine observational research: part 1, a graphical introduction. Br J Anaesth. 2020;125:393-7. doi: https://doi.org/10.1016/j.bja.2020.03.031. [ Links ]

25. Gaskell A, Sleigh J. An Introduction to causal diagrams for anesthesiology research. Anesthesiology. 2020;132:951-67. doi: https://doi.org/10.1097/ALN.0000000000003193. [ Links ]

26. Tennant PWG, Murray EJ, Arnold KF, Berrie L, Fox MP Gadd SC, et al. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol . 2021;50(2):620-32. doi: https://doi.org/10.1093/ije/dyaa213. [ Links ]

27. Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol . 2016;183:758-64. doi: https://doi.org/10.1093/aje/kwv254. [ Links ]

28. Hansford HJ, Cashin AG, Jones MD, Swanson S, Islam N, Dahabreh I, et al. Development of the transparent reporting of observational studies emulating a target trial (TARGET) guideline. BMJ Open. 2023;13:e074626. doi: https://doi.org/10.1136/bmjopen-2023-074626. [ Links ]

29. Huitfeldt A. Is caviar a risk factor for being a millionaire? BMJ. 2016;355. doi: https://doi.org/10.1136/bmj.i6536. [ Links ]

30. Hernán MA. The C-Word: scientific euphe-misms do not improve causal inference from observational data. Am J Public Health. 2018;108:616-9. doi: https://doi.org/10.2105/AJPH.2018.304337. [ Links ]

31. Vickers AJ, Assel M, Dunn RL, Zabor EC, Kat-tan MW, van Smeden M, et al. Guidelines for reporting observational research in urology: the importance of clear reference to causality. BJU International 2023;132:4-8. doi: https://doi.org/10.1111/bju.16028. [ Links ]

Funding The authors declare not having received funding for the preparation of this article.

How to cite this article: Calvache JA, Higgins Tejera. The three main tasks of modern epidemiology: description, prediction and causal inference. Colombian Journal of Anesthesiology. 2023;5i:e e1088.

Received: September 19, 2023; Accepted: September 20, 2023; other: October 09, 2023

^{Correspondence:} Sociedad Colombiana de Anestesiología y Reanimación (S.C.A.R.E.), Cra 15a No. 120 - 74. Bogotá, Colombia. E-mail:jacalvache@unicauca.edu.co

^{Conflicts of interest}

The authors declare having no conflict of interest to disclose. JAC is the Editor-in-Chief for the Colombian Journal of Anesthesiology.

This is an open-access article distributed under the terms of the Creative Commons Attribution License