The Table 2 Fallacy and Overfitting: A Persistent Problem in Contemporary Research?
DOI:
https://doi.org/10.6000/1929-6029.2025.14.61Keywords:
Bias, Confounding Factors, Epidemiologic, Mediation Analysis, CausalityAbstract
The "Table 2 fallacy" represents a common methodological error in medical research, characterized by indiscriminate statistical adjustment for multiple variables without considering their causal nature. This article examines the theoretical foundations of the problem, distinguishing between studies with descriptive, predictive, and explanatory objectives, and emphasizing how the research purpose should determine the adjustment strategy. We highlight the fundamental role of Directed Acyclic Graphs (DAGs) in correctly identifying confounding, mediating, and colliding variables, thus avoiding overadjustment and resulting biases. To illustrate these considerations, we present two practical examples: the relationship between obesity and colorectal cancer, and between coffee consumption and breast cancer. In the first case, we demonstrate how adjustment for intestinal dysbiosis (a mediator) can attenuate the association between obesity and colorectal cancer, reducing the adjusted relative risk from 1.78 (95% CI: 1.20–2.65) to 1.49 (95% CI: 0.97–2.29) and eliminating statistical significance (p=0.072). In the second example, we show how including insomnia (a collider) in the model can create artificial associations between coffee consumption and breast cancer, dramatically increasing the adjusted relative risk to 1.94 (95% CI: 1.34-2.81) with high statistical significance (p<0.001), when a correctly specified model shows no such association. We conclude that, in explanatory studies, it is essential to develop causal reasoning prior to statistical analysis, using DAGs to guide the selection of adjustment variables. This rigorous methodological approach prevents both the dilution of real causal effects and the generation of spurious associations, increasing the internal validity of epidemiological findings and their utility for clinical decision-making.
References
Dharma C, Fu R, Chaiton M. Table 2 Fallacy in Descriptive Epidemiology: Bringing Machine Learning to the Table. Int J Environ Res Public Health 2023; 20(13): 6194. DOI: https://doi.org/10.3390/ijerph20136194
Schisterman EF, Cole SR, Platt RW. Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies. Epidemiol Camb Mass 2009; 20(4): 488-95. DOI: https://doi.org/10.1097/EDE.0b013e3181a819a1
Kaufman JS. Causal Inference Challenges in the Rela-tionship Between Social Determinants and Cardiovascular Outcomes. Can J Cardiol 2024; 40(6): 976-88. DOI: https://doi.org/10.1016/j.cjca.2024.02.005
Akinkugbe AA, Simon AM, Brody ER. A scoping review of Table 2 fallacy in the oral health literature. Community Dent Oral Epidemiol 2021; 49(2): 103-9. DOI: https://doi.org/10.1111/cdoe.12617
Bandoli G, Palmsten K, Chambers CD, Jelliffe-Pawlowski LL, Baer RJ, Thompson CA. Revisiting the Table 2 fallacy: A motivating example examining preeclampsia and preterm birth. Paediatr Perinat Epidemiol 2018; 32(4): 390-7. DOI: https://doi.org/10.1111/ppe.12474
Hernán MA, Hsu J, Healy B. A Second Chance to Get Causal Inference Right: A Classification of Data Science Tasks. CHANCE 2019; 32(1): 42-9. DOI: https://doi.org/10.1080/09332480.2019.1579578
Lederer DJ, Bell SC, Branson RD, Chalmers JD, Marshall R, Maslove DM, et al. Control of Confounding and Reporting of Results in Causal Inference Studies. Guidance for Authors from Editors of Respiratory, Sleep, and Critical Care Journals. Ann Am Thorac Soc 2019; 16(1): 22-8. DOI: https://doi.org/10.1513/AnnalsATS.201808-564PS
Alba AC, Agoritsas T, Walsh M, Hanna S, Iorio A, Devereaux PJ, et al. Discrimination and Calibration of Clinical Prediction Models: Users’ Guides to the Medical Literature. JAMA 2017; 318(14): 1377-84. DOI: https://doi.org/10.1001/jama.2017.12126
Conroy S, Murray EJ. Let the question determine the methods: descriptive epidemiology done right. Br J Cancer 2020; 123(9): 1351-2. DOI: https://doi.org/10.1038/s41416-020-1019-z
Fox MP, Murray EJ, Lesko CR, Sealy-Jefferson S. On the Need to Revitalize Descriptive Epidemiology. Am J Epidemiol 2022; 191(7): 1174-9. DOI: https://doi.org/10.1093/aje/kwac056
Hernan MA, Robins JM. Causal Inference: What If. CRC Press; 2024. 312 p.
Huitfeldt A. Is caviar a risk factor for being a millionaire? BMJ 2016; 355: i6536. DOI: https://doi.org/10.1136/bmj.i6536
Vera-Ponce VJ, Zuzunaga-Montoya FE, Vásquez-Romer LEM, Sanchez-Tamay NM, Loayza-Castro JA, Gutierrez De Carrillo CI. Scientific Tasks in Biomedical and Oncological Research: Describing, Predicting, and Explaining. J Cancer Res Updat 2024; 13: 52-65. DOI: https://doi.org/10.30683/1929-2279.2024.13.08
Werlinger F, Cáceres DD, Werlinger F, Cáceres DD. Aplicación de grafos acíclicos dirigidos en la evaluación de un set mínimo de ajuste de confusores: un complemento al modelamiento estadístico en estudios epidemiológicos observacionales. Rev Médica Chile 2018; 146(7): 907-13. DOI: https://doi.org/10.4067/s0034-98872018000700907
Tennant PWG, Murray EJ, Arnold KF, Berrie L, Fox MP, Gadd SC, et al. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol 2021; 50(2): 620-32. DOI: https://doi.org/10.1093/ije/dyaa213
Ferguson KD, McCann M, Katikireddi SV, Thomson H, Green MJ, Smith DJ, et al. Evidence synthesis for constructing directed acyclic graphs (ESC-DAGs): a novel and systematic method for building directed acyclic graphs. Int J Epidemiol 2020; 49(1): 322-9. DOI: https://doi.org/10.1093/ije/dyz150
Holmberg MJ, Andersen LW. Collider Bias. JAMA 2022; 327(13): 1282-3. DOI: https://doi.org/10.1001/jama.2022.1820
Suttorp MM, Siegerink B, Jager KJ, Zoccali C, Dekker FW. Graphical presentation of confounding in directed acyclic graphs. Nephrol Dial Transplant Off Publ Eur Dial Transpl Assoc - Eur Ren Assoc 2015; 30(9): 1418-23. DOI: https://doi.org/10.1093/ndt/gfu325
Textor J, van der Zander B, Gilthorpe MS, Liskiewicz M, Ellison GT. Robust causal inference using directed acyclic graphs: the R package “dagitty”. Int J Epidemiol 2016; 45(6): 1887-94. DOI: https://doi.org/10.1093/ije/dyw341
Barrett M. ggdag: An R Package for visualizing and analyzing causal directed acyclic graphs. R package version 0.2.7 2022. Available from: https://CRAN.R-project.org/package=ggdag
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Policy for Journals/Articles with Open Access
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
Policy for Journals / Manuscript with Paid Access
Authors who publish with this journal agree to the following terms:
- Publisher retain copyright .
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work .