- Matters Arising
- Published:
Subjects
arising from S. Vrizzi et al. Nature Mental Health https://doi.org/10.1038/s44220-025-00427-1 (2025)
Reliable findings are a cornerstone of scientific research, and emerging subfields in psychology and neuroscience often suffer large corrections1,2 due to unreliable findings. Therefore, examining the reliability of methods in new subfields such as computational psychiatry is very important. However, investigations into reliability have a catch. Because reliability is a measure of stability (in the case of test–retest reliability, stability over multiple assessments), any source of noise corrupts measurement and lowers reliability. Measures do not have good or bad reliability; rather, a specific measure, assessed using specific procedures in a specific population, and analyzed using specific methods, has good or bad reliability. Establishing the reliability—or lack thereof—of a measure depends on this context. When claims are made about the reliability of a class of measures, therefore, they need to take the methodological context into account. Additionally, incorporating up-to-date research methods is needed to eliminate known sources of noise and increase generalizability.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 digital issues and online access to articles
79,00 € per year
only 6,58 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
39,95 €
Prices may be subject to local taxes which are calculated during checkout
References
-
Elliott, M. L. et al. What is the test–retest reliability of common task-functional MRI measures? New empirical evidence and a meta-analysis. Psychol. Sci. 31, 792–806 (2020).
-
Rodebaugh, T. L. et al. Unreliability as a threat to understanding psychopathology: the cautionary tale of attentional bias. J. Abnorm. Psychol. 125, 840–851 (2016).
-
Vrizzi, S. et al. Behavioral, computational and self-reported measures of reward and punishment sensitivity as predictors of mental health characteristics. Nat. Mental Health 3, 654–666 (2025).
-
Enkavi, A. Z. et al. Large-scale analysis of test-retest reliabilities of self-regulation measures. Proc. Natl Acad. Sci. USA 116, 5472–5477 (2019).
-
Chandler, J. & Shapiro, D. Conducting clinical research using crowdsourced convenience samples. Annu. Rev. Clin. Psychol. 12, 53–81 (2016).
-
Zorowitz, S., Solis, J., Niv, Y. & Bennett, D. Inattentive responding can induce spurious associations between task behaviour and symptom measures. Nat. Hum. Behav. 7, 1667–1681 (2023).
-
Brennan, C., Worrall-Davies, A., McMillan, D., Gilbody, S. & House, A. The Hospital Anxiety and Depression Scale: a diagnostic meta-analysis of case-finding ability. J. Psychosom. Res. 69, 371–378 (2010).
-
Pettersson, A., Boström, K. B., Gustavsson, P. & Ekselius, L. Which instruments to support diagnosis of depression have sufficient accuracy? A systematic review. Nord. J. Psychiatry 69, 497–508 (2015).
-
Clark, L. A. & Watson, D. Constructing validity: new developments in creating objective measuring instruments. Psychol. Assess. 31, 1412–1427 (2019).
-
Brown, V. M., Chen, J., Gillan, C. M. & Price, R. B. Improving the reliability of computational analyses: model-based planning and its relationship with compulsivity. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 5, 601–609 (2020).
-
Haines, N., Sullivan-Toole, H. & Olino, T. From classical methods to generative models: tackling the unreliability of neuroscientific measures in mental health research. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 8, 822–831 (2023).
-
Schurr, R., Reznik, D., Hillman, H., Bhui, R. & Gershman, S. J. Dynamic computational phenotyping of human cognition. Nat. Hum. Behav. 8, 917–931 (2024).
-
Mkrtchian, A., Valton, V. & Roiser, J. P. Reliability of decision-making and reinforcement learning computational parameters. Comput. Psychiatr. 7, 30–46 (2023).
-
Waltmann, M., Schlagenhauf, F. & Deserno, L. Sufficient reliability of the behavioral and computational readouts of a probabilistic reversal learning task. Behav. Res. Methods 54, 2993–3014 (2022).
-
Sullivan-Toole, H., Haines, N., Dale, K. & Olino, T. M. Enhancing the psychometric properties of the Iowa Gambling Task using full generative modeling. Comput. Psychiatr. 6, 189–212 (2022).
-
Zech, H. et al. Measuring self-regulation in everyday life: reliability and validity of smartphone-based experiments in alcohol use disorder. Behav. Res. Methods 55, 4329–4342 (2022).
-
Brown, V. M. et al. Reinforcement learning disruptions in individuals with depression and sensitivity to symptom change following cognitive behavioral therapy. JAMA Psychiatry 78, 1113–1122 (2021).
Ethics declarations
Competing interests
The author declares no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Brown, V.M. A missed opportunity to examine reliability in computational psychiatry. Nat. Mental Health (2026). https://doi.org/10.1038/s44220-026-00662-0
-
Received:
-
Accepted:
-
Published:
-
Version of record:
-
DOI: https://doi.org/10.1038/s44220-026-00662-0

Leave a Reply