References
-
Li, Y. et al. Prevalence and trends in diagnosed ADHD among US children and adolescents, 2017–2022. JAMA Netw. Open 6, e2336872 (2023).
-
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders 5th edn (American Psychiatric Publishing, 2013).
-
Shaw, M. et al. A systematic review and analysis of long-term outcomes in attention deficit hyperactivity disorder: effects of treatment and non-treatment. BMC Med. 10, 99 (2012).
-
Hamed, A. M., Kauer, A. J. & Stevens, H. E. Why the diagnosis of attention deficit hyperactivity disorder matters. Front. Psychiatry 6, 167576 (2015).
-
McGoey, K. E., Eckert, T. L. & Dupaul, G. J. Early intervention for preschool-age children with ADHD: a literature review. J. Emot. Behav. Disord. 10, 14–28 (2002).
-
DuPaul, G. J., Kern, L., Gormley, M. J. & Volpe, R. J. Early intervention for young children with ADHD: academic outcomes for responders to behavioral treatment. School Ment. Health 3, 117–126 (2011).
-
Sonuga-Barke, E. J., Koerting, J., Smith, E., McCann, D. C. & Thompson, M. Early detection and intervention for attention-deficit/hyperactivity disorder. Expert. Rev. Neurother. 11, 557–563 (2011).
-
Long, N. & Coats, H. The need for earlier recognition of attention deficit hyperactivity disorder in primary care: a qualitative meta-synthesis of the experience of receiving a diagnosis of ADHD in adulthood. Fam Pract. 39, 1144–1155 (2022).
-
Shephard, E. et al. Systematic review and meta-analysis: the science of early-life precursors and interventions for attention-deficit/hyperactivity disorder. J. Am. Acad. Child Adolesc. Psychiatry 61, 187–226 (2022).
-
Foy, J. M. & Earls, M. F. A process for developing community consensus regarding the diagnosis and management of attention-deficit/hyperactivity disorder. Pediatrics 115, e97–e104 (2005).
-
Klein, R. G. et al. Clinical and functional outcome of childhood attention-deficit/hyperactivity disorder 33 years later. Arch. Gen. Psychiatry 69, 1295–1303 (2012).
-
Du Rietz, E. et al. Trajectories of healthcare utilization and costs of psychiatric and somatic multimorbidity in adults with childhood ADHD: a prospective register-based study. J. Child Psychol. Psychiatry 61, 959–968 (2020).
-
Rocco, I., Corso, B., Bonati, M. & Minicuci, N. Time of onset and/or diagnosis of ADHD in European children: a systematic review. BMC Psychiatry 21, 575 (2021).
-
Boulton, K. A. et al. Diagnostic delay in children with neurodevelopmental conditions attending a publicly funded developmental assessment service: findings from the Sydney Child Neurodevelopment Research Registry. BMJ Open 13, e069500 (2023).
-
Knott, R. et al. Age at diagnosis and diagnostic delay across attention-deficit hyperactivity and autism spectrums. Aust. N. Z. J. Psychiatry 58, 142–151 (2024).
-
Visser, S. N. et al. Trends in the parent-report of health care provider-diagnosed and medicated attention-deficit/hyperactivity disorder: United States, 2003–2011. J. Am. Acad. Child Adolesc. Psychiatry 53, 34–46.e2 (2014).
-
Murray, A. L. et al. Sex differences in ADHD trajectories across childhood and adolescence. Dev. Sci. 22, e12721 (2019).
-
Morgan, P. L., Hillemeier, M. M., Farkas, G. & Maczuga, S. Racial/ethnic disparities in ADHD diagnosis by kindergarten entry. J. Child Psychol. Psychiatry 55, 905–913 (2014).
-
Holland, J. & Sayal, K. Relative age and ADHD symptoms, diagnosis and medication: a systematic review. Eur. Child Adolesc. Psychiatry 28, 1417–1429 (2019).
-
Stevens, T., Peng, L. & Barnard-Brak, L. The comorbidity of ADHD in children diagnosed with autism spectrum disorder. Res. Autism Spectr. Disord. 31, 11–18 (2016).
-
Morgan, P. L., Staff, J., Hillemeier, M. M., Farkas, G. & Maczuga, S. Racial and ethnic disparities in ADHD diagnosis from kindergarten to eighth grade. Pediatrics 132, 85–93 (2013).
-
Hill, E. D. et al. Prediction of mental health risk in adolescents. Nat. Med. 31, 1840–1846 (2025).
-
Alam, S., Raja, P. & Gulzar, Y. Investigation of machine learning methods for early prediction of neurodevelopmental disorders in children. Wirel. Commun. Mob. Comput. 2022, 5766386 (2022).
-
de Lacy, N. et al. Predicting individual cases of major adolescent psychiatric conditions with artificial intelligence. Transl. Psychiatry 13, 314 (2023).
-
Birkhead, G. S., Klompas, M. & Shah, N. R. Uses of electronic health records for public health surveillance to advance public health. Annu. Rev. Public Health 36, 345–359 (2015).
-
Yang, S., Varghese, P., Stephenson, E., Tu, K. & Gronsbell, J. Machine learning approaches for electronic health records phenotyping: a methodical review. J. Am. Med. Inform. Assoc. 30, 367–381 (2022).
-
Solares, J. R. A. et al. Deep learning for electronic health records: a comparative review of multiple deep neural architectures. J. Biomed. Inform. 101, 103337 (2020).
-
Engelhard. M. M. et al. Predictive Value of early autism detection models based on electronic health record data collected before age 1 year. JAMA Netw. Open 6, e2254303 (2023).
-
Chen, J. et al. Enhancing early autism prediction based on electronic records using clinical narratives. J. Biomed. Inform. 144, 104390 (2023).
-
Wang, B. et al. Prediction of early-onset bipolar using electronic health records. J. Child Psychol. Psychiatry 66, 1141–1154 (2025).
-
Roche, D., Mora, T. & Cid, J. Identifying non-adult attention-deficit/hyperactivity disorder individuals using a stacked machine learning algorithm using administrative data population registers in a universal healthcare system. JCPP Adv. 4, e12193 (2024).
-
Garcia-Argibay, M. et al. Predicting childhood and adolescent attention-deficit/hyperactivity disorder onset: a nationwide deep learning approach. Mol. Psychiatry 28, 1232–1239 (2023).
-
Steinberg, E. et al. Language models are an effective representation learning technique for electronic health record data. J. Biomed. Inform. 113, 103637 (2021).
-
Li, Y. et al. BEHRT: transformer for electronic health records. Sci. Rep. 10, 7155 (2020).
-
Goldstein, B. A., Navar, A. M., Pencina, M. J. & Ioannidis, J. P. A. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J. Am. Med. Inform. Assoc. 24, 198–208 (2017).
-
Dey, T. et al. Survival analysis—time-to-event data and censoring. Nat. Methods 19, 906–908 (2022).
-
Shi, Y. et al. Racial disparities in diagnosis of attention-deficit/hyperactivity disorder in a US national birth cohort. JAMA Netw. Open 4, e210321 (2021).
-
Schober, P. & Vetter, T. R. Survival analysis and interpretation of time-to-event data: the tortoise and the hare. Anesth. Analg. 127, 792–798 (2018).
-
Loh, D. R., Hill, E. D., Liu, N., Dawson, G. & Engelhard, M. M. Limitations of binary classification for long-horizon diagnosis prediction and advantages of a discrete-time time-to-event approach: empirical analysis. JMIR AI 4, e62985 (2025).
-
Engelhard, M. & Henao, R. Disentangling whether from when in a neural mixture cure model for failure time data. Proc. Mach. Learn. Res. 151, 9571–9581 (2022).
-
World Health Organization. International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, Vol. 1 (World Health Organization, 1992).
-
Geifman, Y. & El-Yaniv, R. Selective classification for deep neural networks. In Proc. 31st Int. Conf. NeurIPS 4885–4894 (2017).
-
Pessach, D. & Shmueli, E. A review on fairness in machine learning. ACM Comput. Surv. 55, 51:1–51:44 (2022).
-
Korrel, H., Mueller, K. L., Silk, T., Anderson, V. & Sciberras, E. Research review: language problems in children with attention-deficit hyperactivity disorder–a systematic meta-analytic review. J. Child Psychol. Psychiatry 58, 640–654 (2017).
-
Antshel, K. M. & Russo, N. Autism spectrum disorders and ADHD: overlapping phenomenology, diagnostic issues, and treatment considerations. Curr. Psychiatry Rep. 21, 34 (2019).
-
D’Agati, E., Curatolo, P. & Mazzone, L. Comorbidity between ADHD and anxiety disorders across the lifespan. Int. J. Psychiatry Clin. Pract. 23, 238–244 (2019).
-
Frazier, T. W., Youngstrom, E. A., Glutting, J. J. & Watkins, M. W. ADHD and achievement: meta-analysis of the child, adolescent, and adult literatures and a concomitant study with college students. J. Learn. Disabil. 40, 49–65 (2007).
-
Engelhard, M. M. et al. Health system utilization before age 1 among children later diagnosed with autism or ADHD. Sci. Rep. 10, 17677 (2020).
-
Gruschow, S. M., Yerys, B. E., Power, T. J., Durbin, D. R. & Curry. A. E. Validation of the use of electronic health records for classification of ADHD status. J. Atten. Disord. 23, 1647–1655 (2019).
-
Bannett, Y. et al. ADHD diagnosis and timing of medication initiation among children aged 3 to 5 years. JAMA Netw. Open 8, e2529610 (2025).
-
Huang, K.-L. et al. Factors affecting delayed initiation and continuation of medication use for attention-deficit/hyperactivity disorder: a nationwide study. J. Child Adolesc. Psychopharmacol. 31, 197–204 (2021).
-
Sibley, M. H. et al. Variable patterns of remission from adhd in the multimodal treatment study of ADHD. Am. J. Psychiatry 179, 142–151 (2022).
-
Prasad, V. et al. Use of healthcare services before diagnosis of attention-deficit/hyperactivity disorder: a population-based matched case-control study. Arch. Dis. Child. 109, 46–51 (2024).
-
Stolte, A. et al. Using electronic health records to understand the population of local children captured in a large health system in Durham County, NC, USA, and implications for population health research. Soc. Sci. Med. 296, 114759 (2022).
-
Hurst, J. H. et al. Development of an electronic health records datamart to support clinical and population health research. J. Clin. Transl. Sci. https://doi.org/10.1017/cts.2020.499 (2020).
-
Shi, Y. et al. Utility of medical record diagnostic codes to ascertain attention-deficit/hyperactivity disorder and learning disabilities in populations of children. BMC Pediatr. 20, 510 (2020).
-
Richesson, R. L. et al. Electronic health records based phenotyping in next-generation clinical trials: a perspective from the NIH Health Care Systems Collaboratory. J. Am. Med. Inform. Assoc. 20, e226–e231 (2013).
-
RxNorm (US National Library of Medicine, 2023); https://www.nlm.nih.gov/research/umls/rxnorm/index.html
-
LOINC (Regenstrief Institute, 2023); https://loinc.org/
-
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 5998–6008 (Curran Associates, 2017).
-
Hoffmann, J. et al. Training compute-optimal large language models. In Proc. Adv. NeurIPS 35 (2022).
-
Xiong, R. et al. On layer normalization in the transformer architecture. In Proc. 37th International Conference on Machine Learning 10524–10533 (2020).
-
Shazeer, N. GLU variants improve transformer. Preprint at https://doi.org/10.48550/arXiv.2002.05202 (2020).
-
Su, J. et al. RoFormer: enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).
-
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers) (eds Burstein, J. et al.) 4171–4186 (2019).
-
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (2019).
-
Pascanu, R., Mikolov, T. & Bengio, Y. On the difficulty of training recurrent neural networks. Proc. Mach. Learn. Res. 28, 1310–1318 (2013).
-
Lee, C., Zame, W., Yoon, J. & Van Der Schaar, M. Deephit: a deep learning approach to survival analysis with competing risks. Proc. AAAI Conf. Artif. Intell. 32, 11842 (2018).
-
Kvamme, H. & Borgan, Ø. Continuous and discrete-time survival prediction with neural networks. Lifetime Data Anal. 27, 710–736 (2021).
-
Cox, D. R. Regression models and life-tables. J. R. Stat. Soc. B 34, 187–202 (1972).
-
Wei, L.-J. The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat. Med. 11, 1871–1879 (1992).
-
Suresh, K., Severn, C. & Ghosh, D. Survival prediction models: an introduction to discrete-time modeling. BMC Med. Res. Methodol. 22, 207 (2022).
-
Liu, S.-Y. et al. DoRA: weight-decomposed low-rank adaptation. In Proc. 41st International Conference on Machine Learning 235, 32100–32121 (2024).
-
Bergstra, J. & Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012).
-
Antolini, L., Boracchi, P. & Biganzoli, E. A time-dependent discrimination index for survival data. Stat. Med. 24, 3927–3944 (2005).
-
Haider, H., Hoehn, B., Davis, S. & Greiner, R. Effective ways to build and evaluate individual survival distributions. J. Mach. Learn. Res. 21, 1–63 (2020).
-
Austin, P. C. & Steyerberg, E. W. The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models. Stat. Med. 38, 4051–4065 (2019).
-
Graf, E., Schmoor, C., Sauerbrei, W. & Schumacher, M. Assessment and comparison of prognostic classification schemes for survival data. Stat. Med. 18, 2529–2545 (1999).
-
Altman, D. G. Practical Statistics for Medical Research (Chapman and Hall/CRC, 1990).
-
Efron, B. in Breakthroughs in Statistics (eds Kotz, S. & Johnson, N. L.) 569–593 (Springer, 1992); https://doi.org/10.1007/978-1-4612-4380-9_41
-
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proc. Adv. NeurIPS 30, 4766–4777 (2017).
-
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Proc. Adv. NeurIPS 32, 8024–8035 (2019).
-
Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: a next-generation hyperparameter optimization framework. In Proc. KDD 2623–2631 (2019).
-
Kokhlikyan, N. et al. Captum: a unified and generic model interpretability library for PyTorch. Preprint at https://doi.org/10.48550/arXiv.2009.07896 (2020).

Leave a Reply