The coding of depression and anxiety outcomes and support in English primary care data | Bennett Institute for Applied Data Science

Who we are

We, Lola Ojedele and Kunle Oreagba, joined the Bennett Institute for Applied Data Science this summer through HDRUK’s 2025 Health Data Science Internship Programme. In this blog we share more about out time and work at the Bennett Institute.

Introduction

We came to the Bennett Institute with different paths but a shared interest in healthcare data. Both of us jumped at the chance to research within the mental health space. Most people know someone affected by a mental health condition. Studying coding patterns in general practice gave us the opportunity to explore how anxiety disorders and depression are recorded at the first point of contact. We did this by looking across diagnoses, screening and monitoring tools such as Patient Reported Outcome Measures (PROMs), and psychological support and therapies. Understanding these patterns is essential to inform future studies with a focus on mental health and support the integration of NHS Talking Therapies data into the OpenSAFELY platform.

Joint methods

Literature review. Our projects began by reviewing NICE guidelines¹ ² and the NHS Talking Therapies Manual³, which set out the recommended PROMs and therapies for anxiety and depression. These documents became the foundation for developing our respective codelists. We used GitHub for collaboration and version control, quickly adapting to reproducible workflows.

Interviews with healthcare professionals. To refine our approach, we consulted one clinical informatician, who explained how inactive codes map onto newer SNOMED CT codes, and one GP, who shared practical insights on coding practices and common gaps.

Codelist development. Guided by this input, we set clear inclusion and exclusion criteria and built our draft codelists using OpenCodelists. Two new codelists were developed: one for PROMs in anxiety disorders and depression⁴, and another for associated psychological therapies⁵.

Analysis of trends over time. We used the opencodecounts R package to systematically search SNOMED CT code usage, identify missing codes from our searches, and analyse national coding trends in primary care from 1st August 2011 to the 31st July 2024.

Classification into subcategories. We analysed trends using different categorisations and groupings of our codelists. For PROMs, we split codes by specific diagnoses and by whether they were recommended by NICE or NHS Talking Therapies. For psychological therapies, we categorised codes by low- and high-intensity options, as well as by therapeutic approach.

Development of reusable code. We also observed that many SNOMED CT descriptions included semantic tags in parentheses, which indicate the domain to which the concept belongs (e.g., Postpartum Depression (disorder)). The semantic tags indicates the code’s semantic category (like disorder, finding, procedure, or observable entity). To systematically analyse these tags, we helped to develop the extract_semantic_tag() function, which is now part of the R package, see below for an example:

extract_semantic_tag <- function(string) {
  sem_tag <- str_extract(string, "\\(([^()]+)\\)$")
  sem_tag <- str_replace_all(sem_tag, "[()]", "")
  sem_tag
}

extract_semantic_tag("Postpartum Depression (disorder)")
#> [1] "disorder"

All code for this project is available on GitHub at https://github.com/bennettoxford/mental-health-open-data.

Results

We presented a summary of the results at the Closing Ceremony of the Health Data Science Internship Programme on Wednesday, 27th August 2025: The coding of patient reported outcome measures and psychological therapies for anxiety disorders and depression⁶.

Patient Reported Outcome Measures (PROMs)

Over the entire study period the code “Depression screening using questions” was the most frequently used PROM, with over three times more recorded events than the most common anxiety measure, GAD-7, see Table 1 below.

Table 1. Total counts of code usage for PROMs for anxiety disorders and depression (2011–2024).

Disorder	SNOMED Code	Description	Semantic tag	Usage	%
Anxiety disorder	`44545505`	GAD-7 score	Observable entity	6,288,000	71.2
	`836571000000106`	GAD-2 score	Observable entity	1,136,480	12.9
	`401319005`	HADS anxiety	Observable entity	728,020	8.2
Depression	`200971000000100`	Dep screening	Procedure	21,510,040	53.6
	`720433000`	PHQ-9 score	Observable entity	16,229,920	40.5
	`401320004`	HADS Dep score	Observable entity	790,280	2.0

Notes: GAD = Generalised Anxiety Disorder; HADS = Hospital Anxiety Depression Scale; Dep = Depression; PHQ = Patient Health Questionnaire.

Overall PROMs usage increased from 2012/13 to 2023/24, though individual measures fluctuated. The PHQ-9 was the second most common for depression, showing a dip after its Quality and Outcomes Framework (QOF) incentive ended in 2014, before rising again.

To better understand recording practices in order to inform feasibility of research studies using individual level EHR data, we also examined SNOMED CT semantic tags. This showed that in recent years, observable entity codes, which capture PROMs scores directly, have overtaken assessment scale or procedure codes, see Figure 1. This suggests a stronger focus on recording outcomes rather than simply noting the use of a measure.

Figure 1. Yearly trends of PROM codes for anxiety disorders and depression, separated by semantic tag (Observable entity vs Other).

Figure 1. Yearly counts of PROM codes for anxiety disorders and depression, separated by semantic tag (Observable entity vs Other)

Psychological therapies

For therapies, coding is not usually split by anxiety disorders and depression but instead by low- and high-intensity options. In our analysis, the most frequently recorded codes were seen by psychiatrist and seen by psychologist, followed by referrals into NHS Talking Therapies (formerly IAPT, Improving Access to Psychological Therapies), see Table 2. These referrals could be recorded as either GP-referrals or self-referrals.

Table 2: Total counts of most used codes psychological support and treatments for anxiety disorders and depression (2011-2024).

SNOMED Code	Description	Semantic Tag	Usage	%
`305693007`	Seen by psychiatrist	Finding	1,729,570	24.6
`3103488003`	Seen by psychologist	Finding	645,020	9.2
`762481000000100`	Seen by PWP	Finding	295,990	4.2
`380201000000109`	Referral to IAPT	Procedure	833,180	11.8
`1036481000000106`	Self-referral to IAPT	Procedure	749,820	10.7
`183528001`	Referral to psychologist for elderly ill	Procedure	201,060	2.9
`228557008`	CBT	Therapy	56,880	0.8
`933221000000107`	MBT	Therapy	35,425	0.5
`444175001`	Guided self-help CBT	Therapy	27,980	0.4

Notes: PWP = Psychological wellbeing practitioner; IAPT = Improving Access to Psychological Therapy; CBT = Cognitive Behavioural Therapy; MBT = Mindfulness-Based Therapy.

Overall, the majority of code usage reflects referral pathways and administrative recording rather than therapy sessions themselves. By contrast, specific therapies such as cognitive behavioural therapy (CBT), mindfulness-based therapy, and guided self-help CBT were coded much less often, each representing less than 1% of total usage of the codes we selected.

Therapy coding patterns were relatively stable across the study period (see Figure 2), although self-referrals increased after the COVID-19 pandemic, and acceptance into psychological talking therapies has continued to rise since 2019⁷.

Figure 2. Yearly trends of most coded psychological treatments for anxiety disorders and depression.

Figure 2. Yearly trends of most coded psychological treatments for anxiety disorders and depression

Main takeaways

Lola Ojedele: One of the most unexpected findings for me was realising how many PROMs exist in SNOMED CT, but that only a handful are regularly coded. Some have been recorded fewer than ten times in a decade, raising questions about why they exist at all. Speaking to a GP highlighted that many PROMs are actually recorded in free text. This means that for researchers who don’t have access to free text (as is the case for OpenSAFELY), a large amount of information is effectively hidden. This made me appreciate how crucial codelists are in shaping research outcomes, and how messy and inconsistent EHR data can be. Exploring the semantic tags attached to PROM codes showed just how complex this coding system is. My main highlight was helping to develop the extract_semantic_tag() function for the opencodecounts package, and it was amazing to see how small contributions can support wider research.

Kunle Oreagba: I was surprised when I learnt that data in EHR is not always the ground truth and more indepth research can inform the drivers of patterns. This made me appreciate the opportunity of speaking to General Practitioners for more insight and re-echoed the need for multidisciplinary and multistakeholder research. My main highlights are contributing to a poster presentation of our findings at the HDRUK Internship Closing Ceremony and a paper on opencodecounts package in R. Overall, I realized how analytic tools can help make decisions that could save more lives. I enjoyed every day of the internship.