Skip to main content
Toggle menu

Search the website

The coding of depression and anxiety outcomes and support in English primary care data

Posted:
Written by:
  • Lola Ojedele, Kunle Oreagba
Categories:

Who we are

We, Lola Ojedele and Kunle Oreagba, joined the Bennett Institute for Applied Data Science this summer through HDRUK’s 2025 Health Data Science Internship Programme. In this blog we share more about out time and work at the Bennett Institute.

Introduction

We came to the Bennett Institute with different paths but a shared interest in healthcare data. Both of us jumped at the chance to research within the mental health space. Most people know someone affected by a mental health condition. Studying coding patterns in general practice gave us the opportunity to explore how anxiety disorders and depression are recorded at the first point of contact. We did this by looking across diagnoses, screening and monitoring tools such as Patient Reported Outcome Measures (PROMs), and psychological support and therapies. Understanding these patterns is essential to inform future studies with a focus on mental health and support the integration of NHS Talking Therapies data into the OpenSAFELY platform.

Joint methods

Literature review. Our projects began by reviewing NICE guidelines1 2 and the NHS Talking Therapies Manual3, which set out the recommended PROMs and therapies for anxiety and depression. These documents became the foundation for developing our respective codelists. We used GitHub for collaboration and version control, quickly adapting to reproducible workflows.

Interviews with healthcare professionals. To refine our approach, we consulted one clinical informatician, who explained how inactive codes map onto newer SNOMED CT codes, and one GP, who shared practical insights on coding practices and common gaps.

Codelist development. Guided by this input, we set clear inclusion and exclusion criteria and built our draft codelists using OpenCodelists. Two new codelists were developed: one for PROMs in anxiety disorders and depression4, and another for associated psychological therapies5.

Analysis of trends over time. We used the opencodecounts R package to systematically search SNOMED CT code usage, identify missing codes from our searches, and analyse national coding trends in primary care from 1st August 2011 to the 31st July 2024.

Classification into subcategories. We analysed trends using different categorisations and groupings of our codelists. For PROMs, we split codes by specific diagnoses and by whether they were recommended by NICE or NHS Talking Therapies. For psychological therapies, we categorised codes by low- and high-intensity options, as well as by therapeutic approach.

Development of reusable code. We also observed that many SNOMED CT descriptions included semantic tags in parentheses, which indicate the domain to which the concept belongs (e.g., Postpartum Depression (disorder)). The semantic tags indicates the code’s semantic category (like disorder, finding, procedure, or observable entity). To systematically analyse these tags, we helped to develop the extract_semantic_tag() function, which is now part of the R package, see below for an example:

extract_semantic_tag <- function(string) {
  sem_tag <- str_extract(string, "\\(([^()]+)\\)$")
  sem_tag <- str_replace_all(sem_tag, "[()]", "")
  sem_tag
}

extract_semantic_tag("Postpartum Depression (disorder)")
#> [1] "disorder"

All code for this project is available on GitHub at https://github.com/bennettoxford/mental-health-open-data.

Results

We presented a summary of the results at the Closing Ceremony of the Health Data Science Internship Programme on Wednesday, 27th August 2025: The coding of patient reported outcome measures and psychological therapies for anxiety disorders and depression6.

Patient Reported Outcome Measures (PROMs)

Over the entire study period the code “Depression screening using questions” was the most frequently used PROM, with over three times more recorded events than the most common anxiety measure, GAD-7, see Table 1 below.

Table 1. Total counts of code usage for PROMs for anxiety disorders and depression (2011–2024).

Disorder SNOMED Code Description Semantic tag Usage %
Anxiety disorder 44545505 GAD-7 score Observable entity 6,288,000 71.2
836571000000106 GAD-2 score Observable entity 1,136,480 12.9
401319005 HADS anxiety Observable entity 728,020 8.2
Depression 200971000000100 Dep screening Procedure 21,510,040 53.6
720433000 PHQ-9 score Observable entity 16,229,920 40.5
401320004 HADS Dep score Observable entity 790,280 2.0

Notes: GAD = Generalised Anxiety Disorder; HADS = Hospital Anxiety Depression Scale; Dep = Depression; PHQ = Patient Health Questionnaire.

Overall PROMs usage increased from 2012/13 to 2023/24, though individual measures fluctuated. The PHQ-9 was the second most common for depression, showing a dip after its Quality and Outcomes Framework (QOF) incentive ended in 2014, before rising again.

To better understand recording practices in order to inform feasibility of research studies using individual level EHR data, we also examined SNOMED CT semantic tags. This showed that in recent years, observable entity codes, which capture PROMs scores directly, have overtaken assessment scale or procedure codes, see Figure 1. This suggests a stronger focus on recording outcomes rather than simply noting the use of a measure.

Figure 1. Yearly trends of PROM codes for anxiety disorders and depression, separated by semantic tag (Observable entity vs Other).

Figure 1. Yearly counts of PROM codes for anxiety disorders and depression, separated by semantic tag (Observable entity vs Other)

Psychological therapies

For therapies, coding is not usually split by anxiety disorders and depression but instead by low- and high-intensity options. In our analysis, the most frequently recorded codes were seen by psychiatrist and seen by psychologist, followed by referrals into NHS Talking Therapies (formerly IAPT, Improving Access to Psychological Therapies), see Table 2. These referrals could be recorded as either GP-referrals or self-referrals.

Table 2: Total counts of most used codes psychological support and treatments for anxiety disorders and depression (2011-2024).

SNOMED Code Description Semantic Tag Usage %
305693007 Seen by psychiatrist Finding 1,729,570 24.6
3103488003 Seen by psychologist Finding 645,020 9.2
762481000000100 Seen by PWP Finding 295,990 4.2
380201000000109 Referral to IAPT Procedure 833,180 11.8
1036481000000106 Self-referral to IAPT Procedure 749,820 10.7
183528001 Referral to psychologist for elderly ill Procedure 201,060 2.9
228557008 CBT Therapy 56,880 0.8
933221000000107 MBT Therapy 35,425 0.5
444175001 Guided self-help CBT Therapy 27,980 0.4

Notes: PWP = Psychological wellbeing practitioner; IAPT = Improving Access to Psychological Therapy; CBT = Cognitive Behavioural Therapy; MBT = Mindfulness-Based Therapy.

Overall, the majority of code usage reflects referral pathways and administrative recording rather than therapy sessions themselves. By contrast, specific therapies such as cognitive behavioural therapy (CBT), mindfulness-based therapy, and guided self-help CBT were coded much less often, each representing less than 1% of total usage of the codes we selected.

Therapy coding patterns were relatively stable across the study period (see Figure 2), although self-referrals increased after the COVID-19 pandemic, and acceptance into psychological talking therapies has continued to rise since 20197.

Figure 2. Yearly trends of most coded psychological treatments for anxiety disorders and depression.

Figure 2. Yearly trends of most coded psychological treatments for anxiety disorders and depression

Main takeaways

Lola Ojedele: One of the most unexpected findings for me was realising how many PROMs exist in SNOMED CT, but that only a handful are regularly coded. Some have been recorded fewer than ten times in a decade, raising questions about why they exist at all. Speaking to a GP highlighted that many PROMs are actually recorded in free text. This means that for researchers who don’t have access to free text (as is the case for OpenSAFELY), a large amount of information is effectively hidden. This made me appreciate how crucial codelists are in shaping research outcomes, and how messy and inconsistent EHR data can be. Exploring the semantic tags attached to PROM codes showed just how complex this coding system is. My main highlight was helping to develop the extract_semantic_tag() function for the opencodecounts package, and it was amazing to see how small contributions can support wider research.

Kunle Oreagba: I was surprised when I learnt that data in EHR is not always the ground truth and more indepth research can inform the drivers of patterns. This made me appreciate the opportunity of speaking to General Practitioners for more insight and re-echoed the need for multidisciplinary and multistakeholder research. My main highlights are contributing to a poster presentation of our findings at the HDRUK Internship Closing Ceremony and a paper on opencodecounts package in R. Overall, I realized how analytic tools can help make decisions that could save more lives. I enjoyed every day of the internship.