5  Index of Multiple Deprivation (IMD)

The latest available IMD data for England is from 2019.

Updated IMD for England

This is expected to be released in late 2025 by Oxford Consultants for Social Inclusion (OCSI).

The consultation outcome was released in 2022 by the Department for Levelling up, Housing & Communities

Thanks to the NHS-R Community for finding and sharing these links on the NHS-R Slack

IMD is very useful for categorising the area a person lives in for deprivation. Deprivation is by country and do not include the other nations. Deciles are the most commonly used way of referring to IMD and are taken from the scores which are ordered and then cut into 10. For deciles the 1 is the most deprived area and 10 is the least deprived.

Wikipedia link

5.0.1 England

5.0.2 Scotland

6 Finding the IMD for an address

To get IMD scores or deciles for a local data a join will be needed to the postcode table (like a directory of postcodes) found:

https://digital.nhs.uk/services/organisation-data-service/data-downloads/ods-postcode-files

and then to the IMD data:

https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019

  • Select File 7 for the dataset.

Other data is available from this dataset including IDAOPI which relates to only older people.

Note that the column headers change, in 2015 it was LADistrictCode2013 and in 2019 it is LADistrictCode2019. Also LADistrictName2013 has become LADistrictName2019.

6.0.1 Postcode spaces

Postcode lengths vary in original data depending on whether one or two spaces are used between the parts. Consequently, it is always better when joining by postcodes to remove the spaces altogether.

In SQL this would be with the code:

REPLACE(postcode, ' ', '')

in R it can be

postcode <- "NG16 1AA"

# base R
gsub(" ", "", postcode)
[1] "NG161AA"
# {stringr} also available in {tidyverse}
stringr::str_remove(postcode, " ")
[1] "NG161AA"

6.0.2 Partial postcodes

Partial postcodes are sometimes provided to protect the data and may be the first part (before spaces) and 1 or 2 characters from the second part. This will not give a sufficiently reliable IMD score.

7 Example join code (SQL)

To get the IMD score the LSOA (Lower Super Output Area) code is required which is taken from the full postcode.

This code will not run and is dependent on the naming conventions of the SQL server. The column names of LSOA11 and LSOAcode2011 will have come from the data sources. The column PostCode_space has been added to the original data.

SELECT imd.*
FROM DIM_AI.PatientData AS p
LEFT JOIN DIM_AI.PostCodes AS pc ON p.PostCode = pc.PostCode_space
LEFT JOIN DIM_AI.IMD AS i ON pc.PC.LSOA11 = i.LSOAcode2011

8 Creating quintiles

To be able to join to the data scores will need to be put into quintiles (group of 5) rather than deciles (group of 10). Where the number of areas divides into the number of quintiles an equal number of areas can be assigned to each quintile. When it does not divide however, a choice must be made as to which quintiles should have a larger number of areas. The Office of Health Improvement & Disparities recommends using the following approach in their Technical Guide to Assigning Deprivation Categories:

  • Divide the number of small areas within the higher geography by the number of deprivation categories required (up to a maximum of 10), giving an integer and fractional part.

  • The integer-part of this number represents the minimum number of small areas that will be assigned to each deprivation category within each higher geography.

  • The below tables then shows which deprivation categories should be assigned additional small areas based on the fractional part of this number and the number of quintiles being used.

Deciles
Number after decimal point Deciles receiving an extra area
0.0 None
0.1 1
0.2 1, 6
0.3 1, 4, 7
0.4 1, 3, 6, 8
0.5 1, 3, 5, 7, 9
0.6 1, 2, 4, 6, 7, 9
0.7 1, 2, 3, 5, 6, 8, 9
0.8 1, 2, 3, 4, 6, 7, 8, 9
Quintiles
Number after decimal point Quintiles receiving an extra area
0.0 None
0.2 1
0.4 1, 3
0.6 1, 2, 4
0.8 1, 2, 3, 4
Quantiles
Number after decimal point Quantiles receiving an extra area
0.00 None
0.25 1
0.50 1, 3
0.75 1, 2, 3

8.0.1 Quintiles in SQL

SELECT DISTINCT IMDDecile,
FLOOR((IMDDecile-1)/2) + 1 AS IMDQuintile
FROM DIM_AI.IMD
ORDER BY IMDDecile

8.0.2 Quintiles in R

The PHEindicatormethods provides a convenient function that can be used in R to generate quintiles.

df <- data.frame(
  region = as.character(rep(c("Region1", "Region2", "Region3", "Region4"),
    each = 250
  )),
  smallarea = as.character(paste0("Area", seq_along(1:1000))),
  vals = as.numeric(sample(200, 1000, replace = TRUE)),
  stringsAsFactors = FALSE
)

# assign small areas to deciles across whole data frame
# print the top 15
PHEindicatormethods::phe_quantile(df, vals, type = "standard") |>
  dplyr::slice_head(n = 15)
    region smallarea vals quantile
1  Region1     Area1    4       10
2  Region1     Area2    6       10
3  Region1     Area3  194        1
4  Region1     Area4  160        3
5  Region1     Area5   53        8
6  Region1     Area6  181        2
7  Region1     Area7  166        2
8  Region1     Area8   19        9
9  Region1     Area9   67        7
10 Region1    Area10    8       10
11 Region1    Area11  172        2
12 Region1    Area12  146        3
13 Region1    Area13   43        8
14 Region1    Area14  198        1
15 Region1    Area15   61        8

9 Creating local IMDs

In areas like Nottingham/Nottinghamshire the differences between the LSOA areas is diminished when ranked against England as a whole, but when ranked locally, the variation is much more pronounced. It is possible to take the original scores and apply deciles or quintiles to those scores in order to create a local IMD.

9.1 Local IMD creation in SQL

To apply a rank use the windows partition function ROW_NUMBER() OVER(ORDER BY IMDRank) to create a new ranking score and NTILE(10) OVER (ORDER BY IMDRank) to create new deciles.

9.2 Local IMD creation in R

library(tidyverse)
library(PostcodesioR)
library(NHSRpostcodetools) # installed from GitHub not CRAN
library(NHSRpopulation) # installed from GitHub not CRAN
library(janitor)

Attaching package: 'janitor'
The following objects are masked from 'package:stats':

    chisq.test, fisher.test
# Generate random example postcodes
# Restricted to NG postcodes from Nottinghamshire because postcodes are drawn
# from all nations and don't validate within the {NHSRpopulation} package
# currently
postcodes <- purrr::map_chr(
  1:10,
  .f = ~ PostcodesioR::random_postcode("NG16") |>
    purrr::pluck(1)
)

# Create a tibble
tibble_postcodes <- dplyr::tibble(
  random_postcodes = postcodes,
)

## Debugging
# NHSRpopulation::get_data(tibble_postcodes,
#   column = "random_postcodes",
#   url_type = "imd"
# ) |>
#   dplyr::select(
#     random_postcodes,
#     new_postcode,
#     imd_decile,
#     imd_rank,
#     imd_score
#   ) |>
#   mutate(imd_decile_local = ntile(-imd_score, n = 10)) # creating new deciles from the data provided

9.3 LSOAs - by postcode and by Local Authority/District

Note that the previous code example was a very small selection of
postcodes and would require much more data to create locally derived deciles.

This usually comes from whole LSOA areas that are linked to Local Authorities/ Districts and not via row level data postcodes.

For example, Nottinghamshire Healthcare NHS Foundation Trust covers several Local Authority areas in its services and so organisations like this have to find all the geographic areas that the services cover.

Local Authority and District LSOAs are not in {NHSRpostcodetools} (warning)

Currently (March 2024) the package {NHSRpopulation} which relies upon package for postcodes {NHSRpostcodetools} and a join to the IMD data via LSOAs does not have details for Local Authority LSOAs.

There are two local authorities in the service area of Nottinghamshire Healthcare NHS Foundation Trust: Nottingham City Local Authority and several Nottinghamshire County Council LA areas. # often have their data combined, particularly for the Provider # Trusts like Nottinghamshire Healthcare NHS Foundation Trust.

# Define code for Nottingham City LA
la_code_nott_city <- c("E06000018")

# Adapted from a blog:
# https://cdu-data-science-team.github.io/team-blog/posts/2021-05-14-index-of-multiple-deprivation/#imd-in-sql)
la_code_notts_county <- tibble::tribble(
  ~`LSOAcode2011`, ~`LSOAname2011`, ~`LADistrictCode2019`, ~`LADistrictName2019`,
  "E01013812", "Nottingham 018C", "E06000018", "Nottingham",
  "E01013814", "Nottingham 022B", "E06000018", "Nottingham",
  "E01013810", "Nottingham 018A", "E06000018", "Nottingham",
  "E01013811", "Nottingham 018B", "E06000018", "Nottingham",
  "E01013815", "Nottingham 022C", "E06000018", "Nottingham"
) |>
  dplyr::select(LSOAcode2011) |>
  dplyr::pull()

# Combine all districts into one object
la_code_nott_city_notts_county <- c(la_code_nott_city, la_code_notts_county)

la_code_nott_city_notts_county
[1] "E06000018" "E01013812" "E01013814" "E01013810" "E01013811" "E01013815"

11 Referencing IMD in a paper or research

11.1 Suggested citation text

Ministry of Housing, Communities and Local Government. English Indices of Deprivation 2015. https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019 (Accessed 22 March 2024)