Sam Dunn and Seb Fox
29 November 2018
PHE is an executive agency of the Department of Health and Social Care. We provide government, local government, the NHS, Parliament, industry and the public with evidence-based professional, scientific expertise and support.
We exist to protect and improve the nation’s health and wellbeing, and reduce health inequalities.
PHE was established on 1 April 2013 to bring together public health specialists from more than 70 organisations into a single public health service
Fingertips is our main data page, this is where we display most of our indicators. These profiles allow users to: * Browse indicators at different geographical levels * Benchmark against the regional or England average * Export data to use locally
The Public Health Data Science (PHDS) team is located within the national Knowledge and Intelligence Service and is part of the Health Improvement Directorate within PHE.
PHDS team was formed in 2015 following reorganisation.
Data science aims to generate insight and knowledge from data and is a developing field within broad public health. PHE is interested in how data science can better support decision making across it’s work in prevention, improving health and wellbeing and healthcare variation and inequality.
PHE’s Public Health Data Science team consists of public health specialists, statisticians and analytical staff. It works closely with colleagues across PHE, supporting the knowledge and Intelligence needs of the local and national public health system and developing close working relationships with external stakeholders to promote the development of data science across the public health system.
5 main areas of work:
Our vision has been to “consolidate, automate, innovate”. Each step makes the subsequent step easier.
The consolidate step is essentially part of the overall data strategy and is a fundamental building block.
We have filled a “Data Lake” built on tidy data principles.
Having tidy data in a central database allows us to create efficiencies.
There are a number of projects working on automation within Knowledge and Intelligence involving multiple teams across the organisation.
Two broad themes:
With around 1,700 indicators in our fingertips platform, work has begun to automate many using SQL and R. Such as:
Fingertips has opened an API to its data, making the data more accessible to users. This includes data from our local health tool too. In recent months we have started to see stakeholders accessing data in this way
Code demo for life expectancy
# need indicator ID and profile ID (can be found using indicators() function)
df <- fingertips_data(IndicatorID = 90366, ProfileID = 19) %>%
mutate(AreaName = as.character(AreaName))
years <- length(unique(df$Timeperiod))
SoTCurrentMale <- df %>%
filter(AreaCode == "E06000021",TimeperiodSortable == max(df$TimeperiodSortable), Sex == "Male")
SoTCurrentFemale <- df %>%
filter(AreaCode == "E06000021",TimeperiodSortable == max(df$TimeperiodSortable), Sex == "Female")
p <- df %>%
filter(AreaName == "England" | AreaName == "Stoke-on-Trent")
axislimits <- c(min(p$LowerCI95.0limit), max(p$UpperCI95.0limit))
datatable(head(p),options = list(
autoWidth = FALSE,
scrollX = TRUE,
#pageLength = 25,
#lengthMenu = c(5,10,15,20,25),
columnDefs = list(list(className = 'dt-left', targets = "_all")),dom ='t'),
rownames =FALSE,
filter = "none",
class = 'row-border stripe')
p <- p %>%
trends(Timeperiod, Value, AreaName, "England", "Stoke-on-Trent",
fill = ComparedtoEnglandvalueorpercentiles,
title = "Life expectancy at birth",
subtitle = "Local Authority A",
ylab = "Age (years)",
lowerci = LowerCI95.0limit,
upperci = UpperCI95.0limit) +
facet_wrap(~Sex, scales = "free_y") +
scale_y_continuous(limits = axislimits)+
theme(axis.text.x=element_text(angle=45, hjust=1))
p
Automated Text:
Life expectancy has increased in Local Authority A over the last 14 years for both males and females. However, Life expectancy for Local Authority A has remained significantly lower than England for the last 14 years. With male life expectatancy at 76.42 years and 81.15 years for females.
- Red values denote automated text, these values will update when the data in fingertipsR
updates
fingertipsR There are a couple of vignettes that can help users get going:
Fingertipschart
Code demo for calculating Violent crime crude rate per 1,000
## Get some sample data
df <- fingertips_data(IndicatorID = 11202, AreaTypeID = 6) %>%
filter(is.na(ParentCode)) %>%
select(IndicatorName,AreaName,Timeperiod,Value,Count,Denominator,LowerCI95.0limit,UpperCI95.0limit)
crude <- phe_rate(data = df,x = Count,n = Denominator,multiplier = 1000, type = "full",confidence = 0.95)
datatable(crude,
options = list(
autoWidth = FALSE,
scrollX = TRUE,
#pageLength = 25,
#lengthMenu = c(5,10,15,20,25),
columnDefs = list(list(className = 'dt-left', targets = "_all")),dom ='t'),
rownames =FALSE,
filter = "none",
class = 'row-border stripe') %>%
formatStyle(columns = c('value','lowercl','uppercl','confidence','statistic','method'),
color = c('white'),
backgroundColor = c('grey')) %>%
formatRound(columns = c('Value','LowerCI95.0limit','UpperCI95.0limit','value','lowercl','uppercl'),
digits = 2)
PHEindicatormethods There are a couple of vignettes that can help users get going:
PHE have been using R over the couple of year and started to produce it’s output using the our packages.
Flowers, J. (2017). JSON tutorial: Fingertips api. [Online]. Available from: https://rpubs.com/jflowers/239296.
Fox, S. & Flowers, J. (2017). R package version 0.1.3. FingertipsR: Fingertips data for public health. [Online]. Available from: https://CRAN.R-project.org/package=fingertipsR.
Microsoft (2018). Connect to a json file. [Online]. Available from: https://support.office.com/en-ie/article/connect-to-a-json-file-f65207ab-d957-4bf0-bec3-a08bb53cd4c0#ID0EAACAAA=2016.